Grep filtering output from a process after it has already started? - linux

Normally when one wants to look at specific output lines from running something, one can do something like:
./a.out | grep IHaveThisString
but what if IHaveThisString is something which changes every time so you need to first run it, watch the output to catch what IHaveThisString is on that particular run, and then grep it out? I can just dump to file and later grep but is it possible to do something like background it and then bring it to foreground and bringing it back but now piped to some grep? Something akin to:
./a.out
Ctrl-Z
fg | grep NowIKnowThisString
just wondering..

No, it is only in your screen buffer if you didn't save it in some other way.

Short form: You can do this, but you need to know that you need to do it ahead-of-time; it's not something that can be put into place interactively after-the-fact.
Write your script to determine what the string is. We'd need a more detailed example of the output format to give a better example of usage, but here's one for the trivial case where the entire first line is the filter target:
run_my_command | { read string_to_filter_for; fgrep -e "$string_to_filter_for" }
Replace the read string_to_filter_for with as many commands as necessary to read enough input to determine what the target string is; this could be a loop if necessary.
For instance, let's say that the output contains the following:
Session id: foobar
and thereafter, you want to grep for lines containing foobar.
...then you can pipe through the following script:
re='Session id: (.*)'
while read; do
if [[ $REPLY =~ $re ]] ; then
target=${BASH_REMATCH[1]}
break
else
# if you want to print the preamble; leave this out otherwise
printf '%s\n' "$REPLY"
fi
done
[[ $target ]] && grep -F -e "$target"
If you want to manually specify the filter target, this can be done by having the loop check for a file being created with filter contents, and using that when starting up grep afterwards.

That is a little bit strange what you need, but you can do it tis way:
you must go into script session first;
then you use shell how usually;
then you start and interrupt you program;
then run grep over typescript file.
Example:
$ script
$ ./a.out
Ctrl-Z
$ fg
$ grep NowIKnowThisString typescript

You could use a stream editor such as sed instead of grep. Here's an example of what I mean:
$ cat list
Name to look for: Mike
Dora 1
John 2
Mike 3
Helen 4
Here we find the name to look for in the fist line and want to grep for it. Now piping the command to sed:
$ cat list | sed -ne '1{s/Name to look for: //;h}' \
> -e ':r;n;G;/^.*\(.\+\).*\n\1$/P;s/\n.*//;br'
Mike 3
Note: sed itself can take file as a parameter, but you're not working with text files, so that's how you'd use it.
Of course, you'd need to modify the command for your case.

Related

Linux save string to file without ECHO command

I want to save a command to a file (for example I want to save the string "cat /etc/passwd" to a file) but I can't use the echo command.
How can I create and save string to a file directly without using echo command?
You can redirect cat to a file, type the text, and press Control-D when you're done, like this:
cat > file.txt
some text
some more text
^D
By ^D I mean to press Control-D at the end. The line must be empty.
It will not be part of the file, it is just to terminate the input.
Are you avoiding ECHO for security purposes (e.g. you're using a shared terminal and you don't want to leave trace in the shell history of what you've written inside your files) or you're just curious for an alternative method?
Simple alternative to echo:
As someone said, redirecting cat is probably the simpler way to go.
I'd suggest you to manually type your end-of-file, like this:
cat <<EOF > outputfile
> type here
> your
> text
> and finish it with
> EOF
Here's the string you're asking for, as an example:
cat <<EOF > myscript.sh
cat /etc/passwd
EOF
You probably don't want everyone to know you've peeked into that file, but if that's your purpose please notice that wrapping it inside an executable file won't make it more private, as that lines will be logged anyway...
Security - Avoiding history logs etc..
In modern shell, just try adding a space at the beginning of every command and use freely whatever you want.
BTW, my best hint is to avoid using that terminal at all, if you can. If you got two shells (another machine or even just another secure user in the same machine), I'd recommend you using netcat. See here: http://www.thegeekstuff.com/2012/04/nc-command-examples/?utm_source=feedburner
{ { command ls $(dirname $(which cat)) |
grep ^ca't$'; ls /etc/passwd; } |
tr \\n ' '; printf '\n'; } > output-file
But it's probably a lot simpler to just do : printf 'cat /etc/passwd\n'
To be clear, this is a tongue-in-cheek solution. The initial command is an extraordinarily convoluted way to get what you want, and this is intended to be a humorous answer. Perhaps instructive to understand.
I am not sure I understood you correctly but
cat /etc/passwd > target.file
use the > operator to write it to file without echoing
If you need to use it, inside a program :
cat <<EOF >file.txt
some text
some more text
EOF
I would imagine that you are probably trying to print the content of a string to a file, hence you mentioned echo.
You are avoiding this:
echo "cat /etc/passwd" > target.file
You can use a here string combined with cat.
cat > target.file <<< "cat /etc/passwd"
Now the file target.file will contain a string cat /etc/passwd.
$ cat target.file
cat /etc/passwd
$
To create string:
var1=your command
to save a file or variable in a file without echo use:
cat $FILE/VAR1 > /new/file/path

Can I avoid using a FIFO file to join the end of a Bash pipeline to be stored in a variable in the current shell?

I have the following functions:
execIn ()
{
local STORE_INvar="${1}" ; shift
printf -v "${STORE_INvar}" '%s' "$( eval "$#" ; printf %s x ; )"
printf -v "${STORE_INvar}" '%s' "${!STORE_INvar%x}"
}
and
getFifo ()
{
local FIFOfile
FIFOfile="/tmp/diamondLang-FIFO-$$-${RANDOM}"
while [ -e "${FIFOfile}" ]
do
FIFOfile="/tmp/diamondLang-FIFO-$$-${RANDOM}"
done
mkfifo "${FIFOfile}"
echo "${FIFOfile}"
}
I want to store the output of the end of a pipeline into a variable as given to a function at the end of the pipeline, however, the only way I have found to do this that will work in early versions of Bash is to use mkfifo to make a temp fifo file. I was hoping to use file descriptors to avoid having to create temporary files. So, This works, but is not ideal:
Set Up: (before I can do this I need to have assigned a FIFO file to a var that can be used by the rest of the process)
$ FIFOfile="$( getFifo )"
The Pipeline I want to persist:
$ printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 # for e.g.
The action: (I can now add) >${FIFOfile} &
$ printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 >${FIFOfile} &
N.B. The need to background it with & - Problem 1: I get [1] <PID_NO> output to the screen.
The actual persist:
$ execIn SOME_VAR cat - <${FIFOfile}
Problem 2: I get more noise to the screen
[1]+ Done printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 > ${FIFOfile}
Problem 3: I loose the blanks at the start of the stream rather than at the end as I have experienced before.
So, am I doing this the right way? I am sure that there must be a way to avoid the need of a FIFO file that needs cleanup afterwards using file descriptors, but I cannot seem to do this as I cannot assign either side of the problem to a file descriptor that is not attached to a file or a FIFO file.
I can try and resolve the problems with what I have, although to make this work properly I guess I need to pre-establish a pool of FIFO files that can be pulled in to use or else I have a pre-req of establishing this file before the command. So, for many reasons this is far from ideal. If anyone can advise me of a better way you would make my day/week/month/life :)
Thanks in advance...
Process substitution was available in bash from the ancient days. You absolutely do not have a version so ancient as to be unable to use it. Thus, there's no need to use a FIFO at all:
readToVar() { IFS= read -r -d '' "$1"; }
readToVar targetVar < <(printf '\n\n123\n456\n524\n789\n\n\n')
You'll observe that:
printf '%q\n' "$targetVar"
...correctly preserves the leading newlines as well as the trailing ones.
By contrast, in a use case where you can't afford to lose stdin:
readToVar() { IFS= read -r -d '' "$1" <"$2"; }
readToVar targetVar <(printf '\n\n123\n456\n524\n789\n\n\n')
If you really want to pipe to this command, are willing to require a very modern bash, and don't mind being incompatible with job control:
set +m # disable job control
shopt -s lastpipe # in a pipeline, parent shell becomes right-hand side
readToVar() { IFS= read -r -d '' "$1"; }
printf '\n\n123\n456\n524\n789\n\n\n' | grep 2 | readToVar targetVar
The issues you claim to run into with using a FIFO do not actually exist. Put this in a script, and run it:
#!/bin/bash
trap 'rm -rf "$tempdir"' 0 # cleanup on exit
tempdir=$(mktemp -d -t fifodir.XXXXXX)
mkfifo "$tempdir/fifo"
printf '\n\n123\n456\n524\n789\n\n\n' >"$tempdir/fifo" &
IFS= read -r -d '' content <"$tempdir/fifo"
printf '%q\n' "$content" # print content to console
You'll notice that, when run in a script, there is no "noise" printed to the screen, because all that status is explicitly tied to job control, which is disabled by default in scripts.
You'll also notice that both leading and tailing newlines are correctly represented.
One idea, tell me I am crazy, might be to use the !! notation to grab the line just executed, e.g. if there is a command that can terminate a pipeline and stop it actually executing, whilst still as far as the shell is concerned, consider it as a successful execution, I am thinking something like the true command, I could then use !! to grab that line and call my existing function to execute it with process substitution or something. I could then wrap this into an alias, something like: alias streamTo=' | true ; LAST_EXEC="!!" ; myNewCommandVariation <<<' which I think could be used something like: $ cmd1 | cmd2 | myNewCommandVariation THE_VAR_NAME_TO_SET and the <<< from the alias would pass the var name to the command as an arg or stdin, either way, the command would be not at the end of a pipeline. How mad is this idea?
Not a full answer but rather a first point: is there some good reason not using mktemp for creating a new file with a random name? As far as I can see, your function called getFifo() doesn't perform much more.
mktemp -u
will give to you a free new name without creating anything; then you can use mkfifo with this name.

Line from bash command output stored in variable as string

I'm trying to find a solution to a problem analog to this one:
#command_A
A_output_Line_1
A_output_Line_2
A_output_Line_3
#command_B
B_output_Line_1
B_output_Line_2
Now I need to compare A_output_Line_2 and B_output_Line_1 and echo "Correct" if they are equal and "Not Correct" otherwise.
I guess the easiest way to do this is to copy a line of output in some variable and then after executing the two commands, simply compare the variables and echo something.
This I need to implement in a bash script and any information on how to get certain line of output stored in a variable would help me put the pieces together.
Also, it would be cool if anyone can tell me not only how to copy/store a line, but probably just a word or sequence like : line 1, bytes 4-12, stored like string in a variable.
I am not a complete beginner but also not anywhere near advanced linux bash user. Thanks to any help in advance and sorry for bad english!
An easier way might be to use diff, no?
Something like:
command_A > command_A.output
command_B > command_B.output
diff command_A.output command_B.output
This will work for comparing multiple strings.
But, since you want to know about single lines (and words in the lines) here are some pointers:
# first line of output of command_A
command_A | head -n 1
The -n 1 option says only to use the first line (default is 10 I think)
# second line of output of command_A
command_A | head -n 2 | tail -n 1
that will take the first two lines of the output of command_A and then the last of those two lines. Happy times :)
You can now store this information in a variable:
export output_A=`command_A | head -n 2 | tail -n 1`
export output_B=`command_B | head -n 1`
And then compare it:
if [ "$output_A" == "$output_B" ]; then echo 'Correct'; else echo 'Not Correct'; fi
To just get parts of a string, try looking into cut or (for more powerful stuff) sed and awk.
Also, just learing a good general purpose scripting language like python or ruby (even perl) can go a long way with this kind of problem.
Use the IFS (internal field separator) to separate on newlines and store the outputs in an array.
#!/bin/bash
IFS='
'
array_a=( $(./a.sh) )
array_b=( $(./b.sh) )
if [ "${array_a[1]}" = "${array_b[0]}" ]; then
echo "CORRECT"
else
echo "INCORRECT"
fi

Quick unix command to display specific lines in the middle of a file?

Trying to debug an issue with a server and my only log file is a 20GB log file (with no timestamps even! Why do people use System.out.println() as logging? In production?!)
Using grep, I've found an area of the file that I'd like to take a look at, line 347340107.
Other than doing something like
head -<$LINENUM + 10> filename | tail -20
... which would require head to read through the first 347 million lines of the log file, is there a quick and easy command that would dump lines 347340100 - 347340200 (for example) to the console?
update I totally forgot that grep can print the context around a match ... this works well. Thanks!
I found two other solutions if you know the line number but nothing else (no grep possible):
Assuming you need lines 20 to 40,
sed -n '20,40p;41q' file_name
or
awk 'FNR>=20 && FNR<=40' file_name
When using sed it is more efficient to quit processing after having printed the last line than continue processing until the end of the file. This is especially important in the case of large files and printing lines at the beginning. In order to do so, the sed command above introduces the instruction 41q in order to stop processing after line 41 because in the example we are interested in lines 20-40 only. You will need to change the 41 to whatever the last line you are interested in is, plus one.
# print line number 52
sed -n '52p' # method 1
sed '52!d' # method 2
sed '52q;d' # method 3, efficient on large files
method 3 efficient on large files
fastest way to display specific lines
with GNU-grep you could just say
grep --context=10 ...
No there isn't, files are not line-addressable.
There is no constant-time way to find the start of line n in a text file. You must stream through the file and count newlines.
Use the simplest/fastest tool you have to do the job. To me, using head makes much more sense than grep, since the latter is way more complicated. I'm not saying "grep is slow", it really isn't, but I would be surprised if it's faster than head for this case. That'd be a bug in head, basically.
What about:
tail -n +347340107 filename | head -n 100
I didn't test it, but I think that would work.
I prefer just going into less and
typing 50% to goto halfway the file,
43210G to go to line 43210
:43210 to do the same
and stuff like that.
Even better: hit v to start editing (in vim, of course!), at that location. Now, note that vim has the same key bindings!
You can use the ex command, a standard Unix editor (part of Vim now), e.g.
display a single line (e.g. 2nd one):
ex +2p -scq file.txt
corresponding sed syntax: sed -n '2p' file.txt
range of lines (e.g. 2-5 lines):
ex +2,5p -scq file.txt
sed syntax: sed -n '2,5p' file.txt
from the given line till the end (e.g. 5th to the end of the file):
ex +5,p -scq file.txt
sed syntax: sed -n '2,$p' file.txt
multiple line ranges (e.g. 2-4 and 6-8 lines):
ex +2,4p +6,8p -scq file.txt
sed syntax: sed -n '2,4p;6,8p' file.txt
Above commands can be tested with the following test file:
seq 1 20 > file.txt
Explanation:
+ or -c followed by the command - execute the (vi/vim) command after file has been read,
-s - silent mode, also uses current terminal as a default output,
q followed by -c is the command to quit editor (add ! to do force quit, e.g. -scq!).
I'd first split the file into few smaller ones like this
$ split --lines=50000 /path/to/large/file /path/to/output/file/prefix
and then grep on the resulting files.
If your line number is 100 to read
head -100 filename | tail -1
Get ack
Ubuntu/Debian install:
$ sudo apt-get install ack-grep
Then run:
$ ack --lines=$START-$END filename
Example:
$ ack --lines=10-20 filename
From $ man ack:
--lines=NUM
Only print line NUM of each file. Multiple lines can be given with multiple --lines options or as a comma separated list (--lines=3,5,7). --lines=4-7 also works.
The lines are always output in ascending order, no matter the order given on the command line.
sed will need to read the data too to count the lines.
The only way a shortcut would be possible would there to be context/order in the file to operate on. For example if there were log lines prepended with a fixed width time/date etc.
you could use the look unix utility to binary search through the files for particular dates/times
Use
x=`cat -n <file> | grep <match> | awk '{print $1}'`
Here you will get the line number where the match occurred.
Now you can use the following command to print 100 lines
awk -v var="$x" 'NR>=var && NR<=var+100{print}' <file>
or you can use "sed" as well
sed -n "${x},${x+100}p" <file>
With sed -e '1,N d; M q' you'll print lines N+1 through M. This is probably a bit better then grep -C as it doesn't try to match lines to a pattern.
Building on Sklivvz' answer, here's a nice function one can put in a .bash_aliases file. It is efficient on huge files when printing stuff from the front of the file.
function middle()
{
startidx=$1
len=$2
endidx=$(($startidx+$len))
filename=$3
awk "FNR>=${startidx} && FNR<=${endidx} { print NR\" \"\$0 }; FNR>${endidx} { print \"END HERE\"; exit }" $filename
}
To display a line from a <textfile> by its <line#>, just do this:
perl -wne 'print if $. == <line#>' <textfile>
If you want a more powerful way to show a range of lines with regular expressions -- I won't say why grep is a bad idea for doing this, it should be fairly obvious -- this simple expression will show you your range in a single pass which is what you want when dealing with ~20GB text files:
perl -wne 'print if m/<regex1>/ .. m/<regex2>/' <filename>
(tip: if your regex has / in it, use something like m!<regex>! instead)
This would print out <filename> starting with the line that matches <regex1> up until (and including) the line that matches <regex2>.
It doesn't take a wizard to see how a few tweaks can make it even more powerful.
Last thing: perl, since it is a mature language, has many hidden enhancements to favor speed and performance. With this in mind, it makes it the obvious choice for such an operation since it was originally developed for handling large log files, text, databases, etc.
print line 5
sed -n '5p' file.txt
sed '5q' file.txt
print everything else than line 5
`sed '5d' file.txt
and my creation using google
#!/bin/bash
#removeline.sh
#remove deleting it comes move line xD
usage() { # Function: Print a help message.
echo "Usage: $0 -l LINENUMBER -i INPUTFILE [ -o OUTPUTFILE ]"
echo "line is removed from INPUTFILE"
echo "line is appended to OUTPUTFILE"
}
exit_abnormal() { # Function: Exit with error.
usage
exit 1
}
while getopts l:i:o:b flag
do
case "${flag}" in
l) line=${OPTARG};;
i) input=${OPTARG};;
o) output=${OPTARG};;
esac
done
if [ -f tmp ]; then
echo "Temp file:tmp exist. delete it yourself :)"
exit
fi
if [ -f "$input" ]; then
re_isanum='^[0-9]+$'
if ! [[ $line =~ $re_isanum ]] ; then
echo "Error: LINENUMBER must be a positive, whole number."
exit 1
elif [ $line -eq "0" ]; then
echo "Error: LINENUMBER must be greater than zero."
exit_abnormal
fi
if [ ! -z $output ]; then
sed -n "${line}p" $input >> $output
fi
if [ ! -z $input ]; then
# remove this sed command and this comes move line to other file
sed "${line}d" $input > tmp && cp tmp $input
fi
fi
if [ -f tmp ]; then
rm tmp
fi
You could try this command:
egrep -n "*" <filename> | egrep "<line number>"
Easy with perl! If you want to get line 1, 3 and 5 from a file, say /etc/passwd:
perl -e 'while(<>){if(++$l~~[1,3,5]){print}}' < /etc/passwd
I am surprised only one other answer (by Ramana Reddy) suggested to add line numbers to the output. The following searches for the required line number and colours the output.
file=FILE
lineno=LINENO
wb="107"; bf="30;1"; rb="101"; yb="103"
cat -n ${file} | { GREP_COLORS="se=${wb};${bf}:cx=${wb};${bf}:ms=${rb};${bf}:sl=${yb};${bf}" grep --color -C 10 "^[[:space:]]\\+${lineno}[[:space:]]"; }

Egrep acts strange with -f option

I've got a strangely acting egrep -f.
Example:
$ egrep -f ~/tmp/tmpgrep2 orig_20_L_A_20090228.txt | wc -l
3
$ for lines in `cat ~/tmp/tmpgrep2` ; do egrep $lines orig_20_L_A_20090228.txt ; done | wc -l
12
Could someone give me a hint what could be the problem?
No, the files did not changed between executions. The expected answer for the egrep line count is 12.
UPDATE on file contents: the searched file contains cca 13000 lines, each of them are 500 char long, the pattern file contains 12 lines, each of them are 24 char long. The pattern always (and only) occurs on a fixed position in the seached file (26-49).
UPDATE on pattern contents: every pattern from tmpgrep2 are a 24 char long number.
If the search patterns are found on the same lines, then you can get the result you see:
Suppose you look for:
abc
def
ghi
jkl
and the data file is:
abcdefghijklmnoprstuvwxzy
then the one-time command will print 1 and the loop will print 4.
Could it be that the lines read contain something that the shell is expanding/substituting for you, in the second version? Then that doesn't get done by grep when it reads the patterns itself, thus leading to a different sent of patterns being matched.
I'm not totally sure if the shell is doing any expansion on the variable value in an invocation like that, but it's an idea at least.
EDIT: Nope, it doesn't seem to do any substitutions. But it could be quoting issue, if your patterns contain whitespace the for loop will step through each token, not through each line. Take a look at the read bash builtin.
Do you have any duplicates in ~/tmp/tmpgrep2? Egrep will only use the dupes one time, but your loop will use each occurrence.
Get rid of dupes by doing something like this:
$ for lines in `sort < ~/tmp/tmpgrep2 | uniq` ; do egrep $lines orig_20_L_A_20090228.txt ; done | wc -l
I second #unwind.
Why don't you run without wc -l and see what each search is finding?
And maybe:
for lines in `cat ~/tmp/tmpgrep2` ; do echo $lines ; done
Just to see now the shell is handling $lines?
The others have already come up with most of the things I would look at. The next thing I would check is the environment variable GREP_OPTIONS, or whatever it is called on your machine. I've gotten the strangest error messages or behaviors when using a command line argument that interfered with the environment settings.

Resources