Pipe 'tail -f' into awk without hanging - linux

Something like this will hang:
tail -f textfile | awk '{print $0}'
while grep won't hang when used instead of awk.
My actual intention is to add color to some log output using merely standard commands; however it seems that piping tail -f into awk won't work. I don't know if it's a buffer problem, but I tried some approaches that haven't worked, like:
awk '{print $0;fflush()}'
and also How to pipe tail -f into awk
Any ideas?

I ran into almost exactly the same problem with mawk. I think it is due to the way mawk is flushing its buffer, the problem went away when I switched to gawk. Hope this helps (a bit late I know).

I tried this command :
tail -f test | awk '{print $0;}'
And it doesn't hang. Awk will print the new values each time I add something in the test file.
echo "test" >> test
I think you just forgot a quote in your command because you wrote (edit : well, before your post was edited) :
tail -f textfile | awk {print $0}'
Instead of :
tail -f textfile | awk '{print $0}'

Related

How Can I Perform Awk Commands Only On Certain Fields

I have CSV columns that I'm working with:
info,example-string,super-example-string,otherinfo
I would like to get:
example-string super example string
Right now, I'm running the following command:
awk -F ',' '{print $3}' | sed "s/-//g"
But, then I have to paste the lines together to combine $2 and $3.
Is there anyway to do something like this?
awk -F ',' '{print $2" "$3}' | sed "s/-//g"
Except, where the sed command is only performed on $3 and $2 stays in place? I'm just concerned later on if the lines don't match up, the data could be misaligned.
Please note: I need to keep the pipe for the SED command. I just used a simple example but I end up running a lot of commands after that as well.
Try:
$ awk -F, '{gsub(/-/," ",$3); print $2,$3}' file
example-string super example string
How it works
-F,
This tells awk to use a comma as the field separator.
gsub(/-/," ",$3)
This replaces all - in field 3 with spaces.
print $2,$3
This prints fields 2 and 3.
Examples using pipelines
$ echo 'info,example-string,super-example-string,otherinfo' | awk -F, '{gsub(/-/," ",$3); print $2,$3}'
example-string super example string
In a pipeline with sed:
$ echo 'info,example-string,super-example-string,otherinfo' | awk -F, '{gsub(/-/," ",$3); print $2,$3}' | sed 's/string/String/g'
example-String super example String
Though best solution will be either use a single sed or use single awk. Since you have requested to use awk and sed solution so providing this. Also considering your actual data will be same as shown sample Input_file.
awk -F, '{print $2,$3}' Input_file | sed 's/\([^ ]*\)\([^-]*\)-\([^-]*\)-\([^-]*\)/\1 \2 \3 \4/'
Output will be as follows.
example-string super example string

Awk, pipe and tail -f giving unexpected behavior [duplicate]

This question already has answers here:
Piping tail output though grep twice
(2 answers)
Closed 4 years ago.
Here is my sample log file.http://pastebin.com/DwWeFhJk.
When I am doing
tail -f log | awk '{if (NF>3) {print $1}; }'
the result I am getting is correct
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
But when I am doing:
tail -f log |
awk '{if (NF>3) {print $1}; }' |
awk '{print $1}'
I am not getting any output. Even no output in case of
tail -f log | awk '{if (NF>3) {print $1}; }' | grep "64"
I am not getting the reason why the output of the first awk is not getting passed as the input of the second awk/grep after the pipe.
When the output of the first awk is going to the terminal, the output is line-buffered, so each line is printed as it is produced. When the output is going to the second awk or the grep, it is fully buffered. The output won't be sent until the buffer is full. When enough extra records are appended to the log, then the second awk will a buffer full of data to process. Until then, nothing will happen.
You start the command with tail -f, that keeps the output open and therefore does not send a needed newline to the other commands.
This works perfectly fine:
cat log | awk '{if (NF>3) {print $1}; }' | grep 64
So, the problem is buffering. The middle awk is doing normal buffering instead of interactive buffering. This works (non-portably) with mawk:
tail -f log | mawk -W interactive '{if (NF>3) {print $1}; }' | awk '{print}'
You could read GNU description of the issue.
In any case, just check that the awk used in the middle could be told to buffer interactively.
Added:
The command system("") seems to unblock the buffering. It is POSIX, but does not work with mawk.
tail -f log | awk '{if (NF>3) {print $1}; system("")}' | awk '{print}'
search for "parallel --pipe" in the link to avoid the buffering
https://www.gnu.org/software/parallel/parallel_tutorial.html

cat passwd | awk -F':' '{printf $1}' Is this command correct?

I'd like to know how cat passwd | awk -F':' '{printf $1}' works. cat /etc/passwd is a list of users with ID and folders from root to the current user (I don't know if it has something to do with cat passwd). -F is some kind of input file and {printf $1} is printing the first column. That's what I've search so far but seems confusing to me.
Can anyone help me or explain to me if it's right or wrong, please?
This is equivalent to awk -F: '{print $1}' passwd. The cat command is superfluous as all it does is read a file.
The -F option determines the field separator for awk. The quotes around the colon are also superfluous since colon is not special to the shell in this context. The print invocation tells awk to print the first field using $1. You are not passing a format string, so you probably mean print instead of printf.

Tail -f piped to > awk piped to file > file does not work

Having trouble to wrap my head around piping and potential buffering issue. I am trying to perform set of operations piped that seem to break at some piping level. To simplify , I narrowed it down to 3 piping operations that do not work correctly
tail -f | awk '{print $1}' > file
results in no data redirected to the file , however
tail -f | awk '{print $1}'
results are output to stdout fine
also
tail -10 | awk '{print $1}' > file
works fine as well.
thinking it might be buffering issue, tried
tail -f | unbuffer awk '{print $1}' > file
what produced no positive results
(note: in original request, i have more operation in between using grep --line-buffer, but the problem was narrowed down to 3 piped commands tail -f | awk > file
The following will tail -f on a given file and whenever new data is added will automatically execute the while loop:
tail -f file_to_watch | while read a; do echo "$a" |awk '{print $1}' >> file; done
or more simply if you really only need to print the first field you could read it directly to your variable like this:
tail -f file_to_watch | while read a b; do echo "$a" >> file; done
Here is how to handle log files:
tail --follow=name logfile | awk '{print $1 | "tee /var/log/file"}'
or for you this may be ok:
tail -f | awk '{print $1 | "tee /var/log/file"}'
--follow=name this prevents stop of command while log file are rolled.
| "tee /var/log/file" this is used to get the output to the file.

How to run grep inside awk?

Suppose I have a file input.txt with few columns and few rows, the first column is the key, and a directory dir with files which contain some of these keys. I want to find all lines in the files in dir which contain these key words. At first I tried to run the command
cat input.txt | awk '{print $1}' | xargs grep dir
This doesn't work because it thinks the keys are paths on my file system. Next I tried something like
cat input.txt | awk '{system("grep -rn dir $1")}'
But this didn't work either, eventually I have to admit that even this doesn't work
cat input.txt | awk '{system("echo $1")}'
After I tried to use \ to escape the white space and the $ sign, I came here to ask for your advice, any ideas?
Of course I can do something like
for x in `cat input.txt` ; do grep -rn $x dir ; done
This is not good enough, because it takes two commands, but I want only one. This also shows why xargs doesn't work, the parameter is not the last argument
You don't need grep with awk, and you don't need cat to open files:
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' input.txt dir/*
Nor do you need xargs, or shell loops or anything else - just one simple awk command does it all.
If input.txt is not a file, then tweak the above to:
real_input_generating_command |
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' - dir/*
All it's doing is creating an array of keys from the first file (or input stream) and then looking for each key from that array in every file in the dir directory.
Try following
awk '{print $1}' input.txt | xargs -n 1 -I pattern grep -rn pattern dir
First thing you should do is research this.
Next ... you don't need to grep inside awk. That's completely redundant. It's like ... stuffing your turkey with .. a turkey.
Awk can process input and do "grep" like things itself, without the need to launch the grep command. But you don't even need to do this. Adapting your first example:
awk '{print $1}' input.txt | xargs -n 1 -I % grep % dir
This uses xargs' -I option to put xargs' input into a different place on the command line it runs. In FreeBSD or OSX, you would use a -J option instead.
But I prefer your for loop idea, converted into a while loop:
while read key junk; do grep -rn "$key" dir ; done < input.txt
Use process substitution to create a keyword "file" that you can pass to grep via the -f option:
grep -f <(awk '{print $1}' input.txt) dir/*
This will search each file in dir for lines containing keywords printed by the awk command. It's equivalent to
awk '{print $1}' input.txt > tmp.txt
grep -f tmp.txt dir/*
grep requires parameters in order: [what to search] [where to search]. You need to merge keys received from awk and pass them to grep using the \| regexp operator.
For example:
arturcz#szczaw:/tmp/s$ cat words.txt
foo
bar
fubar
foobaz
arturcz#szczaw:/tmp/s$ grep 'foo\|baz' words.txt
foo
foobaz
Finally, you will finish with:
grep `commands|to|prepare|a|keywords|list` directory
In case you still want to use grep inside awk, make sure $1, $2 etc are outside quote.
eg. this works perfectly
cat file_having_query | awk '{system("grep " $1 " file_to_be_greped")}'
// notice the space after grep and before file name

Resources