Tail -f piped to > awk piped to file > file does not work - linux

Having trouble to wrap my head around piping and potential buffering issue. I am trying to perform set of operations piped that seem to break at some piping level. To simplify , I narrowed it down to 3 piping operations that do not work correctly
tail -f | awk '{print $1}' > file
results in no data redirected to the file , however
tail -f | awk '{print $1}'
results are output to stdout fine
also
tail -10 | awk '{print $1}' > file
works fine as well.
thinking it might be buffering issue, tried
tail -f | unbuffer awk '{print $1}' > file
what produced no positive results
(note: in original request, i have more operation in between using grep --line-buffer, but the problem was narrowed down to 3 piped commands tail -f | awk > file

The following will tail -f on a given file and whenever new data is added will automatically execute the while loop:
tail -f file_to_watch | while read a; do echo "$a" |awk '{print $1}' >> file; done
or more simply if you really only need to print the first field you could read it directly to your variable like this:
tail -f file_to_watch | while read a b; do echo "$a" >> file; done

Here is how to handle log files:
tail --follow=name logfile | awk '{print $1 | "tee /var/log/file"}'
or for you this may be ok:
tail -f | awk '{print $1 | "tee /var/log/file"}'
--follow=name this prevents stop of command while log file are rolled.
| "tee /var/log/file" this is used to get the output to the file.

Related

Bash tries to execute commands in heredoc

I am trying to write a simple bash script that will print a multiline output to another file. I am doing it through heredoc format:
#!/bin/sh
echo "Hello!"
cat <<EOF > ~/Desktop/what.txt
a=`echo $1 | awk -F. '{print $NF}'`
b=`echo $2 | tr '[:upper:]' '[:lower:]'`
EOF
I was expecting to see a file in my desktop with these contents:
a=`echo $1 | awk -F. '{print $NF}'`
b=`echo $2 | tr '[:upper:]' '[:lower:]'`
But instead, I am seeing these as the contents of my what.txt file:
a=
b=
Somehow, even though it is part of a heredoc, bash is trying to execute it line by line. How do I prevent this, and print the contents to the file as it is?
Quote EOF so that bash takes inputs literally:
cat <<'EOF' > what.txt
a=`echo $1 | awk -F. '{print $NF}'`
b=`echo $2 | tr '[:upper:]' '[:lower:]'`
EOF
Also start using $() for command substitution instead of old and problematic ``.

Issues passing AWK output to BASH Variable

I'm trying to parse lines from an error log in BASH and then send a certain part out to a BASH variable to be used later in the script and having issues once I try and pass it to a BASH variable.
What the log file looks like:
1446851818|1446851808.1795|12|NONE|DID|8001234
I need the number in the third set (in this case, the number is 12) of the line
Here's an example of the command I'm running:
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | awk -F '[|]' '{print $3}'
The line of code is trying to accomplish this:
Grab the last lines of the log file
Search for a phrase (in this case connect, I'm using the same command to trigger different items)
Separate the number in the third set of the line out so it can be used elsewhere
If I run the above full command, it runs successfully like so:
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | awk -F '[|]' '{print $3}'
12
Now if I try and assign it to a variable in the same line/command, I'm unable to have it echo back the variable.
My command when assigning to a variable looks like:
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | brand=$(awk -F '[|]' '{print $3}')
(It is being run in the same script as the echo command so the variable should be fine, test script looks like:
#!/bin/bash
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | brand=$(awk -F '[|]' '{print $3}')
echo "$brand";
I'm aware this is most likely not the most efficient/eloquent solution to do this, so if there are other ideas/ways to accomplish this I'm open to them as well (my BASH skills are basic but improving)
You need to capture the output of the entire pipeline, not just the final section of it:
brand=$(tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | awk -F '|' '{print $3}')
You may also want to consider what will happen if there is more than one line containing CONNECT in the final five lines of the file (or indeed, if there are none). That's going to cause brand to have multiple (or no) values.
If your intent is to get the third field from the latest line in the file containing CONNECT, awk can pretty much handle the entire thing without needing tail or grep:
brand=$(awk -F '|' '/CONNECT/ {latest = $3} END {print latest}')

Awk, pipe and tail -f giving unexpected behavior [duplicate]

This question already has answers here:
Piping tail output though grep twice
(2 answers)
Closed 4 years ago.
Here is my sample log file.http://pastebin.com/DwWeFhJk.
When I am doing
tail -f log | awk '{if (NF>3) {print $1}; }'
the result I am getting is correct
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
64.242.88.10
But when I am doing:
tail -f log |
awk '{if (NF>3) {print $1}; }' |
awk '{print $1}'
I am not getting any output. Even no output in case of
tail -f log | awk '{if (NF>3) {print $1}; }' | grep "64"
I am not getting the reason why the output of the first awk is not getting passed as the input of the second awk/grep after the pipe.
When the output of the first awk is going to the terminal, the output is line-buffered, so each line is printed as it is produced. When the output is going to the second awk or the grep, it is fully buffered. The output won't be sent until the buffer is full. When enough extra records are appended to the log, then the second awk will a buffer full of data to process. Until then, nothing will happen.
You start the command with tail -f, that keeps the output open and therefore does not send a needed newline to the other commands.
This works perfectly fine:
cat log | awk '{if (NF>3) {print $1}; }' | grep 64
So, the problem is buffering. The middle awk is doing normal buffering instead of interactive buffering. This works (non-portably) with mawk:
tail -f log | mawk -W interactive '{if (NF>3) {print $1}; }' | awk '{print}'
You could read GNU description of the issue.
In any case, just check that the awk used in the middle could be told to buffer interactively.
Added:
The command system("") seems to unblock the buffering. It is POSIX, but does not work with mawk.
tail -f log | awk '{if (NF>3) {print $1}; system("")}' | awk '{print}'
search for "parallel --pipe" in the link to avoid the buffering
https://www.gnu.org/software/parallel/parallel_tutorial.html

How to run grep inside awk?

Suppose I have a file input.txt with few columns and few rows, the first column is the key, and a directory dir with files which contain some of these keys. I want to find all lines in the files in dir which contain these key words. At first I tried to run the command
cat input.txt | awk '{print $1}' | xargs grep dir
This doesn't work because it thinks the keys are paths on my file system. Next I tried something like
cat input.txt | awk '{system("grep -rn dir $1")}'
But this didn't work either, eventually I have to admit that even this doesn't work
cat input.txt | awk '{system("echo $1")}'
After I tried to use \ to escape the white space and the $ sign, I came here to ask for your advice, any ideas?
Of course I can do something like
for x in `cat input.txt` ; do grep -rn $x dir ; done
This is not good enough, because it takes two commands, but I want only one. This also shows why xargs doesn't work, the parameter is not the last argument
You don't need grep with awk, and you don't need cat to open files:
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' input.txt dir/*
Nor do you need xargs, or shell loops or anything else - just one simple awk command does it all.
If input.txt is not a file, then tweak the above to:
real_input_generating_command |
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' - dir/*
All it's doing is creating an array of keys from the first file (or input stream) and then looking for each key from that array in every file in the dir directory.
Try following
awk '{print $1}' input.txt | xargs -n 1 -I pattern grep -rn pattern dir
First thing you should do is research this.
Next ... you don't need to grep inside awk. That's completely redundant. It's like ... stuffing your turkey with .. a turkey.
Awk can process input and do "grep" like things itself, without the need to launch the grep command. But you don't even need to do this. Adapting your first example:
awk '{print $1}' input.txt | xargs -n 1 -I % grep % dir
This uses xargs' -I option to put xargs' input into a different place on the command line it runs. In FreeBSD or OSX, you would use a -J option instead.
But I prefer your for loop idea, converted into a while loop:
while read key junk; do grep -rn "$key" dir ; done < input.txt
Use process substitution to create a keyword "file" that you can pass to grep via the -f option:
grep -f <(awk '{print $1}' input.txt) dir/*
This will search each file in dir for lines containing keywords printed by the awk command. It's equivalent to
awk '{print $1}' input.txt > tmp.txt
grep -f tmp.txt dir/*
grep requires parameters in order: [what to search] [where to search]. You need to merge keys received from awk and pass them to grep using the \| regexp operator.
For example:
arturcz#szczaw:/tmp/s$ cat words.txt
foo
bar
fubar
foobaz
arturcz#szczaw:/tmp/s$ grep 'foo\|baz' words.txt
foo
foobaz
Finally, you will finish with:
grep `commands|to|prepare|a|keywords|list` directory
In case you still want to use grep inside awk, make sure $1, $2 etc are outside quote.
eg. this works perfectly
cat file_having_query | awk '{system("grep " $1 " file_to_be_greped")}'
// notice the space after grep and before file name

Pipe 'tail -f' into awk without hanging

Something like this will hang:
tail -f textfile | awk '{print $0}'
while grep won't hang when used instead of awk.
My actual intention is to add color to some log output using merely standard commands; however it seems that piping tail -f into awk won't work. I don't know if it's a buffer problem, but I tried some approaches that haven't worked, like:
awk '{print $0;fflush()}'
and also How to pipe tail -f into awk
Any ideas?
I ran into almost exactly the same problem with mawk. I think it is due to the way mawk is flushing its buffer, the problem went away when I switched to gawk. Hope this helps (a bit late I know).
I tried this command :
tail -f test | awk '{print $0;}'
And it doesn't hang. Awk will print the new values each time I add something in the test file.
echo "test" >> test
I think you just forgot a quote in your command because you wrote (edit : well, before your post was edited) :
tail -f textfile | awk {print $0}'
Instead of :
tail -f textfile | awk '{print $0}'

Resources