Read from a endless pipe bash [duplicate] - linux

This question already has answers here:
How do you pipe input through grep to another utility?
(3 answers)
Closed 7 years ago.
I want to create a script that runs another script for each line that the first one receives piping.
Like this:
journalctl -f | myScript1.sh
this myScript1.sh will run another one like this:
./myScript2.sh $line_in_pipe
Problem I found is every code I tested just runs well in a finite pipe (till EOF).
But when I pipe programs like tail -f or others it just won't execute. I think it just waits for EOF to execute the loop.
EDIT:
the endless pipe is like this:
tail -f /var/log/apache2/access.log | grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | script_ip_check.sh
so the idea on script_ip_check.sh is doing something like this:
#!/bin/bash
for line in $(cat); do
echo "process:$line"
nmap -sV -p1234 --open -T4 $line | grep 'open' -B3 | grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' >> list_of_ip_mapped &
done
for each line, this case IP, I will spawn a thread of nmap to scan something special on that host.
I will use it to scan IPs that tries to connect some "hidden" port on my server.
So my script must runs all the time till I cancel it or it receives an EOF.
EDIT2:
I just found out that grep flushes its buffer so that's why it's not working.
I I use --line-buffered to force grep to output each line as it's being processed.

We can't say definitively without knowing what's in your script.
For instance, if you're doing this:
# DON'T DO THIS: Violates http://mywiki.wooledge.org/DontReadLinesWithFor
for line in $(cat); do
: ...do something with "$line"...
done
...that'll wait until all stdin is available, resulting in the hang you describe.
However, if you're following best practices (per BashFAQ #1), your code will operate more like this:
while IFS= read -r line; do
: ...do something with "$line"
done
...and that'll actually behave properly, subject to any buffering performed by the writer. For hints on controlling buffering, see BashFAQ #9.
Finally, quoting from DontReadLinesWithFor:
The final issue with reading lines with for is inefficiency. A while read loop reads one line at a time from an input stream; $(<afile) slurps the entire file into memory all at once. For small files, this is not a problem, but if you're reading large files, the memory requirement will be enormous. (Bash will have to allocate one string to hold the file, and another set of strings to hold the word-split results... essentially, the memory allocated will be twice the size of the input file.)
Obviously, if the content is indefinite, the memory requirements and completion time are likewise.

Related

How to grep and show only the data I want from the output of a running program?

I know by running
./mycommand | grep "keywords"
will get me the line of the keywords located after the program has finished executing.
But what if my program is running/looping constantly for a long time and logs out the results after each loop, how can I grep once a new output appears in the terminal without having the need to wait it finish executing?
In order to do this, the best is to log the results of mycommand to a logfile, something like /var/log/mycommand_logfile.log (your program mycommand might need to be modified in order to do so).
You then open a new terminal and launch the following command:
tail -f /var/log/mycommand_logfile.log | grep "keywords"
This will show the last lines, while they get written during mycommand's execution.
If I understand what you're saying, what you have should be doing what you want. Grep doesn't wait for its input to end. It processes it line by line and produces output whenever it encounters a line that matches what it's looking for.
If you're finding that it's not working this way, is there anything else in your command line that might be producing the blockage? Like, for example, if you're really doing something like:
./mycommand | sort | grep "keyword"
then the "sort" command will wait until it's gotten all the data so it can sort it before passing it on to grep.
Or perhaps the problem is that mycommand is a very resource-intensive operation, and running at a high priority, so that grep doesn't get any cpu cycles until mycommand isn't running any more. I'm just spitballing here. The point is, you're doing the command line right.

Linux bash: grep from stream and write to file [duplicate]

This question already has answers here:
How to 'grep' a continuous stream?
(13 answers)
Closed 6 years ago.
I have a log file A that is constantly updated (but it is rolled over) and I need to constantly filter it's content and write to a persistent file.
TL;DR
I need to:
tail -f A.log | grep "keyword" >> B.log
But this command does not write anything to B.log.
Research only got me complex stuff that is not my case. My guess is that I'm missing some simple concept.
This is not the same question marked as possible duplicate, as the grep works and I have it's output if I don't try to write it to a file. The problem is the file.
If just grep, without writing to the file, works, you encountered a buffering "problem". I/O buffering, unless manually implemented by the program will get handled by the libc. If the program's stdout is a termial, buffering will be line-based. If not, the libc buffers output until the buffer reached a size limit.
On Linux, meaning with glibc you can use the stdbuf command to configure that buffering:
tail -f A.log | stdbuf -oL grep "keyword" >> B.log
-oL specifies that the output stream should be line-buffered.

Stream specific numbered Bash file descriptor into variable

I am trying to stream a specific numbered file descriptor into a variable in Bash. I can do this from normal standard in using the following function, but, how do it do it from a specific file descriptor. I need to direct the FD into the sub-shell if I use the same approach. I could always do it reading line by line, but, if I can do it in a continuous stream then that would be massively preferable.
The function I have is:
streamStdInTo ()
{
local STORE_INvar="${1}" ; shift
printf -v "${STORE_INvar}" '%s' "$( cat - )"
}
Yes, I know that this wouldn't work normally as the end of a pipeline would be lost (due to its execution in a sub-shell), however, either in the context of the Bash 4 set +m ; shopt -s lastpipe method of executing the end of a pipeline in the same shell as the start, or, by directing into this via a different file descriptor I am hoping to be able to use it.
So, my question is, How do I use the above but with different file descriptors than the normal?
It's not entirely clear what you mean, but perhaps you are looking for something like:
cat - <&4 # read from fd 4
Or, just call your current function with the redirect:
streamStdInTo foo <&4
edit:
Addressing some questions from the comment, you can use a fifo:
#!/bin/bash
trap 'rm -f $f' 0
f=$(mktemp xxx)
rm $f
mkfifo $f
echo foo > $f &
exec 4< $f
cat - <&4
wait
I think there's a lot of confusion about what exactly you're trying to do. If I understand correctly the end goal here is to run a pipeline and capture the output in a variable, right? Kind of like this:
var=$(cmd1 | cmd2)
Except I guess the idea here is that the name of "$var" is stored in another variable:
varname=var
You can do an end-run around Bash's usual job control situation by using process substitution. So instead of this normal pipeline (which would work in ksh or zsh, but not in bash unless you set lastpipe):
cmd1 | cmd2 | read "$varname"
You would use this command, which is equivalent apart from how the shell handles the job:
read "$varname" < <(cmd1 | cmd2)
With process substitution, "read $varname" isn't run in a pipeline, so Bash doesn't fork to run it. (You could use your streamStdInTo() function there as well, of course)
As I understand it, you wanted to solve this problem by using numeric file descriptors:
cmd1 | cmd2 >&$fd1 &
read "$varname" <&$fd2
To create those file descriptors that connect the pipeline background job to the "read" command, what you need is called a pipe, or a fifo. These can be created without touching the file system (the shell does it all the time!) but the shell doesn't directly expose this functionality, which is why we need to resort to mkfifo to create a named pipe. A named pipe is a special file that exists on the filesystem, but the data you write to it doesn't go to the disk. It's a data queue stored in memory (a pipe). It doesn't need to stay on the filesystem after you've opened it, either, it can be deleted almost immediately:
pipedir=$(mktemp -d /tmp/pipe_maker_XXXX)
mkfifo ${pipedir}/pipe
exec {temp_fd}<>${pipedir}/pipe # Open both ends of the pipe
exec {fd1}>${pipedir}/pipe
exec {fd2}<${pipedir}/pipe
exec {temp_fd}<&- # Close the read/write FD
rm -rf ${pipedir} # Don't need the named FIFO any more
One of the difficulties in working with named pipes in the shell is that attempting to open them just for reading, or just for writing causes the call to block until something opens the other end of the pipe. You can get around that by opening one end in a background job before trying to open the other end, or by opening both ends at once as I did above.
The "{fd}<..." syntax dynamically assigns an unused file descriptor number to the variable $fd and opens the file on that file descriptor. It's been around in ksh for ages (since 1993?), but in Bash I think it only goes back to 4.1 (from 2010).

How to make nohup.out update with perl script?

I have a perl script that copies a large amount of files. It prints some text to standard out and also writes a logfile. However, when running with nohup, both of these display a blank file:
tail -f nohup.out
tail -f logfile.log
The files don't update until the script is done running. Moreover, for some reason tailing the .log file does work if I don't use nohup!
I found a similar question for python (
How come I can't tail my log?)
Is there a similar way to flush the output in perl?
I would use tmux or screen, but they don't exist on this server.
Check perldoc,
HANDLE->autoflush( EXPR );
To disable buffering on standard output that would be,
STDOUT->autoflush(1);

How can I make Bash automatically pipe the output of every command to something like tee?

I use some magic in $PROMPT_COMMAND to automatically save every command I run to a database:
PROMPT_COMMAND='save_command "$(history 1)"'
where save_command is a more complicated function. It would be nice to save also the head/tail of the output of each command, but I can't think of a reasonable way to do this, other than manually prepending some sort of shell function to everything I type (and this becomes even more painful with complicated pipelines or boolean expressions). Basically, I just want the first and last 10 lines of whatever went to /dev/tty to get saved to a variable (or even a file) - is there any way to do this?
script(1) will probably get you started. It won't let you just record the first and last 10 lines, but you can do some post-processing on its output.
bash | tee /dev/tty ./bashout
This saves all stdout gets saved to bashout.
bash | tee /dev/tty | tail > ./bashout
The tail of stdout of every command gets written to bashout.
bash | tee /dev/tty | sed -e :a -e '10p;$q;N;11,$D;ba' > ./bashout
The first and last 10 lines of stdout of every command gets written to bashout.
These don't save the command, but if you modify your save_command to print the command to stdout, it will get in there.

Resources