How to grep text patterns from remote crontabs using xargs through SSH? - linux

I'm developping a script to search for patterns within scripts executed from CRON on a bunch of remote servers through SSH.
Script on client machine -- SSH --> Remote Servers CRON/Scripts
For now I can't get the correct output.
Script on client machine
#!/bin/bash
server_list=( '172.x.x.x' '172.x.x.y' '172.x.x.z' )
for s in ${server_list[#]}; do
ssh -i /home/user/.ssh/my_key.rsa user#${s} crontab -l | grep -v '^#\|^[[:space:]]*$' | cut -d ' ' -f 6- | awk '{print $1}' | grep -v '^$\|^echo\|^find\|^PATH\|^/usr/bin\|^/bin/' | xargs -0 grep -in 'server.tld\|10.x.x.x'
done
This only gives me the paths of scripts from crontab, not the matched lines and line number plus the first line is prefixed with "grep:" keyword (example below):
grep: /opt/directory/script1.sh
/opt/directory/script2.sh
/opt/directory/script3.sh
/opt/directory/script4.sh
How to get proper output, meaning the script path plus line number plus line of matching pattern?
Remote CRON examples
OO 6 * * * /opt/directory/script1.sh foo
30 6 * * * /opt/directory/script2.sh bar
Remote script content examples
1 ) This will match grep pattern
#!/bin/bash
ping -c 4 server.tld && echo "server.tld ($1)"
2 ) This won't match grep pattern
#!/bin/bash
ping -c 4 8.x.x.x && echo "8.x.x.x ($1)"

Without example input, it's really hard to see what your script is attempting to do. But the cron parsing could almost certainly be simplified tremendously by refactoring all of it into a single Awk script. Here is a quick stab, with obviously no way to test.
#!/bin/sh
# No longer using an array for no good reason, so /bin/sh will work
for s in 172.x.x.x 172.x.x.y 172.x.x.z; do
ssh -i /home/user/.ssh/my_key.rsa "user#${s}" crontab -l |
awk '! /^#|^[[:space:]]*$/ && $6 !~ /^$|^(echo|find|PATH|\/usr\/bin|\/bin\/)/ { print $6 }' |
# no -0; use grep -E and properly quote literal dot
xargs grep -Ein 'server\.tld|10.x.x.x'
done
Your command would not output null-delimited data to xargs so probably the immediate problem was that xargs -0 would receive all the file names as a single file name which obviously does not exist, and you forgot to include the ": file not found" from the end of the error message.
The use of grep -E is a minor hack to enable a more modern regex syntax which is more similar to that in Awk, where you don't have to backslash the "or" pipe etc.
This script, like your original, runs grep on the local system where you run the SSH script. If you want to run the commands on the remote server, you will need to refactor to put the entire pipeline in single quotes or a here document:
for s in 172.x.x.x 172.x.x.y 172.x.x.z; do
ssh -i /home/user/.ssh/my_key.rsa "user#${s}" <<\________HERE
crontab -l |
awk '! /^#|^[[:space:]]*$/ && $6 !~ /^$|^(echo|find|PATH|\/usr\/bin|\/bin\/)/ { print $6 }' |
xargs grep -Ein 'server\.tld|10.x.x.x'
________HERE
done
The refactored script contains enough complexities in the quoting that you probably don't want to pass it as an argument to ssh, which requires you to figure out how to quote strings both locally and remotely. It's easier then to pass it as standard input, which obviously just gets transmitted verbatim.
If you get "Pseudo-terminal will not be allocated because stdin is not a terminal.", try using ssh -t. Sometimes you need to add multiple -t options to completely get rid of this message.

Related

Grep function not stopping with head pipe

So i'm currently trying to grep a single result from a random file in a specific directory. The grepping works just fine and the expected output file is populated as expected, but for some reason, even after the output file has already been filled, the process won't stop. This is the grep command where the program seems to be getting stuck.
searchFILE(){
case $2 in
pref)
echo "Populating output file: $3-$1.data.out"
dataOutputFile="$3-$1.data.out"
zgrep -a "\"someParameter\"\:\"$1\"" /folder/anotherFolder/filetemplate.log.* | zgrep -a "\"parameter2\"\:\"$3\"" | head -1 > $dataOutputFile
;;
*)
echo "Unrecognized command"
;;
esac
echo "Query finished"
}
What is currently happening is that the output file is being populated as expected with the head pipe, but for some reason I'm not getting the "Query finished" message, and the process seems not to stop at all.
grep does not know that head -n1 is no longer reading from the pipe until it attempts to write to the pipe, which it will only do if another match is found. There is no direct communication between the processes. It will eventually stop, but only once all the data is read, a second match is found and write fails with EPIPE, or some other error occurs.
You can watch this happen in a simple pipeline like this:
cat /dev/urandom | grep -ao "12[0-9]" | head -n1
With a sufficiently rare pattern, you will observe a delay between output and exit.
One solution is to change your stop condition. Instead of waiting for SIGPIPE as your pipeline does, wait for grep to match once using the -m1 option:
cat /dev/urandom | grep -ao -m1 "12[0-9]"
I saw better performance results with zcat myZippedFile | grep whatever paradigm...
The first difference you need to try is pipe with | head -z --lines=1
The reason is null terminated lines instead of newlines (just in case).
My example script below worked (drop the case statement to make it more simple). If I hold onto $1 $2 inside functions things go wrong. I use parameter $names and only use the $1 $2 $# once, because it also goes wrong for me if I don't and in any case you can then shift over $# and catch arguments. The $# in the script itself are not the same as arguments in bash functions.
grep searching for 2 or multiple parameters in any order means using grep twice; in your case zgrep | grep. The second grep is a normal grep! You only need the first grep to be zgrep to do the unzip. Your question is simpler if you drop the case statement as bash case scares off people: bash was always an ugly lady that works good for short scripts.
zgrep searches text or compressed text, but newlines in LINUX style vs WINDOWS are not the same. So use dos2unix to convert files so that newlines work. I use compressed file simply because it is strange and rare to see zgrep, so it is demonstrated in a shell script with a compressed file! It works for me. I changed a few things, like >> and "sort -u" but you can obviously change them back.
#!/usr/bin/env bash
# Search for egA AND egB using option go
# COMMAND LINE: ./zgrp egA go egB
A="$1"
cOPT="$2" # expecting case go
B="$3"
LOG="./filetemplate.log" # use parameters for long names.
# Generate some data with gzip and delete the temporary file.
echo "\"pramA\":\"$A\" \"pramB\":\"$B\"" >> $B$A.tmp
rm -f ${LOG}.A; tar czf ${LOG}.A $B$A.tmp
rm -f $B$A.tmp
# Use paramaterise $names not $1 etc because you may want to do shift etc
searchFILE()
{
outFile="$B-$A.data.out"
case $cOPT in
go) # This is zgrep | grep NOT zgrep | zgrep
zgrep -a "\"pramA\":\"$A\"" ${LOG}.* | grep -a "\"pramB\":\"$B\"" | head -z --lines=1 >> $outFile
sort -u $outFile > ${outFile}.sorted # sort unique on your output.
;;
*) echo -e "ERROR second argument must be go.\n Usage: ./zgrp egA go egB"
exit 9
;;
esac
echo -e "\n ============ Done: $0 $# Fin. ============="
}
searchFILE "$#"
cat ${outFile}.sorted

Problems with tail -f and awk? [duplicate]

Is that possible to use grep on a continuous stream?
What I mean is sort of a tail -f <file> command, but with grep on the output in order to keep only the lines that interest me.
I've tried tail -f <file> | grep pattern but it seems that grep can only be executed once tail finishes, that is to say never.
Turn on grep's line buffering mode when using BSD grep (FreeBSD, Mac OS X etc.)
tail -f file | grep --line-buffered my_pattern
It looks like a while ago --line-buffered didn't matter for GNU grep (used on pretty much any Linux) as it flushed by default (YMMV for other Unix-likes such as SmartOS, AIX or QNX). However, as of November 2020, --line-buffered is needed (at least with GNU grep 3.5 in openSUSE, but it seems generally needed based on comments below).
I use the tail -f <file> | grep <pattern> all the time.
It will wait till grep flushes, not till it finishes (I'm using Ubuntu).
I think that your problem is that grep uses some output buffering. Try
tail -f file | stdbuf -o0 grep my_pattern
it will set output buffering mode of grep to unbuffered.
If you want to find matches in the entire file (not just the tail), and you want it to sit and wait for any new matches, this works nicely:
tail -c +0 -f <file> | grep --line-buffered <pattern>
The -c +0 flag says that the output should start 0 bytes (-c) from the beginning (+) of the file.
In most cases, you can tail -f /var/log/some.log |grep foo and it will work just fine.
If you need to use multiple greps on a running log file and you find that you get no output, you may need to stick the --line-buffered switch into your middle grep(s), like so:
tail -f /var/log/some.log | grep --line-buffered foo | grep bar
you may consider this answer as enhancement .. usually I am using
tail -F <fileName> | grep --line-buffered <pattern> -A 3 -B 5
-F is better in case of file rotate (-f will not work properly if file rotated)
-A and -B is useful to get lines just before and after the pattern occurrence .. these blocks will appeared between dashed line separators
But For me I prefer doing the following
tail -F <file> | less
this is very useful if you want to search inside streamed logs. I mean go back and forward and look deeply
Didn't see anyone offer my usual go-to for this:
less +F <file>
ctrl + c
/<search term>
<enter>
shift + f
I prefer this, because you can use ctrl + c to stop and navigate through the file whenever, and then just hit shift + f to return to the live, streaming search.
sed would be a better choice (stream editor)
tail -n0 -f <file> | sed -n '/search string/p'
and then if you wanted the tail command to exit once you found a particular string:
tail --pid=$(($BASHPID+1)) -n0 -f <file> | sed -n '/search string/{p; q}'
Obviously a bashism: $BASHPID will be the process id of the tail command. The sed command is next after tail in the pipe, so the sed process id will be $BASHPID+1.
Yes, this will actually work just fine. Grep and most Unix commands operate on streams one line at a time. Each line that comes out of tail will be analyzed and passed on if it matches.
This one command workes for me (Suse):
mail-srv:/var/log # tail -f /var/log/mail.info |grep --line-buffered LOGIN >> logins_to_mail
collecting logins to mail service
Coming some late on this question, considering this kind of work as an important part of monitoring job, here is my (not so short) answer...
Following logs using bash
1. Command tail
This command is a little more porewfull than read on already published answer
Difference between follow option tail -f and tail -F, from manpage:
-f, --follow[={name|descriptor}]
output appended data as the file grows;
...
-F same as --follow=name --retry
...
--retry
keep trying to open a file if it is inaccessible
This mean: by using -F instead of -f, tail will re-open file(s) when removed (on log rotation, for sample).
This is usefull for watching logfile over many days.
Ability of following more than one file simultaneously
I've already used:
tail -F /var/www/clients/client*/web*/log/{error,access}.log /var/log/{mail,auth}.log \
/var/log/apache2/{,ssl_,other_vhosts_}access.log \
/var/log/pure-ftpd/transfer.log
For following events through hundreds of files... (consider rest of this answer to understand how to make it readable... ;)
Using switches -n (Don't use -c for line buffering!).By default tail will show 10 last lines. This can be tunned:
tail -n 0 -F file
Will follow file, but only new lines will be printed
tail -n +0 -F file
Will print whole file before following his progression.
2. Buffer issues when piping:
If you plan to filter ouptuts, consider buffering! See -u option for sed, --line-buffered for grep, or stdbuf command:
tail -F /some/files | sed -une '/Regular Expression/p'
Is (a lot more efficient than using grep) a lot more reactive than if you does'nt use -u switch in sed command.
tail -F /some/files |
sed -une '/Regular Expression/p' |
stdbuf -i0 -o0 tee /some/resultfile
3. Recent journaling system
On recent system, instead of tail -f /var/log/syslog you have to run journalctl -xf, in near same way...
journalctl -axf | sed -une '/Regular Expression/p'
But read man page, this tool was built for log analyses!
4. Integrating this in a bash script
Colored output of two files (or more)
Here is a sample of script watching for many files, coloring ouptut differently for 1st file than others:
#!/bin/bash
tail -F "$#" |
sed -une "
/^==> /{h;};
//!{
G;
s/^\\(.*\\)\\n==>.*${1//\//\\\/}.*<==/\\o33[47m\\1\\o33[0m/;
s/^\\(.*\\)\\n==> .* <==/\\o33[47;31m\\1\\o33[0m/;
p;}"
They work fine on my host, running:
sudo ./myColoredTail /var/log/{kern.,sys}log
Interactive script
You may be watching logs for reacting on events?
Here is a little script playing some sound when some USB device appear or disappear, but same script could send mail, or any other interaction, like powering on coffe machine...
#!/bin/bash
exec {tailF}< <(tail -F /var/log/kern.log)
tailPid=$!
while :;do
read -rsn 1 -t .3 keyboard
[ "${keyboard,}" = "q" ] && break
if read -ru $tailF -t 0 _ ;then
read -ru $tailF line
case $line in
*New\ USB\ device\ found* ) play /some/sound.ogg ;;
*USB\ disconnect* ) play /some/othersound.ogg ;;
esac
printf "\r%s\e[K" "$line"
fi
done
echo
exec {tailF}<&-
kill $tailPid
You could quit by pressing Q key.
you certainly won't succeed with
tail -f /var/log/foo.log |grep --line-buffered string2search
when you use "colortail" as an alias for tail, eg. in bash
alias tail='colortail -n 30'
you can check by
type alias
if this outputs something like
tail isan alias of colortail -n 30.
then you have your culprit :)
Solution:
remove the alias with
unalias tail
ensure that you're using the 'real' tail binary by this command
type tail
which should output something like:
tail is /usr/bin/tail
and then you can run your command
tail -f foo.log |grep --line-buffered something
Good luck.
Use awk(another great bash utility) instead of grep where you dont have the line buffered option! It will continuously stream your data from tail.
this is how you use grep
tail -f <file> | grep pattern
This is how you would use awk
tail -f <file> | awk '/pattern/{print $0}'

How to turn off auto add escape character feature in Linux

I am trying to pass a command to a remote server using ssh. while my commands have some characters like ", $, ', \ which often requires a backslash as a escape character except ' (single quote), but the system is automatically taking an escape character \ before the single codes while execution. Can some one help me how to turn off this.
OS : RHEL
my Code :
ssh -q $server "ps -ef | grep mongo | grep conf | awk '{print \$(NF-2)}'
While execution, the code becomes
ssh -q $server "ps -ef | grep mongo | grep conf | awk \'{print $(NF-2)}\'"
I need to turn off this feature
Your analysis isn't really correct. Anyhow, there is no particular reason to run Awk remotely, or grep at all here (because Awk does all of that nicely and efficiently with a very minor refactoring):
ssh -q "$server" ps -ef |
# This runs locally instead
awk '/mongo/ && /conf/ {print $(NF-2)}'

xargs bash -c unexpected token

I'm experiencing an issue calling xargs inside a bash script to parallelize the launch of a function.
I have this line:
grep -Ev '^#|^$' "$listOfTables" | xargs -d '\n' -l1 -I args -P"$parallels" bash -c "doSqoop 'args'"
that launches the function doSqoop that I previously exported.
I am passing to xargs and then to bash -c a single, very long line, containing fields that I split and handle inside the function.
It is something like schema|tab|dest|desttab|query|splits|.... that I read from a file, via the grep command above. I am fine with this solution, I know xargs can split the line on | but I'm ok this way.
It used to work well since I had to add another field at the end, which contains this kind of value:
field1='varchar(12)',field2='varchar(4)',field3='timestamp',....
Now I have this error:
bash: -c: line 0: syntax error near unexpected token '('
I tried to escape the pharhentesis and and single quotes, without success.
It appears to me that bash -c is interpreting the arguments
Use GNU parallel that can call exported functions, and also has an easier syntax and much more capabilities.
Your sample command should could be replaced with
grep -Ev '^#|^$' file | parallel doSqoop
Test with below script:
#!/bin/bash
doSqoop() {
printf "%s\n" "$#"
}
export -f doSqoop
grep -Ev '^#|^$' file | parallel doSqoop
You can also set the number of processes with the -P option, otherwise it matches the number of cores in your system:
grep -Ev '^#|^$' file | parallel -P "$num" doSqoop

UNIX shell script to run a list of grep commands from a file and getting result in a single delimited file

I am beginner in unix programming and a way to automate my work
I want to run a list a grep commands and get the output of all the grep command in a in a single delimited file .
i am using the following bash script. But it's not working .
Mockup sh file:
!/bin/sh
grep -l abcd123
grep -l abcd124
grep -l abcd125
and while running i used the following command
$ ./Mockup.sh > output.txt
Is it the right command?
How can I get both the grep command and output in the output file?
how can i delimit the output after each command and result?
How can I get both the grep command and output in the output file
You can use bash -v (verbose) to print each command before execution on stderr and it's output will be as usual be available on stdout:
bash -v ./Mockup.sh > output.txt 2>&1
cat output.txt
Working Demo
A suitable shell script could be
#!/bin/sh
grep -l 'abcd123\|abcd124\|abcd125' "$#"
provided that the filenames you pass on the invocation of the script are "well behaved", that is no whitespace in them. (Edit Using the "$#" expansion takes care of generic whitespace in the filenames, tx to triplee for his/her comment)
This kind of invocation (with alternative matching strings, as per the \| syntax) has the added advantage that you have exactly one occurrence of a filename in your final list, because grep -l prints once the filename as soon as it finds the first occurrence of one of the three strings in a file.
Addendum about "$#"
% ff () { for i in "$#" ; do printf "[%s]\n" "$i" ; done ; }
% # NB "a s d" below is indeed "a SPACE s TAB d"
% ff "a s d" " ert " '345
345'
[a s d]
[ ert ]
[345
345]
%
cat myscript.sh
########################
#!/bin/bash
echo "Trying to find the file contenting the below string, relace your string with below string"
grep "string" /path/to/folder/* -R -l
########################
save above file and run it as below
sh myscript.sh > output.txt
once the command prmpt get return you can check the output.txt for require output.
Another approach, less efficient, that tries to address the OP question
How can I get both the grep command and output in the output file?
% cat Mockup
#!/bin/sh
grep -o -e string1 -e string2 -e string3 "$#" 2> /dev/null | sort -t: -k2 | uniq
Output: (mocked up as well)
% sh Mockup file{01..99}
file01:string1
file17:string1
file44:string1
file33:string2
file44:string2
file48:string2
%
looking at the output from POV of a consumer, one foresees problems with search strings and/or file names containing colons... oh well, that's another Q maybe

Resources