UNIX shell script to run a list of grep commands from a file and getting result in a single delimited file

UNIX shell script to run a list of grep commands from a file and getting result in a single delimited file - linux

I am beginner in unix programming and a way to automate my work
I want to run a list a grep commands and get the output of all the grep command in a in a single delimited file .
i am using the following bash script. But it's not working .
Mockup sh file:
!/bin/sh
grep -l abcd123
grep -l abcd124
grep -l abcd125
and while running i used the following command
$ ./Mockup.sh > output.txt
Is it the right command?
How can I get both the grep command and output in the output file?
how can i delimit the output after each command and result?

How can I get both the grep command and output in the output file
You can use bash -v (verbose) to print each command before execution on stderr and it's output will be as usual be available on stdout:
bash -v ./Mockup.sh > output.txt 2>&1
cat output.txt
Working Demo

A suitable shell script could be
#!/bin/sh
grep -l 'abcd123\|abcd124\|abcd125' "$#"
provided that the filenames you pass on the invocation of the script are "well behaved", that is no whitespace in them. (Edit Using the "$#" expansion takes care of generic whitespace in the filenames, tx to triplee for his/her comment)
This kind of invocation (with alternative matching strings, as per the \| syntax) has the added advantage that you have exactly one occurrence of a filename in your final list, because grep -l prints once the filename as soon as it finds the first occurrence of one of the three strings in a file.
Addendum about "$#"
% ff () { for i in "$#" ; do printf "[%s]\n" "$i" ; done ; }
% # NB "a s d" below is indeed "a SPACE s TAB d"
% ff "a s d" " ert " '345
345'
[a s d]
[ ert ]
[345
345]
%

cat myscript.sh
########################
#!/bin/bash
echo "Trying to find the file contenting the below string, relace your string with below string"
grep "string" /path/to/folder/* -R -l
########################
save above file and run it as below
sh myscript.sh > output.txt
once the command prmpt get return you can check the output.txt for require output.

Another approach, less efficient, that tries to address the OP question
How can I get both the grep command and output in the output file?
% cat Mockup
#!/bin/sh
grep -o -e string1 -e string2 -e string3 "$#" 2> /dev/null | sort -t: -k2 | uniq
Output: (mocked up as well)
% sh Mockup file{01..99}
file01:string1
file17:string1
file44:string1
file33:string2
file44:string2
file48:string2
%
looking at the output from POV of a consumer, one foresees problems with search strings and/or file names containing colons... oh well, that's another Q maybe

Related

Grep function not stopping with head pipe

So i'm currently trying to grep a single result from a random file in a specific directory. The grepping works just fine and the expected output file is populated as expected, but for some reason, even after the output file has already been filled, the process won't stop. This is the grep command where the program seems to be getting stuck.
searchFILE(){
case $2 in
pref)
echo "Populating output file: $3-$1.data.out"
dataOutputFile="$3-$1.data.out"
zgrep -a "\"someParameter\"\:\"$1\"" /folder/anotherFolder/filetemplate.log.* | zgrep -a "\"parameter2\"\:\"$3\"" | head -1 > $dataOutputFile
;;
*)
echo "Unrecognized command"
;;
esac
echo "Query finished"
}
What is currently happening is that the output file is being populated as expected with the head pipe, but for some reason I'm not getting the "Query finished" message, and the process seems not to stop at all.

grep does not know that head -n1 is no longer reading from the pipe until it attempts to write to the pipe, which it will only do if another match is found. There is no direct communication between the processes. It will eventually stop, but only once all the data is read, a second match is found and write fails with EPIPE, or some other error occurs.
You can watch this happen in a simple pipeline like this:
cat /dev/urandom | grep -ao "12[0-9]" | head -n1
With a sufficiently rare pattern, you will observe a delay between output and exit.
One solution is to change your stop condition. Instead of waiting for SIGPIPE as your pipeline does, wait for grep to match once using the -m1 option:
cat /dev/urandom | grep -ao -m1 "12[0-9]"

I saw better performance results with zcat myZippedFile | grep whatever paradigm...

The first difference you need to try is pipe with | head -z --lines=1
The reason is null terminated lines instead of newlines (just in case).
My example script below worked (drop the case statement to make it more simple). If I hold onto $1 $2 inside functions things go wrong. I use parameter $names and only use the $1 $2 $# once, because it also goes wrong for me if I don't and in any case you can then shift over $# and catch arguments. The $# in the script itself are not the same as arguments in bash functions.
grep searching for 2 or multiple parameters in any order means using grep twice; in your case zgrep | grep. The second grep is a normal grep! You only need the first grep to be zgrep to do the unzip. Your question is simpler if you drop the case statement as bash case scares off people: bash was always an ugly lady that works good for short scripts.
zgrep searches text or compressed text, but newlines in LINUX style vs WINDOWS are not the same. So use dos2unix to convert files so that newlines work. I use compressed file simply because it is strange and rare to see zgrep, so it is demonstrated in a shell script with a compressed file! It works for me. I changed a few things, like >> and "sort -u" but you can obviously change them back.
#!/usr/bin/env bash
# Search for egA AND egB using option go
# COMMAND LINE: ./zgrp egA go egB
A="$1"
cOPT="$2" # expecting case go
B="$3"
LOG="./filetemplate.log" # use parameters for long names.
# Generate some data with gzip and delete the temporary file.
echo "\"pramA\":\"$A\" \"pramB\":\"$B\"" >> $B$A.tmp
rm -f ${LOG}.A; tar czf ${LOG}.A $B$A.tmp
rm -f $B$A.tmp
# Use paramaterise $names not $1 etc because you may want to do shift etc
searchFILE()
{
outFile="$B-$A.data.out"
case $cOPT in
go) # This is zgrep | grep NOT zgrep | zgrep
zgrep -a "\"pramA\":\"$A\"" ${LOG}.* | grep -a "\"pramB\":\"$B\"" | head -z --lines=1 >> $outFile
sort -u $outFile > ${outFile}.sorted # sort unique on your output.
;;
*) echo -e "ERROR second argument must be go.\n Usage: ./zgrp egA go egB"
exit 9
;;
esac
echo -e "\n ============ Done: $0 $# Fin. ============="
}
searchFILE "$#"
cat ${outFile}.sorted

using sed to set a variable works on command line, but not bash script

I have looked quite a bit for answers but I am not finding any suggestions that have worked so far.
on command line, this works:
$ myvar=$( cat -n /usr/share/dict/cracklib-small | grep $myrand | sed -e "s/$myrand//" )
$ echo $myvar
$ commonness
however, inside a bash script the same exact lines just echoes out a blank line
notes - $myrand is a number, like 10340 generated with $RANDOM
cat prints out a dictionary with line numbers
grep grabs the line with $myrand in it ; e.g. 10340 commonness
sed is intended to remove the $myrand part of the line and replace it with nothing. here is my sample script
#!/bin/bash
# prints out a random word
myrand=$RANDOM
export myrand
myword=$( cat -n /path/to/dict/cracklib-small | grep myrand | sed -e "s/$myrand//g" <<<"$myword" )
echo $myword

Your command line code is running:
grep $myrand
Your script is running:
grep myrand
These are not the same thing; the latter is looking for a word that contains "myrand" within it, not a random number.
By the way -- I'd suggest a different way to get a random line. If you have GNU coreutils, the shuf tool is built-to-purpose:
myword=$(shuf -n 1 /path/to/dict/cracklib-small)

#!/bin/bash
# prints out a random word
myrand=$RANDOM
export myrand
myword=$( cat -n /path/to/dict/cracklib-small | grep myrand | sed -e "s/$myrand//g" <<<"$myword" )
echo $myword
where is the $ sign in grep myrand ?
you must put in some work before posting it here.

Make a variable containing all digits from the stdout of the command run directly before it

I am trying to make a bash shell script that launches some jobs on a queuing system. After a job is launched, the launch command prints the job-id to the stdout, which I would like to 'trap' and then use in the next command. The job-id digits are the only digits in the stdout message.
#!/bin/bash
./some_function
>>> this is some stdout text and the job number is 1234...
and then I would like to get to:
echo $job_id
>>> 1234
My current method is using a tee command to pipe the original command's stdout to a tmp.txt file and then making the variable by grepping that file with a regex filter...something like:
echo 'pretend this is some dummy output from a function 1234' 2>&1 | tee tmp.txt
job_id=`cat tmp.txt | grep -o '[0-9]'`
echo $job_id
>>> pretend this is some dummy output from a function 1234
>>> 1 2 3 4
...but I get the feeling this is not really the most elegant or 'standard' way of doing this. What is the better way to do this?
And for bonus points, how do I remove the spaces from the grep+regex output?

You can use grep -o when you call your script:
jobid=$(echo 'pretend this is some dummy output from a function 1234' 2>&1 |
tee tmp.txt | grep -Eo '[0-9]+$')
echo "$jobid"
1234

Something like this should work:
$ JOBID=`./some_function | sed 's/[^0-9]*\([0-9]*\)[^0-9]*/\1/'`
$ echo $JOBID
1234

Bash Sorting STDIN

I want to write a bash script that sorts the input by rules in different files. The first rule is to write all chars or strings in file1. The second rule is to write all numbers in file2. The third rule is to write all alphanumerical strings in file3. All specials chars must be ignored. Because I am not familiar with bash I don t know how to realize this.
Could someone help me?
Thanks,
Haniball
Thanks for the answers,
I wrote this script,
#!/bin/bash
inp=0 echo "Which filename for strings?"
read strg
touch $strg
echo "Which filename for nums?"
read nums
touch $nums
echo "Which filename for alphanumerics?"
read alphanums
touch $alphanums
while [ "$inp" != "quit" ]
do
echo "Input: "
read inp
echo $inp | grep -o '\<[a-zA-Z]+>' > $strg
echo $inp | grep -o '\<[0-9]>' > $nums
echo $inp | grep -o -E '\<[0-9]{2,}>' > $nums
done
After I ran it, it only writes string in the stringfile.
Greetings, Haniball

Sure can help. See here:
How To Ask Questions The Smart Way
Help Vampires: A Spotter’s Guide
cool site about the bash is here: http://wiki.bash-hackers.org/doku.php
for sorting try man sort
for pattern matching try man grep
other useful tools: man sed man awk man strings man tee
And it is always correct tag your homework as "homework" ;)
You can try something like:
<input_file strings -1 -a | tee chars_and_strings.txt |\
grep "^[A-Za-z0-9][A-Za-z0-9]*$" | tee alphanum.txt |\
grep "^[0-9][0-9]*$" > numonly.txt
The above is only for USA - no international (read unicode) chars, where things coming a little bit more complicated.

grep is sufficient (your question is a bit vague. If I got something wrong, let me know...)
Using the following input file:
this is a string containing words,
single digits as in 1 and 2 as well
as whole numbers 42 1066
all chars or strings
$ grep -o '\<[a-zA-Z]\+\>' sorting_input
this
is
a
string
containing
words
single
digits
as
in
and
as
well
all single digit numbers
$ grep -o '\<[0-9]\>' sorting_input
1
2
all multiple digit numbers
$ grep -o -E '\<[0-9]{2,}\>' sorting_input
42
1066
Redirect the output to a file, i.e. grep ... > file1

Bash really isn't the best language for this kind of task. While possible, ild highly recommend the use of perl, python, or tcl for this.
That said, you can write all of stdin from input to a temporary file with shell redirection. Then, use a command like grep to output matches to another file. It might look something like this.
#!/bin/bash
cat > temp
grep pattern1 > file1
grep pattern2 > file2
grep pattern3 > file3
rm -f temp
Then run it like this:
cat file_to_process | ./script.sh
I'll leave the specifics of the pattern matching to you.

Linux using grep to print the file name and first n characters

How do I use grep to perform a search which, when a match is found, will print the file name as well as the first n characters in that file? Note that n is a parameter that can be specified and it is irrelevant whether the first n characters actually contains the matching string.

grep -l pattern *.txt |
while read line; do
echo -n "$line: ";
head -c $n "$line";
echo;
done
Change -c to -n if you want to see the first n lines instead of bytes.

You need to pipe the output of grep to sed to accomplish what you want. Here is an example:
grep mypattern *.txt | sed 's/^\([^:]*:.......\).*/\1/'
The number of dots is the number of characters you want to print. Many versions of sed often provide an option, like -r (GNU/Linux) and -E (FreeBSD), that allows you to use modern-style regular expressions. This makes it possible to specify numerically the number of characters you want to print.
N=7
grep mypattern *.txt /dev/null | sed -r "s/^([^:]*:.{$N}).*/\1/"
Note that this solution is a lot more efficient that others propsoed, which invoke multiple processes.

There are few tools that print 'n characters' rather than 'n lines'. Are you sure you really want characters and not lines? The whole thing can perhaps be best done in Perl. As specified (using grep), we can do:
pattern="$1"
shift
n="$2"
shift
grep -l "$pattern" "$#" |
while read file
do
echo "$file:" $(dd if="$file" count=${n}c)
done
The quotes around $file preserve multiple spaces in file names correctly. We can debate the command line usage, currently (assuming the command name is 'ngrep'):
ngrep pattern n [file ...]
I note that #litb used 'head -c $n'; that's neater than the dd command I used. There might be some systems without head (but they'd pretty archaic). I note that the POSIX version of head only supports -n and the number of lines; the -c option is probably a GNU extension.

Two thoughts here:
1) If efficiency was not a concern (like that would ever happen), you could check $status [csh] after running grep on each file. E.g.: (For N characters = 25.)
foreach FILE ( file1 file2 ... fileN )
grep targetToMatch ${FILE} > /dev/null
if ( $status == 0 ) then
echo -n "${FILE}: "
head -c25 ${FILE}
endif
end
2) GNU [FSF] head contains a --verbose [-v] switch. It also offers --null, to accomodate filenames with spaces. And there's '--', to handle filenames like "-c". So you could do:
grep --null -l targetToMatch -- file1 file2 ... fileN |
xargs --null head -v -c25 --

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

UNIX shell script to run a list of grep commands from a file and getting result in a single delimited file - linux

How can I get both the grep command and output in the output file You can use bash -v (verbose) to print each command before execution on stderr and it's output will be as usual be available on stdout: bash -v ./Mockup.sh > output.txt 2>&1 cat output.txt Working Demo

Related

Grep function not stopping with head pipe

using sed to set a variable works on command line, but not bash script

Make a variable containing all digits from the stdout of the command run directly before it

Bash Sorting STDIN

Linux using grep to print the file name and first n characters

Categories

Resources