Bash Sorting STDIN - linux

I want to write a bash script that sorts the input by rules in different files. The first rule is to write all chars or strings in file1. The second rule is to write all numbers in file2. The third rule is to write all alphanumerical strings in file3. All specials chars must be ignored. Because I am not familiar with bash I don t know how to realize this.
Could someone help me?
Thanks,
Haniball
Thanks for the answers,
I wrote this script,
#!/bin/bash
inp=0 echo "Which filename for strings?"
read strg
touch $strg
echo "Which filename for nums?"
read nums
touch $nums
echo "Which filename for alphanumerics?"
read alphanums
touch $alphanums
while [ "$inp" != "quit" ]
do
echo "Input: "
read inp
echo $inp | grep -o '\<[a-zA-Z]+>' > $strg
echo $inp | grep -o '\<[0-9]>' > $nums
echo $inp | grep -o -E '\<[0-9]{2,}>' > $nums
done
After I ran it, it only writes string in the stringfile.
Greetings, Haniball

Sure can help. See here:
How To Ask Questions The Smart Way
Help Vampires: A Spotter’s Guide
cool site about the bash is here: http://wiki.bash-hackers.org/doku.php
for sorting try man sort
for pattern matching try man grep
other useful tools: man sed man awk man strings man tee
And it is always correct tag your homework as "homework" ;)
You can try something like:
<input_file strings -1 -a | tee chars_and_strings.txt |\
grep "^[A-Za-z0-9][A-Za-z0-9]*$" | tee alphanum.txt |\
grep "^[0-9][0-9]*$" > numonly.txt
The above is only for USA - no international (read unicode) chars, where things coming a little bit more complicated.

grep is sufficient (your question is a bit vague. If I got something wrong, let me know...)
Using the following input file:
this is a string containing words,
single digits as in 1 and 2 as well
as whole numbers 42 1066
all chars or strings
$ grep -o '\<[a-zA-Z]\+\>' sorting_input
this
is
a
string
containing
words
single
digits
as
in
and
as
well
all single digit numbers
$ grep -o '\<[0-9]\>' sorting_input
1
2
all multiple digit numbers
$ grep -o -E '\<[0-9]{2,}\>' sorting_input
42
1066
Redirect the output to a file, i.e. grep ... > file1

Bash really isn't the best language for this kind of task. While possible, ild highly recommend the use of perl, python, or tcl for this.
That said, you can write all of stdin from input to a temporary file with shell redirection. Then, use a command like grep to output matches to another file. It might look something like this.
#!/bin/bash
cat > temp
grep pattern1 > file1
grep pattern2 > file2
grep pattern3 > file3
rm -f temp
Then run it like this:
cat file_to_process | ./script.sh
I'll leave the specifics of the pattern matching to you.

Related

grep 2 words at if statements in Bash

I am trying to see if my nohup file contains the words that I am looking for. If it does, then I need to put that into tmp file.
So I am currently using:
if grep -q "Started|missing" $DIR3/$dirName/nohup.out
then
grep -E "Started|missing" "$DIR3/$dirName/nohup.out" > tmp
fi
But it never goes into the if statement even if there are words that I am looking for.
How can I fix this?
Since basic sed uses BRE, regex alternation operator is represented by \| . | matches a literal | symbol. And you don't need to touch | symbol in the grep which uses ERE.
if grep -q "Started\|missing" $DIR3/$dirName/nohup.out
You should use egrep instead of grep (Avinash Raj has explained that in other words already in his answer).
I would generally recommend using egrep as a default for everyday use (even though many expressions only contain the basic regular expression syntax). From a practical point the standard grep is only interesting for performance reasons.
Details about the advantages of grep vs. egrep can be found in that superuser question.
When you only put the grep results into the tmp-file, you do not want to grep the file twice.
You can not use
egrep "Started|missing" $DIR3/$dirName/nohup.out > tmp
since that would create an empty tmp file when nothing is found.
You can remove empty files with if [ ! -s tmp ] or use another solution:
Redirectong the grep results without grepping again can be done with
rm -f tmp 2>/dev/null
egrep "Started|missing" $DIR3/$dirName/nohup.out | while read -r strange_line; do
echo "${strange_line}" >> tmp
done

Backticks can't handle pipes in variable

I have a problem with one script in bash with CAT command.
This works:
#!/bin/bash
fil="| grep LSmonitor";
log="/var/log/sys.log ";
lines=`cat $log | grep LSmonitor | wc -l`;
echo $lines;
Output: 139
This does not:
#!/bin/bash
fil="| grep LSmonitor";
log="/var/log/sys.log ";
string="cat $log $fil | wc -l";
echo $string;
`$string`;
Output:
cat /var/log/sys.log | grep LSmonitor | wc -l
cat: opcion invalida -- 'l'
Pruebe 'cat --help' para mas informacion.
$fil is a parameter in this example static, but in real script, parameter is get from html form POST, and if I print I can see that the content of $fil is correct.
In this case, since you're building a pipeline as a string, you would need:
eval "$string"
But DON'T DO THIS!!!! -- someone can easily enter the filter
; rm -rf *
and then you're hosed.
If you want a regex-based filter, get the user to just enter the regex, and then you'll do:
grep "$fil" "$log" | wc -l
Firstly, allow me to say that this sounds like a really bad idea:
[…] in real script, parameter is get from html form POST, […]
You should not be allowing the content of POST requests to be run by your shell. This is a massive attack vector, and whatever mechanisms you have in place to try to protect it are probably not as effective as you think.
Secondly, | inside variables are not treated as special. This isn't specific to backticks. Parameter expansion (e.g., replacing $fil with | grep LSmonitor) happens after the command is parsed and mostly processed. There's a little bit of post-processing that's done on the results of parameter expansion (including "word splitting", which is why $fil is equivalent to the three arguments '|' grep LSmonitor rather than to the single argument '| grep LSmonitor'), but nothing as dramatic as you describe. So, for example, this:
pipe='|'
echo $pipe cat
prints this:
| cat
Since your use-case is so frightening, I'm half-tempted to not explain how you can do what you want — I think you'll be better off not doing this — but since Stack Overflow answers are intended to be useful for more people than just the original poster, an example of how you can do this is below. I encourage the OP not to read on.
fil='| grep LSmonitor'
log=/var/log/sys.log
string="cat $log $fil | wc -l"
lines="$(eval "$string")"
echo "$lines"
Try using eval (taken from https://stackoverflow.com/a/11531431/2687324).
It looks like it's interpreting | as a string, not a pipe, so when it reaches -l, it treats it as if you're trying to pass in -l to cat instead of wc.
The other answers outline why you shouldn't do it this way.
grep LSmonitor /var/log/syslog | wc -l will do what you're looking for.

UNIX shell script to run a list of grep commands from a file and getting result in a single delimited file

I am beginner in unix programming and a way to automate my work
I want to run a list a grep commands and get the output of all the grep command in a in a single delimited file .
i am using the following bash script. But it's not working .
Mockup sh file:
!/bin/sh
grep -l abcd123
grep -l abcd124
grep -l abcd125
and while running i used the following command
$ ./Mockup.sh > output.txt
Is it the right command?
How can I get both the grep command and output in the output file?
how can i delimit the output after each command and result?
How can I get both the grep command and output in the output file
You can use bash -v (verbose) to print each command before execution on stderr and it's output will be as usual be available on stdout:
bash -v ./Mockup.sh > output.txt 2>&1
cat output.txt
Working Demo
A suitable shell script could be
#!/bin/sh
grep -l 'abcd123\|abcd124\|abcd125' "$#"
provided that the filenames you pass on the invocation of the script are "well behaved", that is no whitespace in them. (Edit Using the "$#" expansion takes care of generic whitespace in the filenames, tx to triplee for his/her comment)
This kind of invocation (with alternative matching strings, as per the \| syntax) has the added advantage that you have exactly one occurrence of a filename in your final list, because grep -l prints once the filename as soon as it finds the first occurrence of one of the three strings in a file.
Addendum about "$#"
% ff () { for i in "$#" ; do printf "[%s]\n" "$i" ; done ; }
% # NB "a s d" below is indeed "a SPACE s TAB d"
% ff "a s d" " ert " '345
345'
[a s d]
[ ert ]
[345
345]
%
cat myscript.sh
########################
#!/bin/bash
echo "Trying to find the file contenting the below string, relace your string with below string"
grep "string" /path/to/folder/* -R -l
########################
save above file and run it as below
sh myscript.sh > output.txt
once the command prmpt get return you can check the output.txt for require output.
Another approach, less efficient, that tries to address the OP question
How can I get both the grep command and output in the output file?
% cat Mockup
#!/bin/sh
grep -o -e string1 -e string2 -e string3 "$#" 2> /dev/null | sort -t: -k2 | uniq
Output: (mocked up as well)
% sh Mockup file{01..99}
file01:string1
file17:string1
file44:string1
file33:string2
file44:string2
file48:string2
%
looking at the output from POV of a consumer, one foresees problems with search strings and/or file names containing colons... oh well, that's another Q maybe

Pipe output to use as the search specification for grep on Linux

How do I pipe the output of grep as the search pattern for another grep?
As an example:
grep <Search_term> <file1> | xargs grep <file2>
I want the output of the first grep as the search term for the second grep. The above command is treating the output of the first grep as the file name for the second grep. I tried using the -e option for the second grep, but it does not work either.
You need to use xargs's -i switch:
grep ... | xargs -ifoo grep foo file_in_which_to_search
This takes the option after -i (foo in this case) and replaces every occurrence of it in the command with the output of the first grep.
This is the same as:
grep `grep ...` file_in_which_to_search
Try
grep ... | fgrep -f - file1 file2 ...
If using Bash then you can use backticks:
> grep -e "`grep ... ...`" files
the -e flag and the double quotes are there to ensure that any output from the initial grep that starts with a hyphen isn't then interpreted as an option to the second grep.
Note that the double quoting trick (which also ensures that the output from grep is treated as a single parameter) only works with Bash. It doesn't appear to work with (t)csh.
Note also that backticks are the standard way to get the output from one program into the parameter list of another. Not all programs have a convenient way to read parameters from stdin the way that (f)grep does.
I wanted to search for text in files (using grep) that had a certain pattern in their file names (found using find) in the current directory. I used the following command:
grep -i "pattern1" $(find . -name "pattern2")
Here pattern2 is the pattern in the file names and pattern1 is the pattern searched for
within files matching pattern2.
edit: Not strictly piping but still related and quite useful...
This is what I use to search for a file from a listing:
ls -la | grep 'file-in-which-to-search'
Okay breaking the rules as this isn't an answer, just a note that I can't get any of these solutions to work.
% fgrep -f test file
works fine.
% cat test | fgrep -f - file
fgrep: -: No such file or directory
fails.
% cat test | xargs -ifoo grep foo file
xargs: illegal option -- i
usage: xargs [-0opt] [-E eofstr] [-I replstr [-R replacements]] [-J replstr]
[-L number] [-n number [-x]] [-P maxprocs] [-s size]
[utility [argument ...]]
fails. Note that a capital I is necessary. If i use that all is good.
% grep "`cat test`" file
kinda works in that it returns a line for the terms that match but it also returns a line grep: line 3 in test: No such file or directory for each file that doesn't find a match.
Am I missing something or is this just differences in my Darwin distribution or bash shell?
I tried this way , and it works great.
[opuser#vjmachine abc]$ cat a
not problem
all
problem
first
not to get
read problem
read not problem
[opuser#vjmachine abc]$ cat b
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$ grep -e "`grep problem a`" b --col
not problem xxy
problem abcd
read problem werwer
read not problem 98989
123 not problem 345
345 problem tyu
[opuser#vjmachine abc]$
You should grep in such a way, to extract filenames only, see the parameter -l (the lowercase L):
grep -l someSearch * | xargs grep otherSearch
Because on the simple grep, the output is much more info than file names only. For instance when you do
grep someSearch *
You will pipe to xargs info like this
filename1: blablabla someSearch blablabla something else
filename2: bla someSearch bla otherSearch
...
Piping any of above line makes nonsense to pass to xargs.
But when you do grep -l someSearch *, your output will look like this:
filename1
filename2
Such an output can be passed now to xargs
I have found the following command to work using $() with my first command inside the parenthesis to have the shell execute it first.
grep $(dig +short) file
I use this to look through files for an IP address when I am given a host name.

Linux using grep to print the file name and first n characters

How do I use grep to perform a search which, when a match is found, will print the file name as well as the first n characters in that file? Note that n is a parameter that can be specified and it is irrelevant whether the first n characters actually contains the matching string.
grep -l pattern *.txt |
while read line; do
echo -n "$line: ";
head -c $n "$line";
echo;
done
Change -c to -n if you want to see the first n lines instead of bytes.
You need to pipe the output of grep to sed to accomplish what you want. Here is an example:
grep mypattern *.txt | sed 's/^\([^:]*:.......\).*/\1/'
The number of dots is the number of characters you want to print. Many versions of sed often provide an option, like -r (GNU/Linux) and -E (FreeBSD), that allows you to use modern-style regular expressions. This makes it possible to specify numerically the number of characters you want to print.
N=7
grep mypattern *.txt /dev/null | sed -r "s/^([^:]*:.{$N}).*/\1/"
Note that this solution is a lot more efficient that others propsoed, which invoke multiple processes.
There are few tools that print 'n characters' rather than 'n lines'. Are you sure you really want characters and not lines? The whole thing can perhaps be best done in Perl. As specified (using grep), we can do:
pattern="$1"
shift
n="$2"
shift
grep -l "$pattern" "$#" |
while read file
do
echo "$file:" $(dd if="$file" count=${n}c)
done
The quotes around $file preserve multiple spaces in file names correctly. We can debate the command line usage, currently (assuming the command name is 'ngrep'):
ngrep pattern n [file ...]
I note that #litb used 'head -c $n'; that's neater than the dd command I used. There might be some systems without head (but they'd pretty archaic). I note that the POSIX version of head only supports -n and the number of lines; the -c option is probably a GNU extension.
Two thoughts here:
1) If efficiency was not a concern (like that would ever happen), you could check $status [csh] after running grep on each file. E.g.: (For N characters = 25.)
foreach FILE ( file1 file2 ... fileN )
grep targetToMatch ${FILE} > /dev/null
if ( $status == 0 ) then
echo -n "${FILE}: "
head -c25 ${FILE}
endif
end
2) GNU [FSF] head contains a --verbose [-v] switch. It also offers --null, to accomodate filenames with spaces. And there's '--', to handle filenames like "-c". So you could do:
grep --null -l targetToMatch -- file1 file2 ... fileN |
xargs --null head -v -c25 --

Resources