VAR="1\n2\n3"
I'm trying to print out the second last line. One liner in bash!
I've gotten so far: printf -- "$VAR" | head -2
It however prints out too much.
I can do this with a file no problem: tail -2 ~/file | head -1
You almost done this task by yourself. Try
VAR="1\n2\n3"; printf -- "$VAR"|tail -2|head -1
Here is one pure bash way of doing this:
readarray -t arr < <(printf -- "$VAR") && echo "${arr[-2]}"
2
You may also use this awk as a single command:
VAR="1\n2\n3"
awk -F '\\\\n' '{print $(NF-1)}' <<< "$VAR"
2
maybe more efficient using a temporary variable and using expansions
var=$'1\n2\n3' ; tmpvar=${var%$'\n'*} ; echo "${tmpvar##*$'\n'}"
Use echo -e for backslash interpretation and to translate \n to newlines and print the interested line number using NR.
$ echo -e "${VAR}" | awk 'NR==2'
2
With multiple lines and do, tail and head can be used to print any particular line number.
$ echo -e "$VAR" | tail -2 | head -1
2
or do a fancy sed, where you keep the previous line in the buffer-space (x) to print and keep deleting until the last line,
$ echo -e "$VAR" | sed 'x;$!d'
2
We are trying to execute below script for finding out the occurrence of a particular word in a log file
Need suggestions to optimize the script.
Test.log size - Approx to 500 to 600 MB
$wc -l Test.log
16609852 Test.log
po_numbers - 11 to 12k po's to search
$more po_numbers
xxx1335
AB1085
SSS6205
UY3347
OP9111
....and so on
Current Execution Time - 2.45 hrs
while IFS= read -r po
do
check=$(grep -c "PO_NUMBER=$po" Test.log)
echo $po "-->" $check >>list3
if [ "$check" = "0" ]
then
echo $po >>po_to_server
#else break
fi
done < po_numbers
You are reading your big file too many times when you execute
grep -c "PO_NUMBER=$po" Test.log
You can try to split your big file into smaller ones or write your patterns to a file and make grep use it
echo -e "PO_NUMBER=$po\n" >> patterns.txt
then
grep -f patterns.txt Test.log
$ grep -Fwf <(sed 's/.*/PO_NUMBER=&/' po_numbers) Test.log
create the lookup file from po_numbers (process substitution) check for literal word matches from the log file. This assumes the searched PO_NUMBER=xxx is a separate word, if not remove -w, also assumes there is no regex but just literal matches, if not remove -F, however both will slow down searches.
Using Grep :
sed -e 's|^|PO_NUMBER=|' po_numbers | grep -o -F -f - Test.log | sed -e 's|^PO_NUMBER=||' | sort | uniq -c > list3
grep -o -F -f po_numbers list3 | grep -v -o -F -f - po_numbers > po_to_server
Using awk :
This awk program might work faster
awk '(NR==FNR){ po[$0]=0; next }
{ for(key in po) {
str=$0
po[key]+=gsub("PO_NUMBER="key,"",str)
}
}
END {
for(key in po) {
if (po[key]==0) {print key >> "po_to_server" }
else {print key"-->"po[key] >> "list3" }
}
}' po_numbers Test.log
This does the following :
The first line loads the po keys from the file po_numbers
The second awk parser, will pars the file for occurences of PO_NUMBER=key per line. (gsub is a function which performs a substitutation and returns the substitution count)
In the end we print out the requested output to the requested files.
The assumption here is that is might be possible that multiple patterns could occure multiple times on a single line of Test.log
Comment: the original order of po_numbers will not be satisfied.
"finding out the occurrence"
Not sure if you mean to count the number of occurrences for each searched word or to output the lines in the log that contain at least one of the searched words. This is how you could solve it in the latter case:
(cat po_numbers; echo GO; cat Test.log) | \
perl -nle'$r?/$r/&&print:/GO/?($r=qr/#{[join"|",#s]}/):push#s,$_'
For example
echo "abc-1234a :" | grep <do-something>
to print only abc-1234a
I think these are closer to what you're getting at, but without knowing what you're really trying to achieve, it's hard to say.
echo "abc-1234a :" | egrep -o '^[^:]+'
... though this will also match lines that have no colon. If you only want lines with colons, and you must use only grep, this might work:
echo "abc-1234a :" | grep : | egrep -o '^[^:]+'
Of course, this only makes sense if your echo "abc-1234a :" is an example that would be replace with possibly multiple lines of input.
The smallest tool you could use is probably cut:
echo "abc-1234a :" | cut -d: -f1
And sed is always available...
echo "abc-1234a :" | sed 's/ *:.*//'
For this last one, if you only want to print lines that include a colon, change it to:
echo "abc-1234a :" | sed -ne 's/ *:.*//p'
Heck, you could even do this in pure bash:
while read line; do
field="${line%%:*}"
# do stuff with $field
done <<<"abc-1234a :"
For information on the %% bit, you can man bash and search for "Parameter Expansion".
UPDATE:
You said:
It's the characters in the first line of input before the colon. The
input could have multiple line though.
The solutions with grep probably aren't your best choice, then, since they'll also print data from subsequent lines that might include colons. Of course, there are many ways to solve this requirement as well. We'll start with sample input:
$ function sample { printf "abc-1234a:foo\nbar baz:\nNarf\n"; }
$ sample
abc-1234a:foo
bar baz:
Narf
You could use multiple pipes, for example:
$ sample | head -1 | grep -Eo '^[^:]*'
abc-1234a
$ sample | head -1 | cut -d: -f1
abc-1234a
Or you could use sed to process only the first line:
$ sample | sed -ne '1s/:.*//p'
abc-1234a
Or tell sed to exit after printing the first line (which is faster than reading the whole file):
$ sample | sed 's/:.*//;q'
abc-1234a
Or do the same thing but only show output if a colon was found (for safety):
$ sample | sed -ne 's/:.*//p;q'
abc-1234a
Or have awk do the same thing (as the last 3 examples, respectively):
$ sample | awk '{sub(/:.*/,"")} NR==1'
abc-1234a
$ sample | awk 'NR>1{nextfile} {sub(/:.*/,"")} 1'
abc-1234a
$ sample | awk 'NR>1{nextfile} sub(/:.*/,"")'
abc-1234a
Or in bash, with no pipes at all:
$ read line < <(sample)
$ printf '%s\n' "${line%%:*}"
abc-1234a
It is possible to do what you want with only sed.
Here is an example:
#!/bin/sh
filename=$1
pattern=yourpattern
# flag -n disables print everyline (default behavior)
sed -n "
1,/$pattern/ {
/$pattern/n # skip line containing pattern
p # print lines ranging from line 1 untill pattern
}
" $filename
exit 0
This works at least for GNU's sed. It should work for other sed too, except
regarding the comments (some implementations of sed don't support comments).
Source: https://www.grymoire.com/Unix/Sed.html
the file a.txt has two blank lines at the end
[yaxin#oishi tmp]$ cat -n a.txt
1 jhasdfj
2
3 sdfjalskdf
4
5
and my script is:
[yaxin#oishi tmp]$ cat t.sh
#!/bin/sh
a=`cat a.txt`
a_length=`echo "$a" | awk 'END {print NR}'`
echo "$a"
echo $a_length
[yaxin#oishi tmp]$ sh t.sh
jhasdfj
sdfjalskdf
3
open debug
[yaxin#oishi tmp]$ sh -x t.sh
++ cat a.txt
+ a='jhasdfj
sdfjalskdf'
++ echo 'jhasdfj
sdfjalskdf'
++ awk 'END {print NR}'
+ a_length=3
+ echo 'jhasdfj
sdfjalskdf'
jhasdfj
sdfjalskdf
+ echo 3
3
the cat command steal the blank lines at the end of the file.How to solve this problem.
The cat command does not steal anything. It is the command substitution that does. man bash says:
Bash performs the expansion by executing command and replacing the command substitution with the standard output of the command, with any trailing newlines deleted. Embedded newlines are not deleted
If you want to store an output of a command to a variable, you might add && echo . after the command, store the output and remove the final ..
Also, to count the number of lines in a file, the cannonical way is to run wc -l:
wc -l < a.txt
You don't need cat command here, directly use awk like this:
awk 'END {print NR}' a.txt
Your problem is in storing the cat's output in a shell variable. Even this will give right output (though case of UUOC):
cat a.txt | awk 'END {print NR}'
Update: When you try to do this:
a=`cat a.txt`
OR else:
a=$(cat a.txt)
Pitfall is that the process substitution i.e. command inside reverse quote like you have or in $() strips trailing newlines.
You can do this trick to get trailing newlines stored in a shell variable:
a=`cat a.txt; echo ';'`
a="${a%;}"
Test the variable value:
echo "$a"
printf "%q" "$a"
Then output will show newlines as well:
jhasdfj
sdfjalskdf
$'jhasdfj\n\nsdfjalskdf\n\n\n'
I have a file with contents:
20120619112139,3,22222288100597,01,503352786544597,,W,ROAMER,,,,0,mme2
20120703112557,3,00000000000000,,503352786544021,,B,,8,2505,,U,
20120611171517,3,22222288100620,,503352786544620,11917676228846,B,ROAMER,8,2505,,U,
20120703112557,3,00000000000000,,503352786544021,,B,,8,2505,,U,
20120703112557,3,00000000000000,,503352786544021,,B,,8,2505,,U,
20120611171003,3,22222288100618,02,503352786544618,,W,ROAMER,8,2505,,0,
20120611171046,3,00000000000000,02,503352786544618,11917676228846,W,ROAMER,8,2505,,0,
20120611171101,3,22222288100618,02,503352786544618,11917676228846,W,ROAMER,8,2505,,0,
20120611171101,3,22222222222222,02,503352786544618,11917676228846,W,ROAMER,8,2505,,0,
I need to check if the third field of any line has one digit repeated all through 14 times, like:00000000000000 and print such lines to another file
I tried this code:
awk '$3 ~ /[0-9]{14}/' myfile > output.txt
But this prints lines having "22222288100618" such values as well.
Also i tried:
for i in `cat myfile`
do
if [ `echo $i | cut -d"," -f 3 | egrep "^[0-9]{14}$"` ];
then echo $i >> output.txt;
fi
done
This doesn't help as well.This also prints all the lines.
But I only need these lines in the output file.
20120703112557,3,00000000000000,,503352786544021,,B,,8,2505,,U,
20120703112557,3,00000000000000,,503352786544021,,B,,8,2505,,U,
20120703112557,3,00000000000000,,503352786544021,,B,,8,2505,,U,
20120611171046,3,00000000000000,02,503352786544618,11917676228846,W,ROAMER,8,2505,,0,
20120611171101,3,22222222222222,02,503352786544618,11917676228846,W,ROAMER,8,2505,,0,
Thanks in advance for any immediate help
Don't know if this can be done with awk but this should work:
perl -aF, -nle '$F[2]=~/(\d)\1{13}/&& print'
You can use an expression like 0{14}|1{14}.... Try this:
$ for i in 0 1 2 3 4 5 6 7 8 9; do re=$re${re:+|}$i{14}; done
$ awk -F, --posix \$3~/$re/ myfile
(gawk requires --posix to recognize the interval expression {14}. This may not be necessary with all awk.)
Using grep:
grep -E "[0-9]+,[0-9]+,([0-9])\1{13}" myfile
sed -n '/^[^,]+,[^,]+,([0-9])\1{13}/p' input_file