I try to use awk in file(awk.007) on a file.txt
If some string begins with a letter "j" print.
In awk file I have this:
^J* {print $0}
Name Surname
Maths 2 5 6
I run it by cat file.txt | awk -f awk.007 but each time it show:
Syntax error: ^
If I run awk by command line everything workning fine.
The regular expression needs to be in //'s:
awk '/^J/' file.txt
^J* will match every line, btw, since * is 0 or more repetitions. And the default action for a pattern is print $0, so you don't really need to include that.
Related
I have got a log file with specific String
"Received bla bla with count {} 23567"
Need to get the specific number which is at the end of line.
We can use awk or grep , not able to get this using below command.
grep "Received bla bla" logfile.log | grep '[0-9]'
Since the log file has timestamp at the beginning.
Awk lets you easily grab the last element on a line.
awk '/Received bla bla/ { print $NF }' logfile.log
The variable NF contains the number of (by default, whitespace-separated) fields on the current line, and putting a dollar sign in front refers to the field with that index, i.e. the last field. (Conveniently, but slightly unusually for Unix tools, Awk indexing starts at 1, not 0.)
If the regex needs to come from a variable, try
awk -v regex='Received bla bla' '$0 ~ regex { print $NF }' logfile.log
The operator ~ applies the argument on the right as a regex to the argument on the left, and returns a true value if it matches. $0 is the entire current input line. The -v option lets you set the value of an Awk variable from outside Awk before the script begins to execute.
GNU grep with PCRE matching:
grep -Po 'Received .* with count .*?\K\d+' file
sed -n 's/Received bla bla.* //p'
Could you please try following. I am on mobile so couldn't test it as of now, these should work but.
sed -E 's/.*([0-9]+$)/\1/' Input_file
In case you want to print digits coming at last of line by searching a specific text on that line then try like:
sed -E '/text to search/s/.*([0-9]+$)/\1/' Input_file
I have a file that has several lines of which one line is
-xxxxxxxx()xxxxxxxx
I want to add the contents of this line to a new file
I did this :
awk ' /^-/ {system("echo" $0 ">" "newline.txt")} '
but this does not work , it returns an error that says :
Unnexpected token '('
I believe this is due to the () present in the line. How to overcome this issue?
You need to add proper spaces!
With your erronous awk ' /^-/ {system("echo" $0 ">" "newline.txt")} ', the shell command is essentially echo-xxxxxxxx()xxxxxxxx>newline.txt, which surely doesn't work. You need to construct a proper shell command inside the awk string, and obey awks string concatenation rules, i.e. your intended script should look like this (which is still broken, because $0 is not properly quoted in the resulting shell command):
awk '/^-/ { system("echo " $0 " > newline.txt") }'
However, if you really just need to echo $0 into a file, you can simply do:
awk '/^-/ { print $0 > "newline.txt" }'
Or even more simply
awk '/^-/' > newline.txt
Which essentially applies the default operation to all records matching /^-/, whereby the default operation is to print, which is short for neatly printing the current record, i.e. this script simply filters out the desired records. The > newline.txt redirection outside awk simply puts it into a file.
You don't need the system, echo commands, simply:
awk '/^-/ {print $1}' file > newfile
This will capture lines starting with - and truncate the rest if there's a space.
awk '/^-/ {print $0}' file > newfile
Would capture the entire line including spaces.
You could use grep also:
grep -o '^-.*' file > newfile
Captures any lines starting with -
grep -o '^-.*().*' file > newfile
Would be more specific and capture lines starting with - also containing ()
First of all for simple extraction of patterns from file, you do not need to use awk it is an overkill, grep would be more than enough for the task:
INPUT:
$ more file
123
-xxxxxxxx()xxxxxxxx
abc
-xyxyxxux()xxuxxuxx
123
abc
123
command:
$ grep -oE '^-[^(]+\(\).*' file
-xxxxxxxx()xxxxxxxx
-xyxyxxux()xxuxxuxx
explanations:
Option: -oE to define the output as the pattern and not the whole line (can be removed)
Regex: ^-[^(]+\(\).* will select lines that starts with - and contains ()
You can redirect your output to a new_file by adding > new_file at the end of your command.
I am a java programmer and a newbie to shell scripting, I have a daunting task to parse multi gigabyte logs and look for lines where '1'(just 1 no qoutes) is present at 446th position of the line, I am able to verify that character 1 is present by running this cat *.log | cut -c 446-446 | sort | uniq -c but I am not able to extract the lines and print them in an output file.
awk '{if (substr($0,446,1) == "1") {print $0}}' file
is the basics.
You can use FILENAME in the print feature to add the filename to the output, so then you could do
awk '{if (substr($0,446,1) == "1") {print FILENAME ":" $0}}' file1 file2 ...
IHTH
Try adding grep to the pipe:
grep '^.\{445\}1.*$'
You can use an awk command for that:
awk 'substr($0, 446, 1) == "1"' file.log
substr function will get 1 character at position 446 and == "1" will ensure that character is 1.
Another in awk. To make a more sane example, we print lines where the third char is 3:
$ cat file
123 # this
456 # not this
$ awk -F '' '$3==3' file
123 # this
based on that example but untested:
$ awk -F '' '$446==1' file
I want to parse through a log file formatted like this:
INFO: Successfully received REQUEST_ID: 1111 from 164.12.1.11
INFO: Successfully received REQUEST_ID: 2222 from 164.12.2.22
ERROR: Some error
INFO: Successfully received REQUEST_ID: 3333 from 164.12.3.33
INFO: Successfully received REQUEST_ID: 4444 from 164.12.4.44
WARNING: Some warning
INFO: Some other info
I want a script that outputs 4444. So extract the next word after ^.*REQUEST_ID: from the last line that contains the pattern ^.*REQUEST_ID.
What I have so far:
ID=$(sed -n -e 's/^.*REQUEST_ID: //p' $logfile | tail -n 1)
For lines match the pattern matches for, it deletes all the text matching the match thus leaving only the text after the match and prints it. Then I tail it to get the last line. How to do make it so it only prints the first word?
And is there a more efficient way of doing this then having it piped to tail?
With awk:
awk '
$4 ~ /REQUEST_ID:/{val=$5}
END {print val}
' file.csv
$4 ~ /REQUEST_ID:/ : Match lines in which Field # 4 match REQUEST_ID:.
{val=$5} : Store the value of field 5 in the variable val.
END {print val} : On closing the file, print the last value stored.
I have used a regex match to allow for some variance on the string, and yet get a match. A more lenient match will be (a match at any place of the line):
awk ' /REQUEST_ID/ {val=$5}
END {print val}
' file.csv
If you value (or need) more speed than robustness, then use (Quoting needed):
awk '
$4 == "REQUEST_ID:" {val=$5}
END {print val}
' file.csv
With GNU sed:
sed -nE 's/.* REQUEST_ID: ([0-9]+) .*/\1/p' | tail -n 1
Output:
4444
With GNU grep:
grep -Po 'REQUEST_ID: \K[0-9]+' file | tail -n 1
Output:
4444
-P: Interpret PATTERN as a Perl regular expression.
-o: Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
\K: Drop everything before that point from the internal record.
sed '/^.*REQUEST_ID: \([0-9]\{1,\}\) .*/ {s//\1/;h;}
$!d
x' ${logfile}
posix version
print an empty line if no occurence, the next word (assuming it's a number here)
Principe:
if line contain REQUEST_ID
extract the next number
put it in hold buffer
if not the end, delete the current content (and cycle to next line)
load holding buffer (and print the line ending the cycle)
You can match the number and replace with that value:
sed -e 's/^.*REQUEST_ID: \([0-9]*\).*$/\1/g' $logfile
Print field where line and column meet.
awk 'FNR == 5 {print $5}' file
4444
Another awk alternative if you don't know the position of the search word.
tac file | awk '{for(i=1;i<NF;i++) if($i=="REQUEST_ID:") {print $(i+1);exit}}'
yet, another one without looping
tac file | awk -vRS=" " 'n{print;exit} /REQUEST_ID:/{n=1}'
Is it possible to use grep to match only lines with numbers in a pre-specified range?
For instance I want to list all lines with numbers in the range [1024, 2048] of a log that contain the word 'error'.
I would like to keep the '-n' functionality i.e. have the number of the matched line in the file.
Use sed first:
sed -ne '1024,2048p' | grep ...
-n says don't print lines, 'x,y,p' says print lines x-y inclusive (overrides the -n)
sed -n '1024,2048{/error/{=;p}}' | paste - -
Here /error/ is a pattern to match and = prints the line number.
Awk is a good tool for the job:
$ awk 'NR>=1024 && NR<=2048 && /error/ {print NR,$0}' file
In awk the variable NR contains the current line number and $0 contains the line itself.
The benefits with using awk is that you can easily change the output to display however you want it. For instance to separate the line number from line with a colon followed by a TAB:
$ awk 'NR>=1024 && NR<=2048 && /error/ {print NR,$0}' OFS=':\t' file