How can I display the next two characters from sed results (wildcard characters and then stop the results)?
echo 'this is a test line' | sed 's/^.*te*/te../'
Expecting
test
Actual results te.. line
You can use
sed -n 's/.*\(te..\).*/\1/p' <<< 'this is a test line'
See the online demo. Here,
-n - suppresses the default line output
.*\(te..\).* - matches any zero or more chars, then captured into Group 1 te and any two chars, and then matches the rest of the string
\1 - replaces the whole match with the value of Group 1
p - only prints the result of the substitution.
GNU AWK solution
echo 'this is a test line' | awk 'BEGIN{FPAT="te.."}{print $1}'
output
test
Explanation: Inform AWK to detect fields like te.. using FPAT (Field PATtern) then just print 1st field.
(tested in GNU Awk 5.0.1)
Related
I have got a log file with specific String
"Received bla bla with count {} 23567"
Need to get the specific number which is at the end of line.
We can use awk or grep , not able to get this using below command.
grep "Received bla bla" logfile.log | grep '[0-9]'
Since the log file has timestamp at the beginning.
Awk lets you easily grab the last element on a line.
awk '/Received bla bla/ { print $NF }' logfile.log
The variable NF contains the number of (by default, whitespace-separated) fields on the current line, and putting a dollar sign in front refers to the field with that index, i.e. the last field. (Conveniently, but slightly unusually for Unix tools, Awk indexing starts at 1, not 0.)
If the regex needs to come from a variable, try
awk -v regex='Received bla bla' '$0 ~ regex { print $NF }' logfile.log
The operator ~ applies the argument on the right as a regex to the argument on the left, and returns a true value if it matches. $0 is the entire current input line. The -v option lets you set the value of an Awk variable from outside Awk before the script begins to execute.
GNU grep with PCRE matching:
grep -Po 'Received .* with count .*?\K\d+' file
sed -n 's/Received bla bla.* //p'
Could you please try following. I am on mobile so couldn't test it as of now, these should work but.
sed -E 's/.*([0-9]+$)/\1/' Input_file
In case you want to print digits coming at last of line by searching a specific text on that line then try like:
sed -E '/text to search/s/.*([0-9]+$)/\1/' Input_file
I Have a file name abc.lst i ahve stored that in a variable it contain 3 words string among them i want to grep second word and in that i want to cut the word from expdp to .dmp and store that into variable
example:-
REFLIST_OP=/tmp/abc.lst
cat $REFLIST_OP
34 /data/abc/GOon/expdp_TEST_P119_*_18112017.dmp 12-JAN-18 04.27.00 AM
Desired Output:-
expdp_TEST_P119_*_18112017.dmp
I Have tried below command :-
FULL_DMP_NAME=`cat $REFLIST_OP|grep /orabackup|awk '{print $2}'`
echo $FULL_DMP_NAME
/data/abc/GOon/expdp_TEST_P119_*_18112017.dmp
REFLIST_OP=/tmp/abc.lst
awk '{n=split($2,arr,/\//); print arr[n]}' "$REFLIST_OP"
Test Results:
$ REFLIST_OP=/tmp/abc.lst
$ cat "$REFLIST_OP"
34 /data/abc/GOon/expdp_TEST_P119_*_18112017.dmp 12-JAN-18 04.27.00 AM
$ awk '{n=split($2,arr,/\//); print arr[n]}' "$REFLIST_OP"
expdp_TEST_P119_*_18112017.dmp
To save in variable
myvar=$( awk '{n=split($2,arr,/\//); print arr[n]}' "$REFLIST_OP" )
Following awk may help you on same.
awk -F'/| ' '{print $6}' Input_file
OR
awk -F'/| ' '{print $6}' "$REFLIST_OP"
Explanation: Simply making space and / as a field separator(as per your shown Input_file) and then printing 6th field of the line which is required by OP.
To see the field number and field's value you could use following command too:
awk -F'/| ' '{for(i=1;i<=NF;i++){print i,$i}}' "$REFLIST_OP"
Using sed with one of these regex
sed -e 's/.*\/\([^[:space:]]*\).*/\1/' abc.lst capture non space characters after /, printing only the captured part.
sed -re 's|.*/([^[:space:]]*).*|\1|' abc.lst Same as above, but using different separator, thus avoiding to escape the /. -r to use unescaped (
sed -e 's|.*/||' -e 's|[[:space:]].*||' abc.lst in two steps, remove up to last /, remove from space to end. (May be easiest to read/understand)
myvar=$(<abc.lst); myvar=${myvar##*/}; myvar=${myvar%% *}; echo $myvar
If you want to avoid external command (sed)
I've a file with the below header generated by certain process
Link: <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=2>; rel="next", <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=8>; rel="last"
I want to cut just the number 8 from page=8 in the above content. How to go about it? Appreciate any help.
Try this -
$ cat f
Link: <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=2>; rel="next", <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=8>; rel="last"
$ awk -F'[&=<>]' '{for(i=1;i<=NF;i++) if($i ~ /^page$/) {print $(i+1)}}' f
2
8
If it is getting appended then you will get the last value using below awk :
$ awk -F'[&=<>]' '{for(i=1;i<=NF;i++) if($i ~ /^page$/) {kk=$(i+1)}} END{print kk}' ff
8
Limitation : Currently you have page=2 and page=8 and above command
will print the last page value.
And if you always want to print the 2nd value "8" (Added extra lines to the existing url, considering that it will keep on increasing and you always need the 2nd value then use below) -
$ cat f
Link: <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=2>; rel="next", <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=8>; rel="last"
<https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=8>; rel="last"
$ awk -v k=1 -F'[&=<>]' '{for(i=1;i<=NF;i++) if(($i ~ /^page$/) && (k==2) ) {print $(i+1)} k++}' f
8
Following is an implementation using grep:
grep -Po "&page=[0-9]*" <file_name> | grep -Po "[0-9]*"
Example:
echo 'Link: <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=2>; rel="next", <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=8000>; rel="last"' | grep -Po "&page=[0-9]*" | grep -Po "[0-9]*"
This will produces the result as expected.
echo 'Link: <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=2>; rel="next", <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=12345>; rel="last"' | grep -Po "&page=[0-9]*" |grep -Po "[0-9]*"| awk '2 == NR % $ct'
In awk. reverse the text, remove first [0-9]+=egap, output and rev again:
$ rev foo | awk 'sub(/[0-9]+=egap/,"")||1' |rev
Output:
Link: <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&page=2>; rel="next", <https://rnd.corp.zoom/api/v3/repositories/99/issues?state=all&per_page=100&>; rel="last"
try:
awk '{gsub(/.*page=/,"page=");sub(/>.*/,"");print}' Input_file
Simply substitute the all line with .*page= to page= which is nothing but will go till last page string(as * is a greedy regex match), so then substitute >.*(means starting from > to till end of line) with NULL, then print the line which will be page=8 or last value of the page. Off course I am considering that your Input_file is same as example shown.
awk -F'[= >]' '{print $12}' file
8
awk -F= '{split($8,a,">");print a[1]}' file
8
awk -F= '$8=="8>; rel"{print substr($8,1,1)}' file
8
The fact that a greedy regex is needed here (only the last occurrence of &page= should be matched) enables a simple sed solution:
sed -E 's/^.*&page=([0-9]+).*$/\1/' file
^.*&page= matches everything up to the last occurrence of &page on the line.
([0-9]+) matches one or more digits, and - thanks to enclosure in (...) stores the match in the 1st (and only) capture group, which the replacement string then reference as \1.
.*$ matches any remaining character on the line.
By virtue of the regex having matched the entire line, \1 therefore results in just the captured number as the output.
The above works with both GNU and BSD/macOS sed and takes advantage of modern extended regular expressions (-E), but in case you need a POSIX-compliant solution (which must use basic regular expressions and is therefore more cumbersome):
sed 's/^.*&page=\([0-9]\{1,\}\).*$/\1/' file
With GNU grep (on Linux, as requested), a single-pass grep -Po solution is also possible; like the sed solution, it relies on greedily matching up to the last &page=:
grep -Po "^.*&page=\K[0-9]+" file
-P activates support for PRCEs (Perl-compatible Regular Expressions).
-o only outputs the matching part of the line.
\K drops everything matched so far, so that what [0-9]+ matches - one or more digits - is the only output.
I want to extract timeTaken values from following line:
<some other log data> Exception, Curl1-Time: 0.258315s. Curl2-Time: 3.9092588424683s Exiting.
I am using following command with grep and awk:
grep -Po "Exception, Curl1-Time: \K(\d+.\d*)s. Curl2-Time: (\d+.\d+)" app.log | awk '{print $1 + $3}'
This outputs: 4.167565
Can this be done in more smarter way, maybe using sed or any other
bash tool.
Is it ok to ignore trailing "s." in time-taken
values as the result of addition is correct.
You already use PCRE. Why not use Perl itself?
perl -lne 'print $1 + $2
if /Exception, Curl1-Time: ([\d.]+)s\. Curl2-Time: ([\d.]+)/
' < input
If you have GNU's grep, then you can execute:
var="<some other log data> Exception, Curl1-Time: 0.258315s. Curl2-Time: 3.9092588424683s Exiting."
grep -Eo '[[:digit:]]+\.[[:digit:]]+s?' <<< "$var"
Or you can use awk and stay POSIX:
var="<some other log data> Exception, Curl1-Time: 0.258315s. Curl2-Time: 3.9092588424683s Exiting."
awk '{ while (match($0, /[[:digit:]]+\.[[:digit:]]+s?/)) { print substr($0, RSTART, RLENGTH); $0 = substr($0, RSTART + RLENGTH) } }' <<< "$var"
As you can see, both commands use the regex [[:digit:]]+\.[[:digit:]]+s? to match a pattern of one or more digits, a dot, one or more digits and an optional 's'.
GNU's grep uses the -o option to extract the matching regex pattern.
The awk version uses its match and substr functions, to match and extract relevant data.
After a regex match, RSTART and RLENGTH are set and we can use them to calculate a start and end positions for substr.
RLENGTH is the length of the substring matched by the match function.
RSTART is the start-index in characters of the substring matched by the match function.
see section Built-in Functions for String Manipulation
sed 's/.*Curl1-Time: \([0-9]\.[0-9]*\)s.*\([0-9]\.[0-9]*\)s.*$/\1 \2/p' filename | awk '{print ($1+$2);}'
Regex pattern matching ".Curl1-Time: ([0-9].[0-9])s.([0-9].[0-9])s.*$" ---> Pattern within the braces is the number matching regex.
Entire line is replaced with two matching patterns. i.e the output of sed will be two numbers with spaces in between them. e.g. 1234 34567
awk parses the sed output with default space delimiter and sums up them and prints the result.
I want to parse through a log file formatted like this:
INFO: Successfully received REQUEST_ID: 1111 from 164.12.1.11
INFO: Successfully received REQUEST_ID: 2222 from 164.12.2.22
ERROR: Some error
INFO: Successfully received REQUEST_ID: 3333 from 164.12.3.33
INFO: Successfully received REQUEST_ID: 4444 from 164.12.4.44
WARNING: Some warning
INFO: Some other info
I want a script that outputs 4444. So extract the next word after ^.*REQUEST_ID: from the last line that contains the pattern ^.*REQUEST_ID.
What I have so far:
ID=$(sed -n -e 's/^.*REQUEST_ID: //p' $logfile | tail -n 1)
For lines match the pattern matches for, it deletes all the text matching the match thus leaving only the text after the match and prints it. Then I tail it to get the last line. How to do make it so it only prints the first word?
And is there a more efficient way of doing this then having it piped to tail?
With awk:
awk '
$4 ~ /REQUEST_ID:/{val=$5}
END {print val}
' file.csv
$4 ~ /REQUEST_ID:/ : Match lines in which Field # 4 match REQUEST_ID:.
{val=$5} : Store the value of field 5 in the variable val.
END {print val} : On closing the file, print the last value stored.
I have used a regex match to allow for some variance on the string, and yet get a match. A more lenient match will be (a match at any place of the line):
awk ' /REQUEST_ID/ {val=$5}
END {print val}
' file.csv
If you value (or need) more speed than robustness, then use (Quoting needed):
awk '
$4 == "REQUEST_ID:" {val=$5}
END {print val}
' file.csv
With GNU sed:
sed -nE 's/.* REQUEST_ID: ([0-9]+) .*/\1/p' | tail -n 1
Output:
4444
With GNU grep:
grep -Po 'REQUEST_ID: \K[0-9]+' file | tail -n 1
Output:
4444
-P: Interpret PATTERN as a Perl regular expression.
-o: Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
\K: Drop everything before that point from the internal record.
sed '/^.*REQUEST_ID: \([0-9]\{1,\}\) .*/ {s//\1/;h;}
$!d
x' ${logfile}
posix version
print an empty line if no occurence, the next word (assuming it's a number here)
Principe:
if line contain REQUEST_ID
extract the next number
put it in hold buffer
if not the end, delete the current content (and cycle to next line)
load holding buffer (and print the line ending the cycle)
You can match the number and replace with that value:
sed -e 's/^.*REQUEST_ID: \([0-9]*\).*$/\1/g' $logfile
Print field where line and column meet.
awk 'FNR == 5 {print $5}' file
4444
Another awk alternative if you don't know the position of the search word.
tac file | awk '{for(i=1;i<NF;i++) if($i=="REQUEST_ID:") {print $(i+1);exit}}'
yet, another one without looping
tac file | awk -vRS=" " 'n{print;exit} /REQUEST_ID:/{n=1}'