Awk regix print params from URL request in access log [closed] - linux

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 months ago.
Improve this question
I have an access log file containing the following data, I want to check how many times the &u={yyy} parameter appears and print the corresponding number.
192.168.1.1 [2022/07/10 20:00:00] GET /action?t=test&u=123&b=check
192.168.1.2 [2022/07/10 20:00:00] GET /action?t=test&u=122&b=check
192.168.1.1 [2022/07/10 20:00:00] GET /action?t=test&u=122&b=check
Resuls:
2 122
1 123

I would harness GNU AWK for this task following way, let file.txt content be
192.168.1.1 [2022/07/10 20:00:00] GET /action?t=test&u=123&b=check
192.168.1.2 [2022/07/10 20:00:00] GET /action?t=test&u=122&b=check
192.168.1.1 [2022/07/10 20:00:00] GET /action?t=test&u=122&b=check
then
awk 'match($0,"&u=[^&]*"){arr[substr($0, RSTART+3, RLENGTH-3)]++}END{for(i in arr){print arr[i],i}}' file.txt
gives output
2 122
1 123
Explanation: I use 2 string functions, first is match which does set RSTART, RLENGTH and its' return value is used as condition, so action is executed only if match was found. Action is simple increase of value of array under key based on match without 3 first characters (&u=). After all lines are processed I output value key pairs of arrays. Disclaimer: this solution assumes any order of output lines is acceptable.
(tested in gawk 4.2.1)

If the logfile always look the same:
cat logfile | awk -F\& '{print $2}'| uniq -c

Related

Is there a way to consolidate similar (but not the same) rows in a text file? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a text file on a linux box that has two columns.
1. An IP address
2. A code for a location
Some IP addresses are listed more than once because more than one code is associated with it.
Example:
140.90.218.62 vaac
140.90.220.11 aawu
140.90.220.11 afc
140.90.220.11 arh
140.90.220.40 afc
I would like to consolidate such IP addresses to only be listed once, just with several location codes
Like this
140.90.218.62 vaac
140.90.220.11 aawu:afc:arh
140.90.220.40 afc
I could always code a for loop to read in the file, consolidate the values into an array, and write the cleaned up version back out.
Before I do that I was wonder if a combination of *nix utilities might do the job, do it with less code, etc.
Using awk
awk '{a[$1]=($1 in a?a[$1]":"$2:$2)}END{for (i in a) print i, a[i]}' file
Output:
140.90.220.11 aawu:afc:arh
140.90.220.40 afc
140.90.218.62 vaac
Explanation:
a[$1]=($1 in a?a[$1]":"$2:$2) - creates an indexed array with the IP address as key. Each $2 with the same IP is concatenated to the current value separated by a colon if ther's already an value.
for (i in a) print i,a[i] - when stdin closes, print all entries in a, the index (IP) first and all the values.
bash version 4, with associative arrays.
declare -A data
while read -r ip value; do
data[$ip]+=":$value"
done < file
for key in "${!data[#]}"; do
printf "%s %s\n" "$key" "${data[$key]#:}"
done
With perl:
perl -lanE 'push #{$ips{$F[0]}}, $F[1]; END { $" = ":"; say "$_ #{$ips{$_}}" for sort keys %ips }' yourfile.txt
outputs
140.90.218.62 vaac
140.90.220.11 aawu:afc:arh
140.90.220.40 afc

Match a pattern and then go to the next condition and print the details [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
This is my demo file: Demo.txt
CP Used
----------------------------------- --------------
gyhjjjjjjjjjjjjj
gdhdhsdjjsdjsd
----------------------------------- --------------
list: 21305
DP Used
----------------------------------- --------------
asghjskkkkkkkkkkfe jfdkjcdf
ashdjdjksd
----------------------------------- --------------
list: 203899
Here I want to match DP and then match list and print the details.
Expected output is:
21305,"CP"
203899,"DP"
Parse a simple table with awk:
awk '$2=="Used"{x=$1}; $1=="list:"{print $2",\""x"\""}' Demo.txt
If column 2 contains Used then save content of column 1 to variable x.
If column 1 contains list: then print column 2 and content of variable x.
Output:
21305,"CP"
203899,"DP"
Same approach like Cyrus sir's with a little difference of using variables and using $NF for string list value as follows.
awk -v s1="\"" -v s2="," '/Used/{val=$1;next} /list:/{print $NF s2 s1 val s1;val=""}' Input_file

How would I make the true results of this awk command issue a command like sendmail etc [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I'm trying to create a script that checks the 5th value of every line in a CSV output. For instance:
AAA,111,222,333,1
Here is what I am using:
awk -F "," '{if ($5 > 10) print $1 " has a value of " $5}' results
I was missing the "," ... what I was hoping to create is that if the results were in fact greater, True, then I could issue a command like sendmail with the results. if false do nothing.
All you need to do what you say you want is:
awk -F, '{printf "1st-column-value has 5th-column-value %s than 10\n", ($5>10 ? "greater" : "less")}' file
but of course your logic is wrong (consider equal to 10) and idk if you actually wanted the first column value printed instead of just the text 1st-column-value as you state in your question, and so on since you didn't include concise, testable sample input and expected output in your question.

search for pattern and remove all lines [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have system logs where alarms are written. in my case i have lots of repeated alarms which i want to ignore and focus only on new alarms that might be exist.
sample alarm :
kbl1infn8:CCC_USER_2049.0002:2016/09/20-17:00:03.560451-00540-03276-CCC_USER_2049- <0N CpocsSs7CircuitCat#040200000000009a|6501646464644309|6501646464c90000-6503010117c80000-1.0.3|a40200003e3d8fd5|0000000e|0000000000000000
kbl1infn8:CCC_USER_2049.0002:2016/09/20-17:00:03.560451-00540-03276-CCC_USER_2049- |RC USSDString=*1234*#|MSISDN=93707678224|
kbl1infn8:CCC_USER_2049.0002:2016/09/20-17:00:03.560451-00540-03276-CCC_USER_2049- |NF NOT: src=ERROR_APPLICATION sev=SEVERITY_MAJOR id=010117c800000001
kbl1infn8:CCC_USER_2049.0002:2016/09/20-17:00:03.560451-00540-03276-CCC_USER_2049- de.siemens.advantage.in.featureframework.FeatureException: GenAcc> [0]theGenericAccess: No value available for SubsDMDB.Subscriber.LanguageID and type INTEGER
kbl1infn8:CCC_USER_2049.0002:2016/09/20-17:00:03.560451-00540-03276-CCC_USER_2049- at de.siemens.advantage.in.features.genericAccess.impl.DynamicAsciiBuffer$Handle.throwNotAvailableException(DynamicAsciiBuffer.java:1105)
--
kbl1infn4:CCC_USER_1025.0009:2016/09/20-00:23:03.981403-25661-28403-CCC_USER_1025- <0N CpocsSs7CircuitCat#020200000000008a|6501646464644309|6501646464c90000-6501646464640000-1.1.1|a20200003cc
31dd2|0000000e|0000000000000000
kbl1infn4:CCC_USER_1025.0009:2016/09/20-00:23:03.981403-25661-28403-CCC_USER_1025- |RC CdPA=173|CgPA=93705040139|
kbl1infn4:CCC_USER_1025.0009:2016/09/20-00:23:03.981403-25661-28403-CCC_USER_1025- |NF NOT: src=ERROR_APPLICATION sev=SEVERITY_MAJOR id=6503010103c80016
kbl1infn4:CCC_USER_1025.0009:2016/09/20-00:23:03.981403-25661-28403-CCC_USER_1025- Exception in flexible core (e.g. during logic execution):de.siemens.advantage.in.featureframework.FeatureExc
eption: Call.checkIfCcOperationIsAllowed(): operation Call.playAnnouncement() only allowed within an open call control dialog
kbl1infn4:CCC_USER_1025.0009:2016/09/20-00:23:03.981403-25661-28403-CCC_USER_1025- at de.siemens.advantage.in.features.flexDTMF.actions.dtmfActions.impl.DTMFActionsController.playAnnounc
ementList(DTMFActionsController.java:360)
--
the above lines are related to one alarms. here i want to omit such alarm in my log file.
I have tried using grep -v 'RC USSDString' IN-201609201800.txt | more but this command removes only the line where the searched pattern grep -v 'RC USSDString' IN-201609201800.txt | more does exist, where i want to remove the entire lines of alarm where pattern is found.
Edit:
- I have added one more alarm separated by double dash
Assuming your alarms are multi line and two alarms are separated with each other by --.
awk -v RS="--" '{$1=$1} !/RC USSDString/' alarmfile
If you want to add , multiple string to be excluded from output then:
awk -v RS="--" '{$1=$1} !/string-1/ || !/string-2/' alarmfile
What you have to do :
grep -Ev 'pattern1|pattern2|pattern3' file

Linux/ unix duplicate names [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
What I need to do is, to check for duplicate domain names and find if there is some.
So far I tried many commands with grep, awk ,sort, uniq but couldn't work this out, I am feeling its very simple, but can't reach it.
P.s. If i use uniq -c I get a huge list of string in this file, and I see how many duplicates it has and which by number string it is.
adding 20 rows from the file I am using
1,google.com
2,facebook.com
3,youtube.com
4,yahoo.com
5,baidu.com
6,amazon.com
7,wikipedia.org
8,twitter.com
9,taobao.com
10,qq.com
11,google.co.in
12,live.com
13,sina.com.cn
14,weibo.com
15,linkedin.com
16,yahoo.co.jp
17,tmall.com
18,blogspot.com
19,ebay.com
20,hao123.com
The output I would like to see
> 2 google
> 2 yahoo
Thanks for help !
You could use something like this to get the output you want:
$ awk -F'[.,]' '{++a[$2]}END{for(i in a)if(a[i]>1)print a[i],i}' file
2 google
2 yahoo
With the input field separator to either . or ,, the first {block} is run for every row in the file. It builds up an array a using the second field: "google", "facebook", etc. $2 is the value of the second field, so ++a[$2] increments the value of the array a["google"], a["facebook"], etc. This means that the value in the array increases by one every time the same name is seen.
Once the whole file is processed, the for (i in a) loop runs through all of the keys in the array ("google", "facebook", etc.) and prints those whose value is greater than 1.
Given this file:
$ cat /tmp/test.txt
1,google.com
2,facebook.com
3,youtube.com
4,yahoo.com
5,baidu.com
6,amazon.com
7,wikipedia.org
8,twitter.com
9,taobao.com
10,qq.com
11,google.co.in
12,live.com
13,sina.com.cn
14,weibo.com
15,linkedin.com
16,yahoo.co.jp
17,tmall.com
18,blogspot.com
19,ebay.com
20,hao123.com
In a Perl 1 liner:
$ perl -lane '$count{$1}++ if /^\d+,(\w+)/; END {while (($k, $v) = each %count) { print "$v $k" if $v>1}}' /tmp/test.txt
2 yahoo
2 google

Resources