Shell Script Count String Occurrence - linux

I am working on a project and need help figuring out how to do a task.
I am going to be given a log file, and I need to parse through and count the amount of times something occurs at a certain minute.
For example, if I have a txt file:
Line 3: 0606 221241 successfully copied to **
Line 5: 0606 221242 successfully copied to **
Line 7: 0606 221242 successfully copied to **
Line 9: 0606 221342 successfully copied to **
I want to know how many times something was successfully copied at 2212
So far, I have the following code seperating only lines that have been successful copied and getting the dates seperate...
grep "successfully copied to" Text.log >> Success.txt
awk '{print ($1, $2)}' Success.txt > datesAndTimes.txt
This gives me
0606 221241
0606 221242
0606 221242
0606 221243
For some reason, I am having trouble figuring out how to count the amount of times each specific time (ex. 0606 2212) occurs. occurs.
I only need the minutes, not the seconds (the last two digits of the second column)
Eventually I want a log/txt file that says:
0606 2212 3
0606 2213 1
and so on....
If any one has any ideas, I'm having a bit of a brain fart.
Thank you in advance!

You can get this in awk one liner:
awk '{mm=substr($4, 1, 4); cnt[$3 " " mm]++} END{for(a in cnt) print a " " cnt[a]}' Text.log
Live Demo: http://ideone.com/w2h64d

Related

What does this SED command do and how can I modify it for my use case?

I have been asked to fix someone else code so im unsure how the command actually works as ive never had to work with regex type code.
sed -r 's/([0-9]{2})\/([0-9]{2})\/([0-9]{4})\s([0-9]{2}:[0-9]{2}:[0-9]{2})/\3\/\1\/\2 \4/g'
This code reads the below txt file and is 'meant' to display the number in bold below.
placeholder_name 01/01/2022 12:00:00 01/01/2022 12:00:01 STATUS 12345/15 50
This is output to a new temp file but the issue is that only the first character in the number after the '/' is displayed, i.e. for the above example only 1 is displayed.
How would I modify the above command to take the full number after the '/'. Alternatively, if there is a nicer/better way to do this id be happy to hear it
Note: The number in bold has a range of 1-99
Using sed
$ sed -E 's#.*/([[:digit:]]+).*#\1#' input_file
15

echo not printing to file

i have a large list of telephone numbers I need to delete from a database, Im adding the telephone numbers to a single file on newlines and using the following script to generate the SQL insert command for me to manually paste.
file="Input.txt"
while IFS= read line
do
echo "delete from usr_preferences where uuid like '$line';"; >> output.txt
done <"$file"
Input file data -
1111111111
2222222222
3333333333
4444444444
It's working as expected other than it prints in the terminal rather than printing to file output.txt
What have I missed?
Thanks
Thanks to Andrey B. Panfilov and markp-fuso
Removing ; before the output command worked.

VMStat ran everyday at midnight with the time before each entry

Trying to run a VMSTAT every 10 minutes (every 600 seconds 144 times a day) but would like to append the time at the beggining of each line.
0 00 * * * /usr/bin/vmstat 600 144|awk '{now=strftime("%T"); print now $0}' > /home/rory/rory_vmstat`date +\%d`
I keep getting a message in my mail saying:
/bin/sh: -c: line 0: unexpected EOF while looking for matching `''
/bin/sh: -c: line 1: syntax error: unexpected end of file
This works in the command line: /usr/bin/vmstat 600 144|awk '{now=strftime("%T"); print now $0}' so i'm not sure whats wrong.
I'm sure its nothing too complex, I tried switching the ' and " round but no luck. Any help will be greatly appreciated :)
You've escaped the last % character here date +\%d , you likely need to do the same with the first too:
strftime("\%T")
The issue being that cron converts % to a newline and sends the text after the % to stdin of the command, unless that % is escaped.

Changing txt file via Bash [duplicate]

This question already has answers here:
Printing with sed or awk a line following a matching pattern
(9 answers)
Closed 4 years ago.
I have a text file that looks like
file:/path/to/file
..
..
DA:34,0,0
DA:86,0,0
DA:87,0,0
..
DA:89,0,0
file:/path/to/file
..
DA:23,0,1
..
DA:24,0,1
DA:25,0,1
..
I just want to keep the first line beginning with "DA" after the line beginning with "file". Other lines starting with "DA" have to be deleted. There are a lot of other lines (I marked them with ".."), they also need to be kept.
The result should look like this:
file:/path/to/file
..
..
DA:34,0,0
..
file:/path/to/file
..
DA:23,0,1
..
..
Can anybody help me? I would be really grateful. Thanks
This is very closely related to Printing with sed or awk a line following a matching pattern.
What you are after is:
awk '/^file/{f=1}(f&&/^DA/){f=0;print}!/^DA/' file
How does this work?
/^file/{f=1}: If you find a line which starts with the word "file", set a flag f to 1
(f&&/^DA/){f=0;print}: If the flag f is not zero, and the line starts with DA, print the line and set the flag to zero. This makes sure you only print the first DA after file.
!/^DA/: print all the lines that do not start with DA
A shorter version:
awk '/^file/{f=1}(f--&&/^DA/);!/^DA/' file

Search for a String, and retrieve the line having it and all lines following it until another specific pattern

Using linux, I want to search a text file for the string Blah and then return the line full line that contained the string and all the lines following the pattern up until a line that contains the word Failed.
For example,
Test Case Name "Blah"
Error 1
Error 2
Error 3
Failed
Test Case Name "Foo"
Pass
Test Case Name "Red"
Pass
In the above, I want to search for "Blah", and then return:
Test Case Name "Blah"
Error 1
Error 2
Error 3
Up until the line Failed. There can be any number of "Error" lines between Blah and Failed.
Follow up to make it faster
Both sed and awk options worked.
sed '/Blah/!d;:a;n;/Failed/d;ba' file
and
awk '/Failed/{p=0}/Blah/{p=1}p;' file
However, I noticed that while returning the expected outcome is quite fast, it takes ages to exit. Maybe these commands are recurrently searching for Blah and given that it appears only once, they run until the end-of-file.
This would not be much of a problem but I'm working with a file that contains 10 million lines and for now it is painfully slowly.
Any suggestions on how to exit after finding both lines containing Blah and Failed would be much appreciated.
Thanks!
With sed:
sed '/Blah/,/Failed/!d;//{1!d;}' file
/Blah/: match lines from Blahto Failed
!d: do not delete previous matching lines
//{1!d;}: from lines matching the addresses (that is Blahand Failed), do not delete the first one 1!d.
This might work for you (GNU sed):
sed -n '/Blah/,/Failed/{/Failed/!p}' file
Print the lines between and including Blah to Failed unless the line contains Failed.
sed ':a;/Blah/!d;:b;n;/Failed/ba;bb' file
If a line does not contain Blah delete it. Otherwise, print the current line and fetch the next (n). If this line contains Failed delete it and begin next iteration. Otherwise, repeat until successful or end-of-file.
The first solution prevents Blah and Failed being printed if they inhabit the same line. The second alternative, allows this.
would you like do with awk?
awk '/Failed/{p=0}/Blah/{p=1}p;' file will works for you.

Resources