grep between timestamps from logs in Unix

grep between timestamps from logs in Unix - linux

I have the following logs
2022-07-23T09:00:00,987 hi
2022-07-23T10:00:00,987 hi
2022-07-23T11:10:00,987 hi
2022-07-23T12:52:00,987 hi
2022-07-23T13:29:00,987 hi
2022-07-23T13:59:00,987 hi
I want to grep only the lines between 10 AM to 13:30 PM. Here is my command, but it doesn't retrieve the result as expected. Any ideas where it has to fix
sudo cat <path to my log file> | grep 'hi' | grep -E '2022-07-23T(10:[0-5][0-9]:[0-5][0-9]|13:30:00)'

awk is better tool for this than grep:
awk -F '[T,]' '$2 >= "10:00" && $2 <= "13:30" && /hi/' file
2022-07-23T10:00:00,987 hi
2022-07-23T11:10:00,987 hi
2022-07-23T12:52:00,987 hi
2022-07-23T13:29:00,987 hi
Here:
Using -F '[T,]' we delimit fields on T or , chars
awk -F '[T,]' '$2 >= "10:00" && $2 <= "13:30" does lexicological comparison of 2nd field with our data range
/hi/ search for hi in a line
Here is a grep solution using regex magic:
grep -E '^[^T]+T1([0-2]|3:([0-2][0-9]|30)):.* hi' file
2022-07-23T10:00:00,987 hi
2022-07-23T11:10:00,987 hi
2022-07-23T12:52:00,987 hi
2022-07-23T13:29:00,987 hi
RegEx Demo
RegEx Details:
^: Start
[^T]+:
T1: Match T followed by digit 1
([0-2]|3:([0-2][0-9]|30)): Match digits 0 to 2 to match time starting with 10 or 11 or 12. After alternation we match hour 13 followed by minutes 00 to 29 or 30
:.* hi: Match : followed by any string followed by a space and hi

Related

Different script behavior between Ubuntu 20.04.2 and 21.04 terminated

Can run the below command in shell script with no problem on Ubuntu 21.04 :
grep -h "new tip" logs/node.log | tail -1000 | sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p" | awk -F, -v threshold=2 '{"date +%s.%3N -d\""$1"\""|getline actualTime; delay=actualTime-$2-1591566291; sum+=delay; if (delay > threshold ) print $1 " " delay;} END {print "AVG=", sum/NR}'
but when I run the exact same script on Ubuntu 20.04.2, I get this error :
/bin/sh: 1: Syntax error: Unterminated quoted string
It's definitely the exact same script because I scp'd it from the 21.04 to 20.04.2. Couldn't find any topics in stackoverflow or on the overall internet which addressed this difference. Both Ubuntu's are on Linux cloud servers. About the only way to run the script with no error is taking out this awk line: "date +%s.%3N -d\""$1"\""|getline actualTime;
I tried playing around with the reference to the $1 field but nothing would work. Tried it with nawk instead of awk, but no luck. Maybe as a last resort I can upgrade the OS from v 20 to v 21.
Has anyone seen this before?
Added: Thanks all for the quick replies. Here are the first lines of the log file that the script is running against
[Relaynod:cardano.node.ChainDB:Notice:60] [2021-06-30 02:20:14.36 UTC] Chain extended, new tip: de56b9f458e8942ca74c6a1913dc58fa896823dc19b366285e15481f434ed337 at slot 33453323
[Relaynod:cardano.node.ChainDB:Notice:60] [2021-06-30 02:20:15.17 UTC] Chain extended, new tip: e88ea4f438944bd15186fe93f321c117ec769cfbd33667654634f4510cfd3780 at slot 33453324
Just to make sure it's not a data issue. I ran the script on the Ubuntu 21 server against the file (worked), then copied the file to the Ubuntu 20 server and ran the exact same script against the copied file, and get the error.
I'll try out the suggestions on this topic and will let everyone know the answer.
New update: after laptop crash and replacement, remembered to come back to this post. I ended up using mktime like Ed mentioned. It's working now.
The shell script:
#!/bin/bash
grep -h "new tip" logs/node.log | tail -1000 | sed -rn "s/[(.)].[(.)].* ([0-9]+)/\2,\3/p" | mawk -F, -v threshold=2 -f check_delay.awk
The awk script:
BEGIN{ ENVIRON["TZ"] = "UTC"; }
{
year = substr($1,1,4);
month = substr($1,6,2);
day = substr($1,9,2);
hour = substr($1,12,2);
min = substr($1,15,2);
sec = substr($1,18,2);
timestamp = year" "month" "day" "hour" "min" "sec;
actualTime=mktime(timestamp) + 7200;
delay=actualTime-$2-1591566291;
sum+=delay;
if (delay >= threshold )
print $1 " " delay;}
END {print "AVG=", sum/NR}

You're spawning a shell to call date using whatever value happens to be in $1 in your data so the result of that will depend on your data. Look:
$ echo '3/27/2021' | awk '{"date +%s.%3N -d\""$1"\"" | getline; print}'
1616821200.000
$ echo 'a"b' | awk '{"date +%s.%3N -d\""$1"\"" | getline; print}'
sh: -c: line 0: unexpected EOF while looking for matching `"'
sh: -c: line 1: syntax error: unexpected end of file
a"b
and what this command outputs from a log file:
sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p"
will vary greatly depending on the contents of specific lines in the log file since the parts you're trying to isolate aren't anchored and use .*s when you presumably meant to use [^]]*s. For example:
$ echo '[foo] [3/27/2021] 15 something [probably] happened at line 50'
[foo] [3/27/2021] 15 something [probably] happened at line 50
$ echo '[foo] [3/27/2021] 15 something [probably] happened at line 50' | sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p"
3/27/2021] 15 something [probably,50
$ echo '[foo] [3/27/2021] 15 something [probably] happened at line 50' | sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p" | awk -F, -v threshold=2 '{"date +%s.%3N -d\""$1"\""|getline actualTime; delay=actualTime-$2-1591566291; sum+=delay; if (delay > threshold ) print $1 " " delay;} END {print "AVG=", sum/NR}'
date: invalid date ‘3/27/2021] 15 something [probably’
AVG= -1591566341
If you want to do that then you could introduce a check for a valid date to avoid THAT specific error, e.g. (but obviously create a better date verification regexp than this):
$ echo 'a"b' | awk '$1 ~ "[0-9]/[0-9]+/[0-9]" {"date +%s.%3N -d\""$1"\"" | getline; print}'
$
but it's still fragile and extremely slow.
You're using GNU sed for -r so you have or can get GNU awk and that has builtin time functions so you shouldn't be spawning a subshell to call date in the first place, you should just be using mktime(), see https://stackoverflow.com/a/68180908/1745001, which will avoid cryptic errors like that and run orders of magnitude faster.

How can I fix my bash script to find a random word from a dictionary?

I'm studying bash scripting and I'm stuck fixing an exercise of this site: https://ryanstutorials.net/bash-scripting-tutorial/bash-variables.php#activities
The task is to write a bash script to output a random word from a dictionary whose length is equal to the number supplied as the first command line argument.
My idea was to create a sub-dictionary, assign each word a number line, select a random number from those lines and filter the output, which worked for a similar simpler script, but not for this.
This is the code I used:
6 DIC='/usr/share/dict/words'
7 SUBDIC=$( egrep '^.{'$1'}$' $DIC )
8
9 MAX=$( $SUBDIC | wc -l )
10 RANDRANGE=$((1 + RANDOM % $MAX))
11
12 RWORD=$(nl "$SUBDIC" | grep "\b$RANDRANGE\b" | awk '{print $2}')
13
14 echo "Random generated word from $DIC which is $1 characters long:"
15 echo $RWORD
and this is the error I get using as input "21":
bash script.sh 21
script.sh: line 9: counterintelligence's: command not found
script.sh: line 10: 1 + RANDOM % 0: division by 0 (error token is "0")
nl: 'counterintelligence'\''s'$'\n''electroencephalograms'$'\n''electroencephalograph': No such file or directory
Random generated word from /usr/share/dict/words which is 21 characters long:
I tried in bash to split the code in smaller pieces obtaining no error (input=21):
egrep '^.{'21'}$' /usr/share/dict/words | wc -l
3
but once in the script line 9 and 10 give error.
Where do you think is the error?

problems
SUBDIC=$( egrep '^.{'$1'}$' $DIC ) will store all words of the given length in the SUBDIC variable, so it's content is now something like foo bar baz.
MAX=$( $SUBDIC | ... ) will try to run the command foo bar baz which is obviously bogus; it should be more like MAX=$(echo $SUBDIC | ... )
MAX=$( ... | wc -l ) will count the lines; when using the above mentioned echo $SUBDIC you will have multiple words, but all in one line...
RWORD=$(nl "$SUBDIC" | ...) same problem as above: there's only one line (also note #armali's answer that nl requires a file or stdin)
RWORD=$(... | grep "\b$RANDRANGE\b" | ...) might match the dictionary entry catch 22
likely RWORD=$(... | awk '{print $2}') won't handle lines containing spaces
a simple solution
doing a "random sort" over the all the possible words and taking the first line, should be sufficient:
egrep "^.{$1}$" "${DIC}" | sort -R | head -1

MAX=$( $SUBDIC | wc -l ) - A pipe is used for connecting a command's output, while $SUBDIC isn't a command; an appropriate syntax is MAX=$( <<<$SUBDIC wc -l ).
nl "$SUBDIC" - The argument to nl has to be a filename, which "$SUBDIC" isn't; an appropriate syntax is nl <<<"$SUBDIC".

This code will do it. My test dictionary of words is in file file. It's a good idea to get all words of a given length first but put them in an array not in var. And then get a random index and echo it.
dic=( $(sed -n "/^.\{$1\}$/p" file) )
ind=$((0 + RANDOM % ${#dic[#]}))
echo ${dic[$ind]}

I am also doing this activity and I create one simple solution.
I create the script.
#!/bin/bash
awk "NR==$1 {print}" /usr/share/dict/words
Here if you want a random string then you have to run the script as per the below command from the terminal.
./script.sh $RANDOM
If you want the print any specific number string then you can run as per the below command from the terminal.
./script.sh 465

cat /usr/share/dict/american-english | head -n $RANDOM | tail -n 1
$RANDOM - Returns a different random number each time is it referred to.
this simple line outputs random word from the mentioned dictionary.
Otherwise as umläute mentined you can do:
cat /usr/share/dict/american-english | sort -R | head -1

How do I insert text to the 1st line of a file using sed?

Hi I'm trying to add text to the 1st line of a file using sed
so far iv'e tried
#!/bin/bash
touch test
sed -i -e '1i/etc/example/live/example.com/fullchain.pem;\' test
And this dosn't work
also tried
#!/bin/bash
touch test
sed -i "1i ssl_certificate /etc/example/live/example.com/fullchain.pem;" test
this dosn't seem to work either
oddly when I try
#!/bin/bash
touch test
echo "ssl_certificate /etc/example/live/example.com/fullchain.pem;" > test
I get the 1st line of text to appear when i use cat test
but as soon as i type sed -i "2i ssl_certificate_key /etc/example/live/example.com/privkey.pem;"
I can't see the information that i sould do on line 2 this being ssl_certificate_key /etc/example/live/example.com/privkey.pem;
so my question to summerise
Can text be inserted into the 1st line of a newly created file using sed?
If yes whats the best way of inserting text after the 1st line of text?

Suppose you have a file like this:
one
two
Then to append to the first line:
$ sed '1 s_$_/etc/example/live/example.com/fullchain.pem;_' file
one/etc/example/live/example.com/fullchain.pem;
two
To insert before the first line:
$ sed '1 i /etc/example/live/example.com/fullchain.pem;' file
/etc/example/live/example.com/fullchain.pem;
one
two
Or, to append after the first line:
$ sed '1 a /etc/example/live/example.com/fullchain.pem;' file
one
/etc/example/live/example.com/fullchain.pem;
two
Note the number 1 in those sed expressions - that's called the address in sed terminology. It tells you on which line the command that follows is to operate.
If your file doesn't contain the line you're addressing, the sed command won't get executed. That's why you can't insert/append on line 1, if your file is empty.
Instead of using stream editor, to append (to empty files), just use a shell redirection >>:
echo "content" >> file

Your problem stems from the fact that sed cannot locate the line you're telling it to write at, for example:
touch test
sed -i -e '1i/etc/example/live/example.com/fullchain.pem;\' test
attempts to write to insert at the line 1 of test, but that line doesn't exist at this point. If you've created your file as:
echo -en "\n" > test
sed -i '1i/etc/example/live/example.com/fullchain.pem;\' test
it would not complain, but you'd be having an extra line. Similarly, when you call:
sed -i "2i ssl_certificate_key /etc/example/live/example.com/privkey.pem;"
you're telling sed to insert the following data at the line 2 which doesn't exist at that point so sed doesn't get to edit the file.
So, for the initial line or the last line in the file, you should not use sed because simple > and >> stream redirects are more than enough.

Your command will work if you make sure the input file has at least one line:
[ "$(wc -l < test)" -gt 0 ] || printf '\n' >> test
sed -i -e '1 i/etc/example/live/example.com/fullchain.pem;\' test

To insert text to the first line and put the rest on a new line using sed on macOS this worked for me
sed -i '' '1 i \
Insert
' ~/Downloads/File-path.txt

First and Last
I would assume that anyone who searched for how to insert/append text to the beginning/end of a file probably also needs to know how to do the other also.
cal | \
gsed -E \
-e '1i\{' \
-e '1i\ "lines": [' \
-e 's/(.*)/ "\1",/' \
-e '$s/,$//' \
-e '$a\ ]' \
-e '$a\}'
Explanation
This is cal output piped to gnu-sed (called gsed on macOS installed via brew.sh) with extended RegEx (-E) and 6 "scripts" applied (-e) and line breaks escaped with \ for readability. Scripts 1 & 2 use 1i\ to "at line 1, insert". Scripts 5 & 6 use $a\ to "at line <last>, append". I vertically aligned the text outputs to make the code represent what is expected in the result. Scripts 3 & 4 do substitutions (the latter applying only to "line <last>"). The result is converting command output to valid JSON.
output
{
"lines": [
" October 2019 ",
"Su Mo Tu We Th Fr Sa ",
" 1 2 3 4 5 ",
" 6 7 8 9 10 11 12 ",
"13 14 15 16 17 18 19 ",
"20 21 22 23 24 25 26 ",
"27 28 29 30 31 ",
" "
]
}
For help getting this to work with the macos/BSD version of sed, see my answer here.

How to insert shell variable inside awk command

I'm trying to write a script, In this script i'm passing a shell variable into an awk command, But when i run it nothing happens, i tried to run that line only in the shell, i found that no variable expansion happened like i expected. Here's the code :
1 #!/bin/bash
2
3 # Created By Rafael Adel
4
5 # This script is to start dwm with customizations needed
6
7
8 while true;do
9 datestr=`date +"%r %d/%m/%Y"`
10 batterystr=`acpi | grep -oP "([a-zA-Z]*), ([0-9]*)%"`
11 batterystate=`echo $batterystr | grep -oP "[a-zA-Z]*"`
12 batterypercent=`echo $batterystr | grep -oP "[0-9]*"`
13
14 for nic in `ls /sys/class/net`
15 do
16 if [ -e "/sys/class/net/${nic}/operstate" ]
17 then
18 NicUp=`cat /sys/class/net/${nic}/operstate`
19 if [ "$NicUp" == "up" ]
20 then
21 netstr=`ifstat | awk -v interface=${nic} '$1 ~ /interface/ {printf("D: %2.1fKiB, U: %2.1fKiB",$6/1000, $8/1000)}'`
22 break
23 fi
24 fi
25 done
26
27
28 finalstr="$netstr | $batterystr | $datestr"
29
30 xsetroot -name "$finalstr"
31 sleep 1
32 done &
33
34 xbindkeys -f /etc/xbindkeysrc
35
36 numlockx on
37
38 exec dwm
This line :
netstr=`ifstat | awk -v interface=${nic} '$1 ~ /interface/ {printf("D: %2.1fKiB, U: %2.1fKiB",$6/1000, $8/1000)}'`
Is what causes netstr variable not to get assigned at all. That's because interface is not replaced with ${nic} i guess.
So could you tell me what's wrong here? Thanks.

If you want to /grep/ with your variable, you have 2 choices :
interface=eth0
awk "/$interface/{print}"
or
awk -v interface=eth0 '$0 ~ interface{print}'
See http://www.gnu.org/software/gawk/manual/gawk.html#Using-Shell-Variables

it's like I thought, awk substitutes variables properly, but between //, inside regex ( or awk regex, depending on some awk parameter AFAIR), awk variable cannot be used for substitution
I had no issue grepping with variable inside an awk program (for simple regexp cases):
sawk1='repo\s+module2'
sawk2='#project2\s+=\s+module2$'
awk "/${sawk1}/,/${sawk2}/"'{print}' aFile
(Here the /xxx/,/yyy/ displays everything between xxx and yyy)
(Note the double-quoted "/${sawk1}/,/${sawk2}/", followed by the single-quoted '{print}')
This works just fine, and comes from "awk: Using Shell Variables in Programs":
A common method is to use shell quoting to substitute the variable’s value into the program inside the script.
For example, consider the following program:
printf "Enter search pattern: "
read pattern
awk "/$pattern/ "'{ nmatches++ }
END { print nmatches, "found" }' /path/to/data
The awk program consists of two pieces of quoted text that are concatenated together to form the program.
The first part is double-quoted, which allows substitution of the pattern shell variable inside the quotes.
The second part is single-quoted.
It does add the caveat though:
Variable substitution via quoting works, but can potentially be messy.
It requires a good understanding of the shell’s quoting rules (see Quoting), and it’s often difficult to correctly match up the quotes when reading the program.

Filtering Linux command output

I need to get a row based on column value just like querying a database. I have a command output like this,
Name ID Mem VCPUs State
Time(s)
Domain-0 0 15485 16 r-----
1779042.1
prime95-01 512 1
-b---- 61.9
Here I need to list only those rows where state is "r". Something like this,
Domain-0 0 15485 16
r----- 1779042.1
I have tried using "grep" and "awk" but still I am not able to succeed.
Any help me is much appreciated
Regards,
Raaj

There is a variaty of tools available for filtering.
If you only want lines with "r-----" grep is more than enough:
command | grep "r-----"
Or
cat filename | grep "r-----"

grep can handle this for you:
yourcommand | grep -- 'r-----'
It's often useful to save the (full) output to a file to analyse later. For this I use tee.
yourcommand | tee somefile | grep 'r-----'
If you want to find the line containing "-b----" a little later on without re-running yourcommand, you can just use:
grep -- '-b----' somefile
No need for cat here!
I recommend putting -- after your call to grep since your patterns contain minus-signs and if the minus-sign is at the beginning of the pattern, this would look like an option argument to grep rather than a part of the pattern.

try:
awk '$5 ~ /^r.*/ { print }'
Like this:
cat file | awk '$5 ~ /^r.*/ { print }'

grep solution:
command | grep -E "^([^ ]+ ){4}r"
What this does (-E switches on extended regexp):
The first caret (^) matches the beginning of the line.
[^ ] matches exactly one occurence of a non-space character, the following modifier (+) allows it to also match more occurences.
Grouped together with the trailing space in ([^ ]+ ), it matches any sequence of non-space characters followed by a single space. The modifyer {4} requires this construct to be matched exactly four times.
The single "r" is then the literal character you are searching for.
In plain words this could be written like "If the line starts <^> with four strings that are followed by a space <([^ ]+ ){4}> and the next character is , then the line matches."
A very good introduction into regular expressions has been written by Jan Goyvaerts (http://www.regular-expressions.info/quickstart.html).

Filtering by awk cmd in linux:-
Firstly find the column for this cmd and store file2 :-
awk '/Domain-0 0 15485 /' file1 >file2
Output:-
Domain-0 0 15485 16
r----- 1779042.1
after that awk cmd in file2:-
awk '{print $1,$2,$3,$4,"\n",$5,$6}' file2
Final Output:-
Domain-0 0 15485 16
r----- 1779042.1

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

grep between timestamps from logs in Unix - linux

Related

Different script behavior between Ubuntu 20.04.2 and 21.04 terminated

How can I fix my bash script to find a random word from a dictionary?

How do I insert text to the 1st line of a file using sed?

How to insert shell variable inside awk command

Filtering Linux command output

Categories

Resources