Bash replacing part of output with a string - linux

I have this command that gets a certain output;
command;
uptime | sed -e 's/^[ \t]*//' > temp
awk '{ sub(" min","") sub("users", "user");print > "temp" }' ./temp
When I do cat temp, the output becomes;
13:24:16 up 1:33, 3 user, load average: 0.30, 0.56, 0.63
I want to replace 1:33 with this new time I created with this command;
awk '{printf("%02d:%02d\n",($1/60/60%24),($1/60%60))}' /proc/uptime > temp2
This gets as output;
01:33
So, in a nutshell, I want to replace 1:33 in the output of the first command with the output of the second command 01:33. I have been googling and trying but I keep failing so I decided to come here. I have found sollutions with sed, awk and grep. But I can't figure out the perfect one for this problem.

Match the string with a regex and substitute with sed.
echo '13:24:16 up 1:33, 3 user, load average: 0.30, 0.56, 0.63' |
sed 's/up [^,]*,/up '"$(awk '{printf("%02d:%02d\n",($1/60/60%24),($1/60%60))}' /proc/uptime)"',/'
will output (on my system):
13:24:16 up 03:58, 3 user, load average: 0.30, 0.56, 0.63

It seems like you're trying to convert human-readable time to HH:MM:SS:
uptime | awk '
NR==1 {t = sprintf("%02d:%02d:%02d", $1/3600, $1%3600/60, $1%60); next}
{split($0, a, " "); print $1, $2, t, a[2], a[3]}
' /proc/uptime -
10:24:39 up 00:53:05 1 users, load average: 0.52, 0.58, 0.59

Related

Filtering a file with values over 0.70 using AWK

I have a file of targets predicted by Diana and I would like to extract those with values over 0.70
>AAGACAACGUUUAAACCA|ENST00000367816|0.999999999975474
UTR3 693-701 0.00499294596715397
UTR3 1045-1053 0.405016433077734
>AAGACAACGUUUAAACCA|ENST00000392971|0.996695852735028
CDS 87-95 0.0112208345874892
I don't know why this script doesn't want to work if it seems to be correct
for file in SC*
do
grep ">" $file | awk 'BEGIN{FS="|"}{if($3 >= 0.70)}{print $2, $3}' > 70/$file.tab
done
The issue is it doesn't filter, can you help me to find out the error?
For a start, that's not a valid awk script since you have a misplaced } character:
BEGIN{FS="|"}{if($3 >= 0.70)}{print $2, $3}
# |
# +-------------+
# move here |
# V
BEGIN{FS="|"}{if($3 >= 0.70){print $2, $3}}
You also don't need grep because awk can do that itself, and you can also set the field separator without a BEGIN block. For example, here's a command that will output field 3 values greater than 0.997, on lines starting with > (using | as a field separator):
pax> awk -F\| '/^>/ && $3 > 0.997 { print $3 }' prog.in
0.999999999975474
I chose 0.997 to ensure one of the lines in your input file was filtered out for being too low (as proof that it works). For your desired behaviour, the command would be:
pax> awk -F\| '/^>/ && $3 > 0.7 { print $2, $3 }' prog.in
ENST00000367816 0.999999999975474
ENST00000392971 0.996695852735028
Keep in mind I've used > 0.7 as per your "values over 0.70" in the heading and text of your question. If you really mean "values 0.70 and above" as per the code in your question, simply change > into >=.
Looks like you are running a for loop to kick off awk program multiple times(it means each time a file processes an awk program process will be kicked off), you need not to do that, awk program could read all the files with same name/format by itself, so apart from fixing your typo in awk program pass all files into your awk program too like:
awk -F\| 'FNR==1{close(out); out="70/"FILENAME".tab"} /^>/ && $3 > 0.7 { print $2,$3 > out }' SC*
i think it's perhaps safe to regex filter in string mode, instead of numerically :
$3 !~/0[.][0-6]/
if it started to interpret the input as a number, and does a numeric compare, that would be subject to rounding errors limited to float-point math. with a string-based filter, you could avoid values above
~ 0 . 699 999 999 999 999 95559107901… (approx. IEEE754 double-precision of 7E-1 )
being rounded up.

Extract the uptime value from "w" command output

How can I get the value of up from below command on linux?
# w
01:16:08 up 20:29, 1 user, load average: 0.50, 0.34, 0.30
USER TTY LOGIN# IDLE JCPU PCPU WHAT
root pts/0 00:57 0.00s 0.11s 0.02s w
# w | grep up
01:16:17 up 20:29, 1 user, load average: 0.42, 0.33, 0.29
On Linux, the easiest way to get the uptime in (fractional) seconds is via the 1st field of /proc/uptime (see man proc):
$ cut -d ' ' -f1 /proc/uptime
350735.47
To format that number the same way that w and uptime do, using awk:
$ awk '{s=int($1);d=int(s/86400);h=int(s % 86400/3600);m=int(s % 3600 / 60);
printf "%d days, %02d:%02d\n", d, h, m}' /proc/uptime
4 days, 01:25 # 4 days, 1 hour, and 25 minutes
To answer the question as asked - parsing the output of w (or uptime, whose output is the same as w's 1st output line, which contains all the information of interest), which also works on macOS/BSD, with a granularity of integral seconds:
A perl solution:
<(uptime) is a Bash process substitution that provides uptime's output as input to the perl command - see bottom.
$ perl -nle 'print for / up +((?:\d+ days?, +)?[^,]+)/' <(uptime)
4 days, 01:25
This assumes that days is the largest unit every displayed.
perl -nle tells Perl to process the input line by line, without printing any output by default (-n), automatically stripping the trailing newline from each input line on input, and automatically appending one on output (-l); -e tells Perl to treat the next argument as the script (expression) to process.
print for /.../ tells Perl to output what each capture group (...) inside regex /.../ captures.
up + matches literal up, preceded by (at least) one space and followed by 1 or more spaces (+)
(?:\d+ days?, +)? is a non-capturing subexpression - due to ?: - that matches:
1 or more digits (\d+)
followed by a single space
followed by literal day, optionally followed by a literal s (s?)
the trailing ? makes the entire subexpression optional, given that a number-of-days part may or may not be present.
[^,]+ matches 1 or more (+) subsequent characters up to, but not including a literal , ([^,]) - this is the hh:mm part.
The overall capture group - the outer (...) therefore captures the entire up-time expression - whether composed of hh:mm only, or preceded by <n> day/s> - and prints that.
<(uptime) is a Bash process substitution (<(...))
that, loosely speaking, presents uptime's output as a (temporary, self-deleting) file that perl can read via stdin.
Something like this with gnu sed:
$ w |head -n1
02:06:19 up 3:42, 1 user, load average: 0.01, 0.05, 0.13
$ w |sed -r '1 s/.*up *(.*),.*user.*/\1/g;q'
3:42
$ echo "18:35:23 up 18 days, 9:08, 6 users, load average: 0.09, 0.31, 0.41" \
|sed -r '1 s/.*up *(.*),.*user.*/\1/g;q'
18 days, 9:08
Given that the format of the uptime depends on whether it is less or more than 24 hours, the best I could come up with is a double awk:
$ w
18:35:23 up 18 days, 9:08, 6 users,...
$ w | awk -F 'user|up ' 'NF > 1 {print $2}' \
| awk -F ',' '{for(i = 1; i < NF; i++) {printf("%s ",$i)}} END{print ""}'
18 days 9:08

Remove lines containing non-numeric entries in bash

I have a sample data file (sample.log) that has entries
0.0262
0.0262
0.7634
5.7262
0.abc02
I need to filter out the lines that contain non-numeric data, in the above lines, the last entry.
I tried this
sed 's/[^0-9]//g' sample.log
It removes the non-numeric line but also removes the decimal values, the output I get is
00262
00262
07634
57262
How can I get the original values retained after eliminating the non-numeric lines. Can I use tr or awk
You can't do this job robustly with sed or grep or any other tool that doesn't understand numbers, you need awk instead:
$ cat file
1e3
1f3
0.1.2.3
0.123
$ awk '$0==($0+0)' file
1e3
0.123
The best you could do with a sed solution would be:
$ sed '/[^0-9.]/d; /\..*\./d' file
0.123
which removes all lines that contains anything other than a digit or period then all those that contain 2 or more periods (e.g. an IP address) but that still can't recognize the exponent notation as a number.
If you have hex input data and GNU awk (see #dawg's comment below):
$ echo "0x123" | awk --non-decimal-data '$0==($0+0){printf "%s => %f\n", $0, ($0+0)}'
0x123 => 291.000000
In awk:
awk '/^[[:digit:].]+$/{print $0}' file
Or, you negate that (and add potential + or - if that is in your strings):
awk '/[^[:digit:].+-]/{next} 1' file
Or, same logic with sed:
sed '/[^[:digit:].+-]/d' file
Ed Morton's solution is robust. Given:
$ cat nums.txt
1e6
.1e6
1E6
.001
.
0.001
.1.2
1abc2
0.0
-0
-0.0
0x123
0223
011
NaN
inf
abc
$ awk '$0==($0+0) {printf "%s => %f\n", $0, ($0+0)}
$0!=($0+0) {notf[$0]++;}
END {for (e in notf) print "\""e"\""" not a float"}' /tmp/nums.txt
1e6 => 1000000.000000
.1e6 => 100000.000000
1E6 => 1000000.000000
.001 => 0.001000
0.001 => 0.001000
0.0 => 0.000000
-0 => 0.000000
-0.0 => 0.000000
0x123 => 291.000000
0223 => 223.000000
011 => 11.000000
NaN => nan
inf => inf
".1.2" not a float
"1abc2" not a float
"abc" not a float
"." not a float
You can do it easily with grep if you discard any line that contains any letter:
grep -v [a-z] test
Use:
$ sed -i '/.*[a-z].*/d' sample.log
This might work for you (GNU sed):
sed '/[^0-9.]/d' file
However this may give a false positive on say an IP address i.e. allowing more than one ..
Using your test data:
sed '/^[0-9]\.[0-9]\{4\}$/!d' file
Would only match a digit, followed by a . followed by 4 digits.

grep a file and print only n'th word in the line

Content of file "test":
[]# 0-CPU4 8.9%, 9336/832, 0x5ffe9b88--0x5ffec000
[]# 0-CPU0 13.5%, aa: 4/3, xvl: 35
[]# 0-CPU1 8.6%, SM: 1/4, ovl: 60
[]# 0-CPU0 38.8%, SM: 1/4, ovl: 62
form this file, I want the percentage of last CPU0, which is 38 (ignoring decimal point)
I use below shell command which works fine, like to know if there is a better way.
grep CPU0 test | tail -1 | awk '/0-CPU0/ {print $3}' | sed 's/\..*//'
#above command prints "38"
Assuming your data was in a file called test:
cat test | grep CPU0 | tail -1 | awk '{ printf("%d", $3) }'
grep CPU0 test | tail -1 | awk '{ printf("%d", $3) }' - condensed
awk ' /CPU0/ {a=$3} END{ printf("%d", a) }' test - more condensed
What it does:
cat will output all lines in test file
grep CPU0 will only output those lines that contain CPU0
tail -1 will give the last line from grep's output
awk will split []# 0-CPU0 38.8%, SM: 1/4, ovl: 62 by space(s)
first item is []#, second is 0-CPU0, third is 38.8%
awk's printf %d will give you just 38
$ awk '/CPU0/{last=$3} END{sub(/[.].*/,"",last); print last}' test
38
How is works
/CPU0/{last=$3}
Every time we reach a line that contains CPU0, we assign the third field to the variable last.
END{sub(/[.].*/,"",last); print last}
At the end of the file, however many times CPU0 appeared in the file, last contains the value for the last of the CPU0 lines. The sub command removes the decimal point and everything after it. Then, we print last.

awk after grep: print value when grep returns nothing

I have a question when I use awk and grep to parse log files.
The log file contains some strings with figures, e.g.
Amount: 20
Amount: 30.1
And I use grep to parse the lines with keyword "Amount", and then use awk to get the amount and do a sum:
the command is like:
cat mylog.log | grep Amount | awk -F 'Amount: ' '{sum+=$2}END{print sum}'
It works fine for me. However, sometimes the mylog.log file does not contains the keyword 'Amount'. In this case, I want to print 0, but the above awk command will print nothing. How can I make awk print something when grep returns nothing?
You can use this:
awk '/^Amount/ {amount+=$2} END {print amount+0}' file
With the +0 trick you make it print 0 in case the value is not set.
Explanation
There is no need to grep + awk. awk alone can grep (and many more things!):
/^Amount/ {} on lines starting with "Amount", perform what is in {}.
amount+=$2 add field 2's value to the counter "amount".
END {print amount+0} after processing the whole file, print the value of amount. Doing +0 makes it print 0 if it wasn't set before.
Note also there is no need to set 'Amount' as the field separator. It suffices with the default one (the space).
Test
$ cat a
Amount: 20
Amount: 30.1
$ awk '/^Amount/ {amount+=$2} END {print amount+0}' a
50.1
$ cat b
hello
$ awk '/^Amount/ {amount+=$2} END {print amount+0}' b
0
If your line only contains "Amount: 20" then use #fedorqui's solution, but if it's more like "The quick brown fox had Amount: 20 bananas" then use:
awk -F'Amount:' 'NF==2{sum+=$2} END{print sum+0}' file
Awk one-liner,
awk -F 'Amount: ' '/Amount:/{print "1";sum+=$2}!/Amount:/{print "0"}END{print sum}' file
The above awk command would print the number 1 for the lines which has the string Amount and it prints 0 for the lines which don't have the string Amount. And also if the string Amount is found on a line then it stores the value(column 2) to the sum variable and adds it with any further values. Finally the value of the variable sum is printed at the last.
Example:
$ cat file
Amount: 20
Amount: 30.1
foo bar
adbcksjc
sbcskcbks
cnskncsnc
$ awk -F 'Amount: ' '/Amount:/{print "1";sum+=$2}!/Amount:/{print "0"}END{print sum}' file
1
1
0
0
0
0
50.1

Resources