How to read log file for last n min in linux - linux

I have file in following format and I want read the file for last n no of minutes.
2019-09-22T04:00:03.052+0000: 774093.613: [GC (Allocation Failure)
Desired survivor size 47710208 bytes, new threshold 15 (max 15)
[PSYoungGen: 629228K->22591K(650752K)] 1676693K->1075010K(2049024K), 0.0139764 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
I want to read the log file for x n of minutes based on user requirement so that I can monitor it for last 30 min or 120 min based on user requirement.
I have tried below option to read the file but seems its not working as expected:
awk -F - -vDT="$(date --date="60 minutes ago" +"%Y-%m-%dT%H:%M:%S")" ' DT > $NF,$0' gc-2019-09-13-04-58.log.0.current
Also, in above command "60 minutes ago" option is there which I tried to pass as a variable like v1=30 , date --date="$v1 minutes ago", this one also not working?
Please suggest how to read this file for last x no of minutes?

Here is one for GNU awk (time functions and gensub()). First the test data, two lines of your data with year changed in the first one:
2018-09-22T04:00:03.052+0000: 774093.613: [GC (Allocation Failure)
Desired survivor size 47710208 bytes, new threshold 15 (max 15)
[PSYoungGen: 629228K->22591K(650752K)] 1676693K->1075010K(2049024K), 0.0139764 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
2019-09-22T04:00:03.052+0000: 774093.613: [GC (Allocation Failure)
Desired survivor size 47710208 bytes, new threshold 15 (max 15)
[PSYoungGen: 629228K->22591K(650752K)] 1676693K->1075010K(2049024K), 0.0139764 secs] [Times: user=0.05 sys=0.00, real=0.01 secs]
and the awk program, to which the data is fed backwards using tac:
$ tac file | gawk '
BEGIN {
threshold = systime()-10*60*60 # time threshold is set to 10 hrs
# threshold = systime()-mins*60# uncomment and replace with above
} # for command line switch
{
if(match($1,/^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}/)) {
if( mktime( gensub(/[-T:]/," ","g",substr($1,RSTART,RLENGTH))) < threshold)
exit # exit once first beyond threshold time is found
print $0 b # output current record and the buffer
b="" # reset buffer
} else # for non-time starting records:
b=ORS $0 b # buffer them
}'
You could write the program code between the 's to a file, say program.awk and run it with tac file | gawk -f program.awk and furthemore add a command line switch by uncommenting the marked line in the BEGIN section and running with:
$ gawk -v mins=10 -f program.awk <(tac file)

Get the last N lines of a log file. The most important command is "tail". ...
Get new lines from a file continuously. To get all newly added lines from a log file in realtime on the shell, use the command: tail -f /var/log/mail.log. ...
Get the result line by line. ...
Search in a log file. ...
View the whole content of a file.

Related

What does "file system outputs" mean with time -v?

What is 'file system outputs' counting when using the Linux 'time' command with dd?
It doesn't equal dd 'count' (presumably the number of calls to fwrite?), nor the size of the output in 4096-byte pages (which should be 1024000 in this example).
An example:
> /usr/bin/time -v dd if=/dev/zero of=/tmp/dd.test bs=4M count=1000
1000+0 records in
1000+0 records out
4194304000 bytes (4.2 GB) copied, 4.94305 s, 849 MB/s
Command being timed: "dd if=/dev/zero of=/tmp/dd.test bs=4M count=1000"
User time (seconds): 0.00
System time (seconds): 4.72
Percent of CPU this job got: 95%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:04.94
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 5040
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 1322
Voluntary context switches: 32
Involuntary context switches: 15
Swaps: 0
File system inputs: 240
File system outputs: 8192000
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
The command time is printing out values from the rusage struct (see getrusage(2)).
And according to the source:
/*
* We approximate number of blocks, because we account bytes only.
* A 'block' is 512 bytes
*/
static inline unsigned long task_io_get_oublock(const struct task_struct *p)
{
return p->ioac.write_bytes >> 9;
}
So (at least on Linux) "File system outputs" in time output is the total number of bytes written / 512.

How to extract a last block from a file

I have a file containing many blocks like this:
==9673==
==9673== HEAP SUMMARY:
==9673== in use at exit: 0 bytes in 0 blocks
==9673== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated
==9673==
==9673== All heap blocks were freed -- no leaks are possible
==9673==
==9673== For counts of detected and suppressed errors, rerun with: -v
==9673== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
....
....
....
....
==9655==
==9655== HEAP SUMMARY:
==9655== in use at exit: 0 bytes in 0 blocks
==9655== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated
==9655==
==9655== All heap blocks were freed -- no leaks are possible
==9655==
==9655== For counts of detected and suppressed errors, rerun with: -v
==9655== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
....
....
....
==9699==
==9699== HEAP SUMMARY:
==9699== in use at exit: 0 bytes in 0 blocks
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated
==9699==
==9699== All heap blocks were freed -- no leaks are possible
==9699==
==9699== For counts of detected and suppressed errors, rerun with: -v
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I want to extract the last block starting with line:
==XXXX== HEAP SUMMARY:
So in my example I want to extract only the last bloc:
==9699== HEAP SUMMARY:
==9699== in use at exit: 0 bytes in 0 blocks
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated
==9699==
==9699== All heap blocks were freed -- no leaks are possible
==9699==
==9699== For counts of detected and suppressed errors, rerun with: -v
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
How I can do that with bash?
Using grep -zoP and a negative lookahead regex:
grep -zoP '==\w{4}== HEAP SUMMARY:(?![\s\S]*==\w{4}== HEAP SUMMARY:)[\s\S]*\z' file
==9699== HEAP SUMMARY:
==9699== in use at exit: 0 bytes in 0 blocks
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated
==9699==
==9699== All heap blocks were freed -- no leaks are possible
==9699==
==9699== For counts of detected and suppressed errors, rerun with: -v
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
-z treats file data as null terminated instead of new line terminated
(?![\s\S]*==\w{4}== HEAP SUMMARY:) is negative lookahead that asserts we don't have another instance of the same in the file below.
RegEx Demo
if you have tac, this might be the easiest
$ tac file | awk '1; /==....== HEAP SUMMARY/{exit}' | tac
If you know that the blocks are always 9 lines long, you can simply use tail:
tail -n9 file
With sed:
$ sed -n '/HEAP SUMMARY/{:a;/ERROR SUMMARY/bb;N;ba;:b;$p;d}' infile
==9699== HEAP SUMMARY:
==9699== in use at exit: 0 bytes in 0 blocks
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated
==9699==
==9699== All heap blocks were freed -- no leaks are possible
==9699==
==9699== For counts of detected and suppressed errors, rerun with: -v
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Here is how this works:
sed -n ' # Do not print lines at end of each cycle
/HEAP SUMMARY/ { # If line matches "HEAP SUMMARY"
:a # Label to jump back to
/ERROR SUMMARY/bb # If line matches "ERROR SUMMARY", jump to :b
N # Append next line to pattern space
ba # Jump to :a
:b # Label to jump forward to
$p # If we are on the last line, print pattern space
d # Delete pattern space
}
' infile
Each time this encounters HEAP SUMMARY, it reads all the lines up to the next ERROR SUMMARY into the pattern space. Then, it checks if the last line has been reached; if yes, the pattern space gets printed, if not, it gets deleted.
If the last line of the file also has the block number, this will get the block number fast (no reading of the whole file to find which number that is):
n="$(tail -n1 infile | awk '{print $1}')"
Then we can select all lines that have such block number from the end up:
tac infile | awk -vn="$n" '!($1~n){exit};1'| tac
This might work for you (GNU sed):
sed '/HEAP SUMMARY:/h;//!H;$!d;x' file
When encountering HEAP SUMMARY: replace whatever is in the hold space (HS) by the current line. For any other pattern append tha line to the HS. Delete all lines excepting the last, when the pattern space (PS) is swapped with the HS and the PS is printed out.
Using number in front of data as an id /group number:
id=$(tail -n1 file | grep -Po '(?<=\=\=)[0-9]*') && grep "$id" file |tail -n+2

perf : How to check processess running on particular cpu

Is there any option in perf to look into processes running on a particular cpu /core, and how much percentage of that core is taken by each process.
Reference links would be helpful.
perf is intended to do a profiling which is not good fit for your case. You may try to do sampling /proc/sched_debug (if it is compiled in your kernel). For example you may check which process is currently running on CPU:
egrep '^R|cpu#' /proc/sched_debug
cpu#0, 917.276 MHz
R egrep 2614 37730.177313 ...
cpu#1, 917.276 MHz
R bash 2023 218715.010833 ...
By using his PID as a key, you may check how many CPU time in milliseconds it consumed:
grep se.sum_exec_runtime /proc/2023/sched
se.sum_exec_runtime : 279346.058986
However, as #BrenoLeitão mentioned, SystemTap is quite useful for your script. Here is script for your task.
global cputimes;
global cmdline;
global oncpu;
global NS_PER_SEC = 1000000000;
probe scheduler.cpu_on {
oncpu[pid()] = local_clock_ns();
}
probe scheduler.cpu_off {
if(oncpu[pid()] == 0)
next;
cmdline[pid()] = cmdline_str();
cputimes[pid(), cpu()] <<< local_clock_ns() - oncpu[pid()];
delete oncpu[pid()];
}
probe timer.s(1) {
printf("%6s %3s %6s %s\n", "PID", "CPU", "PCT", "CMDLINE");
foreach([pid+, cpu] in cputimes) {
cpupct = #sum(cputimes[pid, cpu]) * 10000 / NS_PER_SEC;
printf("%6d %3d %3d.%02d %s\n", pid, cpu,
cpupct / 100, cpupct % 100, cmdline[pid]);
}
delete cputimes;
}
It traces moments when process is running on CPU and stops execution on that (due to migration or sleeping) by attaching to scheduler.cpu_on and scheduler.cpu_off probes. Second probe calculates time difference between these events and saves it to cputimes aggregation along with process command line arguments.
timer.s(1) fires once per second -- it walks over aggregation and calculates percentage. Here is sample output for Centos 7 with bash running infinite loop:
0 0 100.16
30 1 0.00
51 0 0.00
380 0 0.02 /usr/bin/python -Es /usr/sbin/tuned -l -P
2016 0 0.08 sshd: root#pts/0 "" "" "" ""
2023 1 100.11 -bash
2630 0 0.04 /usr/libexec/systemtap/stapio -R stap_3020c9e7ba76838179be68cd2390a10c_2630 -F3
I understand that perf is not the proper way to do it, although you can limit perf per CPU, as using perf record -C <cpulist> or even perf stat -c <cpulist>.
The close you are going to see is the context-switch event, but, this is not going to provide you the application names at all.
I think you are going to need something more powerful, as systemtap.

time command output on an already running process

I have a process that spawns some other processes,
I want to use the time command on a specific process and get the same output as the time command.
Is that possible and how?
I want to use the time command on a specific process and get the same output as the time command.
Probably it is enough just to use pidstat to get user and sys time:
$ pidstat -p 30122 1 4
Linux 2.6.32-431.el6.x86_64 (hostname) 05/15/2014 _x86_64_ (8 CPU)
04:42:28 PM PID %usr %system %guest %CPU CPU Command
04:42:29 PM 30122 706.00 16.00 0.00 722.00 3 has_serverd
04:42:30 PM 30122 714.00 12.00 0.00 726.00 3 has_serverd
04:42:31 PM 30122 714.00 14.00 0.00 728.00 3 has_serverd
04:42:32 PM 30122 708.00 16.00 0.00 724.00 3 has_serverd
Average: 30122 710.50 14.50 0.00 725.00 - has_serverd
If not then according to strace time uses wait4 system call (http://linux.die.net/man/2/wait4) to get information about a process from the kernel. The same info returns getrusage but you cannot call it for an arbitrary process according to its documentation (http://linux.die.net/man/2/getrusage).
So, I do not know any command that will give the same output. However it is feasible to create a bash script that gets PID of the specific process and outputs something like time outpus then
This script does these steps:
1) Get the number of clock ticks per second
getconf CLK_TCK
I assume it is 100 and 1 tick is equal to 10 milliseconds.
2) Then in loop do the same sequence of commands while exists the directory /proc/YOUR-PID:
while [ -e "/proc/YOUR-PID" ];
do
read USER_TIME SYS_TIME REAL_TIME <<< $(cat /proc/PID/stat | awk '{print $14, $15, $22;}')
sleep 0.1
end loop
Some explanation - according to man proc :
user time: ($14) - utime - Amount of time that this process has been scheduled in user mode, measured in clock ticks
sys time: ($15) - stime - Amount of time that this process has been scheduled in kernel mode, measured in clock ticks
starttime ($22) - The time in jiffies the process started after system boot.
3) When the process is finished get finish time
read FINISH_TIME <<< $(cat '/proc/self/stat' | awk '{print $22;}')
And then output:
the real time = ($FINISH_TIME-$REAL_TIME) * 10 - in milliseconds
user time: ($USER_TIME/(getconf CLK_TCK)) * 10 - in milliseconds
sys time: ($SYS_TIME/(getconf CLK_TCK)) * 10 - in milliseconds
I think it should give roughly the same result as time. One possible problem I see is if the process exists for a very short period of time.
This is my implementation of time:
#!/bin/bash
# Uses herestrings
print_res_jeffies()
{
let "TIME_M=$2/60000"
let "TIME_S=($2-$TIME_M*60000)/1000"
let "TIME_MS=$2-$TIME_M*60000-$TIME_S*1000"
printf "%s\t%dm%d.%03dms\n" $1 $TIME_M $TIME_S $TIME_MS
}
print_res_ticks()
{
let "TIME_M=$2/6000"
let "TIME_S=($2-$TIME_M*6000)/100"
let "TIME_MS=($2-$TIME_M*6000-$TIME_S*100)*10"
printf "%s\t%dm%d.%03dms\n" $1 $TIME_M $TIME_S $TIME_MS
}
if [ $(getconf CLK_TCK) != 100 ]; then
exit 1;
fi
if [ $# != 1 ]; then
exit 1;
fi
PROC_DIR="/proc/"$1
if [ ! -e $PROC_DIR ]; then
exit 1
fi
USER_TIME=0
SYS_TIME=0
START_TIME=0
while [ -e $PROC_DIR ]; do
read TMP_USER_TIME TMP_SYS_TIME TMP_START_TIME <<< $(cat $PROC_DIR/stat | awk '{print $14, $15, $22;}')
if [ -e $PROC_DIR ]; then
USER_TIME=$TMP_USER_TIME
SYS_TIME=$TMP_SYS_TIME
START_TIME=$TMP_START_TIME
sleep 0.1
else
break
fi
done
read FINISH_TIME <<< $(cat '/proc/self/stat' | awk '{print $22;}')
let "REAL_TIME=($FINISH_TIME - $START_TIME)*10"
print_res_jeffies 'real' $REAL_TIME
print_res_ticks 'user' $USER_TIME
print_res_ticks 'sys' $SYS_TIME
And this is an example that compares my implementation of time and real time:
>time ./sys_intensive > /dev/null
Alarm clock
real 0m10.004s
user 0m9.883s
sys 0m0.034s
In another terminal window I run my_time.sh and give it PID:
>./my_time.sh `pidof sys_intensive`
real 0m10.010ms
user 0m9.780ms
sys 0m0.030ms

Extract values from a text file in linux

I have a log file generated from sqlldr log file and I was wondering if I can write a shell to extract the following values from the log below using Linux. Thanks
Table_name: TEST
Row_load: 100
Row_fail: 10
Date_run: Feb 07, 2014
Table TEST:
100 Rows successfully loaded.
10 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Bind array size not used in direct path.
Column array rows : 5000
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 14486
Total logical records rejected: 0
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 3
Total stream buffers loaded by SQL*Loader load thread: 0
Run began on Fri Feb 07 12:21:24 2014
Run ended on Fri Feb 07 12:21:31 2014
Elapsed time was: 00:00:06.88
CPU time was: 00:00:00.11
If the structure of your log file is always same, then you can do the following with awk:
awk '
NR==1 { sub(/:/,x); print "Table_name: "$NF}
NR==2 { print "Row_load: " $1}
NR==3 { print "Row_fail: " $1}
/Run ended/ { print "Date_run: "$5 FS $6","$8}' file
Output
$ awk '
NR==1 { sub(/:/,x); print "Table_name: "$NF}
NR==2 { print "Row_load: " $1}
NR==3 { print "Row_fail: " $1}
/Run ended/ { print "Date_run: "$5 FS $6","$8}' file
Table_name: TEST
Row_load: 100
Row_fail: 10
Date_run: Feb 07,2014

Resources