Save "perf stat" output per second to a file - linux

I want to get perf output and analyse it. I used
while (true) {
system("sudo perf kvm stat -e r10f4 -a sleep 1 2>&1 | sed '1,3d' | sed '2,$d' > 'system.log' 2>&1");
sleep(0.5);
}
The code above frequently uses perf, which is costly. Instead, I'm running perf stat like: perf kvm stat -a -I 1000 > system.log 2>&1.
This command will keep writting data to "system.log", but I only need the data in a second.
I'm wondering how to let the new data overwrites the older data every second.
Does anybody know if my thoughts are feasible? Or other ways which can solve my problem.

You mean have perf keep overwriting, instead of appending, so you only have the final second?
One way would be to have your own program pipe perf stat -I1000 into itself (e.g. via C stdio popen or with POSIX pipe / dup2 / fork/exec). Then you get to implement the logic that chooses where and how to write the lines to a file. Instead of just writing them normally, you can lseek to the start of the output file before every write. Or pwrite instead of write to always write at file-position 0. You can append some spaces for padding out to a fixed width to make sure a longer line didn't leave some characters in the file that you're not overwriting. (Or ftruncate the output file after writes with lines shorter than the previous line).
Or: redirect perf stat -I 1000 >> file.log with O_APPEND, and periodically truncate the length.
A file opened for append will automatically write at wherever the current end is, so you can leave perf ... -I 1000 running and truncate the file every second or every 5 seconds or so. So at most you have 5 lines to read through to get to the final one you actually want. If using system to run it through a shell, use >>. Or use open(O_APPEND)/dup2 if using fork / execve.
To do the actual truncation, truncate() by path, or ftruncate() by open fd, in a sleep loop. Ideally you'd truncate right before a new line was about to be written, so most of the time there'd be a line. But unless you make extra system calls like fstat or inotify you won't know when exactly to expect another one, although dead reckoning and assuming that sleep sleeps for the minimum time can work ok.
Or: redirect perf stat -I 1000 (without append), and lseek its fd from the parent process, to create the same effect as piping through a process / thread that seeks before write.
Why does forking my process cause the file to be read infinitely shows that parent/child processes with file descriptors that share the same open file description can influence each other's file position with lseek.
This only works if you do the redirect yourself with open / dup2, not if you leave it to a shell via system. You need your own descriptor that refers to the same open file description
Not sure if you could play tricks like lseek on the same fd that a child process had open; in that case avoiding O_APPEND could let you set the write position to 0 without truncating, so the next write will overwrite the contents. But that only works if you open / dup2 / fork/exec and the fd in the parent has its file position tied to the fd in the child process.
Totally untested example
This might not actually compile even with the right headers, but hopefully illustrates the idea.
// must *not* use O_APPEND, otherwise file position is ignored for write.
int fd = open("system.log", O_WRONLY|O_CREAT|O_TRUNC, 0666);
if ((pid = fork()) != 0){
// parent
while(1) {
sleep(1);
lseek(fd, 0, SEEK_SET); // reset the position at which perf will do its next write
}
} else {
// child
dup2(fd, 1); // redirect stdout to the open file
// TODO: look up best practices for fork/exec, and error checking
execlp("sudo", "perf", "kvm", "stat", "-er10f4", "-a", "sleep", "1", NULL);
perror("execlp sudo perf failed");
}
Note the total lack of error checking of return values. This is more like pseudocode, even though it's written in (hopefully) valid C syntax.
And BTW, you can reduce that pipeline to one sed command, just sed -n 4p to print line 4 but not other lines. But perf ... stat -I 1000 doesn't waste lines so there'd be no need to filter on line numbers that way.

Related

Set line maximum of log file

Currently I write a simple logger to log messages from my bash script. The logger works fine and I simply write the date plus the message in the log file. Since the log file will increase, I would like to set the limit of the logger to for example 1000 lines. After reaching 1000 lines, it doesn't delete or totally clear the log file. It should truncate the first line and replace it with the new log line. So the file keeps 1000 lines and doesn't increase further. The latest line should always be at the top of the file. Is there any built in method? Or how could I solve this?
Why would you want to replace the first line with the new message thereby causing a jump in the order of messages in your log file instead of just deleting the first line and appending the new message, e.g. simplistically:
log() {
tail -999 logfile > tmp &&
{ cat tmp && printf '%s\n' "$*"; } > logfile
}
log "new message"
You don't even need a tmp file if your log file is always small lines, just save the output of the tail in a variable and printf that too.
Note that unlike a sed -i solution, the above will not change the inode, hardlinks, permissions or anything else for logfile - it's the same file as you started with just with updated content, it's not getting replaced with a new file.
Your chosen example may not be the best. As the comments have already pointed out, logrotate is the best tool to keep log file sizes at bay; furthermore, a line is not the best unit to measure size. Those commenters are both right.
However, I take your question at face value and answer it.
You can achieve what you want by shell builtins, but it is much faster and simpler to use an external tool like sed. (awk is another option, but it lacks the -i switch which simplifies your life in this case.)
So, suppose your file exists already and is named script.log then
maxlines=1000
log_msg='Whatever the log message is'
sed -i -e1i"\\$log_msg" -e$((maxlines))',$d' script.log
does what you want.
-i means modify the given file in place.
-e1i"\\$log_msg" means insert $log_msg before the first (1) line.
-e$((maxlines))',$d' means delete each line from line number $((maxlines)) to the last one ($).

shell script to trigger a command with every new line in to a file?

Is it possible to trigger a command with every new line in to a file?
For example: I have a log file say maillog. I want to get every new entry in to the log file as a mail.
If a new entry like " Mail Sent " added in to maillog file then my script should grep the new entry and send me a mail with the entry(data).
I know its crazy but i want to automate my Linux box with these kind of things.
Regards,
Not so crazy. Check periodically (once per hour, per day, what you like) the file for new parts by storing the original length of the file, compare the length, in case it has grown, handle the part which was appended:
length=0
while sleep 3600 # use wanted delay here
do
new_length=$(find "$file" -printf "%s")
if [ $length -lt $new_length ]
then
tail --bytes=$[new_length-length] "$file" | handle_part
fi
length=$new_length
done
Now you only have to write that handle_part function which could for instance mail its input somewhere.
Using this way (instead of the obvious tail -f) has the advantage that you can store the current length into a file and later on restarting your script read that length again. So you won't get the whole file after a restart of your script (e. g. due to a machine reboot).
If you want a faster response you could have a look at inotify which is a facility on Linux to monitor file actions; so that polling could be replaced.
Use tail -f, that watches a file and sents whatever is appended to it to stdout. If you have a script that performs the desired action, say mail_per_line, then you can set it up as
tail -f maillog | mail_per_line
In this case, mail_per_line runs once and gets all the lines. If you want to spawn a separate process each time a line comes in, use the shell built-in read:
tail -f maillog | while IFS='' read line; do
send_a_message "$line"
done
To counter the effect described by Alfe, that a restart of this program will cause all the previous logs to be processed again, consider using logrotate.

Shell script to nullify a file everytime it reaches a specific size

I am in the middle of writing a shell script to nullify/truncate a file if it reaches certain size. Also the file is being opened/written by a process all the time. Now every time when I nullify the file, will the file pointer be repositioned to the start of the file or will it remain in its previous position? Let me know if we could reset the file pointer once the file has been truncated?
The position of the file pointer depends on how the file was opened by the process that has it open. If it was opened in append mode, then truncating the file will mean that new data will be written at the end of the file, which is actually the beginning too the first time it writes after the file is truncated. If it was not opened in append mode, then truncating the file will simply mean that there is a series of virtual zero bytes at the start of the file, but the real data will continue to be written at the same point as the last write finished. If the file is being reopened by the other process, rather than being held open, then roughly the same rules apply, but there is a better chance that the file will be written at the beginning. It all depends on how the first process to write to the file after the truncation is managing its file pointer.
You can't reset the file pointer of another process, AFAIK.
A cron job or something like this will do the task; it will find every files bigger than 4096bytes then nullified the files
$ find -type f -size 4096c -print0 | while IFS= read -r -d $'\0' line; do cat /dev/null > $line; done
enter link description here

Empty a file while in use in linux

I'm trying to empty a file in linux while in use, it's a log file so it is continuosly written.
Right now I've used:
echo -n > filename
or
cat /dev/null > filename
but all of this produce an empty file with a newline character (or strange character that I can see as ^#^#^#^#^#^#^#^#^#^#^#^.. on vi) and I have to remove manually with vi and dd the first line and then save.
If I don't use vi adn dd I'm not able to manipulate file with grep but I need an automatic procedure that i can write in a shell script.
Ideas?
This should be enough to empty a file:
> file
However, the other methods you said you tried should also work. If you're seeing weird characters, then they are being written to the file by something else - most probably whatever process is logging there.
What's going on is fairly simple: you are emptying out the file.
Why is it full of ^#s, then, you ask? Well, in a very real sense, it is not. It does not contain those weird characters. It has a "hole".
The program that is writing to the file is writing a file that was opened with O_WRONLY (or perhaps O_RDWR) but not O_APPEND. This program has written, say, 65536 bytes into the file at the point when you empty out the file with cp /dev/null filename or : > filename or some similar command.
Now the program goes to write another chunk of data (say, 4096 or 8192 bytes). Where will that data be written? The answer is: "at the current seek offset on the underlying file descriptor". If the program used O_APPEND the write would be, in effect, preceded by an lseek call that did a "seek to current end-of-file, i.e., current length of file". When you truncate the file that "current end of file" would become zero (the file becoming empty) so the seek would move the write offset to position 0 and the write would go there. But the program did not use O_APPEND, so there is no pre-write "reposition" operation, and the data bytes are written at the current offset (which, again, we've claimed to be 65536 above).
You now have a file that has no data in byte offsets 0 through 65535 inclusive, followed by some data in byte offsets 65536 through 73727 (assuming the write writes 8192 bytes). That "missing" data is the "hole" in the file. When some other program goes to read the file, the OS pretends there is data there: all-zero-byte data.
If the program doing the write operations does not do them on block boundaries, the OS will in fact allocate some extra data (to fit the write into whole blocks) and zero it out. Those zero bytes are not part of the "hole" (they're real zero bytes in the file) but to ordinary programs that do not peek behind the curtain at the Wizard of Oz, the "hole" zero-bytes and the "non-hole" zero bytes are indistinguishable.
What you need to do is to modify the program to use O_APPEND, or to use library routines like syslog that know how to cooperate with log-rotation operations, or perhaps both.
[Edit to add: not sure why this suddenly showed up on the front page and I answered a question from 2011...]
Another way is the following:
cp /dev/null the_file
The advantage of this technique is that it is a single command, so in case it needs sudo access only one sudo call is required.
Why not just :>filename?
(: is a bash builtin having the same effect as /bin/true, and both commands don't echo anything)
Proof that it works:
fg#erwin ~ $ du t.txt
4 t.txt
fg#erwin ~ $ :>t.txt
fg#erwin ~ $ du t.txt
0 t.txt
If it's a log file then the proper way to do this is to use logrotate. As you mentioned doing it manually does not work.
I have not a linux shell here to try ir, but have you try this?
echo "" > file

What does dup2 actually do in this case?

I need some clarification here:
I have some code like this:
child_map[0] = fileno(fd[0]);
..
pid = fork();
if(pid == 0)
/* child process*/
dup2(child_map[0], STDIN_FILENO);
Now, will STDIN_FILENO and child_map[0] point to the same file descriptor ? Will the future inputs be taken from the file pointed to by child_map[0] and STDIN_FILENO ?
I thought STDIN_FILENO means the standard output(terminal).
After the dup2(), child_map[0] and STDIN_FILENO will continue to be separate file descriptors, but they will refer to the same open file description. That means that if, for example, child_map[0] == 5 and STDIN_FILENO == 0, then both file descriptor 5 and 0 will remain open after the dup2().
Referring to the same open file description means that the file descriptors are interchangeable - they share attributes like the current file offset. If you perform an lseek() on one file descriptor, the current file offset is changed for both.
To close the open file description, all file descriptors that point to it must be closed.
It is common to execute close(child_map[0]) after the dup2(), which leaves only one file descriptor open to the file.
It causes all functions which read from stdin to get their data from the specified file descriptor, instead of the parent's stdin (often a terminal, but could be a file or pipe depending on shell redirection).
In fact, this is how a shell would launch processes with redirected input.
e.g.
cat somefile | uniq
uniq's standard input is bound to a pipe, not the terminal.
STDIN_FILENO is stdin, not stdout. (There's a STDOUT_FILENO too.) Traditionally the former is 0 and the latter 1.
This code is using dup2() to redirect the child's stdin from another file descriptor that the parent had opened. (It is in fact the same basic mechanism used for redirection in shells.) What usually happens afterward is that some other program that reads from its stdin is execed, so the code has set up its stdin for that.

Resources