How To Log CPU, Memory, Bandwidth for a process on Linux? - linux

I am looking to log the CPU, Memory, Bandwidth for a process running on Linux. Ultimately the data will be stored in a database, but my main question is how to get access to this data to log in the first place.
My initial thought is to use the top command, and parse the data I need.
Can you think of a better way?

Check out the /proc pseudo filesystem—you can read the files in there from scripts, compiled programs, anywhere.
I have already implemented a similar system and used 'sar' extensively, parsing the output using 'awk', however 'perl', 'python', or any such would work just as well. I made each of these scripts output CSV and then bulk-loaded the CSVs into MySQL for later querying/charting via PHP.

You may use ps for CPU and Memory, something like:
ps -eo pid,user,args,%mem,%cpu
and then parse output.

The kernel can be configured to do this with "process accounting". A web search for that and "linux" will find a wealth of information on how to set it up.

Related

Accessing system performance data directly from the linux kernel

I need to write an application that gets performance statistics on a Linux machine. Unfortunately the environment is extremely memory constrained and so using the standard command line tools isn't really an option as I would need to poll them pretty frequently.
Ideally what I would like to be able to do would be to get the performance data directly from the kernel itself, using the same buffers and data that it uses to try and reduce the RAM requirements for my application as much as possible. Tying my app to the Linux kernel so closely isn't really a problem we have only ever used Linux in production and I can't see that ever changing really.
I've spent the last day or two looking through the kernel source but I have to admit to being somewhat lost. Can anyone point me to the right place for getting access to CPU performance information / I/O performance information / networking performance information and bandwidth usage information please?
I think there are several files under /proc, such as /proc/stat, /proc/diskstats, /proc/net/*.
For CPU performance information, using /proc/stat, the file format is defined in the file ./fs/proc/stat.c in Linux Kernel source code tree.
For disk access information, using /proc/diskstats, the file format is defined in the file ./block/genhd.c in Linux Kernel source code tree, the function is diskstats_show().
For network related statistics, one can refer to files under /proc/net/. But I don't know how to calculate the bandwidth usage based on file under directory /proc/net.

Can you load a tree structure in memory with Linux shell?

I want to create an application with a Linux shell script like this — but can it be done?
This application will create a tree containing data. The tree should be loaded in the memory. The tree (loaded in memory) could be readable from any other external Linux script.
Is it possible to do it with a Linux shell?
If yes, how can you do it?
And are there any simple examples for that?
There are a large number of misconceptions on display in the question.
Each process normally has its own memory; there's no trivial way to load 'the tree' into one process's memory and make it available to all other processes. You might devise a system of related programs that know about a shared memory segment (somehow — there's a problem right there) that contains the tree, but that's about it. They'd be special programs, not general shell scripts. That doesn't meet your 'any other external Linux script' requirement.
What you're seeking is simply not available in the Linux shell infrastructure. That answers your first question; the other two are moot given the answer to the first.
There is a related discussion here. They use shared memory device /dev/shm and, ostensibly, it works for multiple users. At least, it's worth a try:
http://www.linuxquestions.org/questions/linux-newbie-8/bash-is-it-possible-to-write-to-memory-rather-than-a-file-671891/
Edit: just tried it with two users on Ubuntu - looks like a normal directory and REALLY WORKS with the right chmod.
See also:
http://www.cyberciti.biz/tips/what-is-devshm-and-its-practical-usage.html
I don't think there is a way to do this as if you want to keep all the requirements of:
Building this as a shell script
In-memory
Usable across terminals / from external scripts
You would have to give up at least one requirement:
Give up shell script req - Build this in C to run as a Linux process. I only understand this up to the point to say that it would be non-trivial
Give up in-memory req - You can serialize the tree and keep the data in a temp file. This works as long as the file is small and performance bottleneck isn't around access to the tree. The good news is you can use the data across terminals / from external scripts
Give up usability from external scripts req - You can technically build a script and run it by sourcing it to add many (read: a mess of) variables representing the tree into your current shell session.
None of these alternatives are great, but if you had to go with one, number 2 is probably the least problematic.

Disk failure detection perl script

I need to write a script to check the disk every minute and report if it is failing by any reason. The error could be the absolute disk failure and a bad sector and so on .
First, I wonder if there is any script out there that does the same as it should be a standard procedure (because I really do not want to reinvent the wheel).
Second, I wonder if I want to look for errors in /var/log/messages, is there any list of standard error strings for disks that I can use?
I look for that on the net a lot, there are lots of info and at the same time no info about that.
Any help will be much appreciated.
Thanks,
You could simply parse the output of dmesg which usually reports fairly detailed information about drive errors, well that's how I've collected stats on failing drives before.
You might get better more well documented information by using Parse::Syslog or lower level kernel reporting directly though.
Logwatch does the /var/log/messages part of the ordeal (as well as any other logfiles that you choose to add). You can either choose to use that, or to use its code to roll your own sollution (it's all written in perl).
If your harddrives support SMART, i suggest you use smartctl output for diagnostics as it includes a lot of nice info that can be monitored over time to detect failure.

How do I measure net used disk space change due to activity by a given process in Linux?

I'd like to monitor disk space requirements of a running process. Ideally, I want to be able to point to a process and find out the net change in used disk space attributable to it. Is there an easy way of doing this in Linux? (I'm pretty sure it would be feasible, though maybe not very easy, to do this in Solaris with DTrace)
Probably you'll have to ptrace it (or get strace to do it for you and parse the output), and then try to work out what disc is being used.
This is nontrivial, as your tracing process will need to understand which file operations use disc space - and be free of race conditions. However, you might be able to do an approximation.
Quite a lot of things can use up disc space, because most Linux filesystems support "holes". I suppose you could count holes as well for accounting purposes.
Another problem is knowing what filesystem operations free up disc space - for example, opening a file for writing may, in some cases, truncate it. This clearly frees up space. Likewise, renaming a file can free up space if it's renamed over an existing file.
Another issue is processes which invoke helper processes to do stuff - for example if myprog does a system("rm -rf somedir").
Also it's somewhat difficult to know when a file has been completely deleted, as it might be deleted from the filesystem but still open by another process.
Happy hacking :)
If you know the PID of the process to monitor, you'll find plenty of information about it in /proc/<PID>.
The file /proc/<PID>/io contains statistics about bytes read and written by the process, it should be what you are seeking for.
Moreover, in /proc/<PID>/fd/ you'll find links to all the files opened by your process, so you could monitor them.
there is Dtrace for linux is available
http://librenix.com/?inode=13584
Ashitosh

Logging frameworks for embedded linux?

I need a small, portable framework for logging on embedded linux. Ideally it would output to a file or a socket, and having some sort of log rotation/compression would also be nice.
So far, I've found a lot of frameworks, but almost all of them have daunting build procedures or require the use of application frameworks (e.g. log4cxx requires the Apache Portable Runtime, which I'd rather not bother with...).
Just looking for something simple and robust, but everything I seem to find is complicated or requires lots of secondary junk just to run.
Suggestions? (and if the answer is roll my own, that's fine, but...it's be great to avoid that)
Use syslog(3) and syslogd from BusyBox. BusyBox can be very compact when stripped down and doesn't depend on anything other than libc. You can strip out everything you don't want so it is perfectly possible to use it only for logging.
We use BusyBox on a number of embedded systems, both Linux and uClinux, and find its logging facilities highly reliable.
I have no experience with the log4cxx-module but I am using APR on an embedded target running Linux (it is based on the Atmel AT91SAM926x processor family). It was really simple to configure and compile (more or less ./configure --host=arm-none-linux-gnueabi) so I would not be to afraid of going down the log4cxx-path.
Maybe you should consider spending some time on a good logging framework, since this is what you are going to use on your embedded Linux. ... and printf ...
I cooked something where I can enable/disable various logging levels per module in runtime.
Did you ever try debugging multithreaded apps on Linux?
Good luck!
Implementing very robust logging mechanism in C taking about 1000 code lines (from our code base). 90% of this defines of different sections. This includes different macros DBG_E DBG_W DBG_TRACE etc ... and spliting to the section, run time changing of debug level and debug modules (does not include compression just simple print abstraction that can be implemented in different ways file/socket/serial etc...) .
I will estimate that it take about few days to implement. The down side you will spend a few days the up side that you will get something that works for your needs and nothing more, i understand that you are working on embedded platform and footprint and memory usage are important, the best and optimized solution will be one you write. We invested those few days ones. and using it across different products/project and adjust/improve with the time past according to real needs. Main problem of generic solution that it usually will do sort of what you need and a lot more, this more usually just waist of resources.
I can't imagine that your platform is too small to include log4cxx and APR, neither is a large library, and even the tiniest platform is likely to have space for them.
You could just use syslog, which is provided by the C library - a syslog daemon is provided by busybox (which no doubt, you already use if you're on a really tiny platform). I don't know if busybox's syslogd can log to the network, but it has some level of flexibility. You can do log rotation using shell scripts pretty trivially.
Use klogd it reads the kernel log messages(from /proc/kmsg kernel) interface and redirect those messages to appropriate directory. you can use user configurable syslogd daemon along with klogd that will redirect kernel messages into appropriate files in /var/log/ directory.
For instance logs related to mail service will be stored in /var/log/main.log and logs related to kernel booting process will be stored in /var/log/boot.log . User can configure log parsing using syslogd configuration file.
But the use of syslogd may lead to your system performance degradation because for every log messages syslog daemon will do disk operation to store that log into appropriate file
Log sequence
Messages from kernel
---> klogd ( access messages from kernel ring buffer)-->syslogd --> /var/log/*

Resources