Monitor network usage of a process - linux

This question might sound fairly repetitive, but there are subtle details which make it a bit different.
I am looking for a simple tool (for Ubunut/Linux) to monitor network usage such that it gives the min, max, average, and time-plot of network usage by 1) a single process; and, 2) the system; only during the time when the process was running. The major requirement is that I am not looking for a GUI (or terminal GUI like top) based tool but I want this monitoring information to be pushed to a file so that I can perform some post-processing over that.
I come across the following link which lists various options: http://www.binarytides.com/linux-commands-monitor-network/. However, most tools are GUI based and ones which are not do not provide above information.
Any help would be much appreciated.

Wireshark might work, depending on how far you're willing to relax the non-GUI requirement and whether locating your target processes is simple. Wireshark is of course a GUI app, but the tshark command which comes with it is headless and can be used to capture packets to a file. After capturing all packets on an interface, you can run tshark again on the pcap file to filter the file using Wireshark "Display filters" and extract just the packets for your process. That's one part that may or may not be simple, depending on whether you can identify your process from network traffic content, port(s), or by adding some sentinel dummy data. You'll then have two pcap files, one for the whole network interface and one for just your process.
The capinfos command will report the average throughput. Wireshark can be used to generate a time-plot of the traffic with millisecond (or other) granularity via the menu "Statistics >> IO Graph". As for min and max, you can either eyeball that from the time-plot or use editcap to split the pcap files into chunks, run capinfos on each chunk, and calculate the min and max over all chunks.
That might not be the simplest approach, it's just what occurred to me off the top of my head.

Related

about managing file system space

Space Issues in a filesystem on Linux
Lets call it FILESYSTEM1
Normally, space in FILESYSTEM1 is only about 40-50% used
and clients run some reports or run some queries and these reports produce massive files about 4-5GB in size and this instantly fills up FILESYSTEM1.
We have some cleanup scripts in place but they never catch this because it happens in a matter of minutes and the cleanup scripts usually clean data that is more than 5-7 days old.
Another set of scripts are also in place and these report when free space in a filesystem is less than a certain threshold
we thought of possible solutions to detect and act on this proactively.
Increase the FILESYSTEM1 file system to double its size.
set the threshold in the Alert Scripts for this filesystem to alert when 50% full.
This will hopefully give us enough time to catch this and act before the client reports issues due to FILESYSTEM1 being full.
Even though this solution works, does not seem to be the best way to deal with the situation.
Any suggestions / comments / solutions are welcome.
thanks
It sounds like what you've found is that simple threshold-based monitoring doesn't work well for the usage patterns you're dealing with. I'd suggest something that pairs high-frequency sampling (say, once a minute) with a monitoring tool that can do some kind of regression on your data to predict when space will run out.
In addition to knowing when you've already run out of space, you also need to know whether you're about to run out of space. Several tools can do this, or you can write your own. One existing tool is Zabbix, which has predictive trigger functions that can be used to alert when file system usage seems likely to cross a threshold within a certain period of time. This may be useful in reacting to rapid changes that, left unchecked, would fill the file system.

High performance packet handling in Linux

I’m working on a packet reshaping project in Linux using the BeagleBone Black. Basically, packets are received on one VLAN, modified, and then are sent out on a different VLAN. This process is bidirectional - the VLANs are not designated as being input-only or output-only. It’s similar to a network bridge, but packets are altered (sometimes fairly significantly) in-transit.
I’ve tried two different methods for accomplishing this:
Creating a user space application that opens raw sockets on both
interfaces. All packet processing (including bridging) is handled in
the application.
Setting up a software bridge (using the kernel
bridge module) and adding a kernel module that installs a netfilter
hook in post routing (NF_BR_POST_ROUTING). All packet processing is
handled in the kernel.
The second option appears to be around 4 times faster than the first option. I’d like to understand more about why this is. I’ve tried brainstorming a bit and wondered if there is a substantial performance hit in rapidly switching between kernel and user space, or maybe something about the socket interface is inherently slow?
I think the user application is fairly optimized (for example, I’m using PACKET_MMAP), but it’s possible that it could be optimized further. I ran perf on the application and noticed that it was spending a good deal of time (35%) in v7_flush_kern_dcache_area, so perhaps this is a likely candidate. If there are any other suggestions on common ways to optimize packet processing I can give them a try.
Context switches are expensive and kernel to user space switches imply a context switch. You can see this article for exact numbers, but the stated durations are all in the order of microseconds.
You can also use lmbench to benchmark the real cost of context switches on your particular cpu.
The performance of the user space application depends on the used syscall to monitor the sockets too. The fastest syscall is epoll() when you need to handle a lot of sockets. select() will perform very poor, if you handle a lot of sockets.
See this post explaining it:
Why is epoll faster than select?

batch network monitoring per process/socket (Linux, shell)

I am looking for a quick tool to batch monitoring network traffic per socket and/or per process.
I.e., I would like to iterate over a given time and get the iterated traffic as text output/on stdout.
I checked several tools so far as iftop, nethogs, iptraf-ng, ifstat, tcptrack -- which offer either nice statistics of the info I look for or a batch mode, but I did not find a way to combine it.
Ideally, it would be something like iftop or nethogs (or iptraf) just with a batch option
ala
iftop -i eth# -iterateinsec 60 > nettraf.txt
Is there a way to do so (maybe with the tools I tried and missed its batch feature) or some other ready available tool?
Cheers and thanks,
Thomas
I would suggest you to use Sealion which is very simple use. There are default set of commands which will give you all statistics in a given timeline. Also you can add parameters to it.

How to monitor a process in Linux CPU, Memory and time

How can I benchmark a process in Linux? I need something like "top" and "time" put together for a particular process name (it is a multiprocess program so many PIDs will be given)?
Moreover I would like to have a plot over time of memory and cpu usage for these processes and not just final numbers.
Any ideas?
I typically throw together a simple script for this type of work.
Take a look at the kernel documentation for the proc filesystem (Google 'linux proc.txt').
The first line of /proc/stat (Section 1.8 in proc.txt) will give you cumulative cpu usage stats (i.e. user, nice, system, idle, ...). For each process, the file /proc/$PID/stat (Table 1-4 in proc.txt) will provide you with both process-specific cpu usage stats and memory usage stats (see rss).
If you google a bit you'll find plenty of detailed info on these files, and pointers to libraries / apps / code snippets that can help you obtain / derive the values you need. With that in mind, I'll focus on the high-level strategy.
For CPU stats, use your favorite scripting language to create an executable that takes a set of process ids for monitoring. At a fixed interval (ex: 1 second) poll / calculate the cumulative totals for each process and the system as a whole. During each poll interval, write all results on a single line to stdout.
For memory stats, write a similar script, but simply log the per-process memory usage. Memory is a bit easier as we directly obtain the instantaneous values.
Run these script for the duration of your test, passing the set of processes ids that you'd like to monitor and redirecting its output to a log file.
./logcpu $(pidof foo) $(pidof bar) > cpustats
./logmem $(pidof foo) $(pidof bar) > memstats
Import the contents of these files into a spreadsheet (for certain applications this is as easy as copy / paste). For CPU, you are after instantaneous values but have cumulative values, so you'll need to do some minor spreadsheet work to derive these values (it's just the delta 't(x + 1) - t(x)'). Of course you could have your cpu logger write the delta, but you'll be spending a bit more time up front on the script.
Finally, use your spreadsheet to generate a nice plot.
Following are the tools for monitoring a linux system
System commands like top, free -m, vmstat, iostat, iotop, sar, netstat, etc. Nothing comes near these linux utility when you are debugging a problem. These command give you a clear picture that is going inside your server
SeaLion: Agent executes all the commands mentioned in #1 (also user defined) and outputs of these commands can be accessed in a beautiful web interface. This tool comes handy when you are debugging across hundreds of servers as installation is clear simple. And its FREE
Nagios: It is the mother of all monitoring/alerting tools. It is very much customization but very much difficult to setup for beginners. There are sets of tools called nagios plugins that covers pretty much all important Linux metrics
Munin
Server Density: A cloudbased paid service that collects important Linux metrics and gives users ability to write own plugins.
New Relic: Another well know hosted monitoring service.
Zabbix

How to do like "netstat -p", but faster?

Both "netstat -p" and "lsof -n -i -P" seems to readlinking all processes fd's, like stat /proc/*/fd/*.
How to do it more efficiently?
My program wants to know what process is connecting to it. Traversing all processes again and again seems too ineffective.
Ways suggesting iptables things or kernel patches are welcome too.
Take a look at this answer, where various methods and programs that perform socket to process mappings are mentioned. You might also try several additional techniques to improve performance:
Caching the file descriptors in /proc, and the information in /proc/net. This is done by the programs mentioned in the linked answer, but is only viable if your process lasts more than a few seconds.
You might try getpeername(), but this relies you knowing of the possible endpoints and what processes they map to. Your questions suggests that you are connecting sockets locally, you might try using Unix sockets which allow you to receive the credentials of a peer when exchanging messages by passing SO_PASSCRED to setsockopt(). Take a look at these examples (they're pretty nasty but the best I could find).
http://www.lst.de/~okir/blackhats/node121.html
http://www.zanshu.com/ebook/44_secure-programming-cookbook-for-c-and-cpp/0596003943_secureprgckbk-chp-9-sect-8.html
Take a look at fs/proc/base.c in the Linux kernel. This is the heart of the information given by the result of a readlink on a file descriptor in /proc/PID/fd/FD. A significant part of the overhead is the passing of the requests up and down the VFS layer, the numerous locking that occurs on all the kernel data structures that provide the information given, and the stringyfying and destringyfying at the kernel and your end respectively. You might adapt some of the code in this file to generate this information without many of the intermediate layers, in particular minimizing the locking to once per process, or simply once per scan of the entire data set you're after.
My personal recommendation is to just brute force it for now, ideally traverse the processes in /proc in reverse numerical order, as the more recent and interesting processes will have higher PIDs, and return as soon as you've located the results you're after. Doing this once per incoming connection is relatively cheap, it really depends on how performance critical your application is. You'll definitely find it worthwhile to bypass calling netstat and directly parse the new connection from /proc/net/PROTO, then locate the socket in /proc/PID/fd. If all your traffic is localhost, just switch to Unix sockets and get the credentials directly. Writing a new syscall or proc module that dumps huge amounts of data regarding file descriptors I'd save for last.

Resources