batch network monitoring per process/socket (Linux, shell) - linux

I am looking for a quick tool to batch monitoring network traffic per socket and/or per process.
I.e., I would like to iterate over a given time and get the iterated traffic as text output/on stdout.
I checked several tools so far as iftop, nethogs, iptraf-ng, ifstat, tcptrack -- which offer either nice statistics of the info I look for or a batch mode, but I did not find a way to combine it.
Ideally, it would be something like iftop or nethogs (or iptraf) just with a batch option
ala
iftop -i eth# -iterateinsec 60 > nettraf.txt
Is there a way to do so (maybe with the tools I tried and missed its batch feature) or some other ready available tool?
Cheers and thanks,
Thomas

I would suggest you to use Sealion which is very simple use. There are default set of commands which will give you all statistics in a given timeline. Also you can add parameters to it.

Related

Creating a time serie with top and nodeJS

I would like to create a time series and inject it into InfluxDb for a demo. I thought about using the top command (top -pid 1393 -stats cpu), and use the CPU value. And then use NodeJS to extract the data and inject it into an InfluxDB. However, there are a couple of but...:
1- The top command has a display section: can it be removed?
2- In Node, I would call (repeatedly) "top -pid 1393 -stats cpu -l 1" with the "-l 1" option to only get a singe sample. I feel it is a misuse of the fact that top generates data at given intervals (basically, I recreate in Node what top does automatically)
Is there a better way to do this - in the ideal world, I would launch top in node and "pipe" the output stream to a variable in an async way (to execute the insertion into InfluxDB).
Thanks for any hints you may have.
Christian
In actual fact, there is a node module for this: see npm usage.
Note that you may have an issue when installing the usage module, generated by node-gyp rebuild (usage requires this module, which requires XCode in Mac platform - see https://github.com/nodejs/node-gyp). To solve this, look here: xcode-select active developer directory error)
Thanks - C
To monitor the process' resource consuming metrics (correct me if I took your intentions wrong), you don't need your NodeJS stuff at all.
All you need is running Telegraf agent, with this plugin configured, on the machine(s) you're targeting.
Point the output plugin to your Influx - and that's it.
As mention in other comments, there are specific tools for that task. Anyway, if you still want to do it programmatically, I would suggest to either:
run top in non-interactive mode, as explained here: https://unix.stackexchange.com/questions/255100/get-top-output-for-non-interactive-shell
read directly the information that you need from /proc/<pid of your process>/<file of interest>

Monitor network usage of a process

This question might sound fairly repetitive, but there are subtle details which make it a bit different.
I am looking for a simple tool (for Ubunut/Linux) to monitor network usage such that it gives the min, max, average, and time-plot of network usage by 1) a single process; and, 2) the system; only during the time when the process was running. The major requirement is that I am not looking for a GUI (or terminal GUI like top) based tool but I want this monitoring information to be pushed to a file so that I can perform some post-processing over that.
I come across the following link which lists various options: http://www.binarytides.com/linux-commands-monitor-network/. However, most tools are GUI based and ones which are not do not provide above information.
Any help would be much appreciated.
Wireshark might work, depending on how far you're willing to relax the non-GUI requirement and whether locating your target processes is simple. Wireshark is of course a GUI app, but the tshark command which comes with it is headless and can be used to capture packets to a file. After capturing all packets on an interface, you can run tshark again on the pcap file to filter the file using Wireshark "Display filters" and extract just the packets for your process. That's one part that may or may not be simple, depending on whether you can identify your process from network traffic content, port(s), or by adding some sentinel dummy data. You'll then have two pcap files, one for the whole network interface and one for just your process.
The capinfos command will report the average throughput. Wireshark can be used to generate a time-plot of the traffic with millisecond (or other) granularity via the menu "Statistics >> IO Graph". As for min and max, you can either eyeball that from the time-plot or use editcap to split the pcap files into chunks, run capinfos on each chunk, and calculate the min and max over all chunks.
That might not be the simplest approach, it's just what occurred to me off the top of my head.

How to monitor a process in Linux CPU, Memory and time

How can I benchmark a process in Linux? I need something like "top" and "time" put together for a particular process name (it is a multiprocess program so many PIDs will be given)?
Moreover I would like to have a plot over time of memory and cpu usage for these processes and not just final numbers.
Any ideas?
I typically throw together a simple script for this type of work.
Take a look at the kernel documentation for the proc filesystem (Google 'linux proc.txt').
The first line of /proc/stat (Section 1.8 in proc.txt) will give you cumulative cpu usage stats (i.e. user, nice, system, idle, ...). For each process, the file /proc/$PID/stat (Table 1-4 in proc.txt) will provide you with both process-specific cpu usage stats and memory usage stats (see rss).
If you google a bit you'll find plenty of detailed info on these files, and pointers to libraries / apps / code snippets that can help you obtain / derive the values you need. With that in mind, I'll focus on the high-level strategy.
For CPU stats, use your favorite scripting language to create an executable that takes a set of process ids for monitoring. At a fixed interval (ex: 1 second) poll / calculate the cumulative totals for each process and the system as a whole. During each poll interval, write all results on a single line to stdout.
For memory stats, write a similar script, but simply log the per-process memory usage. Memory is a bit easier as we directly obtain the instantaneous values.
Run these script for the duration of your test, passing the set of processes ids that you'd like to monitor and redirecting its output to a log file.
./logcpu $(pidof foo) $(pidof bar) > cpustats
./logmem $(pidof foo) $(pidof bar) > memstats
Import the contents of these files into a spreadsheet (for certain applications this is as easy as copy / paste). For CPU, you are after instantaneous values but have cumulative values, so you'll need to do some minor spreadsheet work to derive these values (it's just the delta 't(x + 1) - t(x)'). Of course you could have your cpu logger write the delta, but you'll be spending a bit more time up front on the script.
Finally, use your spreadsheet to generate a nice plot.
Following are the tools for monitoring a linux system
System commands like top, free -m, vmstat, iostat, iotop, sar, netstat, etc. Nothing comes near these linux utility when you are debugging a problem. These command give you a clear picture that is going inside your server
SeaLion: Agent executes all the commands mentioned in #1 (also user defined) and outputs of these commands can be accessed in a beautiful web interface. This tool comes handy when you are debugging across hundreds of servers as installation is clear simple. And its FREE
Nagios: It is the mother of all monitoring/alerting tools. It is very much customization but very much difficult to setup for beginners. There are sets of tools called nagios plugins that covers pretty much all important Linux metrics
Munin
Server Density: A cloudbased paid service that collects important Linux metrics and gives users ability to write own plugins.
New Relic: Another well know hosted monitoring service.
Zabbix

Disk failure detection perl script

I need to write a script to check the disk every minute and report if it is failing by any reason. The error could be the absolute disk failure and a bad sector and so on .
First, I wonder if there is any script out there that does the same as it should be a standard procedure (because I really do not want to reinvent the wheel).
Second, I wonder if I want to look for errors in /var/log/messages, is there any list of standard error strings for disks that I can use?
I look for that on the net a lot, there are lots of info and at the same time no info about that.
Any help will be much appreciated.
Thanks,
You could simply parse the output of dmesg which usually reports fairly detailed information about drive errors, well that's how I've collected stats on failing drives before.
You might get better more well documented information by using Parse::Syslog or lower level kernel reporting directly though.
Logwatch does the /var/log/messages part of the ordeal (as well as any other logfiles that you choose to add). You can either choose to use that, or to use its code to roll your own sollution (it's all written in perl).
If your harddrives support SMART, i suggest you use smartctl output for diagnostics as it includes a lot of nice info that can be monitored over time to detect failure.

Using "top" in Linux as semi-permanent instrumentation

I'm trying to find the best way to use 'top' as semi-permanent instrumentation in the development of a box running embedded Linux. (The instrumentation will be removed from the final-test and production releases.)
My first pass is to simply add this to init.d:
top -b -d 15 >/tmp/toploop.out &
This runs top in "batch" mode every 15 seconds. Let's assume that /tmp has plenty of spaceā€¦
Questions:
Is 15 seconds a good value to choose for general-purpose monitoring?
Other than disk space, how seriously is this perturbing the state of the system?
What other (perhaps better) tools could be used like this?
Look at collectd. It's a very light weight system monitoring framework coded for performance.
We use sysstat to monitor things like this.
You might find that vmstat and iostat with a delay and no repeat counter is a better option.
I suspect 15 seconds would be more than adequate unless you actually want to watch what's happening in real time, but that doesn't appear to be the case here.
As far as load, on an idling PIII 900Mhz w/ 768MB of RAM running Ubuntu (not sure which version, but not more than a year old) I have top updating every 0.5 seconds and it's about 2% CPU utilization. At 15s updates, I'm seeing 0.1% CPU utilization.
depending upon what exactly you want, you could use the output of uptime, free, and ps to get most, if not all, of top's information.
If you are looking for overall load, uptime is probably sufficient. However, if you want specific information about processes, you are adventurous, and have the /proc filessystem enabled, you may want to write your own tools. The primary benefit in this environment is that you can focus on exactly what you want and minimize the load introduced to the system.
The proc file system gives your application read access to the kernel memory that keeps track of many of the interesting variables. Reading from /proc is one of the lightest ways to get this information. Additionally, you may be able to get more information than provided by top. I've done this in the past to get amount of time spent in user and system by this process. Additionally, you can use this to get information about the number of file descriptors open by the process. You might also use this to get detailed information about how the network system is working.
Much of this information is pre-processed by other applications which can be used if you get the information you need. However, it is rather straight-forward to read the raw information. Do a man proc for more information.
Pity you haven't said what you are monitoring for.
You should decide whether 15 seconds is ok or not. Feel free to drop it way lower if you wish (and have a fast HDD)
No worries unless you are running a soft real-time system
Have a look at tools suggested in other answers. I'll add another sugestion: "iotop", for answering a "who is thrashing the HDD" questions.
At work for system monitoring during stress tests we use a tool called nmon.
What I love about nmon is it has the ability to export to XLS and generate beautiful graphs for you.
It generates statistics for:
Memory Usage
CPU Usage
Network Usage
Disk I/O
Good luck :)

Resources