Hi Stackoverflow community,
For our tool we're trying to build a agent to remotely monitor linux processes, the resources they use, i/o info and tcp/ip connectivity. This is to gather these metrics and send them over to the server using log4. We're doing this in windows use wmi and log4net and looking to do the same in linux.
In essence, doing what top, lsof -i and iostat do then sending it to the central server.
Have seen some initiatives at:
lttng
munin monitoring
systemtap
opennms
godrb.com
mcollective
http://bitbucket.org/chrismiles/psi/wiki/Home
Looking at the source code of top, it directly queries /proc whereas lttng needs to install kernel modules. Bearing in mind, the end use is for enterprise systems, we would like to keep it as close to the kernel as possible without needing to add new modules.
Our goal is to monitor what each process on the box uses (cpu/mem/io etc), any process info (eg version) and where it connects tcp source and destination and send this to the server using log4. Happy for it to be in any language C/php/python/ruby etc
Do you have any suggestions?
Bill
"In essence, doing what top, lsof -i and iostat do then sending it to the central server"
Try SeaLion. Its exactly what you want. It will also give you the flexibility to add more commands as and when your monitoring requirements change or increase. You wouldn't have to look for other tools. It is lesser known but works very easily; simple setup and a great timeline to view the past data in.
For true enterprise monitoring, I would look at Megamon (http://www.megamon.com)
Related
I already tried the options to setup monitoring on Linux with these references:
1. Monitor server resource utilization with JMeter SSHMon Listener
2. Monitor server health performance using JMeter Perfmon Agent
But these two options is not working for me. For SSHMon, I already troubleshoot but not found the solution, here's the issue description: JMeter SSHMon Listener issue error - I/O and Swap not captured.
For JMeter Perfmon, Server Agent installed successfully but I'm not able to resolve this issue due to firewall at server side, the other team don't want to configure the existing defaults for firewall, so I opt to SSHMon, but still not working.
We plan to extract monitoring Linux server manually, if Windows, can use Windows Performance Monitor, but how about linux? Which is the best implementation for this? By using Third Party Linux Software Monitoring tools or simply by Linux Command tools? If with Command Tools, possible to set scheduler in this?
Appreciate your help. Thanks
For the PerfMon Agent - if the "other team" is not willing to open default port 4444 - you can bind the agent to another port like:
./startAgent.sh --udp-port 1234 --tcp-port 5678
replace the ports with the one(s) which are open in your Linux server
also be aware of SSH Tunelling option, like you can forward the port 4444 from the remote machine to the port 4444 on you local machine and connect to localhost:4444 with the Perfmon Metrics Collector Listener
For the SSHMon - stating that it "doesn't work" sounds weird because it just executes the command provided by you and plots the returned values in the "over-time" chart. If you cannot come up with a proper command - it's rather your problem, not JMeter or its plugin problem, if you're not comfortable with sar - there are alternative options like cat /proc/swaps or free commands which give you the swap file utilization. Also there are programs like mpstat or iostat which might be easier to use and parse. See How to Monitor Server Resource Utilization with JMeter’s SSHMon Listener article for sample commands.
If you need further support you need to indicate the exact metric(s) and the anticipated values (percentage of total, absolute value, if you go for the absolute values - which unit you would like to see, etc)
I am wondering if it is possible to write a program on Windows that communicates with a program within a Linux Virtualbox on the same machine. If this is possible, what is the best approach to doing this? Is there a way to do this without using the internet to communicate?
I found instructions showing how you could potentially use SSH, but I have never tried doing this before, so I do not know if using SSH to communicate would be the best option.
I was going to put this as a comment to a very vague question, but then it got too long.
It depends what you mean by "communicate"....
If the Windows machine should start a program on the Linux VM, you probably want plink.exe - see here.
If you want to transfer whole files, you probably want scp or FTP or FileZilla - see here.
If you want to send small messages occasionally, maybe netcat, also known as nc - see Netcat Cheatsheet here.
If you want full-on, high speed, continuous messages, maybe sockets or some messaging protocol like mqtt.
If you want to share data structures, like lists, queues or sets, you could allow both Windows and the Linux machine to access a shared Redis database - see here.
Or maybe it is enough to share a filesystem between the two machines - in which case you can make a Shared Folder in VirtualBox on your host and the VM can just mount that and read/write it. See diagram:
We are looking into implementing an in-memory utility which can recover the system in case of disk/filesystem lockup. This utility has to detect the lockup and take corrective action like rebooting or just shutting down interface.
The server platform is Gentoo Linux 2.4
Any suggestions on - any existing utility or which scripting method will work best (expect, native C++)?
you'll want S.M.A.R.T. monitoring tools (smartmontools)
http://en.wikipedia.org/wiki/S.M.A.R.T.
Note that not all statistics correlate with impending drive failure, and sometimes (for some brands and models) you may need to pass in special flags or you will get garbage. See the wikipedia article for which attributes really indicate danger.
The command is smartctl. You may need to be sudo. smartctl --all will give a summary of all drives, spinning them up very briefly to check their health.
What type of errors are you looking for?
smartmontools and smartd which ship with most distros should be able to help you. They work at a low level with the disk.
SMART on Wikipedia
smartmontools
I have written a daemon in linux for doing dhcp for an embedded system. This platform only has a linux kernel running on it and have no CLI support. What is the best way for me to test my daemon? How do I write a program that will call the main function in this daemon and verify if its working fine?
Appreciate the answers.
When I've been in a situation like this, I've written a second daemon (or had a second listener in the existing daemon) to take the place of a CLI, listening at a particular port and responding to a very limited command set of your own choosing.
In this case, all you really care about is triggering the function on demand, so you could even have it trigger when you connect to this second port, and then report results back to the socket.
I strongly recommend, by the way, making sure your embedded system has some more generic mechanism for logging information to persistent storage and retrieving that log. It doesn't have to be syslog or anything so complicated. But you will want that ability in the future to enable forensic analysis of problems in the field.
You will want to write and debug your daemon in a full featured environment first, then install it on the embedded system at the end when you are sure it works properly.
If you can build a dhcp server for the embedded system you can surely build a simple shell for it also. Try building BusyBox or ash or dash.
You could also try using GDB remote debugging. I found an article about it.
We implemented a server application available on Windows only. Now we like to port it to Linux, HP-UX and AIX, too. This application provides internal statistics through performance counters into the Windows Performance Monitor.
To be more precise: The application is a data base, and we like to provide information like number of connected users or number of requests executed to the administrator. So these are "new" information, proprietary to our application. But we like to make them available in the same environment where the operating system delivers information like the CPU, etc. The goal is to make them easily readable for the administrator.
What is the appropriate and commonly used performance monitor under Linux, HP-UX and AIX?
I would say: that depends on which performance you want to monitor. Used CPU time? Free RAM? Disk IO? Number of beers in your freezer...
But regardless of this you can look at any files below /proc. I'm not sure for HP, but at least Linux and AIX should have that tree (if it's not deactivated at kernel compile time).
Management is where most OSes depart from one another. For this reason there are not many tools that are common between all the OSes.
Additionally, Unix tools follow the single process single responsibility idiom where one tool gets cpu info, another gets memory etc.
The only tool i have seen in the Unix world that gets all this info in one place is top. Almost all sys admins are familiar with this tool and works on all the flavors of OSes you are interested in. It also has the additional advantage of being open source. You could simply extend this tool to expose the counters you are interested in and ship it along with your application.
Another way to do this might be to expose your counters through SNMP and leave it to some third party SNMP tool like HP open view that can collect and present a consistent view along with other management info. This might be a more enterprisy solution, which might appeal to the marketing folks.
I would also say its a good idea to write a standalone console tool that admins can use from their custom home grown scripts (there are many firsm out there with super human admins / over paid it staff that does this).
All together would be a healthy solution for your requirement i think.
The most standard unix tools for such data are the *stat (iostat, vmstat, netstat) tools and sar. On Linux you'll find all this information in /proc, but most Unixes don't have /proc nicely filled with what you are looking for. The mentioned tools are quite standardized and can be used to gather the data you need.