Storing system data into graphite/statsd - statistics

I have setup graphite and statsd on a specific machine that will be dedicated for stats. Now, if I would like to connect my application servers to provide stats - what would be the best way?
I know that carbon does this for the stats machine already, but what do I do on the appservers that doesn't have graphite installed?
What I am looking for is to store load, disk usage and memory free/used.

running collectd (http://collectd.org/) with a graphite agent (https://github.com/indygreg/collectd-carbon) would be an excellent start to gather the information you're after.
There is an almost unlimited amount of ways to get your data into graphite.
You can find a list of tools that have known to work very well with graphite on the readthedocs.org page: http://graphite.readthedocs.org/en/0.9.10/tools.html

Related

How to monitor performance counters of HP Web Tour sample application apache server installed on local machine

how to monitor Performance counters of HP Web Tour sample application apache server installed locally in system using jvisualvm or any other utility.
Looking into Monitoring Your Server and 9 Key Apache Web Server Performance Metrics to Monitor it appears that:
You need to keep an eye on Apache error log
You need to consider Apache specific metrics like requests per second, bytes per second, bytes per request. You should be also able to extract these metrics from your performance testing tool, normally they must report these kind of stats.
You need to consider infrastructure metrics like CPU, RAM, Disk, Network, Swap usage on the machine where you're running this sample application. The majority of operating system come with built-in monitoring tools i.e. Windows Performance Monitor or number of command-line utilities for Linux or a 3rd-party cross-platform monitoring solution like PerfMon or Zabbix

Is it possible to log systems memory and cpu usage using iis logs?

I have a requirement to motiror what was the CPU usage and memory usage of the system when perticular request came.
Is it possible using IIS logs or any other method/tool to do so?
We dont want the usage of IIS process we want the usage of whole system at that time.
You can use windows performance monitor to record cpu and memory usage (using data collector sets). Then, you can check in your IIS logs at what time the request in question came in and look up the recorded data in the performance monitor data collector set.
I don't think there is a tool which automatically combines the IIS log with system performance data. There are tools which include IIS monitoring, but those usually won't break reports down to a single request. If you want to do some further research you can use my list of 40 windows server performance monitoring tools as a starting point.

How to take input in logstash?

when should I use filebeat , packetbeat or topbeat ?
I am new to elk stack. I may sound silly but I am really confused over these. I would appreciate any sort of help.
It took me a while but I have figured out the solution.
File beat is used to read input from files we can use it when some application is generating logs in a file like elasticsearch's logs are generated in a log file , so we can use filebeat to read data from log files.
Topbeat is used to visualise the cpu usage , ram usage and other stuffs which are related to system resources.
Packetbeat can be used to analyze network traffic and we can directly log the transactions taking place using the ports on which transactions are happening.
While I was wondering about the difference between logstash and the beats platform it turned out that beats are more lightweight you need not install JVM on each of your servers to use logstash. However , logstash has a rich community of plugins with their count exceeding 200 but beats is still under development , so logstash can be used if we don't have the required protocol support in beats.
These are all Elasticsearch data shippers belonging to Elastic's Beats family. Each beat helps you analyze different bits and pieces in your environment.
Referring specifically to the beats you mentioned:
Filebeat is good for tracking and forwarding specific log files (e.g. apache access log)
Packetbeat is good for network analysis, monitoring the actual data packets being transferred across the wire
Topbeat can be used for infrastructure monitoring, giving you perf metrics on CPU usage, memory, etc.
There are plenty of resources to help you get started. Try Elastic's site. I also saw a series of tutorials on the Logz.io blog.

Remote monitoring of system stats with node.js

We have implemented a monitoring solution in node.js, which does some basic checks for database integrity and API up-time. We want to expand this system to collect basic system stats of our Linux servers like CPU and disc usage. Some of these servers are behind a firewall which is out of our control, with only some very basic ports open (ssh,ftp,http,https).
How can I gather the system information of these servers in node.js. Are there monitoring systems which expose these information through a (secured) RESTful API?
I've had a lot of success with this ssh client written in javascript:
https://github.com/mscdex/ssh2
So there tons of available solutions for monitoring system stats: Nagios, Zabbix, Scout, Cacti. There are even some hosted ones like ServerDensity.
All of these systems should cover the top-level stats: CPU, RAM, Disk IO & Network. They all have a plug-in infrastructure so that you can send custom stats (API uptime, DB availability) and send them along with the regular stats.
If you're running on a cloud infrastructure somewhere, many of these provide information "out of the box", generally in your account dashboard (see guys like Joyent or Azure).
Big question here is "what else do you need"?
Use NRPE from Nagios as a client on the box you want to monitor. It's fairly simple to set up and it's API is documentented. http://exchange.nagios.org/directory/Addons/Monitoring-Agents/NRPE--2D-Nagios-Remote-Plugin-Executor/details

Which resources should one monitor on a Linux server running a web-server or database

When running any kind of server under load there are several resources that one would like to monitor to make sure that the server is healthy. This is specifically true when testing the system under load.
Some examples for this would be CPU utilization, memory usage, and perhaps disk space.
What other resource should I be monitoring, and what tools are available to do so?
As many as you can afford to, and can then graph/understand/look at the results. Monitoring resources is useful for not only capacity planning, but anomaly detection, and anomaly detection significantly helps your ability to detect security events.
You have a decent start with your basic graphs. I'd want to also monitor the number of threads, number of connections, network I/O, disk I/O, page faults (arguably this is related to memory usage), context switches.
I really like munin for graphing things related to hosts.
I use Zabbix extensively in production, which comes with a stack of useful defaults. Some examples of the sorts of things we've configured it to monitor:
Network usage
CPU usage (% user,system,nice times)
Load averages (1m, 5m, 15m)
RAM usage (real, swap, shm)
Disc throughput
Active connections (by port number)
Number of processes (by process type)
Ping time from remote location
Time to SSL certificate expiry
MySQL internals (query cache usage, num temporary tables in RAM and on disc, etc)
Anything you can monitor with Zabbix, you can also attach triggers to - so it can restart failed services; or page you to alert about problems.
Collect the data now, before performance becomes an issue. When it does, you'll be glad of the historical baselines, and the fact you'll be able to show what date and time problems started happening for when you need to hunt down and punish exactly which developer made bad changes :)
I ended up using dstat which is vmstat's nicer looking cousin.
This will show most everything you need to know about a machine's health,
including:
CPU
Disk
Memory
Network
Swap
"df -h" to make sure that no partition runs full which can lead to all kinds of funky problems, watching the syslog is of course also useful, for that I recommend installing "logwatch" (Logwatch Website) on your server which sends you an email if weird things start showing up in your syslog.
Cacti is a good web-based monitoring/graphing solution. Very complete, very easy to use, with a large userbase including many large Enterprise-level installations.
If you want more 'alerting' and less 'graphing', check out nagios.
As for 'what to monitor', you want to monitor systems at both the system and application level, so yes: network/memory/disk i/o, interrupts and such over the system level. The application level gets more specific, so a webserver might measure hits/second, errors/second (non-200 responses), etc and a database might measure queries/second, average query fulfillment time, etc.
Beware the afore-mentioned slowquerylog in mysql. It should only be used when trying to figure out why some queries are slow. It has the side-effect of making ALL your queries slow while it's enabled. :P It's intended for debugging, not logging.
Think 'passive monitoring' whenever possible. For instance, sniff the network traffic rather than monitor it from your server -- have another machine watch the packets fly back and forth and record statistics about them.
(By the way, that's one of my favorites -- if you watch connections being established and note when they end, you can find a lot of data about slow queries or slow anything else, without putting any load on the server you care about.)
In addition to top and auth.log, I often look at mtop, and enable mysql's slowquerylog and watch mysqldumpslow.
I also use Nagios to monitor CPU, Memory, and logged in users (on a VPS or dedicated server). That last lets me know when someone other than me has logged in.
network of course :) Use MRTG to get some nice bandwidth graphs, they're just pretty most of the time.. until a spammer finds a hole in your security and it suddenly increases.
Nagios is good for alerting as mentioned, and is easy to get setup. You can then use the mrtg plugin to get alerts for your network traffic too.
I also recommend ntop as it shows where your network traffic is going.
A good link to get you going with Munin and Monit: link text
I typically watch top and tail -f /var/log/auth.log.

Resources