How to take input in logstash? - logstash

when should I use filebeat , packetbeat or topbeat ?
I am new to elk stack. I may sound silly but I am really confused over these. I would appreciate any sort of help.

It took me a while but I have figured out the solution.
File beat is used to read input from files we can use it when some application is generating logs in a file like elasticsearch's logs are generated in a log file , so we can use filebeat to read data from log files.
Topbeat is used to visualise the cpu usage , ram usage and other stuffs which are related to system resources.
Packetbeat can be used to analyze network traffic and we can directly log the transactions taking place using the ports on which transactions are happening.
While I was wondering about the difference between logstash and the beats platform it turned out that beats are more lightweight you need not install JVM on each of your servers to use logstash. However , logstash has a rich community of plugins with their count exceeding 200 but beats is still under development , so logstash can be used if we don't have the required protocol support in beats.

These are all Elasticsearch data shippers belonging to Elastic's Beats family. Each beat helps you analyze different bits and pieces in your environment.
Referring specifically to the beats you mentioned:
Filebeat is good for tracking and forwarding specific log files (e.g. apache access log)
Packetbeat is good for network analysis, monitoring the actual data packets being transferred across the wire
Topbeat can be used for infrastructure monitoring, giving you perf metrics on CPU usage, memory, etc.
There are plenty of resources to help you get started. Try Elastic's site. I also saw a series of tutorials on the Logz.io blog.

Related

Emitting application level metrics in node js

I want to emit metrics from my node application to monitor how frequently a certain branch of code is reached. For example, I am interested in knowing how many times a service call didn't return the expected response. Also I want to be able to emit for each service call the time it took etc.
I am expecting I will be using a client in the code that will emit metrics to a server and then I will be able to view the metrics in a dashboard on the server. I am more interested in open source solutions that I can host on my own infrastructure.
Please note, I am not interested in system metrics here such as CPU, memory usage etc.
Implement pervasive logging and then use something like Elasticsearch + Kibana to display them in a dashboard.
There are other metric dashboard systems such as Grafana, Graphite, Tableu etc. A lot of them send metrics which are numbers associated with tags such as counting function calls, CPU load etc. The main reason I like the Kibana solution is that it is not based on metrics but instead extracts metrics from your log files.
The only thing you really need to do with your code is make sure your logs are timestamped.
Google for Kibana or "ELK stack" (ELK stands for Elasticsearch + Logstash + Kibana) for how to set this up. The first time I set it up took me just a few hours to get results.
Node has several loggers that can be configured to send log events to ELK. In addition the Logstash (or the modern "Beats") part of ELK can ingest any log file and parse them with regexp to forward data to Elasticsearch so you do not need to modify your software.
The ELK solution can be configured simply or you can spend literally weeks tuning your data parsing and graphs to get more insights - it is very flexible and how you use it is up to you.
Metrics vs Logs (opinion):
What you want is of course the metrics. But metrics alone doesn't say much. What you are ultimately after is being able to analyse your system for debugging and optimisation. This is where logging has an advantage.
With a solution that extracts metrics from logs like Kibana you have another layer to deep-dive into behind the metrics. You can query it to find what events caused the metrics. This is not easy to do on a running system because you would normally have to simulate inputs to your system to get similar metrics to figure out what is happening. But with Kibana you can analyse historical events that already happened instead!
Here's an old screenshot of a Kibana set-up I did a few years back to monitor a web service (including all emails it receives):
Note the screenshot above - apart from the graphs and metrics I extract from my system I also display parsed logs at the bottom of the dashboard so I get near real-time view of what is happening. This is the email received dashboard which we used to monitor things like subscriptions, complaints, click-through rates etc.

logstash parse with offline logs will give better performance or online?

I have ELK stack installed and about to do performance testing.
Getting below doubt which am not able to resolve myself, expertise suggestions/opinions would be helpful.
I am doubtful on,
1. whether to do logstash on LIVE - meaning, install logstash and run ELK in parallel with my performance testing on application.
2. Or First do the performance testing collect logs and feed logs to logstash offline. (this option is very much possible, as am running this test for about 30minutes only)
Which will b better performant ?
My application is on Java and since logstash also uses JVM for its parsing, am afraid it will have impact on my application performance.
Considering this, I prefer to go with option 2 , but would like to know are there any benefits/advantages going with option 1 that am missing ??
Help/suggestions much appreciated
Test your real environment under real conditions to get anything meaningful.
Will you run logstash on the server? Or will you feed your logs in the background to i.e. Kafka as described in my blogpost you summoned me from? Or will you run a batch job and then after the fact collect the logs?
Of course doing anything on the server itself during processing will have an impact and also tuning your JVM will have a big influence on how well everything performs. In general it is not an issue to run multiple JVMs on the same server.
Do your tests once with logstash / kafka / flume or any other log processing or shipping tool you want to use enabled and then run a second pass without these tools to get an idea of how much they impact the performance.

Logstash reaches 99% CPU usage and freezes forever (or until restarted)

I'm currently running an ELK cluster on reasonably weak hardware (four
virtual machines, with 4 GB memory assigned and two core each). This is slated to change in a couple of months, but for now we still need to ingest and make logs available.
After getting all of the servers of one service sending their logs to
Logstash via nxlog, collection worked fairly well for a few days.
Shortly after that, logstash frequently started to wedge open. The
logstash thread 'filterworker.0' will jump to 93 and then 99% of the
server's CPU. Logstash itself won't terminate; instead it will continue
on, hung, never sending any fresh logs to Elasticsearch. Debug logs will
show that logstash is continually calling flush by interval. It will
never recover from this state; it ran an entire weekend hung and only
resumed normal operations when I restarted it. Logstash would start
catching up on the weekend's logs and then quickly free again (usually
within five to ten minutes), requiring another restart of the service.
Once the logs had been able to mostly catch up (many restarts later and
some turning off of complicated grok filters), logstash returned to its
previous habit of wedging open every five to thirty minutes.
I attempted to narrow this down to a particular configuration and
swapped my log filters into and out of the conf.d directory. With fewer
configs, logstash would run for longer periods of time (up to an hour
and a half) but eventually it would freeze again.
Connecting jstack to the PID of the frozen filterworker.0 thread
returned mostly 'get_thread_regs failed for a lwp' debugger exceptions
and no deadlocks found.
There are no actual failures in logstash's logs when run at debug
verbosity; just those buffering loglines.
The disks are not full.
Our current configuration is three elasticsearch nodes, all receiving
input from the logstash server (using logstash's internal load
balancer). We have a single logstash server. These are all CentOS 7
machines. The logstash machine is running version 2.1.3, sourced from
Elastic's yum repository.
I've played around with changing the heap size, but nothing appears to
help, so I'm currently running it at the out of the box defaults. We
only use one worker thread as it's a single core virtual machine. We
used to use multiline, but that was the first thing I commented out when
this started to happen.
I'm not sure where to go next. My theory is that logstash's buffer is
just unable to handle the current log traffic; but without any
conclusive errors in the logs, I'm not sure how to prove it. I feel like
it might be worth putting a redis or rabbit queue between nxlog and
logstash to buffer the flood; does that seem like a reasonable next step?
Any suggestions that people might have would be greatly appreciated.
You may try to reset the Java environment,when I start up my logstash ,it will up to 99% cpu usage,but when the JVM start over ,the cpu usage will down to 3%,so I guess,maybe your java environment have something wrong.
Wish help.
I use monit to monitor the service and check for high CPU usage and then restart Logstash according to the findings. Bit of a workaround, not really a long term solution.
A queuing system would probably do the trick, check out Kafka, Redis, or RabbitMQ. You would need to measure the difference rate at which the queue is written to vs read from.
It sounds like you need more Logstash nodes. We experienced similar outages, caused by CPU, when the log throughput went up for various reasons. We are putting on aprrox. 6K lines per second and have 6 nodes (just for reference).
Also, putting a Redis pipeline in front of the Logstash nodes allowed us to configure our Logstash nodes to pull and process accordingly. Redis has allowed our Logstash nodes to now be over provisioned as they don't bear the brunt of the traffic. They pull log entries and their utilization is more consistent (no more crashing).

Is Logstsh shipper instance and redis required in this architecture?

I have created a demo environment using Logstash, redis , elastic search and kibana. (http://indico.cern.ch/getFile.....
Here logstash shipper is reading logs from log file which i have centralized using syslog-ng. Loogstash shipper is forwarding it to redis then Logstash indexer (filter) and finally to Elasticsearch.
Now i want to skip logstash shipper and redis part from it. Is this a good idea? Or redis is mandatory Or require to deal with heavy load. I'm not sure about it.
In above pdf link i have read that Logstash has low buffering and redis manages flow of logs that why redis is used. As redis keeps data in memory what if memory gets full? Also read that Logstash and Elasticsearch can be quite hungry in terms of RAM usage. JVM options needs to be properly tuned. if so then, how to tune JVM?
Is it required to purge/rotate elasticsearch data/index?
So which one is best suited for heavy load? I want to parse logs like [ System (OS and daemons ) logs, syslog, web server logs (apache, lighttpd), application server logs (tomcat), database server logs (mysql) and some Application logs (through logfiles) ].
Give your suggestions for improvement. Thanks !!!.
Kindly find following link for IMAGE.
(http://a.disquscdn.com/uploads/mediaembed/images/709/3604/original.jpg)
In the set up you describe Redis should not be required, using syslog-ng to centralise the log files serves the same purpose as Redis when multiple shippers are used.
It might be necessary to prune elasticsearch indexes to reduce disk space requirements. This will depend on how quickly your elasticsearch data are growing, how much disk space you have available and how long you need the logs to be searchable for.
I can't advise on JVM tuning.

Storing system data into graphite/statsd

I have setup graphite and statsd on a specific machine that will be dedicated for stats. Now, if I would like to connect my application servers to provide stats - what would be the best way?
I know that carbon does this for the stats machine already, but what do I do on the appservers that doesn't have graphite installed?
What I am looking for is to store load, disk usage and memory free/used.
running collectd (http://collectd.org/) with a graphite agent (https://github.com/indygreg/collectd-carbon) would be an excellent start to gather the information you're after.
There is an almost unlimited amount of ways to get your data into graphite.
You can find a list of tools that have known to work very well with graphite on the readthedocs.org page: http://graphite.readthedocs.org/en/0.9.10/tools.html

Resources