How to aggregate logs for all docker containers in mesos - node.js

I have multiple microservices written in node and microservices are installed into the docker container and we are using Mesos+Marathon for clustering.
How can I aggregate the logs of all the containers(microservices) on different instance.?

We're using Docker+Mesos as well and are shipping all logs to a log analytics service (it's the service the company I work for offers, http://logz.io). There a couple of ways to achieve that:
Have a log shipper agent within each docker - an agent like rsyslog, nxlog, logstash, logstash-forwarder - that agent would ship data to a central logging solution
Create a Docker that is running the shipper agent (like rsyslog, nxlog, logstash, logstash-forwarder) and that agent reads logs from all Dockers on each machine and ships them to a central location - this is the path we're taking

This is a broad question but I suggest you setup an Elastic Search, Logstash, Kibana stack (ELK)
https://www.elastic.co/products/elasticsearch
https://www.elastic.co/products/logstash
https://www.elastic.co/products/kibana
Then on each one of your containers you can run the logstash forwarder/shipper to send logs to your logstash frontend.
Logs get stored in Elastic Search and then you search for them using Kibana or the Elastic Search API
Hope it helps.

I am also doing some Docker + Mesos + Marathon work so, I guess, I am going to face same doubt that you have.
I don't know if there's any native solution yet. But there's a blog by the folks at elastic.io on how they went about solving this issue.
Here's the link - Log aggregation for Docker containers in Mesos / Marathon cluster

Related

AWS EKS with Fargate send logs to logstash

Good afternoon.
I need to collect application logs in an eks cluster using fargate, in a node group environment I use a fluentbit running with daemon set to collect this and send it to a logstash. But as fargate doesn't support the set daemon, I'm trying some alternatives for that, without using AWS elastic, because we need to send the collections to a logstash.
Has anyone done something like this using these products?
You probably already have the config using Filebeat to send logs to Logstash, thus you can use FileBeat as the sidecar to do the same. If you would like to stick with FluentBit (NOT the AWS Fluentbit provided by Fargate), you can leverage on the HTTP plugin of OSS FluentBit and Logstash.

Does Kubernetes restart a failed container or create a new container when the running container fails for any reason?

I have ran a docker container locally and it stores data in a file (currently no volume is mounted). I stored some data using the API. After that I failed the container using process.exit(1) and started the container again. The previously stored data in the container survives (as expected). But when I do this same thing in Kubernetes (minikube) the data is lost.
Posting this as a community wiki for better visibility, feel free to edit and expand it.
As described in comments, kubernetes replaces failed containers with new (identical) ones and this explain why container's filesystem will be clean.
Also as said containers should be stateless. There are different options how to run different applications and take care about its data:
Run a stateless application using a Deployment
Run a stateful application either as a single instance or as a replicated set
Run automated tasks with a CronJob
Useful links:
Kubernetes workloads
Pod lifecycle

How to forward logs to s3 from yarn container?

I am setting up Spark on Hadoop Yarn cluster in AWS EC2 machines.
This cluster will be ephemeral (For few hours within a day) and hence i want to forward the container logs generated to s3.
I have seen Amazon EMR supporting this feature by forwarding logs to s3 every 5 minutes
Is there any built in configuration inside hadoop/spark that i can leverage ..?
Any other solution to solve this issue will also be helpfull.
Sounds like you're looking for YARN log aggregation.
Haven't tried changing it myself, but you can configure yarn.nodemanager.remote-app-log-dir to point to S3 filesystem, assuming you've setup your core-site.xml accordingly
yarn.log-aggregation.retain-seconds +
yarn.log-aggregation.retain-check-interval-seconds will determine how often the YARN containers will ship out their logs
The alternate solution would be to build your own AMI that has Fluentd or Filebeat pointing at the local YARN log directories, then setup those log forwarders to write to a remote location. For example, Elasticsearch (or one of the AWS log solutions) would be a better choice than just S3

Apache Cassandra monitoring

What is the best way to monitor if cassandra nodes are up? Due to security reasons JMX and nodetool is out of question. I have cluster metrics monitoring via Rest Api, but I understand that even if a node goes Rest Api will only report on a whole cluster.
Well, I have integrated a system where I can monitor all the metrics regarding to my cluster of all nodes. This seems like complicated but pretty simple to integrate. You will need the following components to build up a monitoring system for cassandra:
jolokia jar
telegraf
influxdb
grafana
I'm writing a short procedure, how it works.
Step 1: copy jolokia jvm jar to install_dir/apache-cassandra-version/lib/ , jolokia jvm agent can be downloaded from anywhere in google.
Step 2: add the following line to install_dir/apache-cassandra-version/conf/cassandra-env.sh
JVM_OPTS="$JVM_OPTS -javaagent:<here_goes_the_path_of_your_jolokia_jar>"
Step 3: install telegraf on each node and configure the metrics you want to monitor. and start telegraf service.
Step 4: install grafana and configure your ip, port, protocol. grafana will give you a dashboard to look after your nodes and start grafana service. Your metrics will be able get visibility here.
Step 5: install influxdb on another server from where you want to store your metrics data which will come through telegraf agent.
Step 6: browse the ip you have mentioned, where you have launched your grafana through browser and add data source ip (influxdb ip), then customize your dashboard.
image source: https://blog.pythian.com/monitoring-cassandra-grafana-influx-db/
This is not for monitoring but only for node state.
Cassandra CQL driver provides info if a particular node is UP or DOWN with Host.StateListener Interface. This info is used by driver to mark a node UP or Down. Thus it could be used if node is down or up if JMX is not accessible.
Java Doc API : https://docs.datastax.com/en/drivers/java/3.3/
I came up with a script which listens for DN nodes in the cluster and reports it to our monitoring setup which is integrated with pagerduty.
The script runs on one of our nodes and executes nodetool status every minute and reports for all down nodes.
Here is the script https://gist.github.com/johri21/87d4d549d05c3e2162af7929058a00d1
[1]:

Logstash: what will the KIBANA URL if there are 2 Elastic search instances on the same machine

What will be the URL if there are 2 elastic search instances on the same machine, say for example a Logstash config file is redirecting to embedded elastic search and the other pointing to the external ES instance both on the same machine.
How to differentiate the two?
By default we are accessing the KIBANA using URL: http://:9292
So far, Kibana 3 do not support this, if this two elasticsearch instances are not at the same cluster. So, you have to setup two kibana for each elasticsearch instance.
You can follow this discussion about the feature you want.

Resources