After Linux restart, Kafka throwing "no brokers found when trying to rebalance" - linux

I followed an excellent step-by-step tutorial for installing Kafka on Linux. Everything was working fine for me until I restarted Linux. After the restart, I get the following error when I try to consume a queue with kafka-console-consumer.sh.
$ ~/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic TutorialTopic --from-beginning
[2016-02-04 03:16:54,944] WARN [console-consumer-6966_bob-kafka-storm-1454577414492-8728ae43], no brokers found when trying to rebalance. (kafka.consumer.ZookeeperConsumerConnector)
Before I ran the kafka-console-consumer.sh script, I did push data to the Kafka queue using the kafka-console-producer.sh script. These steps worked without issue before restarting Linux.
I can fix the error by manually starting Kafka; but, I would rather that Kafka start automatically.
What might cause Kafka to not start correctly after restart?

So I had a similar issue in our hortonworks cluster earlier today. It seems like zookeeper was not starting correctly. I first tried kakfa-console-producer and got the exception below:
kafka-console-producer.sh --broker-list=localhost:9093 --topic=some_topic < /tmp/sometextfile.txt```
kafka.common.KafkaException: fetching topic metadata for topics```
The solution for me was to restart the server that had stopped responding. yours may be different but play around with console producer and see what errors your getting.

I has this same issue today when running a consumer:
WARN [console-consumer-74006_my-machine.com-1519117041191-ea485191], no brokers found when trying to rebalance. (kafka.consumer.ZookeeperConsumerConnector)
It turned out to be a disk full issue on the machine. Once space was freed up, the issue was resolved.

Deleting everything out relating to zookeeper and kafka out of the /tmp folder.. worked for me. But my system is not production.. so for other people proceed with caution.

By default kafka stores all the information in the /tmp folder. The tmp folder gets deleted every time you restart. You can change the placement of those files by changing in the properties file config/server.properties the property log.dirs.

For me,when start a consumer to fetch the message from broker cluster. But the Kafka server was shut down.
So, returned the message "no brokers found when trying to rebalance".
When i restart the kafka server, the error disappeared.

For producer consumer to work you have to start a broker. Broker is what intermediates message passing from producer to consumer.
sh kafka/bin/kafka-server-start.sh kafka/config/server.properties
The broker runs on 9092 port by default

Related

Kafka rolling restart – what is the right approach to perform Kafka rolling restart on production clusters

we are using Kafka version - 2.7.1. cluster includes 5 Kafka machines on Linux RHEL 7.6 version
we want to perform Kafka restart on all 5 brokers, but since the Kafka cluster is production cluster, then rolling restart should be the right way.
so we wrote bash script that do the following
stop the kafka01 broker (by systemctl command)
verify no Kafka PID (process)
start kafka01 broker
verify kafka01 is listening to Kafka port 6667
and continue the same steps 1-4, on kafka02, kafka03, kafka04, kafka05 additionally to above steps – 1-4.
we want to add the verification about – verify no corrupted indexes appears after starting the Kafka broker (appears on the Kafka log - server.log), before continue to the next Kafka broker
but we are not sure if this additionally step is needed
NOTE - after Kafka broker restart - usually in the server.log we can see that Kafka tries to fix corrupted indexes (so its can take about 10min to 1 hour)
kafka-topics.sh --bootstrap-server kafka01:9092 --describe --under-replicated-partitions | wc -l
After you are restarting a broker make sure there is no under replicated partition before moving on to another broker,
For having no outage with one broker down in kafka cluster you need to make sure that all your topics are using replication.factor=3 and min.isr=2
In order not to passing the controller around (twice) you should check which broker is the kafka controller and restart it last.
zookeeper-shell.sh [ZK_IP] get /controller
You should restart it last to avoid passing the controller around , when restarting this broker the controller role would be assigned to another running broker so restarting it last would insure you are passing the controller only once

Kafka listening, even when AWS SecGrp is turned off

We're seeing a really strange issue here. We have Kafka running on AWS, with the producer instance talking to these Kafka instances, all within the same VPC.
To simulate a Kafka-down scenario, we removed the entry in the Kafka instances' AWS SecGrp for 9092-9093 ports incoming from the producer's SecGrp. In other words, the producer should ideally NOT be able to talk to Kafka anymore. But, here's the strange part: Kafka continues to ingest data from the producer for another couple of minutes even after this.
There HAS to be something more to this than meets the eye! Can someone enlighten me, please?
It sounds like there is an existing, open connection that is persisting for a few minutes.
You should be able to run something on either end to let you know what's happening with open connections - probably netstat would help.

What happens to data sent to NodeRed output node which is currently down?

I am currently implementing a flow on Node-RED where MQTT subscriber node sends data to a kafka producer node i.e. output node on Node-RED.
If the Kafka producer node is not able to send data in case of remote Kafka is down then what happens to the data which is pushed to the Kafka producer node from MQTT subscriber node.
I cannot afford to loose a single data set.
That will depend on how the Kafka producer node has been written, but having had a quick look at the src it seams to just log an error and throw the message away if there is a problem delivering it to Kafka
There is no retry/queuing built into Node-RED it would have to be added to a given output node. The problem comes with working out what should be kept and for how long, should it be stored on disk or in memory...
Solved the issue by putting a "Catch" node which catches the error thrown by the kafka producer node (It throws the data also with the error in case of remote cluster unavailable). The data then can be extracted and try to send it again to a new cluster.

Azure Eventhub Apache Storm issue

I followed this article to try eventhub with Apache Storm, But when I run the Storm topology it was receiving events for a minute and then it stopped receiving. So I've restarted my program and then it was receiving the remaining messages. Every time I run the program after a minute it couldn't receive from eventhub. Please help me with the possibilities of the issue...
Should I change any configurations at Storm or Zookeeper.
The above jar contains a fix for a known issue in the QPID JMS client, which is used by the Event Hub spout implementation. When the service sends an empty frame (heartbeat to keep connection alive), a decoding error occurs in the client and that causes the client to stop processing commands. Details of this issue can be found here: https://issues.apache.org/jira/browse/QPID-6378

Using Apache Kafka for log aggregation

I am learning Apache Kafka from their quickstart tutorial: http://kafka.apache.org/documentation.html#quickstart. Upto now, I have done the setup as follows. A producer node, where a web server is running at port 8888. A Kafka server(broker), Consumer and Zookeeper instance on another node. And I have tested the default console/file enabled producer and consumer with 3 partitions. The setup is perfect, and I am able to see the messages I sent in the order they created (with in each partition).
Now, I want to send the logs generated from the web server to Kafka Broker. These messages will be processed by consumer later. Currently I am using syslog-ng to capture server logs to a text file. I have come up with 3 rough ideas on how to implement producer to use kafka for log aggregation
Producer Implementations
First Kind:
Listen to tcp port of syslog-ng. Fetch each message and send to kafka server. Here we have two middle processes: Producer and syslog-ng
Second Kind: Using syslog-ng as Producer. Should find a way to send messages to Kafka server instead of writing to a file. Syslog-ng, the producer is the middle process.
Third Kind: Configuring the webserver itself as producer.
Am I correct in my thinking. In the last case we don't have any middle process. But I doubt its implementation will effect server performance. Can anyone let me know the best way of using Apache Kafka(if the above 3 are not good) and guide me through appropriate configuration of server?..
P.S.: I am using node.js for my web server
Thanks,
Sarath
Since you specify that you wish to send the logs generated to kafka broker, it indeed looks as if executing a process to listen and resend messages mainly creates another point of failure with no additional value (unless you need a specific syslog-ng capability).
Syslog-ng can send messages to external applications using:
http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-3.4-guides/en/syslog-ng-ose-v3.4-guide-admin/html/configuring-destinations-program.html. I don't know if there are other ways to do that.
For the third option, I am not sure if kafka can easily be integrated into Node.js as it requires a c++ producer and when I last looked for one, I was not able to find. However, an easy alternative could be to have kafka read the log file created by the server and send those logs (using the console producer provided with kafka). This is usually a good way, as it completely remove dependencies between kafka and the web server (embedding the producer in would require error handling, configuration, etc). It requires the use of tail --follow and it works for us very well. If you wish more details on that, I can include them as well. Still you would need to supervise kafka execution to make sure messages are not lost (and provide a recovery option to offline send messages that failed). But, the good thing about this method is that there are no dependency between the tools.
Hope it helps...
Eran

Resources