kafka.common.OffsetOutOfRangeException - for no valid reason - node.js

I am using kafka-node library to consume the kafka messages from node.js. This is the simple code I tried to consume:
consumer = new Consumer(
client,
[
{ topic: 't', partition: 0 }, { topic: 't1', partition: 1 }
],
{
autoCommit: false
}
);
consumer.on('message', function (message) {
console.log(message);
});
This was working perfectly sometime before, but now I keep on getting this error:
kafka.common.OffsetOutOfRangeException: Request for offset 19 but we only have log segments in the range 0 to 0.
I don't understand what makes Kafka to throw this error without any reason for the same code which was working before, I haven't given offset as 19 no where. I tried deleting the topics, logs etc. I keep on getting this error. Can anyone please guide me how to handle this?

Related

Waiting for leadership elections in KafkaJS

The Situation
I am using kafkajs to write to some dynamically generated kafka topics.
I am finding writing to those topics immediately after registering my producer will regularly cause an error: There is no leader for this topic-partition as we are in the middle of a leadership election.
The full error is:
{"level":"ERROR","timestamp":"2020-08-24T17:48:40.201Z","logger":"kafkajs","message":"[Connection] Response Metadata(key: 3, version: 5)","broker":"localhost:9092","clientId":"tv-kitchen","error":"There is no leader for this topic-partition as we are in the middle of a leadership election","correlationId":1,"size":146}
The Code
Here is the code that is causing the problem:
import kafka from 'myConfiguredKafkaJs'
const run = async () => {
const producer = kafka.producer()
await producer.connect()
producer.send({
topic: 'myRandomTopicString',
messages: [{
value: 'yolo',
}],
})
}
run()
The Question
Two questions:
Is there anything special I should be doing when connecting to the producer (or sending) in order to ensure that logic blocks until the producer is truly ready to send data to a kafka topic?
Is there anything special I should be doing when sending data to the producer in order to ensure that messages are not dropped?
The Solution
Kafkajs offers a createTopics method through the admin client which has an optional waitForLeaders flag:
admin.createTopics({
waitForLeaders: true,
topics: [
{ topic: 'myRandomTopicString123' },
],
}
Using this resolves the problem.
import kafka from 'myConfiguredKafkaJs'
const run = async () => {
const producer = kafka.producer()
const admin = kafka.admin()
await admin.connect()
await producer.connect()
await admin.createTopics({
waitForLeaders: true,
topics: [
{ topic: 'myRandomTopicString123' },
],
})
producer.send({
topic: 'myRandomTopicString',
messages: [{
value: 'yolo',
}],
})
}
run()
Unfortunately this will result in a different error if the topic already existed, but that's a separate question and I suspect that error is more informational than breaking.
{"level":"ERROR","timestamp":"2020-08-24T18:19:48.465Z","logger":"kafkajs","message":"[Connection] Response CreateTopics(key: 19, version: 2)","broker":"localhost:9092","clientId":"tv-kitchen","error":"Topic with this name already exists","correlationId":2,"size":86}
EDIT: the above settings do require that your Kafka instance is properly configured. It is possible to have leadership elections never resolve, in which case KafkaJS will still complain about leadership elections!
In my experience this has been due to situations where a kafka broker was stopped without being de-registered from zookeeper, meaning zookeeper is waiting for a response from something that no longer exists.

Kafka-node suddenly consumes from offset 0

Sometimes, kafka-node consumer starts consuming from offset 0, while its default behavior it to consume only newer messages. Then it will not switch back to its default behavior. Do you know how to solve this and what happens and its behavior suddenly changes? The code is very simple and this happens without altering the code.
var kafka = require("kafka-node") ;
Consumer = kafka.Consumer;
client = new kafka.KafkaClient();
consumer = new Consumer(client, [{ topic: "Topic_23", partition: 0}
]);
consumer.on("message", function(message) {
console.log(message)
});
The only solution I have found so far is to change the kafka topic. Then everything works fine again. Any ideas ?
In Kafka, offsets are not associated to specific consumers but instead, they are linked to the Consumer Groups. In your code, you don't provide the Consumer Group therefore, every time you fire up the consumer, it is being assigned to a different Consumer Group and thus, the offset starts from 0.
The following should do the trick (obviously the first time you are going to read all the messages):
var kafka = require("kafka-node") ;
Consumer = kafka.Consumer;
client = new kafka.KafkaClient();
payload = [{
topic: "Topic_23",
partition: 0
}]
var options = {
groupId: 'test-consumer-group',
fromOffset: 'latest'
};
consumer = new Consumer(client, payload, options);
consumer.on("message", function(message) {
console.log(message)
});

How to create kafka topic with partitions in nodejs?

I am using kafka-node link api for creating kafka topics. I did not find how to create a kafka topic with partitions.
var kafka = require('kafka-node'),
Producer = kafka.Producer,
client = new kafka.Client(),
producer = new Producer(client);
// Create topics sync
producer.createTopics(['t','t1'], false, function (err, data) {
console.log(data);
});
// Create topics async
producer.createTopics(['t'], true, function (err, data) {});
producer.createTopics(['t'], function (err, data) {});// Simply omit 2nd arg
how to create kafka topic with partitions in nodejs.
From your node.js app execute the shell script $KAFKA_HOME/bin/kafka-topics.sh —create —topic topicname —partitions 8 —replication-factor 1 —zookeeper localhost:2181
Where $KAFKA_HOME is the location where you installed Kafka
As documentation describes, this method works only when auto.create.topics.enable is set to true:
This method is used to create topics on the Kafka server. It only works when auto.create.topics.enable, on the Kafka server, is set to true. Our client simply sends a metadata request to the server which will auto create topics. When async is set to false, this method does not return until all topics are created, otherwise it returns immediately.
This means that any operation on unknown topic will lead to its creation with default number of partitions configured by num.partitions parameter.
I'm not sure, but maybe one of the node-rdkafka implementations could allow you to call corresponding librdkafka method to create topic?
I am not that sure but I think as per your requirement the code has updated here:- https://github.com/SOHU-Co/kafka-node#createtopicstopics-cb, adding a parameter "replicaAssignment".
// Optional explicit partition / replica assignment
// When this property exists, partitions and replicationFactor properties are ignored
replicaAssignment: [
{
partition: 0,
replicas: [3, 4]
},
{
partition: 1,
replicas: [2, 1]
}
]
The Producer.createTopics takes a partitons option. See https://www.npmjs.com/package/kafka-node#createtopicstopics-cb
Pass an object, rather than a string
producer.createTopics(['t', 't1'], true, function (err, data) {});
becomes
producer.createTopics(
[
{ topic: 't', paritions: 5 },
{ topic: 't1', partitions: 23 },
],
true,
function (err, data) {}
);

How to control commit of a consumed kafka message using kafka-node

I'm using Node with kafka for the first time, using kafka-node. Consuming a message requires calling an external API, which might even take a second to response. I wish to overcome sudden failures of my consumer, in a way that if a consumer fails, another consumer that will consume that will replace it will receive the same message that its work was not completed.
I'm using kafka 0.10 and trying to use ConsumerGroup.
I thought of setting autoCommit: false in options, and committing the message only once its work has been completed (as I previously done with some Java code some time ago).
However, I can't seem to be sure how should I correctly commit the message only once it is done. How should I commit it?
Another worry I have is that it seems, because of the callbacks, that the next message is being read before the previous one had finished. And I'm afraid that if message x+2 have finished before message x+1, then the offset will be set at x+2, thus in case of failure x+1 will never be re-executed.
Here is basically what I did so far:
var options = {
host: connectionString,
groupId: consumerGroupName,
id: clientId,
autoCommit: false
};
var kafka = require("kafka-node");
var ConsumerGroup = kafka.ConsumerGroup;
var consumerGroup = new ConsumerGroup(options, topic);
consumerGroup.on('connect', function() {
console.log("Consuming Kafka %s, topic=%s", JSON.stringify(options), topic);
});
consumerGroup.on('message', function(message) {
console.log('%s read msg Topic="%s" Partition=%s Offset=%d', this.client.clientId, message.topic, message.partition, message.offset);
console.log(message.value);
doSomeStuff(function() {
// HOW TO COMMIT????
consumerGroup.commit(function(err, data) {
console.log("------ Message done and committed ------");
});
});
});
consumerGroup.on('error', function(err) {
console.log("Error in consumer: " + err);
close();
});
process.once('SIGINT', function () {
close();
});
var close = function() {
// SHOULD SEND 'TRUE' TO CLOSE ???
consumerGroup.close(true, function(error) {
if (error) {
console.log("Consuming closed with error", error);
} else {
console.log("Consuming closed");
}
});
};
One thing you can do here is to have a retry mechanism for every message you process.
You can consult my answer on this thread:
https://stackoverflow.com/a/44328233/2439404
I consume messages from Kafka using kafka-consumer, batch them together using async/cargo and put them in async/queue (in-memory queue). The queue takes a worker function as an arguement to which I am passing a async/retryable.
For your problem, you can just use retryable to do processing on your messages.
https://caolan.github.io/async/docs.html#retryable
This may solve your problem.

Explicitly decrement the offset in Kafka (for retrying/reprocessing a Kafka message)

I have a Kafka queue with Auto commit set to true. I am looking for an option to set the offset to a previous position in case an error has occurred while processing the current message. Something like:
var consumer = new HighLevelConsumer(client, topics, options);
consumer.on('message', function (message) {
console.log('Got msg', message);
var currentOffset = message.offset;
/*
//program logic which caused exception and we want to decrement the offset, so that the same message is picked again
*/
if(errorOccured){
handleErrorCondition(currentOffset);
}
});
//this function sets the offset the current-1 position
function handleErrorCondition(currentOffset){
offset.commit(groupId, [
{ topic: 'test-kafka', partition: 0, offset: (currentOffset-1) }
], function (err, data) {
console.log('Attempted to commit the offset # '+(currentOffset-1));
var jsnData = JSON.stringify(data);
console.log(err+' - '+jsnData);
});
//consumer.setOffset('test-kafka', 0, (currentOffset -1));//this didnt work either
}
But this code doesnt set the offset to "currentOffset -1", it just keeps on reading new messages. The basic intention is to reprocess a failed message by decrementing the offset. I also did try setting "autocommit" to "false" and then explicitly committing the offset upon successful execution of the code. In this case the idea was that the message would be read again if the code execution fails(because offset wasn't incremented), but even this doesn't work.
I am using "kafka-node" package in NodeJS with Kafka_2.10(single node).
Any idea how to make it work? Or any alternate approach for solving the same problem?
PS: I tried an alternate approach of putting the message back in the queue, and it works, but wanted the ofset approach to work.

Resources