Multithreaded Kafka Consumer not processing all the partitions in parallel - multithreading

I have created a multithreaded Kafka consumer in which one thread is assigned to each of the partition (I have total 100 partitions). I have followed https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example link.
Below is the init method of my consumer.
consumer = kafka.consumer.Consumer.createJavaConsumerConnector(createConsumerConfig());
System.out.println("Kafka Consumer initialized.");
Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
topicCountMap.put(topicName, 100);
Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap);
List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topicName);
executor = Executors.newFixedThreadPool(100);
In the above init method, I got the list of Kafka streams (total 100) which should be connected to each of the partition (Which is happening as expected).
Then I did submit each of the streams to a different thread using below snippet.
public Object call() {
for (final KafkaStream stream : streams) {
executor.execute(new StreamWiseConsumer(stream));
}
return true;
}
Below is the StreamWiseConsumer class.
public class StreamWiseConsumer extends Thread {
ConsumerIterator<byte[], byte[]> consumerIterator;
private KafkaStream m_stream;
public StreamWiseConsumer(ConsumerIterator<byte[], byte[]> consumerIterator) {
this.consumerIterator = consumerIterator;
}
public StreamWiseConsumer(KafkaStream kafkaStream) {
this.m_stream = kafkaStream;
}
#Override
public void run() {
ConsumerIterator<byte[], byte[]> consumerIterator = m_stream.iterator();
while(!Thread.currentThread().isInterrupted() && !interrupted) {
try {
if (consumerIterator.hasNext()) {
String reqId = UUID.randomUUID().toString();
System.out.println(reqId+ " : Event received by threadId : "+Thread.currentThread().getId());
MessageAndMetadata<byte[], byte[]> messageAndMetaData = consumerIterator.next();
byte[] keyBytes = messageAndMetaData.key();
String key = null;
if (keyBytes != null) {
key = new String(keyBytes);
}
byte[] eventBytes = messageAndMetaData.message();
if (eventBytes == null){
System.out.println("Topic: No event fetched for transaction Id:" + key);
continue;
}
String event = new String(eventBytes).trim();
// Some Processing code
System.out.println(reqId+" : Processing completed for threadId = "+Thread.currentThread().getId());
consumer.commitOffsets();
} catch (Exception ex) {
}
}
}
}
Ideally, it should start processing from all the 100 partitions in parallel. But it is picking some random number of events from one of the threads and processing it then some other thread starts processing from another partition. It seems like it's sequential processing but with different-different threads. I was expecting processing to happen from all the 100 threads. Am I missing something here?
PFB for the logs link.
https://drive.google.com/file/d/14b7gqPmwUrzUWewsdhnW8q01T_cQ30ES/view?usp=sharing
https://drive.google.com/file/d/1PO_IEsOJFQuerW0y-M9wRUB-1YJuewhF/view?usp=sharing

I doubt whether this is the right approach for vertically scaling kafka streams.
Kafka streams inherently supports multi thread consumption.
Increase the number of threads used for processing by using num.stream.threads configuration.
If you want 100 threads to process the 100 partitions, set num.stream.threads as 100.

Related

Sleep ThreadPoolExecutor threads until interruption

This is a sample batch processing application. Initially I do create a thread pool for the batch processing and then execute those threads on the batching. Once the batch processing is started, it is keep looping and searching for entries in the queue for batching. That is a huge performance drain and the CPU usage goes to max.
The following block contains the sample code, I am currently working on.
BlockingQueue<BatchRequestEntry> batchingQueue;
ThreadPoolExecutor executorService executorService;
private boolean isRunning = false;
private ExecutorService getExecutorService() {
if (executorService == null) {
ThreadFactory threadFactory = new ThreadFactoryBuilder().setNameFormat("batch-processor-%d").build();
executorService = new ThreadPoolExecutor(batchProcessorConfiguration.getThreadCount(), batchProcessorConfiguration.getThreadCount(),
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<>(), threadFactory);
}
return executorService;
}
public synchronized void start() {
if (isRunning) {
return;
}
getExecutorService().submit(() -> {
while (isRunning) {
if (!batchingQueue.isEmpty()) {
List<BatchRequestEntry> entries = batchingQueue.batch(batchProcessorConfiguration.getBatchingMaxCount());
executorService.submit(() -> process(entries));
}
}
});
isRunning = true;
}
public SettableFuture<BatchResultEntry> append(BatchRequestEntry batchRequestEntry) {
if (!isRunning) {
start();
}
SettableFuture<BatchResultEntry> future = SettableFuture.create();
batchingQueue.append(batchRequestEntry);
futures.put(batchRequestEntry.getEntryId(), future);
return future;
}
What I want to provide as a solution is to capture the number of loops with empty batching queue. Then compare that value with a threshold value and sleep the threads in the thread pool. Once the new entries are appended to the batching queue, I want to interrupt the thread sleep and continue the loop for batching. I think that would solve the problem.
I want to figure how to do that with ThreadPoolExecutor and I would appreciate if there are better approaches to address this problem. Thank you!

About understanding partition lease expiration

I have an event hub with 4 partitions and 2 consumer groups. I have 2 webjobs that read the data using an EventProcessor. Both for a different consumer group
I have configured the event processors like this:
var host = new EventProcessorHost(
Guid.NewGuid().ToString(),
configurationManager.EventHubConfiguration.Path,
configurationManager.EventHubConfiguration.ConsumerGroupName,
configurationManager.EventHubConfiguration.ListenerConnectionString,
configurationManager.StorageConfiguration.ConnectionString)
{
PartitionManagerOptions = new PartitionManagerOptions
{
AcquireInterval = TimeSpan.FromSeconds(10),
RenewInterval = TimeSpan.FromSeconds(10),
LeaseInterval = TimeSpan.FromSeconds(30)
}
};
var options = EventProcessorOptions.DefaultOptions;
options.MaxBatchSize = 250;
await host.RegisterEventProcessorFactoryAsync(new PlanCareEventProcessorFactory(telemetryClient, configurationManager), options);
return host;
In my EventProcessor I keep track of the progress (some methods skipped to keep it short and readable):
internal class PlanCareEventProcessor : IEventProcessor
{
public Task OpenAsync(PartitionContext context)
{
namespaceManager = NamespaceManager.CreateFromConnectionString(configurationManager.EventHubConfiguration.ManagerConnectionString);
if (namespaceManager == null)
return;
var currentSeqNo = context.Lease.SequenceNumber;
var lastSeqNo = namespaceManager.GetEventHubPartition(context.EventHubPath, context.ConsumerGroupName, context.Lease.PartitionId).EndSequenceNumber;
var delta = lastSeqNo - currentSeqNo;
var msg = $"Last processed seqnr for partition {context.Lease.PartitionId}: {currentSeqNo} of {lastSeqNo} in consumergroup '{context.ConsumerGroupName}' (lag: {delta})";
telemetryClient.TrackTrace(new TraceTelemetry(msg, SeverityLevel.Information));
telemetryClient.TrackMetric(new MetricTelemetry($"Partition_Lag_{context.Lease.PartitionId}_{context.ConsumerGroupName}", delta));
return Task.CompletedTask;
}
public async Task ProcessEventsAsync(PartitionContext context, IEnumerable<EventData> events)
{
progressCounter++;
...
await LogProgress(context);
}
private async Task LogProgress(PartitionContext context)
{
if (progressCounter >= 100)
{
await CheckPointAsync(context);
progressCounter = 0;
}
}
}
Now I noticed a difference in the webjobs when it comes to how often OpenAsync and CloseAsync are called. For one of the consumer groups this is about every half hour while for the other one it is several times a minute.
Since both webjobs use the same code and are running on the same app plan, what could be the reason for this?
It bothers me because checkpointing using await CheckPointAsync(context) is almost never done for one of the webjobs since it does not reach the threshold before the lease is gone.

Storm analog for spark.streaming.kafka.maxRatePerPartition in Spark

There is spark.streaming.kafka.maxRatePerPartition property in Spark Streaming, which limits number of messages reading from Apache Kafka per second. Is there similar property for Storm?
I think there is no property to limit the number of messages per second.
If you use the new kafka client (kafka 0.9) spout you can set 'MaxUncommittedOffsets' that will throttle the number of uncommited offsets(i.e number of inflight messages).
However, if you are still using the old kafka spout(kafka prior to 0.9), you can use the storm property 'topology.max.spout.pending' which throttles the total number of unacknowledged messages per spout task.
There is a workaround, which helps to do that in Storm. You can simply write following wrapper for KafkaSpout, which count number of messages emitted by spout per second. When it reaches desired number (Config.RATE) it returns nothing.
public class MyKafkaSpout extends KafkaSpout {
private int counter = 0;
private int currentSecond = 0;
private final int tuplesPerSecond = Config.RATE;
public MyKafkaSpout(SpoutConfig spoutConf) {
super(spoutConf);
}
#Override
public void nextTuple() {
if (counter == tuplesPerSecond) {
int newSecond = (int) TimeUnit.MILLISECONDS.toSeconds(System.currentTimeMillis());
if (newSecond <= currentSecond) {
return;
}
counter = 0;
currentSecond = newSecond;
}
++counter;
super.nextTuple();
}
}

How to read InputStream only once using CustomReceiver

I have written custom receiver to receive the stream that is being generated by one of our application. The receiver starts the process gets the stream and then cals store. However, the receive method gets called multiple times, I have written proper loop break condition, but, could not do it. How to ensure it only reads once and does not read the already processed data.?
Here is my custom receiver code:
class MyReceiver() extends Receiver[String](StorageLevel.MEMORY_AND_DISK_2) with Logging {
def onStart() {
new Thread("Splunk Receiver") {
override def run() { receive() }
}.start()
}
def onStop() {
}
private def receive() {
try {
/* My Code to run a process and get the stream */
val reader = new ResultsReader(job.getResults()); // ResultReader is reader for the appication
var event:String = reader.getNextLine;
while (!isStopped || event != null) {
store(event);
event = reader.getNextLine;
}
reader.close()
} catch {
case t: Throwable =>
restart("Error receiving data", t)
}
}
}
Where did i go wrong.?
Problems
1) The job and stream reading happening after every 2 seconds and same data is piling up. So, for 60 line of data, i am getting 1800 or greater some times, in total.
Streaming Code:
val conf = new SparkConf
conf.setAppName("str1");
conf.setMaster("local[2]")
conf.set("spark.driver.allowMultipleContexts", "true");
val ssc = new StreamingContext(conf, Minutes(2));
val customReceiverStream = ssc.receiverStream(new MyReceiver)
println(" searching ");
//if(customReceiverStream.count() > 0 ){
customReceiverStream.foreachRDD(x => {println("=====>"+ x.count());x.count()});
//}
ssc.start();
ssc.awaitTermination()
Note: I am trying this in my local cluster, and with master as local[2].

Azure Worker Role Asynchronous Receive message: how long I should put the sleep for? (milliseconds)

Sample code here
public override void Run()
{
while (true)
{
IAsyncResult result = CUDClient.BeginReceive(TimeSpan.FromSeconds(10), OnMessageReceive, CUDClient);
Thread.Sleep(10000);
}
}
I have tested this Azure worker role. I kept 100 messages in the Service bus Queue. It's doing entities updates as a operation(Entity framework). It took 15 minutes to process all the queues and looks like taking longer time. Any suggestion to improve this?
Thanks in Advance
Actually Service Bus is very fast enough in my experience. What wrong with you is "Thread.Sleep(10000)";
Sleeping 10 sec for each message.
For 100 messages 100*10 = 10000 seconds = 16.67 minutes
So this is a problem for the delay...
Solution:
Dont use Thread.Sleep(10000); (Its not suitable for BeginReceive, only suitable for Receive)
public override void Run() //This should not be a Thread...If its a thread then your thread will terminate after receiving your first message
{
IAsyncResult result = CUDClient.BeginReceive(**TimeSpan.MaxValue**, OnMessageReceive, CUDClient);
}
//Function OnMessageReceive
{
//Process the Message
**IAsyncResult result = CUDClient.BeginReceive(TimeSpan.MaxValue, OnMessageReceive, CUDClient);**
}
using TimeSpan.MaxValue your connection to the SB will be preserved for longtime. so no frequent null message(less cost)...
Try using XecMe Parallel task for processing the message reading.
XecMe # xecme.codeplex.com
Try this one...
//Somefunction
IAsyncResult result = CUDClient.BeginReceive(OnMessageReceive, CUDClient);
while (true)
Thread.Sleep(1000); //In case you are using thread
//Somefunction End
public static void OnMessageReceive(IAsyncResult result)
{
CUDClient.BeginReceive(OnMessageReceive, CUDClient);
SubscriptionClient queueClient = (SubscriptionClient)result.AsyncState;
IBusinessLogicProvider Obj;
try
{
//Receive the message with the EndReceive call
BrokeredMessage receivedmsg = queueClient.EndReceive(result);
//receivedmsg = CUDClient.Receive();
if (receivedmsg != null)
{
switch (receivedmsg.ContentType)
{
case "Project":
Obj = new ProjectsBL();
Obj.HandleMessage(receivedmsg);
receivedmsg.BeginComplete(OnMessageComplete, receivedmsg);
break;
}
}
}
}
I tried this.
while (true)
{
//read all topic messages in sequential way....
IAsyncResult result = CUDClient.BeginReceive(OnMessageReceive, CUDClient);
Thread.Sleep(1000);
}
public static void OnMessageReceive(IAsyncResult result)
{
SubscriptionClient queueClient = (SubscriptionClient)result.AsyncState;
IBusinessLogicProvider Obj;
try
{
//Receive the message with the EndReceive call
BrokeredMessage receivedmsg = queueClient.EndReceive(result);
//receivedmsg = CUDClient.Receive();
if (receivedmsg != null)
{
switch (receivedmsg.ContentType)
{
case "Project":
Obj = new ProjectsBL();
Obj.HandleMessage(receivedmsg);
receivedmsg.BeginComplete(OnMessageComplete, receivedmsg);
break;
}
}
}
}
It processed all the 100 messages in 1 minute(00:01:02) . A lot better than previous one.

Resources