Kafka-Streams throwing NullPointerException when consuming - groovy

I have this problem:
When I'm consuming from a topic using the Processor API, when inside the processor the method context().forward(K, V), Kafka Streams throws a null pointer exception.
This is the stacktrace of it:
Exception in thread "StreamThread-1" java.lang.NullPointerException
at org.apache.kafka.streams.processor.internals.StreamTask.forward(StreamTask.java:336)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:187)
at org.apache.kafka.streams.processor.ProcessorContext$forward.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:133)
at com.bnsf.ltf.processor.ConversionProcessor.process(ConversionProcessor.groovy:23)
at com.bnsf.ltf.processor.ConversionProcessor.process(ConversionProcessor.groovy)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:68)
at org.apache.kafka.streams.processor.internals.StreamTask.forward(StreamTask.java:338)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:187)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:64)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:174)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:320)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:218)
My Gradle dependencies look like this:
compile('org.codehaus.groovy:groovy-all')
compile('org.apache.kafka:kafka-streams:0.10.0.0')
Update: I tried with version 0.10.0.1 and it still throws the same error.
This is the code of the Topology I'm building...
topologyBuilder.addSource('inboundTopic', stringDeserializer, stringDeserializer, conversionConfiguration.inTopic)
.addProcessor('conversionProcess', new ProcessorSupplier() {
#Override
Processor get() {
return conversionProcessor
}
}, 'inboundTopic')
.addSink('outputTopic', conversionConfiguration.outTopic, stringSerializer, stringSerializer, 'conversionProcess')
stream = new KafkaStreams(topologyBuilder, streamConfig)
stream.start()
My processor looks like this:
#Override
void process(String key, String message) {
// Call to a service and the return of the service is set on the
// converted local variable named converted
context().forward(key, converted)
context().commit()
}

Provide your Processor directly.
.addProcessor('conversionProcess', () -> new MyProcessor(), 'inboundTopic')
MyProcessor should, in turn, inherit from AbstractProcessor.

Related

Implementing clustered coordianted timer (runs on one node only in Payara Micro Cluster) using IScheduledExecutorService

I am trying to achieve the following behavior for the clustered coordinated events:
timer (event) is executed only in one thread\JVM in the Payara Micro cluster;
in case node goes down - timer (event) will be executed on another node in the cluster.
From the Payara Micro guide:
Persistent timers are NOT coordinated across a Payara Micro cluster.
They are always executed on an instance with the same name that
created the timers.
and
If that instance goes down, the timer will be recreated on another
instance with the same name once it joins the cluster. Until that
time, the timer becomes inactive.
Seems persistent timers will not work as desired in Payara Micro cluster by definition.
As such I am trying to use IScheduledExecutorService from Hazelcast, what seems to be a perfect match.
Basically implementation with IScheduledExecutorService works well except the scenario when the new Payara Micro node is starting & joining cluster (the cluster where some events already scheduled using IScheduledExecutorService). During this time the following exceptions happen:
Exception 1: java.lang.RuntimeException: ConcurrentRuntime not initialized
[2021-02-15T23:00:31.870+0800] [] [INFO] [] [fish.payara.nucleus.cluster.PayaraCluster] [tid: _ThreadID=63 _ThreadName=hz.angry_yalow.event-5] [timeMillis: 1613401231870] [levelValue: 800] [[
Data Grid Status
Payara Data Grid State: DG Version: 4 DG Name: testClusterDev DG Size: 2
Instances: {
DataGrid: testClusterDev Name: testNode0 Lite: false This: true UUID: 493b19ed-a58d-4508-b9ef-f5c58e05b859 Address: /10.41.0.7:6900
DataGrid: testClusterDev Lite: false This: false UUID: f12342bf-a37e-452a-8c67-1d36dd4dbac7 Address: /10.41.0.7:6901
}]]
[2021-02-15T23:00:32.290+0800] [] [WARNING] [] [com.hazelcast.internal.partition.operation.MigrationRequestOperation] [tid: _ThreadID=160 _ThreadName=ForkJoinPool.commonPool-worker-6] [timeMillis: 1613401232290] [levelValue: 900] [[
[10.41.0.7]:6900 [testClusterDev] [4.1] Failure while executing MigrationInfo{uuid=fc68e9ac-1081-4f9b-a70a-6fb0aae19016, partitionId=27, source=[10.41.0.7]:6900 - 493b19ed-a58d-4508-b9ef-f5c58e05b859, sourceCurrentReplicaIndex=0, sourceNewReplicaIndex=1, destination=[10.41.0.7]:6901 - f12342bf-a37e-452a-8c67-1d36dd4dbac7, destinationCurrentReplicaIndex=-1, destinationNewReplicaIndex=0, master=[10.41.0.7]:6900, initialPartitionVersion=1, partitionVersionIncrement=2, status=ACTIVE}
com.hazelcast.nio.serialization.HazelcastSerializationException: java.lang.RuntimeException: ConcurrentRuntime not initialized
at com.hazelcast.internal.serialization.impl.SerializationUtil.handleException(SerializationUtil.java:103)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:292)
at com.hazelcast.internal.serialization.impl.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:567)
at com.hazelcast.scheduledexecutor.impl.ScheduledRunnableAdapter.readData(ScheduledRunnableAdapter.java:106)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.readInternal(DataSerializableSerializer.java:160)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:106)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:51)
at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:286)
at com.hazelcast.internal.serialization.impl.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:567)
at com.hazelcast.scheduledexecutor.impl.TaskDefinition.readData(TaskDefinition.java:144)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.readInternal(DataSerializableSerializer.java:160)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:106)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:51)
at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:286)
at com.hazelcast.internal.serialization.impl.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:567)
at com.hazelcast.scheduledexecutor.impl.ScheduledTaskDescriptor.readData(ScheduledTaskDescriptor.java:208)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.readInternal(DataSerializableSerializer.java:160)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:106)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:51)
at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:286)
at com.hazelcast.internal.serialization.impl.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:567)
at com.hazelcast.scheduledexecutor.impl.operations.ReplicationOperation.readInternal(ReplicationOperation.java:87)
at com.hazelcast.spi.impl.operationservice.Operation.readData(Operation.java:750)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.readInternal(DataSerializableSerializer.java:160)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:106)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:51)
at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:286)
at com.hazelcast.internal.serialization.impl.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:567)
at com.hazelcast.internal.partition.ReplicaFragmentMigrationState.readData(ReplicaFragmentMigrationState.java:97)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.readInternal(DataSerializableSerializer.java:160)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:106)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:51)
at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:286)
at com.hazelcast.internal.serialization.impl.ByteArrayObjectDataInput.readObject(ByteArrayObjectDataInput.java:567)
at com.hazelcast.internal.partition.operation.MigrationOperation.readInternal(MigrationOperation.java:249)
at com.hazelcast.spi.impl.operationservice.Operation.readData(Operation.java:750)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.readInternal(DataSerializableSerializer.java:160)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:106)
at com.hazelcast.internal.serialization.impl.DataSerializableSerializer.read(DataSerializableSerializer.java:51)
at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.toObject(AbstractSerializationService.java:205)
at com.hazelcast.spi.impl.NodeEngineImpl.toObject(NodeEngineImpl.java:346)
at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:437)
at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:166)
at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:136)
at com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.executeRun(OperationThread.java:123)
at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102)
Caused by: java.lang.RuntimeException: ConcurrentRuntime not initialized
at org.glassfish.concurrent.runtime.ConcurrentRuntime.getRuntime(ConcurrentRuntime.java:121)
at org.glassfish.concurrent.runtime.InvocationContext.readObject(InvocationContext.java:214)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1184)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2296)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461)
at com.hazelcast.internal.serialization.impl.defaultserializers.JavaDefaultSerializers$JavaSerializer.read(JavaDefaultSerializers.java:83)
at com.hazelcast.internal.serialization.impl.defaultserializers.JavaDefaultSerializers$JavaSerializer.read(JavaDefaultSerializers.java:76)
at fish.payara.nucleus.hazelcast.PayaraHazelcastSerializer.read(PayaraHazelcastSerializer.java:84)
at com.hazelcast.internal.serialization.impl.StreamSerializerAdapter.read(StreamSerializerAdapter.java:44)
at com.hazelcast.internal.serialization.impl.AbstractSerializationService.readObject(AbstractSerializationService.java:286)
... 50 more
]]
[2021-02-15T23:00:32.304+0800] [] [WARNING] [] [com.hazelcast.internal.partition.impl.MigrationManager] [tid: _ThreadID=160 _ThreadName=ForkJoinPool.commonPool-worker-6] [timeMillis: 1613401232304] [levelValue: 900] [10.41.0.7]:6900 [testClusterDev] [4.1] Migration failed: MigrationInfo{uuid=fc68e9ac-1081-4f9b-a70a-6fb0aae19016, partitionId=27, source=[10.41.0.7]:6900 - 493b19ed-a58d-4508-b9ef-f5c58e05b859, sourceCurrentReplicaIndex=0, sourceNewReplicaIndex=1, destination=[10.41.0.7]:6901 - f12342bf-a37e-452a-8c67-1d36dd4dbac7, destinationCurrentReplicaIndex=-1, destinationNewReplicaIndex=0, master=[10.41.0.7]:6900, initialPartitionVersion=1, partitionVersionIncrement=2, status=ACTIVE}
This seems to happen because the new node is not fully initialized (as it is just starting). Looks like this exception is less critical comparing with the next one.
Exception 2: java.lang.NullPointerException: Failed to execute java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
[2021-02-15T23:44:19.544+0800] [] [SEVERE] [] [com.hazelcast.spi.impl.executionservice.ExecutionService] [tid: _ThreadID=35 _ThreadName=hz.elated_murdock.scheduled.thread-] [timeMillis: 1613403859544] [levelValue: 1000] [[
[10.4.0.7]:6901 [testClusterDev] [4.1] Failed to execute java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask#55a27ce3
java.lang.NullPointerException
at org.glassfish.concurrent.runtime.ContextSetupProviderImpl.isApplicationEnabled(ContextSetupProviderImpl.java:326)
at org.glassfish.concurrent.runtime.ContextSetupProviderImpl.setup(ContextSetupProviderImpl.java:194)
at org.glassfish.enterprise.concurrent.internal.ContextProxyInvocationHandler.invoke(ContextProxyInvocationHandler.java:94)
at com.sun.proxy.$Proxy154.run(Unknown Source)
at com.hazelcast.scheduledexecutor.impl.ScheduledRunnableAdapter.call(ScheduledRunnableAdapter.java:56)
at com.hazelcast.scheduledexecutor.impl.TaskRunner.call(TaskRunner.java:78)
at com.hazelcast.scheduledexecutor.impl.TaskRunner.run(TaskRunner.java:104)
at com.hazelcast.spi.impl.executionservice.impl.DelegateAndSkipOnConcurrentExecutionDecorator$DelegateDecorator.run(DelegateAndSkipOnConcurrentExecutionDecorator.java:77)
at com.hazelcast.internal.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:217)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at com.hazelcast.internal.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76)
at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102)
]]
This exception happens on the new node which is joining cluster. This doesn't happen always, probably Hazelcast is trying to execute event on the new node which is starting, and it fails becasue environment still not fully initialized. The issue that after two such failed attempts - event gets unloaded by Hazelcast.
Implementation insights:
Method which schedules event using IScheduledExecutorService (resides in application scoped bean in the main app WAR):
#Resource
ContextService _ctxService;
public void sheduleClusteredEvent() {
IScheduledExecutorService executorService = _instance.getScheduledExecutorService("default");
ClusteredEvent ce = new ClusteredEvent(new DiagEvent(null, "TestEvent1"));
Object ceProxy = _ctxService.createContextualProxy(ce, Runnable.class, Serializable.class);
executorService.scheduleAtFixedRate((Runnable) ceProxy, 0, 3, TimeUnit.SECONDS);
}
ClusteredEvent class (resides in a separate JAR and added to classpath via --addLibs param to the Payara Micro). It needs to somehow inform the main app about the event to be trigered, thus BeanManager.fireEvent() is used.
public class ClusteredEvent implements Runnable, Serializable {
private final DiagEvent _event;
public ClusteredEvent(DiagEvent event) {
_event = event;
}
#Override
public void run() {
// For sake of shortness - all check for nulls etc. were removed
((BeanManager) ic.lookup("java:comp/BeanManager")).fireEvent(_event);
}
}
So my questions:
How to solve the mentioned above exceptions / issues?
Am I on the right direction in achieving coordinated clustred events behaviour in Payara Micro cluster? I would expect this to be a trivial task working out-of-the-box, but instead it requires some custom implementation as persistent timers do not work as desired. Is there any other more elegant way available with Payara Micro Cluster (>=v5.2021.1) of achiving coordinated clustred events behaviour?
Thank you so much in advance!
Update 1:
Just to recall that the main purpose of this exercise is to have coordinated timer (events) functionality available in the Payara Micro Cluster, thus suggestions on more elegant solutions are highly welcome.
Addressing questions/suggestions from the comments:
Q1:
why do you need to create a contextual proxy for the even object?
A1: Indeed making the contextual proxy out of the plain ClusteredEvent() object - adds the main complexity here and causes listed above exceptions (meaning: scheduling ClusteredEvent() without making a contextual proxy out of it - works fine and doesn't cause exceptions, but there is a caveat).
The reason contextual proxy is used as I need to somehow trigger the main app running on Payara Micro from the un-managed thread launched by IScheduledExecutorService. So far I haven't found any other workable way of triggering any CDI/EJB bean in the main app from the un-managed thread. Only making it contextual - allows ClusteredEvent.run() to communicate with the main app via BeanManger for example.
Any suggestions on how to establish communication between un-managed thread and CDI/EJB beans running in separate app (and both running on the same Payara Micro instance) - are welcome.
Q2:
You can for example wrap the ceProxy to a Runnable, that executes ceProxy.run() in a try catch block
A2: I have tried it and indeed it helps to handle the "Exception 2" mentioned above. I am posting implementation of the ClusteredEventWrapper class below, try/catch inside run() method handles "Exception 2".
Q3:
The first exception comes from hazelcast trying to deserialize the
proxy on the new instance, which fails because the proxy needs an
initilaized environment to deserialize. To solve this, you would need
to wrap the ceProxy object and customize the deserialization of the
wrapper to wait until the ContextService is initilaized.
A3: Adding custom implementation for serialization/deserialization of ClusteredEventWrapper indeed allows to handle "Exception 1", but here I am still struggling on the best way of handling it. Postponing deserialization via Thread.sleep() - causes new (different) exceptions. Supressing of exceptions - need to check, but in that case I am afraid ClusteredEventWrapper will not be properly deserialized on the new (starting) node, as Hazelcast will consider sync was good and will not try to sync it again (I may be wrong - this I still need to check). As currently seems Hazelcast tries to sync several times util the "Exception 1" gone.
Implementation of the ClusteredEventWrapper which wraps ClusteredEvent:
public class ClusteredEventWrapper implements Runnable, Serializable {
private static final long serialVersionUID = 5878537035999797427L;
private static final Logger LOG = Logger.getLogger(ClusteredEventWrapper.class.getName());
private final Runnable _clusteredEvent;
public ClusteredEventWrapper(Runnable clusteredEvent) {
_clusteredEvent = clusteredEvent;
}
#Override
public void run() {
try {
_clusteredEvent.run();
} catch (Throwable e) {
if (e instanceof NullPointerException
&& e.getStackTrace() != null && e.getStackTrace().length > 0
&& "org.glassfish.concurrent.runtime.ContextSetupProviderImpl".equals(e.getStackTrace()[0].getClassName())
&& "isApplicationEnabled".equals(e.getStackTrace()[0].getMethodName())) {
// Means we got the "Exception 2" (posted above)
LOG.log(Level.WARNING, "Skipping scheduled event execution on this node as this node is still being initialized...");
} else {
LOG.log(Level.SEVERE, "Error executing scheduled event", e);
}
}
}
private void writeObject(ObjectOutputStream out) throws IOException {
LOG.log(Level.INFO, "1_WRITE_OBJECT...");
out.defaultWriteObject();
}
private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {
LOG.log(Level.INFO, "2_READ_OBJECT...");
int retry = 0;
while (readObjectInner(in) != true && retry < 5) { // This doesn't work good, need to think of some other way on handling it
retry++;
LOG.log(Level.INFO, "2_READ_OBJECT: retry {0}", retry);
try {
// We need to wait
Thread.sleep(15000);
} catch (InterruptedException ex) {
}
}
}
private boolean readObjectInner(ObjectInputStream in) throws IOException, ClassNotFoundException {
try {
in.defaultReadObject();
return true;
} catch (Throwable e) {
if (e instanceof RuntimeException && "ConcurrentRuntime not initialized".equals(e.getMessage())) {
// This means node which is trying to desiarialize this objet is not ready yet
return false;
} else {
// For all other exceptions - we throw error
throw e;
}
}
}
}
So now event scheduled in the following way:
#Resource
ContextService _ctxService;
public void sheduleClusteredEvent() {
IScheduledExecutorService executorService = _instance.getScheduledExecutorService("default");
ClusteredEvent ce = new ClusteredEvent(new DiagEvent(null, "PersistentEvent1"));
Object ceProxy = _ctxService.createContextualProxy(ce, Runnable.class, Serializable.class);
executorService.scheduleAtFixedRate(new ClusteredEventWrapper((Runnable) ceProxy), 0, 3, TimeUnit.SECONDS);
}
Below I am posting implemented solution based on suggestions from #OndroMih in the comments:
Excerpt 1:
...a better approach to this is to avoid wrapping your object into a
contextual and instead register BeanManager into a global variable
(singleton) at application startup. In ClusteredEvent.run() you would
retrieve it from a static method, e.g. Registry.getBeanManager(). This
method would have to wait until the application starts up and saves
its BeanManager instance with Registry.setBeanManager()
And this one:
Excerpt 2:
Or maybe even better if you store a reference to the
ManagedExecutorService instead of the BeanManager, execute the run
method with that executor and just inject anything you need.
#OndroMih, please post them as reply - I will mark it as an accepted answer!
Before going into details of the implementation - few words on our application packaging: it consists of:
the main war file which is bundled into Payara Micro as Uber jar, so we do not redeploy application war, we start and stop the whole Payara Micro with the deployed war on it;
and tiny jar lib with few classes which are used mainly in Hazelcast and provided via --addLibs arg to Payara Micro Uber jar to avoid ClassNotFoundExceptions when Hazelcast syncs objects in DataGrid.
And now about the implementation which has given us the desired behavior for the clustered timer/events (see the 1st post):
I) Using ManagedExecutorService as per suggestion above indeed looks much more flexible as it allows to inject any desired object into clustered event, so I started with this approach. But due to some reason - I was not able to inject anything. Due to limited time I left this for investigation in future and switched to the next approach. I am also providing sample code for this case in the end of this post.
II) So I switched to the scenario with BeanManager.
I got the Registry signleton implemented as follows (all comments are removed in sake of shortness). This class resides in the tiny jar added via --addLibs arg to Payara Micro:
public final class Registry {
private ManagedExecutorService _executorService;
private BeanManager _beanManager;
private Registry() {
}
public ManagedExecutorService getExecutorService() {
return _executorService;
}
public void setExecutorService(ManagedExecutorService executorService) {
_executorService = executorService;
}
public BeanManager getBeanManager() {
return _beanManager;
}
public void setBeanManager(BeanManager beanManager) {
_beanManager = beanManager;
}
public static Registry getInstance() {
return InstanceHolder._instance;
}
private static class InstanceHolder {
private static final Registry _instance = new Registry();
}
}
In the main app war we already had an AppListener class which listens for the event when app is deployed, so we added Registry population logic into it:
public class AppListener implements SystemEventListener {
...
#Resource
private ManagedExecutorService _managedExecutorService;
#Resource
private BeanManager _beanManager;
#Override
public void processEvent(SystemEvent event) throws AbortProcessingException {
try {
if (event instanceof PostConstructApplicationEvent) {
LOG.log(Level.INFO, ">> Application started");
...
// Once app marked as started - populate global objects in the Registry
Registry.getInstance().setExecutorService(_managedExecutorService);
Registry.getInstance().setBeanManager(_beanManager);
}
...
} catch (Exception e) {
LOG.log(Level.SEVERE, ">> Error processing event: " + event, e);
}
}
}
ClusteredEvent class which as scheduled via IScheduledExecutorService.scheduleAtFixedRate() also resides in the tiny jar and has the following implementation:
public final class ClusteredEvent implements NamedTask, Runnable, Serializable {
...
private final MultiTenantEvent _event;
public ClusteredEvent(MultiTenantEvent event) {
if (event == null) {
throw new NullPointerException("Event can not be null");
}
_event = event;
}
#Override
public void run() {
try {
if (Registry.getInstance().getBeanManager() == null) {
LOG.log(Level.WARNING, "Skipping timer execution - application not initialized yet...");
return;
}
Registry.getInstance().getBeanManager().fireEvent(_event);
} catch (Throwable e) {
LOG.log(Level.SEVERE, "Error executing timer: " + _event, e);
}
}
#Override
public final String getName() {
return _event.getName();
}
}
And basically that is all. Scheduling is done using the following simple steps:
#Resource(lookup = "payara/Hazelcast")
private HazelcastInstance _instance;
_instance.getScheduledExecutorService("default").scheduleAtFixedRate(new ClusteredEvent(event), initialDelaySec, invocationPeriodSec, TimeUnit.SECONDS);
All tests went good so far. I was worried that Registry.getBeanManager() getting 'spoiled' after some time due to some closed contexts somewhere (I am not sure about the nature of the BeanManager reference), but tests have shown that ref to BeanManager stays valid after 1 day, so hopefully it will work fine.
Another concern (even not a concern, but caveat to be considered) that there is no possibility to control on which node event is to be fired by IScheduledExecutorService, as such when event is triggered on the node which is not yet initialized (still starting) - the event gets skipped. But for our usage-scenario this is acceptable, so currently we can live with these considerations.
And getting back to the issue with usage of ManagedExecutorService: ClusteredEvent was implemented like provided below:
public class ClusteredEvent implements Runnable, Serializable {
private final MultiTenantEvent _event;
public ClusteredEvent(MultiTenantEvent event) {
_event = event;
}
#Override
public void run() {
try {
LOG.log(Level.INFO, "TIMER THREAD NAME: {0}", Thread.currentThread().getName());
if (Registry.getInstance().getExecutorService() == null) {
LOG.log(Level.WARNING, "Skipping timer execution - application not initialized yet...");
return;
}
Registry.getInstance().getExecutorService().submit(new Callable<Boolean>() {
#Override
public Boolean call() throws Exception {
LOG.log(Level.INFO, "Timer.Run() THREAD NAME: {0}", Thread.currentThread().getName());
String beanManagerJndiName = "java:comp/BeanManager";
try {
Context ic = new InitialContext();
BeanManager beanManager = (BeanManager) ic.lookup(beanManagerJndiName);
beanManager.fireEvent(_event);
return true;
} catch (NullPointerException | NamingException ex) {
LOG.log(Level.SEVERE, "ERROR: no BeanManager resource could be located by JNDI name: " + beanManagerJndiName, ex);
return false;
}
}
}).get();
} catch (Throwable e) {
LOG.log(Level.SEVERE, "Error executing timer: " + _event, e);
}
}
}
Output was the following:
[2021-02-24 07:56:07] [INFO] [ua.appName.model.event.ClusteredEvent run]
TIMER THREAD NAME: hz.competent_mccarthy.cached.thread-11
[2021-02-24 07:56:07] [INFO] [ua.appName.model.event.ClusteredEvent$1 call]
Timer.Run() THREAD NAME: concurrent/__defaultManagedExecutorService-managedThreadFactory-Thread-1
[2021-02-24 07:56:07] [SEVERE] [ua.appName.model.event.ClusteredEvent$1 call]
ERROR: no BeanManager resource could be located by JNDI name: java:comp/BeanManager
javax.naming.NamingException: Lookup failed for 'java:comp/BeanManager' in SerialContext[myEnv={java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.url.pkgs=com.sun.enterprise.naming, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl} [Root exception is javax.naming.NamingException: Invocation exception: Got null ComponentInvocation ]
at com.sun.enterprise.naming.impl.SerialContext.lookup(SerialContext.java:496)
at com.sun.enterprise.naming.impl.SerialContext.lookup(SerialContext.java:442)
at javax.naming.InitialContext.lookup(InitialContext.java:417)
at javax.naming.InitialContext.lookup(InitialContext.java:417)
at ua.appName.model.event.ClusteredEvent$1.call(ClusteredEvent.java:70)
at ua.appName.model.event.ClusteredEvent$1.call(ClusteredEvent.java:63)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.glassfish.enterprise.concurrent.internal.ManagedFutureTask.run(ManagedFutureTask.java:143)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)
Caused by: javax.naming.NamingException: Invocation exception: Got null ComponentInvocation
at com.sun.enterprise.naming.impl.GlassfishNamingManagerImpl.getComponentId(GlassfishNamingManagerImpl.java:870)
at com.sun.enterprise.naming.impl.GlassfishNamingManagerImpl.lookup(GlassfishNamingManagerImpl.java:737)
at com.sun.enterprise.naming.impl.JavaURLContext.lookup(JavaURLContext.java:167)
at com.sun.enterprise.naming.impl.SerialContext.lookup(SerialContext.java:476)
... 11 more
So line Timer.Run() THREAD NAME: concurrent/__defaultManagedExecutorService-managedThreadFactory-Thread-1 confirms that code runs already inside the managed thread but still I was not able to inject or lookup nothing. I left this investigation for future this time.
Once again, many thanks to #OndroMih for your suggestions on the implementation!
Thank you!

How to pause the Spring cloud data flow Source class from sending data to kafka?

i am working on spring cloud data flow application ,Following is the code snippet
#Bean
#InboundChannelAdapter(channel = TbeSource.PR1, poller = #Poller(fixedDelay = "2000"))
public MessageSource<Product> getProductSource(ProductBuilder dataAccess) {
return new MessageSource<Product>() {
#SneakyThrows
#Override
public Message<Product> receive() {
System.out.println("calling method");
return MessageBuilder.withPayload(dataAccess.getNext()).build();
}
};
}
In above code the getNext() method will get the data from the database and return that object,so if the data is completely readed then it will return null
we can't return null to this MessageSource.
so is there any options available to pause and resume this Source connection class whenever we need?
Did any one faced / overcome this scenario?
First of all you just can have a Supplier<Product> instead of that MessageSourceand your code would be just like this:
return () -> dataAccess.getNext();
The null result is valid over here and no message is going to be emitted in this case and no error since the framework handles null result properly.
You still can have an idle functionality on that #InboundChannelAdapter when result of the method call is null. For that reason you need to take a look into the SimpleActiveIdleMessageSourceAdvice. See docs for more info: https://docs.spring.io/spring-integration/docs/5.3.4.RELEASE/reference/html/core.html#simpleactiveidlereceivemessageadvice

Numerous Kafka Producer Network Thread generated during data publishing, Null Pointer Exception Spring Kafka

I am writing a Kafka Producer using Spring Kafka 2.3.9 that suppose to publish around 200000 messages to a topic. For example, I have a list of 200000 objects that I fetched from a database and I want to publish json messages of those objects to a topic.
The producer that I have written is working fine for publishing, let's say, 1000 messages. Then it is creating some null pointer error(I have included the screen shot below).
During debugging, I found that the number of Kafka Producer Network Thread is very high. I could not count them but they are definitely more than 500.
I have read the thread Kafka Producer Thread, huge amound of threads even when no message is send and did a similar configuration by making producerPerConsumerPartition property false on DefaultKafkaProducerFactory. But still it is not decreasing the Kafka Producer Network Thread count.
Below are my code snippets, error and picture of those threads. I can't post all of the code segments since it is from a real project.
Code segments
public DefaultKafkaProducerFactory<String, String> getProducerFactory() throws IOException, IllegalStateException {
Map<String, Object> configProps = getProducerConfigMap();
DefaultKafkaProducerFactory<String, String> defaultKafkaProducerFactory = new DefaultKafkaProducerFactory<>(configProps);
//defaultKafkaProducerFactory.transactionCapable();
defaultKafkaProducerFactory.setProducerPerConsumerPartition(false);
defaultKafkaProducerFactory.setProducerPerThread(false);
return defaultKafkaProducerFactory;
}
public Map<String, Object> getProducerConfigMap() throws IOException, IllegalStateException {
Map<String, Object> configProps = new HashMap<>();
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaProperties.getBootstrapAddress());
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(ProducerConfig.RETRIES_CONFIG, kafkaProperties.getKafkaRetryConfig());
configProps.put(ProducerConfig.ACKS_CONFIG, kafkaProperties.getKafkaAcknowledgementConfig());
configProps.put(ProducerConfig.CLIENT_ID_CONFIG, kafkaProperties.getKafkaClientId());
configProps.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 512 * 1024 * 1024);
configProps.put(ProducerConfig.MAX_BLOCK_MS_CONFIG, 10 * 1000);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
//updateSSLConfig(configProps);
return configProps;
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
ProducerFactory<String, String> producerFactory = getProducerFactory();
KafkaTemplate<String, String> kt = new KafkaTemplate<String, String>(stringProducerFactory, true);
kt.setCloseTimeout(java.time.Duration.ofSeconds(5));
return kt;
}
Error
2020-12-07 18:14:19.249 INFO 26651 --- [onPool-worker-1] o.a.k.clients.producer.KafkaProducer : [Producer clientId=kafka-client-09f48ec8-7a69-4b76-a4f4-a418e96ff68e-1] Closing the Kafka producer with timeoutMillis = 0 ms.
2020-12-07 18:14:19.254 ERROR 26651 --- [onPool-worker-1] c.w.p.r.g.xxxxxxxx.xxx.KafkaPublisher : Exception happened publishing to topic. Failed to construct kafka producer
2020-12-07 18:14:19.273 INFO 26651 --- [ main] ConditionEvaluationReportLoggingListener :
Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled.
2020-12-07 18:14:19.281 ERROR 26651 --- [ main] o.s.boot.SpringApplication : Application run failed
java.lang.IllegalStateException: Failed to execute CommandLineRunner
at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:787) [spring-boot-2.2.8.RELEASE.jar:2.2.8.RELEASE]
at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:768) [spring-boot-2.2.8.RELEASE.jar:2.2.8.RELEASE]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:322) [spring-boot-2.2.8.RELEASE.jar:2.2.8.RELEASE]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1226) [spring-boot-2.2.8.RELEASE.jar:2.2.8.RELEASE]
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1215) [spring-boot-2.2.8.RELEASE.jar:2.2.8.RELEASE]
at xxx.xxx.xxx.Application.main(Application.java:46) [classes/:na]
Caused by: java.util.concurrent.CompletionException: java.lang.NullPointerException
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273) ~[na:1.8.0_144]
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280) ~[na:1.8.0_144]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1592) ~[na:1.8.0_144]
at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582) ~[na:1.8.0_144]
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) ~[na:1.8.0_144]
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) ~[na:1.8.0_144]
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) ~[na:1.8.0_144]
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) ~[na:1.8.0_144]
Caused by: java.lang.NullPointerException: null
at com.xxx.xxx.xxx.xxx.KafkaPublisher.publishData(KafkaPublisher.java:124) ~[classes/:na]
at com.xxx.xxx.xxx.xxx.lambda$0(Publisher.java:39) ~[classes/:na]
at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_144]
at com.xxx.xxx.xxx.xxx.publishData(Publisher.java:38) ~[classes/:na]
at com.xxx.xxx.xxx.xxx.Application.lambda$0(Application.java:75) [classes/:na]
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) ~[na:1.8.0_144]
... 5 common frames omitted
Following is the code for publishing the message - line number 124 is when we actually call KafkaTemplate
public void publishData(Object object) {
ListenableFuture<SendResult<String, String>> future = null;
// Convert the Object to JSON
String json = convertObjectToJson(object);
// Generate unique key for the message
String key = UUID.randomUUID().toString();
// Post the JSON to Kafka
try {
future = kafkaConfig.kafkaTemplate().send(kafkaProperties.getTopicName(), key, json);
} catch (Exception e) {
logger.error("Exception happened publishing to topic. {}", e.getMessage());
}
future.addCallback(new ListenableFutureCallback<SendResult<String, String>>() {
#Override
public void onSuccess(SendResult<String, String> result) {
logger.info("Sent message with key=[" + key + "]");
}
#Override
public void onFailure(Throwable ex) {
logger.error("Unable to send message=[ {} due to {}", json, ex.getMessage());
}
});
kafkaConfig.kafkaTemplate().flush();
}
============================
I am not sure if this error is causing by those many network threads.
After posting the data, I have called KafkaTemplate flush method. It did not work.
I also called ProducerFactory closeThreadBoundProducer, reset, destroy methods. None of them seems working.
Am I missing any configuration?
The null pointer issue was not actually related to Spring Kafka. We were reading the topic name from a different location connected by a network. That network connection was failing for few cases and that caused null pointer issue which ultimately caused the above error.

Register Java Class in Flink Cluster

I am running my Fat Jar in Flink Cluster which reads Kafka and saves in Cassandra, the code is,
final Properties prop = getProperties();
final FlinkKafkaConsumer<String> flinkConsumer = new FlinkKafkaConsumer<>
(kafkaTopicName, new SimpleStringSchema(), prop);
flinkConsumer.setStartFromEarliest();
final DataStream<String> stream = env.addSource(flinkConsumer);
DataStream<Person> sensorStreaming = stream.flatMap(new FlatMapFunction<String, Person>() {
#Override
public void flatMap(String value, Collector<Person> out) throws Exception {
try {
out.collect(objectMapper.readValue(value, Person.class));
} catch (JsonProcessingException e) {
logger.error("Json Processing Exception", e);
}
}
});
savePersonDetails(sensorStreaming);
env.execute();
and The Person POJO contains,
#Column(name = "event_time")
private Instant eventTime;
There is codec required to store Instant as below for Cassandra side,
final Cluster cluster = ClusterManager.getCluster(cassandraIpAddress);
cluster.getConfiguration().getCodecRegistry().register(InstantCodec.instance);
When i run standalone works fine, but when i run local cluster throws me an error as below,
Caused by: com.datastax.driver.core.exceptions.CodecNotFoundException: Codec not found for requested operation: [timestamp <-> java.time.Instant]
at com.datastax.driver.core.CodecRegistry.notFound(CodecRegistry.java:679)
at com.datastax.driver.core.CodecRegistry.createCodec(CodecRegistry.java:526)
at com.datastax.driver.core.CodecRegistry.findCodec(CodecRegistry.java:506)
at com.datastax.driver.core.CodecRegistry.access$200(CodecRegistry.java:140)
at com.datastax.driver.core.CodecRegistry$TypeCodecCacheLoader.load(CodecRegistry.java:211)
at com.datastax.driver.core.CodecRegistry$TypeCodecCacheLoader.load(CodecRegistry.java:208)
I read the below document for registering,
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/custom_serializers.html
but InstantCodec is 3rd party one. How can i register it?
I solved the problem, there was LocalDateTime which was emitting from and when i was converting with same type, there was above error. I changed the type into java.util Date type then it worked.

How to read messages in MQs using spark streaming,i.e ZeroMQ,RabbitMQ?

As the spark docs says,it support kafka as data streaming source.but I use ZeroMQ,And there is not a ZeroMQUtils.so how can I use it? and generally,how about other MQs. I am totally new to spark and spark streaming, so I am sorry if the question is stupid.Could anyone give me a solution.Thanks
BTW,I use python.
Update, I finally did it in java with a Custom Receiver. Below is my solution
public class ZeroMQReceiver extends Receiver<T> {
private static final ObjectMapper mapper = new ObjectMapper();
public ZeroMQReceiver() {
super(StorageLevel.MEMORY_AND_DISK_2());
}
#Override
public void onStart() {
// Start the thread that receives data over a connection
new Thread(this::receive).start();
}
#Override
public void onStop() {
// There is nothing much to do as the thread calling receive()
// is designed to stop by itself if isStopped() returns false
}
/** Create a socket connection and receive data until receiver is stopped */
private void receive() {
String message = null;
try {
ZMQ.Context context = ZMQ.context(1);
ZMQ.Socket subscriber = context.socket(ZMQ.SUB);
subscriber.connect("tcp://ip:port");
subscriber.subscribe("".getBytes());
// Until stopped or connection broken continue reading
while (!isStopped() && (message = subscriber.recvStr()) != null) {
List<T> results = mapper.readValue(message,
new TypeReference<List<T>>(){} );
for (T item : results) {
store(item);
}
}
// Restart in an attempt to connect again when server is active again
restart("Trying to connect again");
} catch(Throwable t) {
// restart if there is any other error
restart("Error receiving data", t);
}
}
}
I assume you are talking about Structured Streaming.
I am not familiar with ZeroMQ, but an important point in Spark Structured Streaming sources is replayability (in order to ensure fault tolerance), which, if I understand correctly, ZeroMQ doesn't deliver out-of-the-box.
A practical approach would be buffering the data either in Kafka and using the KafkaSource or as files in a (local FS/NFS, HDFS, S3) directory and using the FileSource for reading. Cf. Spark Docs. If you use the FileSource, make sure not to append anything to an existing file in the FileSource's input directory, but move them into the directory atomically.

Resources