Hazelcast-jet: got error when enriching stream using direct lookup - hazelcast-jet

I am following Doc to try out how to enrich an unbounded stream by directly looking up from a IMap. I have two Maps:
Product: Map<String, Product> (ProductId as key)
Seller: Map<String, Seller> (SellerId as key)
Both Product and Seller are very simple classes:
public class Product implements DataSerializable {
String productId;
String sellerId;
int price;
...
public class Seller implements DataSerializable {
String sellerId;
int revenue;
...
I have two data generators keep pushing data to the two maps. The event-journal are enabled for both maps. I have verified the event-journal works fine.
I want to enrich the stream event of Product map with Seller map. Here is a snippet of my code:
IMap<String, Seller> sellerIMap = jetClient.getMap(SellerDataGenerator.SELLER_MAP);
StreamSource<Product> productStreamSource = Sources.mapJournal(ProductDataGenerator.PRODUCT_MAP, Util.mapPutEvents(), Util.mapEventNewValue(), START_FROM_CURRENT);
p.drawFrom(productStreamSource)
.withoutTimestamps()
.groupingKey(Product::getSellerId)
.mapUsingIMap(sellerIMap, (product, seller) -> new EnrichedProduct(product, seller))
.drainTo(getSink());
try {
JobConfig jobConfig = new JobConfig();
jobConfig.addClass(TaskSubmitter.class).addClass(Seller.class).addClass(Product.class).addClass(ExtendedProduct.class);
jobConfig.setName(Constants.BASIC_TASK);
Job job = jetClient.newJob(p, jobConfig);
} finally {
jetClient.shutdown();
}
When job was submitted, I got following error:
com.hazelcast.spi.impl.operationservice.impl.Invocation - [172.31.33.212]:80 [jet] [3.1] Failed asynchronous execution of execution callback: com.hazelcast.util.executor.DelegatingFuture$DelegatingExecutionCallback#77ac0407for call Invocation{op=com.hazelcast.map.impl.operation.GetOperation{serviceName='hz:impl:mapService', identityHash=1939050026, partitionId=70, replicaIndex=0, callId=-37944, invocationTime=1570410704479 (2019-10-07 01:11:44.479), waitTimeout=-1, callTimeout=60000, name=sellerMap}, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeoutMillis=60000, firstInvocationTimeMs=1570410704479, firstInvocationTime='2019-10-07 01:11:44.479', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 00:00:00.000', target=[172.31.33.212]:80, pendingResponse={VOID}, backupsAcksExpected=0, backupsAcksReceived=0, connection=null}
I tried to put one and two instances in my cluster and got the same error message. I couldn't figure out what was the root cause.

It seems that your problem is a ClassNotFoundException, even though you added the appropriate classes to the job. The objects you store in the IMap exist independent of your Jet job and when the event journal source asks for them, Jet's IMap code tries to deserialize them and fails because Jet doesn't have your domain model classes on its classpath.
To move on, add a JAR with the classes you use in the IMap to Jet's classpath. We are looking for a solution that will remove this requirement.
The reason you haven't got the exception stacktrace in the log output is due to the default java.util.logging setup you end up with when you don't explicitly add a more flexible logging module, such as Log4j.
The next version of Jet's packaging will improve this aspect. Until that time you can follow these steps:
Go to the lib directory of Jet's distribution package and download Log4j into it:
$ cd lib
$ wget https://repo1.maven.org/maven2/log4j/log4j/1.2.17/log4j-1.2.17.jar
Edit bin/common.sh to add the module to the classpath. Towards the end of the file there is a line
CLASSPATH="$JET_HOME/lib/hazelcast-jet-3.1.jar:$CLASSPATH"
You can duplicate this line and replace hazelcast-jet-3.1 with log4j-1.2.17.
At the end of commons.sh there is a multi-line command that constructs the JAVA_OPTS variable. Add "-Dhazelcast.logging.type=log4j" and "-Dlog4j.configuration=file:$JET_HOME/config/log4j.properties" to the list.
Create a file log4j.properties in the config directory:
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.Target=System.out
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d %5p [%c{1}] [%t] - %m%n
# Change this level to debug to diagnose failed cluster formation:
log4j.logger.com.hazelcast.internal.cluster=info
log4j.logger.com.hazelcast.jet=info
log4j.rootLogger=info, stdout

Related

How to read cassandra FQL logs in java?

I have a bunch of cassandra FQL logs with the "cq4" extension. I would like to read them in Java, is there a Java class that those log entries can be mapped into?
These are the logs I see.
I want to read this with this code:
import net.openhft.chronicle.Chronicle;
import net.openhft.chronicle.ChronicleQueueBuilder;
import net.openhft.chronicle.ExcerptTailer;
import java.io.IOException;
public class Main{
public static void main(String[] args) throws IOException {
Chronicle chronicle = ChronicleQueueBuilder.indexed("/Users/pavelorekhov/Desktop/fql_logs").build();
ExcerptTailer tailer = chronicle.createTailer();
while (tailer.nextIndex()) {
tailer.readInstance(/*class goes here*/)
}
}
}
I think from the code and screenshot you can understand what kind of class I need in order to read log entries into objects. Does that class exist in some cassandra maven dependency?
You are using Chronicle 3.x, which is very old.
I suggest using Chronicle 5.20.123, which is the version Cassandra uses.
I would assume Cassandra has it's own tool for reading the contents of these file however, you can dump the raw messages with net.openhft.chronicle.queue.main.DumpMain
I ended up cloning cassandra's github repo from here: https://github.com/apache/cassandra
In their code they have the FQLQueryIterator class which you can use to read logs, like so:
SingleChronicleQueue scq = SingleChronicleQueueBuilder.builder().path("/Users/pavelorekhov/Desktop/fql_logs").build();
ExcerptTailer excerptTailer = scq.createTailer();
FQLQueryIterator iterator = new FQLQueryIterator(excerptTailer, 1);
while (iterator.hasNext()) {
FQLQuery fqlQuery = iterator.next(); // object that holds the log entry
// do whatever you need to do with that log entry...
}

Hazelcast Spring Session SubZero(Kryo) EntryBackupProcessorImpl NullPointerException issue

I am using hazelcast-3.11.2 and SubZero-0.9 as global serializer. I am trying to configure Spring Session using this example. When I have more than one node in cluster - I get next exception when trying to get session id:
2019-03-20 15:01:59.088 ERROR 13635 --- [ration.thread-3]
c.h.m.i.operation.EntryBackupOperation : [x.x.x.x]:5701
[hazelcast-group] [3.11.2] null
java.lang.NullPointerException: null at
com.hazelcast.map.AbstractEntryProcessor$EntryBackupProcessorImpl.processBackup(AbstractEntryProcessor.java:83)
at
com.hazelcast.map.impl.operation.EntryOperator.process(EntryOperator.java:314)
at
com.hazelcast.map.impl.operation.EntryOperator.operateOnKeyValueInternal(EntryOperator.java:181)
at
com.hazelcast.map.impl.operation.EntryOperator.operateOnKey(EntryOperator.java:166)
at
com.hazelcast.map.impl.operation.EntryBackupOperation.run(EntryBackupOperation.java:60)
at
com.hazelcast.spi.impl.operationservice.impl.operations.Backup.run(Backup.java:158)
at com.hazelcast.spi.Operation.call(Operation.java:170) at
com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.call(OperationRunnerImpl.java:208)
at
com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:197)
at
com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:413)
at
com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:153)
at
com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.process(OperationThread.java:123)
at
com.hazelcast.spi.impl.operationexecutor.impl.OperationThread.run(OperationThread.java:110)
My instance config looks like this:
#Configuration
#EnableHazelcastHttpSession
public class HazelcastSessionConfig extends AbstractHttpSessionApplicationInitializer {
#Bean
public HazelcastInstance hazelcastInstance() {
Config config = new Config();
SubZero.useAsGlobalSerializer(config);
MapAttributeConfig attributeConfig = new MapAttributeConfig()
.setName(HazelcastSessionRepository.PRINCIPAL_NAME_ATTRIBUTE)
.setExtractor(PrincipalNameExtractor.class.getName());
config.getMapConfig(HazelcastSessionRepository.DEFAULT_SESSION_MAP_NAME)
.addMapAttributeConfig(attributeConfig)
.addMapIndexConfig(new MapIndexConfig(
HazelcastSessionRepository.PRINCIPAL_NAME_ATTRIBUTE, false));
return Hazelcast.newHazelcastInstance(config);
}
}
Removing SubZero from configuration, removes exception, so it looks like it is SubZero issue. I do use this instance as my cache provider also and hibernate second level cache, so I can not get rid off SubZero.
My thoughts were:
Having two different clusters: one for cache, another for session.
Don't work for me, since I do not know how to configure Spring
Session to use specific hazelcast instance (pass instance name, or
bean itself etc)
Specify which classes should be used with SubZero - but since I have
plenty and new classes going to be added - this is not the best idea
Will appreciate any help.

Spring Integrtion XML and Java Config Conversion

I am very new to Spring Integration and my project is using File Support to read a file and load into data base.
I have XML config , trying to understand it's content.
<int-file:inbound-channel-adapter auto-startup= true channel="channelOne" directory="${xx}" filename-regex="${xx}" id="id" prevent-duplicates="false">
<int:poller fixed-delay="1000" receive-timeout="5000"/>
</int-file:inbound-channel-adapter>
<int:channel id="channelOne"/>
From the above piece, my understanding is :
We define a channel and
Then define inbound-channel-adapter - this will look into directory for the file and create a message with file as a payload.
I was able to convert this in JavaConfig as below :
#Bean
public MessageChannel fileInputChannel() {
return new DirectChannel();
}
#Bean
#InboundChannelAdapter(value = "fileInputChannel", poller = #Poller(fixedDelay = "1000"))
public MessageSource<File> fileReadingMessageSource() {
FileReadingMessageSource sourceReader= new FileReadingMessageSource();
RegexPatternFileListFilter regexPatternFileListFilter = new RegexPatternFileListFilter(
file-regex);
//List<FileListFilter<File>> fileListFilter = new ArrayList<FileListFilter<File>>();
fileListFilter.add(regexPatternFileListFilter);
//CompositeFileListFilter compositeFileListFilter = new CompositeFileListFilter<File>(
fileListFilter);
sourceReader.setDirectory(new File(inputDirectorywhereFileComes));
sourceReader.setFilter(regexPatternFileListFilter );
return sourceReader;
}
Then the next piece of code , which literally I am struggling to understand and moreover to convert to JavaConfig.
Here is the next piece:
<int-file:outbound-gateway
delete-source-files="true"
directory="file:${pp}"
id="id"
reply-channel="channelTwo"
request-channel="channelOne"
temporary-file-suffix=".tmp"/>
<int:channel id="channelTwo"/>
<int:outbound-channel-adapter channel="channelTwo" id="id" method="load" ref="beanClass"/>
So from this piece , my understanding :
1: Define an output channel.
2: Define an outbound-gateway, which will write that message as a file again in directory(other one), also remove file from source directory. And finally it will call the method Load of Bean Class. This is our class and has load method which takes file as input and load it to DB.
I tried to covert it into Java Config. Here is my code:
#Bean
#ServiceActivator(inputChannel= "fileInputChannel")
public MessageHandler fileWritingMessageHandler() throws IOException, ParseException {
FileWritingMessageHandler handler = new FileWritingMessageHandler(new File(path to output directory));
handler.setFileExistsMode(FileExistsMode.REPLACE);
beaObject.load(new File(path to output directory or input directory:: Nothing Worked));
handler.setDeleteSourceFiles(true);
handler.setOutputChannel(fileOutputChannel());
return handler;
}
I am able to write this file to output folder also was able to delete from source. After that I am totally lost. I have to call method Load of my BeanClass(ref=class in XML ).
I tried a lot, but not able to get it. Read multiple times the integration File Support doc, but couldn't make it.
Note: When I tried , I got one error saying , the File Not Found Exception. I believe , I am able to call my method , but can not get the file.
This XML config is working perfectly fine.
Spring Integration with DSL also anyone can suggest, if possible.
Please help me to understand the basic flow and get this thing done. Any help and comments is really appreciable.
Thanks in advance.
First of all you need to understand that #Bean method is exactly for configuration and components definitions which are going to be used later at runtime. You definitely must not call a business logic in the #Bean. I mean that your beaObject.load() is totally wrong.
So, please, go first to Spring Framework Docs to understand what is #Bean and its parent #Configuration: https://docs.spring.io/spring/docs/5.1.2.RELEASE/spring-framework-reference/core.html#beans-java
Your #ServiceActivator for the FileWritingMessageHandler is really correct (when you remove that beaObject.load()). What you just need is to declare one more #ServiceActivator for calling your beaObject.load() at runtime when message appears in the fileOutputChannel:
#ServiceActivator(inputChannel= "fileOutputChannel")
public void loadFileIntoDb(File payload) {
this.beaObject.load(payload);
}
See https://docs.spring.io/spring-integration/docs/5.1.1.BUILD-SNAPSHOT/reference/html/configuration.html#annotations for more info.

Sink component doesn't get the right data with kafka in spring cloud data flow

I am not a native English speaker but I try to express my question as clear as possible.
I encountered this problem which has confused me for two days and I still can't find the solution.
I have built a stream which will run in the Spring Could Data Flow in the Hadoop YARN.
The stream is composed of Http source,processor and file sink.
1.Http Source
The HTTP Source component has two output channels binding with two different destinations which are dest1 and dest2 defined in the application.properties.
spring.cloud.stream.bindings.output.destination=dest1
spring.cloud.stream.bindings.output2.destination=dest2
Below is the code snipet for HTTP source for your reference..
#Autowired
private EssSource channels; //EssSource is the interface for multiple output channels
##output channel 1:
#RequestMapping(path = "/file", method = POST, consumes = {"text/*", "application/json"})
#ResponseStatus(HttpStatus.ACCEPTED)
public void handleRequest(#RequestBody byte[] body, #RequestHeader(HttpHeaders.CONTENT_TYPE) Object contentType) {
logger.info("enter ... handleRequest1...");
channels.output().send(MessageBuilder.createMessage(body,
new MessageHeaders(Collections.singletonMap(MessageHeaders.CONTENT_TYPE, contentType))));
}
##output channel 2:
#RequestMapping(path = "/test", method = POST, consumes = {"text/*", "application/json"})
#ResponseStatus(HttpStatus.ACCEPTED)
public void handleRequest2(#RequestBody byte[] body, #RequestHeader(HttpHeaders.CONTENT_TYPE) Object contentType) {
logger.info("enter ... handleRequest2...");
channels.output2().send(MessageBuilder.createMessage(body,
new MessageHeaders(Collections.singletonMap(MessageHeaders.CONTENT_TYPE, contentType))));
}
2. Processor
The processor has two multiple input channels and two output channels binding with different destinations.
The destination binding is defined in application.properties in processor component project.
//input channel binding
spring.cloud.stream.bindings.input.destination=dest1
spring.cloud.stream.bindings.input2.destination=dest2
//output channel binding
spring.cloud.stream.bindings.output.destination=hdfsSink
spring.cloud.stream.bindings.output2.destination=fileSink
Below is the code snippet for Processor.
#Transformer(inputChannel = EssProcessor.INPUT, outputChannel = EssProcessor.OUTPUT)
public Object transform(Message<?> message) {
logger.info("enter ...transform...");
return "processed by transform1";;
}
#Transformer(inputChannel = EssProcessor.INPUT_2, outputChannel = EssProcessor.OUTPUT_2)
public Object transform2(Message<?> message) {
logger.info("enter ... transform2...");
return "processed by transform2";
}
3. The file sink component.
I use the official fil sink component from Spring.
maven://org.springframework.cloud.stream.app:file-sink-kafka:1.0.0.BUILD-SNAPSHOT
And I just add the destination binding in its applicaiton.properties file.
spring.cloud.stream.bindings.input.destination=fileSink
4.Finding:
The data flow I expected should like this:
Source.handleRequest() -->Processor.handleRequest()
Source.handleRequest2() -->Processor.handleRequest2() --> Sink.fileWritingMessageHandler();
Should only the string "processed by transform2" is saved to the file.
But after my testing, the data flow is actual like this:
Source.handleRequest() -->Processor.handleRequest() --> Sink.fileWritingMessageHandler();
Source.handleRequest2() -->Processor.handleRequest2() --> Sink.fileWritingMessageHandler();
Both the "processed by transform1" and "processed by transform2" string are saved to the file.
5.Question:
Although the destination for the output channel in Processor.handleRequest() binds to hdfsSink instead of fileSink,the data still flows to file Sink. I can't understand this and this is not what I want.
I only want the data from Processor.handleRequest2() flows to file sink instead of both.
If I don't do it right, could anyone tell me how to do it and what is the solution?
It has been confused me for 2 days.
Thanks you for your kindly help.
Alex
Is your stream definition something like this (where the '-2' versions are the ones with multiple channels) ?
http-source-2 | processor-2 | file-sink
Note that Spring Cloud Data Flow will override the destinations defined in applications.properties which is why, even if spring.cloud.stream.bindings.output.destination for the processor is set to hdfs-sink, it will actually match the input of file-sink.
The way destinations are configured from a stream definition is explained here (in the context of taps): http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#spring-cloud-dataflow-stream-tap-dsl
What you can do is to simply swap the meaning of channel 1 and 2 - use the side channel for hdfs. This is a bit brittle though - as the input/output channels of the Stream will be configured automatically and the other channels will be configured via application.properties - in this case it may be better to configure the side channel destinations via stream definition or at deployment time - see http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#_application_properties.
It seems to me that these could be just as well be 2 streams listening to separate endpoints, using regular components - given that data is supposed to be flowing side by side.

Log4net EventLogAppender Log Event ID

Is there a way to log an event into the windows event log with a specified eventid per message? I am using log4net v 1.2.10.
Based on what I see in the EventLogAppender source code the following should do the trick:
log4net.ThreadContext.Properties["EventID"] = 5;
Just call this before you write your log messages (if you do not set it for all messages you should remove the "EventID" again from the Properties.
N.B the property key is case sensitive.
When one uses the native .net Event Log APIs in System.Diagnostics, the WriteEntry methods allow setting the eventID and category. In these APIs:
eventID is a 32 bit int, but its value must be between 0 and 65535
category is a 16 bit int, but its value must be positive. If the
event source includes a category resource file, the event viewer will
use the integer category value to lookup a localized “Task category”
string. Otherwise, the integer value is displayed.
The categories must be numbered consecutively, beginning with the
number 1
Log4net supports writing an EventID and a Category, but it isn’t straight forward. When log4net’s EventLogAppender logs an event, it looks at a dictionary of properties. The named properties "EventID" and "Category" are automatically mapped by the EventLogAppender to the corresponding values in the event log. I’ve seen a few good suggested ways to use log4net’s EventLogAppender and set the EventID and Category in the Windows event log.
a. Using log4net’s appender filtering, a filter may be registered that can add the EventID and Category properties. This method has a nice benefit that the standard log4net wrappers are used and so this can be implemented without changing existing logging code. The difficulty in this method is some mechanism has to be created to calculate the EventID and Category from the logged information. For instance, the filter could look at the exception source and map that source to a Category value.
b. Log4net may be extended so custom logging wrappers can be used that can include EventID and Category parameters. Adding EventID is demonstrated in the log4net sample “Extensibility – EventIDLogApp” which is included in the log4net source. In the extension sample a new interface (IEventIDLog) is used that extends the standard ILog interface used by applications to log. This provides new logging methods that include an eventId parameter. The new logging methods add the eventId to the Properties dictionary before logging the event.
public void Info(int eventId, object message, System.Exception t)
{
if (this.IsInfoEnabled)
{
LoggingEvent loggingEvent = new LoggingEvent(ThisDeclaringType, Logger.Repository, Logger.Name, Level.Info, message, t);
loggingEvent.Properties["EventID"] = eventId;
Logger.Log(loggingEvent);
}
}
c. Log4net supports a ThreadContext object that contains a Properties dictionary. An application could set the EventID and Category properties in this dictionary and then when the thread calls a logging method, the values will be used by the EventLogAppender.
log4net.ThreadContext.Properties["EventID"] = 5;
Some helpful references:
Log4net home page
Log4net SDK reference
Log4net samples
Log4net source
Enhancing log4net exception logging
Log4Net Tutorial pt 6: Log Event Context
Customizing Event Log Categories
EventSourceCreationData.CategoryResourceFile Property
Event Logging Elements
EventLog.WriteEntry Method
Well, the solution was to build the extension project "log4net.Ext.EventID" and to use its types: IEventIDLog, EventIDLogImpl and EventIDLogManager.
Another solution is to add a custom Filter as described here: Enhancing log4net exception logging (direct link to the Gist just in case).
As the author points out:
... EventLogAppender uses inline consts to check them. Once they are added they will be used by the mentioned EventLogAppender to mark the given entries with EventId and Category.
The filter implementation will look like the code below (stripped down gist) with the added benefit that if you make GetEventId method public, you can write some tests against it
public class ExceptionBasedLogEnhancer : FilterSkeleton
{
private const string EventLogKeyEventId = "EventID";
public override FilterDecision Decide(LoggingEvent loggingEvent)
{
var ex = loggingEvent.ExceptionObject;
if (ex != null)
{
loggingEvent.Properties[EventLogKeyEventId] = GetEventId(ex);
}
return FilterDecision.Neutral;
}
private static short GetEventId(Exception ex)
{
// more fancy implementation, like getting hash of ex properties
// can be provided, or mapping types of exceptions to eventids
// return no more than short.MaxValue, otherwise the EventLog will throw
return 0;
}
}
Extend ILog.Info() to take an event ID:
public static class LogUtils
{
public static void Info(this ILog logger, int eventId, object message)
{
log4net.ThreadContext.Properties["EventID"] = eventId;
logger.Info(message);
log4net.ThreadContext.Properties["EventID"] = 0; // back to default
}
}
Then call it like this:
using LogUtils;
private static readonly log4net.ILog _logger = log4net.LogManager.GetLogger(System.Reflection.MethodBase.GetCurrentMethod().DeclaringType);
_logger.Info(3, "First shalt thou take out the Holy Pin, then shalt thou count to three.");

Resources