I recently had a scenario like below:
Flow_A ------> Flow_B ------> Flow_C ------> Flow_D
Where
Flow_A is the initiator and should pass messageA.
Flow_B should pass messageA+messageB.
Flow_C should pass messageA+messageB+messageC
Flow_D should pass messageA+messageB+messageC+messageD.
So, I was thinking to enhance the headers with an old message and again pass to another flow. But, it will be very bulky at the end.
Should I store the message somewhere and then pass the messageId in the header, so that the next flow can get the old message with the messageId?
What should be the best way to achieve this?
See Claim Check pattern: https://docs.spring.io/spring-integration/docs/current/reference/html/message-transformation.html#claim-check
You store a message using ClaimCheckInTransformer and get its id as an output payload.
You move this id into a header and produce the next message.
Repeat #1 and #2 steps for this second message to be ready for the third one.
And so on to prepare environment for the fourth message.
To restore those messages you need to repeat the procedure opposite direction.
Get a header from the message into a payload, remove it and call ClaimCheckOutTransformer to restore a stored message. I say "remove header" to let the stack to be restored properly: the ClaimCheckOutTransformer has a logic like this:
AbstractIntegrationMessageBuilder<?> responseBuilder = getMessageBuilderFactory().fromMessage(retrievedMessage);
// headers on the 'current' message take precedence
responseBuilder.copyHeaders(message.getHeaders());
So, without removing that header, the same message id is going to be carried into the next step and you will be is a loop - StackOverflowError.
Another is to store messages manually somewhere, e.g. MetadataStore and collect their ids in the list for payload. This way you don't need extra logic to deal with headers. Everything in a list of your payload. You can consult the store any time for any id item in that list!
Initially our flow of cimmunicating with google Pub/Sub was so:
Application accepts message
Checks that it doesn't exist in idempotencyStore
3.1 If doesn't exist - put it into idempotency store (key is a value of unique header, value is a current timestamp)
3.2 If exist - just ignore this message
When processing is finished - send acknowledge
In the acknowledge successfull callback - remove this msg from metadatastore
The point 5 is wrong because theoretically we can get duplicated message even after message has processed. Moreover we found out that sometimes message might not be removed even although successful callback was invoked( Message is received from Google Pub/Sub subscription again and again after acknowledge[Heisenbug]) So we decided to update value after message is proccessed and replace timestamp with "FiNISHED" string
But sooner or later we will encounter that this table will be overcrowded. So we have to cleanup messages in the MetaDataStore. We can remove messages which are processed and they were processed more 1 day.
As was mentioned in the comments of https://stackoverflow.com/a/51845202/2674303 I can add additional column in the metadataStore table where I could mark if message is processed. It is not a problem at all. But how can I use this flag in the my cleaner? MetadataStore has only key and value
In the acknowledge successfull callback - remove this msg from metadatastore
I don't see a reason in this step at all.
Since you say that you store in the value a timestamp that means that you can analyze this table from time to time to remove definitely old entries.
In some my project we have a daily job in DB to archive a table for better main process performance. Right, just because we don't need old data any more. For this reason we definitely check some timestamp in the raw to determine if that should go into archive or not. I wouldn't remove data immediately after process just because there is a chance for redelivery from external system.
On the other hand for better performance I would add extra indexed column with timestamp type into that metadata table and would populate a value via trigger on each update or instert. Well, MetadataStore just insert an entry from the MetadataStoreSelector:
return this.metadataStore.putIfAbsent(key, value) == null;
So, you need an on_insert trigger to populate that date column. This way you will know in the end of day if you need to remove an entry or not.
Some part of my project using Esper in Java for complex Event processing. I'm planning to replace Esper with Azure Stream Analytics.
Use Case: FTOD (First Ticket of the Day) & FTOP (First Ticket of Project)
I'm continuously getting ticket data from Eventhub and want to generate 2 types of alerts (FTOD & FTOP). I think thumblingWindow is the best fit for this scenario.
But I'm not able to pick first record in window. Any suggestion how to pick first record in 24 hours window?
Below is Esper query for FTOD
String statementQuery = "context context_" + plantIdStr
+ " select distinct * from TicketInfoComplete as ticket where plantId = '"
+ entry.getKey() + "' and ruleType='FTOD' output first every 24 hours";
Below is my incoming message data
[{"DeviceSerialNumber":"190203XXX001TEST","MessageTimestamp":"2019-02-11T13:46:08.0000000Z","PlantId":"141","ProjectId":"Mobitest","ProjectName":"Mobitest","TicketNumber":"84855","TicketDateTimeinUTC":"2019-02-11T13:46:08.0000000Z","AdditionalInfo":{"value123":"value2"},"Timeout":60000,"Traffic":1,"Make":"Z99","TruckMake":"Z99","PlantName":"RMZ","Status":"Valid","PlantMakeSerialNumber":"Z99|190203XXX001TEST","ErrorMessageJsonString":"[]","Timezone":"India Standard Time"}]
Based on your description, I think you could know about LAST operator with the GROUP BY condition. LAST allows one to look up the most recent event in an event stream within defined constraints.
In Stream Analytics, the scope of LAST (that is, how far back in
history from the current event it needs to look) is always limited to
a finite time interval, using the LIMIT DURATION clause. LAST can
optionally be limited to only consider events that match the current
event on a certain property or condition using the PARTITION BY and
WHEN clauses. LAST is not affected by predicates in WHERE clause, join
conditions in JOIN clause, or grouping expressions in GROUP BY clause
of the current query.
Please see the example in above document:
SELECT
LAST(TicketNumber) OVER (LIMIT DURATION(hour, 24))
FROM input
Just for summarized, the isFirst method need to be considered when you want to get the first item.
Exact Query What I have used after using IsFirst Method for FTOD & FTOP alert.
SELECT
DeviceSerialNumber,MessageTimestamp,PlantId,TruckId,ProjectId,ProjectName,
CustomerId,CustomerName,TicketNumber,TicketDateTimeinUTC,TruckSerialNumber,
TruckMake,PlantName,PlantMakeSerialNumber,Timezone,'FTOD' as alertType
INTO
[alertOutput]
FROM
[ticketInput]
where ISFIRST(mi, 2)=1
SELECT
DeviceSerialNumber,MessageTimestamp,PlantId,TruckId,ProjectId,ProjectName,
CustomerId,CustomerName,TicketNumber,TicketDateTimeinUTC,TruckSerialNumber,
TruckMake,PlantName,PlantMakeSerialNumber,Timezone,'FTOP' as alertType
INTO
[ftopOutput]
FROM
[ticketInput]
where ISFIRST(mi, 2) OVER (PARTITION BY PlantId) = 1
In spring-batch, data can be passed between various steps via ExecutionContext. You can set the details in one step and retrieve in the next. Do we have anything of this sort in spring-integration ?
My use case is that I have to pick up a file from ftp location, then split it based on certain business logic and then process them. Depending on the file names client id would be derived. This client id would be used in splitter, service activator and aggregator components.
From my newbie level of expertise I have in spring, I could not find anything which help me share state for a particular run.I wanted to know if spring-integration provides this state sharing context in some way.
Please let me know if there is a way to do in spring-context.
In Spring Integration applications there is no single ExecutionContext for state sharing. Instead, as Gary Russel mentioned, each message carries all the information within its payload or its headers.
If you use Spring Integration Java DSL and want to transport the clientId by message header you can use enrichHeader transformer. Being supplied with a HeaderEnricherSpec, it can accept a function which returns dynamically determined value for the specified header. As of your use case this might look like:
return IntegrationFlows
.from(/*ftp source*/)
.enrichHeaders(e -> e.headerFunction("clientId", this::deriveClientId))
./*split, aggregate, etc the file according to clientId*/
, where deriveClientId method might be a sort of:
private String deriveClientId(Message<File> fileMessage) {
String fileName = fileMessage.getHeaders().get(FileHeaders.FILENAME, String.class);
String clientId = /*some other logic for deriving clientId from*/fileName;
return clientId;
}
(FILENAME header is provided by FTP message source)
When you need to access the clientId header somewhere in the downstream flow you can do it the same way as file name mentioned above:
String clientId = message.getHeaders().get("clientId", String.class);
But make sure that the message still contains such header as it could have been lost somewhere among intermediate flow items. This is likely to happen if at some point you construct a message manually and send it further. In order not to loose any headers from the preceding message you can copy them during the building:
Message<PayloadType> newMessage = MessageBuilder
.withPayload(payloadValue)
.copyHeaders(precedingMessage.getHeaders())
.build();
Please note that message headers are immutable in Spring Integration. It means you can't just add or change a header of the existing message. You should create a new message or use HeaderEnricher for that purpose. Examples of both approaches are presented above.
Typically you convey information between components in the message payload itself, or often via message headers - see Message Construction and Header Enricher
Headers object has series of parameters and one of them is a list of another parameters. some parameters are secured and I don't want to log them. how can I filter their parameters only in this channel ?
<integration:logging-channel-adapter channel="logChannel" level="INFO" expression="'message received, headers:' + headers"/>
Add a header-filter upstream of the logging adapter to remove any headers you don't want logged.