Spring Batch: completed job is not actually ending - multithreading

I'm using Spring Batch 3.0.1.
I have a two-steps job.
First step is a read-process-write chunk. It generates PDF files with iText,and writes in two databases with Atomikos and JTA.
Second one is only a dummy step to log in the file "process ended"
I have configured it, launched it with CommandLineJobRunner way, and the work I expected has been done, both PDFs and databases.
The final message "process ended" is written; and, in batch_job_execution and batch_step_execution database tables, the rows are written with COMPLETED status and exit code, and end time is fulfilled.
So, my problem is that prompt is not coming back, and the process seems to be executing.
When I debug the process with Eclipse, the behaviour is the same: work is done, but the process does not end. Threads related to Atomikos, database pool and step remain as running, and I don't understand why.
The test case has 9 PDFs (items) to process.
The first step configuration is this:
<bean id="printingTaskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="10" />
<property name="maxPoolSize" value="20" />
<property name="queueCapacity" value="50" />
<property name="waitForTasksToCompleteOnShutdown" value="false" />
</bean>
<step id="pdfStep" next="endLoggerStep">
<tasklet transaction-manager="jtaTransactionManager" task-executor="printingTaskExecutor" throttle-limit="10" >
<batch:chunk reader="pdfItemReader" processor="pdfItemProcessor" writer="pdfItemWriter" commit-interval="20">
</batch:chunk>
</tasklet>
</step>
pdfItemReader is a JdbcPagingItemReader with pageSize parameter with 20 value too.
How can I actually finalize the process in command line or Eclipse? Any wrong configuration?
Any help is very appreciated. Thanks
[SOLVED]
I was going mad with this. Finally, when logging with trace level, the last line was:
Invoking destroy method 'close' on bean with name 'atomikosTransactionManager'
I changed the flag forceShutdown to true, and finally the program ends gratefully.
<bean id="atomikosTransactionManager" class="com.atomikos.icatch.jta.UserTransactionManager"
init-method="init" destroy-method="close">
<property name="forceShutdown" value="true" /><!-- Change this to true -->
</bean>
Hope it helps.

You can use System.exit(0); at the end of your main class which will terminate JVM eventually terminating the batch execution.

Related

Spring Integration - exponential retry with CircuitBreakerAdvice

I want to implement exponential retry with CircuitBreakerAdvice:
Can I increase halfOpenAfter exponentially? For example: it tries for threshold(3), first it waits for 15 seconds, then later it tries again for threshold, halfOpenAfter for 30 seconds, then later it tries again for threshold, halfOpenAfter for 60 seconds, then later it tries again for threshold, now (finally) file has to move to an error folder. This has to happen for all the files in the source folder.
Tried below code, but not sure how to make it exponential and how to move the file to error folder. This code is getting into infinity loop, forever it keeps trying.
<int-file:inbound-channel-adapter directory="sourcedirectorypath"
prevent-duplicates="false" auto-startup="true"
id="fileInbound" channel="sftpChannel">
<int:poller fixed-rate="5000" error-channel=""/>
</int-file:inbound-channel-adapter>
<int:channel id="sftpChannel"/>
<int:service-activator input-channel="sftpChannel" output-channel="outsftpChannel"
expression="#dsf.setThreadKey(#root, headers['file_name'])"/>
<int:channel id="outsftpChannel"/>
<int-sftp:outbound-gateway id="sftpOutboundAdapter" session-factory="dsf" command="put"
request-channel="outsftpChannel" charset="UTF-8" chmod="774" reply-channel="replyChannel"
remote-directory-expression="#sftpConfig.getRemoteDirectory(headers['file_name'])">
<int-sftp:request-handler-advice-chain>
<bean class="org.springframework.integration.handler.advice.ExpressionEvaluatingRequestHandlerAdvice" id="handlerAdvice">
<property name="failureChannel" ref="sftpFailureChannel"/>
</bean>
</int-sftp:request-handler-advice-chain>
</int-sftp:outbound-gateway>
<int:service-activator input-channel="sftpFailureChannel" expression="#sftpConfig.failed(#root)">
<int:request-handler-advice-chain>
<bean id="circuitBreakerAdvice"
class="org.springframework.integration.handler.advice.RequestHandlerCircuitBreakerAdvice">
<property name="threshold" value="3"/> <!-- close after 3 failures -->
<property name="halfOpenAfter" value="15000"/> <!-- half open after 15 seconds -->
</bean>
</int:request-handler-advice-chain>
</int:service-activator>
<int:channel id="replyChannel"/>
<int:service-activator input-channel="replyChannel" output-channel="nullChannel"
expression="#dsf.clearThreadKey(#root, headers['file_name'])" requires-reply="true"/>
Please correct me if my ask is wrong, but i need exactly as below. We can implement with resilience4j + spring boot, but i need it with Spring Integration.
resilience4j:
retry:
instances:
intervalFunctionExponentialExample:
maxRetryAttempts: 3
waitDuration: 15s
enableExponentialBackoff: true
exponentialBackoffMultiplier: 5
after 3 attempts of 15s, 75s, 375s exponentially it should go to fallbackMethod - "moveFileToErrorDirectory"
The RequestHandlerCircuitBreakerAdvice doesn't not support an exponential halfOpenAfter. At the moment. You can use a combination of this one and a RequestHandlerRetryAdvice, where this one goes as a first in the advice chain.
What you show about resilience4j is really a retry configuration, not a circuit breaker. So, think about that one more time if just a RequestHandlerRetryAdvice is enough for you. The ErrorMessageSendingRecoverer might be used as a fallback. The point of Spring Integration is really sending messages not calling methods.
See more info in docs: https://docs.spring.io/spring-integration/docs/current/reference/html/messaging-endpoints.html#message-handler-advice-chain

Spring integration multithreading

Referring to my earlier question at URL - Spring integration multithreading requirement - I think I may have figured out the root cause of the issue.
My requirement in brief -
Poll the database after a fixed delay of 1 sec and then publish very limited data to Tibco EMS queue. Now from this EMS queue I have to do the following tasks all in multithreaded fashion :- i) consume the messages, ii) fetch the full data now from the database and iii) converting this data into json format.
My design -
`<int:channel id="dbchannel"/>
<int-jdbc:inbound-channel-adapter id="dbchanneladapter"
channel="dbchannel" data-source="datasource"
query="${selectquery}" update="${updatequery}"
max-rows-per-poll="1000">
<int:poller id="dbchanneladapterpoller"
fixed-delay="1000">
<int:transactional transaction-manager="transactionmanager" />
</int:poller>
</int-jdbc:inbound-channel-adapter>
<int:service-activator input-channel="dbchannel"
output-channel="publishchannel" ref="jdbcmessagehandler" method="handleJdbcMessage" />
<bean id="jdbcmessagehandler" class="com.citigroup.handler.JdbcMessageHandler" />
<int:publish-subscribe-channel id="publishchannel"/>
<int-jms:outbound-channel-adapter id="publishchanneladapter"
channel="publishchannel" jms-template="publishrealtimefeedinternaljmstemplate" />
<int:channel id="subscribechannel"/>
<int-jms:message-driven-channel-adapter
id="subscribechanneladapter" destination="subscriberealtimeinternalqueue"
connection-factory="authenticationconnectionfactory" channel="subscribechannel"
concurrent-consumers="5" max-concurrent-consumers="5" />
<int:service-activator input-channel="subscribechannel"
ref="subscribemessagehandler" method="logJMSMessage" />
<bean id="subscribemessagehandler" class="com.citigroup.handler.SubscribeJMSMessageHandler" />
</beans>
<bean id="authenticationconnectionfactory"
class="org.springframework.jms.connection.UserCredentialsConnectionFactoryAdapter">
<property name="targetConnectionFactory" ref="connectionFactory" />
<property name="username" value="test" />
<property name="password" value="test123" />
</bean>
<bean id="connectionFactory" class="org.springframework.jndi.JndiObjectFactoryBean">
<property name="jndiTemplate">
<ref bean="jndiTemplate" />
</property>
<property name="jndiName" value="app.jndi.testCF" />
</bean>
<bean id="subscriberealtimeinternalqueue" class="org.springframework.jndi.JndiObjectFactoryBean">
<property name="jndiTemplate">
<ref bean="jndiTemplate" />
</property>
<property name="jndiName"
value="app.queue.testQueue" />
</bean>
<bean id="jndiTemplate" class="org.springframework.jndi.JndiTemplate">
<property name="environment">
<props>
<prop key="java.naming.factory.initial">com.tibco.tibjms.naming.TibjmsInitialContextFactory
</prop>
<prop key="java.naming.provider.url">tibjmsnaming://test01d.nam.nsroot.net:7222</prop>
</props>
</property>
</bean>`
Issue -
Using message-driven-channel with concurrent consumers value set to 5. However, it looks like just one consumer thread (container-2) is created and is picking up the messages from EMS queue. Please find below the log4j log -
16 Aug 2018 11:31:12,077 INFO SubscribeJMSMessageHandler [subscribechanneladapter.container-2][]: Total count of records read from Queue at this moment is 387
record#1:: [ID=7694066395] record#2:: [ID=7694066423] .. .. .. record#387:: [ID=6147457333]
Probable root cause here -
May be its the first step in the configuration where I am polling the database to fetch the data after a fixed-delay that's causing this multithreading issue. Referring to the logs above, my assumption here is since the number of records fetched is 387 and all these are bundled into a List object (List> message), it is being considered as just 1 message/payload instead of 387 different messages and that's why just one thread/container/consumer is picking up this bundled message. Reason for this assumption is the logs below -
GenericMessage [payload=[{"ID":7694066395},{"ID":7694066423},{"ID":6147457333}], headers={json__ContentTypeId__=class org.springframework.util.LinkedCaseInsensitiveMap, jms_redelivered=false, json__TypeId__=class java.util.ArrayList, jms_destination=Queue[app.queue.testQueue], id=e034ba73-7781-b62c-0307-170099263068, priority=4, jms_timestamp=1534820792064, contentType=application/json, jms_messageId=ID:test.21415B667C051:40C149C0, timestamp=1534820792481}]
Question -
Is my understanding of the root cause correct? If yes then what can be done to treat these 387 messages as individual messages (and not one List object of messages) and publish them one by one without impacting the transaction management??
I had discussed this issue with https://stackoverflow.com/users/2756547/artem-bilan in my earlier post on stackoverflow and I had to check this design by replacing Tibco EMS with ActiveMQ. However, ActiveMQ infrastructure is is still being analysed by our architecture team and so can't be used till its approved.
Oh! Now I see what is your problem. The int-jdbc:inbound-channel-Adapter indeed returns a list of records it could select from the DB. And this whole list is sent as a single message to the JMS. That’s the reason how you see only one thread in the consumer side: there is just only one message to get from the queue.
If you would like to have separate messages for each pulled record, you need to consider to use a <splitter> in between JDBC polling operation and sending to JMS.

Spring Integration with RedisLockRegistry example

We are implementing a flow where a <int-sftp:inbound-streaming-channel-adapter/> polls a directory for a file and when found it passes the stream to a service activator.
The issue is we will have multiple instances of the app running and we would like to lock the process so that only one instance can pick up the file.
Looking at the documentation, Redis Lock Registry looks to be the solution, is there an example of this being used in xml?
All I can find is a few references to it and the source code for it.
http://docs.spring.io/spring-integration/reference/html/redis.html point 24.1
Added info:
Ive added the RedisMetaDataStore and SftpSimplePatternFileListFilter. It does work but it does have one oddity, when sftpInboundAdapter is activated by the poller it adds an entry for each file in the metadatastore. Say there are 10 files, there would be 10 entries in the datastore, but it does not process all 10 files in "1 go", only 1 file is processed per poll from the adapter, which would be fine, but in a multi instance environment if the server which picked up the files went down after processing 5 files, another server doesn't seem to pick up the remaining 5 files unless the files are "touched".
Is the behaviour of picking up 1 file per poll correct or should it process all valid files during one poll.
Below is my XML
<int:channel id="sftpInbound"/> <!-- To Java -->
<int:channel id="sftpOutbound"/>
<int:channel id="sftpStreamTransformer"/>
<int-sftp:inbound-streaming-channel-adapter id="sftpInboundAdapter"
channel="sftpInbound"
session-factory="sftpSessionFactory"
filter="compositeFilter"
remote-file-separator="/"
remote-directory="${sftp.directory}">
<int:poller cron="${sftp.cron}"/>
</int-sftp:inbound-streaming-channel-adapter>
<int:stream-transformer input-channel="sftpStreamTransformer" output-channel="sftpOutbound"/>
<bean id="compositeFilter"
class="org.springframework.integration.file.filters.CompositeFileListFilter">
<constructor-arg>
<list>
<bean
class="org.springframework.integration.sftp.filters.SftpSimplePatternFileListFilter">
<constructor-arg value="Receipt*.txt" />
</bean>
<bean id="SftpPersistentAcceptOnceFileListFilter" class="org.springframework.integration.sftp.filters.SftpPersistentAcceptOnceFileListFilter">
<constructor-arg ref="metadataStore" />
<constructor-arg value="ReceiptLock_" />
</bean>
</list>
</constructor-arg>
</bean>
<bean id="redisConnectionFactory"
class="org.springframework.data.redis.connection.jedis.JedisConnectionFactory">
<property name="port" value="${redis.port}" />
<property name="password" value="${redis.password}" />
<property name="hostName" value="${redis.host}" />
</bean>
No; you need to use a SftpPersistentAcceptOnceFileListFilter (docs here) with a Redis (or some other) metadata store, not a lock registry.
EDIT
Regarding your comment below.
Yes, it's a known issue; in the next release we've added a max-fetch-size for exactly this reason - so the instances can each retrieve some of the files rather than the first instance grabbing them all.
(The inbound adapter works by first copying files found, that are not already in the store, to the local disk, and then emits them one at a time).
5.0 only available as a milestone right now M2 at the time of writing, but the current version and milestone repo can be found here; it won't be released for a few more months.
Another alternative would be to use outbound gateways - one to LS the files and one to GET individual files; your app would have to use the metadata store itself, though, to determine which file(s) can be fetched.

Spring Batch - How to achieve multi-threading/partitioning with Single large XML - StaxEventItemReader

I am using Spring Batch 3.2 for bulk migration of data from XML to Database.
My XML contains around 140K users and I want to dump it into DB.
I do not want to proceed it in a single thread.
I tried using TaskExecutor but not able to succeed due to below error.
at java.lang.Thread.run(Thread.java:724)
Caused by: com.ctc.wstx.exc.WstxParsingException: Unexpected close tag </consumerName>; expected </first>.
at [row,col {unknown-source}]: [4814,26]
at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:606)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:479)
at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:464)
at com.ctc.wstx.sr.BasicStreamReader.reportWrongEndElem(BasicStreamReader.java:3283)
at com.ctc.wstx.sr.BasicStreamReader.readEndElem(BasicStreamReader.java:3210)
at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2829)
at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1072)
at org.codehaus.stax2.ri.Stax2EventReaderImpl.peek(Stax2EventReaderImpl.java:367)
at org.springframework.batch.item.xml.stax.DefaultFragmentEventReader.nextEvent(DefaultFragmentEventReader.java:114)
at org.springframework.batch.item.xml.stax.DefaultFragmentEventReader.markFragmentProcessed(DefaultFragmentEventReader.java:184)
where consumerName and first are XML nodes.
I knew StaxEventItemReader is not a thread safe and multiple threads are using XML and due to that issue, there is some problem in marking fragments as processed and I am not able to get unique record as well as complete fragment to process.
Can any one suggest me how Can I use multi-threading/partitioning in my case.
What I want
By using multi-threading, how can I make sure that each thread get unique chuck i.e (Thread 1 - fragement 1-100, Thread 2 - fragement 101-200.... so on)
Each thread process unique chuck and dump into DB.
My configuration
<batch:job id="sampleJob">
<batch:step id="multiThreadStep" allow-start-if-complete="true">
<batch:tasklet transaction-manager="transactionManager" task-executor="taskExecutor" throttle-limit="10">
<batch:chunk reader="xmlItemReader" writer="itemWriter"
processor="itemProcessor" commit-interval="10" skip-limit="1500000">
<batch:skippable-exception-classes>
<batch:include class="java.lang.Exception" />
</batch:skippable-exception-classes>
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
<!-- <bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor">
<property name="concurrencyLimit" value="10"/>
</bean> -->
<bean id="taskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="10" />
<property name="maxPoolSize" value="10" />
<property name="allowCoreThreadTimeOut" value="true" />
</bean>
<bean id="xmlItemReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
<property name="fragmentRootElementName" value="userItem" />
<property name="unmarshaller" ref="userDetailUnmarshaller" />
<property name="saveState" value="false" />
</bean>
Sample XML
<userItems>
<userItem>
<...>
<...>
</userItem>
<userItem>
<...>
<...>
</userItem>
...
...
</userItems>

Spring integration: how to handle exceptions in services after an aggregator?

I have an application relying on Spring Integration (4.0.4.RELEASE) and RabbitMQ. My flow is as follow:
Messages are put in queue via a process (they do not expect any answer):
Gateway -> Channel -> RabbitMQ
And then drained by another process:
RabbitMQ --1--> inbound-channel-adapter A --2--> chain B --3--> aggregator C --4--> service-activator D --5--> final service-activator E
Explanations & context
The specific thing is that nowhere in my application I am using a splitter: aggregator C just waits for enough messages to come, or for a timeout to expire, and then forwards the batch to service D. Messages can get stuck in aggregator C for quite a long time, and should NOT be considered as consumed there. They should only be consumed once service D successfully completes. Therefore, I am using MANUAL acknowledgement on inbound-channel-adapter A and service E is in charge of acknowledging the batch.
Custom aggregator
I solved the acknowledgement issue I had when set to AUTO by redefining the aggregator. Indeed, messages are acknowledged immediately if any asynchronous process occurs in the flow (see question here). Therefore, I switched to MANUAL acknowledgement and implemented the aggregator like this:
<bean class="org.springframework.integration.config.ConsumerEndpointFactoryBean">
<property name="inputChannel" ref="channel3"/>
<property name="handler">
<bean class="org.springframework.integration.aggregator.AggregatingMessageHandler">
<constructor-arg name="processor">
<bean class="com.test.AMQPAggregator"/>
</constructor-arg>
<property name="correlationStrategy">
<bean class="com.test.AggregatorDefaultCorrelationStrategy" />
</property>
<property name="releaseStrategy">
<bean class="com.test.AggregatorMongoReleaseStrategy" />
</property>
<property name="messageStore" ref="messageStoreBean"/>
<property name="expireGroupsUponCompletion" value="true"/>
<property name="sendPartialResultOnExpiry" value="true"/>
<property name="outputChannel" ref="channel4"/>
</bean>
</property>
</bean>
<bean id="messageStoreBean" class="org.springframework.integration.store.SimpleMessageStore"/>
<bean id="messageStoreReaperBean" class="org.springframework.integration.store.MessageGroupStoreReaper">
<property name="messageGroupStore" ref="messageStore" />
<property name="timeout" value="${myapp.timeout}" />
</bean>
<task:scheduled-tasks>
<task:scheduled ref="messageStoreReaperBean" method="run" fixed-rate="2000" />
</task:scheduled-tasks>
I wanted indeed to aggregate the headers in a different way, and keep the highest value of all the amqp_deliveryTag for later multi-acknoledgement in service E (see this thread). This works great so far, apart from the fact that it is far more verbose than the typical aggregator namespace (see this old Jira ticket).
Services
I am just using basic configurations:
chain-B
<int:chain input-channel="channel2" output-channel="channel3">
<int:header-enricher>
<int:error-channel ref="errorChannel" /> // Probably useless
</int:header-enricher>
<int:json-to-object-transformer/>
<int:transformer ref="serviceABean"
method="doThis" />
<int:transformer ref="serviceBBean"
method="doThat" />
</int:chain>
service-D
<int:service-activator ref="serviceDBean"
method="doSomething"
input-channel="channel4"
output-channel="channel5" />
Error management
As I rely on MANUAL acknowledgement, I need to manually reject messages as well in case an exception occurs. I have the following definition for inbound-channel-adapter A:
<int-amqp:inbound-channel-adapter channel="channel2"
queue-names="si.queue1"
error-channel="errorChannel"
mapped-request-headers="*"
acknowledge-mode="MANUAL"
prefetch-count="${properties.prefetch_count}"
connection-factory="rabbitConnectionFactory"/>
I use the following definition for errorChannel:
<int:chain input-channel="errorChannel">
<int:transformer ref="errorUnwrapperBean" method="unwrap" />
<int:service-activator ref="amqpAcknowledgerBean" method="rejectMessage" />
</int:chain>
ErrorUnwrapper is based on this code and the whole exception detection and message rejection works well until messages reach aggregator C.
Problem
If an exception is raised while processing the messages in service-activator D, then I see this exception but errorChannel does not seem to receive any message, and my ErrorUnwrapper unwrap() method is not called. The tailored stack traces I see when an Exception("ahahah") is thrown are as follow:
2014-09-23 16:41:18,725 ERROR o.s.i.s.SimpleMessageStore:174: Exception in expiry callback
org.springframework.messaging.MessageHandlingException: java.lang.Exception: ahahaha
at org.springframework.integration.handler.MethodInvokingMessageProcessor.processMessage(MethodInvokingMessageProcessor.java:78)
at org.springframework.integration.handler.ServiceActivatingHandler.handleRequestMessage(ServiceActivatingHandler.java:71)
at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:170)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:78)
(...)
Caused by: java.lang.Exception: ahahaha
at com.myapp.ServiceD.doSomething(ServiceD.java:153)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
(...)
2014-09-23 16:41:18,733 ERROR o.s.s.s.TaskUtils$LoggingErrorHandler:95: Unexpected error occurred in scheduled task.
org.springframework.messaging.MessageHandlingException: java.lang.Exception: ahahaha
(...)
Question
How can one tell the services that process messages coming from such an aggregator to publish errors to errorChannel? I tried to specify in the header via a header-enricher the error-channel with no luck. I am using the default errorChannel definition, but I tried as well to change its name and redefine it. I am clueless here, and even though I found this and that, I have not managed to get it to work. Thanks in advance for your help!
As you see by StackTrace your process is started from the MessageGroupStoreReaper Thread, which is initiated from the default ThreadPoolTaskScheduler.
So, you must provide a custom bean for that:
<bean id="scheduler" class="org.springframework.scheduling.concurrent.ThreadPoolTaskScheduler">
<property name="errorHandler">
<bean class="org.springframework.integration.channel.MessagePublishingErrorHandler">
<property name="defaultErrorChannel" ref="errorChannel"/>
</bean>
</property>
</bean>
<task:scheduled-tasks scheduler="scheduler">
<task:scheduled ref="messageStoreReaperBean" method="run" fixed-rate="2000" />
</task:scheduled-tasks>
However I see the benefits from having the error-channel on the <aggregator>, where we really have several points from different detached Threads, with wich we can't get deal normally.

Resources