health check before processing file stream in xd - spring-integration

I am pulling files from s3 and processing them using spring xd. I have one processor http client component where i do some RESTful request .Now the problem with this approach is if my webservice is down the files get accumulated in rabbit mq transport .Hence before pulling a individual file from s3 I want to do a health check on my rest service.How can I tackle this my configuration file looks something like this.
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:int="http://www.springframework.org/schema/integration"
xmlns:int-aws="http://www.springframework.org/schema/integration/aws"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
http://www.springframework.org/schema/integration/aws http://www.springframework.org/schema/integration/aws/spring-integration-aws-1.0.xsd">
<int:poller fixed-delay="${fixed-delay}" default="true"/>
<bean id="credentials" class="org.springframework.integration.aws.core.BasicAWSCredentials">
<property name="accessKey" value="${accessKey}"/>
<property name="secretKey" value="${secretKey}"/>
</bean>
<bean
class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<property name="location">
<value>dms-aws-s3-nonprod.properties</value>
</property>
</bean>
<bean id="clientConfiguration" class="com.amazonaws.ClientConfiguration">
<property name="proxyHost" value="${proxyHost}"/>
<property name="proxyPort" value="${proxyPort}"/>
<property name="preemptiveBasicProxyAuth" value="false"/>
</bean>
<bean id="s3Operations" class="org.springframework.integration.aws.s3.core.CustomC1AmazonS3Operations">
<constructor-arg index="0" ref="credentials"/>
<constructor-arg index="1" ref="clientConfiguration"/>
<property name="awsEndpoint" value="s3.amazonaws.com"/>
<property name="temporaryDirectory" value="${temporaryDirectory}"/>
<property name="awsSecurityKey" value="${awsSecurityKey}"/>
</bean>
<!-- aws-endpoint="https://s3.amazonaws.com" -->
<int-aws:s3-inbound-channel-adapter aws-endpoint="s3.amazonaws.com"
bucket="${bucket}"
s3-operations="s3Operations"
credentials-ref="credentials"
file-name-wildcard="${file-name-wildcard}"
remote-directory="${remote-directory}"
channel="splitChannel"
local-directory="${local-directory}"
accept-sub-folders="false"
delete-source-files="true"
archive-bucket="${archive-bucket}"
archive-directory="${archive-directory}">
</int-aws:s3-inbound-channel-adapter>
int-file:splitter input-channel="splitChannel" output-channel="output" markers="true"/>
<int:channel id="output"/>
My stream defination
xd-shell>stream create feedTest16 --definition "aws-s3-source |processor-http-client| log" --deploy

Starting with Spring Integration 4.1, the PollSkipAdvice has been introduced.
Implement your own ServiceHealthCheckPollSkipStrategy and inject it into the <advice-chain> of the <poller> for your <int-aws:s3-inbound-channel-adapter> and you're good with the requirement!
Only one issue is there that your s3-source is tied with the target service for the http-client...

Related

dynamic delay configuration in spring poller

Polling dynamically
I am using poller component from integration to poll files from s3.The fixed delay is 15 min and max message rate is 1 .The reason i did this was in down stream messages in xd were clogging since I am using http.Now this is file for 100k records file but when file size is small i still wait 15 min though i can process fast.Now is there any way to dynamically set delay depending in size of file.Because we don't know which files will be polled also to know it?Depending on file size i will pick up or number of records can we change dynamically the fixed delay or fixed rate?
<int:poller fixed-delay="${fixedDelay}" default="true" max-messages-per-poll="${maxMessageRate}">
<int:advice-chain>
<ref bean="pollAdvise"/>
</int:advice-chain>
</int:poller>
<bean id="pollAdvise" class="org.springframework.integration.scheduling.PollSkipAdvice">
<constructor-arg ref="healthCheckStrategy"/>
</bean>
<bean id="healthCheckStrategy" class="test.ServiceHealthCheckPollSkipStrategy">
<property name="url" value="${url}"/>
<property name="doHealthCheck" value="${doHealthCheck}"/>
</bean>
<bean id="credentials" class="org.springframework.integration.aws.core.BasicAWSCredentials">
<property name="accessKey" value="${accessKey}"/>
<property name="secretKey" value="${secretKey}"/>
</bean>
<bean id="clientConfiguration" class="com.amazonaws.ClientConfiguration">
<property name="proxyHost" value="${proxyHost}"/>
<property name="proxyPort" value="${proxyPort}"/>
<property name="preemptiveBasicProxyAuth" value="false"/>
</bean>
<bean id="s3Operations" class="org.springframework.integration.aws.s3.core.CustomC1AmazonS3Operations">
<constructor-arg index="0" ref="credentials"/>
<constructor-arg index="1" ref="clientConfiguration"/>
<property name="awsEndpoint" value="s3.amazonaws.com"/>
<property name="temporaryDirectory" value="${temporaryDirectory}"/>
<property name="awsSecurityKey" value="${awsSecurityKey}"/>
</bean>
<!-- aws-endpoint="https://s3.amazonaws.com" -->
<int-aws:s3-inbound-channel-adapter aws-endpoint="s3.amazonaws.com"
bucket="${bucket}"
s3-operations="s3Operations"
credentials-ref="credentials"
file-name-wildcard="${fileNameWildcard}"
remote-directory="${remoteDirectory}"
channel="splitChannel"
local-directory="${localDirectory}"
accept-sub-folders="false"
delete-source-files="true"
archive-bucket="${archiveBucket}"
archive-directory="${archiveDirectory}">
</int-aws:s3-inbound-channel-adapter>
<int-file:splitter id="s3splitter" input-channel="splitChannel" output-channel="bridge" markers="false" charset="UTF-8">
<int-file:request-handler-advice-chain>
<bean class="org.springframework.integration.handler.advice.ExpressionEvaluatingRequestHandlerAdvice">
<property name="onSuccessExpression" value="payload.delete()"/>
</bean>
</int-file:request-handler-advice-chain>
</int-file:splitter>
Starting with Spring Integration 4.2 the AbstractMessageSourceAdvice has been introduced:
This method is called after the receive() method; again, you can reconfigure the source, or take any action perhaps depending on the result (which can be null if there was no message created by the source). You can even return a different message!
Starting with version 4.3 we introduce CompoundTriggerAdvice: http://docs.spring.io/spring-integration/docs/4.3.0.BUILD-SNAPSHOT/reference/html/messaging-channels-section.html#_compoundtriggeradvice
Which you can use for your use-case based on the payload size.

Enabling logging in spring integration utility

Below I have the program to send a message and consume a message from queue, right now I have commented out the sending part and only want to consume the messages from queue
Now I want to enable logging in the below program such that a log file is generated in my c: drive and inside that log file it should indicate that what message it is consuming at what time stamp please advise how to configure logging in the below configuration file
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:int="http://www.springframework.org/schema/integration"
xmlns:jms="http://www.springframework.org/schema/integration/jms"
xmlns:file="http://www.springframework.org/schema/integration/file"
xmlns:context="http://www.springframework.org/schema/context"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
http://www.springframework.org/schema/integration/jms
http://www.springframework.org/schema/integration/jms/spring-integration-jms.xsd
http://www.springframework.org/schema/integration/spring-integration.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/integration/file
http://www.springframework.org/schema/integration/file/spring-integration-file.xsd
http://www.springframework.org/schema/context/spring-context.xsd">
<int:poller id="poller" default="true">
<int:interval-trigger interval="200" />
</int:poller>
<int:channel id="input">
<int:queue capacity="10" />
</int:channel>
<bean id="tibcoEMSJndiTemplate" class="org.springframework.jndi.JndiTemplate">
<property name="environment">
<props>
<prop key="java.naming.factory.initial">com.tibco.tibjms.naming.TibjmsInitialContextFactory
</prop>
<prop key="java.naming.provider.url">tcp://lsdrtems2.fm.crdgrp.net:7333</prop>
<prop key="java.naming.security.principal">acfgtir</prop>
<prop key="java.naming.security.credentials">acfgtir</prop>
</props>
</property>
</bean>
<bean id="tibcoEMSConnFactory" class="org.springframework.jndi.JndiObjectFactoryBean">
<property name="jndiTemplate">
<ref bean="tibcoEMSJndiTemplate" />
</property>
<property name="jndiName">
<value>GenericConnectionFactory</value>
</property>
</bean>
<bean id="tibcosendJMSTemplate" class="org.springframework.jms.core.JmsTemplate">
<property name="connectionFactory">
<ref bean="tibcoEMSConnFactory" />
</property>
<property name="defaultDestinationName">
<value>acfgtirrtyation.ioa.swretift_publish_poc1</value>
</property>
<property name="pubSubDomain">
<value>false</value>
</property>
<property name="receiveTimeout">
<value>120000</value>
</property>
</bean>
<!-- <jms:outbound-channel-adapter channel="input"
destination-name="acfgtirrtyation.ioa.swretift_publish_poc1" connection-factory="tibcoEMSConnFactory" /> -->
<int:channel id="objetChannel"></int:channel>
<int:channel id="StringChannel"></int:channel>
<int:channel id="jmsInChannel" />
<jms:message-driven-channel-adapter id="jmsIn" concurrent-consumers="10"
destination-name="acfgtirrtyation.ioa.swretift_publish_poc1" connection-factory="tibcoEMSConnFactory" extract-payload="false"
channel="jmsInChannel" />
<int:payload-type-router input-channel="jmsInChannel">
<int:mapping type="javax.jms.ObjectMessage" channel="objetChannel" />
<int:mapping type="javax.jms.TextMessage" channel="StringChannel" />
</int:payload-type-router>
<file:outbound-channel-adapter id="filesoutOject" channel="objetChannel" directory="C:\\abcsaral"
filename-generator="generatorr" />
<file:outbound-channel-adapter id="filesoutString" channel="StringChannel" directory="C:\\abcsaral"
filename-generator="generatorr" />
<bean id="generatorr" class="com.abs.tibco.TimestampTextGenerator">
</bean>
</beans>
Add log4j (or logback, or any java logging system supported by commons-logging) to your classpath and configure it to log a DEBUG level for category org.springframework.integration.
Or you can add a wire tap to the channel and route it to a Logging channel adapter
<int:channel id="in">
<int:interceptors>
<int:wire-tap channel="logger"/>
</int:interceptors>
</int:channel>
<int:logging-channel-adapter id="logger" level="DEBUG"/>

File getting deleted before processing

I have spring xd module with rabbit transport which pulls files from s3 and split line by line and delete it after processing(ExpressionAdvice) .I have around 1 million messages(lines) in my file which is in s3.The file gets downloaded to xd container box and i checked md5sum and its same and has same lines . I see only 260k odd message are coming to output channel which is processor.I am loosing around 740 messages. sometimes it random once i see all messages like 1 million in my output channel and sometimes only 250k .I am measuring this using counter for my stream.File is downloaded but i feel its getting deleted before processing all records/lines in 10 seconds, my file size is around 700Mb.Please let me know if expression advice is deleting before processing.
module.aws-s3-source.count=1 and module.aws-s3-source.concurrency=70
stream1 as-s3-source |processor|sink
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:int="http://www.springframework.org/schema/integration"
xmlns:context="http://www.springframework.org/schema/context"
xmlns:int-aws="http://www.springframework.org/schema/integration/aws"
xmlns:int-file="http://www.springframework.org/schema/integration/file"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
http://www.springframework.org/schema/integration/file
http://www.springframework.org/schema/integration/file/spring-integration-file.xsd
http://www.springframework.org/schema/context
http://www.springframework.org/schema/context/spring-context.xsd
http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd
http://www.springframework.org/schema/integration/aws http://www.springframework.org/schema/integration/aws/spring-integration-aws-1.0.xsd">
<context:property-placeholder location="classpath*:test-${region}.properties" />
<int:poller fixed-delay="${fixedDelay}" default="true">
<int:advice-chain>
<ref bean="pollAdvise"/>
</int:advice-chain>
</int:poller>
<bean id="pollAdvise" class="org.springframework.integration.scheduling.PollSkipAdvice">
<constructor-arg ref="healthCheckStrategy"/>
</bean>
<bean id="healthCheckStrategy" class="test.ServiceHealthCheckPollSkipStrategy">
<property name="url" value="${url}"/>
<property name="doHealthCheck" value="${doHealthCheck}"/>
<property name="restTemplate" ref="restTemplate"/>
</bean>
<bean id="restTemplate"
class="org.springframework.web.client.RestTemplate">
<constructor-arg ref="requestFactory"/>
</bean>
<bean id="requestFactory"
class="test.BatchClientHttpRequestFactory">
<constructor-arg ref="verifier"/>
</bean>
<bean id="verifier"
class="test.NullHostnameVerifier">
</bean>
<bean id="encryptedDatum" class="test.EncryptedSecuredDatum"/>
<bean id="clientConfiguration" class="com.amazonaws.ClientConfiguration">
<property name="proxyHost" value="${proxyHost}"/>
<property name="proxyPort" value="${proxyPort}"/>
<property name="preemptiveBasicProxyAuth" value="false"/>
</bean>
<bean id="s3Operations" class="test.CustomC1AmazonS3Operations">
<constructor-arg index="0" ref="clientConfiguration"/>
<property name="awsEndpoint" value="s3.amazonaws.com"/>
<property name="temporaryDirectory" value="${temporaryDirectory}"/>
<property name="awsSecurityKey" value=""/>
</bean>
<bean id="credentials" class="org.springframework.integration.aws.core.BasicAWSCredentials">
</bean>
<int-aws:s3-inbound-channel-adapter aws-endpoint="s3.amazonaws.com"
bucket="${bucket}"
s3-operations="s3Operations"
credentials-ref="credentials"
file-name-wildcard="${fileNameWildcard}"
remote-directory="${prefix}"
channel="splitChannel"
local-directory="${localDirectory}"
accept-sub-folders="false"
delete-source-files="true"
archive-bucket="${archiveBucket}"
archive-directory="${archiveDirectory}">
</int-aws:s3-inbound-channel-adapter>
<int-file:splitter input-channel="splitChannel" output-channel="output" markers="false" charset="UTF-8">
<int-file:request-handler-advice-chain>
<bean class="org.springframework.integration.handler.advice.ExpressionEvaluatingRequestHandlerAdvice">
<property name="onSuccessExpression" value="payload.delete()"/>
</bean>
</int-file:request-handler-advice-chain>
</int-file:splitter>
<int:channel-interceptor pattern="*" order="3">
<bean class="org.springframework.integration.channel.interceptor.WireTap">
<constructor-arg ref="loggingChannel" />
</bean>
</int:channel-interceptor>
<int:logging-channel-adapter id="loggingChannel" log-full-message="true" level="INFO"/>
<int:channel id="output"/>
</beans>
Update 2 :
My stream is like below
aws-s3-source|processor|http-client| processor> queue:testQueue
1)Now I split the stream like below:
aws-s3-source> queue:s3Queue
I was able to read all my 1 million messages very fast.
2)Now i added one more stream like below i see issue again was s3 stops pulling file and message are lost everytime
queue:s3Queue>processor|http-client| processor> queue:testQueue
3)Observation is when i add http-client this issue happens again ,i.e. some message from input source is missing.
4)Now I split the file into 125 Mb 5 files instead of 660mb one file .ie 200 k records 5 files.I don't see the issue i get all my messages
I also see lot of messages clogging in queue before http-client .
I am thinking is it something to do with memory or threading inside xd?
Please let me know if expression advice is deleting before processing.
No; the advice is an around advice around the message handler; it can't execute (evaluate the expression), until the splitter has emitted all the lines.
Is it possible the file is pulled from s3 before it's completely written?
To debug this problem, I would suggest changing the advice to send the file to another subflow and do some analysis/logging there before deleting.

Multiple message processed

I have a spring xd source module which pulls file from s3 and splits line by line.I have my spring config as below.But I have 3 container and 1 admin server.Now I see duplicate message being processed by each container as each of them is downloading there own copy.
I can solve with making source s3 module deployment count as 1 but my processing of message is getting slow.?Any inputs to solve this?
<int:poller fixed-delay="${fixedDelay}" default="true">
<int:advice-chain>
<ref bean="pollAdvise"/>
</int:advice-chain>
</int:poller>
<bean id="pollAdvise"
</bean>
<bean id="credentials" class="org.springframework.integration.aws.core.BasicAWSCredentials">
<property name="accessKey" value="#{encryptedDatum.decryptBase64Encoded('${accessKey}')}"/>
<property name="secretKey" value="${secretKey}"/>
</bean>
<bean id="clientConfiguration" class="com.amazonaws.ClientConfiguration">
<property name="proxyHost" value="${proxyHost}"/>
<property name="proxyPort" value="${proxyPort}"/>
<property name="preemptiveBasicProxyAuth" value="false"/>
</bean>
<bean id="s3Operations" class="org.springframework.integration.aws.s3.core.CustomC1AmazonS3Operations">
<constructor-arg index="0" ref="credentials"/>
<constructor-arg index="1" ref="clientConfiguration"/>
<property name="awsEndpoint" value="s3.amazonaws.com"/>
<property name="temporaryDirectory" value="${temporaryDirectory}"/>
<property name="awsSecurityKey" value="${awsSecurityKey}"/>
</bean>
<bean id="encryptedDatum" class="abc"/>
<!-- aws-endpoint="https://s3.amazonaws.com" -->
<int-aws:s3-inbound-channel-adapter aws-endpoint="s3.amazonaws.com"
bucket="${bucket}"
s3-operations="s3Operations"
credentials-ref="credentials"
file-name-wildcard="${fileNameWildcard}"
remote-directory="${remoteDirectory}"
channel="splitChannel"
local-directory="${localDirectory}"
accept-sub-folders="false"
delete-source-files="true"
archive-bucket="${archiveBucket}"
archive-directory="${archiveDirectory}">
</int-aws:s3-inbound-channel-adapter>
<int-file:splitter input-channel="splitChannel" output-channel="output" markers="false" charset="UTF-8">
<int-file:request-handler-advice-chain>
<bean class="org.springframework.integration.handler.advice.ExpressionEvaluatingRequestHandlerAdvice">
<property name="onSuccessExpression" value="payload.delete()"/>
</bean>
</int-file:request-handler-advice-chain>
</int-file:splitter>
<int:channel id="output"/>
[Updated]
I added the idempotency as suggested by you with a metadata store.But since my xd is running in 3 container cluster with rabbit will simple metadatastore work?I think I should use reds/mongo metadata source.If I use mongo/redis metadatastore howcan i evict/remove the messages because messages will pile up over time?
<int:idempotent-receiver id="expressionInterceptor" endpoint="output"
metadata-store="store"
discard-channel="nullChannel"
throw-exception-on-rejection="false"
key-expression="payload"/>
<bean id="store" class="org.springframework.integration.metadata.SimpleMetadataStore"/>
I can suggest you to take a look to the Idempotent Receiver.
With that you can use shared MetadataStore and don't accept duplicate files.
The <idempotent-receiver> should be configured for that your <int-file:splitter>. And yes: with the discard logic to avoid duplicate message.
UPDATE
.But since my xd is running in 3 container cluster with rabbit will simple metadatastore work?
That doesn't matter because you start the stream from the S3 MessageSource, so you should filter files already there. Therefore you need external shared MetadataStore.
.If I use mongo/redis metadatastore howcan i evict/remove the messages because messages will pile up over time?
That's correct. It is a side affect of the Idempotent Receiver logic. Not sure how it is a problem for you if you use a DataBase...
You can clean the collection/keys by some periodic task. Maybe once a week...

spring integration aws s3 delete local file

I am using spring integration to read file from s3 however this works but my local directory is getting full I want to delete files from local directory after files are processed from s3?
<bean id="credentials" class="org.springframework.integration.aws.core.BasicAWSCredentials">
<property name="accessKey" value="${accessKey}"/>
<property name="secretKey" value="${secretKey}"/>
</bean>
<bean id="clientConfiguration" class="com.amazonaws.ClientConfiguration">
<property name="proxyHost" value="${proxyHost}"/>
<property name="proxyPort" value="${proxyPort}"/>
<property name="preemptiveBasicProxyAuth" value="false"/>
</bean>
<bean id="s3Operations" class="org.springframework.integration.aws.s3.core.CustomC1AmazonS3Operations">
<constructor-arg index="0" ref="credentials"/>
<constructor-arg index="1" ref="clientConfiguration"/>
<property name="awsEndpoint" value="s3.amazonaws.com"/>
<property name="temporaryDirectory" value="${temporaryDirectory}"/>
<property name="awsSecurityKey" value="${awsSecurityKey}"/>
</bean>
<!-- aws-endpoint="https://s3.amazonaws.com" -->
<int-aws:s3-inbound-channel-adapter aws-endpoint="s3.amazonaws.com"
bucket="${bucket}"
s3-operations="s3Operations"
credentials-ref="credentials"
file-name-wildcard="${fileNameWildcard}"
remote-directory="${remoteDirectory}"
channel="splitChannel"
local-directory="${localDirectory}"
accept-sub-folders="false"
delete-source-files="true"
archive-bucket="${archiveBucket}"
archive-directory="${archiveDirectory}">
</int-aws:s3-inbound-channel-adapter>
Looks like you are looking for ExpressionEvaluatingRequestHandlerAdvice.
Please, find Retry and More. There is something like <property name="onSuccessExpression" value="payload.delete()" /> in the expression-advice-context.xml config to take care about the local file after the proper finish of the process.

Resources