I understand how aggregating based on size works but I also want to make the release strategy depend on another step in the pipeline to be still running. The idea is that i move files to a certain dir "source", aggregate enough files and then move from "source" to "stage" and then process the staged files. While this process is running I dont want to put more files in stage but I do want to continue to add more files to source folder (that part is handled by using a dispatcher channel connected with file inbound adapter before the aggregator)
<int:aggregator id="filesBuffered"
input-channel="sourceFilesProcessed"
output-channel="stagedFiles"
release-strategy-expression="size() == 10"
correlation-strategy-expression="'mes-group'"
expire-groups-upon-completion="true"
/>
<int:channel id="stagedFiles" />
<int:service-activator input-channel="stagedFiles"
output-channel="readyForMes"
ref="moveToStage"
method="move" />
so as you can see I dont want to release the aggregated messages if an existing instance of moveToStage service activator is running.
I thought about making the stagedFiles channel a queue channel but that doesnt seems right because I do want the files to be passed to moveToStage as a Collection not a single file which I am assuming by making stagedFiles a queue channel it will send a single file. Instead I want to get to a threshold e.g. 10 files, pass those to stagedFiles which allows the moveToStage to process those files but then until this step is done I want the aggregator to continue to aggregate files and then release all aggregated files.
Thanks
I suggest you to have some flag as a AtomicBoolean bean and use it from your moveToStage#move and check it's state from:
release-strategy-expression="size() >= 10 and #stagingFlag.get()"
Related
I have a spring-integration flow that starts with a file inbound-channel-adapter activated by a transactional poller (tx is handled by atomikos).
The text in the file is processed and the message goes down through the flow until it gets sent to one of the JMS queues (JMS outbound-channel-adapter).
In the middle, there are some database writes within a nested transaction.
The system is meant to run 24/7.
It happens that the single message flow, progressively slows down and when I investigated, I found that the stage that is responsable for the increasing delay is the read from filesystem.
Below, the first portion fo the integration flow:
<logging-channel-adapter id="logger" level="INFO"/>
<transaction-synchronization-factory id="sync-factory">
<after-commit expression="payload.delete()" channel="after-commit"/>
</transaction-synchronization-factory>
<service-activator input-channel="after-commit" output-channel="nullChannel" ref="tx-after-commit-service"/>
<!-- typeb inbound from filesystem -->
<file:inbound-channel-adapter id="typeb-file-inbound-adapter"
auto-startup="${fs.typeb.enabled:true}"
channel="typeb-inbound"
directory="${fs.typeb.directory.in}"
filename-pattern="${fs.typeb.filter.filenamePattern:*}"
prevent-duplicates="${fs.typeb.preventDuplicates:false}" >
<poller id="poller"
fixed-delay="${fs.typeb.polling.millis:1000}"
max-messages-per-poll="${fs.typeb.polling.numMessages:-1}">
<transactional synchronization-factory="sync-factory"/>
</poller>
</file:inbound-channel-adapter>
<channel id="typeb-inbound">
<interceptors>
<wire-tap channel="logger"/>
</interceptors>
</channel>
I read something about issues related to the prevent-duplicates option that stores a list of seen files, but that is not the case because I turned it off.
I don't think that it may be related to the filter (filename-pattern) because the expression I use in my config (*.RCV) is easy to apply and the input folder does not contain a lot of files (less than 100) at the same time.
Still, there is something that gradually makes the read from filesystem slower and slower over time, from a few millis to over 3 seconds within a few days of up-time.
Any hints?
You should remove, or move files after they have been processed; otherwise the whole directory has to be rescanned.
In newer versions, you can use a WatchServiceDirectoryScanner which is more efficient.
But it's still best practice to clean up old files.
Finally I got the solution.
The issue was related to the specific version of Spring I was using (4.3.4) that is affected by a bug I had not discovered yet.
The problem is something about DefaultConversionService and the use of converterCache (look at this for more details https://jira.spring.io/browse/SPR-14929).
Upgrading to a more recent version has resolved.
So I think I need to extend the current redis-sink provided in spring-xd to write into a redis Capped list, rather than creating a new one but unfortunately it seems it gets worse as I will have to go deeper into spring-integration and further back into spring-data (spring-data-redis) because the whole redis-sink seems to be based on the generic pub/sub abstraction on redis - or is there some type of handler that can be defined once the message arrives to the channel handler?
In order to have the "effect of a capped list" when I push data redis, I need to execute both a redis "push" and then an "rtrim" as outlined here - http://redis.io/topics/data-types-intro. If I am to build a custom spring-integration / spring-data module. I believe I see support for the "ltrim" but not the"rtrim" operation here http://docs.spring.io/spring-data/redis/docs/1.7.0.RC1/api/
Any Advice on how/where to start or an easier approach would be appreciated.
Actually even Redis doesn't have such a RTRIM command. We don't need it because we reach the same behavior with the negative indexes for LTRIM:
start and end can also be negative numbers indicating offsets from the end of the list, where -1 is the last element of the list, -2 the penultimate element and so on.
I think you should use <redis:store-outbound-channel-adapter> and add something like this into its configuration:
<int-redis:request-handler-advice-chain>
<beans:bean class="org.springframework.integration.handler.advice.ExpressionEvaluatingRequestHandlerAdvice">
<beans:property name="onSuccessExpression" value="#redisTemplate.boundListOps(${keyExpression}).trim(1, -1)"/>
</beans:bean>
</int-redis:request-handler-advice-chain>
To remove the oldest element in the Redis List.
I am building a system that call many and diffrent web service and i wish to generate a report about all errors returned after calling ws.
For that, I use an <int:aggregator: > to aggregate messages from error-channel but i can't know the release-strategy because , i like to aggregate all messages of error-channel. so how can i configure <int:aggregator > to aggregate all messages.
<int:aggregator
correlation-strategy-expression="'${error.msg.correlation.key}'"
input-channel="ws.rsp.error.channel"
output-channel="outboundMailChannel"
ref="errorAggregator"
method="generateErrorReport"
release-strategy-expression="false"
group-timeout="2000"
expire-groups-upon-completion="true"/>
<int:service-activator
input-channel="outboundMailChannel"
ref="errorMsgAgregatedActivator"
method="handleMessage"
/>
And the activator:
#ServiceActivator
public void handleMessage(Message<Collection<Object>> errorList) {
Collection<Object> payload=errorList.getPayload();
System.out.println("error list: "+payload.toString());
}
thanks.
Aggregation either needs an appropriate release strategy, or you can simply use release-strategy-expression="false" (never release), and use a group-timeout to release whatever's in the group after some time.
You may want to use a constant correlation correlation-strategy-expresision="'foo'" and set expire-groups-upon-completion="true" so a new group starts with the next message.
I want to write a batch which reads website access log files(csv file)from a path every day and do some analysis using spring integration.
this is the simplified version of the input csv file.
srcIp1,t1,path1
srcIp2,t2,path2
srcIp1,t3,path2
srcIp1,t4,path1
The access number per source ip and path is to be calculated after some filtering logic.
I made a input channel whose payload is the parsed log line,and a filter is applied,and finally an aggregator to calculate the final result.
The problem is what should be the right group release stragety,the default release stragety(SequenceSizeReleaseStrategy) does not work.
Also any of other spring integraion out of box release
strategies(ExpressionEvaluatingReleaseStrategy,
MessageCountReleaseStrategy, MethodInvokingReleaseStrategy,
SequenceSizeReleaseStrategy, TimeoutCountSequenceSizeReleaseStrategy)
does not seem to fit my needs.
Or Spring integration assumed that a channel carries a message stream where there is no concept of "ending of message" and is not suitalbe for my problem here ?
You can write a custom ReleaseStrategy if you have some way to tell when the group is complete. It is consulted each time a message is added to the group.
Or, you can use a group-timeout to release a partial group after some time when no messages arrived.
I am using spring integration to download files and to process them.
<int-sftp:inbound-channel-adapter channel="FileDownloadChannel"
session-factory="SftpSessionFactory"
remote-directory="/home/sshaji/from_disney/files"
filter = "modifiedFileListFilter"
local-directory="/home/sshaji/to_disney/downloads"
auto-create-local-directory="true" >
<integration:poller cron="*/10 * * * * *" default="true"/>
</int-sftp:inbound-channel-adapter>
<integration:transformer input-channel="FileDownloadChannel"
ref="ErrorTransformer"
output-channel="EndChannel"/>
<integration:router input-channel="FileErrorProcessingChannel"
expression="payload.getErrorCode() > 0">
<integration:mapping value="true" channel="ReportErrorChannel"/>
<integration:mapping value="false" channel="FilesBackupChannel"/>
</integration:router>
The int-sftp:inbound-channel-adapter is used to download files from sftp server.
It downloads about 6 files. all xml files.
The transformer iterates all the 6 files and check whether they have an error tag.
If there is an error tag then it will set its errorcode as 1. else it will be set a 0.
When it comes out of the transformer and goes to the router,
i want to send the files whose errorcode is set to 1 to move to a specific folder (Error)
and those which has errorcode set to 0 to move to another folder (NoError).
Currently the transformer returns a " list fileNames " which contains the errorcode and fileNames of all the 6 files.
How can i check the error code for each file using the router? and then map that particular file to a router.
Common C Logic for my problem
for (int i =0; i<fileNames.lenght();i++) {
if(fileNames[i].getErrorCode == 1) {
moveToErrorFolder(fileNames[i].getName());
} else {
moveToNoErrors(fileNames[i].getName());
}
}
How can i achieve this using spring integration?.
If its not possible, is there any workaround for it?.
I hope now its clear. I am sorry for not providing enough details last time.
Also in the int-sftp:inbound-channel-adapter i have hard coded the "remote-directory" and "local-directory" fields to a specific folder in the system. can i refer these from a bean property or from a constant value?.
I need to configure these values based on config.xml file, is that possible?.
I am new to Spring Integration. Please help me.
Thanks in Advance.
It's still not clear to me what you mean by "The transformer iterates all the 6 files".
Each file will be passed to the transformer in a single message, so I don't see how it can emit a list of 6.
It sounds like you need an <aggregator/> with a correlation-strategy-expression="'foo'" and release-strategy-expression="size() == 6". This would aggregate each single File into a list of File and pass it to your transformer. It then transforms it to a list of your status objects containing the filename and error code.
Finally, you would add a <splitter/> which would split the list into separate FileName messages to send to the router.
You can use normal Spring property placeholders for the directory attributes ${some.property} or SpEL to use a property of another bean #{someBean.remoteDir}.