Spring File Inbound Channel Adapter slows down

Spring File Inbound Channel Adapter slows down - spring-integration

I have a spring-integration flow that starts with a file inbound-channel-adapter activated by a transactional poller (tx is handled by atomikos).
The text in the file is processed and the message goes down through the flow until it gets sent to one of the JMS queues (JMS outbound-channel-adapter).
In the middle, there are some database writes within a nested transaction.
The system is meant to run 24/7.
It happens that the single message flow, progressively slows down and when I investigated, I found that the stage that is responsable for the increasing delay is the read from filesystem.
Below, the first portion fo the integration flow:
<logging-channel-adapter id="logger" level="INFO"/>
<transaction-synchronization-factory id="sync-factory">
<after-commit expression="payload.delete()" channel="after-commit"/>
</transaction-synchronization-factory>
<service-activator input-channel="after-commit" output-channel="nullChannel" ref="tx-after-commit-service"/>
<!-- typeb inbound from filesystem -->
<file:inbound-channel-adapter id="typeb-file-inbound-adapter"
auto-startup="${fs.typeb.enabled:true}"
channel="typeb-inbound"
directory="${fs.typeb.directory.in}"
filename-pattern="${fs.typeb.filter.filenamePattern:*}"
prevent-duplicates="${fs.typeb.preventDuplicates:false}" >
<poller id="poller"
fixed-delay="${fs.typeb.polling.millis:1000}"
max-messages-per-poll="${fs.typeb.polling.numMessages:-1}">
<transactional synchronization-factory="sync-factory"/>
</poller>
</file:inbound-channel-adapter>
<channel id="typeb-inbound">
<interceptors>
<wire-tap channel="logger"/>
</interceptors>
</channel>
I read something about issues related to the prevent-duplicates option that stores a list of seen files, but that is not the case because I turned it off.
I don't think that it may be related to the filter (filename-pattern) because the expression I use in my config (*.RCV) is easy to apply and the input folder does not contain a lot of files (less than 100) at the same time.
Still, there is something that gradually makes the read from filesystem slower and slower over time, from a few millis to over 3 seconds within a few days of up-time.
Any hints?

You should remove, or move files after they have been processed; otherwise the whole directory has to be rescanned.
In newer versions, you can use a WatchServiceDirectoryScanner which is more efficient.
But it's still best practice to clean up old files.

Finally I got the solution.
The issue was related to the specific version of Spring I was using (4.3.4) that is affected by a bug I had not discovered yet.
The problem is something about DefaultConversionService and the use of converterCache (look at this for more details https://jira.spring.io/browse/SPR-14929).
Upgrading to a more recent version has resolved.

Related

Limiting the scope of a transaction in Apache Camel

I have a transacted Camel route with a number of processors
from(Constant.RouteA)
.transacted()
.process(processor1)
.process(processor2)
.process(processor3)
.wireTap(Constant.RouteB)
.wireTap(Constant.RouteC)
.end()
My problem is that I don't want the final part of the route (the wiretaps) to be part of the transaction i.e. I want them to be executed once processor3 has finished and the transaction committed.
Initially I looked at using onCompletion() but it doesn't seem to work together with transacted().
So I found another way which requires using policy() to limit the transaction scope i.e.
from(Constant.RouteA)
.policy("PROPAGATION_REQUIRED")
.process(processor1)
.process(processor2)
.process(processor3)
.end()
.wireTap(Constant.RouteB)
.wireTap(Constant.RouteC)
.end()
The problem is that this solution requires to define the SpringTransactionPolicy in the Spring configuration, but the software I'm working on doesn't use Spring. Transactions are managed by Bitronix and everything works just by using the transacted() method, which as far as I can tell doesn't allow you to limit the scope of a transaction.
Is there a simple way to achieve my goal? Hopefully without bringing Spring into the picture. Thank you!

Try to create two routes. For example:
from(direct:startRoute)
.to(Constant.RouteA)
.wireTap(Constant.RouteB)
.wireTap(Constant.RouteC);
from(Constant.RouteA)
.transacted()
.process(processor1)
.process(processor2)
.process(processor3);
Once route "Constant.RouteA" is finished all changes will committed.

spring-xd custom redis-sink

So I think I need to extend the current redis-sink provided in spring-xd to write into a redis Capped list, rather than creating a new one but unfortunately it seems it gets worse as I will have to go deeper into spring-integration and further back into spring-data (spring-data-redis) because the whole redis-sink seems to be based on the generic pub/sub abstraction on redis - or is there some type of handler that can be defined once the message arrives to the channel handler?
In order to have the "effect of a capped list" when I push data redis, I need to execute both a redis "push" and then an "rtrim" as outlined here - http://redis.io/topics/data-types-intro. If I am to build a custom spring-integration / spring-data module. I believe I see support for the "ltrim" but not the"rtrim" operation here http://docs.spring.io/spring-data/redis/docs/1.7.0.RC1/api/
Any Advice on how/where to start or an easier approach would be appreciated.

Actually even Redis doesn't have such a RTRIM command. We don't need it because we reach the same behavior with the negative indexes for LTRIM:
start and end can also be negative numbers indicating offsets from the end of the list, where -1 is the last element of the list, -2 the penultimate element and so on.
I think you should use <redis:store-outbound-channel-adapter> and add something like this into its configuration:
<int-redis:request-handler-advice-chain>
<beans:bean class="org.springframework.integration.handler.advice.ExpressionEvaluatingRequestHandlerAdvice">
<beans:property name="onSuccessExpression" value="#redisTemplate.boundListOps(${keyExpression}).trim(1, -1)"/>
</beans:bean>
</int-redis:request-handler-advice-chain>
To remove the oldest element in the Redis List.

Spring Integration:Is there a way to aggregate from "all" messages in a channel?

I want to write a batch which reads website access log files(csv file)from a path every day and do some analysis using spring integration.
this is the simplified version of the input csv file.
srcIp1,t1,path1
srcIp2,t2,path2
srcIp1,t3,path2
srcIp1,t4,path1
The access number per source ip and path is to be calculated after some filtering logic.
I made a input channel whose payload is the parsed log line,and a filter is applied,and finally an aggregator to calculate the final result.
The problem is what should be the right group release stragety,the default release stragety(SequenceSizeReleaseStrategy) does not work.
Also any of other spring integraion out of box release
strategies(ExpressionEvaluatingReleaseStrategy,
MessageCountReleaseStrategy, MethodInvokingReleaseStrategy,
SequenceSizeReleaseStrategy, TimeoutCountSequenceSizeReleaseStrategy)
does not seem to fit my needs.
Or Spring integration assumed that a channel carries a message stream where there is no concept of "ending of message" and is not suitalbe for my problem here ?

You can write a custom ReleaseStrategy if you have some way to tell when the group is complete. It is consulted each time a message is added to the group.
Or, you can use a group-timeout to release a partial group after some time when no messages arrived.

aggregator release strategy depend on another service activator running

I understand how aggregating based on size works but I also want to make the release strategy depend on another step in the pipeline to be still running. The idea is that i move files to a certain dir "source", aggregate enough files and then move from "source" to "stage" and then process the staged files. While this process is running I dont want to put more files in stage but I do want to continue to add more files to source folder (that part is handled by using a dispatcher channel connected with file inbound adapter before the aggregator)
<int:aggregator id="filesBuffered"
input-channel="sourceFilesProcessed"
output-channel="stagedFiles"
release-strategy-expression="size() == 10"
correlation-strategy-expression="'mes-group'"
expire-groups-upon-completion="true"
/>
<int:channel id="stagedFiles" />
<int:service-activator input-channel="stagedFiles"
output-channel="readyForMes"
ref="moveToStage"
method="move" />
so as you can see I dont want to release the aggregated messages if an existing instance of moveToStage service activator is running.
I thought about making the stagedFiles channel a queue channel but that doesnt seems right because I do want the files to be passed to moveToStage as a Collection not a single file which I am assuming by making stagedFiles a queue channel it will send a single file. Instead I want to get to a threshold e.g. 10 files, pass those to stagedFiles which allows the moveToStage to process those files but then until this step is done I want the aggregator to continue to aggregate files and then release all aggregated files.
Thanks

I suggest you to have some flag as a AtomicBoolean bean and use it from your moveToStage#move and check it's state from:
release-strategy-expression="size() >= 10 and #stagingFlag.get()"

Nlog: send mail when Loglevel >= Loglevel.Error with last x messages

I'd like to do this (from log4net docu) with nlog:
This example shows how to deliver only significant events. A LevelEvaluator is specified with a threshold of WARN. This means that an email will be sent for each WARN or higher level message that is logged. Each email will also contain up to 512 (BufferSize) previous messages of any level to provide context. Messages not sent will be discarded.
Is it possible?
I only found this on codeproject.
But it uses a wrapper target that flushes in behalf of the number of messages, not on the log level.
Thanks
Tobi

There is a QueuedTargetWrapper ( a target that buffers log events and sends them in batches to the wrapped target)
that seems address the requirement. I haven't tried it yet.
There is a related discussion "The Benefits of Trace Level Logging in Production Without the Drawback of Enormous Files"

Simple solution that will write last 200 log-events on error:
<target name="String" xsi:type="AutoFlushWrapper" condition="level >= LogLevel.Error" flushOnConditionOnly="true">
<target xsi:type="BufferingWrapper"
bufferSize="200"
overflowAction="Discard">
<target xsi:type="wrappedTargetType" ...target properties... />
</target>
</target>
See also: https://github.com/nlog/NLog/wiki/BufferingWrapper-target

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Spring File Inbound Channel Adapter slows down - spring-integration

You should remove, or move files after they have been processed; otherwise the whole directory has to be rescanned. In newer versions, you can use a WatchServiceDirectoryScanner which is more efficient. But it's still best practice to clean up old files.

Related

Limiting the scope of a transaction in Apache Camel

spring-xd custom redis-sink

Spring Integration:Is there a way to aggregate from "all" messages in a channel?

aggregator release strategy depend on another service activator running

Nlog: send mail when Loglevel >= Loglevel.Error with last x messages

Categories

Resources