My requirement is to donwload multiple files from a remote directory in a file server in my spring-batch application. One of the example here suggests the usage of SourcePollingChannelAdapter. In this example the adapter is started adapter.start() but its not stopped. So how does the lifecycle of this work ? My requirement is to first download & Only when all files are downloaded, then proceed to read the files. But this seems this is kind of async process to download the files. So how would I get notified when all the files are downloaded & ready to proceed further ? I dont see any other methods to inject any handlers to the adapter/pollablechannel.
Sample config used :
<int-sftp:inbound-channel-adapter id="sftpInbondAdapterBean"
auto-startup="false" channel="receiveChannelBean" session-factory="sftpSessionFactory"
local-directory="${sftp.download.localDirResource}"
remote-directory="${sftp.download.remoteFilePath}"
auto-create-local-directory="true" delete-remote-files="false"
filename-regex=".*\.csv$">
<int:poller fixed-rate="5000" max-messages-per-poll="3" />
</int-sftp:inbound-channel-adapter>
<int:channel id="receiveChannelBean">
<int:queue/>
</int:channel>
It throws InterruptedException if adapter.stop() is called explicitly. If stop is not called , the files are downloaded in async way without blocking & hence the next step doesn't know if the download is completed or is in-progress.
EDIT : Code Snippet of DownloadingTasklet
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
try {
sftpInbondAdapter.start();
Thread.sleep(15000); // ---Is this mandatory ?
} catch (Exception e) {
throw new BatchClientException("Batch File Download was un-successful !", e);
} finally {
// sftpInbondAdapter.stop();
Logger.logDebug("BATCH_INPUT_FILE_DOWNLOAD", "COMPLETED");
}
return RepeatStatus.FINISHED;
}
The SftpInboundFileSynchronizingMessageSource works in two phases:
First of all it does synchronization from the remote directory to the local one:
Message<File> message = this.fileSource.receive();
if (message == null) {
this.synchronizer.synchronizeToLocalDirectory(this.localDirectory);
message = this.fileSource.receive();
}
And as you see only after that it starts to emit files as a messages. But already only local files. So, download from the FTP has been done already.
The next download attempt happens only when all local files are cleared during sending process.
So, what you want is a built-in functionality.
EDIT
Some DEBUG logs from our tests:
19:29:32,005 DEBUG task-scheduler-1 util.SimplePool:190 - Obtained new org.springframework.integration.sftp.session.SftpSession#26f16625.
19:29:32,036 DEBUG task-scheduler-1 inbound.SftpInboundFileSynchronizer:287 - cannot copy, not a file: /sftpSource/subSftpSource
19:29:32,036 DEBUG task-scheduler-1 session.CachingSessionFactory:187 - Releasing Session org.springframework.integration.sftp.session.SftpSession#26f16625 back to the pool.
19:29:32,036 DEBUG task-scheduler-1 util.SimplePool:221 - Releasing org.springframework.integration.sftp.session.SftpSession#26f16625 back to the pool
19:29:32,037 DEBUG task-scheduler-1 inbound.SftpInboundFileSynchronizer:270 - 3 files transferred
19:29:32,037 DEBUG task-scheduler-1 file.FileReadingMessageSource:380 - Added to queue: [local-test-dir\rollback\ sftpSource1.txt, local-test-dir\rollback\sftpSource2.txt]
19:29:32,043 INFO task-scheduler-1 file.FileReadingMessageSource:368 - Created message: [GenericMessage [payload=local-test-dir\rollback\ sftpSource1.txt, headers={id=40f75bd1-2150-72a4-76f0-f5c619a246da, timestamp=1465514972043}]]
It is with config like:
<int-sftp:inbound-channel-adapter
session-factory="sftpSessionFactory"
channel="requestChannel"
remote-directory="/sftpSource"
local-directory="local-test-dir/rollback"
auto-create-local-directory="true"
local-filter="acceptOnceFilter">
<int:poller fixed-delay="1" max-messages-per-poll="0"/>
</int-sftp:inbound-channel-adapter>
Pay attention how it shows 3 files transferred, where one of them is directory, so skipped.
How it says Added to queue:
And only after that it starts to emit them as messages.
And everything is just because I have fixed-delay="1". Even if it is only 1 millisecond.
The max-messages-per-poll="0" means processes everything with one polling task.
Related
setThreadKey is getting invoked for every file, but clearThreadKey is getting invoked for alternative files. When ever a flow invokes clearThreadKey the file is not getting uploaded to SFTP destination path. Out of 10 files, only 5 files are getting uploaded. I have customized the DelegatingSessionFactory class to decide the threadKey. I guess i am doing something wrong. I think, clearThreadKey is getting invoked even before uploading the file to destination, due to which the flow is not able to upload the file to destination. But it is supposed to invoke only after uploading the file. At the same time, strangely clearThreadKey is not invoked for all the files.
public Message<?> setThreadKey(Message<?> message, Object key) {
String keyStr = String.valueOf(key).split("-")[0];
this.threadKey.set(keyStr);
return message;
}
public Message<?> clearThreadKey(Message<?> message, Object key) {
this.threadKey.remove();
return message;
}
integration.xml
<int-file:inbound-channel-adapter directory="myowndirectorypath" id="fileInbound" channel="sftpChannel">
<int:poller fixed-rate="1000" max-messages-per-poll="100"/>
</int-file:inbound-channel-adapter>
<int:channel id="sftpChannel"/>
<int:service-activator input-channel="sftpChannel" output-channel="outsftpChannel"
expression="#dsf.setThreadKey(#root, headers['file_name'])"/>
<int:channel id="outsftpChannel"/>
<int-sftp:outbound-channel-adapter id="sftpOutboundAdapter" session-factory="dsf"
channel="outsftpChannel" charset="UTF-8"
remote-directory-expression="#sftpConfig.determineRemoteDirectory(headers['file_name'])"/>
<int:service-activator input-channel="outsftpChannel" output-channel="nullChannel"
expression="#dsf.clearThreadKey(#root, headers['file_name'])" requires-reply="true"/>
You have two subscribers on outsftpChannel; they will receive alternate messages in round-robin fashion because it is a DirectChannel.
You need a publish-subscribe-channel so that both recipients will receive the message.
My req. is to poll a directory for a specified time interval say 10 mins. If a file of a particular extension say *.xml is found in the directory then the it just consumes (i.e. picks and deletes) the file and prints the name else after the specified time (say 10 mins.) interval it sends out a mail that the file has not been picked (i.e. consumed) or the file has not come.
There are 2 options either I do it through Spring integration OR WatchService of Core Java. Following is the code in Spring Integration which I have written till now:
<int:channel id="fileChannel" />
<int:channel id="processedFileChannel" />
<context:property-placeholder location="localProps.properties" />
<int:poller default="true" fixed-rate="10000" id="poller"></int:poller>
<int-file:inbound-channel-adapter
directory="file:${inbound.folder}" channel="fileChannel"
filename-pattern="*.xml" />
<int:service-activator input-channel="fileChannel"
ref="fileHandlerService" method="processFile" output-channel="processedFileChannel"/>
<bean id="fileHandlerService" class="com.practice.cmrs.springintegration.Poll" />
The above code is successfully polling the folder for a particular file pattern. Now I have 2 things to do:
1) Stop polling after a particular time interval (configurable) say 10 mins.
2) Check whether a file with a particular extension is there in the folder ... if the file is there (it consumes and then deletes) else it sends an email to a group of people (email part is done.)
Please help me in the above 2 points.
You can use a Smart Poller to do things like that.
You can adjust the poller and/or take different actions if/when the poll results in a message.
Version 4.2 introduced the AbstractMessageSourceAdvice. Any Advice objects in the advice-chain that subclass this class, are applied to just the receive operation. Such classes implement the following methods:
beforeReceive(MessageSource<?> source)
This method is called before the MessageSource.receive() method. It enables you to examine and or reconfigure the source at this time. Returning false cancels this poll (similar to the PollSkipAdvice mentioned above).
Message<?> afterReceive(Message<?> result, MessageSource<?> source)
This method is called after the receive() method; again, you can reconfigure the source, or take any action perhaps depending on the result (which can be null if there was no message created by the source). You can even return a different message!
I am attempting to process a file exactly like this question:
read a remote file line by line
The one answer to this question suggests to use RemoteFileTemplate but I am
attempting to use the -stream option as suggested in the last comment to the
answer. Also, this kinda seems like the point of the -stream option; to get a stream.
My implementation successfully obtains the InputStream and kick off a separate thread to
process in a BufferedReader.
This works happily on windows laptop but deployment on a linux machine I
sometimes get a "Write end dead" exception caught when trying to read the
BufferedReader in my processing thread.
Research into this suggests the writer is not closing the stream properly:
Write end dead exception using PipedInputStream
So, either this is a bug in spring-integration or there is something missing
in my configuration. I am hoping it is the latter and could use feedback on
the way I am obtaining the InputStream. If I am getting the InputStream
correctly, then how can I get the writer to close the input stream after writing?
Thanks!
Here is outbound gateway configuration:
<int-ftp:outbound-gateway session-factory="ftpClientFactory"
request-channel="inboundGetStream" command="get" command-options="-stream"
expression="payload" remote-directory="/" reply-channel="stream">
</int-ftp:outbound-gateway>
<int:channel id="stream">
<int:queue/>
</int:channel>
Here is where I obtain the InputStream:
public InputStream openFileStream(final String filename, final String directory) throws Exception {
if (inboundGetStream.send(MessageBuilder.withPayload(directory + "/" + filename).build(), ftpTimeout)) {
return getInputStream();
}
return null;
}
private InputStream getInputStream() {
Message<?> msgs = stream.receive(ftpTimeout);
if (msgs == null) {
return null;
}
InputStream is = (InputStream) msgs.getPayload();
return is;
}
Would be better if you share more StackTrace to investigate.
Plus, I don't see that you close the session as it is recommended by the solution:
When consuming remote files as streams, the user is responsible for closing the Session after the stream is consumed. For convenience, the Session is provided in the file_remoteSession header.
<int:service-activator input-channel="markers"
expression="payload.mark.toString().equals('END') ? headers['file_remoteSession'].close() : null"/>
As a sample from the Reference Manual.
On the other hand it would be better to consume the InputStream in the same thread as it has been obtained. Not shifting such a low level, session tied resource to the Queue.
Also check, please, this bug: https://bugs.eclipse.org/bugs/show_bug.cgi?id=359184 . Maybe you really should upgrade to something more fresh: http://search.maven.org/#search|ga|1|g%3A%22com.jcraft%22
I'm trying to do the exact opposite of what is being accomplished here:
How do I create a tcp-inbound-gateway which detects connection loss and auto-reconnects?
I have a set of collaborating client tcp adapters taken from the sample application. In general the application will be sending a high volume of client requests to a 3rd party over a shared connection, but there will be times when there are no requests coming through at all.
Because of this, there is a requirement from the 3rd party that says I should send them a ping message every 2 minutes. If the response to the ping isn't received in a predefined period of time(which could be shorter than the timeout for other calls, I should kill the connection and reconnect.
My first thought was to created a scheduled task that will send the ping every 2 minutes, but I'm not sure how I would kill the connection and re-establish it from inside the task. Another option, I was considering was to use the connection interceptor and time the request and response, but that doesn't seem right. I am brand new to SI, so any push in the right direction would be helpful.
Another I just had was autowiring the tcp-outbound-channel-adapter into the job and calling retryConnection()
<converter>
<beans:bean class="org.springframework.integration.samples.tcpclientserver.ByteArrayToStringConverter" />
</converter>
<!-- Given we are looking for performance, let's use
the most performant wire protocol. -->
<beans:bean id="fastestWireFormatSerializer" class="org.springframework.integration.ip.tcp.serializer.ByteArrayLengthHeaderSerializer">
<beans:constructor-arg value="1" />
</beans:bean>
<!-- Client side -->
<gateway id="gw"
service-interface="org.springframework.integration.samples.tcpclientserver.SimpleGateway"
default-request-channel="input" />
<ip:tcp-connection-factory id="client"
type="client"
host="localhost"
port="${availableServerSocket}"
single-use="false"
serializer="fastestWireFormatSerializer"
deserializer="fastestWireFormatSerializer"
so-timeout="10000" />
<publish-subscribe-channel id="input" />
<ip:tcp-outbound-channel-adapter id="outAdapter.client"
order="2"
channel="input"
connection-factory="client" /> <!-- Collaborator -->
<!-- Also send a copy to the custom aggregator for correlation and
so this message's replyChannel will be transferred to the
aggregated message.
The order ensures this gets to the aggregator first -->
<bridge input-channel="input" output-channel="toAggregator.client"
order="1"/>
<!-- Asynch receive reply -->
<ip:tcp-inbound-channel-adapter id="inAdapter.client"
channel="toAggregator.client"
connection-factory="client" /> <!-- Collaborator -->
<!-- dataType attribute invokes the conversion service, if necessary -->
<channel id="toAggregator.client" datatype="java.lang.String" />
<aggregator input-channel="toAggregator.client"
output-channel="toTransformer.client"
correlation-strategy-expression="payload.substring(0,3)"
release-strategy-expression="size() == 2" />
<transformer input-channel="toTransformer.client"
expression="payload.get(1)"/> <!-- The response is always second -->
Update
I am using SI 3.0.7.RELEASE
The only exception I see in the log is:
22:02:38.956 ERROR [pool-1-thread-1][org.springframework.integration.ip.tcp.connection.TcpNetConnection] Read exception localhost:4607:39129:464bd042-dd9c-4639-8d1d-cdda61dc988a SocketTimeoutException:Read timed out
This is when the so-timeout on the clientFactory is set to 10 seconds and I force the server to sleep for 15 seconds. Nothing is ever returned to the SimpleGateway call. It just sits there and waits forever.
Sample Code using the configuration listed above:
String input = "ping";
String result = null;
System.out.println("Sending message: " + input);
try{
result = gateway.send(input);
}catch(Exception e){
System.out.println("There was an exception sending the message");
}
System.out.println("response: " + result);
Output:
Sending message: test
22:12:17.093 ERROR [pool-1-thread-1][org.springframework.integration.ip.tcp.connection.TcpNetConnection] Read exception localhost:4607:39232:fe90d394-edc3-440a-ab68-34e7162db6ec SocketTimeoutException:Read timed out
The gateway call never returns.
There's a new component in 4.2 called Thread Barrier.
This allows you to wait for some (configurable) time until an async event occurs.
See the barrier sample for an example.
The ping reply would 'release' the waiting thread; in the case of the timeout you could use a request handler advice on the handler to catch the exception and take the action you need (you can get a list of open connections from the factory and close them by id).
BTW, we could probably rewrite that sample to use the barrier instead of an aggregator.
There's really no need to reopen the connection (unless you anticipate unsolicited responses on a new connection); it will be opened on the next send.
EDIT:
I need to see more of your log; I just tried it and got the following:
2015-09-28 09:05:53,995 [pool-2-thread-1] ERROR: org.springframework.integration.ip.tcp.connection.TcpNetConnection - Read exception localhost:5678:51323:f3537a39-feb4-4237-9508-7e67c1b79654
java.net.SocketTimeoutException: Read timed out
...
2015-09-28 09:05:53,996 [pool-2-thread-1] TRACE: org.springframework.integration.ip.tcp.TcpOutboundGateway - onMessage: localhost:5678:51323:f3537a39-feb4-4237-9508-7e67c1b79654([Payload=java.net.SocketTimeoutException: Read timed out][Headers={timestamp=1443445553996, id=0b3fae63-f431-c09c-ac4e-26ce4b012b6b, ip_connectionId=localhost:5678:51323:f3537a39-feb4-4237-9508-7e67c1b79654}])
2015-09-28 09:05:53,996 [main] DEBUG: org.springframework.integration.ip.tcp.TcpOutboundGateway - second chance
2015-09-28 09:05:55,998 [main] ERROR: org.springframework.integration.ip.tcp.TcpOutboundGateway - Tcp Gateway exception
org.springframework.integration.MessagingException: Exception while awaiting reply
...
Caused by: java.net.SocketTimeoutException: Read timed out
...
However, it is broken in 4.2 (and for single-use sockets) so I have opened a JIRA Issue. But yours is not single-use so I would expect the same results as mine.
EDIT 2:
Skipping a socket timeout because we have a recent send localhost:4607:26489:926b1c1f-49cc-4f44-b20d-78f2e94e82d1
I should have mentioned this before.
Because the connection factory supports completely asynchronous messaging, we might have to wait for 2 timeouts before closing the socket and propagating the exception to the gateway thread.
The reasoning is thus (using your 10 second so-timeout):
Connection created at t+0; the read thread is started and is blocked in the socket (not interruptible).
Message sent at t+9.
Read times out at t+10.
It would be too early to use this timeout so we ignore it because there has been a send within the last 10 seconds.
Read times out again at t+20.
Exception propagated to caller.
On average, the gateway thread will get the timeout at so-timeout * 1.5, but it will be in the range so-timeout to so-timeout * 2.
EDIT 3:
I can't tell from the log why it's not being propagated in your case.
I just reverted the tcp-client-server sample on this branch to 3.0.7 and changed the test to force a timeout (after skipping one), and it all worked fine for me.
We need to figure out what is different about your case and mine.
You can see the changes I made to the sample in the last commit on that branch.
If you can come up with a similar test case that reproduces the problem, I can help you debug it.
At the beginning of my flow I have a file inbound adapter which reads a directory periodically:
<int-file:inbound-channel-adapter id="filteredFiles"
directory="${controller.cycle.lists.input.dir}"
channel="semaphoreChannel" filename-pattern="*.xml">
<int:poller fixed-delay="3000"/>**
</int-file:inbound-channel-adapter>
When the SI workflow ends it never happens again. It seems the poller is dead and stops working.
There aren't any error messages in the log nor any warnings.
Channel configuration:
<int:channel id="semaphoreChannel" datatype="java.io.File"/>
Second configuration:
<int-file:inbound-channel-adapter id="filteredFiles"
directory="${controller.cycle.lists.input.dir}"
channel="semaphoreChannel" filename-pattern="*.xml">
<int:poller cron="0 * * * * *" />
</int-file:inbound-channel-adapter>
It does not make sense.
Since you use default settings for other <poller> options, you end up with:
public static final int MAX_MESSAGES_UNBOUNDED = Integer.MIN_VALUE;
private volatile long maxMessagesPerPoll = MAX_MESSAGES_UNBOUNDED;
That means the FileReadingMessageSource reads all files by provided pattern during the single poller cycle.
The poller doesn't stop to work, but there is nothing more in the directory to read.
Change to this max-messages-per-poll="1" and let us know how it is.
From other side you can switch on DEBUG logging level for the org.springframework.integration.endpoint.SourcePollingChannelAdapter and there will be a message in logs:
Received no Message during the poll, returning 'false'