spring integration sftp channel - spring-integration

In sftp remote - I have 2 folder [ready] and [process] , What I need to do is first I have to move file from ready to process then I move that file to local directory using single channel .
Please check my code is this correct ?
my code works fine but I have doubt that first it moves to remote process or local folder which happening first ?
#Bean
public IntegrationFlow remoteToLocal() {
return IntegrationFlows
.from(Sftp.inboundAdapter(sftpSessionFactory())
.remoteDirectory(sftpProperties.getRemoteRootDir() + "/ready")
.regexFilter(FILE_PATTERN_REGEX)
.deleteRemoteFiles(true)
.localDirectory(new File(mmFileProperties.getMcfItes()+ mmFileProperties.getInboundDirectory()))
.preserveTimestamp(true)
.temporaryFileSuffix(".tmp"),
e -> e.poller(Pollers.fixedDelay(sftpProperties.getPollerIntervalMs()))
.id("sftpInboundAdapter"))
.handle(Sftp.outboundAdapter(mmSftpSessionFactory())
.remoteDirectory(sftpProperties.getRemoteRootDir() + "/process")
.temporaryFileSuffix(".tmp"))
.get();
}
Please check the new code but it it is not working
private StandardIntegrationFlow remoteToLocalFlow(final String localDirectory, final String remoteDirectoryProcessing, final String adapterName) {
return IntegrationFlows
.from(Sftp.inboundAdapter(mmSftpSessionFactory())
.remoteDirectory(remoteRootDir + remoteDirectoryProcessing)
.regexFilter(FILE_PATTERN_REGEX)
.deleteRemoteFiles(true)
.localDirectory(Paths.get(localDirectory).toFile())
.preserveTimestamp(true)
.temporaryFileSuffix(".tmp"),
e -> {
e.poller(Pollers.fixedDelay(mmSftpProperties.getPollerIntervalMs()))
.id(adapterName);
})
.handle(m -> logger.trace("File received from sftp interface: {}", m))
.handleWithAdapter(h -> h.sftpGateway(sftpSessionFactory(),AbstractRemoteFileOutboundGateway.Command.MV, "payload")
.renameExpression(remoteRootDir + ready)
.localDirectoryExpression(remoteRootDir + process)).get(); }

It looks ok, but it's not the best way to do it; you are copying the file, deleting it and sending it back with another name.
Use an SftpOutboundGateway with a MV (move) command instead.
You can also use a gateway to list and get files.

Related

Copy remote file in sftp using Spring Integration

I need to copy/duplicate remote file in Sftp server also rename when copied, I read here that copying remote file in Sftp isn't supported so the only the available option I had is to GET file into Local and then PUT again to Sftp & delete the Local file, I have successfully achieved my goal but the problem is there is a log printing from org.springframework.core.log.LogAccessor: I have no idea from where it is coming.
Code that helps in copying remote file:
#Bean
public IntegrationFlow copyRemoteFile() {
return IntegrationFlows.from("integration.channel.copy")
.handle(Sftp.outboundGateway(sftpSessionFactory(),
AbstractRemoteFileOutboundGateway.Command.GET,
"headers[" + COPY_SOURCE_PATH.value + "]+'/'+" +
"headers[" + COPY_SOURCE_FILENAME.value + "]")
.autoCreateLocalDirectory(true)
.fileExistsMode(FileExistsMode.REPLACE)
.localDirectory(new File(localPath)))
.log(LoggingHandler.Level.INFO, "SftpCopyService")
.handle(Sftp.outboundGateway(sftpSessionFactory(),
AbstractRemoteFileOutboundGateway.Command.PUT,
"payload")
.remoteDirectoryExpression("headers[" + COPY_DEST_PATH.value + "]")
.fileNameGenerator(n -> (String)n.getHeaders().get(COPY_DEST_FILENAME.value))
.fileExistsMode(FileExistsMode.REPLACE))
.log(LoggingHandler.Level.INFO, "SftpCopyService")
.handle((p, h) -> {
try {
return Files.deleteIfExists(
Paths.get(localPath + File.separator + h.get(COPY_SOURCE_FILENAME.value)));
} catch (IOException e) {
e.printStackTrace();
return false;
}
})
.get();
Here is the log.
2021-02-16 18:10:22,577 WARN [http-nio-9090-exec-1] org.springframework.core.log.LogAccessor: Failed to delete C:\Users\DELL\Desktop\GetTest\Spring Integration.txt
2021-02-16 18:10:22,784 INFO [http-nio-9090-exec-1] org.springframework.core.log.LogAccessor: GenericMessage [payload=C:\Users\DELL\Desktop\GetTest\Spring Integration.txt, headers={file_remoteHostPort=X.X.X.X:22, replyChannel=nullChannel, sourceFileName=Spring Integration.txt, file_remoteDirectory=/uploads/, destFileName=Spring Integrat.txt, destPath=uploads/dest, id=5105bdd1-8180-1185-3661-2ed708e07ab9, sourcePath=/uploads, file_remoteFile=Spring Integration.txt, timestamp=1613479222779}]
2021-02-16 18:10:23,011 INFO [http-nio-9090-exec-1] org.springframework.core.log.LogAccessor: GenericMessage [payload=uploads/dest/Spring Integrat.txt, headers={file_remoteHostPort=X.X.X.X:22, replyChannel=nullChannel, sourceFileName=Spring Integration.txt, file_remoteDirectory=/uploads/, destFileName=Spring Integrat.txt, destPath=uploads/dest, id=1bf83b0f-3b24-66bd-ffbf-2a9018b499fb, sourcePath=/uploads, file_remoteFile=Spring Integration.txt, timestamp=1613479223011}]
The more surprising part is, it appears very early even before the flow is executed, though I have handled file deletion at very last. How can i get rid of this log message? though it doesn't effect my process but the log message is misleading
Also is there any better way to copy remote file to another path inside sftp
EDIT
Like you suggested I tried the SftpRemoteFileTemplate.execute() method to copy files in sftp but when the session.write(InputStream stream,String path) method is called the method control never returns it keeps the control forever
I tried debugging, the control is lost when the execution reaches here:
for(_ackcount = this.seq - startid; _ackcount > ackcount && this.checkStatus((int[])null, header); ++ackcount) {
}
This code sits inside _put method of ChannelSftp.class
Here is the sample code that I'm trying
public boolean copy() {
return remoteFileTemplate.execute(session -> {
if (!session.exists("uploads/Spring Integration.txt")){
return false;
}
if (!session.exists("uploads/dest")){
session.mkdir("uploads/dest");
}
InputStream inputStream = session.readRaw("uploads/Spring Integration.txt");
session.write(inputStream, "uploads/dest/spring.txt");
session.finalizeRaw();
return true;
});
}
Would you please point out what mistake I'm doing here?
Instead of writing the whole flow via local file copy, I'd suggest to look into a single service activator for the SftpRemoteFileTemplate.execute(SessionCallback<F, T>). The provided SftpSession in that callback can be used for the InputStream readRaw() and write(InputStream inputStream, String destination). In the end you must call finalizeRaw().
The LogAccessor issue is not clear. What Spring Integration version do you use? Do you override Spring Core version though?
I think we can improve that WARN message and don't call File.delete() if it does not exists().
Feel free to provide such a contribution!
UPDATE
The JUnit test to demonstrate how to perform a copy on SFTP server:
#Test
public void testSftpCopy() throws Exception {
this.template.execute(session -> {
PipedInputStream in = new PipedInputStream();
PipedOutputStream out = new PipedOutputStream(in);
session.read("sftpSource/sftpSource2.txt", out);
session.write(in, "sftpTarget/sftpTarget2.txt");
return null;
});
Session<?> session = this.sessionFactory.getSession();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
FileCopyUtils.copy(session.readRaw("sftpTarget/sftpTarget2.txt"), baos);
assertThat(session.finalizeRaw()).isTrue();
assertThat(new String(baos.toByteArray())).isEqualTo("source2");
baos = new ByteArrayOutputStream();
FileCopyUtils.copy(session.readRaw("sftpSource/sftpSource2.txt"), baos);
assertThat(session.finalizeRaw()).isTrue();
assertThat(new String(baos.toByteArray())).isEqualTo("source2");
session.close();
}

Run Spring Integration flow concurrently for each Ftp file

I have a Integration flow configured using Java DSL which pulls file from Ftp server using Ftp.inboundChannelAdapter then transforms it to JobRequest, then I have a .handle() method which triggers my batch job, everything is working as per required but the process in running sequentially for each file inside the FTP folder
I added currentThreadName in my Transformer Endpoint it was printing same thread name for each file
Here is what I have tried till now
1.task executor bean
#Bean
public TaskExecutor taskExecutor(){
return new SimpleAsyncTaskExecutor("Integration");
}
2.Integration flow
#Bean
public IntegrationFlow integrationFlow(JobLaunchingGateway jobLaunchingGateway) throws IOException {
return IntegrationFlows.from(Ftp.inboundAdapter(myFtpSessionFactory)
.remoteDirectory("/bar")
.localDirectory(localDir.getFile())
,c -> c.poller(Pollers.fixedRate(1000).taskExecutor(taskExecutor()).maxMessagesPerPoll(20)))
.transform(fileMessageToJobRequest(importUserJob(step1())))
.handle(jobLaunchingGateway)
.log(LoggingHandler.Level.WARN, "headers.id + ': ' + payload")
.route(JobExecution.class,j->j.getStatus().isUnsuccessful()?"jobFailedChannel":"jobSuccessfulChannel")
.get();
}
3.I also read in another SO thread that I need ExecutorChannel so I configured one but I don't know how to inject this channel into my Ftp.inboundAdapter, from logs is see that the channel is always integrationFlow.channel#0 which I guess is a DirectChannel
#Bean
public MessageChannel inputChannel() {
return new ExecutorChannel(taskExecutor());
}
I dont know what I'm missing here, or I might have not properly understood Spring Messaging System as I'm very much new to Spring and Spring-Integration
Any help is appreciated
Thanks
The ExecutorChannel you can simply inject into the flow and it is going to be applied to the SourcePollingChannelAdapter by the framework. So, having that inputChannel defined as a bean you just do this:
.channel(inputChannel())
before your .transform(fileMessageToJobRequest(importUserJob(step1()))).
See more in docs: https://docs.spring.io/spring-integration/docs/current/reference/html/dsl.html#java-dsl-channels
On the other hand to process your files in parallel according your .taskExecutor(taskExecutor()) configuration, you just need to have a .maxMessagesPerPoll(20) as 1. The logic in the AbstractPollingEndpoint is like this:
this.taskExecutor.execute(() -> {
int count = 0;
while (this.initialized && (this.maxMessagesPerPoll <= 0 || count < this.maxMessagesPerPoll)) {
if (pollForMessage() == null) {
break;
}
count++;
}
So, we do have tasks in parallel, but only when they reach that maxMessagesPerPoll where it is 20 in your current case. There is also some explanation in the docs: https://docs.spring.io/spring-integration/docs/current/reference/html/messaging-endpoints.html#endpoint-pollingconsumer
The maxMessagesPerPoll property specifies the maximum number of messages to receive within a given poll operation. This means that the poller continues calling receive() without waiting, until either null is returned or the maximum value is reached. For example, if a poller has a ten-second interval trigger and a maxMessagesPerPoll setting of 25, and it is polling a channel that has 100 messages in its queue, all 100 messages can be retrieved within 40 seconds. It grabs 25, waits ten seconds, grabs the next 25, and so on.

TcpNioClientConnectionFactory vs TcpNetClientConnectionFactory

I am using :
spring-integration-java-dsl-1.2.3.RELEASE
spring-integration-ip-4.3.19.RELEASE
spring-integration-http-4.3.19.RELEASE
I want to know what is the difference between these two implementations TcpNetClientConnectionFactory and TcpNioClientConnectionFactory.
I have created a application for connect to a server and my application must support a high volumen of transaction per seconds maybe 100 transactions per seconds.
I don't know if my implementation is correct for support a high volumen or not.
The NIO implementation usually is recommend to avoid blocking, but I don't know if it changes the type of implementation my application will improves.
public IntegrationFlow createTcpConnection(String connectionId, String host, int port, int headBytes,
int retryInterval)
{
LOGGER.debug("createTcpConnection -> connectionId: {} - host: {} - port: {} - headBytes: {} - retryInterval: {}"
,connectionId,host,port,headBytes,retryInterval);
IntegrationFlow ifr = existsConnection(connectionId);
if (ifr == null) {
TcpNetClientConnectionFactory cf = new TcpNetClientConnectionFactory(host, port);
final ByteArrayLengthHeaderSerializer by = new ByteArrayLengthHeaderSerializer(headBytes);
cf.setSingleUse(false);
cf.setSoKeepAlive(true);
cf.setSerializer(by);
cf.setDeserializer(by);
cf.setComponentName(connectionId);
//Inbound Adapter
TcpReceivingChannelAdapter adapter = new TcpReceivingChannelAdapter();
adapter.setConnectionFactory(cf);
adapter.setClientMode(true);
adapter.setErrorChannelName("errorChannel");
adapter.setRetryInterval(retryInterval);
ifr = IntegrationFlows
.from(adapter)
.enrichHeaders(h -> h.header("connectionId",connectionId))
.channel(fromTcp)
.handle("BridgeMessageEndpoint", "outbound")
.get();
this.flowContext.registration(ifr).id(connectionId+CONNECTION_SUFFIX + ".in").addBean(cf).register();
//OutBound Adapter
TcpSendingMessageHandler sender = new TcpSendingMessageHandler();
sender.setConnectionFactory(cf);
IntegrationFlow flow = f -> f.handle(sender);
this.flowContext.registration(flow).id(connectionId+CONNECTION_SUFFIX + ".out").register();
LOGGER.debug("createTcpConnection: Connection created");
}
return ifr;
}
Generally speaking, from an I/O perspective, TcpNet* will be more efficient for a small/medium number of long-lived connections. TcpNio* would be better for a large number of connections and/or very short-lived connections.
If you want to process inbound messages in parallel with a TcpNet... configuration, you can use an executor channel as the adapter's output channel so the IO thread hands off the work to another thread.

How to handle errors after message has been handed off to QueueChannel?

I have 10 rabbitMQ queues, called event.q.0, event.q.2, <...>, event.q.9. Each of these queues receive messages routed from event.consistent-hash exchange. I want to build a fault tolerant solution that will consume messages for a specific event in sequential manner, since ordering is important. For this I have set up a flow that listens to those queues and routes messages based on event ID to a specific worker flow. Worker flows work based on queue channels so that should guarantee the FIFO order for an event with specific ID. I have come up with with the following set up:
#Bean
public IntegrationFlow eventConsumerFlow(RabbitTemplate rabbitTemplate, Advice retryAdvice) {
return IntegrationFlows
.from(
Amqp.inboundAdapter(new SimpleMessageListenerContainer(rabbitTemplate.getConnectionFactory()))
.configureContainer(c -> c
.adviceChain(retryAdvice())
.addQueueNames(queueNames)
.prefetchCount(amqpProperties.getPreMatch().getDefinition().getQueues().getEvent().getPrefetch())
)
.messageConverter(rabbitTemplate.getMessageConverter())
)
.<Event, String>route(e -> String.format("worker-input-%d", e.getId() % numberOfWorkers))
.get();
}
private Advice deadLetterAdvice() {
return RetryInterceptorBuilder
.stateless()
.maxAttempts(3)
.recoverer(recoverer())
.backOffPolicy(backOffPolicy())
.build();
}
private ExponentialBackOffPolicy backOffPolicy() {
ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
backOffPolicy.setInitialInterval(1000);
backOffPolicy.setMultiplier(3.0);
backOffPolicy.setMaxInterval(15000);
return backOffPolicy;
}
private MessageRecoverer recoverer() {
return new RepublishMessageRecoverer(
rabbitTemplate,
"error.exchange.dlx"
);
}
#PostConstruct
public void init() {
for (int i = 0; i < numberOfWorkers; i++) {
flowContext.registration(workerFlow(MessageChannels.queue(String.format("worker-input-%d", i), queueCapacity).get()))
.autoStartup(false)
.id(String.format("worker-flow-%d", i))
.register();
}
}
private IntegrationFlow workerFlow(QueueChannel channel) {
return IntegrationFlows
.from(channel)
.<Object, Class<?>>route(Object::getClass, m -> m
.resolutionRequired(true)
.defaultOutputToParentFlow()
.subFlowMapping(EventOne.class, s -> s.handle(oneHandler))
.subFlowMapping(EventTwo.class, s -> s.handle(anotherHandler))
)
.get();
}
Now, when lets say an error happens in eventConsumerFlow, the retry mechanism works as expected, but when an error happens in workerFlow, the retry doesn't work anymore and the message doesn't get sent to dead letter exchange. I assume this is because once message is handed off to QueueChannel, it gets acknowledged automatically. How can I make the retry mechanism work in workerFlow as well, so that if exception happens there, it could retry a couple of times and send a message to DLX when tries are exhausted?
If you want resiliency, you shouldn't be using queue channels at all; the messages will be acknowledged immediately after the message is put in the in-memory queue;if the server crashes, those messages will be lost.
You should configure a separate adapter for each queue if you want no message loss.
That said, to answer the general question, any errors on downstream flows (including after a queue channel) will be sent to the errorChannel defined on the inbound adapter.

Spring Integration: Framework Error Handling is Inconsistent

In my experience, the error handling strategy in various EIP components have little or no consistency.
Case 1: handle:
return IntegrationFlows.from(inputChannel)
.enrichHeaders(spec -> spec.header(ERROR_CHANNEL, ARTIFACTORY_ERROR_CHANNEL, true))
.handle(WebFlux.outboundGateway(uri, webClient)
.expectedResponseType(ArtifactSearchResponse.class)
.httpMethod(GET)
.mappedRequestHeaders(ACCEPT))
.log(LoggingHandler.Level.INFO, CLASS_NAME, Message::getPayload)
.handle(transformer)
.channel(outputChannel)
.get();
In this case, if handle(transformer) throws an exception, the message is sent to the ARTIFACTORY_ERROR_CHANNEL as expected, but the exception is returned to the caller. Thus, a test has to use try-catch to not fail.
try {
inputChannel.send(new GenericMessage<>("start"));
} catch (Exception e) {
// nop-op
}
verify(mockMessageHandler, timeout.times(1)).handleMessage(any(ErrorMessage.class));
Case 2: transform:
Change handle(transformer) to transform(transformer) and the exception is never sent to the ARTIFACTORY_ERROR_CHANNEL channel.
Case 3: Gateway:
public IntegrationFlow fileStreamingFlow() {
return IntegrationFlows.from(inputChannel)
.gateway(f -> f.handle(String.class, (fileName, headers) -> {
throw new RuntimeException();
}), spec -> spec.requiresReply(false).errorChannel(S3_ERROR_CHANNEL))
.channel(outputChannel)
.get();
}
}
In this case, the calls blocks forever. See #2451.
Case 4: handle with routeByException:
return IntegrationFlows.from(s3Properties.getFileStreamingInputChannel())
.enrichHeaders(spec -> spec.header(ERROR_CHANNEL, S3_ERROR_CHANNEL, true))
.handle(String.class, (fileName, h) -> {
return new ErrorMessage(new RuntimeException(), h);
}, spec -> spec.requiresReply(false))
.channel(outputChannel)
.routeByException(r -> r.channelMapping(Exception.class, S3_ERROR_CHANNEL))
.get();
}
In order for the exception to be sent to S3_ERROR_CHANNEL, I need to convert the exception to an ErrorMessage, and also apply a routeByException although there is already a previously configured ERROR_CHANNEL.
What I expect: If user defines an error channel, send all exceptions there. If the error handler associated to that channel returns null, terminate the flow; if it returns something else, continue. If user doesn't define an error channel, send the exception to the framework default error channel. Do this regardless of the flow definition.
transformer - if your transformer returns Message<?>; it is responsible to propagate the errorChannel header.
When using a gateway, the error channel must be declared thereon rather than adding it later.
I don't understand what you are trying to do there.
In general, it's best to not manipulate framework headers in this way, but declare the channel on the proper elements (gateways, pollers etc).

Resources