FileSystemPersistentAcceptOnceFileListFilter Not Picking Up File - spring-integration

I am using OpenJDK Java 17, Spring Integration with Spring Boot 2.7.4. Watching a directory for files with the code below, I could see that the metadata store table was updated with the file and its timestamp. But it never got to the fileChannel code for processing. Timing issue perhaps?? This app has been running a few months with no issues before today. I did a touch command on the file and then it got triggered.
Any suggestions? Thanks in advance for any assistance.
#Bean
public MessageChannel fileChannel() { return new DirectChannel(); }
#Bean
#InboundChannelAdapter(value = "fileChannel", poller = #Poller(fixedDelay = "30000"))
public MessageSource<File> watchSourceDirectory() {
FileReadingMessageSource source = new FileReadingMessageSource();
source.setDirectory(new File(appConfig.getLocal().getSourceDir()));
source.setAutoCreateDirectory(true);
CompositeFileListFilter<File> compositeFileListFilter = new CompositeFileListFilter<>();
compositeFileListFilter.addFilter(new RegexPatternFileListFilter(appConfig.getLocal().getFilePattern()));
compositeFileListFilter.addFilter(new LastModifiedFileListFilter(10));
compositeFileListFilter.addFilter(new FileSystemPersistentAcceptOnceFileListFilter(metadataStore, ""));
source.setFilter(compositeFileListFilter);
return source;
} // end watchSourceDirectory()

There is a problem with CompositeFileListFilter: it does apply polled files to all the filters, but return only those which remain in the original list:
public List<F> filterFiles(F[] files) {
Assert.notNull(files, "'files' should not be null");
List<F> results = new ArrayList<>(Arrays.asList(files));
for (FileListFilter<F> fileFilter : this.fileFilters) {
List<F> currentResults = fileFilter.filterFiles(files);
results.retainAll(currentResults);
}
return results;
}
Consider to use a ChainFileListFilter instead. Its logic is to call the next filter only with already filtered files:
public List<F> filterFiles(F[] files) {
Assert.notNull(files, "'files' should not be null");
List<F> leftOver = Arrays.asList(files);
for (FileListFilter<F> fileFilter : this.fileFilters) {
if (leftOver.isEmpty()) {
break;
}
#SuppressWarnings("unchecked")
F[] fileArray = leftOver.toArray((F[]) Array.newInstance(leftOver.get(0).getClass(), leftOver.size()));
leftOver = fileFilter.filterFiles(fileArray);
}
return leftOver;
}
So, your FileSystemPersistentAcceptOnceFileListFilter is not going to be updated for nothing.
This is what is going on in your case:
You got RegexPatternFileListFilter and probably all the files are accepted. No state.
Your files are not accepted because they are not old enough. No state.
Your files are accepted because they are not present in the store. There is a state.
But since last modified filter has already remove your files from the list, the result of the accept once is ignored.
Earlier it probably worked because files were old enough or you didn't have that LastModifiedFileListFilter.

Related

How to handle error during file transformation

When a large file is uploaded to a polled directory, and is attempted to be unzipped, an error takes place, because the file is not yet complete.
How to make a retry of polling after some time?
#Bean
#InboundChannelAdapter(value = "inputChannel", poller = #Poller(fixedRate = "1500"))
public FileReadingMessageSource poll() {
FileReadingMessageSource source = new FileReadingMessageSource();
source.setScanEachPoll(true);
source.setDirectory(new File(pathConfig.getIncomingDirPath()));
source.setUseWatchService(true);
source.setFilter(new SimplePatternFileListFilter("*.zip"));
return source;
}
#Transformer(inputChannel = "inputChannel", outputChannel = "unzipChannel")
public Message convert(Message<File> fileMessage) {
UnZipTransformer unzipTransformer = new UnZipTransformer();
unzipTransformer.setZipResultType(ZipResultType.FILE);
unzipTransformer.setWorkDirectory(new File(pathConfig.getWorkDirPath()));
unzipTransformer.setDeleteFiles(false);
unzipTransformer.afterPropertiesSet();
File file = fileMessage.getPayload();
return unzipTransformer.transform(fileMessage);
}
EDIT - i can't get LastModifiedFileListFilter to work with large files. It works fine with small ones; and when I upload a large one, it does not react any more.
EDIT2
Thanks to advice from Artem and Gary, here is the solution that worked for me.
#Bean
#InboundChannelAdapter(value = "inputChannel", poller = #Poller(fixedDelay = "1500"))
public FileReadingMessageSource poll() {
FileReadingMessageSource source = new FileReadingMessageSource();
source.setScanEachPoll(true);
source.setDirectory(new File(pathConfig.getIncomingDirPath()));
source.setUseWatchService(true);
FileListFilter simplePatternFileListFilter = new SimplePatternFileListFilter("*.zip");
source.setFilter(new ChainFileListFilter<>().addFilter(simplePatternFileListFilter));
return source;
}
#Transformer(inputChannel = "inputChannel", outputChannel = "unzipChannel",
adviceChain = "retryOnIncompleteData")
public Message convertZip(Message<File> fileMessage) {
UnZipTransformer unzipTransformer = new UnZipTransformer();
unzipTransformer.setZipResultType(ZipResultType.FILE);
unzipTransformer.setWorkDirectory(new File(pathConfig.getWorkDirPath()));
unzipTransformer.setDeleteFiles(false);
unzipTransformer.afterPropertiesSet();
return unzipTransformer.transform(fileMessage);
}
#Bean
public Advice retryOnIncompleteData() {
RequestHandlerRetryAdvice advice = new RequestHandlerRetryAdvice();
RetryTemplate template = createRetryTemplate();
advice.setRetryTemplate(template);
return advice;
}
private RetryTemplate createRetryTemplate() {
RetryTemplate template = new RetryTemplate();
SimpleRetryPolicy policy = new SimpleRetryPolicy();
policy.setMaxAttempts(15);
template.setRetryPolicy(policy);
FixedBackOffPolicy backOffPolicy = new FixedBackOffPolicy();
backOffPolicy.setBackOffPeriod(25000l);
template.setBackOffPolicy(backOffPolicy);
return template;
}
Well, actually since you use only SimplePatternFileListFilter, any your files are going to be retried on each poll interval.
To avoid re-fetching the same file over and over (if you don't delete it in the end of the process), we recommend to use AcceptOnceFileListFilter as a part of the CompositeFileListFilter.
In case of error like yours you can use ExpressionEvaluatingRequestHandlerAdvice to call onFailureExpression to ResettableFileListFilter.remove() implementation of the AcceptOnceFileListFilter.
On the other hand instead of ExpressionEvaluatingRequestHandlerAdvice you can consider to use RequestHandlerRetryAdvice to retry an unzipping process: you don't need to re-fetch file from the localsystem.
These AOP advices you should apply to the #Transformer.
See more info in the Reference Manual.
BTW, I would say fixedRate is not good for large files. Especially when you have errors like this. The fixedDelay would be better. See their JavaDocs for more information.

FtpInboundFileSynchronizer.setDeleteRemoteFiles(true) not deleting remote files after transfer

I'm trying with spring boot, integration ftp example (http://docs.spring.io/spring-integration/reference/html/ftp.html, 15.5.1 ). I'm able to read files from remote directory, some how it's not deleting the files with flag setDeleteRemoteFiles(true). Please let me know if I'm missing to set any flags to remove remote files.
#Bean
public SessionFactory<FTPFile> ftpSessionFactory() {
DefaultFtpSessionFactory sf = new DefaultFtpSessionFactory();
sf.setHost("localhost");
sf.setPort(21);
sf.setUsername("sudaredd");
sf.setPassword("");
return new CachingSessionFactory<FTPFile>(sf);
}
#Bean
public FtpInboundFileSynchronizer ftpInboundFileSynchronizer() {
FtpInboundFileSynchronizer fileSynchronizer = new FtpInboundFileSynchronizer(ftpSessionFactory());
fileSynchronizer.setDeleteRemoteFiles(true);
fileSynchronizer.setRemoteDirectory("");
fileSynchronizer.setFilter(new FtpSimplePatternFileListFilter("*"));
return fileSynchronizer;
}
#Bean
#InboundChannelAdapter(channel = "ftpChannel", poller = #Poller(fixedDelay = "5000"))
public MessageSource<File> ftpMessageSource() {
FtpInboundFileSynchronizingMessageSource source =
new FtpInboundFileSynchronizingMessageSource(ftpInboundFileSynchronizer());
source.setLocalDirectory(new File("ftp-inbound"));
source.setAutoCreateLocalDirectory(true);
source.setLocalFilter(new AcceptOnceFileListFilter<File>());
return source;
}
All the config looks good.
Try to switch on DEBUG for the org.springframework.integration category to track behavior.
OTOH, please, be sure that your FTP server support delete operation and your user really has a permission to perform that operation.
Since #InboundChannelAdapter performs its task via taskScheduler you might just miss an error in the logs.
Would be also great if you are able to debug the code and place break point in the AbstractInboundFileSynchronizer.copyFileToLocalDirectory() to follow with the logic and determine what may be wrong.

What is the proper Java Config for simple file copy

I'm extremely new to Spring, and even more-so to Spring Integration, so apologies if this is a very basic question.
I'm wanting to build a very basic log file processor to learn the ropes. Very similar to this: example
I'm also wanting to use a java config approach, and most of the examples I've been following are all XML driven and I'm having a hard time doing the translation.
Ultimately I'd like to recursively poll a source directory for log files and use a persistence store to keep track of what's been found.
Then, copy those files to be processed to a processing folder, and then kick off a spring batch job to process the contents of the file.
When everything completes the processed file can be deleted from the processing location.
I can't seem to figure out the proper way to wire up (using general java config of SpEL) the flow. Also, I'm still very unsure of what the proper pieces should be.
Again something along these basic, high-level lines for the file moving:
file:inbound-channel-adapter -> channel -> file:outbound-adapter
basic sample
Here is what I have so far
EDIT
I've updated with Artem's solution. My source files are now properly copied to the destination location. Thanks Artem!
Ultimately I am still facing the same problem. The files to be scanned are found immediately (and the metadata-store.properties files is populated immediately) but the files are slowly copied to the destination folder. If a crash happens, any source files that have not been copied to the destination folder will essentially be "lost". Perhaps I need to look at other forms of persistence stores, like a custom jdbcfilter.
#Value("${logProcessor.filenamePattern}")
private String filenamePattern;
#Value("${logProcessor.sourceDirectory}")
private String sourceDirectory;
#Value("${logProcessor.processingDirectory}")
private String processingDirectory;
#Bean
#InboundChannelAdapter(channel = "sourceFileChannel", poller = #Poller(fixedRate = "5000"))
public MessageSource<File> sourceFiles() {
CompositeFileListFilter<File> filters = new CompositeFileListFilter<>();
filters.addFilter(new SimplePatternFileListFilter(filenamePattern));
filters.addFilter(persistentFilter());
FileReadingMessageSource source = new FileReadingMessageSource();
source.setAutoCreateDirectory(true);
source.setDirectory(new File(sourceDirectory));
source.setFilter(filters);
source.setUseWatchService(true);
return source;
}
#Bean
#InboundChannelAdapter(channel = "processingFileChannel", poller = #Poller(fixedRate = "5000"))
public MessageSource<File> processingFiles() {
CompositeFileListFilter<File> filters = new CompositeFileListFilter<>();
filters.addFilter(new SimplePatternFileListFilter(filenamePattern));
FileReadingMessageSource source = new FileReadingMessageSource();
source.setAutoCreateDirectory(true);
source.setDirectory(new File(processingDirectory));
source.setFilter(filters);
return source;
}
#Bean
#ServiceActivator(inputChannel = "sourceFileChannel")
public MessageHandler fileOutboundChannelAdapter() {
FileWritingMessageHandler adapter = new FileWritingMessageHandler(new File(processingDirectory));
adapter.setDeleteSourceFiles(false);
adapter.setAutoCreateDirectory(true);
adapter.setExpectReply(false);
return adapter;
}
#Bean
public MessageChannel sourceFileChannel() {
return new DirectChannel();
}
#Bean
public MessageChannel processingFileChannel() {
return new DirectChannel();
}
#Bean
public DefaultDirectoryScanner defaultDirectoryScanner() {
return new DefaultDirectoryScanner();
}
#Bean
public FileSystemPersistentAcceptOnceFileListFilter persistentFilter() {
FileSystemPersistentAcceptOnceFileListFilter fileSystemPersistentAcceptOnceFileListFilter = new FileSystemPersistentAcceptOnceFileListFilter(metadataStore(), "");
fileSystemPersistentAcceptOnceFileListFilter.setFlushOnUpdate(true);
return fileSystemPersistentAcceptOnceFileListFilter;
}
#Bean
public PropertiesPersistingMetadataStore metadataStore(){
PropertiesPersistingMetadataStore metadataStore = new PropertiesPersistingMetadataStore();
metadataStore.setBaseDirectory("C:\\root\\code\\logProcessor");
return metadataStore;
}
You config is good so far.
Having such a complex task, I'm not sure how to help you.
You should ask more specific question. We can't write the solution for you.
Not sure why you need to copy files from one dir to another, if you can simply poll them from the source dir, store in the metadataStore and start a file processing.
So, far I see a small problem in your config. The FileWritingMessageHandler sends results to the processingFileChannel and the same is done by the second FileReadingMessageSource. I'm not sure that it is your intention. Just in case to pay your attention.
You might also need to know about FileSplitter, which lets you to process file line by line.
Also you say processingDirectory, but then you use tmpDir for the FileWritingMessageHandler, which, I guess, assumes your copy logic.
Let's do the task step by step! And then you figure out what, where and how to use!
EDIT
If you need just copy file to the processingDirectory without any reply, you should do one-way adapter:
#Bean
#ServiceActivator(inputChannel = "sourceFileChannel")
public MessageHandler fileOutboundChannelAdapter() {
FileWritingMessageHandler adapter = new FileWritingMessageHandler(new File(processingDirectory));
adapter.setDeleteSourceFiles(true);
adapter.setAutoCreateDirectory(true);
adapter.setExpectReply(false);
return adapter;
}
And then that your #InboundChannelAdapter(channel = "processingFileChannel" is good to pick up files for processing.
Not sure that you need to DeleteSourceFiles though...

Spring Integration- FTP should synchronize with local folder

I have ftp location files and have local folder, on first time the files are copied to local and on restarting the server(Currently it is copying already copied files to the local folder) it should not look for the files which are already exist in the local and it should lookup for new files only. Please let me know is it possible to achieve it using Spring-Integration ftp?
I have added Filter also but still it is not working, please let me know where I am going wrong,
#Bean
#InboundChannelAdapter(value = "inputChannel", poller = #Poller(fixedDelay = "1000", maxMessagesPerPoll = "1"))
public MessageSource<?> receive() {
FtpInboundFileSynchronizingMessageSource messageSource = new FtpInboundFileSynchronizingMessageSource(synchronizer());
PropertiesPersistingMetadataStore metadataStore = new PropertiesPersistingMetadataStore();
FileSystemPersistentAcceptOnceFileListFilter acceptOnceFilter = new FileSystemPersistentAcceptOnceFileListFilter(metadataStore,"*.xml");
File Temp = new File(TEMP_FOLDER);
metadataStore.setBaseDirectory(TEMP_FOLDER);
messageSource.setLocalDirectory(Temp);
messageSource.setAutoCreateLocalDirectory(false);
messageSource.setLocalFilter(acceptOnceFilter);
return messageSource;
}
private AbstractInboundFileSynchronizer<FTPFile> synchronizer() {
folderCleanUp();
AbstractInboundFileSynchronizer<FTPFile> fileSynchronizer = new FtpInboundFileSynchronizer(sessionFactory());
fileSynchronizer.setRemoteDirectory(ftpFileLocation);
fileSynchronizer.setDeleteRemoteFiles(false);
Pattern pattern = Pattern.compile(".*\\.xml$");
FtpRegexPatternFileListFilter ftpRegexPatternFileListFilter = new FtpRegexPatternFileListFilter(pattern);
fileSynchronizer.setFilter(ftpRegexPatternFileListFilter);
return fileSynchronizer;
}
To clarify Artem's advice about implementing your custom FileListFilter, here is an example of such filter (aimed to filter out files older than given moment):
#Component
public class OldFilesFilter extends AbstractFileListFilter<FTPFile> {
// (oldFilesTimestamp field declaration and its source)
#Override
protected boolean accept(FTPFile file) {
String fileName = file.getName();
long fileTimestamp = file.getTimestamp().getTimeInMillis();
ZonedDateTime fileModTimestamp = ZonedDateTime.ofInstant(Instant.ofEpochMilli(fileTimestamp), ZoneId.systemDefault());
boolean isFileAcceptable = fileModTimestamp.isAfter(oldFilesTimestamp);
if (log.isTraceEnabled()) {
log.trace("File {}:\n" +
"file timestamp : {};\n" +
"given timestamp: {};\n" +
"file is new : {}",
fileName, fileModTimestamp, oldFilesTimestamp, isFileAcceptable);
}
return isFileAcceptable;
}
}
Also note that Spring Integration allows multiple filters to be applied to single file source at the same time. This can be achieved with CompositeFileListFilter:
private CompositeFileListFilter<FTPFile> remoteFileFilter() {
FtpPersistentAcceptOnceFileListFilter persistentFilter =
new FtpPersistentAcceptOnceFileListFilter(metadataStore, "remoteProcessedFiles.");
return new CompositeFileListFilter<>(Arrays.asList(new FtpSimplePatternFileListFilter("*.zip"),
persistentFilter,
oldFilesFilter /*known from previous example*/));
}
Yes, it is. Take a look to the local-filter property and FileSystemPersistentAcceptOnceFileListFilter is for you to track local files via external MetadataStore, e.g. Redis, MongoDb or any other which keeps the data over system restarts.

Lost headers when using UnZipResultSplitter

I'm using the Spring Integration Zip extension and it appears that I'm losing headers I've added upstream in the flow. I'm guessing that they are being lost in UnZipResultSplitter.splitUnzippedMap() as I don't see anything that explicitly copies them over.
I seem to recall that this is not unusual with splitters but I can't determine what strategy one should use in such a case.
Yep!
It looks like a bug.
The splitter contract is like this:
if (item instanceof Message) {
builder = this.getMessageBuilderFactory().fromMessage((Message<?>) item);
}
else {
builder = this.getMessageBuilderFactory().withPayload(item);
builder.copyHeaders(headers);
}
So, if those splitted items are messages already, like in case of our UnZipResultSplitter, we just use message as is without copying headers from upstream.
Please, raise a JIRA ticket (https://jira.spring.io/browse/INTEXT) on the matter.
Meanwhile let's consider some workaround:
public class MyUnZipResultSplitter {
public List<Message<Object>> splitUnzipped(Message<Map<String, Object>> unzippedEntries) {
final List<Message<Object>> messages = new ArrayList<Message<Object>>(unzippedEntries.size());
for (Map.Entry<String, Object> entry : unzippedEntries.getPayload().entrySet()) {
final String path = FilenameUtils.getPath(entry.getKey());
final String filename = FilenameUtils.getName(entry.getKey());
final Message<Object> splitMessage = MessageBuilder.withPayload(entry.getValue())
.setHeader(FileHeaders.FILENAME, filename)
.setHeader(ZipHeaders.ZIP_ENTRY_PATH, path)
.copyHeaders(unzippedEntries/getHeaders())
.build();
messages.add(splitMessage);
}
return messages;
}
}

Resources