When a large file is uploaded to a polled directory, and is attempted to be unzipped, an error takes place, because the file is not yet complete.
How to make a retry of polling after some time?
#Bean
#InboundChannelAdapter(value = "inputChannel", poller = #Poller(fixedRate = "1500"))
public FileReadingMessageSource poll() {
FileReadingMessageSource source = new FileReadingMessageSource();
source.setScanEachPoll(true);
source.setDirectory(new File(pathConfig.getIncomingDirPath()));
source.setUseWatchService(true);
source.setFilter(new SimplePatternFileListFilter("*.zip"));
return source;
}
#Transformer(inputChannel = "inputChannel", outputChannel = "unzipChannel")
public Message convert(Message<File> fileMessage) {
UnZipTransformer unzipTransformer = new UnZipTransformer();
unzipTransformer.setZipResultType(ZipResultType.FILE);
unzipTransformer.setWorkDirectory(new File(pathConfig.getWorkDirPath()));
unzipTransformer.setDeleteFiles(false);
unzipTransformer.afterPropertiesSet();
File file = fileMessage.getPayload();
return unzipTransformer.transform(fileMessage);
}
EDIT - i can't get LastModifiedFileListFilter to work with large files. It works fine with small ones; and when I upload a large one, it does not react any more.
EDIT2
Thanks to advice from Artem and Gary, here is the solution that worked for me.
#Bean
#InboundChannelAdapter(value = "inputChannel", poller = #Poller(fixedDelay = "1500"))
public FileReadingMessageSource poll() {
FileReadingMessageSource source = new FileReadingMessageSource();
source.setScanEachPoll(true);
source.setDirectory(new File(pathConfig.getIncomingDirPath()));
source.setUseWatchService(true);
FileListFilter simplePatternFileListFilter = new SimplePatternFileListFilter("*.zip");
source.setFilter(new ChainFileListFilter<>().addFilter(simplePatternFileListFilter));
return source;
}
#Transformer(inputChannel = "inputChannel", outputChannel = "unzipChannel",
adviceChain = "retryOnIncompleteData")
public Message convertZip(Message<File> fileMessage) {
UnZipTransformer unzipTransformer = new UnZipTransformer();
unzipTransformer.setZipResultType(ZipResultType.FILE);
unzipTransformer.setWorkDirectory(new File(pathConfig.getWorkDirPath()));
unzipTransformer.setDeleteFiles(false);
unzipTransformer.afterPropertiesSet();
return unzipTransformer.transform(fileMessage);
}
#Bean
public Advice retryOnIncompleteData() {
RequestHandlerRetryAdvice advice = new RequestHandlerRetryAdvice();
RetryTemplate template = createRetryTemplate();
advice.setRetryTemplate(template);
return advice;
}
private RetryTemplate createRetryTemplate() {
RetryTemplate template = new RetryTemplate();
SimpleRetryPolicy policy = new SimpleRetryPolicy();
policy.setMaxAttempts(15);
template.setRetryPolicy(policy);
FixedBackOffPolicy backOffPolicy = new FixedBackOffPolicy();
backOffPolicy.setBackOffPeriod(25000l);
template.setBackOffPolicy(backOffPolicy);
return template;
}
Well, actually since you use only SimplePatternFileListFilter, any your files are going to be retried on each poll interval.
To avoid re-fetching the same file over and over (if you don't delete it in the end of the process), we recommend to use AcceptOnceFileListFilter as a part of the CompositeFileListFilter.
In case of error like yours you can use ExpressionEvaluatingRequestHandlerAdvice to call onFailureExpression to ResettableFileListFilter.remove() implementation of the AcceptOnceFileListFilter.
On the other hand instead of ExpressionEvaluatingRequestHandlerAdvice you can consider to use RequestHandlerRetryAdvice to retry an unzipping process: you don't need to re-fetch file from the localsystem.
These AOP advices you should apply to the #Transformer.
See more info in the Reference Manual.
BTW, I would say fixedRate is not good for large files. Especially when you have errors like this. The fixedDelay would be better. See their JavaDocs for more information.
Related
I am using OpenJDK Java 17, Spring Integration with Spring Boot 2.7.4. Watching a directory for files with the code below, I could see that the metadata store table was updated with the file and its timestamp. But it never got to the fileChannel code for processing. Timing issue perhaps?? This app has been running a few months with no issues before today. I did a touch command on the file and then it got triggered.
Any suggestions? Thanks in advance for any assistance.
#Bean
public MessageChannel fileChannel() { return new DirectChannel(); }
#Bean
#InboundChannelAdapter(value = "fileChannel", poller = #Poller(fixedDelay = "30000"))
public MessageSource<File> watchSourceDirectory() {
FileReadingMessageSource source = new FileReadingMessageSource();
source.setDirectory(new File(appConfig.getLocal().getSourceDir()));
source.setAutoCreateDirectory(true);
CompositeFileListFilter<File> compositeFileListFilter = new CompositeFileListFilter<>();
compositeFileListFilter.addFilter(new RegexPatternFileListFilter(appConfig.getLocal().getFilePattern()));
compositeFileListFilter.addFilter(new LastModifiedFileListFilter(10));
compositeFileListFilter.addFilter(new FileSystemPersistentAcceptOnceFileListFilter(metadataStore, ""));
source.setFilter(compositeFileListFilter);
return source;
} // end watchSourceDirectory()
There is a problem with CompositeFileListFilter: it does apply polled files to all the filters, but return only those which remain in the original list:
public List<F> filterFiles(F[] files) {
Assert.notNull(files, "'files' should not be null");
List<F> results = new ArrayList<>(Arrays.asList(files));
for (FileListFilter<F> fileFilter : this.fileFilters) {
List<F> currentResults = fileFilter.filterFiles(files);
results.retainAll(currentResults);
}
return results;
}
Consider to use a ChainFileListFilter instead. Its logic is to call the next filter only with already filtered files:
public List<F> filterFiles(F[] files) {
Assert.notNull(files, "'files' should not be null");
List<F> leftOver = Arrays.asList(files);
for (FileListFilter<F> fileFilter : this.fileFilters) {
if (leftOver.isEmpty()) {
break;
}
#SuppressWarnings("unchecked")
F[] fileArray = leftOver.toArray((F[]) Array.newInstance(leftOver.get(0).getClass(), leftOver.size()));
leftOver = fileFilter.filterFiles(fileArray);
}
return leftOver;
}
So, your FileSystemPersistentAcceptOnceFileListFilter is not going to be updated for nothing.
This is what is going on in your case:
You got RegexPatternFileListFilter and probably all the files are accepted. No state.
Your files are not accepted because they are not old enough. No state.
Your files are accepted because they are not present in the store. There is a state.
But since last modified filter has already remove your files from the list, the result of the accept once is ignored.
Earlier it probably worked because files were old enough or you didn't have that LastModifiedFileListFilter.
I'm trying with spring boot, integration ftp example (http://docs.spring.io/spring-integration/reference/html/ftp.html, 15.5.1 ). I'm able to read files from remote directory, some how it's not deleting the files with flag setDeleteRemoteFiles(true). Please let me know if I'm missing to set any flags to remove remote files.
#Bean
public SessionFactory<FTPFile> ftpSessionFactory() {
DefaultFtpSessionFactory sf = new DefaultFtpSessionFactory();
sf.setHost("localhost");
sf.setPort(21);
sf.setUsername("sudaredd");
sf.setPassword("");
return new CachingSessionFactory<FTPFile>(sf);
}
#Bean
public FtpInboundFileSynchronizer ftpInboundFileSynchronizer() {
FtpInboundFileSynchronizer fileSynchronizer = new FtpInboundFileSynchronizer(ftpSessionFactory());
fileSynchronizer.setDeleteRemoteFiles(true);
fileSynchronizer.setRemoteDirectory("");
fileSynchronizer.setFilter(new FtpSimplePatternFileListFilter("*"));
return fileSynchronizer;
}
#Bean
#InboundChannelAdapter(channel = "ftpChannel", poller = #Poller(fixedDelay = "5000"))
public MessageSource<File> ftpMessageSource() {
FtpInboundFileSynchronizingMessageSource source =
new FtpInboundFileSynchronizingMessageSource(ftpInboundFileSynchronizer());
source.setLocalDirectory(new File("ftp-inbound"));
source.setAutoCreateLocalDirectory(true);
source.setLocalFilter(new AcceptOnceFileListFilter<File>());
return source;
}
All the config looks good.
Try to switch on DEBUG for the org.springframework.integration category to track behavior.
OTOH, please, be sure that your FTP server support delete operation and your user really has a permission to perform that operation.
Since #InboundChannelAdapter performs its task via taskScheduler you might just miss an error in the logs.
Would be also great if you are able to debug the code and place break point in the AbstractInboundFileSynchronizer.copyFileToLocalDirectory() to follow with the logic and determine what may be wrong.
I'm extremely new to Spring, and even more-so to Spring Integration, so apologies if this is a very basic question.
I'm wanting to build a very basic log file processor to learn the ropes. Very similar to this: example
I'm also wanting to use a java config approach, and most of the examples I've been following are all XML driven and I'm having a hard time doing the translation.
Ultimately I'd like to recursively poll a source directory for log files and use a persistence store to keep track of what's been found.
Then, copy those files to be processed to a processing folder, and then kick off a spring batch job to process the contents of the file.
When everything completes the processed file can be deleted from the processing location.
I can't seem to figure out the proper way to wire up (using general java config of SpEL) the flow. Also, I'm still very unsure of what the proper pieces should be.
Again something along these basic, high-level lines for the file moving:
file:inbound-channel-adapter -> channel -> file:outbound-adapter
basic sample
Here is what I have so far
EDIT
I've updated with Artem's solution. My source files are now properly copied to the destination location. Thanks Artem!
Ultimately I am still facing the same problem. The files to be scanned are found immediately (and the metadata-store.properties files is populated immediately) but the files are slowly copied to the destination folder. If a crash happens, any source files that have not been copied to the destination folder will essentially be "lost". Perhaps I need to look at other forms of persistence stores, like a custom jdbcfilter.
#Value("${logProcessor.filenamePattern}")
private String filenamePattern;
#Value("${logProcessor.sourceDirectory}")
private String sourceDirectory;
#Value("${logProcessor.processingDirectory}")
private String processingDirectory;
#Bean
#InboundChannelAdapter(channel = "sourceFileChannel", poller = #Poller(fixedRate = "5000"))
public MessageSource<File> sourceFiles() {
CompositeFileListFilter<File> filters = new CompositeFileListFilter<>();
filters.addFilter(new SimplePatternFileListFilter(filenamePattern));
filters.addFilter(persistentFilter());
FileReadingMessageSource source = new FileReadingMessageSource();
source.setAutoCreateDirectory(true);
source.setDirectory(new File(sourceDirectory));
source.setFilter(filters);
source.setUseWatchService(true);
return source;
}
#Bean
#InboundChannelAdapter(channel = "processingFileChannel", poller = #Poller(fixedRate = "5000"))
public MessageSource<File> processingFiles() {
CompositeFileListFilter<File> filters = new CompositeFileListFilter<>();
filters.addFilter(new SimplePatternFileListFilter(filenamePattern));
FileReadingMessageSource source = new FileReadingMessageSource();
source.setAutoCreateDirectory(true);
source.setDirectory(new File(processingDirectory));
source.setFilter(filters);
return source;
}
#Bean
#ServiceActivator(inputChannel = "sourceFileChannel")
public MessageHandler fileOutboundChannelAdapter() {
FileWritingMessageHandler adapter = new FileWritingMessageHandler(new File(processingDirectory));
adapter.setDeleteSourceFiles(false);
adapter.setAutoCreateDirectory(true);
adapter.setExpectReply(false);
return adapter;
}
#Bean
public MessageChannel sourceFileChannel() {
return new DirectChannel();
}
#Bean
public MessageChannel processingFileChannel() {
return new DirectChannel();
}
#Bean
public DefaultDirectoryScanner defaultDirectoryScanner() {
return new DefaultDirectoryScanner();
}
#Bean
public FileSystemPersistentAcceptOnceFileListFilter persistentFilter() {
FileSystemPersistentAcceptOnceFileListFilter fileSystemPersistentAcceptOnceFileListFilter = new FileSystemPersistentAcceptOnceFileListFilter(metadataStore(), "");
fileSystemPersistentAcceptOnceFileListFilter.setFlushOnUpdate(true);
return fileSystemPersistentAcceptOnceFileListFilter;
}
#Bean
public PropertiesPersistingMetadataStore metadataStore(){
PropertiesPersistingMetadataStore metadataStore = new PropertiesPersistingMetadataStore();
metadataStore.setBaseDirectory("C:\\root\\code\\logProcessor");
return metadataStore;
}
You config is good so far.
Having such a complex task, I'm not sure how to help you.
You should ask more specific question. We can't write the solution for you.
Not sure why you need to copy files from one dir to another, if you can simply poll them from the source dir, store in the metadataStore and start a file processing.
So, far I see a small problem in your config. The FileWritingMessageHandler sends results to the processingFileChannel and the same is done by the second FileReadingMessageSource. I'm not sure that it is your intention. Just in case to pay your attention.
You might also need to know about FileSplitter, which lets you to process file line by line.
Also you say processingDirectory, but then you use tmpDir for the FileWritingMessageHandler, which, I guess, assumes your copy logic.
Let's do the task step by step! And then you figure out what, where and how to use!
EDIT
If you need just copy file to the processingDirectory without any reply, you should do one-way adapter:
#Bean
#ServiceActivator(inputChannel = "sourceFileChannel")
public MessageHandler fileOutboundChannelAdapter() {
FileWritingMessageHandler adapter = new FileWritingMessageHandler(new File(processingDirectory));
adapter.setDeleteSourceFiles(true);
adapter.setAutoCreateDirectory(true);
adapter.setExpectReply(false);
return adapter;
}
And then that your #InboundChannelAdapter(channel = "processingFileChannel" is good to pick up files for processing.
Not sure that you need to DeleteSourceFiles though...
I am trying to build a spring batch application that starts a job only after a file comes into a directory. For that I need a file poller and something like the snippet found in Spring reference manual:
public class FileMessageToJobRequest {
private Job job;
private String fileParameterName;
public void setFileParameterName(String fileParameterName) {
this.fileParameterName = fileParameterName;
}
public void setJob(Job job) {
this.job = job;
}
#Transformer
public JobLaunchRequest toRequest(Message<File> message) {
JobParametersBuilder jobParametersBuilder =
new JobParametersBuilder();
jobParametersBuilder.addString(fileParameterName,
message.getPayload().getAbsolutePath());
return new JobLaunchRequest(job, jobParametersBuilder.toJobParameters());
}
}
I would like to manage everything with configuration classes, but I can't really figure out how to make it work.
Your question isn't clear. Would be better to have something that works, then some your own PoC or attempt to reach the task.
But anyway that looks like you would like to avoid XML configuration and be only with Java & Annotation Configuration.
For this purpose I suggest you to take a look into Reference Manual and find this sample in the File Support chapter, too:
#Bean
#InboundChannelAdapter(value = "fileInputChannel", poller = #Poller(fixedDelay = "1000"))
public MessageSource<File> fileReadingMessageSource() {
FileReadingMessageSource source = new FileReadingMessageSource();
source.setDirectory(new File(INBOUND_PATH));
source.setFilter(new SimplePatternFileListFilter("*.txt"));
return source;
}
When trying to start the jar seperately in Unix machine the Thread for task-schedular is not listnening after some time but it is working fine in Windows machine.Even the application is working in linux on startup but going further sometime it is not working.Please let me know Is there any way to avoid the issue.
#Bean
#InboundChannelAdapter(value = "inputChannel", poller = #Poller(fixedDelay = "1000", maxMessagesPerPoll = "1"))
public MessageSource<?> receive() {
FtpInboundFileSynchronizingMessageSource messageSource = new FtpInboundFileSynchronizingMessageSource(synchronizer());
File Temp = new File(TEMP_FOLDER);
messageSource.setLocalDirectory(Temp);
messageSource.setAutoCreateLocalDirectory(true);
return messageSource;
}
private AbstractInboundFileSynchronizer<FTPFile> synchronizer() {
AbstractInboundFileSynchronizer<FTPFile> fileSynchronizer = new FtpInboundFileSynchronizer(sessionFactory());
fileSynchronizer.setRemoteDirectory(ftpFileLocation);
fileSynchronizer.setDeleteRemoteFiles(false);
Pattern pattern = Pattern.compile(".*\\.xml$");
FtpRegexPatternFileListFilter ftpRegexPatternFileListFilter = new FtpRegexPatternFileListFilter(pattern);
fileSynchronizer.setFilter(ftpRegexPatternFileListFilter);
return fileSynchronizer;
}
#Bean(name = "sessionFactory")
public SessionFactory<FTPFile> sessionFactory() {
DefaultFtpSessionFactory sessionFactory = new DefaultFtpSessionFactory();
sessionFactory.setHost(ftpHostName);
sessionFactory.setUsername(ftpUserName);
sessionFactory.setPassword(ftpPassWord);
return sessionFactory;
}
#Bean(name = "inputChannel")
public PollableChannel inputChannel() {
return new QueueChannel();
}
#Bean(name = PollerMetadata.DEFAULT_POLLER)
public PollerMetadata defaultPoller() {
PollerMetadata pollerMetadata = new PollerMetadata();
pollerMetadata.setTrigger(new PeriodicTrigger(100));
return pollerMetadata;
}
#ServiceActivator(inputChannel = "inputChannel")
public void transferredFilesFromFTP(File payload) {
callWork(payload);
}
There is no reason to have one poller immediately after another one. I mean you don't need that QueueChannel.
It's really interesting what does that magic callWork(payload); code do. Doesn't it have anything blocking for some long time? Even if that looks like void (without returning something to wait), but you may have there some thread starvation code, which steals all the thread from the default TaskScheduler (10 by default).
Looks like this is fully related to your another question Spring Integration ftp Thread process