Spring Integration: defining inputs for each MessageSource poll - spring-integration

I've spent a fair amount of time over the past few weeks learning about Spring Integration, and something I'm not seeing in the docs is how the various MessageSources prevent themselves (or can be configured to prevent) reading duplicate inputs into the IntegrationFlows.
For instance, let's take an FTP server:
return IntegrationFlows.from("ftp-endpoint")
.handle(fileProcessor())
.get();
Let's say at one point in time the directory on the FTP server (indicated by "ftp-endpoint" above) has the following files on it:
readingDir/
file1.txt
file2.txt
file3.txt
The next time the scheduled FTP message source (ftp-endpoint) runs, it ingests all 3 files. Then some more files get added:
readingDir/
file1.txt
file2.txt
file3.txt
file4.txt
file5.txt
How do we prevent file1.txt, file2.txt and file3.txt from being read the next time the flow polls the FTP server and runs? What if we actually want it to re-run file3.txt for some reason -- how would we tell it to read file3.txt (re-run), file4.txt (new) and file5.txt (new) but skip file1.txt and file2.txt?
And this question is not just for FTP, it would be for any polling endpoint: email, DropBox, S3, etc. Hopefully its the same API/strategy for all of them!
The only thing I can see on the API that jumps out is to provide a SourcePollingChannelAdapterSpec on the endpoint, so:
IntegrationFlows.from("ftp-endpoint", sourcePollingChannelAdapterSpec)...
And then configure the SourcePollingChannelAdapterSpec to have a piece of Advice (setAdvice(Arrays.asList(specialAdvice))) that has logic in it, but that feels kind of clumsy (I'm not a huge fan of tag interfaces like Advice). I'll use it if that's the only solution, but there's gotta be a better way to tell each MessageSource what to consider "valid inputs" on each run!

It is not clear what resources did you investigate to learn about this stuff.
Please, take a look into official FTP Inbound Channel Adapter docs:
https://docs.spring.io/spring-integration/docs/current/reference/html/ftp.html#ftp-inbound
https://docs.spring.io/spring-integration/docs/current/reference/html/file.html#remote-persistent-flf
So, you are fully missing the fact that there is a FileListFilter strategy and there are a lot of out-of-the-box implementations for our consideration. What you are asking is covered by the mentioned in the first doc AcceptOnceFileListFilter. It does hold a reference to the file to skip it on the next poll. If you'd like to re-fetch some old file, you just can call AcceptOnceFileListFilter.remove(). Or look into an FtpPersistentAcceptOnceFileListFilter which is able to track not just file name, but also its timestamp. So, if the content of remote file is updated, then its timestamp is changed - and this filter would treat the file as new one and will fetch it again.

Related

PTC Integrity batch update member revision

Is there a way to update the member revision of a big list of files via command line?
I can't use :working or :head but have to specify a different revision for each file.
As far as I know --selectionFile only takes paths as input, but not the revision numbers.
edit: I wanted to set member a very big list of files and I wanted to avoid writing the command si updaterevision ... for every file, as it takes ages to complete for that many files. Instead I wanted to know if there is a more advanced method to specify a list of files and their revisions to be able to run the updaterevision only once (like it is with :working) for the whole list of files.
But as it is said in the comment there is no such possibility.
edit2: I use MKS for a couple of years now and as I now know, there is no such possibility (at least up to MKS 11.6) to update many files to different revisions with one single command line call. But using one call per member, as was proposed, made the whole operation take up to several hours as I had many thousands of members in the sandbox and MKS needs some time to complete each sicommand.
Some time already passed since you asked for this question, here is my comment in case it could still be useful for you in the future.
First, It is not completely clear what you want to achieve. Please be more descriptive and if possible provide example.
What I understand as of now is you need to set bunch of files listed as member revision thru the command line. This is fairly simple, the most complicated is actually to have the list of files to be updated to member and the revision that you want to set as member.
I recommend you to create a batch file with the commands to make each file member. You can use Regex to do it very quick and without much trouble.
Here is an example for updating one file member revision:
si updaterevision --hostname=servername --port=portnumber --user=username --changepackageid=5873763:2 --revision=:working myfile_a1.c
where
servername = the name of the server where your sandbox is located
portnumber = the port that provides access to the server for your sandbox
username = your login user id
changepackageid = here you change the number to use your defined TASK:ChangePackage for this changes
revision = if you have a working revision that you want now to become member, just use "working" as revision, otherwise you can define specific revision number, e.g. revision=1.2
At the end you define the name of the file you want to update.
Go to you sandbox root folder, open CMD window, and run the batch file. It will execute each line applying your changes.
If you have a list of files with the revision you want as member, you can use REGEX to convert it into a batch file.
Example list of files in text file:
file1.c 1.10
file3.c 1.19
sec_file1.c 1.1.2.1
support.h 1.7
Use notepad++ or other text editor with regex support and run this search:
Once you know which regex apply, you can now use it in the notepad++ to do a simple search and replace:
Search = ([\w].[\D])\s+([\d.]+).*
Replace = si updaterevision --hostname=servername --port=portnum --user=userid --changepackageid=6123933:4 --revision=\2 \1
\1 => FileName
\2 => File revision
See image below as example:
Finally just save doc as batch file and run it.
Just speculating that if you have a large list of members along with the member revision you want to update to, then you also have an sandbox that served you to generate this list.
If so my approach would be
c:\MySandbox> si updaterevision --recurse --revision=:working
If your member/revision list come from a development path you could first have a sandbox targeting that devpath, resync, (close thesandbox if opened in gui), retarget the sandbox to the destination devpath (or mainline) you want and then issue the command above.
For an single member approach I would use 'si rlog' to generate a list of si-commands directly
si rlog -R --noheaderformat --notrailerformat --revision=:working --format="si updaterevision {membername} --revision={revision}\r\n" > updaterevs.bat.txt
Review updaterevs.bat.txt rename it to updaterevs.bat and ecxecute it.
(Be careful if using it on other sandboxes)
Other interesting readings here might be the "snapshot sandbox" feature,
checkpointing in general and variants rsp. devpaths.
Using only these features might be politically more correct in the philosophy of Integrity.

How to retrive Files generated in the past 120 minutes in Linux and also moved to another location

For one of my Project, I have a certain challenge where I need to take all the reports generated in a certain path, I want this to be an automated process in "Linux". I know the way how to get the file names which have been updated in the past 120 mins, but not the files directly. Now my requirements are in such a way
Take a certain files that have been updated in past 120 mins from the path
/source/folder/which/contains/files
Now do some bussiness logic on this generated files which i can take care of
Move this files to
/destination/folder/where/files/should/go
I know how to achieve #2 and #3 but not sure of #1. Can someone help me how can i achieve this.
Thanks in Advance.
Write a shell script. Sample below. I haven't provided the commands to get the actual list of file names as you said you know how to do that.
#!/bin/sh
files=<my file list>
for file in $files; do
cp $file <destination_dirctory>
done

Unload a file from a ftp and rename it in host

I have one file delivered in a ftp daily. This file doesn´t have the same name everyday. It has the date and the hour of the creation. For example, today the file has the name 20130814_XX_YY_20130814152345, created at 15:23:45 and tomorrow the file can name 20130815_XX_YY_20130815152421. The _XX_YY_ is always the same but the hour will change everyday.
I want to create a host jcl that gets this file with variable name and rename it to a host file. How can I do this ?
Thank you
Regards
Chuchito
STEP1: You can use LS in FTP to write to disk, so you can have a file with the file-name in it. Then GET that file.
STEP2: Process the contents of your file to generate the FTP Control Cards (at least for the GET). The GET generated will be of the form GET 20130814_XX_YY_20130814152345 'HLQ.MAINFRAM.DATASET', where the server name has come from the file GETted in STEP1 and the local (Mainframe) file can be hard-coded, or supplied to the generation if flexibility is required.
STEP3: Run FTP again with the Control Card(s) generated.
Isn't there anything in the Spec?
Sometimes we create complexities where an "out of the box" solution simplifies life considerably.
After the post updated, I now understand the problem a bit better.
If the name is required to be so specific, then the other suggested solution (if i understand it) is to have a fixed file name on the server that contains a list of file names to be uploaded.
In fact, the server could create a fixed file name that is really the JCL to run on the mainframe!!! This file would include the //SYSIN DD * and GET commands! The mainframe uploads this file and submits it as-is to the job reader, which then runs on the mainframe. The last step of this job (created by the server, but run on the mainframe) is to FTP an empty JCL file back to the server, in this way the server "knows" that the mainframe has uploaded the files.
Alternatively, why does the non-Z\os system need to name the file with time information? If the mainframe processes the file daily then date should be sufficient.
With this change the mainframe can reliably predict the file name for the day, generate the appropriate GET command and run.
With a job scheduler it would be easy to run the upload to the mainframe twice a day. This might address any concerns that are expressed in the desire to include a time in the file's name.
Run a Rexx step via a Background TSO step:
Background TSO step
You can then run a listcat to get all the files. You could either write the listcat output to a file and read it in or trap the output via the Address command
or the OutTrap function.
Then use the standard TSO Rename command.
Alternatively you could run ISPF background rexx program and use the ISPF equivalents to get the file name
(1) The real solution to this should be through a scheduling tool for Mainframe jobs. These tools provide capabilities to take care of formatting like the one you described.
(2) Alternatives: REXX and COBOL
(3) If you don't want to prefer REXX, here's a little brief into how you could create the JCL dynamically using COBOL:
A COBOL program that would read a "template" JCL.
Using INSPECT / REPLACE, you could substitute the prototypes with the string that is populated with the date of your choice (you could supply this as a simple SYSIN parm too, if you want the COBOL code to be flexible on the date selection)
Now that your formatted JCL is ready, you could write it to the output stream
//OUTFILE DD SYSOUT=(INTRDR,)
or
//OUTFILE DD SYSOUT=(,INTRDR)
Anything that is written to INTRDR (Internal Reader), goes straight to JES to submit your job!
Hope this helps.

In bash, how to split or merge a file between different machines?

Actually, I have a server and many clients.These clients each have its own config file.Server will get all the clients' config files, and then merge them into one file, then works.Sometimes, client will get server's merged file then modify its part, send it back.Like this pic:
Well, I have tried one way to do this.
Each client keep a own split config file and update it. When a client modify its piece, then try to get all the clients' pieces and merge them into one file.Then send this file to the server.
cat split.1 >> all.config
cat split.2 >> all.config
cat split.3 >> all.config
When client syncs the config file, then it gets the all.config from server, then split it into pieces by "split flag"...
I think it's a really supid way, but it works.Are there any other more beautiful ways to realise this need?

Can TortoiseSVN provide a cross-repository view of user activity?

Is there a way I can see my commit history for a given time period across multiple repositories using TortoiseSVN? It would be nice to be able to see this, and it's a little cumbersome to get my complete commit history if I'm working in multiple repositories.
If you're not going to rule out the svn.exe client, you could do:
svn log <path_to_repo> -r1:head -q | find "william_leara" >> c:\my_commits.txt
Do this for every repository, and "my_commits.txt" will contain your commits from every repository. If you don't have an obscene number of repositories, it's not a big deal. Further example:
:: dump my commits
svn log http://<server>/<path1> -r1:head -q | find "william_leara" >> c:\my_commits.txt
svn log http://<server>/<path2> -r1:head -q | find "william_leara" >> c:\my_commits.txt
svn log file:///c:/src/myrepo -r1:head -q | find "william_leara" >> c:\my_commits.txt
. . . I think you get the idea. Of course you can edit the range as necessary, or write a batch file that accepts arguments to specify repository/range/user, whatever.
The only way to have something like cross-repository view is using Settings menu and then Log Caching->Cached Repositories. This allows to get svn repository statistics (actually, related to local usage of the particular repository) - Details and export repository data in the form of file set: [filename].changes.csv, [filename].merges.csv, [filename].paths.csv, [filename].revisions.csv, etc. The latter is the most probable you are interested in. I think it could be processed easily for example by perl to have a commit history for a given period in a form you need.

Resources