How to know whether the file has been changed on same server or has transferred from another server in asp.net - filesystemwatcher

We have a webfarm scenario on which FileSystemWatcher is used to notify the changes occured on file.When a file is changed or created on one server it's gets noticed and the changes are transferred to another servers on webfarm.Again the transferred files on other servers raise the changed event and they are synced to the same server which create a redundant sync operation.We want to sync the changes only if the changes are on the same server not on the transferred changes from other servers.How could this be possible?

Have the servers monitor a temporary directory where uploaded files get stored.
Let the FilesystemWatcher sync with the other servers at the same time it moves the file to the intended working directory.
The sync sends the file to the other servers working directory of course, thus bypassing the other FilesystemWatchers. Voila!

Related

Spring Integration: Inbound File Adapter drops files when service restarts

We're using the S3InboundFileSynchronizingMessageSource feature of Spring Integration to locally sync and then send messages for any files retrieved from an S3 bucket.
Before syncing, we apply a couple of S3PersistentAcceptOnceFileListFilter filters (to check the file's TimeModified and Hash/ETag) to make sure we only sync "new" files.
Note: We use the JdbcMetadataStore table to persist the record of the files that have previously made it through the filters (using a different REGION for each filter).
Finally, for the S3InboundFileSynchronizingMessageSource local filter, we have a S3PersistentAcceptOnceFileListFilter FileSystemPersistentAcceptOnceFileListFilter -- again on TimeModified and again persisted but in a different region.
The issue is: if the service is restarted after the file has made it through the 1st filter but before the message source successfully sent the message along, we essentially drop the file and never actually process it.
What are we doing wrong? How can we avoid this "dropped file" issue?
I assume you use a FileSystemPersistentAcceptOnceFileListFilter for the localFilter since S3PersistentAcceptOnceFileListFilter is not going to work there.
Let see how you use those filters in the configuration! I wonder if switching to the ChainFileListFilter for your remote files helps you somehow.
See docs: https://docs.spring.io/spring-integration/docs/current/reference/html/file.html#file-reading
EDIT
if the service is restarted after the file has made it through the 1st filter but before the message source successfully sent the message along
I think Gary is right: you need a transaction around that polling operation which includes filter logic as well.
See docs: https://docs.spring.io/spring-integration/docs/current/reference/html/jdbc.html#jdbc-metadata-store
This way the TX is not going to be committed until the message for a file leaves the polling channel adapter. Therefore after restart you simply will be able to synchronize the rolled back files again. Just because they are not present in the store for filtering.

Is it possible to turn off lock on the file set up by SFTP Session Factory

I'm struggle with cashed sftp session factory. Namely, I suffered from session unavailable because I used to many in my application. Currently I have one default non cashed session. Which writes file to sftp server but set up locks on them. Thus it can't be read by any other user. I'd like to avoid it. Perfectly, turn off lock after single file is uploaded. Is it possible ?
Test structure
Start polling adapter
Upload file to remote
Check whether files are uploaded
Stop polling adapter
Clean up remote
When you deal with data transferring over the network, you need to be sure that you release resources you use do to that. For example be sure to close InputStream after sending data to the SFTP. This is really not a framework responsibility to close it automatically. More over you may give us already not an InputStream, but just plain byte[] from it. That's only a reason I can think about locking-like behavior.

Azure Logic App: Why FTP Connector delays to Identify files over FTP connector?

I have been working with FTP connector in my AzureLogicApp for unzipping files in FTP server from Source to Destination folder.
I have configured FTP connector to Trigger whenever the file is added in Source folder.
The Problem I face is the delay to Trigger the connector here.
Once I add the zipfile in source folder, It would take around 1 minute for the Azure FTP connector to identify and Pick the file over FTP.
To identify if the issue is with Azure FTP connector or FTP server, I tried using BLOB storage instead of FTP server and The connector was triggered in a second.!!!
What I understand by this is, The delay happens from FTP side, or the Way FTP connector communicates with FTP server.
Can Anyone tell the areas of optimization here? What possible changes I can make to minimize this delay.?
I also noticed this behaviour of the FTP trigger and found the reason for the delay in the FTP Trigger doco here:
https://learn.microsoft.com/en-us/azure/connectors/connectors-create-api-ftp#how-ftp-triggers-work
...when a trigger finds a new file, the trigger checks that the new file is complete, and not partially written. For example, a file might have changes in progress when the trigger checks the file server. To avoid returning a partially written file, the trigger notes the timestamp for the file that has recent changes, but doesn't immediately return that file. The trigger returns the file only when polling the server again. Sometimes, this behavior might cause a delay that is up to twice the trigger's polling interval.
Firstly you need to know, the logic app file trigger has some difference from the Function, mostly it won't trigger immediately, when you set the trigger you will find it need a interval. Even there is a file, however there is a interval it won't trigger right now.
Then it's about how ftp trigger works, when it trigger the logic app, if you check the trigger history you will find it has multiple Succeeded however only one fired history and there is a 2 minutes delay. The reason you could check the connector reference: How FTP triggers work. There is a description about this.
The trigger returns the file only when polling the server again. Sometimes, this behavior might cause a delay that is up to twice the trigger's polling interval.

file locking was not happening either in the server1 or server2 using spring poller

I have implemented the spring poller in my Application. My Application runs in two servers. I place the txt/pdf/xlsx files in the inbound folder. The same inbound folder is pointed by the two servers. Once I place multiple files like 10-15 files, both servers(both JVM's) were trying to pick the same file and One server process the file and moves the file from inbound to inprocess & when other server tries the same and throws the FileNotFoundException.
Is there any way to lock the file so other server may not be able to read the same file? (or) Is there any other solution to fix this issue.
Thanks in advance.
The <int-file:inbound-channel-adapter> can be supplied with the <locker> sub-element. See Reference Manual:
When multiple processes are reading from the same directory it can be desirable to lock files to prevent them from being picked up concurrently. To do this you can use a FileLocker. There is a java.nio based implementation available out of the box, but it is also possible to implement your own locking scheme. The nio locker can be injected as follows
<int-file:inbound-channel-adapter id="filesIn"
directory="file:${input.directory}" prevent-duplicates="true">
<int-file:nio-locker/>
</int-file:inbound-channel-adapter>

Designing a message processing system

I have been asked to create a message processing system as following. As I am not sure if this is the right place to post this, feel free to move it to any other appropriate SC group.
Problem
Server have about 100 to 500 clients connected at every moment. When a client connects to server, server loads part of their data and cache it in memory for faster access. Server will receive between 200~1000 messages per second for all clients. These messages are relatively small (about 500 bytes). Any changes to data in cache should be saved to disk as soon as possible. When client disconnects all their data is saved to disk and removed from cache. each message contains some instruction and a text message which will be saved as file. Instructions should be executed as fast as possible (near instant) and all clients using that file should get the update. Only writing the modified message to disk can be delayed.
Here is my solution in a diagram
My solution consists of a web server (http or socket) a message queue and two or more instances of file server and instruction server.
Web server grabs client messages and if there is message available for client in message queue, pushes it back to client.
Instruction processor grabs instructions from queue and creates necessary message to be processed by file server (Get/set file) and waits for the file to be available in queue and more process to create another message for client.
File server only provides the files, either from cache or physical file depending on type of file.
Concerns:
There are peak times that total connected clients might go over 10000 at once and total messages received from clients increase to 10~15K.
I should be able to clear the queue and go back to normal state as soon as possible (with processing requests obviously).
I should be able to add extra instruction processors and file servers on the fly without having to shut down the other instances.
In case file server crashes it shouldn’t lose files so it has to write files to disk as soon as there are any changes and process time is available.
File system should be in b+ tree format so some applications (local reporting apps) could easily access files without having to go through queue server
My Solution
I am thinking of using node.js for socket/web server. And may be a NoSQL database for file server and a queue server such as rabbitMQ or Node_Redis and Redis.
Questions:
Is there a better way of structuring this system?
What are my other options for components of this system?
is it possible to run all the instances in same server machine or even in same application (in different thread)?
You have a couple of holes here, mostly around the web server "pushing" the message back to the client. That doesn't really work in a web-based world. You can try and use websockets, but generally, this ends up being polling based.
I don't know what the "instructions" are to be executed, but saving 1000 500byte messages is trivial. Many NoSQL solutions boast million+ write per second capacity. Especially if you let committing to disk to lag.
Don't bother with the queue for the return of the file. A good NoSQL solution will scale better. Build out a Cassandra cluster, load test it until it can handle your peak load.
This simplifies your architecture into a 1 or more web servers, clients polling that server for file updates, a queue for submitting "messages" to the "instruction server" (also known as an application server in web-developer terms), and a no-sql database for the instruction server to write files to.
This makes scaling easy, you can always add more web servers, and with a decent cluster size for your no-sql server, you should get to scale horizontally there as well. Your only real bottleneck is your instruction server queue, which you could always throw more instruction servers at.

Resources