Monitor directory for file changes on Linux - linux

I have a directory with files. An application is going to scan these files and then in some way mark each file as scanned. Then I want to get a notification that a file was scanned and delete it.
How can the application mark the file as scanned?
Regular attributes are not suited for me because for example the file could be read by someone but it doesn't mean it was scanned. How can I get a notification about scanned files?
Thank you.

You can use inotify (manpage) to get notified about changes. You're getting notified only once. so there is no need to mark things as 'notified'.
An example is given here.

The application that scans the files should keep a record of which files it has already scanned. Either a text file of a small database...

I would recommend you not to go for a "file system" solution, it can be pretty complicated and buggy.
How about a that the scanned service will send a message to the deletion service after each file that was scanned?

Related

How can I completely delete a file that was uploaded to Gitlab in an issue comment?

Someone uploaded (attached) a file in a Gitlab issue comment. They did not mean to share that file publicly. I can delete the comment, but the file is still available via the original direct url. The file is at:
https://gitlab.com/<username>/<repo>/uploads/<hash>/<filename>
Is there any way to completely remove files from this uploads directory?
Short version: There's server-side Uploads administration | GitLab, but little to nothing else.
TLDR:
For the owner of a repository, there seems to be no way to get hold of these uploads directly, there even doesn't seem to be a way to list all uploads pertaning to a specific repository (or user/owner), let alone modify them.
use-cases where this would be desirable:
deletion of data that should not be exposed but has been erroneously.
down-scaling of oversized files (images, pdfs, etc)
replacing files with updated versions
deleting space-hogs that are no longer needed.
deleting files that got uploaded accidentally by trigger-happy mice or when the result of a previous upload didn't show in time for the impatient user.
Making these files changeable would cause several issues rooted in their current/previous immutable status:
Users aware of this status will frequently re-use the url to an already uploaded file for perusal in other issues, or the associated wiki (even across projects) to avoid duplication. Afaik, there is no such thing as a link-count for upload items, so deleting an item might result in orphaned references, and changing an uploaded file might render other references out-of-context.
It would solve the serious issue of leaked information, though. The only way I have found so far to remove a file would be to send a prayer to the administrator of the gitlab server, and ask him/her to take care of the uploads directory on the server, as described in Uploads administration | GitLab

Setting up a trigger to watch new folders Azure Logic Apps

I am trying to create a logic app that will transfer files as they are created from my FTP server to my Azure file share. The structure of the folder my trigger is watching is structured by date (see below). Each day that a file is added, a new folder is created, so I need the trigger to check new subfolders but I don't want to go into the app every day to change which folder the trigger looks at. Is this possible?
Here's how my folder(Called data) structure is, each day that a file is added a new folder is created.
-DATA-
2016-10-01
2016-10-02
2016-10-03
...
The FTP Connector uses a configurable polling where you set how many times it should look for a file. The trigger currently does not support dynamic folders. However what you could try is the following:
Trigger your logic app by recurrence (same principle as the FTP trigger in fact)
Action: Create a variable to store the date time (format used in your folder naming)
Action: Do a list files in folder (here you should be able to dynamically set the folder name using the variable you created)
For-each file in folder
Action: Get File Content
Whatever you need to do with the file (call nested logic app in case you need to do multiple processing actions on each fiel is smart if you need to handle resubmits of the flow by file)
In order to avoid that you pick up every file each time, you will need to find a way to exlude files which have been processed in an earlier run. So either rename the file after it's processed to an extension you can exclude in the next run or move the file to a subfolder "Processed\datetime" in the root.
This solution will require more actions and thus will be more expensive. I haven't tried it out, but I think this should work. Or at least it's the approach I would try to set up.
Unfortunately, what you're asking is not possible with the current FTP Connector. And there aren't any really great solution right now...:(
As an aside, I've seen this pattern several times and, as you are seeing, it just causes more problems than it could solve, which realistically is 0. :)
If you own the FTP Server, the best thing to do is put the files in one folder.
If you do not own the FTP Server, politely mention to the owner that this patterns is causing problems and doesn't help you in any way so please, put the files on one folder ;)

How persistent is data I put on my Azure WebApp via FTP?

I've been searching around and can't find any clear answers to this. I need a small amount of data - talking kilobytes, probably not ever reaching megabyte range - available as a file on my Azure instance, outside the web app itself, for a web job to work with. I won't get into why this is necessary, but it is (alternatives have been explored), and the question is now where to put those files. The obvious answer seems to be to connect to the FTP, create a directory, plop them there and work with them there.
I did a quick test and I'm able to create a "downloads" directory within the "data" directory, drop some files in it, and work with them there. It works great for this very small, simple need that I have.
How long will that data stay there? Is that directory purged at any point automatically by the servers? Is that directory part of any backups that are maintained? How "safe" is something I manually put outside the wwwroot folder?
It will never be purged. The only folder that can get purged is the %TEMP% folder. All other folders that you have write access to will be persisted forever.

Perforce deleting a file from workspace and reflecting that in Perforce

Is it possible to delete a file from your workspace and then hitting submit in perforce and that file being deleted from the perforce server?
open for read: F
\LocalSource\Perforce\MainBranch\blah\New Text Document.txt: The system cannot find the file specified.
Submit aborted -- fix problems then use 'p4 submit -c 4799463'.
Some file(s) could not be transferred from client.
I get this message when I try to submit. In Subversion I could do this. I had a look on the internet and it looks like this isn't possible, but I thought I'd check on here.
(The reason I want this is because I have a spreadsheet and I want to extract the modules from the spreadsheet and put them into source control. But sometimes modules in that spreadsheet may be removed and I want to be able to just checkin the modules that are changed and do deletions on the server, without having to go into the perforce client and deleting the files marked for deletion in there.) One method was to delete all the files in perforce and then do a dummy commit of an empty directory. And then add all the files again extracted from the spreadsheet and do an add. But then in my version history I always will have a version with a full delete.
Any simple ideas, special commands that I can use?
Thanks,
Chris
If you delete files directly on disk, without using the Perforce client to delete them (e.g., you use your spreadsheet command to delete those files directly), that's called "offline work", and in order to tell Perforce that you've made those changes, you just need to go back into your P4V window and use "Reconcile Offline Work".
See Working Disconnected From The Perforce Server for complete instructions.
See also this related question: Sync offline changes to a workspace into Perforce
Perforce has a command-line client (http://www.perforce.com/product/components/perforce_commandline_client) you should be able to execute from Excel as any exe file via the Shell function.

how to programmatically detect a new file in a sharepoint shared folder

I am using wss3.0 and I need a way to listen on a shared folder library for file changes that are coming from users and check out those files and copy them somewhere else on disk. It's almost like an alert functionality, but every time it happens instead of emailing people, it needs to run some code to check out the new files and copy them to a network location.
the best solution that I can come up with is creating some custom timer job and check which files have changed since my last successful run but then I will need to save my last successful run date time somewhere.
If anybody has a better idea, they are more than welcome to share it.
You can add Event Receiver to this library, and every time an item is added it would start. Then inside Event Receiver you would copy the file to your disk location.

Resources