Monitor Directory for Changes - linux

Much like a similar SO question, I am trying to monitor a directory on a Linux box for the addition of new files and would like to immediately process these new files when they arrive. Any ideas on the best way to implement this?

Look at inotify.
With inotify you can watch a directory for file creation.

First make sure inotify-tools in installed.
Then use them like this:
logOfChanges="/tmp/changes.log.csv" # Set your file name here.
# Lock and load
inotifywait -mrcq $DIR > "$logOfChanges" &
IN_PID=$$
# Do your stuff here
...
# Kill and analyze
kill $IN_PID
while read entry; do
# Split your CSV, but beware that file names may contain spaces too.
# Just look up how to parse CSV with bash. :)
path=...
event=...
... # Other stuff like time stamps?
# Depending on the event…
case "$event" in
SOME_EVENT) myHandlingCode path ;;
...
*) myDefaultHandlingCode path ;;
done < "$logOfChanges"
Alternatively, using --format instead of -c on inotifywait would be an idea.
Just man inotifywait and man inotifywatch for more infos.
You can also use incron and use it to call a handling script.

One solution I thought of is to create a "file listener" coupled with a cron job. I'm not crazy about this but I think it could work.

fschange (Linux File System Change Notification) is a perfect solution, but it needs to patch your kernel

Related

Read updates from continuously updated directory

I am writing a bash script that looks at each file in a directory and does some sort of action to it. It's supposed to look something like this (maybe?).
for file in "$dir"* ; do
something
done
Cool, right? The problem is, this directory is being updated frequently (with new files). There is no guarantee that, at some point, I will technically be done with all the files in the dir (therefore exiting the for-loop), but not actually done feeding the directory with extra files. There is no guarantee that I will never be done feeding the directory (well... take that with a grain of salt).
I do NOT want to process the same file more than once.
I was thinking of making a while loop that runs forever and keeps updating some file-list A, while making another file-list B that keeps track of all the files I already processed, and the first file in file-list A that is not in file-list B gets processed.
Is there a better method? Does this method even work? Thanks
Edit: Mandatory "I am bash newb"
#Barmar has a good suggestion. One way to handle this is using inotify to watch for new files. After installing the inotify-tools on your system, you can use the inotifywait command to feed new-file events into a loop.
You may start with something like:
inotifywait -m -e MOVED_TO,CLOSED_WRITE myfolder |
while read dir events file, do
echo "Processing file $file"
...do something with $dir/$file...
mv $dir/$file /some/place/for/processed/files
done
This inotifywait command will generate events for (a) files that are moved into the directory and (b) files that are closed after being opened for writing. This will generally get you what you want, but there are always corner cases that depend on your particular application.
The output of inotifywait looks something like:
tmp/work/ CLOSE_WRITE,CLOSE file1
tmp/work/ MOVED_TO file2

Check directory daily for new files - linux bash script

I'd like to monitor a directory for new files daily using a linux bash script.
New files are added to the directory every 4 hours or so. So I'd like to at the end of the day process all the files.
By process I mean convert them to an alternative file type then pipe them to another folder once converted.
I've looked at inotify to monitor the directory but can't tell if you can make this a daily thing.
Using inotify I have got this code working in a sample script:
#!/bin/bash
while read line
do
echo "close_write: $line"
done < <(inotifywait -mr -e close_write "/home/tmp/")
This does notify when new files are added and it is immediate.
I was considering using this and keeping track of the new files then processing them at all at once, at the end of the day.
I haven't done this before so I was hoping for some help.
Maybe something other than inotify will work better.
Thanks!
You can use a daily cron job: http://linux.die.net/man/1/crontab
Definitely should look into using a cronjob. Edit your cronfile and put this in:
0 0 * * * /path/to/script.sh
That means run your script at midnight everyday. Then in your script.sh, all you would do is for all the files, "convert them to an alternative file type then pipe them to another folder once converted".
Your cron job (see other answers on this page) should keep a list of the files you have already processed and then use comm -3 processed-list all-list to get the new files.
man comm
Its a better alternative to
awk 'FNR==NR{a[$0];next}!($0 in a)' processed-list all-list
and probably more robust than using find since you record the ones that you have actually processed.
To collect the files by the end of day, just use find:
find $DIR -daystart -mtime -1 -type f
Then as others pointed out, set up a cron job to run your script.

"Spoof" File Extension In Bash

Is there a way to "spoof" the file extension of a file in bash for consumption by another program? I can think of doing some shell scripting and making lots of soft-links, but that isn't very scalable.
Let's imagine I have a program I'm trying to use that requires input files to be of a specific file extension, and it has no method of turning off this check.
You could make a fifo with the requisite extension and cat any other file type into it. So, if your crazy program needs to see files that end in .funky, you can do this:
mkfifo file.funky
cat someotherfile > file.funky &
someprogram file.funky
Create a symbolic link for each file you want to have a particular extension, then pass the name of the symlink to the command.
For example suppose you have files with names of the form *.foo and you need to refer to them with extensions of .bar:
for file in *.foo ; do
ln -s $file _$$_$file.bar
done
I precede each symlink name with _$$_ to avoid the possibility of colliding with an existing file name (you don't want to do ln -s file.foo file.bar if file.bar already exists).
With a little more programming, your script can keep track of which symlinks it created and, if you like, clean them up after executing the command.
This assumes, as you stated in the question, that the command can't be forced to accept a different extension.
You could, without too much difficulty, create a wrapper script that replaces the command in question, creating the symlinks, invoking the command, and cleaning up after itself automatically.

Inotify compatible Shell script to monitor shell a certain directory

This is first question in Stack overflow. need an inotify compatible script writing that will monitor a certain directory, and if any new files/folders are created in in, copy those files to another folder. I need the script to monitor constantly for changes rather than run periodically.
Thanx in advance.
You can use inotifywait, from the inotify-tools page, to build something like this. A typical use:
inotifywait -m /tmp | while read path events name; do
echo "Now I am going to do something with $name in directory $path."
done
There are oodles of options for controlling how inotifywait operates; consult the man page for details.

How can I loop through some files in my script?

I am very much a beginner at this and have searched for answers my question but have not found any that I understand how to implement. Any help would be greatly appreciated.
I have a script:
FILE$=`ls ~/Desktop/File_Converted/`
mkdir /tmp/$FILE
mv ~/Desktop/File_Converted/* /tmp/$FILE/
So I can use Applescript to say when a file is dropped into this desktop folder, create a temp directory, move the file there and the do other stuff. I then delete the temp directory. This is fine as far as it goes, but the problem is that if another file is dropped into File_Converted directory before I am doing doing stuff to the file I am currently working with it will change the value of the $FILE variable before the script has completed operating on the current file.
What I'd like to do is use a variable set up where the variable is, say, $FILE1. I check to see if $FILE1 is defined and, if not, use it. If it is defined, then try $FILE2, etc... In the end, when I am done, I want to reclaim the variable so $FILE1 get set back to null again and the next file dropped into the File_Converted folder can use it again.
Any help would be greatly appreciated. I'm new to this so I don't know where to begin.
Thanks!
Dan
Your question is a little difficult to parse, but I think you're not really understanding shell globs or looping constructs. The globs are expanded based on what's there now, not what might be there earlier or later.
DIR=$(mktemp -d)
mv ~/Desktop/File_Converted/* "$DIR"
cd "$DIR"
for file in *; do
: # whatever you want to do to "$file"
done
You don't need a LIFO -- multiple copies of the script run for different events won't have conflict over their variable names. What they will conflict on is shared temporary directories, and you should use mktemp -d to create a temporary directory with a new, unique, and guaranteed-nonconflicting name every time your script is run.
tempdir=$(mktemp -t -d mytemp.XXXXXX)
mv ~/Desktop/File_Converted/* "$tempdir"
cd "$tempdir"
for f in *; do
...whatever...
done
What you describe is a classic race condition, in which it is not clear that one operation will finish before a conflicting operation starts. These are not easy to handle, but you will learn so much about scripting and programming by handling them that it is well worth the effort to do so, even just for learning's sake.
I would recommend that you start by reviewing the lockfile or flock manpage. Try some experiments. It looks as though you probably have the right aptitude for this, for you are asking exactly the right questions.
By the way, I suspect that you want to kill the $ in
FILE$=`ls ~/Desktop/File_Converted/`
Incidentally, #CharlesDuffy correctly observes that "using ls in scripts is indicative of something being done wrong in and of itself. See mywiki.wooledge.org/ParsingLs and mywiki.wooledge.org/BashPitfalls." One suspects that the suggested lockfile exercise will clear up both points, though it will probably take you several hours to work through it.

Resources