How to implement a semaphore that will synchronize several different copies of the same program in Linux - linux

I have a program that can be ran several times. The program uses a working directory where it saves/manipulates its runtime files and puts results. I want to make sure that if several copies of the program run simultaneously, they won't use the same folder. To do this I add a hidden file in the work directory when it's created, that means that the directory is being used, and delete it when the program exits. When a program wants to use a certain directory as its working directory, it'll check if that file exists, and if not it will use the directory, otherwise, it'll use a directory of the same name with its process id attached. The implementation is: (in Tcl)
upon starting:
if [file exists [db_work_area]/.folder_used] {
reg set work_area_override [db_work_area]_[pid]
}
...
exec touch ${db_wa}/.folder_used
when exiting:
if [file exists [db_work_area]/.folder_used] {
file delete [db_work_area]/.folder_used
}
This works when the copies of the program are opened one at a time, however I am afraid that if several copies of the program will be opened at the same time, there will be a problem with their synchronization. Meaning that two programs will check if the file exists, together, see that it doesn't both chose that directory, and only after that, they will add the file. How can I implement a semaphore that will be able to synchronize between the several different copies of the same program running?

You should not do a [file exists] and later the touch, it works better to use open to do it in a single step with the EXCL permission.
Try to use something like this to create the file and fail if it already exists in an atomic way.
if {[catch {open ${db_wa}/.folder_used {WRONLY EXCL CREAT}} fd]} {
# error happend, file exists
# pick a different area
} else {
# just close it again, like a touch to create the file
close $fd
}

Related

what happens when calling ```touch .``` in linux?

this is a very specific question
I'm mainly interested in the open() system calls the happen when running touch ..
So I ran strace touch . and saw that opennat() is called three times.
but I'm not really understanding whats going on; as touch . does not print anything in the console and does not create a new file named "." since "." is a pointer to the current folder and can be seen by running ls -a so nothing is created since that name is already in use.
this is my assumption:
open() is called to check if the specified file name already exits, if a file descriptor is returned this means that the name is already in use and the operation is canceled.
please correct me if I'm wrong.
GNU touch prefers to use a file descriptor when touching files, since it's possible to write touch - > foo and expect the file foo to be touched. As a result, it always tries to open the specified path as a writable file, and if that's possible, it then uses that file descriptor to update the file timestamp.
In this case, it's not possible to open . for writing, so openat returns EISDIR. touch notices that it's a directory, so its call to its internal fdutimensat function gets an invalid file descriptor and falls back to using utimensat instead of futimens.
It isn't the case that the openat call is used to check that the file exists, but instead that using a file descriptor for many operations means that you don't have to deal with path resolution multiple times or handle symlinks, since all of those are resolved when the file descriptor is opened. This is why many long-lived programs choose to open a file descriptor to their current working directory, then change directories, and then use the file descriptor with fchdir to change back. Any pchanges to permissions after the program starts are not a problem.

How do I get the filename of an open std::fs::File in Rust?

I have an open std::fs::File, and I want to get it's filename, e.g. as a PathBuf. How do I do that?
The simple solution would be to just save the path used in the call to File::open. Unfortunately, this does not work for me. I am trying to write a program that reads log files, and the program that writes the logs keep changing the filenames as part of it's log rotation. So the file may very well have been renamed since it was opened. This is on Linux, so renaming open files is possible.
How do I get around this issue, and get the current filename of an open file?
On a typical Unix filesystem, a file may have multiple filenames at once, or even none at all. The file metadata is stored in an inode, which has a unique inode number, and this inode number can be linked from any number of directory entries. However, there are no reverse links from the inode back to the directory entries.
Given an open File object in Rust, you can get the inode number using the ino() method. If you know the directory the log file is in, you can use std::fs::read_dir() to iterate over all entries in that directory, and each entry will also have an ino() method, so you can find the one(s) matching your open file object. Of course this approach is subject to race conditions – the directory entry may already be gone again once you try to do anything with it.
On linux, files handles held by the current process can be found under /proc/self/fd. These look and act like symlinks to the original files (though I think they may technically be something else - perhaps someone who knows more can chip in).
You can therefore recover the (possibly changed) file name by constructing the correct path in /proc/self/fd using your file descriptor, and then following the symlink back to the filesystem.
This snippet shows the steps:
use std::fs::read_link;
use std::os::unix::io::AsRawFd;
use std::path::PathBuf;
// if f is your std::fs::File
// first construct the path to the symlink under /proc
let path_in_proc = PathBuf::from(format!("/proc/self/fd/{}", f.as_raw_fd()));
// ...and follow it back to the original file
let new_file_name = read_link(path_in_proc).unwrap();

How does logrotate work when there are two process use the same file?

For example:
Program A is writing log to file "test.log".
If logrotate runs, it will rename "test.log" to "test.log.1" first, and then create a new file "test.log".
After step 2, program A does not report any error, but the A's log does not appear in the new file "test.log".
The questions are:
Where is the data that A write to file after step 2 ?
How can logrotate rename and create new file when another process is writing to the file? (Is any point that I miss about logrotate?)
Thanks!
This is very tightly related to how POSIX filesystems work. When you rename a file, it's only the name of the file that is changed, the physical file on the disk will not change. Also, once a file is opened, the process using the file only have a link (through many layers) to the physical file on the disk, the name is only used when opening the file.
That means the program A will still write to the same file, which now has the new name (i.e. test.log.1 in your example).
A common solution to this problem is to have the log rotation program send a signal (e.g. SIGHUP or SIGUSR1 or similar) to the process. The process will detect this signal and then reopen the logging to use the new file.

how to get the filename along with absolute path to the file, whenever a new file is created using inode in linux?

I doing some experiments with my linux OS (CentOS) and I want to track all the tool logs created under the same environment, tool generates the respective logs (.log extn) for tracking these changes I wrote a perl watcher which actually monitoring the directory that I set and when the new file is created it will show at the output but This is consuming a lot of memory and CPU utilization as i have set 2sec as the sleep period.
My QUESTION "Is there any better of way doing this ?" I thought of using inode table for tracking all the changes in the system. can this solve my issue ? and if yes then could please let us know the solution towards the same ?
It seems that you want to monitor a directory for changes. This is a complex job, but for which there are good modules. The easiest one to recommend is probably Linux::Inotify2
This module implements an interface to the Linux 2.6.13 and later Inotify file/directory change notification system.
This seems to be along the lines of what you wanted.
Any such monitor needs additional event handling. This example uses AnyEvent.
use warnings;
use strict;
use feature 'say';
use AnyEvent;
use Linux::Inotify2;
my $dir = 'dir_to_watch';
my $inotify = Linux::Inotify2->new or die "Can't create inotify object: $!";
$inotify->watch( $dir, IN_MODIFY | IN_CREATE, sub {
my $e = shift;
my $name = $e->fullname;
say "$name modified" if $e->IN_MODIFY; # Both show the new file
say "$name created" if $e->IN_CREATE; # but see comments below
});
my $inotify_w = AnyEvent->io (
fh => $inotify->fileno, poll => 'r', cb => sub { $inotify->poll }
);
1 while $inotify->poll;
If you only care about new files then you only need one constant above.
For both types of events the $name has the name of the new file. From man inotify on my system
... the name field in the returned inotify_event structure identifies the name of the file within the directory.
The inotify_event structure is suitably represented by a Linux::Inotify2::Watcher object.
Using IN_CREATE seems to be an obvious solution for your purpose. I tested by creating two files, with two redirected echo commands separated by semi-colon on the same command line, and also by touch-ing a file. The written files are detected as separate events, and so is the touch-ed file.
Using IN_MODIFY may also work since it monitors (in $dir)
... any file system object in the watched object (always a directory), that is files, directories, symlinks, device nodes etc. ...
As for tests, both files written by echo as above are reported, as separate events. But a touch-ed file is not reported, since data didn't change (the file wasn't written to).
Which is better suited for your need depends on details. For example, a tool may open a log file as it starts, only to write to it much later. The two ways above will behave differently in that case. All this should be investigated carefully under your specific conditions.
We may think of a race condition, since while the code executes other file(s) could slip in. But the module is far better than that and it does report new changes after the handler completes. I tested by creating files while that code runs (and sleeps) and they are reported.
Some other notable frameworks for event-driven programming are POE and IO::Async.
The File::Monitor does this kind of work, too.

How to call a bash script automatically when directory contents chage

My goal is to run a bash script automatically whenever any new file is added to a particular directory or any subdirectory of that particular directory.
Detail Scenario:
I am creating an automated process for file submission from teachers to students and vice versa. Sender will upload file and it will be stored inside the Uploads directory in the LAMP server in the format, ex. "name_course-name_filename.pdf". I want some method so that when any file stored inside the Uploads folder, the same time a script will be called and send that file to the list of receives.
From the database I can find the list of receiver for that particular course and student.
The only concern of mine is, how to call a script automatically and make it work on individual file whenever the content of the directory changes. Cron will do in intervals but not a real time work.
Linux provides a nice mechanism for that purpose which is called inotify. inotify is mostly available as a C API. But there have been developed shell utilities as well. You should use inotifywait from inotifytools (pkg name in debian) for this. Here comes a basic example:
#!/bin/bash
directory="/tmp" # or whatever you are interested in
inotifywait -m -e create "$directory" |
while read folder eventlist eventfile
do
echo "the following events happened in folder $folder:"
echo "$eventlist $eventfile"
done
Update:
If the problem goes complicated, for example you'll have to monitor recursive, dynamic directory structures, you should have a look at incron It's a cron like daemon which executes scripts on certain events. But the events are file system events rather than timer events.
There is another option to 'inotifywait':
-d --daemon
Same as --monitor, except run in the background logging events to a file
that must be specified by --outfile. Implies --syslog.
For completeness:
-m --monitor
Instead of exiting after receiving a single event, execute indefinitely.
The default behaviour is to exit after the first event occurs.
Within the do-done block of your 'while' statement, you might parse each event report for interesting details then use 'case-esac' to take action based on each event that you care about.
For something that you plan to rely on for your operations, you might also consider replacing the hard-coded '$directory' with some sort of configuration file. Such a file might include the path and filename, the interesting events for that path and file, and a script to run when those events happened.
The script might take the list of events as parameters and then 'case-esac' again.
Just one man's ramblins,
~~~ 8d;-Dan

Resources