How does logrotate work when there are two process use the same file? - linux

For example:
Program A is writing log to file "test.log".
If logrotate runs, it will rename "test.log" to "test.log.1" first, and then create a new file "test.log".
After step 2, program A does not report any error, but the A's log does not appear in the new file "test.log".
The questions are:
Where is the data that A write to file after step 2 ?
How can logrotate rename and create new file when another process is writing to the file? (Is any point that I miss about logrotate?)
Thanks!

This is very tightly related to how POSIX filesystems work. When you rename a file, it's only the name of the file that is changed, the physical file on the disk will not change. Also, once a file is opened, the process using the file only have a link (through many layers) to the physical file on the disk, the name is only used when opening the file.
That means the program A will still write to the same file, which now has the new name (i.e. test.log.1 in your example).
A common solution to this problem is to have the log rotation program send a signal (e.g. SIGHUP or SIGUSR1 or similar) to the process. The process will detect this signal and then reopen the logging to use the new file.

Related

what happens when calling ```touch .``` in linux?

this is a very specific question
I'm mainly interested in the open() system calls the happen when running touch ..
So I ran strace touch . and saw that opennat() is called three times.
but I'm not really understanding whats going on; as touch . does not print anything in the console and does not create a new file named "." since "." is a pointer to the current folder and can be seen by running ls -a so nothing is created since that name is already in use.
this is my assumption:
open() is called to check if the specified file name already exits, if a file descriptor is returned this means that the name is already in use and the operation is canceled.
please correct me if I'm wrong.
GNU touch prefers to use a file descriptor when touching files, since it's possible to write touch - > foo and expect the file foo to be touched. As a result, it always tries to open the specified path as a writable file, and if that's possible, it then uses that file descriptor to update the file timestamp.
In this case, it's not possible to open . for writing, so openat returns EISDIR. touch notices that it's a directory, so its call to its internal fdutimensat function gets an invalid file descriptor and falls back to using utimensat instead of futimens.
It isn't the case that the openat call is used to check that the file exists, but instead that using a file descriptor for many operations means that you don't have to deal with path resolution multiple times or handle symlinks, since all of those are resolved when the file descriptor is opened. This is why many long-lived programs choose to open a file descriptor to their current working directory, then change directories, and then use the file descriptor with fchdir to change back. Any pchanges to permissions after the program starts are not a problem.

How do I get the filename of an open std::fs::File in Rust?

I have an open std::fs::File, and I want to get it's filename, e.g. as a PathBuf. How do I do that?
The simple solution would be to just save the path used in the call to File::open. Unfortunately, this does not work for me. I am trying to write a program that reads log files, and the program that writes the logs keep changing the filenames as part of it's log rotation. So the file may very well have been renamed since it was opened. This is on Linux, so renaming open files is possible.
How do I get around this issue, and get the current filename of an open file?
On a typical Unix filesystem, a file may have multiple filenames at once, or even none at all. The file metadata is stored in an inode, which has a unique inode number, and this inode number can be linked from any number of directory entries. However, there are no reverse links from the inode back to the directory entries.
Given an open File object in Rust, you can get the inode number using the ino() method. If you know the directory the log file is in, you can use std::fs::read_dir() to iterate over all entries in that directory, and each entry will also have an ino() method, so you can find the one(s) matching your open file object. Of course this approach is subject to race conditions – the directory entry may already be gone again once you try to do anything with it.
On linux, files handles held by the current process can be found under /proc/self/fd. These look and act like symlinks to the original files (though I think they may technically be something else - perhaps someone who knows more can chip in).
You can therefore recover the (possibly changed) file name by constructing the correct path in /proc/self/fd using your file descriptor, and then following the symlink back to the filesystem.
This snippet shows the steps:
use std::fs::read_link;
use std::os::unix::io::AsRawFd;
use std::path::PathBuf;
// if f is your std::fs::File
// first construct the path to the symlink under /proc
let path_in_proc = PathBuf::from(format!("/proc/self/fd/{}", f.as_raw_fd()));
// ...and follow it back to the original file
let new_file_name = read_link(path_in_proc).unwrap();

Unload a file from a ftp and rename it in host

I have one file delivered in a ftp daily. This file doesn´t have the same name everyday. It has the date and the hour of the creation. For example, today the file has the name 20130814_XX_YY_20130814152345, created at 15:23:45 and tomorrow the file can name 20130815_XX_YY_20130815152421. The _XX_YY_ is always the same but the hour will change everyday.
I want to create a host jcl that gets this file with variable name and rename it to a host file. How can I do this ?
Thank you
Regards
Chuchito
STEP1: You can use LS in FTP to write to disk, so you can have a file with the file-name in it. Then GET that file.
STEP2: Process the contents of your file to generate the FTP Control Cards (at least for the GET). The GET generated will be of the form GET 20130814_XX_YY_20130814152345 'HLQ.MAINFRAM.DATASET', where the server name has come from the file GETted in STEP1 and the local (Mainframe) file can be hard-coded, or supplied to the generation if flexibility is required.
STEP3: Run FTP again with the Control Card(s) generated.
Isn't there anything in the Spec?
Sometimes we create complexities where an "out of the box" solution simplifies life considerably.
After the post updated, I now understand the problem a bit better.
If the name is required to be so specific, then the other suggested solution (if i understand it) is to have a fixed file name on the server that contains a list of file names to be uploaded.
In fact, the server could create a fixed file name that is really the JCL to run on the mainframe!!! This file would include the //SYSIN DD * and GET commands! The mainframe uploads this file and submits it as-is to the job reader, which then runs on the mainframe. The last step of this job (created by the server, but run on the mainframe) is to FTP an empty JCL file back to the server, in this way the server "knows" that the mainframe has uploaded the files.
Alternatively, why does the non-Z\os system need to name the file with time information? If the mainframe processes the file daily then date should be sufficient.
With this change the mainframe can reliably predict the file name for the day, generate the appropriate GET command and run.
With a job scheduler it would be easy to run the upload to the mainframe twice a day. This might address any concerns that are expressed in the desire to include a time in the file's name.
Run a Rexx step via a Background TSO step:
Background TSO step
You can then run a listcat to get all the files. You could either write the listcat output to a file and read it in or trap the output via the Address command
or the OutTrap function.
Then use the standard TSO Rename command.
Alternatively you could run ISPF background rexx program and use the ISPF equivalents to get the file name
(1) The real solution to this should be through a scheduling tool for Mainframe jobs. These tools provide capabilities to take care of formatting like the one you described.
(2) Alternatives: REXX and COBOL
(3) If you don't want to prefer REXX, here's a little brief into how you could create the JCL dynamically using COBOL:
A COBOL program that would read a "template" JCL.
Using INSPECT / REPLACE, you could substitute the prototypes with the string that is populated with the date of your choice (you could supply this as a simple SYSIN parm too, if you want the COBOL code to be flexible on the date selection)
Now that your formatted JCL is ready, you could write it to the output stream
//OUTFILE DD SYSOUT=(INTRDR,)
or
//OUTFILE DD SYSOUT=(,INTRDR)
Anything that is written to INTRDR (Internal Reader), goes straight to JES to submit your job!
Hope this helps.

How to call a bash script automatically when directory contents chage

My goal is to run a bash script automatically whenever any new file is added to a particular directory or any subdirectory of that particular directory.
Detail Scenario:
I am creating an automated process for file submission from teachers to students and vice versa. Sender will upload file and it will be stored inside the Uploads directory in the LAMP server in the format, ex. "name_course-name_filename.pdf". I want some method so that when any file stored inside the Uploads folder, the same time a script will be called and send that file to the list of receives.
From the database I can find the list of receiver for that particular course and student.
The only concern of mine is, how to call a script automatically and make it work on individual file whenever the content of the directory changes. Cron will do in intervals but not a real time work.
Linux provides a nice mechanism for that purpose which is called inotify. inotify is mostly available as a C API. But there have been developed shell utilities as well. You should use inotifywait from inotifytools (pkg name in debian) for this. Here comes a basic example:
#!/bin/bash
directory="/tmp" # or whatever you are interested in
inotifywait -m -e create "$directory" |
while read folder eventlist eventfile
do
echo "the following events happened in folder $folder:"
echo "$eventlist $eventfile"
done
Update:
If the problem goes complicated, for example you'll have to monitor recursive, dynamic directory structures, you should have a look at incron It's a cron like daemon which executes scripts on certain events. But the events are file system events rather than timer events.
There is another option to 'inotifywait':
-d --daemon
Same as --monitor, except run in the background logging events to a file
that must be specified by --outfile. Implies --syslog.
For completeness:
-m --monitor
Instead of exiting after receiving a single event, execute indefinitely.
The default behaviour is to exit after the first event occurs.
Within the do-done block of your 'while' statement, you might parse each event report for interesting details then use 'case-esac' to take action based on each event that you care about.
For something that you plan to rely on for your operations, you might also consider replacing the hard-coded '$directory' with some sort of configuration file. Such a file might include the path and filename, the interesting events for that path and file, and a script to run when those events happened.
The script might take the list of events as parameters and then 'case-esac' again.
Just one man's ramblins,
~~~ 8d;-Dan

How to implement a semaphore that will synchronize several different copies of the same program in Linux

I have a program that can be ran several times. The program uses a working directory where it saves/manipulates its runtime files and puts results. I want to make sure that if several copies of the program run simultaneously, they won't use the same folder. To do this I add a hidden file in the work directory when it's created, that means that the directory is being used, and delete it when the program exits. When a program wants to use a certain directory as its working directory, it'll check if that file exists, and if not it will use the directory, otherwise, it'll use a directory of the same name with its process id attached. The implementation is: (in Tcl)
upon starting:
if [file exists [db_work_area]/.folder_used] {
reg set work_area_override [db_work_area]_[pid]
}
...
exec touch ${db_wa}/.folder_used
when exiting:
if [file exists [db_work_area]/.folder_used] {
file delete [db_work_area]/.folder_used
}
This works when the copies of the program are opened one at a time, however I am afraid that if several copies of the program will be opened at the same time, there will be a problem with their synchronization. Meaning that two programs will check if the file exists, together, see that it doesn't both chose that directory, and only after that, they will add the file. How can I implement a semaphore that will be able to synchronize between the several different copies of the same program running?
You should not do a [file exists] and later the touch, it works better to use open to do it in a single step with the EXCL permission.
Try to use something like this to create the file and fail if it already exists in an atomic way.
if {[catch {open ${db_wa}/.folder_used {WRONLY EXCL CREAT}} fd]} {
# error happend, file exists
# pick a different area
} else {
# just close it again, like a touch to create the file
close $fd
}

Resources