Multiple cronjobs possibly accessing log file at the same time - cron

As the title says: I have multiple cronjobs loading python scripts, that start at different times. The scripts do different things and thus take different amounts of time to finish. So it's pretty likely that they will overlap from time to time. During their runtime they write all their output into log file. The scripts (and its results) are completely independent from each other.
What happens if multiple scripts try to write to the same logfile at the same time?
If there is a problem, how can I solve it?

Related

Can multiple qsub submissions read the same group of files?

I would like to use a bash script and qsub to run 30-40 python programs at once.
Each python program reads and searches through the same set of files (~400 total) for a set of sequences.
Is there a problem where multiple python programs could be trying to read from the same file? If so, what are the consequences?
There are no problems with multiple jobs reading from the same files that are imposed by Torque. (I think that statement is likely to be true for any resource manager / scheduler.)
The main issue I can imagine would be your file system's performance and whether or not it can keep up with potentially concurrent accesses to those files.

"find" command cannot detect files added during execution

Stackoverflow has saved my life on countless occasions over the years. Now, it's time for me to post my first question ever, the answer to which I have been unable to find so far.
I have a tool (language/implementation is irrelevant) which accepts a text file as input. This text file (let's call it file_list.txt) contains a long list of file paths, one per line. The tool then iterates over the lines in file_list.txt and does something with every file path. This needs to be done continuously and file_list.txt needs to always contain the latest file paths because users continuously upload or delete files from the share being monitored. To achieve this, I have set up a cron job which calls a script. First the script calls the find utility with the search parameters required and pipes the output to a temporary file. When the file is fully populated, it is moved to file_list.txt. Then, once this is done, the tool is invoked with file_list.txt as an input parameter.
So far, so good. The share being monitored is VERY LARGE (~60 TB) and the find command takes around 5 hours to execute. This is not a problem since we have multiple overlapping find commands running in parallel (triggered once per hour). The entire setup runs on a compute farm, so CPU utilization, etc. is also not an issue.
The problem arises in the lag time for file detection. Ideally, I want a user to add a file and I want one of the already running, overlapping find commands to detect this file within a matter of minutes. However, I have noticed that none of the already-running find commands will detect this file. Only a find command started AFTER this file was added will detect it. This means that generally, I need to wait around 5 hours for a newly added file to be detected. This leads me to believe that the find utility somehow acts on a "cached" version of the share state when it was triggered. Is this true? Can anyone confirm this? And if so, what can I do to improve the detection lag?
Please let me know if further clarificaion is required. I am happy to provide any further details.
To summarize: you have a gigantic filesystem volume (60 TB) which contains a huge number of files, and you use find(1) to name a large number of those files and put those names into a text file for analysis. You have discovered that files are not listed if they are created after find(1) was started but before it finished.
I think the best solution is to stop thinking of this as a batch job, and do it "online" using inotify(7). You can use the inotify API to be immediately informed of changes to your filesystem, including new files being created. There is of course the original C API, as well as the excellent pyinotify.
With inotify, you can start a watcher program once and leave it running continuously (under a supervisor if needed for restarts). The operating system can then notify you whenever a relevant filesystem event occurs, and you can respond immediately rather than waiting for the next scan.
The one downside for your use case might be that the watcher program does need to run on a machine which has the filesystem mounted locally. But the overall compute resources required are probably much less than your current approach of repeated linear scans.
executing find commands and piping the output to temporary files might work up to a certain scale, but is far from optimal. If you want a less resource intensive, more reactive solution, I would recommend considering to reimplement your software using the inotify interface:
The inotify API provides a mechanism for monitoring filesystem events.
Inotify can be used to monitor individual files, or to monitor
directories. When a directory is monitored, inotify will return
events for the directory itself, and for files inside the directory.
So an event will be raised for each file change; or file being added.
Note that you can then keep an internal list of files up to date which only needs to be changed when you get a event.

When running a shell script with loops, operation just stops

NOTICE: Feedback on how the question can be improved would be great as I am still learning, I understand there is no code because I am confident it does not need fixing. I have researched online a great deal and cannot seem to find the answer to my question. My script works as it should when I change the parameters to produce less outputs so I know it works just fine. I have debugged the script and got no errors. When my parameters are changed to produce more outputs and the script runs for hours then it stops. My goal for the question below is to determine if linux will timeout a process running over time (or something related) and, if, how it can be resolved.
I am running a shell script that has several for loops which does the following:
- Goes through existing files and copies data into a newly saved/named file
- Makes changes to the data in each file
- Submits these files (which number in the thousands) to another system
The script is very basic (beginner here) but so long as I don't give it too much to generate, it works as it should. However if I want it to loop through all possible cases which means I will generates 10's of thousands of files, then after a certain amount of time the shell script just stops running.
I have more than enough hard drive storage to support all the files being created. One thing to note however is that during the part where files are being submitted, if the machine they are submitted to is full at that moment in time, the shell script I'm running will have to pause where it is and wait for the other machine to clear. This process works for a certain amount of time but eventually the shell script stops running and won't continue.
Is there a way to make it continue or prevent it from stopping? I typed control + Z to suspend the script and then fg to resume but it still does nothing. I check the status by typing ls -la to see if the file size is increasing and it is not although top/ps says the script is still running.
Assuming that you are using 'Bash' for your script - most likely, you are running out of 'system resources' for your shell session. Also most likely, the manner in which your script works is causing the issue. Without seeing your script it will be difficult to provide additional guidance, however, you can check several items at the 'system level' that may assist you, i.e.
review system logs for errors about your process or about 'system resources'
check your docs: man ulimit (or 'man bash' and search for 'ulimit')
consider removing 'deep nesting' (if present); instead, create work sets where step one builds the 'data' needed for the next step, i.e. if possible, instead of:
step 1 (all files) ## guessing this is what you are doing
step 2 (all files)
step 3 (all files
Try each step for each file - Something like:
for MY_FILE in ${FILE_LIST}
do
step_1
step_2
step_3
done
:)
Dale

Will conflicts arise between larger and smaller files in CRON job?

I have set a cron job for a file for every 6 hours. The file may run for 4hours.
If i set cron for another file , will it affect the previous one which may run for 4hours?
No. If the job is not working on same resources, it wont conflict even if it's running simultaneously.
The cron daemon doesn't check to see if anything else by the same name is running, if that is what you mean, so cron will not care. However, if your script creates temporary files, for example, without using helper-tools like "mktemp" they could conflict with each other - so that will depend how well written your script is.

Should I worry about File handles in logging from async methods?

I'm using the each() method of the async lib and experiencing some very odd (and inconsistent) errors that appear to be File handle errors when I attempt to log to file from within the child processes.
The array that I'm handing to this method frequently has hundreds of items and I'm curious if Node is having trouble running out of available file handles as it tries to log to file from within all these simultaneous processes. The problem goes away when I comment out my log calls, so it's definitely related to this somehow, but I'm having a tough time tracking down why.
All the logging is trying to go into a single file... I'm entirely unclear on how that works given that each write (presumably) blocks, which makes me wonder how all these simultaneous processes are able to run independently if they're all sitting around waiting on the file to become available to write to.
Assuming that this IS the source of my troubles, what's the right way to log from a process such as Asnyc.each() which runs N number of processes at once?
I think you should have some adjustable limit to how many concurrent/outstanding write calls you are going to do. No, none of them will block, but I think async.eachLimit or async.queue will give you the flexibility to set the limit low and be sure things behave and then gradually increase it to find out what resource constraints you eventually bump up against.

Resources