Sequential FIFO queue for linux command line [closed] - linux

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm trying to find or implement a simple solution that can sequentially queue up Linux shell commands, so that they are executed one at a time. Here are the criteria:
The queue must execute the commands one at a time, i.e. no two commands can run at the same time.
I don't have the list of commands ahead of time. They will be coming in from web requests that my web server receives. That means the queue could be empty for a long time, and 10 requests can come in at the same time.
My web server can only do system calls to the shell, so this program/solution needs to be callable from the command line.
I only have one machine, so it can't and doesn't need to farm out the work to multiple machines.
Originally I thought the at command can do what I want, but the only thing is that it doesn't execute the commands sequentially.
I'm thinking of implementing my own solution in python with these parts:
Have a dedicated directory with a lock file
Queued commands are stored as individual files with the filename containing an incrementing sequence ID or timestamp or something similar, which I'll call "command files"
Write a python script using fcntl module on the lock file to ensure only 1 instance of the script is running
The script will watch the directory for any files and execute the shell commands in the files in the order of the filename
When the directory has no more "command files", the script will unlock the lock file and exit
When my web server wants to enqueue jobs, it will add a new "command file" and call my python script
The python script will check if another instance of itself is running. If yes, then exit, which will let the other instance handle the newly queued "command file". If no, then lock the lock file and start executing the "command files" in order
Does this sound like it'll work? The only race condition that I don't know how to handle is when the first instance of the script checks the directory and see that it's empty, and before it unlocks the lock file, a new command is queued and new instance of the script invoked. And that new script will exit when it sees the file is locked. Then the original script will unlock the file and exit.
Is there something out there that already do this, so I don't have to implement this myself?

Use a named pipe, aka FIFO:
mkfifo /tmp/shellpipe
Start a shell process whose input comes from the pipe:
/bin/sh < /tmp/shellpipe
When the web server wants to execute a command, it writes it to the pipe.
sprintf(cmdbuf, "echo '%s' > /tmp/shellpipe", command);
system(cmdbuf);

A posix message queue seems tailor made for this and whole lot simpler (and faster) than messing around with timestamped files and such. A script can enqueue the requests when they come in; another script dequeues the requests and executes them. There are some size limitations that apply to the queues but it doesn't sound like you would come close to hitting them.

Related

How to acquire a lock file in Linux Bash [duplicate]

This question already has an answer here:
Linux flock, how to "just" lock a file?
(1 answer)
Closed 1 year ago.
I have an existing lock file sometimes used by other processes. I want to temporarily acquire this lock file so other programs that potentially use it have to wait for me to unlock. Then I want to run a few commands, and then unlock it. How do I do this? I thought this would be easy but for some reason I cannot figure it out at all. I understand that I would most likely need to use flock for this, but what arguments should I be using in this scenario? flock seems to always need a command or second file to work, however in this situation there doesn't seem to be one.
Context: A bash script I am using is running into race conditions around a particular lock file (/var/lib/apt/lists/lock to be specific), and to test my solution I want to reliably be able to lock this file so I can check if my changes to the script work.
You have an example in the flock (1) man page. You should execute the commands in a subshell:
(
flock -n 9 || exit 1
...
) 9>/var/lib/apt/lists/lock
Using this form the lock is released when the subshell exits.

Jenkins: After build cancel , handle SIGTERM interruption inside the Python script in order to cleanup before finishing

My Jenkins server (version 2.167) is running a shell build job that executes a script written with Python 3.7.0.
Sometimes users need to cancel the build manually (by clicking on the red button with white cross from Jenkins GUI), and the Python scripts needs to handle the interruption in order to perform cleanup tasks before exiting. Some times, the interruption is handled correctly, but others, it seems that the parent process gets terminated before the Python script can run the cleanup procedure.
At the beginning of the Python script, I defined the following:
def cleanup_after_int(signum, frame):
# some cleanup code here
sys.exit(0)
signal.signal(signal.SIGINT, cleanup_after_int)
signal.signal(signal.SIGTERM, cleanup_after_int)
# the rest of the script here
Is the code I'm using sufficient, or should I consider something more?
The Jenkins doc for aborting a build is https://wiki.jenkins.io/display/JENKINS/Aborting+a+build
Found a pretty good document showing how this work: https://gist.github.com/datagrok/dfe9604cb907523f4a2f
You describe a race:
it seems that the parent process [sometimes] gets terminated before the Python script can run the cleanup procedure.
It would be helpful to know how you know that, in terms of the symptoms you observe.
In any event, the python code you posted looks fine. It should work as expected if SIGTERM is delivered to your python process. Perhaps jenkins is just terminating the parent bash. Or perhaps both bash & python are in the same process group and jenkins signals the process group. Pay attention to PGRP in ps -j output.
Perhaps your cleanup code is complex and requires resources that are not always available. For example, perhaps stdout is a pipe to the parent, and cleanup code logs to that open file descriptor, though sometimes a dead parent has closed it.
You might consider having the cleanup code first "daemonize", using this chapter 3 call: http://man7.org/linux/man-pages/man3/daemon.3.html. Then your cleanup would at least be less racy, leading to more reproducible results when you test it and when you use it in production.
You could choose to have the parent bash script orchestrate the cleanup:
trap "python cleanup.py" SIGINT SIGTERM
python doit.py
You could choose to not worry about cleaning upon exit at all. Instead, log whatever you've dirtied, and (synchronously) clean that just before starting, then begin your regularly scheduled script that does the real work. Suppose you create three temp files, and wish to tidy them up. Append each of their names to /tmp/temp_files.txt just before creating each one. Be sure to flush buffers and persist the write with fsync() or close().
Rather than logging, you could choose to clean at startup without a log. For example:
$ rm -f /tmp/{1,2,3}.txt
might suffice. If only the first two were created last time, and the third does not exist, no big deal. Use wildcards where appropriate.

Is there a way to make a bash script process messages that have been sent to it using the write command

Is there a way to make a bash script process messages that have been sent to it using the "write" command? So for example, if a user wants to activate a feature in my script, could I make it so that they can send the script a command using the write command?
One possible method I thought of was to configure logging for a screen session and then have the bash script parse text through there, but I'm not sure if there would be a simpler or more efficient way to tackle this
EDIT: I was thinking as an alternative solution I could use a named pipe. I'm worried that it would break though if the tmp partition gets filled up completely (not sure if this would impact write as well?). I'm going to be running this script on a shared box, and every once in a while someone will completely fill up the /tmp partition and then just leave it like that until people start complaining
Hmm, you are trying to really circumvent a poor unix command to ask it something it was not specified for. From the man page (emphasize mine):
The write utility allows you to communicate with other users, by copying
lines from your terminal to theirs
That means that write is intended to copy line directly on terminals. As soon as you say, I will dump terminal output with screen, and then parse the dump file, you loose the simplicity of write (and also need disk space, with the problem of removing old lines from a sequencial file)
Worse, as your script lives on its own, it could (should?) be a daemon script attached to no terminal
So if I have correctly understood your question, your requirements are:
a script that does some tasks and should be able to respond to asynchronous requests - common usages are named pipes or network or unix domain sockets, less common are files in a dedicated folder with a optional signal to have immediate processing, adding lines to a sequential file while being possible is uncommon, because of a synchonization of access problem
a simple and convivial way for users to pass requests. Ok write is nice for that part, but much too hard to interface IMHO
If you do not want to waste time on that part by using standard tools, I would recommend the mail system. It is trivial to alias a mail address to a program that will be called with the mail message as input. But I am not sure it is worth it, because the user could directly call the program with the request as input or command line parameter.
So the client part could be simply a program that:
create a temporary file in a dedicated folder (mkstemp is your friend in C or C++, or mktemp in shell - but beware of race conditions)
write the request to that file
optionaly send a signal to a pid - provided the script write its own PID on startup to a dedicated file

linux: Determining file handle identity via procfs

I'm trying to determine whether it's possible to distinguish between two separate handles on the same file, and a single handle with two file descriptors pointing to it, using metadata from procfs.
Case 1: Two File Handles
# setup
exec 3>test.lck
exec 4>test.lck
# usage
flock -x 3 # this grabs an exclusive lock
flock -s 4 # this blocks
echo "This code is never reached"
Case 2: One Handle, Two FDs
# setup
exec 3>test.lck
exec 4>&3
# usage
flock -x 3 # this grabs an exclusive lock
flock -s 4 # this converts that lock to a shared lock
echo "This code gets run"
If I'm inspecting a system's state from userland after the "setup" stage has finished and before the "usage", and I want to distinguish between those two cases, is the necessary metadata available? If not, what's the best way to expose it? (Is adding kernelspace pointers to /proc/*/fdinfo a reasonable action, which upstream is likely to accept as a patch?)
I'm unaware of anything exposing this in proc as it is. Figuring this out may be useful when debugging some crap, but then you can just inspect the state with the kernel debugger or a systemtap script.
From your question it seems you want to achieve this in a manner which can be easily scripted and here I have to ask what is the real problem.
I have no idea if linux folks would be interested in exposing this. One problem is that exposing a pointer to file adds another infoleak and thus would be likely plugged in the future. Other means would require numbering all file objects and that's not going to happen. Regardless, you would be asked for a justification in a similar way I asked you above.

How to create a virtual command-backed file in Linux?

What is the most straightforward way to create a "virtual" file in Linux, that would allow the read operation on it, always returning the output of some particular command (run everytime the file is being read from)? So, every read operation would cause an execution of a command, catching its output and passing it as a "content" of the file.
There is no way to create such so called "virtual file". On the other hand, you would be
able to achieve this behaviour by implementing simple synthetic filesystem in userspace via FUSE. Moreover you don't have to use c, there
are bindings even for scripting languages such as python.
Edit: And chances are that something like this already exists: see for example scriptfs.
This is a great answer I copied below.
Basically, named pipes let you do this in scripting, and Fuse let's you do it easily in Python.
You may be looking for a named pipe.
mkfifo f
{
echo 'V cebqhpr bhgchg.'
sleep 2
echo 'Urer vf zber bhgchg.'
} >f
rot13 < f
Writing to the pipe doesn't start the listening program. If you want to process input in a loop, you need to keep a listening program running.
while true; do rot13 <f >decoded-output-$(date +%s.%N); done
Note that all data written to the pipe is merged, even if there are multiple processes writing. If multiple processes are reading, only one gets the data. So a pipe may not be suitable for concurrent situations.
A named socket can handle concurrent connections, but this is beyond the capabilities for basic shell scripts.
At the most complex end of the scale are custom filesystems, which lets you design and mount a filesystem where each open, write, etc., triggers a function in a program. The minimum investment is tens of lines of nontrivial coding, for example in Python. If you only want to execute commands when reading files, you can use scriptfs or fuseflt.
No one mentioned this but if you can choose the path to the file you can use the standard input /dev/stdin.
Everytime the cat program runs, it ends up reading the output of the program writing to the pipe which is simply echo my input here:
for i in 1 2 3; do
echo my input | cat /dev/stdin
done
outputs:
my input
my input
my input
I'm afraid this is not easily possible. When a process reads from a file, it uses system calls like open, fstat, read. You would need to intercept these calls and output something different from what they would return. This would require writing some sort of kernel module, and even then it may turn out to be impossible.
However, if you simply need to trigger something whenever a certain file is accessed, you could play with inotifywait:
#!/bin/bash
while inotifywait -qq -e access /path/to/file; do
echo "$(date +%s)" >> /tmp/access.txt
done
Run this as a background process, and you will get an entry in /tmp/access.txt each time your file is being read.

Resources