So I found out the hard way doing this is really bad in linux:
# a.sh
while true
do
some stuff
sh a.sh
done
I want to be able to update the script and have it fix itself. Is something like this considered safe instead/
# a.sh
while true
do
wget http://127.0.0.1/b.sh
sh b.sh
done
# b.sh
some stuff
This way I can update script b.sh and the next execution of it will be force updated since a.sh calls it?
If you want a process to alter its source code and re-launch itself, and you don't want multiple (slightly different) copies of the process running at the same time, then you somehow need to kill the parent process. It doesn't matter much how you do this; for instance, if the "rewrite self" section of the code is only triggered when a rewrite is actually necessary, then you could just put "exit" after the line that calls the rewritten script. (I suspect that something like mv -f b.sh a.sh && sh a.sh && exit might work, too, since I think the entire line would be sent to the Bash interpreter before the script is destroyed.)
If you do want multiple copies of the process running, then you need to find some way to limit this number. For instance, you might have the script check how many iterations of itself are already running before forking.
Note that in both cases, this assumes that your script has accurately modified itself, which...is a tricky problem, to say the least. You are treading on dangerous ground.
Finally, if all you really want is a daemon that gets automatically relaunched when it dies (which is what it sounds like you're describing in your comments), there are other ways to accomplish this. I'm not terribly familiar with how this sort of thing is typically done, but I imagine that in a shell script you could simply use trap. (trap a.sh EXIT might be sufficient, though that would make your script quite difficult to permanently kill should you later decide that you made a mistake.)
You probably want to exec the script. exec replaces the script in execution with a new execution environment, so it never returns (unless there was some problem starting the indicated program).
Unless you know what you're doing, it's a bad idea to overwrite a running script. bash does not read the entire file into memory when it starts a script, so it will continue at the same byte offset in the overwritten file.
Finally, don't use sh when you mean bash. sh might be a different shell, and even if it is bash, it will have different behaviour.
Related
My Jenkins server (version 2.167) is running a shell build job that executes a script written with Python 3.7.0.
Sometimes users need to cancel the build manually (by clicking on the red button with white cross from Jenkins GUI), and the Python scripts needs to handle the interruption in order to perform cleanup tasks before exiting. Some times, the interruption is handled correctly, but others, it seems that the parent process gets terminated before the Python script can run the cleanup procedure.
At the beginning of the Python script, I defined the following:
def cleanup_after_int(signum, frame):
# some cleanup code here
sys.exit(0)
signal.signal(signal.SIGINT, cleanup_after_int)
signal.signal(signal.SIGTERM, cleanup_after_int)
# the rest of the script here
Is the code I'm using sufficient, or should I consider something more?
The Jenkins doc for aborting a build is https://wiki.jenkins.io/display/JENKINS/Aborting+a+build
Found a pretty good document showing how this work: https://gist.github.com/datagrok/dfe9604cb907523f4a2f
You describe a race:
it seems that the parent process [sometimes] gets terminated before the Python script can run the cleanup procedure.
It would be helpful to know how you know that, in terms of the symptoms you observe.
In any event, the python code you posted looks fine. It should work as expected if SIGTERM is delivered to your python process. Perhaps jenkins is just terminating the parent bash. Or perhaps both bash & python are in the same process group and jenkins signals the process group. Pay attention to PGRP in ps -j output.
Perhaps your cleanup code is complex and requires resources that are not always available. For example, perhaps stdout is a pipe to the parent, and cleanup code logs to that open file descriptor, though sometimes a dead parent has closed it.
You might consider having the cleanup code first "daemonize", using this chapter 3 call: http://man7.org/linux/man-pages/man3/daemon.3.html. Then your cleanup would at least be less racy, leading to more reproducible results when you test it and when you use it in production.
You could choose to have the parent bash script orchestrate the cleanup:
trap "python cleanup.py" SIGINT SIGTERM
python doit.py
You could choose to not worry about cleaning upon exit at all. Instead, log whatever you've dirtied, and (synchronously) clean that just before starting, then begin your regularly scheduled script that does the real work. Suppose you create three temp files, and wish to tidy them up. Append each of their names to /tmp/temp_files.txt just before creating each one. Be sure to flush buffers and persist the write with fsync() or close().
Rather than logging, you could choose to clean at startup without a log. For example:
$ rm -f /tmp/{1,2,3}.txt
might suffice. If only the first two were created last time, and the third does not exist, no big deal. Use wildcards where appropriate.
What is the most straightforward way to create a "virtual" file in Linux, that would allow the read operation on it, always returning the output of some particular command (run everytime the file is being read from)? So, every read operation would cause an execution of a command, catching its output and passing it as a "content" of the file.
There is no way to create such so called "virtual file". On the other hand, you would be
able to achieve this behaviour by implementing simple synthetic filesystem in userspace via FUSE. Moreover you don't have to use c, there
are bindings even for scripting languages such as python.
Edit: And chances are that something like this already exists: see for example scriptfs.
This is a great answer I copied below.
Basically, named pipes let you do this in scripting, and Fuse let's you do it easily in Python.
You may be looking for a named pipe.
mkfifo f
{
echo 'V cebqhpr bhgchg.'
sleep 2
echo 'Urer vf zber bhgchg.'
} >f
rot13 < f
Writing to the pipe doesn't start the listening program. If you want to process input in a loop, you need to keep a listening program running.
while true; do rot13 <f >decoded-output-$(date +%s.%N); done
Note that all data written to the pipe is merged, even if there are multiple processes writing. If multiple processes are reading, only one gets the data. So a pipe may not be suitable for concurrent situations.
A named socket can handle concurrent connections, but this is beyond the capabilities for basic shell scripts.
At the most complex end of the scale are custom filesystems, which lets you design and mount a filesystem where each open, write, etc., triggers a function in a program. The minimum investment is tens of lines of nontrivial coding, for example in Python. If you only want to execute commands when reading files, you can use scriptfs or fuseflt.
No one mentioned this but if you can choose the path to the file you can use the standard input /dev/stdin.
Everytime the cat program runs, it ends up reading the output of the program writing to the pipe which is simply echo my input here:
for i in 1 2 3; do
echo my input | cat /dev/stdin
done
outputs:
my input
my input
my input
I'm afraid this is not easily possible. When a process reads from a file, it uses system calls like open, fstat, read. You would need to intercept these calls and output something different from what they would return. This would require writing some sort of kernel module, and even then it may turn out to be impossible.
However, if you simply need to trigger something whenever a certain file is accessed, you could play with inotifywait:
#!/bin/bash
while inotifywait -qq -e access /path/to/file; do
echo "$(date +%s)" >> /tmp/access.txt
done
Run this as a background process, and you will get an entry in /tmp/access.txt each time your file is being read.
As for bash, is it a bad practice to store text output in variables? I don't mean few lines, but even as much as few MB of data. Should the variables be emptied after the script is done?
Edit: I didn't clarify the second part enough, I wanted to ask whether I should empty the variables in the case I run scripts in the current shell, not in a subshell, so that it doesn't drain memory. Or, shouldn't I run scripts in the current one at all?
Should the variables be emptied after the script is done
You need to understand that a script is executed in a sub shell (child of the present shell) that gets its own environment and variable space. When script ends, that sub-shell exists and all the variables held by that sub-shell get destroyed/released anyway so no need to empty variables programmatically.
As for bash, is it a bad practice to store text output in variables?
That's great practice! Carry on with bash programming, and don't care about this kind of memory issues (until you want to store debian DVD image in a single $debian_iso variable, then you can have a problem)
I don't mean few lines, but even as much as few MB of data. Should the
variables be emptied after the script is done?
All you variables in bash shell evaporate when you finish executing your script. It will manage the memory for you. That said, if you assign foo="bar" you can access $foo in the same script, but obviously you won't see that $foo in another script
I am spawning multiple sh (sipp) scripts from a tcl script . I want to know those scripts will run in parallel or as a child process? Because I want to run it in parallel.
If I use threads extension then do I need to use any other packages along with that?
Thanks in advance.
Tcl can quite easily run multiple subprocesses in parallel. The way you do so depends on how you want to handle what those subprocesses do. (Bourne shell — sh — scripts work just fine as subprocesses.) None of these require threads. You can use threads too, but they're not necessary for just running subprocesses as, from Tcl's perspective at least, subprocess handling is a purely I/O-bound matter.
For more details, please narrow down (in another question) which type of subprocess handling you want to do.
Fire and Forget
If you don't care about tracking the subprocesses at all, just set them going in the background by putting a & as the last word to exec:
exec /bin/sh myscript.sh &
Keeping in Touch
To keep in touch with the subprocess, you need to open a pipeline (and use this weird stanza to do so; put the arguments inside a list with a | concatenated on the front):
set thePipe [open |[list /bin/sh myscript.sh]]
You can then read/gets from the pipe to get the output (yes, it supports fileevent and asynchronous I/O on all platforms). If you want to write to the pipe (i.e., to the subprocess's stdin) open with mode w, and to both read and write, use mode r+ or w+ (doesn't matter which, as it is a pipe and not a file). Be aware that you have to be a bit careful with pipes; you can get deadlocked or highly confused. I recommend using the asynchronous I/O style, with fconfigure $thePipe -blocking 0, but that's quite a bit different to a synchronous style of I/O handling.
Great Expectations
You can also use the Expect extension to work with multiple spawned subprocesses at once. To do that, you must save the id from each spawn in its own variable, then pass that id to expect and send with the -i option. You probably want to use expect_background.
set theId [spawn /bin/sh myscript.sh]
expect_background {
-i $theId
"password:" {
send -i $theId "$mypass\r"
# Etc.
}
}
# Note that [expect_background] doesn't support 'timeout'
I have a very long command running on a very large file. It involves sort, uniq, grep and awk commands in the single command that pipes the results of one command to another.
Once I issue this command for execution, the command prompt doesn't return back until the command has completely executed.
Is there a way to know what is the progress of the command in terms of how much of its execution it has completed or anything similar that gives us an idea of how much of a particular command inside this main command has completed?
Without knowing exactly what you're doing I can't say whether or not it would work for you, but have a look at pv. It might fit the bill.
Perl was originally created because AWK wasn't quite powerful enough for the task at hand. With commands like sort and grep, and a syntax very similar to AWK's, it should not be hard to translate a command line using those programs into a short Perl script.
The advantage of Perl is that you can easily communicate the progress of your script via print statements. For example, you could indicate when the input file was done being loaded, when the sort was completed, etc.