Prevent script running with same arguments twice - linux

We are looking into building a logcheck script that will tail a given log file and email when the given arguments are found. I am having trouble accurately determining if another version of this script is running with at least one of the same arguments against the same file. Script can take the following:
logcheck -i <filename(s)> <searchCriterion> <optionalEmailAddresses>
I have tried to use ps aux with a series of grep, sed, and cut, but it always ends up being more code than the script itself and seldom works very efficiently. Is there an efficient way to tell if another version of this script is running with the same filename and search criteria? A few examples of input:
EX1 .\logcheck -i file1,file2,file3 "foo string 0123" email#address.com
EX2 .\logcheck -s file1 Hello,World,Foo
EX3 .\logcheck -i file3 foo email#address1.com,email#address2.com
In this case 3 should not run because 1 is already running with parameters file3 and foo.

There are many solutions for your problem, I would recommend creating a lock file, with the following format:
arg1Ex1 PID#(Ex1)
arg2Ex1 PID#(Ex1)
arg3Ex1 PID#(Ex1)
arg4Ex1 PID#(Ex1)
arg1Ex2 PID#(Ex2)
arg2Ex2 PID#(Ex2)
arg3Ex2 PID#(Ex2)
arg4Ex2 PID#(Ex2)
when your script starts:
It will search in the file for all the arguments it has received (awk command or grep)
If one of the arguments is present in the list, fetch the process PID (awk 'print $2' for example) to check if it is still running (ps) (double check for concurrency and in case of process ended abnormally previously garbage might remain inside the file)
If the PID is still there, the script will not run
Else append the arguments to the lock file with the current process PID and run the script.
At the end, of the execution you remove the lines that contains the arguments that have been used by the script, or remove all lines with its PID.

Related

How to use set -x without showing stdout?

Within CI, I am running a bash script that calls many bash scripts.
./internals/declination/create "${RELEASE_VERSION}" "${CI_COMMIT_REF_NAME}" > /dev/null
This doest not disable the stdout returned by the script.
The Gitlabi-CI runners stop logging after 100MB of log, It says Job's log exceeded limit of 10240000 bytes.
I know the log script can only grow up.
How can I optimize the output log size?
I don't need to have all the stdout, I can have stderr but then it will be a long running script without information.
Is there a way to display the commands which is running like when doing set -x?
Edit
Reading the answers, I was not able to solve my issue. I need to add that I am using nodejs to run the bash script that run the long bash script.
This is how I call my node script within .gitlab-ci.yml:
scripts:
- node my_script.js
Within my_script.js, I have:
exports.handler = () => {
const ls = spawn('bash', [path.join(__dirname, 'release.sh')], { stdio: 'inherit' });
ls.on('close', (code) => {
if (code !== 0) {
console.log(`ps process exited with code ${code}`);
process.exitCode = code;
}
});
};
Within my_script.sh, I have:
./internals/declination/create "${RELEASE_VERSION}" "${CI_COMMIT_REF_NAME}" > /dev/null
You can selectively redirect file handles with exec.
exec >stdout 2>stderr
This however loses the connection to the terminal, so there is no simple way to output anything to the terminal after this point.
You can instead duplicate a file handle with m>&n where m is the number of the file descriptor to duplicate and n is the number of the new one (choose a big number like 99 to not accidentally clobber an existing handle).
exec 98<&1 # stdout
exec 99<&2 # stderr
exec >/dev/null 2>&1
:
To re-enable output,
exec 1<&98 2<&99
If you redirected to a temporary file instead of /dev/null you could obviously now show the tail of those files to the caller.
tail -n 100 "$TMPDIR"/stdout "$TMPDIR"/stderr
(On a shared server, probably use mktemp to create a unique temporary directory at the beginning of your script; static hard-coded file names make it impossible to run two builds at the same time.)
As you usually can't predict where the next error will happen, probably put all of this in a wrapper script which performs the redirection, runs the build, and finally displays the tail end of the temporary log files. Some build servers probably want to see some signs of life in the log file every few minutes, so perhaps tail a few lines every once in a while in a loop, too.
On the other hand, if there is just a single build command, the whole build job's stdout and stderr can simply be redirected to a log file, and you don't need to exec things back and forth. If you need to enable output selectively for portions of the script, use exec as above; but for wholesale redirection, just redirect the one command.
In summary, maybe your build script would look something like this.
#!/bin/sh
t=$(mktemp -t -d cibuild.XXXXXXXX) || exit
trap 'kill $buildpid; wait $buildpid; tail -n 500 "$t"/*; rm -rf "$t"' 0 1 2 3 5 15
# Your original commands here
${initial_process_wd}/internals/declination/create "${RELEASE_VERSION}" "${CI_COMMIT_REF_NAME}">"$t"/stdout 2>"$t"/stderr &
buildpid=$!
while kill -0 $buildpid; do
sleep 180
date
tail -n 1 "$t"/*
done
wait
A flaw with this approach is that you lose timing information. A proper solution woud let you see when each line was produced, and display standard output and standard error intermixed in the order the messages were printed, perhaps with visible time stamps, and even with coloring hints (red time stamps for stderr?)
Option 1
If your script will output the error message to stderr, you can ignore all output to stdout by using command > /dev/null, where /dev/null is a black hole that will take away any output to it.
Option 2
If there's any pattern on your error message, you can use grep to filter out those error messages.
Edit 1:
To show the command that is running, you can supply -x command to bash; therefore, your command will be
bash -x ${initial_process_wd}/internals/declination/create "${RELEASE_VERSION}" "${CI_COMMIT_REF_NAME}" > /dev/null
bash will print the command executed to stderr
Edit 2:
If you want to reduce the size of the output file, you can pass it to gzip by using ${initial_process_wd}/internals/declination/create "${RELEASE_VERSION}" "${CI_COMMIT_REF_NAME}" | gzip > logfile.
To read the content of the logfile, you can use zcat logfile.

How to get the complete calling command of a BASH script from inside the script (not just the arguments)

I have a BASH script that has a long set of arguments and two ways of calling it:
my_script --option1 value --option2 value ... etc
or
my_script val1 val2 val3 ..... valn
This script in turn compiles and runs a large FORTRAN code suite that eventually produces a netcdf file as output. I already have all the metadata in the netcdf output global attributes, but it would be really nice to also include the full run command one used to create that experiment. Thus another user who receives the netcdf file could simply reenter the run command to rerun the experiment, without having to piece together all the options.
So that is a long way of saying, in my BASH script, how do I get the last command entered from the parent shell and put it in a variable? i.e. the script is asking "how was I called?"
I could try to piece it together from the option list, but the very long option list and two interface methods would make this long and arduous, and I am sure there is a simple way.
I found this helpful page:
BASH: echoing the last command run
but this only seems to work to get the last command executed within the script itself. The asker also refers to use of history, but the answers seem to imply that the history will only contain the command after the programme has completed.
Many thanks if any of you have any idea.
You can try the following:
myInvocation="$(printf %q "$BASH_SOURCE")$((($#)) && printf ' %q' "$#")"
$BASH_SOURCE refers to the running script (as invoked), and $# is the array of arguments; (($#)) && ensures that the following printf command is only executed if at least 1 argument was passed; printf %q is explained below.
While this won't always be a verbatim copy of your command line, it'll be equivalent - the string you get is reusable as a shell command.
chepner points out in a comment that this approach will only capture what the original arguments were ultimately expanded to:
For instance, if the original command was my_script $USER "$(date +%s)", $myInvocation will not reflect these arguments as-is, but will rather contain what the shell expanded them to; e.g., my_script jdoe 1460644812
chepner also points that out that getting the actual raw command line as received by the parent process will be (next to) impossible. Do tell me if you know of a way.
However, if you're prepared to ask users to do extra work when invoking your script or you can get them to invoke your script through an alias you define - which is obviously tricky - there is a solution; see bottom.
Note that use of printf %q is crucial to preserving the boundaries between arguments - if your original arguments had embedded spaces, something like $0 $* would result in a different command.
printf %q also protects against other shell metacharacters (e.g., |) embedded in arguments.
printf %q quotes the given argument for reuse as a single argument in a shell command, applying the necessary quoting; e.g.:
$ printf %q 'a |b'
a\ \|b
a\ \|b is equivalent to single-quoted string 'a |b' from the shell's perspective, but this example shows how the resulting representation is not necessarily the same as the input representation.
Incidentally, ksh and zsh also support printf %q, and ksh actually outputs 'a |b' in this case.
If you're prepared to modify how your script is invoked, you can pass $BASH_COMMANDas an extra argument: $BASH_COMMAND contains the raw[1]
command line of the currently executing command.
For simplicity of processing inside the script, pass it as the first argument (note that the double quotes are required to preserve the value as a single argument):
my_script "$BASH_COMMAND" --option1 value --option2
Inside your script:
# The *first* argument is what "$BASH_COMMAND" expanded to,
# i.e., the entire (alias-expanded) command line.
myInvocation=$1 # Save the command line in a variable...
shift # ... and remove it from "$#".
# Now process "$#", as you normally would.
Unfortunately, there are only two options when it comes to ensuring that your script is invoked this way, and they're both suboptimal:
The end user has to invoke the script this way - which is obviously tricky and fragile (you could however, check in your script whether the first argument contains the script name and error out, if not).
Alternatively, provide an alias that wraps the passing of $BASH_COMMAND as follows:
alias my_script='/path/to/my_script "$BASH_COMMAND"'
The tricky part is that this alias must be defined in all end users' shell initialization files to ensure that it's available.
Also, inside your script, you'd have to do extra work to re-transform the alias-expanded version of the command line into its aliased form:
# The *first* argument is what "$BASH_COMMAND" expanded to,
# i.e., the entire (alias-expanded) command line.
# Here we also re-transform the alias-expanded command line to
# its original aliased form, by replacing everything up to and including
# "$BASH_COMMMAND" with the alias name.
myInvocation=$(sed 's/^.* "\$BASH_COMMAND"/my_script/' <<<"$1")
shift # Remove the first argument from "$#".
# Now process "$#", as you normally would.
Sadly, wrapping the invocation via a script or function is not an option, because the $BASH_COMMAND truly only ever reports the current command's command line, which in the case of a script or function wrapper would be the line inside that wrapper.
[1] The only thing that gets expanded are aliases, so if you invoked your script via an alias, you'll still see the underlying script in $BASH_COMMAND, but that's generally desirable, given that aliases are user-specific.
All other arguments and even input/output redirections, including process substitutiions <(...) are reflected as-is.
"$0" contains the script's name, "$#" contains the parameters.
Do you mean something like echo $0 $*?

How to use sed command to delete lines without backup file?

I have large file with size of 130GB.
# ls -lrth
-rw-------. 1 root root 129G Apr 20 04:25 syslog.log
So I need to reduce file size by deleting line which starts with "Nov 2" , So I have given the following command,
sed -i '/Nov 2/d' syslog.log
So I can't edit file using VIM editor also.
When I trigger SED command , its creating backup file also. But I don't have much space in root. Please try to give alternate solution to delete particular line from this file without increasing space in server.
It does not create a real backup file. sed is a stream editor. When applied to a file with option -i it will stream that file through the sed process, write the output to a new file (a temporary one), when everything is done, it will rename the new file to the original name.
(There are options to create backup files also, but you didn't give them, so I won't mention that further.)
In your case you have a very large file and don't want to create any copy, however temporary. For this you need to open the file for reading and writing at the same time, then your sed process can overwrite the original. After this, you will have to truncate the file at the end of the writing.
To demonstrate how this can be done, we first perform a test case.
Create a test file, containing lots of lines:
seq 0 999999 > x
Now, lets say we want to remove all lines containing the digit 4:
grep -v 4 1<>x <x
This will open the file for reading and writing as STDOUT (1), and for reading as STDIN. The grep command will read all lines and will output only the lines not containing a 4 (option -v).
This will effectively overwrite the beginning of the original file.
You will not know how long the output is, so after the output the original contents of the file will appear:
…
999991
999992
999993
999995
999996
999997
999998
999999
537824
537825
537826
537827
537828
537829
…
You can use the Unix tool truncate to shorten your file manually afterwards. In a real scenario you will have trouble finding the right spot for this, so it makes sense to count the number of bytes written (using wc):
(Don't forget to recreate the original x for this test.)
(grep -v 4 <x | tee /dev/stderr 1<>x) |& wc -c
This will preform the step above and additionally print out the number of bytes written to the terminal, in this example case the output will be 3653658. Now use truncate:
truncate -s 3653658 x
Now you have the result you want.
If you want to do this in a script, i. e. without interaction, you can use this:
length=$((grep -v 4 <x | tee /dev/stderr 1<>x) |& wc -c)
truncate -s "$length" x
I cannot guarantee that this will work for files >2GB or >4GB on your machine; depending on your operating system (32bit?) and the versions of the installed tools you might run into largefile issues. I'd perform tests with large files first (>4GB as this is typically a limit for many things) and then cross your fingers and give it a try :)
Some caveats you have to keep in mind:
Of course, nobody is supposed to append log entries to that log file while the procedure is running.
Also, any abort during the running of the process (power failure, signal caught, etc.) will leave the file in an undefined state. But re-running the command again after such a mishap will in most cases produce the correct output; some lines might be doubled, but not more than a single line should be corrupted then.
The output must be smaller than the input, of course, otherwise the writing will overtake the reading, corrupting the whole result so that lines which should be there will be missing (or truncated at the start).

Bash: pipe command output to function as the second argument

In my bash script I have a function for appending messages to the log file. It is used as follows:
addLogEntry (debug|info|warning|error) message
It produces nicely formatted lines with severity indication, timestamp and calling function name.
I've been looking for a way to pass output of some standard commands like rm to this function, while still being able to specify severity as the first argument. I'd also like to capture both stdout and stderr.
Is this possible without using a variable? It just feels excessive to involve variables to record a measly log message, and it encumbers the code too.
You have two choices:
You can add support to your addLogEntry function to have it accept the message from standard input (when no message argument is given or when - is given as the message).
You can use Command Substitution to run the command and capture its output as an argument to your function:
addLogEntry info "$(rm -v .... 2>&1)"
Note that this will lose any trailing newlines in the output however (in case that matters).
You can also use xargs to accomplish this
$ rm -v ... 2>&1 | xargs -I% addLogEntry info %
info removed 'blah1'
info removed 'blah2'
...
In the case of this command, the addLogEntry is called for every line in the input.

Unix: What does cat by itself do?

I saw the line data=$(cat) in a bash script (just declaring an empty variable) and am mystified as to what that could possibly do.
I read the man pages, but it doesn't have an example or explanation of this. Does this capture stdin or something? Any documentation on this?
EDIT: Specifically how the heck does doing data=$(cat) allow for it to run this hook script?
#!/bin/bash
# Runs all executable pre-commit-* hooks and exits after,
# if any of them was not successful.
#
# Based on
# http://osdir.com/ml/git/2009-01/msg00308.html
data=$(cat)
exitcodes=()
hookname=`basename $0`
# Run each hook, passing through STDIN and storing the exit code.
# We don't want to bail at the first failure, as the user might
# then bypass the hooks without knowing about additional issues.
for hook in $GIT_DIR/hooks/$hookname-*; do
test -x "$hook" || continue
echo "$data" | "$hook"
exitcodes+=($?)
done
https://github.com/henrik/dotfiles/blob/master/git_template/hooks/pre-commit
cat will catenate its input to its output.
In the context of the variable capture you posted, the effect is to assign the statement's (or containing script's) standard input to the variable.
The command substitution $(command) will return the command's output; the assignment will assign the substituted string to the variable; and in the absence of a file name argument, cat will read and print standard input.
The Git hook script you found this in captures the commit data from standard input so that it can be repeatedly piped to each hook script separately. You only get one copy of standard input, so if you need it multiple times, you need to capture it somehow. (I would use a temporary file, and quote all file name variables properly; but keeping the data in a variable is certainly okay, especially if you only expect fairly small amounts of input.)
Doing:
t#t:~# temp=$(cat)
hello how
are you?
t#t:~# echo $temp
hello how are you?
(A single Controld on the line by itself following "are you?" terminates the input.)
As manual says
cat - concatenate files and print on the standard output
Also
cat Copy standard input to standard output.
here, cat will concatenate your STDIN into a single string and assign it to variable temp.
Say your bash script script.sh is:
#!/bin/bash
data=$(cat)
Then, the following commands will store the string STR in the variable data:
echo STR | bash script.sh
bash script.sh < <(echo STR)
bash script.sh <<< STR

Resources