"Argument list too long" with only 2 arguments? - linux

I'm debugging someone else's code, and I've run into a situation I wouldn't know how to produce if I tried to code it deliberately. It's coming from a very large Bash script, being run by Bash 4.1.2 on a CentOS 6 box. While the overall program is gigantic, the error consistently occurs in the following function:
get_las() {
echo "Getting LAS..."
pushd ${ferret_workdir} >& /dev/null
#Download:
if [ ! -e ${las_dist_file} ] || ((force_install)) ; then
echo "Don't see LAS tar file ${las_dist_file}"
echo "Downloading LAS from ${las_dist_file} -to-> $(pwd)/${las_dist_file}"
echo "wget -O '${las_dist_file}' '${las_tar_url}'"
wget -O "${las_dist_file}" "${las_tar_url}"
[ $? != 0 ] && echo " ERROR: Could not download LAS:${las_dist_file}" && popd >/dev/null && checked_done 1
fi
popd >& /dev/null
return 0
}
If I allow the script to run from scratch in a pristine environment, when this section is reached it will spit out the following error and die:
Don't see LAS tar file las-esg-v7.3.9.tar.gz
Downloading LAS from las-esg-v7.3.9.tar.gz -to-> /usr/local/src/esgf/workbench/esg/ferret/7.3.9/las-esg-v7.3.9.tar.gz
wget -O 'las-esg-v7.3.9.tar.gz' 'ftp://ftp.pmel.noaa.gov/pub/las/las-esg-v7.3.9.tar.gz'
/usr/local/bin/esg-product-server: line 428: /usr/bin/wget: Argument list too long
ERROR: Could not download LAS:las-esg-v7.3.9.tar.gz
Note that I even have a debug echo in there to prove that the arguments are only two small strings.
If I let the program error out at the point above and then immediately re-run it from the same expect script, with the only change being that it has already completed all the stages prior to this one and is detecting that and skipping them, this section will execute normally with no error. This behavior is 100% reproducible on my test box -- if I purge all traces that running the code leaves, the first run thereafter bombs out at this point, and subsequent runs will be fine.
The only thing I can think is that I've run into some obscure bug in Bash itself that is somehow causing it to leak MAX_ARG_PAGES memory invisibly, but I can't think of even any theoretical ways to make this happen, so I'm asking here.
What the heck is going on and how do I make it stop (without extreme measures like recompiling the kernel to just throw more memory at it)?
Update: To answer a question in the comments, line 428 is
wget -O "${las_dist_file}" "${las_tar_url}"

The error E2BIG refers to the sum of the bytes in the environment and the argv list. Has the script exported a huge number (or huge size) of variables? Run printenv right before the wget to see what's going on.

Related

Standard error file when there is no error

I'm new to Linux & shell and I'm struggling with checking if the compilation is successful.
g++ code.cpp -o code.o 2>error.txt
if [ ! -e error.txt ]
then
do something
else
echo "Failed to compile"
I guess an error file is created even if the compilation is successful. What is the content of the error file when there is no error? I need to change the if condition to check if the compilation is successful.
It's just the order of things. What happens when the shell parses the string g++ code.cpp -o code.o 2>error.txt is:
The shell creates error.txt, truncating the file if that name already exists.
g++ is called with its error output redirected to the new file.
If g++ does not write any data, then the file remains as it was (empty) at the end of step 1.
You probably aren't so much interested in the error file as you are the return value. You probably ought to just do:
if g++ code.cpp -o code; then : do something; done
or even just:
g++ code .cpp -o code && : do something
but if really want to do something else with the errors, you can do:
if g++ code.cpp -o code.o 2> error.txt; then
rm error.txt
: do something
else
echo >&2 Failed to compile code.cpp.\ See "$(pwd)"/error.txt for details.
fi
Make sure you escape at least one of the spaces after the . so that you get 2 spaces after the period (or just quote the whole argument to echo). Although it's become fashionable lately to claim that you only need one space, all of those arguments rely on the use of variable width fonts and any command line tool worth using will be used most often in an environment where fixed width fonts are still dominant. This last point is totally unrelated to your question, but is worth remembering.

How to suppress irrelevant ShellCheck messages?

Environment
System: Linux Mint 19 (based on Ubuntu 18.04).
Editor: I use Visual Studio Code (official website) with ShellCheck plugin to check for errors, warnings, and hints on-the-fly.
ShellCheck
is a necessary tool for every shell script writer.
Although the developers must have put enormous effort to make it as good as it gets, it sometimes produces irrelevant warnings and / or information.
Example code, with such messages (warning SC2120 + directly adjacent information SC2119):
Example shell script snippet
am_i_root ()
# expected arguments: none
{
# check if no argument has been passed
[ "$#" -eq 0 ] || print_error_and_exit "am_i_root" "Some arguments have been passed to the function! No arguments expected. Passed: $*"
# check if the user is root
# this will return an exit code of the command itself directly
[ "$(id -u)" -eq 0 ]
}
# check if the user had by any chance run the script with root privileges and if so, quit
am_i_root && print_error_and_exit "am_i_root" "This script should not be run as root! Quitting to shell."
Where:
am_i_root is checking for unwanted arguments passed. Its real purpose is self-explanatory.
print_error_and_exit is doing as its name says, it is more or less self-explanatory.
If any argument has been passed, I want the function / script to print error message and exit.
Question
How do I disable these messages (locally only)?
Think it through before doing this!
Do this only if you are 100.0% positive that the message(s) is really irrelevant. Then, read the Wiki here and here on this topic.
Once you assured yourself the message(s) is irrelevant
While generally speaking, there are more ways to achieve this goal, I said to disable those messages locally, so there is only one in reality.
That being adding the following line before the actual message occurrence:
# shellcheck disable=code
Notably, adding text after that in the same line will result in an error as it too will be interpreted by shellcheck.
If you want to add an explanation as to why you are suppressing the warning, you can add another hash # to prevent shellcheck from interpreting the rest of the line.
Incorrect:
# shellcheck disable=code irrelevant because reasons
Correct:
# shellcheck disable=code # code is irrelevant because reasons
Note, that it is possible to add multiple codes separated by comma like this example:
# shellcheck disable=SC2119,SC2120
Note, that the # in front is an integral part of disabling directive!
With Shellcheck 0.7.1 and later, you can suppress irrelevant messages on the command line by filtering on severity (valid options are: error, warning, info, style):
$ shellcheck --severity=error my_script.sh
This will only show errors and will suppress the annoying SC2034, SC2086, etc. warnings and style recommendations.
You can also suppress messages per-code with a directive in your ~/.shellcheckrc file, such as:
disable=SC2076,SC2016
Both of these options allow you to filter messages globally, rather than having to edit each source code file with the same directives.
If your distro does not have the latest version, you can upgrade with something like:
scversion="stable" # or "0.7.1" or "latest"
wget -qO- "https://github.com/koalaman/shellcheck/releases/download/${scversion?}/shellcheck-${scversion?}.linux.x86_64.tar.xz" | tar -xJv
sudo cp "shellcheck-${scversion}/shellcheck" /usr/bin/
shellcheck --version

How do I rerun a bash script skipping over lines which have previously run sucesfully?

I have a bash script which acts as a wrapper for an analysis pipeline. If the script errors out I want to be able to run the script from the point at which the errors occurred by simply re-running the original command. I have set two different traps; one which will remove the last file being generated on a non-zero exit from my script, the other will remove all the temporary files on exit signal = 0 and essentially cleans up the file system at the end of the run. I turned on noclobber in the bash environment which allows my script to skip over lines of the script where files have already been written but this will only do this if I do not set the non-zero exit trap. As soon as I set this trap then it will exit at the first line where noclobber IDs a file it will not overwrite. Is there a way for me to skip over lines of code that have successfully run previously rather than having to re-run my code from the start? I know I could use conditional statements for each line but I thought there might be a neater way of doing this.
set -o noclobber
# Function to clean up temporary folders when script exits at the end
rmfile() { rm -r $1 }
# Function to remove the file being currently generated
# Function executed if script errors out
rmlast() {
if [ ! -z "$CURRENTFILE" ]
then
rm -r $1
exit 1
fi }
# Trap to remove the currently generated file
trap 'rmlast "$CURRENTFILE"' ERR SIGINT
#Make temporary directory if it has not been created in a previous run
TEMPDIR=$(find . -name "tmp*")
if [ -z "$TEMPDIR" ]
then
TEMPDIR=$(mktemp -d /test/tmpXXX)
fi
# Set CURRENTFILE variable
CURRENTFILE="${TEMPDIR}/Variants.vcf"
# Set CURRENTFILE variable
complexanalysis_tool input_file > $CURRENTFILE
# Set CURRENTFILE variable
CURRENTFILE="${TEMPDIR}/Filtered.vcf"
complexanalysis_tool2 input_file2 > $CURRENTFILE
CURRENTFILE="${TEMPDIR}/Filtered_2.vcf"
complexanalysis_tool3 input_file3 > $CURRENTFILE
# Move files to final destination folder
mv -nv $TEMPDIR/*.vcf /test/newdest/
# Trap to remove temporary folders when script finishes running
trap 'rmfile "$TEMPDIR"' 0
Update:
I have been offered answers suggesting the use of the make utility. I want to make use of its inbuilt utility to check if a dependency has been fulfilled.
In my hands the makefile suggested by VK Kashyap does not seem to skip execution for previously accomplished tasks. So for example I ran the script above and interrupted the script when it was running filtered.vcf with ctrl c. When I rerun the script again it runs from the beginning again i.e. starts again at varaints.vcf. Am I missing something in order to get the makefile to show sources as being fullfilled?
Answer to update:
OK this is a rookie mistake but since I am not familiar with generating makefiles I will post this explanation of my error. The reason my makefile was not rerunning from the exit point was that I had named the targets a different name to the output files being generated. So as VK Kashyap quite correctly answered if you name the targets eg.
variants.vcf
filtered.vcf
filtered2.vcf
the same as the output files being generated then the script will skip previously accomplished tasks.
make utility might be an answer for the thing you want to achive.
it has inbuilt dependecy checking (the stuff which you are trying to achive with tmp files)
#run all target when all of the files are available
all: variants.vcf filtered.vcf filtered2.vcf
mv -nv $(TEMPDIR)/*.vcf /test/newdest/
variants.vcf:
complexanalysis_tool input_file > variants.vcf
filtered.vcf:
complexanalysis_tool2 input_file2 > filtered.vcf
filtered2.vcf:
complexanalysis_tool3 input_file3 > filtered2.vcf
you may use bash script to invoke this make file as:
#/bin/bash
export TEMPDIR=xyz
make -C $TEMPDIR all
make utility will check itself for already accomplished task and skip execution for done stuffs. it will continue where you had the error finishing the task.
you can find more details on internet about exact syntax for makefile.
there is no built-in way to do that.
however, you could brew something like that by keeping track of the last successful line and building your own goto statement, as described here and in Is there a "goto" statement in bash? (just replace the 'labels' with actual line-numbers).
however, the question is whether this is really a smart idea.
a better way is to only run the commands needed, not the commands not-yet-executed.
this could be done either by explicit conditionals in your bash-script:
produce_if_missing() {
# check if first argument is existing
# if not run the rest of the arguments and pipe it into the first one
local curfile=$1
shift
if [ ! -e "${curfile}" ]; then
$# > "${curfile}"
fi
}
produce_if_missing Variants.vcf complexanalysis_tool input_file
produce_if_missing Filtered.vcf complexanalysis_tool2 input_file2
or using tools that are made for such things (see VK Kahyap's answer using make, though i prefer using variables in the make-rules to minimize typos):
Variants.vcf: input_file
complexanalysis_tool $^ > $#
Filtered.vcf: input_file
complexanalysis_tool2 $^ > $#

Incorrect exit status of wrapper script in KSH

We have a wrapper script for Teradata TPT utility. The wrapper script is pretty straightforward but the problem is that the exit status of the wrapper is not the same as that of the utility. In many cases, the script returns 0 even if the utility fails. I have saved the exit status in a separate variable because some steps need to be done before exiting but exiting with this variable's value doesn't seem to work. Or is the utility returning status 0 even in case of some failures even if the logs clearly specify some other status?
The worse part is, this behavior is quite random, sometimes the script does fail with the exit status of the utility. I want to be sure if there is some problem with utility's exit status.
The script runs through KSH. The final part of the wrapper script is:
tbuild -f $sql.tmp -j ${id}_$JOB >$out 2>&1
ret_code=$?
cd ${TWB_ROOT}/logs
logpath=`ls -t ${TWB_ROOT}/logs/${id}_${JOB}*.out |head -1`
logpath1=${logpath##*/}
logname=${logpath1%-*}
tlogview -l ${logpath} > /edw/$GROUP/tnl/jobs/$JOB/logs/tpt_logs/${logname}.log
###Mainting 3 tpt binary log files
if [ $ret_code -eq 0 ]
then
binout=$TPTLOGDIR/${logname}.dat
binout1=$TPTLOGDIR/${logname}.dat1
binout2=$TPTLOGDIR/${logname}.dat2
[ -f $binout1 ] && mv $binout1 $binout2
[ -f $binout ] && mv $binout $binout1
mv "$logpath" "/edw/${GROUP}/tnl/jobs/$JOB/logs/tpt_logs/${logname}.dat"
fi
rm -f $sql.tmp
echo ".exit"
exit $ret_code
Thanks in advance for the help and suggestions.
The script looks ok, and should indeed return the same exit code as the tbuild utility.
It comes down to knowledge of the specific product.
I've never worked with any of these products, but Teradata has an ample User Guide for the Parallel Transporter, with an explicit Post-Job Considerations section, warning:
Even if the job completed successfully, action may still be required based on error and warning information in the job logs and error tables.
So technically, a job might complete, but results may vary from time to time.
I guess you have to define your own policies and scan the logfiles for patterns of warnings and error messages, and then generate your own exit codes for semantic failures. Tools like logstash or splunk might come in handy.
BTW, you might consider using logrotate for rotating the $TPTLOGDIR/${logname}.dat files.
Turns out that the issue was in the utility itself, as suspected. Shell script worked fine.

Unable to run BASH script in current environment multiple times

I have a bash script that I use to move from source to bin directories from anywhere I currently am (I call this script, 'teleport'). Since it basically is just a glorified 'cd' command, I have to run it in the current shell (i.e. . ./teleport.sh ). I've set up an alias in my .bashrc file so that 'teleport' matches '. teleport.sh'.
The first time I run it, it works fine. But then, if I run it again after it has run once, it doesn't do anything. It works again if I close my terminal and then open a new one, but only the first time. My intuition is that there is something internally going on with BASH that I'm not familiar with, so I thought I would run it through the gurus here to see if I can get an answer.
The script is:
numargs=$#
function printUsage
{
echo -e "Usage: $0 [-o | -s] <PROJECT>\n"
echo -e "\tMagically teleports you into the main source directory of a project.\n"
echo -e "\t PROJECT: The current project you wish to teleport into."
echo -e "\t -o: Teleport into the objdir.\n"
echo -e "\t -s: Teleport into the source dir.\n"
}
if [ $numargs -lt 2 ]
then
printUsage
fi
function teleportToObj
{
OBJDIR=${HOME}/Source/${PROJECT}/obj
cd ${OBJDIR}
}
function teleportToSrc
{
cd ${HOME}/Source/${PROJECT}/src
}
while getopts "o:s:" opt
do
case $opt in
o)
PROJECT=$OPTARG
teleportToObj
;;
s)
PROJECT=$OPTARG
teleportToSrc
;;
esac
done
My usage of it is something like:
sjohnson#corellia:~$ cd /usr/local/src
sjohnson#corellia:/usr/local/src$ . ./teleport -s some-proj
sjohnson#corellia:~/Source/some-proj/src$ teleport -o some-proj
sjohnson#corellia:~/Source/some-proj/src$
<... START NEW TERMINAL ...>
sjohnson#corellia:~$ . ./teleport -o some-proj
sjohnson#corellia:~/Source/some-proj/obj$
The problem is that getopts necessarily keeps a little bit of state so that it can be called in a loop, and you're not clearing that state. Each time it's called, it processes one more argument, and it increments the shell's OPTIND variable so it'll know which argument to process the next time it's called. When it's done with all the arguments, it returns 1 (false) every time it's invoked, which makes the while exit.
The first time you source your script, it works as expected. The second (and third, fourth...) time, getopts does nothing but return false.
Add one line to reset the state before you start looping:
unset OPTIND # clear state so getopts will start over
while getopts "o:s:" opt
do
# ...
done
(I assume there's a typo in your transcript, since it shows you invoking the script -- not sourcing it -- on the second try, but that's not the real problem here.)
The problem is that the first time you call is you are sourcing the script (thats what ". ./teleport") does which runs the script in the current shell thus preserving the cd. The second time you call it, it isn't sourced so you create a subshell, cd to the appropriate directory, and then exit the subshell putting you right back where you called the script from!
The way to make this work is simply to make teleportToSrc and teleportToObj aliases or functions in the current shell (i.e. outside a script)

Resources