Bash debug calling stack - linux

I'm using a generic procedure to trap and describe the error or abnormal situations, instead of the usual '2>...' error construct.
The idea will be to have a procedure like this simplified version:
function debug(){
echo "Fatal Error: $PWD:$BASH_SOURCE:$LINENO $*"
....
exit 1
}
and then, used as in this example:
[ -z "$PARAMETER" ] && debug The parameter was not provided
The issues are:
BASH_SOURCE is the running source. The idea is to show the calling source, if the procedure 'debug' is global.
LINENO is the line where the expansion is executed and no the calling address.
Note: BASH_SOURCE[0] and BASH_SOURCE[1] provide 'some help' when the procedure is sourced.
This will be used to notify 'user' errors, on a centralized error message procedure. This may include a post in syslog and on other log files. However, some messages may look alike and knowing where in the source code the error was detected, help the developers.
Is there any method to obtain a calling stack on bash?

You can use bash built-in command "caller" for this.
Link --> http://wiki.bash-hackers.org/commands/builtin/caller

If you add following line in the script, system will show the execution flow of the script:
set -x

Related

How to capture linux command log into the file?

Let's say I have the below command.
STATE_NOT_C_COUNT=`mongo --host "${DB_HOST}" --port 27017 "${MONGO_DATABASE}" --eval "db.$MONGO_DATABASE.count({\"state\" : {"'"$ne"'":\"C\"},\"physicalTableName\":\"table_name\"},{nolock:true})" | tail -1`
When I used to run the above command, got the exception like
exception: connect failed
I want to capture this exception in into the file via the error function.
error(){
if [ "$?" -ne "0" ]; then
echo "$1" 2>&1 error_log
exit 1
fi
}
I'm using the above function like this:
error $STATE_NOT_C_COUNT
But I'm not able to capture the exception through the function in files.
What you are doing is terrible. Let the program that fails print its error messages to stderr, and ensure that stderr is pointed to the right thing. However, the major issue you are having is just lack of quotes. Try:
error "$STATE_NOT_C_COUNT"
The issue is that the command error $STATE_NOT_C_COUNT is subject to field splitting, so if $STATE_NOT_C_COUNT contains any whitespace it is split into arguments, and you are only writing the first one. Another alternative is to write echo "$#" in the function, but this will squash whitespace. However, it cannot be stressed enough that this is a terrible approach, completely against the unix philosophy. The program should write its error to stderr, and you should let them go there. Just make sure stderr is pointed where you want it. The only possible reason to capture stderr is if you want to write it to multiple locations, so you might pipe it to tee or to a syslogger, or some other message bus, but doing such a thing is questionable.

How to suppress irrelevant ShellCheck messages?

Environment
System: Linux Mint 19 (based on Ubuntu 18.04).
Editor: I use Visual Studio Code (official website) with ShellCheck plugin to check for errors, warnings, and hints on-the-fly.
ShellCheck
is a necessary tool for every shell script writer.
Although the developers must have put enormous effort to make it as good as it gets, it sometimes produces irrelevant warnings and / or information.
Example code, with such messages (warning SC2120 + directly adjacent information SC2119):
Example shell script snippet
am_i_root ()
# expected arguments: none
{
# check if no argument has been passed
[ "$#" -eq 0 ] || print_error_and_exit "am_i_root" "Some arguments have been passed to the function! No arguments expected. Passed: $*"
# check if the user is root
# this will return an exit code of the command itself directly
[ "$(id -u)" -eq 0 ]
}
# check if the user had by any chance run the script with root privileges and if so, quit
am_i_root && print_error_and_exit "am_i_root" "This script should not be run as root! Quitting to shell."
Where:
am_i_root is checking for unwanted arguments passed. Its real purpose is self-explanatory.
print_error_and_exit is doing as its name says, it is more or less self-explanatory.
If any argument has been passed, I want the function / script to print error message and exit.
Question
How do I disable these messages (locally only)?
Think it through before doing this!
Do this only if you are 100.0% positive that the message(s) is really irrelevant. Then, read the Wiki here and here on this topic.
Once you assured yourself the message(s) is irrelevant
While generally speaking, there are more ways to achieve this goal, I said to disable those messages locally, so there is only one in reality.
That being adding the following line before the actual message occurrence:
# shellcheck disable=code
Notably, adding text after that in the same line will result in an error as it too will be interpreted by shellcheck.
If you want to add an explanation as to why you are suppressing the warning, you can add another hash # to prevent shellcheck from interpreting the rest of the line.
Incorrect:
# shellcheck disable=code irrelevant because reasons
Correct:
# shellcheck disable=code # code is irrelevant because reasons
Note, that it is possible to add multiple codes separated by comma like this example:
# shellcheck disable=SC2119,SC2120
Note, that the # in front is an integral part of disabling directive!
With Shellcheck 0.7.1 and later, you can suppress irrelevant messages on the command line by filtering on severity (valid options are: error, warning, info, style):
$ shellcheck --severity=error my_script.sh
This will only show errors and will suppress the annoying SC2034, SC2086, etc. warnings and style recommendations.
You can also suppress messages per-code with a directive in your ~/.shellcheckrc file, such as:
disable=SC2076,SC2016
Both of these options allow you to filter messages globally, rather than having to edit each source code file with the same directives.
If your distro does not have the latest version, you can upgrade with something like:
scversion="stable" # or "0.7.1" or "latest"
wget -qO- "https://github.com/koalaman/shellcheck/releases/download/${scversion?}/shellcheck-${scversion?}.linux.x86_64.tar.xz" | tar -xJv
sudo cp "shellcheck-${scversion}/shellcheck" /usr/bin/
shellcheck --version

Bash does not print any error msg upon non-existing commands starting with dot

This is really just out of curiosity.
A typo made me notice that in Bash, the following:
$ .anything
does not print any error ("anything" not to be interpreted literally, it can really be anything, and no space after the dot).
I am curious about how this is interpreted in bash.
Note that echo $? after such command returns 127. This usually means "command not found". It does make sense in this case, however I find it odd that no error message is printed.
Why would $ anything actually print bash:anything: command not found... (assuming that no anything cmd is in the PATH), while $ .anything slips through silently?
System: Fedora Core 22
Bash version: GNU bash, version 4.3.39(1)-release (x86_64-redhat-linux-gnu)
EDIT:
Some comments below indicated the problem as non-reproducible at first.
The answer of #hek2mgl below summarises the many contributions to this issue, which was eventually found (by #n.m.) as reproducible in FC22 and submitted as a bug report in https://bugzilla.redhat.com/show_bug.cgi?id=1292531
bash supports a handler for situations when a command can't be found. You can define the following function:
function command_not_found_handle() {
command=$1
# do something
}
Using that function it is possible to suppress the error message. Search for that function in your bash startup files.
Another way to find that out is to unset the function. Like this:
$ unset -f command_not_found_handle
$ .anything # Should display the error message
After some research, #n.m. found out that the described behaviour is by intention. FC22 implements command_not_found_handle and calls the program /etc/libexec/pk-command-not-found. This program is part of the PackageKit project and will try to suggest installable packages if you type a command name that can't be found.
In it's main() function the program explicitly checks if the command name starts with a dot and silently returns in that case. This behaviour was introduced in this commit:
https://github.com/hughsie/PackageKit/commit/0e85001b
as a response to this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=1151185
IMHO this behaviour is questionable. At least other distros are not doing so. But now you know that the behaviour is 100% reproducible and you may follow up on that bug report.

Calling a script with x3270 -script

I have an old script which is used to scrape information from an IBM server via x3270. However, I can't get it to work correctly. This is how I'm calling it:
/usr/X11R6/bin/x3270 -script -model 3279-2 -geom +110+160 -efont 3270-20 'Script( "/usr/X11R6/lib/X11/x3270/qmon_script.sh" )'
I get an x3270 window and the following error message: Hostname syntax error: Multiple port names
The script I'm calling handles all the connection details, but x3270 appears to be confused and is thinking 'Script( "/usr/X11R6/lib/X11/x3270/qmon_script.sh" )' is the hostname (which is obviously not correct).
I've been unable to find any good examples on how to call a script through x3270 like this. Any ideas?
According to the documentation for x3270:
-script
Causes x3270 to read commands from standard input, with the results written to standard
output. The protocol for these commands is documented in x3270-script(1).
So it doesn't allow giving the script itself on the command line. Instead you're supposed to supply the script through standard input. You probably want either:
echo 'Script( "/usr/X11R6/lib/X11/x3270/qmon_script.sh" )' | /usr/X11R6/bin/x3270 -script -model 3279-2 -geom +110+160 -efont 3270-20
Or maybe:
/usr/X11R6/bin/x3270 -script -model 3279-2 -geom +110+160 -efont 3270-20 < /usr/X11R6/lib/X11/x3270/qmon_script.sh

sh: command not found

In my Fortran program I want to call the system to run my code (with alias asv20r3). For it I do:
call system ("asv20r3 " //filename)
But I obtain the following message:
sh: asv20r3: command not found
Is it necessary to define something more in order to make the system understand that I want to execute the code via the alias asv20r3?
Thanks!
The following FORTRAN90 program has all the answer to your question :
program Test
print*, 'Printing environment variables : '
call system("set")
print*, 'Printing environment aliases : '
call system("alias")
end program Test
The output of the program speaks for itself : environment variables are inherited; aliases are not.
You can either choose to rely on the content of an environment variable (using call get_environment_variable(...)) or hardcode the path and/or command like someone else suggested.
You probably need to supply the path to asv20r3 in the system call
your program, asv20r3, needs to be in the $PATH of your system.
if it is in the same directory as where you run your main program, than you must tell your system that by pre-pending a "./" to it, such as "./asv20r3".

Resources