How to execute Rust code directly on Unix systems? (using the shebang)

How to execute Rust code directly on Unix systems? (using the shebang) - rust

From reading this thread, it looks like its possible to use the shebang to run Rust *.
#!/usr/bin/env rustc
fn main() {
println!("Hello World!");
}
Making this executable and running does compile, but not run the code.
chmod +x hello_world.rs
./hello_world.rs
However this only compiles the code into hello_world.
Can *.rs files be executed directly, similar to a shell script?
* This references rustx, I looked into this, but its a bash script which compiles the script every time (without caching) and never removes the file from the temp directory, although this could be improved. Also it has the significant limitation that it can't use crates.

There's cargo-script. That also lets you use dependencies.
After installing cargo-script via cargo install cargo-script, you can create your script file (hello.rs) like this:
#!/usr/bin/env run-cargo-script
fn main() {
println!("Hello World!");
}
To execute it, you need to:
$ chmod +x hello.rs
$ ./hello.rs
Compiling hello v0.1.0 (file://~/.cargo/.cargo/script-cache/file-hello-d746fc676c0590b)
Finished release [optimized] target(s) in 0.80 secs
Hello World!
To use crates from crates.io, please see the tutorial in the README linked above.

This seems to work:
#!/bin/sh
//usr/bin/env rustc $0 -o a.out && ./a.out && rm ./a.out ; exit
fn main() {
println!("Hello World!");
}

I have written a tool just for that: Scriptisto. It is a fully language agnostic tool and it works with other compiled languages or languages that have expensive validation steps (Python with mypy).
For Rust it can also fetch dependencies behind the scenes or build entirely in Docker without having a Rust compiler installed. scriptisto embeds those templates into the binary so you can bootstrap easily:
$ scriptisto new rust > ./script.rs
$ chmod +x ./script.rs
$ ./script.rs
Instead of new rust you can do new docker-rust and the build will not require Rust compiler on your host system.

#!/bin/sh
#![allow()] /*
exec cargo-play --cached --release $0 -- "$#"
*/
Needs cargo-play. You can see a solution that doesn't need anything here:
#!/bin/sh
#![allow()] /*
# rust self-compiler by Mahmoud Al-Qudsi, Copyright NeoSmart Technologies 2020
# See <https://neosmart.net/blog/self-compiling-rust-code/> for info & updates.
#
# This code is freely released to the public domain. In case a public domain
# license is insufficient for your legal department, this code is also licensed
# under the MIT license.
# Get an output path that is derived from the complete path to this self script.
# - `realpath` makes sure if you have two separate `script.rs` files in two
# different directories, they get mapped to different binaries.
# - `which` makes that work even if you store this script in $PATH and execute
# it by its filename alone.
# - `cut` is used to print only the hash and not the filename, which `md5sum`
# always includes in its output.
OUT=/tmp/$(printf "%s" $(realpath $(which "$0")) | md5sum | cut -d' ' -f1)
# Calculate hash of the current contents of the script, so we can avoid
# recompiling if it hasn't changed.
MD5=$(md5sum "$0" | cut -d' ' -f1)
# Check if we have a previously compiled output for this exact source code.
if !(test -f "${OUT}.md5" && test "${MD5}" = "$(cat ${OUT}.md5)"); then
# The script has been modified or is otherwise not cached.
# Check if the script already contains an `fn main()` entry point.
if grep -Eq '^\s*(\[.*?\])*\s*fn\s*main\b*' "$0"; then
# Compile the input script as-is to the previously determined location.
rustc "$0" -o ${OUT}
# Save rustc's exit code so we can compare against it later.
RUSTC_STATUS=$?
else
# The script does not contain an `fn main()` entry point, so add one.
# We don't use `printf 'fn main() { %s }' because the shebang must
# come at the beginning of the line, and we don't use `tail` to skip
# it because that would result in incorrect line numbers in any errors
# reported by rustc, instead we just comment out the shebang but leave
# it on the same line as `fn main() {`.
printf "fn main() {//%s\n}" "$(cat $0)" | rustc - -o ${OUT}
# Save rustc's exit code so we can compare against it later.
RUSTC_STATUS=$?
fi
# Check if we compiled the script OK, or exit bubbling up the return code.
if test "${RUSTC_STATUS}" -ne 0; then
exit ${RUSTC_STATUS}
fi
# Save the MD5 of the current version of the script so we can compare
# against it next time.
printf "%s" ${MD5} > ${OUT}.md5
fi
# Execute the compiled output. This also ends execution of the shell script,
# as it actually replaces its process with ours; see exec(3) for more on this.
exec ${OUT} $#
# At this point, it's OK to write raw rust code as the shell interpreter
# never gets this far. But we're actually still in the rust comment we opened
# on line 2, so close that: */

Related

Is there a way to know how the user invoked a program from bash?

Here's the problem: I have this script foo.py, and if the user invokes it without the --bar option, I'd like to display the following error message:
Please add the --bar option to your command, like so:
python foo.py --bar
Now, the tricky part is that there are several ways the user might have invoked the command:
They may have used python foo.py like in the example
They may have used /usr/bin/foo.py
They may have a shell alias frob='python foo.py', and actually ran frob
Maybe it's even a git alias flab=!/usr/bin/foo.py, and they used git flab
In every case, I'd like the message to reflect how the user invoked the command, so that the example I'm providing would make sense.
sys.argv always contains foo.py, and /proc/$$/cmdline doesn't know about aliases. It seems to me that the only possible source for this information would be bash itself, but I don't know how to ask it.
Any ideas?
UPDATE How about if we limit possible scenarios to only those listed above?
UPDATE 2: Plenty of people wrote very good explanation about why this is not possible in the general case, so I would like to limit my question to this:
Under the following assumptions:
The script was started interactively, from bash
The script was start in one of these 3 ways:
foo <args> where foo is a symbolic link /usr/bin/foo -> foo.py
git foo where alias.foo=!/usr/bin/foo in ~/.gitconfig
git baz where alias.baz=!/usr/bin/foo in ~/.gitconfig
Is there a way to distinguish between 1 and (2,3) from within the script? Is there a way to distinguish between 2 and 3 from within the script?
I know this is a long shot, so I'm accepting Charles Duffy's answer for now.
UPDATE 3: So far, the most promising angle was suggested by Charles Duffy in the comments below. If I can get my users to have
trap 'export LAST_BASH_COMMAND=$(history 1)' DEBUG
in their .bashrc, then I can use something like this in my code:
like_so = None
cmd = os.environ['LAST_BASH_COMMAND']
if cmd is not None:
cmd = cmd[8:] # Remove the history counter
if cmd.startswith("foo "):
like_so = "foo --bar " + cmd[4:]
elif cmd.startswith(r"git foo "):
like_so = "git foo --bar " + cmd[8:]
elif cmd.startswith(r"git baz "):
like_so = "git baz --bar " + cmd[8:]
if like_so is not None:
print("Please add the --bar option to your command, like so:")
print(" " + like_so)
else:
print("Please add the --bar option to your command.")
This way, I show the general message if I don't manage to get their invocation method. Of course, if I'm going to rely on changing my users' environment I might as well ensure that the various aliases export their own environment variables that I can look at, but at least this way allows me to use the same technique for any other script I might add later.

No, there is no way to see the original text (before aliases/functions/etc).
Starting a program in UNIX is done as follows at the underlying syscall level:
int execve(const char *path, char *const argv[], char *const envp[]);
Notably, there are three arguments:
The path to the executable
An argv array (the first item of which -- argv[0] or $0 -- is passed to that executable to reflect the name under which it was started)
A list of environment variables
Nowhere in here is there a string that provides the original user-entered shell command from which the new process's invocation was requested. This is particularly true since not all programs are started from a shell at all; consider the case where your program is started from another Python script with shell=False.
It's completely conventional on UNIX to assume that your program was started through whatever name is given in argv[0]; this works for symlinks.
You can even see standard UNIX tools doing this:
$ ls '*.txt' # sample command to generate an error message; note "ls:" at the front
ls: *.txt: No such file or directory
$ (exec -a foobar ls '*.txt') # again, but tell it that its name is "foobar"
foobar: *.txt: No such file or directory
$ alias somesuch=ls # this **doesn't** happen with an alias
$ somesuch '*.txt' # ...the program still sees its real name, not the alias!
ls: *.txt: No such file
If you do want to generate a UNIX command line, use pipes.quote() (Python 2) or shlex.quote() (Python 3) to do it safely.
try:
from pipes import quote # Python 2
except ImportError:
from shlex import quote # Python 3
cmd = ' '.join(quote(s) for s in open('/proc/self/cmdline', 'r').read().split('\0')[:-1])
print("We were called as: {}".format(cmd))
Again, this won't "un-expand" aliases, revert to the code that was invoked to call a function that invoked your command, etc; there is no un-ringing that bell.
That can be used to look for a git instance in your parent process tree, and discover its argument list:
def find_cmdline(pid):
return open('/proc/%d/cmdline' % (pid,), 'r').read().split('\0')[:-1]
def find_ppid(pid):
stat_data = open('/proc/%d/stat' % (pid,), 'r').read()
stat_data_sanitized = re.sub('[(]([^)]+)[)]', '_', stat_data)
return int(stat_data_sanitized.split(' ')[3])
def all_parent_cmdlines(pid):
while pid > 0:
yield find_cmdline(pid)
pid = find_ppid(pid)
def find_git_parent(pid):
for cmdline in all_parent_cmdlines(pid):
if cmdline[0] == 'git':
return ' '.join(quote(s) for s in cmdline)
return None

See the Note at the bottom regarding the originally proposed wrapper script.
A new more flexible approach is for the python script to provide a new command line option, permitting users to specify a custom string they would prefer to see in error messages.
For example, if a user prefers to call the python script 'myPyScript.py' via an alias, they can change the alias definition from this:
alias myAlias='myPyScript.py $#'
to this:
alias myAlias='myPyScript.py --caller=myAlias $#'
If they prefer to call the python script from a shell script, it can use the additional command line option like so:
#!/bin/bash
exec myPyScript.py "$#" --caller=${0##*/}
Other possible applications of this approach:
bash -c myPyScript.py --caller="bash -c myPyScript.py"
myPyScript.py --caller=myPyScript.py
For listing expanded command lines, here's a script 'pyTest.py', based on feedback by #CharlesDuffy, that lists cmdline for the running python script, as well as the parent process that spawned it.
If the new -caller argument is used, it will appear in the command line, although aliases will have been expanded, etc.
#!/usr/bin/env python
import os, re
with open ("/proc/self/stat", "r") as myfile:
data = [x.strip() for x in str.split(myfile.readlines()[0],' ')]
pid = data[0]
ppid = data[3]
def commandLine(pid):
with open ("/proc/"+pid+"/cmdline", "r") as myfile:
return [x.strip() for x in str.split(myfile.readlines()[0],'\x00')][0:-1]
pid_cmdline = commandLine(pid)
ppid_cmdline = commandLine(ppid)
print "%r" % pid_cmdline
print "%r" % ppid_cmdline
After saving this to a file named 'pytest.py', and then calling it from a bash script called 'pytest.sh' with various arguments, here's the output:
$ ./pytest.sh a b "c d" e
['python', './pytest.py']
['/bin/bash', './pytest.sh', 'a', 'b', 'c d', 'e']
NOTE: criticisms of the original wrapper script aliasTest.sh were valid. Although the existence of a pre-defined alias is part of the specification of the question, and may be presumed to exist in the user environment, the proposal defined the alias (creating the misleading impression that it was part of the recommendation rather than a specified part of the user's environment), and it didn't show how the wrapper would communicate with the called python script. In practice, the user would either have to source the wrapper or define the alias within the wrapper, and the python script would have to delegate the printing of error messages to multiple custom calling scripts (where the calling information resided), and clients would have to call the wrapper scripts. Solving those problems led to a simpler approach, that is expandable to any number of additional use cases.
Here's a less confusing version of the original script, for reference:
#!/bin/bash
shopt -s expand_aliases
alias myAlias='myPyScript.py'
# called like this:
set -o history
myAlias $#
_EXITCODE=$?
CALL_HISTORY=( `history` )
_CALLING_MODE=${CALL_HISTORY[1]}
case "$_EXITCODE" in
0) # no error message required
;;
1)
echo "customized error message #1 [$_CALLING_MODE]" 1>&2
;;
2)
echo "customized error message #2 [$_CALLING_MODE]" 1>&2
;;
esac
Here's the output:
$ aliasTest.sh 1 2 3
['./myPyScript.py', '1', '2', '3']
customized error message #2 [myAlias]

There is no way to distinguish between when an interpreter for a script is explicitly specified on the command line and when it is deduced by the OS from the hashbang line.
Proof:
$ cat test.sh
#!/usr/bin/env bash
ps -o command $$
$ bash ./test.sh
COMMAND
bash ./test.sh
$ ./test.sh
COMMAND
bash ./test.sh
This prevents you from detecting the difference between the first two cases in your list.
I am also confident that there is no reasonable way of identifying the other (mediated) ways of calling a command.

I can see two ways to do this:
The simplest, as suggested by 3sky, would be to parse the command line from inside the python script. argparse can be used to do so in a reliable way. This only works if you can change that script.
A more complex way, slightly more generic and involved, would be to change the python executable on your system.
Since the first option is well documented, here are a bit more details on the second one:
Regardless of the way your script is called, python is ran. The goal here is to replace the python executable with a script that checks if foo.py is among the arguments, and if it is, check if --bar is as well. If not, print the message and return.
In every other case, execute the real python executable.
Now, hopefully, running python is done trough the following shebang: #!/usr/bin/env python3, or trough python foo.py, rather than a variant of #!/usr/bin/python or /usr/bin/python foo.py. That way, you can change the $PATH variable, and prepend a directory where your false python resides.
In the other case, you can replace the /usr/bin/python executable, at the risk of not playing nice with updates.
A more complex way of doing this would probably be with namespaces and mounts, but the above is probably enough, especially if you have admin rights.
Example of what could work as a script:
#!/usr/bin/env bash
function checkbar
{
for i in "$#"
do
if [ "$i" = "--bar" ]
then
echo "Well done, you added --bar!"
return 0
fi
done
return 1
}
command=$(basename ${1:-none})
if [ $command = "foo.py" ]
then
if ! checkbar "$#"
then
echo "Please add --bar to the command line, like so:"
printf "%q " $0
printf "%q " "$#"
printf -- "--bar\n"
exit 1
fi
fi
/path/to/real/python "$#"
However, after re-reading your question, it is obvious that I misunderstood it. In my opinion, it is all right to just print either "foo.py must be called like foo.py --bar", "please add bar to your arguments" or "please try (instead of )", regardless of what the user entered:
If that's an (git) alias, this is a one time error, and the user will try their alias after creating it, so they know where to put the --bar part
with either with /usr/bin/foo.py or python foo.py:
If the user is not really command line-savvy, they can just paste the working command that is displayed, even if they don't know the difference
If they are, they should be able to understand the message without trouble, and adjust their command line.

I know it's bash task, but i think the easiest way is modify 'foo.py'. Of course it depends on level of script complicated, but maybe it will fit. Here is sample code:
#!/usr/bin/python
import sys
if len(sys.argv) > 1 and sys.argv[1] == '--bar':
print 'make magic'
else:
print 'Please add the --bar option to your command, like so:'
print ' python foo.py --bar'
In this case, it does not matter how user run this code.
$ ./a.py
Please add the --bar option to your command, like so:
python foo.py --bar
$ ./a.py -dua
Please add the --bar option to your command, like so:
python foo.py --bar
$ ./a.py --bar
make magic
$ python a.py --t
Please add the --bar option to your command, like so:
python foo.py --bar
$ /home/3sky/test/a.py
Please add the --bar option to your command, like so:
python foo.py --bar
$ alias a='python a.py'
$ a
Please add the --bar option to your command, like so:
python foo.py --bar
$ a --bar
make magic

How do I rerun a bash script skipping over lines which have previously run sucesfully?

I have a bash script which acts as a wrapper for an analysis pipeline. If the script errors out I want to be able to run the script from the point at which the errors occurred by simply re-running the original command. I have set two different traps; one which will remove the last file being generated on a non-zero exit from my script, the other will remove all the temporary files on exit signal = 0 and essentially cleans up the file system at the end of the run. I turned on noclobber in the bash environment which allows my script to skip over lines of the script where files have already been written but this will only do this if I do not set the non-zero exit trap. As soon as I set this trap then it will exit at the first line where noclobber IDs a file it will not overwrite. Is there a way for me to skip over lines of code that have successfully run previously rather than having to re-run my code from the start? I know I could use conditional statements for each line but I thought there might be a neater way of doing this.
set -o noclobber
# Function to clean up temporary folders when script exits at the end
rmfile() { rm -r $1 }
# Function to remove the file being currently generated
# Function executed if script errors out
rmlast() {
if [ ! -z "$CURRENTFILE" ]
then
rm -r $1
exit 1
fi }
# Trap to remove the currently generated file
trap 'rmlast "$CURRENTFILE"' ERR SIGINT
#Make temporary directory if it has not been created in a previous run
TEMPDIR=$(find . -name "tmp*")
if [ -z "$TEMPDIR" ]
then
TEMPDIR=$(mktemp -d /test/tmpXXX)
fi
# Set CURRENTFILE variable
CURRENTFILE="${TEMPDIR}/Variants.vcf"
# Set CURRENTFILE variable
complexanalysis_tool input_file > $CURRENTFILE
# Set CURRENTFILE variable
CURRENTFILE="${TEMPDIR}/Filtered.vcf"
complexanalysis_tool2 input_file2 > $CURRENTFILE
CURRENTFILE="${TEMPDIR}/Filtered_2.vcf"
complexanalysis_tool3 input_file3 > $CURRENTFILE
# Move files to final destination folder
mv -nv $TEMPDIR/*.vcf /test/newdest/
# Trap to remove temporary folders when script finishes running
trap 'rmfile "$TEMPDIR"' 0
Update:
I have been offered answers suggesting the use of the make utility. I want to make use of its inbuilt utility to check if a dependency has been fulfilled.
In my hands the makefile suggested by VK Kashyap does not seem to skip execution for previously accomplished tasks. So for example I ran the script above and interrupted the script when it was running filtered.vcf with ctrl c. When I rerun the script again it runs from the beginning again i.e. starts again at varaints.vcf. Am I missing something in order to get the makefile to show sources as being fullfilled?
Answer to update:
OK this is a rookie mistake but since I am not familiar with generating makefiles I will post this explanation of my error. The reason my makefile was not rerunning from the exit point was that I had named the targets a different name to the output files being generated. So as VK Kashyap quite correctly answered if you name the targets eg.
variants.vcf
filtered.vcf
filtered2.vcf
the same as the output files being generated then the script will skip previously accomplished tasks.

make utility might be an answer for the thing you want to achive.
it has inbuilt dependecy checking (the stuff which you are trying to achive with tmp files)
#run all target when all of the files are available
all: variants.vcf filtered.vcf filtered2.vcf
mv -nv $(TEMPDIR)/*.vcf /test/newdest/
variants.vcf:
complexanalysis_tool input_file > variants.vcf
filtered.vcf:
complexanalysis_tool2 input_file2 > filtered.vcf
filtered2.vcf:
complexanalysis_tool3 input_file3 > filtered2.vcf
you may use bash script to invoke this make file as:
#/bin/bash
export TEMPDIR=xyz
make -C $TEMPDIR all
make utility will check itself for already accomplished task and skip execution for done stuffs. it will continue where you had the error finishing the task.
you can find more details on internet about exact syntax for makefile.

there is no built-in way to do that.
however, you could brew something like that by keeping track of the last successful line and building your own goto statement, as described here and in Is there a "goto" statement in bash? (just replace the 'labels' with actual line-numbers).
however, the question is whether this is really a smart idea.
a better way is to only run the commands needed, not the commands not-yet-executed.
this could be done either by explicit conditionals in your bash-script:
produce_if_missing() {
# check if first argument is existing
# if not run the rest of the arguments and pipe it into the first one
local curfile=$1
shift
if [ ! -e "${curfile}" ]; then
$# > "${curfile}"
fi
}
produce_if_missing Variants.vcf complexanalysis_tool input_file
produce_if_missing Filtered.vcf complexanalysis_tool2 input_file2
or using tools that are made for such things (see VK Kahyap's answer using make, though i prefer using variables in the make-rules to minimize typos):
Variants.vcf: input_file
complexanalysis_tool $^ > $#
Filtered.vcf: input_file
complexanalysis_tool2 $^ > $#

Making Sublime Text 2 command on linux behave as it does on MacOS X

There are many questions asking about accessing the Sublime Text 2 editor from the command line. The responses, in summary, are to make a symlink, alias or simple shell script to run the appropriate sublime_text command. I can do that. What I want is to make the linux version behave like the MacOS version.
On MacOS, I have the following:
ln -s /Applications/Sublime\ Text\ 2.app/Contents/SharedSupport/bin/subl ~/bin/subl
Then in my .zshrc:
alias subl="$HOME/bin/subl -n"
export EDITOR="$HOME/bin/subl -n -w"
This does two things. It gives me a subl command that opens any files given on the command line in a new window. The subl command does not block the terminal. It also sets up my editor to open sublime text to edit the arguments, but this time it does block. In particular, $EDITOR blocks until its arguments are closed. It does not block on unrelated sublime text windows.
I can achieve a similar effect on linux with the following:
In ~/bin/subl:
#! /bin/zsh
$HOME/Sublime\ Text\ 2/sublime_text -n $# &
and then in ~/bin/subl_wait: (think mate_wait for TextMate users)
#! /bin/zsh
exec $HOME/Sublime\ Text\ 2/sublime_text -n -w $#
I can then set EDITOR to subl_wait, and things almost work. subl opens files for editing and doesn't block. subl_wait opens files for editing and does block.
The problem is that subl_wait is waiting until all open files are closed, not just its arguments.
Is it possible to get this working perfectly?

Looks like I've found the issue. (Thanks to this post: http://www.sublimetext.com/forum/viewtopic.php?f=2&t=7003 )
Basic point: sublime behaves differently depending upon whether an instance is already running!
If an instance is already running then sublime on linux behaves similarly to MacOS. If no instance is running then the terminal blocks until you exit sublime.
With that in mind, we just need to modify the scripts to make sure sublime is running:
in ~/bin/subl_start:
#! /bin/zsh
if [ ! "$(pidof sublime_text)" ] ; then
# start sublime text main instance
# echo "Starting Sublime Text 2"
$HOME/Sublime\ Text\ 2/sublime_text &
sleep 1 # needed to avoid a race condition
fi
in ~/bin/subl:
#! /bin/zsh
. $HOME/bin/subl_start
exec $HOME/Sublime\ Text\ 2/sublime_text -n $#
in ~/bin/subl_wait:
#! /bin/zsh
. $HOME/bin/subl_start
exec $HOME/Sublime\ Text\ 2/sublime_text -n -w $#
Note that I've used the -n flags everywhere. This might not be your cup of tea. If you are using -n then you possibly also want to look at your close_windows_when_empty setting.

Inspired by the OP's answer, I've created a bash wrapper script for Sublime Text that incorporates all your findings and runs on both OSX and Linux.
Its purpose is threefold:
provide a unified subl CLI that works like ST's own subl on OSX: invoke ST without blocking, unless waiting is explicitly requested.
encapsulate a workaround for the waiting-related bug on Linux.
when saved or symlinked to as sublwait, provide a sublwait CLI that automatically applies the --wait and --new-window options so as to make it suitable for use with $EDITOR (note that some programs, e.g. npm, require the $EDITOR to contain the name of an executable only - executables + options are not supported); also makes sure that at least one file is specified.
The only open question is whether the OP's approach to avoiding the race condition - sleep 1 - is robust enough.
Update:
Note that subl on OSX is by default NOT placed in the $PATH - you normally have to do that manually. If you haven't done so, the script will now locate subl inside ST's application bundle; (it tries app names in the following sequence: 'Sublime Text', 'Sublime Text 2', 'Sublime Text 3', first in /Applications, then in ~/Applications.)
Here's the output from running the script with -h:
Multi-platform (OSX, Linux) wrapper script for invocation of Sublime Text (ST)
from the command line.
Linux:
Works around undesired blocking of the shell (unless requested)
and a bug when waiting for specific files to be edited.
Both platforms:
When invoked as `sublwait`, automatically applies the
--wait --new-window
options to make it suitable for use with $EDITOR.
Therefore, you can to the following:
- Name this script `subl` for a CLI that supports ALL options.
(On OSX, this will simply defer to the `subl` CLI that came with ST.)
- Place the script in a directory in your $PATH.
- In the same directory, create a symlink to the `subl` script named
`sublwait`:
ln -s subl sublwait
and, if desired, add
export EDITOR=sublwait
to your shell profile.
Note that if you only use OSX, you can make do with ST's own subl and just save this script directly as sublwait.
Script source:
#!/usr/bin/env bash
# Multi-platform (OSX, Linux) wrapper script for invocation of Sublime Text (ST)
# from the command line. Invoke with -h for details.
[[ $1 == '-h' || $1 == '--help' ]] && showHelpOnly=1 || showHelpOnly=0
[[ $(basename "$BASH_SOURCE") == 'sublwait' ]] && invokedAsSublWait=1 || invokedAsSublWait=0
[[ $(uname) == 'Darwin' ]] && isOsX=1 || isOsX=0
# Find the platform-appropriate ST executable.
if (( isOsX )); then # OSX: ST comes with a bona-fide CLI, `subl`.
# First, try to find the `subl` CLI in the $PATH.
# Note: This CLI is NOT there by default; it must be created by symlinking it from
# its location inside the ST app bundle.
# Find the `subl` executable, ignoring this script, if named subl' as well, or a
# script by that name in the same folder as this one (when invoked via symlink 'sublwait').
stExe=$(which -a subl | fgrep -v -x "$(dirname "$BASH_SOURCE")/subl" | head -1)
# If not already in the path, look for it inside the application bundle. Try several locations and versions.
if [[ -z $stExe ]]; then
for p in {,$HOME}"/Applications/Sublime Text"{,' 2',' 3'}".app/Contents/SharedSupport/bin/subl"; do
[[ -f $p ]] && { stExe=$p; break; }
done
fi
[[ -x $stExe ]] || { echo "ERROR: Sublime Text CLI 'subl' not found." 1>&2; exit 1; }
else # Linux: `sublime_text` is the only executable - the app itself.
stExe='sublime_text'
which "$stExe" >/dev/null || { echo "ERROR: Sublime Text executable '$stExe' not found." 1>&2; exit 1; }
fi
# Show command-line help, if requested.
# Add preamble before printing ST's own help.
# Note that we needn't worry about blocking the
# shell in this case - ST just outputs synchronously
# to stdout, then exits.
if (( showHelpOnly )); then
bugDescr=$(
cat <<EOF
works around a bug on Linux (as of v2.0.2), where Sublime Text,
if it is not already running, mistakenly blocks until it is exited altogether.
EOF
)
if (( invokedAsSublWait )); then
# We provide variant-specific help here.
cat <<EOF
Wrapper script for Sublime Text suitable for use with the \$EDITOR variable.
Opens the specified files for editing in a new window and blocks the
invoking program (shell) until they are closed.
In other words: the --wait and --new-window options are automatically
applied.
Aside from encapsulating this functionality without the need for options
- helpful for tools that require \$EDITOR to be an executable name only -
$bugDescr
Usage: sublwait file ...
EOF
# Note: Adding other options doesn't make sense in this scenario
# (as of v2.0.2), so we do NOT show ST's own help here.
else
cat <<EOF
Multi-platform (OSX, Linux) wrapper script for invocation of
Sublime Text (ST) from the command line.
Linux:
Works around undesired blocking of the shell (unless requested)
and a bug when waiting for specific files to be edited.
Both platforms:
When invoked as \`sublwait\`, automatically applies the
--wait --new-window
options to make it suitable for use with \$EDITOR.
Therefore, you can to the following:
- Name this script \`subl\` for a CLI that supports ALL options.
(On OSX, this will simply defer to the \`subl\` CLI that came with ST.)
- Place the script in a directory in your \$PATH.
- In the same directory, create a symlink to the \`subl\` script named
\`sublwait\`:
ln -s subl sublwait
and, if desired, add
export EDITOR=sublwait
to your shell profile.
Sublime Text's own help:
------------------------
EOF
# Finally, print ST's own help and exit.
exec "$stExe" "$#"
fi
exit 0
fi
# Invoked as `sublwait`? -> automatically apply --wait --new-window options.
if (( invokedAsSublWait )); then
# Validate parameters.
# - We expect NO options - to keep things simple and predictable, we do NOT allow
# specifying additional options (beyond the implied ones).
# - We need at least 1 file argument.
# - As a courtesy, we ensure that no *directories* are among the arguments - ST doesn't support
# that properly (always waits for ST exit altogether); beyond that, however, we leave input
# validation to ST.
if [[ "$1" =~ ^-[[:alnum:]]+$ || "$1" =~ ^--[[:alnum:]]+[[:alnum:]-]+$ ]]; then # options specified?
{ echo "ERROR: Unexpected option specified: '$1'. Use -h for help." 1>&2; exit 1; }
elif (( $# == 0 )); then # no file arguments?
{ echo "ERROR: Missing file argument. Use -h for help." 1>&2; exit 1; }
else # any directories among the arguments?
# Note: We do NOT check for file existence - files could be created on demand.
# (Things can still go wrong - e.g., /nosuchdir/mynewfile - and ST doesn't
# handle that gracefully, but we don't want to do too much here.)
for f in "$#"; do
[[ ! -d "$f" ]] || { echo "ERROR: Specifying directories is not supported: '$f'. Use -h for help." 1>&2; exit 1; }
done
fi
# Prepend the implied options.
set -- '--wait' '--new-window' "$#"
fi
# Finally, invoke ST:
if (( isOsX )); then # OSX
# `subl` on OSX handles all cases correctly; simply pass parameters through.
exec "$stExe" "$#"
else # LINUX: `sublime_text`, the app executable itself, does have a CLI, but it blocks the shell.
# Determine if the wait option was specified.
mustWait=0
if (( invokedAsSublWait )); then
mustWait=1
else
# Look for the wait option in the parameters to pass through.
for p in "$#"; do
[[ $p != -* ]] && break # past options
[[ $p == '--wait' || $p =~ ^-[[:alnum:]]*w[[:alnum:]]*$ ]] && { mustWait=1; break; }
done
fi
if (( mustWait )); then # Invoke in wait-for-specified-files-to-close mode.
# Quirk on Linux:
# If sublime_text isn't running yet, we must start it explicitly first.
# Otherwise, --wait will wait for ST *as a whole* to be closed before returning,
# which is undesired.
# Thanks, http://stackoverflow.com/questions/14598261/making-sublime-text-2-command-on-linux-behave-as-it-does-on-macos-x
if ! pidof "$stExe" 1>/dev/null; then
# Launch as BACKGROUND task to avoid blocking.
# (Sadly, the `--background` option - designed not to activate the Sublime Text window
# on launching - doesn't actually work on Linux (as of ST v2.0.2 on Ubuntu 12.04).)
("$stExe" --background &)
# !! We MUST give ST some time to start up, otherwise the 2nd invocation below will be ignored.
# ?? Does a fixed sleep time of 1 second work reliably?
sleep 1
fi
# Invoke in blocking manner, as requested.
exec "$stExe" "$#"
else # Ensure invocation in NON-blocking manner.
if ! pidof "$stExe" 1>/dev/null; then # ST isn't running.
# If ST isn't running, invoking it *always* blocks.
# Therefore, we launch it as a background taks.
# Invocation via a subshell (parentheses) suppresses display of the
# background-task 'housekeeping' info.
("$stExe" "$#" &)
else # ST is already running, we can safely invoke it directly without fear of blocking.
exec "$stExe" "$#"
fi
fi
fi

On Ubuntu Gnu/Linux 13.04 64-bit:
I just keep subl running pretty much all the time. So my git config has:
core.editor=/usr/bin/subl -n -w
And that's all I need. I save the git commit file with ctrl-s, close the window with ctrl-w and I'm done. But I then have to really close the window by hitting the X in the upper corner... 96% perfect.

Trace of executed programs called by a Bash script

A script is misbehaving. I need to know who calls that script, and who calls the calling script, and so on, only by modifying the misbehaving script.
This is similar to a stack-trace, but I am not interested in a call stack of function calls within a single bash script.
Instead, I need the chain of executed programs/scripts that is initiated by my script.

A simple script I wrote some days ago...
# FILE : sctrace.sh
# LICENSE : GPL v2.0 (only)
# PURPOSE : print the recursive callers' list for a script
# (sort of a process backtrace)
# USAGE : [in a script] source sctrace.sh
#
# TESTED ON :
# - Linux, x86 32-bit, Bash 3.2.39(1)-release
# REFERENCES:
# [1]: http://tldp.org/LDP/abs/html/internalvariables.html#PROCCID
# [2]: http://linux.die.net/man/5/proc
# [3]: http://linux.about.com/library/cmd/blcmdl1_tac.htm
#! /bin/bash
TRACE=""
CP=$$ # PID of the script itself [1]
while true # safe because "all starts with init..."
do
CMDLINE=$(cat /proc/$CP/cmdline)
PP=$(grep PPid /proc/$CP/status | awk '{ print $2; }') # [2]
TRACE="$TRACE [$CP]:$CMDLINE\n"
if [ "$CP" == "1" ]; then # we reach 'init' [PID 1] => backtrace end
break
fi
CP=$PP
done
echo "Backtrace of '$0'"
echo -en "$TRACE" | tac | grep -n ":" # using tac to "print in reverse" [3]
... and a simple test.
I hope you like it.

You can use Bash Debugger http://bashdb.sourceforge.net/
Or, as mentioned in the previous comments, the caller bash built-in. See: http://wiki.bash-hackers.org/commands/builtin/caller
i=0; while caller $i ;do ((i++)) ;done
Or as a bash function:
dump_stack(){
local i=0
local line_no
local function_name
local file_name
while caller $i ;do ((i++)) ;done | while read line_no function_name file_name;do echo -e "\t$file_name:$line_no\t$function_name" ;done >&2
}
Another way to do it is to change PS4 and enable xtrace:
PS4='+$(date "+%F %T") ${FUNCNAME[0]}() $BASH_SOURCE:${BASH_LINENO[0]}+ '
set -o xtrace # Comment this line to disable tracing.

~$ help caller
caller: caller [EXPR]
Returns the context of the current subroutine call.
Without EXPR, returns "$line $filename". With EXPR,
returns "$line $subroutine $filename"; this extra information
can be used to provide a stack trace.
The value of EXPR indicates how many call frames to go back before the
current one; the top frame is frame 0.

Since you say you can edit the script itself, simply put a:
ps -ef >/tmp/bash_stack_trace.$$
in it, where the problem is occurring.
This will create a number of files in your tmp directory that show the entire process list at the time it happened.
You can then work out which process called which other process by examining this output. This can either be done manually, or automated with something like awk, since the output is regular - you just use those PID and PPID columns to work out the relationships between all the processes you're interested in.
You'll need to keep an eye on the files, since you'll get one per process so they may have to be managed. Since this is something that should only be done during debugging, most of the time that line will be commented out (preceded by #), so the files won't be created.
To clean them up, you can simply do:
rm /tmp/bash_stack_trace.*

UPDATE:
The code below should work. Now I have a newer answer with a newer code version that allows a message inserted in the stacktrace.
IIRC I just couldn't find this answer to update it as well at the time. But now decided code is better kept in git so latest version of the above should be in this gist.
original code-corrected answer below:
There was another answer about this somewhere but here is a function to use for getting stack trace in the sense used for example in the java programming language. You call the function and it puts the stack trace into the variable $STACK. It show the code points that led to get_stack being called. This is mostly useful for complicated execution where single shell sources multiple script snippets and nesting.
function get_stack () {
STACK=""
# to avoid noise we start with 1 to skip get_stack caller
local i
local stack_size=${#FUNCNAME[#]}
for (( i=1; i<$stack_size ; i++ )); do
local func="${FUNCNAME[$i]}"
[ x$func = x ] && func=MAIN
local linen="${BASH_LINENO[(( i - 1 ))]}"
local src="${BASH_SOURCE[$i]}"
[ x"$src" = x ] && src=non_file_source
STACK+=$'\n'" "$func" "$src" "$linen
done
}

adding pstree -p -u `whoami` >>output in your script will probably get you the information you need.

The simplest script which returns a stack trace with all callers:
i=0; while caller $i ;do ((i++)) ;done

You could try something like
strace -f -e execve script.sh

/usr/bin/env questions regarding shebang line pecularities

Questions:
What does the kernel do if you stick a shell-script into the shebang line?
How does the Kernel know which interpreter to launch?
Explanation:
I recently wanted to write a wrapper around /usr/bin/env because my CGI Environment does not allow me to set the PATH variable, except globally (which of course sucks!).
So I thought, "OK. Let's set PREPENDPATH and set PATH in a wrapper around env.". The resulting script (here called env.1) looked like this:
#!/bin/bash
/usr/bin/env PATH=$PREPENDPATH:$PATH $*
which looks like it should work. I checked how they both react, after setting PREPENDPATH:
$ which /usr/bin/env python
/usr/bin/env
/usr/bin/python
$ which /usr/bin/env.1 python
/usr/bin/env
/home/pi/prepend/bin/python
Look absolutely perfect! So far, so good. But look what happens to "Hello World!".
# Shebang is #!/usr/bin/env python
$ test-env.py
Hello World!
# Shebang is #!/usr/bin/env.1 python
$ test-env.1.py
Warning: unknown mime-type for "Hello World!" -- using "application/*"
Error: no such file "Hello World!"
I guess I am missing something pretty fundamental about UNIX.
I'm pretty lost, even after looking at the source code of the original env. It sets the environment and launches the program (or so it seems to me...).

First of all, you should very seldom use $* and you should almost always use "$#" instead. There are a number of questions here on SO which explain the ins and outs of why.
Second - the env command has two main uses. One is to print the current environment; the other is to completely control the environment of a command when it is run. The third use, which you are demonstrating, is to modify the environment, but frankly there's no need for that - the shells are quite capable of handling that for you.
Mode 1:
env
Mode 2:
env -i HOME=$HOME PATH=$PREPENDPATH:$PATH ... command args
This version cancels all inherited environment variables and runs command with precisely the environment set by the ENVVAR=value options.
The third mode - amending the environment - is less important because you can do that fine with regular (civilized) shells. (That means "not C shell" - again, there are other questions on SO with answers that explain that.) For example, you could perfectly well do:
#!/bin/bash
export PATH=${PREPENDPATH:?}:$PATH
exec python "$#"
This insists that $PREPENDPATH is set to a non-empty string in the environment, and then prepends it to $PATH, and exports the new PATH setting. Then, using that new PATH, it executes the python program with the relevant arguments. The exec replaces the shell script with python. Note that this is quite different from:
#!/bin/bash
PATH=${PREPENDPATH:?}:$PATH exec python "$#"
Superficially, this is the same. However, this will execute the python found on the pre-existing PATH, albeit with the new value of PATH in the process's environment. So, in the example, you'd end up executing Python from /usr/bin and not the one from /home/pi/prepend/bin.
In your situation, I would probably not use env and would just use an appropriate variant of the script with the explicit export.
The env command is unusual because it does not recognize the double-dash to separate options from the rest of the command. This is in part because it does not take many options, and in part because it is not clear whether the ENVVAR=value options should come before or after the double dash.
I actually have a series of scripts for running (different versions of) a database server. These scripts really use env (and a bunch of home-grown programs) to control the environment of the server:
#!/bin/ksh
#
# #(#)$Id: boot.black_19.sh,v 1.3 2008/06/25 15:44:44 jleffler Exp $
#
# Boot server black_19 - IDS 11.50.FC1
IXD=/usr/informix/11.50.FC1
IXS=black_19
cd $IXD || exit 1
IXF=$IXD/do.not.start.$IXS
if [ -f $IXF ]
then
echo "$0: will not start server $IXS because file $IXF exists" 1>&2
exit 1
fi
ONINIT=$IXD/bin/oninit.$IXS
if [ ! -f $ONINIT ]
then ONINIT=$IXD/bin/oninit
fi
tmpdir=$IXD/tmp
DAEMONIZE=/work1/jleffler/bin/daemonize
stdout=$tmpdir/$IXS.stdout
stderr=$tmpdir/$IXS.stderr
if [ ! -d $tmpdir ]
then asroot -u informix -g informix -C -- mkdir -p $tmpdir
fi
# Specialized programs carried to extremes:
# * asroot sets UID and GID values and then executes
# * env, which sets the environment precisely and then executes
# * daemonize, which makes the process into a daemon and then executes
# * oninit, which is what we really wanted to run in the first place!
# NB: daemonize defaults stdin to /dev/null and could set umask but
# oninit dinks with it all the time so there is no real point.
# NB: daemonize should not be necessary, but oninit doesn't close its
# controlling terminal and therefore causes cron-jobs that restart
# it to hang, and interactive shells that started it to hang, and
# tracing programs.
# ??? Anyone want to integrate truss into this sequence?
asroot -u informix -g informix -C -a dbaao -a dbsso -- \
env -i HOME=$IXD \
INFORMIXDIR=$IXD \
INFORMIXSERVER=$IXS \
INFORMIXCONCSMCFG=$IXD/etc/concsm.$IXS \
IFX_LISTEN_TIMEOUT=3 \
ONCONFIG=onconfig.$IXS \
PATH=/usr/bin:$IXD/bin \
SHELL=/usr/bin/ksh \
TZ=UTC0 \
$DAEMONIZE -act -d $IXD -o $stdout -e $stderr -- \
$ONINIT "$#"
case "$*" in
(*v*) track-oninit-v $stdout;;
esac

You should carefully read the wikipedia article about shebang.
When your system sees the magic number corresponding to the shebang, it does an execve on the given path after the shebang and gives the script itself as an argument.
Your script fails because the file you give (/usr/bin/env.1) is not an executable, but begins itself by a shebang....
Ideally, you could resolve it using... env on your script with this line as a shebang:
#!/usr/bin/env /usr/bin/env.1 python
It won't work though on linux as it treats "/usr/bin/env.1 python" as a path (it doesn't split arguments)
So the only way I see is to write your env.1 in C
EDIT: seems like no one belives me ^^, so I've written a simple and dirty env.1.c:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
const char* prependpath = "/your/prepend/path/here:";
int main(int argc, char** argv){
int args_len = argc + 1;
char* args[args_len];
const char* env = "/usr/bin/env";
int i;
/* arguments: the same */
args[0] = env;
for(i=1; i<argc; i++)
args[i] = argv[i];
args[argc] = NULL;
/* environment */
char* p = getenv("PATH");
char* newpath = (char*) malloc(strlen(p)
+ strlen(prependpath));
sprintf(newpath, "%s%s", prependpath, p);
setenv("PATH", newpath, 1);
execv(env, args);
return 0;
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string