Which shell should be used for Linux/MacOS/UNIX best compatibility? - linux

My question may seem related to SO question "What Linux shell should I use?", but my problem is to know which shell shall be used to write an application start script, knowing that this is a cross-platform Java application (almost all Linux distributions, MacOS, Solaris, ...). So i'm adding compatibility concerns here.
Please note that i'm not asking "which is the best shell to use" in general (which may have no sense in my opinion: subjective, depends on needs), but I'd like to know, which shell has the best chance, today, to be available (and suitable for Java application start) on most operating systems.
Also, may I simply have to use the shebang #!/bin/bash to "use bash"? (or for example #!/bin/ksh for Korn shell). What if this shell is not available on this OS?
We're actually using a ".sh" file with the shebang #!/bin/sh (which is Bourne shell I guess) but some users are complaining about errors on some Linux distributions (we don't know yet which one they use, but we would like to have a more global approach instead of fixing errors one by one). MacOS is currently using bash as the default shell but at this time we don't have any issue on MacOS using /bin/sh...
Note: we'd like to avoid having several start scripts (i.e. using different shells)

For maximum portability, your best bet is /bin/sh using only POSIX sh features (no extensions). Any other shell you pick might not be installed on some system (BSDs rarely have bash, while Linux rarely has ksh).
The problem you can run into is that frequently, /bin/sh is not actually Bourne sh or a strictly POSIX sh -- it's frequently just a link for /bin/bash or /bin/ksh that runs that other shell in sh-compatibility mode. That means that while any POSIX sh script should run fine, there will also be extensions supported that will cause things that are illegal per POSIX to run as well. So you might have a script that you think is fine (runs fine when you test it), but its actually depending on some bash or ksh extension that other shells don't support.
You can try running your script with multiple shells in POSIX compatibility mode (try say, bash, ksh, and dash) and make sure it runs on all of them and you're not accidentally using some extension that only one supports.

You won't find a shell implementation that will be installed on every of these OSes, however, all of them are either POSIX compliant or more or less close to being compliant.
You should then restrict your shell scripts to stick to the POSIX standard as far as possible.
However, there is no simple way to tell a script is to be executed in a POSIX context, and in particular to specify what shebang to set. I would suggest to use a postinstaller script that would insert the correct shebang on the target platform retrieved using this command:
#!/bin/sh
printf "#!%s\n" `PATH=\`getconf PATH\` command -v sh`
You scripts should also include this instruction once and before calling any external command:
export PATH=$(getconf PATH):$PATH
to make sure the utilities called are the POSIX ones. Moreover, beware that some Unix implementations might require an environment variable to be set for them to behave a POSIX way (eg BIN_SH=xpg4 is required on Tru64/OSF1, XPG_SUS_ENV=ON on AIX, ...).
To develop your script, I would recommend to use a shell that has the less extensions to the standard, like dash. That would help to quickly detect errors caused by bashisms (or kshisms or whatever).
PS: beware that despite popular belief, /bin/sh is not guaranteed to be POSIX compliant even on a POSIX compliant OS.

Related

determine how a linux process was invoked

Is there a way in linux, to determine how a process was invoked?
I know, that ps displays the startup parameters, but I'm interested how the process start was executed.
Was it a init.d script, a cron job or manual invocation via cli.
Right now I am looking through all configs/commands manually, is there an easy way I am overlooking?
(I also know, that the presence of systemd etc. is distro related which helps to priorize a little bit.)
In most cases, for a process of pid 1234, you can get valuable information about it thru /proc/1234/ (see proc(5) for details)
See also credentials(7) and Advanced Linux Programming and Linux From Scratch
For exemple, try ps $$ then cat /proc/$$/status then cat /proc/$$/maps then cat /proc/$$/comm in your terminal (running probably the GNU bash shell, or zsh)
Consider writing your C program doing appropriate syscalls(2) (with perhaps opendir(3) and readdir(3)...) to query that information from /proc/ ....
Remember to read errno(3). A lot of functions (like open(2), read(2), getpwnam(3) ....) can fail.
Download, then study for inspiration the source code of the GNU bash shell (or even of the Linux kernel), it is free software.

Why isn't this command running properly?

So I'm making a better command-line frontend for APT and I'm putting on some finishing touches and when the code below runs.
Command::new("unbuffer")
.arg("apt")
.arg("list")
.arg("|")
.arg("less")
.arg("-r")
.status()
.expect("Something went wrong.");
it spits out:
E: Command line option 'r' [from -r] is not understood in combination with the other options.
but when I just run unbuffer apt list | less -r manually in my terminal it works perfectly. How do I get it to run properly when calling it in Rust?
Spawning a process via Command uses the system's native functionality to create a process. This is a low level feature and has little to do with your shell/terminal that you are used to. In particular, your shell (e.g. bash or zsh, running inside of your terminal) offers a lot more features. For example, piping via | is such a feature. Command does not support these features as the low level system's API doesn't.
Luckily, the low level interface offers other means of achieving a lot of stuff. Piping for example is mostly just redirecting the standard inputs and outputs. You can do that with Command::{stdin, stdout, sterr}. Please see this part of the documentation for more information.
There are a few very similar questions, which are not similar enough to warrent closing this as a dupe though:
Execute a shell command
Why does the compgen command work in the Linux terminal but not with process::Command?: mentions shell built-in commands that do not work with Command.
Executing find using std::process::Command on cygwin does not work

Why are some Bash commands both built-in and external?

Some commands are internal built-in Bash commands while others are external (other programs). I see why certain commands need to be built-in. Some of the reasons are:
If a command needs to change the internal state of the shell process.
If a command performs a very basic operation in the shell.
If a command is called often and needs to be made fast. An external command is executed by loading an external program and hence is slower.
But why are some commands both built-in and external, for example echo and test? I understand echo is used a lot and thus is built-in (Reason 3). But then why also have it as an external command and have a binary for it in /bin/echo? The built-in version of echo will always take precedence over the external version and thus, the external version is hardly ever used. So, why then have an external version of it at all?
It's exactly your point 3. When a command does very little (echo is a good example), spawning a new process dominates the run time behavior. With growing disks and bandwidth and code bases you always reach a spot when you have so much data and so many files (our code base at work has 100k files!!) that one less spawn per file makes a difference of minutes.
That's also why the typical built-in is a drop-in replacement which takes (perhaps a superset of) the same arguments as the binary.
You also ask why the old binary is still retained even though Bash has it as a built-in — the answer is that a lot of programs rely on the existence of that /bin/echo. It's actually standardized.
Bash is only one of many user interfaces and offline command interpreters. They all have different sets of built-ins. Some shells are purposefully small and rely a lot on what you could call "legacy" binaries. One example is ash and its successor, Dash. Dash is now the default /bin/sh in Ubuntu and Debian due to its speed, and is popular for embedded systems due to its small size. (But even Dash has builtins for echo, test and dozens of other commands, and provides a command history for interactive use.)

What are the differences between running a process with and without a shell?

In the Node.js docs for child_process, I came upon this line:
Since a shell is not spawned, behaviors such as I/O redirection and file globbing are not supported.
That’s good to know, but the “such as” worries me. What other behaviors are missing? What even counts as running without a shell — isn’t sh/cmd.exe still parsing the command-line input?
It looks like my initial assumption was flawed: neither the Bourne shell (sh) nor Windows’s cmd.exe parses the command when invoked without a shell — it’s up to the consuming application to do so.
The most significant lost feature is the lack of filename expansion. Unless the called application binary understands UNIX file syntax, you can’t do most relative file path tricks:
No ../ for a parent directory
No ./ for the current working directory
No ~ for the user directory
Other shell syntax characters like redirection operators (> for example) and pipes (|) are unsupported.
Basically, if it’s listed as a feature on the shell’s wikipedia page, non-shell execution won’t have it.

TCL, Linux, and FLOCK

So I am working with a program written in TCL that uses the FLOCK function to lock files. I am testing it on a newer version of Linux than the one it currently runs on and I found that when the newer machine runs the script, it uses FLOCK from /usr/bin/flock, which differs from the TCL version of FLOCK. The TCL version uses -read -write and such, while the Linux version uses completely different options.
In short, the program stops working and errors out when it gets to any FLOCK call. If I change the options to fit the Linux version, it breaks the program on the other machines.
Is there a way to make it use the TCL version as opposed to the Linux one?
Tcl itself does not come with a flock command, though you might be seeing it automatically trying to use the system command if you're testing interactively. Such automated use of system commands is not done in scripts (that would be hellishly prone to instability due to varying PATHs) so when writing a script you should be explicit as to what you mean.
If you want to use the system command (itself non-portable, especially to non-Linux systems) then just do:
exec flock $options...
Be aware that Tcl uses a different form of argument quoting to the shell. This can sometimes catch people out when writing exec calls.
Alternatively, use the flock Tcl command that is in the TclX package. The syntax is a little different to that of the Linux system utility, in large part because it's a bit lower-level. In its favor, it is rather more portable.

Resources