Make linux SPLIT command compatible with Mac OS terminal

Make linux SPLIT command compatible with Mac OS terminal - linux

I have a bash script that works fine in linux, but when I run it on my Mac terminal it fails, as the options for the splitcommand are slightly different in Mac terminal. My script is:
## Merge and half final two segments
last_file=`ls temp_filt.snplist_* | tail -n 1`
penultimate_file=`ls temp_filt.snplist_* | tail -n 2 | head -1`
cat $penultimate_file $last_file > temp && mv temp $penultimate_file
split -n l/2 $penultimate_file && mv xaa $penultimate_file; mv xab $last_file
The script fails at the final line, since the -n l/2 doesn't exist in tcsh (default shel environment in Mac OS 10.x.x). I was wondering what is the equivalent script in tcsh.
Is there a generic way to run linux script in Mac OS terminal, without the need to change the script?

It's not the MacOS terminal that's doing the split. It's a programm called split. MacOS is built on the FreeBSD userland tools, which behave differently from the GNU utils.
There are two options:
Install the FreeBSD tools on your Linux boxes to make them compatible with FreeBSD.
Install the GNU utils on your MacOS machine. If you have brew you can do this with brew install coreutils

An option is to use the language built-ins and limit external commands
Note the script contains several flaws: ls is useless and parsing ls output is not safe
array=(temp_filt.snplist_*)
last_file=${array[ -1]}
penultimate_file=${array[ -2]}
If the files are big bash read built-in will be very slow.
A simple solution in this case using cat, wc, head and tail which are compatible between systems. Note when passed in a command variables must be double quoted to avoid word splitting.
cat "$penultimate_file" "$last_file" > temp || exit 1
nb_lines=$(wc -l < temp)
((half_nb_lines=nb_lines/2))
head "-$half_nb_lines" temp > "$penultimate_file" || exit 1
tail "+$((half_nb_lines+1))" temp > "$last_file" || exit 1
rm temp
Note in the last line
command1 && command2 ; command3
the command3 is executed whatever the first exit status, { ; } may be used for grouping commands
command1 && { command2 ; command3; }

Related

How to get/set modification times in cross platform way from a (bash) script?

I use a combination of stat and touch for getting/setting timestamps on files and repertories. But I need different set-ups if on mac os x or GNU/Linux:
touch on mac os x does not know the -d option described there
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/touch.html
which allows things like
touch -d "2007-11-12 10:15:30.002Z" ajosey
I am seemingly constrained to -t [[CC]YY]MMDDhhmm[.SS].
stat also differs, for example on a Linux account of mine, it does not recognize the -t format from the stat on mac os x.
Thus on the Linux I currently do something like
stat --format 'touch -d "%y" "%n"' index.html
to create a command line of the type
touch -d "2015-04-08 00:38:51.940365000 +0200" "index.html"
whereas on the mac os x I have
stat -f "touch -t %Sm \"%N\"" -t %Y%m%d%H%M.%S index.html
which gives me something (this is not the same index.html as prior) like:
touch -t 201503281339.42 "index.html"
How could handle this in a unified way ? Perhaps with some sed in between ?
I need to produce a sequence of touch commands in a format working on both platforms. The creation of this sequence must work on both platforms.
I am open to other scripting than bash, with the constraint that on the Linux side I am with a system with no admin rights. perl there is This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi.

Short of a better method, I will temporarily adopt the following, which is based on these observations:
touch -t works the same on my mac os x and the Linux I have access too.
On the Linux side, I can use date -d to transform a date as produced by stat -c %y to the YYYYMMDDHHMM.SS format I can use on input to touch -t, and on the Mac OS X side I can use directly stat with suitable options for this result.
For batch processing of files in a repertory, where I was using stat with * shell expansion, I can replace that with a for shell loop.
Putting these things together I end with the following script:
#!/bin/sh
case `uname -s` in
"Linux" )
MYDATEFORTOUCH() {
date -d"$(stat -c %y "$1")" +%Y%m%d%H%M.%S
}
;;
"Darwin" )
MYDATEFORTOUCH() {
stat -f %Sm -t %Y%m%d%H%M.%S "$1"
}
;;
* )
MYDATEFORTOUCH() {
197001010000.00
}
;;
esac
echo "#!/bin/sh" > fichierTEMPA
for file in *
do echo "touch -ch -t $(MYDATEFORTOUCH "$file") \"$file\"" >> fichierTEMPA
done
Executing this in a repertory produces a file (with silly name here fichierTEMPA) which is a series of touch -t commands. The -h is for not following symbolic links, on mac os x, it implies the -c which is to not create a file which didn't exist, I am not sure if -c is also implied by -h on GNU/Linux.

Install the GNU Coreutils on your Mac and you can stop bothering about incompatibilities. It is explained here how to do it.

Find the current shell of the user using a shell script [duplicate]

How can I determine the current shell I am working on?
Would the output of the ps command alone be sufficient?
How can this be done in different flavors of Unix?

There are three approaches to finding the name of the current shell's executable:
Please note that all three approaches can be fooled if the executable of the shell is /bin/sh, but it's really a renamed bash, for example (which frequently happens).
Thus your second question of whether ps output will do is answered with "not always".
echo $0 - will print the program name... which in the case of the shell is the actual shell.
ps -ef | grep $$ | grep -v grep - this will look for the current process ID in the list of running processes. Since the current process is the shell, it will be included.
This is not 100% reliable, as you might have other processes whose ps listing includes the same number as shell's process ID, especially if that ID is a small number (for example, if the shell's PID is "5", you may find processes called "java5" or "perl5" in the same grep output!). This is the second problem with the "ps" approach, on top of not being able to rely on the shell name.
echo $SHELL - The path to the current shell is stored as the SHELL variable for any shell. The caveat for this one is that if you launch a shell explicitly as a subprocess (for example, it's not your login shell), you will get your login shell's value instead. If that's a possibility, use the ps or $0 approach.
If, however, the executable doesn't match your actual shell (e.g. /bin/sh is actually bash or ksh), you need heuristics. Here are some environmental variables specific to various shells:
$version is set on tcsh
$BASH is set on bash
$shell (lowercase) is set to actual shell name in csh or tcsh
$ZSH_NAME is set on zsh
ksh has $PS3 and $PS4 set, whereas the normal Bourne shell (sh) only has $PS1 and $PS2 set. This generally seems like the hardest to distinguish - the only difference in the entire set of environment variables between sh and ksh we have installed on Solaris boxen is $ERRNO, $FCEDIT, $LINENO, $PPID, $PS3, $PS4, $RANDOM, $SECONDS, and $TMOUT.

ps -p $$
should work anywhere that the solutions involving ps -ef and grep do (on any Unix variant which supports POSIX options for ps) and will not suffer from the false positives introduced by grepping for a sequence of digits which may appear elsewhere.

Try
ps -p $$ -oargs=
or
ps -p $$ -ocomm=

If you just want to ensure the user is invoking a script with Bash:
if [ -z "$BASH" ]; then echo "Please run this script $0 with bash"; exit; fi
or ref
if [ -z "$BASH" ]; then exec bash $0 ; exit; fi

You can try:
ps | grep `echo $$` | awk '{ print $4 }'
Or:
echo $SHELL

$SHELL need not always show the current shell. It only reflects the default shell to be invoked.
To test the above, say bash is the default shell, try echo $SHELL, and then in the same terminal, get into some other shell (KornShell (ksh) for example) and try $SHELL. You will see the result as bash in both cases.
To get the name of the current shell, Use cat /proc/$$/cmdline. And the path to the shell executable by readlink /proc/$$/exe.

There are many ways to find out the shell and its corresponding version. Here are few which worked for me.
Straightforward
$> echo $0 (Gives you the program name. In my case the output was -bash.)
$> $SHELL (This takes you into the shell and in the prompt you get the shell name and version. In my case bash3.2$.)
$> echo $SHELL (This will give you executable path. In my case /bin/bash.)
$> $SHELL --version (This will give complete info about the shell software with license type)
Hackish approach
$> ******* (Type a set of random characters and in the output you will get the shell name. In my case -bash: chapter2-a-sample-isomorphic-app: command not found)

ps is the most reliable method. The SHELL environment variable is not guaranteed to be set and even if it is, it can be easily spoofed.

I have a simple trick to find the current shell. Just type a random string (which is not a command). It will fail and return a "not found" error, but at start of the line it will say which shell it is:
ksh: aaaaa: not found [No such file or directory]
bash: aaaaa: command not found

I have tried many different approaches and the best one for me is:
ps -p $$
It also works under Cygwin and cannot produce false positives as PID grepping. With some cleaning, it outputs just an executable name (under Cygwin with path):
ps -p $$ | tail -1 | awk '{print $NF}'
You can create a function so you don't have to memorize it:
# Print currently active shell
shell () {
ps -p $$ | tail -1 | awk '{print $NF}'
}
...and then just execute shell.
It was tested under Debian and Cygwin.

The following will always give the actual shell used - it gets the name of the actual executable and not the shell name (i.e. ksh93 instead of ksh, etc.). For /bin/sh, it will show the actual shell used, i.e. dash.
ls -l /proc/$$/exe | sed 's%.*/%%'
I know that there are many who say the ls output should never be processed, but what is the probability you'll have a shell you are using that is named with special characters or placed in a directory named with special characters? If this is still the case, there are plenty of other examples of doing it differently.
As pointed out by Toby Speight, this would be a more proper and cleaner way of achieving the same:
basename $(readlink /proc/$$/exe)

My variant on printing the parent process:
ps -p $$ | awk '$1 == PP {print $4}' PP=$$
Don't run unnecessary applications when AWK can do it for you.

Provided that your /bin/sh supports the POSIX standard and your system has the lsof command installed - a possible alternative to lsof could in this case be pid2path - you can also use (or adapt) the following script that prints full paths:
#!/bin/sh
# cat /usr/local/bin/cursh
set -eu
pid="$$"
set -- sh bash zsh ksh ash dash csh tcsh pdksh mksh fish psh rc scsh bournesh wish Wish login
unset echo env sed ps lsof awk getconf
# getconf _POSIX_VERSION # reliable test for availability of POSIX system?
PATH="`PATH=/usr/bin:/bin:/usr/sbin:/sbin getconf PATH`"
[ $? -ne 0 ] && { echo "'getconf PATH' failed"; exit 1; }
export PATH
cmd="lsof"
env -i PATH="${PATH}" type "$cmd" 1>/dev/null 2>&1 || { echo "$cmd not found"; exit 1; }
awkstr="`echo "$#" | sed 's/\([^ ]\{1,\}\)/|\/\1/g; s/ /$/g' | sed 's/^|//; s/$/$/'`"
ppid="`env -i PATH="${PATH}" ps -p $pid -o ppid=`"
[ "${ppid}"X = ""X ] && { echo "no ppid found"; exit 1; }
lsofstr="`lsof -p $ppid`" ||
{ printf "%s\n" "lsof failed" "try: sudo lsof -p \`ps -p \$\$ -o ppid=\`"; exit 1; }
printf "%s\n" "${lsofstr}" |
LC_ALL=C awk -v var="${awkstr}" '$NF ~ var {print $NF}'

My solution:
ps -o command | grep -v -e "\<ps\>" -e grep -e tail | tail -1
This should be portable across different platforms and shells. It uses ps like other solutions, but it doesn't rely on sed or awk and filters out junk from piping and ps itself so that the shell should always be the last entry. This way we don't need to rely on non-portable PID variables or picking out the right lines and columns.
I've tested on Debian and macOS with Bash, Z shell (zsh), and fish (which doesn't work with most of these solutions without changing the expression specifically for fish, because it uses a different PID variable).

If you just want to check that you are running (a particular version of) Bash, the best way to do so is to use the $BASH_VERSINFO array variable. As a (read-only) array variable it cannot be set in the environment,
so you can be sure it is coming (if at all) from the current shell.
However, since Bash has a different behavior when invoked as sh, you do also need to check the $BASH environment variable ends with /bash.
In a script I wrote that uses function names with - (not underscore), and depends on associative arrays (added in Bash 4), I have the following sanity check (with helpful user error message):
case `eval 'echo $BASH#${BASH_VERSINFO[0]}' 2>/dev/null` in
*/bash#[456789])
# Claims bash version 4+, check for func-names and associative arrays
if ! eval "declare -A _ARRAY && func-name() { :; }" 2>/dev/null; then
echo >&2 "bash $BASH_VERSION is not supported (not really bash?)"
exit 1
fi
;;
*/bash#[123])
echo >&2 "bash $BASH_VERSION is not supported (version 4+ required)"
exit 1
;;
*)
echo >&2 "This script requires BASH (version 4+) - not regular sh"
echo >&2 "Re-run as \"bash $CMD\" for proper operation"
exit 1
;;
esac
You could omit the somewhat paranoid functional check for features in the first case, and just assume that future Bash versions would be compatible.

None of the answers worked with fish shell (it doesn't have the variables $$ or $0).
This works for me (tested on sh, bash, fish, ksh, csh, true, tcsh, and zsh; openSUSE 13.2):
ps | tail -n 4 | sed -E '2,$d;s/.* (.*)/\1/'
This command outputs a string like bash. Here I'm only using ps, tail, and sed (without GNU extesions; try to add --posix to check it). They are all standard POSIX commands. I'm sure tail can be removed, but my sed fu is not strong enough to do this.
It seems to me, that this solution is not very portable as it doesn't work on OS X. :(

echo $$ # Gives the Parent Process ID
ps -ef | grep $$ | awk '{print $8}' # Use the PID to see what the process is.
From How do you know what your current shell is?.

This is not a very clean solution, but it does what you want.
# MUST BE SOURCED..
getshell() {
local shell="`ps -p $$ | tail -1 | awk '{print $4}'`"
shells_array=(
# It is important that the shells are listed in descending order of their name length.
pdksh
bash dash mksh
zsh ksh
sh
)
local suited=false
for i in ${shells_array[*]}; do
if ! [ -z `printf $shell | grep $i` ] && ! $suited; then
shell=$i
suited=true
fi
done
echo $shell
}
getshell
Now you can use $(getshell) --version.
This works, though, only on KornShell-like shells (ksh).

Do the following to know whether your shell is using Dash/Bash.
ls –la /bin/sh:
if the result is /bin/sh -> /bin/bash ==> Then your shell is using Bash.
if the result is /bin/sh ->/bin/dash ==> Then your shell is using Dash.
If you want to change from Bash to Dash or vice-versa, use the below code:
ln -s /bin/bash /bin/sh (change shell to Bash)
Note: If the above command results in a error saying, /bin/sh already exists, remove the /bin/sh and try again.

I like Nahuel Fouilleul's solution particularly, but I had to run the following variant of it on Ubuntu 18.04 (Bionic Beaver) with the built-in Bash shell:
bash -c 'shellPID=$$; ps -ocomm= -q $shellPID'
Without the temporary variable shellPID, e.g. the following:
bash -c 'ps -ocomm= -q $$'
Would just output ps for me. Maybe you aren't all using non-interactive mode, and that makes a difference.

Get it with the $SHELL environment variable. A simple sed could remove the path:
echo $SHELL | sed -E 's/^.*\/([aA-zZ]+$)/\1/g'
Output:
bash
It was tested on macOS, Ubuntu, and CentOS.

On Mac OS X (and FreeBSD):
ps -p $$ -axco command | sed -n '$p'

Grepping PID from the output of "ps" is not needed, because you can read the respective command line for any PID from the /proc directory structure:
echo $(cat /proc/$$/cmdline)
However, that might not be any better than just simply:
echo $0
About running an actually different shell than the name indicates, one idea is to request the version from the shell using the name you got previously:
<some_shell> --version
sh seems to fail with exit code 2 while others give something useful (but I am not able to verify all since I don't have them):
$ sh --version
sh: 0: Illegal option --
echo $?
2

One way is:
ps -p $$ -o exe=
which is IMO better than using -o args or -o comm as suggested in another answer (these may use, e.g., some symbolic link like when /bin/sh points to some specific shell as Dash or Bash).
The above returns the path of the executable, but beware that due to /usr-merge, one might need to check for multiple paths (e.g., /bin/bash and /usr/bin/bash).
Also note that the above is not fully POSIX-compatible (POSIX ps doesn't have exe).

Kindly use the below command:
ps -p $$ | tail -1 | awk '{print $4}'

This one works well on Red Hat Linux (RHEL), macOS, BSD and some AIXes:
ps -T $$ | awk 'NR==2{print $NF}'
alternatively, the following one should also work if pstree is available,
pstree | egrep $$ | awk 'NR==2{print $NF}'

You can use echo $SHELL|sed "s/\/bin\///g"

And I came up with this:
sed 's/.*SHELL=//; s/[[:upper:]].*//' /proc/$$/environ

How to run matlab code in linux as script file?

I am looking into running matlab script in Linux similar to bash/python scripts. I.e., a matlab script that can be run as an application.

You can get a similar effect without your custom mash script by adding the following header to the files you want to be executable:
#/usr/bin/bash
/path/to/matlab -r "$(sed -n -e '4,$p' < "$0")"
exit $?
If you want matlab to terminate after executing the script, as in your example, you could replace the second line with
sed -n -e '4,$p' < "$0" | /path/to/matlab
The idea here is to execute a bash command that simply clips off the header of the script, and passes the rest along to matlab.

Here is the implementation I came up with -
Create /usr/bin/mash script file containing the following lines -
#!/bin/bash
grep -ve '^(#!\|^\s*$)' ${#: -1} | ${#: 1:$#-1}
exit $?
Make mash script executable -
$ chmod +x /usr/bin/mash
Write matlab script file called test.msh
#!/usr/bin/mash /usr/local/MATLAB/R2012a/bin/matlab -nodisplay
format long
a = 2*pi % matlab commands ...
Make test.msh script executable -
$ chmod +x mash
Run test.msh
$ ./test.msh
...
>> >> a =
6.283185307179586

Why does bash behave differently, when it is called as sh?

I have an ubuntu machine with default shell set to bash and both ways to the binary in $PATH:
$ which bash
/bin/bash
$ which sh
/bin/sh
$ ll /bin/sh
lrwxrwxrwx 1 root root 4 Mar 6 2013 /bin/sh -> bash*
But when I try to call a script that uses the inline file descriptor (that only bash can handle, but not sh) both calls behave differently:
$ . ./inline-pipe
reached
$ bash ./inline-pipe
reached
$ sh ./inline-pipe
./inline-pipe: line 6: syntax error near unexpected token `<'
./inline-pipe: line 6: `done < <(echo "reached")'
The example-script I am referring to looks like that
#!/bin/sh
while read line; do
if [[ "$line" == "reached" ]]; then echo "reached"; fi
done < <(echo "reached")
the real one is a little bit longer:
#!/bin/sh
declare -A elements
while read line
do
for ele in $(echo $line | grep -o "[a-z]*:[^ ]*")
do
id=$(echo $ele | cut -d ":" -f 1)
elements["$id"]=$(echo $ele | cut -d ":" -f 2)
done
done < <(adb devices -l)
echo ${elements[*]}

When bash is invoked as sh, it (mostly) restricts itself to features found in the POSIX standard. Process substitution is not one of those features, hence the error.

Theoretically, it is a feature of bash: if you call as "sh", it by default switches off all of its features. And the root shell is by default "/bin/sh".
Its primary goal is the security. Secondary is the produce some level of compatibility between some shells of the system, because it enables the system scripts to run in alternate (faster? more secure?) environment.
This is the theory.
Practically goes this so, that there are always people in a development team, who want to reduce and eliminate everything with various arguments (security, simplicity, safety, stability - but these arguments are going somehow always to the direction of the removal, deletion, destroying).
This is because the bash in debian doesn't have network sockets, this is because debian wasn't able in 20 years to normally integrate the best compressors (bz2, xz) - and this is because the root shell is by default so primitive, as of the PDP11 of the eighties.

I believe sh on ubuntu is actually dash which is smaller than bash with fewer features.

How do I know if I'm running a nested shell?

When using a *nix shell (usually bash), I often spawn a sub-shell with which I can take care of a small task (usually in another directory), then exit out of to resume the session of the parent shell.
Once in a while, I'll lose track of whether I'm running a nested shell, or in my top-level shell, and I'll accidentally spawn an additional sub-shell or exit out of the top-level shell by mistake.
Is there a simple way to determine whether I'm running in a nested shell? Or am I going about my problem (by spawning sub-shells) in a completely wrong way?

The $SHLVL variable tracks your shell nesting level:
$ echo $SHLVL
1
$ bash
$ echo $SHLVL
2
$ exit
$ echo $SHLVL
1
As an alternative to spawning sub-shells you could push and pop directories from the stack and stay in the same shell:
[root#localhost /old/dir]# pushd /new/dir
/new/dir /old/dir
[root#localhost /new/dir]# popd
/old/dir
[root#localhost /old/dir]#

Here is a simplified version of part of my prompt:
PS1='$(((SHLVL>1))&&echo $SHLVL)\$ '
If I'm not in a nested shell, it doesn't add anything extra, but it shows the depth if I'm in any level of nesting.

Look at $0: if it starts with a minus -, you're in the login shell.

pstree -s $$ is quite useful to see your depth.

The environment variable $SHLVL contains the shell "depth".
echo $SHLVL
The shell depth can also be determined using pstree (version 23 and above):
pstree -s $$ | grep sh- -o | wc -l
I've found the second way to be more robust than the first whose value was reset when using sudo or became unreliable with env -i.
None of them can correctly deal with su.
The information can be made available in your prompt:
PS1='\u#\h/${SHLVL} \w \$ '
PS1='\u#\h/$(pstree -s $$ | grep sh- -o | tail +2 | wc -l) \w \$ '
The | tail +2 is there to remove one line from the grep output. Since we are using a pipeline inside a "$(...)" command substitution, the shell needs to invoke a sub-shell, so pstree report it and grep detects one more sh- level.
In debian-based distributions, pstree is part of the package psmisc. It might not be installed by default on non-desktop distributions.

As #John Kugelman says, echo $SHLVL will tell you the bash shell depth.
And as #Dennis Williamson shows, you can edit your prompt via the PS1 variable to get it to print this value.
I prefer that it always prints the shell depth value, so here's what I've done: edit your "~/.bashrc" file:
gedit ~/.bashrc
and add the following line to the end:
export PS1='\$SHLVL'":$SHLVL\n$PS1"
Now you will always see a printout of your current bash level just above your prompt. Ex: here you can see I am at a bash level (depth) of 2, as indicated by the $SHLVL:2:
$SHLVL:2
7510-gabriels ~ $
Now, watch the prompt as I go down into some bash levels via the bash command, then come back up via exit. Here you see my commands and prompt (response), starting at level 2 and going down to 5, then coming back up to level 2:
$SHLVL:2
7510-gabriels ~ $ bash
$SHLVL:3
7510-gabriels ~ $ bash
$SHLVL:4
7510-gabriels ~ $ bash
$SHLVL:5
7510-gabriels ~ $ exit
exit
$SHLVL:4
7510-gabriels ~ $ exit
exit
$SHLVL:3
7510-gabriels ~ $ exit
exit
$SHLVL:2
7510-gabriels ~ $
Bonus: always show in your terminal your current git branch you are on too!
Make your prompt also show you your git branch you are working on by using the following in your "~/.bashrc" file instead:
git_show_branch() {
__gsb_BRANCH=$(git symbolic-ref -q --short HEAD 2>/dev/null)
if [ -n "$__gsb_BRANCH" ]; then
echo "$__gsb_BRANCH"
fi
}
export PS1="\e[7m\$(git_show_branch)\e[m\n\h \w $ "
export PS1='\$SHLVL'":$SHLVL $PS1"
Source: I have no idea where git_show_branch() originally comes from, but I got it from Jason McMullan on 5 Apr. 2018. I then added the $SHLVL part shown above just last week.
Sample output:
$SHLVL:2 master
7510-gabriels ~/GS/dev/temp $
And here's a screenshot showing it in all its glory. Notice the git branch name, master, highlighted in white!
Update to the Bonus section
I've improved it again and put my ~/.bashrc file on github here. Here's a sample output of the new terminal prompt. Notice how it shows the shell level as 1, and it shows the branch name of the currently-checked-out branch (master in this case) whenever I'm inside a local git repo!:
Cross-referenced:
Output of git branch in tree like fashion

ptree $$ will also show you how many levels deep you are

If you running inside sub-shell following code will yield 2:
ps | fgrep bash | wc -l
Otherwise, it will yield 1.
EDIT Ok, it's not so robust approach as was pointed out in comments :)
Another thing to try is
ps -ef | awk '{print $2, " ", $8;}' | fgrep $PPID
will yield 'bash' if you in sub-shell.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string