Where is $PATH in a system() declared - linux

I had something weird happen on my computer.
I had gperf installed under /usr/local/bin.
As related to questionI asked here I had a perl script running on my computer which contain the line system() on gperf with flags something look like
perl file:
system("gperf ...") == 0 || die "calling gperf failed: $?";
However no matter how hard I try the gperf will not run and out put the failed message
to debug I tried something like
system("echo \$PATH") == 0 || die "calling gperf failed: $?";
and found that it does not contain /usr/local/bin/ where i installed my gperf but only look in usr/bin where it was not installed
So the $PATH is wrong...
So I googled around and saw system() is same as calling /bin/sh inside a file so i tried /bin/sh and echo $PATH which found that it contain /usr/local/bin/ to my disbelieve.
So my question is where is the $PATH for a system() declared? why is it different then the one inside a Bourne shell ?

The PATH used by commands launched via system is the same as the one in the perl script, accessible through $ENV{PATH}. It's the PATH that the perl script inherits from the program that called it, unless you changed it in the script.
What's biting you is probably that you set up your PATH in the wrong configuration file. Define it in ~/.profile, /etc/profile or other system-wide file, not in a shell configuration file such as .bashrc. See this question for some general information.
If you want to set the path manually inside the perl script, you can use something like
$ENV{PATH} = "/usr/local/bin:$ENV{PATH}" unless ":$ENV{PATH}:" =~ m~:/usr/local/bin:~;
but this is probably a bad idea: in most cases, your script should not modify the path chosen by the user who runs that script.
If you're having trouble finding the right place to set PATH on your system after reading the question I linked to and the questions linked in my answer there, ask on Unix & Linux and be sure to state the details of your operating system (distribution, version, etc.) and how you log in (this is a user question, not a programming question).

On a linux system using BASH as the shell, the PATH is set at login time from the user's .bash_profile file in their home directory. You can append the /usr/local/bin directory with a line like this at the end of the file:
PATH=$PATH:/usr/local/bin
The other (probably more reliable) way to fix it is to use the absolute path in your system call, like this:
system("/usr/local/bin/gperf")

Related

Getting bash script to update parent shell's Environment

I am attempting to write a bash command line tool that is usable immediately after installation, i.e. in the same shell as its installation script was called. Lets say install-script.sh (designed for Ubuntu) looks like:
# Get the script's absolute path:
pushd `dirname $0` > /dev/null
SCRIPTPATH=`pwd`
popd > /dev/null
# Add lines to bash.bashrc to export the environment variable:
echo "SCRIPT_HOME=${SCRIPTPATH}" >> /etc/bash.bashrc
echo "export SCRIPT_HOME" >> /etc/bash.bashrc
# Create a new command:
cp ${SCRIPTPATH}/newcomm /usr/bin
chmod a+x /usr/bin/newcomm
The idea is that the new command newcomm uses the SCRIPT_HOME environment variable to reference the main script - which is also in SCRIPTPATH:
exec "${SCRIPT_HOME}/main-script.sh"
Now, the updated bash.bashrc hasn't been loaded into the parent shell yet. Worse, I cannot source it from within the script - which is running in a child shell. Using export to change SCRIPT_HOME in the parent shell would at best be duct-taping the issue, but even this is impossible. Also note that the installation script needs to be run using sudo so it cannot be called from the parent shell using source.
It should be possible since package managers like apt do it. Is there a robust way to patch up my approach? How is this usually done, and is there a good guide to writing bash installers?
You can't. Neither can apt.
A package manager will instead just write required data/variables to a file, which are read either by the program itself, by a patch to the program, or by a wrapper.
Good examples can be found in /etc/default/*. These are files with variable definitions, and some even helpfully describe where they're sourced from:
$ cat /etc/default/ssh
# Default settings for openssh-server. This file is sourced by /bin/sh from
# /etc/init.d/ssh.
# Options to pass to sshd
SSHD_OPTS=
You'll notice that none of the options are set in your current shell after installing a package, since programs get them straight from the files in one way or another.
The only way to modify the current shell is to source a script. That's unavoidable, so start there. Write a script that is sourced. That script will in turn call your current script.
Your current script will need to communicate with the sourced one to tell it what to change. A common way is to echo variable assignments that can be directly executed by the caller. For instance:
printf 'export SCRIPT_HOME=%q\n' "$SCRIPTPATH"
Using printf with %q ensures any special characters will be escaped properly.
Then have the sourced script eval the inner script.
eval "$(sudo install-script.sh)"
If you want to hide the sourceing of the top script you could hide it behind an alias or shell function.

Import PATH environment variable into Bash script launched with cron

When creating Bash scripts, I have always had a line right at the start defining the PATH environment variable. I recently discovered that this doesn't make the script very portable as the PATH variable is different for different versions of Linux (in my case, I moved the script from Arch Linux to Ubuntu and received errors as various executables weren't in the same places).
Is it possible to copy the PATH environment variable defined by the login shell into the current Bash script?
EDIT:
I see that my question has caused some confusion resulting in some thinking that I want to change the PATH environment variable of the login shell with a bash script, which is the exact opposite of what I want.
This is what I currently have at the top of one of my Bash scripts:
#!/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
# Test if an internet connection is present
wget -O /dev/null google.com
I want to replace that second line with something that copies the value of PATH from the login shell into the script environment:
#!/bin/bash
PATH=$(command that copies value of PATH from login shell)
# Test if an internet connection is present
wget -O /dev/null google.com
EDIT 2: Sorry for the big omission on my part. I forgot to mention that the scripts in question are being run on a schedule through cron. Cron creates it's own environment for running the scripts which does not use the environment variables of the login shell or modify them. I just tried running the following script in cron:
#!/bin/bash
echo $PATH >> /home/user/output.txt
The result is as follows. As you can see, the PATH variable used by cron is different to the login shell:
user#ubuntu_router:~$ cat output.txt
/usr/bin:/bin
user#ubuntu_router:~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
Don't touch the user's PATH at all unless you have a specific reason. Not doing anything will (basically) accomplish what you ask.
You don't have to do anything to get the user's normal PATH since every process inherits the PATH and all other environment variables automatically.
If you need to add something nonstandard to the PATH, the usual approach is to prepend (or append) the new directory to the user's existing PATH, like so:
PATH=/opt/your/random/dir:$PATH
The environment of cron jobs is pretty close to the system's "default" (for some definition of "default") though interactive shells may generally run with a less constrained environment. But again, the fix for that is to add any missing directories to the current value at the beginning of the script. Adding directories which don't exist on this particular system is harmless, as is introducing duplicate directories.
I've managed to find the answer to my question:
PATH=$PATH:$(sed -n '/PATH=/s/^.*=// ; s/\"//gp' '/etc/environment')
This command will grab the value assigned to PATH by Linux from the environment file and append it to the PATH used by Cron.
I used the following resources to help find the answer:
How to grep for contents after pattern?
https://help.ubuntu.com/community/EnvironmentVariables#System-wide_environment_variables

Setting path variables and running Ruby script

This is my first time working with a Ruby script, and, in order to run this script, I have to first cd into the root of the project, which is /usr/local/bin/youtube-multiple-dl and then execute the script as bin/youtube-multiple-dl.
I tried setting the PATH variable
echo 'export PATH="$HOME/youtube-multiple-dl/bin:$PATH"' >> ~/.bash_profile
in hopes that I can run this from anywhere on the machine without having to cd to the project's root, however, no luck with that so far.
System: Ubuntu 15.04 server
Script Repo
My current way of executing the script is:
root#box15990:~# cd /usr/local/bin/youtube-multiple-dl
root#box15990:/usr/local/bin/youtube-multiple-dl# bin/youtube-multiple-dl
Desired way of executing script:
root#box15990:~# youtube-multiple-dl
How can I properly set the enviroment path for this script in order to run from anywhere?
echo 'export PATH="$HOME/youtube-multiple-dl/bin:$PATH"' >> ~/.bash_profile
isn't how we set a PATH entry.
The PATH is a list of directories to be searched, not a list of files.
Typically, the PATH should contain something like:
/usr/local/bin:/usr/bin
somewhere in it.
If it doesn't, then you want to modify it using a text editor, such as nano, pico or vim using one of these commands:
nano ~/.bash_profile
pico ~/.bash_profile
vim ~/.bash_profile
You probably want one of the first two over vim as vim, while being extremely powerful and one of the most-used editors in the world, is also not overly intuitive if you're not used to it. You can use man nano or man pico to learn about the other too.
Once your in your file editor, scroll to the bottom and remove the line you added. Then find the /usr/bin section in your PATH and add /usr/local/bin: before it. : is the delimiter between directories. That change will tell the shell to look in /usr/local/bin before /usr/bin, so that any things you added to the /usr/local/bin directory will be found before the system-installed code, which is in /usr/bin.
It's possible that there isn't a PATH statement in the file. If you don't see one, simply add:
export PATH=/usr/local/bin:$PATH
After modifying your ~/.bash_profile, save the file and exit the editor, and then restart your shell. You can do that by exiting and re-opening a terminal window, or by running:
exec $SHELL
at the command-line.
At that point, running:
echo $PATH
should reflect the change to your path.
To confirm that the change is in effect, you can run:
which youtube-multiple.dl
and you should get back:
/usr/local/bin/youtube-multiple.dl
At that point you should be able to run:
youtube-multiple.dl -h
and get back a response showing the built-in help. This is because the shell will search the path, starting with the first defined directory, and continue until it exhausts the list, and will execute the first file matching that name.
Because of the difficulties you're having, I'd strongly recommend reading some tutorials about managing a *nix system. It's not overly hard to learn the basics, and having an understanding of how the shell finds files and executes them is essential for anyone programming a scripting language like Ruby, Python, Perl, etc. We're using the OS constantly, installing files for system and user's use, and doing so correctly and safely is very important for the security and stability of the machine.

What exactly does "/usr/bin/env node" do at the beginning of node files?

I had seen this line #!/usr/bin/env node at the beginning of some examples in nodejs and I had googled without finding any topic that could answer the reason for that line.
The nature of the words makes search it not that easy.
I'd read some javascript and nodejs books recently and I didn't remember seeing it in any of them.
If you want an example, you could see the RabbitMQ official tutorial, they have it in almost all of their examples, here is one of them:
#!/usr/bin/env node
var amqp = require('amqplib/callback_api');
amqp.connect('amqp://localhost', function(err, conn) {
conn.createChannel(function(err, ch) {
var ex = 'logs';
var msg = process.argv.slice(2).join(' ') || 'Hello World!';
ch.assertExchange(ex, 'fanout', {durable: false});
ch.publish(ex, '', new Buffer(msg));
console.log(" [x] Sent %s", msg);
});
setTimeout(function() { conn.close(); process.exit(0) }, 500);
});
Could someone explain me what is the meaning of this line?
What is the difference if I put or remove this line? In what cases do I need it?
#!/usr/bin/env node is an instance of a shebang line: the very first line in an executable plain-text file on Unix-like platforms that tells the system what interpreter to pass that file to for execution, via the command line following the magic #! prefix (called shebang).
Note: Windows does not support shebang lines, so they're effectively ignored there; on Windows it is solely a given file's filename extension that determines what executable will interpret it. However, you still need them in the context of npm.[1]
The following, general discussion of shebang lines is limited to Unix-like platforms:
In the following discussion I'll assume that the file containing source code for execution by Node.js is simply named file.
You NEED this line, if you want to invoke a Node.js source file directly, as an executable in its own right - this assumes that the file has been marked as executable with a command such as chmod +x ./file, which then allows you to invoke the file with, for instance, ./file, or, if it's located in one of the directories listed in the $PATH variable, simply as file.
Specifically, you need a shebang line to create CLIs based on Node.js source files as part of an npm package, with the CLI(s) to be installed by npm based on the value of the "bin" key in a package's package.json file; also see this answer for how that works with globally installed packages. Footnote [1] shows how this is handled on Windows.
You do NOT need this line to invoke a file explicitly via the node interpreter, e.g., node ./file
Optional background information:
#!/usr/bin/env <executableName> is a way of portably specifying an interpreter: in a nutshell, it says: execute <executableName> wherever you (first) find it among the directories listed in the $PATH variable (and implicitly pass it the path to the file at hand).
This accounts for the fact that a given interpreter may be installed in different locations across platforms, which is definitely the case with node, the Node.js binary.
By contrast, the location of the env utility itself can be relied upon to be in the same location across platforms, namely /usr/bin/env - and specifying the full path to an executable is required in a shebang line.
Note that POSIX utility env is being repurposed here to locate by filename and execute an executable in the $PATH.
The true purpose of env is to manage the environment for a command - see env's POSIX spec and Keith Thompson's helpful answer.
It's also worth noting that Node.js is making a syntax exception for shebang lines, given that they're not valid JavaScript code (# is not a comment character in JavaScript, unlike in POSIX-like shells and other interpreters).
[1] In the interest of cross-platform consistency, npm creates wrapper *.cmd files (batch files) on Windows when installing executables specified in a package's package.json file (via the "bin" property). Essentially, these wrapper batch files mimic Unix shebang functionality: they invoke the target file explicitly with the executable specified in the shebang line - thus, your scripts must include a shebang line even if you only ever intend to run them on Windows - see this answer of mine for details.
Since *.cmd files can be invoked without the .cmd extension, this makes for a seamless cross-platform experience: on both Windows and Unix you can effectively invoke an npm-installed CLI by its original, extension-less name.
Scripts that are to be executed by an interpreter normally have a shebang line at the top to tell the OS how to execute them.
If you have a script named foo whose first line is #!/bin/sh, the system will read that first line and execute the equivalent of /bin/sh foo. Because of this, most interpreters are set up to accept the name of a script file as a command-line argument.
The interpreter name following the #! has to be a full path; the OS won't search your $PATH to find the interpreter.
If you have a script to be executed by node, the obvious way to write the first line is:
#!/usr/bin/node
but that doesn't work if the node command isn't installed in /usr/bin.
A common workaround is to use the env command (which wasn't really intended for this purpose):
#!/usr/bin/env node
If your script is called foo, the OS will do the equivalent of
/usr/bin/env node foo
The env command executes another command whose name is given on its command line, passing any following arguments to that command. The reason it's used here is that env will search $PATH for the command. So if node is installed in /usr/local/bin/node, and you have /usr/local/bin in your $PATH, the env command will invoke /usr/local/bin/node foo.
The main purpose of the env command is to execute another command with a modified environment, adding or removing specified environment variables before running the command. But with no additional arguments, it just executes the command with an unchanged environment, which is all you need in this case.
There are some drawbacks to this approach. Most modern Unix-like systems have /usr/bin/env, but I worked on older systems where the env command was installed in a different directory. There might be limitations on additional arguments you can pass using this mechanism. If the user doesn't have the directory containing the node command in $PATH, or has some different command called node, then it could invoke the wrong command or not work at all.
Other approaches are:
Use a #! line that specifies the full path to the node command itself, updating the script as needed for different systems; or
Invoke the node command with your script as an argument.
See also this question (and my answer) for more discussion of the #!/usr/bin/env trick.
Incidentally, on my system (Linux Mint 17.2), it's installed as /usr/bin/nodejs. According to my notes, it changed from /usr/bin/node to /usr/bin/nodejs between Ubuntu 12.04 and 12.10. The #!/usr/bin/env trick won't help with that (unless you set up a symlink or something similar).
UPDATE: A comment by mtraceur says (reformatted):
A workaround for the nodejs vs node problem is to start the file with
the following six lines:
#!/bin/sh -
':' /*-
test1=$(nodejs --version 2>&1) && exec nodejs "$0" "$#"
test2=$(node --version 2>&1) && exec node "$0" "$#"
exec printf '%s\n' "$test1" "$test2" 1>&2
*/
This will first try nodejs and then try node, and only
print the error messages if both of them are not found. An explanation
is out of scope of these comments, I'm just leaving it here in case it
helps anyone deal with the problem since this answer brought the
problem up.
I haven't used NodeJS lately. My hope is that the nodejs vs. node issue has been resolved in the years since I first posted this answer. On Ubuntu 18.04, the nodejs package installs /usr/bin/nodejs as a symlink to /usr/bin/node. On some earlier OS (Ubuntu or Linux Mint, I'm not sure which), there was a nodejs-legacy package that provided node as a symlink to nodejs. No guarantee that I have all the details right.
The exec system call of the Linux kernel understands shebangs (#!) natively
When you do on bash:
./something
on Linux, this calls the exec system call with the path ./something.
This line of the kernel gets called on the file passed to exec: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_script.c#L25
if ((bprm->buf[0] != '#') || (bprm->buf[1] != '!'))
It reads the very first bytes of the file, and compares them to #!.
If the comparison is true, then the rest of the line is parsed by the Linux kernel, which makes another exec call with:
executable: /usr/bin/env
first argument: node
second argument: script path
therefore equivalent to:
/usr/bin/env node /path/to/script.js
env is an executable that searches PATH to e.g. find /usr/bin/node, and then finally calls:
/usr/bin/node /path/to/script.js
The Node.js interpreter does see the #! line in the file, but it must be programmed to ignore that line even though # is not in general a valid comment character in Node (unlike many other languages such as Python where it is), see also: Pound Sign (#) As Comment Start In JavaScript?
And yes, you can make an infinite loop with:
printf '#!/a\n' | sudo tee /a
sudo chmod +x /a
/a
Bash recognizes the error:
-bash: /a: /a: bad interpreter: Too many levels of symbolic links
#! just happens to be human readable, but that is not required.
If the file started with different bytes, then the exec system call would use a different handler. The other most important built-in handler is for ELF executable files: https://github.com/torvalds/linux/blob/v4.8/fs/binfmt_elf.c#L1305 which checks for bytes 7f 45 4c 46 (which also happens to be human readable for .ELF). Let's confirm that by reading the 4 first bytes of /bin/ls, which is an ELF executable:
head -c 4 "$(which ls)" | hd
output:
00000000 7f 45 4c 46 |.ELF|
00000004
So when the kernel sees those bytes, it takes the ELF file, puts it into memory correctly, and starts a new process with it. See also: How does kernel get an executable binary file running under linux?
Finally, you can add your own shebang handlers with the binfmt_misc mechanism. For example, you can add a custom handler for .jar files. This mechanism even supports handlers by file extension. Another application is to transparently run executables of a different architecture with QEMU.
I don't think POSIX specifies shebangs however: https://unix.stackexchange.com/a/346214/32558 , although it does mention in on rationale sections, and in the form "if executable scripts are supported by the system something may happen". macOS and FreeBSD also seem to implement it however.
PATH search motivation
Likely, one big motivation for the existence of shebangs is the fact that in Linux, we often want to run commands from PATH just as:
basename-of-command
instead of:
/full/path/to/basename-of-command
But then, without the shebang mechanism, how would Linux know how to launch each type of file?
Hardcoding the extension in commands:
basename-of-command.js
or implementing PATH search on every interpreter:
node basename-of-command
would be a possibility, but this has the major problem that everything breaks if we ever decide to refactor the command into another language.
Shebangs solve this problem beautifully.
Short answer:
It is the path to the interpreter.
EDIT (Long Answer):
The reason there is no slash before "node" is because you can not always guarantee the reliability of #!/bin/ . The "/env" bit makes the program more cross-platform by running the script in a modified environment and more reliably being able to find the interpreter program.
You do not necessarily need it, but it is good to use to ensure portability (and professionalism)

How can I change the binary file link to something else

I have two questions and they are linked. I execute the command like this:
python on the shell and it opens the shell.
Now I want
To which file it is linked. I mean when I run python then what is the path of file it opens like /usr/bin/python or what?
The other questions is I want to change that link to some other location so that when I run python then it opens /usr/bal/bla/python2.7.
The command run when you type python is determined primarily by the setting of your $PATH. The first executable file called python that is found in a directory listed on your $PATH will be the one executed. There is no 'link' per se. The which command will tell you what the shell executes when you type python.
If you want python to open a different program, there are a number of ways to do it. If you have $HOME/bin on your $PATH ahead of /usr/bin, then you can create a symlink:
ln -s /usr/bal/bla/python2.7 $HOME/bin/python
This will now be executed instead of /usr/bin/python. Alternatively, you can create an alias:
alias python=/usr/bal/bla/python2.7
Alternatively again, if /usr/bal/bla contains other useful programs, you could add /usr/bal/bla to your $PATH ahead of /usr/bin.
There are other mechanisms too, but one of these is likely to be the one you use. I'd most probably use the symlink in $HOME/bin.

Resources