Bash built-in "history" can be executed, but is not in $PATH - linux

I know shell builtins are loaded into memory and thought I could find all builtins in /usr/bin or somewhere in echo $PATH. I was trying to find out how the history command works. My assumption was that it is reading from ~/bash_history. So I tried objdump -S $(which history)
which history
echo $?
1
This did not return the path of the command which makes me where the binary for history is located.
type -t which
builtin
I assume this means that it is loaded onto memory. So does a shell process load builtins that are stored outside of echo $PATH

A shell builtin is literally built into the shell executable itself. It is invoked by the shell not as a separate process, but simply as a regular function call within the shell process. So if you want to find its source code, you need to look in the source code of bash.
For many builtins, such as cd, the main reason for it being a builtin is that it modifies the state of the shell itself. It would be pointless to have cd be a separate process, as that process would only change its own current directory, not that of the shell process. In the case of history, the reason is presumably that ~/.bash_history is only written when the shell exits, so the command also needs access to the in-memory history of the current session, which is contained within the running bash process. For other builtins, such as echo, the reason is performance: the command is assumed to be so frequently used that we want to avoid spawning a new process every time it is invoked (but if you really do want a process, there is also /bin/echo, which may behave differently).

Related

How to know what script header to use and why it matters?

I infrequently have to write bash scripts for various unrelated purposes and while I usually have a good idea what commands I want in the script, I often have no idea what header to use or why I'm using one when I do find it. For example(s):
Standard shell script:
#!/bin/bash
Python:
#!/usr/bin/env python
Scripts seem to work fine without headers but if headers are the standard, there's a reason for them and they shouldn't be ignored. If it has an effect, then it's a valuable tool that could be used to accomplish more.
Minimally, I'd like to know what headers to use with MySQL scripts and what the headers do on Standard, Python, and MySQL scripts. Ideally, I'd like a generic list of headers or an understanding of how to create a header based on what program is being used.
How the Kernel Executes Things
Simplified (a bit), there are two ways the kernel in a POSIX system knows how to execute a program. One, if the program is in a binary format the kernel understands (such as ELF), the kernel can execute it "directly" (more detail out of scope). If the program is a text file starting with a shebang, such as
#!/usr/bin/somebinary -arg
or what-have-you, the kernel actually executes the command as if it had been directed to execute:
/usr/bin/somebinary -arg "$0"
where $0 here is the name of the script file you just tried to execute. (So you can immediately tell why so many scripting languages use # as a comment-starter – it means they don't have to treat the shebang as special.)
PATH and the env command
The kernel does not look at the PATH environment variable to determine which executable you're talking about, so if you are distributing a python script to systems that may have multiple versions of python installed, you can't guarantee that there will be a
#!/usr/bin/python
env, however, is POSIX, so you can count on it existing, and it will look up python in PATH. Thus,
#!/usr/bin/env python
will execute the script with the first python found in your PATH.
BASH, SH and Special Meanings for Invocation
Some programs have special semantics for how they're invoked. In particular, on many systems /bin/sh is a symlink to another shell, such as /bin/bash. While bash does not contain a perfectly POSIXLY_STRICT implementation of sh, when it is invoked as /bin/sh it is stricter than it would be if invoked as plain-old-bash.
MySQL and arg limitations
The shebang line can be length limited and technically, it can only support one argument, so mysql is a bit tricky – you can't expect to pass a username and database name to a mysql script.
#!/usr/bin/env mysql
use mydb;
select * from mytbl;
Will fail because the kernel will try mysql "$0". Even if you have your credentials in a .my.cnf file, mysql itself will try to treat "$0" as a database name. Likewise:
#!/usr/bin/mysql -e
use mydb;
select * from mytbl;
will fail because again, "$0" is not a table name (you hope).
There does not seem to be an appropriate syntax for directly executing a mysql script this way. Your best bet is to pipe the sql commands to mysql directly:
mysql < my_sql_commands
http://mywiki.wooledge.org/BashGuide/Practices#Choose_Your_Shell
When the first line of a script starts with #!, that's what's called a "shebang". When that script is run as an executable, the operating system uses that line to determine how to run the script -- that is to say, to find the program with which the script should be executed.
It's incorrect that "scripts work fine without headers" -- if you don't have a shebang line, you can't be invoked using the execve() call, which means that many (most?) programs won't be able to execute your script. Sometimes invocation from a shell will try to use that shell itself in the absence of a shebang, but you can't trust that to be the case.
(There's an exception to that -- if someone starts your script by running sh yourscript or bash yourscript, the shebang line isn't read at all, and the script they chose is used; however, running scripts this way is a bad practice, as the author typically knows better than the user what the correct interpreter is).
In short:
If you want to use modern features, and you want the user to be able to override the shell version in use by putting a different release of bash earlier in their path, use #!/usr/bin/env bash
If you want to use modern features and ensure that you always run with the system shell, use #!/bin/bash
If you're going to write your script to strictly conform with POSIX sh, use #!/bin/sh
There's not a limited list of shebang lines we can give you, since any native executable (non-script program) can be used as a script interpreter, and thus be placed in a shebang. If you created a file called myscript with #!/usr/bin/env yourprogram, gave it executable permissions, and ran ./myscript foo bar, this would result in /usr/bin/env yourprogram myscript foo bar being invoked; yourprogram would be run by /usr/bin/env (after a PATH lookup), and would be responsible for knowing what to do with myscript and its arguments.
For an extremely detailed history of shebang lines and how they work across systems both modern and ancient, see http://www.in-ulm.de/~mascheck/various/shebang/

Why calling a script by "scriptName" doesn't work?

I have a simple script cmakeclean to clean cmake temp files:
#!/bin/bash -f
rm CMakeCache.txt
rm *.cmake
which I call like
$ cmakeclean
And it does remove CMakeCache.txt, but it doesn't remove cmake_install.cmake:
rm: *.cmake: No such file or directory
When I run it like:
$ . cmakeclean
it does remove both.
What is the difference and can I make this script work like an usual linux command (without . in front)?
P.S.
I am sure the both times is same script is executed. To check this I added echo meme in the script and rerun it in both ways.
Remove the -f from your #!/bin/bash -f line.
-f prevents pathname expansion, which means that *.cmake will not match anything. When you run your script as a script, it interprets the shebang line, and in effect runs /bin/bash -f scriptname. When you run it as . scriptname, the shebang is just seen as a comment line and ignored, so the fact that you do not have -f set in your current environment allows it to work as expected.
. script is short for source script which means the current shell executes the commands in the script. If there's an exit in there, the current shell will exit (and e. g. the terminal window will close).
This is typically used to modify the environment of the current shell (set variables etc.).
script asks the shell to fork itself, then exec the given script in the child process, and then wait in the father for the termination of the child. If there's an exit in the script, this will be executed by the child shell and thus only terminate this. The father shell stays intact and unaltered by this call.
This is typically used to start other programs from the current shell.
Is this about ClearCase? What did you do in your poor life where you've been assigned to work in the deepest bowels of hell?
For years, I was a senior ClearCase Administer. I haven't touched it in over a decade. My life is way better now. The sky is bluer, bird songs are more melodious, and my dread over coming to work every day is now a bit less.
Getting back to your issue: It's hard to say exactly what's going on. ClearCase does some wacky things. In a dynamic view, the ClearCase repository on Unix systems is hidden in the shell's environment. Now you see it, now you don't.
When you run a shell script, it starts up a new environment. If a particular shell variable is not imported, it is invisible that shell script. When you merely run cmakeclean from the command line, you are spawning a new shell -- one that does not contain your ClearCase environment.
When you run a shell script with a dot prefix like . cmakeclean, you are running that shell script in the current shell which contains your ClearCase environment. Thus, it can see your ClearCase view.
If you're using a snapshot view, it is possible that you have a $HOME/.bashrc that's changing directories on you. When a new shell environment runs in BASH (the default shell in MacOS X and Linux), it first runs $HOME/.bashrc. If this sets a particular directory, then you end up in that directory and not in the directory where you ran your shell script. I use to see this when I too was involved in ClearCase hell. People setup their .kshrc script (it was the days before BASH and most people used Kornshell) to setup their views. Unfortunately, this made running any other shell script almost impossible to do.

What occurs when a file is `source`-d in Unix/Linux context?

I've seen shell scripts that include a line such as:
source someOtherFile
I know that causes the content of someOtherFile to execute, but what is the significance of source?
Follow-up questions: Can ANY script be sourced, or only certain type of scripts? Are there any side-effects other than environment variables when a script is sourced (as opposed to normally executing it)?
Running the command source on a script executes the script within the context of the current process. This means that environment variables set by the script remain available after it's finished running. This is in contrast to running a script normally, in which case environment variables set within the newly-spawned process will be lost once the script exits.
You can source any runnable shell script. The end effect will be the same as if you had typed the commands in the script into your terminal. For example, if the script changes directories, when it finishes running, your current working directory will have changed.
If you tell the shell, e.g. bash, to read a file and execute the commands in the file, it's called sourcing. The main point is, the current process (shell) does this, not a new child process.
In BASH you can use the source command or simply . to source a file.
source is a Unix command that evaluates the file following the command, as a list of commands, executed in the current context. You can also use . for sourcing the file.
source my-script.sh;
. my-script.sh;
Both commands will have the same effect.
In contrast, passing the script filename to the desired shell will run the script in a subshell, not the current context.

Location of cd executable

I read that the executables for the commands issued using exec() calls are supposed to be stored in directories that are part of the PATH variable.
Accordingly, I found the executables for ls, chmod, grep, cat in /bin.
However, I could not find the executable for cd.
Where is it located?
A process can only affect its own working directory. When an executable is executed by the shell it executes as a child process, so a cd executable (if one existed) would change that child process's working directory without affecting the parent process (the shell), hence the cd command must be implemented as a shell built-in that actually executes in the shell's own process.
cd is a shell built-in, unfortunately.
$ type cd
cd is a shell builtin
...from http://www.linuxquestions.org/questions/linux-newbie-8/whereis-cd-sudo-doesnt-find-cd-464767/
But you should be able to get it working with:
sh -c "cd /somedir; do something"
Not all utilities that you can execute at a shell prompt need actually exist as actual executables in the filesystem. They can also be so-called shell built-ins, which means – you guessed it – that they are built into the shell.
The Single Unix Specification does, in general, not specify whether a utility has to be provided as an executable or as a built-in, that is left as a private internal implementation detail to the OS vendor.
The only exceptions are the so-called special built-ins, which must be provided as built-ins, because they affect the behavior of the shell itself in a manner that regular executables (or even regular built-ins) can't (for example set, which sets variables that persist even after set exits). Those special built-ins are:
break
:
continue
.
eval
exec
exit
export
readonly
return
set
shift
times
trap
unset
Note that cd is not on that list, which means that cd is not a special built-in. In fact, according to the specification, it would be perfectly legal to implement cd as a regular executable. It's just not possible, for the reasons given by the other answers.
And if you scroll down to the non-normative section of the specification, i.e. to the part that is not officially part of the specification but only purely informational, you will find that fact explicitly mentioned:
Since cd affects the current shell execution environment, it is always provided as a shell regular built-in.
So, the specification doesn't require cd to be a built-in, but it's simply impossible to do otherwise.
Note that sometimes utilities are provided both as a built-in and as an executable. A good example is the time utility, which on a typical GNU system is provided both as an executable by the Coreutils package and as a shell regular built-in by Bash. This can lead to confusion, because when you do man time, you get the manpage of the time executable (the time builtin is documented in man builtins), but when you execute time you get the time built-in, which does not support the same features as the time executable whose manpage you just read. You have to explicitly run /usr/bin/time (or whatever path you installed Coreutils into) to get the executable.
According to this, cd is always a built-in command and never an executable:
Since cd affects the current shell execution environment, it is always provided as a shell regular built-in.
cd is part of the shell; an internal command. There is no binary for it.
The command cd is built-in in your command line shell. It could not affect the working directory of your shell otherwise.
I also searched the executable of "cd" and there is no such.
You can work with chdir (pathname) in C, it has the same effect.

what are shell built-in commands in linux?

I have just started using Linux and I am curious how shell built-in commands such as cd are defined.
Also, I'd appreciate if someone could explain how they are implemented and executed.
If you want to see how bash builtins are defined then you just need to look at Section 4 of The Bash Man Page.
If, however, you want to know how bash bultins are implemented, you'll need to look at the Bash source code because these commands are compiled into the bash executable.
One fast and easy way to see whether or not a command is a bash builtin is to use the help command. Example, help cd will show you how the bash builtin of 'cd' is defined. Similarly for help echo.
The actual set of built-ins varies from shell to shell. There are:
Special built-in utilities, which must be built-in, because they have some special properties
Regular built-in utilities, which are almost always built-in, because of the performance or other considerations
Any standard utility can be also built-in if a shell implementer wishes.
You can find out whether the utility is built in using the type command, which is supported by most shells (although its output is not standardized). An example from dash:
$ type ls
ls is /bin/ls
$ type cd
cd is a shell builtin
$ type exit
exit is a special shell builtin
Re cd utility, theoretically there's nothing preventing a shell implementer to implement it as external command. cd cannot change the shell's current directory directly, but, for instance, cd could communicate new directory to the shell process via a socket. But nobody does so because there's no point. Except very old shells (where there was not a notion of built-ins), where cd used some dirty system hack to do its job.
How is cd implemented inside the shell? The basic algorithm is described here. It can also do some work to support shell's extra features.
Manjari,
Check the source code of bash shell from ftp://ftp.gnu.org/gnu/bash/bash-2.05b.tar.gz
You will find that the definition of shell built-in commands in not in a separate binary executable but its within the shell binary itself (the name shell built-in clearly suggests this).
Every Unix shell has at least some builtin commands. These builtin commands are part of the shell, and are implemented as part of the shell's source code. The shell recognizes that the command that it was asked to execute was one of its builtins, and it performs that action on its own, without calling out to a separate executable. Different shells have different builtins, though there will be a whole lot of overlap in the basic set.
Sometimes, builtins are builtin for performance reasons. In this case, there's often also a version of that command in $PATH (possibly with a different feature set, different set of recognized command line arguments, etc), but the shell decided to implement the command as a builtin as well so that it could save the work of spawning off a short-lived process to do some work that it could do itself. That's the case for bash and printf, for example:
$ type printf
printf is a shell builtin
$ which printf
/usr/bin/printf
$ printf
printf: usage: printf [-v var] format [arguments]
$ /usr/bin/printf
/usr/bin/printf: missing operand
Try `/usr/bin/printf --help' for more information.
Note that in the above example, printf is both a shell builtin (implemented as part of bash itself), as well as an external command (located at /usr/bin/printf). Note that they behave differently as well - when called with no arguments, the builtin version and the command version print different error messages. Note also the -v var option (store the results of this printf into a shell variable named var) can only be done as part of the shell - subprocesses like /usr/bin/printf have no access to the variables of the shell that executed them.
And that brings us to the 2nd part of the story: some commands are builtin because they need to be. Some commands, like chmod, are thin wrappers around system calls. When you run /bin/chmod 777 foo, the shell forks, execs /bin/chmod (passing "777" and "foo") as arguments, and the new chmod process runs the C code chmod("foo", 777); and then returns control to the shell. This wouldn't work for the cd command, though. Even though cd looks like the same case as chmod, it has to behave differently: if the shell spawned another process to execute the chdir system call, it would change the directory only for that newly spawned process, not the shell. Then, when the process returned, the shell would be left sitting in the same directory as it had been in all along - therefore cd needs to be implemented as a shell builtin.
A Shell builtin -- http://linux.about.com/library/cmd/blcmdl1_builtin.htm
for eg. -
which cd
/usr/bin/which: no cd in (/usr/bin:/usr/local/bin......
Not a shell builtin but a binary.
which ls
/bin/ls
http://ss64.com/bash/ this will help you.
and here is shell scripting guide
http://www.freeos.com/guides/lsst/

Resources