Unknown bash command causing PC to freeze [duplicate] - linux

I looked at this page and can't understand how this works.
This command "exponentially spawns subprocesses until your box locks up".
But why? What I understand less are the colons.
user#host$ :(){ :|:& };:

:(){ :|:& };:
..defines a function named :, which spawns itself (twice, one pipes into the other), and backgrounds itself.
With line breaks:
:()
{
:|:&
};
:
Renaming the : function to forkbomb:
forkbomb()
{
forkbomb | forkbomb &
};
forkbomb
You can prevent such attacks by using ulimit to limit the number of processes-per-user:
$ ulimit -u 50
$ :(){ :|:& };:
-bash: fork: Resource temporarily unavailable
$
More permanently, you can use /etc/security/limits.conf (on Debian and others, at least), for example:
* hard nproc 50
Of course that means you can only run 50 processes, you may want to increase this depending on what the machine is doing!

That defines a function called : which calls itself twice (Code: : | :). It does that in the background (&). After the ; the function definition is done and the function : gets started.
So every instance of : starts two new : and so on... Like a binary tree of processes...
Written in plain C that is:
fork();
fork();

Just to add to the above answers, the behavior of pipe | is to create two processes at once and connect them with pipe(pipe is implemented by the operating system itself), so when we use pipe, each parent processes spawn two other processes, which leads to utilization of system resource exponentially so that resource is used up faster. Also & is used to background the process and in this case prompts returns immediately so that the next call executes even faster.
Conclusion :
|: To use system resource faster( with exponential growth)
&: background the process to get process started faster

This defines a function called : (:()). Inside the function ({...}), there's a :|:& which is like this:
: calls this : function again.
| signifies piping the output to a command.
: after | means pipe to the function :.
&, in this case, means run the preceding in the background.
Then there's a ; that is known as a command separator.
Finally, the : starts this "chain reaction", activating the fork bomb.
The C equivalent would be:
#include <sys/types.h>
#include <unistd.h>
int main()
{
fork();
fork();
}

Related

Child processes in bash

We use bash scripts with asynchonous calls using '&'. Something like this:
function test() {
sleep 1
}
test &
mypid=$!
# do some stuff for two hours
wait $mypid
Usually everything is OK, but sometimes we get error
"wait: pid 419090 is not a child of this shell"
I know that bash keeps child pids in a special table and I know ('man wait') that bash is allowed not to store status information in this table if nobody uses $!, and nobody can state 'wait $mypid'. I suspect, that this optimization contains a bug that causes the error. Does somebody know how to print this table or how to disable this optimization?
I was trying recently something quite similar.
Are you sure that the second process you run concurrently starts before the previous dies? In this case, I think that there is a possibility it takes the same pid with the recently died one.
Also I think we cannot be sure that $! takes the pid of the process we lastly run, because there may be several processes in the background from another functions starting or ending at the same time.
I would suggest using something like this.
mypid=$(ps -ef | grep name_of_your_process | awk ' !/grep/ {print $2} ')
In grep name_of_your_process you can specify some parameters too so as to get the exact process you want.
I hope it helps a bit.
Having written something similar, I suggest the correct strategy is to fork in background BOTH function test and the long running 2 hour stuff.
Then you can wait on a list of pids, invoked in background sorted by the expected running times (fastest first).
The bash(1) wait also allows you to simply wait, for all of the child processes to complete, but that may require a checking protocol for successful completions.
An alternative approach for greater reliability is to use batch queues, with a seperate process started to check for sucessful completions.
You can use gdb to attach to a running shell and see what's happening. On my system I ran yum install bash-debuginfo. I ran gdb and attached to a running shell.
(gdb) b wait_for_single_pid
Breakpoint 1 at 0x441840: file jobs.c, line 2115.
(gdb) c
Continuing.
Breakpoint 1, wait_for_single_pid (pid=11298) at jobs.c:2115
2115 {
(gdb) n
2120 BLOCK_CHILD (set, oset);
(gdb)
2121 child = find_pipeline (pid, 0, (int *)NULL);
(gdb) s
find_pipeline (pid=pid#entry=11298, alive_only=alive_only#entry=0, jobp=jobp#entry=0x0) at jobs.c:1308
1308 {
(gdb)
1313 if (jobp)
(gdb) n
1315 if (the_pipeline)
(gdb)
1329 job = find_job (pid, alive_only, &p);
(gdb) s
find_job (pid=11298, alive_only=0, procp=procp#entry=0x7ffdc053f038) at jobs.c:1364
1364 for (i = 0; i < js.j_jobslots; i++)
(gdb) n
1372 if (jobs[i])
(gdb)
1374 p = jobs[i]->pipe;
(gdb)
1378 if (p->pid == pid && ((alive_only == 0 && PRECYCLED(p) == 0) || PALIVE(p)))
(gdb)
1385 p = p->next;
(gdb)
1387 while (p != jobs[i]->pipe);
There code is traversing the pipe linked lists attached to the jobs array. I didn't encounter any bugs, but perhaps you can spot them with this approach.

How to restart a group of processes when it is triggered from one of them in C code

i have few processes *.rt written in C.
I want to restart all of them(*.rt) in the process foo.rt(one of the *.rt) in itself (buid-in C code)
Normally i have 2 bash scripts stop.sh and start.sh. These scripts are invoked from shell.
Here are the staffs of the scripts
stop.sh --> send kill -9 signal to all ".rt" files.
start.sh -->invokes processes named ".rt"
My problem is how can i restart all rt's from C code. Is there any Idea to restart all "*.rt" files triggered from foo.rt file ?
I tried to use this in foo.rt but it doesnt work. *Because stop.sh is killing all .rt files even if it is forked as a child which is deployed to execute start.sh script
...
case 708: /* There is a trigger signal here*/
{
result = APP_RES_PRG_OK;
if (fork() == 0) { /* child */
execl("/bin/sh","sh","-c","/sbin/stop.sh",NULL);
execl("/bin/sh","sh","-c","/sbin/start.sh",NULL);// Error:This will be killed by /sbin/stop command
}
}
I'have solved problem with "at" daemon in Linux
I invoke 2 system() calls stop & start.
My first attempt was faulty as explained above. execl make a new image and never returns to later execl unless it is succeed
Here is my solution
case 708: /*There is a trigger signal here*/
{
system("echo '/sbin/start.sh' | at now + 2 min");
system("echo '/sbin/stop.sh | at now + 1 min");
}
You could use process groups, at least if all your related processes are originated by the same process...
So you could write a glue program in C which set a new process group using setpgrp(2) and store its pid (or keeps running, waiting for some IPC).
Then you would stop that process group by using killpg(2).
See also the notion of session and setsid(2)

Bash script execution with and without shebang in Linux and BSD

How and who determines what executes when a Bash-like script is executed as a binary without a shebang?
I guess that running a normal script with shebang is handled with binfmt_script Linux module, which checks a shebang, parses command line and runs designated script interpreter.
But what happens when someone runs a script without a shebang? I've tested the direct execv approach and found out that there's no kernel magic in there - i.e. a file like that:
$ cat target-script
echo Hello
echo "bash: $BASH_VERSION"
echo "zsh: $ZSH_VERSION"
Running compiled C program that does just an execv call yields:
$ cat test-runner.c
void main() {
if (execv("./target-script", 0) == -1)
perror();
}
$ ./test-runner
./target-script: Exec format error
However, if I do the same thing from another shell script, it runs the target script using the same shell interpreter as the original one:
$ cat test-runner.bash
#!/bin/bash
./target-script
$ ./test-runner.bash
Hello
bash: 4.1.0(1)-release
zsh:
If I do the same trick with other shells (for example, Debian's default sh - /bin/dash), it also works:
$ cat test-runner.dash
#!/bin/dash
./target-script
$ ./test-runner.dash
Hello
bash:
zsh:
Mysteriously, it doesn't quite work as expected with zsh and doesn't follow the general scheme. Looks like zsh executed /bin/sh on such files after all:
greycat#burrow-debian ~/z/test-runner $ cat test-runner.zsh
#!/bin/zsh
echo ZSH_VERSION=$ZSH_VERSION
./target-script
greycat#burrow-debian ~/z/test-runner $ ./test-runner.zsh
ZSH_VERSION=4.3.10
Hello
bash:
zsh:
Note that ZSH_VERSION in parent script worked, while ZSH_VERSION in child didn't!
How does a shell (Bash, dash) determines what gets executed when there's no shebang? I've tried to dig up that place in Bash/dash sources, but, alas, looks like I'm kind of lost in there. Can anyone shed some light on the magic that determines whether the target file without shebang should be executed as script or as a binary in Bash/dash? Or may be there is some sort of interaction with kernel / libc and then I'd welcome explanations on how does it work in Linux and FreeBSD kernels / libcs?
Since this happens in dash and dash is simpler, I looked there first.
Seems like exec.c is the place to look, and the relevant functionis are tryexec, which is called from shellexec which is called whenever the shell things a command needs to be executed. And (a simplified version of) the tryexec function is as follows:
STATIC void
tryexec(char *cmd, char **argv, char **envp)
{
char *const path_bshell = _PATH_BSHELL;
repeat:
execve(cmd, argv, envp);
if (cmd != path_bshell && errno == ENOEXEC) {
*argv-- = cmd;
*argv = cmd = path_bshell;
goto repeat;
}
}
So, it simply always replaces the command to execute with the path to itself (_PATH_BSHELL defaults to "/bin/sh") if ENOEXEC occurs. There's really no magic here.
I find that FreeBSD exhibits identical behavior in bash and in its own sh.
The way bash handles this is similar but much more complicated. If you want to look in to it further I recommend reading bash's execute_command.c and looking specifically at execute_shell_script and then shell_execve. The comments are quite descriptive.
(Looks like Sorpigal has covered it but I've already typed this up and it may be of interest.)
According to Section 3.16 of the Unix FAQ, the shell first looks at the magic number (first two bytes of the file). Some numbers indicate a binary executable; #! indicates that the rest of the line should be interpreted as a shebang. Otherwise, the shell tries to run it as a shell script.
Additionally, it seems that csh looks at the first byte, and if it's #, it'll try to run it as a csh script.

on-the-fly output redirection, seeing the file redirection output while the program is still running

If I use a command like this one:
./program >> a.txt &
, and the program is a long running one then I can only see the output once the program ended. That means I have no way of knowing if the computation is going well until it actually stops computing. I want to be able to read the redirected output on file while the program is running.
This is similar to opening a file, appending to it, then closing it back after every writing. If the file is only closed at the end of the program then no data can be read on it until the program ends. The only redirection I know is similar to closing the file at the end of the program.
You can test it with this little python script. The language doesn't matter. Any program that writes to standard output has the same problem.
l = range(0,100000)
for i in l:
if i%1000==0:
print i
for j in l:
s = i + j
One can run this with:
./python program.py >> a.txt &
Then cat a.txt .. you will only get results once the script is done computing.
From the stdout manual page:
The stream stderr is unbuffered.
The stream stdout is line-buffered
when it points to a terminal.
Partial lines will not appear until
fflush(3) or exit(3) is called, or
a new‐line is printed.
Bottom line: Unless the output is a terminal, your program will have its standard output in fully buffered mode by default. This essentially means that it will output data in large-ish blocks, rather than line-by-line, let alone character-by-character.
Ways to work around this:
Fix your program: If you need real-time output, you need to fix your program. In C you can use fflush(stdout) after each output statement, or setvbuf() to change the buffering mode of the standard output. For Python there is sys.stdout.flush() of even some of the suggestions here.
Use a utility that can record from a PTY, rather than outright stdout redirections. GNU Screen can do this for you:
screen -d -m -L python test.py
would be a start. This will log the output of your program to a file called screenlog.0 (or similar) in your current directory with a default delay of 10 seconds, and you can use screen to connect to the session where your command is running to provide input or terminate it. The delay and the name of the logfile can be changed in a configuration file or manually once you connect to the background session.
EDIT:
On most Linux system there is a third workaround: You can use the LD_PRELOAD variable and a preloaded library to override select functions of the C library and use them to set the stdout buffering mode when those functions are called by your program. This method may work, but it has a number of disadvantages:
It won't work at all on static executables
It's fragile and rather ugly.
It won't work at all with SUID executables - the dynamic loader will refuse to read the LD_PRELOAD variable when loading such executables for security reasons.
It's fragile and rather ugly.
It requires that you find and override a library function that is called by your program after it initially sets the stdout buffering mode and preferably before any output. getenv() is a good choice for many programs, but not all. You may have to override common I/O functions such as printf() or fwrite() - if push comes to shove you may just have to override all functions that control the buffering mode and introduce a special condition for stdout.
It's fragile and rather ugly.
It's hard to ensure that there are no unwelcome side-effects. To do this right you'd have to ensure that only stdout is affected and that your overrides will not crash the rest of the program if e.g. stdout is closed.
Did I mention that it's fragile and rather ugly?
That said, the process it relatively simple. You put in a C file, e.g. linebufferedstdout.c the replacement functions:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <dlfcn.h>
char *getenv(const char *s) {
static char *(*getenv_real)(const char *s) = NULL;
if (getenv_real == NULL) {
getenv_real = dlsym(RTLD_NEXT, "getenv");
setlinebuf(stdout);
}
return getenv_real(s);
}
Then you compile that file as a shared object:
gcc -O2 -o linebufferedstdout.so -fpic -shared linebufferedstdout.c -ldl -lc
Then you set the LD_PRELOAD variable to load it along with your program:
$ LD_PRELOAD=./linebufferedstdout.so python test.py | tee -a test.out
0
1000
2000
3000
4000
If you are lucky, your problem will be solved with no unfortunate side-effects.
You can set the LD_PRELOAD library in the shell, if necessary, or even specify that library system-wide (definitely NOT recommended) in /etc/ld.so.preload.
If you're trying to modify the behavior of an existing program try stdbuf (part of coreutils starting with version 7.5 apparently).
This buffers stdout up to a line:
stdbuf -oL command > output
This disables stdout buffering altogether:
stdbuf -o0 command > output
Have you considered piping to tee?
./program | tee a.txt
However, even tee won't work if "program" doesn't write anything to stdout until it is done. So, the effectiveness depends a lot on how your program behaves.
If the program writes to a file, you can read it while it is being written using tail -f a.txt.
Your problem is that most programs check to see if the output is a terminal or not. If the output is a terminal then output is buffered one line at a time (so each line is output as it is generated) but if the output is not a terminal then the output is buffered in larger chunks (4096 bytes at a time is typical) This behaviour is normal behaviour in the C library (when using printf for example) and also in the C++ library (when using cout for example), so any program written in C or C++ will do this.
Most other scripting languages (like perl, python, etc.) are written in C or C++ and so they have exactly the same buffering behaviour.
The answer above (using LD_PRELOAD) can be made to work on perl or python scripts, since the interpreters are themselves written in C.
The unbuffer command from the expect package does exactly what you are looking for.
$ sudo apt-get install expect
$ unbuffer python program.py | cat -
<watch output immediately show up here>

The Bash command :(){ :|:& };: will spawn processes to kernel death. Can you explain the syntax?

I looked at this page and can't understand how this works.
This command "exponentially spawns subprocesses until your box locks up".
But why? What I understand less are the colons.
user#host$ :(){ :|:& };:
:(){ :|:& };:
..defines a function named :, which spawns itself (twice, one pipes into the other), and backgrounds itself.
With line breaks:
:()
{
:|:&
};
:
Renaming the : function to forkbomb:
forkbomb()
{
forkbomb | forkbomb &
};
forkbomb
You can prevent such attacks by using ulimit to limit the number of processes-per-user:
$ ulimit -u 50
$ :(){ :|:& };:
-bash: fork: Resource temporarily unavailable
$
More permanently, you can use /etc/security/limits.conf (on Debian and others, at least), for example:
* hard nproc 50
Of course that means you can only run 50 processes, you may want to increase this depending on what the machine is doing!
That defines a function called : which calls itself twice (Code: : | :). It does that in the background (&). After the ; the function definition is done and the function : gets started.
So every instance of : starts two new : and so on... Like a binary tree of processes...
Written in plain C that is:
fork();
fork();
Just to add to the above answers, the behavior of pipe | is to create two processes at once and connect them with pipe(pipe is implemented by the operating system itself), so when we use pipe, each parent processes spawn two other processes, which leads to utilization of system resource exponentially so that resource is used up faster. Also & is used to background the process and in this case prompts returns immediately so that the next call executes even faster.
Conclusion :
|: To use system resource faster( with exponential growth)
&: background the process to get process started faster
This defines a function called : (:()). Inside the function ({...}), there's a :|:& which is like this:
: calls this : function again.
| signifies piping the output to a command.
: after | means pipe to the function :.
&, in this case, means run the preceding in the background.
Then there's a ; that is known as a command separator.
Finally, the : starts this "chain reaction", activating the fork bomb.
The C equivalent would be:
#include <sys/types.h>
#include <unistd.h>
int main()
{
fork();
fork();
}

Resources