wget and bash error: bash: line 0: fg: no job control - linux

I am trying to run a series of commands in parallel through xargs. I created a null-separated list of commands in a file cmd_list.txt and then attempted to run them in parallel with 6 threads as follows:
cat cmd_list.txt | xargs -0 -P 6 -I % bash -c %
However, I get the following error:
bash: line 0: fg: no job control
I've narrowed down the problem to be related to the length of the individual commands in the command list. Here's an example artificially-long command to download an image:
mkdir a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8
wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg
Just running the wget command on its own, without the file list and without xargs, works fine. However, running this command at the bash command prompt (again, without the file list) fails with the no job control error:
echo "wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg" | xargs -I % bash -c %
If I leave out the long folder name and therefore shorten the command, it works fine:
echo "wget --no-check-certificate --no-verbose -O /tmp/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg" | xargs -I % bash -c %
xargs has a -s (size) parameter that can change the max size of the command line length, but I tried increasing it to preposterous sizes (e.g., 16000) without any effect. I thought that the problem may have been related to the length of the string passed in to bash -c, but the following command also works without trouble:
bash -c "wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg"
I understand that there are other options to run commands in parallel, such as the parallel command (https://stackoverflow.com/a/6497852/1410871), but I'm still very interested in fixing my setup or at least figuring out where it's going wrong.
I'm on Mac OS X 10.10.1 (Yosemite).

It looks like the solution is to avoid the -I parameter for xargs which, per the OS X xargs man page, has a 255-byte limit on the replacement string. Instead, the -J parameter is available, which does not have a 255-byte limit.
So my command would look like:
echo "wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg" | xargs -J % bash -c %
However, in the above command, only the portion of the replacement string before the first whitespace is passed to bash, so bash tries to execute:
wget
which obviously results in an error. My solution is to ensure that xargs interprets the commands as null-delimited instead of whitespace-delimited using the -0 parameter, like so:
echo "wget --no-check-certificate --no-verbose -O a-very-long-folder-de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8de090952623b4865c2c34bd6330f8a423ed05ed8/blah.jpg http://d4u3lqifjlxra.cloudfront.net/uploads/example/file/48/accordion.jpg" | xargs -0 -J % bash -c %
and finally, this works!
Thank you to #CharlesDuffy who provided most of this insight. And no thank you to my OS X version of xargs for its poor handling of replacement strings that exceed the 255-byte limit.

I suspect it's the percent symbol, and your top shell complaining.
cat cmd_list.txt | xargs -0 -P 6 -I % bash -c %
Percent is a metacharacter for job control. "fg %2", e.g. "kill %4".
Try escaping the percents with a backslash to signal to the top shell that it should not try to interpret the percent, and xargs should be handed a literal percent character.
cat cmd_list.txt | xargs -0 -P 6 -I \% bash -c \%

Related

How do i make my bash script on download automatically turn into a terminal command? [duplicate]

Say I have a file at the URL http://mywebsite.example/myscript.txt that contains a script:
#!/bin/bash
echo "Hello, world!"
read -p "What is your name? " name
echo "Hello, ${name}!"
And I'd like to run this script without first saving it to a file. How do I do this?
Now, I've seen the syntax:
bash < <(curl -s http://mywebsite.example/myscript.txt)
But this doesn't seem to work like it would if I saved to a file and then executed. For example readline doesn't work, and the output is just:
$ bash < <(curl -s http://mywebsite.example/myscript.txt)
Hello, world!
Similarly, I've tried:
curl -s http://mywebsite.example/myscript.txt | bash -s --
With the same results.
Originally I had a solution like:
timestamp=`date +%Y%m%d%H%M%S`
curl -s http://mywebsite.example/myscript.txt -o /tmp/.myscript.${timestamp}.tmp
bash /tmp/.myscript.${timestamp}.tmp
rm -f /tmp/.myscript.${timestamp}.tmp
But this seems sloppy, and I'd like a more elegant solution.
I'm aware of the security issues regarding running a shell script from a URL, but let's ignore all of that for right now.
source <(curl -s http://mywebsite.example/myscript.txt)
ought to do it. Alternately, leave off the initial redirection on yours, which is redirecting standard input; bash takes a filename to execute just fine without redirection, and <(command) syntax provides a path.
bash <(curl -s http://mywebsite.example/myscript.txt)
It may be clearer if you look at the output of echo <(cat /dev/null)
This is the way to execute remote script with passing to it some arguments (arg1 arg2):
curl -s http://server/path/script.sh | bash /dev/stdin arg1 arg2
For bash, Bourne shell and fish:
curl -s http://server/path/script.sh | bash -s arg1 arg2
Flag "-s" makes shell read from stdin.
Use:
curl -s -L URL_TO_SCRIPT_HERE | bash
For example:
curl -s -L http://bitly/10hA8iC | bash
Using wget, which is usually part of default system installation:
bash <(wget -qO- http://mywebsite.example/myscript.txt)
You can also do this:
wget -O - https://raw.github.com/luismartingil/commands/master/101_remote2local_wireshark.sh | bash
The best way to do it is
curl http://domain/path/to/script.sh | bash -s arg1 arg2
which is a slight change of answer by #user77115
You can use curl and send it to bash like this:
bash <(curl -s http://mywebsite.example/myscript.txt)
I often using the following is enough
curl -s http://mywebsite.example/myscript.txt | sh
But in a old system( kernel2.4 ), it encounter problems, and do the following can solve it, I tried many others, only the following works
curl -s http://mywebsite.example/myscript.txt -o a.sh && sh a.sh && rm -f a.sh
Examples
$ curl -s someurl | sh
Starting to insert crontab
sh: _name}.sh: command not found
sh: line 208: syntax error near unexpected token `then'
sh: line 208: ` -eq 0 ]]; then'
$
The problem may cause by network slow, or bash version too old that can't handle network slow gracefully
However, the following solves the problem
$ curl -s someurl -o a.sh && sh a.sh && rm -f a.sh
Starting to insert crontab
Insert crontab entry is ok.
Insert crontab is done.
okay
$
Also:
curl -sL https://.... | sudo bash -
Just combining amra and user77115's answers:
wget -qO- https://raw.githubusercontent.com/lingtalfi/TheScientist/master/_bb_autoload/bbstart.sh | bash -s -- -v -v
It executes the bbstart.sh distant script passing it the -v -v options.
Is some unattended scripts I use the following command:
sh -c "$(curl -fsSL <URL>)"
I recommend to avoid executing scripts directly from URLs. You should be sure the URL is safe and check the content of the script before executing, you can use a SHA256 checksum to validate the file before executing.
instead of executing the script directly, first download it and then execute
SOURCE='https://gist.githubusercontent.com/cci-emciftci/123123/raw/123123/sample.sh'
curl $SOURCE -o ./my_sample.sh
chmod +x my_sample.sh
./my_sample.sh
This way is good and conventional:
17:04:59#itqx|~
qx>source <(curl -Ls http://192.168.80.154/cent74/just4Test) Lord Jesus Loves YOU
Remote script test...
Param size: 4
---------
17:19:31#node7|/var/www/html/cent74
arch>cat just4Test
echo Remote script test...
echo Param size: $#
If you want the script run using the current shell, regardless of what it is, use:
${SHELL:-sh} -c "$(wget -qO - http://mywebsite.example/myscript.txt)"
if you have wget, or:
${SHELL:-sh} -c "$(curl -Ls http://mywebsite.example/myscript.txt)"
if you have curl.
This command will still work if the script is interactive, i.e., it asks the user for input.
Note: OpenWRT has a wget clone but not curl, by default.
bash | curl http://your.url.here/script.txt
actual example:
juan#juan-MS-7808:~$ bash | curl https://raw.githubusercontent.com/JPHACKER2k18/markwe/master/testapp.sh
Oh, wow im alive
juan#juan-MS-7808:~$

How to store output for every xargs instance separately

cat domains.txt | xargs -P10 -I % ffuf -u %/FUZZ -w wordlist.txt -o output.json
Ffuf is used for directory and file bruteforcing while domains.txt contains valid HTTP and HTTPS URLs like http://example.com, http://example2.com. I used xargs to speed up the process by running 10 parallel instances. But the problem here is I am unable to store output for each instance separately and output.json is getting override by every running instance. Is there anything we can do to make output.json unique for every instance so that all data gets saved separately. I tried ffuf/$(date '+%s').json instead but it didn't work either.
Sure. Just name your output file using the domain. E.g.:
xargs -P10 -I % ffuf -u %/FUZZ -w wordlist.txt -o output-%.json < domains.txt
(I dropped cat because it was unnecessary.)
I missed the fact that your domains.txt file is actually a list of URLs rather than a list of domain names. I think the easiest fix is just to simplify domains.txt to be just domain names, but you could also try something like:
xargs -P10 -I % sh -c 'domain="%"; ffuf -u %/FUZZ -w wordlist.txt -o output-${domain##*/}.json' < domains.txt
cat domains.txt | xargs -P10 -I % sh -c "ping % > output.json.%"
Like this and your "%" can be part of the file name. (I changed your command to ping for my testing)
So maybe something more like this:
cat domains.txt | xargs -P10 -I % sh -c "ffuf -u %/FUZZ -w wordlist.txt -o output.json.%
"
I would replace your ffuf command with the following script, and call this from the xargs command. It just strips out the invalid file name characters and replaces them with a dot then runs the command:
#!/usr/bin/bash
URL=$1
FILE="`echo $URL | sed 's/:\/\//\./g'`"
ffuf -u ${URL}/FUZZ -w wordlist.txt -o output-${FILE}.json

how to use wget spider to identify broken urls from a list of urls and save broken ones

I am trying to write a shell script to identify broken urls from a list of urls.
here is input_url.csv sample:
https://www.google.com/
https://www.nbc.com
https://www.google.com.hksjkhkh/
https://www.google.co.jp/
https://www.google.ca/
Here is what I have which works:
wget --spider -nd -nv -H --max-redirect 0 -o run.log -i input_url.csv
and this gives me '2019-09-03 19:48:37 URL: https://www.nbc.com 200 OK' for valid urls, and for broken ones it gives me '0 redirections exceeded.'
what i expect is that i only want to save those broken links into my output file.
sample expect output:
https://www.google.com.hksjkhkh/
I think I would go with:
<input.csv xargs -n1 -P10 sh -c 'wget --spider --quiet "$1" || echo "$1"' --
You can use -P <count> option to xargs to run count processes in parallel.
xargs runs the command sh -c '....' -- for each line of the input file appending the input file line as the argument to the script.
Then sh inside runs wget ... "$1". The || checks if the return status is nonzero, which means failure. On wget failure, echo "$1" is executed.
Live code link at repl.
You could filter the output of wget -nd -nv and then regex the output, well like
wget --spider -nd -nv -H --max-redirect 0 -i input 2>&1 | grep -v '200 OK' | grep 'unable' | sed 's/.* .//; s/.$//'
but this looks not expendable, is not parallel so probably is slower and probably not worth the hassle.

linux strace: How to filter system calls that take more than a second

I'm using "strace -f -T -tt -o foo.txt -p 1234" to print the time spent in system calls. This makes the output file huge, is there a way to just print the system calls that took greater than 1second. I can grep it out from the file later, but is there a better way?
If we simply omit the -o foo.txt argument, the output goes to standard output. We can pipe it through grep and redirect to the file:
strace -f -T -tt -p 1234 | grep pattern > foo.txt
To watch the output at the same time:
strace -f -T -tt -p 1234 | grep pattern | tee foo.txt
If a command prints only to a file that is passed as an argument, and we want to filter/redirect its output, the first step is to check whether it implements the dash convention: can you specify standard input or output using - as a filename argument:
some_command - | our_pipe > file.txt
If not, then the recourse is to use Bash process substitution substitution syntax: >(output command) and <(input command):
some_command >(our_pipe > file.txt)
The process substitution syntax expands into a token that is suitable as a filename argument for a command or function. When the program opens that token, it gets a file descriptor to the command's input or output, depending on direction.
With process substitution, we can redirect the input or output of stubborn programs which work only with files passed as by name as arguments, and which do not support any convention for requesting that standard input or output be used in place of a file.
The token used by process substitution is platform-dependent; we can see what it is using echo. For instance on GNU/Linux, Bash takes advantage of the /dev/fd operating system feature:
$ echo <(ls -l)
/dev/fd/63
You can use the following command:
strace -T command 2>&1 >/dev/null | awk '{gsub(/[<>]/,"",$NF)}$NF+.0>1.0'
Explanation:
strace -T adds the time spent in the syscall end the end of the line, enclosed in <...>
2>&1 >/dev/null | awk pipes stderr to awk. (strace writes it's output to stderr!)
The awk command removes the <> from the last field $NF and prints lines where the time spent is higher than a second.
Probably you'll also want to pass the threshold as a variable to the awk command:
strace -T command 2>&1 >/dev/null \
| awk -v thres=0.001 '{gsub(/[<>]/,"",$NF)}$NF+.0>thres+.0'

Passing arguments to a script invoked with bash -c

I'm testing a Bash script I created on GitHub for behavioral correctness (e.g. that it parses options correctly). I want to do this without having to clone the repository locally, so here is how I'm doing it:
curl -sSL https://github.com/jamesqo/gid/raw/master/gid | xargs -0 bash -c
My question is, how can I pass arguments to the script in question? I tried bash -c --help, but that didn't work since it got interpreted as part of the script.
Thanks!
You’re actually over-complicating things by using xargs with Bash’s -c option.
Download the script directly
You don’t need to clone the repository to run the script. Just download it directly:
curl -o gid https://raw.githubusercontent.com/jamesqo/gid/master/gid
Now that it’s downloaded as gid, you can run it as a Bash script, e.g.,
bash gid --help
You can also make the downloaded script executable in order to run it as a regular Unix script file (using its shebang, #!/bin/bash):
chmod +x gid
./gid --help
Use process substitution
If you wanted to run the script without actually saving it to a file, you could use Bash process substitution:
bash <(curl -sSL https://github.com/jamesqo/gid/raw/master/gid) --help
I'll echo Anthony's comments - it makes a lot more sense to download the script and execute it directly, but if you're really set on using the -c option for bash, it's a little bit complicated, the problem is that when you do:
something | xargs -0 bash -c
there's no opportunity to pass any arguments. They all get swallowed as the argument to -c - it essentially gets turned into:
bash -c "$(something)"
so if you place something after the -c in the xargs, it gets before the something. There is no opportunity to put anything after something, as xargs doesn't let you.
If you want to pass arguments, you have to use the substitution position option for xargs, which allows you to place where the argument goes, The option is -J <item>, and the next thing to realize is that the first argument will be $0, so you have to do:
something | xargs -0 -I # bash -c # something <arg1> <arg2>…
I can emulate this with:
echo 'echo hi: ~$0~ ~$1~ ~$2~ ~$3~' | xargs -0 -I # bash -c # something one two three four
which yields:
hi: ~something~ ~one~ ~two~ ~three~

Resources