Make xargs symlinks with return of a command - linux

Im trying made a symlink file with name returned in a previous command but return the error is not a directory.
Each file in a folder i want to made the symlink file with the hash.0, the Following code snippet is in example file 213123.0:
for x in *; do openssl x509 -noout -hash -in $x|xargs ln -s $x {} ; done;
Returned:
"ln: target ‘b28afb7c’ is not a directory"
How can I do this correctly?

xargs is not find. You do not need {} to tell xargs where to stick the argument it always just sticks it at the end. Drop the {} argument and your command will work.
Use of xargs -t argument to have it show you the command it was trying to run would have found this for you.
It should also be pointed out that openssl (at least in some versions) has a c_rehash perl script that does this for you and handles corner cases that naive attempts will not (such as duplicated certificate files and duplicate hash results). Additionally your snippet doesn't actually append the .0 you said you wanted.
You cannot use xargs to do what you want here as you cannot control the placement/etc. of the argument to xargs such as to create the hash.0 filename you desire. That being said xargs is entirely unnecessary here as you only have a single bit of output to deal with.
Either use hash=$(openssl ... "$x"); ln -s "$x" "${hash}.0" or drop the variable entirely and use ln -s "$x" "$(openssl ... "$x")".

If you have GNU Parallel installed:
parallel 'ln -s {} $(openssl x509 -noout -hash -in {}).0' ::: *
If it is not packaged for your system, this should install it in 10 seconds:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
To learn more: Watch the intro video for a quick introduction:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial (man parallel_tutorial). You command line
will love you for it.

Related

Creating compressed tar file, with only subset of files, remotely over SSH

I've successfully managed to transfer a tar file over SSH on stdout from a remote system, creating a compressed file locally, by doing something like this:
read -s sudopass
ssh me#remote "echo $sudopass | sudo -S tar cf - '/dir'" 2>/dev/null | XZ_OPT='-6 -T0 -v' xz > dir.tar.xz
As expected this gets me a dir.tar.xz locally which is all of the remote /dir compressed.
I've also managed to figure out how to locally only compress a subset of files, by passing a filelist to tar with -T on STDIN:
find '/dir' -name '*.log' | XZ_OPT='-6 -T0 -v' tar cJvf /root/logs.txz -T -
My main question is: how would I go about doing the first thing (transfer plain tar remotly, then compress locally) while at the same time telling tar that I only want to do it on a specific subset of files?
When I try combining the two:
ssh me#remote "echo $sudopass | sudo -S find '/dir' -name '*.log' | tar cf
-T -" | XZ_OPT='-6 -T0 -v' xz > cypress_logs.tar.xz
I get errors like:
tar: -: Cannot stat: No such file or directory
I feel like tar isn't liking the fact that I'm both passing it something on STDIN as well as expecting it to output to STDOUT. Adding another - didn't seem to help either.
Also, as a bonus question, if anyone has a better idea on how to pass $sudopass above that would be great, since this method -- while avoiding having the password in the bash history -- makes the sudo password show up in the process list while it's running.
Remember that the f option requires an argument, so when you write cf -T -, I suspect that the -T is getting consumed as the argument to f, which throws off the rest of the command line.
This works for me:
ssh me#remote "echo $password | sudo -S find /tmp/dir -name '*.log' | tar -cf- -T-"
You could also write it like this:
ssh me#remote "echo $password | sudo -S find /tmp/dir -name '*.log' | tar cf - -T-"
But I prefer to always use - for options, rather than legacy tar's weird options without any prefix.

Passing arguments to a script invoked with bash -c

I'm testing a Bash script I created on GitHub for behavioral correctness (e.g. that it parses options correctly). I want to do this without having to clone the repository locally, so here is how I'm doing it:
curl -sSL https://github.com/jamesqo/gid/raw/master/gid | xargs -0 bash -c
My question is, how can I pass arguments to the script in question? I tried bash -c --help, but that didn't work since it got interpreted as part of the script.
Thanks!
You’re actually over-complicating things by using xargs with Bash’s -c option.
Download the script directly
You don’t need to clone the repository to run the script. Just download it directly:
curl -o gid https://raw.githubusercontent.com/jamesqo/gid/master/gid
Now that it’s downloaded as gid, you can run it as a Bash script, e.g.,
bash gid --help
You can also make the downloaded script executable in order to run it as a regular Unix script file (using its shebang, #!/bin/bash):
chmod +x gid
./gid --help
Use process substitution
If you wanted to run the script without actually saving it to a file, you could use Bash process substitution:
bash <(curl -sSL https://github.com/jamesqo/gid/raw/master/gid) --help
I'll echo Anthony's comments - it makes a lot more sense to download the script and execute it directly, but if you're really set on using the -c option for bash, it's a little bit complicated, the problem is that when you do:
something | xargs -0 bash -c
there's no opportunity to pass any arguments. They all get swallowed as the argument to -c - it essentially gets turned into:
bash -c "$(something)"
so if you place something after the -c in the xargs, it gets before the something. There is no opportunity to put anything after something, as xargs doesn't let you.
If you want to pass arguments, you have to use the substitution position option for xargs, which allows you to place where the argument goes, The option is -J <item>, and the next thing to realize is that the first argument will be $0, so you have to do:
something | xargs -0 -I # bash -c # something <arg1> <arg2>…
I can emulate this with:
echo 'echo hi: ~$0~ ~$1~ ~$2~ ~$3~' | xargs -0 -I # bash -c # something one two three four
which yields:
hi: ~something~ ~one~ ~two~ ~three~

What does `!:-` do?

I am very new to bash scripting and in Ubuntu\Debian package system.
Today I am studying the content of this preinst file that the script executes before that package is unpacked from its Debian archive (.deb) file.
My fist doubt is about a line containing this:
!:-
Probably it is a stupid question but, using Google, I can't find an answer.
Insert the last command without the last argument (bash)
/usr/sbin/ab2 -f TLS1 -S -n 1000 -c 100 -t 2 http://www.google.com/
then
!:- http://www.stackoverflow.com/
is the same as
/usr/sbin/ab2 -f TLS1 -S -n 1000 -c 100 -t 2 http://www.stackoverflow.com/

Parallel download using Curl command line utility

I want to download some pages from a website and I did it successfully using curl but I was wondering if somehow curl downloads multiple pages at a time just like most of the download managers do, it will speed up things a little bit. Is it possible to do it in curl command line utility?
The current command I am using is
curl 'http://www...../?page=[1-10]' 2>&1 > 1.html
Here I am downloading pages from 1 to 10 and storing them in a file named 1.html.
Also, is it possible for curl to write output of each URL to separate file say URL.html, where URL is the actual URL of the page under process.
My answer is a bit late, but I believe all of the existing answers fall just a little short. The way I do things like this is with xargs, which is capable of running a specified number of commands in subprocesses.
The one-liner I would use is, simply:
$ seq 1 10 | xargs -n1 -P2 bash -c 'i=$0; url="http://example.com/?page${i}.html"; curl -O -s $url'
This warrants some explanation. The use of -n 1 instructs xargs to process a single input argument at a time. In this example, the numbers 1 ... 10 are each processed separately. And -P 2 tells xargs to keep 2 subprocesses running all the time, each one handling a single argument, until all of the input arguments have been processed.
You can think of this as MapReduce in the shell. Or perhaps just the Map phase. Regardless, it's an effective way to get a lot of work done while ensuring that you don't fork bomb your machine. It's possible to do something similar in a for loop in a shell, but end up doing process management, which starts to seem pretty pointless once you realize how insanely great this use of xargs is.
Update: I suspect that my example with xargs could be improved (at least on Mac OS X and BSD with the -J flag). With GNU Parallel, the command is a bit less unwieldy as well:
parallel --jobs 2 curl -O -s http://example.com/?page{}.html ::: {1..10}
Well, curl is just a simple UNIX process. You can have as many of these curl processes running in parallel and sending their outputs to different files.
curl can use the filename part of the URL to generate the local file. Just use the -O option (man curl for details).
You could use something like the following
urls="http://example.com/?page1.html http://example.com?page2.html" # add more URLs here
for url in $urls; do
# run the curl job in the background so we can start another job
# and disable the progress bar (-s)
echo "fetching $url"
curl $url -O -s &
done
wait #wait for all background jobs to terminate
As of 7.66.0, the curl utility finally has built-in support for parallel downloads of multiple URLs within a single non-blocking process, which should be much faster and more resource-efficient compared to xargs and background spawning, in most cases:
curl -Z 'http://httpbin.org/anything/[1-9].{txt,html}' -o '#1.#2'
This will download 18 links in parallel and write them out to 18 different files, also in parallel. The official announcement of this feature from Daniel Stenberg is here: https://daniel.haxx.se/blog/2019/07/22/curl-goez-parallel/
For launching of parallel commands, why not use the venerable make command line utility.. It supports parallell execution and dependency tracking and whatnot.
How? In the directory where you are downloading the files, create a new file called Makefile with the following contents:
# which page numbers to fetch
numbers := $(shell seq 1 10)
# default target which depends on files 1.html .. 10.html
# (patsubst replaces % with %.html for each number)
all: $(patsubst %,%.html,$(numbers))
# the rule which tells how to generate a %.html dependency
# $# is the target filename e.g. 1.html
%.html:
curl -C - 'http://www...../?page='$(patsubst %.html,%,$#) -o $#.tmp
mv $#.tmp $#
NOTE The last two lines should start with a TAB character (instead of 8 spaces) or make will not accept the file.
Now you just run:
make -k -j 5
The curl command I used will store the output in 1.html.tmp and only if the curl command succeeds then it will be renamed to 1.html (by the mv command on the next line). Thus if some download should fail, you can just re-run the same make command and it will resume/retry downloading the files that failed to download during the first time. Once all files have been successfully downloaded, make will report that there is nothing more to be done, so there is no harm in running it one extra time to be "safe".
(The -k switch tells make to keep downloading the rest of the files even if one single download should fail.)
Curl can also accelerate a download of a file by splitting it into parts:
$ man curl |grep -A2 '\--range'
-r/--range <range>
(HTTP/FTP/SFTP/FILE) Retrieve a byte range (i.e a partial docu-
ment) from a HTTP/1.1, FTP or SFTP server or a local FILE.
Here is a script that will automatically launch curl with the desired number of concurrent processes: https://github.com/axelabs/splitcurl
Starting from 7.68.0 curl can fetch several urls in parallel. This example will fetch urls from urls.txt file with 3 parallel connections:
curl --parallel --parallel-immediate --parallel-max 3 --config urls.txt
urls.txt:
url = "example1.com"
output = "example1.html"
url = "example2.com"
output = "example2.html"
url = "example3.com"
output = "example3.html"
url = "example4.com"
output = "example4.html"
url = "example5.com"
output = "example5.html"
curl and wget cannot download a single file in parallel chunks, but there are alternatives:
aria2 (written in C++, available in Deb and Cygwin repo's)
aria2c -x 5 <url>
axel (written in C, available in Deb repo)
axel -n 5 <url>
wget2 (written in C, available in Deb repo)
wget2 --max-threads=5 <url>
lftp (written in C++, available in Deb repo)
lftp -n 5 <url>
hget (written in Go)
hget -n 5 <url>
pget (written in Go)
pget -p 5 <url>
Run a limited number of process is easy if your system have commands like pidof or pgrep which, given a process name, return the pids (the count of the pids tell how many are running).
Something like this:
#!/bin/sh
max=4
running_curl() {
set -- $(pidof curl)
echo $#
}
while [ $# -gt 0 ]; do
while [ $(running_curl) -ge $max ] ; do
sleep 1
done
curl "$1" --create-dirs -o "${1##*://}" &
shift
done
to call like this:
script.sh $(for i in `seq 1 10`; do printf "http://example/%s.html " "$i"; done)
The curl line of the script is untested.
I came up with a solution based on fmt and xargs. The idea is to specify multiple URLs inside braces http://example.com/page{1,2,3}.html and run them in parallel with xargs. Following would start downloading in 3 process:
seq 1 50 | fmt -w40 | tr ' ' ',' \
| awk -v url="http://example.com/" '{print url "page{" $1 "}.html"}' \
| xargs -P3 -n1 curl -o
so 4 downloadable lines of URLs are generated and sent to xargs
curl -o http://example.com/page{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16}.html
curl -o http://example.com/page{17,18,19,20,21,22,23,24,25,26,27,28,29}.html
curl -o http://example.com/page{30,31,32,33,34,35,36,37,38,39,40,41,42}.html
curl -o http://example.com/page{43,44,45,46,47,48,49,50}.html
Bash 3 or above lets you populate an array with multiple values as it expands sequence expressions:
$ urls=( "" http://example.com?page={1..4} )
$ unset urls[0]
Note the [0] value, which was provided as shorthand to make the indices line up with page numbers, since bash arrays autonumber starting at zero. This strategy obviously might not always work. Anyway, you can unset it in this example.
Now you have a an array, and you can verify the contents with declare -p:
$ declare -p urls
declare -a urls=([1]="http://example.com?Page=1" [2]="http://example.com?Page=2" [3]="http://example.com?Page=3" [4]="http://example.com?Page=4")
Now that you have a list of URLs in an array, expand the array into a curl command line:
$ curl $(for i in ${!urls[#]}; do echo "-o $i.html ${urls[$i]}"; done)
The curl command can take multiple URLs and fetch all of them, recycling the existing connection (HTTP/1.1) to a common server, but it needs the -o option before each one in order to download and save each target. Note that characters within some URLs may need to be escaped to avoid interacting with your shell.
I am not sure about curl, but you can do that using wget.
wget \
--recursive \
--no-clobber \
--page-requisites \
--html-extension \
--convert-links \
--restrict-file-names=windows \
--domains website.org \
--no-parent \
www.website.org/tutorials/html/

Wget Output in Recursive mode

I am using wget -r to download 3 .zip files from a specified webpage. Here is what I have so far:
wget -r -nd -l1 -A.zip http://www.website.com/example
Right now, the zip files all begin with abc_*.zip where * seems to be a random. I want to have the first downloaded file to be called xyz_1.zip, the second to be xyz_2.zip, and the third to be xyz_3.zip.
Is this possible with wget?
Many thanks!
I don't think it's possible with wget alone. After downloading you could use some simple shell scripting to rename the files, like:
i=1; for f in abc_*.zip; do mv "$f" "xyz_$i.zip"; i=$(($i+1)); done
Try to get a listing first and then download each file separately.
let n=1
wget -nv -l1 -r --spider http://www.website.com/example 2>&1 | \
egrep -io 'http://.*\.zip'| \
while read url; do
wget -nd -nv -O $(echo $url|sed 's%^.*/\(.*\)_.*$%\1%')_$n.zip "$url"
let n++
done
I don't think there is a way you can do it within a single wget command.
wget does have a -O option which you can use to tell it which file to output to, but it won't work in your case because multiple files will get concatenated together.
You will have to write a script which renames the files from abc_*.zip to xyz_*.zip after wget has completed.
Alternatively, invoke wget for one zip file at a time and use the -O option.

Resources