Use mget to download files in Bash - linux

I am creating a bash script for centos 7 and I want that mget download me the files who be named in a file. How I can make this?
I tried with these codes "prueba" is the file where is ubicated the filename:
mget prueba
mget prueba/*
Thank you for helpme

Are you talking about this mget? If so, it's not directly possible to use this utility to download a list of URL's you specify in a file.
You can however use xargs to simulate the same effect:
xargs -n 1 -a prueba mget
This would effectively call mget for each line in the file you specify (e.g. prueba).

This shoud solve your problem:
xargs -n 1 -P 8 -a prueba wget
-a Use file as input
-n1 Use one argument at a time
-P8 Use up to 8 processes (no need to use mget, since xargs handles the parallel downloads)

Related

How to understand this shell script?

cat urls.txt | xargs -P 10 -n 1 wget -nH -nc -x]
This shell is very confusing to new user, just want to ask if there is any reference document I can refer?
There is nothing much confusing about it.
If you want to know what the commands do then use the manual.
man cat
man xargs
The pipe sends the output of one command to the next, in this case cat urls.txt to xargs.
cat urls.txt will write the contents of the file urls.txt to stdout, which is then used as the input for xargs.
xargs -P 10 -n 1 will execute a command with with the input (the contents of urls.txt) as arguments. The command in this case being wget -nH -nc -x]. I don't know what ] is supposed to do there, but that's probably a typo.
All in all you can understand, without much caring about the options, that this will download a list of files that is in urls.txt into your current directory. Of course it's always safe to check the options flags. in this case -nc for example causees wget to rename a downloaded file and append a number if the file is already in the directory.
All three man pages can also be found online:
cat
xargs
wget
you can follow this book https://www.iiitd.edu.in/~amarjeet/Files/SM2012/Linux%20Dummies%209th.pdf
And best way to learn Linux command is use man command
example :
type > man xargs on terminal you will get all detail
you will get man page for all linux comman
The best way is follow this link https://explainshell.com

Pass each file obtained from a command to another command as a parameter

I am using the following line to take a pdf and split it:
pdfseparate -f 14 -l 23 ALF.SS.0.pdf "${FILE}"-%d.pdf
Now I want for each file produced, to run several commands like this:
pdfcrop --margins '-30 0 -385 0' outputOfpdfSeparate outputOfpdfSeparate-1stCol.pdf
I am trying to figure out the best way to do this:
With a single loop, for each file created by pdfseparate, if I manage to "know" what is the name of the file, I could pass it to pdfcrop and done. But since it is using %d, I do not know how to handle this "new name" in which each file has a new number. I know how to do this in Java but here I do not see it so clear.
Using pipes. I think I have the same issue since if I do
pdfseparate [options] | pdfcrops inputfile outputfile,
I do not know how to "use" the name of inputfile. I am sure it is easy but I dont see it.
Using xargs. I am studying this command since it is new for me.
Using exec. I am under the impression this is not necessary but maybe I am wrong since it's been a long while since I last used exec.
Thanks in advance.
You can use xargs. It is the best way in terms of speed.
I usually use it for converting a lot of .mp4 file to .mp3.
Doing this conversion one-by-one not only is tedious but also takes a long time. Therefore you can use the auto parallel mechanism with the help of -P 0 option in xargs
for example if I had 10 .mp4 files I would do this:
ls *.mp4 | xargs -I xxx -P 0 ffmpeg -i xxx xxx.mp3
After running this line; 10 ffmpet commands are running simultaneously.
The other way to do this is storing a list of .mp4 file in a text file like this:
ls *.mp4 > list-mp4
then:
xargs -I xxx -P 0 ffmpeg -i xxx xxx.mp3 < list-mp4
Or may you have access to GNU-parallel. Thus you can:
parallel ffmpeg -i {} {}.mp3 ::: *.mp4
Now for your case; if you want to use these (= xargs or parallel) or your own command, you should notice that your first command should send its output to stdout. Because the second command is going to read its stdin from the stdout of the first command and bash does this for your.
Thus when you can use pipe == | with your: pdfseparate than it sends its output to stdout. If it does/can NOT, then the right-side of the pipe == the second command does nothing and vice versa: the second command should/can read its stdin from incoming stdout.
For example
ls *.txt | echo {}
here echo does not read any incoming stdout from the ls command and just prints {}
Eventually, your pdfseparate should send to stdout. Then xargs store it in -I anything-your-like and then call your second command
Therefor:
pdfseparate options... | xargs -I ABC -P 0 your-second-command+its-options ABC
NOTE-1 that xargs stores the given stdout line-by-line in ABC and you pass this to your second command as its input
NOTE-2 you do not have to use -P 0 at all. It is just for speeding up the executing time. You can omit it but your second command are synchronized per incoming line.
pdfseparate does not output the files it created, thus you have to use "ls" command to get the filelist, you want to operate on.
# separate the pdfs
pdfseparate -f 14 -l 23 ALF.SS.0.pdf "${FILE}"-%d.pdf
# operate on the just created files, assumes that a "FILE" variable is set, which might not be the case
for i in $(ls "${FILE}-*.pdf"); do pdfcrop --margins '-30 0 -385 0' $i; done;
# assuming that FILE variable in your case would match ALF.SS.0-[0-9]*.pdf, you'd use this:
for i in $(ls ALF.SS.0-[0-9]*.pdf); do pdfcrop --margins '-30 0 -385 0' $i; done;

how to copy file to multiple sub directories linux

I have a file needs to copy unique directory call test
directory structure as below
/contentroot/path/a/x/test
/contentroot/path/a/y/test
/contentroot/path/a/z/test
--------------------------
as above I have more then 250 combination test directory
I have try below command ( by using asterisk) but it's only copy one test directly only and giving issue (cp: omitting directory )
cp myfile.txt /contentroot/path/a/*/test
any Help
Perhaps a for loop?
for FOLDER in /contentroot/path/a/*/test; do
cp myfile.txt $FOLDER
done
You can feed the output of an echo as an input to xargs. xargs will then run the cp command three times, appending the next directory path piped to it from the echo.
The -n 1 option on the xargs command is so it only appends one of those arguments at a time to the cp each time it runs.
echo /contentroot/path/a/x/test /contentroot/path/a/y/test /contentroot/path/a/z/test | xargs -n 1 cp myfile.txt
Warnings! Firstly this will over-write files (if they exist) and secondlt any bash command should be tested and used at the runners risk! ;)

mention extensions when split (linux)

I have a pretty simple question:
exec('split -d -l 10 _.txt part');
This splits my _.txt file into chunks part00,part01 etc.
Can i set file extension for these chunks somehow?
Thank you,
It is possible by using the --filter option as documented in info coreutils 'split invocation':
split -d -l 10 _.txt part --filter='cat > $FILE.txt'
This will create part00.txt, part01.txt and so on. Also seems to work for binary files (with -b instead of -l).
# touch xaa xab xac; for f in xa{a..c};do echo mv -- "$f" "$f.txt";done

shell script to download latest file from FTP

I am writing shell script first time, I want to download latest create file from FTP.
I want to download latest file of specific folder. Below is my code for that. But it is downloading all the files of the folder not the latest one.
ftp -in ftp.abc.com << SCRIPTEND
user xyz xyz
binary
cd Rpts/
mget ls -t -r | tail -n 1
quit
SCRIPTEND
help me with this, please?
Try using wget or lftp utility instead, it compares file time/date and AFAIR its purpose is ftp scripting. Switch to ssh/rsync if possible, you can read a bit about lftp instead of rsync here:
https://serverfault.com/questions/24622/how-to-use-rsync-over-ftp
Probably the easiest way is to link last version on server side to "current", and always get the file pointed. If you're not admin of the server, you need to list all files with date/time, grab the information, parse it, decide which one is newest, in the meantime state on the server can change, and you find yourself in more complicated solution than it's worth.
The point is, that "ls" sorts output in some way, and time may not be default. There are switches to sort it e.g. base on modification time, however even when server responds with OK on ls -t , you can't be sure it really supports sorting, it can just ignore all switches and always return the same list, that's why admins usually use "current" link (ln -s). If there's no "current", to make sure you have the right file, you need to parse list anyway ( ls -al ).
http://www.catb.org/esr/writings/unix-koans/shell-tools.html
Looking at the code, the line
mget ls -t -r | tail -n 1
doesn't do what you think. It actually grabs all of the output of ls -t and then tail processes the output of mget. You could replace this line with
mget $(ls -t -r | tail -n 1)
but I am not sure if ftp will support such a call...
Try using an FTP client other than ftp. For example, curlftpfs available at curlftpfs.sourceforge.net is a good candidate as it allows you to mount an FTP to a directory as if it is a local folder and then run different commands on the files there (including find, grep, etc.). Take a look at this article.
This way, since the output comes form a local command, you'd be more certain that ls -t returns a properly sorted list.
Btw, it's a bit less convoluted to use ls -t | head -1 than ls -t -r | tail -1. They produce the same result but why reverse and grab from the tail when you can just grab the head :)
If you use curlftpfs then your script would be something like this (assuming server ftp.abc.com and user xyz with password xyz).
mkdir /tmp/ftpsession
curlftpfs ftp://xyz:xyz#ftp.abc.com /tmp/ftpsession
cd /tmp/ftpsession/Rpts
cp -Rpf $(ls -t | head -1) /your/destination/folder/or/file
cd -
umount /tmp/ftpsession
My Solution is this:
curl 'ftp://server.de/dir/'$(curl 'ftp://server.de/dir/' 2>/dev/null | tail -1 | awk '{print $(NF)}')

Resources