How to use wget with multithread and wildcards?

How to use wget with multithread and wildcards? - linux

I want to use wget to download multiple files at once in a script using wildcards, like this:
wget -r -nd --no-parent --no-remove-listing $ftpUrl -l1 -A file1*.txt &
wget -r -nd --no-parent --no-remove-listing $ftpUrl -l1 -A file2*.txt &
wget -r -nd --no-parent --no-remove-listing $ftpUrl -l1 -A file3*.txt &
The problem is that wget downloads .listing file everytime and because there are multiple instances running, sometimes the file is being downloaded when another instance is reading it.
Is there a way to lock .listing file or ask wget to not download it (I can do it manually as the first command)? I don't understand how it reads this .listing file since it's not a plain list of the URLs, but rather something like this:
drwxr-xr-x 3 4015 4015 16384 Dec 14 21:23 .
drwxr-xr-x 4 4015 4015 4096 Dec 14 21:23 ..
-rw-r--r-- 1 4015 4015 327 Feb 15 2022 file1-bla.txt
-rw-r--r-- 1 4015 4015 10716 Feb 15 2022 file2-bla.txt
-rw-r--r-- 1 4015 4015 163 Feb 15 2022 file2-bla.txt
If I try to use -i .listing (or even if I rename .listing to list.txt and use -i list.txt) I get an error saying that the URLs are not valid or something.

RFC 959 stipulates that
LIST (LIST) This command causes a list to be sent from the server to the passive DTP. If the pathname specifies a directory or other
group of files, the server should transfer a list of files in the
specified directory. If the pathname specifies a file then the server
should send current information on the file. A null argument implies
the user's current working or default directory. The data transfer is
over the data connection in type ASCII or type EBCDIC. (The user must
ensure that the TYPE is appropriately ASCII or EBCDIC). Since the
information on a file may vary widely from system to system, this
information may be hard to use automatically in a program, but may be
quite useful to a human user.
Observe that it does not impose formal requirement what exactly will be returned.
drwxr-xr-x 3 4015 4015 16384 Dec 14 21:23 .
drwxr-xr-x 4 4015 4015 4096 Dec 14 21:23 ..
-rw-r--r-- 1 4015 4015 327 Feb 15 2022 file1-bla.txt
-rw-r--r-- 1 4015 4015 10716 Feb 15 2022 file2-bla.txt
-rw-r--r-- 1 4015 4015 163 Feb 15 2022 file2-bla.txt
This looks akin to output of ls -lah command, if you are interested in using that to form URL list acceptable by wget then you should take last column (excluding . and ..) and prefix them with URL of FTP server, if that URL is ftp://ftp.example.com and you got list as above then your list of URLs should look as follows
ftp://ftp.example.com/file1-bla.txt
ftp://ftp.example.com/file2-bla.txt
ftp://ftp.example.com/file3-bla.txt

Related

listing files in UNIX owned by a particular user

How do I list the files owned by a particular user in UNIX ?.
If I use ls - l command in a shared directory ,it lists all the files with the details .This shared directory contains many files created by many users in a group and I am in a situation where I want to see the files created only by a particular user. Is there any listing command to give username as the input.
Refer below example,
command : ls - l
drwxr-xr-x 2 user_1 main 4.0K Feb 12 16:43 proj_1
drwxrws--- 6 user_2 main 20M Feb 18 11:07 proj_2
drwxr-xr-x 3 user_1 main 1.3M Feb 18 00:18 proj_3
drwxrwsr-x 2 user_2 main 8.0K Dec 27 01:23 proj_4
drwxrwsr-x 2 user_3 main 8.1K Dec 27 01:23 proj_5
I am looking for a command to display only the files created by the user_2 with my expected output as below ,
drwxrws--- 6 user_2 main 20M Feb 18 11:07 proj_2
drwxrwsr-x 2 user_2 main 8.0K Dec 27 01:23 proj_4
Kindly let me know if there is a way .

It should be possible to use awk togheter with ls -l
ls -l | awk '$3=="user_2" { print $0 }'
this will print all lines where third field (user) matches "user_2"

You simply can use the findcommand like this:
find . -maxdepth 1 -user some_user -exec ls -lsad {} \;
Why the options are used:
maxdepth we only want to see current directory level
user we only want to see files owned by given user
exec lets do something with the found file
What we want do with the file:
ls -lsad gives you the long list of current file, if it is a directory, don't go into it.

Script that calls another script to execute on every file in a directory

There are two directories that contains these files:
First one /usr/local/nagios/etc/hosts
[root#localhost hosts]$ ll
total 12
-rw-rw-r-- 1 apache nagios 1236 Feb 7 10:10 10.80.12.53.cfg
-rw-rw-r-- 1 apache nagios 1064 Feb 27 22:47 10.80.12.62.cfg
-rw-rw-r-- 1 apache nagios 1063 Feb 22 12:02 localhost.cfg
And the second one /usr/local/nagios/etc/services
[root#localhost services]$ ll
total 20
-rw-rw-r-- 1 apache nagios 2183 Feb 27 22:48 10.80.12.62.cfg
-rw-rw-r-- 1 apache nagios 1339 Feb 13 10:47 Check usage _etc.cfg
-rw-rw-r-- 1 apache nagios 7874 Feb 22 11:59 localhost.cfg
And I have a script that goes through file in Hosts directory and paste some lines from that file in the file in the Services directory.
The script is ran like this:
./nagios-contacts.sh /usr/local/nagios/etc/hosts/10.80.12.62.cfg /usr/local/nagios/etc/services/10.80.12.62.cfg
How can I achieve that another script calls my script and goes through every file in the Hosts directory and does its job for the files with the same name in the Service directory?
In my script I´m pulling out contacts from the 10.80.12.62.cfg in the Hosts directory and appending them to the file with the same name in the Service directory.

Don't use ls output as an input to for loop instead use the built-in wild-cards. See why it's not a good idea.
for f in /usr/local/nagios/etc/hosts/*.cfg
do
basef=$(basename "$f")
./nagios-contacts.sh "$f" "/usr/local/nagios/etc/services/${basef}"
done

It sounds like you just need to do some iteration.
echo $(pwd)
for file in $(ls); do ./nagious-contacts.sh $file; done;
So it will loop over all files in the current directory.
You can also modify it as well by doing something more absolute.
abspath=$1
for file in $(ls $abspath); do ./nagious-contacts.sh $abspath/$file; done
which would loop over all files in a set directory, and then pass the abspath/filename into your script.

Linux: Finding Newly Added Files

I am trying to obtain a backup of 'newly' added files to a Fedora system. Files can be copied through a Windows Samba share and appear to retain the original created timestamp. However, because it retains this timestamp I am having issues identifying which files were newly added to the system.
Currently, the only way I can think of doing this is to have a master list snapshot of all the files on the system at a specific time. Then when I perform the backup I compare the previous snapshot with a current snapshot. It would detect files that were removed from the system but it seems excessive and I was thinking there must be an easier way to backup newly added files.
Terry

Try using find. Something like this:
find . -ctime -10
That will give you a list of files and directories, starting from within your current directory, that has had its state changed within the last 10 days.
Example:
My Downloads directory looks like this:
kobus#akira:~/Downloads$ ll
total 2025284
drwxr-xr-x 4 kobus kobus 4096 Nov 4 11:25 ./
drwxr-xr-x 41 kobus kobus 4096 Oct 30 09:26 ../
-rw-rw-r-- 1 kobus kobus 8042383 Oct 28 14:08 apache-maven-3.3.3- bin.tar.gz
drwxrwxr-x 2 kobus kobus 4096 Oct 14 09:55 ELKImages/
-rw-rw-r-- 1 kobus kobus 1469054976 Nov 4 11:25 Fedora-Live-Workstation-x86_64-23-10.iso
-rw------- 1 kobus kobus 351004 Sep 21 14:07 GrokConstructor-master.zip
drwxrwxr-x 11 kobus kobus 4096 Jul 11 2014 jboss-eap-6.3/
-rw-rw-r-- 1 kobus kobus 183399393 Oct 19 16:26 jboss-eap-6.3.0-installer.jar
-rw-rw-r-- 1 kobus kobus 158177216 Oct 19 16:26 jboss-eap-6.3.0.zip
-rw-rw-r-- 1 kobus kobus 71680110 Oct 13 13:51 jre-8u60-linux-x64.tar.gz
-rw-r--r-- 1 kobus kobus 4680 Oct 12 12:34 nginx-release-centos-7-0.el7.ngx.noarch.rpm
-rw-r--r-- 1 kobus kobus 3479765 Oct 12 14:22 ngx_openresty-1.9.3.1.tar.gz
-rw------- 1 kobus kobus 16874455 Sep 15 16:49 Oracle_VM_VirtualBox_Extension_Pack-5.0.4-102546.vbox-extpack
-rw-r--r-- 1 kobus kobus 7505310 Oct 6 10:29 sublime_text_3_build_3083_x64.tar.bz2
-rw------- 1 kobus kobus 41467245 Sep 7 10:37 tagspaces-1.12.0-linux64.tar.gz
-rw-rw-r-- 1 kobus kobus 42658300 Nov 4 10:14 tagspaces-2.0.1-linux64.tar.gz
-rw------- 1 kobus kobus 70046668 Sep 15 16:49 VirtualBox-5.0-5.0.4_102546_el7-1.x86_64.rpm
Here's what the find returns:
kobus#akira:~/Downloads$ find . -ctime -10
.
./tagspaces-2.0.1-linux64.tar.gz
./apache-maven-3.3.3-bin.tar.gz
./Fedora-Live-Workstation-x86_64-23-10.iso
kobus#akira:~/Downloads$

Most unices do not have a concept of file creation time. You can't make ls print it because the information is not recorded. If you need creation time, use a version control system: define creation time as the check-in time.
If your unix variant has a creation time, look at its documentation. For example, on Mac OS X (the only example I know of¹), use ls -tU. Windows also stores a creation time, but it's not always exposed to ports of unix utilities, for example Cygwin ls doesn't have an option to show it. The stat utility can show the creation time, called “birth time” in GNU utilities, so under Cygwin you can show files sorted by birth time with stat -c '%W %n' * | sort -k1n.
Note that the ctime (ls -lc) is not the file creation time, it's the inode change time. The inode change time is updated whenever anything about the file changes (contents or metadata) except that the ctime isn't updated when the file is merely read (even if the atime is updated). In particular, the ctime is always more recent than the mtime (file content modification time) unless the mtime has been explicitly set to a date in the future.

"Newly added files, Fedora" : The below examples will show a list with date and time.
Example, all installed packages : $ rpm -qa --last
Example, the latest 100 packages : $ rpm -qa --last | head -100
Example, create a text file : $ rpm -qa --last | head -100 >> last-100-packages.txt

Automatically launching Firefox from terminal using at command

I am a beginner at linux and really enthusiastic to learn the OS. I am trying to launch Firefox(or any other software like Evince) from the command line as follows:
[root#localhost ~]# at 1637
[root#localhost ~]# at> firefox
[root#localhost ~]# at> ^d
The job gets scheduled without any error. But at the specified time it does not run.
I also tried giving the following path:
[root#localhost ~]# at 1637
[root#localhost ~]# at> /usr/bin/firefox
[root#localhost ~]# at> ^d
Still no result. But When I try to use echo to display a text on the screen it appears at the specified time as desired. What might be the issue?

I think you have not set DISPLAY. at will run in separate shell where display is not set.
try the following code.
dinesh:~$ at 2120
warning: commands will be executed using /bin/sh
at> export DISPLAY=:0
at> /usr/bin/firefox > firefox.log 2>&1
at> <EOT>
job 7 at Tue Mar 11 21:20:00 2014
If it is still failing check firefox.log for more information.

1) Its not always recommended to run things as root
2) You can also try ./firefox if you are in the current directory of firefox. In linux you need to pay attention to your path variable. Unless . (the current directory) is in your path you will have to type ./program if the program is in the same directory as you.
Also you need to pay attention to file permissions: In linux you have read-write-eXecute access.
ls -l will do a list of directories and show the file permissions:
drwxr-xr-x 10 user staff 340 Oct 6 2012 GlassFish_Server/
drwx------# 15 jeffstein staff 510 Oct 6 15:01 Google Drive/
drwxr-xr-x 20 jeffstein staff 680 May 14 2013 Kindle/
drwx------+ 67 jeffstein staff 2278 Jan 26 14:22 Library/
drwx------+ 19 jeffstein staff 646 Oct 23 18:28 Movies/
drwx------+ 15 jeffstein staff 510 Jan 3 20:29 Music/
drwx------+ 90 jeffstein staff 3060 Mar 9 20:23 Pictures/
drwxr-xr-x+ 6 jeffstein staff 204 Nov 3 21:16 Public/
drwxr-xr-x 22 jeffstein staff 748 Jan 14 2012 androidTools/
-rwxrwxrwx 1 jeffstein staff 1419 Aug 28 2013 color.sh*
This is an example of ls -l here you can see color.sh has -rwxrwxrwx that means that anybody can read or write or run the file.
Without actually knowing where you installed firefox however I can't be of more help but these are some small pointers which might help.

try finding where firefox is actually installed using "whereis firefox" command.
Then try using that path in at command.

In order to get directions on how to use a command type:
man at
this will display the "manual"
DESCRIPTION
The at and batch utilities read commands from standard input or a speci-
fied file. The commands are executed at a later time, using sh(1).
at executes commands at a specified time;
atq lists the user's pending jobs, unless the user is the superuser;
in that case, everybody's jobs are listed;
atrm deletes jobs;
batch executes commands when system load levels permit; in other words,
when the load average drops below _LOADAVG_MX (1.5), or the value
specified in the invocation of at run.
So obviously you need to schedule a job with at and you can see if it worked with atq
Read the manual and it should help - if i have more time I'll write you a quick example.

Why doesn't grep work if a file is not specified?

I have some problem with the Linux grep command, it don't work !!!
I am trying the following test on my Ubuntu system:
I have create the following folder: /home/andrea/Scrivania/prova
Inside this folder I have created a txt file named prova.txt and inside this file I have write the string test and I have save it
In the shell I have first access the folder /home/andrea/Scrivania/prova and so I have launched the grep command in the following way:
~/Scrivania/prova$ grep test
The problem is that the cursor continues to blink endlessly and cannot find NOTHING! Why? What is the problem?

You've not provided files for the grep command to scan
grep "test" *
or for recursive
grep -r "test" *

Because grep searches standard input if no files are given. Try this.
grep test *

You are not running the command you were looking for.
grep test * will look for test in all files in your current directory.
grep test prova.txt will look for test specifically in prova.txt
(grep test will grep the test string in stdin, and will not return until EOF.)

You need to pipe in something to grep - you cant just call grep test without any other arguments as it is actually doing nothing. try grep test *
Another use for grep is to pipe in a command
e.g. This is my home directory:
drwx------+ 3 oliver staff 102 12 Nov 21:57 Desktop
drwx------+ 10 oliver staff 340 17 Nov 18:34 Documents
drwx------+ 17 oliver staff 578 20 Nov 18:57 Downloads
drwx------# 12 oliver staff 408 13 Nov 20:53 Dropbox
drwx------# 52 oliver staff 1768 11 Nov 12:05 Library
drwx------+ 3 oliver staff 102 12 Nov 21:57 Movies
drwx------+ 5 oliver staff 170 17 Nov 10:40 Music
drwx------+ 3 oliver staff 102 20 Nov 19:17 Pictures
drwxr-xr-x+ 4 oliver staff 136 12 Nov 21:57 Public
If i run
l | grep Do
I get the result
drwx------+ 10 oliver staff 340 17 Nov 18:34 Documents
drwx------+ 17 oliver staff 578 20 Nov 18:57 Downloads
remember to pipe the grep command

From grep man page:
Grep searches the named input FILEs (or standard input if no files
are
named, or the file name - is given) for lines containing a match to the
given PATTERN.
If you don't provide file name(s) for it to use, it will try to read from stdin.
Try grep test *

As per GNU Grep 3.0
A file named - stands for standard input. If no input is specified,
grep searches the working directory . if given a command-line
option specifying recursion; otherwise, grep searches standard input.
So for OP's command, without any additional specification, grep tries to search in standard input, which is not actually provided there.
A simple approach is grep -r [pattern], as per the above, to specify recursion with -r and search in current directory and sub-directories.
Also note that wildcard * only includes files, not directories. If used, a prompt might be shown for hint:
grep: [directory_name]: Is a directory

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to use wget with multithread and wildcards? - linux

Related

listing files in UNIX owned by a particular user

Script that calls another script to execute on every file in a directory

Linux: Finding Newly Added Files

Automatically launching Firefox from terminal using at command

Why doesn't grep work if a file is not specified?

Categories

Resources