Linux downloading a file appending to file - linux

So I'm downloading a file and getting the data from the download, not actually storing the file itself.
Just want to see the speeds and log them.
wget http://www.google.com/download -a log.log -O /dev/null &
wget http://www.google.com/download -a log.log -O /dev/null &
wget http://www.google.com/download -a log.log -O /dev/null
I am trying to download simultaneously but the output is overlapping, how do I prevent this?

You could first write your output to different files and then merge them together.
This should work:
wget http://www.google.com/download -a log1.log &
wget http://www.google.com/download -a log2.log &
wget http://www.google.com/download -a log3.log ;
cat log2.log >> log1.log;
cat log3.log >> log1.log
Now you should have all your output in log1.log

if you want with append command is ok. if you dont want append just overwrite change -a command to -o
wget http://www.google.com/download -o log.log -O /dev/null

Related

how to use wget spider to identify broken urls from a list of urls and save broken ones

I am trying to write a shell script to identify broken urls from a list of urls.
here is input_url.csv sample:
https://www.google.com/
https://www.nbc.com
https://www.google.com.hksjkhkh/
https://www.google.co.jp/
https://www.google.ca/
Here is what I have which works:
wget --spider -nd -nv -H --max-redirect 0 -o run.log -i input_url.csv
and this gives me '2019-09-03 19:48:37 URL: https://www.nbc.com 200 OK' for valid urls, and for broken ones it gives me '0 redirections exceeded.'
what i expect is that i only want to save those broken links into my output file.
sample expect output:
https://www.google.com.hksjkhkh/
I think I would go with:
<input.csv xargs -n1 -P10 sh -c 'wget --spider --quiet "$1" || echo "$1"' --
You can use -P <count> option to xargs to run count processes in parallel.
xargs runs the command sh -c '....' -- for each line of the input file appending the input file line as the argument to the script.
Then sh inside runs wget ... "$1". The || checks if the return status is nonzero, which means failure. On wget failure, echo "$1" is executed.
Live code link at repl.
You could filter the output of wget -nd -nv and then regex the output, well like
wget --spider -nd -nv -H --max-redirect 0 -i input 2>&1 | grep -v '200 OK' | grep 'unable' | sed 's/.* .//; s/.$//'
but this looks not expendable, is not parallel so probably is slower and probably not worth the hassle.

Using wget on Linux to scan sub-folders for specific files

Using wget -r -P Home -A jpg http://example.com will result me a list of files from that website directory, what i'm searching for is how do i query a search like: wget -r -P home -A jpg http://example.com/from 65121 to 75121/ file_ 100 to 200.jpg
Example(s):
wget -r -P home -A jpg http://example.com/65122/file_102.jpg
wget -r -P home -A jpg http://example.com/65123/file_103.jpg
wget -r -P home -A jpg http://example.com/65124/file_104.jpg
Is it possible to achieve that on a Linux distro?
I'm fairly new to Linux OS, any tips are welcome.
Use a nested for loop and some bash scripting:
for i in {65121..75121}; do for j in {100..200}; do wget -r -P home -A jpg "http://example.com/${i}/file_${j}.jpg"; done; done
Wget has loop
wget -nd -H -p -A file_{100..200}.jpg -e robots=off http://example.com/{65121..75121}/
If there are only file_{100..200}.jpg It's simpler
wget -nd -H -p -A jpg -e robots=off http://example.com/{65121..75121}/

ssh tunneled command output to file

I have an old Syno NAS and wish to use the "shred" command to wipe this disks inside. The idea is to let the command run to complete on the box itself without the need of a computer.
So far I have managed...
1) to get the right parameters for 'shred'
* runs in the background using the &
2) get that command to output the progress (-v option) to a file shred.txt
* to see from the file what the progress is
shred -v -f -z -n 2 /dev/hdd 2>&1 | tee /volume1/backup/shred.txt &
3) ssh tunnel the command so I can turn off my laptop while its running
ssh -n -f root#host "sh -c 'nohup /opt/bin/shred -f -z -n 2 /dev/sdd > /dev/null 2>&1 &'"
The problem is that I can't combine 2) and 3)
I tried to combine them like this, but the resulting file remained empty:
ssh -n -f root#host "sh -c 'nohup /opt/bin/shred -f -z -n 2 /dev/sdd 2>&1 | tee /volume1/backup/shred.txt > /dev/null &'"
It might be a case of the NOOBS but I can't figure out how to get this done.
Any suggestions?
Thanks. Vince
Commands sh and tee are not needed in here:
ssh -n root#host 'nohup /opt/bin/shred -f -z -n 2 /dev/sdd 2>&1 >/volume1/backup/shred.txt &' >/dev/null
The final >/dev/null is optional, it will just disregard any greetings from other hosts.
Tried the following command (based on Grzegorz suggestion) and included the opening date stamp and the before mentioned - stupidly forgotten - verbose switch. Last version of the command string:
ssh -n root#host 'date > /volume1/backup/shred_sda.txt; nohup /opt/bin/shred -v -f -z -n 4 /dev/sda 2>&1 >> /volume1/backup/shred_sda.txt # >/dev/null'
The last thing to figure out is how to include the date stamp when the shred command has completed.

Update bash script, file check, how?

#!/bin/sh
LOCAL=/var/local
TMP=/var/tmp
URL=http://um10.eset.com/eset_upd
USER=""
PASSWD=""
WGET="wget --user=$USER --password=$PASSWD -t 15 -T 15 -N -nH -nd -q"
UPDATEFILE="update.ver"
cd $LOCAL
CMD="$WGET $URL/$UPDATEFILE"
eval "$CMD" || exit 1;
if [ -n "`file $UPDATEFILE|grep -i rar`" ]; then
(
cd $TMP
rm -f $TMP/$UPDATEFILE
unrar x $LOCAL/$UPDATEFILE ./
)
UPDATEFILE=$TMP/$UPDATEFILE
URL=`echo $URL|sed -e s:/eset_upd::`
fi
TMPFILE=$TMP/nod32tmpfile
grep file=/ $UPDATEFILE|tr -d \\r > $TMPFILE
FILELIST=`cut -c 6- $TMPFILE`
rm -f $TMPFILE
echo "Downloading updates..."
for FILE in $FILELIST; do
CMD="$WGET \"$URL$FILE\""
eval "$CMD"
done
cp $UPDATEFILE $LOCAL/update.ver
perl -i -pe 's/\/download\/\S+\/(\S+\.nup)/\1/g' $LOCAL/update.ver
echo "Done."
So I have this code to download definitions for my antivirus. The only problem is that, it downloads all files everytime i run script. Is it possible to implement some sort file checking ?, let's say for example,
"if that file is present and have same filesize skip it"
Bash Linux
The -nc argument to wget will not re-fetch files that already exist. It is, however, not compatible with the -N switch. So you'll have to change your WGET line to:
WGET="wget --user=$USER --password=$PASSWD -t 15 -T 15 -nH -nd -q -nc"

WGET without the log file

Every time I use wget http://www.domain.com a Log file is being saved automatically on my server. is there anyway to run this command without logging?
Thanks,
Joel
You could try -o and -q
-o logfile
--output-file=logfile
Log all messages to logfile. The messages are
normally reported to standard error.
-q
--quiet
Turn off Wget's output.
So you'd have:
wget ... -q -o /dev/null ...
This will print the site contents to the standard output, is this what you mean when you say that you don't want logging to a file?
wget -O - http://www.domain.com/
I personally found that #Paul's answer was still adding to a Log file, regardless of the Command line output of -q
Added -O to /dev/null ontop of the -o output file argument.
wget [url] -q -o /dev/null -O &> /dev/null

Resources