Add some text to each line of txt file and pass to wget - linux

I have a file called : filename.txt contains file name with extension
I want to add url before each line like www.abc.com/
and pass it to wget like :
cat filename.txt | xargs -n 1 -P 16 wget -q -P /location
Thanks

Sounds like you want to prefix each line in filename.txt with a string:
sed -e 's#^#www.abc.com/#' filename.txt

I got my answer, thanks to all for your valuable response
awk '{print "https://<URL>" $0;}' filename.txt | xargs -n 1 -P 16 wget -q -P /location

Related

How to store output for every xargs instance separately

cat domains.txt | xargs -P10 -I % ffuf -u %/FUZZ -w wordlist.txt -o output.json
Ffuf is used for directory and file bruteforcing while domains.txt contains valid HTTP and HTTPS URLs like http://example.com, http://example2.com. I used xargs to speed up the process by running 10 parallel instances. But the problem here is I am unable to store output for each instance separately and output.json is getting override by every running instance. Is there anything we can do to make output.json unique for every instance so that all data gets saved separately. I tried ffuf/$(date '+%s').json instead but it didn't work either.
Sure. Just name your output file using the domain. E.g.:
xargs -P10 -I % ffuf -u %/FUZZ -w wordlist.txt -o output-%.json < domains.txt
(I dropped cat because it was unnecessary.)
I missed the fact that your domains.txt file is actually a list of URLs rather than a list of domain names. I think the easiest fix is just to simplify domains.txt to be just domain names, but you could also try something like:
xargs -P10 -I % sh -c 'domain="%"; ffuf -u %/FUZZ -w wordlist.txt -o output-${domain##*/}.json' < domains.txt
cat domains.txt | xargs -P10 -I % sh -c "ping % > output.json.%"
Like this and your "%" can be part of the file name. (I changed your command to ping for my testing)
So maybe something more like this:
cat domains.txt | xargs -P10 -I % sh -c "ffuf -u %/FUZZ -w wordlist.txt -o output.json.%
"
I would replace your ffuf command with the following script, and call this from the xargs command. It just strips out the invalid file name characters and replaces them with a dot then runs the command:
#!/usr/bin/bash
URL=$1
FILE="`echo $URL | sed 's/:\/\//\./g'`"
ffuf -u ${URL}/FUZZ -w wordlist.txt -o output-${FILE}.json

how to echo the filename?

I'm searching in a .docx content with this command:
unzip -p *.docx word/document.xml | sed -e 's/<[^>]\{1,\}>//g; s/[^[:print:]]\{1,\}//g' | grep $1
But I need the name of file which contains the word what I searched. How can I do it?
You can walk through the files via for cycle:
for file in *.docx; do
unzip -p "$file" word/document.xml | sed -e 's/<[^>]\{1,\}>//g; s/[^[:print:]]\{1,\}//g' | grep PATTERN && echo $file
done
The && echo $file part prints the filename when grep finds the pattern.
Try with:
find . -name "*your_file_name*" | xargs grep your_word | cut -d':' -f1
If you're using GNU grep (likely, as you're on Linux), you might want to use this option:
--label=LABEL
Display input actually coming from standard input as input coming from file LABEL. This is especially useful when implementing tools like zgrep, e.g., gzip -cd foo.gz | grep --label=foo -H something. See
also the -H option.
So you'd have something like
for f in *.docx
do unzip -p "$f" word/document.xml \
| sed -e "$sed_command" \
| grep -H --label="$f" "$1"
done

Need response time and download time for the URLs and write shell scripts for same

I have use command to get response time :
curl -s -w "%{time_total}\n" -o /dev/null https://www.ziffi.com/suggestadoc/js/ds.ziffi.https.v308.js
and I also need download time of this below mentioned js file link so used wget command to download this file but i get multiple parameter out put. I just need download time from it
$ wget --output-document=/dev/null https://www.ziffi.com/suggestadoc/js/ds.ziffi.https.v307.js
please suggest
I think what you are looking for is this:
wget --output-document=/dev/null https://www.ziffi.com/suggestadoc/js/ds.ziffi.https.v307.js 2>&1 >/dev/null | grep = | awk '{print $5}' | sed 's/^.*\=//'
Explanation:
2>&1 >/dev/null | --> Makes sure stderr gets piped instead of stdout
grep = --> select the line that contains the '=' symbol
sed 's/^.*\=//' --> deletes everything from linestart to the = symbol

how to grep external txt file in bash script?

I would like to run a script that can be access in other terminals other then my own. like bash <(curl -s http://domain.com/scripts/hello.sh) but I need this script to grep a txt file that is also located on that server for example domain.com/scripts/stuff.txt. What would be the best method?
Download a text file, and pipe it into grep:
curl http://domain.com/scripts/stuff.txt | grep foo
Try:
curl -s http://domain.com/scripts/stuff.txt | grep "$Hi"
You can redirect the output to a local text file, like so:
curl -s http://domain.com/scripts/stuff.txt | grep "$Hi" > stuff_that_starts_with_hi.txt
You can also use wget
wget -qO - http://domain.com/scripts/stuff.txt | grep dat

Search for string within html link on webpage and download the linked file

I am trying to write a linux script to search for a link on a web page and download the file from that link...
the webpage is:
http://ocram.github.io/picons/downloads.html
The link I am interested in is:
"hd.reflection-black.7z"
The original way I was doing this was using these commands..
lynx -dump -listonly http://ocram.github.io/picons/downloads.html &> output1.txt
cat output1.txt | grep "17" &> output2.txt
cut -b 1-6 --complement output2.txt &> output3.txt
wget -i output3.txt
I am hoping there is an easier way to search the webpage for the link "hd.reflection-black.7z" and save the linked file.
The files are stored on google drive which does not contain the filename in the url, hence the use of "17" in second line of code above..
#linuxnoob, if you to download the file (curl is more powerfull than wget):
curl -L --compressed `(curl --compressed "http://ocram.github.io/picons/downloads.html" 2> /dev/null | \
grep -o '<a .*href=.*>' | \
sed -e 's/<a /\n<a /g' | \
grep hd.reflection-black.7z | \
sed -e 's/<a .*href=['"'"'"]//' -e 's/["'"'"'].*$//' -e '/^$/ d')` > hd.reflection-black.7z
without indentation, for your script:
curl -L --compressed `(curl --compressed "http://ocram.github.io/picons/downloads.html" 2> /dev/null | grep -o '<a .*href=.*>' | sed -e 's/<a /\n<a /g' | grep hd.reflection-black.7z | sed -e 's/<a .*href=['"'"'"]//' -e 's/["'"'"'].*$//' -e '/^$/ d')` > hd.reflection-black.7z 2>/dev/null
You can try it!
What about?
curl --compressed "http://ocram.github.io/picons/downloads.html" | \
grep -o '<a .*href=.*>' | \
sed -e 's/<a /\n<a /g' | \
grep hd.reflection-black.7z | \
sed -e 's/<a .*href=['"'"'"]//' -e 's/["'"'"'].*$//' -e '/^$/ d'
I'd try to avoid using regular expressions since they tend to break in unexpected ways (e.g. the output is split in more than one line for some reason).
I suggest to use a scripting language like Ruby or Python, where higher level tools are available.
The following example is in Ruby:
#!/usr/bin/ruby
require 'rubygems'
require 'nokogiri'
require 'open-uri'
main_url = ARGV[0] # 'http://ocram.github.io/picons/downloads.html'
filename = ARGV[1] # 'hd.reflection-black.7z'
doc = Nokogiri::HTML(open(main_url))
url = doc.xpath("//a[text()='#{filename}']").first['href']
File.open(filename,'w+') do |file|
open(url,'r' ) do |link|
IO.copy_stream(link,file)
end
end
Save it to a file like fetcher.rb and then you can use it with
ruby fetcher.rb http://ocram.github.io/picons/downloads.html hd.reflection-black.7z
To make it work you'll have to install Ruby and the Nokogiri library (both are available on most distro's repositories)

Resources