How to use "curl -Is link | head -n 1" with many links in a file - linux

i want to use this curl -Is link | head -n 1 to check if a link is alive ,but this is for one link ,anyone help me to check with a large number link in a file ,i has tried curl -Is Mylink.txt | head -n 1 ,but it does not achieve my goal .Thanks for any help .

Assuming your lines are well formatted (as curl likely needs them to be anyway) #codeforester's should work for you. As a matter of habit I try to avoid using cat and for loops to parse files though so I'd do it a little differently:
while read -r line; do
printf 'URL %s: %s\n' "$line" "$(curl -Is "$line" | head -n 1)"
done < Mylink.txt

This sounds like a job for xargs! Depending on what your file looks like that is. I'm assuming there is one link per line. Why are you using the -Is? Especially the -s for silent mode if you want to see errors? You may actually want the -s -S options together (link)
xargs < links.txt -I % sh -c 'curl -Is "$1" | head -n1' _ %
edited to reflect corrections in the comments, thanks!

for link in $(cat Mylink.txt)
do
echo $link: $(curl -Is $link | head -n 1)
done

Related

Multiple process curl command for urls to output to individual files

I am attempting to curl multiple urls in a bash command. Eventually I will be curling a large number of Urls so I am using xargs to use multiple processes to speed up the process.
My file consists of x number of URLs:
https://someurl.com
https://someotherurl.com
My issue comes when attempting to output the results to separate files named after the URLs I curl.
The bash command I have is:
xargs -P 5 -n 1 -I% curl -k -L % -0 % < urls.txt
When I run this I get 'Failed to create file https://someotherurl.com'
You cannot create a file with / in the filename. You could do it this way:
#!/bin/bash
while IFS= read -r line
do
echo "LINE: $line"
if [[ "$line" != "" ]]
then
filename="${line#https://}"
echo "FILENAME: $filename"
# YOUR CURL COMMAND HERE, USING $filename
fi
done < url.txt
it ignores empty lines
variable substitution is used to remove the https:// part of each URL
this will allow you to create the file
Note: if your URLs containt sub-directories, they must be removed as well.
Ex: you want to do https://www.exemple.com/some/sub/dir
The script I suggested here would try to create a file named "www.exemple.com/some/sub/dir". In this case, you could replace the / with _ using tr.
The script would become:
#!/bin/bash
while IFS= read -r line
do
echo "LINE: $line"
if [[ "$line" != "" ]]
then
filename=$(echo "$line" | tr '/' '_')
filename2=${filename#https:__}
echo "FILENAME: $filename2"
# YOUR CURL COMMAND HERE, USING $filename2
fi
done < url.txt
Because your question is ambiguous, I would assume:
You have a file urls.txt that contains URLs separated by LF.
You want to download all URLs by curl and use each URL as its filename.
Unfortunately, that's not possible because URL contains invalid characters like slash /. Alternatively, for this case, I would suggest you use Bsse64 safe mode to decode URL before saving to file based on RFC 3548.
After applying this requirement, your script would become like:
seq 100 | xargs -I# echo 'https://example.com?#' > urls.txt
xargs -P0 -L1 sh -c 'curl -SskL0 -o $(printf %s "$1" | uuencode -m /dev/stdout | sed "1d;\$d" | tr +/ -_) "$1"' sh < urls.txt

Calling a shell script that is stored in another shell script variabl

I searched SO but could not find any relevant post with this specific problem. I would like to know how to call a shell script which is stored in a variable of another shell script.
In the below script I am trying to read service name & corresponding shellscript, check if the service is running, if not, start the service using the shell script associated with that service name. tried multiple options shared in various forums(like 'eval' etc) with no luck. please help to provide your suggestions on this.
checker.sh
#!/bin/sh
while read service
do
servicename=`echo $service | cut -d: -f1`
servicestartcommand=`echo $service | rev | cut -d: -f1 | rev`
if (( $(ps -ef | grep -v grep | grep $servicename | wc -l) > 0 ))
then
echo "$servicename Running"
else
echo "!!$servicename!! Not Running, calling $servicestartcommand"
eval "$servicestartcommand"
fi
done < names.txt
Names.txt
WebSphere:\opt\software\WebSphere\startServer.sh
WebLogic:\opt\software\WebLogic\startWeblogic.sh
Your script can be refactored into this:
#!/bin/bash
while IFS=: read -r servicename servicestartcommand; do
if ps cax | grep -q "$servicename"; then
echo "$servicename Running"
else
echo "!!$servicename!! Not Running, calling $servicestartcommand"
$servicestartcommand
fi
done < names.txt
No need to use wc -l after grep's output as you can use grep -q
No need to use read full line and then use cut, rev etc later. You can use IFS=: and read the line into 2 separate variables
No need to use eval in the end
It is much simpler than you expect. Instead of:
eval "$servicestartcommand"
eval should only be used in extreme circumstances. All you need is
$servicestartcommand
Note: no quotes.
As an example, try this on the command-line:
cmd='ls -l'
$cmd
That should work. But:
"$cmd"
will fail. It will look for a program with a space in its name called 'ls -l'.
May be I don't get the idea, but why not use system variables?
export FOO=bar
echo $FOO
bar

How do I search for a file based on what is output by a command running on that file

I am working on a project for one of my professors and he asked me to sort a couple hundred .fits images based on their header files (specifically what star they are images of) I think that grep would be the best way to do this however I can't seam to figure out how to use grep based on the header.
I am entering:
ls | imhead *.fits | grep -E -r "PG\ 1104+243" *
to just list them out for now, once they are listed I know how to copy them into a directory.
I am new to using grep so I am unsure as to where my error lies? any help would be greatly appreciated! Thanks!
Assuming that imghead will extract the headers of the .fits as txt, you can use a simple shell script to do it:
script.sh
#!/bin/bash
grep "$1" "$2" > /dev/null 2>&1 && echo "$2"
Note that the + is a special character if you use extended regular expression, meaning if you pass the -E as in the question. A simple grep without any options should do the trick here.
Use find to exec the script on every *.fits file in the current folder:
find -maxdepth 1 -name '*.fits' -exec ./script.sh 'PG 1104+243' {} \;
If you are going to copy/move/alter or do something with the files you find, you might be better off, in terms of complexity and ease of quoting, using a loop like this:
#!/bin/bash
find . -name \*.fits -print0 | while read -d '' -r file; do
echo Checking file: $file
imhead "$file" | grep -q 'PG 1104+243'
if [ $? -eq 0 ]; then
echo Object matches: $file
fi
done

Bash, wget remove comma from output filename

I am reading a file with URL's by line by line then I pass URL to the wget:
FILE=/home/img-url.txt
while read line; do
url=$line
wget -N -P /home/img/ $url
done < $FILE
This works, but some file contains comma in the filename. How I can save the file without the comma?
Example:
http://xy.com/0005.jpg -> saved as 0005.jpg
http://xy.com/0022,22.jpg -> save as 002222.jpg not as 0022,22
I hope you find my question interesting.
UPDATE:
We have some nice solution, but is there any solution to the time stamping error?
WARNING: timestamping does nothing in combination with -O. See the manual
for details.
This should work:
url="$line"
filename="${url##*/}"
filename="${filename//,/}"
wget -P /home/img/ "$url" -O "$filename"
Using -N and -O both will throw a warning message. wget manual says:
-N (for timestamp-checking) is not supported in
combination with -O: since file is always newly created, it will always
have a very new timestamp.
So, when you use -O option, it actually creates a new file with new timestamping and thus the -N option becomes dummy (it can't do what it is for). If you want to preserve the timestapming, then a workaround might be this:
url="$line"
wget -N -P /home/img/ "$url"
file="${url##*/}"
newfile="${filename//,/}"
[[ $file != $newfile ]] && cp -p /home/img/"$file" /home/img/"$newfile" && rm /home/img/"$file"
In the body of the loop you need to generate the filename from the URL without commas and, without the leading part of the URL, and tell wget to save under other name.
url=$line
file=`echo $url | sed -e 's|^.*/||' -e 's/,//g'`
wget -N -P /home/image/dema-ktlg/ -O $file $url
In the meantime I wrote this:
url=$line
$file=`echo ${url##*/} | sed 's/,//'`
wget -N -P /home/image/dema-ktlg/ -O $file $url
Seems to work fine, is there any trivial problem with my code?

How to tail -f the latest log file with a given pattern

I work with some log system which creates a log file every hour, like follows:
SoftwareLog.2010-08-01-08
SoftwareLog.2010-08-01-09
SoftwareLog.2010-08-01-10
I'm trying to tail to follow the latest log file giving a pattern (e.g. SoftwareLog*) and I realize there's:
tail -F (tail --follow=name --retry)
but that only follow one specific name - and these have different names by date and hour. I tried something like:
tail --follow=name --retry SoftwareLog*(.om[1])
but the wildcard statement is resoved before it gets passed to tail and doesn't re-execute everytime tail retries.
Any suggestions?
I believe the simplest solution is as follows:
tail -f `ls -tr | tail -n 1`
Now, if your directory contains other log files like "SystemLog" and you only want the latest "SoftwareLog" file, then you would simply include a grep as follows:
tail -f `ls -tr | grep SoftwareLog | tail -n 1`
[Edit: after a quick googling for a tool]
You might want to try out multitail - http://www.vanheusden.com/multitail/
If you want to stick with Dennis Williamson's answer (and I've +1'ed him accordingly) here are the blanks filled in for you.
In your shell, run the following script (or it's zsh equivalent, I whipped this up in bash before I saw the zsh tag):
#!/bin/bash
TARGET_DIR="some/logfiles/"
SYMLINK_FILE="SoftwareLog.latest"
SYMLINK_PATH="$TARGET_DIR/$SYMLINK_FILE"
function getLastModifiedFile {
echo $(ls -t "$TARGET_DIR" | grep -v "$SYMLINK_FILE" | head -1)
}
function getCurrentlySymlinkedFile {
if [[ -h $SYMLINK_PATH ]]
then
echo $(ls -l $SYMLINK_PATH | awk '{print $NF}')
else
echo ""
fi
}
symlinkedFile=$(getCurrentlySymlinkedFile)
while true
do
sleep 10
lastModified=$(getLastModifiedFile)
if [[ $symlinkedFile != $lastModified ]]
then
ln -nsf $lastModified $SYMLINK_PATH
symlinkedFile=$lastModified
fi
done
Background that process using the normal method (again, I don't know zsh, so it might be different)...
./updateSymlink.sh 2>&1 > /dev/null
Then tail -F $SYMLINK_PATH so that the tail hands the changing of the symbolic link or a rotation of the file.
This is slightly convoluted, but I don't know of another way to do this with tail. If anyone else knows of a utility that handles this, then let them step forward because I'd love to see it myself too - applications like Jetty by default do logs this way and I always script up a symlinking script run on a cron to compensate for it.
[Edit: Removed an erroneous 'j' from the end of one of the lines. You also had a bad variable name "lastModifiedFile" didn't exist, the proper name that you set is "lastModified"]
I haven't tested this, but an approach that may work would be to run a background process that creates and updates a symlink to the latest log file and then you would tail -f (or tail -F) the symlink.
#!/bin/bash
PATTERN="$1"
# Try to make sure sub-shells exit when we do.
trap "kill -9 -- -$BASHPID" SIGINT SIGTERM EXIT
PID=0
OLD_FILES=""
while true; do
FILES="$(echo $PATTERN)"
if test "$FILES" != "$OLD_FILES"; then
if test "$PID" != "0"; then
kill $PID
PID=0
fi
if test "$FILES" != "$PATTERN" || test -f "$PATTERN"; then
tail --pid=$$ -n 0 -F $PATTERN &
PID=$!
fi
fi
OLD_FILES="$FILES"
sleep 1
done
Then run it as: tail.sh 'SoftwareLog*'
The script will lose some log lines if the logs are written to between checks. But at least it's a single script, with no symlinks required.
We have daily rotating log files as: /var/log/grails/customer-2020-01-03.log. To tail the latest one, the following command worked fine for me:
tail -f /var/log/grails/customer-`date +'%Y-%m-%d'`.log
(NOTE: no space after the + sign in the expression)
So, for you, the following should work (if you are in the same directory of the logs):
tail -f SoftwareLog.`date +'%Y-%m-%d-%H'`
I believe the easiest way is to use tail with ls and head, try something like this
tail -f `ls -t SoftwareLog* | head -1`

Resources