This question already has an answer here:
Why would a correct shell script give a wrapped/truncated/corrupted error message? [duplicate]
(1 answer)
Closed 5 years ago.
Trying to download latest SBT version from GitHub:
version="$(curl -vsLk https://github.com/sbt/sbt/releases/latest 2>&1 | grep "< Location" | rev | cut -d'/' -f1 | rev)"
version is set to v1.1.0-RC2
Then attempting to download the .tar.gz package:
curl -fsSLk "https://github.com/sbt/sbt/archive/${version}.tar.gz" | tar xvfz - -C /home/myuser
However, instead of the correct URL:
https://github.com/sbt/sbt/archive/v1.1.0-RC2.tar.gz
Somehow the version string is interpreted as a command(?!), resulting in:
.tar.gzttps://github.com/sbt/sbt/archive/v1.1.0-RC2
When I manually set version="v1.1.0-RC2", this doesn't happen.
Thanks in advance!
You should use -I flag in curl command and a much simpler pipeline to grab version number like this:
curl -sILk https://github.com/sbt/sbt/releases/latest |
awk -F '[/ ]+' '$1 == "Location:"{sub(/\r$/, ""); print $NF}'
v1.1.0-RC2
Also note use of sub function to strip off \r from end of line of curl output.
Your script:
version=$(curl -sILk https://github.com/sbt/sbt/releases/latest | awk -F '[/ ]+' '$1 == "Location:"{sub(/\r$/, ""); print $NF}')
curl -fsSLk "https://github.com/sbt/sbt/archive/${version}.tar.gz" | tar xvfz - -C /home/myuser
Related
I have a script that I download slack with the wget command, as the script runs every time a computer is configured I need to always download the latest version of slack.
i work in debian9
I'm doing it right now:
wget https://downloads.slack-edge.com/linux_releases/slack-desktop-3.3.7-amd64.deb
and I tried this:
curl -s https://slack.com/intl/es/release-notes/linux | grep "<h2>Slack" | head -1 | sed 's/[<h2>/]//g' | sed 's/[a-z A-Z]//g' | sed "s/ //g"
this return: 3.3.7
add this to: wget https://downloads.slack-edge.com/linux_releases/slack-desktop-$curl-amd64.deb
and not working.
Do you know why this can not work?
Your script produces a long string with a lot of leading whitespace.
bash$ curl -s https://slack.com/intl/es/release-notes/linux |
> grep "<h2>Slack" | head -1 |
> sed 's/[<h2>/]//g' | sed 's/[a-z A-Z]//g' | sed "s/ //g"
3.3.7
You want the string without spaces, and the fugly long pipeline can be simplified significantly.
barh$ curl -s https://slack.com/intl/es/release-notes/linux |
> sed -n "/^.*<h2>Slack /{;s///;s/[^0-9.].*//p;q;}"
3.3.7
Notice also that the character class [<h2>/] doesn't mean at all what you think. It matches a single character which is < or h or 2 or > or / regardless of context. So for example, if the current version number were to contain the digit 2, you would zap that too.
Scraping like this is very brittle, though. I notice that if I change the /es/ in the URL to /en/ I get no output at all. Perhaps you can find a better way to obtain the newest version (using apt should allow you to install the newest version without any scripting on your side).
echo wget "https://downloads.slack-edge.com/linux_releases/slack-desktop-$(curl -s "https://slack.com/intl/es/release-notes/linux" | xmllint --html --xpath '//h2' - 2>/dev/null | head -n1 | sed 's/<h2>//;s#</h2>##;s/Slack //')-amd64.deb"
will output:
wget https://downloads.slack-edge.com/linux_releases/slack-desktop-3.3.7-amd64.deb
I used xmllint to parse the html and extract the first part between <h2> tags. Then some removing with sed and I receive the newest version.
#edit:
On noticing, that you could just grep <h2> from the site to get the version, you can the version with just:
curl -s "https://slack.com/intl/es/release-notes/linux" | grep -m1 "<h2>" | cut -d' ' -f2 | cut -d'<' -f1
This question already has answers here:
How to grep for contents after pattern?
(8 answers)
Closed 4 years ago.
I have .env file and I am trying to parse the value from it.
I ran this
cat .env | grep PORT=
I got
PORT=3333
How do I grab the value of a specific key?
cat env | grep PORT= | cut -d '=' -f2
Let say your input looks like this :
$ cat test.txt
Port=2020
Email=me#myserver.com
Version=2.02
Then this will do :
awk -F'=' '/^Version/ { print $2}' test.txt
Output
2.02
Use eval to parse the assignment line, later variable values can be substituted with $:
eval "$(grep ^PORT= .env)"
echo $PORT
This question already has answers here:
How do I search all revisions of a file in a SubVersion repository?
(7 answers)
SVN Repository Search [closed]
(14 answers)
Closed 7 years ago.
I would like to do the following using svn: Search in a file for a certain string content across the revision history? Is this possible>? How do I best go about it?
Do I come up with a script using svn cat -r1 and grep or is there some better method?
Svn has no direct support for this.
If it's only one file this will work, albeit slowly.
svn log -q <file> | grep '^r' | awk '{print $1;}' | \
xargs -n 1 -i svn cat -r {} <file> | grep '<string>'
Fill in <file> and <string>
A for loop will also work so you can print the matching file/revision if desired.
Using a for loop to have some more output control (this is bash):
#!/bin/env bash
f=$1
s=$2
for r in $(svn log -q "$f" | grep '^r' | awk '{print $1;}'); do
e=$(svn cat -r $r "$f" | grep "$s")
if [[ -n "$e" ]]; then
echo "Found in revision $r: $e"
fi
done
This takes two arguments: the (path to) the file to search and the string to search for in the file.
I have a file with content as follows and want to validate the content as
1.I have entries of rec$NUM and this field should be repeated 7 times only.
for example I have rec1.any_attribute this rec1 should come only 7 times in whole file.
2.I need validating script for this.
If records for rec$NUM are less than 7 or Greater than 7 script should report that record.
FILE IS AS FOLLOWS :::
rec1:sourcefile.name=
rec1:mapfile.name=
rec1:outputfile.name=
rec1:logfile.name=
rec1:sourcefile.nodename_col=
rec1:sourcefle.snmpnode_col=
rec1:mapfile.enc=
rec2:sourcefile.name=abc
rec2:mapfile.name=
rec2:outputfile.name=
rec2:logfile.name=
rec2:sourcefile.nodename_col=
rec2:sourcefle.snmpnode_col=
rec2:mapfile.enc=
rec3:sourcefile.name=abc
rec3:mapfile.name=
rec3:outputfile.name=
rec3:logfile.name=
rec3:sourcefile.nodename_col=
rec3:sourcefle.snmpnode_col=
rec3:mapfile.enc=
Please Help
Thanks in Advance... :)
Simple awk:
awk -F: '/^rec/{a[$1]++}END{for(t in a){if(a[t]!=7){print "Some error for record: " t}}}' test.rc
grep '^rec1' file.txt | wc -l
grep '^rec2' file.txt | wc -l
grep '^rec3' file.txt | wc -l
All above should return 7.
The commands:
grep rec file2.txt | cut -d':' -f1 | uniq -c | egrep -v '^ *7'
will success if file follows your rules, fails (and returns the failing record) if it doesn't.
(replace "uniq -c" by "sort -u" if record numbers can be mixed).
I have this bash script that i wrote to analyse the html of any given web page. What its actually supposed to do is to return the domains on that page. Currently its returning the number of URL's on that web page.
#!/bin/sh
echo "Enter a url eg www.bbc.com:"
read url
content=$(wget "$url" -q -O -)
echo "Enter file name to store URL output"
read file
echo $content > $file
echo "Enter file name to store filtered links:"
read links
found=$(cat $file | grep -o -E 'href="([^"#]+)"' | cut -d '"' -f2 | sort | uniq | awk '/http/' > $links)
output=$(egrep -o '^http://[^/]+/' $links | sort | uniq -c > out)
cat out
How can i get it to return the domains instead of the URL's. From my programming knowledge I know its supposed to do parsing from the right but i am a newbie at bash scripting. Can someone please help me. This is as far as I have gone.
I know there's a better way to do this in awk but you can do this with sed, by appending this after your awk '/http/':
| sed -e 's;https\?://;;' | sed -e 's;/.*$;;'
Then you want to move your sort and uniq to the end of that.
So that the whole line will look like:
found=$(cat $file | grep -o -E 'href="([^"#]+)"' | cut -d '"' -f2 | awk '/http/' | sed -e 's;https\?://;;' | sed -e 's;/.*$;;' | sort | uniq -c > out)
You can get rid of this line:
output=$(egrep -o '^http://[^/]+/' $links | sort | uniq -c > out)
EDIT 2:
Please note, that you might want to adapt the search patterns in the sed expressions to your needs. This solution considers only http[s]?://-protocol and www.-servers...
EDIT:
If you want count and domains:
lynx -dump -listonly http://zelleke.com | \
sed -n '4,$ s#^.*http[s]?://\([^/]*\).*$#\1#p' | \
sort | \
uniq -c | \
sed 's/www.//'
gives
2 wordpress.org
10 zelleke.com
Original Answer:
You might want to use lynx for extracting links from URL
lynx -dump -listonly http://zelleke.com
gives
# blank line at the top of the output
References
1. http://www.zelleke.com/feed/
2. http://www.zelleke.com/comments/feed/
3. http://www.zelleke.com/
4. http://www.zelleke.com/#content
5. http://www.zelleke.com/#secondary
6. http://www.zelleke.com/
7. http://www.zelleke.com/wp-login.php
8. http://www.zelleke.com/feed/
9. http://www.zelleke.com/comments/feed/
10. http://wordpress.org/
11. http://www.zelleke.com/
12. http://wordpress.org/
Based on this output you achieve desired result with:
lynx -dump -listonly http://zelleke.com | \
sed -n '4,$ s#^.*http://\([^/]*\).*$#\1#p' | \
sort -u | \
sed 's/www.//'
gives
wordpress.org
zelleke.com
You can remove path from url with sed:
sed s#http://##; s#/.*##
I want to say you also, that these two lines are wrong:
found=$(cat $file | grep -o -E 'href="([^"#]+)"' | cut -d '"' -f2 | sort | uniq | awk '/http/' > $links)
output=$(egrep -o '^http://[^/]+/' $links | sort | uniq -c > out)
You must make either redirection ( > out ), or command substitution $(), but not two thing at the same time. Because the variables will be empty in this case.
This part
content=$(wget "$url" -q -O -)
echo $content > $file
would be also better to write this way:
wget "$url" -q -O - > $file
you may be interested by it:
https://www.rfc-editor.org/rfc/rfc3986#appendix-B
explain the way to parse uri using regex.
so you can parse an uri from the left this way, and extract the "authority" that contains domain and subdomain names.
sed -r 's_^([^:/?#]+:)?(//([^/?#]*))?.*_\3_g';
grep -Eo '[^\.]+\.[^\.]+$' # pipe with first line, give what you need
this is interesting to:
http://www.scribd.com/doc/78502575/124/Extracting-the-Host-from-a-URL
assuming that url always begin this way
https?://(www\.)?
is really hazardous.