I'm trying to write a script and one of the parts of the script requires me to concatenate some variables together to create a URL.
REPO_URL='https://github.com/Example/Repo.Game/'
FILENAME='Example.Game-linux.zip'
latest_version="$(curl -LIs "${REPO_URL}/releases/latest" | grep -i '^location:' | cut -d' ' -f2 | cut -d'/' -f8)"
echo "$latest_version"
echo "$FILENAME"
echo "$REPO_URL"
echo "${REPO_URL}releases/download/${latest_version}/${FILENAME}"
Output:
2.0.5164
Example.Game-linux.zip
https://github.com/Example/Repo.Game/
/Example.Game-linux.ziple/Repo.Game/releases/download/2.0.5164
My actual output:
2.0.5164
Oxide.Rust-linux.zip
https://github.com/OxideMod/Oxide.Rust/
/Oxide.Rust-linux.zipideMod/Oxide.Rust/releases/download/2.0.5164
It looks like some kind of overflow problem? I'm not exactly sure. I added abcabc to the filename and the output became
/Oxide.Rust-linux.zipabcabc/Oxide.Rust/releases/download/2.0.5164
Any help would be appreciated.
I resolved the problem by removing the carriage return value from the variable.
tr -d '\r' seems to have resolved it. I'm not sure where the variable came from and if anyone has advice on how to clean up this mess I would love some advice.
latest_version="$(curl -LIs "${REPO_URL}/releases/latest" | grep -i '^location:' | cut -d' ' -f2 | cut -d'/' -f8 | tr -d '\r')
You can use ANSI quoting, and variable substitution to remove control characters from variables without having to invoke sub-shells.
ANSI quoting uses the special format $'\*' to represent special characters. For example use $'\t' for tab, $'\n' for new-line and $'\r' for carriage-return.
Variable substitution uses extra characters at the end of the variable name to perform actions on the variable. For example
${variable//[pattern]/[substitution]} will replace all instances of [pattern] in ${variable} with [substitution].
${variable%[pattern]} will remove [pattern] from ${variable} if it is at the end.
By combining these two, you can remove carriage-return characters from the end of your variable like this:
echo ${variable%$'\r'}
Note: Variable substitution doesn't actually change the contents of the variable. To do that, you have to re-assign the result back to the variable:
variable="${variable%$'\r'}"
There is a cleaner way to get the version number, minus any trailing carriage-return, from github using sed.
latest_version =$(curl -LIs "${REPO_URL}/releases/latest" | sed -n 's/^Location:.*\/\([^\r]*\).*$/\1/p')
sed reads every line of input (STDIN by default) and performs operations on it defined by the action string parameter. The action string is a little tricky to explain in this case, but here goes:
The -n option suppresses the printing of each input line. Output will then only happen if it is explicitly stated in the action string.
The s/[pattern]/[substitution]/p construct says whenever you find [pattern], replace it with [substitution] and print it. Our [pattern] is ^Location:.*\/\(.*\)$, and our [substitution] is \1.
The expression ^ matches the beginning of the line.
The expression . means any single character, and the expression .* means any number of characters (including zero). This will match the largest possible string, so, for example .*/ will match abc/def/ in the string abc/def/ghi.
The expression \/ just escapes the forward slash (because we are using backslash as a delimiter, we have to escape it).
The expression \([pattern]\) says any time you find [pattern], remember it. in our case, it will remember whatever matches [^\r].
The expression [{chars}] matches any one of the characters in {chars}. [^{chars}] matches any character that is not in {chars}. so [^\r]* matches any number of characters that is not a carriage return.
The expression $ matches the end of a line.
The expression \1 is replaced by the first remembered pattern.
So altogether, our action string says:
If you find a line that starts with Location:, followed by any number of characters, followed by a /, followed by any number of characters that are not a carriage return (which will be remembered), followed by any number of characters, followed by an end of line, then print the remembered characters.
Related
I want to remove the first two words that come up in my output string. this string is also within another string.
What I have:
for servers in `ls /data/field`
do
string=`cat /data/field/$servers/time`
This sends this text:
00:00 down server
I would like to remove "00:00 down" so that it only displays "server".
I have tried using cut -d ' ' -f2- $string which ends up just removing directories that the command searches.
Any ideas?
Please, do the things properly :
for servers in /data/field/*; do
string=$(cut -d" " -f3- /data/field/$servers/time)
echo "$string"
done
backticks are deprecated in 2014 in favor of the form $( )
don't parse ls output, use glob instead like I do with data/field/*
Check http://mywiki.wooledge.org/BashFAQ for various subjects
Use -d option to set the delimtier to space
$ echo 00:00 down server | cut -d" " -f3-
server
Note Use the field number 3 as the count starts from 1 and not 0
From man page
-d, --delimiter=DELIM
use DELIM instead of TAB for field delimiter
N- from N'th byte, character or field, to end of line
More Tests
$ echo 00:00 down server hello world| cut -d" " -f3-
server hello world
The for loop is capable of iterating through the files using globbing. So I would write something like
for servers in /data/field*
do
string=`cut -d" " -f3- /data/field/$servers/time`
...
...
You can use sed as well:
sed 's/^.* * //'
For the examples given, I prefer cut. But for the general problem expressed by the question, the answers above have minor short-comings. For instance, when you don't know how many spaces are between the words (cut), or whether they start with a space or not (cut,sed), or cannot be easily used in a pipeline (shell for-loop). Here's a perl example that is fast, efficient, and not too hard to remember:
| perl -pe 's/^\s*(\S+\s+){2}//'
Perl's -p operates like sed's. That is, it gobbles input one line at a time, like -n, and after dong work, prints the line again. The -e starts the command-line-based script. The script is simply a one-line substitute s/// expression; substitute matching regular expressions on the left hand side with the string on the right-hand side. In this case, the right-hand side is empty, so we're just cutting out the expression found on the left-hand side.
The regular expression, particular to Perl (and all PLRE derivatives, like those in Python and Ruby and Javascript), uses \s to match whitespace, and \S to match non-whitespace. So the combination of \S+\s+ matches a word followed by its whitespace. We group that sub-expression together with (...) and then tell sed to match exactly 2 of those in a row with the {m,n} expression, where n is optional and m is 2. The leading \s* means trim leading whitespace.
How can I use sed to get the SOMETHING in <version.suffix>SOMETHING</version.suffix>?
I tried sed 's#.*>\(.*\)\<version\.suffix\>#\1#' ,but fails.
Try this one:
sed 's/<.*>\(.*\)<.*>/\1/'
It should be general enough to get every xml value.
If you need to eliminate the indentation add \s* at the beginning like this:
sed 's/\s*<.*>\(.*\)<.*>/\1/'
Alternatively if you only want version.suffix's value, you can make the command more specific like this:
sed 's/<version\.suffix>\(.*\)<.*>/\1/'
You could use the below sed command,
$ echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#^<[^>]*>\(.*\)<\/[^>]*>$#\1#'
SOMETHING
^<[^>]*> Matches the first tag string <version.suffix>.
\(.*\)<\/[^>]*>$ Characters upto the next closing tag are captured. And the remaining closing tag was matched by this <\/[^>]*> regex.
Finally all the matched characters are replaced by the characters which are present inside the group index 1.
Your regex is correct but the only thing is, you forget to use / inside the closing tag.
$ echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#.*>\(.*\)</version\.suffix>#\1#'
|<-Here
SOMETHING
Many ways possible, e.g:
with sed
echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#<[^>]*>##g'
or grep
echo '<version.suffix>SOMETHING</version.suffix>' | grep -oP '<version.suffix>\KSOMETHING(?=</version.suffix>)'
Assuming the formatting of the question is accurate, when I run the example in the question as-is:
$ echo '<version.suffix>SOMETHING</version.suffix>' | sed 's#.*>\(.*\)\<version\.suffix\>#\1#'
I see the following output:
SOMETHING</>
In case my formatting skills fail me, this output ends with the trailing left angle bracket, a forward slash, and finally the right angle bracket.
So, why this "failure"? Well, on my system (Linux with GNU grep 2.14), grep(1) includes the following snippet:
The Backslash Character and Special Expressions
The symbols \< and \> respectively match the empty string at the beginning and end of a word.
Other answers suggest good alternatives to extract the value in XML tag syntax; use them.
I just wanted to point out why the RE in the original problem fails on current Linux systems: some symbols match no actual characters, but instead match empty boundaries in these apps that support posix-extended regular expressions. So, in this example, the brackets in the source are matched in unexpected ways:
the (.*)has matched SOMETHING</, to be printed by the \1 back-reference
the left-hand side of version.suffix is matched by \<
version.suffix is matched by version\.suffix
the right-hand side of version.suffix is matched by \>
the trailing > character remains in sed's pattern space and is printed.
TL;DR -"\X" does not mean "just match an X" for all X!
I have a string like this:
/var/cpanel/users/joebloggs:DNS9=domain.example
I need to extract the username (joebloggs) from this string and store it in a variable.
The format of the string will always be the same with exception of joebloggs and domain.example so I am thinking the string can be split twice using cut?
The first split would split by : and we would store the first part in a variable to pass to the second split function.
The second split would split by / and store the last word (joebloggs) into a variable
I know how to do this in PHP using arrays and splits but I am a bit lost in bash.
To extract joebloggs from this string in bash using parameter expansion without any extra processes...
MYVAR="/var/cpanel/users/joebloggs:DNS9=domain.example"
NAME=${MYVAR%:*} # retain the part before the colon
NAME=${NAME##*/} # retain the part after the last slash
echo $NAME
Doesn't depend on joebloggs being at a particular depth in the path.
Summary
An overview of a few parameter expansion modes, for reference...
${MYVAR#pattern} # delete shortest match of pattern from the beginning
${MYVAR##pattern} # delete longest match of pattern from the beginning
${MYVAR%pattern} # delete shortest match of pattern from the end
${MYVAR%%pattern} # delete longest match of pattern from the end
So # means match from the beginning (think of a comment line) and % means from the end. One instance means shortest and two instances means longest.
You can get substrings based on position using numbers:
${MYVAR:3} # Remove the first three chars (leaving 4..end)
${MYVAR::3} # Return the first three characters
${MYVAR:3:5} # The next five characters after removing the first 3 (chars 4-9)
You can also replace particular strings or patterns using:
${MYVAR/search/replace}
The pattern is in the same format as file-name matching, so * (any characters) is common, often followed by a particular symbol like / or .
Examples:
Given a variable like
MYVAR="users/joebloggs/domain.example"
Remove the path leaving file name (all characters up to a slash):
echo ${MYVAR##*/}
domain.example
Remove the file name, leaving the path (delete shortest match after last /):
echo ${MYVAR%/*}
users/joebloggs
Get just the file extension (remove all before last period):
echo ${MYVAR##*.}
example
NOTE: To do two operations, you can't combine them, but have to assign to an intermediate variable. So to get the file name without path or extension:
NAME=${MYVAR##*/} # remove part before last slash
echo ${NAME%.*} # from the new var remove the part after the last period
domain
Define a function like this:
getUserName() {
echo $1 | cut -d : -f 1 | xargs basename
}
And pass the string as a parameter:
userName=$(getUserName "/var/cpanel/users/joebloggs:DNS9=domain.example")
echo $userName
What about sed? That will work in a single command:
sed 's#.*/\([^:]*\).*#\1#' <<<$string
The # are being used for regex dividers instead of / since the string has / in it.
.*/ grabs the string up to the last backslash.
\( .. \) marks a capture group. This is \([^:]*\).
The [^:] says any character _except a colon, and the * means zero or more.
.* means the rest of the line.
\1 means substitute what was found in the first (and only) capture group. This is the name.
Here's the breakdown matching the string with the regular expression:
/var/cpanel/users/ joebloggs :DNS9=domain.example joebloggs
sed 's#.*/ \([^:]*\) .* #\1 #'
Using a single Awk:
... | awk -F '[/:]' '{print $5}'
That is, using as field separator either / or :, the username is always in field 5.
To store it in a variable:
username=$(... | awk -F '[/:]' '{print $5}')
A more flexible implementation with sed that doesn't require username to be field 5:
... | sed -e s/:.*// -e s?.*/??
That is, delete everything from : and beyond, and then delete everything up until the last /. sed is probably faster too than awk, so this alternative is definitely better.
Using a single sed
echo "/var/cpanel/users/joebloggs:DNS9=domain.example" | sed 's/.*\/\(.*\):.*/\1/'
I like to chain together awk using different delimitators set with the -F argument. First, split the string on /users/ and then on :
txt="/var/cpanel/users/joebloggs:DNS9=domain.com"
echo $txt | awk -F"/users/" '{print$2}' | awk -F: '{print $1}'
$2 gives the text after the delim, $1 the text before it.
I know I'm a little late to the party and there's already good answers, but here's my method of doing something like this.
DIR="/var/cpanel/users/joebloggs:DNS9=domain.example"
echo ${DIR} | rev | cut -d'/' -f 1 | rev | cut -d':' -f1
I would like to display the last word in these lines I tried to look for example the word value but no answer, so I thought to look for the words between quotes but my file contains other words between quotes that I have I need not actually want to display the values of the select tag knowing that my html file is.
grep '*' hosts.html | awk '{print $NF}'
For example:
value='www.visit-tunisia.com'>www.visit-tunisia.com
value='www.watania1.tn'>www.watania1.tn
value='www.watania2.tn'>www.watania2.tn
I would have
www.visit-tunisia.com
www.watania1.tn
www.watania2.tn
You need to set the field separator to > you do this with the -F option:
$ awk -F'>' '{print $NF}' hosts.html
www.visit-tunisia.com
www.watania1.tn
www.watania2.tn
Note: I'm not sure what you are trying to achieve by grep '*' hosts.html?
Interpreting the comment liberally, you have input lines which might contain:
value='www.visit-tunisia.com'>www.visit-tunisia.com
value='www.watania1.tn'>www.watania1.tn
value='www.watania2.tn'>www.watania2.tn
and you would like the names which are repeated on a line as the output:
www.visit-tunisia.com
www.watania1.tn
www.watania2.tn
This can be done using sed and capturing parentheses.
sed -n -e "s/.*'\([^']*\)'.*\1.*/\1/p"
The -n says "don't print unless I say to do so". The s///p command prints if the substitute works. The pattern looks for a stream of 'anything' (.*), a single quote, captures what's inside up to the next single quote ('\([^']*\)') followed by any text, the captured text (the first \1), and anything. The replacement text is what was captured (the second \1).
Example:
$ cat data
www and wotnot
value='www.visit-tunisia.com'>www.visit-tunisia.com
blah
value='www.watania1.tn'>www.watania1.tn
hooplah
value='www.watania2.tn'>www.watania2.tn
if 'nothing' is required, nothing will be done.
$ sed -n -e "s/.*'\([^']*\)'.*\1.*/\1/p" data
www.visit-tunisia.com
www.watania1.tn
www.watania2.tn
nothing
$
Clearly, you can refine the [^']* part of the match if you want to. I used double quotes around the expression since the pattern matches on single quotes. Life is trickier if you need to allow both single and double quotes; at that point, I'd put the script into a file and run sed -f script data to make life easier.
sed 's/.*>\(.*\)/\1/g' your_file
I'm stumped with how to remove a portion of a string that has forward slashes and question marks in it.
Example: /diag/PeerManager/list?deviceid=RXMWANT8WFYJNF7K6DXXXJLJVN
and I need the output to be RXMWANT8WFYJNF7K6DXXXJLJVN
I've tried tr and sed but tr removes some of the characters I need in the output. sed is giving me trouble because of the forward slashes.
What's a quick method to remove the /diag/PeerManager/list?deviceid= portion of my string?
thanks!
echo "/diag/PeerManager/list?deviceid=RXMWANT8WFYJNF7K6DXXXJLJVN" | sed -n 's:/[a-zA-Z]/[a-zA-Z]/[a-zA-Z]?[a-zA-Z]=::p'
This should do the trick. I chose the colon as the delimiter as it will not cause any issues with the forward slash. This makes a lot of assumptions about the type of input it will be receiving, specifically that it will only contain three backslashes with lower and uppercase letters between them, a series of letters ending in a question mark, another series of letters ending in an equals sign. This then removes those items and prints the remaining characters (your device id).
This worked for me:
sed 's/.*deviceid=\([^&]*\).*/\1/'
Example:
$ echo '/diag/PeerManager/list?deviceid=RXMWANT8WFYJNF7K6DXXXJLJVN' | sed 's/.*deviceid=\([^&]*\).*/\1/'
RXMWANT8WFYJNF7K6DXXXJLJVN
This is not the most robust solution, but if you have a fixed set of input that will never change, it's probably good enough.
One way using awk, if there is only a single occurrence of an = on each line:
awk -F= '{ print $2 }' file.txt
Results:
RXMWANT8WFYJNF7K6DXXXJLJVN
Use Equals Sign as Field Delimiter
If you know that your GET query string will always have only one parameter (in this case, deviceid) then you can just use the equals sign as a field delimiter with the standard cut utility. For example:
$ echo '/diag/PeerManager/list?deviceid=RXMWANT8WFYJNF7K6DXXXJLJVN' |
cut -d= -f2-
RXMWANT8WFYJNF7K6DXXXJLJVN
How about:
$ echo /diag/PeerManager/list?deviceid=RXMWANT8WFYJNF7K6DXXXJLJVN | sed 's/^.*=//'
RXMWANT8WFYJNF7K6DXXXJLJVN