In shell, split a portion of a string with dot as delimiter [duplicate] - linux

This question already has answers here:
How do I split a string on a delimiter in Bash?
(37 answers)
Closed 4 years ago.
I am new to shell scripting, can you please help with below requirement, thanks.
$AU_NAME=AU_MSM3-3.7-00.01.02.03
#separate the string after last "-", with "." as delimiter
#that is, separate "00.01.02.03" and print/save as below.
major=00
minor=01
micro=02
build=03

First, note that you don't use $ when assigning to a parameter in the shell. Your first line should be just this:
AU_NAME=AU_MSM3-3.7-00.01.02.03
The $ is used to get the value of the parameter once assigned. And the bit after the $ can be an expression in curly braces with extra stuff besides just the name, allowing you to perform various operations on the value. For example, you can do something like this:
IFS=. read major minor micro build <<EOF
${AU_NAME##*-}
EOF
where the ##*- strips off everything from the beginning of the string through the last '-', leaving just "00.01.02.03", and the IFS (Internal Field Separator) parameter tells the shell where to break the string into fields.
In bash, zsh, and ksh93+, you can get that onto one line by shortening the here-document to a here-string:
IFS=. read major minor micro build <<<"${AU_NAME##*-}"
More generally, in those same shells, you can split into an arbitrarily-sized array instead of distinct variables:
IFS=. components=(${AU_NAME##*-})
(Though that syntax won't work in especially-ancient versions of ksh; in them you have to do this instead:
IFS=. set -A components ${AU_NAME##*-}
)
That gets you this equivalence (except in zsh, which by default numbers the elements 1-4 instead of 0-3):
major=${components[0]}
minor=${components[1]}
micro=${components[2]}
build=${components[3]}

In bash, you can do something like this:
version=$(echo $AU_NAME | grep -o '[^-]*$')
major=$(echo $version | cut -d. -f1)
minor=$(echo $version | cut -d. -f2)
micro=$(echo $version | cut -d. -f3)
build=$(echo $version | cut -d. -f4)
The grep call uses -o which outputs only the matching part of the line. The match itself is every non-hyphen character to the end of the line.
The cut command uses the delimeter . (-d.), and uses -f to select individual fields.
It's a little clunky. I'm sure there are probably better ways to achieve this, but you can do quite a lot with grep and cut alone so they're handy tools to have in your arsenal.

You can use parameter expansion and the special IFS variable.
#! /bin/bash
AU_NAME=AU_MSM3-3.7-00.01.02.03
IFS=. VER=(${AU_NAME##*-})
for i in {0..3} ; do
echo ${VER[i]}
done
major=${VER[0]}
minor=${VER[1]}
micro=${VER[2]}
build=${VER[3]}
BTW, in an assignment, do not start the variable on the left hand side with a dollar sign.

Related

Using echo in bash puts last variable in front of the output

I'm trying to write a script and one of the parts of the script requires me to concatenate some variables together to create a URL.
REPO_URL='https://github.com/Example/Repo.Game/'
FILENAME='Example.Game-linux.zip'
latest_version="$(curl -LIs "${REPO_URL}/releases/latest" | grep -i '^location:' | cut -d' ' -f2 | cut -d'/' -f8)"
echo "$latest_version"
echo "$FILENAME"
echo "$REPO_URL"
echo "${REPO_URL}releases/download/${latest_version}/${FILENAME}"
Output:
2.0.5164
Example.Game-linux.zip
https://github.com/Example/Repo.Game/
/Example.Game-linux.ziple/Repo.Game/releases/download/2.0.5164
My actual output:
2.0.5164
Oxide.Rust-linux.zip
https://github.com/OxideMod/Oxide.Rust/
/Oxide.Rust-linux.zipideMod/Oxide.Rust/releases/download/2.0.5164
It looks like some kind of overflow problem? I'm not exactly sure. I added abcabc to the filename and the output became
/Oxide.Rust-linux.zipabcabc/Oxide.Rust/releases/download/2.0.5164
Any help would be appreciated.
I resolved the problem by removing the carriage return value from the variable.
tr -d '\r' seems to have resolved it. I'm not sure where the variable came from and if anyone has advice on how to clean up this mess I would love some advice.
latest_version="$(curl -LIs "${REPO_URL}/releases/latest" | grep -i '^location:' | cut -d' ' -f2 | cut -d'/' -f8 | tr -d '\r')
You can use ANSI quoting, and variable substitution to remove control characters from variables without having to invoke sub-shells.
ANSI quoting uses the special format $'\*' to represent special characters. For example use $'\t' for tab, $'\n' for new-line and $'\r' for carriage-return.
Variable substitution uses extra characters at the end of the variable name to perform actions on the variable. For example
${variable//[pattern]/[substitution]} will replace all instances of [pattern] in ${variable} with [substitution].
${variable%[pattern]} will remove [pattern] from ${variable} if it is at the end.
By combining these two, you can remove carriage-return characters from the end of your variable like this:
echo ${variable%$'\r'}
Note: Variable substitution doesn't actually change the contents of the variable. To do that, you have to re-assign the result back to the variable:
variable="${variable%$'\r'}"
There is a cleaner way to get the version number, minus any trailing carriage-return, from github using sed.
latest_version =$(curl -LIs "${REPO_URL}/releases/latest" | sed -n 's/^Location:.*\/\([^\r]*\).*$/\1/p')
sed reads every line of input (STDIN by default) and performs operations on it defined by the action string parameter. The action string is a little tricky to explain in this case, but here goes:
The -n option suppresses the printing of each input line. Output will then only happen if it is explicitly stated in the action string.
The s/[pattern]/[substitution]/p construct says whenever you find [pattern], replace it with [substitution] and print it. Our [pattern] is ^Location:.*\/\(.*\)$, and our [substitution] is \1.
The expression ^ matches the beginning of the line.
The expression . means any single character, and the expression .* means any number of characters (including zero). This will match the largest possible string, so, for example .*/ will match abc/def/ in the string abc/def/ghi.
The expression \/ just escapes the forward slash (because we are using backslash as a delimiter, we have to escape it).
The expression \([pattern]\) says any time you find [pattern], remember it. in our case, it will remember whatever matches [^\r].
The expression [{chars}] matches any one of the characters in {chars}. [^{chars}] matches any character that is not in {chars}. so [^\r]* matches any number of characters that is not a carriage return.
The expression $ matches the end of a line.
The expression \1 is replaced by the first remembered pattern.
So altogether, our action string says:
If you find a line that starts with Location:, followed by any number of characters, followed by a /, followed by any number of characters that are not a carriage return (which will be remembered), followed by any number of characters, followed by an end of line, then print the remembered characters.

bash extract version string & convert to version dot

I want to extract version string (1_4_5) from my-app-1_4_5.img and then convert into dot version (1.4.5) without filename. Version string will have three (1_4_5) or four (1_4_5_7) segments.
Have this one liner working ls my-app-1_4_5.img | cut -d'-' -f 3 | cut -d'.' -f 1 | tr _ .
Would like to know if there is any better way rather than piping output from cut.
Here's an attempt with parameter expansion. I'm assuming you have a wildcard pattern you want to loop over.
for file in *-*.img; do
base=${file%.img}
ver=${base##*-}
echo "${ver//_/.}"
done
The construct ${var%pattern} returns the variable var with any suffix matching pattern trimmed off. Similarly, ${var#pattern} trims any prefix which matches pattern. In both cases, doubling the operator switches to trimming the longest possible match instead of the shortest. (These are POSIX-compatible pattenr expansion, i.e. not strictly Bash only.) The construct ${var/pattern/replacement} replaces the first match in var on pattern with replacement; doubling the first slash causes every match to be replaced. (This is Bash only.)
You can do it with sed:
sed -E "s/.*([0-9]+)_([0-9]+)_([0-9]+).*/\1.\2.\3/" <<< my-app-1_4_5.img
Assuming the version number will always be between the last dash and the file extension, you can use something like this in pure Bash:
name="file-name-x-1_2_3_4_5.ext"
version=${name##*-}
version=${version%%.ext}
version=${version//_/.}
echo $version
The code above will result in:
1.2.3.4.5
For a complete explanation about the brace expansions used above, please take a look at Bash Reference Manual: 3.5.1 Brace Expansion.
Remove everything but 0 to 9, _ and newline and then replace all _ with .:
echo "my-app-1_4_5.img" | tr -cd '0-9_\n' | tr '_' '.'
Output:
1.4.5
With bash and a regex:
echo "my-app-1_4_5.img" | while IFS= read -r line; do [[ "$line" =~ [^0-9]([0-9_]+)[^0-9] ]] && echo "${BASH_REMATCH[1]//_/.}"; done
Output:
1.4.5
A slightly shorter variant
name=my-app-1_4_5.img
vers=${name//[!0-9_]}
$ echo ${vers//_/.}
1.4.5

how to remove first two words of a strings output

I want to remove the first two words that come up in my output string. this string is also within another string.
What I have:
for servers in `ls /data/field`
do
string=`cat /data/field/$servers/time`
This sends this text:
00:00 down server
I would like to remove "00:00 down" so that it only displays "server".
I have tried using cut -d ' ' -f2- $string which ends up just removing directories that the command searches.
Any ideas?
Please, do the things properly :
for servers in /data/field/*; do
string=$(cut -d" " -f3- /data/field/$servers/time)
echo "$string"
done
backticks are deprecated in 2014 in favor of the form $( )
don't parse ls output, use glob instead like I do with data/field/*
Check http://mywiki.wooledge.org/BashFAQ for various subjects
Use -d option to set the delimtier to space
$ echo 00:00 down server | cut -d" " -f3-
server
Note Use the field number 3 as the count starts from 1 and not 0
From man page
-d, --delimiter=DELIM
use DELIM instead of TAB for field delimiter
N- from N'th byte, character or field, to end of line
More Tests
$ echo 00:00 down server hello world| cut -d" " -f3-
server hello world
The for loop is capable of iterating through the files using globbing. So I would write something like
for servers in /data/field*
do
string=`cut -d" " -f3- /data/field/$servers/time`
...
...
You can use sed as well:
sed 's/^.* * //'
For the examples given, I prefer cut. But for the general problem expressed by the question, the answers above have minor short-comings. For instance, when you don't know how many spaces are between the words (cut), or whether they start with a space or not (cut,sed), or cannot be easily used in a pipeline (shell for-loop). Here's a perl example that is fast, efficient, and not too hard to remember:
| perl -pe 's/^\s*(\S+\s+){2}//'
Perl's -p operates like sed's. That is, it gobbles input one line at a time, like -n, and after dong work, prints the line again. The -e starts the command-line-based script. The script is simply a one-line substitute s/// expression; substitute matching regular expressions on the left hand side with the string on the right-hand side. In this case, the right-hand side is empty, so we're just cutting out the expression found on the left-hand side.
The regular expression, particular to Perl (and all PLRE derivatives, like those in Python and Ruby and Javascript), uses \s to match whitespace, and \S to match non-whitespace. So the combination of \S+\s+ matches a word followed by its whitespace. We group that sub-expression together with (...) and then tell sed to match exactly 2 of those in a row with the {m,n} expression, where n is optional and m is 2. The leading \s* means trim leading whitespace.

Making a bash script to accept input from file OR piping output

I have the following bash script which takes the tabular data as input,
get the first line and spit them vertically:
#!/bin/bash
# my_script.sh
export LC_ALL=C
file=$1
head -n1 $file |
tr "\t" "\n" |
awk '{print $1 " " NR-1}'
The problem is that I can only execute it this way:
$ myscript.sh some_tab_file.txt
What I want to do is on top of the above capability also allows you to do this:
$ cat some_tab_file.txt myscript.sh | myscript.sh
Namely take it from pipe output. How can I achieve that?
I'd normally write:
export LC_ALL=C
head -n1 "$#" |
tr "\t" "\n" |
awk '{print $1 " " NR-1}'
This works with any number of arguments, or none if there are none. Using "$#" is important in this and many other contexts. See the Bash manual on special parameters and shell parameter expansion for more information on the many and varied notations available for controlling how shell parameters are handled. Generally, double quotes are a good idea, especially if the file names may contain spaces.
A common idiom is to fall back to the input file - if there are no parameters. There is a convenient shorthand for that;
file=${1--}
The substitution ${variable-fallback} evaluates to the variable's value, or fallback if it's unset.
I believe your script should work as-is, though; head will read standard input if the (unquoted!) file name you pass in evaluates to the empty string.
Take care to properly double-quote all interpolations of "$file", by the way; otherwise, your script won't work on filenames containing spaces or shell metacharacters. (Then you break the fortunate side effect of not passing a filename to head if your script did not receive one, though.)

Cut not working as a variable

I have a wierd situation. I am trying to cut some info from a file and everything works fine when I run the command straight into the terminal, but as soon as I make it a variable in a script it returns a mixture of what it should cut and a list of the files in the current directory.
cat query.sql | cut -d':' -f3,4
works but...
QUERY_SQL="query.sql"
MYSQL_COMMAND=`cat $QUERY_SQL | cut -d':' -f3,4`
echo $MYSQL_COMMAND
returns the wierd output mentioned above.
What am I doing wrong?
EDIT:
The query file looks something like this...
email#somehwhere.com:3:SQL CODE
I suspect something in the contents of MYSQL_COMMAND is being interpreted as a filename glob pattern. Try changing
MYSQL_COMMAND=`cat $QUERY_SQL | cut -d':' -f3,4`
echo $MYSQL_COMMAND
to
MYSQL_COMMAND="$(cut -d: -f3,4 < "$QUERY_SQL")"
printf '%s\n' "$MYSQL_COMMAND"
Best defensive coding practice for shell is to put double quotes around every variable substitution, unless you know for a fact that you need word splitting and glob expansion to happen after a particular substitution. Changing echo to printf '%s\n' avoids a related set of problems. I can never remember whether you actually need double quotes around $(...) in a variable assignment, so I put them in just to be safe.

Resources