bash extract version string & convert to version dot - linux

I want to extract version string (1_4_5) from my-app-1_4_5.img and then convert into dot version (1.4.5) without filename. Version string will have three (1_4_5) or four (1_4_5_7) segments.
Have this one liner working ls my-app-1_4_5.img | cut -d'-' -f 3 | cut -d'.' -f 1 | tr _ .
Would like to know if there is any better way rather than piping output from cut.

Here's an attempt with parameter expansion. I'm assuming you have a wildcard pattern you want to loop over.
for file in *-*.img; do
base=${file%.img}
ver=${base##*-}
echo "${ver//_/.}"
done
The construct ${var%pattern} returns the variable var with any suffix matching pattern trimmed off. Similarly, ${var#pattern} trims any prefix which matches pattern. In both cases, doubling the operator switches to trimming the longest possible match instead of the shortest. (These are POSIX-compatible pattenr expansion, i.e. not strictly Bash only.) The construct ${var/pattern/replacement} replaces the first match in var on pattern with replacement; doubling the first slash causes every match to be replaced. (This is Bash only.)

You can do it with sed:
sed -E "s/.*([0-9]+)_([0-9]+)_([0-9]+).*/\1.\2.\3/" <<< my-app-1_4_5.img

Assuming the version number will always be between the last dash and the file extension, you can use something like this in pure Bash:
name="file-name-x-1_2_3_4_5.ext"
version=${name##*-}
version=${version%%.ext}
version=${version//_/.}
echo $version
The code above will result in:
1.2.3.4.5
For a complete explanation about the brace expansions used above, please take a look at Bash Reference Manual: 3.5.1 Brace Expansion.

Remove everything but 0 to 9, _ and newline and then replace all _ with .:
echo "my-app-1_4_5.img" | tr -cd '0-9_\n' | tr '_' '.'
Output:
1.4.5

With bash and a regex:
echo "my-app-1_4_5.img" | while IFS= read -r line; do [[ "$line" =~ [^0-9]([0-9_]+)[^0-9] ]] && echo "${BASH_REMATCH[1]//_/.}"; done
Output:
1.4.5

A slightly shorter variant
name=my-app-1_4_5.img
vers=${name//[!0-9_]}
$ echo ${vers//_/.}
1.4.5

Related

Find the depth of the current path

How can I write a shell script to find the depth of the current path?
Assuming I am in:
/home/user/test/test1/test2/test3
It should return 6.
With shell parameter expansions, no external commands:
$ var=${PWD//[!\/]}
$ echo ${#var}
6
The first expansion removes all characters that are not /; the second one prints the length of var.
Explanations with details for support by POSIX shell or Bash (the links in parentheses go to the corresponding sections in the POSIX standard or the Bash manual):
$PWD contains the path to the current working directory. (sh/Bash)
The ${parameter/pattern/string} expansion replaces the first occurrence of pattern in the expansion of parameter with string. (Bash)
If the first slash is doubled (as in our case), all occurrences are replaced.
If string is empty, the slash after pattern is optional (as in our case).
The pattern [!\/] is a bracket expression and stands for "any character other than slash". (sh/Bash)
The slash has to be escaped, \/, or it is interpreted as ending the pattern.
! as the first character in a bracket expression negates the expression: any character other than the ones in the expression match the pattern. POSIX sh requires support for ! and says the behaviour for using ^ is undefined; Bash supports both ! and ^. Notice that this is not a bracket expression as seen in regular expressions, where only ^ is valid.
${#parameter} expands to the length of parameter. (sh/Bash)
A simple approach in fish:
count (string split / $PWD)
You could count the number of slashes in the current path:
pwd | awk -F"/" '{print NF-1}'
You can do this using a pipeline. pipe string into grep with the -o option. This prints out each "/" on a new line. pipe again into wc -l counts the number of lines printed.
echo "$path_str" | grep -o '/' - | wc -l
Assuming you don't have trailing "/", you can just count the "/".
So you would
Remove everything that is not a "/"
Count the length of the resulting string
In fish, this would be done with something like
string replace --regex --all '[^/]' '' -- $PWD | string length
The regular expression - [^/] here matches every single character that is not a "/". With "--all", this will be done as often as possible, and replace it with '', i.e. nothing.
The -- is the option separator, so that nothing in the argument is interpreted as an option (otherwise you'd have issues if an argument started with a "-a").
$PWD is the current directory.
string length simply outputs the length of its input.
Using perl :
echo '/home/user/test/test1/test2/test3' |
perl -lne '#_ = split /\//; print scalar #_ -1'
Output
6
You could use find just like that :
find / -printf "%d %p\n" 2>/dev/null | grep "$PWD$" | awk '{print $1}'
Maybe not the most efficient, but handles slashes well.

Bash: How to extract numbers preceded by _ and followed by

I have the following format for filenames: filename_1234.svg
How can I retrieve the numbers preceded by an underscore and followed by a dot. There can be between one to four numbers before the .svg
I have tried:
width=${fileName//[^0-9]/}
but if the fileName contains a number as well, it will return all numbers in the filename, e.g.
file6name_1234.svg
I found solutions for two underscores (and splitting it into an array), but I am looking for a way to check for the underscore as well as the dot.
You can use simple parameter expansion with substring removal to simply trim from the right up to, and including, the '.', then trim from the left up to, and including, the '_', leaving the number you desire, e.g.
$ width=filename_1234.svg; val="${width%.*}"; val="${val##*_}"; echo $val
1234
note: # trims from left to first-occurrence while ## trims to last-occurrence. % and %% work the same way from the right.
Explained:
width=filename_1234.svg - width holds your filename
val="${width%.*}" - val holds filename_1234
val="${val##*_}" - finally val holds 1234
Of course, there is no need to use a temporary value like val if your intent is that width should hold the width. I just used a temp to protect against changing the original contents of width. If you want the resulting number in width, just replace val with width everywhere above and operate directly on width.
note 2: using shell capabilities like parameter expansion prevents creating a separate subshell and spawning a separate process that occurs when using a utility like sed, grep or awk (or anything that isn't part of the shell for that matter).
Try the following code :
filename="filename_6_1234.svg"
if [[ "$filename" =~ ^(.*)_([^.]*)\..*$ ]];
then
echo "${BASH_REMATCH[0]}" #will display 'filename_6_1234.svg'
echo "${BASH_REMATCH[1]}" #will display 'filename_6'
echo "${BASH_REMATCH[2]}" #will display '1234'
fi
Explanation :
=~ : bash operator for regex comparison
^(.*)_([^.])\..*$ : we look for any character, followed by an underscore, followed by any character, followed by a dot and an extension. We create 2 capture groups, one for before the last underscore, one for after
BASH_REMATCH : array containing the captured groups
Some more way
[akshay#localhost tmp]$ filename=file1b2aname_1234.svg
[akshay#localhost tmp]$ after=${filename##*_}
[akshay#localhost tmp]$ echo ${after//[^0-9]}
1234
Using awk
[akshay#localhost tmp]$ awk -F'[_.]' '{print $2}' <<< "$filename"
1234
I would use
sed 's!_! !g' | awk '{print "_" $NF}'
to get from filename_1234.svg to _1234.svg then
sed 's!svg!!g'
to get rid of the extension.
If you set IFS, you can use Bash's build-in read.
This splits the filename by underscores and dots and stores the result in the array a.
IFS='_.' read -a a <<<'file1b2aname_1234.svg'
And this takes the second last element from the array.
echo ${a[-2]}
There's a solution using cut:
name="file6name_1234.svg"
num=$(echo "$name" | cut -d '_' -f 2 | cut -d '.' -f 1)
echo "$num"
-d is for specifying a delimiter.
-f refers to the desired field.
I don't know anything about performance but it's simple to understand and simple to maintain.

Linux Bash. Delete line if field exactly matches

I have something like this in a file named file.txt
AA.201610.pancake.Paul
AA.201610.hello.Robert
A.201610.hello.Mark
Now, i ONLY get the first three fields in 3 variables like:
field1="A"
field2="201610"
field3='hello'.
I'd like to remove a line, if it contains exactly the first 3 fields, like , in the case described above, i want only the third line to be removed from the file.txt . Is there a way to do that? And is there a way to do that in the same file?
I tried with:
sed -i /$field1"."$field2"."$field3"."/Id file.txt
but of course this removes both the second and the third line
I suggest using awk for this as sed can only do regex search and that requires escaping all special meta-chars and anchors, word boundaries etc to avoid false matches.
Suggested awk with non-regex matching:
awk -F '[.]' -v f1="$field1" -v f2="$field2" -v f3="$field3" '
!($1==f1 && $2==f2 && $3==f3)' file
AA.201610.pancake.Paul
AA.201610.hello.Robert
Use ^ to anchor the pattern at the beginning of the line. Also note that . in a regex means "any character" and not a literal peridio. You have to escape it: either \. (be careful with shell escaping and the difference between single and double quotes) or [.]
Sed cannot do string matches, only regexp matches which becomes horrendously complicated to work around when you simply want to match a literal string (see Is it possible to escape regex metacharacters reliably with sed). Just use awk:
$ awk -v str="${field1}.${field2}.${field3}." 'index($0,str)!=1' file
AA.201610.pancake.Paul
AA.201610.hello.Robert
The question was about bash so in bash:
#!/usr/bin/env bash
field1="A"
field2="201610"
field3='hello'
IFS=
while read -r i
do
case "$i" in
"${field1}.${field2}.${field3}."*) ;;
*) echo -E "$i"
esac
done < file.txt

how to replace a special characters by character using shell

I have a string variable x=tmp/variable/custom-sqr-sample/test/example
in the script, what I want to do is to replace all the “-” with the /,
after that,I should get the following string
x=tmp/variable/custom/sqr/sample/test/example
Can anyone help me?
I tried the following syntax
it didnot work
exa=tmp/variable/custom-sqr-sample/test/example
exa=$(echo $exa|sed 's/-///g')
sed basically supports any delimiter, which comes in handy when one tries to match a /, most common are |, # and #, pick one that's not in the string you need to work on.
$ echo $x
tmp/variable/custom-sqr-sample/test/example
$ sed 's#-#/#g' <<< $x
tmp/variable/custom/sqr/sample/test/example
In the commend you tried above, all you need is to escape the slash, i.e.
echo $exa | sed 's/-/\//g'
but choosing a different delimiter is nicer.
The tr tool may be a better choice than sed in this case:
x=tmp/variable/custom-sqr-sample/test/example
echo "$x" | tr -- - /
(The -- isn't strictly necessary, but keeps tr (and humans) from mistaking - for an option.)
In bash, you can use parameter substitution:
$ exa=tmp/variable/custom-sqr-sample/test/example
$ exa=${exa//-/\/}
$ echo $exa
tmp/variable/custom/sqr/sample/test/example

Complex shell wildcard

I want to use echo to display(not content) directories that start with atleast 2 characters but can't begin with "an"
For example if had the following in the directory:
a as an23 an23 blue
I would only get
as blue back
I tried echo ^an* but that returns the directory with 1 charcter too.
Is there any way i can do this in the form of echo globalpattern
You can use the shells extended globbing feature, in bash:
bash$ setsh -s extglob
bash$ echo !(#(?|an*))
The !() construct inverts its internal expression, see this for more.
In zsh:
zsh$ setopt extendedglob
zsh$ print *~(?|an*)
In this case the ~ negates the pattern before the tilde. See the manual for more.
Since you want at least two characters in the names, you can use printf '%s\n' ??* to echo each such name on a separate line. You can then eliminate those names that start with an with grep -v '^an', leading to:
printf '%s\n' ??* | grep -v '^an'
The quotes aren't strictly necessary in the grep command with modern shells. Once upon a quarter of a century or so ago, the Bourne shell had ^ as a synonym for | so I still use quotes around carets.
If you absolutely must use echo instead of printf, then you'll have to map white space to newlines (assuming you don't have any names that contain white space).
I'm trying with just the echo command, no grep either?
What about:
echo [!a]?* a[!n]*
The first term lists all the two-plus character names not beginning with a; the second lists all the two-plus character names where the first is a and the second is not n.
This should do it, but you'd likely be better off with ls or even find:
echo * | tr ' ' '\012' | egrep '..' | egrep -v '^an'
Shell globbing is a form of regex, but it's not as powerful as egrep regex's.

Resources