Find the depth of the current path - linux

How can I write a shell script to find the depth of the current path?
Assuming I am in:
/home/user/test/test1/test2/test3
It should return 6.

With shell parameter expansions, no external commands:
$ var=${PWD//[!\/]}
$ echo ${#var}
6
The first expansion removes all characters that are not /; the second one prints the length of var.
Explanations with details for support by POSIX shell or Bash (the links in parentheses go to the corresponding sections in the POSIX standard or the Bash manual):
$PWD contains the path to the current working directory. (sh/Bash)
The ${parameter/pattern/string} expansion replaces the first occurrence of pattern in the expansion of parameter with string. (Bash)
If the first slash is doubled (as in our case), all occurrences are replaced.
If string is empty, the slash after pattern is optional (as in our case).
The pattern [!\/] is a bracket expression and stands for "any character other than slash". (sh/Bash)
The slash has to be escaped, \/, or it is interpreted as ending the pattern.
! as the first character in a bracket expression negates the expression: any character other than the ones in the expression match the pattern. POSIX sh requires support for ! and says the behaviour for using ^ is undefined; Bash supports both ! and ^. Notice that this is not a bracket expression as seen in regular expressions, where only ^ is valid.
${#parameter} expands to the length of parameter. (sh/Bash)

A simple approach in fish:
count (string split / $PWD)

You could count the number of slashes in the current path:
pwd | awk -F"/" '{print NF-1}'

You can do this using a pipeline. pipe string into grep with the -o option. This prints out each "/" on a new line. pipe again into wc -l counts the number of lines printed.
echo "$path_str" | grep -o '/' - | wc -l

Assuming you don't have trailing "/", you can just count the "/".
So you would
Remove everything that is not a "/"
Count the length of the resulting string
In fish, this would be done with something like
string replace --regex --all '[^/]' '' -- $PWD | string length
The regular expression - [^/] here matches every single character that is not a "/". With "--all", this will be done as often as possible, and replace it with '', i.e. nothing.
The -- is the option separator, so that nothing in the argument is interpreted as an option (otherwise you'd have issues if an argument started with a "-a").
$PWD is the current directory.
string length simply outputs the length of its input.

Using perl :
echo '/home/user/test/test1/test2/test3' |
perl -lne '#_ = split /\//; print scalar #_ -1'
Output
6

You could use find just like that :
find / -printf "%d %p\n" 2>/dev/null | grep "$PWD$" | awk '{print $1}'
Maybe not the most efficient, but handles slashes well.

Related

bash extract version string & convert to version dot

I want to extract version string (1_4_5) from my-app-1_4_5.img and then convert into dot version (1.4.5) without filename. Version string will have three (1_4_5) or four (1_4_5_7) segments.
Have this one liner working ls my-app-1_4_5.img | cut -d'-' -f 3 | cut -d'.' -f 1 | tr _ .
Would like to know if there is any better way rather than piping output from cut.
Here's an attempt with parameter expansion. I'm assuming you have a wildcard pattern you want to loop over.
for file in *-*.img; do
base=${file%.img}
ver=${base##*-}
echo "${ver//_/.}"
done
The construct ${var%pattern} returns the variable var with any suffix matching pattern trimmed off. Similarly, ${var#pattern} trims any prefix which matches pattern. In both cases, doubling the operator switches to trimming the longest possible match instead of the shortest. (These are POSIX-compatible pattenr expansion, i.e. not strictly Bash only.) The construct ${var/pattern/replacement} replaces the first match in var on pattern with replacement; doubling the first slash causes every match to be replaced. (This is Bash only.)
You can do it with sed:
sed -E "s/.*([0-9]+)_([0-9]+)_([0-9]+).*/\1.\2.\3/" <<< my-app-1_4_5.img
Assuming the version number will always be between the last dash and the file extension, you can use something like this in pure Bash:
name="file-name-x-1_2_3_4_5.ext"
version=${name##*-}
version=${version%%.ext}
version=${version//_/.}
echo $version
The code above will result in:
1.2.3.4.5
For a complete explanation about the brace expansions used above, please take a look at Bash Reference Manual: 3.5.1 Brace Expansion.
Remove everything but 0 to 9, _ and newline and then replace all _ with .:
echo "my-app-1_4_5.img" | tr -cd '0-9_\n' | tr '_' '.'
Output:
1.4.5
With bash and a regex:
echo "my-app-1_4_5.img" | while IFS= read -r line; do [[ "$line" =~ [^0-9]([0-9_]+)[^0-9] ]] && echo "${BASH_REMATCH[1]//_/.}"; done
Output:
1.4.5
A slightly shorter variant
name=my-app-1_4_5.img
vers=${name//[!0-9_]}
$ echo ${vers//_/.}
1.4.5

Use sed/grep to get string from tail to a char in middle

I have some strings (variables) i need to edit in the bash for further analysis
They consist of things like
str="~/folder/item"
How can I use sed or grep to grab just "item" (meaning from tail to char '/')?
Refer to Shell Parameter Expansion. You can say:
$ str="~/folder/item"
$ echo ${str##*/}
item
Quoting from the manual:
${parameter##word}
The word is expanded to produce a pattern just as in filename
expansion (see Filename Expansion). If the pattern matches the
beginning of the expanded value of parameter, then the result of the
expansion is the expanded value of parameter with the shortest
matching pattern (the ‘#’ case) or the longest matching pattern (the
‘##’ case) deleted. If parameter is ‘#’ or ‘*’, the pattern removal
operation is applied to each positional parameter in turn, and the
expansion is the resultant list. If parameter is an array variable
subscripted with ‘#’ or ‘*’, the pattern removal operation is applied
to each member of the array in turn, and the expansion is the
resultant list.
Using grep:
$ grep -Po '.*/\K.*' <<< $str
item
Assuming you are dealing with a text file containing lines like the one shown, then:
sed 's%.*/\([^/]*\)"%\1%' <<< 'str="~/folder/item"'
This yields:
item
If you are dealing with a variable str that contains a string ~/folder/item, then you can use:
basename "$str"
or:
echo "${str##*/}"
basename "${str}" gives the last part of the string after the final /, including any file extensions you may have. If you want to do the opposite and grab the directory, use dirname "${str}"
echo $str|awk -F"/" '{print $NF}'
or
echo "$str" | perl -pe 's/.*\///g'
basename also suitable for this
for example
str="~/folder/item"
name=$(basename $file)
echo $name
the output would be "item"

Extract part of a string using bash/cut/split

I have a string like this:
/var/cpanel/users/joebloggs:DNS9=domain.example
I need to extract the username (joebloggs) from this string and store it in a variable.
The format of the string will always be the same with exception of joebloggs and domain.example so I am thinking the string can be split twice using cut?
The first split would split by : and we would store the first part in a variable to pass to the second split function.
The second split would split by / and store the last word (joebloggs) into a variable
I know how to do this in PHP using arrays and splits but I am a bit lost in bash.
To extract joebloggs from this string in bash using parameter expansion without any extra processes...
MYVAR="/var/cpanel/users/joebloggs:DNS9=domain.example"
NAME=${MYVAR%:*} # retain the part before the colon
NAME=${NAME##*/} # retain the part after the last slash
echo $NAME
Doesn't depend on joebloggs being at a particular depth in the path.
Summary
An overview of a few parameter expansion modes, for reference...
${MYVAR#pattern} # delete shortest match of pattern from the beginning
${MYVAR##pattern} # delete longest match of pattern from the beginning
${MYVAR%pattern} # delete shortest match of pattern from the end
${MYVAR%%pattern} # delete longest match of pattern from the end
So # means match from the beginning (think of a comment line) and % means from the end. One instance means shortest and two instances means longest.
You can get substrings based on position using numbers:
${MYVAR:3} # Remove the first three chars (leaving 4..end)
${MYVAR::3} # Return the first three characters
${MYVAR:3:5} # The next five characters after removing the first 3 (chars 4-9)
You can also replace particular strings or patterns using:
${MYVAR/search/replace}
The pattern is in the same format as file-name matching, so * (any characters) is common, often followed by a particular symbol like / or .
Examples:
Given a variable like
MYVAR="users/joebloggs/domain.example"
Remove the path leaving file name (all characters up to a slash):
echo ${MYVAR##*/}
domain.example
Remove the file name, leaving the path (delete shortest match after last /):
echo ${MYVAR%/*}
users/joebloggs
Get just the file extension (remove all before last period):
echo ${MYVAR##*.}
example
NOTE: To do two operations, you can't combine them, but have to assign to an intermediate variable. So to get the file name without path or extension:
NAME=${MYVAR##*/} # remove part before last slash
echo ${NAME%.*} # from the new var remove the part after the last period
domain
Define a function like this:
getUserName() {
echo $1 | cut -d : -f 1 | xargs basename
}
And pass the string as a parameter:
userName=$(getUserName "/var/cpanel/users/joebloggs:DNS9=domain.example")
echo $userName
What about sed? That will work in a single command:
sed 's#.*/\([^:]*\).*#\1#' <<<$string
The # are being used for regex dividers instead of / since the string has / in it.
.*/ grabs the string up to the last backslash.
\( .. \) marks a capture group. This is \([^:]*\).
The [^:] says any character _except a colon, and the * means zero or more.
.* means the rest of the line.
\1 means substitute what was found in the first (and only) capture group. This is the name.
Here's the breakdown matching the string with the regular expression:
/var/cpanel/users/ joebloggs :DNS9=domain.example joebloggs
sed 's#.*/ \([^:]*\) .* #\1 #'
Using a single Awk:
... | awk -F '[/:]' '{print $5}'
That is, using as field separator either / or :, the username is always in field 5.
To store it in a variable:
username=$(... | awk -F '[/:]' '{print $5}')
A more flexible implementation with sed that doesn't require username to be field 5:
... | sed -e s/:.*// -e s?.*/??
That is, delete everything from : and beyond, and then delete everything up until the last /. sed is probably faster too than awk, so this alternative is definitely better.
Using a single sed
echo "/var/cpanel/users/joebloggs:DNS9=domain.example" | sed 's/.*\/\(.*\):.*/\1/'
I like to chain together awk using different delimitators set with the -F argument. First, split the string on /users/ and then on :
txt="/var/cpanel/users/joebloggs:DNS9=domain.com"
echo $txt | awk -F"/users/" '{print$2}' | awk -F: '{print $1}'
$2 gives the text after the delim, $1 the text before it.
I know I'm a little late to the party and there's already good answers, but here's my method of doing something like this.
DIR="/var/cpanel/users/joebloggs:DNS9=domain.example"
echo ${DIR} | rev | cut -d'/' -f 1 | rev | cut -d':' -f1

Extract file basename without path and extension in bash [duplicate]

This question already has answers here:
Extract filename and extension in Bash
(38 answers)
Closed 6 years ago.
Given file names like these:
/the/path/foo.txt
bar.txt
I hope to get:
foo
bar
Why this doesn't work?
#!/bin/bash
fullfile=$1
fname=$(basename $fullfile)
fbname=${fname%.*}
echo $fbname
What's the right way to do it?
You don't have to call the external basename command. Instead, you could use the following commands:
$ s=/the/path/foo.txt
$ echo "${s##*/}"
foo.txt
$ s=${s##*/}
$ echo "${s%.txt}"
foo
$ echo "${s%.*}"
foo
Note that this solution should work in all recent (post 2004) POSIX compliant shells, (e.g. bash, dash, ksh, etc.).
Source: Shell Command Language 2.6.2 Parameter Expansion
More on bash String Manipulations: http://tldp.org/LDP/LG/issue18/bash.html
The basename command has two different invocations; in one, you specify just the path, in which case it gives you the last component, while in the other you also give a suffix that it will remove. So, you can simplify your example code by using the second invocation of basename. Also, be careful to correctly quote things:
fbname=$(basename "$1" .txt)
echo "$fbname"
A combination of basename and cut works fine, even in case of double ending like .tar.gz:
fbname=$(basename "$fullfile" | cut -d. -f1)
Would be interesting if this solution needs less arithmetic power than Bash Parameter Expansion.
Here are oneliners:
$(basename "${s%.*}")
$(basename "${s}" ".${s##*.}")
I needed this, the same as asked by bongbang and w4etwetewtwet.
Pure bash, no basename, no variable juggling. Set a string and echo:
p=/the/path/foo.txt
echo "${p//+(*\/|.*)}"
Output:
foo
Note: the bash extglob option must be "on", (Ubuntu sets extglob "on" by default), if it's not, do:
shopt -s extglob
Walking through the ${p//+(*\/|.*)}:
${p -- start with $p.
// substitute every instance of the pattern that follows.
+( match one or more of the pattern list in parenthesis, (i.e. until item #7 below).
1st pattern: *\/ matches anything before a literal "/" char.
pattern separator | which in this instance acts like a logical OR.
2nd pattern: .* matches anything after a literal "." -- that is, in bash the "." is just a period char, and not a regex dot.
) end pattern list.
} end parameter expansion. With a string substitution, there's usually another / there, followed by a replacement string. But since there's no / there, the matched patterns are substituted with nothing; this deletes the matches.
Relevant man bash background:
pattern substitution:
${parameter/pattern/string}
Pattern substitution. The pattern is expanded to produce a pat
tern just as in pathname expansion. Parameter is expanded and
the longest match of pattern against its value is replaced with
string. If pattern begins with /, all matches of pattern are
replaced with string. Normally only the first match is
replaced. If pattern begins with #, it must match at the begin‐
ning of the expanded value of parameter. If pattern begins with
%, it must match at the end of the expanded value of parameter.
If string is null, matches of pattern are deleted and the / fol
lowing pattern may be omitted. If parameter is # or *, the sub
stitution operation is applied to each positional parameter in
turn, and the expansion is the resultant list. If parameter is
an array variable subscripted with # or *, the substitution
operation is applied to each member of the array in turn, and
the expansion is the resultant list.
extended pattern matching:
If the extglob shell option is enabled using the shopt builtin, several
extended pattern matching operators are recognized. In the following
description, a pattern-list is a list of one or more patterns separated
by a |. Composite patterns may be formed using one or more of the fol
lowing sub-patterns:
?(pattern-list)
Matches zero or one occurrence of the given patterns
*(pattern-list)
Matches zero or more occurrences of the given patterns
+(pattern-list)
Matches one or more occurrences of the given patterns
#(pattern-list)
Matches one of the given patterns
!(pattern-list)
Matches anything except one of the given patterns
Here is another (more complex) way of getting either the filename or extension, first use the rev command to invert the file path, cut from the first . and then invert the file path again, like this:
filename=`rev <<< "$1" | cut -d"." -f2- | rev`
fileext=`rev <<< "$1" | cut -d"." -f1 | rev`
If you want to play nice with Windows file paths (under Cygwin) you can also try this:
fname=${fullfile##*[/|\\]}
This will account for backslash separators when using BaSH on Windows.
Just an alternative that I came up with to extract an extension, using the posts in this thread with my own small knowledge base that was more familiar to me.
ext="$(rev <<< "$(cut -f "1" -d "." <<< "$(rev <<< "file.docx")")")"
Note: Please advise on my use of quotes; it worked for me but I might be missing something on their proper use (I probably use too many).
Use the basename command. Its manpage is here: http://unixhelp.ed.ac.uk/CGI/man-cgi?basename

Prepend to regex match

I got a variable in a bash script that I need to replace. The only constant in the line is that it will be ending in "_(*x)xxxp.mov". Where x's are numbers and can be of either 3 or 4 of length. For example, I know how to replace the value but only if it is a constant:
echo 'whiteout-tlr1_1080p.mov' | sed 's/_[0-9]*[0-9][0-9][0-9]p.mov/_h1080p.mov/g'
How can I carry over the regex match to replacement line?
Edit:
Ok I just learned that grep can print only the match would it better to to do something like this?
urltrail=$(echo $# | grep -o [0-9]*[0-9][0-9][0-9]p.mov)
newurl=$(sed 's/$urltrail/h$urltrail/g')
Hmm, tried the above but am getting a hang.
Back Reference
sed 's/_\([0-9]*[0-9][0-9][0-9]\)p.mov/_h\1p.mov/g'
The back-reference \n, where n is a single digit, matches the substring previously matched by the nth parenthesized subexpression of the regular expression.
You're not piping the old path into sed, so sed is hanging waiting for input.
newurl=$(echo $# |sed 's/$urltrail/h$urltrail/g')

Resources