Remove path prefix of space separated paths - linux

Given a list of paths separated by a single space:
/home/me/src/test /home/me/src/vendor/a /home/me/src/vendor/b
I want to remove the prefix /home/me/src/ so that the result is:
test vendor/a vendor/b
For a single path I would do: ${PATH#/home/me/src/} but how do I apply it to this series?

You can use // to replace all occurrences of substring. Replace it with null string to remove them.
$ path="/home/me/src/test /home/me/src/vendor/a /home/me/src/vendor/b"
$ echo ${path//\/home\/me\/src\/}
test vendor/a vendor/b
Reference: ${parameter/pattern/string} in Bash reference manual

Using shell parameter expansion doesn't seem to be the solution for this, since it would remove everything up to / from a given point is useful, as nu11p01n73R's answer reveals.
For clarity, I would use sed with the syntax sed 's#pattern#replacement#g':
$ str="/home/me/src/test /home/me/src/vendor/a /home/me/src/vendor/b"
$ sed 's#/home/me/src/##g' <<< "$str"
test vendor/a vendor/b

Like always a grep solution from my side :
echo 'your string' | grep -Po '^/([^ /]*/)+\K.+'
Please note that the above regex do this for any string like /x/y/z/test ... But if you are interested only in replacing /home/me/src/, try the following :
echo 'your string' | grep -Po '^/home/me/src/\K.+' --color

Related

Cutting certain string of variable

I'd like to cut off some special strings of a variable.
The variable contains the following, including a lot of blank space before <div... and a class attribute:
<div data-href="/www.somewebspace.com" class="class1 class2">
I would like to extract the contents of the data-href attribute i.e have this output /www.somewebspace.com
I tried out the following code, the output starts with the contents of the data-href attribute and the class attribute.
echo $Test | grep -oP '(?<=<div data-href=").*(?=")'
How can I get rid of the class attribute?
Kind regards and grateful for every reply,
X3nion
P.S. Some other question arouse. I've got this strings I'd like to extract from a text file:
<div class="aditem-addon">
Today, 23:23</div>`
What would be the correct command to extract only the "Today, 23:23" without any spaces and spaces before and after the term?
Maybe I would have to delete the black spaces before?
your regex is correct, you only need to adjust the greediness of the * quantifier:
* is a greedy quantifier : match as much as possible whilst getting a match
*? is a reluctant quantifier : match the minimum characters to get a match
# Correct
Test='<div data-href="/www.somewebspace.com" class="fdgks"></div>'
echo $Test | grep -oP '(?<=<div data-href=").*?(?=")'
#> /www.somewebspace.com
# the desired output
# WRONG
echo $Test | grep -oP '(?<=<div data-href=").*(?=")'
#> /www.somewebspace.com" class="fdgks
# didn't stop until it matched the last quote `"`
echo $Test$Test | grep -oP '(?<=<div data-href=").*(?=")'
#> /www.somewebspace.com" class="fdgks"></div><div data-href="/www.somewebspace.com" class="fdgks
# same as the last one
for a more detailed explanation about the difference between greedy, reluctant and possessive quantifiers (see)
EDIT
echo $Test$Test | grep -Poz '(?<=<div class="aditem-addon">\n ).*?(?=<\/div>)'
#> Today, 23:23
#> Today, 23:23
\n matches a newline an a leading space.
if the string you're looking for contains the newline character \n you'll need to add the z option to grep i.e the call will be grep -ozP
Unless the input is very simple, considering using xmllint or other html parsing tool. For the very simple cases, you can use bash solution:
#! /bin/sh
s=' <div data-href="/www.somewebspace.com" class="class1 class2"> '
s1=${s##*data-href=\"}
s1=${s1%%\"*}
echo "$s1"
Which will print
/www.somewebspace.com

Can't input date variable in bash

I have a directory /user/reports under which many files are there, one of them is :
report.active_user.30092018.77325.csv
I need output as number after date i.e. 77325 from above file name.
I created below command to find a value from file name:
ls /user/reports | awk -F. '/report.active_user.30092018/ {print $(NF-1)}'
Now, I want current date to be passed in above command as variable and get result:
ls /user/reports | awk -F. '/report.active_user.$(date +'%d%m%Y')/ {print $(NF-1)}'
But not getting required output.
Tried bash script:
#!/usr/bin/env bash
_date=`date +%d%m%Y`
active=$(ls /user/reports | awk -F. '/report.active_user.${_date}/ {print $(NF-1)}')
echo $active
But still output is blank.
Please help with proper syntax.
As #cyrus said you must use double quotes in your variable assignment because simple quote are use only for string and not for containing variables.
Bas use case
number=10
string='I m sentence with or wihtout var $number'
echo $string
Correct use case
number=10
string_with_number="I m sentence with var $number"
echo $string_with_number
You can use simple quote but not englobe all the string
number=10
string_with_number='I m sentence with var '$number
echo $string_with_number
Don't parse ls
You don't need awk for this: you can manage with the shell's capabilities
for file in report.active_user."$(date "+%d%m%Y")"*; do
tmp=${file%.*} # remove the extension
number=${tmp##*.} # remove the prefix up to and including the last dot
echo "$number"
done
See https://www.gnu.org/software/bash/manual/bashref.html#Shell-Parameter-Expansion

Using SED to replace long string - but got unterminated substitute in regular expression

hi trying to replace the following string with a long one :
#x#
with string that I got from the command line:
read test
sed -i --backup 's/#x#/'${test}'/g' file.json README.md
but it is working only for 1 word, it is not working if there is space between word . even between quotes
sed: 1: "s/#x#/string test string: unterminated substitute in regular expression
if case you run it on MacOS and struggling with "unterminated substitute in regular expression", there is an easier explanation for this:
MacOS has slightly other version of sed than usually is on linux. -i requires a parameter. If you have none, just add "" after -i
sed -i "" --backup 's/#x#/'${test}'/g' file.json README.md
or for example if you just have to delete dome line, this works on linux, but brings “invalid command code” on MacOS
sed -i 39d filenamehere.log
and this works on MacOS
sed -i "" 39d filenamehere.log
The problem originates from the way you are using the single-quotes. Currently you are terminating your input behind the 2. single-quote. See the Error message, it makes you aware of the fact that it is missing something.
If you have a file with the following content:
foo #x# foo
Than you can replace the content e.g. with the following command:
sed 's/#x#/bar foo bar/' foo.txt > foo2.txt
And get:
foo bar foo bar foo
If you need to pass in a variable the comment from Gordon Davisson shows you the right way.
By the way, if you want to use the inplace option, on my linux you would need to use the command like this:
sed -i.old "s/#x#/${test}/" foo.txt
But I think this might depends on your enviroment (mac?).
sed doesn't understand strings where a string is a series of literal characters. It replaces a regexp (not a string) with a backreference-enabled "string" (also not a string) all within a set of delimiters (which ALSO require careful handling in both the regexp and the replacement). See Is it possible to escape regex metacharacters reliably with sed for more info.
To replace a string with another string the simplest approach is to just use a tool that understands strings such as awk:
$ cat file
before stuff
foo #x# bar
after stuff
$ cat tst.awk
BEGIN {
old = ARGV[1]
new = ARGV[2]
ARGV[1] = ARGV[2] = ""
}
s = index($0,old) { $0 = substr($0,1,s-1) new substr($0,s+length(old)) }
{ print }
$ test='a/\t/&"b'
$ awk -f tst.awk '#x#' "$test" file
before stuff
foo a/\t/&"b bar
after stuff
The above will work no matter what characters test contains, even newlines:
$ test='contains a
newline'
$ awk -f tst.awk '#x#' "$test" file
before stuff
foo contains a
newline bar
after stuff

remove from the string "89dde7.rqsnhq34h.fmu8s1vn0i94hl.tgz.tar.gz" only the ".tar.gz"

remove from the string 89dde7.rqsnhq34h.fmu8s1vn0i94hl.tgz.tar.gz only the .tar.gz part and the result should be 89dde7.rqsnhq34h.fmu8s1vn0i94hl.tgz.
It can also happen some files with this extension:
91xhq8vkxlkdfpmfg566qahrwkh01c7n0scpdsr4p4vf6.tbz.tar.bz2 and others with double extension tar.tbz tar.zip and so on ...
In case .tar.zip the result must be nomearchivio.tar in the case 91xhq8vkxlkdfpmfg566qahrwkh01c7n0scpdsr4p4vf6.tbz.tar.bz2 must be 91xhq8vkxlkdfpmfg566qahrwkh01c7n0scpdsr4p4vf6.tbz
I use this :
nameFile= "89dde7.rqsnhq34h.fmu8s1vn0i94hl.tgz.tar.gz"
name=${nameFile%.*}
and the result is :
echo $name
89dde7.rqsnhq34h.fmu8s1vn0i94hl.tgz.tar
can you help me? Thanks
P.S. note that there are also other points within the file name.
Since you know exactly what you want to remove, just write it in full:
name=${nameFile%.tar.gz}
Or to remove the last two "extensions" .*.*:
name=${nameFile%.*.*}
You could use sed and remove the last 7 characters
echo $nameFile |sed 's/.\{7\}$//'
You could give a try to awk, for example:
echo 89dde7.rqsnhq34h.fmu8s1vn0i94hl.tgz.tar.gz | awk -F '\.tgz' '{print $1".tgz"}'
It will output:
89dde7.rqsnhq34h.fmu8s1vn0i94hl.tgz
For other files:
echo "01c7n0scpdsr4p4vf6.tbz.tar.bz2" | awk -F '\.tbz' '{print $1".tbz"}'
It will output:
01c7n0scpdsr4p4vf6.tbz
In this case, awk is using as a delimiter -F '\.tbz' your pattern, .tgz or tbz and then prints all items found at the left + your desired extension.

Replace one character with another in Bash

I need to replace a space ( ) with a dot (.) in a string in bash.
I think this would be pretty simple, but I'm new so I can't figure out how to modify a similar example for this use.
Use inline shell string replacement. Example:
foo=" "
# replace first blank only
bar=${foo/ /.}
# replace all blanks
bar=${foo// /.}
See http://tldp.org/LDP/abs/html/string-manipulation.html for more details.
You could use tr, like this:
tr " " .
Example:
# echo "hello world" | tr " " .
hello.world
From man tr:
DESCRIPTION
Translate, squeeze, and/or delete characters from standard input, writ‐
ing to standard output.
In bash, you can do pattern replacement in a string with the ${VARIABLE//PATTERN/REPLACEMENT} construct. Use just / and not // to replace only the first occurrence. The pattern is a wildcard pattern, like file globs.
string='foo bar qux'
one="${string/ /.}" # sets one to 'foo.bar qux'
all="${string// /.}" # sets all to 'foo.bar.qux'
Try this
echo "hello world" | sed 's/ /./g'
Use parameter substitution:
string=${string// /.}
Try this for paths:
echo \"hello world\"|sed 's/ /+/g'|sed 's/+/\/g'|sed 's/\"//g'
It replaces the space inside the double-quoted string with a + sing, then replaces the + sign with a backslash, then removes/replaces the double-quotes.
I had to use this to replace the spaces in one of my paths in Cygwin.
echo \"$(cygpath -u $JAVA_HOME)\"|sed 's/ /+/g'|sed 's/+/\\/g'|sed 's/\"//g'
The recommended solution by shellcheck would be the following:
string="Hello World" ; echo "${string// /.}"
output: Hello.World

Resources