[UPDATED QUESTION]
I've got a variable $CHANGED which stores the output of a subversion command like this: CHANGED="$(svnlook changed -r $REV $REPOS)".
Executing svnlook changed -r $REV $REPOS will output the following to the command line:
A /path/to/file
A /path/to/file2
A /path/to/file3
However, I need to store the output formatted as shown below in a variable $FILES:
A /path/to/file<br />A /path/to/file2<br />A /path/to/file3<br />
I need this for using $FILES in a command which generates an email massage like this:
sendemail [some-options] $FILES
It should to replace $FILES with A /path/to/file<br />A /path/to/file2<br />A /path/to/file3<br /> so that it can interpret the html break tags.
In bash:
echo "${VAR//$'\n'/<br />}"
See Parameter Expansion
The Parameter Expansion section of the man page is your friend.
Starting with
changed="
A /path/to/file
A /path/to/other/file
A /path/to/new/file
"
You can remove leading and trailing newlines using the # and % expansions:
files="${changed#$'\n'}"
files="${files%$'\n'}"
Then replace the other newlines with <br />:
files="${files//$'\n'/<br />}"
Demonstration:
printf '***%s***\n' "$files"
***A /path/to/file<br />A /path/to/other/file<br />A /path/to/new/file***
(Note that I've changed your all-uppercase variable names to lower case. Avoid uppercase names for your locals, as these tend to be used for communication via the environment.)
If you dislike writing newline as $'\n', you may of course store it in a variable:
nl=$'\n'
files="${changed#$nl}"
files="${files%$nl}"
files="${files//$nl/<br />}"
You can modify hek2mgl's answer to strip out the first <br /> (if any):
CHANGED="
A /path/to/file
A /path/to/other/file
A /path/to/new/file
"
FILES="$(echo "${CHANGED//$'\n'/<br />}" | sed 's#^<br />##g')"
echo "$FILES"
Output:
A /path/to/file<br />A /path/to/other/file<br />A /path/to/new/file<br />
Another way (with only sed):
FILES="$(echo "$CHANGED" | sed ':a;N;$!ba;s#\n#<br />#g;s#^<br />##g')"
Related
I've got a text file containing the html-source of a web page. There are lines with "data-adid="...". These lines I'd like to capture.
Therefore, I use:
Id=$(grep -m 10 -A 1 "data-adid" Textfile)
to get the first ten results.
The variable Id contains the following:
<arcicle class="aditem" data-adid="1234567890" <div class="aditem-image"> --
<arcicle class="aditem" data-adid="2134567890" <div class="aditem-image"> --
<arcicle class="aditem" data-adid="2134567890" <div class="aditem-image"> --
...
I would like to get the following output:
id="1234567890" id="2134567890" id="3124567890"
When using the grep command, I only managage to get the numbers, e.g.
Id2=$(echo $Id | grep -oP '(?<=data-ad=").*?(?=")')
gets 1234567890 2134567890 3124567890
When trying
Id2=$(echo $Id | grep -oP '(?<=data-ad).*?(?=")')
this will only give me id= id= id=
How could the code be change to get the desired output?
Though html values should be dealt with tools which understand html well but since OP is mentioning he/she needs in shell like tools, I would go for awk for this one. Written and tested in https://ideone.com/EpU1aW
echo "$var" |
awk '
match($0,/data-adid="[^"]*"/){
val=substr($0,RSTART,RLENGTH)
sub(/^data-ad/,"",val)
print val
val=""
}
'
data-ad is matching only data-ad - actually match the id= part too, with a " up until the next ". And I see no reason to use fancy lookarounds - just match the string and output the matched part only.
grep -oP 'data-ad\Kid="[^"]*"'
Should be enough. Note that $Id undergoes word splitting expansion and most probably should be quoted and that it's impossible to parse html using regex so you should most probably use html syntax aware tools instead.
With any sed:
$ sed 's/.*data-ad\(id="[^"]*"\).*/\1/' file
id="1234567890"
id="2134567890"
id="2134567890"
I'd like to cut off some special strings of a variable.
The variable contains the following, including a lot of blank space before <div... and a class attribute:
<div data-href="/www.somewebspace.com" class="class1 class2">
I would like to extract the contents of the data-href attribute i.e have this output /www.somewebspace.com
I tried out the following code, the output starts with the contents of the data-href attribute and the class attribute.
echo $Test | grep -oP '(?<=<div data-href=").*(?=")'
How can I get rid of the class attribute?
Kind regards and grateful for every reply,
X3nion
P.S. Some other question arouse. I've got this strings I'd like to extract from a text file:
<div class="aditem-addon">
Today, 23:23</div>`
What would be the correct command to extract only the "Today, 23:23" without any spaces and spaces before and after the term?
Maybe I would have to delete the black spaces before?
your regex is correct, you only need to adjust the greediness of the * quantifier:
* is a greedy quantifier : match as much as possible whilst getting a match
*? is a reluctant quantifier : match the minimum characters to get a match
# Correct
Test='<div data-href="/www.somewebspace.com" class="fdgks"></div>'
echo $Test | grep -oP '(?<=<div data-href=").*?(?=")'
#> /www.somewebspace.com
# the desired output
# WRONG
echo $Test | grep -oP '(?<=<div data-href=").*(?=")'
#> /www.somewebspace.com" class="fdgks
# didn't stop until it matched the last quote `"`
echo $Test$Test | grep -oP '(?<=<div data-href=").*(?=")'
#> /www.somewebspace.com" class="fdgks"></div><div data-href="/www.somewebspace.com" class="fdgks
# same as the last one
for a more detailed explanation about the difference between greedy, reluctant and possessive quantifiers (see)
EDIT
echo $Test$Test | grep -Poz '(?<=<div class="aditem-addon">\n ).*?(?=<\/div>)'
#> Today, 23:23
#> Today, 23:23
\n matches a newline an a leading space.
if the string you're looking for contains the newline character \n you'll need to add the z option to grep i.e the call will be grep -ozP
Unless the input is very simple, considering using xmllint or other html parsing tool. For the very simple cases, you can use bash solution:
#! /bin/sh
s=' <div data-href="/www.somewebspace.com" class="class1 class2"> '
s1=${s##*data-href=\"}
s1=${s1%%\"*}
echo "$s1"
Which will print
/www.somewebspace.com
I am trying to replace the strings in an xml file using the sed command. My script contains the following code.
SEARCH='key="identifierA" value ="000000 00:00:00"'
REPLACE='key="identifierA" value ="101617 00:00:00"'
TEST_DIR=home/test/
TEST_FILE="test.xml"
ChangeXml(){
ModifyValue $TEST_DIR $TEST_FILE $SEARCH $REPLACE
}
ModifyValue (){
cd $1
echo "Search : $3 Replace : $4 "
sed -i "s/$3/$4/g" $2
}
#Actions performed
ChangeXml
But this #3 in the echo returns identifierA and $4 returns 000000 00:00:00. Its supposed to give the value assigned to those variables instead. Due to this replace is not working as expected. Tried to escape the space in between key="identifierA" value ="000000 00:00:00". But not getting the results. I am very new to the shell scripting. Can anyone tell me the reason and correct me to achieve the expected result?
Quote the variables if they can contain whitespace:
ModifyValue "$TEST_DIR" "$TEST_FILE" "$SEARCH" "$REPLACE"
Otherwise, $SEARCH is sent in pieces (split on whitespace) and populates more than one argument.
Given a list of paths separated by a single space:
/home/me/src/test /home/me/src/vendor/a /home/me/src/vendor/b
I want to remove the prefix /home/me/src/ so that the result is:
test vendor/a vendor/b
For a single path I would do: ${PATH#/home/me/src/} but how do I apply it to this series?
You can use // to replace all occurrences of substring. Replace it with null string to remove them.
$ path="/home/me/src/test /home/me/src/vendor/a /home/me/src/vendor/b"
$ echo ${path//\/home\/me\/src\/}
test vendor/a vendor/b
Reference: ${parameter/pattern/string} in Bash reference manual
Using shell parameter expansion doesn't seem to be the solution for this, since it would remove everything up to / from a given point is useful, as nu11p01n73R's answer reveals.
For clarity, I would use sed with the syntax sed 's#pattern#replacement#g':
$ str="/home/me/src/test /home/me/src/vendor/a /home/me/src/vendor/b"
$ sed 's#/home/me/src/##g' <<< "$str"
test vendor/a vendor/b
Like always a grep solution from my side :
echo 'your string' | grep -Po '^/([^ /]*/)+\K.+'
Please note that the above regex do this for any string like /x/y/z/test ... But if you are interested only in replacing /home/me/src/, try the following :
echo 'your string' | grep -Po '^/home/me/src/\K.+' --color
I need to replace a space ( ) with a dot (.) in a string in bash.
I think this would be pretty simple, but I'm new so I can't figure out how to modify a similar example for this use.
Use inline shell string replacement. Example:
foo=" "
# replace first blank only
bar=${foo/ /.}
# replace all blanks
bar=${foo// /.}
See http://tldp.org/LDP/abs/html/string-manipulation.html for more details.
You could use tr, like this:
tr " " .
Example:
# echo "hello world" | tr " " .
hello.world
From man tr:
DESCRIPTION
Translate, squeeze, and/or delete characters from standard input, writ‐
ing to standard output.
In bash, you can do pattern replacement in a string with the ${VARIABLE//PATTERN/REPLACEMENT} construct. Use just / and not // to replace only the first occurrence. The pattern is a wildcard pattern, like file globs.
string='foo bar qux'
one="${string/ /.}" # sets one to 'foo.bar qux'
all="${string// /.}" # sets all to 'foo.bar.qux'
Try this
echo "hello world" | sed 's/ /./g'
Use parameter substitution:
string=${string// /.}
Try this for paths:
echo \"hello world\"|sed 's/ /+/g'|sed 's/+/\/g'|sed 's/\"//g'
It replaces the space inside the double-quoted string with a + sing, then replaces the + sign with a backslash, then removes/replaces the double-quotes.
I had to use this to replace the spaces in one of my paths in Cygwin.
echo \"$(cygpath -u $JAVA_HOME)\"|sed 's/ /+/g'|sed 's/+/\\/g'|sed 's/\"//g'
The recommended solution by shellcheck would be the following:
string="Hello World" ; echo "${string// /.}"
output: Hello.World