I have a long string like following:
string='<span id="/yourid/12345" class="noname">lala1</span><span id="/yourid/34567" class="noname">lala2</span><span id="/yourid/39201" class="noname">lala3</span>'
The objective is to loop through each of the 'yourid' and echo the id 12345, 34567 and 39201 for further processing. How can this be achieve through bash shell?
GNU grep:
grep -oP '(?<=/yourid/)\d+' <<< "$string"
12345
34567
39201
Use a real XML parser. For instance, if you have XMLStarlet installed...
while read -r id; do
[[ $id ]] || continue
printf '%s\n' "${id#/yourid/}"
done < <(xmlstarlet sel -m -t '//span[#id]' -v ./#id -n <<<"<root>${string}</root>")
With Perl:
declare -a ids
ids=( $(perl -lne 'while(m!yourid/(\w+)!g){print $1}' <<< "$string") )
echo ${ids[#]}
Related
I tried:
here is content of file.txt
some other text
#1.something1=kjfk
#2.something2=dfkjdk
#3.something3=3232
some other text
bash script:
ids=( `grep "something" file.txt | cut -d'.' -f1` )
for id in "${ids[#]}"; do
echo $id
done
result:
(nothing newline...)
(nothing newline...)
(nothing newline...)
but all it prints is nothing like newline for every such id found what am i missing?
Your grep and cut should be working but you can use awk and reduce 2 commands into one:
while read -r id;
echo "$id"
done < <(awk -F '\\.' '/something/{print $1}' file.txt)
To populate an array:
ids=()
while read -r id;
ids+=( "$id" )
done < <(awk -F '\\.' '/something/{print $1}' file.txt)
You can use grep's -o option to output only the text matched by a regular expression:
$ ids=($(grep -Eo '^#[0-9]+' file.txt))
$ echo ${ids[#]}
#1 #2 #3
This of course doesn't check for the existence of a period on the line... If that's important, then you could either expand things with another pipe:
$ ids=($(grep -Eo '^#[0-9]+\.something' file.txt | grep -o '^#[0-9]*'))
or you could trim the array values after populating the array:
$ ids=($(grep -Eo '^#[0-9]+\.something' file.txt))
$ echo ${ids[#]}
#1.something #2.something #3.something
$ for key in "${!ids[#]}"; do ids[key]="${ids[key]%.*}"; done
$ echo ${ids[#]}
#1 #2 #3
I have a string like below
QUERY_RESULT='88371087|COB-A#2014-04-22,COB-C#2014-04-22,2014-04-22,2014-04-23 88354188|COB-W#2014-04-22,2014-04-22,2014-04-22 88319898|COB-X#2014-04-22,COB-K#2014-04-22,2014-04-22,2014-04-22'
This is a result taken by querying the database. Now I want to take all the values before the pipe and separate it with coma. So the output needed is :
A='88371087,88354188,88319898'
The db values can be different every time, there can be just one value or 2 or more values
How do I do it.
Using awk
A=`echo $QUERY_RESULT | awk '{ nreg=split($0,reg);for(i=1;i<=nreg;i++){split(reg[i],fld,"|");printf("%s%s",(i==1?"":","),fld[1]);}}'`
echo $A
88371087,88354188,88319898
Using grep -oP
grep -oP '(^| )\K[^|]+' <<< "$QUERY_RESULT"
88371087
88354188
88319898
OR to get comma separated value:
A=$(grep -oP '(^| )\K[^|]+' <<< "$QUERY_RESULT"|sed '$!s/$/,/'|tr -d '\n')
echo "$A"
88371087,88354188,88319898
$ words=( $( grep -oP '\S+(?=\|)' <<< "$QUERY_RESULT") )
$ A=$(IFS=,; echo "${words[*]}")
$ echo "$A"
88371087,88354188,88319898
Bash only.
shopt -s extglob
result=${QUERY_RESULT//|+([^ ]) /,}
result=${result%|*}
echo "$result"
Output:
88371087,88354188,88319898
I have string contains a path
string="toto.titi.1.tata.2.abc.def"
I want to extract the substring which is situated after toto.titi.1.tata.2.. but 1 and 2 here are examples and could be other numbers.
In general: I want to extract the substring which situated after toto.titi.[i].tata.[j]..
[i] and [j] are a numbers
How to do it?
Pure bash solution:
[[ $string =~ toto\.titi\.[0-9]+\.tata\.[0-9]+\.(.*$) ]] && result="${BASH_REMATCH[1]}"
echo "$result"
An alternate bash solution that uses parameter expansion instead of a regular expression:
echo "${string#toto.titi.[0-9].tata.[0-9].}"
If the numbers can be multi-digit values (i.e., greater than 9), you would need to use an extended pattern:
shopt -s extglob
echo "${string#toto.titi.+([0-9]).tata.+([0-9]).}"
You can use cut
echo $string | cut -f6- -d'.'
This does it:
echo ${string} | sed -re 's/^toto\.titi\.[[:digit:]]+\.tata\.[[:digit:]]+\.//'
May be like this:
echo "$string" | cut -d '.' -f 6-
You can use sed. Like this:
string="toto.titi.1.tata.2.abc.def"
string=$(sed 's/toto\.titi\.[0-9]\.tata\.[0-9]\.//' <<< "$string")
echo "$string"
Output:
abc.def
try this awk line:
awk -F'toto\\.titi\\.[0-9]+\\.tata\\.[0-9]+\\.' '{print $2}' file
with your example:
kent$ echo "toto.titi.1.tata.2.abc.def"|awk -F'toto\\.titi\\.[0-9]+\\.tata\\.[0-9]+\\.' '{print $2}'
abc.def
There is a string $STRING, in which syllables are written with the spaces. If the variable $WORD have at least one syllable in this string, report of this in any way.
Your solution checks to see if $WORD exists in $STRING when it should be the other way around. Try this:
string="run walk stand"
word=walking
if echo "$string" | sed -e 's/ /\n/g' | grep -Fqif - <(echo "$word")
then
echo "Match!"
fi
As you can see, you can test the result of the grep without having to save the output in a variable.
By the way -n is the same as ! -z.
Can any one advise how to search on linux for some data between a tilde character. I need to get IP data however its been formed like the below.
Details:
20110906000418~118.221.246.17~DATA~DATA~DATA
One more:
echo '20110906000418~118.221.246.17~DATA~DATA~DATA' | sed -r 's/[^~]*~([^~]+)~.*/\1/'
echo "20110906000418~118.221.246.17~DATA~DATA~DATA" | cut -d'~' -f2
This uses the cut command with the delimiter set to ~. The -f2 switch then outputs just the 2nd field.
If the text you give is in a file (called filename), try:
grep "[0-9]*~" filename | cut -d'~' -f2
With cut:
echo "20110906000418~118.221.246.17~DATA~DATA~DATA" | cut -d~ -f2
With awk:
echo "20110906000418~118.221.246.17~DATA~DATA~DATA"
| awk -F~ '{ print $2 }'
In awk:
echo '20110906000418~118.221.246.17~DATA~DATA~DATA' | awk -F~ '{print $2}'
Just use bash
$ string="20110906000418~118.221.246.17~DATA~DATA~DATA"
$ echo ${string#*~}
118.221.246.17~DATA~DATA~DATA
$ string=${string#*~}
$ echo ${string%%~*}
118.221.246.17
one more, using perl:
$ perl -F~ -lane 'print $F[1]' <<< '20110906000418~118.221.246.17~DATA~DATA~DATA'
118.221.246.17
bash:
#!/bin/bash
IFS='~'
while read -a array;
do
echo ${array[1]}
done < ip
If string is constant, the following parameter expansion performs substring extraction:
$ a=20110906000418~118.221.246.17~DATA~DATA~DATA
$ echo ${a:15:14}
118.221.246.17
or using regular expressions in bash:
$ echo $(expr "$a" : '[^~]*~\([^~]*\)~.*')
118.221.246.17
last one, again using pure bash methods:
$ tmp=${a#*~}
$ echo $tmp
118.221.246.17~DATA~DATA~DATA
$ echo ${tmp%%~*}
118.221.246.17