How to extract a part of string? - linux

I have string contains a path
string="toto.titi.1.tata.2.abc.def"
I want to extract the substring which is situated after toto.titi.1.tata.2.. but 1 and 2 here are examples and could be other numbers.
In general: I want to extract the substring which situated after toto.titi.[i].tata.[j]..
[i] and [j] are a numbers
How to do it?

Pure bash solution:
[[ $string =~ toto\.titi\.[0-9]+\.tata\.[0-9]+\.(.*$) ]] && result="${BASH_REMATCH[1]}"
echo "$result"

An alternate bash solution that uses parameter expansion instead of a regular expression:
echo "${string#toto.titi.[0-9].tata.[0-9].}"
If the numbers can be multi-digit values (i.e., greater than 9), you would need to use an extended pattern:
shopt -s extglob
echo "${string#toto.titi.+([0-9]).tata.+([0-9]).}"

You can use cut
echo $string | cut -f6- -d'.'

This does it:
echo ${string} | sed -re 's/^toto\.titi\.[[:digit:]]+\.tata\.[[:digit:]]+\.//'

May be like this:
echo "$string" | cut -d '.' -f 6-

You can use sed. Like this:
string="toto.titi.1.tata.2.abc.def"
string=$(sed 's/toto\.titi\.[0-9]\.tata\.[0-9]\.//' <<< "$string")
echo "$string"
Output:
abc.def

try this awk line:
awk -F'toto\\.titi\\.[0-9]+\\.tata\\.[0-9]+\\.' '{print $2}' file
with your example:
kent$ echo "toto.titi.1.tata.2.abc.def"|awk -F'toto\\.titi\\.[0-9]+\\.tata\\.[0-9]+\\.' '{print $2}'
abc.def

Related

How extract a substring from string using bash commands

i want to extract the substring:
"1.0.119"
From:
"/abc/efg/v/y-1.0.119.u"
How it can be done?
(i get the whole string from a pipe)
Thanks
> echo "/abc/efg/v/y-1.0.119.u" | cut -d'-' -f2 | cut -d'.' -f1-3
1.0.119
cut -d'-' -f2 returns what is after the first -.
cut -d'.' -f1-3 returns what is before the third ..
You can match the characters between the last - and the last . with a Bash regex:
$ [[ "/abc/efg/v/y-1.0.119.u" =~ [^-]+-(.*)\. ]] && echo "${BASH_REMATCH[1]}"
1.0.119
Or sed with the same pattern:
$ echo "/abc/efg/v/y-1.0.119.u" | sed 's/^\([^-]*-\)\(.*\)\(\..*$\)/\2/'
1.0.119
echo "/abc/efg/v/y-1.0.119.u" | sed 's/\(^.*-\)\(.*\)\(\..*$\)/\2/'
Use sed to split the text into three sections, separated with brackets and print just the second section.

Best way to swap first 4 chars with last 4 chars of string?

What's the way to swap first 4 chars with last 4 chars of string?
e.g. I have the string 20140613, I'd like to convert that to 06132014.
$ f=20140613
$ g=${f#????}${f%????}
$ echo $g
06132014
For dealing with longer strings something like the following is needed. (With inspiration from konsolebox's answer.)
echo ${f:(-4)}${f:4:${#f} - 8}${f:0:4}
Using pure BASH regex:
s='20140613'
[[ "$s" =~ ^(.*)([[:digit:]]{4})$ ]] && echo "${BASH_REMATCH[2]}${BASH_REMATCH[1]}"
06132014
Simply use substring expansion:
$ STRING=20140613
$ echo "${STRING:(-4)}${STRING:0:4}"
06132014
See Parameter Expansion.
Using date which is optimized for such kind of conversion:
$ str="20140613"
$ date +"%m%d%Y" -d "$str"
06132014
When you have to convert dates, no need to look so far ;)
Using sed:
STRING="20140613"
STRING=$(echo $STRING | sed 's/\(....\)\(.*\)/\2\1/')
Or using awk:
echo 20140613 | awk '{print substr($0,5,7) substr($0,1,4)}'
Test:
~$ echo 20140613 | awk '{print substr($0,5,7) substr($0,1,4)}'
>> 06132014
Through sed,
$ echo 20140613 | sed 's/^\(.\{4\}\)\(.\{4\}\)$/\2\1/g'
06132014
Through perl,
$ echo 20140613 | perl -pe 's/^(.{4})(.{4})$/\2\1/g'
06132014
With GNU Coreutils:
input=20140613
output=$(echo $input | fold -w4 | tac | tr -d \\n)
If you also need the last line feed, you can replace tr -d \\n with printf %s%s\\n or just append && echo to the command.
With perl
for str in 11112222 1111xxxx2222 111222
do
echo -n "$str -> "
echo "$str" | perl -ple 's/^(.{4})(.*)(.{4})$/\3\2\1/'
done
produces:
11112222 -> 22221111
1111xxxx2222 -> 2222xxxx1111
111222 -> 111222

How to search through a string and extract the required value in unix

I have a string like below
QUERY_RESULT='88371087|COB-A#2014-04-22,COB-C#2014-04-22,2014-04-22,2014-04-23 88354188|COB-W#2014-04-22,2014-04-22,2014-04-22 88319898|COB-X#2014-04-22,COB-K#2014-04-22,2014-04-22,2014-04-22'
This is a result taken by querying the database. Now I want to take all the values before the pipe and separate it with coma. So the output needed is :
A='88371087,88354188,88319898'
The db values can be different every time, there can be just one value or 2 or more values
How do I do it.
Using awk
A=`echo $QUERY_RESULT | awk '{ nreg=split($0,reg);for(i=1;i<=nreg;i++){split(reg[i],fld,"|");printf("%s%s",(i==1?"":","),fld[1]);}}'`
echo $A
88371087,88354188,88319898
Using grep -oP
grep -oP '(^| )\K[^|]+' <<< "$QUERY_RESULT"
88371087
88354188
88319898
OR to get comma separated value:
A=$(grep -oP '(^| )\K[^|]+' <<< "$QUERY_RESULT"|sed '$!s/$/,/'|tr -d '\n')
echo "$A"
88371087,88354188,88319898
$ words=( $( grep -oP '\S+(?=\|)' <<< "$QUERY_RESULT") )
$ A=$(IFS=,; echo "${words[*]}")
$ echo "$A"
88371087,88354188,88319898
Bash only.
shopt -s extglob
result=${QUERY_RESULT//|+([^ ]) /,}
result=${result%|*}
echo "$result"
Output:
88371087,88354188,88319898

Script on bash, can you?

There is a string $STRING, in which syllables are written with the spaces. If the variable $WORD have at least one syllable in this string, report of this in any way.
Your solution checks to see if $WORD exists in $STRING when it should be the other way around. Try this:
string="run walk stand"
word=walking
if echo "$string" | sed -e 's/ /\n/g' | grep -Fqif - <(echo "$word")
then
echo "Match!"
fi
As you can see, you can test the result of the grep without having to save the output in a variable.
By the way -n is the same as ! -z.

Need to grab data inbetween tilde character

Can any one advise how to search on linux for some data between a tilde character. I need to get IP data however its been formed like the below.
Details:
20110906000418~118.221.246.17~DATA~DATA~DATA
One more:
echo '20110906000418~118.221.246.17~DATA~DATA~DATA' | sed -r 's/[^~]*~([^~]+)~.*/\1/'
echo "20110906000418~118.221.246.17~DATA~DATA~DATA" | cut -d'~' -f2
This uses the cut command with the delimiter set to ~. The -f2 switch then outputs just the 2nd field.
If the text you give is in a file (called filename), try:
grep "[0-9]*~" filename | cut -d'~' -f2
With cut:
echo "20110906000418~118.221.246.17~DATA~DATA~DATA" | cut -d~ -f2
With awk:
echo "20110906000418~118.221.246.17~DATA~DATA~DATA"
| awk -F~ '{ print $2 }'
In awk:
echo '20110906000418~118.221.246.17~DATA~DATA~DATA' | awk -F~ '{print $2}'
Just use bash
$ string="20110906000418~118.221.246.17~DATA~DATA~DATA"
$ echo ${string#*~}
118.221.246.17~DATA~DATA~DATA
$ string=${string#*~}
$ echo ${string%%~*}
118.221.246.17
one more, using perl:
$ perl -F~ -lane 'print $F[1]' <<< '20110906000418~118.221.246.17~DATA~DATA~DATA'
118.221.246.17
bash:
#!/bin/bash
IFS='~'
while read -a array;
do
echo ${array[1]}
done < ip
If string is constant, the following parameter expansion performs substring extraction:
$ a=20110906000418~118.221.246.17~DATA~DATA~DATA
$ echo ${a:15:14}
118.221.246.17
or using regular expressions in bash:
$ echo $(expr "$a" : '[^~]*~\([^~]*\)~.*')
118.221.246.17
last one, again using pure bash methods:
$ tmp=${a#*~}
$ echo $tmp
118.221.246.17~DATA~DATA~DATA
$ echo ${tmp%%~*}
118.221.246.17

Resources