How to extract a substring beginning and ending with user defined special characters from a string in linux? - linux

I am working on linux scripts and want to extract a substring out of a master string as in the following example :-
Master string =
2011-12-03 11:04:22#Alex#Audrino^13b11254^Townville#USA#
What I require is :-
Substring =
13b11254
I simply want to read and extract whatever is there in between ^ ^ special characters.
This code will be used in a linux script.

Using standard shell parameter expansion:
% s='2011-12-03 11:04:22#Alex#Audrino^13b11254^Townville#USA#' ss=${s#*^} ss=${ss%^*}
% printf '%s\n' "$ss"
13b11254

The solution bellow uses the cut utility, which spawns a process and is slower that the shell parameter expansion solution. It might be easier to understand, and can be run on a file instead of on a single string.
s='2011-12-03 11:04:22#Alex#Audrino^13b11254^Townville#USA#'
echo $s | cut -d '^' -f 2

You can also use bash arrays and field separator:
IFS="^"
s='2011-12-03 11:04:22#Alex#Audrino^13b11254^Townville#USA#'
array=($s)
echo ${array[1]}
This allows you to test is you have exactly 2 separators:
if [ ${#array[*]} -ne 3 ]
then
echo error
else
echo ok
fi

POSIX sh compatible:
temp="${string#*^}"
printf "%s\n" "${temp%^*}"
Assumes that ^ is only used 2x per string as the 2 delimiters.

Related

Storing escape characters in unix variable

I am extracting a part from an existing file and storing it as a string in a variable.The string looks something like this.
var="*a<br>*b<br>*c"
Now as * is a special character in unix it doesnot work in further operations(like sed,grep) until I put an escape character infront of every *
Thats why,I am doing something like this -
echo $var | sed 's/\*/\\*/g'
On running this command in bash we get
echo $var | sed 's/\*/\\*/g'
\*a<br>\*b<br>\*c
which is the desired output,but when I try to store this in a variable, I am getting back my original variable like so
var=`echo $var | sed 's/\*/\\*/g'`
echo $var
*a<br>*b<br>*c
I am assuming this happens because the variable ignores the backslashes interpreting them as escape characters. How can I retain the backslashes and store them as in a variable?
The problem is caused by backticks. Use $( ) instead, and it goes away:
var="*a<br>*b<br>*c"
var=$(printf '%s\n' "$var" | sed 's/\*/\\*/g')
printf '%s\n' "$var"
(Why is this problem caused by backticks? Because the only way to nest them is to escape the inner ones with backslashes, so they necessarily change how backslashes behave; whereas $( ), because it uses different starting and ending sigils, can be nested natively).
That said, if your shell is one (like bash) with ksh-inspired extensions, you don't need sed at all here, as the shell can perform simple string replacements natively via parameter expansion:
var="*a<br>*b<br>*c"
printf '%s\n' "${var//'*'/'\*'}"
For background on why this answer uses printf instead of echo, see Why is printf better than echo? at [unix.se], or the APPLICATION USAGE section of the POSIX specification for echo.

bash extract version string & convert to version dot

I want to extract version string (1_4_5) from my-app-1_4_5.img and then convert into dot version (1.4.5) without filename. Version string will have three (1_4_5) or four (1_4_5_7) segments.
Have this one liner working ls my-app-1_4_5.img | cut -d'-' -f 3 | cut -d'.' -f 1 | tr _ .
Would like to know if there is any better way rather than piping output from cut.
Here's an attempt with parameter expansion. I'm assuming you have a wildcard pattern you want to loop over.
for file in *-*.img; do
base=${file%.img}
ver=${base##*-}
echo "${ver//_/.}"
done
The construct ${var%pattern} returns the variable var with any suffix matching pattern trimmed off. Similarly, ${var#pattern} trims any prefix which matches pattern. In both cases, doubling the operator switches to trimming the longest possible match instead of the shortest. (These are POSIX-compatible pattenr expansion, i.e. not strictly Bash only.) The construct ${var/pattern/replacement} replaces the first match in var on pattern with replacement; doubling the first slash causes every match to be replaced. (This is Bash only.)
You can do it with sed:
sed -E "s/.*([0-9]+)_([0-9]+)_([0-9]+).*/\1.\2.\3/" <<< my-app-1_4_5.img
Assuming the version number will always be between the last dash and the file extension, you can use something like this in pure Bash:
name="file-name-x-1_2_3_4_5.ext"
version=${name##*-}
version=${version%%.ext}
version=${version//_/.}
echo $version
The code above will result in:
1.2.3.4.5
For a complete explanation about the brace expansions used above, please take a look at Bash Reference Manual: 3.5.1 Brace Expansion.
Remove everything but 0 to 9, _ and newline and then replace all _ with .:
echo "my-app-1_4_5.img" | tr -cd '0-9_\n' | tr '_' '.'
Output:
1.4.5
With bash and a regex:
echo "my-app-1_4_5.img" | while IFS= read -r line; do [[ "$line" =~ [^0-9]([0-9_]+)[^0-9] ]] && echo "${BASH_REMATCH[1]//_/.}"; done
Output:
1.4.5
A slightly shorter variant
name=my-app-1_4_5.img
vers=${name//[!0-9_]}
$ echo ${vers//_/.}
1.4.5

Extract path from a entire string in bash shell script

I need to extract path from a string. I found examples in another post, but missing additional steps.
I have a string as below:
title="test test good dskgkdh hdfyr /rlsmodules/svnrepo/SOURCE/CBL/MQ/BASELINE/MQO000.CBL kdlkfg nsfgf trhrnrt"
cobsrc=$(awk '{match($0,/\/[^"]*/,a);print a[0]}' <<< $title)
echo $cobsrc
Output is
/rlsmodules/svnrepo/SOURCE/CBL/MQ/BASELINE/MQO000.CBL kdlkfg nsfgf trhrnrt
I need only
/rlsmodules/svnrepo/SOURCE/CBL/MQ/BASELINE/MQO000.CBL
What modification is required?
An existing post on similar query:
how to extract path from string in shell script
Four solutions, in order of my own preference.
First option would be simple parameter expansion, in two steps:
$ title="/${title#*/}"
$ title="${title%% *}"
$ echo "$title"
/rlsmodules/svnrepo/SOURCE/CBL/MQ/BASELINE/MQO000.CBL
The first line removes everything up to the first slash (while prepending a slash to replace the one that's stripped", the second line removes everything from the first bit of whitespace that remains.
Or, if you prefer, use a regex:
$ [[ $title =~ ^[^/]*(/[^ ]+)\ ]]
$ echo ${BASH_REMATCH[1]}
/rlsmodules/svnrepo/SOURCE/CBL/MQ/BASELINE/MQO000.CBL
The regex translates as:
null at the beginning of the line,
a run of zero or more non-slashes,
an atom:
a slash followed by non-space characters
a space, to end the previous atom.
The $BASH_REMATCH array contains the content of the bracketed atom.
Next option might be grep -o:
$ grep -o '/[^ ]*' <<<"$title"
(Result redacted -- you know what it'll be.)
You could of course assign this output to a variable using command substitution, which you already know about.
Last option is another external tool...
$ sed 's:^[^/]*::;s/ .*//' <<<"$title"
This is the same functionality as is handled by the parameter expansion (at the top of the answer) only in a sed script, which requires a call to an external program. Included only for pedantry. :)
Could you please try following.
echo "$title" | awk 'match($0,/\/.*\/[^ ]*/){print substr($0,RSTART,RLENGTH)}'
Output will be as follows.
/rlsmodules/svnrepo/SOURCE/CBL/MQ/BASELINE/MQO000.CBL
Solution 2nd: Considering that your variable don't have space in between its value then following may help you too.
echo "$title" | awk '{sub(/[^/]* /,"");sub(/ .*/,"")} 1'

Get first character of a string SHELL

I want to first the first character of a string, for example:
$>./first $foreignKey
And I want to get "$"
I googled it and I found some solutions but it concerns only bash and not Sh !
This should work on any Posix compatible shell (including sh). printf is not required to be a builtin but it often is, so this may save a fork or two:
first_letter=$(printf %.1s "$1")
Note: (Possibly I should have explained this six years ago when I wrote this brief answer.) It might be tempting to write %c instead of %.1s; that produces exactly the same result except in the case where the argument "$1" is empty. printf %c "" actually produces a NUL byte, which is not a valid character in a Posix shell; different shells might treat this case differently. Some will allow NULs as an extension; others, like bash, ignore the NUL but generate an error message to tell you it has happened. The precise semantics of %.1s is "at most 1 character at the start of the argument, which means that first_letter is guaranteed to be set to the empty string if the argument is the empty string, without raising any error indication.
Well, you'll probably need to escape that particular value to prevent it being interpreted as a shell variable but, if you don't have access to the nifty bash substring facility, you can still use something like:
name=paxdiablo
firstchar=`echo $name | cut -c1-1`
If you do have bash (it's available on most Linux distros and, even if your login shell is not bash, you should be able to run scripts with it), it's the much easier:
firstchar=${name:0:1}
For escaping the value so that it's not interpreted by the shell, you need to use:
./first \$foreignKey
and the following first script shows how to get it:
letter=`echo $1 | cut -c1-1`
echo ".$letter."
Maybe it is an old question.
recently I got the same problem, according to POSIX shell manual about substring processing, this is my solution without involving any subshell/fork
a="some string here"
printf 'first char is "%s"\n' "${a%"${a#?}"}"
for shell sh
echo "hello" | cut -b 1 # -b 1 extract the 1st byte
h
echo "hello" |grep -o "." | head -n 1
h
echo "hello" | awk -F "" '{print $1}'
h
you can try this for bash:
s='hello'; echo ${s:0:1}
h
printf -v first_character "%c" "${variable}"

How can I convert a string from uppercase to lowercase in Bash? [duplicate]

This question already has answers here:
How to convert a string to lower case in Bash
(29 answers)
Closed 4 years ago.
I have been searching to find a way to convert a string value from uppercase to lowercase. All the search results show approaches of using the tr command.
The problem with the tr command is that I am able to get the result only when I use the command with the echo statement. For example:
y="HELLO"
echo $y| tr '[:upper:]' '[:lower:]'
The above works and results in 'hello', but I need to assign the result to a variable as below:
y="HELLO"
val=$y| tr '[:upper:]' '[:lower:]'
string=$val world
When assigning the value like above it gives me an empty result.
PS: My Bash version is 3.1.17
If you are using Bash 4, you can use the following approach:
x="HELLO"
echo $x # HELLO
y=${x,,}
echo $y # hello
z=${y^^}
echo $z # HELLO
Use only one , or ^ to make the first letter lowercase or uppercase.
One way to implement your code is
y="HELLO"
val=$(echo "$y" | tr '[:upper:]' '[:lower:]')
string="$val world"
This uses $(...) notation to capture the output of the command in a variable. Note also the quotation marks around the string variable -- you need them there to indicate that $val and world are a single thing to be assigned to string.
If you have Bash 4.0 or higher, a more efficient & elegant way to do it is to use Bash built-in string manipulation:
y="HELLO"
string="${y,,} world"
Note that tr can only handle plain ASCII, making any tr-based solution fail when facing international characters.
Same goes for the Bash 4-based ${x,,} solution.
The AWK tool, on the other hand, properly supports even UTF-8 / multibyte input.
y="HELLO"
val=$(echo "$y" | awk '{print tolower($0)}')
string="$val world"
Answer courtesy of liborw.
Execute in backticks:
x=`echo "$y" | tr '[:upper:]' '[:lower:]'`
This assigns the result of the command in backticks to the variable x. (I.e., it's not particular to tr, but it is a common pattern/solution for shell scripting.)
You can use $(..) instead of the backticks. See here for more info.
I'm on Ubuntu 14.04 (Trusty Tahr), with Bash version 4.3.11. However, I still don't have the fun built-in string manipulation ${y,,}
This is what I used in my script to force capitalization:
CAPITALIZED=`echo "${y}" | tr '[a-z]' '[A-Z]'`
If you define your variable using declare (old: typeset) then you can state the case of the value throughout the variable's use.
declare -u FOO=AbCxxx
echo $FOO
Output:
ABCXXX
Option -l to declare does lowercase:
When the variable is assigned a value, all upper-case characters are converted to lower-case. The upper-case attribute is disabled.
Building on Rody's answer, this worked for me.
y="HELLO"
val=$(echo $y | tr '[:upper:]' '[:lower:]')
string="$val world"
One small modification: if you are using underscore next to the variable, you need to encapsulate the variable name in {}.
string="${val}_world"

Resources