Add suffix to comma-separated strings in bash ecosystem

Add suffix to comma-separated strings in bash ecosystem - linux

Is there a way of transforming a comma-delimited variable to add a suffix to each token using standard gnu tools? e.g.
VARIABLE=`aaa,bbb,ccc`
suffix=`-foo`
Expected output = `aaa-foo,bbb-foo,ccc-foo`
Additionally, if I have only one token, the transformation should behave in the same way
e.g. aaa -> aaa-foo

echo "aaa,bbb,ccc" | sed -E 's/([^,]+)/\1-foo/g'
It makes groups of characters that are not "," and then append -foo on it
With variables:
suffix="-foo"; VARIABLE="aaa,bbb,ccc"; echo ${VARIABLE} | sed -E "s/([^,]+)/\1${suffix}/g"

echo $VARIBLE | tr "," "\n" | awk '{print $1"-foo"}' | paste -sd "," -
explanation:
put each token on single line
tr "," "\n"
append "-foo" to each token
awk '{print $1"-foo"}'
join back up with the original comma
paste -sd "," -

Try:
answer = `echo $VARIABLE | sed "s/,/-foo,/g" | sed "s/$/-foo/"`
If you need to have the suffix as a variable then try:
answer = `echo $VARIABLE | sed "s/,/${suffix},/g" | sed "s/$/${suffix}/"`
I don't have access to a Unix box at the moment to prove this works.

The following:
s="aaa,bbb,ccc"
IFS=,
a=( $s )
mapfile -t b < <(printf '%s-foo\n' "${a[#]}")
should give us:
$ declare -p b
declare -a b=([0]="aaa-foo" [1]="bbb-foo" [2]="ccc-foo")
From there, if you can reconstruct the original format in a number of ways...
IFS=, eval 'JOINED="${b[*]}"'
Or if you don't like using eval, perhaps:
d=""; o=""
for x in "${b[#]}"; do
printf -v o '%s%s%s' "$o" "$d" "$x"
d=,
done
... which will put the complete modified string in $o.

With bash Parameter Expansion
var='aaa,bbb,ccc';[ -n "$var" ] && printf "%s\n" "${var//,/-foo,}-foo"

Related

How to convert a string to lower case in Bash, when the string is potentially -e, -E or -n? [duplicate]

This question already has answers here:
How to convert a string to lower case in Bash
(29 answers)
Closed 1 year ago.
In this question: How to convert a string to lower case in Bash?
The accepted answer is:
tr:
echo "$a" | tr '[:upper:]' '[:lower:]'
awk:
echo "$a" | awk '{print tolower($0)}'
Neither of these solutions work if $a is -e or -E or -n
Would this be a more appropriate solution:
echo "#$a" | sed 's/^#//' | tr '[:upper:]' '[:lower:]'

Use
printf '%s\n' "$a" | tr '[:upper:]' '[:lower:]'

Don't bother with tr. Since you're using bash, just use the , operator in parameter expansion:
$ a='-e BAR'
$ printf "%s\n" "${a,,?}"
-e bar

Using typeset (or declare) you can define a variable to automatically convert data to lower case when assigning to the variable, eg:
$ a='-E'
$ printf "%s\n" "${a}"
-E
$ typeset -l a # change attribute of variable 'a' to automatically convert assigned data to lowercase
$ printf "%s\n" "${a}" # lowercase doesn't apply to already assigned data
-E
$ a='-E' # but for new assignments to variable 'a'
$ printf "%s\n" "${a}" # we can see that the data is
-e # converted to lowercase
If you need to maintain case sensitivity of the current variable you can always defined a new variable to hold the lowercase value, eg:
$ typeset -l lower_a
$ lower_a="${a}" # convert data to lowercase upon assignment to variable 'lower_a'
$ printf "%s\n" "${lower_a}"
-e

Unix Script loop through individual variables in a list and execute code

I have been busting my head all day long without coming up with a sucessfull solution.
Setup:
We have Linux RHEL 8.3 and a file, script.sh
There is an enviroment variable set by an application with a dynamic string in it.
export PROGARM_VAR="abc10,def20,ghi30"
The delimiter is always "," and the values inside vary from 1 to 20.
Inside the script I have defined 20 variables which take the values
using "cut" command I take each value and assign it to a variable
var1=$(echo $PROGARM_VAR | cut -f1 -d,)
var2=$(echo $PROGARM_VAR | cut -f2 -d,)
var3=$(echo $PROGARM_VAR | cut -f3 -d,)
var4=$(echo $PROGARM_VAR | cut -f4 -d,)
etc
In our case we will have:
var1="abc10" var2="def20" var3="ghi30" and var4="" which is empty
The loop must take each variable, test if its not empty and execute 10 pages of code using the tested variable. When it reaches an empty variable it should break.
Could you give me a hand please?
Thank you

Just split it with a comma. There are endless possibilities. You could:
10_pages_of_code() { echo "$1"; }
IFS=, read -a -r vars <<<"abc10,def20,ghi30"
for i in "${vars[#]}"; do 10_pages_of_code "$i"; done
or:
printf "%s" "abc10,def20,ghi30" | xargs -n1 -d, bash -c 'echo 10_pages_of_code "$1"' _
A safer code could use readarray instead of read to properly handle newlines in values, but I doubt that matters for you:
IFS= readarray -d , -t vars < <(printf "%s" "abc10,def20,ghi30")
You could also read in a stream up:
while IFS= read -r -d, var || [[ -n "$var" ]]; do
10_pages_of_code "$var"
done < <(printf "%s" "abc10,def20,ghi30")
But still you could do it with cut... just actually write a loop and use an iterator.
i=0
while var=$(printf "%s\n" "$PROGARM_VAR" | cut -f"$i" -d,) && [[ -n "$var" ]]; do
10_pages_of_code "$var"
((i++))
done

or
echo "$PROGRAM_VAR" | tr , \\n | while read var; do
: something with $var
done

How to extract numbers from a string?

I have string contains a path
string="toto.titi.12.tata.2.abc.def"
I want to extract only the numbers from this string.
To extract the first number:
tmp="${string#toto.titi.*.}"
num1="${tmp%.tata*}"
To extract the second number:
tmp="${string#toto.titi.*.tata.*.}"
num2="${tmp%.abc.def}"
So to extract a parameter I have to do it in 2 steps. How to extract a number with one step?

You can use tr to delete all of the non-digit characters, like so:
echo toto.titi.12.tata.2.abc.def | tr -d -c 0-9

To extract all the individual numbers and print one number word per line pipe through -
tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'
Breakdown:
Replaces all line breaks with spaces: tr '\n' ' '
Replaces all non numbers with spaces: sed -e 's/[^0-9]/ /g'
Remove leading white space: -e 's/^ *//g'
Remove trailing white space: -e 's/ *$//g'
Squeeze spaces in sequence to 1 space: tr -s ' '
Replace remaining space separators with line break: sed 's/ /\n/g'
Example:
echo -e " this 20 is 2sen\nten324ce 2 sort of" | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /\n/g'
Will print out
20
2
324
2

Here is a short one:
string="toto.titi.12.tata.2.abc.def"
id=$(echo "$string" | grep -o -E '[0-9]+')
echo $id // => output: 12 2
with space between the numbers.
Hope it helps...

Parameter expansion would seem to be the order of the day.
$ string="toto.titi.12.tata.2.abc.def"
$ read num1 num2 <<<${string//[^0-9]/ }
$ echo "$num1 / $num2"
12 / 2
This of course depends on the format of $string. But at least for the example you've provided, it seems to work.
This may be superior to anubhava's awk solution which requires a subshell. I also like chepner's solution, but regular expressions are "heavier" than parameter expansion (though obviously way more precise). (Note that in the expression above, [^0-9] may look like a regex atom, but it is not.)
You can read about this form or Parameter Expansion in the bash man page. Note that ${string//this/that} (as well as the <<<) is a bashism, and is not compatible with traditional Bourne or posix shells.

This would be easier to answer if you provided exactly the output you're looking to get. If you mean you want to get just the digits out of the string, and remove everything else, you can do this:
d#AirBox:~$ string="toto.titi.12.tata.2.abc.def"
d#AirBox:~$ echo "${string//[a-z,.]/}"
122
If you clarify a bit I may be able to help more.

You can also use sed:
echo "toto.titi.12.tata.2.abc.def" | sed 's/[0-9]*//g'
Here, sed replaces
any digits (class [0-9])
repeated any number of times (*)
with nothing (nothing between the second and third /),
and g stands for globally.
Output will be:
toto.titi..tata..abc.def

Convert your string to an array like this:
$ str="toto.titi.12.tata.2.abc.def"
$ arr=( ${str//[!0-9]/ } )
$ echo "${arr[#]}"
12 2

Use regular expression matching:
string="toto.titi.12.tata.2.abc.def"
[[ $string =~ toto\.titi\.([0-9]+)\.tata\.([0-9]+)\. ]]
# BASH_REMATCH[0] would be "toto.titi.12.tata.2.", the entire match
# Successive elements of the array correspond to the parenthesized
# subexpressions, in left-to-right order. (If there are nested parentheses,
# they are numbered in depth-first order.)
first_number=${BASH_REMATCH[1]}
second_number=${BASH_REMATCH[2]}

Using awk:
arr=( $(echo $string | awk -F "." '{print $3, $5}') )
num1=${arr[0]}
num2=${arr[1]}

Hi adding yet another way to do this using 'cut',
echo $string | cut -d'.' -f3,5 | tr '.' ' '
This gives you the following output:
12 2

Fixing newline issue (for mac terminal):
cat temp.txt | tr '\n' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed $'s/ /\\\n/g'

Assumptions:
there is no embedded white space
the string of text always has 7 period-delimited strings
the string always contains numbers in the 3rd and 5th period-delimited positions
One bash idea that does not require spawning any subprocesses:
$ string="toto.titi.12.tata.2.abc.def"
$ IFS=. read -r x1 x2 num1 x3 num2 rest <<< "${string}"
$ typeset -p num1 num2
declare -- num1="12"
declare -- num2="2"
In a comment OP has stated they wish to extract only one number at a time; the same approach can still be used, eg:
$ string="toto.titi.12.tata.2.abc.def"
$ IFS=. read -r x1 x2 num1 rest <<< "${string}"
$ typeset -p num1
declare -- num1="12"
$ IFS=. read -r x1 x2 x3 x4 num2 rest <<< "${string}"
$ typeset -p num2
declare -- num2="2"
A variation on anubhava's answer that uses parameter expansion instead of a subprocess call to awk, and still working with the same set of initial assumptions:
$ arr=( ${string//./ } )
$ num1=${arr[2]}
$ num2=${arr[4]}
$ typeset -p num1 num2
declare -- num1="12"
declare -- num2="2"

How to split a list by comma not space

I want to split a text with comma , not space in for foo in list. Suppose I have a CSV file CSV_File with following text inside it:
Hello,World,Questions,Answers,bash shell,script
...
I used following code to split it into several words:
for word in $(cat CSV_File | sed -n 1'p' | tr ',' '\n')
do echo $word
done
It prints:
Hello
World
Questions
Answers
bash
shell
script
But I want it to split the text by commas not spaces:
Hello
World
Questions
Answers
bash shell
script
How can I achieve this in bash?

Set IFS to ,:
sorin#sorin:~$ IFS=',' ;for i in `echo "Hello,World,Questions,Answers,bash shell,script"`; do echo $i; done
Hello
World
Questions
Answers
bash shell
script
sorin#sorin:~$

Using a subshell substitution to parse the words undoes all the work you are doing to put spaces together.
Try instead:
cat CSV_file | sed -n 1'p' | tr ',' '\n' | while read word; do
echo $word
done
That also increases parallelism. Using a subshell as in your question forces the entire subshell process to finish before you can start iterating over the answers. Piping to a subshell (as in my answer) lets them work in parallel. This matters only if you have many lines in the file, of course.

I think the canonical method is:
while IFS=, read field1 field2 field3 field4 field5 field6; do
do stuff
done < CSV.file
If you don't know or don't care about how many fields there are:
IFS=,
while read line; do
# split into an array
field=( $line )
for word in "${field[#]}"; do echo "$word"; done
# or use the positional parameters
set -- $line
for word in "$#"; do echo "$word"; done
done < CSV.file

kent$ echo "Hello,World,Questions,Answers,bash shell,script"|awk -F, '{for (i=1;i<=NF;i++)print $i}'
Hello
World
Questions
Answers
bash shell
script

Create a bash function
split_on_commas() {
local IFS=,
local WORD_LIST=($1)
for word in "${WORD_LIST[#]}"; do
echo "$word"
done
}
split_on_commas "this,is a,list" | while read item; do
# Custom logic goes here
echo Item: ${item}
done
... this generates the following output:
Item: this
Item: is a
Item: list
(Note, this answer has been updated according to some feedback)

Read: http://linuxmanpages.com/man1/sh.1.php
& http://www.gnu.org/s/hello/manual/autoconf/Special-Shell-Variables.html
IFS The Internal Field Separator that is used for word splitting
after expansion and to split lines into words with the read
builtin command. The default value is ``''.
IFS is a shell environment variable so it will remain unchanged within the context of your Shell script but not otherwise, unless you EXPORT it. ALSO BE AWARE, that IFS will not likely be inherited from your Environment at all: see this gnu post for the reasons and more info on IFS.
You're code written like this:
IFS=","
for word in $(cat tmptest | sed -n 1'p' | tr ',' '\n'); do echo $word; done;
should work, I tested it on command line.
sh-3.2#IFS=","
sh-3.2#for word in $(cat tmptest | sed -n 1'p' | tr ',' '\n'); do echo $word; done;
World
Questions
Answers
bash shell
script

You can use:
cat f.csv | sed 's/,/ /g' | awk '{print $1 " / " $4}'
or
echo "Hello,World,Questions,Answers,bash shell,script" | sed 's/,/ /g' | awk '{print $1 " / " $4}'
This is the part that replace comma with space
sed 's/,/ /g'

For me, use array split is simpler ref
IN="bla#some.com;john#home.com"
arrIN=(${IN//;/ })
echo ${arrIN[1]}

Using readarray(mapfile):
$ cat csf
Hello,World,Questions,Answers,bash shell,script
$ readarray -td, arr < csf
$ printf '%s\n' "${arr[#]}"
Hello
World
Questions
Answers
bash shell
script

linux shell title case

I am wrinting a shell script and have a variable like this: something-that-is-hyphenated.
I need to use it in various points in the script as:
something-that-is-hyphenated, somethingthatishyphenated, SomethingThatIsHyphenated
I have managed to change it to somethingthatishyphenated by stripping out - using sed "s/-//g".
I am sure there is a simpler way, and also, need to know how to get the camel cased version.
Edit: Working function derived from #Michał's answer
function hyphenToCamel {
tr '-' '\n' | awk '{printf "%s%s", toupper(substr($0,1,1)), substr($0,2)}'
}
CAMEL=$(echo something-that-is-hyphenated | hyphenToCamel)
echo $CAMEL
Edit: Finally, a sed one liner thanks to #glenn
echo a-hyphenated-string | sed -E "s/(^|-)([a-z])/\u\2/g"

a GNU sed one-liner
echo something-that-is-hyphenated |
sed -e 's/-\([a-z]\)/\u\1/g' -e 's/^[a-z]/\u&/'
\u in the replacement string is documented in the sed manual.

Pure bashism:
var0=something-that-is-hyphenated
var1=(${var0//-/ })
var2=${var1[*]^}
var3=${var2// /}
echo $var3
SomethingThatIsHyphenated
Line 1 is trivial.
Line 2 is the bashism for replaceAll or 's/-/ /g', wrapped in parens, to build an array.
Line 3 uses ${foo^}, which means uppercase (while ${foo,} would mean 'lowercase' [note, how ^ points up while , points down]) but to operate on every first letter of a word, we address the whole array with ${foo[*]} (or ${foo[#]}, if you would prefer that).
Line 4 is again a replace-all: blank with nothing.
Line 5 is trivial again.

You can define a function:
hypenToCamel() {
tr '-' '\n' | awk '{printf "%s%s", toupper(substr($0,0,1)), substr($0,2)}'
}
CAMEL=$(echo something-that-is-hyphenated | hypenToCamel)
echo $CAMEL

In the shell you are stuck with being messy:
aa="aaa-aaa-bbb-bbb"
echo " $aa" | sed -e 's/--*/ /g' -e 's/ a/A/g' -e 's/ b/B/g' ... -e 's/ *//g'
Note the carefully placed space in the echo and the double space in the last -e.
I leave it as an exercise to complete the code.
In perl it is a bit easier as a one-line shell command:
perl -e 'print map{ $a = ucfirst; $a =~ s/ +//g; $a} split( /-+/, $ARGV[0] ), "\n"' $aa

For the records, here's a pure Bash safe method (that is not subject to pathname expansion)—using Bash≥4:
var0=something-that-is-hyphenated
IFS=- read -r -d '' -a var1 < <(printf '%s\0' "${var0,,}")
printf '%s' "${var1[#]^}"
This (safely) splits the lowercase expansion of var0 at the hyphens, with each split part in array var1. Then we use the ^ parameter expansion to uppercase the first character of the fields of this array, and concatenate them.
If your variable may also contain spaces and you want to act on them too, change IFS=- into IFS='- '.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Add suffix to comma-separated strings in bash ecosystem - linux

echo "aaa,bbb,ccc" | sed -E 's/([^,]+)/\1-foo/g' It makes groups of characters that are not "," and then append -foo on it With variables: suffix="-foo"; VARIABLE="aaa,bbb,ccc"; echo ${VARIABLE} | sed -E "s/([^,]+)/\1${suffix}/g"

echo $VARIBLE | tr "," "\n" | awk '{print $1"-foo"}' | paste -sd "," - explanation: put each token on single line tr "," "\n" append "-foo" to each token awk '{print $1"-foo"}' join back up with the original comma paste -sd "," -

Try: answer = `echo $VARIABLE | sed "s/,/-foo,/g" | sed "s/$/-foo/"` If you need to have the suffix as a variable then try: answer = `echo $VARIABLE | sed "s/,/${suffix},/g" | sed "s/$/${suffix}/"` I don't have access to a Unix box at the moment to prove this works.

With bash Parameter Expansion var='aaa,bbb,ccc';[ -n "$var" ] && printf "%s\n" "${var//,/-foo,}-foo"

Related

How to convert a string to lower case in Bash, when the string is potentially -e, -E or -n? [duplicate]

Unix Script loop through individual variables in a list and execute code

How to extract numbers from a string?

How to split a list by comma not space

linux shell title case

Categories

Resources