split string to array where length is not fixed (not bash) - linux

How is it possible to split a string to an array?
#!/bin/sh
eg.
str="field 1,field 2,field 3,field 4"
The length of the arrays is various
Have found alot of solutions but they only works in bash
update
This only works if the length of the array has 4 values, but what if it has 10 values?
The for loop doesn't seem to work
arr=$(echo "field 1,field 2,field 3,field 4" | awk '{split($0,a,","); print a[1],a[2],a[3],a[4]}');
for value in ${arr[#]}
do
echo "$value\n"
done

To get the split into variables in dash (that doesn't support arrays) use:
string="field 1,field 2,field 3,field 4"
IFS=","
set -- $string
for val
do
echo "$val"
done

In bash you can do this:
str="field 1,field 2,field 3,field 4"
IFS=, array=($str)
IFS is the input field separator.
Zsh is much more elegant
array=${(s:,:)str}
You can even do it directly
% for i in "${(s:,:)str[#]}"; do echo $i; done
field 1
field 2
field 3
field 4

Related

How can I truncate a line of text longer than a given length?

How would you go about removing everything after x number of characters? For example, cut everything after 15 characters and add ... to it.
This is an example sentence should turn into This is an exam...
GnuTools head can use chars rather than lines:
head -c 15 <<<'This is an example sentence'
Although consider that head -c only deals with bytes, so this is incompatible with multi-bytes characters like UTF-8 umlaut ü.
Bash built-in string indexing works:
str='This is an example sentence'
echo "${str:0:15}"
Output:
This is an exam
And finally something that works with ksh, dash, zsh…:
printf '%.15s\n' 'This is an example sentence'
Even programmatically:
n=15
printf '%.*s\n' $n 'This is an example sentence'
If you are using Bash, you can directly assign the output of printf to a variable and save a sub-shell call with:
trim_length=15
full_string='This is an example sentence'
printf -v trimmed_string '%.*s' $trim_length "$full_string"
Use sed:
echo 'some long string value' | sed 's/\(.\{15\}\).*/\1.../'
Output:
some long strin...
This solution has the advantage that short strings do not get the ... tail added:
echo 'short string' | sed 's/\(.\{15\}\).*/\1.../'
Output:
short string
So it's one solution for all sized outputs.
Use cut:
echo "This is an example sentence" | cut -c1-15
This is an exam
This includes characters (to handle multi-byte chars) 1-15, c.f. cut(1)
-b, --bytes=LIST
select only these bytes
-c, --characters=LIST
select only these characters
Awk can also accomplish this:
$ echo 'some long string value' | awk '{print substr($0, 1, 15) "..."}'
some long strin...
In awk, $0 is the current line. substr($0, 1, 15) extracts characters 1 through 15 from $0. The trailing "..." appends three dots.
Todd actually has a good answer however I chose to change it up a little to make the function better and remove unnecessary parts :p
trim() {
if (( "${#1}" > "$2" )); then
echo "${1:0:$2}$3"
else
echo "$1"
fi
}
In this version the appended text on longer string are chosen by the third argument, the max length is chosen by the second argument and the text itself is chosen by the first argument.
No need for variables :)
Using Bash Shell Expansions (No External Commands)
If you don't care about shell portability, you can do this entirely within Bash using a number of different shell expansions in the printf builtin. This avoids shelling out to external commands. For example:
trim () {
local str ellipsis_utf8
local -i maxlen
# use explaining variables; avoid magic numbers
str="$*"
maxlen="15"
ellipsis_utf8=$'\u2026'
# only truncate $str when longer than $maxlen
if (( "${#str}" > "$maxlen" )); then
printf "%s%s\n" "${str:0:$maxlen}" "${ellipsis_utf8}"
else
printf "%s\n" "$str"
fi
}
trim "This is an example sentence." # This is an exam…
trim "Short sentence." # Short sentence.
trim "-n Flag-like strings." # Flag-like strin…
trim "With interstitial -E flag." # With interstiti…
You can also loop through an entire file this way. Given a file containing the same sentences above (one per line), you can use the read builtin's default REPLY variable as follows:
while read; do
trim "$REPLY"
done < example.txt
Whether or not this approach is faster or easier to read is debatable, but it's 100% Bash and executes without forks or subshells.

concatenate variable into single variable separated by comma with for loop

I want to add the values to a variable, separated by comma, using for loop.
First values should remain first and so on.
for ((i=0; i<${#MYARRAY[#]}; i++));
do
ALL=$ALL$MYARRAY$i,
done
echo $ALL
I expect the output
val1,val2,val3
but the actuel output is
val1,val2,val3,
How to avoid the comma after the last value?
Just add one of the three statements after your for loop:
ALL=${ALL%,}
ALL=${ALL::-1}
ALL=${ALL%?}
Another option is with the translate (tr) command. For example:
$ myarray=(val1 val2 val3 val4)
$ echo ${myarray[*]}
val1 val2 val3 val4
$ myarray=$(echo ${myarray[*]} | tr ' ' ,) # Replace space with ','
$ echo $myarray # Gives what you need
val1,val2,val3,val4
https://www.tldp.org/LDP/abs/html/string-manipulation.html is a good source. Insert the following line after the loop.
ALL=${ALL%,}
In this example, the first iteration does not put a comma in $ALL. In the following iteration, a comma is placed before the value. This way, there won't be any comma at the end of the output string.
MYARRAY=(val val val)
for (( i=0; i<${#MYARRAY[#]}; i++ ))
do
if [ $i == 0 ]
then
ALL=$ALL$MYARRAY$i
else
ALL=$ALL,$MYARRAY$i
fi
done
echo $ALL
This is exactly what the [*] construct is for:
myarray=(val1 val2 val3 val4)
oldIFS="$IFS"
IFS=','
echo "${myarray[*]}"
IFS="$oldIFS"
gives:
val1,val2,val3,val4
I am using lowercase myarray because uppercase should be reserved for system (bash) variables.
Note that "${myarray[*]}" must be inside double-quotes, otherwise you do not get the join magic. The elements are joined by the first character of IFS, which by default is a space.

A pure Bash Cuting script that do not provide efficient work

Already posted solution of using awk or sed are quite standard and help in case something did not work correctly.
like for a :
StringStr="ValueA:ValueB,ValueC:ValueC" ;
echo ${StringStr} | gawk -F',' 'BEGIN{}{for(intx=1;intx<=NF;intx++){printf("%s\n",$(intx))}}END{}'
do produce the same result, but a restricted user that can log into it's account and have fewer option like not allowed to used awk or gawk for a specific reason does have to produce something that have to work every-time.
For efficient reason I do develop my own Bash Function Library on github.com and fall on a technique that do not work as supposed and here a working example:
This technique use the Bash 'Remove matching prefix pattern' and 'Remove matching suffix pattern'. The goal is to get a string of chained information to use a simple as possible the bash-shell element to extract-out inserted element.
By the present I do have first statement to obtain a String out of a specific format:
Ex:
StringPattern="__VALUE1__:__VALUE2__,"
The format suppose adding in chain, many Pattern of type StringPattern.
The remain ',' will be used to split and separate the string back in
VALUE1:VALUE2 form .
like StringStorage will hold many times, parsed StringPattern, here 2 examples:
1 - sample 1
StringPattern="VariableA:InformationA,"
StringStorage="${StringStorage}${StringPattern}" ;
2 - sample 2
StringPattern="VariableB:InformationB,"
StringStorage="${StringStorage}${StringPattern}" ;
At this moment, StringStorage hold properly this information:
StringStorage="VariableA:InformationA,VariableB:InformationB,"
Now with StringStorage, the bash algorithm made out of a mix of 'Remove matching prefix pattern' and 'Remove matching suffix pattern' does work for this case :
### Description of IntCsvCount
### does remove all chosed Comma Separated value ',' from StringStorage
### and subtract from the original length the removed result from this
### subtraction. This produce IntCsvCount == 2
IntCsvCount=$( cstr=${StringStorage//,/} ; echo $(( ${#StringStorage} - ${#cstr} )) ) ;
### Description of
### Will be re Variable used to put the Extracted sequence.
bstr="" ;
### Description of for
### Received information from IntCsvCount it should count
### from 0 to Last element . This case it's ${IntCsvCount}-1 or 1 in
### my example.
for (( intx=0 ; intx <= ${IntCsvCount}-1 ; intx++ )) ; do
### This extracting First Segment based on
### Remove matching suffix pattern ${parameter%word} where
### work is ${astr#*,} ( Remove matching prefix pattern ) of
### everything in $astr until find a ',' .
bstr=${astr%*${astr#*,}} ;
### Destroying the $bstr part in by starting the astr to position of
### the end of size equivalent of bstr size (${#bstr}), end position is
### equal to [ Highest-String size ] - [ Shortest-String size ]
astr=${astr:${#bstr}:$(( ${#astr} - ${#bstr}))} ;
echo -ne "Element: ${bstr}\n" ;
done
This should produce the following answer.
Element: VariableA:InformationA,
Element: VariableB:InformationB,
Putting this into a function will require only to change the CSV by ':' and let extract the 'VariableA' and 'InformationA'.
The problem start using a String with non uniform. As observed on this board, example of a sentence and cutting a part should work on non-uniform string, but here a sample that do not work. And I do have more than one advise in hand from using gawk, sed, even cut but from this algorithm it does not work with this sample :
astr="master|ZenityShellEval|Variable declaration|Added Zenity font support to allow choosing both font-name and size and parsing the zenition option, notice --font option require a space between font and size.|20170127|"
comming from
astr=$( zenity --width=640 --height=600 --forms --show-header --text="Commit Message" --add-entry="Branch name" --add-entry="function" --add-entry="section" --add-entry="commit Message" --add-calendar="Commit Date" --forms-date-format="%Y%m%d" --separator='|' ) ;
I am also enforcing the output to look like what StringPattern should look like:
astr="${astr}|" ;
The same code except CSV (Comma Separated Value) was changed from ',' to '|'
IntCsvCount=$( cstr=${astr//|/} ; echo $(( ${#astr} - ${#cstr} )) ) ;
bstr="" ;
for (( intx=0 ; intx <= ${IntCsvCount}-1 ; intx++ )) ; do
bstr=${astr%*${astr#*|}} ;
astr=${astr:${#bstr}:$(( ${#astr} - ${#bstr}))} ;
echo -ne "Element: ${bstr}\n" ;
done
Where this time output generate following output:
Element:master|ZenityShellEval|Variable declaration|Added Zenity font support to allow choosing both font-name and size and parsing the zenition option, notice --font option require a space between font and size.|20170127|
Element:
Element:
Element:
Is there some reason why it should not work every time ?
So, you posted this AWK script:
BEGIN{}{for(intx=1;intx<=NF;intx++){printf("%s\n",$(intx))}}END{}
If I understand correctly, you're saying that it does exactly what you want, and the only problem is that you don't want to rely on AWK?
In that case, you're really making this more complicated than you need to. You can use Bash's substring-replacement functionality directly:
str=ValueA:ValueB,ValueC:ValueC
printf '%s\n' "${str//,/$'\n'}"
If I am understanding the end of your question properly, you have a string like astr="master|ZenityShellEval|Variable declaration|Added Zenity font support to allow choosing both font-name and size and parsing the zenition option, notice --font option require a space between font and size.|20170127|"
, and you want the following output:
Element: master
Element: ZenityShellEval
Element: Variable declaration
Element: Added Zenity font support to allow choosing both font-name and size and parsing the zenition option, notice --font option require a space between font and size.
Element: 20170127
The most simple way I could think of doing this is the following:
s="${astr%|}"; echo "Element: ${s//|/$'\n'Element: }";
Also, don't forget about arrays! I think they'll come in handy for what you're working on. The following also produces the desired output:
(IFS='|'; declare -a a=(${astr}); printf "Element: %s\n" "${a[#]}")
Bash Hackers Wiki has a great page on arrays, which I recommend look over.
Here is the same run on the last few themes:
IFS="|" read -ra arr<<<"${astr}"
printf "Element: %s\n" "${arr[#]}"
I thought I would add that your original awk is a little bloated to:
echo -n "ValueA:ValueB,ValueC:ValueC" | awk '1' RS=","
And of course, awk for current solution:
awk 'NF && $0 = "Element: " $0' RS="|" <<<"$astr"

How to get value from command line using for loop

Following is the code for extracting input from command line into bash script:
input=(*);
for i in {1..5..1}
do
input[i]=$($i);
done;
My question is: how to get $1, $2, $3, $4 values from input command line, where command line code input is:
bash script.sh "abc.txt" "|" "20" "yyyy-MM-dd"
Note: Not using for i in "${#}"
#!/bin/bash
for ((i=$#-1;i>=0;i--)); do
echo "${BASH_ARGV[$i]}"
done
Example: ./script.sh a "foo bar" c
Output:
a
foo bar
c
I don't know what you have against for i in "$#"; do..., but you can certainly do it with shift, for example:
while [ -n "$1" ]; do
printf " '%s'\n" "$1"
shift
done
Output
$ bash script.sh "abc.txt" "|" "20" "yyyy-MM-dd"
'abc.txt'
'|'
'20'
'yyyy-MM-dd'
Personally, I don't see why you exclude for i in "$#"; do ... it is a valid way to iterate though the args that will preserve quoted whitespace. You can also use the array and C-style for loop as indicated in the other answers.
note: if you are going to use your input array, you should use input=("$#") instead of input=($*). Using the latter will not preserve quoted whitespace in your positional parameters. e.g.
input=("$#")
for ((i = 0; i < ${#input[#]}; i++)); do
printf " '%s'\n" "${input[i]}"
done
works fine, but if you use input=($*) with arguments line "a b", it will treat those as two separate arguments.
If I'm correctly understanding what you're trying to do, you can write:
input=("$#")
to copy the positional parameters into an array named input.
If you specifically want only the first five positional parameters, you can write:
input=("${#:1:5}")
Edited to add: Or are you asking, given a variable i that contains the integer 2, how you can get $2? If that's your question, then — you can use indirect expansion, where Bash retrieves the value of a variable, then uses that value as the name of the variable to substitute. Indirect expansion uses the ! character:
i=2
input[i]="${!i}" # same as input[2]="$2"
This is almost always a bad idea, though. You should rethink what you're doing.

string alignment in perl / match alignment

I have two strings $dna1 and $dna2. Print the two strings as concatenated, and then print the second string lined up over its copy at the end of the concatenated strings. For example, if the input
strings are AAAA and TTTT, print:
AAAATTTT
TTTT
this is a self exercise question .. not a homework ,
i tried using index
#!/usr/bin/perl -w
$a ='AAAAAAAAAATTTTTTTTT';
$b ='TTTTTTTTTT';
print $a,"\n";
print ''x index($a,$b),$b,"\n";
but it is not working as needed .help please
Start by checking what index($a,$b) is returning... Perhaps you should pick a $b that's actually in $a!
Then realise that concatenating 10 instances of an empty string is an empty string, not 10 spaces.
This is a fun little exercise. I did this:
perl -lwe'$a="AAAA"; $b="TTTT"; $c = $a.$b; $i = index($c,$b) + length($b);
print $c; printf "%${i}s\n", $b;'
AAAAAAATTTT
TTTT
Note that generally speaking, using the variable names $a through $c is a bad idea, and only acceptable here because it is a one-liner. $a and $b are also reserved variable names used with sort.

Resources