The proposal is to be a pure bash function for splitting strings that accepts any string as a delimiter and any string as an input.
QUESTION: How to create a function for splitting strings that accepts any string as input and as delimiter?
!!!REASON FOR QUESTION!!! There are many, many proposals (see this example) for string splitting with bash commands, but almost all of them only work in specific cases and not according to our proposal.
NOTES: We consider the following Linux distributions in their latest versions to be eligible as compatible plataforms -> Debiam, Ubuntu (server and desktop), Arch, RedHat, CentOS, SUSE (server and desktop).
Thanks and be kind! π€
SOME INPUT TO TEST:
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
Β§\\Β§[+][.][-]
A literal backslash, β\β.Β°
Β°\a
The βalertβ character, Ctrl-g, ASCII code 7 (BEL). (This often makes some sort of audible noise.)
\b
Backspace, Ctrl-h, ASCII code 8 (BS).
\f
Formfeed, Ctrl-l, ASCII code 12 (FF).
\n
Newline, Ctrl-j, ASCII code 10 (LF).
\r
Carriage return, Ctrl-m, ASCII code 13 (CR).
\t
Horizontal TAB, Ctrl-i, ASCII code 9 (HT).
\v
Vertical TAB, Ctrl-k, ASCII code 11 (VT).-
\nnn
The octal value nnn, where nnn stands for 1 to 3 digits between β0β and β7β. For example, the code for the ASCII ESC (escape) character is β\033β.
15
It may also be helpful to note (though understandably you had no room to do so) that the -d option to readarray first appears in Bash 4.4. β
fbicknel
Aug 18, 2017 at 15:57
4
Great answer (+1). If you change your awk to awk '{ gsub(/,[ ]+|$/,"\0"); print }' ./ and eliminate that concatenation of the final ", " then you don't have to go through the gymnastics on eliminating the final record. So: readarray -td '' a < <(awk '{ gsub(/,[ ]+/,"\0"); print; }' <<<"$string") on Bash that supports readarray. Note your method is Bash 4.4+ I think because of the -d in readarray β
dawg
Nov 26, 2017 at 22:28
10
Wow, what a brilliant answer! Hee hee, my response: ditched the bash script and fired up python! β
artfulrobot
May 14, 2018 at 11:32
11
I'd move your right answers up to the top, I had to scroll through a lot of rubbish to find out how to do it properly :-) β
paxdiablo
Jan 9, 2020 at 12:31
44
This is exactly the kind of thing that will convince you to never code in bash. An astoundingly simple task that has 8 incorrect solutions. Btw, this is without a design constraint of, "Make it as obscure and finicky as possible"Β§$
END
HEREDOC
F_MS_STR_TO_SPLIT="${FILE_CONTENT:6:-3}"
F_MS_DELIMITER_P="int }' ./ and eliminate"
f_my_answer "$F_MS_STR_TO_SPLIT" "$F_MS_DELIMITER_P"
f_my_answer "$F_MS_STR_TO_SPLIT" "."
f_my_answer "$F_MS_STR_TO_SPLIT" "+"
f_my_answer "$F_MS_STR_TO_SPLIT" "'"
f_my_answer "$F_MS_STR_TO_SPLIT" "\\"
f_my_answer "$F_MS_STR_TO_SPLIT" "-"
f_my_answer "a.+b.+c" "[.][+]"
f_my_answer "a[.][+]b[.][+]c" "[.][+]"
f_my_answer "a.+b.+c" ".+"
There are many, many proposals for string splitting with bash commands, but almost all of them only work in specific cases and not accepts any string as input and as delimiter.
The function below, created by jhnc and modified by me, accepts any string as input and as delimiter. π
FUNCTION
declare -a F_MASTER_SPLITTER_R;
f_master_splitter(){
: 'Split a given string and returns an array.
Args:
F_MS_STR_TO_SPLIT (str): String to split.
F_MS_DELIMITER_P (Optional[str]): Delimiter used to split. If not informed
the split will be done by spaces.
Returns:
F_MASTER_SPLITTER_R (array): Array with the provided string separated by
the informed delimiter.
'
local F_MS_STR_TO_SPLIT="$1"
local F_MS_DELIMITER_P="$2"
if [ -z "$F_MS_DELIMITER_P" ] ; then
F_MS_DELIMITER_P=" "
fi
F_MASTER_SPLITTER_R=();
local F_MS_ITEM=""
while
F_MS_ITEM="${F_MS_STR_TO_SPLIT%%"$F_MS_DELIMITER_P"*}"
F_MASTER_SPLITTER_R+=("$F_MS_ITEM")
F_MS_STR_TO_SPLIT="${F_MS_STR_TO_SPLIT:${#F_MS_ITEM}}"
((${#F_MS_STR_TO_SPLIT}))
do
F_MS_STR_TO_SPLIT="${F_MS_STR_TO_SPLIT:${#2}}"
done
}
USAGE
f_master_splitter "<STR_INPUT>" "<STR_DELIMITER>"
NOTE: The f_master_splitter above was made available completely free as part of this project ez_i - Create shell script installers easily!.
TO TEST (MORE ELABORATE)
read -r -d '' FILE_CONTENT << 'HEREDOC'
BEGIN
Β§\\Β§[+][.][-]
A literal backslash, β\β.Β°
Β°\a
The βalertβ character, Ctrl-g, ASCII code 7 (BEL). (This often makes some sort of audible noise.)
\b
Backspace, Ctrl-h, ASCII code 8 (BS).
\f
Formfeed, Ctrl-l, ASCII code 12 (FF).
\n
Newline, Ctrl-j, ASCII code 10 (LF).
\r
Carriage return, Ctrl-m, ASCII code 13 (CR).
\t
Horizontal TAB, Ctrl-i, ASCII code 9 (HT).
\v
Vertical TAB, Ctrl-k, ASCII code 11 (VT).-''%s
\nnn
The octal value nnn, where nnn stands for 1 to 3 digits between β0β and β7β. For example, the code for the ASCII ESC (escape) character is β\033β.
15
It may also be helpful to note (though understandably you had no room to do so) that the -d option to readarray first appears in Bash 4.4. β
fbicknel
Aug 18, 2017 at 15:57
4
Great answer (+1). If you change your awk to awk '{ gsub(/,[ ]+|$/,"\0"); print }' %s./ \"and eliminate+.-Β°\aβ\b\f\n\r\t\v\nnnββ\033`` that concatenation of the final ", " then you don't have to go through the gymnastics on eliminating the final record. So: readarray -td '' a < <(awk '{ gsub(/,[ ]+/,"\0"); print; }' <<<"$string") on Bash that supports readarray. Note your method is Bash 4.4+ I think because of the -d in readarray β
dawg
Nov 26, 2017 at 22:28
10
Wow, what a brilliant answer! Hee hee, my response: ditched the bash script and fired up python! β
artfulrobot
May 14, 2018 at 11:32
11
I'd move your right answers up to the top, I had to scroll through a lot of rubbish to find out how to do it properly :-) β
paxdiablo
Jan 9, 2020 at 12:31
44
This is exactly the kind of thing that will convince you to never code in bash. An astoundingly simple task that has 8 incorrect solutions. Btw, this is without a design constraint of, "Make it as obscure and finicky as possible"Β§$
END
HEREDOC
F_MS_STR_TO_SPLIT="${FILE_CONTENT:6:-3}"
F_MS_DELIMITER_P="int }' %s./ \\\"and eliminate+.-Β°\aβ\b\f\n\r\t\v\nnnββ\033\`\`"
f_print_my_array() {
LENGTH=${#F_MASTER_SPLITTER_R[*]}
for ((i=0;i<=$(($LENGTH-1));i++)); do
echo ">>>>>>>>>>"
echo "${F_MASTER_SPLITTER_R[$i]}"
echo "<<<<<<<<<<"
done
}
echo ">>>>>>>>>>>>>>>>>>>>"
f_master_splitter "$F_MS_STR_TO_SPLIT" "$F_MS_DELIMITER_P"
f_print_my_array
echo "<<<<<<<<<<<<<<<<<<<<"
echo ">>>>>>>>>>>>>>>>>>>>"
f_master_splitter "$F_MS_STR_TO_SPLIT" "."
f_print_my_array
echo "<<<<<<<<<<<<<<<<<<<<"
echo ">>>>>>>>>>>>>>>>>>>>"
f_master_splitter "$F_MS_STR_TO_SPLIT" "+"
f_print_my_array
echo "<<<<<<<<<<<<<<<<<<<<"
echo ">>>>>>>>>>>>>>>>>>>>"
f_master_splitter "$F_MS_STR_TO_SPLIT" "'"
f_print_my_array
echo "<<<<<<<<<<<<<<<<<<<<"
echo ">>>>>>>>>>>>>>>>>>>>"
f_master_splitter "$F_MS_STR_TO_SPLIT" "\\"
f_print_my_array
echo "<<<<<<<<<<<<<<<<<<<<"
echo ">>>>>>>>>>>>>>>>>>>>"
f_master_splitter "$F_MS_STR_TO_SPLIT" "-"
f_print_my_array
echo "<<<<<<<<<<<<<<<<<<<<"
echo ">>>>>>>>>>>>>>>>>>>>"
f_master_splitter "a.+b.+c" "[.][+]"
f_print_my_array
echo "<<<<<<<<<<<<<<<<<<<<"
echo ">>>>>>>>>>>>>>>>>>>>"
f_master_splitter "a[.][+]b[.][+]c" "[.][+]"
f_print_my_array
echo "<<<<<<<<<<<<<<<<<<<<"
echo ">>>>>>>>>>>>>>>>>>>>"
f_master_splitter "a.+b.+c" ".+"
f_print_my_array
echo "<<<<<<<<<<<<<<<<<<<<"
A very special thanks to jhnc! You rock! π€
I have the input file (myfile) as:
/data/152.18224487:2,S/proforma invoice.doc
/data/152.916612:2,/proforma invoice.doc
/data/152.48152834/Bank T.T Copy 12 d3d.doc
/data/155071755/Bank T.T Copy.doc
/data/1521/Quotation Request.doc
/data/15.462/Quotation Request 2ds.doc
/data/15.22649962_test4/Quotation Request 33 zz (.doc
/data/15.226462_test6/Quotation Request.doc
and I need to exclude all data after latest "/" to the end of the row to have this output:
/data/152.18224487:2,S
/data/152.916612:2,
/data/152.48152834
/data/155071755
/data/1521
/data/15.462
/data/15.22649962_test4
/data/15.226462_test6
How can I do this from command line linux ?
This is a follow-up question related to extract last section of data from file using linux command
Could you please try following.
awk 'match($0,/\/.*\//){print substr($0,RSTART,RLENGTH-1)}' Input_file
Above will look from / to till last occurrence of / in case your Input_file can start other than / then try following.
awk 'match($0,/.*\//){print substr($0,RSTART,RLENGTH-1)}' Input_file
This one is combined with your previous question ,
ie. data:
>> Vi 'x' found in file /data/152.916612:2,/proforma invoice.doc
>> Vi 'x' found in file /data/152.48152834/Bank T.T Copy 12 d3d.doc
>> Vi 'x' found in file /data/155071755/Bank T.T Copy.doc
...
wwk:
$ awk '
(s=match($0,/found in file /)+RLENGTH) && (match(substr($0,s),/.*\//)) {
print substr($0,s,RLENGTH-1)
}' file
Output:
/data/152.18224487:2,S
/data/152.916612:2,
/data/152.48152834
...
Try
sed 's:/[^/]*$::' < inputfile > outputfile
You stated in a comment elsewhere that you also need only the rest after the last slash, so here we go:
sed 's:^.*/::' < inputfile > outputfile
awk -F/ '{print "/"$1$2"/"$3}' file
/data/152.18224487:2,S
/data/152.916612:2,
/data/152.48152834
/data/155071755
/data/1521
/data/15.462
/data/15.22649962_test4
/data/15.226462_test6
I'm trying to replace following string to multi lines as following
setsid /usr/local/bin/Naming_Service ${OPTIONS} &
replacing with
setsid /usr/local/bin/Naming_Service ${OPTIONS_13016} &
setsid /usr/local/bin/Naming_Service ${OPTIONS_13018} &
I tried with this command
sed '0,/setsid \/usr\/var\/run\/Naming_Serivce ${OPTIONS}/s//setsid \/usr\/var\/run\/Naming_Serivce ${OPTIONS_13016}\n\setsid \/usr\/var\/run\/Naming_Serivce ${OPTIONS_13018}\n /' script > new_script
can you please help to resolve
sed 's/^\(.*\)\(${OPTIONS}\)\(.*\)$/\1${OPTIONS_13016}\3\n\1${OPTIONS_13018}\3/' < script > new_script
(...) - create groups
\1 \3 - using these groups
\n - newline
.* - any character
For Your requirement use below syntax
Syntax:
sed -e "s/setsid \/usr\/local\/bin\/Naming_Service \${OPTIONS}/setsid \/usr\/local\/bin\/Naming_Service \${OPTIONS_13016} \&\nsetsid \/usr\/local\/bin\/Naming_Service \${OPTIONS_13018}/g" script > new_script
I'm using Powershell to trim spaces between strings, I need help. I'm reading the values into a variable using Get-Content
Here is my input data:
04:31 Alex M.O.R.P.H. & Natalie Gioia - My Heaven http://goo.gl/rMOa2q
[ARMADA MUSIC]
12:37 Chakra - Home (Alexander Popov Remix) http://goo.gl/3janGY
[SOUNDPIERCING]
See the space between the two songs? I want to eliminate these. so that the output is:
04:31 Alex M.O.R.P.H. & Natalie Gioia - My Heaven http://goo.gl/rMOa2q
[ARMADA MUSIC]
12:37 Chakra - Home (Alexander Popov Remix) http://goo.gl/3janGY
[SOUNDPIERCING]
I put the contents in a file called foo.txt.
foreach ($line in get-content foo.txt) {
if ($line -ne '') {
$line
}
}
$noEmptyLines = Get-Content -Path C:\FilePath\File.Txt | Where-Object{$_ -notmatch "^\s*$"}
You would have a variable where any lines that contained only whitespace would be removed.
I have this line:
\\Server1\A Share & Test & Check M
I want this output:
\\Server1\A Share & Test & Check
It should end with a character and not spaces (\\Server1\A Share & Test & Check)
I tried this:
sed -i "s/[ *\t[a-z]]*$//I" shares.txt
It removes the last letter but not the spaces.
The regex you are after is \s*[a-z]*$
sed -i "s/\s*[a-z]*$//I" shares.txt
\s is for any white space character
try this
echo "\\Server1\A Share & Test & Check M" | sed 's/[\tA-Za-z]*$//g'
output:
echo "\\Server1\A Share & Test & Check