Is there a way in bash to convert a string into a lower case string?
For example, if I have:
a="Hi all"
I want to convert it to:
"hi all"
The are various ways:
POSIX standard
tr
$ echo "$a" | tr '[:upper:]' '[:lower:]'
hi all
AWK
$ echo "$a" | awk '{print tolower($0)}'
hi all
Non-POSIX
You may run into portability issues with the following examples:
Bash 4.0
$ echo "${a,,}"
hi all
sed
$ echo "$a" | sed -e 's/\(.*\)/\L\1/'
hi all
# this also works:
$ sed -e 's/\(.*\)/\L\1/' <<< "$a"
hi all
Perl
$ echo "$a" | perl -ne 'print lc'
hi all
Bash
lc(){
case "$1" in
[A-Z])
n=$(printf "%d" "'$1")
n=$((n+32))
printf \\$(printf "%o" "$n")
;;
*)
printf "%s" "$1"
;;
esac
}
word="I Love Bash"
for((i=0;i<${#word};i++))
do
ch="${word:$i:1}"
lc "$ch"
done
Note: YMMV on this one. Doesn't work for me (GNU bash version 4.2.46 and 4.0.33 (and same behaviour 2.05b.0 but nocasematch is not implemented)) even with using shopt -u nocasematch;. Unsetting that nocasematch causes [[ "fooBaR" == "FOObar" ]] to match OK BUT inside case weirdly [b-z] are incorrectly matched by [A-Z]. Bash is confused by the double-negative ("unsetting nocasematch")! :-)
In Bash 4:
To lowercase
$ string="A FEW WORDS"
$ echo "${string,}"
a FEW WORDS
$ echo "${string,,}"
a few words
$ echo "${string,,[AEIUO]}"
a FeW WoRDS
$ string="A Few Words"
$ declare -l string
$ string=$string; echo "$string"
a few words
To uppercase
$ string="a few words"
$ echo "${string^}"
A few words
$ echo "${string^^}"
A FEW WORDS
$ echo "${string^^[aeiou]}"
A fEw wOrds
$ string="A Few Words"
$ declare -u string
$ string=$string; echo "$string"
A FEW WORDS
Toggle (undocumented, but optionally configurable at compile time)
$ string="A Few Words"
$ echo "${string~~}"
a fEW wORDS
$ string="A FEW WORDS"
$ echo "${string~}"
a FEW WORDS
$ string="a few words"
$ echo "${string~}"
A few words
Capitalize (undocumented, but optionally configurable at compile time)
$ string="a few words"
$ declare -c string
$ string=$string
$ echo "$string"
A few words
Title case:
$ string="a few words"
$ string=($string)
$ string="${string[#]^}"
$ echo "$string"
A Few Words
$ declare -c string
$ string=(a few words)
$ echo "${string[#]}"
A Few Words
$ string="a FeW WOrdS"
$ string=${string,,}
$ string=${string~}
$ echo "$string"
A few words
To turn off a declare attribute, use +. For example, declare +c string. This affects subsequent assignments and not the current value.
The declare options change the attribute of the variable, but not the contents. The reassignments in my examples update the contents to show the changes.
Edit:
Added "toggle first character by word" (${var~}) as suggested by ghostdog74.
Edit: Corrected tilde behavior to match Bash 4.3.
echo "Hi All" | tr "[:upper:]" "[:lower:]"
tr:
a="$(tr [A-Z] [a-z] <<< "$a")"
AWK:
{ print tolower($0) }
sed:
y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
I know this is an oldish post but I made this answer for another site so I thought I'd post it up here:
UPPER -> lower:
use python:
b=`echo "print '$a'.lower()" | python`
Or Ruby:
b=`echo "print '$a'.downcase" | ruby`
Or Perl:
b=`perl -e "print lc('$a');"`
Or PHP:
b=`php -r "print strtolower('$a');"`
Or Awk:
b=`echo "$a" | awk '{ print tolower($1) }'`
Or Sed:
b=`echo "$a" | sed 's/./\L&/g'`
Or Bash 4:
b=${a,,}
Or NodeJS:
b=`node -p "\"$a\".toLowerCase()"`
You could also use dd:
b=`echo "$a" | dd conv=lcase 2> /dev/null`
lower -> UPPER:
use python:
b=`echo "print '$a'.upper()" | python`
Or Ruby:
b=`echo "print '$a'.upcase" | ruby`
Or Perl:
b=`perl -e "print uc('$a');"`
Or PHP:
b=`php -r "print strtoupper('$a');"`
Or Awk:
b=`echo "$a" | awk '{ print toupper($1) }'`
Or Sed:
b=`echo "$a" | sed 's/./\U&/g'`
Or Bash 4:
b=${a^^}
Or NodeJS:
b=`node -p "\"$a\".toUpperCase()"`
You could also use dd:
b=`echo "$a" | dd conv=ucase 2> /dev/null`
Also when you say 'shell' I'm assuming you mean bash but if you can use zsh it's as easy as
b=$a:l
for lower case and
b=$a:u
for upper case.
Bash 5.1 provides a straight forward way to do this with the L parameter transformation:
${var#L}
So for example you can say:
v="heLLo"
echo "${v#L}"
# hello
You can also do uppercase with U:
v="hello"
echo "${v#U}"
# HELLO
And uppercase the first letter with u:
v="hello"
echo "${v#u}"
# Hello
In zsh:
echo $a:u
Gotta love zsh!
Using GNU sed:
sed 's/.*/\L&/'
Example:
$ foo="Some STRIng";
$ foo=$(echo "$foo" | sed 's/.*/\L&/')
$ echo "$foo"
some string
Pre Bash 4.0
Bash Lower the Case of a string and assign to variable
VARIABLE=$(echo "$VARIABLE" | tr '[:upper:]' '[:lower:]')
echo "$VARIABLE"
For the Bash command line and depending on locale and international letters, this might work (assembled from the answers from others):
$ echo "ABCÆØÅ" | python -c "print(open(0).read().lower())"
abcæøå
$ echo "ABCÆØÅ" | sed 's/./\L&/g'
abcæøå
$ export a="ABCÆØÅ" | echo "${a,,}"
abcæøå
Whereas these variations might NOT work:
$ echo "ABCÆØÅ" | tr "[:upper:]" "[:lower:]"
abcÆØÅ
$ echo "ABCÆØÅ" | awk '{print tolower($1)}'
abcÆØÅ
$ echo "ABCÆØÅ" | perl -ne 'print lc'
abcÆØÅ
$ echo 'ABCÆØÅ' | dd conv=lcase 2> /dev/null
abcÆØÅ
Simple way
echo "Hi all" | awk '{ print tolower($0); }'
For a standard shell (without bashisms) using only builtins:
uppers=ABCDEFGHIJKLMNOPQRSTUVWXYZ
lowers=abcdefghijklmnopqrstuvwxyz
lc(){ #usage: lc "SOME STRING" -> "some string"
i=0
while ([ $i -lt ${#1} ]) do
CUR=${1:$i:1}
case $uppers in
*$CUR*)CUR=${uppers%$CUR*};OUTPUT="${OUTPUT}${lowers:${#CUR}:1}";;
*)OUTPUT="${OUTPUT}$CUR";;
esac
i=$((i+1))
done
echo "${OUTPUT}"
}
And for upper case:
uc(){ #usage: uc "some string" -> "SOME STRING"
i=0
while ([ $i -lt ${#1} ]) do
CUR=${1:$i:1}
case $lowers in
*$CUR*)CUR=${lowers%$CUR*};OUTPUT="${OUTPUT}${uppers:${#CUR}:1}";;
*)OUTPUT="${OUTPUT}$CUR";;
esac
i=$((i+1))
done
echo "${OUTPUT}"
}
In bash 4 you can use typeset
Example:
A="HELLO WORLD"
typeset -l A=$A
You can try this
s="Hello World!"
echo $s # Hello World!
a=${s,,}
echo $a # hello world!
b=${s^^}
echo $b # HELLO WORLD!
ref : http://wiki.workassis.com/shell-script-convert-text-to-lowercase-and-uppercase/
From the bash manpage:
${parameter^pattern}
${parameter^^pattern}
${parameter,pattern}
${parameter,,pattern}
Case modification. This expansion modifies the case of alphabetic characters in parameter. The pattern is expanded to produce a
pattern just as in pathname expansion. Each character in the expanded
value of parameter is tested against pattern, and, if it matches
the pattern, its case is converted. The pattern should not attempt to
match more than one character. The ^ operator converts lowercase
letters matching pattern to uppercase; the , operator converts
matching uppercase letters to lowercase. The ^^ and ,,
expansions convert each matched character in the expanded value; the
^ and , expansions match and convert only the first character in the expanded value. If pattern is omitted, it is treated like a
?, which matches every character. If parameter is # or *, the case modification operation is applied to each positional parameter in turn, and the expansion is the resultant list. If
parameter is an array variable subscripted with # or *, the case modification operation is applied to each member of the array in
turn, and the expansion is the resultant list.
Regular expression
I would like to take credit for the command I wish to share but the truth is I obtained it for my own use from http://commandlinefu.com. It has the advantage that if you cd to any directory within your own home folder that is it will change all files and folders to lower case recursively please use with caution. It is a brilliant command line fix and especially useful for those multitudes of albums you have stored on your drive.
find . -depth -exec rename 's/(.*)\/([^\/]*)/$1\/\L$2/' {} \;
You can specify a directory in place of the dot(.) after the find which denotes current directory or full path.
I hope this solution proves useful the one thing this command does not do is replace spaces with underscores - oh well another time perhaps.
Converting case is done for alphabets only. So, this should work neatly.
I am focusing on converting alphabets between a-z from upper case to lower case. Any other characters should just be printed in stdout as it is...
Converts the all text in path/to/file/filename within a-z range to A-Z
For converting lower case to upper case
cat path/to/file/filename | tr 'a-z' 'A-Z'
For converting from upper case to lower case
cat path/to/file/filename | tr 'A-Z' 'a-z'
For example,
filename:
my name is xyz
gets converted to:
MY NAME IS XYZ
Example 2:
echo "my name is 123 karthik" | tr 'a-z' 'A-Z'
# Output:
# MY NAME IS 123 KARTHIK
Example 3:
echo "my name is 123 &&^&& ##$##%%& kAR2~thik" | tr 'a-z' 'A-Z'
# Output:
# MY NAME IS 123 &&^&& ##0#%%& KAR2~THIK
Many answers using external programs, which is not really using Bash.
If you know you will have Bash4 available you should really just use the ${VAR,,} notation (it is easy and cool). For Bash before 4 (My Mac still uses Bash 3.2 for example). I used the corrected version of #ghostdog74 's answer to create a more portable version.
One you can call lowercase 'my STRING' and get a lowercase version. I read comments about setting the result to a var, but that is not really portable in Bash, since we can't return strings. Printing it is the best solution. Easy to capture with something like var="$(lowercase $str)".
How this works
The way this works is by getting the ASCII integer representation of each char with printf and then adding 32 if upper-to->lower, or subtracting 32 if lower-to->upper. Then use printf again to convert the number back to a char. From 'A' -to-> 'a' we have a difference of 32 chars.
Using printf to explain:
$ printf "%d\n" "'a"
97
$ printf "%d\n" "'A"
65
97 - 65 = 32
And this is the working version with examples.
Please note the comments in the code, as they explain a lot of stuff:
#!/bin/bash
# lowerupper.sh
# Prints the lowercase version of a char
lowercaseChar(){
case "$1" in
[A-Z])
n=$(printf "%d" "'$1")
n=$((n+32))
printf \\$(printf "%o" "$n")
;;
*)
printf "%s" "$1"
;;
esac
}
# Prints the lowercase version of a sequence of strings
lowercase() {
word="$#"
for((i=0;i<${#word};i++)); do
ch="${word:$i:1}"
lowercaseChar "$ch"
done
}
# Prints the uppercase version of a char
uppercaseChar(){
case "$1" in
[a-z])
n=$(printf "%d" "'$1")
n=$((n-32))
printf \\$(printf "%o" "$n")
;;
*)
printf "%s" "$1"
;;
esac
}
# Prints the uppercase version of a sequence of strings
uppercase() {
word="$#"
for((i=0;i<${#word};i++)); do
ch="${word:$i:1}"
uppercaseChar "$ch"
done
}
# The functions will not add a new line, so use echo or
# append it if you want a new line after printing
# Printing stuff directly
lowercase "I AM the Walrus!"$'\n'
uppercase "I AM the Walrus!"$'\n'
echo "----------"
# Printing a var
str="A StRing WITH mixed sTUFF!"
lowercase "$str"$'\n'
uppercase "$str"$'\n'
echo "----------"
# Not quoting the var should also work,
# since we use "$#" inside the functions
lowercase $str$'\n'
uppercase $str$'\n'
echo "----------"
# Assigning to a var
myLowerVar="$(lowercase $str)"
myUpperVar="$(uppercase $str)"
echo "myLowerVar: $myLowerVar"
echo "myUpperVar: $myUpperVar"
echo "----------"
# You can even do stuff like
if [[ 'option 2' = "$(lowercase 'OPTION 2')" ]]; then
echo "Fine! All the same!"
else
echo "Ops! Not the same!"
fi
exit 0
And the results after running this:
$ ./lowerupper.sh
i am the walrus!
I AM THE WALRUS!
----------
a string with mixed stuff!
A STRING WITH MIXED STUFF!
----------
a string with mixed stuff!
A STRING WITH MIXED STUFF!
----------
myLowerVar: a string with mixed stuff!
myUpperVar: A STRING WITH MIXED STUFF!
----------
Fine! All the same!
This should only work for ASCII characters though.
For me it is fine, since I know I will only pass ASCII chars to it.
I am using this for some case-insensitive CLI options, for example.
For Bash3.2.+ | Mac:
read -p 'What is your email? ' email
email=$(echo $email | tr '[:upper:]' '[:lower:]')
email="$email"
echo $email
If using v4, this is baked-in. If not, here is a simple, widely applicable solution. Other answers (and comments) on this thread were quite helpful in creating the code below.
# Like echo, but converts to lowercase
echolcase () {
tr [:upper:] [:lower:] <<< "${*}"
}
# Takes one arg by reference (var name) and makes it lowercase
lcase () {
eval "${1}"=\'$(echo ${!1//\'/"'\''"} | tr [:upper:] [:lower:] )\'
}
Notes:
Doing: a="Hi All" and then: lcase a will do the same thing as: a=$( echolcase "Hi All" )
In the lcase function, using ${!1//\'/"'\''"} instead of ${!1} allows this to work even when the string has quotes.
This is a far faster variation of JaredTS486's approach that uses native Bash capabilities (including Bash versions <4.0) to optimize his approach.
I've timed 1,000 iterations of this approach for a small string (25 characters) and a larger string (445 characters), both for lowercase and uppercase conversions. Since the test strings are predominantly lowercase, conversions to lowercase are generally faster than to uppercase.
I've compared my approach with several other answers on this page that are compatible with Bash 3.2. My approach is far more performant than most approaches documented here, and is even faster than tr in several cases.
Here are the timing results for 1,000 iterations of 25 characters:
0.46s for my approach to lowercase; 0.96s for uppercase
1.16s for Orwellophile's approach to lowercase; 1.59s for uppercase
3.67s for tr to lowercase; 3.81s for uppercase
11.12s for ghostdog74's approach to lowercase; 31.41s for uppercase
26.25s for technosaurus' approach to lowercase; 26.21s for uppercase
25.06s for JaredTS486's approach to lowercase; 27.04s for uppercase
Timing results for 1,000 iterations of 445 characters (consisting of the poem "The Robin" by Witter Bynner):
2s for my approach to lowercase; 12s for uppercase
4s for tr to lowercase; 4s for uppercase
20s for Orwellophile's approach to lowercase; 29s for uppercase
75s for ghostdog74's approach to lowercase; 669s for uppercase. It's interesting to note how dramatic the performance difference is between a test with predominant matches vs. a test with predominant misses
467s for technosaurus' approach to lowercase; 449s for uppercase
660s for JaredTS486's approach to lowercase; 660s for uppercase. It's interesting to note that this approach generated continuous page faults (memory swapping) in Bash
Solution:
#!/bin/bash
set -e
set -u
declare LCS="abcdefghijklmnopqrstuvwxyz"
declare UCS="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
function lcase()
{
local TARGET="${1-}"
local UCHAR=''
local UOFFSET=''
while [[ "${TARGET}" =~ ([A-Z]) ]]
do
UCHAR="${BASH_REMATCH[1]}"
UOFFSET="${UCS%%${UCHAR}*}"
TARGET="${TARGET//${UCHAR}/${LCS:${#UOFFSET}:1}}"
done
echo -n "${TARGET}"
}
function ucase()
{
local TARGET="${1-}"
local LCHAR=''
local LOFFSET=''
while [[ "${TARGET}" =~ ([a-z]) ]]
do
LCHAR="${BASH_REMATCH[1]}"
LOFFSET="${LCS%%${LCHAR}*}"
TARGET="${TARGET//${LCHAR}/${UCS:${#LOFFSET}:1}}"
done
echo -n "${TARGET}"
}
The approach is simple: while the input string has any remaining uppercase letters present, find the next one, and replace all instances of that letter with its lowercase variant. Repeat until all uppercase letters are replaced.
Some performance characteristics of my solution:
Uses only shell builtin utilities, which avoids the overhead of invoking external binary utilities in a new process
Avoids sub-shells, which incur performance penalties
Uses shell mechanisms that are compiled and optimized for performance, such as global string replacement within variables, variable suffix trimming, and regex searching and matching. These mechanisms are far faster than iterating manually through strings
Loops only the number of times required by the count of unique matching characters to be converted. For example, converting a string that has three different uppercase characters to lowercase requires only 3 loop iterations. For the preconfigured ASCII alphabet, the maximum number of loop iterations is 26
UCS and LCS can be augmented with additional characters
so i attempted to perform some updated benchmarking using the consensus approach for each utility, but instead of repeating a tiny set many times, I ...
fed in a 1.85 GB .txt file that's filled to the brim w/ multi-byte Unicode chars in UTF-8 encoding,
via the pipe in order to equalize I/O aspect,
while also enforcing LC_ALL=C for all to ensure level playing field
————————————————————————————————————————
Both bsd-sed and gnu-sed are rather mediocre, to put it very nicely.
I don't even know what bsd-sed was trying to do, as their xxhash doesn't match
was python3 trying to do Unicode letter-casing ?
(even though I already forced the locale setting LC_ALL=C )
tr is the most extreme
gnu-tr is, by far, the fastest among all
bsd-tr utterly atrocious
perl5 is faster than any awk variant I have, unless you're okay with loading the whole file at once using mawk2 in order to gain a tiny bit over perl5 :
2.935s mawk2
vs
3.081s perl5
within awk, gnu-gawk appears slowest among the 3 , mawk 1.3.4 in the middle, and mawk 1.9.9.6 fastest : more than 50% time savings over gawk
. (I didn't waste my time with the useless macosx nawk)
.
out9: 1.85GiB 0:00:03 [ 568MiB/s] [ 568MiB/s] [ <=> ]
in0: 1.85GiB 0:00:03 [ 568MiB/s] [ 568MiB/s] [============>] 100%
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C mawk2 '{ print tolower($_) }' FS='^$'; )
mawk 1.9.9.6 (mawk2-beta)
3.07s user 0.66s system 111% cpu 3.348 total
85759a34df874966d096c6529dbfb9d5 stdin
out9: 1.85GiB 0:00:06 [ 297MiB/s] [ 297MiB/s] [ <=> ]
in0: 1.85GiB 0:00:06 [ 297MiB/s] [ 297MiB/s] [============>] 100%
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C mawk '{ print tolower($_) }' FS='^$'; )
mawk 1.3.4
6.01s user 0.83s system 107% cpu 6.368 total
85759a34df874966d096c6529dbfb9d5 stdin
out9: 23.8MiB 0:00:00 [ 238MiB/s] [ 238MiB/s] [ <=> ]
in0: 1.85GiB 0:00:07 [ 244MiB/s] [ 244MiB/s] [============>] 100%
out9: 1.85GiB 0:00:07 [ 244MiB/s] [ 244MiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gawk -be '{ print tolower($_) }' FS='^$';
GNU Awk 5.1.1, API: 3.1 (GNU MPFR 4.1.0, GNU MP 6.2.1)
7.49s user 0.78s system 106% cpu 7.763 total
85759a34df874966d096c6529dbfb9d5 stdin
out9: 1.85GiB 0:00:03 [ 616MiB/s] [ 616MiB/s] [ <=> ]
in0: 1.85GiB 0:00:03 [ 617MiB/s] [ 617MiB/s] [============>] 100%
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C perl -ne 'print lc'; )
perl5 (revision 5 version 34 subversion 0)
2.70s user 0.85s system 115% cpu 3.081 total
85759a34df874966d096c6529dbfb9d5 stdin
out9: 1.85GiB 0:00:32 [57.4MiB/s] [57.4MiB/s] [ <=> ]
in0: 1.85GiB 0:00:32 [57.4MiB/s] [57.4MiB/s] [============>] 100%
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gsed 's/.*/\L&/'; ) # GNU-sed
gsed (GNU sed) 4.8
32.57s user 0.97s system 101% cpu 32.982 total
85759a34df874966d096c6529dbfb9d5 stdin
out9: 1.86GiB 0:00:38 [49.7MiB/s] [49.7MiB/s] [ <=> ]
in0: 1.85GiB 0:00:38 [49.4MiB/s] [49.4MiB/s] [============>] 100%
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C sed 's/.*/\L&/'; ) # BSD-sed
37.94s user 0.86s system 101% cpu 38.318 total
d5e2d8487df1136db7c2334a238755c0 stdin
in0: 313MiB 0:00:00 [3.06GiB/s] [3.06GiB/s] [=====>] 16% ETA 0:00:00
out9: 1.85GiB 0:00:11 [ 166MiB/s] [ 166MiB/s] [ <=>]
in0: 1.85GiB 0:00:00 [3.31GiB/s] [3.31GiB/s] [============>] 100%
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C python3 -c "print(open(0).read().lower()))
Python 3.9.12
9.04s user 2.18s system 98% cpu 11.403 total
7ddc0b5cbcfbbfac3c2b6da6731bd262 stdin
out9: 2.51MiB 0:00:00 [25.1MiB/s] [25.1MiB/s] [ <=> ]
in0: 1.85GiB 0:00:11 [ 171MiB/s] [ 171MiB/s] [============>] 100%
out9: 1.85GiB 0:00:11 [ 171MiB/s] [ 171MiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C ruby -pe '$_.downcase!'; )
ruby 2.6.8p205 (2021-07-07 revision 67951) [universal.arm64e-darwin21]
10.46s user 1.23s system 105% cpu 11.073 total
85759a34df874966d096c6529dbfb9d5 stdin
in0: 1.85GiB 0:00:01 [1.01GiB/s] [1.01GiB/s] [============>] 100%
out9: 1.85GiB 0:00:01 [1.01GiB/s] [1.01GiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gtr '[A-Z]' '[a-z]'; ) # GNU-tr
gtr (GNU coreutils) 9.1
1.11s user 1.21s system 124% cpu 1.855 total
85759a34df874966d096c6529dbfb9d5 stdin
out9: 1.85GiB 0:01:19 [23.7MiB/s] [23.7MiB/s] [ <=> ]
in0: 1.85GiB 0:01:19 [23.7MiB/s] [23.7MiB/s] [============>] 100%
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C tr '[A-Z]' '[a-z]'; ) # BSD-tr
78.94s user 1.50s system 100% cpu 1:19.67 total
85759a34df874966d096c6529dbfb9d5 stdin
( time ( pvE0 < "${m3t}" | LC_ALL=C gdd conv=lcase ) | pvE9 ) | xxh128sum | lgp3; sleep 3;
out9: 0.00 B 0:00:01 [0.00 B/s] [0.00 B/s] [<=> ]
in0: 1.85GiB 0:00:06 [ 295MiB/s] [ 295MiB/s] [============>] 100%
out9: 1.81GiB 0:00:06 [ 392MiB/s] [ 294MiB/s] [ <=> ]
3874110+1 records in
3874110+1 records out
out9: 1.85GiB 0:00:06 [ 295MiB/s] [ 295MiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C gdd conv=lcase; ) # GNU-dd
gdd (coreutils) 9.1
1.93s user 4.35s system 97% cpu 6.413 total
85759a34df874966d096c6529dbfb9d5 stdin
% ( time ( pvE0 < "${m3t}" | LC_ALL=C dd conv=lcase ) | pvE9 ) | xxh128sum | lgp3; sleep 3;
out9: 36.9MiB 0:00:00 [ 368MiB/s] [ 368MiB/s] [ <=> ]
in0: 1.85GiB 0:00:04 [ 393MiB/s] [ 393MiB/s] [============>] 100%
out9: 1.85GiB 0:00:04 [ 393MiB/s] [ 393MiB/s] [ <=> ]
3874110+1 records in
3874110+1 records out
out9: 1.85GiB 0:00:04 [ 393MiB/s] [ 393MiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C dd conv=lcase; ) # BSD-dd
1.92s user 4.24s system 127% cpu 4.817 total
85759a34df874966d096c6529dbfb9d5 stdin
————————————————————————————————————————
mawk2 can be made artificially faster than perl5 by having the file load all at once, and doing tolower() for all 1.85 GB in a single function call ::
( time ( pvE0 < "${m3t}" |
LC_ALL=C mawk2 '
BEGIN { FS = RS = "^$" }
END { print tolower($(ORS = "")) }'
) | pvE9 ) | xxh128sum| lgp3
in0: 1.85GiB 0:00:00 [3.35GiB/s] [3.35GiB/s] [============>] 100%
out9: 1.85GiB 0:00:02 [ 647MiB/s] [ 647MiB/s] [ <=> ]
( pvE 0.1 in0 < "${m3t}" | LC_ALL=C mawk2 ; )
1.39s user 1.31s system 91% cpu 2.935 total
85759a34df874966d096c6529dbfb9d5 stdin
For Bash versions earlier than 4.0, this version should be fastest (as it doesn't fork/exec any commands):
function string.monolithic.tolower
{
local __word=$1
local __len=${#__word}
local __char
local __octal
local __decimal
local __result
for (( i=0; i<__len; i++ ))
do
__char=${__word:$i:1}
case "$__char" in
[A-Z] )
printf -v __decimal '%d' "'$__char"
printf -v __octal '%03o' $(( $__decimal ^ 0x20 ))
printf -v __char \\$__octal
;;
esac
__result+="$__char"
done
REPLY="$__result"
}
technosaurus's answer had potential too, although it did run properly for mee.
In spite of how old this question is and similar to this answer by technosaurus. I had a hard time finding a solution that was portable across most platforms (That I Use) as well as older versions of bash. I have also been frustrated with arrays, functions and use of prints, echos and temporary files to retrieve trivial variables. This works very well for me so far I thought I would share.
My main testing environments are:
GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu)
GNU bash, version 3.2.57(1)-release (sparc-sun-solaris2.10)
lcs="abcdefghijklmnopqrstuvwxyz"
ucs="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
input="Change Me To All Capitals"
for (( i=0; i<"${#input}"; i++ )) ; do :
for (( j=0; j<"${#lcs}"; j++ )) ; do :
if [[ "${input:$i:1}" == "${lcs:$j:1}" ]] ; then
input="${input/${input:$i:1}/${ucs:$j:1}}"
fi
done
done
Simple C-style for loop to iterate through the strings.
For the line below if you have not seen anything like this before
this is where I learned this. In this case the line checks if the char ${input:$i:1} (lower case) exists in input and if so replaces it with the given char ${ucs:$j:1} (upper case) and stores it back into input.
input="${input/${input:$i:1}/${ucs:$j:1}}"
To store the transformed string into a variable. Following worked for me -
$SOURCE_NAME to $TARGET_NAME
TARGET_NAME="`echo $SOURCE_NAME | tr '[:upper:]' '[:lower:]'`"
Based on Dejay Clayton excellent solution, I've generalized the uppercase/lowercase to a transpose function (independently useful), returned the result in a variable (faster/safer), and added a BASH v4+ optimization:
pkg::transpose() { # <retvar> <string> <from> <to>
local __r=$2 __m __p
while [[ ${__r} =~ ([$3]) ]]; do
__m="${BASH_REMATCH[1]}"; __p="${3%${__m}*}"
__r="${__r//${__m}/${4:${#__p}:1}}"
done
printf -v "$1" "%s" "${__r}"
}
pkg::lowercase() { # <retvar> <string>
if (( BASH_VERSINFO[0] >= 4 )); then
printf -v "$1" "%s" "${2,,}"
else
pkg::transpose "$1" "$2" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" \
"abcdefghijklmnopqrstuvwxyz"
fi
}
pkg::uppercase() { # <retvar> <string>
if (( BASH_VERSINFO[0] >= 4 )); then
printf -v "$1" "%s" "${2^^}"
else
pkg::transpose "$1" "$2" "abcdefghijklmnopqrstuvwxyz" \
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
fi
}
To keep things simple I didn't add any set -e support (or any error checking really)... but otherwise it generally follows shellguide and pkg::transpose() tries to avoid any likely variable name clashes for the printf -v
isn't this cleaner than a full chain of shell variable(s) + declare + eval + single quote escapes + echo + pipe(s) + tr just to avoid a sub-shell or external process ?
# ***MUCH*** faster for ASCII only
mawk '$!NF = toupper($_)' <<< 'abcxyz'
ABCXYZ
gawk '$_ = tolower($_)' <<< 'FAB-EDC'
fab-edc
and Unicodes are just as easy to work with, without having to "unpack" or "encode" or "decode" bytes
printf '%s' "${test_utf8}" | ……
1 ÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øù
úûüýþÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģĤĥĦħĨĩĪī
ĬĭĮįİıIJijĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘřŚśŜŝ
ŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſƀƁƂƃƄƅƆƇƈƉƊƋƌƍƎƏ
ƐƑƒƓƔƕƖƗƘƙƚƛƜƝƞƟƠơƢƣƤƥƦƧƨƩƪƫƬƭƮƯưƱƲƳƴƵƶƷƸƹƺƻƼƽƾƿǀǁǂ
ǃDŽDždžLJLjljNJNjnjǍǎǏǐǑǒǓǔǕǖǗǘǙǚǛǜǝǞǟǠǡǢǣǤǥǦǧǨǩǪǫǬǭǮǯǰDZDzdzǴ
…… | gawk '$_ = toupper($_)'
1 ÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ÷ØÙ
ÚÛÜÝÞŸĀĀĂĂĄĄĆĆĈĈĊĊČČĎĎĐĐĒĒĔĔĖĖĘĘĚĚĜĜĞĞĠĠĢĢĤĤĦĦĨĨĪĪ
ĬĬĮĮİIIJIJĴĴĶĶĸĹĹĻĻĽĽĿĿŁŁŃŃŅŅŇŇʼnŊŊŌŌŎŎŐŐŒŒŔŔŖŖŘŘŚŚŜŜ
ŞŞŠŠŢŢŤŤŦŦŨŨŪŪŬŬŮŮŰŰŲŲŴŴŶŶŸŹŹŻŻŽŽSƀƁƂƂƄƄƆƇƇƉƊƋƋƍƎƏ
ƐƑƑƓƔǶƖƗƘƘƚƛƜƝȠƟƠƠƢƢƤƤƦƧƧƩƪƫƬƬƮƯƯƱƲƳƳƵƵƷƸƸƺƻƼƼƾǷǀǁǂ
ǃDŽDžDŽLJLjLJNJNjNJǍǍǏǏǑǑǓǓǕǕǗǗǙǙǛǛƎǞǞǠǠǢǢǤǤǦǦǨǨǪǪǬǬǮǮǰDZDzDZǴ
[dev#localhost ~]$ TEST=STRESS2
[dev#localhost ~]$ echo ${TEST,,}
stress2
use this command to do the same , it will convert upper case strings into lowercase :
sed 's/[A-Z]/[a-z]/g' <filename>
Related
I want to uppercase just the first character in my string with bash.
foo="bar";
//uppercase first character
echo $foo;
should print "Bar";
One way with bash (version 4+):
foo=bar
echo "${foo^}"
prints:
Bar
foo="$(tr '[:lower:]' '[:upper:]' <<< ${foo:0:1})${foo:1}"
One way with sed:
echo "$(echo "$foo" | sed 's/.*/\u&/')"
Prints:
Bar
$ foo="bar";
$ foo=`echo ${foo:0:1} | tr '[a-z]' '[A-Z]'`${foo:1}
$ echo $foo
Bar
To capitalize first word only:
foo='one two three'
foo="${foo^}"
echo $foo
One two three
To capitalize every word in the variable:
foo="one two three"
foo=( $foo ) # without quotes
foo="${foo[#]^}"
echo $foo
One Two Three
(works in bash 4+)
Using awk only
foo="uNcapItalizedstrIng"
echo $foo | awk '{print toupper(substr($0,0,1))tolower(substr($0,2))}'
Here is the "native" text tools way:
#!/bin/bash
string="abcd"
first=`echo $string|cut -c1|tr [a-z] [A-Z]`
second=`echo $string|cut -c2-`
echo $first$second
just for fun here you are :
foo="bar";
echo $foo | awk '{$1=toupper(substr($1,0,1))substr($1,2)}1'
# or
echo ${foo^}
# or
echo $foo | head -c 1 | tr [a-z] [A-Z]; echo $foo | tail -c +2
# or
echo ${foo:1} | sed -e 's/^./\B&/'
It can be done in pure bash with bash-3.2 as well:
# First, get the first character.
fl=${foo:0:1}
# Safety check: it must be a letter :).
if [[ ${fl} == [a-z] ]]; then
# Now, obtain its octal value using printf (builtin).
ord=$(printf '%o' "'${fl}")
# Fun fact: [a-z] maps onto 0141..0172. [A-Z] is 0101..0132.
# We can use decimal '- 40' to get the expected result!
ord=$(( ord - 40 ))
# Finally, map the new value back to a character.
fl=$(printf '%b' '\'${ord})
fi
echo "${fl}${foo:1}"
This works too...
FooBar=baz
echo ${FooBar^^${FooBar:0:1}}
=> Baz
FooBar=baz
echo ${FooBar^^${FooBar:1:1}}
=> bAz
FooBar=baz
echo ${FooBar^^${FooBar:2:2}}
=> baZ
And so on.
Sources:
Bash Manual: Shell Parameter Expansion
Full Bash Guide: Parameters
Bash Hacker's Wiki Parameter Expansion
Inroductions/Tutorials:
Cyberciti.biz: 8. Convert to upper to lower case or vice versa
Opensource.com: An introduction to parameter expansion in Bash
This one worked for me:
Searching for all *php file in the current directory , and replace the first character of each filename to capital letter:
e.g: test.php => Test.php
for f in *php ; do mv "$f" "$(\sed 's/.*/\u&/' <<< "$f")" ; done
Alternative and clean solution for both Linux and OSX, it can also be used with bash variables
python -c "print(\"abc\".capitalize())"
returns Abc
This is POSIX sh-compatible as far as I know.
upper_first.sh:
#!/bin/sh
printf "$1" | cut -c1 -z | tr -d '\0' | tr [:lower:] [:upper:]
printf "$1" | cut -c2-
cut -c1 -z ends the first string with \0 instead of \n. It gets removed with tr -d '\0'. It also works to omit the -z and use tr -d '\n' instead, but this breaks if the first character of the string is a newline.
Usage:
$ upper_first.sh foo
Foo
$
In a function:
#!/bin/sh
function upper_first ()
{
printf "$1" | cut -c1 -z | tr -d '\0' | tr [:lower:] [:upper:]
printf "$1" | cut -c2-
}
old="foo"
new="$(upper_first "$old")"
echo "$new"
Posix compliant and with less sub-processes:
v="foo[Bar]"
printf "%s" "${v%"${v#?}"}" | tr '[:lower:]' '[:upper:]' && printf "%s" "${v#?}"
==> Foo[Bar]
first-letter-to-lower () {
str=""
space=" "
for i in $#
do
if [ -z $(echo $i | grep "the\|of\|with" ) ]
then
str=$str"$(echo ${i:0:1} | tr '[A-Z]' '[a-z]')${i:1}$space"
else
str=$str${i}$space
fi
done
echo $str
}
first-letter-to-upper-xc () {
v-first-letter-to-upper | xclip -selection clipboard
}
first-letter-to-upper () {
str=""
space=" "
for i in $#
do
if [ -z $(echo $i | grep "the\|of\|with" ) ]
then
str=$str"$(echo ${i:0:1} | tr '[a-z]' '[A-Z]')${i:1}$space"
else
str=$str${i}$space
fi
done
echo $str
}
first-letter-to-lower-xc(){
v-first-letter-to-lower | xclip -selection clipboard
}
Not exactly what asked but quite helpful
declare -u foo #When the variable is assigned a value, all lower-case characters are converted to upper-case.
foo=bar
echo $foo
BAR
And the opposite
declare -l foo #When the variable is assigned a value, all upper-case characters are converted to lower-case.
foo=BAR
echo $foo
bar
What if the first character is not a letter (but a tab, a space, and a escaped double quote)? We'd better test it until we find a letter! So:
S=' \"ó foo bar\"'
N=0
until [[ ${S:$N:1} =~ [[:alpha:]] ]]; do N=$[$N+1]; done
#F=`echo ${S:$N:1} | tr [:lower:] [:upper:]`
#F=`echo ${S:$N:1} | sed -E -e 's/./\u&/'` #other option
F=`echo ${S:$N:1}
F=`echo ${F} #pure Bash solution to "upper"
echo "$F"${S:(($N+1))} #without garbage
echo '='${S:0:(($N))}"$F"${S:(($N+1))}'=' #garbage preserved
Foo bar
= \"Foo bar=
I have a version number with three columns and two digits (xx:xx:xx). Can anyone please tell me how to increment that using shell script.
Min Value
00:00:00
Max Value
99:99:99
Sample IO
10:23:56 -> 10:23:57
62:54:99 -> 62:55:00
87:99:99 -> 88:00:00
As a one liner using awk, assuming VERSION is a variable with the version in it:
echo $VERSION | awk 'BEGIN { FS=":" } { $3++; if ($3 > 99) { $3=0; $2++; if ($2 > 99) { $2=0; $1++ } } } { printf "%02d:%02d:%02d\n", $1, $2, $3 }'
Nothing fancy (other than Bash) needed:
$ ver=87:99:99
$ echo "$ver"
87:99:99
$ printf -v ver '%06d' $((10#${ver//:}+1))
$ ver=${ver%????}:${ver: -4:2}:${ver: -2:2}
$ echo "$ver"
88:00:00
We just use the parameter expansion ${ver//:} to remove the colons: we're then left with a usual decimal number, increment it and reformat it using printf; then use some more parameter expansions to group the digits.
This assumes that ver has already been thorougly checked (with a regex or glob).
It's easy, just needs some little math tricks and bc command, here is how:
#!/bin/bash
# read VERSION from $1 into VER
IFS=':' read -r -a VER <<< "$1"
# increment by 1
INCR=$(echo "ibase=10; ${VER[0]}*100*100+${VER[1]}*100+${VER[2]}+1"|bc)
# prepend zeros
INCR=$(printf "%06d" ${INCR})
# output the result
echo ${INCR:0:2}:${INCR:2:2}:${INCR:4:2}
If you need overflow checking you can do it with the trick like INCR statement.
This basically works, but may or may not do string padding:
IN=43:99:99
F1=`echo $IN | cut -f1 '-d:'`
F2=`echo $IN | cut -f2 '-d:'`
F3=`echo $IN | cut -f3 '-d:'`
F3=$(( F3 + 1 ))
if [ "$F3" -gt 99 ] ; then F3=00 ; F2=$(( F2 + 1 )) ; fi
if [ "$F2" -gt 99 ] ; then F2=00 ; F1=$(( F1 + 1 )) ; fi
OUT="$F1:$F2:$F3"
echo $OUT
try this one liner:
awk '{gsub(/:/,"");$0++;gsub(/../,"&:");sub(/:$/,"")}7'
tests:
kent$ awk '{gsub(/:/,"");$0++;gsub(/../,"&:");sub(/:$/,"")}7' <<< "22:33:99"
22:34:00
kent$ awk '{gsub(/:/,"");$0++;gsub(/../,"&:");sub(/:$/,"")}7' <<< "22:99:99"
23:00:00
kent$ awk '{gsub(/:/,"");$0++;gsub(/../,"&:");sub(/:$/,"")}7' <<< "22:99:88"
22:99:89
Note, corner cases were not tested.
So I have this function with the following output:
AGsg4SKKs74s62#
I need to find a way to scramble the characters without deleting anything..aka all characters must be present after I scramble them.
I can only bash utilities including awk and sed.
echo 'AGsg4SKKs74s62#' | sed 's/./&\n/g' | shuf | tr -d "\n"
Output (e.g.):
S7s64#2gKAGsKs4
Here's a pure Bash function that does the job:
scramble() {
# $1: string to scramble
# return in variable scramble_ret
local a=$1 i
scramble_ret=
while((${#a})); do
((i=RANDOM%${#a}))
scramble_ret+=${a:i:1}
a=${a::i}${a:i+1}
done
}
See if it works:
$ scramble 'AGsg4SKKs74s62#'
$ echo "$scramble_ret"
G4s6s#2As74SgKK
Looks all right.
I know that you haven't mentioned Perl but it could be done like this:
perl -MList::Util=shuffle -F'' -lane 'print shuffle #F' <<<"AGsg4SKKs74s62#"
-a enables auto-split mode and -F'' sets the field separator to an empty string, so each character goes into a separate array element. The array is shuffled using the function provided by the core module List::Util.
Here is my solution, usage: shuffleString "any-string". Performance is not in my consideration when using bash.
function shuffleString() {
local line="$1"
for i in $(seq 1 ${#line}); do
local p=$(expr $RANDOM % ${#line})
if [[ $p -lt $i ]]; then
local line="${line:0:$p}${line:$i:1}${line:$p+1:$i-$p-1}${line:$p:1}${line:$i+1}"
elif [[ $p -gt $i ]]; then
local line="${line:0:$i}${line:$p:1}${line:$i+1:$p-$i-1}${line:$i:1}${line:$p+1}"
fi
done
echo "$line"
}
I'm trying to write a script where the user enters a number as a parameter and the script calculates the sum of all the digits e.g.,
./myScript 963
18
So the script takes the string "963" and adds all the characters in the string 9+6+3=18. I'm thinking I could get the length of the string and use a loop to add all the indexes of the string together but I cannot figure out how to get an index of the string without already knowing the character you're looking for.
I was able to break the string up using the following command,
echo "963" | fold -w1
9
6
3
But I'm not sure if/how I could pipe | or redirect > the results into a variable and add it to a total each time.
How can I get a character of a string at a particular index?
Update:
Example 1:
$1=59 then the operation is
5+9=14
Example 2:
$1=2222 then the operation is
2+2+2+2=8
All the characters in the string are added to a total sum.
The following script loops through all of the digits in the input string and adds them together:
#!/bin/bash
s="$1"
for ((i=0; i<${#s}; ++i)); do
((t+=${s:i:1}))
done
echo "sum of digits: $t"
The syntax ${s:i:1} extracts a substring of length 1 from position i in the string $s.
Output:
$ ./add.sh 963
sum of digits: 18
If you wanted to continue adding together the digits until there was only one remaining, you could do this instead:
#!/bin/bash
s="$1"
while (( ${#s} > 1 )); do
t=0
for ((i=0; i<${#s}; ++i)); do
((t+=${s:i:1}))
done
echo "iteration $((++n)): $t"
s=$t
done
echo "final result: $s"
The outer while loop continues as long as the length of the string is greater than 1. The inner for loop adds together each digit in the string.
Output:
$ ./add.sh 963
iteration 1: 18
iteration 2: 9
final result: 9
Not that you asked for it but there are many ways to sum all of the digits in a string. Here's another one using Perl:
$ perl -MList::Util=sum -F -anE 'say sum #F' <<<639
18
List::Util is a core module in Perl. The sum subroutine does a reduction sum on a list to produce a single value. -a enables auto-split mode so the input is split into the array #F. -F is used to set the field delimiter (in this case it is blank, so every character counts as a separate field). -n processes every line of input one at a time and -E is used to enter a Perl one-liner but with newer features (such as say) enabled. say is like print but a newline is added to the output.
If you're not familiar with the <<< syntax, it is equivalent to echo 639 | perl ....
Not using string subscription but computing the desired sum:
number=963
sum=0
for d in `echo "$number" | sed 's,\(.\), \1,g'`
do
sum=$(($sum + $d))
done
echo $sum
Output: 18
I would do this:
num="963"
echo "$num" | grep -o . | paste -sd+ - | bc
#or using your fold
echo "$num" | fold -w1 | paste -sd+ - | bc
both prints
18
Explanation
the grep -o . return each digit from your number as well as the fold -w1
the paste -sd+ - merges the lines to one line using the delimiter + - e.g. create an calculation string like 9+6+3
the bc does the calculation
if you want script, e.g. digadd.sh use
grep -o . <<<"$1" | paste -sd+ - | bc
using it
$ bash digadd.sh #nothing
$ #will return nothing
$ bash digadd.sh 1273617617273450359345873647586378242349239471289638982
268
$
For fun, doing this in loop until the result is only 1 digit
num=12938932923849028940802934092840924
while [[ ${#num} > 1 ]]
do
echo -n "sum of digits for $num is:"
num=$(echo "$num" | grep -o . | paste -sd+ - | bc)
echo $num
done
echo "final result: $num"
prints
sum of digits for 12938932923849028940802934092840924 is:159
sum of digits for 159 is:15
sum of digits for 15 is:6
final result: 6
another fun variant, what will extract all digits from any string is:
grep -oP '\d' <<<"$1" | paste -sd+ - | bc
so using it in the script digadd.sh like
bash digadd.sh 9q6w3
produces
18
The answer for your question in the title: To getting the Nth character from any string you can use
echo "$string:POSITION:length" #position from 0
e.g. to get the 1st digit
echo "${num:0:1}"
You can use cut with -c parameter to get character at any position. for example:
echo "963" | cut -c1
Outputs: 9
Using awk:
awk 'split($0,a,""){for(i in a) sum+=i}END{print sum}' <<<$1
This can be done using substring manipulation (supported by busybox ash, but not posix sh compliant)
#!/bin/ash
i=0
sum=0
while [ $i -lt ${#1} ]; do
sum=$((sum+${1:i:1}));
i=$((i+1))
done
echo $sum
If you really must have a posix shell compliant version, you can use:
#!/bin/sh
sum=0
A=$1
while [ ${#B} -lt ${#A} ];do
B=$B?
done
while [ "$A" ]; do
B=${B#?*}
sum=$((sum+${A%$B}))
A=${A#?*}
done
echo $sum
For example:
s1="my_foo"
s2="not_my_bar"
the desired result would be my_o. How do I do this in bash?
My solution below uses fold to break the string into one character per line, sort to sort the lists, comm to compare the two strings and finally tr to delete the new line characters
comm -12 <(fold -w1 <<< $s1 | sort -u) <(fold -w1 <<< $s2 | sort -u) | tr -d '\n'
Alternatively, here is a pure Bash solution (which also maintains the order of the characters). It iterates over the first string and checks if each character is present in the second string.
s="temp_foo_bar"
t="temp_bar"
i=0
while [ $i -ne ${#s} ]
do
c=${s:$i:1}
if [[ $result != *$c* && $t == *$c* ]]
then
result=$result$c
fi
((i++))
done
echo $result
prints: temp_bar
Assuming the strings do not contain embedded newlines:
s1='my_foo' s2='my_bar'
intersect=$(
comm -12 <(
fold -w1 <<< "$s1" |
sort -u
) <(
fold -w1 <<< "$s2" |
sort -u
) |
tr -d \\n
)
printf '%s\n' "$intersect"
And another one:
tr -dc "$s2" <<< "$s1"
a late entry, I've just found this page:
echo "$str2" |
awk 'BEGIN{FS=""}
{ n=0; while(n<=NF) {
if ($n == substr(test,n,1)) { if(!found[$n]) printf("%c",$n); found[$n]=1;} n++;
} print ""}' test="$str1"
and another one, this one builds a regexp for matching (note: doesn't work with special characters, but that's not that hard to fix with anonther sed)
echo "$str1" |
grep -E -o ^`echo -n "$str2" | sed 's/\(.\)/(|\1/g'; echo "$str2" | sed 's/./)/g'`
Should be a portable solution:
s1="my_foo"
s2="my_bar"
while [ -n "$s1" -a -n "$s2" ]
do
if [ "${s1:0:1}" = "${s2:0:1}" ]
then
printf %s "${s1:0:1}"
else
break
fi
s1="${s1:1:${#s1}}"
s2="${s2:1:${#s2}}"
done
A solution using a single sed execution:
echo -e "$s1\n$s2" | sed -e 'N;s/^/\n/;:begin;s/\n\(.\)\(.*\)\n\(.*\)\1\(.*\)/\1\n\2\n\3\4/;t begin;s/\n.\(.*\)\n\(.*\)/\n\1\n\2/;t begin;s/\n\n.*//'
As all cryptic sed script, it needs explanation in the form of a sed script file that can be run by echo -e "$s1\n$s2" | sed -f script:
# Read the next line so s1 and s2 are in the pattern space only separated by a \n.
N
# Put a \n at the beginning of the pattern space.
s/^/\n/
# During the script execution, the pattern space will contain <result so far>\n<what left of s1>\n<what left of s2>.
:begin
# If the 1st char of s1 is found in s2, remove it from s1 and s2, append it to the result and do this again until it fails.
s/\n\(.\)\(.*\)\n\(.*\)\1\(.*\)/\1\n\2\n\3\4/
t begin
# When previous substitution fails, remove 1st char of s1 and try again to find 1st char of S1 in s2.
s/\n.\(.*\)\n\(.*\)/\n\1\n\2/
t begin
# When previous substitution fails, s1 is empty so remove the \n and what is left of s2.
s/\n\n.*//
If you want to remove duplicate, add the following at the end of the script:
:end;s/\(.\)\(.*\)\1/\1\2/;t end
Edit: I realize that dogbane's pure shell solution has the same algorithm, and is probably more efficient.
comm=""
for ((i=0;i<${#s1};i++))
do
if test ${s1:$i:1} = ${s2:$i:1}
then
comm=${comm}${s1:$i:1}
fi
done
Since everyone loves perl one-liners full of punctuation:
perl -e '$a{$_}++ for split "",shift; $b{$_}++ for split "",shift; for (sort keys %a){print if defined $b{$_}}' my_foo not_my_bar
Creates hashes %a and %b from the input strings.
Prints any characters common to both strings.
outputs:
_moy
"flower","flow","flight" --> output fl
s="flower"
t="flow"
i=0
while [ $i -ne ${#s} ]
do
c=${s:$i:1}
if [[ $result != *$c* && $t == *$c* ]]
then
result=$result$c
fi
((i++))
done
echo $result
p=$result
q="flight"
j=0
while [ $j -ne ${#p} ]
do
c1=${p:$j:1}
if [[ $result1 != *$c1* && $q == *$c1* ]]
then
result1=$result1$c1
fi
((j++))
done
echo $result1