How to compare strings in golang? [closed]

How to compare strings in golang? [closed] - string

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I want to make a function that calculates the length of the common segment (starting from the beginning) in two strings. For example:
foo:="Makan"
bar:="Makon"
The result should be 3.
foo:="Indah"
bar:="Ihkasyandehlo"
The result should be 1.

It's not clear what you are asking because you limited your test cases to ASCII characters.
I've added a Unicode test case and I've included answers for bytes, runes, or both.
play.golang.org:
package main
import (
"fmt"
"unicode/utf8"
)
func commonBytes(s, t string) (bytes int) {
if len(s) > len(t) {
s, t = t, s
}
i := 0
for ; i < len(s); i++ {
if s[i] != t[i] {
break
}
}
return i
}
func commonRunes(s, t string) (runes int) {
if len(s) > len(t) {
s, t = t, s
}
i := 0
for ; i < len(s); i++ {
if s[i] != t[i] {
break
}
}
return utf8.RuneCountInString(s[:i])
}
func commonBytesRunes(s, t string) (bytes, runes int) {
if len(s) > len(t) {
s, t = t, s
}
i := 0
for ; i < len(s); i++ {
if s[i] != t[i] {
break
}
}
return i, utf8.RuneCountInString(s[:i])
}
func main() {
Tests := []struct {
word1, word2 string
}{
{"Makan", "Makon"},
{"Indah", "Ihkasyandehlo"},
{"日本語", "日本語"},
}
for _, test := range Tests {
fmt.Println("Words: ", test.word1, test.word2)
fmt.Println("Bytes: ", commonBytes(test.word1, test.word2))
fmt.Println("Runes: ", commonRunes(test.word1, test.word2))
fmt.Print("Bytes & Runes: ")
fmt.Println(commonBytesRunes(test.word1, test.word2))
}
}
Output:
Words: Makan Makon
Bytes: 3
Runes: 3
Bytes & Runes: 3 3
Words: Indah Ihkasyandehlo
Bytes: 1
Runes: 1
Bytes & Runes: 1 1
Words: 日本語 日本語
Bytes: 9
Runes: 3
Bytes & Runes: 9 3

Note that if you were working with Unicode characters, the result could be quite different.
Try for instance using utf8.DecodeRuneInString().
See this example:
package main
import "fmt"
import "unicode/utf8"
func index(s1, s2 string) int {
res := 0
for i, w := 0, 0; i < len(s2); i += w {
if i >= len(s1) {
return res
}
runeValue1, width := utf8.DecodeRuneInString(s1[i:])
runeValue2, width := utf8.DecodeRuneInString(s2[i:])
if runeValue1 != runeValue2 {
return res
}
if runeValue1 == utf8.RuneError || runeValue2 == utf8.RuneError {
return res
}
w = width
res = i + w
}
return res
}
func main() {
foo := "日本本a語"
bar := "日本本b語"
fmt.Println(index(foo, bar))
foo = "日本語"
bar = "日otest"
fmt.Println(index(foo, bar))
foo = "\xF0"
bar = "\xFF"
fmt.Println(index(foo, bar))
}
Here, the result would be:
9 (3 common runes of width '3')
3 (1 rune of width '3')
0 (invalid rune, meaning utf8.RuneError)

You mean like this. Please note, this will not handle UTF 8, only ascii.
package main
import (
"fmt"
)
func equal(s1, s2 string) int {
eq := 0
if len(s1) > len(s2) {
s1, s2 = s2, s1
}
for key, _ := range s1 {
if s1[key] == s2[key] {
eq++
} else {
break
}
}
return eq
}
func main() {
fmt.Println(equal("buzzfizz", "buzz"))
fmt.Println(equal("Makan", "Makon"))
fmt.Println(equal("Indah", "Ihkasyandehlo"))
}

Related

Split string along regex, but keep matches

I want to split a string on a regular expresion, but preserve the matches.
I have tried splitting the string on a regex, but it throws away the matches. I have also tried using this, but I am not very good at translating code from language to language, let alone C#.
re := regexp.MustCompile(`\d`)
array := re.Split("ab1cd2ef3", -1)
I need the value of array to be ["ab", "1", "cd", "2", "ef", "3"], but the value of array is ["ab", "cd", "ef"]. No errors.

The kind of regex support in the link you have pointed out is NOT available in Go regex package. You can read the related discussion.
What you want to achieve (as per the sample given) can be done using regex to match digits or non-digits.
package main
import (
"fmt"
"regexp"
)
func main() {
str := "ab1cd2ef3"
r := regexp.MustCompile(`(\d|[^\d]+)`)
fmt.Println(r.FindAllStringSubmatch(str, -1))
}
Playground: https://play.golang.org/p/L-ElvkDky53
Output:
[[ab ab] [1 1] [cd cd] [2 2] [ef ef] [3 3]]

I don't think this is possible with the current regexp package, but the Split could be easily extended to such behavior.
This should work for your case:
func Split(re *regexp.Regexp, s string, n int) []string {
if n == 0 {
return nil
}
matches := re.FindAllStringIndex(s, n)
strings := make([]string, 0, len(matches))
beg := 0
end := 0
for _, match := range matches {
if n > 0 && len(strings) >= n-1 {
break
}
end = match[0]
if match[1] != 0 {
strings = append(strings, s[beg:end])
}
beg = match[1]
// This also appends the current match
strings = append(strings, s[match[0]:match[1]])
}
if end != len(s) {
strings = append(strings, s[beg:])
}
return strings
}

Dumb solutions. Add separator in the string and split with separator.
package main
import (
"fmt"
"regexp"
"strings"
)
func main() {
re := regexp.MustCompile(`\d+`)
input := "ab1cd2ef3"
sep := "|"
indexes := re.FindAllStringIndex(input, -1)
fmt.Println(indexes)
move := 0
for _, v := range indexes {
p1 := v[0] + move
p2 := v[1] + move
input = input[:p1] + sep + input[p1:p2] + sep + input[p2:]
move += 2
}
result := strings.Split(input, sep)
fmt.Println(result)
}

You can use a bufio.Scanner:
package main
import (
"bufio"
"strings"
)
func digit(data []byte, eof bool) (int, []byte, error) {
for i, b := range data {
if '0' <= b && b <= '9' {
if i > 0 {
return i, data[:i], nil
}
return 1, data[:1], nil
}
}
return 0, nil, nil
}
func main() {
s := bufio.NewScanner(strings.NewReader("ab1cd2ef3"))
s.Split(digit)
for s.Scan() {
println(s.Text())
}
}
https://golang.org/pkg/bufio#Scanner.Split

Is there a better way to insert "|' into binary string rep to get this 10|000|001

Is there a better way to insert "|" into a string
given a binary string representation of decimal 200 = 11001000
this function returns a string = 11|001|000
While this function works, it seems very kludgy!! Why is it so
hard in GO to do a simple character insertion???
func (i Binary) FString() string {
a := strconv.FormatUint(i.Get(), 2)
y := make([]string, len(a), len(a)*2)
data := []rune(a)
r := []rune{}
for i := len(data) - 1; i >= 0; i-- {
r = append(r, data[i])
}
for j := len(a) - 1; j >= 0; j-- {
y = append(y, string(r[j]))
if ((j)%3) == 0 && j > 0 {
y = append(y, "|")
}
}
return strings.Join(y, "")
}

Depends on what you call better. I'd use regular expressions.
In this case, the complexity arises from inserting separators from the right. If we padded the string so that its length was a multiple of 3, we could insert the separator from the left. And we could easily use a regular expression to insert | before every three characters. Then, we can just strip off the leading | + padding.
func (i Binary) FString() string {
a := strconv.FormatUint(i.Get(), 2)
pad_req := len(a) % 3
padding := strings.Repeat("0", (3 - pad_req))
a = padding + a
re := regexp.MustCompile("([01]{3})")
a = re.ReplaceAllString(a, "|$1")
start := len(padding) + 1
if len(padding) == 3 {
// If we padded with "000", we want to remove the `|` before *and* after it
start = 5
}
a = a[start:]
return a
}
Snippet on the Go Playground

If performance is not critical and you just want a compact version, you may copy the input digits to output, and insert a | symbol whenever a group of 2 has been written to the output.
Groups are counted from right-to-left, so when copying the digits from left-to-right, the first group might be smaller. So the counter of digits inside a group may not necessarily start from 0 in case of the first group, but from len(input)%3.
Here is an example of it:
func Format(s string) string {
b, count := &bytes.Buffer{}, len(s)%3
for i, r := range s {
if i > 0 && count == i%3 {
b.WriteRune('|')
}
b.WriteRune(r)
}
return b.String()
}
Testing it:
for i := uint64(0); i < 10; i++ {
fmt.Println(Format(strconv.FormatUint(i, 2)))
}
fmt.Println(Format(strconv.FormatInt(1234, 2)))
Output (try it on the Go Playground):
0
1
10
11
100
101
110
111
1|000
1|001
10|011|010|010
If you have to do this many times and performance does matter, then check out my answer to the question: How to fmt.Printf an integer with thousands comma
Based on that a fast solution can be:
func Format(s string) string {
out := make([]byte, len(s)+(len(s)-1)/3)
for i, j, k := len(s)-1, len(out)-1, 0; ; i, j = i-1, j-1 {
out[j] = s[i]
if i == 0 {
return string(out)
}
if k++; k == 3 {
j, k = j-1, 0
out[j] = '|'
}
}
}
Output is the same of course. Try it on the Go Playground.

This is a partitioning problem. You can use this function:
func partition(s, separator string, pLen int) string {
if pLen < 1 || len(s) == 0 || len(separator) == 0 {
return s
}
buffer := []rune(s)
L := len(buffer)
pCount := L / pLen
result := []string{}
index := 0
for ; index < pCount; index++ {
_from := L - (index+1)*pLen
_to := L - index*pLen
result = append(result, string(buffer[_from:_to]))
}
if L%pLen != 0 {
result = append(result, string(buffer[0:L-index*pLen]))
}
for h, t := 0, len(result)-1; h < t; h, t = h+1, t-1 {
result[t], result[h] = result[h], result[t]
}
return strings.Join(result, separator)
}
And s := partition("11001000", "|", 3) will give you 11|001|000.
Here is a little test:
func TestSmokeTest(t *testing.T) {
input := "11001000"
s := partition(input, "|", 3)
if s != "11|001|000" {
t.Fail()
}
s = partition(input, "|", 2)
if s != "11|00|10|00" {
t.Fail()
}
input = "0111001000"
s = partition(input, "|", 3)
if s != "0|111|001|000" {
t.Fail()
}
s = partition(input, "|", 2)
if s != "01|11|00|10|00" {
t.Fail()
}
}

golang: bitwise operation on very long binary bit string representation

As an exercise, in input I got 2 very big string containing long binary representation here a short one but could have more than 100 bits:
Example
11100
00011
With output in bitwise OR (as string)
11111
My approach was to parse each string characters and make a bitwise OR and build a new string but it is too long to process on big entry and not effective.
Then ParseInt method is restricted to a 64 bit length
num1, err:= strconv.ParseInt("11100", 2, 64)
num2, err:= strconv.ParseInt("00011", 2, 64)
res := num1 | num2
How to deal with a bitwise OR between 2 string binary representation?

You could create the resulting bitwise OR string by doing character comparisons, or you can perform arbitrary large numeric operations using math/big. Here is an example of such an operation:
package main
import "fmt"
import "math/big"
func main() {
num1 := "11100"
num2 := "00011"
var bigNum1 big.Int
var bigNum2 big.Int
var result big.Int
if _, ok := bigNum1.SetString(num1, 2); !ok {
panic("invalid num1")
}
if _, ok := bigNum2.SetString(num2, 2); !ok {
panic("invalid num2")
}
result.Or(&bigNum1, &bigNum2)
for i := result.BitLen() - 1; i >= 0; i-- {
fmt.Print(result.Bit(i))
}
fmt.Println()
}
Go Playground

While you could convert these to numbers to perform bitwise operations, if your only goal is to perform a single bitwise OR on the two strings, parsing the strings into numbers will be less efficient than simply iterating over the string to achieve your end result. Doing so would only make sense if you were performing lots of operations on the numbers in their binary form.
Example code for performing an OR operation on the strings below. Do note that this code assumes the strings are the same length as the examples in the question are, if they were of different lengths you would need to handle that as well.
package main
import "fmt"
func main() {
n1 := "1100"
n2 := "0011"
fmt.Printf("Input: %v | %v\n", n1, n2)
if len(n1) != len(n2) {
fmt.Println("Only supports strings of the same length")
return
}
result := make([]byte, len(n1))
for i := 0; i < len(n1); i++ {
switch n1[i] {
case '0':
switch n2[i] {
case '0':
result[i] = '0'
case '1':
result[i] = '1'
}
case '1':
switch n2[i] {
case '0':
result[i] = '1'
case '1':
result[i] = '1'
}
}
}
fmt.Println("Result: ", string(result))
}
http://play.golang.org/p/L3o6_jYdi1

How about this:
package main
import "fmt"
func main(){
a := "01111100"
b := "1001000110"
var longest, len_diff int
if len(a) > len(b) {
longest = len(a)
len_diff = len(a) - len(b)
} else {
longest = len(b)
len_diff = len(b) - len(a)
}
temp_slice := make([] byte, longest)
var a_start, b_start int
if len(a) > len(b) {
for i := 0; i < len_diff; i++ {
temp_slice[i] = a[i]
}
a_start = len_diff
} else {
for i := 0; i < len_diff; i++ {
temp_slice[i] = b[i]
}
b_start = len_diff
}
for i := 0; i < (longest - len_diff); i++ {
if a[a_start + i] == '1' || b[b_start + i] == '1' {
temp_slice[len_diff + i] = '1'
} else {
temp_slice[len_diff + i] = '0'
}
}
fmt.Println(string(temp_slice))
}
goplayground

Alternative: try this library:https://github.com/aristofanio/bitwiser.
you can parse large bytes arrays like bitstring. See:
package main
import (
"github.com/aristofanio/bitwiser"
)
func main() {
//
b0, _ := bitwiser.ParseFromBits("011100")
b1, _ := bitwiser.ParseFromBits("11010011100")
//
println(b0.ToString()) //output: 0x1c (len(array) = 1byte)
println(b1.ToString()) //output: 0x069c (len(array) = 2bytes)
}

Split string by length in Golang

Does anyone know how to split a string in Golang by length?
For example to split "helloworld" after every 3 characters, so it should ideally return an array of "hel" "low" "orl" "d"?
Alternatively a possible solution would be to also append a newline after every 3 characters..
All ideas are greatly appreciated!

Make sure to convert your string into a slice of rune: see "Slice string into letters".
for automatically converts string to rune so there is no additional code needed in this case to convert the string to rune first.
for i, r := range s {
fmt.Printf("i%d r %c\n", i, r)
// every 3 i, do something
}
r[n:n+3] will work best with a being a slice of rune.
The index will increase by one every rune, while it might increase by more than one for every byte in a slice of string: "世界": i would be 0 and 3: a character (rune) can be formed of multiple bytes.
For instance, consider s := "世a界世bcd界efg世": 12 runes. (see play.golang.org)
If you try to parse it byte by byte, you will miss (in a naive split every 3 chars implementation) some of the "index modulo 3" (equals to 2, 5, 8 and 11), because the index will increase past those values:
for i, r := range s {
res = res + string(r)
fmt.Printf("i %d r %c\n", i, r)
if i > 0 && (i+1)%3 == 0 {
fmt.Printf("=>(%d) '%v'\n", i, res)
res = ""
}
}
The output:
i 0 r 世
i 3 r a <== miss i==2
i 4 r 界
i 7 r 世 <== miss i==5
i 10 r b <== miss i==8
i 11 r c ===============> would print '世a界世bc', not exactly '3 chars'!
i 12 r d
i 13 r 界
i 16 r e <== miss i==14
i 17 r f ===============> would print 'd界ef'
i 18 r g
i 19 r 世 <== miss the rest of the string
But if you were to iterate on runes (a := []rune(s)), you would get what you expect, as the index would increase one rune at a time, making it easy to aggregate exactly 3 characters:
for i, r := range a {
res = res + string(r)
fmt.Printf("i%d r %c\n", i, r)
if i > 0 && (i+1)%3 == 0 {
fmt.Printf("=>(%d) '%v'\n", i, res)
res = ""
}
}
Output:
i 0 r 世
i 1 r a
i 2 r 界 ===============> would print '世a界'
i 3 r 世
i 4 r b
i 5 r c ===============> would print '世bc'
i 6 r d
i 7 r 界
i 8 r e ===============> would print 'd界e'
i 9 r f
i10 r g
i11 r 世 ===============> would print 'fg世'

Here is another variant playground.
It is by far more efficient in terms of both speed and memory than other answers. If you want to run benchmarks here they are benchmarks. In general it is 5 times faster than the previous version that was a fastest answer anyway.
func Chunks(s string, chunkSize int) []string {
if len(s) == 0 {
return nil
}
if chunkSize >= len(s) {
return []string{s}
}
var chunks []string = make([]string, 0, (len(s)-1)/chunkSize+1)
currentLen := 0
currentStart := 0
for i := range s {
if currentLen == chunkSize {
chunks = append(chunks, s[currentStart:i])
currentLen = 0
currentStart = i
}
currentLen++
}
chunks = append(chunks, s[currentStart:])
return chunks
}
Please note that the index points to a first byte of a rune on iterating over a string. The rune takes from 1 to 4 bytes. Slicing also treats the string as a byte array.
PREVIOUS SLOWER ALGORITHM
The code is here playground. The conversion from bytes to runes and then to bytes again takes a lot of time actually. So better use the fast algorithm at the top of the answer.
func ChunksSlower(s string, chunkSize int) []string {
if chunkSize >= len(s) {
return []string{s}
}
var chunks []string
chunk := make([]rune, chunkSize)
len := 0
for _, r := range s {
chunk[len] = r
len++
if len == chunkSize {
chunks = append(chunks, string(chunk))
len = 0
}
}
if len > 0 {
chunks = append(chunks, string(chunk[:len]))
}
return chunks
}
Please note that these two algorithms treat invalid UTF-8 characters in a different way. First one processes them as is when second one replaces them by utf8.RuneError symbol ('\uFFFD') that has following hexadecimal representation in UTF-8: efbfbd.

Also needed a function to do this recently, see example usage here
func SplitSubN(s string, n int) []string {
sub := ""
subs := []string{}
runes := bytes.Runes([]byte(s))
l := len(runes)
for i, r := range runes {
sub = sub + string(r)
if (i + 1) % n == 0 {
subs = append(subs, sub)
sub = ""
} else if (i + 1) == l {
subs = append(subs, sub)
}
}
return subs
}

Here is another example (you can try it here):
package main
import (
"fmt"
"strings"
)
func ChunkString(s string, chunkSize int) []string {
var chunks []string
runes := []rune(s)
if len(runes) == 0 {
return []string{s}
}
for i := 0; i < len(runes); i += chunkSize {
nn := i + chunkSize
if nn > len(runes) {
nn = len(runes)
}
chunks = append(chunks, string(runes[i:nn]))
}
return chunks
}
func main() {
fmt.Println(ChunkString("helloworld", 3))
fmt.Println(strings.Join(ChunkString("helloworld", 3), "\n"))
}

An easy solution using regex
re := regexp.MustCompile((\S{3}))
x := re.FindAllStringSubmatch("HelloWorld", -1)
fmt.Println(x)
https://play.golang.org/p/mfmaQlSRkHe

I tried 3 version to implement the function, the function named "splitByWidthMake" is fastest.
These functions ignore the unicode but only the ascii code.
package main
import (
"fmt"
"strings"
"time"
"math"
)
func splitByWidthMake(str string, size int) []string {
strLength := len(str)
splitedLength := int(math.Ceil(float64(strLength) / float64(size)))
splited := make([]string, splitedLength)
var start, stop int
for i := 0; i < splitedLength; i += 1 {
start = i * size
stop = start + size
if stop > strLength {
stop = strLength
}
splited[i] = str[start : stop]
}
return splited
}
func splitByWidth(str string, size int) []string {
strLength := len(str)
var splited []string
var stop int
for i := 0; i < strLength; i += size {
stop = i + size
if stop > strLength {
stop = strLength
}
splited = append(splited, str[i:stop])
}
return splited
}
func splitRecursive(str string, size int) []string {
if len(str) <= size {
return []string{str}
}
return append([]string{string(str[0:size])}, splitRecursive(str[size:], size)...)
}
func main() {
/*
testStrings := []string{
"hello world",
"",
"1",
}
*/
testStrings := make([]string, 10)
for i := range testStrings {
testStrings[i] = strings.Repeat("#", int(math.Pow(2, float64(i))))
}
//fmt.Println(testStrings)
t1 := time.Now()
for i := range testStrings {
_ = splitByWidthMake(testStrings[i], 2)
//fmt.Println(t)
}
elapsed := time.Since(t1)
fmt.Println("for loop version elapsed: ", elapsed)
t1 = time.Now()
for i := range testStrings {
_ = splitByWidth(testStrings[i], 2)
}
elapsed = time.Since(t1)
fmt.Println("for loop without make version elapsed: ", elapsed)
t1 = time.Now()
for i := range testStrings {
_ = splitRecursive(testStrings[i], 2)
}
elapsed = time.Since(t1)
fmt.Println("recursive version elapsed: ", elapsed)
}

Not the most efficient, will work for most use-cases.
Go playground: https://play.golang.org/p/0JSqv3OMdCR
// splitBy splits a string s by int n.
func splitBy(s string, n int) []string {
var ss []string
for i := 1; i < len(s); i++ {
if i%n == 0 {
ss = append(ss, s[:i])
s = s[i:]
i = 1
}
}
ss = append(ss, s)
return ss
}
// test
s := "helloworld"
ss := splitBy(s, 3)
fmt.Println(ss)
# output
$ go run main.go
[hel low orl d]

How to reverse a string in Go?

How can we reverse a simple string in Go?

In Go1 rune is a builtin type.
func Reverse(s string) string {
runes := []rune(s)
for i, j := 0, len(runes)-1; i < j; i, j = i+1, j-1 {
runes[i], runes[j] = runes[j], runes[i]
}
return string(runes)
}

Russ Cox, on the golang-nuts mailing list, suggests
package main
import "fmt"
func main() {
input := "The quick brown 狐 jumped over the lazy 犬"
// Get Unicode code points.
n := 0
rune := make([]rune, len(input))
for _, r := range input {
rune[n] = r
n++
}
rune = rune[0:n]
// Reverse
for i := 0; i < n/2; i++ {
rune[i], rune[n-1-i] = rune[n-1-i], rune[i]
}
// Convert back to UTF-8.
output := string(rune)
fmt.Println(output)
}

This works, without all the mucking about with functions:
func Reverse(s string) (result string) {
for _,v := range s {
result = string(v) + result
}
return
}

From Go example projects: golang/example/stringutil/reverse.go, by Andrew Gerrand
/*
Copyright 2014 Google Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Reverse returns its argument string reversed rune-wise left to right.
func Reverse(s string) string {
r := []rune(s)
for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 {
r[i], r[j] = r[j], r[i]
}
return string(r)
}
Go Playground for reverse a string
After reversing string "bròwn", the correct result should be "nwòrb", not "nẁorb".
Note the grave above the letter o.
For preserving Unicode combining characters such as "as⃝df̅" with reverse result "f̅ds⃝a",
please refer to another code listed below:
http://rosettacode.org/wiki/Reverse_a_string#Go

This works on unicode strings by considering 2 things:
range works on string by enumerating unicode characters
string can be constructed from int slices where each element is a unicode character.
So here it goes:
func reverse(s string) string {
o := make([]int, utf8.RuneCountInString(s));
i := len(o);
for _, c := range s {
i--;
o[i] = c;
}
return string(o);
}

There are too many answers here. Some of them are clear duplicates. But even from the left one, it is hard to select the best solution.
So I went through the answers, thrown away the one that does not work for unicode and also removed duplicates. I benchmarked the survivors to find the fastest. So here are the results with attribution (if you notice the answers that I missed, but worth adding, feel free to modify the benchmark):
Benchmark_rmuller-4 100000 19246 ns/op
Benchmark_peterSO-4 50000 28068 ns/op
Benchmark_russ-4 50000 30007 ns/op
Benchmark_ivan-4 50000 33694 ns/op
Benchmark_yazu-4 50000 33372 ns/op
Benchmark_yuku-4 50000 37556 ns/op
Benchmark_simon-4 3000 426201 ns/op
So here is the fastest method by rmuller:
func Reverse(s string) string {
size := len(s)
buf := make([]byte, size)
for start := 0; start < size; {
r, n := utf8.DecodeRuneInString(s[start:])
start += n
utf8.EncodeRune(buf[size-start:], r)
}
return string(buf)
}
For some reason I can't add a benchmark, so you can copy it from PlayGround (you can't run tests there). Rename it and run go test -bench=.

I noticed this question when Simon posted his solution which, since strings are immutable, is very inefficient. The other proposed solutions are also flawed; they don't work or they are inefficient.
Here's an efficient solution that works, except when the string is not valid UTF-8 or the string contains combining characters.
package main
import "fmt"
func Reverse(s string) string {
n := len(s)
runes := make([]rune, n)
for _, rune := range s {
n--
runes[n] = rune
}
return string(runes[n:])
}
func main() {
fmt.Println(Reverse(Reverse("Hello, 世界")))
fmt.Println(Reverse(Reverse("The quick brown 狐 jumped over the lazy 犬")))
}

I wrote the following Reverse function which respects UTF8 encoding and combined characters:
// Reverse reverses the input while respecting UTF8 encoding and combined characters
func Reverse(text string) string {
textRunes := []rune(text)
textRunesLength := len(textRunes)
if textRunesLength <= 1 {
return text
}
i, j := 0, 0
for i < textRunesLength && j < textRunesLength {
j = i + 1
for j < textRunesLength && isMark(textRunes[j]) {
j++
}
if isMark(textRunes[j-1]) {
// Reverses Combined Characters
reverse(textRunes[i:j], j-i)
}
i = j
}
// Reverses the entire array
reverse(textRunes, textRunesLength)
return string(textRunes)
}
func reverse(runes []rune, length int) {
for i, j := 0, length-1; i < length/2; i, j = i+1, j-1 {
runes[i], runes[j] = runes[j], runes[i]
}
}
// isMark determines whether the rune is a marker
func isMark(r rune) bool {
return unicode.Is(unicode.Mn, r) || unicode.Is(unicode.Me, r) || unicode.Is(unicode.Mc, r)
}
I did my best to make it as efficient and readable as possible. The idea is simple, traverse through the runes looking for combined characters then reverse the combined characters' runes in-place. Once we have covered them all, reverse the runes of the entire string also in-place.
Say we would like to reverse this string bròwn. The ò is represented by two runes, one for the o and one for this unicode \u0301a that represents the "grave".
For simplicity, let's represent the string like this bro'wn. The first thing we do is look for combined characters and reverse them. So now we have the string br'own. Finally, we reverse the entire string and end up with nwo'rb. This is returned to us as nwòrb
You can find it here https://github.com/shomali11/util if you would like to use it.
Here are some test cases to show a couple of different scenarios:
func TestReverse(t *testing.T) {
assert.Equal(t, Reverse(""), "")
assert.Equal(t, Reverse("X"), "X")
assert.Equal(t, Reverse("b\u0301"), "b\u0301")
assert.Equal(t, Reverse("😎⚽"), "⚽😎")
assert.Equal(t, Reverse("Les Mise\u0301rables"), "selbare\u0301siM seL")
assert.Equal(t, Reverse("ab\u0301cde"), "edcb\u0301a")
assert.Equal(t, Reverse("This `\xc5` is an invalid UTF8 character"), "retcarahc 8FTU dilavni na si `�` sihT")
assert.Equal(t, Reverse("The quick bròwn 狐 jumped over the lazy 犬"), "犬 yzal eht revo depmuj 狐 nwòrb kciuq ehT")
}

//Reverse reverses string using strings.Builder. It's about 3 times faster
//than the one with using a string concatenation
func Reverse(in string) string {
var sb strings.Builder
runes := []rune(in)
for i := len(runes) - 1; 0 <= i; i-- {
sb.WriteRune(runes[i])
}
return sb.String()
}
//Reverse reverses string using string
func Reverse(in string) (out string) {
for _, r := range in {
out = string(r) + out
}
return
}
BenchmarkReverseStringConcatenation-8 1000000 1571 ns/op 176 B/op 29 allocs/op
BenchmarkReverseStringsBuilder-8 3000000 499 ns/op 56 B/op 6 allocs/op
Using strings.Builder is about 3 times faster than using string concatenation

Here is quite different, I would say more functional approach, not listed among other answers:
func reverse(s string) (ret string) {
for _, v := range s {
defer func(r rune) { ret += string(r) }(v)
}
return
}

This is the fastest implementation
func Reverse(s string) string {
size := len(s)
buf := make([]byte, size)
for start := 0; start < size; {
r, n := utf8.DecodeRuneInString(s[start:])
start += n
utf8.EncodeRune(buf[size-start:], r)
}
return string(buf)
}
const (
s = "The quick brown 狐 jumped over the lazy 犬"
reverse = "犬 yzal eht revo depmuj 狐 nworb kciuq ehT"
)
func TestReverse(t *testing.T) {
if Reverse(s) != reverse {
t.Error(s)
}
}
func BenchmarkReverse(b *testing.B) {
for i := 0; i < b.N; i++ {
Reverse(s)
}
}

A simple stroke with rune:
func ReverseString(s string) string {
runes := []rune(s)
size := len(runes)
for i := 0; i < size/2; i++ {
runes[size-i-1], runes[i] = runes[i], runes[size-i-1]
}
return string(runes)
}
func main() {
fmt.Println(ReverseString("Abcdefg 汉语 The God"))
}
: doG ehT 语汉 gfedcbA

You could also import an existing implementation:
import "4d63.com/strrev"
Then:
strrev.Reverse("abåd") // returns "dåba"
Or to reverse a string including unicode combining characters:
strrev.ReverseCombining("abc\u0301\u031dd") // returns "d\u0301\u031dcba"
These implementations supports correct ordering of unicode multibyte and combing characters when reversed.
Note: Built-in string reverse functions in many programming languages do not preserve combining, and identifying combining characters requires significantly more execution time.

func ReverseString(str string) string {
output :=""
for _, char := range str {
output = string(char) + output
}
return output
}
// "Luizpa" -> "apziuL"
// "123日本語" -> "語本日321"
// "⚽😎" -> "😎⚽"
// "´a´b´c´" -> "´c´b´a´"

This code preserves sequences of combining characters intact, and
should work with invalid UTF-8 input too.
package stringutil
import "code.google.com/p/go.text/unicode/norm"
func Reverse(s string) string {
bound := make([]int, 0, len(s) + 1)
var iter norm.Iter
iter.InitString(norm.NFD, s)
bound = append(bound, 0)
for !iter.Done() {
iter.Next()
bound = append(bound, iter.Pos())
}
bound = append(bound, len(s))
out := make([]byte, 0, len(s))
for i := len(bound) - 2; i >= 0; i-- {
out = append(out, s[bound[i]:bound[i+1]]...)
}
return string(out)
}
It could be a little more efficient if the unicode/norm primitives
allowed iterating through the boundaries of a string without
allocating. See also https://code.google.com/p/go/issues/detail?id=9055 .

If you need to handle grapheme clusters, use unicode or regexp module.
package main
import (
"unicode"
"regexp"
)
func main() {
str := "\u0308" + "a\u0308" + "o\u0308" + "u\u0308"
println("u\u0308" + "o\u0308" + "a\u0308" + "\u0308" == ReverseGrapheme(str))
println("u\u0308" + "o\u0308" + "a\u0308" + "\u0308" == ReverseGrapheme2(str))
}
func ReverseGrapheme(str string) string {
buf := []rune("")
checked := false
index := 0
ret := ""
for _, c := range str {
if !unicode.Is(unicode.M, c) {
if len(buf) > 0 {
ret = string(buf) + ret
}
buf = buf[:0]
buf = append(buf, c)
if checked == false {
checked = true
}
} else if checked == false {
ret = string(append([]rune(""), c)) + ret
} else {
buf = append(buf, c)
}
index += 1
}
return string(buf) + ret
}
func ReverseGrapheme2(str string) string {
re := regexp.MustCompile("\\PM\\pM*|.")
slice := re.FindAllString(str, -1)
length := len(slice)
ret := ""
for i := 0; i < length; i += 1 {
ret += slice[length-1-i]
}
return ret
}

It's assuredly not the most memory efficient solution, but for a "simple" UTF-8 safe solution the following will get the job done and not break runes.
It's in my opinion the most readable and understandable on the page.
func reverseStr(str string) (out string) {
for _, s := range str {
out = string(s) + out
}
return
}

The following two methods run faster than the fastest solution that preserve combining characters, though that's not to say I'm missing something in my benchmark setup.
//input string s
bs := []byte(s)
var rs string
for len(bs) > 0 {
r, size := utf8.DecodeLastRune(bs)
rs += fmt.Sprintf("%c", r)
bs = bs[:len(bs)-size]
} // rs has reversed string
Second method inspired by this
//input string s
bs := []byte(s)
cs := make([]byte, len(bs))
b1 := 0
for len(bs) > 0 {
r, size := utf8.DecodeLastRune(bs)
d := make([]byte, size)
_ = utf8.EncodeRune(d, r)
b1 += copy(cs[b1:], d)
bs = bs[:len(bs) - size]
} // cs has reversed bytes

NOTE: This answer is from 2009, so there are probably better solutions out there by now.
Looks a bit 'roundabout', and probably not very efficient, but illustrates how the Reader interface can be used to read from strings. IntVectors also seem very suitable as buffers when working with utf8 strings.
It would be even shorter when leaving out the 'size' part, and insertion into the vector by Insert, but I guess that would be less efficient, as the whole vector then needs to be pushed back by one each time a new rune is added.
This solution definitely works with utf8 characters.
package main
import "container/vector";
import "fmt";
import "utf8";
import "bytes";
import "bufio";
func
main() {
toReverse := "Smørrebrød";
fmt.Println(toReverse);
fmt.Println(reverse(toReverse));
}
func
reverse(str string) string {
size := utf8.RuneCountInString(str);
output := vector.NewIntVector(size);
input := bufio.NewReader(bytes.NewBufferString(str));
for i := 1; i <= size; i++ {
rune, _, _ := input.ReadRune();
output.Set(size - i, rune);
}
return string(output.Data());
}

func Reverse(s string) string {
r := []rune(s)
var output strings.Builder
for i := len(r) - 1; i >= 0; i-- {
output.WriteString(string(r[i]))
}
return output.String()
}

Simple, Sweet and Performant
func reverseStr(str string) string {
strSlice := []rune(str) //converting to slice of runes
length := len(strSlice)
for i := 0; i < (length / 2); i++ {
strSlice[i], strSlice[length-i-1] = strSlice[length-i-1], strSlice[i]
}
return string(strSlice) //converting back to string
}

Reversing a string by word is a similar process. First, we convert the string into an array of strings where each entry is a word. Next, we apply the normal reverse loop to that array. Finally, we smush the results back together into a string that we can return to the caller.
package main
import (
"fmt"
"strings"
)
func reverse_words(s string) string {
words := strings.Fields(s)
for i, j := 0, len(words)-1; i < j; i, j = i+1, j-1 {
words[i], words[j] = words[j], words[i]
}
return strings.Join(words, " ")
}
func main() {
fmt.Println(reverse_words("one two three"))
}

Another hack is to use built-in language features, for example, defer:
package main
import "fmt"
func main() {
var name string
fmt.Scanln(&name)
for _, char := range []rune(name) {
defer fmt.Printf("%c", char) // <-- LIFO does it all for you
}
}

For simple strings it possible to use such construction:
func Reverse(str string) string {
if str != "" {
return Reverse(str[1:]) + str[:1]
}
return ""
}
For Unicode strings it might look like this:
func RecursiveReverse(str string) string {
if str == "" {
return ""
}
runes := []rune(str)
return RecursiveReverse(string(runes[1:])) + string(runes[0])
}

A version which I think works on unicode. It is built on the utf8.Rune functions:
func Reverse(s string) string {
b := make([]byte, len(s));
for i, j := len(s)-1, 0; i >= 0; i-- {
if utf8.RuneStart(s[i]) {
rune, size := utf8.DecodeRuneInString(s[i:len(s)]);
utf8.EncodeRune(rune, b[j:j+size]);
j += size;
}
}
return string(b);
}

rune is a type, so use it. Moreover, Go doesn't use semicolons.
func reverse(s string) string {
l := len(s)
m := make([]rune, l)
for _, c := range s {
l--
m[l] = c
}
return string(m)
}
func main() {
str := "the quick brown 狐 jumped over the lazy 犬"
fmt.Printf("reverse(%s): [%s]\n", str, reverse(str))
}

try below code:
package main
import "fmt"
func reverse(s string) string {
chars := []rune(s)
for i, j := 0, len(chars)-1; i < j; i, j = i+1, j-1 {
chars[i], chars[j] = chars[j], chars[i]
}
return string(chars)
}
func main() {
fmt.Printf("%v\n", reverse("abcdefg"))
}
for more info check http://golangcookbook.com/chapters/strings/reverse/
and http://www.dotnetperls.com/reverse-string-go

func reverseString(someString string) string {
runeString := []rune(someString)
var reverseString string
for i := len(runeString)-1; i >= 0; i -- {
reverseString += string(runeString[i])
}
return reverseString
}

Strings are immutable object in golang, unlike C inplace reverse is not possible with golang.
With C , you can do something like,
void reverseString(char *str) {
int length = strlen(str)
for(int i = 0, j = length-1; i < length/2; i++, j--)
{
char tmp = str[i];
str[i] = str[j];
str[j] = tmp;
}
}
But with golang, following one, uses byte to convert the input into bytes first and then reverses the byte array once it is reversed, convert back to string before returning. works only with non unicode type string.
package main
import "fmt"
func main() {
s := "test123 4"
fmt.Println(reverseString(s))
}
func reverseString(s string) string {
a := []byte(s)
for i, j := 0, len(s)-1; i < j; i++ {
a[i], a[j] = a[j], a[i]
j--
}
return string(a)
}

Here is yet another solution:
func ReverseStr(s string) string {
chars := []rune(s)
rev := make([]rune, 0, len(chars))
for i := len(chars) - 1; i >= 0; i-- {
rev = append(rev, chars[i])
}
return string(rev)
}
However, yazu's solution above is more elegant since he reverses the []rune slice in place.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to compare strings in golang? [closed] - string

Related

Split string along regex, but keep matches

Is there a better way to insert "|' into binary string rep to get this 10|000|001

golang: bitwise operation on very long binary bit string representation

Split string by length in Golang

How to reverse a string in Go?

Categories

Resources