Split string by length in Golang

Split string by length in Golang - string

Does anyone know how to split a string in Golang by length?
For example to split "helloworld" after every 3 characters, so it should ideally return an array of "hel" "low" "orl" "d"?
Alternatively a possible solution would be to also append a newline after every 3 characters..
All ideas are greatly appreciated!

Make sure to convert your string into a slice of rune: see "Slice string into letters".
for automatically converts string to rune so there is no additional code needed in this case to convert the string to rune first.
for i, r := range s {
fmt.Printf("i%d r %c\n", i, r)
// every 3 i, do something
}
r[n:n+3] will work best with a being a slice of rune.
The index will increase by one every rune, while it might increase by more than one for every byte in a slice of string: "世界": i would be 0 and 3: a character (rune) can be formed of multiple bytes.
For instance, consider s := "世a界世bcd界efg世": 12 runes. (see play.golang.org)
If you try to parse it byte by byte, you will miss (in a naive split every 3 chars implementation) some of the "index modulo 3" (equals to 2, 5, 8 and 11), because the index will increase past those values:
for i, r := range s {
res = res + string(r)
fmt.Printf("i %d r %c\n", i, r)
if i > 0 && (i+1)%3 == 0 {
fmt.Printf("=>(%d) '%v'\n", i, res)
res = ""
}
}
The output:
i 0 r 世
i 3 r a <== miss i==2
i 4 r 界
i 7 r 世 <== miss i==5
i 10 r b <== miss i==8
i 11 r c ===============> would print '世a界世bc', not exactly '3 chars'!
i 12 r d
i 13 r 界
i 16 r e <== miss i==14
i 17 r f ===============> would print 'd界ef'
i 18 r g
i 19 r 世 <== miss the rest of the string
But if you were to iterate on runes (a := []rune(s)), you would get what you expect, as the index would increase one rune at a time, making it easy to aggregate exactly 3 characters:
for i, r := range a {
res = res + string(r)
fmt.Printf("i%d r %c\n", i, r)
if i > 0 && (i+1)%3 == 0 {
fmt.Printf("=>(%d) '%v'\n", i, res)
res = ""
}
}
Output:
i 0 r 世
i 1 r a
i 2 r 界 ===============> would print '世a界'
i 3 r 世
i 4 r b
i 5 r c ===============> would print '世bc'
i 6 r d
i 7 r 界
i 8 r e ===============> would print 'd界e'
i 9 r f
i10 r g
i11 r 世 ===============> would print 'fg世'

Here is another variant playground.
It is by far more efficient in terms of both speed and memory than other answers. If you want to run benchmarks here they are benchmarks. In general it is 5 times faster than the previous version that was a fastest answer anyway.
func Chunks(s string, chunkSize int) []string {
if len(s) == 0 {
return nil
}
if chunkSize >= len(s) {
return []string{s}
}
var chunks []string = make([]string, 0, (len(s)-1)/chunkSize+1)
currentLen := 0
currentStart := 0
for i := range s {
if currentLen == chunkSize {
chunks = append(chunks, s[currentStart:i])
currentLen = 0
currentStart = i
}
currentLen++
}
chunks = append(chunks, s[currentStart:])
return chunks
}
Please note that the index points to a first byte of a rune on iterating over a string. The rune takes from 1 to 4 bytes. Slicing also treats the string as a byte array.
PREVIOUS SLOWER ALGORITHM
The code is here playground. The conversion from bytes to runes and then to bytes again takes a lot of time actually. So better use the fast algorithm at the top of the answer.
func ChunksSlower(s string, chunkSize int) []string {
if chunkSize >= len(s) {
return []string{s}
}
var chunks []string
chunk := make([]rune, chunkSize)
len := 0
for _, r := range s {
chunk[len] = r
len++
if len == chunkSize {
chunks = append(chunks, string(chunk))
len = 0
}
}
if len > 0 {
chunks = append(chunks, string(chunk[:len]))
}
return chunks
}
Please note that these two algorithms treat invalid UTF-8 characters in a different way. First one processes them as is when second one replaces them by utf8.RuneError symbol ('\uFFFD') that has following hexadecimal representation in UTF-8: efbfbd.

Also needed a function to do this recently, see example usage here
func SplitSubN(s string, n int) []string {
sub := ""
subs := []string{}
runes := bytes.Runes([]byte(s))
l := len(runes)
for i, r := range runes {
sub = sub + string(r)
if (i + 1) % n == 0 {
subs = append(subs, sub)
sub = ""
} else if (i + 1) == l {
subs = append(subs, sub)
}
}
return subs
}

Here is another example (you can try it here):
package main
import (
"fmt"
"strings"
)
func ChunkString(s string, chunkSize int) []string {
var chunks []string
runes := []rune(s)
if len(runes) == 0 {
return []string{s}
}
for i := 0; i < len(runes); i += chunkSize {
nn := i + chunkSize
if nn > len(runes) {
nn = len(runes)
}
chunks = append(chunks, string(runes[i:nn]))
}
return chunks
}
func main() {
fmt.Println(ChunkString("helloworld", 3))
fmt.Println(strings.Join(ChunkString("helloworld", 3), "\n"))
}

An easy solution using regex
re := regexp.MustCompile((\S{3}))
x := re.FindAllStringSubmatch("HelloWorld", -1)
fmt.Println(x)
https://play.golang.org/p/mfmaQlSRkHe

I tried 3 version to implement the function, the function named "splitByWidthMake" is fastest.
These functions ignore the unicode but only the ascii code.
package main
import (
"fmt"
"strings"
"time"
"math"
)
func splitByWidthMake(str string, size int) []string {
strLength := len(str)
splitedLength := int(math.Ceil(float64(strLength) / float64(size)))
splited := make([]string, splitedLength)
var start, stop int
for i := 0; i < splitedLength; i += 1 {
start = i * size
stop = start + size
if stop > strLength {
stop = strLength
}
splited[i] = str[start : stop]
}
return splited
}
func splitByWidth(str string, size int) []string {
strLength := len(str)
var splited []string
var stop int
for i := 0; i < strLength; i += size {
stop = i + size
if stop > strLength {
stop = strLength
}
splited = append(splited, str[i:stop])
}
return splited
}
func splitRecursive(str string, size int) []string {
if len(str) <= size {
return []string{str}
}
return append([]string{string(str[0:size])}, splitRecursive(str[size:], size)...)
}
func main() {
/*
testStrings := []string{
"hello world",
"",
"1",
}
*/
testStrings := make([]string, 10)
for i := range testStrings {
testStrings[i] = strings.Repeat("#", int(math.Pow(2, float64(i))))
}
//fmt.Println(testStrings)
t1 := time.Now()
for i := range testStrings {
_ = splitByWidthMake(testStrings[i], 2)
//fmt.Println(t)
}
elapsed := time.Since(t1)
fmt.Println("for loop version elapsed: ", elapsed)
t1 = time.Now()
for i := range testStrings {
_ = splitByWidth(testStrings[i], 2)
}
elapsed = time.Since(t1)
fmt.Println("for loop without make version elapsed: ", elapsed)
t1 = time.Now()
for i := range testStrings {
_ = splitRecursive(testStrings[i], 2)
}
elapsed = time.Since(t1)
fmt.Println("recursive version elapsed: ", elapsed)
}

Not the most efficient, will work for most use-cases.
Go playground: https://play.golang.org/p/0JSqv3OMdCR
// splitBy splits a string s by int n.
func splitBy(s string, n int) []string {
var ss []string
for i := 1; i < len(s); i++ {
if i%n == 0 {
ss = append(ss, s[:i])
s = s[i:]
i = 1
}
}
ss = append(ss, s)
return ss
}
// test
s := "helloworld"
ss := splitBy(s, 3)
fmt.Println(ss)
# output
$ go run main.go
[hel low orl d]

Related

Efficient reverse indexing of number (type string) in Go

Background
TLDR (and simplified): Given a string s, where s is any positive integer, reverse the order and summate each digit multiplied by it's new index (+1).
For example, the value returned from "98765" would be: (1x5) + (2x6) + (3x7) + (4x8) + (5*9)= 115.
My current working solution can be found here: Go playground. I'd like to know whether there's a better way of doing this, be it readability or efficiency. For example, I decided in favour of a count variable instead utilising i and len as it seemed clearer. I'm also not very familiar with int/string conversions but I'm assuming making use of strconv is required.
func reverseStringSum(s string) int {
total := 0
count := 1
for i := len(s) - 1; i >= 0; i-- {
char := string([]rune(s)[i])
num, _ := strconv.Atoi(char)
total += count * num
count++
}
return total
}

Here's an efficient way to solve the complete problem: sum("987-65") = 115. The complete problem is documented in your working solution link: https://go.dev/play/p/DJ1ZYYDFnfq.
package main
import "fmt"
func reverseSum(s string) int {
sum := 0
for i, j := len(s)-1, 0; i >= 0; i-- {
d := int(s[i]) - '0'
if 0 <= d && d <= 9 {
j++
sum += j * d
}
}
return sum
}
func main() {
s := "987-65"
sum := reverseSum(s)
fmt.Println(sum)
}
https://go.dev/play/p/bx7wfmtXaie
115
Since we are talkng about efficient Go code, we need some Go benchmarks.
$ go test reversesum_test.go -bench=. -benchmem
BenchmarkSumTBJ-8 4001182 295.8 ns/op 52 B/op 6 allocs/op
BenchmarkSumA2Q-8 225781720 5.284 ns/op 0 B/op 0 allocs/op
Your solution (TBJ) is slow.
reversesum_test.go:
package main
import (
"strconv"
"strings"
"testing"
)
func reverseSumTBJ(s string) int {
total := 0
count := 1
for i := len(s) - 1; i >= 0; i-- {
char := string([]rune(s)[i])
num, _ := strconv.Atoi(char)
total += count * num
count++
}
return total
}
func BenchmarkSumTBJ(b *testing.B) {
for n := 0; n < b.N; n++ {
rawString := "987-65"
stringSlice := strings.Split(rawString, "-")
numberString := stringSlice[0] + stringSlice[1]
reverseSumTBJ(numberString)
}
}
func reverseSumA2Q(s string) int {
sum := 0
for i, j := len(s)-1, 0; i >= 0; i-- {
d := int(s[i]) - '0'
if 0 <= d && d <= 9 {
j++
sum += j * d
}
}
return sum
}
func BenchmarkSumA2Q(b *testing.B) {
for n := 0; n < b.N; n++ {
rawString := "987-65"
reverseSumA2Q(rawString)
}
}
The reverse sum is part of a larger problem, computing a CAS Registry Number check digit.
package main
import "fmt"
// CASRNCheckDigit returns the computed
// CAS Registry Number check digit.
func CASRNCheckDigit(s string) string {
// CAS Registry Number
// https://en.wikipedia.org/wiki/CAS_Registry_Number
//
// The check digit is found by taking the last digit times 1,
// the preceding digit times 2, the preceding digit times 3 etc.,
// adding all these up and computing the sum modulo 10.
//
// The CAS number of water is 7732-18-5:
// the checksum 5 is calculated as
// (8×1 + 1×2 + 2×3 + 3×4 + 7×5 + 7×6)
// = 105; 105 mod 10 = 5.
//
// Check Digit Verification of CAS Registry Numbers
// https://www.cas.org/support/documentation/chemical-substances/checkdig
for i, sep := 0, 0; i < len(s); i++ {
if s[i] == '-' {
sep++
if sep == 2 {
s = s[:i]
break
}
}
}
sum := 0
for i, j := len(s)-1, 0; i >= 0; i-- {
d := int(s[i]) - '0'
if 0 <= d && d <= 9 {
j++
sum += j * d
}
}
return string(rune(sum%10 + '0'))
}
func main() {
var rn, cd string
// 987-65-5: Adenosine 5'-triphosphate disodium salt
// https://www.chemicalbook.com/CASEN_987-65-5.htm
rn = "987-65"
cd = CASRNCheckDigit(rn)
fmt.Println("CD:", cd, "\tRN:", rn)
// 732-18-5: Water
// https://www.chemicalbook.com/CASEN_7732-18-5.htm
rn = "7732-18-5"
cd = CASRNCheckDigit(rn)
fmt.Println("CD:", cd, "\tRN:", rn)
// 7440-21-3: Silicon
// https://www.chemicalbook.com/CASEN_7440-21-3.htm
rn = "7440-21-3"
cd = CASRNCheckDigit(rn)
fmt.Println("CD:", cd, "\tRN:", rn)
}
https://go.dev/play/p/VYh-5LuGpCn
BenchmarkCD-4 37187641 30.29 ns/op 4 B/op 1 allocs/op

Maybe this is more efficient
func reverseStringSum(s string) int {
total := 0
count := 1
for i := len(s) - 1; i >= 0; i-- {
num, _ := strconv.Atoi(string(s[i]))
total += count * num
count++
}
return total
}

Bitmasking conversion of CPU ids with Go

I have a mask that contains a binary counting of cpu_ids (0xA00000800000 for 3 CPUs) which I want to convert into a string of comma separated cpu_ids: "0,2,24".
I did the following Go implementation (I am a Go starter). Is it the best way to do it? Especially the handling of byte buffers seems to be inefficient!
package main
import (
"fmt"
"os"
"os/exec"
)
func main(){
cpuMap := "0xA00000800000"
cpuIds = getCpuIds(cpuMap)
fmt.Println(cpuIds)
}
func getCpuIds(cpuMap string) string {
// getting the cpu ids
cpu_ids_i, _ := strconv.ParseInt(cpuMap, 0, 64) // int from string
cpu_ids_b := strconv.FormatInt(cpu_ids_i, 2) // binary as string
var buff bytes.Buffer
for i, runeValue := range cpu_ids_b {
// take care! go returns code points and not the string
if runeValue == '1' {
//fmt.Println(bitString, i)
buff.WriteString(fmt.Sprintf("%d", i))
}
if (i+1 < len(cpu_ids_b)) && (runeValue == '1') {
//fmt.Println(bitString)
buff.WriteString(string(","))
}
}
cpuIds := buff.String()
// remove last comma
cpuIds = cpuIds[:len(cpuIds)-1]
//fmt.Println(cpuIds)
return cpuIds
}
Returns:
"0,2,24"

What you're doing is essentially outputting the indices of the "1"'s in the binary representation from left-to-right, and starting index counting from the left (unusal).
You can achieve the same using bitmasks and bitwise operators, without converting it to a binary string. And I would return a slice of indices instead of its formatted string, easier to work with.
To test if the lowest (rightmost) bit is 1, you can do it like x&0x01 == 1, and to shift a whole number bitwise to the right: x >>= 1. After a shift, the rightmost bit "disappears", and the previously 2nd bit becomes the 1st, so you can test again with the same logic. You may loop until the number is greater than 0 (which means it sill has 1-bits).
See this question for more examples of bitwise operations: Difference between some operators "|", "^", "&", "&^". Golang
Of course if we test the rightmost bit and shift right, we get the bits (indices) in reverse order (compared to what you want), and the indices are counted from right, so we have to correct this before returning the result.
So the solution looks like this:
func getCpuIds(cpuMap string) (r []int) {
ci, err := strconv.ParseInt(cpuMap, 0, 64)
if err != nil {
panic(err)
}
count := 0
for ; ci > 0; count, ci = count+1, ci>>1 {
if ci&0x01 == 1 {
r = append(r, count)
}
}
// Indices are from the right, correct it:
for i, v := range r {
r[i] = count - v - 1
}
// Result is in reverse order:
for i, j := 0, len(r)-1; i < j; i, j = i+1, j-1 {
r[i], r[j] = r[j], r[i]
}
return
}
Output (try it on the Go Playground):
[0 2 24]
If for some reason you need the result as a comma separated string, this is how you can obtain that:
buf := &bytes.Buffer{}
for i, v := range cpuIds {
if i > 0 {
buf.WriteString(",")
}
buf.WriteString(strconv.Itoa(v))
}
cpuIdsStr := buf.String()
fmt.Println(cpuIdsStr)
Output (try it on the Go Playground):
0,2,24

Is there a better way to insert "|' into binary string rep to get this 10|000|001

Is there a better way to insert "|" into a string
given a binary string representation of decimal 200 = 11001000
this function returns a string = 11|001|000
While this function works, it seems very kludgy!! Why is it so
hard in GO to do a simple character insertion???
func (i Binary) FString() string {
a := strconv.FormatUint(i.Get(), 2)
y := make([]string, len(a), len(a)*2)
data := []rune(a)
r := []rune{}
for i := len(data) - 1; i >= 0; i-- {
r = append(r, data[i])
}
for j := len(a) - 1; j >= 0; j-- {
y = append(y, string(r[j]))
if ((j)%3) == 0 && j > 0 {
y = append(y, "|")
}
}
return strings.Join(y, "")
}

Depends on what you call better. I'd use regular expressions.
In this case, the complexity arises from inserting separators from the right. If we padded the string so that its length was a multiple of 3, we could insert the separator from the left. And we could easily use a regular expression to insert | before every three characters. Then, we can just strip off the leading | + padding.
func (i Binary) FString() string {
a := strconv.FormatUint(i.Get(), 2)
pad_req := len(a) % 3
padding := strings.Repeat("0", (3 - pad_req))
a = padding + a
re := regexp.MustCompile("([01]{3})")
a = re.ReplaceAllString(a, "|$1")
start := len(padding) + 1
if len(padding) == 3 {
// If we padded with "000", we want to remove the `|` before *and* after it
start = 5
}
a = a[start:]
return a
}
Snippet on the Go Playground

If performance is not critical and you just want a compact version, you may copy the input digits to output, and insert a | symbol whenever a group of 2 has been written to the output.
Groups are counted from right-to-left, so when copying the digits from left-to-right, the first group might be smaller. So the counter of digits inside a group may not necessarily start from 0 in case of the first group, but from len(input)%3.
Here is an example of it:
func Format(s string) string {
b, count := &bytes.Buffer{}, len(s)%3
for i, r := range s {
if i > 0 && count == i%3 {
b.WriteRune('|')
}
b.WriteRune(r)
}
return b.String()
}
Testing it:
for i := uint64(0); i < 10; i++ {
fmt.Println(Format(strconv.FormatUint(i, 2)))
}
fmt.Println(Format(strconv.FormatInt(1234, 2)))
Output (try it on the Go Playground):
0
1
10
11
100
101
110
111
1|000
1|001
10|011|010|010
If you have to do this many times and performance does matter, then check out my answer to the question: How to fmt.Printf an integer with thousands comma
Based on that a fast solution can be:
func Format(s string) string {
out := make([]byte, len(s)+(len(s)-1)/3)
for i, j, k := len(s)-1, len(out)-1, 0; ; i, j = i-1, j-1 {
out[j] = s[i]
if i == 0 {
return string(out)
}
if k++; k == 3 {
j, k = j-1, 0
out[j] = '|'
}
}
}
Output is the same of course. Try it on the Go Playground.

This is a partitioning problem. You can use this function:
func partition(s, separator string, pLen int) string {
if pLen < 1 || len(s) == 0 || len(separator) == 0 {
return s
}
buffer := []rune(s)
L := len(buffer)
pCount := L / pLen
result := []string{}
index := 0
for ; index < pCount; index++ {
_from := L - (index+1)*pLen
_to := L - index*pLen
result = append(result, string(buffer[_from:_to]))
}
if L%pLen != 0 {
result = append(result, string(buffer[0:L-index*pLen]))
}
for h, t := 0, len(result)-1; h < t; h, t = h+1, t-1 {
result[t], result[h] = result[h], result[t]
}
return strings.Join(result, separator)
}
And s := partition("11001000", "|", 3) will give you 11|001|000.
Here is a little test:
func TestSmokeTest(t *testing.T) {
input := "11001000"
s := partition(input, "|", 3)
if s != "11|001|000" {
t.Fail()
}
s = partition(input, "|", 2)
if s != "11|00|10|00" {
t.Fail()
}
input = "0111001000"
s = partition(input, "|", 3)
if s != "0|111|001|000" {
t.Fail()
}
s = partition(input, "|", 2)
if s != "01|11|00|10|00" {
t.Fail()
}
}

How to compare strings in golang? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I want to make a function that calculates the length of the common segment (starting from the beginning) in two strings. For example:
foo:="Makan"
bar:="Makon"
The result should be 3.
foo:="Indah"
bar:="Ihkasyandehlo"
The result should be 1.

It's not clear what you are asking because you limited your test cases to ASCII characters.
I've added a Unicode test case and I've included answers for bytes, runes, or both.
play.golang.org:
package main
import (
"fmt"
"unicode/utf8"
)
func commonBytes(s, t string) (bytes int) {
if len(s) > len(t) {
s, t = t, s
}
i := 0
for ; i < len(s); i++ {
if s[i] != t[i] {
break
}
}
return i
}
func commonRunes(s, t string) (runes int) {
if len(s) > len(t) {
s, t = t, s
}
i := 0
for ; i < len(s); i++ {
if s[i] != t[i] {
break
}
}
return utf8.RuneCountInString(s[:i])
}
func commonBytesRunes(s, t string) (bytes, runes int) {
if len(s) > len(t) {
s, t = t, s
}
i := 0
for ; i < len(s); i++ {
if s[i] != t[i] {
break
}
}
return i, utf8.RuneCountInString(s[:i])
}
func main() {
Tests := []struct {
word1, word2 string
}{
{"Makan", "Makon"},
{"Indah", "Ihkasyandehlo"},
{"日本語", "日本語"},
}
for _, test := range Tests {
fmt.Println("Words: ", test.word1, test.word2)
fmt.Println("Bytes: ", commonBytes(test.word1, test.word2))
fmt.Println("Runes: ", commonRunes(test.word1, test.word2))
fmt.Print("Bytes & Runes: ")
fmt.Println(commonBytesRunes(test.word1, test.word2))
}
}
Output:
Words: Makan Makon
Bytes: 3
Runes: 3
Bytes & Runes: 3 3
Words: Indah Ihkasyandehlo
Bytes: 1
Runes: 1
Bytes & Runes: 1 1
Words: 日本語 日本語
Bytes: 9
Runes: 3
Bytes & Runes: 9 3

Note that if you were working with Unicode characters, the result could be quite different.
Try for instance using utf8.DecodeRuneInString().
See this example:
package main
import "fmt"
import "unicode/utf8"
func index(s1, s2 string) int {
res := 0
for i, w := 0, 0; i < len(s2); i += w {
if i >= len(s1) {
return res
}
runeValue1, width := utf8.DecodeRuneInString(s1[i:])
runeValue2, width := utf8.DecodeRuneInString(s2[i:])
if runeValue1 != runeValue2 {
return res
}
if runeValue1 == utf8.RuneError || runeValue2 == utf8.RuneError {
return res
}
w = width
res = i + w
}
return res
}
func main() {
foo := "日本本a語"
bar := "日本本b語"
fmt.Println(index(foo, bar))
foo = "日本語"
bar = "日otest"
fmt.Println(index(foo, bar))
foo = "\xF0"
bar = "\xFF"
fmt.Println(index(foo, bar))
}
Here, the result would be:
9 (3 common runes of width '3')
3 (1 rune of width '3')
0 (invalid rune, meaning utf8.RuneError)

You mean like this. Please note, this will not handle UTF 8, only ascii.
package main
import (
"fmt"
)
func equal(s1, s2 string) int {
eq := 0
if len(s1) > len(s2) {
s1, s2 = s2, s1
}
for key, _ := range s1 {
if s1[key] == s2[key] {
eq++
} else {
break
}
}
return eq
}
func main() {
fmt.Println(equal("buzzfizz", "buzz"))
fmt.Println(equal("Makan", "Makon"))
fmt.Println(equal("Indah", "Ihkasyandehlo"))
}

How to reverse a string in Go?

How can we reverse a simple string in Go?

In Go1 rune is a builtin type.
func Reverse(s string) string {
runes := []rune(s)
for i, j := 0, len(runes)-1; i < j; i, j = i+1, j-1 {
runes[i], runes[j] = runes[j], runes[i]
}
return string(runes)
}

Russ Cox, on the golang-nuts mailing list, suggests
package main
import "fmt"
func main() {
input := "The quick brown 狐 jumped over the lazy 犬"
// Get Unicode code points.
n := 0
rune := make([]rune, len(input))
for _, r := range input {
rune[n] = r
n++
}
rune = rune[0:n]
// Reverse
for i := 0; i < n/2; i++ {
rune[i], rune[n-1-i] = rune[n-1-i], rune[i]
}
// Convert back to UTF-8.
output := string(rune)
fmt.Println(output)
}

This works, without all the mucking about with functions:
func Reverse(s string) (result string) {
for _,v := range s {
result = string(v) + result
}
return
}

From Go example projects: golang/example/stringutil/reverse.go, by Andrew Gerrand
/*
Copyright 2014 Google Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
// Reverse returns its argument string reversed rune-wise left to right.
func Reverse(s string) string {
r := []rune(s)
for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 {
r[i], r[j] = r[j], r[i]
}
return string(r)
}
Go Playground for reverse a string
After reversing string "bròwn", the correct result should be "nwòrb", not "nẁorb".
Note the grave above the letter o.
For preserving Unicode combining characters such as "as⃝df̅" with reverse result "f̅ds⃝a",
please refer to another code listed below:
http://rosettacode.org/wiki/Reverse_a_string#Go

This works on unicode strings by considering 2 things:
range works on string by enumerating unicode characters
string can be constructed from int slices where each element is a unicode character.
So here it goes:
func reverse(s string) string {
o := make([]int, utf8.RuneCountInString(s));
i := len(o);
for _, c := range s {
i--;
o[i] = c;
}
return string(o);
}

There are too many answers here. Some of them are clear duplicates. But even from the left one, it is hard to select the best solution.
So I went through the answers, thrown away the one that does not work for unicode and also removed duplicates. I benchmarked the survivors to find the fastest. So here are the results with attribution (if you notice the answers that I missed, but worth adding, feel free to modify the benchmark):
Benchmark_rmuller-4 100000 19246 ns/op
Benchmark_peterSO-4 50000 28068 ns/op
Benchmark_russ-4 50000 30007 ns/op
Benchmark_ivan-4 50000 33694 ns/op
Benchmark_yazu-4 50000 33372 ns/op
Benchmark_yuku-4 50000 37556 ns/op
Benchmark_simon-4 3000 426201 ns/op
So here is the fastest method by rmuller:
func Reverse(s string) string {
size := len(s)
buf := make([]byte, size)
for start := 0; start < size; {
r, n := utf8.DecodeRuneInString(s[start:])
start += n
utf8.EncodeRune(buf[size-start:], r)
}
return string(buf)
}
For some reason I can't add a benchmark, so you can copy it from PlayGround (you can't run tests there). Rename it and run go test -bench=.

I noticed this question when Simon posted his solution which, since strings are immutable, is very inefficient. The other proposed solutions are also flawed; they don't work or they are inefficient.
Here's an efficient solution that works, except when the string is not valid UTF-8 or the string contains combining characters.
package main
import "fmt"
func Reverse(s string) string {
n := len(s)
runes := make([]rune, n)
for _, rune := range s {
n--
runes[n] = rune
}
return string(runes[n:])
}
func main() {
fmt.Println(Reverse(Reverse("Hello, 世界")))
fmt.Println(Reverse(Reverse("The quick brown 狐 jumped over the lazy 犬")))
}

I wrote the following Reverse function which respects UTF8 encoding and combined characters:
// Reverse reverses the input while respecting UTF8 encoding and combined characters
func Reverse(text string) string {
textRunes := []rune(text)
textRunesLength := len(textRunes)
if textRunesLength <= 1 {
return text
}
i, j := 0, 0
for i < textRunesLength && j < textRunesLength {
j = i + 1
for j < textRunesLength && isMark(textRunes[j]) {
j++
}
if isMark(textRunes[j-1]) {
// Reverses Combined Characters
reverse(textRunes[i:j], j-i)
}
i = j
}
// Reverses the entire array
reverse(textRunes, textRunesLength)
return string(textRunes)
}
func reverse(runes []rune, length int) {
for i, j := 0, length-1; i < length/2; i, j = i+1, j-1 {
runes[i], runes[j] = runes[j], runes[i]
}
}
// isMark determines whether the rune is a marker
func isMark(r rune) bool {
return unicode.Is(unicode.Mn, r) || unicode.Is(unicode.Me, r) || unicode.Is(unicode.Mc, r)
}
I did my best to make it as efficient and readable as possible. The idea is simple, traverse through the runes looking for combined characters then reverse the combined characters' runes in-place. Once we have covered them all, reverse the runes of the entire string also in-place.
Say we would like to reverse this string bròwn. The ò is represented by two runes, one for the o and one for this unicode \u0301a that represents the "grave".
For simplicity, let's represent the string like this bro'wn. The first thing we do is look for combined characters and reverse them. So now we have the string br'own. Finally, we reverse the entire string and end up with nwo'rb. This is returned to us as nwòrb
You can find it here https://github.com/shomali11/util if you would like to use it.
Here are some test cases to show a couple of different scenarios:
func TestReverse(t *testing.T) {
assert.Equal(t, Reverse(""), "")
assert.Equal(t, Reverse("X"), "X")
assert.Equal(t, Reverse("b\u0301"), "b\u0301")
assert.Equal(t, Reverse("😎⚽"), "⚽😎")
assert.Equal(t, Reverse("Les Mise\u0301rables"), "selbare\u0301siM seL")
assert.Equal(t, Reverse("ab\u0301cde"), "edcb\u0301a")
assert.Equal(t, Reverse("This `\xc5` is an invalid UTF8 character"), "retcarahc 8FTU dilavni na si `�` sihT")
assert.Equal(t, Reverse("The quick bròwn 狐 jumped over the lazy 犬"), "犬 yzal eht revo depmuj 狐 nwòrb kciuq ehT")
}

//Reverse reverses string using strings.Builder. It's about 3 times faster
//than the one with using a string concatenation
func Reverse(in string) string {
var sb strings.Builder
runes := []rune(in)
for i := len(runes) - 1; 0 <= i; i-- {
sb.WriteRune(runes[i])
}
return sb.String()
}
//Reverse reverses string using string
func Reverse(in string) (out string) {
for _, r := range in {
out = string(r) + out
}
return
}
BenchmarkReverseStringConcatenation-8 1000000 1571 ns/op 176 B/op 29 allocs/op
BenchmarkReverseStringsBuilder-8 3000000 499 ns/op 56 B/op 6 allocs/op
Using strings.Builder is about 3 times faster than using string concatenation

Here is quite different, I would say more functional approach, not listed among other answers:
func reverse(s string) (ret string) {
for _, v := range s {
defer func(r rune) { ret += string(r) }(v)
}
return
}

This is the fastest implementation
func Reverse(s string) string {
size := len(s)
buf := make([]byte, size)
for start := 0; start < size; {
r, n := utf8.DecodeRuneInString(s[start:])
start += n
utf8.EncodeRune(buf[size-start:], r)
}
return string(buf)
}
const (
s = "The quick brown 狐 jumped over the lazy 犬"
reverse = "犬 yzal eht revo depmuj 狐 nworb kciuq ehT"
)
func TestReverse(t *testing.T) {
if Reverse(s) != reverse {
t.Error(s)
}
}
func BenchmarkReverse(b *testing.B) {
for i := 0; i < b.N; i++ {
Reverse(s)
}
}

A simple stroke with rune:
func ReverseString(s string) string {
runes := []rune(s)
size := len(runes)
for i := 0; i < size/2; i++ {
runes[size-i-1], runes[i] = runes[i], runes[size-i-1]
}
return string(runes)
}
func main() {
fmt.Println(ReverseString("Abcdefg 汉语 The God"))
}
: doG ehT 语汉 gfedcbA

You could also import an existing implementation:
import "4d63.com/strrev"
Then:
strrev.Reverse("abåd") // returns "dåba"
Or to reverse a string including unicode combining characters:
strrev.ReverseCombining("abc\u0301\u031dd") // returns "d\u0301\u031dcba"
These implementations supports correct ordering of unicode multibyte and combing characters when reversed.
Note: Built-in string reverse functions in many programming languages do not preserve combining, and identifying combining characters requires significantly more execution time.

func ReverseString(str string) string {
output :=""
for _, char := range str {
output = string(char) + output
}
return output
}
// "Luizpa" -> "apziuL"
// "123日本語" -> "語本日321"
// "⚽😎" -> "😎⚽"
// "´a´b´c´" -> "´c´b´a´"

This code preserves sequences of combining characters intact, and
should work with invalid UTF-8 input too.
package stringutil
import "code.google.com/p/go.text/unicode/norm"
func Reverse(s string) string {
bound := make([]int, 0, len(s) + 1)
var iter norm.Iter
iter.InitString(norm.NFD, s)
bound = append(bound, 0)
for !iter.Done() {
iter.Next()
bound = append(bound, iter.Pos())
}
bound = append(bound, len(s))
out := make([]byte, 0, len(s))
for i := len(bound) - 2; i >= 0; i-- {
out = append(out, s[bound[i]:bound[i+1]]...)
}
return string(out)
}
It could be a little more efficient if the unicode/norm primitives
allowed iterating through the boundaries of a string without
allocating. See also https://code.google.com/p/go/issues/detail?id=9055 .

If you need to handle grapheme clusters, use unicode or regexp module.
package main
import (
"unicode"
"regexp"
)
func main() {
str := "\u0308" + "a\u0308" + "o\u0308" + "u\u0308"
println("u\u0308" + "o\u0308" + "a\u0308" + "\u0308" == ReverseGrapheme(str))
println("u\u0308" + "o\u0308" + "a\u0308" + "\u0308" == ReverseGrapheme2(str))
}
func ReverseGrapheme(str string) string {
buf := []rune("")
checked := false
index := 0
ret := ""
for _, c := range str {
if !unicode.Is(unicode.M, c) {
if len(buf) > 0 {
ret = string(buf) + ret
}
buf = buf[:0]
buf = append(buf, c)
if checked == false {
checked = true
}
} else if checked == false {
ret = string(append([]rune(""), c)) + ret
} else {
buf = append(buf, c)
}
index += 1
}
return string(buf) + ret
}
func ReverseGrapheme2(str string) string {
re := regexp.MustCompile("\\PM\\pM*|.")
slice := re.FindAllString(str, -1)
length := len(slice)
ret := ""
for i := 0; i < length; i += 1 {
ret += slice[length-1-i]
}
return ret
}

It's assuredly not the most memory efficient solution, but for a "simple" UTF-8 safe solution the following will get the job done and not break runes.
It's in my opinion the most readable and understandable on the page.
func reverseStr(str string) (out string) {
for _, s := range str {
out = string(s) + out
}
return
}

The following two methods run faster than the fastest solution that preserve combining characters, though that's not to say I'm missing something in my benchmark setup.
//input string s
bs := []byte(s)
var rs string
for len(bs) > 0 {
r, size := utf8.DecodeLastRune(bs)
rs += fmt.Sprintf("%c", r)
bs = bs[:len(bs)-size]
} // rs has reversed string
Second method inspired by this
//input string s
bs := []byte(s)
cs := make([]byte, len(bs))
b1 := 0
for len(bs) > 0 {
r, size := utf8.DecodeLastRune(bs)
d := make([]byte, size)
_ = utf8.EncodeRune(d, r)
b1 += copy(cs[b1:], d)
bs = bs[:len(bs) - size]
} // cs has reversed bytes

NOTE: This answer is from 2009, so there are probably better solutions out there by now.
Looks a bit 'roundabout', and probably not very efficient, but illustrates how the Reader interface can be used to read from strings. IntVectors also seem very suitable as buffers when working with utf8 strings.
It would be even shorter when leaving out the 'size' part, and insertion into the vector by Insert, but I guess that would be less efficient, as the whole vector then needs to be pushed back by one each time a new rune is added.
This solution definitely works with utf8 characters.
package main
import "container/vector";
import "fmt";
import "utf8";
import "bytes";
import "bufio";
func
main() {
toReverse := "Smørrebrød";
fmt.Println(toReverse);
fmt.Println(reverse(toReverse));
}
func
reverse(str string) string {
size := utf8.RuneCountInString(str);
output := vector.NewIntVector(size);
input := bufio.NewReader(bytes.NewBufferString(str));
for i := 1; i <= size; i++ {
rune, _, _ := input.ReadRune();
output.Set(size - i, rune);
}
return string(output.Data());
}

func Reverse(s string) string {
r := []rune(s)
var output strings.Builder
for i := len(r) - 1; i >= 0; i-- {
output.WriteString(string(r[i]))
}
return output.String()
}

Simple, Sweet and Performant
func reverseStr(str string) string {
strSlice := []rune(str) //converting to slice of runes
length := len(strSlice)
for i := 0; i < (length / 2); i++ {
strSlice[i], strSlice[length-i-1] = strSlice[length-i-1], strSlice[i]
}
return string(strSlice) //converting back to string
}

Reversing a string by word is a similar process. First, we convert the string into an array of strings where each entry is a word. Next, we apply the normal reverse loop to that array. Finally, we smush the results back together into a string that we can return to the caller.
package main
import (
"fmt"
"strings"
)
func reverse_words(s string) string {
words := strings.Fields(s)
for i, j := 0, len(words)-1; i < j; i, j = i+1, j-1 {
words[i], words[j] = words[j], words[i]
}
return strings.Join(words, " ")
}
func main() {
fmt.Println(reverse_words("one two three"))
}

Another hack is to use built-in language features, for example, defer:
package main
import "fmt"
func main() {
var name string
fmt.Scanln(&name)
for _, char := range []rune(name) {
defer fmt.Printf("%c", char) // <-- LIFO does it all for you
}
}

For simple strings it possible to use such construction:
func Reverse(str string) string {
if str != "" {
return Reverse(str[1:]) + str[:1]
}
return ""
}
For Unicode strings it might look like this:
func RecursiveReverse(str string) string {
if str == "" {
return ""
}
runes := []rune(str)
return RecursiveReverse(string(runes[1:])) + string(runes[0])
}

A version which I think works on unicode. It is built on the utf8.Rune functions:
func Reverse(s string) string {
b := make([]byte, len(s));
for i, j := len(s)-1, 0; i >= 0; i-- {
if utf8.RuneStart(s[i]) {
rune, size := utf8.DecodeRuneInString(s[i:len(s)]);
utf8.EncodeRune(rune, b[j:j+size]);
j += size;
}
}
return string(b);
}

rune is a type, so use it. Moreover, Go doesn't use semicolons.
func reverse(s string) string {
l := len(s)
m := make([]rune, l)
for _, c := range s {
l--
m[l] = c
}
return string(m)
}
func main() {
str := "the quick brown 狐 jumped over the lazy 犬"
fmt.Printf("reverse(%s): [%s]\n", str, reverse(str))
}

try below code:
package main
import "fmt"
func reverse(s string) string {
chars := []rune(s)
for i, j := 0, len(chars)-1; i < j; i, j = i+1, j-1 {
chars[i], chars[j] = chars[j], chars[i]
}
return string(chars)
}
func main() {
fmt.Printf("%v\n", reverse("abcdefg"))
}
for more info check http://golangcookbook.com/chapters/strings/reverse/
and http://www.dotnetperls.com/reverse-string-go

func reverseString(someString string) string {
runeString := []rune(someString)
var reverseString string
for i := len(runeString)-1; i >= 0; i -- {
reverseString += string(runeString[i])
}
return reverseString
}

Strings are immutable object in golang, unlike C inplace reverse is not possible with golang.
With C , you can do something like,
void reverseString(char *str) {
int length = strlen(str)
for(int i = 0, j = length-1; i < length/2; i++, j--)
{
char tmp = str[i];
str[i] = str[j];
str[j] = tmp;
}
}
But with golang, following one, uses byte to convert the input into bytes first and then reverses the byte array once it is reversed, convert back to string before returning. works only with non unicode type string.
package main
import "fmt"
func main() {
s := "test123 4"
fmt.Println(reverseString(s))
}
func reverseString(s string) string {
a := []byte(s)
for i, j := 0, len(s)-1; i < j; i++ {
a[i], a[j] = a[j], a[i]
j--
}
return string(a)
}

Here is yet another solution:
func ReverseStr(s string) string {
chars := []rune(s)
rev := make([]rune, 0, len(chars))
for i := len(chars) - 1; i >= 0; i-- {
rev = append(rev, chars[i])
}
return string(rev)
}
However, yazu's solution above is more elegant since he reverses the []rune slice in place.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Split string by length in Golang - string

An easy solution using regex re := regexp.MustCompile((\S{3})) x := re.FindAllStringSubmatch("HelloWorld", -1) fmt.Println(x) https://play.golang.org/p/mfmaQlSRkHe

Related

Efficient reverse indexing of number (type string) in Go

Bitmasking conversion of CPU ids with Go

Is there a better way to insert "|' into binary string rep to get this 10|000|001

How to compare strings in golang? [closed]

How to reverse a string in Go?

Categories

Resources