Replace a character in a string in golang - string

I am trying to replace a specific position character from an array of strings. Here is what my code looks like:
package main
import (
"fmt"
)
func main() {
str := []string{"test","testing"}
str[0][2] = 'y'
fmt.Println(str)
}
Now, running this gives me the error:
cannot assign to str[0][2]
Any idea how to do this? I have tried using strings.Replace, but AFAIK it will replace all the occurrence of the given character, while I want to replace that specific character. Any help is appreciated. TIA.

Strings in Go are immutable, you can't change their content. To change the value of a string variable, you have to assign a new string value.
An easy way is to first convert the string to a byte or rune slice, do the change and convert back:
s := []byte(str[0])
s[2] = 'y'
str[0] = string(s)
fmt.Println(str)
This will output (try it on the Go Playground):
[teyt testing]
Note: I converted the string to byte slice, because this is what happens when you index a string: it indexes its bytes. A string stores the UTF-8 byte sequence of the text, which may not necessarily map bytes to characters one-to-one.
If you need to replace the 2nd character, use []rune instead:
s := []rune(str[0])
s[2] = 'y'
str[0] = string(s)
fmt.Println(str)
In this example it doesn't matter though, but in general it may.
Also note that strings.Replace() does not (necessarily) replace all occurrences:
func Replace(s, old, new string, n int) string
The parameter n tells how many replacement are to be performed max. So the following also works (try it on the Go Playground):
str[0] = strings.Replace(str[0], "s", "y", 1)
Yet another solution could be to slice the string up until the replacable character, and starting from the character after the replacable one, and just concatenate them (try this one on the Go Playground):
str[0] = str[0][:2] + "y" + str[0][3:]
Care must be taken here too: the slice indices are byte indices, not character (rune) indices.
See related question: Immutable string and pointer address

Here's a function that will do that for you. It takes care of converting the string that you want to modify into a []rune, and then back out to string.
If your intention is to replace bytes rather than runes, you can:
copy this function's code, rename it from runeSub to byteSub
change the r rune parameter to b byte
Also available on repl.it
package main
import "fmt"
// runeSub - given an array of strings (ss), replace the
// (ri)th rune (character) in the (si)th string
// of (ss), with the rune (r)
//
// ss - the array of strings
// si - the index of the string in ss that you want to modify
// ri - the index of the rune in ss[si] that you want to replace
// r - the rune you want to insert
//
// NOTE: this function has no panic protection from things like
// out-of-bound index values
func runeSub(ss []string, si, ri int, r rune) {
rr := []rune(ss[si])
rr[ri] = r
ss[si] = string(rr)
}
func main() {
ss := []string{"test","testing"}
runeSub(ss, 0, 2, 'y')
fmt.Println(ss)
}

Related

How to write the result of a function in a slice

In the example, everything works fine. But they do not use the variable a and immediately display it https://play.golang.org/p/O0XwtQJRej
But I have a problem:
package main
import (
"fmt"
"strings"
)
func main() {
str := "fulltext"
var slice []string
slice = strings.Split(str , "")
fmt.Printf("anwer: ", slice)
}
Whence in the answer there are superfluous characters, for example
%! (EXTRA [] string =
P.S. I know that I need to use append to add elements to the slice, but now I do not understand how to apply append here.
UP:
Now I have the answer:
anwer: %!(EXTRA []string=[f u l l t e x t])
But I need just:
[f u l l t e x t]
But I do not understand how I should change my code?
The problem is not with the assignment of the return value of strings.Split() to the local variable slice, which is totally fine.
The problem is that you used fmt.Printf() which expects a format string, and based on that format string it formats / substitutes expected parameters. Since your format string does not contain any verbs, that fmt.Printf() call expects no parameters, yet you pass it one, so it signals this with those extra characters (kind of error string).
Provide a valid format string where you indicate you will supply 1 parameter, a slice:
fmt.Printf("answer: %v", slice)
With this, the output is:
answer: [f u l l t e x t]
Or alternatively use fmt.Println(), which does not expect a format string:
fmt.Println("answer:", slice)
(Note that there is no space after the colon, as fmt.Println() adds a space between 2 values if one of them is of type string).
Output is the same. Try the examples on the Go Playground.
Staying with fmt.Printf(), when the parameter involves string values, the %q verb is often more useful, as that will print quoted string values, much easier to spot certain mistakes (e.g. invisible characters, or if a string contains spaces, it will become obvious):
fmt.Printf("answer: %q\n", slice)
Output of this (try it on the Go Playground):
answer: ["f" "u" "l" "l" "t" "e" "x" "t"]
If you'd wanted to append the result of a function call, this is how it could look like:
slice := []string{"initial", "content"}
slice = append(slice, strings.Split(str, "")...)
fmt.Printf("answer: %q\n", slice)
And now the output (try it on the Go Playground):
answer: ["initial" "content" "f" "u" "l" "l" "t" "e" "x" "t"]
Give to printf the expected format, in most cases, %v is fine.
package main
import (
"fmt"
"strings"
)
func main() {
str := "fulltext"
var slice []string
slice = strings.Split(str, "")
fmt.Printf("anwer: %v", slice)
}
see https://golang.org/pkg/fmt/ for more info.

Golang convert integer to unicode character

Given the following input:
intVal := 2612
strVal := "2612"
What is a mechanism for mapping to the associated unicode value as a string.
For example, the following code prints "☒"
fmt.Println("\u2612")
But the following does not work:
fmt.Println("\\u" + strVal)
I researched runes, strconv, and unicode/utf8 but was unable to find a suitable conversion strategy.
2612 is not the integer value of the unicode rune, the integer value of \u2612 is 9746. The string "2612" is the hex value of the rune, so parse it as a hex number and convert it to a rune.
i, err := strconv.ParseInt(strVal, 16, 32)
if err != nil {
log.Fatal(err)
}
r := rune(i)
fmt.Println(string(r))
https://play.golang.org/p/t_e6AfbKQq
This one works:
fmt.Println("\u2612")
Because an interpreted string literal is specified in the source code, and the compiler will unquote (interpret) it. It is not the fmt package that processes this unquoting.
This doesn't work:
fmt.Println("\\u" + strVal)
Because again an interpreted string literal is used which will be resolved to a string value \u, and then it will be concatenated with the value of the local variable strVal which is 2612, so the final string value will be \u2612. But this is not an interpreted string literal, this is the "final" result. This won't be processed / unquoted further.
Alternatively to JimB's answer, you may also use strconv.Unquote() which does an unquoting similar to what the compiler does.
See this example:
// The original that works:
s := "\u2612"
fmt.Println(s, []byte(s))
// Using strconv.Unquote():
strVal := "2612"
s2, err := strconv.Unquote(`"\u` + strVal + `"`)
fmt.Println(s2, []byte(s2), err)
fmt.Println(s == s2)
Output (try it on the Go Playground):
☒ [226 152 146]
☒ [226 152 146] <nil>
true
Something to note here: We want to unquote the \u2612 text by strconv.Unquote(), but Unquote() requires that the string to be unquoted to be in quotes ("Unquote interprets s as a single-quoted, double-quoted, or backquoted Go string literal..."), that's why we pre- and postpended it with a quotation mark.

Go lang's equivalent of charCode() method of JavaScript

The charCodeAt() method in JavaScript returns the numeric Unicode value of the character at the given index, e.g.
"s".charCodeAt(0) // returns 115
How would I go by to get the numeric unicode value of the the same string/letter in Go?
The character type in Go is rune which is an alias for int32 so it is already a number, just print it.
You still need a way to get the character at the specified position. Simplest way is to convert the string to a []rune which you can index. To convert a string to runes, simply use the type conversion []rune("some string"):
fmt.Println([]rune("s")[0])
Prints:
115
If you want it printed as a character, use the %c format string:
fmt.Println([]rune("absdef")[2]) // Also prints 115
fmt.Printf("%c", []rune("absdef")[2]) // Prints s
Also note that the for range on a string iterates over the runes of the string, so you can also use that. It is more efficient than converting the whole string to []rune:
i := 0
for _, r := range "absdef" {
if i == 2 {
fmt.Println(r)
break
}
i++
}
Note that the counter i must be a distinct counter, it cannot be the loop iteration variable, as the for range returns the byte position and not the rune index (which will be different if the string contains multi-byte characters in the UTF-8 representation).
Wrapping it into a function:
func charCodeAt(s string, n int) rune {
i := 0
for _, r := range s {
if i == n {
return r
}
i++
}
return 0
}
Try these on the Go Playground.
Also note that strings in Go are stored in memory as a []byte which is the UTF-8 encoded byte sequence of the text (read the blog post Strings, bytes, runes and characters in Go for more info). If you have guarantees that the string uses characters whose code is less than 127, you can simply work with bytes. That is indexing a string in Go indexes its bytes, so for example "s"[0] is the byte value of 's' which is 115.
fmt.Println("s"[0]) // Prints 115
fmt.Println("absdef"[2]) // Prints 115
Internally string is a 8 bit byte array in golang. So every byte will represent the ascii value.
str:="abc"
byteValue := str[0]
intValue := int(byteValue)
fmt.Println(byteValue)//97
fmt.Println(intValue)//97

How to get a single Unicode character from string

I wonder how I can I get a Unicode character from a string. For example, if the string is "你好", how can I get the first character "你"?
From another place I get one way:
var str = "你好"
runes := []rune(str)
fmt.Println(string(runes[0]))
It does work.
But I still have some questions:
Is there another way to do it?
Why in Go does str[0] not get a Unicode character from a string, but it gets byte data?
First, you may want to read https://blog.golang.org/strings
It will answer part of your questions.
A string in Go can contains arbitrary bytes. When you write str[i], the result is a byte, and the index is always a number of bytes.
Most of the time, strings are encoded in UTF-8 though. You have multiple ways to deal with UTF-8 encoding in a string.
For instance, you can use the for...range statement to iterate on a string rune by rune.
var first rune
for _,c := range str {
first = c
break
}
// first now contains the first rune of the string
You can also leverage the unicode/utf8 package. For instance:
r, size := utf8.DecodeRuneInString(str)
// r contains the first rune of the string
// size is the size of the rune in bytes
If the string is encoded in UTF-8, there is no direct way to access the nth rune of the string, because the size of the runes (in bytes) is not constant. If you need this feature, you can easily write your own helper function to do it (with for...range, or with the unicode/utf8 package).
You can use the utf8string package:
package main
import "golang.org/x/exp/utf8string"
func main() {
s := utf8string.NewString("ÄÅàâäåçèéêëìîïü")
// example 1
r := s.At(1)
println(r == 'Å')
// example 2
t := s.Slice(1, 3)
println(t == "Åà")
}
https://pkg.go.dev/golang.org/x/exp/utf8string
you can do this:
func main() {
str := "cat"
var s rune
for i, c := range str {
if i == 2 {
s = c
}
}
}
s is now equal to a

Indexing string as chars

The elements of strings have type byte and may be accessed using the
usual indexing operations.
How can I get element of string as char ?
"some"[1] -> "o"
The simplest solution is to convert it to an array of runes :
var runes = []rune("someString")
Note that when you iterate on a string, you don't need the conversion. See this example from Effective Go :
for pos, char := range "日本語" {
fmt.Printf("character %c starts at byte position %d\n", char, pos)
}
This prints
character 日 starts at byte position 0
character 本 starts at byte position 3
character 語 starts at byte position 6
Go strings are usually, but not necessarily, UTF-8 encoded. In the case they are Unicode strings, the term "char[acter]" is pretty complex and there is no generall/unique bijection of runes (code points) and Unicode characters.
Anyway one can easily work with code points (runes) in a slice and use indexes into it using a conversion:
package main
import "fmt"
func main() {
utf8 := "Hello, 世界"
runes := []rune(utf8)
fmt.Printf("utf8:% 02x\nrunes: %#v\n", []byte(utf8), runes)
}
Also here: http://play.golang.org/p/qWVSA-n93o
Note: Often the desire to access Unicode "characters" by index is a design mistake. Most of textual data is processed sequentially.
Another option is the package utf8string:
package main
import "golang.org/x/exp/utf8string"
func main() {
s := utf8string.NewString("🧡💛💚💙💜")
t := s.At(2)
println(t == '💚')
}
https://pkg.go.dev/golang.org/x/exp/utf8string

Resources