Shouldn't strconv.Unquote handle both single and double quotes?
See also https://golang.org/src/strconv/quote.go - line 350
However following code returns a syntax error:
s, err := strconv.Unquote(`'test'`)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(s)
}
https://play.golang.org/p/TnprqhNdwD1
But double quotes work as expected:
s, err := strconv.Unquote(`"test"`)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(s)
}
What am I missing?
There is no ready function for what you want in the standard library.
What you presented works, but we can make it simpler (and likely more efficient):
func trimQuotes(s string) string {
if len(s) >= 2 {
if c := s[len(s)-1]; s[0] == c && (c == '"' || c == '\'') {
return s[1 : len(s)-1]
}
}
return s
}
Testing it:
fmt.Println(trimQuotes(`'test'`))
fmt.Println(trimQuotes(`"test"`))
fmt.Println(trimQuotes(`"'test`))
Output (try it on the Go Playground):
test
test
"'test
strconv.Unquote does properly handle both single and double quotes, but it isn't intended to be used in the way that your code snippet invokes it. It's intended for use in cases where you are processing go source code, and come across a string literal. The single quote case is valid for a single character, and not a string. In your go source files, if you try to use single quotes for a multi-character string literal, you'll get a compiler error similar to illegal rune literal.
What you can do instead for removing quotes from the start and end of a string, is use the strings.Trim function to take care of it.
s := strings.Trim(`'test'`, `'"`)
fmt.Println(s)
Temp workaround:
func trimQuotes(s string) string {
if len(s) >= 2 {
switch {
case s[0] == '"' && s[len(s)-1] == '"':
return s[1 : len(s)-1]
case s[0] == '\'' && s[len(s)-1] == '\'':
return s[1 : len(s)-1]
}
}
return s
}
Related
I have a particular string that I need to run base64 decode on in Go. This string looks something like this:
qU4aaakFmjaaaaI5aaa\/EN\/aaa\/SaaaJaaa6aa+nGnk=
Please note this is not the exact same string but it does have the same shape and number of characters, padding characters and it has those \/ things on the same positions in the string.
Let's call it key.
In PHP if I run
base64_decode($key);
the decode operation is successful
If In Python I run
base64.b64decode(key)
the decode operation is once more successful. Problem is, I can't do base64 decoding on this thing in Go.
dcd, err := base64.StdEncoding.DecodeString("qU4aaakFmjaaaaI5aaa\\/EN\\/aaa\\/SaaaJaaa6aa+nGnk=")
if err != nil {
log.Fatal(err)
}
return dcd
This will return the error
illegal base64 data at input byte 19
In the Go version, I have to escape those backslashes. It seems that the error appears at byte 19. Bearing in mind that this string that I am using as an example has the same length as the string that is actually causing the problem I would believe that the error happens right at the byte with the \ character. What can I do about this?
The alphabet of the standard Base64 does not contain backslash. So the qU4aaakFmjaaaaI5aaa\/EN\/aaa\/SaaaJaaa6aa+nGnk= input is not valid Base64 encoded string.
The forward slash is valid character in Base64, just not the backslash. It's possible the \/ is a sequence designating a single slash. If so, replace the \/ sequences with a single / and you're good to go.
For example:
s := `qU4aaakFmjaaaaI5aaa\/EN\/aaa\/SaaaJaaa6aa+nGnk=`
s = strings.ReplaceAll(s, `\/`, `/`)
dcd, err := base64.StdEncoding.DecodeString(s)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(dcd))
Which outputs (try it on the Go Playground):
�Ni��6�i�9i����i��i��i��i��y
If \/ is not a special sequence and you want to discard all invalid characters from the input, this is how it could be done:
var valid = map[rune]bool{}
func init() {
for _, r := range "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=" {
valid[r] = true
}
}
func clean(s string) string {
return strings.Map(func(r rune) rune {
if valid[r] {
return r
}
return -1
}, s)
}
func main() {
s := `qU4aaakFmjaaaaI5aaa\/EN\/aaa\/SaaaJaaa6aa+nGnk=`
s = clean(s)
dcd, err := base64.StdEncoding.DecodeString(s)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(dcd))
}
Output is the same. Try this one on the Go Playground.
I'm attempting to check if the first character in a string matches the following, note the UTF-8 quote characters:
c := t.Content[0]
if c != '.' && c != ',' && c != '?' && c != '“' && c != '”'{
This code does not work due to the special characters in the last two checks.
What is the correct way to do this?
Indexing a string indexes its bytes (in UTF-8 encoding - this is how Go stores strings in memory), but you want to test the first character.
So you should get the first rune and not the first byte. For efficiency you may use utf8.DecodeRuneInString() which only decodes the first rune. If you need all the runes of the string, you may use type conversion like all := []rune("I'm a string").
See this example:
for _, s := range []string{"asdf", ".asdf", "”asdf"} {
c, _ := utf8.DecodeRuneInString(s)
if c != '.' && c != ',' && c != '?' && c != '“' && c != '”' {
fmt.Println("Ok:", s)
} else {
fmt.Println("Not ok:", s)
}
}
Output (try it on the Go Playground):
Ok: asdf
Not ok: .asdf
Not ok: ”asdf
Adding to #icza's great answer: It's worth noting that while indexing of strings is in bytes, range of strings is in runes. So the following also works:
for _, s := range []string{"asdf", ".asdf", "”asdf"} {
for _, c := range s {
if c != '.' && c != ',' && c != '?' && c != '“' && c != '”' {
fmt.Println("Ok:", s)
} else {
fmt.Println("Not ok:", s)
}
break // we break after the first character regardless
}
}
InputOutput
abc abc___
a a___
abcdeabcde_
Attempt
package main
import "fmt"
import "unicode/utf8"
func main() {
input := "abc"
if utf8.RuneCountInString(input) == 1 {
fmt.Println(input + "_____")
} else if utf8.RuneCountInString(input) == 2 {
fmt.Println(input + "____")
} else if utf8.RuneCountInString(input) == 3 {
fmt.Println(input + "___")
} else if utf8.RuneCountInString(input) == 4 {
fmt.Println(input + "__")
} else if utf8.RuneCountInString(input) == 5 {
fmt.Println(input + "_")
} else {
fmt.Println(input)
}
}
returns
abc___
Discussion
Although the code is creating the expected output, it looks very verbose and devious.
Question
Is there a concise way?
The strings package has a Repeat function, so something like
input += strings.Repeat("_", desiredLen - utf8.RuneCountInString(input))
would be simpler. You should probably check that desiredLen is smaller than inpult length first.
You can also do this efficiently without loops and "external" function calls, by slicing a prepared "max padding" (slice out the required padding and simply add it to the input):
const max = "______"
func pad(s string) string {
if i := utf8.RuneCountInString(s); i < len(max) {
s += max[i:]
}
return s
}
Using it:
fmt.Println(pad("abc"))
fmt.Println(pad("a"))
fmt.Println(pad("abcde"))
Output (try it on the Go Playground):
abc___
a_____
abcde_
Notes:
len(max) is a constant (because max is a constant): Spec: Length and capacity:
The expression len(s) is constant if s is a string constant.
Slicing a string is efficient:
An important consequence of this slice-like design for strings is that creating a substring is very efficient. All that needs to happen is the creation of a two-word string header. Since the string is read-only, the original string and the string resulting from the slice operation can share the same array safely.
You could just do input += "_" in a cycle, but that would allocate unnecessary strings. Here is a version that doesn't allocate more than it needs:
const limit = 6
func f(s string) string {
if len(s) >= limit {
return s
}
b := make([]byte, limit)
copy(b, s)
for i := len(s); i < limit; i++ {
b[i] = '_'
}
return string(b)
}
Playground: http://play.golang.org/p/B_Wx1449QM.
Let's say for example that I have one string, like this:
<h1>Hello World!</h1>
What Go code would be able to extract Hello World! from that string? I'm still relatively new to Go. Any help is greatly appreciated!
If the string looks like whatever;START;extract;END;whatever you can use this which will get the string in between:
// GetStringInBetween Returns empty string if no start string found
func GetStringInBetween(str string, start string, end string) (result string) {
s := strings.Index(str, start)
if s == -1 {
return
}
s += len(start)
e := strings.Index(str[s:], end)
if e == -1 {
return
}
e += s + e - 1
return str[s:e]
}
What happens here is it will find first index of START, adds length of START string and returns all that exists from there until first index of END.
There are lots of ways to split strings in all programming languages.
Since I don't know what you are especially asking for I provide a sample way to get the output
you want from your sample.
package main
import "strings"
import "fmt"
func main() {
initial := "<h1>Hello World!</h1>"
out := strings.TrimLeft(strings.TrimRight(initial,"</h1>"),"<h1>")
fmt.Println(out)
}
In the above code you trim <h1> from the left of the string and </h1> from the right.
As I said there are hundreds of ways to split specific strings and this is only a sample to get you started.
Hope it helps, Good luck with Golang :)
DB
I improved the Jan Kardaš`s answer.
now you can find string with more than 1 character at the start and end.
func GetStringInBetweenTwoString(str string, startS string, endS string) (result string,found bool) {
s := strings.Index(str, startS)
if s == -1 {
return result,false
}
newS := str[s+len(startS):]
e := strings.Index(newS, endS)
if e == -1 {
return result,false
}
result = newS[:e]
return result,true
}
Here is my answer using regex. Not sure why no one suggested this safest approach
package main
import (
"fmt"
"regexp"
)
func main() {
content := "<h1>Hello World!</h1>"
re := regexp.MustCompile(`<h1>(.*)</h1>`)
match := re.FindStringSubmatch(content)
if len(match) > 1 {
fmt.Println("match found -", match[1])
} else {
fmt.Println("match not found")
}
}
Playground - https://play.golang.org/p/Yc61x1cbZOJ
In the strings pkg you can use the Replacer to great affect.
r := strings.NewReplacer("<h1>", "", "</h1>", "")
fmt.Println(r.Replace("<h1>Hello World!</h1>"))
Go play!
func findInString(str, start, end string) ([]byte, error) {
var match []byte
index := strings.Index(str, start)
if index == -1 {
return match, errors.New("Not found")
}
index += len(start)
for {
char := str[index]
if strings.HasPrefix(str[index:index+len(match)], end) {
break
}
match = append(match, char)
index++
}
return match, nil
}
Read up on the strings package. Have a look into the SplitAfter function which can do something like this:
var sample = "[this][is my][string]"
t := strings.SplitAfter(sample, "[")
That should produce a slice something like: "[", "this][", "is my][", "string]". Using further functions for Trimming you should get your solution. Best of luck.
func Split(str, before, after string) string {
a := strings.SplitAfterN(str, before, 2)
b := strings.SplitAfterN(a[len(a)-1], after, 2)
if 1 == len(b) {
return b[0]
}
return b[0][0:len(b[0])-len(after)]
}
the first call of SplitAfterN will split the original string into array of 2 parts divided by the first found after string, or it will produce array containing 1 part equal to the original string.
second call of SplitAfterN uses a[len(a)-1] as input, as it is "the last item of array a". so either string after after or the original string str. the input will be split into array of 2 parts divided by the first found before string, or it will produce array containing 1 part equal to the input.
if after was not found than we can simply return b[0] as it is equal to a[len(a)-1]
if after is found, it will be included at the end of b[0] string, therefore you have to trim it via b[0][0:len(b[0])-len(after)]
all strings are case sensitive
I'm trying to alter an existing string in Go but I keep getting this error "cannot assign to new_str[i]"
package main
import "fmt"
func ToUpper(str string) string {
new_str := str
for i:=0; i<len(str); i++{
if str[i]>='a' && str[i]<='z'{
chr:=uint8(rune(str[i])-'a'+'A')
new_str[i]=chr
}
}
return new_str
}
func main() {
fmt.Println(ToUpper("cdsrgGDH7865fxgh"))
}
This is my code, I wish to change lowercase to uppercase but I cant alter the string. Why? How can I alter it?
P.S I wish to use ONLY the fmt package!
Thanks in advance.
You can't... they are immutable. From the Golang Language Specification:
Strings are immutable: once created, it is impossible to change the contents of a string.
You can however, cast it to a []byte slice and alter that:
func ToUpper(str string) string {
new_str := []byte(str)
for i := 0; i < len(str); i++ {
if str[i] >= 'a' && str[i] <= 'z' {
chr := uint8(rune(str[i]) - 'a' + 'A')
new_str[i] = chr
}
}
return string(new_str)
}
Working sample: http://play.golang.org/p/uZ_Gui7cYl
Use range and avoid unnecessary conversions and allocations. Strings are immutable. For example,
package main
import "fmt"
func ToUpper(s string) string {
var b []byte
for i, c := range s {
if c >= 'a' && c <= 'z' {
if b == nil {
b = []byte(s)
}
b[i] = byte('A' + rune(c) - 'a')
}
}
if b == nil {
return s
}
return string(b)
}
func main() {
fmt.Println(ToUpper("cdsrgGDH7865fxgh"))
}
Output:
CDSRGGDH7865FXGH
In Go strings are immutable. Here is one very bad way of doing what you want (playground)
package main
import "fmt"
func ToUpper(str string) string {
new_str := ""
for i := 0; i < len(str); i++ {
chr := str[i]
if chr >= 'a' && chr <= 'z' {
chr = chr - 'a' + 'A'
}
new_str += string(chr)
}
return new_str
}
func main() {
fmt.Println(ToUpper("cdsrgGDH7865fxgh"))
}
This is bad because
you are treating your string as characters - what if it is UTF-8? Using range str is the way to go
appending to strings is slow - lots of allocations - a bytes.Buffer would be a good idea
there is a very good library routine to do this already strings.ToUpper
It is worth exploring the line new_str += string(chr) a bit more. Strings are immutable, so what this does is make a new string with the chr on the end, it doesn't extend the old string. This is wildly inefficient for long strings as the allocated memory will tend to the square of the string length.
Next time just use strings.ToUpper!