I was wondering if there is any way I could easily split a string at spaces, except when the space is inside quotation marks?
For example, changing
Foo bar random "letters lol" stuff
into
Foo, bar, random, "letters lol", stuff
Think about it. You have a string in comma separated values (CSV) file format, RFC4180, except that your separator, outside quote pairs, is a space (instead of a comma). For example,
package main
import (
"encoding/csv"
"fmt"
"strings"
)
func main() {
s := `Foo bar random "letters lol" stuff`
fmt.Printf("String:\n%q\n", s)
// Split string
r := csv.NewReader(strings.NewReader(s))
r.Comma = ' ' // space
fields, err := r.Read()
if err != nil {
fmt.Println(err)
return
}
fmt.Printf("\nFields:\n")
for _, field := range fields {
fmt.Printf("%q\n", field)
}
}
Playground: https://play.golang.org/p/Ed4IV97L7H
Output:
String:
"Foo bar random \"letters lol\" stuff"
Fields:
"Foo"
"bar"
"random"
"letters lol"
"stuff"
Using strings.FieldsFunc try this:
package main
import (
"fmt"
"strings"
)
func main() {
s := `Foo bar random "letters lol" stuff`
quoted := false
a := strings.FieldsFunc(s, func(r rune) bool {
if r == '"' {
quoted = !quoted
}
return !quoted && r == ' '
})
out := strings.Join(a, ", ")
fmt.Println(out) // Foo, bar, random, "letters lol", stuff
}
Using simple strings.Builder and range over string and keeping or not keeping " at your will, try this
package main
import (
"fmt"
"strings"
)
func main() {
s := `Foo bar random "letters lol" stuff`
a := []string{}
sb := &strings.Builder{}
quoted := false
for _, r := range s {
if r == '"' {
quoted = !quoted
sb.WriteRune(r) // keep '"' otherwise comment this line
} else if !quoted && r == ' ' {
a = append(a, sb.String())
sb.Reset()
} else {
sb.WriteRune(r)
}
}
if sb.Len() > 0 {
a = append(a, sb.String())
}
out := strings.Join(a, ", ")
fmt.Println(out) // Foo, bar, random, "letters lol", stuff
// not keep '"': // Foo, bar, random, letters lol, stuff
}
Using scanner.Scanner, try this:
package main
import (
"fmt"
"strings"
"text/scanner"
)
func main() {
var s scanner.Scanner
s.Init(strings.NewReader(`Foo bar random "letters lol" stuff`))
slice := make([]string, 0, 5)
tok := s.Scan()
for tok != scanner.EOF {
slice = append(slice, s.TokenText())
tok = s.Scan()
}
out := strings.Join(slice, ", ")
fmt.Println(out) // Foo, bar, random, "letters lol", stuff
}
Using csv.NewReader which removes " itself, try this:
package main
import (
"encoding/csv"
"fmt"
"log"
"strings"
)
func main() {
s := `Foo bar random "letters lol" stuff`
r := csv.NewReader(strings.NewReader(s))
r.Comma = ' '
record, err := r.Read()
if err != nil {
log.Fatal(err)
}
out := strings.Join(record, ", ")
fmt.Println(out) // Foo, bar, random, letters lol, stuff
}
Using regexp, try this:
package main
import (
"fmt"
"regexp"
"strings"
)
func main() {
s := `Foo bar random "letters lol" stuff`
r := regexp.MustCompile(`[^\s"]+|"([^"]*)"`)
a := r.FindAllString(s, -1)
out := strings.Join(a, ", ")
fmt.Println(out) // Foo, bar, random, "letters lol", stuff
}
You could use regex
This (go playground) will cover all use cases for multiple words inside quotes and multiple quoted entries in your array:
package main
import (
"fmt"
"regexp"
)
func main() {
s := `Foo bar random "letters lol" stuff "also will" work on "multiple quoted stuff"`
r := regexp.MustCompile(`[^\s"']+|"([^"]*)"|'([^']*)`)
arr := r.FindAllString(s, -1)
fmt.Println("your array: ", arr)
}
Output will be:
[Foo, bar, random, "letters lol", stuff, "also will", work, on, "multiple quoted stuff"]
If you want to learn more about regex here is a great SO answer with super handy resources at the end - Learning Regular Expressions
Hope this helps
Related
I want to split a string on a regular expresion, but preserve the matches.
I have tried splitting the string on a regex, but it throws away the matches. I have also tried using this, but I am not very good at translating code from language to language, let alone C#.
re := regexp.MustCompile(`\d`)
array := re.Split("ab1cd2ef3", -1)
I need the value of array to be ["ab", "1", "cd", "2", "ef", "3"], but the value of array is ["ab", "cd", "ef"]. No errors.
The kind of regex support in the link you have pointed out is NOT available in Go regex package. You can read the related discussion.
What you want to achieve (as per the sample given) can be done using regex to match digits or non-digits.
package main
import (
"fmt"
"regexp"
)
func main() {
str := "ab1cd2ef3"
r := regexp.MustCompile(`(\d|[^\d]+)`)
fmt.Println(r.FindAllStringSubmatch(str, -1))
}
Playground: https://play.golang.org/p/L-ElvkDky53
Output:
[[ab ab] [1 1] [cd cd] [2 2] [ef ef] [3 3]]
I don't think this is possible with the current regexp package, but the Split could be easily extended to such behavior.
This should work for your case:
func Split(re *regexp.Regexp, s string, n int) []string {
if n == 0 {
return nil
}
matches := re.FindAllStringIndex(s, n)
strings := make([]string, 0, len(matches))
beg := 0
end := 0
for _, match := range matches {
if n > 0 && len(strings) >= n-1 {
break
}
end = match[0]
if match[1] != 0 {
strings = append(strings, s[beg:end])
}
beg = match[1]
// This also appends the current match
strings = append(strings, s[match[0]:match[1]])
}
if end != len(s) {
strings = append(strings, s[beg:])
}
return strings
}
Dumb solutions. Add separator in the string and split with separator.
package main
import (
"fmt"
"regexp"
"strings"
)
func main() {
re := regexp.MustCompile(`\d+`)
input := "ab1cd2ef3"
sep := "|"
indexes := re.FindAllStringIndex(input, -1)
fmt.Println(indexes)
move := 0
for _, v := range indexes {
p1 := v[0] + move
p2 := v[1] + move
input = input[:p1] + sep + input[p1:p2] + sep + input[p2:]
move += 2
}
result := strings.Split(input, sep)
fmt.Println(result)
}
You can use a bufio.Scanner:
package main
import (
"bufio"
"strings"
)
func digit(data []byte, eof bool) (int, []byte, error) {
for i, b := range data {
if '0' <= b && b <= '9' {
if i > 0 {
return i, data[:i], nil
}
return 1, data[:1], nil
}
}
return 0, nil, nil
}
func main() {
s := bufio.NewScanner(strings.NewReader("ab1cd2ef3"))
s.Split(digit)
for s.Scan() {
println(s.Text())
}
}
https://golang.org/pkg/bufio#Scanner.Split
I am trying to access a string as a character array or as a rune and join with some separator. What is the right way to do it.
Here are the two ways i tried but i get an error as below
cannot use ([]rune)(t)[i] (type rune) as type []string in argument to strings.Join
How does a string represented in GOLANG. Is it like a character array?
package main
import (
"fmt"
"strings"
)
func main() {
var t = "hello"
s := ""
for i, rune := range t {
s += strings.Join(rune, "\n")
}
fmt.Println(s)
}
package main
import (
"fmt"
"strings"
)
func main() {
var t = "hello"
s := ""
for i := 0; i < len(t); i++ {
s += strings.Join([]rune(t)[i], "\n")
}
fmt.Println(s)
}
I also tried the below way.BUt, it does not work for me.
var t = "hello"
s := ""
for i := 0; i < len(t); i++ {
s += strings.Join(string(t[i]), "\n")
}
fmt.Println(s)
The strings.Join method expects a slice of strings as first argument, but you are giving it a rune type.
You can use the strings.Split method to obtain a slice of strings from a string. Here is an example.
I'm just wondering why these asian characters in this string wont show up when I reverse and print the individual characters in the string.
package main
import "fmt"
func main() {
a := "The quick brown 狐 jumped over the lazy 犬"
var lenght int = len(a) - 1
for ; lenght > -1; lenght-- {
fmt.Printf("%c", a[lenght])
}
fmt.Println()
}
You are accessing the string array by byte not by 'logical character'
To better understand this example breaks the string first as an array of runes and then prints the rune backwards.
http://play.golang.org/p/bzbo7k6WZT
package main
import "fmt"
func main() {
msg := "The quick brown 狐 jumped over the lazy 犬"
elements := make([]rune, 0)
for _, rune := range msg {
elements = append(elements, rune)
}
for i := len(elements) - 1; i >= 0; i-- {
fmt.Println(string(elements[i]))
}
}
Shorter Version: http://play.golang.org/p/PYsduB4Rgq
package main
import "fmt"
func main() {
msg := "The quick brown 狐 jumped over the lazy 犬"
elements := []rune(msg)
for i := len(elements) - 1; i >= 0; i-- {
fmt.Println(string(elements[i]))
}
}
I am looking for a function, that can sort string or []byte:
"bcad" to "abcd"
or
[]byte("bcad") to []byte("abcd")
The string only contains letters - but sorting should also work for letters and numbers.
I found sort package but not the function I want.
It feels wasteful to create a string for each character just to Join them.
Here's one that is a little less wasteful, but with more boiler plate. playground://XEckr_rpr8
type sortRunes []rune
func (s sortRunes) Less(i, j int) bool {
return s[i] < s[j]
}
func (s sortRunes) Swap(i, j int) {
s[i], s[j] = s[j], s[i]
}
func (s sortRunes) Len() int {
return len(s)
}
func SortString(s string) string {
r := []rune(s)
sort.Sort(sortRunes(r))
return string(r)
}
func main() {
w1 := "bcad"
w2 := SortString(w1)
fmt.Println(w1)
fmt.Println(w2)
}
You can convert the string to a slice of strings, sort it, and then convert it back to a string:
package main
import (
"fmt"
"sort"
"strings"
)
func SortString(w string) string {
s := strings.Split(w, "")
sort.Strings(s)
return strings.Join(s, "")
}
func main() {
w1 := "bcad"
w2 := SortString(w1)
fmt.Println(w1)
fmt.Println(w2)
}
This prints:
bcad
abcd
Try it: http://play.golang.org/p/_6cTBAAZPb
there is a simple way by leveraging function sort.Slice:
package main
import (
"fmt"
"sort"
)
func main() {
word := "1BCagM9"
s := []rune(word)
sort.Slice(s, func(i int, j int) bool { return s[i] < s[j] })
fmt.Println(string(s))
}
(Playground)
package main
import (
"fmt"
"sort"
)
func main() {
word := "1àha漢字Pépy5"
charArray := []rune(word)
sort.Slice(charArray, func(i int, j int) bool {
return charArray[i] < charArray[j]
})
fmt.Println(string(charArray))
}
Output:
15Pahpyàé字漢
Playground
the thing is, golang does not have a convenient function to sort string.
Sort using int
too much conversion i think
still using rune to merge as string
func sortByInt(s string) string {
var si = []int{}
var sr = []rune{}
for _, r := range s {
si = append(si, int(r))
}
sort.Ints(si)
for _, r := range si {
sr = append(sr, rune(r))
}
return string(sr)
}
Implement sort interface for []rune, just remember that
rune equals to int32
byte equals to uint8
func sortBySlice(s string) []rune {
sr := []rune(s)
sort.Slice(sr, func(i int, j int) bool {
return sr[i] < sr[j]
})
return sr
}
You can sort a string array in go like this
// name of the file is main.go
package main
import (
"fmt"
"sort"
)
/*
*This function is used to sort a string array
*/
func main() {
var names = []string{"b", "e", "a", "d", "g", "c", "f"}
fmt.Println("original string array: ", names)
sort.Strings(names)
fmt.Println("After string sort array ", names)
return
}
i := 123
s := string(i)
s is 'E', but what I want is "123"
Please tell me how can I get "123".
And in Java, I can do in this way:
String s = "ab" + "c" // s is "abc"
how can I concat two strings in Go?
Use the strconv package's Itoa function.
For example:
package main
import (
"strconv"
"fmt"
)
func main() {
t := strconv.Itoa(123)
fmt.Println(t)
}
You can concat strings simply by +'ing them, or by using the Join function of the strings package.
fmt.Sprintf("%v",value);
If you know the specific type of value use the corresponding formatter for example %d for int
More info - fmt
fmt.Sprintf, strconv.Itoa and strconv.FormatInt will do the job. But Sprintf will use the package reflect, and it will allocate one more object, so it's not an efficient choice.
It is interesting to note that strconv.Itoa is shorthand for
func FormatInt(i int64, base int) string
with base 10
For Example:
strconv.Itoa(123)
is equivalent to
strconv.FormatInt(int64(123), 10)
You can use fmt.Sprintf or strconv.FormatFloat
For example
package main
import (
"fmt"
)
func main() {
val := 14.7
s := fmt.Sprintf("%f", val)
fmt.Println(s)
}
In this case both strconv and fmt.Sprintf do the same job but using the strconv package's Itoa function is the best choice, because fmt.Sprintf allocate one more object during conversion.
check the benchmark here: https://gist.github.com/evalphobia/caee1602969a640a4530
see https://play.golang.org/p/hlaz_rMa0D for example.
Converting int64:
n := int64(32)
str := strconv.FormatInt(n, 10)
fmt.Println(str)
// Prints "32"
Another option:
package main
import "fmt"
func main() {
n := 123
s := fmt.Sprint(n)
fmt.Println(s == "123")
}
https://golang.org/pkg/fmt#Sprint
ok,most of them have shown you something good.
Let'me give you this:
// ToString Change arg to string
func ToString(arg interface{}, timeFormat ...string) string {
if len(timeFormat) > 1 {
log.SetFlags(log.Llongfile | log.LstdFlags)
log.Println(errors.New(fmt.Sprintf("timeFormat's length should be one")))
}
var tmp = reflect.Indirect(reflect.ValueOf(arg)).Interface()
switch v := tmp.(type) {
case int:
return strconv.Itoa(v)
case int8:
return strconv.FormatInt(int64(v), 10)
case int16:
return strconv.FormatInt(int64(v), 10)
case int32:
return strconv.FormatInt(int64(v), 10)
case int64:
return strconv.FormatInt(v, 10)
case string:
return v
case float32:
return strconv.FormatFloat(float64(v), 'f', -1, 32)
case float64:
return strconv.FormatFloat(v, 'f', -1, 64)
case time.Time:
if len(timeFormat) == 1 {
return v.Format(timeFormat[0])
}
return v.Format("2006-01-02 15:04:05")
case jsoncrack.Time:
if len(timeFormat) == 1 {
return v.Time().Format(timeFormat[0])
}
return v.Time().Format("2006-01-02 15:04:05")
case fmt.Stringer:
return v.String()
case reflect.Value:
return ToString(v.Interface(), timeFormat...)
default:
return ""
}
}
package main
import (
"fmt"
"strconv"
)
func main(){
//First question: how to get int string?
intValue := 123
// keeping it in separate variable :
strValue := strconv.Itoa(intValue)
fmt.Println(strValue)
//Second question: how to concat two strings?
firstStr := "ab"
secondStr := "c"
s := firstStr + secondStr
fmt.Println(s)
}