String splitting before character - string

I'm new to go and have been using split to my advantage. Recently I came across a problem I wanted to split something, and keep the splitting char in my second slice rather than removing it, or leaving it in the first slice as with SplitAfter.
For example the following code:
strings.Split("email#email.com", "#")
returned: ["email", "email.com"]
strings.SplitAfter("email#email.com", "#")
returned: ["email#", "email.com"]
What's the best way to get ["email", "#email.com"]?

Use strings.Index to find the # and slice to get the two parts:
var part1, part2 string
if i := strings.Index(s, "#"); i >= 0 {
part1, part2 = s[:i], s[i:]
} else {
// handle case with no #
}
Run it on the playground.

Could this work for you?
s := strings.Split("email#email.com", "#")
address, domain := s[0], "#"+s[1]
fmt.Println(address, domain)
// email #email.com
Then combing and creating a string
var buffer bytes.Buffer
buffer.WriteString(address)
buffer.WriteString(domain)
result := buffer.String()
fmt.Println(result)
// email#email.com

You can use bufio.Scanner:
package main
import (
"bufio"
"strings"
)
func email(data []byte, eof bool) (int, []byte, error) {
for i, b := range data {
if b == '#' {
if i > 0 {
return i, data[:i], nil
}
return len(data), data, nil
}
}
return 0, nil, nil
}
func main() {
s := bufio.NewScanner(strings.NewReader("email#email.com"))
s.Split(email)
for s.Scan() {
println(s.Text())
}
}
https://golang.org/pkg/bufio#Scanner.Split

Related

Parse variable length array from csv to struct

I have the following setup to parse a csv file:
package main
import (
"fmt"
"os"
"encoding/csv"
)
type CsvLine struct {
Id string
Array1 [] string
Array2 [] string
}
func ReadCsv(filename string) ([][]string, error) {
f, err := os.Open(filename)
if err != nil {
return [][]string{}, err
}
defer f.Close()
lines, err := csv.NewReader(f).ReadAll()
if err != nil {
return [][]string{}, err
}
return lines, nil
}
func main() {
lines, err := ReadCsv("./data/sample-0.3.csv")
if err != nil {
panic(err)
}
for _, line := range lines {
fmt.Println(line)
data := CsvLine{
Id: line[0],
Array1: line[1],
Array2: line[2],
}
fmt.Println(data.Id)
fmt.Println(data.Array1)
fmt.Println(data.Array2)
}
}
And the following setup in my csv file:
594385903dss,"['fhjdsk', 'dfjdskl', 'fkdsjgooiertio']","['jflkdsjfl', 'fkjdlsfjdslkfjldks']"
87764385903dss,"['cxxc', 'wqeewr', 'opi', 'iy', 'qw']","['cvbvc', 'gf', 'mnb', 'ewr']"
My understanding is that variable length lists should be parsed into a slice, is it possible to do this directly via a csv reader? (The csv output was generated via a python project.)
Help/suggestions appreciated.
CSV does not have a notion of "variable length arrays", it is just a comma separated list of values. The format is described in RFC 4180, and that is exactly what the encoding/csv package implements.
You can only get a string slice out of a CSV line. How you interpret the values is up to you. You have to post process your data if you want to split it further.
What you have may be simply processed with the regexp package, e.g.
var r = regexp.MustCompile(`'[^']*'`)
func split(s string) []string {
parts := r.FindAllString(s, -1)
for i, part := range parts {
parts[i] = part[1 : len(part)-1]
}
return parts
}
Testing it:
s := `['one', 'two', 'three']`
fmt.Printf("%q\n", split(s))
s = `[]`
fmt.Printf("%q\n", split(s))
s = `['o,ne', 't,w,o', 't,,hree']`
fmt.Printf("%q\n", split(s))
Output (try it on the Go Playground):
["one" "two" "three"]
[]
["o,ne" "t,w,o" "t,,hree"]
Using this split() function, this is how processing may look like:
for _, line := range lines {
data := CsvLine{
Id: line[0],
Array1: split(line[1]),
Array2: split(line[2]),
}
fmt.Printf("%+v\n", data)
}
This outputs (try it on the Go Playground):
{Id:594385903dss Array1:[fhjdsk dfjdskl fkdsjgooiertio] Array2:[jflkdsjfl fkjdlsfjdslkfjldks]}
{Id:87764385903dss Array1:[cxxc wqeewr opi iy qw] Array2:[cvbvc gf mnb ewr]}

How to convert Camel case string to snake case

I have a string
str := "IGotInternAtGeeksForGeeks"
I try to convert it in to
str = "i_got_intern_at_geeks_for_geeks"
Try this,
import (
"fmt"
"strings"
"regexp"
)
var matchFirstCap = regexp.MustCompile("(.)([A-Z][a-z]+)")
var matchAllCap = regexp.MustCompile("([a-z0-9])([A-Z])")
func ToSnakeCase(str string) string {
snake := matchFirstCap.ReplaceAllString(str, "${1}_${2}")
snake = matchAllCap.ReplaceAllString(snake, "${1}_${2}")
return strings.ToLower(snake)
}
Run:
func main() {
fmt.Println(ToSnakeCase("IGotInternAtGeeksForGeeks"))
}
Output:
i_got_intern_at_geeks_for_geeks
NOTE: This will not work for many non-English languages.
I know this is old post but, I've create a package named gobeam/Stringy You can easily convert camel case string to snake case and kebab case and vice versa. Example:
package main
import (
"fmt"
stringy "github.com/gobeam/Stringy"
)
func main() {
str := stringy.New("HelloGuysHowAreYou?")
snakeStr := str.SnakeCase("?", "")
fmt.Println(snakeStr.ToLower()) // hello_guys_how_are_you
fmt.Println(snakeStr.ToUpper()) // HELLO_GUYS_HOW_ARE_YOU
}
Without reguar expression version.
Letters only, because the use case is struct field db tag. Feel free to modify it for other use cases.
func ToSnake(camel string) (snake string) {
var b strings.Builder
diff := 'a' - 'A'
l := len(camel)
for i, v := range camel {
// A is 65, a is 97
if v >= 'a' {
b.WriteRune(v)
continue
}
// v is capital letter here
// irregard first letter
// add underscore if last letter is capital letter
// add underscore when previous letter is lowercase
// add underscore when next letter is lowercase
if (i != 0 || i == l-1) && ( // head and tail
(i > 0 && rune(camel[i-1]) >= 'a') || // pre
(i < l-1 && rune(camel[i+1]) >= 'a')) { //next
b.WriteRune('_')
}
b.WriteRune(v + diff)
}
return b.String()
}
// here is the test
func TestToSnake(t *testing.T) {
input := "MyLIFEIsAwesomE"
want := "my_life_is_awesom_e"
if got := ToSnake(input); got != want {
t.Errorf("ToSnake(%v) = %v, want %v", input, got, want)
}
}
Faster and simpler version:
import "bytes"
func SnakeCase(camel string) string {
var buf bytes.Buffer
for _, c := range camel {
if 'A' <= c && c <= 'Z' {
// just convert [A-Z] to _[a-z]
if buf.Len() > 0 {
buf.WriteRune('_')
}
bytes.WriteRune(c - 'A' + 'a')
} else {
bytes.WriteRune(c)
}
}
return buf.String()
}
Known bugs:
1. no-ascii
2. reversed upper abbreviate word, eg. baseURL will be ugly base_u_r_l, but not base_url, consider use white list to filter.
wrapped it into a package
import (
"fmt"
"github.com/buxizhizhoum/inflection"
)
func example () {
// to convert a string to underscore
res := inflection.Underscore("aA")
// will return a_a
fmt.Println(res)
// to convert a string to camelize
// will return AA
fmt.Println(inflection.Camelize("a_a", true))
}

Update a string value in loop

Is it possible to update the value of a string when we execute a for loop?
package main
import (
"fmt"
"strings"
)
func Chop(r int, s string) string {
return s[r:]
}
func main() {
s:= "ThisIsAstring1ThisIsAstring2ThisIsAstring3"
for strings.Contains(s, "string") {
// Original value > ThisIsAstring1ThisIsAstring2ThisIsAstring3
fmt.Println(s)
// I delete a part of the string > ThisIsAstring1
remove := len(s)/3
// Now, I update the value of string > string := ThisIsAstring2ThisIsAstring3
s := Chop(remove, s)
fmt.Println(s)
break
}
}
I don't know how to do it.
I have no clue what the use case is, but here goes. Let's start with identifying the issues in your code:
// You cannot use a reserved keyword "string" as a variable name
string:= "ThisIsAstring1ThisIsAstring2ThisIsAstring3"
for strings.Contains(string, "string") {
// Remove is a float, but you need to pass an int into your chop function
remove := len(string)/3
// You're reassigning your string variable. You really just want =, not :=
string := Chop(remove, string)
fmt.Println(string)
}
Now, here's a solution that will work for your use case:
str := "ThisIsAstring1ThisIsAstring2ThisIsAstring3"
for strings.Contains(str, "string") {
fmt.Println(str)
remove := int(len(str) / 3)
str = Chop(remove, str)
}
fmt.Println(str)
GoPlay:
https://play.golang.org/p/NdROIFDS_5

How to check if there is a special character in string or if a character is a special character in GoLang

After reading a string from the input, I need to check if there is a special character in it
You can use strings.ContainsAny to see if a rune exists:
package main
import (
"fmt"
"strings"
)
func main() {
fmt.Println(strings.ContainsAny("Hello World", ",|"))
fmt.Println(strings.ContainsAny("Hello, World", ",|"))
fmt.Println(strings.ContainsAny("Hello|World", ",|"))
}
Or if you want to check if there are only ASCII characters, you can use strings.IndexFunc:
package main
import (
"fmt"
"strings"
)
func main() {
f := func(r rune) bool {
return r < 'A' || r > 'z'
}
if strings.IndexFunc("HelloWorld", f) != -1 {
fmt.Println("Found special char")
}
if strings.IndexFunc("Hello World", f) != -1 {
fmt.Println("Found special char")
}
}
Depending on your definition of special character, the simplest solution would probably to do a for range loop on your string (which yield runes instead of bytes), and for each rune check if it is in your list of allowed/forbidden runes.
See Strings, bytes, runes and characters in Go for more about the relations between string, bytes and runes.
Playground example
package main
var allowed = []rune{'a','b','c','d','e','f','g'}
func haveSpecial(input string) bool {
for _, char := range input {
found := false
for _, c := range allowed {
if c == char {
found = true
break
}
}
if !found {
return true
}
}
return false
}
func main() {
cases := []string{
"abcdef",
"abc$€f",
}
for _, input := range cases {
if haveSpecial(input) {
println(input + ": NOK")
} else {
println(input + ": OK")
}
}
}
You want to use the unicode package, which has a nice function to check for symbols.
https://golang.org/pkg/unicode/#IsSymbol
package main
import (
"fmt"
"unicode"
)
func hasSymbol(str string) bool {
for _, letter := range str {
if unicode.IsSymbol(letter) {
return true
}
}
return false
}
func main() {
var strs = []string {
"A quick brown fox",
"A+quick_brown<fox",
}
for _, str := range strs {
if hasSymbol(str) {
fmt.Printf("String '%v' contains symbols.\n", str)
} else {
fmt.Printf("String '%v' did not contain symbols.\n", str)
}
}
}
This will provide the following output:
String 'A quick brown fox' did not contain symbols.
String 'A+quick_brown<fox' contains symbols.
I ended up doing something like this
alphabet := "abcdefghijklmnopqrstuvwxyz"
alphabetSplit := strings.Split(alphabet, "")
inputLetters := strings.Split(input, "")
for index, value := range inputLetters {
special:=1
for _, char :=range alphabetSplit{
if char == value {
special = 0
break
}
}
It might have anything wrong because since I used it to something specific i had to edit to post it here

Read a text file, replace its words, output to another text file

So I am trying to make a program in GO to take a text file full of code and convert that into GO code and then save that file into a GO file or text file. I have been trying to figure out how to save the changes I made to the text file, but the only way I can see the changes is through a println statement because I am using strings.replace to search the string array that the text file is stored in and change each occurrence of a word that needs to be changed (ex. BEGIN -> { and END -> }). So is there any other way of searching and replacing in GO I don't know about or is there a way to edit a text file that I don't know about or is this impossible?
Thanks
Here is the code I have so far.
package main
import (
"os"
"bufio"
"bytes"
"io"
"fmt"
"strings"
)
func readLines(path string) (lines []string, errr error) {
var (
file *os.File
part []byte
prefix bool
)
if file, errr = os.Open(path); errr != nil {
return
}
defer file.Close()
reader := bufio.NewReader(file)
buffer := bytes.NewBuffer(make([]byte, 0))
for {
if part, prefix, errr = reader.ReadLine(); errr != nil {
break
}
buffer.Write(part)
if !prefix {
lines = append(lines, buffer.String())
buffer.Reset()
}
}
if errr == io.EOF {
errr = nil
}
return
}
func writeLines(lines []string, path string) (errr error) {
var (
file *os.File
)
if file, errr = os.Create(path); errr != nil {
return
}
defer file.Close()
for _,item := range lines {
_, errr := file.WriteString(strings.TrimSpace(item) + "\n");
if errr != nil {
fmt.Println(errr)
break
}
}
return
}
func FixBegin(lines []string) (errr error) {
var(
a string
)
for i := 0; ; i++ {
a = lines[i];
fmt.Println(strings.Replace(a, "BEGIN", "{", -1))
}
return
}
func FixEnd(lines []string) (errr error) {
var(
a string
)
for i := 0; ; i++ {
a = lines[i];
fmt.Println(strings.Replace(a, "END", "}", -1))
}
return
}
func main() {
lines, errr := readLines("foo.txt")
if errr != nil {
fmt.Println("Error: %s\n", errr)
return
}
for _, line := range lines {
fmt.Println(line)
}
errr = FixBegin(lines)
errr = writeLines(lines, "beer2.txt")
fmt.Println(errr)
errr = FixEnd(lines)
lines, errr = readLines("beer2.txt")
if errr != nil {
fmt.Println("Error: %s\n", errr)
return
}
errr = writeLines(lines, "beer2.txt")
fmt.Println(errr)
}
jnml#fsc-r630:~/src/tmp/SO/13789882$ ls
foo.txt main.go
jnml#fsc-r630:~/src/tmp/SO/13789882$ cat main.go
package main
import (
"bytes"
"io/ioutil"
"log"
)
func main() {
src, err := ioutil.ReadFile("foo.txt")
if err != nil {
log.Fatal(err)
}
src = bytes.Replace(src, []byte("BEGIN"), []byte("{"), -1)
src = bytes.Replace(src, []byte("END"), []byte("}"), -1)
if err = ioutil.WriteFile("beer2.txt", src, 0666); err != nil {
log.Fatal(err)
}
}
jnml#fsc-r630:~/src/tmp/SO/13789882$ cat foo.txt
BEGIN
FILE F(KIND=REMOTE);
EBCDIC ARRAY E[0:11];
REPLACE E BY "HELLO WORLD!";
WRITE(F, *, E);
END.
jnml#fsc-r630:~/src/tmp/SO/13789882$ go run main.go
jnml#fsc-r630:~/src/tmp/SO/13789882$ cat beer2.txt
{
FILE F(KIND=REMOTE);
EBCDIC ARRAY E[0:11];
REPLACE E BY "HELLO WORLD!";
WRITE(F, *, E);
}.
jnml#fsc-r630:~/src/tmp/SO/13789882$
I agree with #jnml wrt using ioutil to slurp the file and to write it back. But I think that the replacing shouldn't be done by multiple passes over []byte. Code and data are strings/text and should be treated as such (even if dealing with non ascii/utf8 encodings requires estra work); a one pass replacement (of all placeholders 'at once') avoids the risk of replacing results of previous changes (even if my regexp proposal must be improved to handle non-trivial tasks).
package main
import(
"fmt"
"io/ioutil"
"log"
"regexp"
"strings"
)
func main() {
// (1) slurp the file
data, err := ioutil.ReadFile("../tmpl/xpl.go")
if err != nil {
log.Fatal("ioutil.ReadFile: ", err)
}
s := string(data)
fmt.Printf("----\n%s----\n", s)
// => function that works for files of (known) other encodings that ascii or utf8
// (2) create a map that maps placeholder to be replaced to the replacements
x := map[string]string {
"BEGIN" : "{",
"END" : "}"}
ks := make([]string, 0, len(x))
for k := range x {
ks = append(ks, k)
}
// => function(s) that gets the keys from maps
// (3) create a regexp that finds the placeholder to be replaced
p := strings.Join(ks, "|")
fmt.Printf("/%s/\n", p)
r := regexp.MustCompile(p)
// => funny letters & order need more consideration
// (4) create a callback function for ..ReplaceAllStringFunc that knows
// about the map x
f := func(s string) string {
fmt.Printf("*** '%s'\n", s)
return x[s]
}
// => function (?) to do Step (2) .. (4) in a reusable way
// (5) do the replacing (s will be overwritten with the result)
s = r.ReplaceAllStringFunc(s, f)
fmt.Printf("----\n%s----\n", s)
// (6) write back
err = ioutil.WriteFile("result.go", []byte(s), 0644)
if err != nil {
log.Fatal("ioutil.WriteFile: ", err)
}
// => function that works for files of (known) other encodings that ascii or utf8
}
output:
go run 13789882.go
----
func main() BEGIN
END
----
/BEGIN|END/
*** 'BEGIN'
*** 'END'
----
func main() {
}
----
If your file size is huge, reading everything in memory might not be possible nor advised. Give BytesReplacingReader a try as it is done replacement in streaming fashion. And it's reasonably performant. If you want to replace two strings (such as BEGIN -> { and END -> }), just need to wrap two BytesReplacingReader over original reader, one for BEGIN and one for END:
r := NewBytesReplacingReader(
NewBytesReplacingReader(inputReader, []byte("BEGIN"), []byte("{"),
[]byte("END"), []byte("}")
// use r normally and all non-overlapping occurrences of
// "BEGIN" and "END" will be replaced with "{" and "}"

Resources