Parse variable length array from csv to struct - string

I have the following setup to parse a csv file:
package main
import (
"fmt"
"os"
"encoding/csv"
)
type CsvLine struct {
Id string
Array1 [] string
Array2 [] string
}
func ReadCsv(filename string) ([][]string, error) {
f, err := os.Open(filename)
if err != nil {
return [][]string{}, err
}
defer f.Close()
lines, err := csv.NewReader(f).ReadAll()
if err != nil {
return [][]string{}, err
}
return lines, nil
}
func main() {
lines, err := ReadCsv("./data/sample-0.3.csv")
if err != nil {
panic(err)
}
for _, line := range lines {
fmt.Println(line)
data := CsvLine{
Id: line[0],
Array1: line[1],
Array2: line[2],
}
fmt.Println(data.Id)
fmt.Println(data.Array1)
fmt.Println(data.Array2)
}
}
And the following setup in my csv file:
594385903dss,"['fhjdsk', 'dfjdskl', 'fkdsjgooiertio']","['jflkdsjfl', 'fkjdlsfjdslkfjldks']"
87764385903dss,"['cxxc', 'wqeewr', 'opi', 'iy', 'qw']","['cvbvc', 'gf', 'mnb', 'ewr']"
My understanding is that variable length lists should be parsed into a slice, is it possible to do this directly via a csv reader? (The csv output was generated via a python project.)
Help/suggestions appreciated.

CSV does not have a notion of "variable length arrays", it is just a comma separated list of values. The format is described in RFC 4180, and that is exactly what the encoding/csv package implements.
You can only get a string slice out of a CSV line. How you interpret the values is up to you. You have to post process your data if you want to split it further.
What you have may be simply processed with the regexp package, e.g.
var r = regexp.MustCompile(`'[^']*'`)
func split(s string) []string {
parts := r.FindAllString(s, -1)
for i, part := range parts {
parts[i] = part[1 : len(part)-1]
}
return parts
}
Testing it:
s := `['one', 'two', 'three']`
fmt.Printf("%q\n", split(s))
s = `[]`
fmt.Printf("%q\n", split(s))
s = `['o,ne', 't,w,o', 't,,hree']`
fmt.Printf("%q\n", split(s))
Output (try it on the Go Playground):
["one" "two" "three"]
[]
["o,ne" "t,w,o" "t,,hree"]
Using this split() function, this is how processing may look like:
for _, line := range lines {
data := CsvLine{
Id: line[0],
Array1: split(line[1]),
Array2: split(line[2]),
}
fmt.Printf("%+v\n", data)
}
This outputs (try it on the Go Playground):
{Id:594385903dss Array1:[fhjdsk dfjdskl fkdsjgooiertio] Array2:[jflkdsjfl fkjdlsfjdslkfjldks]}
{Id:87764385903dss Array1:[cxxc wqeewr opi iy qw] Array2:[cvbvc gf mnb ewr]}

Related

String splitting before character

I'm new to go and have been using split to my advantage. Recently I came across a problem I wanted to split something, and keep the splitting char in my second slice rather than removing it, or leaving it in the first slice as with SplitAfter.
For example the following code:
strings.Split("email#email.com", "#")
returned: ["email", "email.com"]
strings.SplitAfter("email#email.com", "#")
returned: ["email#", "email.com"]
What's the best way to get ["email", "#email.com"]?
Use strings.Index to find the # and slice to get the two parts:
var part1, part2 string
if i := strings.Index(s, "#"); i >= 0 {
part1, part2 = s[:i], s[i:]
} else {
// handle case with no #
}
Run it on the playground.
Could this work for you?
s := strings.Split("email#email.com", "#")
address, domain := s[0], "#"+s[1]
fmt.Println(address, domain)
// email #email.com
Then combing and creating a string
var buffer bytes.Buffer
buffer.WriteString(address)
buffer.WriteString(domain)
result := buffer.String()
fmt.Println(result)
// email#email.com
You can use bufio.Scanner:
package main
import (
"bufio"
"strings"
)
func email(data []byte, eof bool) (int, []byte, error) {
for i, b := range data {
if b == '#' {
if i > 0 {
return i, data[:i], nil
}
return len(data), data, nil
}
}
return 0, nil, nil
}
func main() {
s := bufio.NewScanner(strings.NewReader("email#email.com"))
s.Split(email)
for s.Scan() {
println(s.Text())
}
}
https://golang.org/pkg/bufio#Scanner.Split

Go Templates: range over string

Is there any way to range over a string in Go templates (that is, from the code in the template itself, not from native Go)? It doesn't seem to be supported directly (The value of the pipeline must be an array, slice, map, or channel.), but is there some hack like splitting the string into an array of single-character strings or something?
Note that I am unable to edit any go source: I'm working with a compiled binary here. I need to make this happen from the template code alone.
You can use FuncMap to split string into characters.
package main
import (
"text/template"
"log"
"os"
)
func main() {
tmpl, err := template.New("foo").Funcs(template.FuncMap{
"to_runes": func(s string) []string {
r := []string{}
for _, c := range []rune(s) {
r = append(r, string(c))
}
return r
},
}).Parse(`{{range . | to_runes }}[{{.}}]{{end}}`)
if err != nil {
log.Fatal(err)
}
err = tmpl.Execute(os.Stdout, "hello world")
if err != nil {
log.Fatal(err)
}
}
This should be:
[h][e][l][l][o][ ][w][o][r][l][d]

Encode/Decode base64

here is my code and I don't understand why the decode function doesn't work.
Little insight would be great please.
func EncodeB64(message string) (retour string) {
base64Text := make([]byte, base64.StdEncoding.EncodedLen(len(message)))
base64.StdEncoding.Encode(base64Text, []byte(message))
return string(base64Text)
}
func DecodeB64(message string) (retour string) {
base64Text := make([]byte, base64.StdEncoding.DecodedLen(len(message)))
base64.StdEncoding.Decode(base64Text, []byte(message))
fmt.Printf("base64: %s\n", base64Text)
return string(base64Text)
}
It gaves me :
[Decode error - output not utf-8][Decode error - output not utf-8]
The len prefix is superficial and causes the invalid utf-8 error:
package main
import (
"encoding/base64"
"fmt"
"log"
)
func main() {
str := base64.StdEncoding.EncodeToString([]byte("Hello, playground"))
fmt.Println(str)
data, err := base64.StdEncoding.DecodeString(str)
if err != nil {
log.Fatal("error:", err)
}
fmt.Printf("%q\n", data)
}
(Also here)
Output
SGVsbG8sIHBsYXlncm91bmQ=
"Hello, playground"
EDIT: I read too fast, the len was not used as a prefix. dystroy got it right.
DecodedLen returns the maximal length.
This length is useful for sizing your buffer but part of the buffer won't be written and thus won't be valid UTF-8.
You have to use only the real written length returned by the Decode function.
l, _ := base64.StdEncoding.Decode(base64Text, []byte(message))
log.Printf("base64: %s\n", base64Text[:l])
To sum up the other two posts, here are two simple functions to encode/decode Base64 strings with Go:
// Dont forget to import "encoding/base64"!
func base64Encode(str string) string {
return base64.StdEncoding.EncodeToString([]byte(str))
}
func base64Decode(str string) (string, bool) {
data, err := base64.StdEncoding.DecodeString(str)
if err != nil {
return "", true
}
return string(data), false
}
Try it!
#Denys Séguret's answer is almost 100% correct. As an improvement to avoid wasting memory with non used space in base64Text, you should use base64.DecodedLen. Take a look at how base64.DecodeString uses it.
It should look like this:
func main() {
message := base64.StdEncoding.EncodeToString([]byte("Hello, playground"))
base64Text := make([]byte, base64.StdEncoding.DecodedLen(len(message)))
n, _ := base64.StdEncoding.Decode(base64Text, []byte(message))
fmt.Println("base64Text:", string(base64Text[:n]))
}
Try it here.
More or less like above, but using []bytes and part of a bigger struct:
func (s secure) encodePayload(body []byte) string {
//Base64 Encode
return base64.StdEncoding.EncodeToString(body)
}
func (s secure) decodePayload(body []byte) ([]byte, error) {
//Base64 Decode
b64 := make([]byte, base64.StdEncoding.DecodedLen(len(body)))
n, err := base64.StdEncoding.Decode(b64, body)
if err != nil {
return nil, err
}
return b64[:n], nil
}

Read a text file, replace its words, output to another text file

So I am trying to make a program in GO to take a text file full of code and convert that into GO code and then save that file into a GO file or text file. I have been trying to figure out how to save the changes I made to the text file, but the only way I can see the changes is through a println statement because I am using strings.replace to search the string array that the text file is stored in and change each occurrence of a word that needs to be changed (ex. BEGIN -> { and END -> }). So is there any other way of searching and replacing in GO I don't know about or is there a way to edit a text file that I don't know about or is this impossible?
Thanks
Here is the code I have so far.
package main
import (
"os"
"bufio"
"bytes"
"io"
"fmt"
"strings"
)
func readLines(path string) (lines []string, errr error) {
var (
file *os.File
part []byte
prefix bool
)
if file, errr = os.Open(path); errr != nil {
return
}
defer file.Close()
reader := bufio.NewReader(file)
buffer := bytes.NewBuffer(make([]byte, 0))
for {
if part, prefix, errr = reader.ReadLine(); errr != nil {
break
}
buffer.Write(part)
if !prefix {
lines = append(lines, buffer.String())
buffer.Reset()
}
}
if errr == io.EOF {
errr = nil
}
return
}
func writeLines(lines []string, path string) (errr error) {
var (
file *os.File
)
if file, errr = os.Create(path); errr != nil {
return
}
defer file.Close()
for _,item := range lines {
_, errr := file.WriteString(strings.TrimSpace(item) + "\n");
if errr != nil {
fmt.Println(errr)
break
}
}
return
}
func FixBegin(lines []string) (errr error) {
var(
a string
)
for i := 0; ; i++ {
a = lines[i];
fmt.Println(strings.Replace(a, "BEGIN", "{", -1))
}
return
}
func FixEnd(lines []string) (errr error) {
var(
a string
)
for i := 0; ; i++ {
a = lines[i];
fmt.Println(strings.Replace(a, "END", "}", -1))
}
return
}
func main() {
lines, errr := readLines("foo.txt")
if errr != nil {
fmt.Println("Error: %s\n", errr)
return
}
for _, line := range lines {
fmt.Println(line)
}
errr = FixBegin(lines)
errr = writeLines(lines, "beer2.txt")
fmt.Println(errr)
errr = FixEnd(lines)
lines, errr = readLines("beer2.txt")
if errr != nil {
fmt.Println("Error: %s\n", errr)
return
}
errr = writeLines(lines, "beer2.txt")
fmt.Println(errr)
}
jnml#fsc-r630:~/src/tmp/SO/13789882$ ls
foo.txt main.go
jnml#fsc-r630:~/src/tmp/SO/13789882$ cat main.go
package main
import (
"bytes"
"io/ioutil"
"log"
)
func main() {
src, err := ioutil.ReadFile("foo.txt")
if err != nil {
log.Fatal(err)
}
src = bytes.Replace(src, []byte("BEGIN"), []byte("{"), -1)
src = bytes.Replace(src, []byte("END"), []byte("}"), -1)
if err = ioutil.WriteFile("beer2.txt", src, 0666); err != nil {
log.Fatal(err)
}
}
jnml#fsc-r630:~/src/tmp/SO/13789882$ cat foo.txt
BEGIN
FILE F(KIND=REMOTE);
EBCDIC ARRAY E[0:11];
REPLACE E BY "HELLO WORLD!";
WRITE(F, *, E);
END.
jnml#fsc-r630:~/src/tmp/SO/13789882$ go run main.go
jnml#fsc-r630:~/src/tmp/SO/13789882$ cat beer2.txt
{
FILE F(KIND=REMOTE);
EBCDIC ARRAY E[0:11];
REPLACE E BY "HELLO WORLD!";
WRITE(F, *, E);
}.
jnml#fsc-r630:~/src/tmp/SO/13789882$
I agree with #jnml wrt using ioutil to slurp the file and to write it back. But I think that the replacing shouldn't be done by multiple passes over []byte. Code and data are strings/text and should be treated as such (even if dealing with non ascii/utf8 encodings requires estra work); a one pass replacement (of all placeholders 'at once') avoids the risk of replacing results of previous changes (even if my regexp proposal must be improved to handle non-trivial tasks).
package main
import(
"fmt"
"io/ioutil"
"log"
"regexp"
"strings"
)
func main() {
// (1) slurp the file
data, err := ioutil.ReadFile("../tmpl/xpl.go")
if err != nil {
log.Fatal("ioutil.ReadFile: ", err)
}
s := string(data)
fmt.Printf("----\n%s----\n", s)
// => function that works for files of (known) other encodings that ascii or utf8
// (2) create a map that maps placeholder to be replaced to the replacements
x := map[string]string {
"BEGIN" : "{",
"END" : "}"}
ks := make([]string, 0, len(x))
for k := range x {
ks = append(ks, k)
}
// => function(s) that gets the keys from maps
// (3) create a regexp that finds the placeholder to be replaced
p := strings.Join(ks, "|")
fmt.Printf("/%s/\n", p)
r := regexp.MustCompile(p)
// => funny letters & order need more consideration
// (4) create a callback function for ..ReplaceAllStringFunc that knows
// about the map x
f := func(s string) string {
fmt.Printf("*** '%s'\n", s)
return x[s]
}
// => function (?) to do Step (2) .. (4) in a reusable way
// (5) do the replacing (s will be overwritten with the result)
s = r.ReplaceAllStringFunc(s, f)
fmt.Printf("----\n%s----\n", s)
// (6) write back
err = ioutil.WriteFile("result.go", []byte(s), 0644)
if err != nil {
log.Fatal("ioutil.WriteFile: ", err)
}
// => function that works for files of (known) other encodings that ascii or utf8
}
output:
go run 13789882.go
----
func main() BEGIN
END
----
/BEGIN|END/
*** 'BEGIN'
*** 'END'
----
func main() {
}
----
If your file size is huge, reading everything in memory might not be possible nor advised. Give BytesReplacingReader a try as it is done replacement in streaming fashion. And it's reasonably performant. If you want to replace two strings (such as BEGIN -> { and END -> }), just need to wrap two BytesReplacingReader over original reader, one for BEGIN and one for END:
r := NewBytesReplacingReader(
NewBytesReplacingReader(inputReader, []byte("BEGIN"), []byte("{"),
[]byte("END"), []byte("}")
// use r normally and all non-overlapping occurrences of
// "BEGIN" and "END" will be replaced with "{" and "}"

How can I read a whole file into a string variable

I have lots of small files, I don't want to read them line by line.
Is there a function in Go that will read a whole file into a string variable?
Use ioutil.ReadFile:
func ReadFile(filename string) ([]byte, error)
ReadFile reads the file named by filename and returns the contents. A successful call
returns err == nil, not err == EOF. Because ReadFile reads the whole file, it does not treat
an EOF from Read as an error to be reported.
You will get a []byte instead of a string. It can be converted if really necessary:
s := string(buf)
Edit: the ioutil package is now deprecated: "Deprecated: As of Go 1.16, the same functionality is now provided by package io or package os, and those implementations should be preferred in new code. See the specific function documentation for details." Because of Go's compatibility promise, ioutil.ReadMe is safe, but #openwonk's updated answer is better for new code.
If you just want the content as string, then the simple solution is to use the ReadFile function from the io/ioutil package. This function returns a slice of bytes which you can easily convert to a string.
Go 1.16 or later
Replace ioutil with os for this example.
package main
import (
"fmt"
"os"
)
func main() {
b, err := os.ReadFile("file.txt") // just pass the file name
if err != nil {
fmt.Print(err)
}
fmt.Println(b) // print the content as 'bytes'
str := string(b) // convert content to a 'string'
fmt.Println(str) // print the content as a 'string'
}
Go 1.15 or earlier
package main
import (
"fmt"
"io/ioutil"
)
func main() {
b, err := ioutil.ReadFile("file.txt") // just pass the file name
if err != nil {
fmt.Print(err)
}
fmt.Println(b) // print the content as 'bytes'
str := string(b) // convert content to a 'string'
fmt.Println(str) // print the content as a 'string'
}
I think the best thing to do, if you're really concerned about the efficiency of concatenating all of these files, is to copy them all into the same bytes buffer.
buf := bytes.NewBuffer(nil)
for _, filename := range filenames {
f, _ := os.Open(filename) // Error handling elided for brevity.
io.Copy(buf, f) // Error handling elided for brevity.
f.Close()
}
s := string(buf.Bytes())
This opens each file, copies its contents into buf, then closes the file. Depending on your situation you may not actually need to convert it, the last line is just to show that buf.Bytes() has the data you're looking for.
This is how I did it:
package main
import (
"fmt"
"os"
"bytes"
"log"
)
func main() {
filerc, err := os.Open("filename")
if err != nil{
log.Fatal(err)
}
defer filerc.Close()
buf := new(bytes.Buffer)
buf.ReadFrom(filerc)
contents := buf.String()
fmt.Print(contents)
}
You can use strings.Builder:
package main
import (
"io"
"os"
"strings"
)
func main() {
f, err := os.Open("file.txt")
if err != nil {
panic(err)
}
defer f.Close()
b := new(strings.Builder)
io.Copy(b, f)
print(b.String())
}
Or if you don't mind []byte, you can use
os.ReadFile:
package main
import "os"
func main() {
b, err := os.ReadFile("file.txt")
if err != nil {
panic(err)
}
os.Stdout.Write(b)
}
For Go 1.16 or later you can read file at compilation time.
Use the //go:embed directive and the embed package in Go 1.16
For example:
package main
import (
"fmt"
_ "embed"
)
//go:embed file.txt
var s string
func main() {
fmt.Println(s) // print the content as a 'string'
}
I'm not with computer,so I write a draft. You might be clear of what I say.
func main(){
const dir = "/etc/"
filesInfo, e := ioutil.ReadDir(dir)
var fileNames = make([]string, 0, 10)
for i,v:=range filesInfo{
if !v.IsDir() {
fileNames = append(fileNames, v.Name())
}
}
var fileNumber = len(fileNames)
var contents = make([]string, fileNumber, 10)
wg := sync.WaitGroup{}
wg.Add(fileNumber)
for i,_:=range content {
go func(i int){
defer wg.Done()
buf,e := ioutil.Readfile(fmt.Printf("%s/%s", dir, fileName[i]))
defer file.Close()
content[i] = string(buf)
}(i)
}
wg.Wait()
}

Resources