How can I read a whole file into a string variable - string

I have lots of small files, I don't want to read them line by line.
Is there a function in Go that will read a whole file into a string variable?

Use ioutil.ReadFile:
func ReadFile(filename string) ([]byte, error)
ReadFile reads the file named by filename and returns the contents. A successful call
returns err == nil, not err == EOF. Because ReadFile reads the whole file, it does not treat
an EOF from Read as an error to be reported.
You will get a []byte instead of a string. It can be converted if really necessary:
s := string(buf)
Edit: the ioutil package is now deprecated: "Deprecated: As of Go 1.16, the same functionality is now provided by package io or package os, and those implementations should be preferred in new code. See the specific function documentation for details." Because of Go's compatibility promise, ioutil.ReadMe is safe, but #openwonk's updated answer is better for new code.

If you just want the content as string, then the simple solution is to use the ReadFile function from the io/ioutil package. This function returns a slice of bytes which you can easily convert to a string.
Go 1.16 or later
Replace ioutil with os for this example.
package main
import (
"fmt"
"os"
)
func main() {
b, err := os.ReadFile("file.txt") // just pass the file name
if err != nil {
fmt.Print(err)
}
fmt.Println(b) // print the content as 'bytes'
str := string(b) // convert content to a 'string'
fmt.Println(str) // print the content as a 'string'
}
Go 1.15 or earlier
package main
import (
"fmt"
"io/ioutil"
)
func main() {
b, err := ioutil.ReadFile("file.txt") // just pass the file name
if err != nil {
fmt.Print(err)
}
fmt.Println(b) // print the content as 'bytes'
str := string(b) // convert content to a 'string'
fmt.Println(str) // print the content as a 'string'
}

I think the best thing to do, if you're really concerned about the efficiency of concatenating all of these files, is to copy them all into the same bytes buffer.
buf := bytes.NewBuffer(nil)
for _, filename := range filenames {
f, _ := os.Open(filename) // Error handling elided for brevity.
io.Copy(buf, f) // Error handling elided for brevity.
f.Close()
}
s := string(buf.Bytes())
This opens each file, copies its contents into buf, then closes the file. Depending on your situation you may not actually need to convert it, the last line is just to show that buf.Bytes() has the data you're looking for.

This is how I did it:
package main
import (
"fmt"
"os"
"bytes"
"log"
)
func main() {
filerc, err := os.Open("filename")
if err != nil{
log.Fatal(err)
}
defer filerc.Close()
buf := new(bytes.Buffer)
buf.ReadFrom(filerc)
contents := buf.String()
fmt.Print(contents)
}

You can use strings.Builder:
package main
import (
"io"
"os"
"strings"
)
func main() {
f, err := os.Open("file.txt")
if err != nil {
panic(err)
}
defer f.Close()
b := new(strings.Builder)
io.Copy(b, f)
print(b.String())
}
Or if you don't mind []byte, you can use
os.ReadFile:
package main
import "os"
func main() {
b, err := os.ReadFile("file.txt")
if err != nil {
panic(err)
}
os.Stdout.Write(b)
}

For Go 1.16 or later you can read file at compilation time.
Use the //go:embed directive and the embed package in Go 1.16
For example:
package main
import (
"fmt"
_ "embed"
)
//go:embed file.txt
var s string
func main() {
fmt.Println(s) // print the content as a 'string'
}

I'm not with computer,so I write a draft. You might be clear of what I say.
func main(){
const dir = "/etc/"
filesInfo, e := ioutil.ReadDir(dir)
var fileNames = make([]string, 0, 10)
for i,v:=range filesInfo{
if !v.IsDir() {
fileNames = append(fileNames, v.Name())
}
}
var fileNumber = len(fileNames)
var contents = make([]string, fileNumber, 10)
wg := sync.WaitGroup{}
wg.Add(fileNumber)
for i,_:=range content {
go func(i int){
defer wg.Done()
buf,e := ioutil.Readfile(fmt.Printf("%s/%s", dir, fileName[i]))
defer file.Close()
content[i] = string(buf)
}(i)
}
wg.Wait()
}

Related

Parse variable length array from csv to struct

I have the following setup to parse a csv file:
package main
import (
"fmt"
"os"
"encoding/csv"
)
type CsvLine struct {
Id string
Array1 [] string
Array2 [] string
}
func ReadCsv(filename string) ([][]string, error) {
f, err := os.Open(filename)
if err != nil {
return [][]string{}, err
}
defer f.Close()
lines, err := csv.NewReader(f).ReadAll()
if err != nil {
return [][]string{}, err
}
return lines, nil
}
func main() {
lines, err := ReadCsv("./data/sample-0.3.csv")
if err != nil {
panic(err)
}
for _, line := range lines {
fmt.Println(line)
data := CsvLine{
Id: line[0],
Array1: line[1],
Array2: line[2],
}
fmt.Println(data.Id)
fmt.Println(data.Array1)
fmt.Println(data.Array2)
}
}
And the following setup in my csv file:
594385903dss,"['fhjdsk', 'dfjdskl', 'fkdsjgooiertio']","['jflkdsjfl', 'fkjdlsfjdslkfjldks']"
87764385903dss,"['cxxc', 'wqeewr', 'opi', 'iy', 'qw']","['cvbvc', 'gf', 'mnb', 'ewr']"
My understanding is that variable length lists should be parsed into a slice, is it possible to do this directly via a csv reader? (The csv output was generated via a python project.)
Help/suggestions appreciated.
CSV does not have a notion of "variable length arrays", it is just a comma separated list of values. The format is described in RFC 4180, and that is exactly what the encoding/csv package implements.
You can only get a string slice out of a CSV line. How you interpret the values is up to you. You have to post process your data if you want to split it further.
What you have may be simply processed with the regexp package, e.g.
var r = regexp.MustCompile(`'[^']*'`)
func split(s string) []string {
parts := r.FindAllString(s, -1)
for i, part := range parts {
parts[i] = part[1 : len(part)-1]
}
return parts
}
Testing it:
s := `['one', 'two', 'three']`
fmt.Printf("%q\n", split(s))
s = `[]`
fmt.Printf("%q\n", split(s))
s = `['o,ne', 't,w,o', 't,,hree']`
fmt.Printf("%q\n", split(s))
Output (try it on the Go Playground):
["one" "two" "three"]
[]
["o,ne" "t,w,o" "t,,hree"]
Using this split() function, this is how processing may look like:
for _, line := range lines {
data := CsvLine{
Id: line[0],
Array1: split(line[1]),
Array2: split(line[2]),
}
fmt.Printf("%+v\n", data)
}
This outputs (try it on the Go Playground):
{Id:594385903dss Array1:[fhjdsk dfjdskl fkdsjgooiertio] Array2:[jflkdsjfl fkjdlsfjdslkfjldks]}
{Id:87764385903dss Array1:[cxxc wqeewr opi iy qw] Array2:[cvbvc gf mnb ewr]}

How to encrypt a value imported from a JSON file using SOPS (Secrets OPerationS) and Go?

I have a JSON file as follows.
secret.json:
{
"secret": "strongPassword"
}
I want to print out an encrypted value of the key "secret".
I've so far tried as follows.
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"go.mozilla.org/sops"
)
type secretValue struct {
Value string `json:"secret"`
}
func main() {
file, _ := ioutil.ReadFile("secret.json")
getSecretValue := secretValue{}
_ = json.Unmarshal([]byte(file), &getSecretValue)
encryptedValue, err := sops.Tree.Encrypt([]byte(getSecretValue.Value), file)
if err != nil {
panic(err)
}
fmt.Println(encryptedValue)
}
As you might have guessed, I'm pretty new to Go and the code above doesn't work.
How can I improve the code to print out the encrypted value?
Please note that I'm writing code like this only to see how SOPS works using Go. I don't print out secret value like this in production.
Edit:
I think the problem is the arguments for the Encrypt function. According to the documentation, it should take []byte key and Cipher arguments, but I don't know either if I'm setting the []byte key correct or where that Cipher comes from. Is it from crypto/cipher package?
Edit 2:
Thank you #HolaYang for the great answer.
I tried to make your answer work with the external JSON file as follows, but it gave me an error message saying cannot use fileContent (type secretValue) as type []byte in argument to (&"go.mozilla.org/sops/stores/json".Store literal).LoadPlainFile.
package main
import (
hey "encoding/json"
"fmt"
"io/ioutil"
"go.mozilla.org/sops"
"go.mozilla.org/sops/aes"
"go.mozilla.org/sops/stores/json"
)
type secretValue struct {
Value string `json:"secret"`
}
func main() {
// fileContent := []byte(`{
// "secret": "strongPassword"
// }`)
file, _ := ioutil.ReadFile("secret.json")
fileContent := secretValue{}
//_ = json.Unmarshal([]byte(file), &fileContent)
_ = hey.Unmarshal([]byte(file), &fileContent)
encryptKey := []byte("0123456789012345") // length 16
branches, _ := (&json.Store{}).LoadPlainFile(fileContent)
tree := sops.Tree{Branches: branches}
r, err := tree.Encrypt(encryptKey, aes.NewCipher())
if err != nil {
panic(err)
}
fmt.Println(r)
}
Let's see the function declaration of sops.Tree.Encrypt (a typo here in your code).
By the code, we should do in these steps.
Construct a sops.Tree instance with the json file.
Use a certain Cipher for your encrypt.
Try yourself in this way please.
Code demo below, with AES as Cipher, and sops can only encrypt the total tree with the source code interface.
package main
import (
"fmt"
"go.mozilla.org/sops"
"go.mozilla.org/sops/aes"
"go.mozilla.org/sops/stores/json"
)
func main() {
/*
fileContent := []byte(`{
"secret": "strongPassword"
}`)
*/
fileContent, _ := ioutil.ReadFile("xxx.json")
encryptKey := []byte("0123456789012345") // length 16
branches, _ := (&json.Store{}).LoadPlainFile(fileContent)
tree := sops.Tree{Branches: branches}
r, err := tree.Encrypt(encryptKey, aes.NewCipher())
if err != nil {
panic(err)
}
fmt.Println(r)
}

Thesaurus of strings: because so many different start characters, need to use Split with not equal to logic

I have a .dat file that is a dictionary/thesaurus containing about 300k lines
For each word, the following lines below it that have a word in brackets at the start of the string are the thesaurus' alternatives with the word in the brackets being the type. So a noun or adjective. For example:
acceptant|1
(adj)|acceptive|receptive
acceptation|3
(noun)|acceptance
(noun)|word meaning|word sense|sense|signified
(noun)|adoption|acceptance|espousal|blessing|approval|approving
accepted|6
(adj)|recognized|recognised|acknowledged
(adj)|undisputed|uncontroversial |noncontroversial
(adj)|standard
(adj)|acceptable|standard |received
(adj)|established |constituted
(adj)|received|conventional
accepting|1
(adj)|acceptive
So in the above there are 4 words from the dictionary, but each word has multiple different entries for the thesaurus
I want to split the strings using:
strings.Split(dictionary, !"(")
Meaning anything that isn't the "(" character. This is because it's an extensive dictionary with slang and abbreviations and whatnot. But I can't work out how to use the not equal to operator
Does anyone know how to use split with not equal to logic? Or can anyone suggest some clever alternative ideas?
#MostafaSolati's solution could be improved by being written more efficiently.
package main
import (
"bufio"
"bytes"
"fmt"
"os"
)
func main() {
file, _ := os.Open("dic.dat")
scanner := bufio.NewScanner(file)
for scanner.Scan() {
data := scanner.Bytes()
if bytes.HasPrefix(data, []byte("(")) {
continue
}
line := scanner.Text()
fmt.Println(line)
}
}
Output:
acceptant|1
acceptation|3
accepted|6
accepting|1
By design, Go code is expected to be efficient. The Go standard library testing package includes a benchmark feature.
It's important to avoid unnecessary conversions and allocations. For example, converting byte slices read from a file to a strings, an allocation and a copy.
In this case, we only need to convert accepted data to a string. For example, prefer Bytes to Text.
$ go test dict_test.go -bench=.
BenchmarkText-4 500 2486306 ns/op 898528 B/op 14170 allocs/op
BenchmarkBytes-4 1000 1489828 ns/op 34080 B/op 609 allocs/op
$
Sample benchmark data:
KEY: Aback.
SYN: Backwards, rearwards, aft, abaft, astern, behind, back.
ANT: Onwards, forwards, ahead, before, afront, beyond, afore.
=
KEY: Abandon.
SYN: Leave, forsake, desert, renounce, cease, relinquish,
discontinue, castoff, resign, retire, quit, forego, forswear,
depart from, vacate, surrender, abjure, repudiate.
ANT: Pursue, prosecute, undertake, seek, court, cherish, favor,
protect, claim, maintain, defend, advocate, retain, support, uphold,
occupy, haunt, hold, assert, vindicate, keep.
=
dict_test.go:
package main
import (
"bufio"
"bytes"
"fmt"
"io/ioutil"
"os"
"strings"
"testing"
)
func BenchmarkText(b *testing.B) {
b.ReportAllocs()
for N := 0; N < b.N; N++ {
file := bytes.NewReader(benchData)
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
if !strings.HasPrefix(line, "KEY") {
continue
}
_ = line // process line
}
if err := scanner.Err(); err != nil {
b.Fatal(err)
}
}
}
func BenchmarkBytes(b *testing.B) {
b.ReportAllocs()
for N := 0; N < b.N; N++ {
file := bytes.NewReader(benchData)
scanner := bufio.NewScanner(file)
for scanner.Scan() {
data := scanner.Bytes()
if !bytes.HasPrefix(data, []byte("KEY")) {
continue
}
line := scanner.Text()
_ = line // process line
}
if err := scanner.Err(); err != nil {
b.Fatal(err)
}
}
}
var benchData = func() []byte {
// A Complete Dictionary of Synonyms and Antonyms by Samuel Fallows
// http://www.gutenberg.org/files/51155/51155-0.txt
data, err := ioutil.ReadFile(`/home/peter/dictionary.51155-0.txt`)
if err != nil {
panic(err)
}
return data
}()
package main
import (
"bufio"
"fmt"
"os"
"strings"
)
func main() {
file, _ := os.Open("dic.dat")
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
if strings.HasPrefix(line, "(") {
continue
}
fmt.Println(line)
}
}

Go Templates: range over string

Is there any way to range over a string in Go templates (that is, from the code in the template itself, not from native Go)? It doesn't seem to be supported directly (The value of the pipeline must be an array, slice, map, or channel.), but is there some hack like splitting the string into an array of single-character strings or something?
Note that I am unable to edit any go source: I'm working with a compiled binary here. I need to make this happen from the template code alone.
You can use FuncMap to split string into characters.
package main
import (
"text/template"
"log"
"os"
)
func main() {
tmpl, err := template.New("foo").Funcs(template.FuncMap{
"to_runes": func(s string) []string {
r := []string{}
for _, c := range []rune(s) {
r = append(r, string(c))
}
return r
},
}).Parse(`{{range . | to_runes }}[{{.}}]{{end}}`)
if err != nil {
log.Fatal(err)
}
err = tmpl.Execute(os.Stdout, "hello world")
if err != nil {
log.Fatal(err)
}
}
This should be:
[h][e][l][l][o][ ][w][o][r][l][d]

Read a text file, replace its words, output to another text file

So I am trying to make a program in GO to take a text file full of code and convert that into GO code and then save that file into a GO file or text file. I have been trying to figure out how to save the changes I made to the text file, but the only way I can see the changes is through a println statement because I am using strings.replace to search the string array that the text file is stored in and change each occurrence of a word that needs to be changed (ex. BEGIN -> { and END -> }). So is there any other way of searching and replacing in GO I don't know about or is there a way to edit a text file that I don't know about or is this impossible?
Thanks
Here is the code I have so far.
package main
import (
"os"
"bufio"
"bytes"
"io"
"fmt"
"strings"
)
func readLines(path string) (lines []string, errr error) {
var (
file *os.File
part []byte
prefix bool
)
if file, errr = os.Open(path); errr != nil {
return
}
defer file.Close()
reader := bufio.NewReader(file)
buffer := bytes.NewBuffer(make([]byte, 0))
for {
if part, prefix, errr = reader.ReadLine(); errr != nil {
break
}
buffer.Write(part)
if !prefix {
lines = append(lines, buffer.String())
buffer.Reset()
}
}
if errr == io.EOF {
errr = nil
}
return
}
func writeLines(lines []string, path string) (errr error) {
var (
file *os.File
)
if file, errr = os.Create(path); errr != nil {
return
}
defer file.Close()
for _,item := range lines {
_, errr := file.WriteString(strings.TrimSpace(item) + "\n");
if errr != nil {
fmt.Println(errr)
break
}
}
return
}
func FixBegin(lines []string) (errr error) {
var(
a string
)
for i := 0; ; i++ {
a = lines[i];
fmt.Println(strings.Replace(a, "BEGIN", "{", -1))
}
return
}
func FixEnd(lines []string) (errr error) {
var(
a string
)
for i := 0; ; i++ {
a = lines[i];
fmt.Println(strings.Replace(a, "END", "}", -1))
}
return
}
func main() {
lines, errr := readLines("foo.txt")
if errr != nil {
fmt.Println("Error: %s\n", errr)
return
}
for _, line := range lines {
fmt.Println(line)
}
errr = FixBegin(lines)
errr = writeLines(lines, "beer2.txt")
fmt.Println(errr)
errr = FixEnd(lines)
lines, errr = readLines("beer2.txt")
if errr != nil {
fmt.Println("Error: %s\n", errr)
return
}
errr = writeLines(lines, "beer2.txt")
fmt.Println(errr)
}
jnml#fsc-r630:~/src/tmp/SO/13789882$ ls
foo.txt main.go
jnml#fsc-r630:~/src/tmp/SO/13789882$ cat main.go
package main
import (
"bytes"
"io/ioutil"
"log"
)
func main() {
src, err := ioutil.ReadFile("foo.txt")
if err != nil {
log.Fatal(err)
}
src = bytes.Replace(src, []byte("BEGIN"), []byte("{"), -1)
src = bytes.Replace(src, []byte("END"), []byte("}"), -1)
if err = ioutil.WriteFile("beer2.txt", src, 0666); err != nil {
log.Fatal(err)
}
}
jnml#fsc-r630:~/src/tmp/SO/13789882$ cat foo.txt
BEGIN
FILE F(KIND=REMOTE);
EBCDIC ARRAY E[0:11];
REPLACE E BY "HELLO WORLD!";
WRITE(F, *, E);
END.
jnml#fsc-r630:~/src/tmp/SO/13789882$ go run main.go
jnml#fsc-r630:~/src/tmp/SO/13789882$ cat beer2.txt
{
FILE F(KIND=REMOTE);
EBCDIC ARRAY E[0:11];
REPLACE E BY "HELLO WORLD!";
WRITE(F, *, E);
}.
jnml#fsc-r630:~/src/tmp/SO/13789882$
I agree with #jnml wrt using ioutil to slurp the file and to write it back. But I think that the replacing shouldn't be done by multiple passes over []byte. Code and data are strings/text and should be treated as such (even if dealing with non ascii/utf8 encodings requires estra work); a one pass replacement (of all placeholders 'at once') avoids the risk of replacing results of previous changes (even if my regexp proposal must be improved to handle non-trivial tasks).
package main
import(
"fmt"
"io/ioutil"
"log"
"regexp"
"strings"
)
func main() {
// (1) slurp the file
data, err := ioutil.ReadFile("../tmpl/xpl.go")
if err != nil {
log.Fatal("ioutil.ReadFile: ", err)
}
s := string(data)
fmt.Printf("----\n%s----\n", s)
// => function that works for files of (known) other encodings that ascii or utf8
// (2) create a map that maps placeholder to be replaced to the replacements
x := map[string]string {
"BEGIN" : "{",
"END" : "}"}
ks := make([]string, 0, len(x))
for k := range x {
ks = append(ks, k)
}
// => function(s) that gets the keys from maps
// (3) create a regexp that finds the placeholder to be replaced
p := strings.Join(ks, "|")
fmt.Printf("/%s/\n", p)
r := regexp.MustCompile(p)
// => funny letters & order need more consideration
// (4) create a callback function for ..ReplaceAllStringFunc that knows
// about the map x
f := func(s string) string {
fmt.Printf("*** '%s'\n", s)
return x[s]
}
// => function (?) to do Step (2) .. (4) in a reusable way
// (5) do the replacing (s will be overwritten with the result)
s = r.ReplaceAllStringFunc(s, f)
fmt.Printf("----\n%s----\n", s)
// (6) write back
err = ioutil.WriteFile("result.go", []byte(s), 0644)
if err != nil {
log.Fatal("ioutil.WriteFile: ", err)
}
// => function that works for files of (known) other encodings that ascii or utf8
}
output:
go run 13789882.go
----
func main() BEGIN
END
----
/BEGIN|END/
*** 'BEGIN'
*** 'END'
----
func main() {
}
----
If your file size is huge, reading everything in memory might not be possible nor advised. Give BytesReplacingReader a try as it is done replacement in streaming fashion. And it's reasonably performant. If you want to replace two strings (such as BEGIN -> { and END -> }), just need to wrap two BytesReplacingReader over original reader, one for BEGIN and one for END:
r := NewBytesReplacingReader(
NewBytesReplacingReader(inputReader, []byte("BEGIN"), []byte("{"),
[]byte("END"), []byte("}")
// use r normally and all non-overlapping occurrences of
// "BEGIN" and "END" will be replaced with "{" and "}"

Resources