I have this little bit of code that kept me busy the whole weekend.
package main
import (
"encoding/csv"
"fmt"
"log"
"os"
)
func main() {
f, err := os.Create("./test.csv")
if err != nil {
log.Fatal("Error: %s", err)
}
defer f.Close()
w := csv.NewWriter(f)
var record []string
record = append(record, "Unquoted string")
s := "Cr#zy text with , and \\ and \" etc"
record = append(record, s)
fmt.Println(record)
w.Write(record)
record = make([]string, 0)
record = append(record, "Quoted string")
s = fmt.Sprintf("%q", s)
record = append(record, s)
fmt.Println(record)
w.Write(record)
w.Flush()
}
When run it prints out:
[Unquoted string Cr#zy text with , and \ and " etc]
[Quoted string "Cr#zy text with , and \\ and \" etc"]
The second, quoted text is exactly what I would wish to see in the CSV, but instead I get this:
Unquoted string,"Cr#zy text with , and \ and "" etc"
Quoted string,"""Cr#zy text with , and \\ and \"" etc"""
Where do those extra quotes come from and how do I avoid them?
I have tried a number of things, including using strings.Quote and some such but I can't seem to find a perfect solution. Help, please?
It's part of the standard for storing data as CSV.
Double quote characters need to be escaped for parsing reasons.
A (double) quote character in a field must be represented by two (double) quote characters.
From: http://en.wikipedia.org/wiki/Comma-separated_values
You don't really have to worry because the CSV reader un-escapes the double quote.
Example:
package main
import (
"encoding/csv"
"fmt"
"os"
)
func checkError(e error){
if e != nil {
panic(e)
}
}
func writeCSV(){
fmt.Println("Writing csv")
f, err := os.Create("./test.csv")
checkError(err)
defer f.Close()
w := csv.NewWriter(f)
s := "Cr#zy text with , and \\ and \" etc"
record := []string{
"Unquoted string",
s,
}
fmt.Println(record)
w.Write(record)
record = []string{
"Quoted string",
fmt.Sprintf("%q",s),
}
fmt.Println(record)
w.Write(record)
w.Flush()
}
func readCSV(){
fmt.Println("Reading csv")
file, err := os.Open("./test.csv")
defer file.Close();
cr := csv.NewReader(file)
records, err := cr.ReadAll()
checkError(err)
for _, record := range records {
fmt.Println(record)
}
}
func main() {
writeCSV()
readCSV()
}
Output
Writing csv
[Unquoted string Cr#zy text with , and \ and " etc]
[Quoted string "Cr#zy text with , and \\ and \" etc"]
Reading csv
[Unquoted string Cr#zy text with , and \ and " etc]
[Quoted string "Cr#zy text with , and \\ and \" etc"]
Here's the code for the write function.
func (w *Writer) Write(record []string) (err error)
I have csv file with line with double quote string like:
text;//*[#class="price"]/span;text
And csv Reader generate error to read csv file.
Helpful was:
reader := csv.NewReader(file)
reader.LazyQuotes = true
The s variable's value is not what you think it is. http://play.golang.org/p/vAEYkINWnm
Related
I've the below text:
str := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
And want to remove ALL empty lines.
I was able to remove the empty lines in the paragraphs as:
str = strings.Replace(str, "\n\n", "\n", -1)
fmt.Println(str)
And ended up with:
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
So, still have couple of empty lines at the beginning and few empty lines at the end, how can I get red of them?
In my app I'm trying to extract the texts from all "png" files in the same directory, and get it in pretty format, my full code so far is:
package main
import (
"fmt"
"io/ioutil"
"os"
"os/exec"
"path/filepath"
"strings"
_ "image/png"
)
func main() {
var files []string
root := "."
err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
if filepath.Ext(path) == ".png" {
path = strings.TrimSuffix(path, filepath.Ext(path))
files = append(files, path)
}
return nil
})
if err != nil {
panic(err)
}
for _, file := range files {
fmt.Println(file)
err = exec.Command(`tesseract`, file+".png", file).Run()
if err != nil {
fmt.Printf("Error: %s\n", err)
} else {
b, err := ioutil.ReadFile(file + ".txt") // just pass the file name
if err != nil {
fmt.Print(err)
} else {
str := string(b) // convert content to a 'string'
str = strings.Replace(str, "\n\n", "\n", -1)
fmt.Println(str) // print the content as a 'string'
}
}
}
}
split the string with \n and remove whitespaces in splitted eliments and then concat them with \n
func trimEmptyNewLines(str string) string{
strs := strings.Split(str, "\n")
str = ""
for _, s := range strs {
if len(strings.TrimSpace(s)) == 0 {
continue
}
str += s+"\n"
}
str = strings.TrimSuffix(str, "\n")
return str
}
run full code here
You can use strings.TrimSpace to remove all leading and trailing whitespace:
str = strings.TrimSpace(str)
A little different answer.
package main
import (
"fmt"
)
func main() {
str := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
first := 0
last := 0
for i, j := range []byte(str) {
if j != 10 && j != 32 {
if first == 0 {
first = i
}
last = i
}
}
str = str[first : last+1]
fmt.Print(str)
}
I copied your string and turned it into JSON:
package main
import (
"encoding/json"
"log"
)
func main() {
// The string from the original post.
myString := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
// Marshal to json.
data, err := json.Marshal(myString)
if err != nil {
log.Fatalf("Failed to marshal string to JSON.\nError: %s", err.Error())
}
// Print the string to stdout.
println(string(data))
}
It'll probably be easier to see the whitespace in JSON.
"\n \n\nMaybe we should all just listen to\nrecords and quit our jobs\n\n— gach White —\n\nAZ QUOTES\n\n \n\n \n\n "
Do you see the problem here? There's a couple of spaces in between your newline characters, additionally, you have uneven numbers of newline characters. so replacing \n\n with \n won't behave as you'd like it to.
I see one of your goals is this:
And want to remove ALL empty lines.
(I'm not addressing extracting text from PNG files, as that's a separate question.)
package main
import (
"encoding/json"
"log"
"strings"
)
func main() {
// The string from the original post.
myString := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
// Create a resulting string.
result := ""
// Iterate through the lines in this string.
for _, line := range strings.Split(myString, "\n") {
if line = strings.TrimSpace(line); line != "" {
result += line + "\n"
}
}
// Print the result to stdout.
println(result)
// Marshal the result to JSON.
resultJSON, err := json.Marshal(result)
if err != nil {
log.Fatalf("Failed to marshal result to JSON.\nError: %s", err.Error())
}
println(string(resultJSON))
}
stdout:
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
"Maybe we should all just listen to\nrecords and quit our jobs\n— gach White —\nAZ QUOTES\n"
It looks like you have white space inbetween, e.g.
\n \n
So, doing a regexp replace with regular expression \n[ \t]*\n might be more sensible.
This won't remove single empty lines at the beginning though, for this you would use ^\n* and replace with an empty string.
Refining this a bit further, you can add more white space like \f and consider multiple empty lines at once
\n([ \t\f]*\n)+
\n a newline
(...)+ followed by one or more
[ \t\f]*\n empty lines
This clears all empty lines in between, but may keep white space at the beginning or the end of the string. As suggested in other answers, adding a strings.TrimSpace() takes care of this.
Putting everything together gives
https://play.golang.org/p/E07ZkE2nlcp
package main
import (
"fmt"
"regexp"
"strings"
)
func main() {
str := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
re := regexp.MustCompile(`\n([ \t\f]*\n)+`)
str = string(re.ReplaceAll([]byte(str), []byte("\n")))
str = strings.TrimSpace(str)
fmt.Println("---")
fmt.Println(str)
fmt.Println("---")
}
which finally shows
---
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
---
I'm reading file line by line and like to split line based on substring. But when I use SplitAfterN with read line passed, I'm facing below error,
cannot convert 'variable' (type []string) to type string
where 'variable' = []string type
package main
import (
"bufio"
"flag"
"fmt"
"log"
"os"
"strings"
)
func main() {
var fLine []string
FileName := flag.String("fpath", "Default file path", "File path description ")
flag.Parse()
fptr, err := os.Open(*FileName)
if err != nil {
log.Fatal(err)
}
FileScanner := bufio.NewScanner(fptr)
for FileScanner.Scan() {
// Append each line into one buffer while reading
fLine = append(fLine, FileScanner.Text())
splitline := strings.SplitAfterN(fLine, "12345", 2)
fmt.Println("Splited string = ", splitline[1])
}
}
I'm expecting below line to split passed argument (fLine)
splitline := strings.SplitAfterN(fread, "12345", 2)
The (last) line you read is not fLine, that is a slice of all lines. The last line is returned by FileScanner.Text(). If you want to split the last line, either store that in a variable, or use the last element of the slice.
If you choose to store it in a variable:
line := FileScanner.Text()
fLine = append(fLine, line)
splitline := strings.SplitAfterN(line, "12345", 2)
If you just want to use the last slice element:
fLine = append(fLine, FileScanner.Text())
splitline := strings.SplitAfterN(fLine[len(fLine)-1], "12345", 2)
So you just want to convert the slice to a string, right?
This should do what you need..
So:
splitline := strings.SplitAfterN(strings.Join(fLine," "), "12345", 2)
I currently have a script that performs an os command, that returns a great deal of data, at the end of the data it gives a total such that:
N Total.
N can be any number from 0 upward.
I want to perform this command, and take N then put it into a value. I have the command running and I'm storing it in a bytes.Buffer, however I'm unsure how to scrape this so that I only get the number. The "N Total." string is always at the end of the output. Any help would be appreciated as I've seen various different methods but they all seem quite convoluted.
You can use a bufio.Scanner to read the command's output line-wise. Then just remember the last line and parse it once the command has finished.
package main
import (
"bufio"
"fmt"
"io"
"os/exec"
"strings"
)
func main() {
r, w := io.Pipe()
cmd := exec.Command("fortune")
cmd.Stdout = w
go func() {
cmd.Run()
r.Close()
w.Close()
}()
sc := bufio.NewScanner(r)
var lastLine string
for sc.Scan() {
line := sc.Text()
fmt.Println("debug:", line)
if strings.TrimSpace(line) != "" {
lastLine = line
}
}
fmt.Println(lastLine)
}
Sample output:
debug: "Get back to your stations!"
debug: "We're beaming down to the planet, sir."
debug: -- Kirk and Mr. Leslie, "This Side of Paradise",
debug: stardate 3417.3
stardate 3417.3
Parsing lastLine is left as an excercise for the reader.
You can split the string by \n and get the last line.
package main
import (
"fmt"
"strconv"
"strings"
)
func main() {
output := `
Some os output
Some more os output
Again some os output
1001 Total`
// If you're getting the string from the bytes.Buffer do this:
// output := myBytesBuffer.String()
outputSplit := strings.Split(output, "\n") // Break into lines
// Get last line from the end.
// -1 assumes the numbers in the last line. Change it if its not.
lastLine := outputSplit[len(outputSplit)-1]
lastLine = strings.Replace(lastLine, " Total", "", -1) // Remove text
number, _ := strconv.Atoi(lastLine) // Convert from text to number
fmt.Println(number)
}
peterSO points out that for big output the above may be slow.
Here's another way that uses a compiled regexp expression to match against a small subset of bytes.
package main
import (
"bytes"
"fmt"
"os/exec"
"regexp"
"strconv"
)
func main() {
// Create regular expression. You only create this once.
// Would be regexpNumber := regexp.MustCompile(`(\d+) Total`) for you
regexpNumber := regexp.MustCompile(`(\d+) bits physical`)
// Whatever your os command is
command := exec.Command("cat", "/proc/cpuinfo")
output, _ := command.Output()
// Your bytes.Buffer
var b bytes.Buffer
b.Write(output)
// Get end of bytes slice
var end []byte
if b.Len()-200 > 0 {
end = b.Bytes()[b.Len()-200:]
} else {
end = b.Bytes()
}
// Get matches. matches[1] contains your number
matches := regexpNumber.FindSubmatch(end)
// Convert bytes to int
number, _ := strconv.Atoi(string(matches[1])) // Convert from text to number
fmt.Println(number)
}
I want to create string \"str\" but i want to give variable name to str.
For ex :
x := "name"
q := fmt.Sprintf("\"%s\"", x)
I want q = "\"name\""
I tried this
Use escape sequences preceded by \ to show literal special characters in a formatted string \\ for \ and \" for "
package main
import (
"fmt"
)
func main() {
x := "hello"
q := fmt.Sprintf("\\\"%s\"\\", x)
fmt.Println(q)
}
A more functional, flexible solution, depending on your taste:
x := "hello"
p := []byte{'"', '\\', '"', '"'}
q := append(append(p, []byte(x)...), p...)
fmt.Printf("%s", q)
https://play.golang.org/p/MHOsdefZYW
The file names.txt consists of many names in the form of:
"KELLEE","JOSLYN","JASON","INGER","INDIRA","GLINDA","GLENNIS"
Does anyone know how to split the string so that it is individual names separated by commas?
KELLEE,JOSLYN,JASON,INGER,INDIRA,GLINDA,GLENNIS
The following code splits by comma and leaves quotes around the name, what is the escape character to split out the ". Can it be done in one Split statement, splitting out "," and leaving a comma to separate?
package main
import "fmt"
import "io/ioutil"
import "strings"
func main() {
fData, err := ioutil.ReadFile("names.txt") // read in the external file
if err != nil {
fmt.Println("Err is ", err) // print any error
}
strbuffer := string(fData) // convert read in file to a string
arr := strings.Split(strbuffer, ",")
fmt.Println(arr)
}
By the way, this is part of Project Euler problem # 22. http://projecteuler.net/problem=22
Jeremy's answer is basically correct and does exactly what you have asked for. But the format of your "names.txt" file is actually a well known and is called CSV (comma separated values). Luckily, Go comes with an encoding/csv package (which is part of the standard library) for decoding and encoding such formats easily. In addition to your + Jeremy's solution, this package will also give exact error messages if the format is invalid, supports multi-line records and does proper unquoting of quoted strings.
The basic usage looks like this:
package main
import (
"encoding/csv"
"fmt"
"io"
"os"
)
func main() {
file, err := os.Open("names.txt")
if err != nil {
fmt.Println("Error:", err)
return
}
defer file.Close()
reader := csv.NewReader(file)
for {
record, err := reader.Read()
if err == io.EOF {
break
} else if err != nil {
fmt.Println("Error:", err)
return
}
fmt.Println(record) // record has the type []string
}
}
There is also a ReadAll method that might make your program even shorter, assuming that the whole file fits into the memory.
Update: dystroy has just pointed out that your file has only one line anyway. The CSV reader works well for that too, but the following, less general solution should also be sufficient:
for {
if n, _ := fmt.Fscanf(file, "%q,", &name); n != 1 {
break
}
fmt.Println("name:", name)
}
Split doesn't remove characters from the substrings. Your split is fine you just need to process the slice afterwards with strings.Trim(val, "\"").
for i, val := range arr {
arr[i] = strings.Trim(val, "\"")
}
Now arr will have the leading and trailing "s removed.