Truncate every member of a slice using Go - string

I just started with Go, and I'm having a little trouble accomplishing what I want to do. After loading a large text file in which each line begins with a word I want in my array, followed by single and multi-space delimited text I do not care about.
My first line of code creates an array of lines
lines := strings.Split( string( file ), "\n" )
The next step would be to truncate each line which I can do with a split statement. I'm sure I could do this with a for loop but I'm trying to learn some of the more efficient operations in Go (compared to c/c++)
I was hoping I could do something like this
lines := strings.Split( (lines...), " " )
Is there a better way to do this or should I just use some type of for loop?

Using bufio.NewScanner then word := strings.Fields(scanner.Text()) and slice = append(slice, word[0]) like this working sample code:
package main
import (
"bufio"
"fmt"
"strings"
)
func main() {
s := ` wanted1 not wanted
wanted2 not wanted
wanted3 not wanted
`
slice := []string{}
// scanner := bufio.NewScanner(os.Stdin)
scanner := bufio.NewScanner(strings.NewReader(s))
for scanner.Scan() {
word := strings.Fields(scanner.Text())
if len(word) > 0 {
slice = append(slice, word[0])
}
}
fmt.Println(slice)
}
Using strings.Fields(line) like this working sample code:
package main
import "fmt"
import "strings"
func main() {
s := `
wanted1 not wanted
wanted2 not wanted
wanted3 not wanted
`
lines := strings.Split(s, "\n")
slice := make([]string, 0, len(lines))
for _, line := range lines {
words := strings.Fields(line)
if len(words) > 0 {
slice = append(slice, words[0])
}
}
fmt.Println(slice)
}
output:
[wanted1 wanted2 wanted3]

Related

Unable to correctly compare two strings in go

Hi I am trying to find the no. of times a digit appears in a no. using the below code. But the value of j is always 0 even if the digit appears many time in the number. I would like to know why the current comparison does not work. Is it possible to do this without converting input to integer?
package main
import "fmt"
import "bufio"
import "os"
func main (){
reader := bufio.NewReader(os.Stdin)
c,_ := reader.ReadString('\n')
d,_ := reader.ReadString('\n')
j := 0
for _,i := range(c){
if string(i) == d{
fmt.Printf("inside if")
j = j+1
}
}
fmt.Println(j)
}
func (b *Reader) ReadString(delim byte) (string, error)
ReadString reads until the first occurrence of delim in the input, returning a string containing the data up to and including the delimiter.
So if you enter 3 for d, then d == "3\n".
You probably just need to do:
d,_ := reader.ReadString('\n')
d = d[:len(d)-1]

Removing first and last empty lines from a string

I've the below text:
str := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
And want to remove ALL empty lines.
I was able to remove the empty lines in the paragraphs as:
str = strings.Replace(str, "\n\n", "\n", -1)
fmt.Println(str)
And ended up with:
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
So, still have couple of empty lines at the beginning and few empty lines at the end, how can I get red of them?
In my app I'm trying to extract the texts from all "png" files in the same directory, and get it in pretty format, my full code so far is:
package main
import (
"fmt"
"io/ioutil"
"os"
"os/exec"
"path/filepath"
"strings"
_ "image/png"
)
func main() {
var files []string
root := "."
err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
if filepath.Ext(path) == ".png" {
path = strings.TrimSuffix(path, filepath.Ext(path))
files = append(files, path)
}
return nil
})
if err != nil {
panic(err)
}
for _, file := range files {
fmt.Println(file)
err = exec.Command(`tesseract`, file+".png", file).Run()
if err != nil {
fmt.Printf("Error: %s\n", err)
} else {
b, err := ioutil.ReadFile(file + ".txt") // just pass the file name
if err != nil {
fmt.Print(err)
} else {
str := string(b) // convert content to a 'string'
str = strings.Replace(str, "\n\n", "\n", -1)
fmt.Println(str) // print the content as a 'string'
}
}
}
}
split the string with \n and remove whitespaces in splitted eliments and then concat them with \n
func trimEmptyNewLines(str string) string{
strs := strings.Split(str, "\n")
str = ""
for _, s := range strs {
if len(strings.TrimSpace(s)) == 0 {
continue
}
str += s+"\n"
}
str = strings.TrimSuffix(str, "\n")
return str
}
run full code here
You can use strings.TrimSpace to remove all leading and trailing whitespace:
str = strings.TrimSpace(str)
A little different answer.
package main
import (
"fmt"
)
func main() {
str := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
first := 0
last := 0
for i, j := range []byte(str) {
if j != 10 && j != 32 {
if first == 0 {
first = i
}
last = i
}
}
str = str[first : last+1]
fmt.Print(str)
}
I copied your string and turned it into JSON:
package main
import (
"encoding/json"
"log"
)
func main() {
// The string from the original post.
myString := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
// Marshal to json.
data, err := json.Marshal(myString)
if err != nil {
log.Fatalf("Failed to marshal string to JSON.\nError: %s", err.Error())
}
// Print the string to stdout.
println(string(data))
}
It'll probably be easier to see the whitespace in JSON.
"\n \n\nMaybe we should all just listen to\nrecords and quit our jobs\n\n— gach White —\n\nAZ QUOTES\n\n \n\n \n\n "
Do you see the problem here? There's a couple of spaces in between your newline characters, additionally, you have uneven numbers of newline characters. so replacing \n\n with \n won't behave as you'd like it to.
I see one of your goals is this:
And want to remove ALL empty lines.
(I'm not addressing extracting text from PNG files, as that's a separate question.)
package main
import (
"encoding/json"
"log"
"strings"
)
func main() {
// The string from the original post.
myString := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
// Create a resulting string.
result := ""
// Iterate through the lines in this string.
for _, line := range strings.Split(myString, "\n") {
if line = strings.TrimSpace(line); line != "" {
result += line + "\n"
}
}
// Print the result to stdout.
println(result)
// Marshal the result to JSON.
resultJSON, err := json.Marshal(result)
if err != nil {
log.Fatalf("Failed to marshal result to JSON.\nError: %s", err.Error())
}
println(string(resultJSON))
}
stdout:
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
"Maybe we should all just listen to\nrecords and quit our jobs\n— gach White —\nAZ QUOTES\n"
It looks like you have white space inbetween, e.g.
\n \n
So, doing a regexp replace with regular expression \n[ \t]*\n might be more sensible.
This won't remove single empty lines at the beginning though, for this you would use ^\n* and replace with an empty string.
Refining this a bit further, you can add more white space like \f and consider multiple empty lines at once
\n([ \t\f]*\n)+
\n a newline
(...)+ followed by one or more
[ \t\f]*\n empty lines
This clears all empty lines in between, but may keep white space at the beginning or the end of the string. As suggested in other answers, adding a strings.TrimSpace() takes care of this.
Putting everything together gives
https://play.golang.org/p/E07ZkE2nlcp
package main
import (
"fmt"
"regexp"
"strings"
)
func main() {
str := `
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
`
re := regexp.MustCompile(`\n([ \t\f]*\n)+`)
str = string(re.ReplaceAll([]byte(str), []byte("\n")))
str = strings.TrimSpace(str)
fmt.Println("---")
fmt.Println(str)
fmt.Println("---")
}
which finally shows
---
Maybe we should all just listen to
records and quit our jobs
— gach White —
AZ QUOTES
---

Isolating String Output

I currently have a script that performs an os command, that returns a great deal of data, at the end of the data it gives a total such that:
N Total.
N can be any number from 0 upward.
I want to perform this command, and take N then put it into a value. I have the command running and I'm storing it in a bytes.Buffer, however I'm unsure how to scrape this so that I only get the number. The "N Total." string is always at the end of the output. Any help would be appreciated as I've seen various different methods but they all seem quite convoluted.
You can use a bufio.Scanner to read the command's output line-wise. Then just remember the last line and parse it once the command has finished.
package main
import (
"bufio"
"fmt"
"io"
"os/exec"
"strings"
)
func main() {
r, w := io.Pipe()
cmd := exec.Command("fortune")
cmd.Stdout = w
go func() {
cmd.Run()
r.Close()
w.Close()
}()
sc := bufio.NewScanner(r)
var lastLine string
for sc.Scan() {
line := sc.Text()
fmt.Println("debug:", line)
if strings.TrimSpace(line) != "" {
lastLine = line
}
}
fmt.Println(lastLine)
}
Sample output:
debug: "Get back to your stations!"
debug: "We're beaming down to the planet, sir."
debug: -- Kirk and Mr. Leslie, "This Side of Paradise",
debug: stardate 3417.3
stardate 3417.3
Parsing lastLine is left as an excercise for the reader.
You can split the string by \n and get the last line.
package main
import (
"fmt"
"strconv"
"strings"
)
func main() {
output := `
Some os output
Some more os output
Again some os output
1001 Total`
// If you're getting the string from the bytes.Buffer do this:
// output := myBytesBuffer.String()
outputSplit := strings.Split(output, "\n") // Break into lines
// Get last line from the end.
// -1 assumes the numbers in the last line. Change it if its not.
lastLine := outputSplit[len(outputSplit)-1]
lastLine = strings.Replace(lastLine, " Total", "", -1) // Remove text
number, _ := strconv.Atoi(lastLine) // Convert from text to number
fmt.Println(number)
}
peterSO points out that for big output the above may be slow.
Here's another way that uses a compiled regexp expression to match against a small subset of bytes.
package main
import (
"bytes"
"fmt"
"os/exec"
"regexp"
"strconv"
)
func main() {
// Create regular expression. You only create this once.
// Would be regexpNumber := regexp.MustCompile(`(\d+) Total`) for you
regexpNumber := regexp.MustCompile(`(\d+) bits physical`)
// Whatever your os command is
command := exec.Command("cat", "/proc/cpuinfo")
output, _ := command.Output()
// Your bytes.Buffer
var b bytes.Buffer
b.Write(output)
// Get end of bytes slice
var end []byte
if b.Len()-200 > 0 {
end = b.Bytes()[b.Len()-200:]
} else {
end = b.Bytes()
}
// Get matches. matches[1] contains your number
matches := regexpNumber.FindSubmatch(end)
// Convert bytes to int
number, _ := strconv.Atoi(string(matches[1])) // Convert from text to number
fmt.Println(number)
}

Access a string as a character array for using in strings.Join() method: GO language

I am trying to access a string as a character array or as a rune and join with some separator. What is the right way to do it.
Here are the two ways i tried but i get an error as below
cannot use ([]rune)(t)[i] (type rune) as type []string in argument to strings.Join
How does a string represented in GOLANG. Is it like a character array?
package main
import (
"fmt"
"strings"
)
func main() {
var t = "hello"
s := ""
for i, rune := range t {
s += strings.Join(rune, "\n")
}
fmt.Println(s)
}
package main
import (
"fmt"
"strings"
)
func main() {
var t = "hello"
s := ""
for i := 0; i < len(t); i++ {
s += strings.Join([]rune(t)[i], "\n")
}
fmt.Println(s)
}
I also tried the below way.BUt, it does not work for me.
var t = "hello"
s := ""
for i := 0; i < len(t); i++ {
s += strings.Join(string(t[i]), "\n")
}
fmt.Println(s)
The strings.Join method expects a slice of strings as first argument, but you are giving it a rune type.
You can use the strings.Split method to obtain a slice of strings from a string. Here is an example.

strings.Split in Go

The file names.txt consists of many names in the form of:
"KELLEE","JOSLYN","JASON","INGER","INDIRA","GLINDA","GLENNIS"
Does anyone know how to split the string so that it is individual names separated by commas?
KELLEE,JOSLYN,JASON,INGER,INDIRA,GLINDA,GLENNIS
The following code splits by comma and leaves quotes around the name, what is the escape character to split out the ". Can it be done in one Split statement, splitting out "," and leaving a comma to separate?
package main
import "fmt"
import "io/ioutil"
import "strings"
func main() {
fData, err := ioutil.ReadFile("names.txt") // read in the external file
if err != nil {
fmt.Println("Err is ", err) // print any error
}
strbuffer := string(fData) // convert read in file to a string
arr := strings.Split(strbuffer, ",")
fmt.Println(arr)
}
By the way, this is part of Project Euler problem # 22. http://projecteuler.net/problem=22
Jeremy's answer is basically correct and does exactly what you have asked for. But the format of your "names.txt" file is actually a well known and is called CSV (comma separated values). Luckily, Go comes with an encoding/csv package (which is part of the standard library) for decoding and encoding such formats easily. In addition to your + Jeremy's solution, this package will also give exact error messages if the format is invalid, supports multi-line records and does proper unquoting of quoted strings.
The basic usage looks like this:
package main
import (
"encoding/csv"
"fmt"
"io"
"os"
)
func main() {
file, err := os.Open("names.txt")
if err != nil {
fmt.Println("Error:", err)
return
}
defer file.Close()
reader := csv.NewReader(file)
for {
record, err := reader.Read()
if err == io.EOF {
break
} else if err != nil {
fmt.Println("Error:", err)
return
}
fmt.Println(record) // record has the type []string
}
}
There is also a ReadAll method that might make your program even shorter, assuming that the whole file fits into the memory.
Update: dystroy has just pointed out that your file has only one line anyway. The CSV reader works well for that too, but the following, less general solution should also be sufficient:
for {
if n, _ := fmt.Fscanf(file, "%q,", &name); n != 1 {
break
}
fmt.Println("name:", name)
}
Split doesn't remove characters from the substrings. Your split is fine you just need to process the slice afterwards with strings.Trim(val, "\"").
for i, val := range arr {
arr[i] = strings.Trim(val, "\"")
}
Now arr will have the leading and trailing "s removed.

Resources