how to get golang to test a multiline output matches - string

I have the following code which generates some string output:
package formatter
import (
"bytes"
"log"
"text/template"
"github.com/foo/bar/internal/mapper"
)
// map of template functions that enable us to identify the final item within a
// collection being iterated over.
var fns = template.FuncMap{
"plus1": func(x int) int {
return x + 1
},
}
// Dot renders our results in dot format for use with graphviz
func Dot(results []mapper.Page) string {
dotTmpl := `digraph sitemap { {{range .}}
"{{.URL}}"
-> { {{$n := len .Anchors}}{{range $i, $v := .Anchors}}
"{{.}}"{{if eq (plus1 $i) $n}}{{else}},{{end}}{{end}}
} {{end}}
}`
tmpl, err := template.New("digraph").Funcs(fns).Parse(dotTmpl)
if err != nil {
log.Fatal(err)
}
var output bytes.Buffer
if err := tmpl.Execute(&output, results); err != nil {
log.Fatal(err)
}
return output.String()
}
It generates output like:
digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}
Below is a test for this functionality...
package formatter
import (
"testing"
"github.com/foo/bar/internal/mapper"
)
func TestDot(t *testing.T) {
input := []mapper.Page{
mapper.Page{
URL: "http://www.example.com/",
Anchors: []string{
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz",
},
Links: []string{
"http://www.example.com/foo.css",
"http://www.example.com/bar.css",
"http://www.example.com/baz.css",
},
Scripts: []string{
"http://www.example.com/foo.js",
"http://www.example.com/bar.js",
"http://www.example.com/baz.js",
},
},
}
output := `digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
actual := Dot(input)
if actual != output {
t.Errorf("expected: %s\ngot: %s", output, actual)
}
}
Which fails with the following error (which is related to the outputted format spacing)...
--- FAIL: TestDot (0.00s)
format_test.go:43: expected: digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}
got: digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}
I've tried tweaking my test output variable so the spacing would align with what's actually outputted from the real code. That didn't work.
I also tried using strings.Replace() on both my output variable and the actual outputted content and bizarrely the output from my function (even though it was passed through strings.Replace would still be multi-lined (and so the test would fail)?
Anyone have any ideas how I can make the output consistent for the sake of code verification?
Thanks.
UPDATE
I tried the approach suggested by #icza and it still fails the test, although the output in the test looks more like it's expected to be:
=== RUN TestDot
--- FAIL: TestDot (0.00s)
format_test.go:65: expected: digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}
got: digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}

If you want to ignore format, you can use strings.Fields.
output := strings.Fields(`digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`)
actual := strings.Fields(Dot(input))
if !equal(output,actual) {
// ...
}
where equal is a simple function that compares two slices.

The simplest solution is to use the same indentation in the test when specifying the expected output (the same what you use in the template).
You have:
output := `digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
Change it to:
output := `digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
Note that for example the final line is not indented. When you use raw string literal, every character including indentation characters is part of the literal as-is.
Steps to create a correct, un-indented raw string literal
After all, this is completely a non-coding issue, but rather an issue of editors' auto-formatting and defining a raw string literal. An easy way to get it right is first to write an empty raw string literal, add an empty line to it and clear the auto-indentation inserted by the editor:
output := `
`
When you have this, copy-paste the correct input before the closing backtick, e.g.:
output := `
digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
And as a last step, remove line break from the first line of the raw string literal, and you have the correct raw string literal:
output := `digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
Once you have this, running gofmt or auto-formatting of editors will not mess with it anymore.
UPDATE:
I checked your updated test result, and in the result you get, there is a space after the first line: digraph sitemap {, and also there's a space after the 3rd line: -> {, but you don't add those to your expected output. Either add those to your expected output too, or remove those spaces from the template! When comparing strings, they are compared byte-wise, every character (including white-spaces) matter.
To remove those extra spaces from the template:
dotTmpl := `digraph sitemap { {{- range .}}
"{{.URL}}"
-> { {{- $n := len .Anchors}}{{range $i, $v := .Anchors}}
"{{.}}"{{if eq (plus1 $i) $n}}{{else}},{{end}}{{end}}
} {{end}}
}`
Note the use of {{-. This is to trim spaces around template actions, this was added in Go 1.6.

the problem is that there is an extra space. in your formatted text right after { that seems to be your problem. You can fix it by changing your format string to this
`digraph sitemap {{{range .}}
"{{.URL}}"
-> {{{$n := len .Anchors}}{{range $i, $v := .Anchors}}
"{{.}}"{{if eq (plus1 $i) $n}}{{else}},{{end}}{{end}}
}{{end}}
}`

Related

How to create map variable from string literal in Golang?

I have a prepared map structure as a string literal. Notice, it is not a JSON! (look at commas in the last elements of blocks)
dict := `{
"ru": {
"test_key": "Тестовый ключ",
"some_err": "Произошла ошибка",
},
"en": {
"test_key": "Test key",
"some_err": "Error occurs",
},
}`
I want to transform this string to real value of map type (map[string]map[string]string). I need it for tests. Is it possible?
If this is just for testing, I would remove the "unneeded" commas from the source string and use JSON unmarshaling.
To remove the unneeded commas: I'd use the regexp ,\s*}, and replace it with a single }.
For example:
dict = regexp.MustCompile(`,\s*}`).ReplaceAllLiteralString(dict, "}")
var m map[string]map[string]string
if err := json.Unmarshal([]byte(dict), &m); err != nil {
panic(err)
}
fmt.Println(m)
Output (try it on the Go Playground):
map[en:map[some_err:Error occurs test_key:Test key] ru:map[some_err:Произошла ошибка test_key:Тестовый ключ]]

Matching text to images using fuzzy search

I am using this package: https://github.com/blevesearch/bleve to create a mapping of products2images.
It is working fine when I use single terms, but not at all if I use an entire phrase. For instance, if I use this :
query := bleve.NewFuzzyQuery("lacteo")
it will correctly map the right image. However, If I do this :
query := bleve.NewFuzzyQuery("lacteo leche yogurt cebolla")
It will not match anything at all.
What am I doing wrong here ?
Set DB :
package main
import (
"github.com/blevesearch/bleve"
)
func main() {
message := []struct {
Id string
Body string
}{
{
Id: "lacteos.jpg",
Body: "lacteo leche yogurt cebolla",
},
{
Id: "cafe.jpg",
Body: "café yerba té",
},
{
Id: "queso.jpg",
Body: "lacteo leche yogurt cebolla queso",
},
{
Id: "harina.jpg",
Body: "harina",
},
}
mapping := bleve.NewIndexMapping()
index, err := bleve.New("example.bleve", mapping)
if err != nil {
panic(err)
}
index.Index(message[0].Id, message[0])
index.Index(message[1].Id, message[1])
index.Index(message[2].Id, message[2])
index.Index(message[3].Id, message[3])
}
Search for something :
package main
import (
"fmt"
"log"
"github.com/blevesearch/bleve"
)
func main() {
index, _ := bleve.Open("example.bleve")
query := bleve.NewFuzzyQuery("lacteo leche yogurt cebolla queso")
query.SetFuzziness(2)
searchRequest := bleve.NewSearchRequest(query)
searchResult, err := index.Search(searchRequest)
if err != nil {
log.Fatal(err.Error())
}
for _, v := range searchResult.Hits {
fmt.Println(v.ID)
fmt.Println(v.Score)
fmt.Println("-------------")
}
}
So, after posting an issue at their repo : https://github.com/blevesearch/bleve/issues/1565 I found out this is actually not supported. I ended up adding a little bit more logic to my side to make this work.

Remove all characters after a delimiter in a string

I am building a web crawler application in golang.
After downloading the HTML of a page, I separate out the URLs.
I am presented with URLs that have "#s" in them, such as "en.wikipedia.org/wiki/Race_condition#Computing". I would like to get rid of all characters following the "#", since these lead to the same page anyways. Any advice for how to do so?
Use the url package:
u, _ := url.Parse("SOME_URL_HERE")
u.Fragment = ""
return u.String()
An improvement on the answer by Luke Joshua Park is to parse the URL relative to the URL of the source page. This creates an absolute URL from what might be relative URL on the page (scheme not specified, host not specified, relative path). Another improvement is to check and handle errors.
func clean(pageURL, linkURL string) (string, error) {
p, err := url.Parse(pageURL)
if err != nil {
return "", err
}
l, err := p.Parse(linkURL)
if err != nil {
return "", err
}
l.Fragment = "" // chop off the fragment
return l.String()
}
If you are not interested in getting an absolute URL, then chop off everything after the #. This works because the only valid use of # in a URL is the fragment separator.
func clean(linkURL string) string {
i := strings.LastIndexByte(linkURL, '#')
if i < 0 {
return linkURL
}
return linkURL[:i]
}

Insert a mgo query []M.bson result into a file.txt as a string

i have to insert into a file the result of a mgo query MongoDB converted in Go to get the id of images
var path="/home/Medo/text.txt"
pipe := cc.Pipe([]bson.M{
{"$unwind": "$images"},
{"$group": bson.M{"_id": "null", "images":bson.M{"$push": "$images"}}},
{"$project": bson.M{"_id": 0}}})
response := []bson.M{}
errResponse := pipe.All(&response)
if errResponse != nil {
fmt.Println("error Response: ",errResponse)
}
fmt.Println(response) // to print for making sure that it is working
data, err := bson.Marshal(&response)
s:=string(data)
if err22 != nil {
fmt.Println("error insertion ", err22)
}
Here is the part where I have to create a file and write on it.
The problem is when I got the result of the query in the text file I got an enumeration values in the last of each value for example:
id of images
23456678`0`
24578689`1`
23678654`2`
12890762`3`
76543890`4`
64744848`5`
so for each value i got a number sorted in the last , and i can't figure out how , after getting the reponse from the query i converted the Bson to []Byte and then to Stringbut it keeps me getting that enumeration sorted values in the last of each results
I'd like to drop those 0 1 2 3 4 5
var _, errExistFile = os.Stat(path)
if os.IsNotExist(errExistFile) {
var file, errCreateFile = os.Create(path)
if isError(erro) {
return
}
defer file.Close()
}
fmt.Println("==> done creating file", path)
var file, errii = os.OpenFile(path, os.O_RDWR, 0644)
if isError(errii) {
return
}
defer file.Close()
// write some text line-by-line to file
_, erri := file.WriteString(s)
if isError(erri) {
return
}
erri = file.Sync()
if isError(erri) {
return
}
fmt.Println("==> done writing to file")
You could declare a simple struct eg
simple struct {
ID idtype `bson:"_id"`
Image int `bson:"images"`
}
The function to put the image ids into the file would be
open file stuff…
result := simple{}
iter := collection.Find(nil).Iter()
for iter.Next(&result){
file.WriteString(fmt.Sprintf("%d\n",result.Image))
}
iter.Close()

How to check file existence by its base name (without extension)?

Question is quite self-explanatory.
Please, could anybody show me how can I check existence of the file by name (without extension) by short and efficient way. It would be great if code returns several occurrence if folder have several files with the same name.
Example:
folder/
file.html
file.md
UPDATE:
It is not obviously how to use one of filepath.Match() or filepath.Glob() functions by official documentation. So here is some examples:
matches, _ := filepath.Glob("./folder/file*") //returns paths to real files [folder/file.html, folder/file.md]
matchesToPattern, _ := filepath.Match("./folder/file*", "./folder/file.html") //returns true, but it is just compare strings and doesn't check real content
You need to use the path/filepath package.
The functions to check are: Glob(), Match() and Walk() — pick whatever suits your taste better.
Here is the updated code :
package main
import (
"fmt"
"os"
"path/filepath"
"regexp"
)
func main() {
dirname := "." + string(filepath.Separator)
d, err := os.Open(dirname)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
defer d.Close()
fi, err := d.Readdir(-1)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
r, _ := regexp.Compile("f([a-z]+)le") // the string to match
for _, fi := range fi {
if fi.Mode().IsRegular() { // is file
if r.Match([]byte(fi.Name())) { // if it match
fmt.Println(fi.Name(), fi.Size(), "bytes")
}
}
}
}
With this one you can also search for date, size, include subfolders or file properties.

Resources