Trying to create a GO function that produces the same result as the Ubuntu Linux "cksum" operation, for example:
$ echo 123 > /tmp/foo
$ cksum /tmp/foo
2330645186 4 /tmp/foo
Could someone please provide a GO function that produces the first substring of the above result ("2330645186")? Thank you.
(Update)
It turns out cksum doesn't implement a cyclic redundancy check based on the CRC32 process (quite). To test CRC32 (the same as you'd find listed for a CRC32 checksum) you can use CRC calculation # http://zorc.breitbandkatze.de/ - go's hash/crc32.ChecksumIEEE implementation matches this
To implement the cksum crc process (also known as POSIX cksum) I instead generated a golang version of the c algorithm found on a cksum man page (which uses a lookup table)
package main
import (
"bufio"
"fmt"
"io"
"os"
)
var tbl = [256]uint32{0x00000000, 0x04C11DB7, 0x09823B6E, 0x0D4326D9,
0x130476DC, 0x17C56B6B, 0x1A864DB2, 0x1E475005,
0x2608EDB8, 0x22C9F00F, 0x2F8AD6D6, 0x2B4BCB61,
0x350C9B64, 0x31CD86D3, 0x3C8EA00A, 0x384FBDBD,
0x4C11DB70, 0x48D0C6C7, 0x4593E01E, 0x4152FDA9,
0x5F15ADAC, 0x5BD4B01B, 0x569796C2, 0x52568B75,
0x6A1936C8, 0x6ED82B7F, 0x639B0DA6, 0x675A1011,
0x791D4014, 0x7DDC5DA3, 0x709F7B7A, 0x745E66CD,
0x9823B6E0, 0x9CE2AB57, 0x91A18D8E, 0x95609039,
0x8B27C03C, 0x8FE6DD8B, 0x82A5FB52, 0x8664E6E5,
0xBE2B5B58, 0xBAEA46EF, 0xB7A96036, 0xB3687D81,
0xAD2F2D84, 0xA9EE3033, 0xA4AD16EA, 0xA06C0B5D,
0xD4326D90, 0xD0F37027, 0xDDB056FE, 0xD9714B49,
0xC7361B4C, 0xC3F706FB, 0xCEB42022, 0xCA753D95,
0xF23A8028, 0xF6FB9D9F, 0xFBB8BB46, 0xFF79A6F1,
0xE13EF6F4, 0xE5FFEB43, 0xE8BCCD9A, 0xEC7DD02D,
0x34867077, 0x30476DC0, 0x3D044B19, 0x39C556AE,
0x278206AB, 0x23431B1C, 0x2E003DC5, 0x2AC12072,
0x128E9DCF, 0x164F8078, 0x1B0CA6A1, 0x1FCDBB16,
0x018AEB13, 0x054BF6A4, 0x0808D07D, 0x0CC9CDCA,
0x7897AB07, 0x7C56B6B0, 0x71159069, 0x75D48DDE,
0x6B93DDDB, 0x6F52C06C, 0x6211E6B5, 0x66D0FB02,
0x5E9F46BF, 0x5A5E5B08, 0x571D7DD1, 0x53DC6066,
0x4D9B3063, 0x495A2DD4, 0x44190B0D, 0x40D816BA,
0xACA5C697, 0xA864DB20, 0xA527FDF9, 0xA1E6E04E,
0xBFA1B04B, 0xBB60ADFC, 0xB6238B25, 0xB2E29692,
0x8AAD2B2F, 0x8E6C3698, 0x832F1041, 0x87EE0DF6,
0x99A95DF3, 0x9D684044, 0x902B669D, 0x94EA7B2A,
0xE0B41DE7, 0xE4750050, 0xE9362689, 0xEDF73B3E,
0xF3B06B3B, 0xF771768C, 0xFA325055, 0xFEF34DE2,
0xC6BCF05F, 0xC27DEDE8, 0xCF3ECB31, 0xCBFFD686,
0xD5B88683, 0xD1799B34, 0xDC3ABDED, 0xD8FBA05A,
0x690CE0EE, 0x6DCDFD59, 0x608EDB80, 0x644FC637,
0x7A089632, 0x7EC98B85, 0x738AAD5C, 0x774BB0EB,
0x4F040D56, 0x4BC510E1, 0x46863638, 0x42472B8F,
0x5C007B8A, 0x58C1663D, 0x558240E4, 0x51435D53,
0x251D3B9E, 0x21DC2629, 0x2C9F00F0, 0x285E1D47,
0x36194D42, 0x32D850F5, 0x3F9B762C, 0x3B5A6B9B,
0x0315D626, 0x07D4CB91, 0x0A97ED48, 0x0E56F0FF,
0x1011A0FA, 0x14D0BD4D, 0x19939B94, 0x1D528623,
0xF12F560E, 0xF5EE4BB9, 0xF8AD6D60, 0xFC6C70D7,
0xE22B20D2, 0xE6EA3D65, 0xEBA91BBC, 0xEF68060B,
0xD727BBB6, 0xD3E6A601, 0xDEA580D8, 0xDA649D6F,
0xC423CD6A, 0xC0E2D0DD, 0xCDA1F604, 0xC960EBB3,
0xBD3E8D7E, 0xB9FF90C9, 0xB4BCB610, 0xB07DABA7,
0xAE3AFBA2, 0xAAFBE615, 0xA7B8C0CC, 0xA379DD7B,
0x9B3660C6, 0x9FF77D71, 0x92B45BA8, 0x9675461F,
0x8832161A, 0x8CF30BAD, 0x81B02D74, 0x857130C3,
0x5D8A9099, 0x594B8D2E, 0x5408ABF7, 0x50C9B640,
0x4E8EE645, 0x4A4FFBF2, 0x470CDD2B, 0x43CDC09C,
0x7B827D21, 0x7F436096, 0x7200464F, 0x76C15BF8,
0x68860BFD, 0x6C47164A, 0x61043093, 0x65C52D24,
0x119B4BE9, 0x155A565E, 0x18197087, 0x1CD86D30,
0x029F3D35, 0x065E2082, 0x0B1D065B, 0x0FDC1BEC,
0x3793A651, 0x3352BBE6, 0x3E119D3F, 0x3AD08088,
0x2497D08D, 0x2056CD3A, 0x2D15EBE3, 0x29D4F654,
0xC5A92679, 0xC1683BCE, 0xCC2B1D17, 0xC8EA00A0,
0xD6AD50A5, 0xD26C4D12, 0xDF2F6BCB, 0xDBEE767C,
0xE3A1CBC1, 0xE760D676, 0xEA23F0AF, 0xEEE2ED18,
0xF0A5BD1D, 0xF464A0AA, 0xF9278673, 0xFDE69BC4,
0x89B8FD09, 0x8D79E0BE, 0x803AC667, 0x84FBDBD0,
0x9ABC8BD5, 0x9E7D9662, 0x933EB0BB, 0x97FFAD0C,
0xAFB010B1, 0xAB710D06, 0xA6322BDF, 0xA2F33668,
0xBCB4666D, 0xB8757BDA, 0xB5365D03, 0xB1F740B4}
type crc struct {
p, r uint32
Size int
final bool
}
func NewCrc() *crc {
return &crc{0, 0, 0, false}
}
func (pr *crc) Add(b byte) {
if pr.final {
return
}
pr.r = (pr.r << 8) ^ tbl[byte(pr.r>>24)^b]
pr.Size++
}
func (pr *crc) Crc() uint32 {
if pr.final {
return pr.r
}
for m := pr.Size; m > 0; {
b := byte(m & 0377)
m = m >> 8
pr.r = (pr.r << 8) ^ tbl[byte(pr.r>>24)^b]
}
pr.final = true //Prevent further modification
pr.r = ^pr.r
return pr.r
}
func cksum(filename string) (uint32, int, error) {
f, err := os.Open(filename)
if err != nil {
return 0, 0, err
}
defer f.Close()
in := bufio.NewReader(f)
pr := NewCrc()
for done := false; !done; {
switch b, err := in.ReadByte(); err {
case io.EOF:
done = true
case nil:
pr.Add(b)
default:
return 0, 0, err
}
}
return pr.Crc(), pr.Size, nil
}
func main() {
var filename = "foo"
crc, size, err := cksum(filename)
if err != nil {
fmt.Println("Error: ", err)
return
}
fmt.Printf("%d %d %s\n", crc, size, filename)
}
Obviously in this case the filename is hardcoded (to foo) but you could change that with flags. The content of foo is 123\n (**note: in windows you'll need to convert line endings to not get a 5 byte file) Results:
linux: $ cksum foo
2330645186 4 foo
linux: $ go run cksum.go
2330645186 4 foo
windows: > go run cksum.go **
2330645186 4 foo
Actually, I found a more simplified answer to my original question:
Using:
https://pkg.go.dev/github.com/cxmcc/unixsums#section-readme
Here is the snippet that provides the posix checksum equivalent value of a file in Go:
data, err := ioutil.ReadFile("/tmp/test.loop")
if err != nil {
log.Fatal(err)
}
fmt.Printf("cksum: %d\n", cksum.Cksum(data))
I have the following code which generates some string output:
package formatter
import (
"bytes"
"log"
"text/template"
"github.com/foo/bar/internal/mapper"
)
// map of template functions that enable us to identify the final item within a
// collection being iterated over.
var fns = template.FuncMap{
"plus1": func(x int) int {
return x + 1
},
}
// Dot renders our results in dot format for use with graphviz
func Dot(results []mapper.Page) string {
dotTmpl := `digraph sitemap { {{range .}}
"{{.URL}}"
-> { {{$n := len .Anchors}}{{range $i, $v := .Anchors}}
"{{.}}"{{if eq (plus1 $i) $n}}{{else}},{{end}}{{end}}
} {{end}}
}`
tmpl, err := template.New("digraph").Funcs(fns).Parse(dotTmpl)
if err != nil {
log.Fatal(err)
}
var output bytes.Buffer
if err := tmpl.Execute(&output, results); err != nil {
log.Fatal(err)
}
return output.String()
}
It generates output like:
digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}
Below is a test for this functionality...
package formatter
import (
"testing"
"github.com/foo/bar/internal/mapper"
)
func TestDot(t *testing.T) {
input := []mapper.Page{
mapper.Page{
URL: "http://www.example.com/",
Anchors: []string{
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz",
},
Links: []string{
"http://www.example.com/foo.css",
"http://www.example.com/bar.css",
"http://www.example.com/baz.css",
},
Scripts: []string{
"http://www.example.com/foo.js",
"http://www.example.com/bar.js",
"http://www.example.com/baz.js",
},
},
}
output := `digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
actual := Dot(input)
if actual != output {
t.Errorf("expected: %s\ngot: %s", output, actual)
}
}
Which fails with the following error (which is related to the outputted format spacing)...
--- FAIL: TestDot (0.00s)
format_test.go:43: expected: digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}
got: digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}
I've tried tweaking my test output variable so the spacing would align with what's actually outputted from the real code. That didn't work.
I also tried using strings.Replace() on both my output variable and the actual outputted content and bizarrely the output from my function (even though it was passed through strings.Replace would still be multi-lined (and so the test would fail)?
Anyone have any ideas how I can make the output consistent for the sake of code verification?
Thanks.
UPDATE
I tried the approach suggested by #icza and it still fails the test, although the output in the test looks more like it's expected to be:
=== RUN TestDot
--- FAIL: TestDot (0.00s)
format_test.go:65: expected: digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}
got: digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}
If you want to ignore format, you can use strings.Fields.
output := strings.Fields(`digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`)
actual := strings.Fields(Dot(input))
if !equal(output,actual) {
// ...
}
where equal is a simple function that compares two slices.
The simplest solution is to use the same indentation in the test when specifying the expected output (the same what you use in the template).
You have:
output := `digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
Change it to:
output := `digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
Note that for example the final line is not indented. When you use raw string literal, every character including indentation characters is part of the literal as-is.
Steps to create a correct, un-indented raw string literal
After all, this is completely a non-coding issue, but rather an issue of editors' auto-formatting and defining a raw string literal. An easy way to get it right is first to write an empty raw string literal, add an empty line to it and clear the auto-indentation inserted by the editor:
output := `
`
When you have this, copy-paste the correct input before the closing backtick, e.g.:
output := `
digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
And as a last step, remove line break from the first line of the raw string literal, and you have the correct raw string literal:
output := `digraph sitemap {
"http://www.example.com/"
-> {
"http://www.example.com/foo",
"http://www.example.com/bar",
"http://www.example.com/baz"
}
}`
Once you have this, running gofmt or auto-formatting of editors will not mess with it anymore.
UPDATE:
I checked your updated test result, and in the result you get, there is a space after the first line: digraph sitemap {, and also there's a space after the 3rd line: -> {, but you don't add those to your expected output. Either add those to your expected output too, or remove those spaces from the template! When comparing strings, they are compared byte-wise, every character (including white-spaces) matter.
To remove those extra spaces from the template:
dotTmpl := `digraph sitemap { {{- range .}}
"{{.URL}}"
-> { {{- $n := len .Anchors}}{{range $i, $v := .Anchors}}
"{{.}}"{{if eq (plus1 $i) $n}}{{else}},{{end}}{{end}}
} {{end}}
}`
Note the use of {{-. This is to trim spaces around template actions, this was added in Go 1.6.
the problem is that there is an extra space. in your formatted text right after { that seems to be your problem. You can fix it by changing your format string to this
`digraph sitemap {{{range .}}
"{{.URL}}"
-> {{{$n := len .Anchors}}{{range $i, $v := .Anchors}}
"{{.}}"{{if eq (plus1 $i) $n}}{{else}},{{end}}{{end}}
}{{end}}
}`