I want to create an empty file with a size of 1GB without initializing the contents of the file. Is there a way to quickly create a file of a specified size like in golang?
golang codeļ¼
func CreateFileBySize(file string, size int64) error {
f, err := os.Create(file)
if err != nil {
return err
}
defer f.Close()
if err := f.Truncate(size); err != nil {
return err
}
return nil
}
Truncate changes the size of the file. It does not change the I/O offset.
Is there an existing similar method in rust
As far as i know there is no support for sparse files in the Rust standard library (assuming that's what you mean, I have no idea if it is).
According to https://www.systutorials.com/handling-sparse-files-on-linux/ and https://users.rust-lang.org/t/rust-create-sparse-file/57276/2 you can just seek forward and write a single byte, and you'll get a sparse file.
Alternatively, you can use the libc crate and call fallocate(2) and / or pwrite(2) depending on your requirements
I have a thermal printer connected via USB at /dev/usb/lp0. I want to print a label by writing my EPL (Zebra commands) to this file. But the printer doesn't receive all the written bytes. I suspect some kind of buffering but don't know hot to avoid it.
I have a working java example:
FileOutputStream fo = new FileOutputStream("/dev/usb/lp0");
fo.write(bytes, 0 , bytes.length);
fo.flush();
fo.close();
But my go code doesnt't work:
f, err := os.OpenFile("/dev/usb/lp0", os.O_RDWR, 755)
if err != nil {
return err
}
n, err := f.Write(bytes)
if err != nil {
f.Close()
return err
}
log.Printf("Wrote %d bytes", n)
return f.Close()
Putting the printer into Trace mode, I found out, that the printer receives only the first 128 of
181 bytes. When I add a second write method after the first line, like this
f.WriteString(" ")
the sequence get printed as submitted. Using f.Sync() to flush doesn't work, because it's not supported for files like /dev/usb/*. How do I correctly write to the printer without using a second write method?
How would a process read its own output stream? I am writing automated tests which start a few application sub-processes (applications) in the same process as the test. Therefore, the standard out is a mix of test output and application output.
I want to read the output stream at runtime and fail the test if I see errors from the application. Is this possible/feasible? If so, how do I do it?
Note: I know I could start the applications as their own separate processes and then read their output streams. That's a lot of work from where I am now.
Also note, this is not a dupe of How to test a function's output (stdout/stderr) in Go unit tests, although that ticket is similar and helpful. The other ticket is about capturing output for a single function call. This ticket is about reading the entire stream, continuously. The correct answer is also a little different - it requires a pipe.
Yes, You may use os.Pipe() then process it yourself:
tmp := os.Stdout
r, w, err := os.Pipe()
if err != nil {
panic(err)
}
os.Stdout = w
Or divert os.Stdout to a another file or strings.Builder.
Here is the detailed answer:
In Go, how do I capture stdout of a function into a string?
A slightly modified version of an answer given in In Go, how do I capture stdout of a function into a string? using a os.Pipe (a form of IPC):
Pipe returns a connected pair of Files; reads from r return bytes written to w. It returns the files and an error, if any.
As os.Stdout is an *os.File, you could replace it with any file.
package main
import (
"bytes"
"fmt"
"io"
"log"
"os"
)
func main() {
old := os.Stdout
r, w, _ := os.Pipe() // TODO: handle error.
os.Stdout = w
// All stdout will be caputered from here on.
fmt.Println("this will be caputered")
// Access output and restore previous stdout.
outc := make(chan string)
go func() {
var buf bytes.Buffer
io.Copy(&buf, r) // TODO: handle error
outc <- buf.String()
}()
w.Close()
os.Stdout = old
out := <-outc
log.Printf("captured: %s", out)
}
I need to elaborate a file (potentially a big file) one block at a time and write the result to a new file.
To put it simply, I have the basic function to elaborate a block:
func elaborateBlock(block []byte) []byte { ... }
Every block needs to be elaborated and then written to the output file sequentially (preserving original order).
The one-thread implementation is trivial:
for {
buffer := make([]byte, BlockSize)
_, err := inputFile.Read(buffer)
if err == io.EOF {
break
}
processedData := elaborateBlock(buffer)
outputFile.Write(processedData)
}
But the elaboration can be heavy and every block can be processed separately, so a multi-threaded implementation is the natural evolution.
The solution I came up with is to create an array of channels, compute every block in a different thread and sync the final write by looping the channel array:
Utility function:
func blockThread(channel chan []byte, block []byte) {
channel <- elaborateBlock(block)
}
In the main program:
chans = []chan []byte {}
for {
buffer := make([]byte, BlockSize)
_, err := inputFile.Read(buffer)
if err == io.EOF {
break
}
channel := make(chan []byte)
chans = append(chans, channel)
go blockThread(channel, buffer)
}
for i := range chans {
data := <- chans[i]
outputFile.Write(data)
}
This approach works but can be problematic with large files because it requires to load the whole file in memory before starting writing the output.
Do you think there can be a better solution, with also better performance overall?
If blocks do need to be written out in order
If you want to work on multiple blocks concurrently, obviously you need to hold multiple blocks in memory at the same time.
You may decide how many blocks you want to process concurrently, and it's enough to read as many into memory at the same time. E.g. you may say you want to process 5 blocks concurrently. This will limit memory usage, and still utilize your CPU resources potentially to the max. Recommended to pick a number based on your available CPU cores (if processing a block does not already use multi cores). This can be queried using runtime.GOMAXPROCS(0).
You should have a single goroutine that reads the input file sequentially, and prodocue the blocks wrapped in Jobs (which also contain the block index).
You should have multiple worker goroutines, preferable as many as cores you have (but experiment with smaller and higher values too). Each worker goroutine just receives jobs, and calls elaborateBlock() on the data, and delivers it on the results channel.
There should be a single, designated consumer which receives completed jobs, and writes them in order to the output file. Since goroutines run concurrently and we have no control in which order the blocks are completed, the consumer should keep track of the index of the next block to be written to the output. Blocks arriving out of order should only be stored, and only proceed with writing if the subsequent block arrives.
This is an (incomplete) example how to do all these:
const BlockSize = 1 << 20 // 1 MB
func elaborateBlock(in []byte) []byte { return in }
type Job struct {
Index int
Block []byte
}
func producer(jobsCh chan<- *Job) {
// Init input file:
var inputFile *os.File
for index := 0; ; index++ {
job := &Job{
Index: index,
Block: make([]byte, BlockSize),
}
_, err := inputFile.Read(job.Block)
if err != nil {
break
}
jobsCh <- job
}
}
func worker(jobsCh <-chan *Job, resultCh chan<- *Job) {
for job := range jobsCh {
job.Block = elaborateBlock(job.Block)
resultCh <- job
}
}
func consumer(resultCh <-chan *Job) {
// Init output file:
var outputFile *os.File
nextIdx := 0
jobMap := map[int]*Job{}
for job := range resultCh {
jobMap[job.Index] = job
// Write out all blocks we have in contiguous index range:
for {
j := jobMap[nextIdx]
if j == nil {
break
}
if _, err := outputFile.Write(j.Block); err != nil {
// handle error, maybe terminate?
}
delete(nextIdx) // This job is written out
nextIdx++
}
}
}
func main() {
jobsCh := make(chan *Job)
resultCh := make(chan *Job)
for i := 0; i < 5; i++ {
go worker(jobsCh, resultCh)
}
wg := sync.WaitGroup{}
wg.Add(1)
go func() {
defer wg.Done()
consumer(resultCh)
}()
// Start producing jobs:
producer(jobsCh)
// No more jobs:
close(jobsCh)
// Wait for consumer to complete:
wg.Wait()
}
One thing to note here: this alone won't guarantee limiting the used memory. Imagine a case where the first block would require an enormous time to calculate, while subsequent blocks do not. What would happen? The first block would occupy a worker, and the other workers would "quickly" complete the subsequent blocks. The consumer would store all in memory, waiting for the first block to complete (as that has to be written out first). This could increase memory usage.
How could we avoid this?
By introducing a job pool. New jobs could not be created arbitrarily, but taken from a pool. If the pool is empty, the producer has to wait. So when the producer needs a new Job, takes one from a pool. When the consumer has written out a Job, puts it back into the pool. Simple as that. This would also reduce pressure on the garbage collector, as jobs (and large []byte buffers) are not created and thrown away, they could be re-used.
For a simple Job pool implementation you could use a buffered channel. For details, see How to implement Memory Pooling in Golang.
If blocks can be written in any order
Another option could be to allocate the output file in advance. If the size of the output blocks are also deterministic, you can do so (e.g. outsize := (insize / blocksize) * outblockSize).
To what end?
If you have the output file pre-allocated, the consumer does not need to wait input blocks in order. Once an input block is calculated, you can calculate the position where it will go in the output, seek to that position and just write it. For this you may use File.Seek().
This solution still requires to send the block index from the producer to the consumer, but the consumer won't need to store blocks arriving out-of-order, so the consumer can be simpler, and does not need to store completed blocks until the subsequent one arrives in order to proceed with writing the output file.
Note that this solution naturally does not pose a memory threat, as completed jobs are never accumulated / cached, they are written out in the order of completion.
See related questions for more details and techniques:
Is this an idiomatic worker thread pool in Go?
How to collect values from N goroutines executed in a specific order?
here is a working example that should work and is as close as possible to your original code.
the idea is to turn your array into a channel of channels of bytes. then
first fire up a consumer that will read on this channel of channels , get the channel of bytes, read from it and write the result.
Back on the main thread you create a channel of bytes, write it to the channel of channels (now the consumer reading sequentially from them will read the results in order) and then fire up the process that will do the work and write on the allocated channel (producers).
what will happen now is that the there will be a "race" between the procuders and the consumer, as soon as a produced block is read from the consumer and written the resources associated with it will be deallocated. this could be an improvement to your original design.
here is the code and the playground link:
package main
import (
"bytes"
"fmt"
"io"
"sync"
)
func elaborateBlock(b []byte) []byte {
return []byte("werkwerkwerk")
}
func blockThread(channel chan []byte, block []byte, wg *sync.WaitGroup) {
channel <- elaborateBlock(block)
wg.Done()
}
func main() {
chans := make(chan chan []byte)
BlockSize := 3
inputBytes := bytes.NewBuffer([]byte("transmutemetowerkwerkwerk"))
producewg := sync.WaitGroup{}
consumewg := sync.WaitGroup{}
consumewg.Add(1)
go func() {
chancount := 0
for ch := range chans {
data := <-ch
fmt.Printf("got %d block, result:%s\n", chancount, data)
chancount++
}
fmt.Printf("done receiving\n")
consumewg.Done()
}()
for {
buffer := make([]byte, BlockSize)
_, err := inputBytes.Read(buffer)
if err == io.EOF {
go func() {
//wait for all the procuders to finish
producewg.Wait()
//then close the main channel to notify the consumer
close(chans)
}()
break
}
channel := make(chan []byte)
chans <- channel //give the channel that we return the result to the receiver
producewg.Add(1)
go blockThread(channel, buffer, &producewg)
}
consumewg.Wait()
fmt.Printf("main exiting")
}
playground link
as a minor point i don't feel right about the "read the whole file into memory" statement cause you are just reading a block every time from the Reader, maybe "holding the result of the whole computation in memory" is more appropriate?
Is there a simple way to format a string in Go without printing the string?
I can do:
bar := "bar"
fmt.Printf("foo: %s", bar)
But I want the formatted string returned rather than printed so I can manipulate it further.
I could also do something like:
s := "foo: " + bar
But this becomes difficult to read when the format string is complex, and cumbersome when one or many of the parts aren't strings and have to be converted first, like
i := 25
s := "foo: " + strconv.Itoa(i)
Is there a simpler way to do this?
Sprintf is what you are looking for.
Example
fmt.Sprintf("foo: %s", bar)
You can also see it in use in the Errors example as part of "A Tour of Go."
return fmt.Sprintf("at %v, %s", e.When, e.What)
1. Simple strings
For "simple" strings (typically what fits into a line) the simplest solution is using fmt.Sprintf() and friends (fmt.Sprint(), fmt.Sprintln()). These are analogous to the functions without the starter S letter, but these Sxxx() variants return the result as a string instead of printing them to the standard output.
For example:
s := fmt.Sprintf("Hi, my name is %s and I'm %d years old.", "Bob", 23)
The variable s will be initialized with the value:
Hi, my name is Bob and I'm 23 years old.
Tip: If you just want to concatenate values of different types, you may not automatically need to use Sprintf() (which requires a format string) as Sprint() does exactly this. See this example:
i := 23
s := fmt.Sprint("[age:", i, "]") // s will be "[age:23]"
For concatenating only strings, you may also use strings.Join() where you can specify a custom separator string (to be placed between the strings to join).
Try these on the Go Playground.
2. Complex strings (documents)
If the string you're trying to create is more complex (e.g. a multi-line email message), fmt.Sprintf() becomes less readable and less efficient (especially if you have to do this many times).
For this the standard library provides the packages text/template and html/template. These packages implement data-driven templates for generating textual output. html/template is for generating HTML output safe against code injection. It provides the same interface as package text/template and should be used instead of text/template whenever the output is HTML.
Using the template packages basically requires you to provide a static template in the form of a string value (which may be originating from a file in which case you only provide the file name) which may contain static text, and actions which are processed and executed when the engine processes the template and generates the output.
You may provide parameters which are included/substituted in the static template and which may control the output generation process. Typical form of such parameters are structs and map values which may be nested.
Example:
For example let's say you want to generate email messages that look like this:
Hi [name]!
Your account is ready, your user name is: [user-name]
You have the following roles assigned:
[role#1], [role#2], ... [role#n]
To generate email message bodies like this, you could use the following static template:
const emailTmpl = `Hi {{.Name}}!
Your account is ready, your user name is: {{.UserName}}
You have the following roles assigned:
{{range $i, $r := .Roles}}{{if $i}}, {{end}}{{.}}{{end}}
`
And provide data like this for executing it:
data := map[string]interface{}{
"Name": "Bob",
"UserName": "bob92",
"Roles": []string{"dbteam", "uiteam", "tester"},
}
Normally output of templates are written to an io.Writer, so if you want the result as a string, create and write to a bytes.Buffer (which implements io.Writer). Executing the template and getting the result as string:
t := template.Must(template.New("email").Parse(emailTmpl))
buf := &bytes.Buffer{}
if err := t.Execute(buf, data); err != nil {
panic(err)
}
s := buf.String()
This will result in the expected output:
Hi Bob!
Your account is ready, your user name is: bob92
You have the following roles assigned:
dbteam, uiteam, tester
Try it on the Go Playground.
Also note that since Go 1.10, a newer, faster, more specialized alternative is available to bytes.Buffer which is: strings.Builder. Usage is very similar:
builder := &strings.Builder{}
if err := t.Execute(builder, data); err != nil {
panic(err)
}
s := builder.String()
Try this one on the Go Playground.
Note: you may also display the result of a template execution if you provide os.Stdout as the target (which also implements io.Writer):
t := template.Must(template.New("email").Parse(emailTmpl))
if err := t.Execute(os.Stdout, data); err != nil {
panic(err)
}
This will write the result directly to os.Stdout. Try this on the Go Playground.
try using Sprintf(); it will not print the output but save it for future purpose.
check this out.
package main
import "fmt"
func main() {
address := "NYC"
fmt.Sprintf("I live in %v", address)
}
when you run this code, it will not output anything. But once you assigned the Sprintf() to a separate variable, it can be used for future purposes.
package main
import "fmt"
func main() {
address := "NYC"
fmt.Sprintf("I live in %v", address)
var city = fmt.Sprintf("lives in %v", address)
fmt.Println("Michael",city)
}
I've created go project for string formatting from template (it allow to format strings in C# or Python style) and by performance tests it is fater than fmt.Sprintf, you could find it here https://github.com/Wissance/stringFormatter
it works in following manner:
func TestStrFormat(t *testing.T) {
strFormatResult, err := Format("Hello i am {0}, my age is {1} and i am waiting for {2}, because i am {0}!",
"Michael Ushakov (Evillord666)", "34", "\"Great Success\"")
assert.Nil(t, err)
assert.Equal(t, "Hello i am Michael Ushakov (Evillord666), my age is 34 and i am waiting for \"Great Success\", because i am Michael Ushakov (Evillord666)!", strFormatResult)
strFormatResult, err = Format("We are wondering if these values would be replaced : {5}, {4}, {0}", "one", "two", "three")
assert.Nil(t, err)
assert.Equal(t, "We are wondering if these values would be replaced : {5}, {4}, one", strFormatResult)
strFormatResult, err = Format("No args ... : {0}, {1}, {2}")
assert.Nil(t, err)
assert.Equal(t, "No args ... : {0}, {1}, {2}", strFormatResult)
}
func TestStrFormatComplex(t *testing.T) {
strFormatResult, err := FormatComplex("Hello {user} what are you doing here {app} ?", map[string]string{"user":"vpupkin", "app":"mn_console"})
assert.Nil(t, err)
assert.Equal(t, "Hello vpupkin what are you doing here mn_console ?", strFormatResult)
}
In your case, you need to use Sprintf() for format string.
func Sprintf(format string, a ...interface{}) string
Sprintf formats according to a format specifier and returns the resulting string.
s := fmt.Sprintf("Good Morning, This is %s and I'm living here from last %d years ", "John", 20)
Your output will be :
Good Morning, This is John and I'm living here from last 20 years.
fmt.SprintF function returns a string and you can format the string the very same way you would have with fmt.PrintF
I came to this page specifically looking for a way to format an error string. So if someone needs help with the same, you want to use the fmt.Errorf() function.
The method signature is func Errorf(format string, a ...interface{}) error.
It returns the formatted string as a value that satisfies the error interface.
You can look up more details in the documentation - https://golang.org/pkg/fmt/#Errorf.
Instead of using template.New, you can just use the new builtin with
template.Template:
package main
import (
"strings"
"text/template"
)
func format(s string, v interface{}) string {
t, b := new(template.Template), new(strings.Builder)
template.Must(t.Parse(s)).Execute(b, v)
return b.String()
}
func main() {
bar := "bar"
println(format("foo: {{.}}", bar))
i := 25
println(format("foo: {{.}}", i))
}