Read entire msgpack file with Rust - rust

How do I read an entire msgpack file and convert it to a json file with rust?
I have tried the following:
use std::io;
use std::fs::File;
use std::io::{BufReader, BufWriter, Write};
use std::fs;
use std::io::BufRead;
use std::path::Path;
use std::io::prelude::*;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let file_path = "./src/foo.msgpack";
let reader = BufReader::new(File::open(file_path).unwrap());
let writer = BufWriter::new(File::create("./src/results.json").unwrap());
let mut deserializer = rmp_serde::Deserializer::from_read(reader);
// let mut serializer = serde_json::Serializer::new(io::stdout());
let mut serializer = serde_json::Serializer::pretty(writer);
serde_transcode::transcode(&mut deserializer, &mut serializer).unwrap();
serializer.into_inner().flush().unwrap();
Ok(())
}
This reads the first line out of the msgpack file, and no more.
Other libraries such as serde_json can turn a deserializer into an iterator (https://docs.rs/serde_json/latest/serde_json/de/struct.Deserializer.html#method.into_iter) which seems to be what I want, but rmp_serde doesn't do this - and I'm too much of a rust newbie to figure this out myself.
Thanks!

Related

How would I save this to a file instead of stdout?

I'm probably just missing something simple
and I have used std::fs::File
use std::io::{stdout, Write};
use curl::easy::Easy;
fn main() {
let mut easy = Easy::new();
easy.url("checkip.amazonaws.com").unwrap();
easy.write_function(|data| {
Ok(stdout().write(data).unwrap())
}).unwrap();
easy.perform().unwrap();
}
Move a file into the closure and write all the content:
use std::fs::File;
use std::io::{Write};
use curl::easy::Easy;
fn main() {
let mut file = File::create("foo.txt").expect("open file");
let mut easy = Easy::new();
easy.url("checkip.amazonaws.com").unwrap();
easy.write_function(move |data| {
file.write_all(data).expect("write data");
Ok(data.len())
}).unwrap();
easy.perform().unwrap();
}

How to read bytes into BytesMut from a file?

I have this code to read a file into a BytesMut:
use std::fs::File;
use std::io;
use std::io::prelude::*;
use bytes::{BufMut, BytesMut, Buf};
fn main() -> io::Result<()> {
let mut f = File::open("foo.txt")?;
let mut b = BytesMut::with_capacity(10);
f.read(b.as_mut())?;
println!("The bytes: {:?}", b.len());
Ok(())
}
But b.len() is always zero. The content of "foo.txt" is 0x00000010 (decimal 16). I can read into a [0;10] then convert it to a BytesMut, but is there a easy way do this?
The f.read() call will write into an existing byte slice &[u8]. You have allocated space for 10 bytes, but the BytesMut doesn't yet hold 10 bytes. When it is deref'd into a byte slice it will return an empty slice, since it holds no data.
You'll want something like BytesMut::zeroed and then it will work:
use std::fs::File;
use std::io::{Read, Result};
use bytes::BytesMut;
fn main() -> Result<()> {
let mut f = File::open("foo.txt")?;
let mut b = BytesMut::zeroed(10);
f.read(b.as_mut())?;
println!("The bytes: {:?}", b.len());
Ok(())
}
The bytes: 10
If you want to read the whole file without allocating prior, you can use std::io::copy with a Writer adapter via .writer():
use bytes::{BufMut, BytesMut};
use std::fs::File;
use std::io::Result;
fn main() -> Result<()> {
let mut f = File::open("foo.txt")?;
let mut b = BytesMut::new().writer();
std::io::copy(&mut f, &mut b)?;
println!("The bytes: {:?}", b.into_inner().len());
Ok(())
}

Rust generic struct reader and writer

I'm wondering if there is a good way to make a function that can read any type of struct from a file. I was able to write a file with the function below. This allows me to write any struct that implements Serialize. I'm trying to do something similar with a reader using generics and struct that impl Deserilize. However, I seem to be hitting issues on the generics and lifetimes. Is there a way to read files of any type of struct?
extern crate bincode;
extern crate serde;
#[macro_use]
extern crate serde_derive;
use serde::{Serialize, Deserialize};
fn main() {
let filename = String::from("./prices.ruststruct");
{
let now = Instant::now();
let (open_prices, close_prices) = gen_random_prices();
let test_prices = TestPrices::new(open_prices, close_prices);
write_struct(&filename, &test_prices);
println!("{}", now.elapsed().as_micros());
}
{
let test_prices = read_struct::<Prices>(&filename);
let now = Instant::now();
let total_prices: f64 = test_prices.open_price.iter().sum();
println!("{}", now.elapsed().as_micros());
}
}
#[derive(Deserialize, Serialize)]
struct Prices {
open_price: Vec<f64>,
close_price: Vec<f64>,
}
fn write_struct(filename: &str, data: &impl Serialize) {
let filename = format!("{}.ruststruct", filename);
let bytes: Vec<u8> = bincode::serialize(&data).unwrap();
let mut file = File::create(filename).unwrap();
file.write_all(&bytes).unwrap();
}
fn read_struct<'a, T: Deserialize<'a>>(filename: &str) -> T {
let filename = format!("{}.ruststruct", filename);
let mut file = File::open(filename).unwrap();
let mut buffer = Vec::<u8>::new();
file.read_to_end(&mut buffer).unwrap();
let decoded: T = bincode::deserialize(&buffer[..]).unwrap();
decoded
}
fn read_struct<'a, T: Deserialize<'a>>(filename: &str) -> T {
Deserialize<'a> is only a suitable bound when you are planning to let the deserialized structures borrow from the input data. But this function cannot allow that, because it discards buffer when it returns.
For a structure like your Prices, T: DeserializeOwned will work. This guarantees that the structure won't borrow from the input data, so it's okay to drop the data.
If you want to allow borrowing then you must put reading the file into a buffer, and deserializing from the buffer, in separate functions so that the caller can keep the buffer alive as long as it wants to use the deserialized structure.

type `std::result::Result<u8, std::io::Error>` cannot be dereferenced

use std::env;
use std::fs::File;
use std::io::{BufReader, BufWriter, Read, Write};
fn main() {
let args = env::args().collect::<Vec<String>>();
let file = File::open(&args[1]).expect("file not found");
let reader = BufReader::new(file);
let mut writer = BufWriter::new(std::io::stdout());
for it in reader.bytes() {
writer.write(&[*it]);
}
}
Why does this give an error?
type `std::result::Result<u8, std::io::Error>` cannot be dereferenced
From the documentation,
fn bytes(self) -> Bytes<Self> where
Self: Sized,
Transforms this Read instance to an Iterator over its bytes.
The returned type implements Iterator where the Item is Result<u8,
io::Error>. The yielded item is Ok if a byte was successfully read and
Err otherwise. EOF is mapped to returning None from this iterator.
Only types implementing std::ops::Deref can be dereferenced.
use std::env;
use std::fs::File;
use std::io::{BufReader, BufWriter, Read, Write};
fn main() {
let args = env::args().collect::<Vec<String>>();
let file = File::open(&args[1]).expect("file not found");
let reader = BufReader::new(file);
let mut writer = BufWriter::new(std::io::stdout());
for it in reader.bytes() {
writer.write(&[it.unwrap()]);
}
}

How to create an in-memory object that can be used as a Reader, Writer, or Seek in Rust?

I need a completely in-memory object that I can give to BufReader and BufWriter. Something like Python's StringIO. I want to write to and read from such an object using methods ordinarily used with Files.
Is there a way to do this using the standard library?
In fact there is a way: Cursor<T>!
(please also read Shepmaster's answer on why often it's even easier)
In the documentation you can see that there are the following impls:
impl<T> Seek for Cursor<T> where T: AsRef<[u8]>
impl<T> Read for Cursor<T> where T: AsRef<[u8]>
impl Write for Cursor<Vec<u8>>
impl<T> AsRef<[T]> for Vec<T>
From this you can see that you can use the type Cursor<Vec<u8>> just as an ordinary file, because Read, Write and Seek are implemented for that type!
Little example (Playground):
use std::io::{Cursor, Read, Seek, SeekFrom, Write};
// Create fake "file"
let mut c = Cursor::new(Vec::new());
// Write into the "file" and seek to the beginning
c.write_all(&[1, 2, 3, 4, 5]).unwrap();
c.seek(SeekFrom::Start(0)).unwrap();
// Read the "file's" contents into a vector
let mut out = Vec::new();
c.read_to_end(&mut out).unwrap();
println!("{:?}", out);
For a more useful example, check the documentation linked above.
You don't need a Cursor most of the time.
object that I can give to BufReader and BufWriter
BufReader requires a value that implements Read:
impl<R: Read> BufReader<R> {
pub fn new(inner: R) -> BufReader<R>
}
BufWriter requires a value that implements Write:
impl<W: Write> BufWriter<W> {
pub fn new(inner: W) -> BufWriter<W> {}
}
If you view the implementors of Read you will find impl<'a> Read for &'a [u8].
If you view the implementors of Write, you will find impl Write for Vec<u8>.
use std::io::{Read, Write};
fn main() {
// Create fake "file"
let mut file = Vec::new();
// Write into the "file"
file.write_all(&[1, 2, 3, 4, 5]).unwrap();
// Read the "file's" contents into a new vector
let mut out = Vec::new();
let mut c = file.as_slice();
c.read_to_end(&mut out).unwrap();
println!("{:?}", out);
}
Writing to a Vec will always append to the end. We also take a slice to the Vec that we can update. Each read of c will advance the slice further and further until it is empty.
The main differences from Cursor:
Cannot seek the data, so you cannot easily re-read data
Cannot write to anywhere but the end
If you want to use BufReader with an in-memory String, you can use the as_bytes() method:
use std::io::BufRead;
use std::io::BufReader;
use std::io::Read;
fn read_buff<R: Read>(mut buffer: BufReader<R>) {
let mut data = String::new();
let _ = buffer.read_line(&mut data);
println!("read_buff got {}", data);
}
fn main() {
read_buff(BufReader::new("Potato!".as_bytes()));
}
This prints read_buff got Potato!. There is no need to use a cursor for this case.
To use an in-memory String with BufWriter, you can use the as_mut_vec method. Unfortunately it is unsafe and I have not found any other way. I don't like the Cursor approach since it consumes the vector and I have not found a way yet to use the Cursor together with BufWriter.
use std::io::BufWriter;
use std::io::Write;
pub fn write_something<W: Write>(mut buf: BufWriter<W>) {
buf.write("potato".as_bytes());
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::{BufWriter};
#[test]
fn testing_bufwriter_and_string() {
let mut s = String::new();
write_something(unsafe { BufWriter::new(s.as_mut_vec()) });
assert_eq!("potato", &s);
}
}

Resources