Writing a Vec<String> to files using std::fs::write - rust

I'm writing a program that handles a vector which is combination of numbers and letters (hence Vec<String>). I sort it with the .sort() method and am now trying to write it to a file.
Where strvec is my sorted vector that I'm trying to write using std::fs::write;
println!("Save results to file?");
let to_save: String = read!();
match to_save.as_str() {
"y" => {
println!("Enter filename");
let filename: String = read!();
let pwd = current_dir().into();
write("/home/user/dl/results", strvec);
Rust tells me "the trait AsRef<[u8]> is not implemented for Vec<String>". I've also tried using &strvec.
How do I avoid this/fix it?

When it comes to writing objects to the file you might want to consider serialization. Most common library for this in Rust is serde, however in this example where you want to store vector of Strings and if you don't need anything human readable in file (but it comes with small size :P), you can also use bincode:
use std::fs;
use bincode;
fn main() {
let v = vec![String::from("aaa"), String::from("bbb")];
let encoded_v = bincode::serialize(&v).expect("Could not encode vector");
fs::write("file", encoded_v).expect("Could not write file");
let read_v = fs::read("file").expect("Could not read file");
let decoded_v: Vec<String> = bincode::deserialize(&read_v).expect("Could not decode vector");
println!("{:?}", decoded_v);
}
Remember to add bincode = "1.3.3" under dependencies in Cargo.toml
#EDIT:
Actually you can easily save String to the file so simple join() should do:
use std::fs;
fn main() {
let v = vec![
String::from("aaa"),
String::from("bbb"),
String::from("ccc")];
fs::write("file", v.join("\n")).expect("");
}

Rust can't write anything besides a &[u8] to a file. There are too many different ways which data can be interpreted before it gets flattened, so you need to handle all of that ahead of time. For a Vec<String>, it's pretty simple, and you can just use concat to squish everything down to a single String, which can be interpreted as a &[u8] because of its AsRef<u8> trait impl.
Another option would be to use join, in case you wanted to add some sort of delimiter between your strings, like a space, comma, or something.
fn main() {
let strvec = vec![
"hello".to_string(),
"world".to_string(),
];
// "helloworld"
std::fs::write("/tmp/example", strvec.concat()).expect("failed to write to file");
// "hello world"
std::fs::write("/tmp/example", strvec.join(" ")).expect("failed to write to file");
}

You can't get a &[u8] from a Vec<String> without copying since a slice must refer to a contiguous sequence of items. Each String will have its own allocation on the heap somewhere, so while each individual String can be converted to a &[u8], you can't convert the whole vector to a single &[u8].
While you can .collect() the vector into a single String and then get a &[u8] from that, this does some unnecessary copying. Consider instead just iterating the Strings and writing each one to the file. With this helper, it's no more complex than using std::fs::write():
use std::path::Path;
use std::fs::File;
use std::io::Write;
fn write_each(
path: impl AsRef<Path>,
items: impl IntoIterator<Item=impl AsRef<[u8]>>,
) -> std::io::Result<()> {
let mut file = File::create(path)?;
for i in items {
file.write_all(i.as_ref())?;
}
// Surface any I/O errors that could otherwise be swallowed when
// the file is closed implicitly by being dropped.
file.sync_all()
}
The bound impl IntoIterator<Item=impl AsRef<[u8]>> is satisfied by both Vec<String> and by &Vec<String>, so you can call this as either write_each("path/to/output", strvec) (to consume the vector) or write_each("path/to/output", &strvec) (if you need to hold on to the vector for later).

Related

How do I insert a dynamic byte string into a vector?

I need to create packet to send to the server. For this purpose I use vector with byteorder crate. When I try to append string, Rust compiler tells I use unsafe function and give me an error.
use byteorder::{LittleEndian, WriteBytesExt};
fn main () {
let login = "test";
let packet_length = 30 + (login.len() as i16);
let mut packet = Vec::new();
packet.write_u8(0x00);
packet.write_i16::<LittleEndian>(packet_length);
packet.append(&mut Vec::from(String::from("game name ").as_bytes_mut()));
// ... rest code
}
The error is:
packet.append(&mut Vec::from(String::from("game name ").as_bytes_mut()));
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ call to unsafe function
This is playground to reproduce: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=381c6d14660d47beaece15d068b3dc6a
What is the correct way to insert some string as bytes into vector ?
The unsafe function called was as_bytes_mut(). This creates a mutable reference with exclusive access to the bytes representing the string, allowing you to modify them. You do not really need a mutable reference in this case, as_bytes() would have sufficed.
However, there is a more idiomatic way. Vec<u8> also functions as a writer (it implements std::io::Write), so you can use one of its methods, or even the write! macro, to write encoded text on it.
use std::io::Write;
use byteorder::{LittleEndian, WriteBytesExt};
fn main () -> Result<(), std::io::Error> {
let login = "test";
let packet_length = 30 + (login.len() as i16);
let mut packet = Vec::new();
packet.write_u8(0x00)?;
packet.write_i16::<LittleEndian>(packet_length)?;
let game_name = String::from("game name");
write!(packet, "{} ", game_name)?;
Ok(())
}
Playground
See also:
Use write! macro with a string instead of a string literal
What's the de-facto way of reading and writing files in Rust 1.x?
You can use .extend() on the Vec and pass in the bytes representation of the String:
use byteorder::{LittleEndian, WriteBytesExt};
fn main() {
let login = "test";
let packet_length = 30 + (login.len() as i16);
let mut packet = Vec::new();
packet.write_u8(0x00);
packet.write_i16::<LittleEndian>(packet_length);
let string = String::from("game name ");
packet.extend(string.as_bytes());
}
Playground

Is it possible to build a HashMap of &str referencing environment variables?

I'm trying to make a read-only map of environment variables.
fn os_env_hashmap() -> HashMap<&'static str, &'static str> {
let mut map = HashMap::new();
use std::env;
for (key,val) in env::vars_os() {
let k = key.to_str();
if k.is_none() { continue }
let v = val.to_str();
if v.is_none() { continue }
k.unwrap();
//map.insert( k.unwrap(), v.unwrap() );
}
return map;
}
Can't seem to uncomment the "insert" line near the bottom without compiler errors about key,val,k, and v being local.
I might be able to fix the compiler error by using String instead of str, but str seems perfect for a read-only result.
Feel free to suggest a more idiomatic way to do this.
This is unfortunately not straightforward using only the facilities of the Rust standard library.
If env::vars_os() returned an iterator over &'static OsStr instead of OsString, this would be trivial. Unfortunately, not all platforms allow creating an &OsStr to the contents of an environment variable. In particular, on Windows, the native encoding is UTF-16 but the encoding needed by OsStr is WTF-8. For this reason, there really is no OsStr anywhere you could take a reference to, until you create an OsString by iterating over env::vars_os().
The simplest thing, as the question comments mention, is to return owned Strings:
fn os_env_hashmap() -> HashMap<String, String> {
let mut map = HashMap::new();
use std::env;
for (key, val) in env::vars_os() {
// Use pattern bindings instead of testing .is_some() followed by .unwrap()
if let (Ok(k), Ok(v)) = (key.into_string(), val.into_string()) {
map.insert(k, v);
}
}
return map;
}
The result is not "read-only", but it is not shared, so you cannot cause data races or other weird bugs by mutating it.
See also
Is there any way to return a reference to a variable created in a function?
Return local String as a slice (&str)

Is there a way to avoid cloning when converting a PathBuf to a String?

I need to simply (and dangerously - error handling omitted for brevity) get the current executable name. I made it work, but my function converts a &str to String only to call as_str() on it later for pattern matching.
fn binary_name() -> String {
std::env::current_exe().unwrap().file_name().unwrap().to_str().unwrap().to_string()
}
As I understand it, std::env::current_exe() gives me ownership of the PathBuf which I could transfer by returning it. As it stands, I borrow it to convert it to &str. From there, the only way to return the string is to clone it before the PathBuf is dropped.
Is there any way to avoid this &OsStr -> &str -> String -> &str cycle?
Is there a way to avoid cloning when converting a PathBuf to a String?
Absolutely. However, that's not what you are doing. You are taking a part of the PathBuf via file_name and converting that. You cannot take ownership of a part of a string.
If you weren't taking a subset, then converting an entire PathBuf can be done by converting to an OsString and then to a String. Here, I ignore the specific errors and just return success or failure:
use std::path::PathBuf;
fn exe_name() -> Option<String> {
std::env::current_exe()
.ok()
.map(PathBuf::into_os_string)
.and_then(|exe| exe.into_string().ok())
}
Is there any way to avoid this &OsStr -> &str -> String -> &str cycle?
No, because you are creating the String (or OsString or PathBuf, whichever holds ownership depending on the variant of code) inside your method. Check out Return local String as a slice (&str) for why you cannot return a reference to a stack-allocated item (including a string).
As stated in that Q&A, if you want to have references, the thing owning the data has to outlive the references:
use std::env;
use std::path::Path;
use std::ffi::OsStr;
fn binary_name(path: &Path) -> Option<&str> {
path.file_name().and_then(OsStr::to_str)
}
fn main() {
let exe = env::current_exe().ok();
match exe.as_ref().and_then(|e| binary_name(e)) {
Some("cat") => println!("Called as cat"),
Some("dog") => println!("Called as dog"),
Some(other) => println!("Why did you call me {}?", other),
None => println!("Not able to figure out what I was called as"),
}
}
Your original code can be written to not crash on errors easily enough
fn binary_name() -> Option<String> {
let exe = std::env::current_exe();
exe.ok()
.as_ref()
.and_then(|p| p.file_name())
.and_then(|s| s.to_str())
.map(String::from)
}

Read file character-by-character in Rust

Is there an idiomatic way to process a file one character at a time in Rust?
This seems to be roughly what I'm after:
let mut f = io::BufReader::new(try!(fs::File::open("input.txt")));
for c in f.chars() {
println!("Character: {}", c.unwrap());
}
But Read::chars is still unstable as of Rust v1.6.0.
I considered using Read::read_to_string, but the file may be large and I don't want to read it all into memory.
Let's compare 4 approaches.
1. Read::chars
You could copy Read::chars implementation, but it is marked unstable with
the semantics of a partial read/write of where errors happen is currently unclear and may change
so some care must be taken. Anyway, this seems to be the best approach.
2. flat_map
The flat_map alternative does not compile:
use std::io::{BufRead, BufReader};
use std::fs::File;
pub fn main() {
let mut f = BufReader::new(File::open("input.txt").expect("open failed"));
for c in f.lines().flat_map(|l| l.expect("lines failed").chars()) {
println!("Character: {}", c);
}
}
The problems is that chars borrows from the string, but l.expect("lines failed") lives only inside the closure, so compiler gives the error borrowed value does not live long enough.
3. Nested for
This code
use std::io::{BufRead, BufReader};
use std::fs::File;
pub fn main() {
let mut f = BufReader::new(File::open("input.txt").expect("open failed"));
for line in f.lines() {
for c in line.expect("lines failed").chars() {
println!("Character: {}", c);
}
}
}
works, but it keeps allocation a string for each line. Besides, if there is no line break on the input file, the whole file would be load to the memory.
4. BufRead::read_until
A memory efficient alternative to approach 3 is to use Read::read_until, and use a single string to read each line:
use std::io::{BufRead, BufReader};
use std::fs::File;
pub fn main() {
let mut f = BufReader::new(File::open("input.txt").expect("open failed"));
let mut buf = Vec::<u8>::new();
while f.read_until(b'\n', &mut buf).expect("read_until failed") != 0 {
// this moves the ownership of the read data to s
// there is no allocation
let s = String::from_utf8(buf).expect("from_utf8 failed");
for c in s.chars() {
println!("Character: {}", c);
}
// this returns the ownership of the read data to buf
// there is no allocation
buf = s.into_bytes();
buf.clear();
}
}
I cannot use lines() because my file could be a single line that is gigabytes in size. This an improvement on #malbarbo's recommendation of copying Read::chars from the an old version of Rust. The utf8-chars crate already adds .chars() to BufRead for you.
Inspecting their repository, it doesn't look like they load more than 4 bytes at a time.
Your code will look the same as it did before Rust removed Read::chars:
use std::io::stdin;
use utf8_chars::BufReadCharsExt;
fn main() {
for c in stdin().lock().chars().map(|x| x.unwrap()) {
println!("{}", c);
}
}
Add the following to your Cargo.toml:
[dependencies]
utf8-chars = "1.0.0"
There are two solutions that make sense here.
First, you could copy the implementation of Read::chars() and use it; that would make it completely trivial to move your code over to the standard library implementation if/when it stabilizes.
On the other hand, you could simply iterate line by line (using f.lines()) and then use line.chars() on each line to get the chars. This is a little more hacky, but it will definitely work.
If you only wanted one loop, you could use flat_map() with a lambda like |line| line.chars().

How would I create and use a string to string Hashmap in Rust?

How would I idiomatically create a string to string hashmap in rust. The following works, but is it the right way to do it? is there a different kind of string I should be using?
use std::collections::hashmap::HashMap;
//use std::str;
fn main() {
let mut mymap = HashMap::new();
mymap.insert("foo".to_string(), "bar".to_string());
println!("{0}", mymap["foo".to_string()]);
}
Assuming you would like the flexibility of String, HashMap<String, String> is correct. The other choice is &str, but that imposes significant restrictions on how the HashMap can be used/where it can be passed around; but if it it works, changing one or both parameter to &str will be more efficient. This choice should be dictated by what sort of ownership semantics you need, and how dynamic the strings are, see this answer and the strings guide for more.
BTW, searching a HashMap<String, ...> with a String can be expensive: if you don't already have one, it requires allocating a new String. We have a work around in the form of find_equiv, which allows you to pass a string literal (and, more generally, any &str) without allocating a new String:
use std::collections::HashMap;
fn main() {
let mut mymap = HashMap::new();
mymap.insert("foo".to_string(), "bar".to_string());
println!("{}", mymap.find_equiv(&"foo"));
println!("{}", mymap.find_equiv(&"not there"));
}
playpen (note I've left the Option in the return value, one could call .unwrap() or handle a missing key properly).
Another slightly different option (more general in some circumstances, less in others), is the std::string::as_string function, which allows viewing the data in &str as if it were a &String, without allocating (as the name suggests). It returns an object that can be dereferenced to a String, e.g.
use std::collections::HashMap;
use std::string;
fn main() {
let mut mymap = HashMap::new();
mymap.insert("foo".to_string(), "bar".to_string());
println!("{}", mymap[*string::as_string("foo")]);
}
playpen
(There is a similar std::vec::as_vec.)
Writing this answer for future readers. huon's answer is correct at the time but *_equiv methods were purged some time ago.
The HashMap documentation provides an example on using String-String hashmaps, where &str can be used.
The following code will work just fine. No new String allocation necessary:
use std::collections::HashMap;
fn main() {
let mut mymap = HashMap::new();
mymap.insert("foo".to_string(), "bar".to_string());
println!("{0}", mymap["foo"]);
println!("{0}", mymap.get("foo").unwrap());
}

Resources