How to efficiently push displayable item into String? [duplicate] - rust

This question already has an answer here:
How can I append a formatted string to an existing String?
(1 answer)
Closed 5 years ago.
See this example:
fn concat<T: std::fmt::Display>(s: &mut String, thing: T) {
// TODO
}
fn main() {
let mut s = "Hello ".into();
concat(&mut s, 42);
assert_eq!(&s, "Hello 42");
}
I know that I can use this:
s.push_str(&format!("{}", thing))
but this is not the most efficient, because format! allocate a String that is not necessary.
The most efficient is to write directly the string representation of the displayable item into the String buffer. How to do this?

There are multiple formatting macros, and in your case you want the write! macro:
use std::fmt::{Display, Write};
fn concat<T: Display>(s: &mut String, thing: &T) {
write!(s, "{}", thing).unwrap();
}
fn main() {
let mut s = "Hello ".into();
concat(&mut s, &42);
assert_eq!(&s, "Hello 42");
}
Anything that implements one of the Write trait (and String does) is a valid target for write!.
Note: actually anything that implements a write_fmt method, as macro don't care much about semantics; which is why either fmt::Write or io::Write work.

Related

How to define one function for two types in Rust?

I am writing a Rust application. I'd like to have a method to display the text whether it is a string or number. Furthermore, I came up with the following solution, but it is duplicating the code. Is there a better way to do it in Rust?
Notice: I am not looking for a built-in function to print variables. It's just an example. I am looking for a way to implement the same feature for two types.
trait Display {
fn display(self);
}
impl Display for String {
fn display(self) -> () {
println!("You wrote: {}", self);
}
}
impl Display for i32 {
fn display(self) -> () {
println!("You wrote: {}", self);
}
}
fn main() {
let name: String = String::from("Tom Smykowski");
name.display();
let age: i32 = 22;
age.display();
}
You came close. But there is already a trait for converting things to strings - std::fmt::Display (and the automatically-implemented ToString) - so you don't need to have your own trait:
fn display<T: std::fmt::Display>(v: T) {
println!("You wrote: {v}");
}
fn main() {
let name: String = String::from("Tom Smykowski");
display(name);
let age: i32 = 22;
display(age);
}
Even if you don't need to display the types but do something else with them, we can take the idea from Display - instead of defining the whole functionality, define only the pieces that are different. For example, you can create a trait to convert the numbers to strings (or the opposite), or just have functions for each different piece - for example, printing itself without "You wrote: ".
I came up with the following solution, but it is duplicating the code. Is there a better way to do it in Rust?
Add a simple declarative macro on top, that is very common in the stdlib and all. e.g.
macro_rules! impl_display {
($t:ty) => {
impl Display for $t {
fn display(self) {
println!("You wrote {self}");
}
}
}
}
impl_display!(String);
impl_display!(i32);
impl_display!(i64);
impl_display!(f32);
Although:
usually the implementations would be different, though not always e.g. implementing an operation on all numeric types, or all unsigned numbers, that's one of the most common context you'll see it in the stdlib: the stdlib has no numeric trait but methods are usually implemented on all numeric types, so there's a handful of macros used for all of them, and when new methods are added they're just added to the relevant macro
here you're already relying on the existence and implementation of std::fmt::Display so you should just use that, your trait is not really useful

Creating an owned iterator with lines

I am learning Rust by following the Rust Book and I am currently trying to modify the project in Chapter 12, but I can't understand why my code is not working.
The function in question is the search function
fn search(query: &str, contents: String) -> Vec<String> {
contents.lines().filter(|line| line.contains(query)).collect()
}
which is supposed to get contents of a file as a string and return a collection of the lines in the file containing query. In this form, it throws the error "a value of type std::vec::Vec<std::string::String> cannot be built from an iterator over elements of type &str".
I think that the error comes from the use of lines since it doesn't take ownership of contents. My question is if there is a better way to do this or if there is a similar method to lines that does take ownership.
As a Rust learner it is important to know the differences between strings (String) and string slices (&str), and how those two types interact.
The lines() method of an &str returns the lines as iterator over &str, which is the same type.
However the lines method of a string also returns an iterator over &str, which is the same type as before but in this case not the same type as the input.
This means, your output will be of type Vec<&str>.
However in that case you need a lifetime because otherwise you can't return a reference. In this case your example would look like this:
fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
contents.lines().filter(|line| line.contains(query)).collect()
}
fn main() {
println!("found: {:?}",search("foo", "the foot\nof the\nfool"));
}
However if you want the vector to contain strings, you can use the to_owned() function to convert a &str into a String:
fn search(query: &str, contents: &str) -> Vec<String> {
contents.lines().map(|line| line.to_owned()).filter(|line| line.contains(query)).collect()
}
fn main() {
println!("{:?}",search("foo", "the foot\nof the\nfool"));
}
However this is inefficient because some strings are created that aren't used so it is better to map last:
fn search(query: &str, contents: &str) -> Vec<String> {
contents.lines().filter(|line| line.contains(query)).map(|line| line.to_owned()).collect()
}
fn main() {
println!("{:?}",search("foo", "the foot\nof the\nfool"));
}
Or with contents of type String, but I think this doesn't make much sense:
fn search(query: &str, contents: String) -> Vec<String> {
contents.lines().map(|line| line.to_owned()).filter(|line| line.contains(query)).collect()
}
fn main() {
println!("{:?}",search("foo", "the foot\nof the\nfool".to_owned()));
}
Explanation: Passing contents as a String isn't very useful because the search function will own it, but it is not mutable, so you can't change it to the search result, and also your search result is a vector, and you can't transform a single owned String into multiple owned ones.
P.S.: I'm also relatively new to Rust, so feel free to comment or edit my post if I missed something.

Attempt to implement sscanf in Rust, failing when passing &str as argument

Problem:
Im new to Rust, and im trying to implement a macro which simulates sscanf from C.
So far it works with any numeric types, but not with strings, as i am already trying to parse a string.
macro_rules! splitter {
( $string:expr, $sep:expr) => {
let mut iter:Vec<&str> = $string.split($sep).collect();
iter
}
}
macro_rules! scan_to_types {
($buffer:expr,$sep:expr,[$($y:ty),+],$($x:expr),+) => {
let res = splitter!($buffer,$sep);
let mut i = 0;
$(
$x = res[i].parse::<$y>().unwrap_or_default();
i+=1;
)*
};
}
fn main() {
let mut a :u8; let mut b :i32; let mut c :i16; let mut d :f32;
let buffer = "00:98;76,39.6";
let sep = [':',';',','];
scan_to_types!(buffer,sep,[u8,i32,i16,f32],a,b,c,d); // this will work
println!("{} {} {} {}",a,b,c,d);
}
This obviously wont work, because at compile time, it will try to parse a string slice to str:
let a :u8; let b :i32; let c :i16; let d :f32; let e :&str;
let buffer = "02:98;abc,39.6";
let sep = [':',';',','];
scan_to_types!(buffer,sep,[u8,i32,&str,f32],a,b,e,d);
println!("{} {} {} {}",a,b,e,d);
$x = res[i].parse::<$y>().unwrap_or_default();
| ^^^^^ the trait `FromStr` is not implemented for `&str`
What i have tried
I have tried to compare types using TypeId, and a if else condition inside of the macro to skip the parsing, but the same situation happens, because it wont expand to a valid code:
macro_rules! scan_to_types {
($buffer:expr,$sep:expr,[$($y:ty),+],$($x:expr),+) => {
let res = splitter!($buffer,$sep);
let mut i = 0;
$(
if TypeId::of::<$y>() == TypeId::of::<&str>(){
$x = res[i];
}else{
$x = res[i].parse::<$y>().unwrap_or_default();
}
i+=1;
)*
};
}
Is there a way to set conditions or skip a repetition inside of a macro ? Or instead, is there a better aproach to build sscanf using macros ? I have already made functions which parse those strings, but i couldnt pass types as arguments, or make them generic.
Note before the answer: you probably don't want to emulate sscanf() in Rust. There are many very capable parsers in Rust, so you should probably use one of them.
Simple answer: the simplest way to address your problem is to replace the use of &str with String, which makes your macro compile and run. If your code is not performance-critical, that is probably all you need. If you care about performance and about avoiding allocation, read on.
A downside of String is that under the hood it copies the string data from the string you're scanning into a freshly allocated owned string. Your original approach of using an &str should have allowed for your &str to directly point into the data that was scanned, without any copying. Ideally we'd like to write something like this:
trait MyFromStr {
fn my_from_str(s: &str) -> Self;
}
// when called on a type that impls `FromStr`, use `parse()`
impl<T: FromStr + Default> MyFromStr for T {
fn my_from_str(s: &str) -> T {
s.parse().unwrap_or_default()
}
}
// when called on &str, just return it without copying
impl MyFromStr for &str {
fn my_from_str(s: &str) -> &str {
s
}
}
Unfortunately that doesn't compile, complaining of a "conflicting implementation of trait MyFromStr for &str", even though there is no conflict between the two implementations, as &str doesn't implement FromStr. But the way Rust currently works, a blanket implementation of a trait precludes manual implementations of the same trait, even on types not covered by the blanket impl.
In the future this will be resolved by specialization. Specialization is not yet part of stable Rust, and might not come to stable Rust for years, so we have to think of another solution. In case of macro usage, we can just let the compiler "specialize" for us by creating two traits with the same name. (This is similar to the autoref-based specialization invented by David Tolnay, but even simpler because it doesn't require autoref resolution to work, as we have the types provided explicitly.)
We create separate traits for parsed and unparsed values, and implement them as needed:
trait ParseFromStr {
fn my_from_str(s: &str) -> Self;
}
impl<T: FromStr + Default> ParseFromStr for T {
fn my_from_str(s: &str) -> T {
s.parse().unwrap_or_default()
}
}
pub trait StrFromStr {
fn my_from_str(s: &str) -> &str;
}
impl StrFromStr for &str {
fn my_from_str(s: &str) -> &str {
s
}
}
Then in the macro we just call <$y>::my_from_str() and let the compiler generate the correct code. Since macros are untyped, this works because we never need to provide a single "trait bound" that would disambiguate which my_from_str() we want. (Such a trait bound would require specialization.)
macro_rules! scan_to_types {
($buffer:expr,$sep:expr,[$($y:ty),+],$($x:expr),+) => {
#[allow(unused_assignments)]
{
let res = splitter!($buffer,$sep);
let mut i = 0;
$(
$x = <$y>::my_from_str(&res[i]);
i+=1;
)*
}
};
}
Complete example in the playground.

Cannot split a string into string slices with explicit lifetimes because the string does not live long enough

I'm writing a library that should read from something implementing the BufRead trait; a network data stream, standard input, etc. The first function is supposed to read a data unit from that reader and return a populated struct filled mostly with &'a str values parsed from a frame from the wire.
Here is a minimal version:
mod mymod {
use std::io::prelude::*;
use std::io;
pub fn parse_frame<'a, T>(mut reader: T)
where
T: BufRead,
{
for line in reader.by_ref().lines() {
let line = line.expect("reading header line");
if line.len() == 0 {
// got empty line; done with header
break;
}
// split line
let splitted = line.splitn(2, ':');
let line_parts: Vec<&'a str> = splitted.collect();
println!("{} has value {}", line_parts[0], line_parts[1]);
}
// more reads down here, therefore the reader.by_ref() above
// (otherwise: use of moved value).
}
}
use std::io;
fn main() {
let stdin = io::stdin();
let locked = stdin.lock();
mymod::parse_frame(locked);
}
An error shows up which I cannot fix after trying different solutions:
error: `line` does not live long enough
--> src/main.rs:16:28
|
16 | let splitted = line.splitn(2, ':');
| ^^^^ does not live long enough
...
20 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the body at 8:4...
--> src/main.rs:8:5
|
8 | / {
9 | | for line in reader.by_ref().lines() {
10 | | let line = line.expect("reading header line");
11 | | if line.len() == 0 {
... |
22 | | // (otherwise: use of moved value).
23 | | }
| |_____^
The lifetime 'a is defined on a struct and implementation of a data keeper structure because the &str requires an explicit lifetime. These code parts were removed as part of the minimal example.
BufReader has a lines() method which returns Result<String, Err>. I handle errors using expect or match and thus unpack the Result so that the program now has the bare String. This will then be done multiple times to populate a data structure.
Many answers say that the unwrap result needs to be bound to a variable otherwise it gets lost because it is a temporary value. But I already saved the unpacked Result value in the variable line and I still get the error.
How to fix this error - could not get it working after hours trying.
Does it make sense to do all these lifetime declarations just for &str in a data keeper struct? This will be mostly a readonly data structure, at most replacing whole field values. String could also be used, but have found articles saying that String has lower performance than &str - and this frame parser function will be called many times and is performance-critical.
Similar questions exist on Stack Overflow, but none quite answers the situation here.
For completeness and better understanding, following is an excerpt from complete source code as to why lifetime question came up:
Data structure declaration:
// tuple
pub struct Header<'a>(pub &'a str, pub &'a str);
pub struct Frame<'a> {
pub frameType: String,
pub bodyType: &'a str,
pub port: &'a str,
pub headers: Vec<Header<'a>>,
pub body: Vec<u8>,
}
impl<'a> Frame<'a> {
pub fn marshal(&'a self) {
//TODO
println!("marshal!");
}
}
Complete function definition:
pub fn parse_frame<'a, T>(mut reader: T) -> Result<Frame<'a>, io::Error> where T: BufRead {
Your problem can be reduced to this:
fn foo<'a>() {
let thing = String::from("a b");
let parts: Vec<&'a str> = thing.split(" ").collect();
}
You create a String inside your function, then declare that references to that string are guaranteed to live for the lifetime 'a. Unfortunately, the lifetime 'a isn't under your control — the caller of the function gets to pick what the lifetime is. That's how generic parameters work!
What would happen if the caller of the function specified the 'static lifetime? How would it be possible for your code, which allocates a value at runtime, to guarantee that the value lives longer than even the main function? It's not possible, which is why the compiler has reported an error.
Once you've gained a bit more experience, the function signature fn foo<'a>() will jump out at you like a red alert — there's a generic parameter that isn't used. That's most likely going to mean bad news.
return a populated struct filled mostly with &'a str
You cannot possibly do this with the current organization of your code. References have to point to something. You are not providing anywhere for the pointed-at values to live. You cannot return an allocated String as a string slice.
Before you jump to it, no you cannot store a value and a reference to that value in the same struct.
Instead, you need to split the code that creates the String and that which parses a &str and returns more &str references. That's how all the existing zero-copy parsers work. You could look at those for inspiration.
String has lower performance than &str
No, it really doesn't. Creating lots of extraneous Strings is a bad idea, sure, just like allocating too much is a bad idea in any language.
Maybe the following program gives clues for others who also also having their first problems with lifetimes:
fn main() {
// using String und &str Slice
let my_str: String = "fire".to_owned();
let returned_str: MyStruct = my_func_str(&my_str);
println!("Received return value: {ret}", ret = returned_str.version);
// using Vec<u8> und &[u8] Slice
let my_vec: Vec<u8> = "fire".to_owned().into_bytes();
let returned_u8: MyStruct2 = my_func_vec(&my_vec);
println!("Received return value: {ret:?}", ret = returned_u8.version);
}
// using String -> str
fn my_func_str<'a>(some_str: &'a str) -> MyStruct<'a> {
MyStruct {
version: &some_str[0..2],
}
}
struct MyStruct<'a> {
version: &'a str,
}
// using Vec<u8> -> & [u8]
fn my_func_vec<'a>(some_vec: &'a Vec<u8>) -> MyStruct2<'a> {
MyStruct2 {
version: &some_vec[0..2],
}
}
struct MyStruct2<'a> {
version: &'a [u8],
}

Why is it discouraged to accept a reference &String, &Vec, or &Box as a function argument?

I wrote some Rust code that takes a &String as an argument:
fn awesome_greeting(name: &String) {
println!("Wow, you are awesome, {}!", name);
}
I've also written code that takes in a reference to a Vec or Box:
fn total_price(prices: &Vec<i32>) -> i32 {
prices.iter().sum()
}
fn is_even(value: &Box<i32>) -> bool {
**value % 2 == 0
}
However, I received some feedback that doing it like this isn't a good idea. Why not?
TL;DR: One can instead use &str, &[T] or &T to allow for more generic code.
One of the main reasons to use a String or a Vec is because they allow increasing or decreasing the capacity. However, when you accept an immutable reference, you cannot use any of those interesting methods on the Vec or String.
Accepting a &String, &Vec or &Box also requires the argument to be allocated on the heap before you can call the function. Accepting a &str allows a string literal (saved in the program data) and accepting a &[T] or &T allows a stack-allocated array or variable. Unnecessary allocation is a performance loss. This is usually exposed right away when you try to call these methods in a test or a main method:
awesome_greeting(&String::from("Anna"));
total_price(&vec![42, 13, 1337])
is_even(&Box::new(42))
Another performance consideration is that &String, &Vec and &Box introduce an unnecessary layer of indirection as you have to dereference the &String to get a String and then perform a second dereference to end up at &str.
Instead, you should accept a string slice (&str), a slice (&[T]), or just a reference (&T). A &String, &Vec<T> or &Box<T> will be automatically coerced (via deref coercion) to a &str, &[T] or &T, respectively.
fn awesome_greeting(name: &str) {
println!("Wow, you are awesome, {}!", name);
}
fn total_price(prices: &[i32]) -> i32 {
prices.iter().sum()
}
fn is_even(value: &i32) -> bool {
*value % 2 == 0
}
Now you can call these methods with a broader set of types. For example, awesome_greeting can be called with a string literal ("Anna") or an allocated String. total_price can be called with a reference to an array (&[1, 2, 3]) or an allocated Vec.
If you'd like to add or remove items from the String or Vec<T>, you can take a mutable reference (&mut String or &mut Vec<T>):
fn add_greeting_target(greeting: &mut String) {
greeting.push_str("world!");
}
fn add_candy_prices(prices: &mut Vec<i32>) {
prices.push(5);
prices.push(25);
}
Specifically for slices, you can also accept a &mut [T] or &mut str. This allows you to mutate a specific value inside the slice, but you cannot change the number of items inside the slice (which means it's very restricted for strings):
fn reset_first_price(prices: &mut [i32]) {
prices[0] = 0;
}
fn lowercase_first_ascii_character(s: &mut str) {
if let Some(f) = s.get_mut(0..1) {
f.make_ascii_lowercase();
}
}
In addition to Shepmaster's answer, another reason to accept a &str (and similarly &[T] etc) is because of all of the other types besides String and &str that also satisfy Deref<Target = str>. One of the most notable examples is Cow<str>, which lets you be very flexible about whether you are dealing with owned or borrowed data.
If you have:
fn awesome_greeting(name: &String) {
println!("Wow, you are awesome, {}!", name);
}
But you need to call it with a Cow<str>, you'll have to do this:
let c: Cow<str> = Cow::from("hello");
// Allocate an owned String from a str reference and then makes a reference to it anyway!
awesome_greeting(&c.to_string());
When you change the argument type to &str, you can use Cow seamlessly, without any unnecessary allocation, just like with String:
let c: Cow<str> = Cow::from("hello");
// Just pass the same reference along
awesome_greeting(&c);
let c: Cow<str> = Cow::from(String::from("hello"));
// Pass a reference to the owned string that you already have
awesome_greeting(&c);
Accepting &str makes calling your function more uniform and convenient, and the "easiest" way is now also the most efficient. These examples will also work with Cow<[T]> etc.
The recommendation is using &str over &String because &str also satisfies &String which could be used for both owned strings and the string slices but not the other way around:
use std::borrow::Cow;
fn greeting_one(name: &String) {
println!("Wow, you are awesome, {}!", name);
}
fn greeting_two(name: &str) {
println!("Wow, you are awesome, {}!", name);
}
fn main() {
let s1 = "John Doe".to_string();
let s2 = "Jenny Doe";
let s3 = Cow::Borrowed("Sally Doe");
let s4 = Cow::Owned("Sally Doe".to_string());
greeting_one(&s1);
// greeting_one(&s2); // Does not compile
// greeting_one(&s3); // Does not compile
greeting_one(&s4);
greeting_two(&s1);
greeting_two(s2);
greeting_two(&s3);
greeting_two(&s4);
}
Using vectors to manipulate text is never a good idea and does not even deserve discussion because you will loose all the sanity checks and performance optimizations. String type uses vector internally anyway. Remember, Rust uses UTF-8 for strings for storage efficiency. If you use vector, you have to repeat all the hard work. Other than that, borrowing vectors or boxed values should be OK.
Because those types can be coerced, so if we use those types functions will accept less types:
1- a reference to String can be coerced to a str slice. For example create a function:
fn count_wovels(words:&String)->usize{
let wovels_count=words.chars().into_iter().filter(|x|(*x=='a') | (*x=='e')| (*x=='i')| (*x=='o')|(*x=='u')).count();
wovels_count
}
if you pass &str, it will not be accepted:
let name="yilmaz".to_string();
println!("{}",count_wovels(&name));
// this is not allowed because argument should be &String but we are passing str
// println!("{}",wovels("yilmaz"))
But if that function accepts &str instead
// words:&str
fn count_wovels(words:&str)->usize{ ... }
we can pass both types to the function
let name="yilmaz".to_string();
println!("{}",count_wovels(&name));
println!("{}",wovels("yilmaz"))
With this, our function can accept more types
2- Similary, a reference to Box &Box[T], will be coerced to the reference to the value inside the Box Box[&T]. for example
fn length(name:&Box<&str>){
println!("lenght {}",name.len())
}
this accepts only &Box<&str> type
let boxed_str=Box::new("Hello");
length(&boxed_str);
// expected reference `&Box<&str>` found reference `&'static str`
// length("hello")
If we pass &str as type, we can pass both types
3- Similar relation exists between ref to a Vec and ref to an array
fn square(nums:&Vec<i32>){
for num in nums{
println!("square of {} is {}",num,num*num)
}
}
fn main(){
let nums=vec![1,2,3,4,5];
let nums_array=[1,2,3,4,5];
// only &Vec<i32> is accepted
square(&nums);
// mismatched types: mismatched types expected reference `&Vec<i32>` found reference `&[{integer}; 5]`
//square(&nums_array)
}
this will work for both types
fn square(nums:&[i32]){..}

Resources