Is it possible to "format!" a &str / String in Rust? [duplicate] - string

I want to use the format! macro with a String as first argument, but because the macro expects a string literal, I am not able pass anything different to it.
I want to do this to dynamically add strings into the current string for use in a view engine. I'm open for suggestions if there might be a better way to do it.
let test = String::from("Test: {}");
let test2 = String::from("Not working!");
println!(test, test2);
What I actually want to achieve is the below example, where main.html contains {content}.
use std::io::prelude::*;
use std::fs::File;
use std::io;
fn main() {
let mut buffer = String::new();
read_from_file_using_try(&mut buffer);
println!(&buffer, content="content");
}
fn read_from_file_using_try(buffer: &mut String) -> Result<(), io::Error> {
let mut file = try!(File::open("main.html"));
try!(file.read_to_string(buffer));
Ok(())
}
So I want to print the contents of main.html after I formatted it.

Short answer: it cannot be done.
Long answer: the format! macro (and its derivatives) requires a string literal, that is a string known at compilation-time. In exchange for this requirement, if the arguments provided do not match the format, a compilation error is raised.
What you are looking for is known as a template engine. A non-exhaustive list of Rust template engines in no particular order:
Handlebars
Rustache
Maud
Horrorshow
fomat-macros
...
Template engines have different characteristics, and notably differ by the degree of validation occurring at compile-time or run-time and their flexibility (I seem to recall that Maud was very HTML-centric, for example). It's up to you to find the one most fitting for your use case.

Check out the strfmt library, it is the closest I've found to do dynamic string formatting.

Related

Rust Manipulating Strings in Functions

I'm new to Rust, and I want to process strings in a function in Rust and then return a struct that contains the results of that processing to use in more functions. This is very simplified and a bit messier because of all my attempts to get this working, but:
struct Strucc<'a> {
string: &'a str,
booool: bool
}
fn do_stuff2<'a>(input: &'a str) -> Result<Strucc, &str> {
let to_split = input.to_lowercase();
let splitter = to_split.split("/");
let mut array: Vec<&str> = Vec::new();
for split in splitter {
array.push(split);
}
let var = array[0];
println!("{}", var);
let result = Strucc{
string: array[0],
booool: false
};
Ok(result)
}
The issue is that to convert the &str to lowercase, I have to create a new String that's owned by the function.
As I understand it, the reason this won't compile is because when I split the new String I created, all the &strs I get from it are substrings of the String, which are all still owned by the function, and so when the value is returned and that String goes out of scope, the value in the struct I returned gets erased.
I tried to fix this with lifetimes (as you can see in the function definition), but from what I can tell I'd have to give the String a lifetime which I can't do as far as I'm aware because it isn't borrowed. Either that or I need to make the struct own that String (which I also don't understand how to do, nor does it seem reasonable as I'd have to make the struct mutable).
Also as a sidenote: Previously I have tried to just use a String in the struct but I want to define constants which won't work with that, and I still don't think it would solve the issue. I've also tried to use .clone() in various places just in case but had no luck (though I know why this shouldn't work anyway).
I have been looking for some solution for this for hours and it feels like such a small step so I feel I may be asking the wrong questions or have missed something simple but please explain it like I'm five because I'm very confused.
I think you misunderstand what &str actually is. &str is just a pointer to the string data plus a length. The point of &str is to be an immutable reference to a specific string, which enables all sorts of nice optimizations. When you attempt to turn the &str lowercase, Rust needs somewhere to put the data, and the only place to put it would be a String, because Strings own their data. Take a look at this post for more information.
Your goal is unachievable without Strucc containing a String, since .to_lowercase() has to create new data, and you have to allocate the resulting data somewhere in order to own a reference to it. The best place to put the resulting data would be the returned struct, i.e. Strucc, and therefore Strucc must contain a String.
Also as a sidenote: Previously I have tried to just use a String in the struct but I want to define constants which won't work with that, and I still don't think it would solve the issue.
You can use "x".to_owned() to create a String literal.
If you're trying to create a global constant, look at once_cell's lazy global initialization.

How to return a Result containing a formatted string, as &str [duplicate]

There are several questions that seem to be about the same problem I'm having. For example see here and here. Basically I'm trying to build a String in a local function, but then return it as a &str. Slicing isn't working because the lifetime is too short. I can't use str directly in the function because I need to build it dynamically. However, I'd also prefer not to return a String since the nature of the object this is going into is static once it's built. Is there a way to have my cake and eat it too?
Here's a minimal non-compiling reproduction:
fn return_str<'a>() -> &'a str {
let mut string = "".to_string();
for i in 0..10 {
string.push_str("ACTG");
}
&string[..]
}
No, you cannot do it. There are at least two explanations why it is so.
First, remember that references are borrowed, i.e. they point to some data but do not own it, it is owned by someone else. In this particular case the string, a slice to which you want to return, is owned by the function because it is stored in a local variable.
When the function exits, all its local variables are destroyed; this involves calling destructors, and the destructor of String frees the memory used by the string. However, you want to return a borrowed reference pointing to the data allocated for that string. It means that the returned reference immediately becomes dangling - it points to invalid memory!
Rust was created, among everything else, to prevent such problems. Therefore, in Rust it is impossible to return a reference pointing into local variables of the function, which is possible in languages like C.
There is also another explanation, slightly more formal. Let's look at your function signature:
fn return_str<'a>() -> &'a str
Remember that lifetime and generic parameters are, well, parameters: they are set by the caller of the function. For example, some other function may call it like this:
let s: &'static str = return_str();
This requires 'a to be 'static, but it is of course impossible - your function does not return a reference to a static memory, it returns a reference with a strictly lesser lifetime. Thus such function definition is unsound and is prohibited by the compiler.
Anyway, in such situations you need to return a value of an owned type, in this particular case it will be an owned String:
fn return_str() -> String {
let mut string = String::new();
for _ in 0..10 {
string.push_str("ACTG");
}
string
}
In certain cases, you are passed a string slice and may conditionally want to create a new string. In these cases, you can return a Cow. This allows for the reference when possible and an owned String otherwise:
use std::borrow::Cow;
fn return_str<'a>(name: &'a str) -> Cow<'a, str> {
if name.is_empty() {
let name = "ACTG".repeat(10);
name.into()
} else {
name.into()
}
}
You can choose to leak memory to convert a String to a &'static str:
fn return_str() -> &'static str {
let string = "ACTG".repeat(10);
Box::leak(string.into_boxed_str())
}
This is a really bad idea in many cases as the memory usage will grow forever every time this function is called.
If you wanted to return the same string every call, see also:
How to create a static string at compile time
The problem is that you are trying to create a reference to a string that will disappear when the function returns.
A simple solution in this case is to pass in the empty string to the function. This will explicitly ensure that the referred string will still exist in the scope where the function returns:
fn return_str(s: &mut String) -> &str {
for _ in 0..10 {
s.push_str("ACTG");
}
&s[..]
}
fn main() {
let mut s = String::new();
let s = return_str(&mut s);
assert_eq!("ACTGACTGACTGACTGACTGACTGACTGACTGACTGACTG", s);
}
Code in Rust Playground:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2499ded42d3ee92d6023161fe82e9b5f
This is an old question but a very common one. There are many answers but none of them addresses the glaring misconception people have about the strings and string slices, which stems from not knowing their true nature.
But lets start with the obvious question before addressing the implied one: Can we return a reference to a local variable?
What we are asking to achieve is the textbook definition of a dangling pointer. Local variables will be dropped when the function completes its execution. In other words they will be pop off the execution stack and any reference to the local variables then on will be pointing to some garbage data.
Best course of action is either returning the string or its clone. No need to obsess over the speed.
However, I believe the essence of the question is if there is a way to convert a String into an str? The answer is no and this is where the misconception lies:
You can not turn a String into an str by borrowing it. Because a String is heap allocated. If you take a reference to it, you still be using heap allocated data but through a reference. str, on the other hand, is stored directly in the data section of the executable file and it is static. When you take a reference to a string, you will get matching type signature for common string manipulations, not an actual &str.
You can check out this post for detailed explanation:
What are the differences between Rust's `String` and `str`?
Now, there may be a workaround for this particular use case if you absolutely use static text:
Since you use combinations of four bases A, C, G, T, in groups of four, you can create a list of all possible outcomes as &str and use them through some data structure. You will jump some hoops but certainly doable.
if it is possible to create the resulting STRING in a static way at compile time, this would be a solution without memory leaking
#[macro_use]
extern crate lazy_static;
fn return_str<'a>() -> &'a str {
lazy_static! {
static ref STRING: String = {
"ACTG".repeat(10)
};
}
&STRING
}
Yes you can - the method replace_range provides a work around -
let a = "0123456789";
//println!("{}",a[3..5]); fails - doesn't have a size known at compile-time
let mut b = String::from(a);
b.replace_range(5..,"");
b.replace_range(0..2,"");
println!("{}",b); //succeeds
It took blood sweat and tears to achieve this!

Formatting a const string in Rust [duplicate]

This extremely simple Rust program:
fn main() {
let c = "hello";
println!(c);
}
throws the following compile-time error:
error: expected a literal
--> src/main.rs:3:14
|
3 | println!(c);
| ^
In previous versions of Rust, the error said:
error: format argument must be a string literal.
println!(c);
^
Replacing the program with:
fn main() {
println!("Hello");
}
Works fine.
The meaning of this error isn't clear to me and a Google search hasn't really shed light on it. Why does passing c to the println! macro cause a compile time error? This seems like quite unusual behaviour.
This should work:
fn main() {
let c = "hello";
println!("{}", c);
}
The string "{}" is a template where {} will be replaced by the next argument passed to println!.
TL;DR If you don't care why and just want to fix it, see the sibling answer.
The reason that
fn main() {
let c = "hello";
println!(c);
}
Cannot work is because the println! macro looks at the string at compile time and verifies that the arguments and argument specifiers match in amount and type (this is a very good thing!). At this point in time, during macro evaluation, it's not possible to tell that c came from a literal or a function or what have you.
Here's an example of what the macro expands out to:
let c = "hello";
match (&c,) {
(__arg0,) => {
#[inline]
#[allow(dead_code)]
static __STATIC_FMTSTR: &'static [&'static str] = &[""];
::std::io::stdio::println_args(&::std::fmt::Arguments::new(
__STATIC_FMTSTR,
&[::std::fmt::argument(::std::fmt::Show::fmt, __arg0)]
))
}
};
I don't think that it's actually impossible for the compiler to figure this out, but it would probably take a lot of work with potentially little gain. Macros operate on portions of the AST and the AST only has type information. To work in this case, the AST would have to include the source of the identifier and enough information to determine it's acceptable to be used as a format string. In addition, it might interact poorly with type inference - you'd want to know the type before it's been picked yet!
The error message asks for a "string literal". What does the word "literal" mean? asks about what that means, which links to the Wikipedia entry:
a literal is a notation for representing a fixed value in source code
"foo" is a string literal, 8 is a numeric literal. let s = "foo" is a statement that assigns the value of a string literal to an identifier (variable). println!(s) is a statement that provides an identifier to the macro.
If you really want to define the first argument of println! in one place, I found a way to do it. You can use a macro:
macro_rules! hello {() => ("hello")};
println!(hello!());
Doesn't look too useful here, but I wanted to use the same formatting in a few places, and in this case the method was very helpful:
macro_rules! cell_format {() => ("{:<10}")}; // Pads with spaces on right
// to fill up 10 characters
println!(cell_format!(), "Foo");
println!(cell_format!(), 456);
The macro saved me from having to duplicate the formatting option in my code.
You could also, obviously, make the macro more fancy and take arguments if necessary to print different things with different arguments.
If your format string will be reused only a moderate number of times, and only some variable data will be changed, then a small function may be a better option than a macro:
fn pr(x: &str) {
println!("Some stuff that will always repeat, something variable: {}", x);
}
pr("I am the variable data");
Outputs
Some stuff that will always repeat, something variable: I am the variable data

How can I use a dynamic format string with the format! macro?

I want to use the format! macro with a String as first argument, but because the macro expects a string literal, I am not able pass anything different to it.
I want to do this to dynamically add strings into the current string for use in a view engine. I'm open for suggestions if there might be a better way to do it.
let test = String::from("Test: {}");
let test2 = String::from("Not working!");
println!(test, test2);
What I actually want to achieve is the below example, where main.html contains {content}.
use std::io::prelude::*;
use std::fs::File;
use std::io;
fn main() {
let mut buffer = String::new();
read_from_file_using_try(&mut buffer);
println!(&buffer, content="content");
}
fn read_from_file_using_try(buffer: &mut String) -> Result<(), io::Error> {
let mut file = try!(File::open("main.html"));
try!(file.read_to_string(buffer));
Ok(())
}
So I want to print the contents of main.html after I formatted it.
Short answer: it cannot be done.
Long answer: the format! macro (and its derivatives) requires a string literal, that is a string known at compilation-time. In exchange for this requirement, if the arguments provided do not match the format, a compilation error is raised.
What you are looking for is known as a template engine. A non-exhaustive list of Rust template engines in no particular order:
Handlebars
Rustache
Maud
Horrorshow
fomat-macros
...
Template engines have different characteristics, and notably differ by the degree of validation occurring at compile-time or run-time and their flexibility (I seem to recall that Maud was very HTML-centric, for example). It's up to you to find the one most fitting for your use case.
Check out the strfmt library, it is the closest I've found to do dynamic string formatting.

Is this the right way to read lines from file and split them into words in Rust?

Editor's note: This code example is from a version of Rust prior to 1.0 and is not syntactically valid Rust 1.0 code. Updated versions of this code produce different errors, but the answers still contain valuable information.
I've implemented the following method to return me the words from a file in a 2 dimensional data structure:
fn read_terms() -> Vec<Vec<String>> {
let path = Path::new("terms.txt");
let mut file = BufferedReader::new(File::open(&path));
return file.lines().map(|x| x.unwrap().as_slice().words().map(|x| x.to_string()).collect()).collect();
}
Is this the right, idiomatic and efficient way in Rust? I'm wondering if collect() needs to be called so often and whether it's necessary to call to_string() here to allocate memory. Maybe the return type should be defined differently to be more idiomatic and efficient?
There is a shorter and more readable way of getting words from a text file.
use std::io::{BufRead, BufReader};
use std::fs::File;
let reader = BufReader::new(File::open("file.txt").expect("Cannot open file.txt"));
for line in reader.lines() {
for word in line.unwrap().split_whitespace() {
println!("word '{}'", word);
}
}
You could instead read the entire file as a single String and then build a structure of references that points to the words inside:
use std::io::{self, Read};
use std::fs::File;
fn filename_to_string(s: &str) -> io::Result<String> {
let mut file = File::open(s)?;
let mut s = String::new();
file.read_to_string(&mut s)?;
Ok(s)
}
fn words_by_line<'a>(s: &'a str) -> Vec<Vec<&'a str>> {
s.lines().map(|line| {
line.split_whitespace().collect()
}).collect()
}
fn example_use() {
let whole_file = filename_to_string("terms.txt").unwrap();
let wbyl = words_by_line(&whole_file);
println!("{:?}", wbyl)
}
This will read the file with less overhead because it can slurp it into a single buffer, whereas reading lines with BufReader implies a lot of copying and allocating, first into the buffer inside BufReader, and then into a newly allocated String for each line, and then into a newly allocated the String for each word. It will also use less memory, because the single large String and vectors of references are more compact than many individual Strings.
A drawback is that you can't directly return the structure of references, because it can't live past the stack frame the holds the single large String. In example_use above, we have to put the large String into a let in order to call words_by_line. It is possible to get around this with unsafe code and wrapping the String and references in a private struct, but that is much more complicated.

Resources