I was reading through the book section about Strings and found they were using &* combined together to convert a piece of text. The following is what it says:
use std::net::TcpStream;
TcpStream::connect("192.168.0.1:3000"); // Parameter is of type &str.
let addr_string = "192.168.0.1:3000".to_string();
TcpStream::connect(&*addr_string); // Convert `addr_string` to &str.
In other words, they are saying they are converting a String to a &str. But why is that conversion done using both of the aforementioned signs? Should this not be done using some other method? Does not the & mean we are taking its reference, then using the * to dereference it?
In short: the * triggers an explicit deref, which can be overloaded via ops::Deref.
More Detail
Look at this code:
let s = "hi".to_string(); // : String
let a = &s;
What's the type of a? It's simply &String! This shouldn't be very surprising, since we take the reference of a String. Ok, but what about this?
let s = "hi".to_string(); // : String
let b = &*s; // equivalent to `&(*s)`
What's the type of b? It's &str! Wow, what happened?
Note that *s is executed first. As most operators, the dereference operator * is also overloadable and the usage of the operator can be considered syntax sugar for *std::ops::Deref::deref(&s) (note that we recursively dereferencing here!). String does overload this operator:
impl Deref for String {
type Target = str;
fn deref(&self) -> &str { ... }
}
So, *s is actually *std::ops::Deref::deref(&s), in which the deref() function has the return type &str which is then dereferenced again. Thus, *s has the type str (note the lack of &).
Since str is unsized and not very handy on its own, we'd like to have a reference to it instead, namely &str. We can do this by adding a & in front of the expression! Tada, now we reached the type &str!
&*s is rather the manual and explicit form. Often, the Deref-overload is used via automatic deref coercion. When the target type is fixed, the compiler will deref for you:
fn takes_string_slice(_: &str) {}
let s = "hi".to_string(); // : String
takes_string_slice(&s); // this works!
In general, &* means to first dereference (*) and then reference (&) a value. In many cases, this would be silly, as we'd end up at the same thing.
However, Rust has deref coercions. Combined with the Deref and DerefMut traits, a type can dereference to a different type!
This is useful for Strings as that means that they can get all the methods from str, it's useful for Vec<T> as it gains the methods from [T], and it's super useful for all the smart pointers, like Box<T>, which will have all the methods of the contained T!
Following the chain for String:
String --deref--> str --ref--> &str
Does not the & mean we are taking its reference, then using the * to dereference it?
No, your order of operations is backwards. * and & associate to the right. In this example, dereferencing is first, then referencing.
I think now you can do this &addr_string
(from a comment)
Sometimes, this will do the same thing. See What are Rust's exact auto-dereferencing rules? for the full details, but yes, a &String can be passed to a function that requires a &str. There are still times where you need to do this little dance by hand. The most common I can think of is:
let name: Option<String> = Some("Hello".to_string());
let name2: Option<&str> = name.as_ref().map(|s| &**s);
You'll note that we actually dereference twice:
&String -->deref--> String --deref--> str --ref--> &str
Although this case can now be done with name.as_ref().map(String::as_str);
Related
I am a new learner of Rust, I see the * operator can be overloaded by Deref trait. The std::string::String type have Deref trait implemented, which returns &str type. However when I do the following test, the compiler tells me the type of s2 is str, with an error message "size for values of type str cannot be known at compilation time". So the code cannot be compiled. But the question is why s2 is str? Shouldn't it be the same type as s1?
let owned = "test".to_string(); // owned type is String
let s1 = owned.deref(); // s1 type is &str
let s2 = *owned; // s2 type is str
Deref is a bit of a special trait in Rust, and the rules can be found in the docs. There are some other places where Deref coercion occurs, but since you asked about unary *, the first rule on that page is relevant to you.
If T implements Deref<Target = U>, and x is a value of type T, then:
In immutable contexts, *x (where T is neither a reference nor a raw pointer) is equivalent to *Deref::deref(&x).
So after Deref::deref is called, Rust tries the unary * again. This can invoke Deref on some other type, as seen in this question. This is also the same way C++'s overloadable -> operator works. It does some sort of (user-defined) coercion and then tries to dereference again, which may recursively call -> on something else.
So this
let s2 = *owned;
is equivalent to
let s2 = *owned.deref();
And has type str. str is not a Sized type and hence can't be stored in a variable, which causes your error.
As for why Rust does this, the Deref trait is defined to take a reference and return a reference. This makes sense, since it's coercing some sort of reference behind the scenes, not actually creating data. Nine times out of ten, Deref simply returns a reference to some inner data on the outer structure (Box being a prime example of this).
On the other hand, when you as the programmer write *, you clearly don't want a reference. After all, you just went out of your way to dereference the data. So the * allows deref-coercion through the Deref trait but then still tries to take ownership of (or copy, if applicable) the data after coercion is finished.
Let's take another look at the relationship between Deref and the dereference operator:
let owned = "test".to_string(); // owned type is String
let s1 = owned.deref(); // type of s1 is &str
let s2 = &*owned; // type of s2 is also &str
//let s3 = *owned; // if this compiled, type of s3 would be str
Note how *x expands to *x.deref(), not to x.deref() itself. This you can think of deref() as a "pre-processing" step before applying the actual dereference operator. This is why the above example needed &*owned, and why *owned doesn't compile, despite owned.deref() compiling just fine.
I was reading the doc from rust lang website and in chapter 4 they did the following example:
let s = String::from("hello world");
let hello = &s[0..5];
let world = &s[6..11];
hello is of type &str that I created from a variable s of type String.
Some rows below they define the following function:
fn first_word(s: &String) -> &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
This time s is of type &String but still &s[0..i] gave me a &str slice.
How is it possible? I thought that the correct way to achieve this would be something like &((*str)[0..i]).
Am I missing something? Maybe during the [0..i] operation Rust auto deference the variable?
Thanks
Maybe during the [0..i] operation Rust auto deference the variable?
This is exactly what happens. When you call methods/index a reference, it automatically dereferences before applying the method. This behavior can also be manually implemented with the Deref trait. String implements the Deref with a target of str, which means when you call str methods on String. Read more about deref coercion here.
It's important to realize what happens with &s[1..5], and that it's &(s[1..5]), namely, s[1..5] is first first evaluated, this returns a value of type str, and a reference to that value is taken. In fact, there's even more indirection: x[y] in rust is actually syntactic sugar for *std::ops::Index::index(x,y). Note the dereference, as this function always returns a reference, which is then dereferenced by the sugar, and then it is referenced again by the & in our code — naturally, the compiler will optimize this and ensure we are not pointlessly taking references to only dereference them again.
It so happens that the String type does support the Index<Range<usize>> trait and it's Index::output type is str.
It also happens that the str type supports the same, and that it's output type is also str, viā a blanket implementation of SliceIndex.
On your question of auto-dereferencing, it is true that Rust has a Deref trait defined on String as well so that in many contexts, such as this one, &String is automatically cast to &str — any context that accepts a &str also accepts a &String, meaning that the implementation on Index<usize> on String is actually for optimization to avoid this indirection. If it not were there, the code would still work, and perhaps the compiler could even optimize the indirection away.
But that automatic casting is not why it works — it simply works because indexing is defined on many different types.
Finally:
I thought that the correct way to achieve this would be something like &((*str)[0..i]).
This would not work regardless, a &str is not the same as a &String and cannot be dereferenced to a String like a &String. In fact, a &str in many ways is closer to a String than it is to a &String. a &str is really just a fat pointer to a sequence of unicode bytes, also containing the length of said sequence in the second word; a String is, if one will, an extra-fat pointer that also contains the current capacity of the buffer with it, and owns the buffer it points to, so it can delete and resize it.
In this example code from the Rust documentation:
fn takes_str(s: &str) { }
let s = String::from("Hello");
takes_str(&s);
What exactly is going on behind the scenes that causes &s to become a &str instead of a &String? The documentation seems to suggest that there's some dereferencing going on, but I thought * was for dereferencing, not &?
What's going on here is called deref coercions. These allow references to types that implement the Deref trait to be used in place of references to other types. As your example shows, &String can be used anywhere a &str is required, because String implements Deref to str.
There's an answer here for how to capitalize the ASCII characters in a string. This is not quite adequate for my specific problem.
For anyone who wanders in off Google, things have improved since Rust 0.13.0. (I'm on Rust 1.13.0)
&str (string slice) provides to_uppercase() and you can coerce anything to it.
// First, with an &str, which any string can coerce to.
let str = "hello øåÅßç";
let up = str.to_uppercase();
println!("{}", up);
// And again with a String
let nonstatic_string = String::from(str);
let up2 = nonstatic_string.to_uppercase();
println!("{}", up2);
// ...if this fails in an edge case, use &nonstatic_string to force coercion
Regarding Matthieu M.'s comment, this conversion should be fully Unicode-compliant.
Here's a runnable copy on the Rust Playground
In case it helps newcomers to understand what's going on:
Strings implement Deref<Target=str>, so you can call any str methods on String.
str::to_uppercase() takes &str (a borrowed, non-mutable slice. In other words, a read-only reference to a string), so it'll accept pretty much anything.
str::to_uppercase() allocates a new String and returns it, so its return value isn't constrained by the borrowing rules for the input.
The caveat mentioned in the comment at the end is that, if you call my_string.to_uppercase() as str::to_uppercase(my_string), then it'll complain like this:
expected &str, found struct `std::string::String`
What's happening is that my_string.to_uppercase() isn't actually equivalent to that... it's equivalent to str::to_uppercase(&my_string).
(You can't auto-deref if you're not starting with a reference. You don't need to provide the & when making a method call because the &self in the method definition does it for you.)
I don't think there's any function to do it directly, but you can use the functions in the UnicodeChar trait with chars and map like:
let str = "hello øåÅßç";
let up = str.chars().map(|c| c.to_uppercase()).collect::<String>();
println!("{}", up);
Output:
HELLO ØÅÅßÇ
Tested on rustc 0.13.0-dev (66601647c 2014-11-27 06:41:17 +0000)
I am unable to compile code that converts a type from an integer to a string. I'm running an example from the Rust for Rubyists tutorial which has various type conversions such as:
"Fizz".to_str() and num.to_str() (where num is an integer).
I think the majority (if not all) of these to_str() function calls have been deprecated. What is the current way to convert an integer to a string?
The errors I'm getting are:
error: type `&'static str` does not implement any method in scope named `to_str`
error: type `int` does not implement any method in scope named `to_str`
Use to_string() (running example here):
let x: u32 = 10;
let s: String = x.to_string();
println!("{}", s);
You're right; to_str() was renamed to to_string() before Rust 1.0 was released for consistency because an allocated string is now called String.
If you need to pass a string slice somewhere, you need to obtain a &str reference from String. This can be done using & and a deref coercion:
let ss: &str = &s; // specifying type is necessary for deref coercion to fire
let ss = &s[..]; // alternatively, use slicing syntax
The tutorial you linked to seems to be obsolete. If you're interested in strings in Rust, you can look through the strings chapter of The Rust Programming Language.