I noticed that Box<T> implements everything that T implements and can be used transparently. For Example:
let mut x: Box<Vec<u8>> = Box::new(Vec::new());
x.push(5);
I would like to be able to do the same.
This is one use case:
Imagine I'm writing functions that operate using an axis X and an axis Y. I'm using values to change those axis that are of type numbers but belongs only to one or the other axis.
I would like my compiler to fail if I attempt to do operations with values that doesn't belong to the good axis.
Example:
let x = AxisX(5);
let y = AxisY(3);
let result = x + y; // error: incompatible types
I can do this by making a struct that will wrap the numbers:
struct AxisX(i32);
struct AxisY(i32);
But that won't give me access to all the methods that i32 provides like abs(). Example:
x.abs() + 3 // error: abs() does not exist
// ...maybe another error because I don't implement the addition...
Another possible use case:
You can appropriate yourself a struct of another library and implement or derive anything more you would want. For example: a struct that doesn't derive Debug could be wrapped and add the implementation for Debug.
You are looking for std::ops::Deref:
In addition to being used for explicit dereferencing operations with the (unary) * operator in immutable contexts, Deref is also used implicitly by the compiler in many circumstances. This mechanism is called 'Deref coercion'. In mutable contexts, DerefMut is used.
Further:
If T implements Deref<Target = U>, and x is a value of type T, then:
In immutable contexts, *x on non-pointer types is equivalent to *Deref::deref(&x).
Values of type &T are coerced to values of type &U
T implicitly implements all the (immutable) methods of the type U.
For more details, visit the chapter in The Rust Programming Language as well as the reference sections on the dereference operator, method resolution and type coercions.
By implementing Deref it will work:
impl Deref for AxisX {
type Target = i32;
fn deref(&self) -> &i32 {
&self.0
}
}
x.abs() + 3
You can see this in action on the Playground.
However, if you call functions from your underlying type (i32 in this case), the return type will remain the underlying type. Therefore
assert_eq!(AxisX(10).abs() + AxisY(20).abs(), 30);
will pass. To solve this, you may overwrite some of those methods you need:
impl AxisX {
pub fn abs(&self) -> Self {
// *self gets you `AxisX`
// **self dereferences to i32
AxisX((**self).abs())
}
}
With this, the above code fails. Take a look at it in action.
I have a 2D vector that rejects indexing using i32 values, but works if I cast those values using as usize:
#[derive(Clone)]
struct Color;
struct Pixel {
color: Color,
}
fn shooting_star(p: &mut Vec<Vec<Pixel>>, x: i32, y: i32, w: i32, h: i32, c: Color) {
for i in x..=w {
for j in y..=h {
p[i][j].color = c.clone();
}
}
}
fn main() {}
When I compile, I get the error message
error[E0277]: the trait bound `i32: std::slice::SliceIndex<[std::vec::Vec<Pixel>]>` is not satisfied
--> src/main.rs:11:13
|
11 | p[i][j].color = c.clone();
| ^^^^ slice indices are of type `usize` or ranges of `usize`
|
= help: the trait `std::slice::SliceIndex<[std::vec::Vec<Pixel>]>` is not implemented for `i32`
= note: required because of the requirements on the impl of `std::ops::Index<i32>` for `std::vec::Vec<std::vec::Vec<Pixel>>`
If I change the code to have
p[i as usize][j as usize].color = c.clone();
Then everything works fine. However, this feels like it would be a really bizarre choice with no reason not to be handled by the Vec type.
In the documentation, there are plenty of examples like
assert_eq!(vec[0], 1);
By my understanding, if a plain number with no decimal is by default an i32, then there's no reason using an i32 to index shouldn't work.
Unlike Java, C# or even C++, numeric literals in Rust do not have a fixed type. The numeric type of a literal is usually inferred by the compiler, or explicitly stated using a suffix (0usize, 0.0f64, and so on). In that regard, the type of the 0 literal in assert_eq!(vec[0], 1); is inferred to be a usize, since Rust allows Vec indexing by numbers of type usize only.
As for the rationale behind using usize as the indexing type: a usize is equivalent to a word in the target architecture. Thus, a usize can refer to the index/address of all possible memory locations for the computer the program is running on. Thus, the maximum possible length of a vector is the maximum possible value that can be contained in a isize (isize::MAX == usize::MAX / 2). Using usize sizes and indices for a Vec prevents the creation and usage of a vector larger than the available memory itself.
Furthermore, the usage of an unsigned integer just large enough to refer all possible memory locations allows the removal of two dynamic checks, one, the supplied size/index is non-negative (if isize was used, this check would have to be implemented manually), and two, creating a vector or dereferencing a value of a vector will not cause the computer to run out of memory. However, the latter is guaranteed only when the type stored in the vector fits into one word.
I have a type defined as follows (by another crate):
trait Bar {
type Baz;
}
struct Foo<B: Bar, T> {
baz: B::Baz,
t: ::std::marker::PhantomData<T>
}
The type parameter T serves to encode some data at compile-time, and no instances of it will ever exist.
I would like to store a number of Foos, all with the same B, but with different Ts, in a Vec. Any time I am adding to or removing from this Vec I will know the proper T for the item in question by other means.
I know I could have a Vec<Box<Any>>, but do not want to incur the overhead of dynamic dispatch here.
I decided to make it a Vec<Foo<B, ()>>, and transmute to the proper type whenever necessary. However, to my surprise, a function like the following is not allowed:
unsafe fn change_t<B: Bar, T, U>(foo: Foo<B, T>) -> Foo<B, U> {
::std::mem::transmute(foo)
}
This gives the following error:
error[E0512]: transmute called with types of different sizes
--> src/main.rs:13:5
|
13 | ::std::mem::transmute(foo)
| ^^^^^^^^^^^^^^^^^^^^^
|
= note: source type: Foo<B, T> (size can vary because of <B as Bar>::Baz)
= note: target type: Foo<B, U> (size can vary because of <B as Bar>::Baz)
I find this very confusing, as both types have the same B, so their B::Bazes must also be the same, and the type T should not affect the type's layout whatsoever. Why is this not allowed?
It's been brought it to my attention that this sort of transmute results in an error even when the type parameter T is not present at all!
trait Bar {
type Baz;
}
struct Foo<B: Bar> {
baz: B::Baz,
}
unsafe fn change_t<B: Bar>(foo: B::Baz) -> B::Baz {
::std::mem::transmute(foo)
}
playground
Bizarrely, it is not possible to translate even between B::Baz and B::Baz. Unless there is some extremely subtle reasoning that makes this unsafe, this seems very much like a compiler bug.
It seems that for transmute size compatibility can't be verified at compile time for generic T values: so you have to use &T references:
unsafe fn change_t<B: Bar>(foo: &B::Baz) -> &B::Baz {
::std::mem::transmute(foo)
}
This related question contains some interesting details about transmuting.
I released the cluFullTransmute crate (GitHub repository) that solves your question, but it requires a nightly compiler. Give it a try.
I was reading the lifetimes chapter of the Rust book, and I came across this example for a named/explicit lifetime:
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let x; // -+ x goes into scope
// |
{ // |
let y = &5; // ---+ y goes into scope
let f = Foo { x: y }; // ---+ f goes into scope
x = &f.x; // | | error here
} // ---+ f and y go out of scope
// |
println!("{}", x); // |
} // -+ x goes out of scope
It's quite clear to me that the error being prevented by the compiler is the use-after-free of the reference assigned to x: after the inner scope is done, f and therefore &f.x become invalid, and should not have been assigned to x.
My issue is that the problem could have easily been analyzed away without using the explicit 'a lifetime, for instance by inferring an illegal assignment of a reference to a wider scope (x = &f.x;).
In which cases are explicit lifetimes actually needed to prevent use-after-free (or some other class?) errors?
The other answers all have salient points (fjh's concrete example where an explicit lifetime is needed), but are missing one key thing: why are explicit lifetimes needed when the compiler will tell you you've got them wrong?
This is actually the same question as "why are explicit types needed when the compiler can infer them". A hypothetical example:
fn foo() -> _ {
""
}
Of course, the compiler can see that I'm returning a &'static str, so why does the programmer have to type it?
The main reason is that while the compiler can see what your code does, it doesn't know what your intent was.
Functions are a natural boundary to firewall the effects of changing code. If we were to allow lifetimes to be completely inspected from the code, then an innocent-looking change might affect the lifetimes, which could then cause errors in a function far away. This isn't a hypothetical example. As I understand it, Haskell has this problem when you rely on type inference for top-level functions. Rust nipped that particular problem in the bud.
There is also an efficiency benefit to the compiler — only function signatures need to be parsed in order to verify types and lifetimes. More importantly, it has an efficiency benefit for the programmer. If we didn't have explicit lifetimes, what does this function do:
fn foo(a: &u8, b: &u8) -> &u8
It's impossible to tell without inspecting the source, which would go against a huge number of coding best practices.
by inferring an illegal assignment of a reference to a wider scope
Scopes are lifetimes, essentially. A bit more clearly, a lifetime 'a is a generic lifetime parameter that can be specialized with a specific scope at compile time, based on the call site.
are explicit lifetimes actually needed to prevent [...] errors?
Not at all. Lifetimes are needed to prevent errors, but explicit lifetimes are needed to protect what little sanity programmers have.
Let's have a look at the following example.
fn foo<'a, 'b>(x: &'a u32, y: &'b u32) -> &'a u32 {
x
}
fn main() {
let x = 12;
let z: &u32 = {
let y = 42;
foo(&x, &y)
};
}
Here, the explicit lifetimes are important. This compiles because the result of foo has the same lifetime as its first argument ('a), so it may outlive its second argument. This is expressed by the lifetime names in the signature of foo. If you switched the arguments in the call to foo the compiler would complain that y does not live long enough:
error[E0597]: `y` does not live long enough
--> src/main.rs:10:5
|
9 | foo(&y, &x)
| - borrow occurs here
10 | };
| ^ `y` dropped here while still borrowed
11 | }
| - borrowed value needs to live until here
The lifetime annotation in the following structure:
struct Foo<'a> {
x: &'a i32,
}
specifies that a Foo instance shouldn't outlive the reference it contains (x field).
The example you came across in the Rust book doesn't illustrate this because f and y variables go out of scope at the same time.
A better example would be this:
fn main() {
let f : Foo;
{
let n = 5; // variable that is invalid outside this block
let y = &n;
f = Foo { x: y };
};
println!("{}", f.x);
}
Now, f really outlives the variable pointed to by f.x.
Note that there are no explicit lifetimes in that piece of code, except the structure definition. The compiler is perfectly able to infer lifetimes in main().
In type definitions, however, explicit lifetimes are unavoidable. For example, there is an ambiguity here:
struct RefPair(&u32, &u32);
Should these be different lifetimes or should they be the same? It does matter from the usage perspective, struct RefPair<'a, 'b>(&'a u32, &'b u32) is very different from struct RefPair<'a>(&'a u32, &'a u32).
Now, for simple cases, like the one you provided, the compiler could theoretically elide lifetimes like it does in other places, but such cases are very limited and do not worth extra complexity in the compiler, and this gain in clarity would be at the very least questionable.
If a function receives two references as arguments and returns a reference, then the implementation of the function might sometimes return the first reference and sometimes the second one. It is impossible to predict which reference will be returned for a given call. In this case, it is impossible to infer a lifetime for the returned reference, since each argument reference may refer to a different variable binding with a different lifetime. Explicit lifetimes help to avoid or clarify such a situation.
Likewise, if a structure holds two references (as two member fields) then a member function of the structure may sometimes return the first reference and sometimes the second one. Again explicit lifetimes prevent such ambiguities.
In a few simple situations, there is lifetime elision where the compiler can infer lifetimes.
I've found another great explanation here: http://doc.rust-lang.org/0.12.0/guide-lifetimes.html#returning-references.
In general, it is only possible to return references if they are
derived from a parameter to the procedure. In that case, the pointer
result will always have the same lifetime as one of the parameters;
named lifetimes indicate which parameter that is.
The case from the book is very simple by design. The topic of lifetimes is deemed complex.
The compiler cannot easily infer the lifetime in a function with multiple arguments.
Also, my own optional crate has an OptionBool type with an as_slice method whose signature actually is:
fn as_slice(&self) -> &'static [bool] { ... }
There is absolutely no way the compiler could have figured that one out.
As a newcomer to Rust, my understanding is that explicit lifetimes serve two purposes.
Putting an explicit lifetime annotation on a function restricts the type of code that may appear inside that function. Explicit lifetimes allow the compiler to ensure that your program is doing what you intended.
If you (the compiler) want(s) to check if a piece of code is valid, you (the compiler) will not have to iteratively look inside every function called. It suffices to have a look at the annotations of functions that are directly called by that piece of code. This makes your program much easier to reason about for you (the compiler), and makes compile times managable.
On point 1., Consider the following program written in Python:
import pandas as pd
import numpy as np
def second_row(ar):
return ar[0]
def work(second):
df = pd.DataFrame(data=second)
df.loc[0, 0] = 1
def main():
# .. load data ..
ar = np.array([[0, 0], [0, 0]])
# .. do some work on second row ..
second = second_row(ar)
work(second)
# .. much later ..
print(repr(ar))
if __name__=="__main__":
main()
which will print
array([[1, 0],
[0, 0]])
This type of behaviour always surprises me. What is happening is that df is sharing memory with ar, so when some of the content of df changes in work, that change infects ar as well. However, in some cases this may be exactly what you want, for memory efficiency reasons (no copy). The real problem in this code is that the function second_row is returning the first row instead of the second; good luck debugging that.
Consider instead a similar program written in Rust:
#[derive(Debug)]
struct Array<'a, 'b>(&'a mut [i32], &'b mut [i32]);
impl<'a, 'b> Array<'a, 'b> {
fn second_row(&mut self) -> &mut &'b mut [i32] {
&mut self.0
}
}
fn work(second: &mut [i32]) {
second[0] = 1;
}
fn main() {
// .. load data ..
let ar1 = &mut [0, 0][..];
let ar2 = &mut [0, 0][..];
let mut ar = Array(ar1, ar2);
// .. do some work on second row ..
{
let second = ar.second_row();
work(second);
}
// .. much later ..
println!("{:?}", ar);
}
Compiling this, you get
error[E0308]: mismatched types
--> src/main.rs:6:13
|
6 | &mut self.0
| ^^^^^^^^^^^ lifetime mismatch
|
= note: expected type `&mut &'b mut [i32]`
found type `&mut &'a mut [i32]`
note: the lifetime 'b as defined on the impl at 4:5...
--> src/main.rs:4:5
|
4 | impl<'a, 'b> Array<'a, 'b> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
note: ...does not necessarily outlive the lifetime 'a as defined on the impl at 4:5
--> src/main.rs:4:5
|
4 | impl<'a, 'b> Array<'a, 'b> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
In fact you get two errors, there is also one with the roles of 'a and 'b interchanged. Looking at the annotation of second_row, we find that the output should be &mut &'b mut [i32], i.e., the output is supposed to be a reference to a reference with lifetime 'b (the lifetime of the second row of Array). However, because we are returning the first row (which has lifetime 'a), the compiler complains about lifetime mismatch. At the right place. At the right time. Debugging is a breeze.
The reason why your example does not work is simply because Rust only has local lifetime and type inference. What you are suggesting demands global inference. Whenever you have a reference whose lifetime cannot be elided, it must be annotated.
I think of a lifetime annotation as a contract about a given ref been valid in the receiving scope only while it remains valid in the source scope. Declaring more references in the same lifetime kind of merges the scopes, meaning that all the source refs have to satisfy this contract.
Such annotation allow the compiler to check for the fulfillment of the contract.
I tried to create vector of closures:
fn main() {
let mut vec = Vec::new();
vec.push(Box::new(|| 10));
vec.push(Box::new(|| 20));
println!("{}", vec[0]());
println!("{}", vec[1]());
}
That yielded the following error report:
error[E0308]: mismatched types
--> src/main.rs:5:23
|
5 | vec.push(Box::new(|| 20));
| ^^^^^ expected closure, found a different closure
|
= note: expected type `[closure#src/main.rs:4:23: 4:28]`
found type `[closure#src/main.rs:5:23: 5:28]`
= note: no two closures, even if identical, have the same type
= help: consider boxing your closure and/or using it as a trait object
I fixed it by specifying the type explicitly:
let mut vec: Vec<Box<Fn() -> i32>> = Vec::new();
What is the inferred type of vec and why is it that way?
Each closure has an auto-generated, unique, anonymous type. As soon as you add the first closure to the vector, that is the type of all items in the vector. However, when you try to add the second closure, it has a different auto-generated, unique, anonymous type, and so you get the error listed.
Closures are essentially structs that are created by the compiler that implement one of the Fn* traits. The struct contains fields for all the variables captured by the closure, so it by definition needs to be unique, as each closure will capture different numbers and types of variables.
Why can't it just infer Box<Fn() -> i32>?
"can't" is a tough question to answer. It's possible that the compiler could iterate through all the traits of every type that is used to see if some intersection caused the code to compile, but that feels a bit magical to me. You could try opening a feature request or discussing it on one of the forums to see if there is general acceptance of such an idea.
However, Rust does try to make things explicit, especially things that might involve performance. When you go from a concrete struct to a trait object, you are introducing indirection, which has the possibility of being slower.
Right now, the Fn* traits work the same as a user-constructed trait:
trait MyTrait {
fn hello(&self) {}
}
struct MyStruct1;
impl MyTrait for MyStruct1 {}
struct MyStruct2;
impl MyTrait for MyStruct2 {}
fn main() {
let mut things = vec![];
things.push(MyStruct1);
things.push(MyStruct2);
}
error[E0308]: mismatched types
--> src/main.rs:14:17
|
14 | things.push(MyStruct2);
| ^^^^^^^^^ expected struct `MyStruct1`, found struct `MyStruct2`
|
= note: expected type `MyStruct1`
found type `MyStruct2`