I read the below syntax from byteorder:
rdr.read_u16::<BigEndian>()
I can't find any documentation which explains the syntax instance.method::<SomeThing>()
This construct is called turbofish. If you search for this statement, you will discover its definition and its usage.
Although the first edition of The Rust Programming Language is outdated, I feel that this particular section is better than in the second book.
Quoting the second edition:
path::<...>, method::<...>
Specifies parameters to generic type, function, or method in an expression; often referred to as turbofish (e.g., "42".parse::<i32>())
You can use it in any kind of situation where the compiler is not able to deduce the type parameter, e.g.
fn main () {
let a = (0..255).sum();
let b = (0..255).sum::<u32>();
let c: u32 = (0..255).sum();
}
a does not work because it cannot deduce the variable type.
b does work because we specify the type parameter directly with the turbofish syntax.
c does work because we specify the type of c directly.
Related
I have some code that's supposed to get image filenames from a database and add them to a vector.
extern crate postgres;
use postgres::{Connection, TlsMode};
fn main() {
let conn = Connection::connect(
"postgres://postgres:password#localhost:5432/test",
TlsMode::None,
).unwrap();
let mut filenames = Vec::new();
if let Ok(filename_results) = conn.query("SELECT filename FROM images", &[]) {
for row in &filename_results {
filenames.push(format!("{}.jpg", row.get(0)));
}
}
println!("{:?}", filenames);
}
This fails with:
error[E0283]: type annotations required: cannot resolve `_: postgres::types::FromSql`
--> src/main.rs:14:54
|
14 | filenames.push(format!("{}.jpg", row.get(0)));
| ^^^
I don't understand why Rust can't figure out the type in this context, though I've figured out a way to make it work. I'm wondering what the simplest/idiomatic way to tell format!() what types it should be expecting are, and why row.get(0) doesn't need a type annotation unless I slap a format!() around it. This is my best attempt at a solution:
for row in &filename_results {
let filename: String = row.get(0);
filenames.push(format!("{}.jpg", filename));
}
Let's look at the signature of the function you're calling:
fn get<I, T>(&self, idx: I) -> T
where
I: RowIndex + Debug,
T: FromSql,
That is, this function actually has two type parameters, I and T. It uses I as the type to index with. The argument you pass has this type. T is the return type. The constraints (the where clause) don't really matter here, but they specify that the argument type I has to be something postgres can use as a row index, and the return type T has to be something postgres can create from an SQL result.
Usually, Rust can infer the type parameters of functions. Argument types are usually easier to infer, because there's a value of the desired type right there. Even C++ can infer argument types! Return types are harder to infer because they depend on the context the function is called from, but Rust can often infer those too.
Let's look at your function call and the context it's used:
format!("{}.jpg", row.get(0))
Here's it's obvious that the argument is an integer, because it's a literal, and it's right there. There are rules for working out what integer types it could be, but in this case, it has to be usize because that's the only one the RowIndex trait is implemented for.
But what return type are you expecting? format! can take almost any type, so the compiler has no way to know what get needs to return. All it knows is that T has to have the FromSql trait. This is what the error message tells you:
error[E0283]: type annotations required: cannot resolve `_: postgres::types::FromSql`
Luckily, Rust has a syntax for explicitly passing function parameters to functions, so you don't have to rely on its type inference. Shepmaster wrote a good explanation of it in this answer to a similar question. Jumping straight to the answer, you can write row.get::<_, String>(0) to only specify the second type parameter, and let inference work on the first type parameter.
You specifically ask for a more idiomatic way to specify the type, and I think what you already have is more idiomatic. With the explicit type parameter, a reader still needs to understand the signature of get to know that String will be the return type. It's not always the case that the second type parameter will be the return type, and it's easy to get confused and specify them in the wrong order. By naming and type-annotating the result, you make it obvious what value the type annotation refers to.
let filename: String = row.get(0);
filenames.push(format!("{}.jpg", filename));
If you do want to write your code in the more functional style that Shepmaster suggested, you can still use this style:
let filenames = filename_results.map(|row| { let f: String = row.get(0); format!("{}.jpg", f) }).collect();
and break the "one-liner" across lines if that suits your taste.
In type hints in Rust it is possible to use partial types in annotations like this:
let myvec: Vec<_> = vec![1, 2, 3];
What is the correct terminology for the underscore in the partial type annotation? I'm interested in both the Rust terminology as well as more academic type theory terminology.
I was able to find a piece of official documentation where the underscore is named in the context of patterns, but I doubt it's a "strict" name:
Patterns consist of some combination of literals, destructured arrays or enum constructors, structs and tuples, variable binding specifications, wildcards (..), and placeholders (_).
The Book provides the following description in the glossary:
_: "ignored" pattern binding (see Patterns (Ignoring bindings)). Also used to make integer-literals readable (see Reference (Integer literals)).
I was not able to find a definition pointing specifically to partial type annotations, but I think "placeholder" (or "type placeholder", depending on the context) would not be ambiguous.
After some digging it seems that Vec<_> is consistently called a partial type (so in let x: Vec<_> we have a partial type annotation, while Fn(String) -> _ would be a partial type signature) but the _ in this context is varyingly called either a type wildcard or a type placeholder, and _ in the type grammar can be read as the token for "infer this type" (at the time of the PR mentioned below, TyInfer internally in the compiler).
Some interesting reading:
Partial type signatures in Haskell
The pull request which added _ to the Rust type grammar
Intermingled parameter lists - Niko Matsakis' blog post in which he proposes to "Introduce _ as a notation for an unspecified lifetime or type"
Interesting detail from the PR:
let x: _ = 5;
let x = 5;
The two lines above are equivalent, and both parsed as variable x with type TyInfer.
In the compiler, it seems to be called Infer (in syntax::ast, rustc::hir, and rustc::ty)
I think this naming is somewhat reasonable, because these _s are replaced with fresh (type) inference variables before doing Hindley-Milner-like type inference.
Seems like the grammar refers to it as an "inferred type". Per the documentation:
Inferred type
Syntax:
InferredType : _
The inferred type asks the compiler to infer the type if possible based on the surrounding information available. It cannot be used in item signatures. It is often used in generic arguments:
let x: Vec<_> = (0..10).collect();
In this snippet from Hyper's example, there's a bit of code that I've annotated with types that compiles successfully:
.map_err(|x: std::io::Error| -> hyper::Error {
::std::convert::From::<std::io::Error>::from(x)
})
The type definition of From::from() seems to be fn from(T) -> Self;
How is it that what seems to be a std::io::Error -> Self seems to return a hyper::Error value, when none of the generics and arguments I give it are of the type hyper::Error?
It seems that some sort of implicit type conversion is happening even when I specify all the types explicitly?
Type information in Rust can flow backwards.
The return type of the closure is specified to be hyper::Error. Therefore, the result of the block must be hyper::Error, therefore the result of From::from must be hyper::Error.
If you wanted to, you could use ...
<hyper::Error as ::std::convert::From>::<std::io::Error>::from(x)
... which would be the even more fully qualified version. But with the closure return type there, it's unnecessary.
Type inference has varying degrees.
For example, in C++ each literal is typed, and only a fully formed type can be instantiated, therefore the type of any expression can be computed (and is). Before C++11, this led to the compiler giving an error message: You are attempting to assign a value of type X to a variable of type Y. In C++11, auto was introduced to let the compiler figure out the type of the variable based on the value that was assigned to it.
In Java, this works slightly differently: the type of a variable has to be fully spelled out, but in exchange when constructing a type the generic bits can be left out since they are deduced from the variable the value is assigned to.
Those two examples are interesting because type information does not flow the same way in both of them, which hints that there is no reason for the flow to go one way or another; there are however technical constraints aplenty.
Rust, instead, uses a variation of the Hindley Milner type unification algorithm.
I personally see Hindley Milner as a system of equation:
Give each potential type a name: A, B, C, ...
Create equations tying together those types based on the structure of the program.
For example, imagine the following:
fn print_slice(s: &[u32]) {
println!("{:?}", s);
}
fn main() {
let mut v = Vec::new();
v.push(1);
print_slice(&v);
}
And start from main:
Assign names to types: v => A, 1 => B,
Put forth some equations: A = Vec<C> (from v = Vec::new()), C = B (from v.push(1)), A = &[u32] OR <A as Deref>::Output = &[u32] OR ... (from print_slice(&v),
First round of solving: A = Vec<B>, &[B] = &[u32],
Second round of solving: B = u32, A = Vec<u32>.
There are some difficulties woven into the mix because of subtyping (which the original HM doesn't have), however it's essentially just that.
In this process, there is no consideration for going backward or forwarded, it's just equation solving either way.
This process is known as Type Unification and if it fails you get a hopefully helpful compiler error.
In Python you can do something like this:
if isinstance("hello", basestring):
print "hello is a string"
else:
print "Not a string"
My question is can this kind of code be recreated or emulated using Rust ? If it is possible, is this kind of checking necessary or useful in Rust ?
Python is dynamically typed. When you write for example a function def foo(x):, the caller can choose to give a value of any type as the parameter x. That’s why Python has isinstance(), so that you can check when it’s important.
Rust is statically typed. Any variable in the code has a type that is known at compile-time. For functions parameters you have to write it explicitly: fn foo(x: String) {. For local variables you can write it: let x: String = something(); or leave it to the compiler’s type inference to figure out: let x = something(); based on other information (here based on the return type of something()). Sometimes there is not enough context for type inference and you have to write an explicit type annotation.
If everything has a known type, an isinstance function that returns true or false doesn’t make sense. So Rust doesn’t quite have one.
Note that some form of dynamic typing is possible with trait objects and the Any trait:
http://doc.rust-lang.org/book/trait-objects.html
http://doc.rust-lang.org/std/any/trait.Any.html
So you can write:
fn foo(object: &Any) {
if object.is::<String>() {
// ...
}
}
object’s type is still static: it’s &Any. But it also represents a value of some other, arbitrary type. You can access that value with other Any methods such as downcast_ref.
Rust has a limited for a downcasting that can be provided by Any: you can Any to query whether the concrete type is X or Y.
The usefulness of the construct is rather limited though; Rust is a statically typed language so in most situation you:
either know the exact concrete type
or use a trait that has sufficient methods for your needs
still, Chris Morgan developed an AnyMap for example to store one value of each type, without knowing said types a-priori, which he then used to provide a typed interface to HTTP headers without restricting the set of headers to a known set.
I think the most similar paradigm in Rust is using the match keyword on enum types. For instance, in the std::net::IpAddr case, you can use matching to decide whether you are dealing with an Ipv4Addr or an Ipv6Addr.
use std::net::{IpAddr, Ipv4Addr, Ipv6Addr};
fn main() {
do_something(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)));
do_something(IpAddr::V6(Ipv6Addr::new(0, 0, 0, 0, 0, 0, 0, 0)));
}
fn do_something(v: IpAddr) {
match v {
IpAddr::V4(_x) => {
println!("I'm an IPv4 Address!");
}
IpAddr::V6(_x) => {
println!("I'm an IPv6 Address!");
}
}
}
Link to Rust Playground with working example
This has the advantage of having well-defined types everywhere, which is of course required as Rust is strongly typed. However, it does have a potential impact on how you arrange your data structures; thus this sort of behavior has to be planned in advance.
When declaring a variable of type vector or a hash map in Rust, we do:
let v: Vec<int>
let m: HashMap<int, int>
To instantiate, we need to call new(). However, we do so thusly:
Vec::<int>::new()
^^
HashMap::<int, int>::new()
^^
Note the sudden appearance of ::. Coming from C++, these are odd. Why do these occur? Does having a leading :: make IDENTIFIER :: < IDENTFIER … easier to parse than IDENTIFIER < IDENTIFIER, which might be construed as a less-than operation? (And thus, this is simply a thing to make the language easier to parse? But if so, why not also do it during type specifications, so as to have the two mirror each other?)
(As Shepmaster notes, often Vec::new() is enough; the type can often be inferred.)
When parsing an expression, it would be ambiguous whether a < was the start of a type parameter list or a less-than operator. Rust always assumes the latter and requires ::< for type parameter lists.
When parsing a type, it's always unambiguously a type parameter list, so ::< is never necessary.
In C++, this ambiguity is kept in the parser, which makes parsing C++ much more difficult than parsing Rust. See here for an explanation why this matters.
Anyway, most of the time in Rust, the types can be inferred and you can just write Vec::new(). Since ::< is usually not needed and is fairly ugly, it makes sense to keep only < in types, rather than making the two syntaxes match up.
The two different syntaxes don't even specify the same type parameters necessarily.
In this example:
let mut map: HashMap<K, V>;
K and V fill the type parameters of the struct HashMap declaration, the type itself.
In this expression:
HashMap::<K, V>::new()
K and V fill the type parameters of the impl block where the method new is defined! The impl block need not have the same, as many, or the same default, type parameters as the type itself.
In this particular case, the struct has the parameters HashMap<K, V, S = RandomState> (3 parameters, 1 defaulted). And the impl block containing ::new() has parameters impl<K, V> (2 parameters, not implemented for arbitrary states).