Why is num::One needed for iterating over a range? - rust

Why does the following for loop with a char range fail to compile?
fn main() {
for c in 'a'..'z' {
println!("{}", c);
}
}
Error...
main.rs:11:5: 14:2 error: the trait `core::num::One` is not implemented for the type `char` [E0277]
main.rs:11 for c in 'a'..'z' {
main.rs:12 println!("{}", c);
main.rs:13 }
main.rs:14 }
main.rs:11:5: 14:2 error: the trait `core::iter::Step` is not implemented for the type `char` [E0277]
main.rs:11 for c in 'a'..'z' {
main.rs:12 println!("{}", c);
main.rs:13 }
main.rs:14 }
Why do you even need core::num::One for a iterating over a range?

The x..y syntax is sugar for std::ops::Range { start: x, end: y }. This type (Range<A>) is iterable due to the implementation of Iterator for it, specifically, from that page:
impl<A> Iterator for Range<A>
where A: One + Step,
&'a A: Add<&'a A>,
&'a A::Output == A {
type Item = A;
This is saying that Range<A> can behave as an iterator over As if the type A implements One and Step, and can be added in the right way.
In this case, char satisfies none of those: it is semantically nonsense for char to have One or be addable, and it doesn't implement Step either.
That said, since char doesn't implement those traits (and hence Range<char> doesn't behave like an iterator via that impl), it should be possible to have a manual impl:
impl Iterator for Range<char> {
type Item = char;
which would allow for x in 'a'..'z' to work.
However, this probably isn't semantically what we want: the .. range doesn't include the last element, which would be suprising for characters, one would have to write 'a'..'{' to get the letters A through Z. There's been proposals for inclusive-range syntax, e.g. one example is 'a'...'z' (more dots == more elements), and I would imagine that there would be an Iterator implementation for this type with chars.
As others have demonstrated, for ASCII characters one can use byte literals, and more generally, one can cast characters to u32s:
for i in ('à' as u32)..('æ' as u32) + 1 {
let c = std::char::from_u32(i).unwrap();
println!("{}", c);
}
Which gives:
à
á
â
ã
ä
å
æ
NB. this approach isn't perfect, it will crash if the range crosses the surrogate range, 0xD800-0xDFFF.
I just published a crate, char-iter, which handles the latter correctly and behaves like one would expect. Once added (via cargo), it can be used like:
extern crate char_iter;
// ...
for c in char_iter::new('a', 'z') {
// ...
}
for c in char_iter::new('à', 'æ') {
// ...
}

Well, you need Step to denote that the structure can be stepped over in both directions.
/// Objects that can be stepped over in both directions.
///
/// The `steps_between` function provides a way to efficiently compare
/// two `Step` objects.
pub trait Step: PartialOrd
One on the other hand is used to retrieve a value from mutable iterator, while simultaneously incrementing it:
#[inline]
fn next(&mut self) -> Option<A> {
if self.start < self.end {
let mut n = &self.start + &A::one();
mem::swap(&mut n, &mut self.start);
Some(n)
} else {
None
}
}
Source
What you could do is make your range a u8 and then convert it back to char, like this:
fn main() {
for c in (b'a'..b'z'+1) {
println!(" {:?}", c as char);
}
}
Note: That range are exclusive so ('a'..'z') is actually ('a', 'b', ... 'y'). Or in math notation [a,z) ;) .
That's why I add b'z'+1 instead of b'z'.
Note: u8 is valid, only because the characters are ASCII.

Related

What does this Rust Closure argument syntax mean?

I modified code found on the internet to create a function that obtains the statistical mode of any Hashable type that implements Eq, but I do not understand some of the syntax. Here is the function:
use std::hash::Hash;
use std::collections::HashMap;
pub fn mode<'a, I, T>(items: I) -> &'a T
where I: IntoIterator<Item = &'a T>,
T: Hash + Clone + Eq, {
let mut occurrences: HashMap<&T, usize> = HashMap::new();
for value in items.into_iter() {
*occurrences.entry(value).or_insert(0) += 1;
}
occurrences
.into_iter()
.max_by_key(|&(_, count)| count)
.map(|(val, _)| val)
.expect("Cannot compute the mode of zero items")
}
(I think requiring Clone may be overkill.)
The syntax I do not understand is in the closure passed to map_by_key:
|&(_, count)| count
What is the &(_, count) doing? I gather the underscore means I can ignore that parameter. Is this some sort of destructuring of a tuple in a parameter list? Does this make count take the reference of the tuple's second item?
.max_by_key(|&(_, count)| count) is equivalent to .max_by_key(f) where f is this:
fn f<T>(t: &(T, usize)) -> usize {
(*t).1
}
f() could also be written using pattern matching, like this:
fn f2<T>(&(_, count): &(T, usize)) -> usize {
count
}
And f2() is much closer to the first closure you're asking about.
The second closure is essentially the same, except there is no reference slightly complicating matters.

Is it legal to cast a struct to an array?

Consider the following:
// Just a sequence of adjacent fields of same the type
#[repr(C)]
#[derive(Debug)]
struct S<T> {
a : T,
b : T,
c : T,
d : T,
}
impl<T : Sized> S<T> {
fn new(a : T, b : T, c : T, d : T) -> Self {
Self {
a,
b,
c,
d,
}
}
// reinterpret it as an array
fn as_slice(&self) -> &[T] {
unsafe { std::slice::from_raw_parts(self as *const Self as *const T, 4) }
}
}
fn main() {
let s = S::new(1, 2, 3, 4);
let a = s.as_slice();
println!("s :: {:?}\n\
a :: {:?}", s, a);
}
Is this code portable?
Is it always safe to assume a repr(C) struct with fields of same type can be reinterpreted like an array? Why?
Yes, it is safe and portable, except for very large T (fix below). None of the points listed in the safety section of the documentation for std::slice::from_raw_parts are a concern here:
The data pointer is valid for mem::size_of::<T>() * 4, which is the size of S<T>, and is properly aligned.
All of the items are in the same allocation object, because they are in the same struct.
The pointer is not null, because it is a cast from the safe &self parameter, and it is properly aligned, because S<T> has (at least) the alignment of T.
The data parameter definitely points to 4 consecutive initialized Ts, because S is marked #[repr(C)] which is defined such that in your struct, no padding would be introduced. (repr(Rust) makes no such guarantee).
The memory referenced is not mutated during the lifetime of the reference, which is guaranteed by the borrow checker.
The total size of the slice must not be greater than isize::MAX. The code does not check this, so it is technically a safety hole. To be sure, add a check to as_slice, before the unsafe:
assert!(std::mem::size_of::<S<T>>() <= isize::MAX as _);
The check will normally be optimized out.

Why do I need to dereference a variable when comparing it but not when doing arithmetic?

I have the following code:
fn example(known_primes: &[i32], number: i32, prime: i32, limit: i32) {
let mut is_prime = true;
for prime in known_primes {
if number % prime == 0 {
is_prime = false;
break;
}
if *prime > limit {
break;
}
}
}
Why do I need to dereference prime in the second condition (*prime > limit), when I don't need to do so in the first one (number % prime == 0)?
Both % and < are operators that take two numbers and return something. The only difference seems to be in what they return (a number vs. a boolean). While Why isn't it possible to compare a borrowed integer to a literal integer? does explain what would be required to make the code work (implementations for all overloads, ideally in the standard library), it does not say why it does work for a % b. Is there a fundamental difference between these operators? Or is it just not implemented yet?
Comparison operators actually do behave differently than arithmetic operators. The difference becomes obvious when looking at the trait definitions. As an example, here is the PartialEq trait
pub trait PartialEq<Rhs = Self>
where
Rhs: ?Sized,
{
fn eq(&self, other: &Rhs) -> bool;
fn ne(&self, other: &Rhs) -> bool { ... }
}
and the Add trait
pub trait Add<RHS = Self> {
type Output;
fn add(self, rhs: RHS) -> Self::Output;
}
We can see that comparison traits take the operands by reference, while the arithmetic traits take the operands by value. This difference is reflected in how the compiler translates operator expressions:
a == b ==> std::cmp::PartialEq::eq(&a, &b)
a + b ==> std::ops::Add::add(a, b)
The operands of comparisons are evaluated as place expressions, so they can never move values. Operands of arithmetic operators, on the other hand, are evaluated as value expressions, so they are moved or copied depending on whether the operand type is Copy.
As a result of this difference, if we implement PartialEq for the type A, we can not only compare A and A, but also &A and &A by virtue of deref coercions for the operands. For Add on the other hand we need a separate implementation to be able to add &A and &A.
I can't answer why the standard library implements the "mixed" versions for reference and value for arithmetic operators, but not for comparisons. I can't see a fundamental reason why the latter can't be done.
Because you can have Rem implementation for different types and the core library implements
impl<'a> Rem<&'a i32> for i32 { /* … */ }
This is impossible for PartialOrd and Ord traits, so you need to compare exactly the same types, in this case i32, that is why there is requirement for dereference.

How do I compare a vector against a reversed version of itself?

Why won't this compile?
fn isPalindrome<T>(v: Vec<T>) -> bool {
return v.reverse() == v;
}
I get
error[E0308]: mismatched types
--> src/main.rs:2:25
|
2 | return v.reverse() == v;
| ^ expected (), found struct `std::vec::Vec`
|
= note: expected type `()`
found type `std::vec::Vec<T>`
Since you only need to look at the front half and back half, you can use the DoubleEndedIterator trait (methods .next() and .next_back()) to look at pairs of front and back elements this way:
/// Determine if an iterable equals itself reversed
fn is_palindrome<I>(iterable: I) -> bool
where
I: IntoIterator,
I::Item: PartialEq,
I::IntoIter: DoubleEndedIterator,
{
let mut iter = iterable.into_iter();
while let (Some(front), Some(back)) = (iter.next(), iter.next_back()) {
if front != back {
return false;
}
}
true
}
(run in playground)
This version is a bit more general, since it supports any iterable that is double ended, for example slice and chars iterators.
It only examines each element once, and it automatically skips the remaining middle element if the iterator was of odd length.
Read up on the documentation for the function you are using:
Reverse the order of elements in a slice, in place.
Or check the function signature:
fn reverse(&mut self)
The return value of the method is the unit type, an empty tuple (). You can't compare that against a vector.
Stylistically, Rust uses 4 space indents, snake_case identifiers for functions and variables, and has an implicit return at the end of blocks. You should adjust to these conventions in a new language.
Additionally, you should take a &[T] instead of a Vec<T> if you are not adding items to the vector.
To solve your problem, we will use iterators to compare the slice. You can get forward and backward iterators of a slice, which requires a very small amount of space compared to reversing the entire array. Iterator::eq allows you to do the comparison succinctly.
You also need to state that the T is comparable against itself, which requires Eq or PartialEq.
fn is_palindrome<T>(v: &[T]) -> bool
where
T: Eq,
{
v.iter().eq(v.iter().rev())
}
fn main() {
println!("{}", is_palindrome(&[1, 2, 3]));
println!("{}", is_palindrome(&[1, 2, 1]));
}
If you wanted to do the less-space efficient version, you have to allocate a new vector yourself:
fn is_palindrome<T>(v: &[T]) -> bool
where
T: Eq + Clone,
{
let mut reverse = v.to_vec();
reverse.reverse();
reverse == v
}
fn main() {
println!("{}", is_palindrome(&[1, 2, 3]));
println!("{}", is_palindrome(&[1, 2, 1]));
}
Note that we are now also required to Clone the items in the vector, so we add that trait bound to the method.

Calling a function which takes a closure twice with different closures

As a project for learning rust, I am writing a program which can parse sgf files (a format for storing go games, and technically also other games). Currently the program is supposed to parse strings of the type (this is just an exampel) ";B[ab]B[cd]W[ef]B[gh]" into [Black((0,1)),Black((2,3,)),White((4,5)),Black((6,7))]
For this I am using the parser-combinators library.
I have run into the following error:
main.rs:44:15: 44:39 error: can't infer the "kind" of the closure; explicitly annotate it; e.g. `|&:| {}` [E0187]
main.rs:44 pmove().map(|m| {Property::White(m)})
^~~~~~~~~~~~~~~~~~~~~~~~
main.rs:44:15: 44:39 error: mismatched types:
expected `closure[main.rs:39:15: 39:39]`,
found `closure[main.rs:44:15: 44:39]`
(expected closure,
found a different closure) [E0308]
main.rs:44 pmove().map(|m| {Property::White(m)})
^~~~~~~~~~~~~~~~~~~~~~~~
error: aborting due to 2 previous errors
Could not compile `go`.
The function in question is below. I am completely new to rust, so I can't really isolate the problem further or recreate it in a context without the parser-combinators library (might even have something to do with that library?).
fn parse_go_sgf(input: &str) -> Vec<Property> {
let alphabetic = |&:| {parser::satisfy(|c| {c.is_alphabetic()})};
let prop_value = |&: ident, value_type| {
parser::spaces().with(ident).with(parser::spaces()).with(
parser::between(
parser::satisfy(|c| c == '['),
parser::satisfy(|c| c == ']'),
value_type
)
)
};
let pmove = |&:| {
alphabetic().and(alphabetic())
.map(|a| {to_coord(a.0, a.1)})
};
let pblack = prop_value(
parser::string("B"),
pmove().map(|m| {Property::Black(m)}) //This is where I am first calling the map function.
);
let pwhite = prop_value(
parser::string("W"),
pmove().map(|m| {Property::White(m)}) //This is where the compiler complains
);
let pproperty = parser::try(pblack).or(pwhite);
let mut pnode = parser::spaces()
.with(parser::string(";"))
.with(parser::many(pproperty));
match pnode.parse(input) {
Ok((value, _)) => value,
Err(err) => {
println!("{}",err);
vec!(Property::Unkown)
}
}
}
So I am guessing this has something to do with closures all having different types. But in other cases it seems possible to call the same function with different closures. For example
let greater_than_forty_two = range(0, 100)
.find(|x| *x > 42);
let greater_than_forty_three = range(0, 100)
.find(|x| *x > 43);
Seems to work just fine.
So what is going on in my case that is different.
Also, as I am just learning, any general comments on the code are also welcome.
Unfortunately, you stumbled upon one of the rough edges in the Rust type system (which is, given the closure-heavy nature of parser-combinators, not really unexpected).
Here is a simplified example of your problem:
fn main() {
fn call_closure_fun<F: Fn(usize)>(f: F) { f(12) } // 1
fn print_int(prefix: &str, i: usize) { println!("{}: {}", prefix, i) }
let call_closure = |&: closure| call_closure_fun(closure); // 2
call_closure(|&: i| print_int("first", i)); // 3.1
call_closure(|&: i| print_int("second", i)); // 3.2
}
It gives exactly the same error as your code:
test.rs:8:18: 8:47 error: mismatched types:
expected `closure[test.rs:7:18: 7:46]`,
found `closure[test.rs:8:18: 8:47]`
(expected closure,
found a different closure) [E0308]
test.rs:8 call_closure(|&: i| print_int("second", i));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We have (referenced in the comments in the code):
a function which accepts a closure of some concrete form;
a closure which calls a function (1) with its own argument;
two invocations of closure (1), passing different closures each time.
Rust closures are unboxed. It means that for each closure the compiler generates a fresh type which implements one of closure traits (Fn, FnMut, FnOnce). These types are anonymous - they don't have a name you can write out. All you know is that these types implement a certain trait.
Rust is a strongly- and statically-typed language: the compiler must know exact type of each variable and each parameter at the compile time. Consequently it has to assign types for every parameter of every closure you write. But what type should closure argument of (2) have? Ideally, it should be some generic type, just like in (1): the closure should accept any type as long as it implements a trait. However, Rust closures can't be generic, and so there is no syntax to specify that. So Rust compiler does the most natural thing it can - it infers the type of closure argument based on the first use of call_closure, i.e. from 3.1 invocation - that is, it assigns the anonymous type of the closure in 3.1!
But this anonymous type is different from the anonymous type of the closure in 3.2: the only thing they have in common is that they both implement Fn(usize). And this is exactly what error is about.
The best solution would be to use functions instead of closures because functions can be generic. Unfortunately, you won't be able to do that either: your closures return structures which contain closures inside themselves, something like
pub struct Satisfy<I, Pred> { ... }
where Pred is later constrained to be Pred: FnMut(char) -> bool. Again, because closures have anonymous types, you can't specify them in type signatures, so you won't be able to write out the signature of such generic function.
In fact, the following does work (because I've extracted closures for parser::satisfy() calls to parameters):
fn prop_value<'r, I, P, L, R>(ident: I, value_type: P, l: L, r: R) -> pp::With<pp::With<pp::With<pp::Spaces<&'r str>, I>, pp::Spaces<&'r str>>, pp::Between<pp::Satisfy<&'r str, L>, pp::Satisfy<&'r str, R>, P>>
where I: Parser<Input=&'r str, Output=&'r str>,
P: Parser<Input=&'r str, Output=Property>,
L: Fn(char) -> bool,
R: Fn(char) -> bool {
parser::spaces().with(ident).with(parser::spaces()).with(
parser::between(
parser::satisfy(l),
parser::satisfy(r),
value_type
)
)
}
And you'd use it like this:
let pblack = prop_value(
parser::string("B"),
pmove().map(|&: m| Property::Black(m)),
|c| c == '[', |c| c == ']'
);
let pwhite = prop_value(
parser::string("W"),
pmove().map(|&: m| Property::White(m)),
|c| c == '[', |c| c == ']'
);
pp is introduced with use parser::parser as pp.
This does work, but it is really ugly - I had to use the compiler error output to actually determine the required return type. With the slightest change in the function it will have to be adjusted again. Ideally this is solved with unboxed abstract return types - there is a postponed RFC on them - but we're still not there yet.
As the author of parser-combinators I will just chime in on another way of solving this, without needing to use the compiler to generate the return type.
As each parser is basically just a function together with 2 associated types there are implementations for the Parser trait for all function types.
impl <I, O> Parser for fn (State<I>) -> ParseResult<O, I>
where I: Stream { ... }
pub struct FnParser<I, O, F>(F);
impl <I, O, F> Parser for FnParser<I, O, F>
where I: Stream, F: FnMut(State<I>) -> ParseResult<O, I> { ... }
These should all be replaced by a single trait and the FnParser type removed, once the orphan checking allows it. In the meantime we can use the FnParser type to create a parser from a closure.
Using these traits we can essentially hide the big parser type returned from in Vladimir Matveev's example.
fn prop_value<'r, I, P, L, R>(ident: I, value_type: P, l: L, r: R, input: State<&'r str>) -> ParseResult<Property, &'r str>
where I: Parser<Input=&'r str, Output=&'r str>,
P: Parser<Input=&'r str, Output=Property>,
L: Fn(char) -> bool,
R: Fn(char) -> bool {
parser::spaces().with(ident).with(parser::spaces()).with(
parser::between(
parser::satisfy(l),
parser::satisfy(r),
value_type
)
).parse_state(input)
}
And we can now construct the parser with this
let parser = FnParser(move |input| prop_value(ident, value_type, l, r, input));
And this is basically the best we can do at the moment using rust. Unboxed anonymous return types would make all of this significantly easier since complex return types would not be needed (nor created since the library itself could be written to utilize this, avoiding the complex types entirely).
Two facets of Rust's closures are causing your problem, one, closures cannot be generic, and two, each closure is its own type. Because closure's cannot be generic,prop_value's parameter value_type must be a specific type. Because each closure is a specific type, the closure you pass to prop_value in pwhite is a different type from the one in pblack. What the compiler does is conclude that the value_type must have the type of the closure in pblack, and when it gets to pwhite it finds a different closure, and gives an error.
Judging from your code sample, the simplest solution would probably be to make prop_value a generic fn - it doesn't look like it needs to be a closure. Alternatively, you could declare its parameter value_type to be a closure trait object, e.g. &Fn(...) -> .... Here is a simplifed example demonstrating these approaches:
fn higher_fn<F: Fn() -> bool>(f: &F) -> bool {
f()
}
let higher_closure = |&: f: &Fn() -> bool | { f() };
let closure1 = |&:| { true };
let closure2 = |&:| { false };
higher_fn(&closure1);
higher_fn(&closure2);
higher_closure(&closure1);
higher_closure(&closure2);
The existing answer(s) are good, but I wanted to share an even smaller example of the problem:
fn thing<F: FnOnce(T), T>(f: F) {}
fn main() {
let caller = |&: f| {thing(f)};
caller(|&: _| {});
caller(|&: _| {});
}
When we define caller, its signature is not fully fixed yet. When we call it the first time, type inference sets the input and output types. In this example, after the first call, caller will be required to take a closure with a specific type, the type of the first closure. This is because each and every closure has its own unique, anonymous type. When we call caller a second time, the second closure's (unique, anonymous) type doesn't fit!
As #wingedsubmariner points out, there's no way to create closures with generic types. If we had hypothetical syntax like for<F: Fn()> |f: F| { ... }, then perhaps we could work around this. The suggestion to make a generic function is a good one.

Resources