How to reduce std::io::Chain

How to reduce std::io::Chain - rust

Moving on from https://doc.rust-lang.org/rust-by-example/std_misc/file/read_lines.html, I would like to define a function that accepts an iterable of Paths, and returns a Reader that wraps all the paths into a single stream, my non-compilable attempt,
fn read_lines<P, I: IntoIterator<Item = P>>(files: I) -> Result<io::Lines<io::BufReader<File>>>
where
P: AsRef<Path>,
{
let handles = files.into_iter()
.map(|path|
File::open(path).unwrap());
// I guess it is hard (impossible?) to define the type of this reduction,
// Chain<File, Chain<File, ..., Chain<File, File>>>
// and that is the reason the compiler is complaining.
match handles.reduce(|a, b| a.chain(b)) {
Some(combination) => Ok(BufReader::new(combination).lines()),
None => {
// Not nice, hard fail if the array len is 0
Ok(BufReader::new(handles.next().unwrap()).lines())
},
}
}
This gives an expected error, which I am unsure how to address,
error[E0599]: the method `chain` exists for struct `File`, but its trait bounds were not satisfied
--> src/bin.rs:136:35
|
136 | match handles.reduce(|a, b| a.chain(b)) {
| ^^^^^ method cannot be called on `File` due to unsatisfied trait bounds
|
::: /home/test/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/fs.rs:91:1
|
91 | pub struct File {
| --------------- doesn't satisfy `File: Iterator`
|
::: /home/test/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/mod.rs:902:8
|
902 | fn chain<R: Read>(self, next: R) -> Chain<Self, R>
| ----- the method is available for `Box<File>` here
|
= note: the following trait bounds were not satisfied:
`File: Iterator`
which is required by `&mut File: Iterator`
= help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
|
1 | use std::io::Read;
|
error: aborting due to previous error
I've tried contorting the code with Box's without success, but it seems the fundamental issue is that the type of this reduction is "undefined": Chain<File, Chain<File, ..., Chain<File, File>>> IIUC. How would a Rustacean define a method like this? Is it possible without using dynamic "boxes"?

I guess it is hard (impossible?) to define the type of this reduction, Chain<File, Chain<File, ..., Chain<File, File>>>. [...] How would a Rustacean define a method like this?
The combinator you are looking for is flat_map:
let handles = files.into_iter().map(|path| File::open(path).unwrap());
handles.flat_map(|handle| BufReader::new(handle).lines())
Also, your return type is unnecessarily specific, committing to a particular implementation of both the iterator over the handles and the iterator over the lines coming from a handle. Even if you get it to work, the signature of your function will be tightly coupled to its implementation, meaning you won't be able to to e.g. switch to a more efficient approach without introducing a breaking change to the API.
To avoid such coupling, you can use an impl Trait return type. That way the signature of your function only promises that the type of the returned value will implement Iterator. The function could then look like this:
fn read_lines<P, I: IntoIterator<Item = P>>(files: I) -> impl Iterator<Item = io::Result<String>>
where
P: AsRef<Path>,
{
let handles = files.into_iter().map(|path| File::open(path).unwrap());
handles.flat_map(|handle| BufReader::new(handle).lines())
}
Finally, if you really want to combine reduce and chain, you can do that too. Your intuition that you need to use a Box is correct, but it is much easier to use fold() than reduce():
handles.fold(
Box::new(std::iter::empty()) as Box<dyn Iterator<Item = _>>,
|iter, handle| Box::new(iter.chain(BufReader::new(handle).lines())),
)
Folding starts with an empty iterator, boxed and cast to a trait object, and proceeds to chain lines of each handle to the end of the previous iterator chain. Each result of the chain is boxed so that its type is erased to Box<dyn Iterator<Item = io::Result<String>>>, which eliminates the recursion on the type level. The return type of the function can be either impl Iterator or Box<dyn Iterator>, both will compile.
Note that this solution is inefficient, not just due to boxing, but also because the final iterator will wrap all the previous ones. Although the recursion is not visible from the erased types, it's there in the implementation, and the final next() will internally have to go through all the stacked iterators, possibly even blowing up the stack if there is a sufficient number of files. The solution based on flat_map() doesn't have that issue.

Related

How do I use collect::<HashSet<_>>.intersection() without the values becoming borrowed?

I am looping loop over a Vec<&str>, each time reassigning a variable that holds the intersection of the last two checked. This is resulting in "expected char, found &char". I think this is happening because the loop is a new block scope, which means the values from the original HashSet are borrowed, and go into the new HashSet as borrowed. Unfortunately, the type checker doesn't like that. How do I create a new HashSet<char> instead of HashSet<&char>?
Here is my code:
use std::collections::HashSet;
fn find_item_in_common(sacks: Vec::<&str>) -> char {
let mut item: Option<char> = None;
let mut sacks_iter = sacks.iter();
let matching_chars = sacks_iter.next().unwrap().chars().collect::<HashSet<_>>();
loop {
let next_sack = sacks_iter.next();
if next_sack.is_none() { break; }
let next_sack_values: HashSet<_> = next_sack.unwrap().chars().collect();
matching_chars = matching_chars.intersection(&next_sack_values).collect::<HashSet<_>>();
}
matching_chars.drain().nth(0).unwrap()
}
and here are the errors that I'm seeing:
error[E0308]: mismatched types
--> src/bin/03.rs:13:26
|
6 | let matching_chars = sacks_iter.next().unwrap().chars().collect::<HashSet<_>>();
| ---------------------------------------------------------- expected due to this value
...
13 | matching_chars = matching_chars.intersection(&next_sack_values).collect::<HashSet<_>>();
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected `char`, found `&char`
|
= note: expected struct `HashSet<char>`
found struct `HashSet<&char>`
By the way, what is that first error trying to tell me? It seems like it is missing something before or after "expected" -- <missing thing?> expected <or missing thing?> due to this value?
I also tried changing matching_chars = matching_chars to matching_chars = matching_chars.cloned() and I get the following error. I understand what the error is saying, but I don't know how to resolve it.
error[E0599]: the method `cloned` exists for struct `HashSet<char>`, but its trait bounds were not satisfied
--> src/bin/03.rs:13:41
|
13 | matching_chars = matching_chars.cloned().intersection(&next_sack_values).collect::<HashSet<_>>();
| ^^^^^^ method cannot be called on `HashSet<char>` due to unsatisfied trait bounds
|
::: /Users/brandoncc/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/collections/hash/set.rs:112:1
|
112 | pub struct HashSet<T, S = RandomState> {
| -------------------------------------- doesn't satisfy `HashSet<char>: Iterator`
|
= note: the following trait bounds were not satisfied:
`HashSet<char>: Iterator`
which is required by `&mut HashSet<char>: Iterator`

Your attempt at using cloned() was almost right but you have to call it after you create the iterator:
matching_chars.intersection(&next_sack_values).cloned().collect::<HashSet<_>>()
or for Copy types you should use the more appropriate .copied() adapter:
matching_chars.intersection(&next_sack_values).copied().collect::<HashSet<_>>()

Looking at the signature of HashSet::intersection will make this clearer:
pub fn intersection<'a>(
&'a self,
other: &'a HashSet<T, S>
) -> Intersection<'a, T, S>
The type Intersection<'a, T, S> implements Iterator<Item=&'a T>. So when you collect this iterator, you get a HashSet<&char> as opposed to a HashSet<char>.
The solution is simply to use .cloned on the iterator before you use .collect, since char is Clone, like so:
matching_chars = matching_chars.intersection(&next_sack_values).cloned().collect()

By the way, what is that first error trying to tell me?
The error is telling you that it expects char because (due to) the original value for matching_chars has type HashSet<char>.
I also tried changing matching_chars = matching_chars to matching_chars = matching_chars.cloned() and I get the following error. I understand what the error is saying, but I don't know how to resolve it.
Do you, really?
str::chars is an Iterator<Item=char>, so when you collect() to a hashset you get a HashSet<char>.
The problem is that intersection borrows the hashset, and since the items the hashset contains may or may not be Clone, it also has to borrow the set items, it can't just copy or clone them (not without restricting its flexibility anyway).
So that's where you need to add the cloned call, on the HashSet::intersection in order to adapt it from an Iterator<Item=&char> to an Iterator<Item=char>.
Or you can just use the & operator, which takes two borrowed hashsets and returns an owned hashset (requiring that the items be Clone).
Alternatively use Iterator::filter or Iterator::findon one of the sets, checking if the othersHashSet::containsthe item being looked at. Fundamentally that's basically whatintersection` does, and you know there's just one item at the end.

Why does auto borrowing not occur in Rust if I implement `TryFrom` for a reference type?

Let's say I want to implement a conversion on a reference. In this case, it's a conversion from &f64 -> Foo.
use std::convert::{TryFrom, TryInto};
struct Foo {
a: f64
}
impl TryFrom<&f64> for Foo {
type Error = String;
fn try_from(value: &f64) -> Result<Foo, String> {
Ok(Foo {
a: *value
})
}
}
fn main(){
let foo: Foo = 5.0.try_into().unwrap();
let bar: Foo = (&5.0).try_into().unwrap();
}
(Yes of course this is a pointless and stupid example, but it helps simplify the problem)
Now, the second line in the main method, with manual borrowing, succeeds.
However, the first line in the main method, without the manual borrowing, fails with this error:
error[E0277]: the trait bound `Foo: From<{float}>` is not satisfied
--> src/main.rs:18:24
|
18 | let foo: Foo = 5.0.try_into().unwrap();
| ^^^^^^^^ the trait `From<{float}>` is not implemented for `Foo`
|
= note: required because of the requirements on the impl of `Into<Foo>` for `{float}`
note: required because of the requirements on the impl of `TryFrom<{float}>` for `Foo`
--> src/main.rs:7:6
|
7 | impl TryFrom<&f64> for Foo {
| ^^^^^^^^^^^^^ ^^^
= note: required because of the requirements on the impl of `TryInto<Foo>` for `{float}`
For more information about this error, try `rustc --explain E0277`.
error: could not compile `playground` due to previous error
Playground
Why is automatic borrowing not working here?

Just as the error message suggests, the problem is the trait bound Foo: From<{float}> is not satisfied. When matching traits, Rust will not perform any coercion but probing the suitable method. This is actually documented in The Rustonomicon, reads
Note that we do not perform coercions when matching traits (except for receivers, see the next page). If there is an impl for some type U and T coerces to U, that does not constitute an implementation for T.
and the next page says
Suppose we have a function foo that has a receiver (a self, &self or &mut self parameter). If we call value.foo(), the compiler needs to determine what type Self is before it can call the correct implementation of the function. ... If it can't call this function (for example, if the function has the wrong type or a trait isn't implemented for Self), then the compiler tries to add in an automatic reference. This means that the compiler tries <&T>::foo(value) and <&mut T>::foo(value). This is called an "autoref" method call.
So when matching the trait bound, Rust compiler will try to auto ref/deref on the type only. In addition, the dot operator in rust is just a syntax sugar of fully qualified function call. Thus 5.0.try_into().unwrap(); will become f64::try_into(5.0).unwrap(); and since TryInto is not implemented for f64, Rust will try to auto reference it by calling &f64::try_into(5.0).unwrap();. Now the compiler can find a version of TryInto implemented for &f64, however the type of argument still doesn't match: try_into for &f64 requires &f64 as parameter type, while the current call provides f64, and Rust compiler cannot do any coercion on parameters when checking trait bound. Thus the trait bound still doesn't match (&f64 cannot take f64 argument) and the check failed. Thus you will see the error message.

Why isn't `std::mem::drop` exactly the same as the closure |_|() in higher-ranked trait bounds?

The implementation of std::mem::drop is documented to be the following:
pub fn drop<T>(_x: T) { }
As such, I would expect the closure |_| () (colloquially known as the toilet closure) to be a potential 1:1 replacement to drop, in both directions. However, the code below shows that drop isn't compatible with a higher ranked trait bound on the function's parameter, whereas the toilet closure is.
fn foo<F, T>(f: F, x: T)
where
for<'a> F: FnOnce(&'a T),
{
dbg!(f(&x));
}
fn main() {
foo(|_| (), "toilet closure"); // this compiles
foo(drop, "drop"); // this does not!
}
The compiler's error message:
error[E0631]: type mismatch in function arguments
--> src/main.rs:10:5
|
1 | fn foo<F, T>(f: F, x: T)
| ---
2 | where
3 | for<'a> F: FnOnce(&'a T),
| ------------- required by this bound in `foo`
...
10 | foo(drop, "drop"); // this does not!
| ^^^
| |
| expected signature of `for<'a> fn(&'a _) -> _`
| found signature of `fn(_) -> _`
error[E0271]: type mismatch resolving `for<'a> <fn(_) {std::mem::drop::<_>} as std::ops::FnOnce<(&'a _,)>>::Output == ()`
--> src/main.rs:10:5
|
1 | fn foo<F, T>(f: F, x: T)
| ---
2 | where
3 | for<'a> F: FnOnce(&'a T),
| ------------- required by this bound in `foo`
...
10 | foo(drop, "drop"); // this does not!
| ^^^ expected bound lifetime parameter 'a, found concrete lifetime
Considering that drop is supposedly generic with respect to any sized T, it sounds unreasonable that the "more generic" signature fn(_) -> _ is not compatible with for<'a> fn (&'a _) -> _. Why is the compiler not admitting the signature of drop here, and what makes it different when the toilet closure is placed in its stead?

The core of the issue is that drop is not a single function, but rather a parameterized set of functions that each drop some particular type. To satisfy a higher-ranked trait bound (hereafter hrtb), you'd need a single function that can simultaneously take references to a type with any given lifetime.
We'll use drop as our typical example of a generic function, but all this applies more generally too. Here's the code for reference: fn drop<T>(_: T) {}.
Conceptually, drop is not a single function, but rather one function for every possible type T. Any particular instance of drop takes only arguments of a single type. This is called monomorphization. If a different T is used with drop, a different version of drop is compiled. That's why you can't pass a generic function as an argument and use that function in full generality (see this question)
On the other hand, a function like fn pass(x: &i32) -> &i32 {x} satisfies the hrtb for<'a> Fn(&'a i32) -> &'a i32. Unlike drop, we have a single function that simultaneously satisfies Fn(&'a i32) -> &'a i32 for every lifetime 'a. This is reflected in how pass can be used.
fn pass(x: &i32) -> &i32 {
x
}
fn two_uses<F>(f: F)
where
for<'a> F: Fn(&'a i32) -> &'a i32, // By the way, this can simply be written
// F: Fn(&i32) -> &i32 due to lifetime elision rules.
// That applies to your original example too.
{
{
// x has some lifetime 'a
let x = &22;
println!("{}", f(x));
// 'a ends around here
}
{
// y has some lifetime 'b
let y = &23;
println!("{}", f(y));
// 'b ends around here
}
// 'a and 'b are unrelated since they have no overlap
}
fn main() {
two_uses(pass);
}
(playground)
In the example, the lifetimes 'a and 'b have no relation to each other: neither completely encompasses the other. So there isn't some kind of subtyping thing going on here. A single instance of pass is really being used with two different, unrelated lifetimes.
This is why drop doesn't satisfy for<'a> FnOnce(&'a T). Any particular instance of drop can only cover one lifetime (ignoring subtyping). If we passed drop into two_uses from the example above (with slight signature changes and assuming the compiler let us), it would have to choose some particular lifetime 'a and the instance of drop in the scope of two_uses would be Fn(&'a i32) for some concrete lifetime 'a. Since the function would only apply to single lifetime 'a, it wouldn't be possible to use it with two unrelated lifetimes.
So why does the toilet closure get a hrtb? When inferring the type for a closure, if the expected type hints that a higher-ranked trait bound is needed, the compiler will try to make one fit. In this case, it succeeds.
Issue #41078 is closely related to this and in particular, eddyb's comment here gives essentially the explanation above (though in the context of closures, rather than ordinary functions). The issue itself doesn't address the present problem though. It instead addresses what happens if you assign the toilet closure to a variable before using it (try it out!).
It's possible that the situation will change in the future, but it would require a pretty big change in how generic functions are monomorphized.

In short, both lines should fail. But since one step in old way of handling hrtb lifetimes, namely the leak check, currently has some soundness issue, rustc ends up (incorrectly) accepting one and leaving the other with a pretty bad error message.
If you disable the leak check with rustc +nightly -Zno-leak-check, you'll be able to see a more sensible error message:
error[E0308]: mismatched types
--> src/main.rs:10:5
|
10 | foo(drop, "drop");
| ^^^ one type is more general than the other
|
= note: expected type `std::ops::FnOnce<(&'a &str,)>`
found type `std::ops::FnOnce<(&&str,)>`
My interpretation of this error is that the &x in the body of the foo function only has a scope lifetime confined to the said body, so f(&x) also has the same scope lifetime which can't possibly satisfy the for<'a> universal quantification required by the trait bound.
The question you present here is almost identical to issue #57642, which also has two contrasting parts.
The new way to process hrtb lifetimes is by using so-called universes. Niko has a WIP to tackle the leak check with universes. Under this new regime, both parts of issue #57642 linked above is said to all fail with far more clear diagnoses. I suppose the compiler should be able to handle your example code correctly by then, too.

Array cannot be indexed by RangeFull?

Consider the following example:
use std::ops::Index;
use std::ops::RangeFull;
fn f<T: Index<RangeFull>>(x: T) {}
fn main() {
let x: [i32; 4] = [0, 1, 2, 3];
f(x);
}
Upon calling f(x), I get an error:
error[E0277]: the type `[i32; 4]` cannot be indexed by `std::ops::RangeFull`
--> src/main.rs:8:5
|
8 | f(x);
| ^ `[i32; 4]` cannot be indexed by `std::ops::RangeFull`
|
= help: the trait `std::ops::Index<std::ops::RangeFull>` is not implemented for `[i32; 4]`
note: required by `f`
--> src/main.rs:4:1
|
4 | fn f<T: Index<RangeFull>>(x: T) {}
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I am confused. I can obviously write, for example, let y = x[..];. Does this not mean indexing x with RangeFull? Are arrays somehow special in this regard?

As you can see in the documentation for the primitive array type, Index<…> is not directly implemented for arrays. This is partly because it would currently be impossible to provide blanket implementations for all array sizes, but mainly because it's not necessary; the implementation for slices is sufficient for most purposes.
The expression x[..] is translated to *std::ops::Index::index(&x, ..) by the compiler, which in turn is evaluated according to the usual method call semantics. Since there is no implementation of Index<RangeFull> for arrays, the compiler repeatedly dereferences &x and performs an unsized coercion at the end, eventually finding the implementation of Index<RangeFull> for [i32].
The process of calling a generic function, like f() in your example, is different from method call semantics. The compiler first infers what T is based on the argument you are passing; in this case T is inferred to be [i32; 4]. In the next step, the compiler verifies whether T satisfies the trait bounds, and since it doesn't, you get an error message.
If we want to make your code work, we need to make sure to pass a slice to f(). Since a slice is unsized, we need to pass it by reference, so we need to define f() like this:
fn f<T: ?Sized + Index<RangeFull>>(_: &T) {}
The ?Sized is necessary since type parameters receive an implicit Sized bound. When calling f(), we need to make sure T is actually inferred as [i32] rather than [i32; 4]. To this end, we can either explicitly specify T
f::<[_]>(&x);
or explicitly perform the unsized conversion before passing the argument, so the compiler infers the desired type:
f(&x as &[_]);
f(&x[..])

What is the inferred type of a vector of closures?

I tried to create vector of closures:
fn main() {
let mut vec = Vec::new();
vec.push(Box::new(|| 10));
vec.push(Box::new(|| 20));
println!("{}", vec[0]());
println!("{}", vec[1]());
}
That yielded the following error report:
error[E0308]: mismatched types
--> src/main.rs:5:23
|
5 | vec.push(Box::new(|| 20));
| ^^^^^ expected closure, found a different closure
|
= note: expected type `[closure#src/main.rs:4:23: 4:28]`
found type `[closure#src/main.rs:5:23: 5:28]`
= note: no two closures, even if identical, have the same type
= help: consider boxing your closure and/or using it as a trait object
I fixed it by specifying the type explicitly:
let mut vec: Vec<Box<Fn() -> i32>> = Vec::new();
What is the inferred type of vec and why is it that way?

Each closure has an auto-generated, unique, anonymous type. As soon as you add the first closure to the vector, that is the type of all items in the vector. However, when you try to add the second closure, it has a different auto-generated, unique, anonymous type, and so you get the error listed.
Closures are essentially structs that are created by the compiler that implement one of the Fn* traits. The struct contains fields for all the variables captured by the closure, so it by definition needs to be unique, as each closure will capture different numbers and types of variables.
Why can't it just infer Box<Fn() -> i32>?
"can't" is a tough question to answer. It's possible that the compiler could iterate through all the traits of every type that is used to see if some intersection caused the code to compile, but that feels a bit magical to me. You could try opening a feature request or discussing it on one of the forums to see if there is general acceptance of such an idea.
However, Rust does try to make things explicit, especially things that might involve performance. When you go from a concrete struct to a trait object, you are introducing indirection, which has the possibility of being slower.
Right now, the Fn* traits work the same as a user-constructed trait:
trait MyTrait {
fn hello(&self) {}
}
struct MyStruct1;
impl MyTrait for MyStruct1 {}
struct MyStruct2;
impl MyTrait for MyStruct2 {}
fn main() {
let mut things = vec![];
things.push(MyStruct1);
things.push(MyStruct2);
}
error[E0308]: mismatched types
--> src/main.rs:14:17
|
14 | things.push(MyStruct2);
| ^^^^^^^^^ expected struct `MyStruct1`, found struct `MyStruct2`
|
= note: expected type `MyStruct1`
found type `MyStruct2`

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to reduce std::io::Chain - rust

Related

How do I use collect::<HashSet<_>>.intersection() without the values becoming borrowed?

Why does auto borrowing not occur in Rust if I implement `TryFrom` for a reference type?

Why isn't `std::mem::drop` exactly the same as the closure |_|() in higher-ranked trait bounds?

Array cannot be indexed by RangeFull?

What is the inferred type of a vector of closures?

Categories

Resources