Rust: Ownership problem when trying to get next byte from Stdin - rust

I was modifing a code snippet from github which enable fast i/o in Rust for competitive-programming.
I want to add a feature making it can read single character from stdin, just like getchar in C. My approach was a combination of the pub fn read<T: std::str::FromStr> in the original code snippet and this SO answer
//////////////////////////////////////////////////////////////////////
/// https://github.com/kenkoooo/competitive-programming-rs/blob/master/src/utils/scanner.rs
/// let (stdin, stdout) = (std::io::stdin(), std::io::stdout());
/// let mut sc = IO::new(stdin.lock(), stdout.lock());
pub struct IO<R, W: std::io::Write>(R, std::io::BufWriter<W>);
impl<R: std::io::Read, W: std::io::Write> IO<R, W> {
pub fn new(r: R, w: W) -> IO<R, W> {
IO(r, std::io::BufWriter::new(w))
}
pub fn write<S: ToString>(&mut self, s: S) {
use std::io::Write;
self.1.write_all(s.to_string().as_bytes()).unwrap();
}
pub fn read<T: std::str::FromStr>(&mut self) -> T {
use std::io::Read;
let buf = self
.0
.by_ref()
.bytes()
.map(|b| b.unwrap())
.skip_while(|&b| b == b' ' || b == b'\n' || b == b'\r' || b == b'\t')
.take_while(|&b| b != b' ' && b != b'\n' && b != b'\r' && b != b'\t')
.collect::<Vec<_>>();
unsafe { std::str::from_utf8_unchecked(&buf) }
.parse()
.ok()
.expect("Parse error.")
}
pub fn usize0(&mut self) -> usize {
self.read::<usize>() - 1
}
pub fn vec<T: std::str::FromStr>(&mut self, n: usize) -> Vec<T> {
(0..n).map(|_| self.read()).collect()
}
pub fn chars(&mut self) -> Vec<char> {
self.read::<String>().chars().collect()
}
pub fn char(&mut self) -> char {
self
.0
.by_ref()
.bytes()
.next()
.unwrap()
.unwrap() as char
}
}
///////////////////////////////////////////////////////////////////////
#[allow(non_snake_case)]
fn main() {
let (stdin, stdout) = (std::io::stdin(), std::io::stdout());
let mut sc = IO::new(stdin.lock(), stdout.lock());
let c = sc.char();
sc.write(c);
}
cargo run and the output was:
$ cargo run
warning: unused manifest key: package.author
Compiling ralgo v0.1.0 (/home/xxx/ralgo)
error[E0507]: cannot move out of a mutable reference
--> src/main.rs:40:9
|
40 | / self
41 | | .0
42 | | .by_ref()
43 | | .bytes()
| | ------^
| |__________|_____|
| | move occurs because value has type `R`, which does not implement the `Copy` trait
| value moved due to this method call
|
note: `bytes` takes ownership of the receiver `self`, which moves value
--> /home/xxx/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/io/mod.rs:922:14
|
922 | fn bytes(self) -> Bytes<Self>
| ^^^^
For more information about this error, try `rustc --explain E0507`.
error: could not compile `ralgo` due to previous error
My questions:
The reason for this error and how to fix it.
Is there any better way to achieve getchar in Rust?
Why the rust-analyser shows the return type of let buf = self.0.by_ref().bytes() in the original read function is Bytes<&mut R> but for my code self.0.by_ref().bytes() it was Bytes<R>?

Upfront, I want to note that your read is unsound. And furthermore the cost of allocating a vector is almost certainly going to dwarf the cost of validating that said vector is UTF8, so it's rather unnecessary.
The reason for this error and how to fix it.
Because you don't use std::io::Read globally, the compiler knows that R specifically implements Read (as that's a bound), it does not know about any other implementation of Read. Notably, it does not know about impl<R: Read + ?Sized> Read for &mut R. Thus as far as it is concerned, the closest thing which would have a bytes method is the original reader, which it tries to deref'. The error is much clearer if you assign the result of by_ref to a local variable.
Just stop using fully qualified path like this, especially for traits. Your code is strictly less readable than if you just imported Read, Write, and BufWriter at the top.
Is there any better way to achieve getchar in Rust?
getchar is a horrible function so I'm not too clear on why you'd want that. But I would suggest just using byteorder's read_u8.
Why the rust-analyser shows the return type
Possibly because it's getting confused and assumes all the traits are in scope.
.skip_while(|&b| b == b' ' || b == b'\n' || b == b'\r' || b == b'\t')
.take_while(|&b| b != b' ' && b != b'\n' && b != b'\r' && b != b'\t')
These seem like complicated ways of not calling u8::is_ascii_whitespace.

Related

How to return offset char_indices from a function

Suppose the following Rust snippet:
use std::borrow::Cow;
fn char_indices_from(s: &str, offset: usize) -> impl Iterator<Item=(usize, char)> + '_ {
s[offset..].char_indices().map(move |(i,c)| (i+offset,c))
}
fn main() {
let mut m = Cow::from("watermelons and stuff");
let offset = 2;
for (i, c) in char_indices_from(&m, offset) {
if i == 3 {
m = Cow::from("clouds and the sky");
break
}
}
}
This displeases the borrow checker:
error[E0506]: cannot assign to `m` because it is borrowed
--> src/main.rs:12:13
|
10 | for (i, c) in char_indices_from(&m, offset) {
| -----------------------------
| | |
| | borrow of `m` occurs here
| a temporary with access to the borrow is created here ...
11 | if i == 3 {
12 | m = Cow::from("clouds and the sky");
| ^ assignment to borrowed `m` occurs here
...
15 | }
| - ... and the borrow might be used here, when that temporary is dropped and runs the destructor for type `impl Iterator<Item = (usize, char)>`
Doing this, however, works just fine:
use std::borrow::Cow;
fn main() {
let mut m = Cow::from("watermelons and stuff");
let offset = 2;
for (i, c) in m[offset..].char_indices().map(|(i,c)| (i+offset, c)) {
if i == 3 {
m = Cow::from("clouds and the sky");
break
}
}
}
Those are some excellent diagnostics given by rustc. Nevertheless, I find myself confused as to how one would fix char_indices_from such that the first program satisfies Rust's borrowing rules.
Your assumption is that you can overwrite m because it's the last thing you do before break.
It's true that the Rust borrow checker is smart enough to figure this out; your second example proves this.
The borrow checker rightfully complains about the first example, though, because you forget destructors, meaning, the Drop trait. Because your return type is impl Iterator + '_, it has to assume this could be any type that implements Iterator and depends on the input lifetimes. Which includes types that use the borrowed values in their Drop implementation. This is also what the compiler tries to tell you.
You could fix that by replacing the impl return type with the actual type, proving to the borrow checker that there is no Drop implementation. Although you will also get problems with that, because your type contains a closure whose type cannot be named.
That's why usually these things return their own iterator type (for example the itertools crate, none of their functions have an impl return type).
So that's what I would do: implement your own iterator return type.
use std::{borrow::Cow, str::CharIndices};
struct CharIndicesFrom<'a> {
raw_indices: CharIndices<'a>,
offset: usize,
}
impl Iterator for CharIndicesFrom<'_> {
type Item = (usize, char);
fn next(&mut self) -> Option<Self::Item> {
self.raw_indices.next().map(|(i, c)| (i + self.offset, c))
}
}
fn char_indices_from(s: &str, offset: usize) -> CharIndicesFrom<'_> {
CharIndicesFrom {
raw_indices: s[offset..].char_indices(),
offset,
}
}
fn main() {
let mut m = Cow::from("watermelons and stuff");
let offset = 2;
for (i, c) in char_indices_from(&m, offset) {
if i == 3 {
m = Cow::from("clouds and the sky");
break;
}
}
}

How do I implement an O(n) time and O(1) space FromIterator implementation for a linked list in safe code?

I have a Cons list:
#[derive(Debug)]
pub enum Cons {
Empty,
Pair(i64, Box<Cons>),
}
I want to implement FromIterator<i64> for this type, in an efficient manner.
Attempt one is straightforward: implement a push method which recursively traverses the list and transforms a Cons::Empty into a Cons::Pair(x, Box::new(Cons::Empty)); repeatedly call this push method. This operation is O(n^2) in time and O(n) in temporary space for the stack frames.
Attempt two will combine the recursion with the iterator to improve the time performance: by pulling a single item from the iterator to construct a Cons::Pair and then recursing to construct the remainder of the list, we now construct the list in O(n) time and O(n) temporary space:
impl FromIterator<i64> for Cons {
fn from_iter<I>(iter: I) -> Self
where
I: IntoIterator<Item = i64>,
{
let mut iter = iter.into_iter();
match iter.next() {
Some(x) => Cons::Pair(x, Box::new(iter.collect())),
None => Cons::Empty,
}
}
}
In C, it would be possible to implement this method using O(n) operations and O(1) working space size. However, I cannot translate it into Rust. The idea is simple, but it requires storing two mutable pointers to the same value; something that Rust forbids. A failed attempt:
impl FromIterator<i64> for Cons {
fn from_iter<I>(iter: I) -> Self
where
I: IntoIterator<Item = i64>,
{
let mut iter = iter.into_iter();
let ret = Box::new(Cons::Empty);
let mut cursor = ret;
loop {
match iter.next() {
Some(x) => {
let mut next = Box::new(Cons::Empty);
*cursor = Cons::Pair(x, next);
cursor = next;
}
None => break,
}
}
return *ret;
}
}
error[E0382]: use of moved value: `next`
--> src/lib.rs:20:30
|
18 | let mut next = Box::new(Cons::Empty);
| -------- move occurs because `next` has type `Box<Cons>`, which does not implement the `Copy` trait
19 | *cursor = Cons::Pair(x, next);
| ---- value moved here
20 | cursor = next;
| ^^^^ value used here after move
error[E0382]: use of moved value: `*ret`
--> src/lib.rs:25:16
|
13 | let ret = Box::new(Cons::Empty);
| --- move occurs because `ret` has type `Box<Cons>`, which does not implement the `Copy` trait
14 | let mut cursor = ret;
| --- value moved here
...
25 | return *ret;
| ^^^^ value used here after move
Is it possible to perform this algorithm in safe Rust? How else could I implement an efficient FromIterator for my Cons type? I understand that I may be able to make some headway by switching Box to Rc, but I'd like to avoid this if possible.
You are attempting to have two owners of a single variable, but Rust only allows a single owner. You do this twice: once for ret and once for next. Instead, use mutable references.
I chose to introduce a last() method which can be used in an implementation of Extend and participate in more abstractions.
#[derive(Debug)]
pub enum Cons {
Empty,
Pair(i64, Box<Cons>),
}
impl Cons {
fn last(&mut self) -> &mut Self {
let mut this = self;
loop {
eprintln!("1 loop turn");
match this {
Cons::Empty => return this,
Cons::Pair(_, next) => this = next,
}
}
}
}
impl FromIterator<i64> for Cons {
fn from_iter<I>(iter: I) -> Self
where
I: IntoIterator<Item = i64>,
{
let mut this = Cons::Empty;
this.extend(iter);
this
}
}
impl Extend<i64> for Cons {
fn extend<I>(&mut self, iter: I)
where
I: IntoIterator<Item = i64>,
{
let mut this = self.last();
for i in iter {
eprintln!("1 loop turn");
*this = Cons::Pair(i, Box::new(Cons::Empty));
this = match this {
Cons::Empty => unreachable!(),
Cons::Pair(_, next) => next,
};
}
}
}
fn main() {
dbg!(Cons::from_iter(0..10));
}
This produces
Pair(0, Pair(1, Pair(2, Pair(3, Pair(4, Pair(5, Pair(6, Pair(7, Pair(8, Pair(9, Empty))))))))))
0 -> 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8 -> 9 -> ⏚
See also:
Adding an append method to a singly linked list
How to implement an addition method of linked list?
How do I return a mutable reference to the last element in a singly linked list to append an element?
Learn Rust With Entirely Too Many Linked Lists

Iterate through a whole file one character at a time

I'm new to Rust and I'm struggle with the concept of lifetimes. I want to make a struct that iterates through a file a character at a time, but I'm running into issues where I need lifetimes. I've tried to add them where I thought they should be but the compiler isn't happy. Here's my code:
struct Advancer<'a> {
line_iter: Lines<BufReader<File>>,
char_iter: Chars<'a>,
current: Option<char>,
peek: Option<char>,
}
impl<'a> Advancer<'a> {
pub fn new(file: BufReader<File>) -> Result<Self, Error> {
let mut line_iter = file.lines();
if let Some(Ok(line)) = line_iter.next() {
let char_iter = line.chars();
let mut advancer = Advancer {
line_iter,
char_iter,
current: None,
peek: None,
};
// Prime the pump. Populate peek so the next call to advance returns the first char
let _ = advancer.next();
Ok(advancer)
} else {
Err(anyhow!("Failed reading an empty file."))
}
}
pub fn next(&mut self) -> Option<char> {
self.current = self.peek;
if let Some(char) = self.char_iter.next() {
self.peek = Some(char);
} else {
if let Some(Ok(line)) = self.line_iter.next() {
self.char_iter = line.chars();
self.peek = Some('\n');
} else {
self.peek = None;
}
}
self.current
}
pub fn current(&self) -> Option<char> {
self.current
}
pub fn peek(&self) -> Option<char> {
self.peek
}
}
fn main() -> Result<(), Error> {
let file = File::open("input_file.txt")?;
let file_buf = BufReader::new(file);
let mut advancer = Advancer::new(file_buf)?;
while let Some(char) = advancer.next() {
print!("{}", char);
}
Ok(())
}
And here's what the compiler is telling me:
error[E0515]: cannot return value referencing local variable `line`
--> src/main.rs:37:13
|
25 | let char_iter = line.chars();
| ---- `line` is borrowed here
...
37 | Ok(advancer)
| ^^^^^^^^^^^^ returns a value referencing data owned by the current function
error[E0597]: `line` does not live long enough
--> src/main.rs:49:34
|
21 | impl<'a> Advancer<'a> {
| -- lifetime `'a` defined here
...
49 | self.char_iter = line.chars();
| -----------------^^^^--------
| | |
| | borrowed value does not live long enough
| assignment requires that `line` is borrowed for `'a`
50 | self.peek = Some('\n');
51 | } else {
| - `line` dropped here while still borrowed
error: aborting due to 2 previous errors
Some errors have detailed explanations: E0515, E0597.
For more information about an error, try `rustc --explain E0515`.
error: could not compile `advancer`.
Some notes:
The Chars iterator borrows from the String it was created from. So you can't drop the String while the iterator is alive. But that's what happens in your new() method, the line variable owning the String disappears while the iterator referencing it is stored in the struct.
You could also try storing the current line in the struct, then it would live long enough, but that's not an option – a struct cannot hold a reference to itself.
Can you make a char iterator on a String that doesn't store a reference into the String? Yes, probably, for instance by storing the current position in the string as an integer – it shouldn't be the index of the char, because chars can be more than one byte long, so you'd need to deal with the underlying bytes yourself (using e.g. is_char_boundary() to take the next bunch of bytes starting from your current index that form a char).
Is there an easier way? Yes, if performance is not of highest importance, one solution is to make use of Vec's IntoIterator instance (which uses unsafe magic to create an object that hands out parts of itself) :
let char_iter = file_buf.lines().flat_map(|line_res| {
let line = line_res.unwrap_or(String::new());
line.chars().collect::<Vec<_>>()
});
Note that just returning line.chars() would have the same problem as the first point.
You might think that String should have a similar IntoIterator instance, and I wouldn't disagree.

Is there a way to use locked standard input and output in a constructor to live as long as the struct you're constructing?

I'm building a PromptSet that can ask a series of questions in a row. For testing reasons, it allows you to pass a reader and writer instead of using stdin & stdout directly.
Because stdin and stdout are the common use case, I would like to create a default "constructor" that allows the user to produce a PromptSet<StdinLock, StdoutLock> without needing any parameters. Here's the code so far:
use std::io::{self, BufRead, StdinLock, StdoutLock, Write};
pub struct PromptSet<R, W>
where
R: BufRead,
W: Write,
{
pub reader: R,
pub writer: W,
}
impl<R, W> PromptSet<R, W>
where
R: BufRead,
W: Write,
{
pub fn new(reader: R, writer: W) -> PromptSet<R, W> {
return PromptSet {
reader: reader,
writer: writer,
};
}
pub fn default<'a>() -> PromptSet<StdinLock<'a>, StdoutLock<'a>> {
let stdin = io::stdin();
let stdout = io::stdout();
return PromptSet {
reader: stdin.lock(),
writer: stdout.lock(),
};
}
pub fn prompt(&mut self, question: &str) -> String {
let mut input = String::new();
write!(self.writer, "{}: ", question).unwrap();
self.writer.flush().unwrap();
self.reader.read_line(&mut input).unwrap();
return input.trim().to_string();
}
}
fn main() {}
StdinLock and StdoutLock both need a lifetime declared. To complicate it, I think the original stdin()/stdout() handles need to live at least as long as the locks do. I would like the references to StdinLock and StdoutLock to live as long as my PromptSet does but no matter what I try I can't get it to work. Here is the error that I keep getting:
error[E0597]: `stdin` does not live long enough
--> src/main.rs:30:21
|
30 | reader: stdin.lock(),
| ^^^^^ borrowed value does not live long enough
...
33 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the method body at 25:5...
--> src/main.rs:25:5
|
25 | / pub fn default<'a>() -> PromptSet<StdinLock<'a>, StdoutLock<'a>> {
26 | | let stdin = io::stdin();
27 | | let stdout = io::stdout();
28 | |
... |
32 | | };
33 | | }
| |_____^
error[E0597]: `stdout` does not live long enough
--> src/main.rs:31:21
|
31 | writer: stdout.lock(),
| ^^^^^^ borrowed value does not live long enough
32 | };
33 | }
| - borrowed value only lives until here
|
note: borrowed value must be valid for the lifetime 'a as defined on the method body at 25:5...
--> src/main.rs:25:5
|
25 | / pub fn default<'a>() -> PromptSet<StdinLock<'a>, StdoutLock<'a>> {
26 | | let stdin = io::stdin();
27 | | let stdout = io::stdout();
28 | |
... |
32 | | };
33 | | }
| |_____^
It's perfectly possible I just don't understand the concept of lifetimes or something else super basic.
The lock method's signature is fn lock(&self) -> StdinLock, which, when fully expanded with lifetime annotations, is fn lock<'a>(&'a self) -> StdinLock<'a>. Thus the StdinLock can only live as long as the value that the lock method is called on. Since you defined stdin in this very function, the StdinLock can't outlive the function. This is the same as returning a reference to a local value. You also can't return the reference and the referred-to value together.
You can't do this, and you can't work around it. The only fix is to have the default method take a Stdin and a Stdout object as arguments.
That said, you can work around it. Yes I know, I just said the exact opposite, but it's more of a "no one other than me will ever use stdin/stdout" (a.k.a., println! will not work anymore!).
In Rust 1.26, you can use Box::leak to leak the Stdin to a &'static Stdin, which will yield a StdinLock<'static>. Before Rust 1.26, you can use the leak crate:
pub fn default() -> PromptSet<StdinLock<'static>, StdoutLock<'static>> {
let stdin = Box::leak(Box::new(io::stdin()));
let stdout = Box::leak(Box::new(io::stdout()));
PromptSet {
reader: stdin.lock(),
writer: stdout.lock(),
}
}
Might be not really the answer to your question, but to a similar problem. Here's my solution.
The main trick here is to call stdin.lock() for every single line.
use std::io;
use std::io::prelude::*;
use std::io::Stdin;
struct StdinWrapper {
stdin: Stdin,
}
impl Iterator for StdinWrapper {
type Item = String;
fn next(&mut self) -> Option<Self::Item> {
let stdin = &self.stdin;
let mut lines = stdin.lock().lines();
match lines.next() {
Some(result) => Some(result.expect("Cannot read line")),
None => None,
}
}
}
/**
* Callers of this method should not know concrete source of the strings.
* It could be Stdin, a file, DB, or even aliens from SETI.
*/
fn read() -> Box<Iterator<Item = String>> {
let stdin = io::stdin();
Box::new(StdinWrapper { stdin })
}
fn main() {
let lines = read();
for line in lines {
println!("{}", line);
}
}

How to destructure tuple struct with reference

I'm trying to use the hyper library to make some requests. The Headers::get() method returns Option<&H>, where H is a tuple struct with one field. I can use if let Some() to destructure the Option. But how do we destructure the &H? Sure I could always access the field with .0, but I'm curious if Rust has a syntax to do this.
struct s(String);
fn f(input: &s) -> &s {
input
}
fn main() {
let my_struct1 = s("a".to_owned());
let s(foo) = my_struct1;
let my_struct2 = s("b".to_owned());
let &s(bar) = f(&my_struct2); // this does not work
let baz = &my_struct2.0; // this works
}
When you try to compile this, the Rust compiler will tell you how to fix the error with a nice message:
error[E0507]: cannot move out of borrowed content
--> <anon>:11:9
|
11 | let &s(bar) = f(&my_struct2); // this does not work
| ^^^---^
| | |
| | hint: to prevent move, use `ref bar` or `ref mut bar`
| cannot move out of borrowed content
This is needed to tell the compiler that you only want a reference to the field in the struct; the default matching will perform a move and the original struct value will no longer be valid.
Let's fix the example:
struct s(String);
fn f(input: &s) -> &s {
input
}
fn main() {
let my_struct1 = s("a".to_owned());
let s(foo) = my_struct1;
let my_struct2 = s("b".to_owned());
let &s(ref bar) = f(&my_struct2);
}
Another way is to dereference first and drop the &. I think this is preferred in Rust:
struct s(String);
fn f(input: &s) -> &s {
input
}
fn main() {
let my_struct1 = s("a".to_owned());
let s(foo) = my_struct1;
let my_struct2 = s("b".to_owned());
let s(ref bar) = *f(&my_struct2);
}

Resources