Unwrap a BTreeSet in rust - rust

In rust, the following function is legal:
fn unwrap<T>(s:Option<T>) -> T {
s.unwrap()
}
It takes ownership of s, panics if it is a None, and returns ownership of the contents of s (which is legal since an Option owns its contents).
I was trying to write a similar function with signature
fn unwrap_set<T>(s: BTreeSet<T>) -> T {
...
}
The idea is that it panics unless s has size 1, in which case it returns the single element. It seems like this should be possible for the same reason unwrap is possible, however none of the methods on BTreeSet had the right signature (they would need to have return type T). The closest was take, and I tried to do
let mut s2 = s;
let v: &T = s2.iter().next().unwrap();
s2.take(v).unwrap()
but this failed.
Is writing a function like unwrap_set possible?

The easiest way to do this would be to use BTreeSet<T>'s implementation of IntoIterator, which would allow you to easily pull owned values out of the set one at a time:
fn unwrap_set<T>(s: BTreeSet<T>) -> T {
let mut it = s.into_iter();
if let Some(first) = it.next() {
if let None = it.next() {
return first;
}
}
panic!("set must have a single value");
}
If you wanted to indirectly rely on IntoIterator you could also use a normal loop, but I don't think it's as readable that way so I probably wouldn't do this:
fn unwrap_set<T>(s: BTreeSet<T>) -> T {
let mut result = None;
for item in s {
// If there is a second value, bail out
if let Some(_) = result {
result = None;
break;
}
result = Some(item);
}
return result.expect("set must have a single value");
}

Related

Make iterator of nested iterators

How could I pack the following code into a single iterator?
use std::io::{BufRead, BufReader};
use std::fs::File;
let file = BufReader::new(File::open("sample.txt").expect("Unable to open file"));
for line in file.lines() {
for ch in line.expect("Unable to read line").chars() {
println!("Character: {}", ch);
}
}
Naively, I’d like to have something like (I skipped unwraps)
let lines = file.lines().next();
Reader {
line: lines,
char: next().chars()
}
and iterate over Reader.char till hitting None, then refreshing Reader.line to a new line and Reader.char to the first character of the line. This doesn't seem to be possible though because Reader.char depends on the temporary variable.
Please notice that the question is about nested iterators, reading text files is used as an example.
You can use the flat_map() iterator utility to create new iterator that can produce any number of items for each item in the iterator it's called on.
In this case, that's complicated by the fact that lines() returns an iterator of Results, so the Err case must be handled.
There's also the issue that .chars() references the original string to avoid an additional allocation, so you have to collect the characters into another iterable container.
Solving both issues results in this mess:
fn example() -> impl Iterator<Item=Result<char, std::io::Error>> {
let file = BufReader::new(File::open("sample.txt").expect("Unable to open file"));
file.lines().flat_map(|line| match line {
Err(e) => vec![Err(e)],
Ok(line) => line.chars().map(Ok).collect(),
})
}
If String gave us an into_chars() method we could avoid collect() here, but then we'd have differently-typed iterators and would need to use either Box<dyn Iterator> or something like either::Either.
Since you already use .expect() here, you can simplify a bit by using .expect() within the closure to avoid handling the Err case:
fn example() -> impl Iterator<Item=char> {
let file = BufReader::new(File::open("sample.txt").expect("Unable to open file"));
file.lines().flat_map(|line|
line.expect("Unable to read line").chars().collect::<Vec<_>>()
)
}
In the general case, flat_map() is usually quite easy. You just need to be mindful of whether you are iterating owned vs borrowed values; both cases have some sharp corners. In this case, iterating over owned String values makes using .chars() problematic. If we could iterate over borrowed str slices we wouldn't have to .collect().
Drawing on the answer from #cdhowie and this answer that suggests using IntoIter to get an iterator of owned chars, I was able to come up with this solution that is the closest to what I expected:
use std::fs::File;
use std::io;
use std::io::{BufRead, BufReader, Lines};
use std::vec::IntoIter;
struct Reader {
lines: Lines<BufReader<File>>,
iter: IntoIter<char>,
}
impl Reader {
fn new(filename: &str) -> Self {
let file = BufReader::new(File::open(filename).expect("Unable to open file"));
let mut lines = file.lines();
let iter = Reader::char_iter(lines.next().expect("Unable to read file"));
Reader { lines, iter }
}
fn char_iter(line: io::Result<String>) -> IntoIter<char> {
line.unwrap().chars().collect::<Vec<_>>().into_iter()
}
}
impl Iterator for Reader {
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
match self.iter.next() {
None => {
self.iter = match self.lines.next() {
None => return None,
Some(line) => Reader::char_iter(line),
};
Some('\n')
}
Some(val) => Some(val),
}
}
}
it works as expected:
let reader = Reader::new("src/main.rs");
for ch in reader {
print!("{}", ch);
}

How can I implement the typestate pattern based on a discriminator? [duplicate]

I was wondering if it was possible to return different types depending on the conditions in the function:
This code will work if you remove '|| bool' and the 'if/else' statements.
Thanks in advance.
fn main() {
let vector: Vec<i32> = vec![0, 2, 5, 8, 9];
let targetL i32 = 3;
let found_item = linear_search(vector, target);
println!("{}", &found_item);
}
fn linear_search(vector: Vec<i32>, target: i32) -> i32 || bool {
let mut found: i32 = 0;
for item in vector {
if item == target {
found = item;
break
}
}
if found == 0 {
false
} else {
found
}
}
The precise type must be known at compile time (and is subsequently erased). You cannot decide arbitrarily which types to return at runtime.
However, you can do you've tried to do, by wrapping the types into a generic enum (which replaces the || in your code):
enum TypeOr<S, T> {
Left(S),
Right(T),
}
fn linear_search(vector: ...) -> TypeOr<i32, bool> { //...
The downside is that you must unwrap the value from the enum before you can do anything else with the result. However, this isn't so arduous in practice.
This is essentially a generalised version of the commonly used Option and Result types.
Edit: In fact, in your case, you are served very nicely by the semantics of the Option type: you never return true, so you may equate the None result with the false result your function returns, and this captures the idea you're trying to express: either your linear search finds the target and returns it (Some(found)), or it does not, and has nothing to return (None).

Adding an append method to a singly linked list

I was looking through the singly linked list example on rustbyexample.com and I noticed the implementation had no append method, so I decided to try and implement it:
fn append(self, elem: u32) -> List {
let mut node = &self;
loop {
match *node {
Cons(_, ref tail) => {
node = tail;
},
Nil => {
node.prepend(elem);
break;
},
}
}
return self;
}
The above is one of many different attempts, but I cannot seem to find a way to iterate down to the tail and modify it, then somehow return the head, without upsetting the borrow checker in some way.
I am trying to figure out a solution that doesn't involve copying data or doing additional bookkeeping outside the append method.
As described in Cannot obtain a mutable reference when iterating a recursive structure: cannot borrow as mutable more than once at a time, you need to transfer ownership of the mutable reference when performing iteration. This is needed to ensure you never have two mutable references to the same thing.
We use similar code as that Q&A to get a mutable reference to the last item (back) which will always be the Nil variant. We then call it and set that Nil item to a Cons. We wrap all that with a by-value function because that's what the API wants.
No extra allocation, no risk of running out of stack frames.
use List::*;
#[derive(Debug)]
enum List {
Cons(u32, Box<List>),
Nil,
}
impl List {
fn back(&mut self) -> &mut List {
let mut node = self;
loop {
match {node} {
&mut Cons(_, ref mut next) => node = next,
other => return other,
}
}
}
fn append_ref(&mut self, elem: u32) {
*self.back() = Cons(elem, Box::new(Nil));
}
fn append(mut self, elem: u32) -> Self {
self.append_ref(elem);
self
}
}
fn main() {
let n = Nil;
let n = n.append(1);
println!("{:?}", n);
let n = n.append(2);
println!("{:?}", n);
let n = n.append(3);
println!("{:?}", n);
}
When non-lexical lifetimes are enabled, this function can be more obvious:
fn back(&mut self) -> &mut List {
let mut node = self;
while let Cons(_, next) = node {
node = next;
}
node
}
As the len method is implemented recursively, I have done the same for the append implementation:
fn append(self, elem: u32) -> List {
match self {
Cons(current_elem, tail_box) => {
let tail = *tail_box;
let new_tail = tail.append(elem);
new_tail.prepend(current_elem)
}
Nil => {
List::new().prepend(elem)
}
}
}
One possible iterative solution would be to implement append in terms of prepend and a reverse function, like so (it won't be as performant but should still only be O(N)):
// Reverses the list
fn rev(self) -> List {
let mut result = List::new();
let mut current = self;
while let Cons(elem, tail) = current {
result = result.prepend(elem);
current = *tail;
}
result
}
fn append(self, elem: u32) -> List {
self.rev().prepend(elem).rev()
}
So, it's actually going to be slightly more difficult than you may think; mostly because Box is really missing a destructive take method which would return its content.
Easy way: the recursive way, no return.
fn append_rec(&mut self, elem: u32) {
match *self {
Cons(_, ref mut tail) => tail.append_rec(elem),
Nil => *self = Cons(elem, Box::new(Nil)),
}
}
This is relatively easy, as mentioned.
Harder way: the recursive way, with return.
fn append_rec(self, elem: u32) -> List {
match self {
Cons(e, tail) => Cons(e, Box::new((*tail).append_rec(elem))),
Nil => Cons(elem, Box::new(Nil)),
}
}
Note that this is grossly inefficient. For a list of size N, we are destroying N boxes and allocating N new ones. In place mutation (the first approach), was much better in this regard.
Harder way: the iterative way, with no return.
fn append_iter_mut(&mut self, elem: u32) {
let mut current = self;
loop {
match {current} {
&mut Cons(_, ref mut tail) => current = tail,
c # &mut Nil => {
*c = Cons(elem, Box::new(Nil));
return;
},
}
}
}
Okay... so iterating (mutably) over a nested data structure is not THAT easy because ownership and borrow-checking will ensure that:
a mutable reference is never copied, only moved,
a mutable reference with an outstanding borrow cannot be modified.
This is why here:
we use {current} to move current into the match,
we use c # &mut Nil because we need a to name the match of &mut Nil since current has been moved.
Note that thankfully rustc is smart enough to check the execution path and detect that it's okay to continue looping as long as we take the Cons branch since we reinitialize current in that branch, however it's not okay to continue after taking the Nil branch, which forces us to terminate the loop :)
Harder way: the iterative way, with return
fn append_iter(self, elem: u32) -> List {
let mut stack = List::default();
{
let mut current = self;
while let Cons(elem, tail) = current {
stack = stack.prepend(elem);
current = take(tail);
}
}
let mut result = List::new();
result = result.prepend(elem);
while let Cons(elem, tail) = stack {
result = result.prepend(elem);
stack = take(tail);
}
result
}
In the recursive way, we were using the stack to keep the items for us, here we use a stack structure instead.
It's even more inefficient than the recursive way with return was; each node cause two deallocations and two allocations.
TL;DR: in-place modifications are generally more efficient, don't be afraid of using them when necessary.

Can I combine variable assignent with an if?

I have this code:
let fd = libc::creat(path, FILE_MODE);
if fd < 0 {
/* error */
}
the equivalent in C is shorter:
if ((fd = creat(path, FILE_MODE)) < 0) {
/* error */
}
can I do a similar thing in Rust? I tried to map it to if let but it looks like handling Options.
No, it's not possible by design. let bindings are one of the two non-expression statements in Rust. That means that the binding does not return any value that could be used further.
Bindings as expressions don't make a whole lot of sense in Rust in general. Consider let s = String::new(): this expression can't return String, because s owns the string. Or what about let (x, _) = get_tuple()? Would the expression return the whole tuple or just the not-ignored elements? So ⇒ let bindings aren't expressions.
About the if let: Sadly that won't work either. It just enables us to test if a destructuring works or to put it in other words: destructure a refutable pattern. This doesn't only work with Option<T>, but with all types.
If you really want to shorten this code, there is a way: make c_int easily convertible into a more idiomatic type, like Result. This is best done via extension trait:
trait LibcIntExt {
fn to_res(self) -> Result<u32, u32>;
}
impl LibcIntExt for c_int {
fn to_res(self) -> Result<u32, u32> {
if self < 0 {
Err(-self as u32)
} else {
Ok(self as u32)
}
}
}
With this you can use if let in the resulting Result:
if let Err(fd) = libc::creat(path, FILE_MODE).to_res() {
/* error */
}

Is boxing or explicit lifetimes the right solution when referencing a collection item in a loop?

In Rust (version 1.x) I want to use elements of a collection inside a loop such as the example
below (which recors the characters it has seen and does something when it spots a repeated char) where the collection is defined inside the function and only used by the loop.
fn do_something(word:&str) -> u32 {
let mut seen_chars = HashMap::new();
let mut answer : u32 = 0;
for (i,c) in word.chars().enumerate() {
let char_str = Box::new(c.to_string());
match seen_chars.get(&char_str) {
Some(&index) => {
answer = answer + index;
},
None => {seen_chars.insert(char_str,i);}
};
}
answer
}
In order to store references to c in my hashmap (which I have declared outside the loop) I need to box c
and allocate it on the heap. This feels inefficent and like I must be doing something wrong.
I wondered if using explicit lifetimes would be a better way to do things, below is my best attempt but I can't get it to compile.
fn do_something<'a>(word:&str) -> u32 {
let mut seen_chars = : &'a HashMap<&str,usize> = &HashMap::new();
let mut answer : u32 = 0;
for (i,c) in word.chars().enumerate() {
let char_str = &'a str = &c.to_string();
match seen_chars.get(&char_str) {
Some(&index) => {
answer = answer + index;
},
None => {seen_chars.insert(char_str,i);}
};
}
answer
}
When I try compiling I get "error: borrowed value does not live long enough" with an indication that "&HashMap::new()" is the problem.
Can I use lifetime specification to solve this issue or am doing things the wrong way here?
I don't think either of your approaches is the best solution. You can just use the char itself as a key for your HashMap, no need to convert it to a String:
fn do_something(word:&str) -> usize {
let mut seen_chars = HashMap::new();
let mut answer : usize = 0;
for (i,c) in word.chars().enumerate() {
match seen_chars.get(&c) {
Some(&index) => {
answer = answer + index;
},
None => {seen_chars.insert(c,i);}
};
}
answer
}
(I also had to change the type of answer to get this to compile, since enumerate gives you usizes . Alternatively, you could cast i to u32 where necessary)
If, for some reason, you wanted to have string keys instead of char, you would have to use owned strings (i.e. String) instead of string slices (&str). You would end up with something like this:
fn do_something(word:&str) -> usize {
let mut seen_chars : HashMap<String,usize> = HashMap::new();
let mut answer : usize = 0;
for (i,c) in word.chars().enumerate() {
let char_str = c.to_string();
match seen_chars.get(&char_str) {
Some(&index) => {
answer = answer + index;
},
None => {seen_chars.insert(char_str,i);}
};
}
answer
}
But I strongly suspect that the first options is what you actually want.

Resources