Possible to combine assignment and comparison in an expression? - rust

In C, it's common to assign and compare in a single expression:
n = n_init;
do {
func(n);
} while ((n = n.next) != n_init);
As I understand it this can be expressed in Rust as:
n = n_init;
loop {
func(n);
n = n.next;
if n == n_init {
break;
}
}
Which works the same as the C version (assuming the body of the loop doesn't use continue).
Is there a more terse way to express this in Rust, or is the example above ideal?
For the purposes of this question, assume ownership or satisfying the borrow checker isn't an issue. It's up to developer to satisfy these requirements.
For example, as an integer:
n = n_init;
loop {
func(&vec[n]);
n = vec[n].next;
if n == n_init {
break;
}
}
This may seem obvious that the Rust example is idiomatic Rust - however I'm looking to move quite a lot of this style of loop to Rust, I'm interested to know if there is some better/different way to express it.

The idiomatic way to represent iteration in Rust is to use an Iterator. Thus you would implement an iterator that does the n = n.next and then use a for loop to iterate over the iterator.
struct MyIter<'a> {
pos: &'a MyData,
start: &'a MyData,
}
impl<'a> Iterator for MyIter<'a> {
type Item = &'a MyData;
fn next(&mut self) -> Option<&'a MyData> {
if self.pos as *const _ == self.start as *const _ {
None
} else {
let pos = self.pos;
self.pos = self.pos.next;
Some(pos)
}
}
}
it is left as an exercise to the reader to adapt this iterator to be able to start from the first element instead of starting from the second.

Rust supports pattern matching in if and while:
instead of having a boolean condition, the test is considered successful if the pattern matches
as part of pattern matching, you bind the values matched to names
Thus, if instead of having a boolean condition you were building an Option...
fn check(next: *mut Node, init: *mut Node) -> Option<*mut Node>;
let mut n = n_init;
loop {
func(n);
if let Some(x) = check(n.next, n_init) {
n = x;
} else {
break;
}
}
However, if you can use an Iterator instead you'll be much more idiomatic.

An assignment in Rust returns the empty tuple. If you are fine with non-idiomatic code you can compare the assignment-result with such an empty tuple and use a logical conjunction to chain your actual loop condition.
let mut current = 3;
let mut parent;
while (parent = get_parent(current)) == () && parent != current {
println!("currently {}, parent is {}", current, parent);
current = parent;
}
// example function
fn get_parent(x: usize) -> usize {
if x > 0 { x - 1 } else { x }
}
// currently 3, parent is 2
// currently 2, parent is 1
// currently 1, parent is 0
This has the disadvantage that entering the loop needs to run logic (which you can avoid with C's do {..} while(); style loops).
You can use this approach inside a do-while macro, but readability isn't that great and at that point a refactoring might be preferable. In any case, this is how it could look:
do_it!({
println!("{}", n);
} while (n = n + 1) == () && n < 4);
This is the code for the macro:
macro_rules! do_it {
($b: block while $e:expr) => {
loop {
$b
if !($e) { break };
}
}
}

Related

Recursive closure inside a function [duplicate]

This is a very simple example, but how would I do something similar to:
let fact = |x: u32| {
match x {
0 => 1,
_ => x * fact(x - 1),
}
};
I know that this specific example can be easily done with iteration, but I'm wondering if it's possible to make a recursive function in Rust for more complicated things (such as traversing trees) or if I'm required to use my own stack instead.
There are a few ways to do this.
You can put closures into a struct and pass this struct to the closure. You can even define structs inline in a function:
fn main() {
struct Fact<'s> { f: &'s dyn Fn(&Fact, u32) -> u32 }
let fact = Fact {
f: &|fact, x| if x == 0 {1} else {x * (fact.f)(fact, x - 1)}
};
println!("{}", (fact.f)(&fact, 5));
}
This gets around the problem of having an infinite type (a function that takes itself as an argument) and the problem that fact isn't yet defined inside the closure itself when one writes let fact = |x| {...} and so one can't refer to it there.
Another option is to just write a recursive function as a fn item, which can also be defined inline in a function:
fn main() {
fn fact(x: u32) -> u32 { if x == 0 {1} else {x * fact(x - 1)} }
println!("{}", fact(5));
}
This works fine if you don't need to capture anything from the environment.
One more option is to use the fn item solution but explicitly pass the args/environment you want.
fn main() {
struct FactEnv { base_case: u32 }
fn fact(env: &FactEnv, x: u32) -> u32 {
if x == 0 {env.base_case} else {x * fact(env, x - 1)}
}
let env = FactEnv { base_case: 1 };
println!("{}", fact(&env, 5));
}
All of these work with Rust 1.17 and have probably worked since version 0.6. The fn's defined inside fns are no different to those defined at the top level, except they are only accessible within the fn they are defined inside.
As of Rust 1.62 (July 2022), there's still no direct way to recurse in a closure. As the other answers have pointed out, you need at least a bit of indirection, like passing the closure to itself as an argument, or moving it into a cell after creating it. These things can work, but in my opinion they're kind of gross, and they're definitely hard for Rust beginners to follow. If you want to use recursion but you have to have a closure, for example because you need something that implements FnOnce() to use with thread::spawn, then I think the cleanest approach is to use a regular fn function for the recursive part and to wrap it in a non-recursive closure that captures the environment. Here's an example:
let x = 5;
let fact = || {
fn helper(arg: u64) -> u64 {
match arg {
0 => 1,
_ => arg * helper(arg - 1),
}
}
helper(x)
};
assert_eq!(120, fact());
Here's a really ugly and verbose solution I came up with:
use std::{
cell::RefCell,
rc::{Rc, Weak},
};
fn main() {
let weak_holder: Rc<RefCell<Weak<dyn Fn(u32) -> u32>>> =
Rc::new(RefCell::new(Weak::<fn(u32) -> u32>::new()));
let weak_holder2 = weak_holder.clone();
let fact: Rc<dyn Fn(u32) -> u32> = Rc::new(move |x| {
let fact = weak_holder2.borrow().upgrade().unwrap();
if x == 0 {
1
} else {
x * fact(x - 1)
}
});
weak_holder.replace(Rc::downgrade(&fact));
println!("{}", fact(5)); // prints "120"
println!("{}", fact(6)); // prints "720"
}
The advantages of this are that you call the function with the expected signature (no extra arguments needed), it's a closure that can capture variables (by move), it doesn't require defining any new structs, and the closure can be returned from the function or otherwise stored in a place that outlives the scope where it was created (as an Rc<Fn...>) and it still works.
Closure is just a struct with additional contexts. Therefore, you can do this to achieve recursion (suppose you want to do factorial with recursive mutable sum):
#[derive(Default)]
struct Fact {
ans: i32,
}
impl Fact {
fn call(&mut self, n: i32) -> i32 {
if n == 0 {
self.ans = 1;
return 1;
}
self.call(n - 1);
self.ans *= n;
self.ans
}
}
To use this struct, just:
let mut fact = Fact::default();
let ans = fact.call(5);

Rust - create filter predicate as function

I want to create a function which returns a predicate.
I have an vector and I want to filter based on character a specific character. My code:
let n = 0; // Irrelevant. This will change
co2 = co2
.into_iter()
.filter(|&binary| binary.chars().collect::<Vec<char>>().get(n).unwrap() == &'1')
.collect();
I have a couple of characters I want to match. My intention was to create a function like this:
fn create_predicate(character: char, n: usize) -> impl FnMut(&str) {
move |x: &str| {
return x.chars().collect::<Vec<char>>().get(n).unwrap() == &character;
}
}
co2 = co2.filter(create_predicate('q', n));
This wont work, but I would like to make it look something like that
So I want to have a function that creates a predicate which I can use in my filter. How can I do that?
Hard to give a complete answer without a full minimal reproducible example, but you can get close with two changes to your code:
The predicate must return bool if you want to use it in filter.
Possibly, the predicate must take &&str.
Working example:
fn create_predicate(character: char, n: usize) -> impl FnMut(&&str) -> bool {
move |x: &&str| {
return x.chars().collect::<Vec<char>>().get(n).unwrap() == &character;
}
}
fn main() {
let co2 = [ "azerty", "foo", "bar" ];
let co2 = co2.iter().copied().filter(create_predicate('q', 1));
}
Playground

how to correctly return a reference to a mutated linked list in rust?

I am solving a leetcode problem in Rust, it's a linked list problem.
The part that I am stuck at is that I have a working algorithm, but I wasn't able to return from the function, below is my solution
pub fn remove_nth_from_end(head: Option<Box<ListNode>>, n: i32) -> Option<Box<ListNode>> {
let mut cursor = head.clone().unwrap();
let mut count: i32 = 0;
while cursor.next != None {
count += 1;
cursor = cursor.next.unwrap();
}
let mut n = count - n;
let mut new_cursor = head.unwrap();
while n != 0 {
n -= 1;
new_cursor = new_cursor.next.unwrap();
}
new_cursor.next = new_cursor.next.unwrap().next;
head // <- error: used of moved value
}
I first clone the head so that I can iterate through the linked list to get its total number of nodes.
Then, I will have to remove one node from the list, hence I'm not cloning the head, instead I use it directly, in this case the variable is moved. So after I am done removing the node, I would like to return the head, so that I can return the whole linked list.
However, because of the ownership system in rust, I wasn't able to return a moved value. The problem is I couldn't clone the value as well because if I were to clone, then the head is no longer pointing to the linked list where I removed one node from it.
How would one solve this kind of issue in Rust? I am fairly new to Rust, just picked up the language recently.
One way is to use &mut over the nodes and then use Option::take to take ownership of the nodes while leaving None behind. Use those combinations to mutate the list:
impl Solution {
pub fn remove_nth_from_end(mut head: Option<Box<ListNode>>, mut n: i32) -> Option<Box<ListNode>> {
match n {
0 => head.and_then(|node| node.next),
_ => {
let mut new_head = &mut head;
while n > 0 {
new_head = if let Some(next) = new_head {
&mut next.next
} else {
return head;
};
n -= 1;
}
let to_skip = new_head.as_mut().unwrap().next.take();
new_head.as_mut().map(|node| {
node.next = if let Some(mut other_node) = to_skip {
other_node.next.take()
} else {
None
};
});
head
}
}
}
}
Playground
Disclaimer: This do not implement it working from the end of the list but from the beginning of it. Didn't realize that part, but that should be the problem itself.

Iterating through a Window of a String without collect

I need to iterate through and compare a window of unknown length of a string. My current implementation works, however I've done performance tests against it, and it is very inefficient. The method needs to be guaranteed to be safe against Unicode.
fn foo(line: &str, patt: &str) {
for window in line.chars().collect::<Vec<char>>().windows(patt.len()) {
let mut bar = String::new();
for ch in window {
bar.push(*ch);
}
// perform various comparison checks
}
}
An improvement on Shepmaster's final solution, which significantly lowers overhead (by a factor of ~1.5), is
fn foo(line: &str, pattern: &str) -> bool {
let pattern_len = pattern.chars().count();
let starts = line.char_indices().map(|(i, _)| i);
let mut ends = line.char_indices().map(|(i, _)| i);
// Itertools::dropping
if pattern_len != 0 { ends.nth(pattern_len - 1); }
for (start, end) in starts.zip(ends.chain(Some(line.len()))) {
let bar = &line[start..end];
if bar == pattern { return true }
}
false
}
That said, your code from the Github page is a little odd. For instance, you try to deal with different length open and close tags with a wordier version of
let length = cmp::max(comment.len(), comment_end.len());
but your check
if window.contains(comment)
could then trigger multiple times!
Much better would be to just iterate over shrinking slices. In the mini example this would be
fn foo(line: &str, pattern: &str) -> bool {
let mut chars = line.chars();
loop {
let bar = chars.as_str();
if bar.starts_with(pattern) { return true }
if chars.next().is_none() { break }
}
false
}
(Note that this once again ends up again improving performance by another factor of ~1.5.)
and in a larger example this would be something like
let mut is_in_comments = 0u64;
let start = match line.find(comment) {
Some(start) => start,
None => return false,
};
let end = match line.rfind(comment_end) {
Some(end) => end,
None => return true,
};
let mut chars = line[start..end + comment_end.len()].chars();
loop {
let window = chars.as_str();
if window.starts_with(comment) {
if nested {
is_in_comments += 1;
} else {
is_in_comments = 1;
}
} else if window.starts_with(comment_end) {
is_in_comments = is_in_comments.saturating_sub(1);
}
if chars.next().is_none() { break }
}
Note that this still counts overlaps, so /*/ might count as an opening /* immediately followed by a closing */.
The method needs to be guaranteed to be safe against Unicode.
pattern.len() returns the number of bytes that the string requires, so it's already possible that your code is doing the wrong thing. I might suggest you check out tools like QuickCheck to produce arbitrary strings that include Unicode.
Here's my test harness:
use std::iter;
fn main() {
let mut haystack: String = iter::repeat('a').take(1024*1024*100).collect();
haystack.push('b');
println!("{}", haystack.len());
}
And I'm compiling and timing via cargo build --release && time ./target/release/x. Creating the string by itself takes 0.274s.
I used this version of your original code just to have some kind of comparison:
fn foo(line: &str, pattern: &str) -> bool {
for window in line.chars().collect::<Vec<char>>().windows(pattern.len()) {
let mut bar = String::new();
for ch in window {
bar.push(*ch);
}
if bar == pattern { return true }
}
false
}
This takes 4.565s, or 4.291s for just foo.
The first thing I see is that there is a lot of allocation happening on the inner loop. The code creates, allocates, and destroys the String for each iteration. Let's reuse the String allocation:
fn foo_mem(line: &str, pattern: &str) -> bool {
let mut bar = String::new();
for window in line.chars().collect::<Vec<char>>().windows(pattern.len()) {
bar.clear();
bar.extend(window.iter().cloned());
if bar == pattern { return true }
}
false
}
This takes 2.155s or 1.881s for just foo_mem.
Continuing on, another extraneous allocation is the one for the String at all. We already have bytes that look like the right thing, so let's reuse them:
fn foo_no_string(line: &str, pattern: &str) -> bool {
let indices: Vec<_> = line.char_indices().map(|(i, _c)| i).collect();
let l = pattern.chars().count();
for window in indices.windows(l + 1) {
let first_idx = *window.first().unwrap();
let last_idx = *window.last().unwrap();
let bar = &line[first_idx..last_idx];
if bar == pattern { return true }
}
// Do the last pair
{
let last_idx = indices[indices.len() - l];
let bar = &line[last_idx..];
if bar == pattern { return true }
}
false
}
This code is ugly and unidiomatic. I'm pretty sure some thinking (that I'm currently too lazy to do) would make it look a lot better.
This takes 1.409s or 1.135s for just foo_mem.
As this is ~25% of the original time, Amdahl's Law suggests this is a reasonable stopping point.

Lazy sequence generation in Rust

How can I create what other languages call a lazy sequence or a "generator" function?
In Python, I can use yield as in the following example (from Python's docs) to lazily generate a sequence that is iterable in a way that does not use the memory of an intermediary list:
# a generator that yields items instead of returning a list
def firstn(n):
num = 0
while num < n:
yield num
num += 1
sum_of_first_n = sum(firstn(1000000))
How can I do something similar in Rust?
Rust does have generators, but they are highly experimental and not currently available in stable Rust.
Works in stable Rust 1.0 and above
Range handles your concrete example. You can use it with the syntactical sugar of ..:
fn main() {
let sum: u64 = (0..1_000_000).sum();
println!("{}", sum)
}
What if Range didn't exist? We can create an iterator that models it!
struct MyRange {
start: u64,
end: u64,
}
impl MyRange {
fn new(start: u64, end: u64) -> MyRange {
MyRange {
start: start,
end: end,
}
}
}
impl Iterator for MyRange {
type Item = u64;
fn next(&mut self) -> Option<u64> {
if self.start == self.end {
None
} else {
let result = Some(self.start);
self.start += 1;
result
}
}
}
fn main() {
let sum: u64 = MyRange::new(0, 1_000_000).sum();
println!("{}", sum)
}
The guts are the same, but more explicit than the Python version. Notably, Python's generators keep track of the state for you. Rust prefers explicitness, so we have to create our own state and update it manually. The important part is the implementation of the Iterator trait. We specify that the iterator yields values of a specific type (type Item = u64) and then deal with stepping each iteration and how to tell we have reached the end of iteration.
This example is not as powerful as the real Range, which uses generics, but shows an example of how to go about it.
Works in nightly Rust
Nightly Rust does have generators, but they are highly experimental. You need to bring in a few unstable features to create one. However, it looks pretty close to the Python example, with some Rust-specific additions:
// 1.43.0-nightly (2020-02-09 71c7e149e42cb0fc78a8)
#![feature(generators, generator_trait)]
use std::{
ops::{Generator, GeneratorState},
pin::Pin,
};
fn firstn(n: u64) -> impl Generator<Yield = u64, Return = ()> {
move || {
let mut num = 0;
while num < n {
yield num;
num += 1;
}
}
}
Since everything in current Rust operates on iterators, we create an adapter that converts a generator into an iterator in order to play with the broader ecosystem. I'd expect that such an adapter would be present in the standard library eventually:
struct GeneratorIteratorAdapter<G>(Pin<Box<G>>);
impl<G> GeneratorIteratorAdapter<G>
where
G: Generator<Return = ()>,
{
fn new(gen: G) -> Self {
Self(Box::pin(gen))
}
}
impl<G> Iterator for GeneratorIteratorAdapter<G>
where
G: Generator<Return = ()>,
{
type Item = G::Yield;
fn next(&mut self) -> Option<Self::Item> {
match self.0.as_mut().resume(()) {
GeneratorState::Yielded(x) => Some(x),
GeneratorState::Complete(_) => None,
}
}
}
Now we can use it:
fn main() {
let generator_iterator = GeneratorIteratorAdapter::new(firstn(1_000_000));
let sum: u64 = generator_iterator.sum();
println!("{}", sum);
}
What's interesting about this is that it's less powerful than an implementation of Iterator. For example, iterators have the size_hint method, which allows consumers of the iterator to have an idea of how many elements are remaining. This allows optimizations when collecting into a container. Generators do not have any such information.
As of Rust 1.34 stable, you have convenient std::iter::from_fn utility. It is not a coroutine (i.e. you still have to return each time), but at least it saves you from defining another struct.
from_fn accepts a closure FnMut() -> Option<T> and repeatedly calls it to create an Iterator<T>. In pseudo-Python, def from_fn(f): while (val := f()) is not None: yield val.
// -> Box<dyn std::iter::Iterator<Item=u64>> in Rust 2015
fn firstn(n: u64) -> impl std::iter::Iterator<Item = u64> {
let mut num = 0;
std::iter::from_fn(move || {
let result;
if num < n {
result = Some(num);
num += 1
} else {
result = None
}
result
})
}
fn main() {
let sum_of_first_n = firstn(1000000).sum::<u64>();
println!("sum(0 to 999999): {}", sum_of_first_n);
}
std::iter::successors is also available. It is less general but might be a bit easier to use since you just pass around the seed value explicitly. In pseudo-Python: def successors(seed, f): while seed is not None: yield seed; seed = f(seed).
fn firstn(n: u64) -> impl std::iter::Iterator<Item = u64> {
std::iter::successors(
Some(0),
move |&num| {
let next = num + 1;
if next < n {
Some(next)
} else {
None
}
},
)
}
However, Shepmaster's note applies to these utility too. (tldr: often hand-rolled Iterators are more memory efficient)
What's interesting about this is that it's less powerful than an implementation of Iterator. For example, iterators have the size_hint method, which allows consumers of the iterator to have an idea of how many elements are remaining. This allows optimizations when collecting into a container. Generators do not have any such information.
(Note: returning impl is a Rust 2018 feature. See the Edition Guide for configuration and Announcement or Rust By Example for explanation)
Rust 1.0 does not have generator functions, so you'd have to do it manually with explicit iterators.
First, rewrite your Python example as a class with a next() method, since that is closer to the model you're likely to get in Rust. Then you can rewrite it in Rust with a struct that implements the Iterator trait.
You might also be able to use a function that returns a closure to achieve a similar result, but I don't think it would be possible to have that implement the Iterator trait (since it would require being called to generate a new result).
You can use my stackful Rust generator library which supports stable Rust:
#[macro_use]
extern crate generator;
use generator::{Generator, Gn};
fn firstn(n: usize) -> Generator<'static, (), usize> {
Gn::new_scoped(move |mut s| {
let mut num = 0;
while num < n {
s.yield_(num);
num += 1;
}
done!();
})
}
fn main() {
let sum_of_first_n: usize = firstn(1000000).sum();
println!("sum ={}", sum_of_first_n);
}
or more simply:
let n = 100000;
let range = Gn::new_scoped(move |mut s| {
let mut num = 0;
while num < n {
s.yield_(num);
num += 1;
}
done!();
});
let sum: usize = range.sum();

Resources