how to solve cyclic-dependency & Iterator lifetime problem? - rust

Rustaceans. when I start to write a BloomFilter example in rust. I found I have serveral problems have to solve. I struggle to solve them but no progress in a day. I need help, any suggestion will help me a lot, Thanks.
Problems
How to solve lifetime when pass a Iterator into another function?
// let bits = self.hash(value); // how to solve such lifetime error without use 'static storage?
// Below is a workaround code but need to computed in advanced.
let bits = Box::new(self.hash(value).collect::<Vec<u64>>().into_iter());
self.0.set(bits);
How to solve cyclic-dependency between struts without modify lower layer code, e.g: bloom_filter ?
// cyclic-dependency:
// RedisCache -> BloomFilter -> Storage
// | ^
// ------------<impl>------------
//
// v--- cache ownership has moved here
let filter = BloomFilter::by(Box::new(cache));
cache.1.replace(filter);
Since rust does not have null value, How can I solve the cyclic-dependency initialization without any stubs?
let mut cache = RedisCache(
Client::open("redis://localhost").unwrap(),
// I found can use Weak::new() to solve it,but need to downgrade a Rc reference.
// v-- need a BloomFilter stub to create RedisCache
RefCell::new(BloomFilter::new()),
);
Code
#![allow(unused)]
mod bloom_filter {
use std::{hash::Hash, marker::PhantomData};
pub type BitsIter = Box<dyn Iterator<Item = u64>>;
pub trait Storage {
fn set(&mut self, bits: BitsIter);
fn contains_all(&self, bits: BitsIter) -> bool;
}
pub struct BloomFilter<T: Hash>(Box<dyn Storage>, PhantomData<T>);
impl<T: Hash> BloomFilter<T> {
pub fn new() -> BloomFilter<T> {
return Self::by(Box::new(ArrayStorage([0; 5000])));
struct ArrayStorage<const N: usize>([u8; N]);
impl<const N: usize> Storage for ArrayStorage<N> {
fn set(&mut self, bits: BitsIter) {
let size = self.0.len() as u64;
bits.map(|bit| (bit % size) as usize)
.for_each(|index| self.0[index] = 1);
}
fn contains_all(&self, bits: BitsIter) -> bool {
let size = self.0.len() as u64;
bits.map(|bit| (bit % size) as usize)
.all(|index| self.0[index] == 1)
}
}
}
pub fn by(storage: Box<dyn Storage>) -> BloomFilter<T> {
BloomFilter(storage, PhantomData)
}
pub fn add(&mut self, value: T) {
// let bits = self.hash(value); // how to solve such lifetime error?
let bits = Box::new(self.hash(value).collect::<Vec<u64>>().into_iter());
self.0.set(bits);
}
pub fn contains(&self, value: T) -> bool {
// lifetime problem same as Self::add(T)
let bits = Box::new(self.hash(value).collect::<Vec<u64>>().into_iter());
self.0.contains_all(bits)
}
fn hash<'a, H: Hash + 'a>(&self, _value: H) -> Box<dyn Iterator<Item = u64> + 'a> {
todo!()
}
}
}
mod spi {
use super::bloom_filter::*;
use redis::{Client, Commands, RedisResult};
use std::{
cell::RefCell,
rc::{Rc, Weak},
};
pub struct RedisCache<'a>(Client, RefCell<BloomFilter<&'a str>>);
impl<'a> RedisCache<'a> {
pub fn new() -> RedisCache<'a> {
let mut cache = RedisCache(
Client::open("redis://localhost").unwrap(),
// v-- need a BloomFilter stub to create RedisCache
RefCell::new(BloomFilter::new()),
);
// v--- cache ownership has moved here
let filter = BloomFilter::by(Box::new(cache));
cache.1.replace(filter);
return cache;
}
pub fn get(&mut self, key: &str, load_value: fn() -> Option<String>) -> Option<String> {
let filter = self.1.borrow();
if filter.contains(key) {
if let Ok(value) = self.0.get::<&str, String>(key) {
return Some(value);
}
if let Some(actual_value) = load_value() {
let _: () = self.0.set(key, &actual_value).unwrap();
return Some(actual_value);
}
}
return None;
}
}
impl<'a> Storage for RedisCache<'a> {
fn set(&mut self, bits: BitsIter) {
todo!()
}
fn contains_all(&self, bits: BitsIter) -> bool {
todo!()
}
}
}
Updated
First, thanks #Colonel Thirty Two give me a lot of information that I haven't mastered and help me fixed the problem of the iterator lifetime.
The cyclic-dependency I have solved by break the responsibility of the Storage into another struct RedisStorage without modify the bloom_filter module, but make the example bloated. Below is their relationships:
RedisCache -> BloomFilter -> Storage <---------------
| |
|-------> redis::Client <- RedisStorage ---<impl>---
I realized the ownership & lifetime system is not only used by borrow checker, but also Rustaceans need a bigger front design to obey the rules than in a GC language, e.g: java. Am I right?
Final Code
mod bloom_filter {
use std::{
hash::{Hash, Hasher},
marker::PhantomData,
};
pub type BitsIter<'a> = Box<dyn Iterator<Item = u64> + 'a>;
pub trait Storage {
fn set(&mut self, bits: BitsIter);
fn contains_all(&self, bits: BitsIter) -> bool;
}
pub struct BloomFilter<T: Hash>(Box<dyn Storage>, PhantomData<T>);
impl<T: Hash> BloomFilter<T> {
#[allow(unused)]
pub fn new() -> BloomFilter<T> {
return Self::by(Box::new(ArrayStorage([0; 5000])));
struct ArrayStorage<const N: usize>([u8; N]);
impl<const N: usize> Storage for ArrayStorage<N> {
fn set(&mut self, bits: BitsIter) {
let size = self.0.len() as u64;
bits.map(|bit| (bit % size) as usize)
.for_each(|index| self.0[index] = 1);
}
fn contains_all(&self, bits: BitsIter) -> bool {
let size = self.0.len() as u64;
bits.map(|bit| (bit % size) as usize)
.all(|index| self.0[index] == 1)
}
}
}
pub fn by(storage: Box<dyn Storage>) -> BloomFilter<T> {
BloomFilter(storage, PhantomData)
}
pub fn add(&mut self, value: T) {
self.0.set(self.hash(value));
}
pub fn contains(&self, value: T) -> bool {
self.0.contains_all(self.hash(value))
}
fn hash<'a, H: Hash + 'a>(&self, value: H) -> BitsIter<'a> {
Box::new(
[3, 11, 31, 71, 131]
.into_iter()
.map(|salt| SimpleHasher(0, salt))
.map(move |mut hasher| hasher.hash(&value)),
)
}
}
struct SimpleHasher(u64, u64);
impl SimpleHasher {
fn hash<H: Hash>(&mut self, value: &H) -> u64 {
value.hash(self);
self.finish()
}
}
impl Hasher for SimpleHasher {
fn finish(&self) -> u64 {
self.0
}
fn write(&mut self, bytes: &[u8]) {
self.0 += bytes.iter().fold(0u64, |acc, k| acc * self.1 + *k as u64)
}
}
}
mod spi {
use super::bloom_filter::*;
use redis::{Client, Commands};
use std::{cell::RefCell, rc::Rc};
pub struct RedisCache<'a>(Rc<RefCell<Client>>, BloomFilter<&'a str>);
impl<'a> RedisCache<'a> {
pub fn new(client: Rc<RefCell<Client>>, filter: BloomFilter<&'a str>) -> RedisCache<'a> {
RedisCache(client, filter)
}
pub fn get<'f>(
&mut self,
key: &str,
load_value: fn() -> Option<&'f str>,
) -> Option<String> {
if self.1.contains(key) {
let mut redis = self.0.as_ref().borrow_mut();
if let Ok(value) = redis.get::<&str, String>(key) {
return Some(value);
}
if let Some(actual_value) = load_value() {
let _: () = redis.set(key, &actual_value).unwrap();
return Some(actual_value.into());
}
}
return None;
}
}
struct RedisStorage(Rc<RefCell<Client>>);
const BLOOM_FILTER_KEY: &str = "bloom_filter";
impl Storage for RedisStorage {
fn set(&mut self, bits: BitsIter) {
bits.for_each(|slot| {
let _: bool = self
.0
.as_ref()
.borrow_mut()
.setbit(BLOOM_FILTER_KEY, slot as usize, true)
.unwrap();
})
}
fn contains_all(&self, mut bits: BitsIter) -> bool {
bits.all(|slot| {
self.0
.as_ref()
.borrow_mut()
.getbit(BLOOM_FILTER_KEY, slot as usize)
.unwrap()
})
}
}
#[test]
fn prevent_cache_penetration_by_bloom_filter() {
let client = Rc::new(RefCell::new(Client::open("redis://localhost").unwrap()));
redis::cmd("FLUSHDB").execute(&mut *client.as_ref().borrow_mut());
let mut filter: BloomFilter<&str> = BloomFilter::by(Box::new(RedisStorage(client.clone())));
assert!(!filter.contains("Rust"));
filter.add("Rust");
assert!(filter.contains("Rust"));
let mut cache = RedisCache::new(client, filter);
assert_eq!(
cache.get("Rust", || Some("System Language")),
Some("System Language".to_string())
);
assert_eq!(
cache.get("Rust", || panic!("must never be called after cached")),
Some("System Language".to_string())
);
assert_eq!(
cache.get("Go", || panic!("reject to loading `Go` from external storage")),
None
);
}
}

pub type BitsIter = Box<dyn Iterator<Item = u64>>;
In this case, the object in the box must be valid for the 'static lifetime. This isn't the case for the iterator returned by hash - its limited to the lifetime of self.
Try replacing with:
pub type BitsIter<'a> = Box<dyn Iterator<Item = u64> + 'a>;
Or using generics instead of boxed trait objects.
So your RedisClient needs a BloomFilter, but the BloomFilter also needs the RedisClient?
Your BloomFilter should not use the RedisCache that itself uses the BloomFilter - that's a recipe for infinitely recursing calls (how do you know what calls to RedisCache::add should update the bloom filter and which calls are from the bloom filter?).
If you really have to, you need some form of shared ownership, like Rc or Arc. Your BloomFilter will also need to use a weak reference, or else the two objects will refer to each other and will never free.

Related

What pattern to utilize to use a Vec of differing nested generic types/trait objects?

I'm trying to implement a pattern where different Processors can dictate the input type they take and produce a unified output (currently a fixed type, but I'd like to get it generic once this current implementation is working).
Below is a minimal example:
use std::convert::From;
use processor::NoOpProcessor;
use self::{
input::{Input, InputStore},
output::UnifiedOutput,
processor::{MultiplierProcessor, Processor, StringProcessor},
};
mod input {
use std::collections::HashMap;
#[derive(Debug)]
pub struct Input<T>(pub T);
#[derive(Default)]
pub struct InputStore(HashMap<String, String>);
impl InputStore {
pub fn insert<K, V>(mut self, key: K, value: V) -> Self
where
K: ToString,
V: ToString,
{
let key = key.to_string();
let value = value.to_string();
self.0.insert(key, value);
self
}
pub fn get<K, V>(&self, key: K) -> Option<Input<V>>
where
K: ToString,
for<'a> &'a String: Into<V>,
{
let key = key.to_string();
self.0.get(&key).map(|value| Input(value.into()))
}
}
}
mod processor {
use super::{input::Input, output::UnifiedOutput};
use super::I32Input;
pub struct NoOpProcessor;
pub trait Processor {
type I;
fn process(&self, input: &Input<Self::I>) -> UnifiedOutput;
}
impl Processor for NoOpProcessor {
type I = I32Input;
fn process(&self, input: &Input<Self::I>) -> UnifiedOutput {
UnifiedOutput(input.0 .0)
}
}
pub struct MultiplierProcessor(pub i32);
impl Processor for MultiplierProcessor {
type I = I32Input;
fn process(&self, input: &Input<Self::I>) -> UnifiedOutput {
UnifiedOutput(input.0 .0 * self.0)
}
}
pub struct StringProcessor;
impl Processor for StringProcessor {
type I = String;
fn process(&self, input: &Input<Self::I>) -> UnifiedOutput {
UnifiedOutput(input.0.parse().unwrap())
}
}
}
mod output {
#[derive(Debug)]
pub struct UnifiedOutput(pub i32);
}
pub fn main() {
let input_store = InputStore::default()
.insert("input_a", 123)
.insert("input_b", 567)
.insert("input_c", "789");
let processors = {
let mut labelled_processors = Vec::new();
// let mut labelled_processors: Vec<LabelledProcessor<Input<>>> = Vec::new(); // What's the correct type?
labelled_processors.push(LabelledProcessor("input_a", Box::new(NoOpProcessor)));
labelled_processors.push(LabelledProcessor(
"input_b",
Box::new(MultiplierProcessor(3)),
));
// labelled_processors.push(LabelledProcessor("input_c", Box::new(StringProcessor)));
labelled_processors
};
for processor in processors {
let output = retrieve_input_and_process(&input_store, processor);
println!("{:?}", output);
}
}
#[derive(Debug)]
pub struct I32Input(pub i32);
impl From<&String> for I32Input {
fn from(s: &String) -> Self {
Self(s.parse().unwrap())
}
}
struct LabelledProcessor<I>(&'static str, Box<dyn Processor<I = I>>)
where
for<'a> &'a String: Into<I>;
fn retrieve_input_and_process<T>(
store: &InputStore,
processor: LabelledProcessor<T>,
) -> UnifiedOutput
where
for<'a> &'a String: Into<T>,
{
let input = store.get(processor.0).unwrap();
processor.1.process(&input)
}
When // labelled_processors.push(LabelledProcessor("input_c", Box::new(StringProcessor))); is uncommented, I get the below compilation error:
error[E0271]: type mismatch resolving `<attempt2::processor::StringProcessor as attempt2::processor::Processor>::I == attempt2::I32Input`
--> src/attempt2.rs:101:63
|
101 | labelled_processors.push(LabelledProcessor("input_c", Box::new(StringProcessor)));
| ^^^^^^^^^^^^^^^^^^^^^^^^^ type mismatch resolving `<attempt2::processor::StringProcessor as attempt2::processor::Processor>::I == attempt2::I32Input`
|
note: expected this to be `attempt2::I32Input`
--> src/attempt2.rs:75:18
|
75 | type I = String;
| ^^^^^^
= note: required for the cast from `attempt2::processor::StringProcessor` to the object type `dyn attempt2::processor::Processor<I = attempt2::I32Input>`
I think I've learnt enough to "get" what the issue is - the labelled_processors vec expects all its items to have the same type. My problem is I'm unsure how to rectify this. I've tried to leverage dynamic dispatch more (for example changing LabelledProcessor to struct LabelledProcessor(&'static str, Box<dyn Processor<dyn Input>>);). However these changes spiral to their own issues with the type system too.
Other answers I've found online generally don't address this level of complexity with respect to the nested generics/traits - stopping at 1 level with the answer being let vec_x: Vec<Box<dyn SomeTrait>> .... This makes me wonder if there's an obvious answer that can be reached that I've just missed or if there's a whole different pattern I should be employing instead to achieve this goal?
I'm aware of potentially utilizing enums as wel, but that would mean all usecases would need to be captured within this module and it may not be able to define inputs/outputs/processors in external modules.
A bit lost at this point.
--- EDIT ---
Some extra points:
This is just an example, so things like InputStore basically converting everything to String is just an implementation detail. It's mainly to symbolize the concept of "the type needs to comply with some trait to be accepted", I just chose String for simplicity.
One possible solution would be to make retrieve_input_and_process a method of LabelledProcessor, and then hide the type behind a trait:
use std::convert::From;
use processor::NoOpProcessor;
use self::{
input::InputStore,
output::UnifiedOutput,
processor::{MultiplierProcessor, Processor, StringProcessor},
};
mod input {
use std::collections::HashMap;
#[derive(Debug)]
pub struct Input<T>(pub T);
#[derive(Default)]
pub struct InputStore(HashMap<String, String>);
impl InputStore {
pub fn insert<K, V>(mut self, key: K, value: V) -> Self
where
K: ToString,
V: ToString,
{
let key = key.to_string();
let value = value.to_string();
self.0.insert(key, value);
self
}
pub fn get<K, V>(&self, key: K) -> Option<Input<V>>
where
K: ToString,
for<'a> &'a str: Into<V>,
{
let key = key.to_string();
self.0.get(&key).map(|value| Input(value.as_str().into()))
}
}
}
mod processor {
use super::{input::Input, output::UnifiedOutput};
use super::I32Input;
pub struct NoOpProcessor;
pub trait Processor {
type I;
fn process(&self, input: &Input<Self::I>) -> UnifiedOutput;
}
impl Processor for NoOpProcessor {
type I = I32Input;
fn process(&self, input: &Input<Self::I>) -> UnifiedOutput {
UnifiedOutput(input.0 .0)
}
}
pub struct MultiplierProcessor(pub i32);
impl Processor for MultiplierProcessor {
type I = I32Input;
fn process(&self, input: &Input<Self::I>) -> UnifiedOutput {
UnifiedOutput(input.0 .0 * self.0)
}
}
pub struct StringProcessor;
impl Processor for StringProcessor {
type I = String;
fn process(&self, input: &Input<Self::I>) -> UnifiedOutput {
UnifiedOutput(input.0.parse().unwrap())
}
}
}
mod output {
#[derive(Debug)]
pub struct UnifiedOutput(pub i32);
}
pub fn main() {
let input_store = InputStore::default()
.insert("input_a", 123)
.insert("input_b", 567)
.insert("input_c", "789");
let processors = {
let mut labelled_processors: Vec<Box<dyn LabelledProcessorRef>> = Vec::new();
labelled_processors.push(Box::new(LabelledProcessor(
"input_a",
Box::new(NoOpProcessor),
)));
labelled_processors.push(Box::new(LabelledProcessor(
"input_b",
Box::new(MultiplierProcessor(3)),
)));
labelled_processors.push(Box::new(LabelledProcessor(
"input_c",
Box::new(StringProcessor),
)));
labelled_processors
};
for processor in processors {
let output = processor.retrieve_input_and_process(&input_store);
println!("{:?}", output);
}
}
#[derive(Debug)]
pub struct I32Input(pub i32);
impl From<&str> for I32Input {
fn from(s: &str) -> Self {
Self(s.parse().unwrap())
}
}
struct LabelledProcessor<I>(&'static str, Box<dyn Processor<I = I>>);
impl<I> LabelledProcessorRef for LabelledProcessor<I>
where
for<'a> &'a str: Into<I>,
{
fn retrieve_input_and_process(&self, store: &InputStore) -> UnifiedOutput {
let input = store.get(self.0).unwrap();
self.1.process(&input)
}
}
trait LabelledProcessorRef {
fn retrieve_input_and_process(&self, store: &InputStore) -> UnifiedOutput;
}
UnifiedOutput(123)
UnifiedOutput(1701)
UnifiedOutput(789)

Rust mutably borrowed lifetime issue

This is en excerpt from my actual code, but the issue is the same. I cannot borrow the staging_buffer in Buffer::from_data as mutable, and I cannot figure out why this is the case. Why isn't the mutable borrow of the Memory<'a> from the staging_Buffer released when the block is finished?
Note that the u32 reference in the Device struct is just a dummy, and in practice is a non-copyable type.
pub struct Device<'a> {
id: &'a u32,
}
impl<'a> Device<'a> {
pub fn new(id: &'a u32) -> Self {
Self { id }
}
}
pub struct Memory<'a> {
device: &'a Device<'a>,
}
impl<'a> Memory<'a> {
pub fn new(device: &'a Device) -> Self {
Self { device }
}
pub fn map(&mut self) {}
pub fn unmap(&mut self) {}
pub fn copy_from_host<T>(&self, _slice: &[T]) {}
}
impl<'a> Drop for Memory<'a> {
fn drop(&mut self) {}
}
pub struct Buffer<'a> {
device: &'a Device<'a>,
memory: Memory<'a>,
}
impl<'a> Buffer<'a> {
pub fn new(device: &'a Device) -> Self {
let buffer_memory = Memory::new(device);
Self {
device,
memory: buffer_memory,
}
}
pub fn from_data<T: Copy>(device: &'a Device, _data: &[T]) -> Self {
let mut staging_buffer = Buffer::new(device);
{
let staging_buffer_memory = staging_buffer.memory_mut();
staging_buffer_memory.map();
staging_buffer_memory.unmap();
}
let buffer = Self::new(device);
staging_buffer.copy_to_buffer(&buffer);
buffer
}
pub fn memory(&self) -> &Memory {
&self.memory
}
pub fn memory_mut(&'a mut self) -> &mut Memory {
&mut self.memory
}
fn copy_to_buffer(&self, _dst_buffer: &Buffer) {
//
}
}
impl<'a> Drop for Buffer<'a> {
fn drop(&mut self) {}
}
fn main() {
let id = 5;
let device = Device::new(&id);
let _buffer = Buffer::new(&device);
}
Okay, I finally figured it out!
The error here is actually in your implementation of memory_mut, and not in the rest of the code (although I agree that the lifetimes and references are making things harder than they need to be).
One way to see that this is the issue is to just mutably borrow memory directly, like so.
...
pub fn from_data<T: Copy>(device: &'a Device, _data: &[T]) -> Self {
let mut staging_buffer = Buffer::new(device);
{
// Don't use the stagin_buffer_memory variable
// let staging_buffer_memory = staging_buffer.memory_mut();
staging_buffer.memory.map();
staging_buffer.memory.unmap();
}
let buffer = Self::new(device);
staging_buffer.copy_to_buffer(&buffer);
buffer
}
...
but instead you can fix memory_mut by rewriting it like this:
...
pub fn memory_mut(&mut self) -> &mut Memory<'a> {
&mut self.memory
}
...
In this case the implementation of from_data works as it should.
I am not very confident with this, but I think it is due to the fact that the same lifetime annotation 'a is used everywhere.
Then, when lifetimes are considered by the borrow-checker, some are supposed to be equivalent (because of the same annotation) although they are not.
A lifetime is indeed necessary for the reference id in Device.
Another is needed for the reference device in Memory but this is not necessarily the same as the previous.
Then, I propagated these distinct lifetimes all over the code.
A Rust expert could probably explain this better than me.
pub struct Device<'i> {
id: &'i u32,
}
impl<'i> Device<'i> {
pub fn new(id: &'i u32) -> Self {
Self { id }
}
}
pub struct Memory<'d, 'i> {
device: &'d Device<'i>,
}
impl<'d, 'i> Memory<'d, 'i> {
pub fn new(device: &'d Device<'i>) -> Self {
Self { device }
}
pub fn map(&mut self) {}
pub fn unmap(&mut self) {}
pub fn copy_from_host<T>(
&self,
_slice: &[T],
) {
}
}
impl<'d, 'i> Drop for Memory<'d, 'i> {
fn drop(&mut self) {}
}
pub struct Buffer<'d, 'i> {
device: &'d Device<'i>,
memory: Memory<'d, 'i>,
}
impl<'d, 'i> Buffer<'d, 'i> {
pub fn new(device: &'d Device<'i>) -> Self {
let buffer_memory = Memory::new(device);
Self {
device,
memory: buffer_memory,
}
}
pub fn from_data<T: Copy>(
device: &'d Device<'i>,
_data: &[T],
) -> Self {
let mut staging_buffer = Buffer::new(device);
{
let staging_buffer_memory = staging_buffer.memory_mut();
staging_buffer_memory.map();
staging_buffer_memory.unmap();
}
let buffer = Self::new(device);
staging_buffer.copy_to_buffer(&buffer);
buffer
}
pub fn memory(&self) -> &Memory<'d, 'i> {
&self.memory
}
pub fn memory_mut(&mut self) -> &mut Memory<'d, 'i> {
&mut self.memory
}
fn copy_to_buffer(
&self,
_dst_buffer: &Buffer,
) {
//
}
}
impl<'d, 'i> Drop for Buffer<'d, 'i> {
fn drop(&mut self) {}
}
fn main() {
let id = 5;
let device = Device::new(&id);
let _buffer = Buffer::new(&device);
}

Using Rayon into_par_iter().sum() with custom struct

I'm trying to get use Rayon::prelude::into_par_iter to sum up a bunch of instances of a struct. I've implemented std::iter::Sum for the struct, but am still running into an error. Here is my example code;
use std::iter::Sum;
use rayon::prelude::*;
pub struct TestStruct {
val: f32,
}
impl TestStruct {
pub fn new(v: f32) -> Self {
Self { val: v }
}
}
impl<'a> Sum<&'a Self> for TestStruct {
fn sum<I>(iter: I) -> Self
where
I: Iterator<Item = &'a Self>,
{
iter.fold(Self { val: 0. }, |a, b| Self {
val: a.val + b.val,
})
}
}
fn main() {
let default_val: f32 = 1.1;
let sum: TestStruct = (0..5).into_par_iter().map(|&default_val| {
TestStruct::new(default_val)
})
.sum();
println!("{}", sum.val);
}
and I am told the trait 'std::iter::Sum' is not implemented for 'TestStruct', but I think I am implementing it. Am I missing something here?
Two small compounded issues. One is that the std::iter::Sum is defined as:
pub trait Sum<A = Self> {
fn sum<I>(iter: I) -> Self
where
I: Iterator<Item = A>;
}
The Self here refers to whatever the struct gets the implementation and the default generic parameter has already been filled in. You don't need to specify it any more if implementing Sum for struct itself.
The other is that the map closure parameter default_val, which is an integer, shadows the previous float one with the same name. Since TestStruct::new expects a f32, it leads to a type error.
This works:
impl Sum for TestStruct {
fn sum<I>(iter: I) -> Self
where
I: Iterator<Item = Self>,
{
...
}
}
fn main() {
let default_val: f32 = 1.1;
let sum: TestStruct = (0..5)
.into_par_iter()
.map(|_| TestStruct::new(default_val))
.sum();
println!("{}", sum.val);
}

Is there an owned version of String::chars?

The following code does not compile:
use std::str::Chars;
struct Chunks {
remaining: Chars,
}
impl Chunks {
fn new(s: String) -> Self {
Chunks {
remaining: s.chars(),
}
}
}
The error is:
error[E0106]: missing lifetime specifier
--> src/main.rs:4:16
|
4 | remaining: Chars,
| ^^^^^ expected lifetime parameter
Chars doesn't own the characters it iterates over and it can't outlive the &str or String it was created from.
Is there an owned version of Chars that does not need a lifetime parameter or do I have to keep a Vec<char> and an index myself?
std::vec::IntoIter is an owned version of every iterator, in a sense.
use std::vec::IntoIter;
struct Chunks {
remaining: IntoIter<char>,
}
impl Chunks {
fn new(s: String) -> Self {
Chunks {
remaining: s.chars().collect::<Vec<_>>().into_iter(),
}
}
}
Playground link
Downside is additional allocation and a space overhead, but I am not aware of the iterator for your specific case.
Ouroboros
You can use the ouroboros crate to create a self-referential struct containing the String and a Chars iterator:
use ouroboros::self_referencing; // 0.4.1
use std::str::Chars;
#[self_referencing]
pub struct IntoChars {
string: String,
#[borrows(string)]
chars: Chars<'this>,
}
// All these implementations are based on what `Chars` implements itself
impl Iterator for IntoChars {
type Item = char;
#[inline]
fn next(&mut self) -> Option<Self::Item> {
self.with_mut(|me| me.chars.next())
}
#[inline]
fn count(mut self) -> usize {
self.with_mut(|me| me.chars.count())
}
#[inline]
fn size_hint(&self) -> (usize, Option<usize>) {
self.with(|me| me.chars.size_hint())
}
#[inline]
fn last(mut self) -> Option<Self::Item> {
self.with_mut(|me| me.chars.last())
}
}
impl DoubleEndedIterator for IntoChars {
#[inline]
fn next_back(&mut self) -> Option<Self::Item> {
self.with_mut(|me| me.chars.next_back())
}
}
impl std::iter::FusedIterator for IntoChars {}
// And an extension trait for convenience
trait IntoCharsExt {
fn into_chars(self) -> IntoChars;
}
impl IntoCharsExt for String {
fn into_chars(self) -> IntoChars {
IntoCharsBuilder {
string: self,
chars_builder: |s| s.chars(),
}
.build()
}
}
See also:
How can I store a Chars iterator in the same struct as the String it is iterating on?
Rental
You can use the rental crate to create a self-referential struct containing the String and a Chars iterator:
#[macro_use]
extern crate rental;
rental! {
mod into_chars {
pub use std::str::Chars;
#[rental]
pub struct IntoChars {
string: String,
chars: Chars<'string>,
}
}
}
use into_chars::IntoChars;
// All these implementations are based on what `Chars` implements itself
impl Iterator for IntoChars {
type Item = char;
#[inline]
fn next(&mut self) -> Option<Self::Item> {
self.rent_mut(|chars| chars.next())
}
#[inline]
fn count(mut self) -> usize {
self.rent_mut(|chars| chars.count())
}
#[inline]
fn size_hint(&self) -> (usize, Option<usize>) {
self.rent(|chars| chars.size_hint())
}
#[inline]
fn last(mut self) -> Option<Self::Item> {
self.rent_mut(|chars| chars.last())
}
}
impl DoubleEndedIterator for IntoChars {
#[inline]
fn next_back(&mut self) -> Option<Self::Item> {
self.rent_mut(|chars| chars.next_back())
}
}
impl std::iter::FusedIterator for IntoChars {}
// And an extension trait for convenience
trait IntoCharsExt {
fn into_chars(self) -> IntoChars;
}
impl IntoCharsExt for String {
fn into_chars(self) -> IntoChars {
IntoChars::new(self, |s| s.chars())
}
}
See also:
How can I store a Chars iterator in the same struct as the String it is iterating on?
There's also the owned-chars crate, which
provides an extension trait for String with two methods, into_chars and into_char_indices. These methods parallel String::chars and String::char_indices, but the iterators they create consume the String instead of borrowing it.
You could implement your own iterator, or wrap Chars like this (with just one small unsafe block):
// deriving Clone would be buggy. With Rc<>/Arc<> instead of Box<> it would work though.
struct OwnedChars {
// struct fields are dropped in order they are declared,
// see https://stackoverflow.com/a/41056727/1478356
// with `Chars` it probably doesn't matter, but for good style `inner`
// should be dropped before `storage`.
// 'static lifetime must not "escape" lifetime of the struct
inner: ::std::str::Chars<'static>,
// we need to box anyway to be sure the inner reference doesn't move when
// moving the storage, so we can erase the type as well.
// struct OwnedChar<S: AsRef<str>> { ..., storage: Box<S> } should work too
storage: Box<AsRef<str>>,
}
impl OwnedChars {
pub fn new<S: AsRef<str>+'static>(s: S) -> Self {
let storage = Box::new(s) as Box<AsRef<str>>;
let raw_ptr : *const str = storage.as_ref().as_ref();
let ptr : &'static str = unsafe { &*raw_ptr };
OwnedChars{
storage: storage,
inner: ptr.chars(),
}
}
pub fn as_str(&self) -> &str {
self.inner.as_str()
}
}
impl Iterator for OwnedChars {
// just `char` of course
type Item = <::std::str::Chars<'static> as Iterator>::Item;
fn next(&mut self) -> Option<Self::Item> {
self.inner.next()
}
}
impl DoubleEndedIterator for OwnedChars {
fn next_back(&mut self) -> Option<Self::Item> {
self.inner.next_back()
}
}
impl Clone for OwnedChars {
fn clone(&self) -> Self {
// need a new allocation anyway, so simply go for String, and just
// clone the remaining string
OwnedChars::new(String::from(self.inner.as_str()))
}
}
impl ::std::fmt::Debug for OwnedChars {
fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
let storage : &str = self.storage.as_ref().as_ref();
f.debug_struct("OwnedChars")
.field("storage", &storage)
.field("inner", &self.inner)
.finish()
}
}
// easy access
trait StringExt {
fn owned_chars(self) -> OwnedChars;
}
impl<S: AsRef<str>+'static> StringExt for S {
fn owned_chars(self) -> OwnedChars {
OwnedChars::new(self)
}
}
See playground
As copied from How can I store a Chars iterator in the same struct as the String it is iterating on?:
use std::mem;
use std::str::Chars;
/// I believe this struct to be safe because the String is
/// heap-allocated (stable address) and will never be modified
/// (stable address). `chars` will not outlive the struct, so
/// lying about the lifetime should be fine.
///
/// TODO: What about during destruction?
/// `Chars` shouldn't have a destructor...
struct OwningChars {
_s: String,
chars: Chars<'static>,
}
impl OwningChars {
fn new(s: String) -> Self {
let chars = unsafe { mem::transmute(s.chars()) };
OwningChars { _s: s, chars }
}
}
impl Iterator for OwningChars {
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
self.chars.next()
}
}
Here is a solution without unsafe.
It provides the same effect as s.chars().collect::<Vec<_>>().into_iter(), but without the allocation overhead.
struct OwnedChars {
s: String,
index: usize,
}
impl OwnedChars {
pub fn new(s: String) -> Self {
Self { s, index: 0 }
}
}
impl Iterator for OwnedChars {
type Item = char;
fn next(&mut self) -> Option<Self::Item> {
// Slice of leftover characters
let slice = &self.s[self.index..];
// Iterator over leftover characters
let mut chars = slice.chars();
// Query the next char
let next_char = chars.next()?;
// Compute the new index by looking at how many bytes are left
// after querying the next char
self.index = self.s.len() - chars.as_str().len();
// Return next char
Some(next_char)
}
}
Together with a little bit of trait magic:
trait StringExt {
fn into_chars(self) -> OwnedChars;
}
impl StringExt for String {
fn into_chars(self) -> OwnedChars {
OwnedChars::new(self)
}
}
You can do:
struct Chunks {
remaining: OwnedChars,
}
impl Chunks {
fn new(s: String) -> Self {
Chunks {
remaining: s.into_chars(),
}
}
}

How would I create a handle manager in Rust?

pub struct Storage<T>{
vec: Vec<T>
}
impl<T: Clone> Storage<T>{
pub fn new() -> Storage<T>{
Storage{vec: Vec::new()}
}
pub fn get<'r>(&'r self, h: &Handle<T>)-> &'r T{
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<T>, t: T){
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<T>{
self.vec.push(t);
Handle{id: self.vec.len()-1}
}
}
struct Handle<T>{
id: uint
}
I am currently trying to create a handle system in Rust and I have some problems. The code above is a simple example of what I want to achieve.
The code works but has one weakness.
let mut s1 = Storage<uint>::new();
let mut s2 = Storage<uint>::new();
let handle1 = s1.create(5);
s1.get(handle1); // works
s2.get(handle1); // unsafe
I would like to associate a handle with a specific storage like this
//Pseudo code
struct Handle<T>{
id: uint,
storage: &Storage<T>
}
impl<T> Handle<T>{
pub fn get(&self) -> &T;
}
The problem is that Rust doesn't allow this. If I would do that and create a handle with the reference of a Storage I wouldn't be allowed to mutate the Storage anymore.
I could implement something similar with a channel but then I would have to clone T every time.
How would I express this in Rust?
The simplest way to model this is to use a phantom type parameter on Storage which acts as a unique ID, like so:
use std::kinds::marker;
pub struct Storage<Id, T> {
marker: marker::InvariantType<Id>,
vec: Vec<T>
}
impl<Id, T> Storage<Id, T> {
pub fn new() -> Storage<Id, T>{
Storage {
marker: marker::InvariantType,
vec: Vec::new()
}
}
pub fn get<'r>(&'r self, h: &Handle<Id, T>) -> &'r T {
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<Id, T>, t: T) {
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<Id, T> {
self.vec.push(t);
Handle {
marker: marker::InvariantLifetime,
id: self.vec.len() - 1
}
}
}
pub struct Handle<Id, T> {
id: uint,
marker: marker::InvariantType<Id>
}
fn main() {
struct A; struct B;
let mut s1 = Storage::<A, uint>::new();
let s2 = Storage::<B, uint>::new();
let handle1 = s1.create(5);
s1.get(&handle1);
s2.get(&handle1); // won't compile, since A != B
}
This solves your problem in the simplest case, but has some downsides. Mainly, it depends on the use to define and use all of these different phantom types and to prove that they are unique. It doesn't prevent bad behavior on the user's part where they can use the same phantom type for multiple Storage instances. In today's Rust, however, this is the best we can do.
An alternative solution that doesn't work today for reasons I'll get in to later, but might work later, uses lifetimes as anonymous id types. This code uses the InvariantLifetime marker, which removes all sub typing relationships with other lifetimes for the lifetime it uses.
Here is the same system, rewritten to use InvariantLifetime instead of InvariantType:
use std::kinds::marker;
pub struct Storage<'id, T> {
marker: marker::InvariantLifetime<'id>,
vec: Vec<T>
}
impl<'id, T> Storage<'id, T> {
pub fn new() -> Storage<'id, T>{
Storage {
marker: marker::InvariantLifetime,
vec: Vec::new()
}
}
pub fn get<'r>(&'r self, h: &Handle<'id, T>) -> &'r T {
let index = h.id;
&self.vec[index]
}
pub fn set(&mut self, h: &Handle<'id, T>, t: T) {
let index = h.id;
self.vec[index] = t;
}
pub fn create(&mut self, t: T) -> Handle<'id, T> {
self.vec.push(t);
Handle {
marker: marker::InvariantLifetime,
id: self.vec.len() - 1
}
}
}
pub struct Handle<'id, T> {
id: uint,
marker: marker::InvariantLifetime<'id>
}
fn main() {
let mut s1 = Storage::<uint>::new();
let s2 = Storage::<uint>::new();
let handle1 = s1.create(5);
s1.get(&handle1);
// In theory this won't compile, since the lifetime of s2
// is *slightly* shorter than the lifetime of s1.
//
// However, this is not how the compiler works, and as of today
// s2 gets the same lifetime as s1 (since they can be borrowed for the same period)
// and this (unfortunately) compiles without error.
s2.get(&handle1);
}
In a hypothetical future, the assignment of lifetimes may change and we may grow a better mechanism for this sort of tagging. However, for now, the best way to accomplish this is with phantom types.

Resources