Group vector of structs by field

Group vector of structs by field - rust

I want to create a vector with all of the matching field id from the struct, process that new vector and then repeat the process. Basically grouping together the structs with matching field id.
Is there a way to do this by not using the unstable feature drain_filter?
#![feature(drain_filter)]
#[derive(Debug)]
struct Person {
id: u32,
}
fn main() {
let mut people = vec![];
for p in 0..10 {
people.push(Person { id: p });
}
while !people.is_empty() {
let first_person_id = people.first().unwrap().id;
let drained: Vec<Person> = people.drain_filter(|p| p.id == first_person_id).collect();
println!("{:#?}", drained);
}
}
Playground

If you are looking to group your vector by the person id, it's likely to be more efficient using a HashMap from id to Vec<Person>, where each id hold a vector of persons. And then you can loop through the HashMap and process each vector / group. This is potentially more efficient than draining people in each iteration, which in worst case has O(N^2) time complexity while with a HashMap the time complexity is O(N).
#![feature(drain_filter)]
use std::collections::HashMap;
#[derive(Debug)]
struct Person {
id: u32,
}
fn main() {
let mut people = vec![];
let mut groups: HashMap<u32, Vec<Person>> = HashMap::new();
for p in 0..10 {
people.push(Person { id: p });
}
people.into_iter().for_each(|person| {
let group = groups.entry(person.id).or_insert(vec![]);
group.push(person);
});
for (_id, group) in groups {
println!("{:#?}", group);
}
}
Playground

Related

Prehash a struct

I have struct with many fields that I want to use as a key in a HashMap.
I often need to use the struct multiple times to access different HashMaps and I don't want to compute the hash and possible clone each time as the program needs to be as performant as possible and it will be accessing the HashMaps a lot (billions of times so the time really stacks up).
Here is a simplified example:
use std::collections::HashMap;
#[derive(Hash, Eq, PartialEq, Clone)]
struct KeyStruct {
field1: usize,
field2: bool,
}
fn main() {
// This is what I'm doing now
let key = KeyStruct { field1: 1, field2: true };
// This is what I'd like to do
// let key = key.get_hash()
let mut map1 = HashMap::new();
let mut map2 = HashMap::new();
let mut map3 = HashMap::new();
let mut map4 = HashMap::new();
if !map1.contains_key(&key) {
map1.insert(key.clone(), 1);
}
if !map2.contains_key(&key) {
map2.insert(key.clone(), 2);
}
if !map3.contains_key(&key) {
map3.insert(key.clone(), 3);
}
if !map4.contains_key(&key) {
map4.insert(key.clone(), 4);
}
}
I never actually use the values in the KeyStruct, I just want to use it as a key to the HashMaps. I would like to avoid hashing it multiple times and cloning it like is done in that example.

My Rust OR_INSERT Hashmap code to update a struct content works without dereferencing. Why?

From Rust documentation, this count variable wouldn't work without dereferencing (*)
let text = "hello world wonderful world";
let mut map = HashMap::new();
for word in text.split_whitespace() {
let count = map.entry(word).or_insert(0);
*count += 1;
}
println!("{:?}", map);
However, I have the following code which tries to update a u8 variable (i.e team_val.goals_scored ) in a Struct if the key is found in a hashmap. It works without dereferencing. My understanding from the above Rust documentation was I need to dereference the team_val.goals_scored in order to update the content of the struct which is also a value for the hash map. Whelp!
My code:
#[derive(Debug)]
struct Team {
name: String,
goals_scored: u8,
goals_conceded: u8,
}
fn build_scores_table(results: String) -> HashMap<String, Team> {
// The name of the team is the key and its associated struct is the value.
let mut scores: HashMap<String, Team> = HashMap::new();
for r in results.lines() {
let v: Vec<&str> = r.split(',').collect();
let team_1_name = v[0].to_string();
let team_1_score: u8 = v[2].parse().unwrap();
let team_2_name = v[1].to_string();
let team_2_score: u8 = v[3].parse().unwrap();
// TODO: Populate the scores table with details extracted from the
// current line. Keep in mind that goals scored by team_1
// will be number of goals conceded from team_2, and similarly
// goals scored by team_2 will be the number of goals conceded by
// team_1.
let mut team_1_struct= Team {
name: team_1_name.clone(),
goals_scored: team_1_score,
goals_conceded: team_2_score
};
let mut team_2_struct= Team {
name: team_2_name.clone(),
goals_scored: team_2_score,
goals_conceded: team_1_score
};
if scores.contains_key(&team_1_name) {
let team_val = scores.entry(team_1_name.clone()).or_insert(team_1_struct);
println!("Yooo {:#?}",team_val);
team_val.goals_scored +=team_1_score;
team_val.goals_conceded += team_2_score;
} else {
scores.insert(team_1_name,team_1_struct);
}
if scores.contains_key(&team_2_name) {
let team_val = scores.entry(team_2_name.clone()).or_insert(team_2_struct);
println!("Yooo {:#?}",team_val);
team_val.goals_scored +=team_2_score;
team_val.goals_conceded += team_1_score;
} else {
scores.insert(team_2_name,team_2_struct);
}
}
scores
}

Rust does some automatic dereferencing, described here. We can see the difference between the documentation code and what you wrote:
// This
*count += 1
// versus this
team_val.goals_scored += team_1_score
^--- Causes an automatic dereferencing
If you're coming from C I think this documentation may be even clearer.
Answering the follow-up question 'can you use entry() without using clone() on the key - you cannot. entry() consumes what you send it, and doesn't give it back, so the borrow checker will prevent you from doing this. It's currently a 'limitation' on the API - but if you're dealing with something as cheap as a short string to copy, then this shouldn't impact you much.
You can do a fair amount to slim down your code, though (with the caveat that I only did this for one team - it's easily extensible):
use std::collections::HashMap;
struct Team {
name: String,
goals: u8
}
type Scorecard = HashMap<String, Team>;
fn scoring(records: String) -> Scorecard {
let mut scores : Scorecard = HashMap::new();
for r in records.lines() {
let record: Vec<&str> = r.split(',').collect();
let team_name = record[0].to_string();
let team_score: u8 = record[1].parse().unwrap();
// Note that we do not need to create teams here on every iteration.
// There is a chance they already exist, and we can create them only if
// the `or_insert()` clause is activated.
// Note that the `entry()` clause when used with `or_insert()`
// gives you an implicit 'does this key exist?' check.
let team = scores.entry(team_name.clone()).or_insert(Team {
name: team_name,
goals: 0,
});
team.goals += team_score;
}
scores
}
fn main() {
let record = "Thunderers,1\nFlashians,1\nThunderers,2";
let scores = scoring(record.to_string());
for (key, value) in &scores {
println!("{}: The mighty {} scored {} points.", key, value.name, value.goals)
}
// Flattening Some Options!
// let o = Some(Some(5));
// let p = Some(5);
// println!("o: {:#?}", o);
// println!("p: {:#?}", p);
// println!("flattened o: {:#?}", o.flatten());
// println!("flattened p: {:#?}", p.flatten());
}

Filter a vector without losing its ownership

Given a struct as follows:
#[derive(Debug)]
struct Item {
id: u32
}
impl Item {
fn new(id: u32) -> Item {
Item { id }
}
}
I'm looking for a way to perform filter on a vector of that struct without taking its ownership. The following code won't work because the ownership has been moved:
fn main() {
let items: Vec<Item> = vec![Item::new(1), Item::new(2), Item::new(3)];
let odd_items: Vec<Item> = items.into_iter()
.filter(| item | item.id % 2 == 1)
.collect();
for i in odd_items.iter() { println!("{:?}", i); }
for i in items.iter() { println!("{:?}", i); }
}
Currently, I have 2 solutions:
Having a vector of &Item instead of Item, however, I find it a bit awkward to start from Vec<Item> but end up with Vec<&Item>:
fn main() {
let items: Vec<Item> = vec![Item::new(1), Item::new(2), Item::new(3)];
let odd_items: Vec<&Item> = items.iter()
.filter(| item | item.id % 2 == 1)
.collect();
for i in odd_items.iter() { println!("{:?}", i); }
for i in items.iter() { println!("{:?}", i); }
}
Clone an initial vector. I prefer this one but it results in unnecessary cloning, I'd prefer cloning only the items after filitering:
fn main() {
let items: Vec<Item> = vec![Item::new(1), Item::new(2), Item::new(3)];
let odd_items: Vec<Item> = items.clone()
.into_iter()
.filter(| item | item.id % 2 == 1)
.collect();
for i in odd_items.iter() { println!("{:?}", i); }
for i in items.iter() { println!("{:?}", i); }
}
I wonder if there is any better way to filter a vector without losing ownership.

Clone an initial vector. I prefer this one but it results in unnecessary cloning, I'd prefer cloning only the items after filtering:
If you want to clone after filtering, you can certainly do so, and it will have the side effect of converting &Item to Item (which is what clone() does by definition):
let items: Vec<Item> = vec![Item::new(1), Item::new(2), Item::new(3)];
let odd_items: Vec<Item> = items
.iter()
.filter(|item| item.id % 2 == 1)
.cloned() // the same as .map(Item::clone)
.collect();
BTW if you can get away with Vec<&Item>, that solution might be more efficient because it uses less space (if actual items are larger than a pointer, that is), and it creates odd_items as a "view" into items. If that's not what you want, then go for the cloning, you'll just need to add #[derive(Clone)] to Item.
Also note that, despite its bad rap, clone() doesn't have to be expensive - "cloning" a struct that just contains a u32 is no more expensive than a u32 assignment. It's only when the type is large and/or contains heap-allocated data that cloning becomes a thing to shy away from.

In your specific case, you don't actually need to call collect as the next for loop can use an iterator directly:
fn main() {
let items: Vec<Item> = vec![Item::new(1), Item::new(2), Item::new(3)];
let odd_items = items.iter().filter(|item| item.id % 2 == 1);
for i in odd_items {
println!("{:?}", i);
}
for i in items.iter() {
println!("{:?}", i);
}
}
With this technique, odd_items is a binding to an iterator of &Item but type inference hides it (actual type is more complex, as chaining iterators usually results in things like Filter<Map<Iter<...>> etc).
The main advantage of doing so is that no Vec is allocated in addition to items, and the latter retains ownership.
Playground link

Rust: concurrency error, program hangs after first thread

I have created a simplified version of my problem below, I have a Bag struct and Item struct. I want to spawn 10 threads that execute item_action method from Bag on each item in an item_list, and print a statement if both item's attributes are in the bag's attributes.
use std::sync::{Mutex,Arc};
use std::thread;
#[derive(Clone, Debug)]
struct Bag{
attributes: Arc<Mutex<Vec<usize>>>
}
impl Bag {
fn new(n: usize) -> Self {
let mut v = Vec::with_capacity(n);
for _ in 0..n {
v.push(0);
}
Bag{
attributes:Arc::new(Mutex::new(v)),
}
}
fn item_action(&self, item_attr1: usize, item_attr2: usize) -> Result<(),()> {
if self.attributes.lock().unwrap().contains(&item_attr1) ||
self.attributes.lock().unwrap().contains(&item_attr2) {
println!("Item attributes {} and {} are in Bag attribute list!", item_attr1, item_attr2);
Ok(())
} else {
Err(())
}
}
}
#[derive(Clone, Debug)]
struct Item{
item_attr1: usize,
item_attr2: usize,
}
impl Item{
pub fn new(item_attr1: usize, item_attr2: usize) -> Self {
Item{
item_attr1: item_attr1,
item_attr2: item_attr2
}
}
}
fn main() {
let mut item_list: Vec<Item> = Vec::new();
for i in 0..10 {
item_list.push(Item::new(i, (i+1)%10));
}
let bag: Bag= Bag::new(10); //create 10 attributes
let mut handles = Vec::with_capacity(10);
for x in 0..10 {
let bag2 = bag.clone();
let item_list2= item_list.clone();
handles.push(
thread::spawn(move || {
bag2.item_action(item_list2[x].item_attr1, item_list2[x].item_attr2);
})
)
}
for h in handles {
println!("Here");
h.join().unwrap();
}
}
When running, I only got one line, and the program just stops there without returning.
Item attributes 0 and 1 are in Bag attribute list!
May I know what went wrong? Please see code in Playground
Updated:
With suggestion from #loganfsmyth, the program can return now... but still only prints 1 line as above. I expect it to print 10 because my item_list has 10 items. Not sure if my thread logic is correct.
I have added println!("Here"); when calling join all threads. And I can see Here is printed 10 times, just not the actual log from item_action

I believe this is because Rust is not running your
if self.attributes.lock().unwrap().contains(&item_attr1) ||
self.attributes.lock().unwrap().contains(&item_attr2) {
expression in the order you expect. The evaluation order of subexpressions in Rust is currently undefined. What appears to be happening is that you essentially end up with
const condition = {
let lock1 = self.attributes.lock().unwrap();
let lock2 = self.attributes.lock().unwrap();
lock1.contains(&item_attr1) || lock2.contains(&item_attr2)
};
if condition {
which is causing your code to deadlock.
You should instead write:
let attributes = self.attributes.lock().unwrap();
if attributes.contains(&item_attr1) ||
attributes.contains(&item_attr2) {
so that there is only one lock.
Your code would also work as-is if you used an RwLock or ReentrantMutex instead of a Mutex since those allow the same thread to have multiple immutable references to the data.

Access another element immutably while mutating an element of a HashMap

I'm working on a game that involves a bunch of Beetle objects stored in a HashMap. Each beetle has a position, and it can also have a target id which is the key for another beetle in the hash. If a beetle has a target, it needs to move toward the target each time the game loop executes.
I can't perform the lookup of the target's current position, because you can't have a mutable and immutable borrow at the same. I get that, but any ideas how to restructure for my specific case?
I think I'm just getting caught up in how easy this would be in pretty much any other language, I can't see the idiomatic Rust way to do it. Here's a pretty minimal but complete example:
use std::collections::HashMap;
type Beetles = HashMap<i32, Beetle>;
struct Beetle {
x: f32,
y: f32,
target_id: i32,
}
impl Beetle {
fn new() -> Beetle {
Beetle {
x: 0.0,
y: 0.0,
target_id: -1,
}
}
}
fn main() {
let mut beetles: Beetles = HashMap::new();
beetles.insert(0, Beetle::new());
beetles.insert(1, Beetle::new());
set_target(&mut beetles, 0, 1);
move_toward_target(&mut beetles, 0);
}
fn set_target(beetles: &mut Beetles, subject_id: i32, target_id: i32) {
if let Some(subject) = beetles.get_mut(&subject_id) {
subject.target_id = target_id;
}
}
fn move_toward_target(beetles: &mut Beetles, beetle_id: i32) {
if let Some(subject) = beetles.get_mut(&beetle_id) {
if let Some(target) = beetles.get(&subject.target_id) {
// update subject position to move closer to target...
}
}
}

You could solve your specific problem by performing a double lookup for the subject. First, borrow immutably from the hash map to collect the information necessary for updating the subject. Then finally update the subject using the collected information by borrowing mutably from the hash map:
fn move_toward_target(beetles: &mut Beetles, beetle_id: i32) {
if let Some(subject_target_id) = beetles.get(&beetle_id).map(|b| b.target_id) {
let mut target_xy = None; // example
if let Some(target) = beetles.get(&subject_target_id) {
// collect information about target relevant for updating subject
target_xy = Some((target.x, target.y)) // example
}
let subject = beetles.get_mut(&beetle_id).unwrap();
// update subject using collected information about target
if let Some((target_x, target_y)) = target_xy{ // example
subject.x = target_x;
subject.y = target_y;
}
}
}
However, it is likely that you will run in similar and more complex problems with your beetles in the future, because the beetles are your central game objects, which you will likely want to reference mutably and immutably at the same time at several places in your code.
Therefore, it makes sense to wrap your beetles in std::cell::RefCells, which check borrow rules dynamically at runtime. This gives you a lot flexibility when referencing beetles in your hash map:
fn main() {
let mut beetles: Beetles = HashMap::new();
beetles.insert(0, RefCell::new(Beetle::new()));
beetles.insert(1, RefCell::new(Beetle::new()));
set_target(&mut beetles, 0, 1);
move_toward_target(&mut beetles, 0);
}
fn set_target(beetles: &mut Beetles, subject_id: i32, target_id: i32) {
if let Some(mut subject) = beetles.get_mut(&subject_id).map(|b| b.borrow_mut()) {
subject.target_id = target_id;
}
}
fn move_toward_target(beetles: &mut Beetles, beetle_id: i32) {
if let Some(mut subject) = beetles.get(&beetle_id).map(|b| b.borrow_mut()) {
if let Some(target) = beetles.get(&subject.target_id).map(|b| b.borrow()) {
//example for updating subject based on target
subject.x = target.x;
subject.y = target.y;
}
}
}
updated Beetles type:
type Beetles = HashMap<i32, RefCell<Beetle>>;

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Group vector of structs by field - rust

Related

Prehash a struct

My Rust OR_INSERT Hashmap code to update a struct content works without dereferencing. Why?

Filter a vector without losing its ownership

Rust: concurrency error, program hangs after first thread

Access another element immutably while mutating an element of a HashMap

Categories

Resources