Why is this hashmap search slower than expected? - rust

What is the best way to check a hash map for a key?
Currently I am using this:
let hashmap = HashMap::<&str, &str>::new(); // Empty hashmap
let name = "random";
for i in 0..5000000 {
if !hashmap.contains_key(&name) {
// Do nothing
}
}
This seems to be fast in most cases and takes 0.06 seconds when run as shown, but when I use it in this following loop it becomes very slow and takes almost 1 min on my machine. (This is compiling with cargo run --release).
The code aims to open an external program, and loop over the output from that program.
let a = vec!["view", "-h"]; // Arguments to open process with
let mut child = Command::new("samtools").args(&a)
.stdout(Stdio::piped())
.spawn()
.unwrap();
let collect_pairs = HashMap::<&str, &str>::new();
if let Some(ref mut stdout) = child.stdout {
for line in BufReader::new(stdout).lines() {
// Do stuff here
let name = "random";
if !collect_pairs.contains_key(&name) {
// Do nothing
}
}
}
For some reason adding the if !collect_pairs.contains_key( line increases the run time by almost a minute. The output from child is around 5 million lines. All this code exists in fn main()
EDIT
This appears to fix the problem, resulting in a fast run time, but I do not know why the !hashmap.contains_key does not work well here:
let n: Option<&&str> = collect_pairs.get(name);
if match n {Some(v) => 1, None => 0} == 1 {
// Do something
}

One thing to consider is that HashMap<K, V> uses a cryptographically secure hashing algorithm by default, so it will always be a bit slow by nature.
get() boils down to
self.search(k).map(|bucket| bucket.into_refs().1)
contains_key is
self.search(k).is_some()
As such, that get() is faster for you seems strange to me, it's doing more work!
Also,
if match n {Some(v) => 1, None => 0} == 1 {
This can be written more idiomatically as
if let Some(v) = n {

Ive found my problem, Im sorry I didnt pick up until now. I wasnt checking the return of if !collect_pairs.contains_key(&name) properly. It returns true for some reason resulting in the rest of the if block being run. I assumed it was evaluating to false. Thanks for the help

Related

Using while let with two variables simultaneously

I'm learning Rust and have been going through leetcode problems. One of them includes merging two linked lists, whose nodes are optional. I want to write a while loop that would go on until at least 1 node becomes None, and I was trying to use the while let loop for that.
However, it looks like the while let syntax supports only one optional, e.g.:
while let Some(n) = node {
// do stuff
}
but I can't write
while let Some(n1) = node1 && Some(n2) = node2 {
}
Am I misunderstanding the syntax? I know I can rewrite it with a while true loop, but is there a more elegant way of doing it?
Also, can one do multiple checks with if let? Like if let None=node1 && None=node2 {return}
You can pattern match with Option::zip:
while let Some((n1, n2)) = node1.zip(node2) {
...
}
In addition to what #Netwave said, on nightly you can use the unstable let_chains feature:
#![feature(let_chains)]
while let Some(n1) = node1 && let Some(n2) = node2 {
// ...
}

error handling when unwrapping several try_into calls

I have a case where I need to parse some different values out from a vector.
I made a function for it, that returns a option, which either should give a option or a None, depending on whether the unwrapping succeeds.
Currently it looks like this:
fn extract_edhoc_message(msg : Vec<u8>)-> Option<EdhocMessage>{
let mtype = msg[0];
let fcnt = msg[1..3].try_into().unwrap();
let devaddr = msg[3..7].try_into().unwrap();
let msg = msg[7..].try_into().unwrap();
Some(EdhocMessage {
m_type: mtype,
fcntup: fcnt,
devaddr: devaddr,
edhoc_msg: msg,
})
}
But, I would like to be able to return a None, if any of the unwrap calls fail.
I can do that by pattern matching on each of them, and then explicitly return a None, if anything fails, but that would a lot of repeated code.
Is there any way to say something like:
"if any of these unwraps fail, return a None?"
This is exactly what ? does. It's even shorter than the .unwrap() version:
fn extract_error_message(msg: Vec<u8>) -> Option<EdhocMessage> {
let m_type = msg[0];
let fcntup = msg[1..3].try_into().ok()?;
let devaddr = msg[3..7].try_into().ok()?;
let edhoc_msg = msg[7..].try_into().ok()?;
Some(EdhocMessage {
m_type,
fcntup,
devaddr,
edhoc_msg
})
}
See this relevant part of the Rust Book.

Cannot get Hash::get_mut() and File::open() to agree about mutability

During a lengthy computation, I need to look up some data in a number of different files. I cannot know beforehand how many or which files exactly, but chances are high that each file is used many times (on the order of 100 million times).
In the first version, I opened the file (whose name is an intermediate result of the computation) each time for lookup.
In the second version, I have a HashMap<String, Box<File>> where I remember already open files and open new ones lazily on demand.
I couldn't manage to handle the mutable stuff that arises from the need to have Files to be mutable. I got something working, but it looks overly silly:
let path = format!("egtb/{}.egtb", self.signature());
let hentry = hash.get_mut(&self.signature());
let mut file = match hentry {
Some(f) => f,
None => {
let rfile = File::open(&path);
let wtf = Box::new(match rfile {
Err(ioe) => return Err(format!("could not open EGTB file {} ({})", path, ioe)),
Ok(opened) => opened,
});
hash.insert(self.signature(), wtf);
// the following won't work
// wtf
// &wtf
// &mut wtf
// So I came up with the following, but it doesn't feel right, does it?
hash.get_mut(&self.signature()).unwrap()
}
};
Is there a canonical way to get a mut File from File::open() or File::create()? In the manuals, this is always done with:
let mut file = File:open("foo.txt")?;
This means my function would have to return Result<_, io::Error> and I can't have that.
The problem seems to be that with the hash-lookup Some(f) gives me a &mut File but the Ok(f) from File::open gives me just a File, and I don't know how to make a mutable reference from that, so that the match arm's types match. I have no clear idea why the version as above at least compiles, but I'd very much like to learn how to do that without getting the File from the HashMap again.
Attempts to use wtf after it has been inserted into the hashmap fail to compile because the value was moved into the hashmap. Instead, you need to obtain the reference into the value stored in the hashmap. To do so without a second lookup, you can use the entry API:
let path = format!("egtb/{}.egtb", self.signature());
let mut file = match hash.entry(self.signature()) {
Entry::Occupied(e) => e.into_mut(),
Entry::Vacant(e) => {
let rfile = File::open(&path)
.map_err(|_| format!("could not open EGTB file {} ({})", path, ioe))?;
e.insert(Box::new(rfile))
}
};
// `file` is `&mut File`, use it as needed
Note that map_err() allows you to use ? even when your function returns a Result not immediately compatible with the one you have.
Also note that there is no reason to box the File, a HashMap<String, File> would work just as nicely.

Issue when creating a Vector and assigning it later on [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
After reading a LOT of documentation I figured out that I am facing a problem of scope, but I have no idea how to solve it. See the example code below:
fn main() {
let mut bytes_buf:Vec<u8> = Vec::new(); // 1) where I declare the Vector, the compiler force me to initialize it.
loop {
match socket.recv_from(&mut buf) {
Ok((size, src)) => {
if count == 0 {
chunks_cnt = ...
bytes_buf = vec![0; MAX_CHUNK_SIZE * chunks_cnt as usize]; // 2) where I want to set vector size, only ONCE, and after knowing chunks_cnt
}
bytes_buf[start..end].copy_from_slice(buf); // 3) where I want to gradually fill the vector
}
}
}
}
For convenience, you can check the full code here
Possible solution
Here the socket fills the slice buf. If it fails an error message is shown. If is succeeds it will enter the loop.
On each iteration of the loop, the buf is converted to a Vec<u8> and appended to bytes_buf. Then if this is the first iteration, then the size value is inserted into the first position. Then the first flag is set to false. After that all iterations will continue appending data to the vector.
The following minimal example should compile fine:
use std::net::{UdpSocket};
const UDP_HEADER: usize = 8;
const IP_HEADER: usize = 20;
const MAX_DATA_LENGTH: usize = (64 * 1024 - 1) - UDP_HEADER - IP_HEADER;
fn main() {
let socket = UdpSocket::bind("0.0.0.0:8888").expect("Could not bind socket");
let mut buf= [0u8; MAX_DATA_LENGTH]; // Slice that will be filled by recv_from.
let mut bytes_buf:Vec<u8> = Vec::new(); // Vector where the data will be moved.
let mut first = true; // Flag that indicates if this is our first iteration.
loop {
match socket.recv_from(&mut buf) {
Ok((_size, _src)) => {
// Convert the slice to a vector (to_vec function) and append it to the bytes_buf.
bytes_buf.append(&mut buf.to_vec());
if first {
// Insert function inserts the element at the specified position and shifts
// all elements after it to the right.
bytes_buf.insert(0, 10u8); // IDK What value you need here.
}
first = false; // Set first to false
},
Err(err) => eprintln!("Error: {}", err) // If we fail display the error.
}
}
}
Side note
Your example was missing lots of variables and context. Despite this, I managed to create a minimal working example of what I believe you are trying to achieve thanks to the link you shared despite being quite different. Please next time provide a minimal reproducible example. More information here: How to create a Minimal, Reproducible Example
Have a nice day!

getting payload from a substrate event back in rust tests

i've created my first substrate project successful and the built pallet also works fine. Now i wanted to create tests for the flow and the provided functions.
My flow is to generate a random hash and store this hash associated to the sender of the transaction
let _sender = ensure_signed(origin)?;
let nonce = Nonce::get();
let _random_seed = <randomness_collective_flip::Module<T>>::random_seed();
let random_hash = (_random_seed, &_sender, nonce).using_encoded(T::Hashing::hash);
ensure!(!<Hashes<T>>::contains_key(random_hash), "This new id already exists");
let _now = <timestamp::Module<T>>::get();
let new_elem = HashElement {
id: random_hash,
parent: parent,
updated: _now,
created: _now
};
<Hashes<T>>::insert(random_hash, new_pid);
<HashOwner<T>>::insert(random_hash, &_sender);
Self::deposit_event(RawEvent::Created(random_hash, _sender));
Ok(())
works good so far, when now i want to test the flow with a written test, i want to check if the hash emitted in the Created event is also assigned in the HashOwner Map. For this i need to get the value out of the event back.
And this is my problem :D i'm not professional in rust and all examples i found are expecting all values emitted in the event like this:
// construct event that should be emitted in the method call directly above
let expected_event = TestEvent::generic_event(RawEvent::EmitInput(1, 32));
// iterate through array of `EventRecord`s
assert!(System::events().iter().any(|a| a.event == expected_event));
When debugging my written test:
assert_ok!(TemplateModule::create_hash(Origin::signed(1), None));
let events = System::events();
let lastEvent = events.last().unwrap();
let newHash = &lastEvent.event;
i see in VSCode that the values are available:
debug window of vs code
but i dont know how to get this Hash in a variable back... maybe this is only a one liner ... but my rust knowledge is damn too small :D
thank you for your help
Here's a somewhat generic example of how to parse and check events, if you only care about the last event that your module put in system and nothing else.
assert_eq!(
System::events()
// this gives you an EventRecord { event: ..., ...}
.into_iter()
// map into the inner `event`.
.map(|r| r.event)
// the inner event is like `OuterEvent::mdouleEvent(EventEnum)`. The name of the outer
// event comes from whatever you have placed in your `delc_event! {}` in test mocks.
.filter_map(|e| {
if let MetaEvent::templateModule(inner) = e {
Some(inner)
} else {
None
}
})
.last()
.unwrap(),
// RawEvent is defined and imported in the template.rs file.
// val1 and val2 are things that you want to assert against.
RawEvent::Created(val1, val2),
);
Indeed you can also omit the first map or do it in more compact ways, but I have done it like this so you can see it step by step.
Print the System::events(), this also helps.
I now got it from the response of kianenigma :)
I wanted to reuse the given data in the event:
let lastEvent = System::events()
// this gives you an EventRecord { event: ..., ...}
.into_iter()
// map into the inner `event`.
.map(|r| r.event)
// the inner event is like `OuterEvent::mdouleEvent(EventEnum)`. The name of the outer
// event comes from whatever you have placed in your `delc_event! {}` in test mocks.
.filter_map(|e| {
if let TestEvent::pid(inner) = e {
Some(inner)
} else {
None
}
})
.last()
.unwrap();
if let RawEvent::Created(newHash, initiatedAccount) = lastEvent {
// there are the values :D
}
this can maybe be written better but this helps me :)

Resources