Use the `replacen` to replace the string with the original variable - rust

I tried assigning the original magazine variable with the replaced characters, but it doesn't work.
fn can_construct(ransom_note: String, magazine: String) -> bool {
let ransom_arr: Vec<char> = ransom_note.chars().collect();
let mut magazine = magazine.as_str();
for index in 0..ransom_arr.len() {
let original_length = &magazine.len();
let mut new_str = &magazine.replacen(ransom_arr[index], "", 1);
if &new_str.len() == original_length {
return false;
}
magazine = new_str.as_mut_str();
}
true
}
I don't understand why it fails.
let original_length = &magazine.len();
-------------- borrow later used here
let mut new_str = &magazine.replacen(ransom_arr[index], "", 1);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
...
}
- temporary value is freed at the end of this statement
= note: consider using a `let` binding to create a longer lived value
Given two strings ransomNote and magazine, return true if ransomNote can be constructed by using the letters from magazine and false otherwise.
Each letter in magazine can only be used once in ransomNote.
The above are the corresponding rules
Example:
Input: ransomNote = "aa", magazine = "ab"
Output: false

The problem is that replacen returns a new String and you're taking and storing a reference to it. The String itself will be dropped at the end of the block but you're storing it in a variable that lives for longer than that. You can instead store the String value:
fn can_construct(ransom_note: String, mut magazine: String) -> bool {
let ransom_arr: Vec<char> = ransom_note.chars().collect();
for index in 0..ransom_arr.len() {
let original_length = magazine.len();
let mut new_str = magazine.replacen(ransom_arr[index], "", 1);
if new_str.len() == original_length {
return false;
}
magazine = new_str;
}
true
}
This compiles but since you haven't given example inputs/outputs, I couldn't test to see if it works as you expect.
This can be simplified further using iterators:
fn can_construct(ransom_note: String, mut magazine: String) -> bool {
for char in ransom_note.chars() {
let original_length = magazine.len();
let mut new_str = magazine.replacen(char, "", 1);
if new_str.len() == original_length {
return false;
}
magazine = new_str;
}
true
}
Both these solutions are O(n^2) as string replace is an O(n) operation. If I understand the objective correctly, there's a O(n) solution possible using a HashMap.

Related

Why can't I collect strings from a Vec<String> after calling `push` on them?

I am trying to collect vector of string to string with separator $.
let v = [String::from("bump"), String::from("sage"),String::from("lol"), String::from(" kek ")];
let s: String = v.into_iter().map(|x| x.push_str("$")).collect();
println!("{:?}",s );
The code above does not work, but this:
let v = [String::from("hello"), String::from("world"),String::from("shit"), String::from(" +15 ")];
let s: String = v.into_iter().collect();
println!("{:?}",s );
is working. How do I solve this problem?
Your code isn't working because push_str() does not return the string.
So your map() function maps from String to (), because you don't return x from it.
Further, x is not mutable, so you cannot call push_str() on it. You have to declare it mut.
This is your code, minimally modified so that it works:
fn main(){
let v = [String::from("bump"), String::from("sage"),String::from("lol"), String::from(" kek ")];
let s: String = v.into_iter().map(|mut x| {x.push_str("$"); x}).collect();
println!("{:?}",s );
}
"bump$sage$lol$ kek $"
Further, if you only push a single character, do push('$') instead.
You will notice, however, that there is a $ at the end of the string. Your usecase is perfect for reduce(), so I'd use #Aleksander's answer instead.
You can use Iterator::reduce. Note that it will put separators only between items (and not at the end of string like in Petterrabit's answer) and it will re-use the allocation of fist string (which results in slightly better memory efficiency).
fn main() {
let v = [
String::from("bump"),
String::from("sage"),
String::from("lol"),
String::from(" kek "),
];
let s: String = v
.into_iter()
.reduce(|mut acc, x| {
acc.push('$');
acc.push_str(&x);
acc
})
.unwrap_or_default();
println!("{}", s);
}
You must return strings from your map.
Note that push_str doesn't return anything.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=50534d454a46093299ce38682733c86a
fn main() {
let v = [
String::from("bump"),
String::from("sage"),
String::from("lol"),
String::from(" kek "),
];
let s: String = v
.iter()
.map(|x| {
let mut x = x.to_owned();
x.push_str("$");
x
})
.collect();
println!("{}", s);
}
EDIT
If your real use case is more complex and you must use a iterator and a map, you should prefer above answers which are better done (no need to own the returned strings from the map because you collect them into a new string anyway).
But that said if the only purpose is to join your Vec with a separator you should simply do
fn main() {
let v = [
String::from("bump"),
String::from("sage"),
String::from("lol"),
String::from(" kek "),
];
println!("{}", v.join("$"));
}

Rust - Multiple Calls to Iterator Methods

I have this following rust code:
fn tokenize(line: &str) -> Vec<&str> {
let mut tokens = Vec::new();
let mut chars = line.char_indices();
for (i, c) in chars {
match c {
'"' => {
if let Some(pos) = chars.position(|(_, x)| x == '"') {
tokens.push(&line[i..=i+pos]);
} else {
// Not a complete string
}
}
// Other options...
}
}
tokens
}
I am trying to elegantly extract a string surrounded by double quotes from the line, but since chars.position takes a mutable reference and chars is moved into the for loop, I get a compilation error - "value borrowed after move". The compiler suggests borrowing chars in the for loop but this doesn't work because an immutable reference is not an iterator (and a mutable one would cause the original problem where I can't borrow mutably again for position).
I feel like there should be a simple solution to this.
Is there an idiomatic way to do this or do I need to regress to appending characters one by one?
Because a for loop will take ownership of chars (because it calls .into_iter() on it) you can instead manually iterate through chars using a while loop:
fn tokenize(line: &str) -> Vec<&str> {
let mut tokens = Vec::new();
let mut chars = line.char_indices();
while let Some((i, c)) = chars.next() {
match c {
'"' => {
if let Some(pos) = chars.position(|(_, x)| x == '"') {
tokens.push(&line[i..=i+pos]);
} else {
// Not a complete string
}
}
// Other options...
}
}
}
It works if you just desugar the for-loop:
fn tokenize(line: &str) -> Vec<&str> {
let mut tokens = Vec::new();
let mut chars = line.char_indices();
while let Some((i, c)) = chars.next() {
match c {
'"' => {
if let Some(pos) = chars.position(|(_, x)| x == '"') {
tokens.push(&line[i..=i+pos]);
} else {
// Not a complete string
}
},
_ => {},
}
}
tokens
}
The normal for-loop prevents additional modification of the iterator because this usually leads to surprising and hard-to-read code. Doing it as a while-loop has no such protection.
If all you want to do is find quoted strings, I would not, however, go with an iterator at all here.
fn tokenize(line: &str) -> Vec<&str> {
let mut tokens = Vec::new();
let mut line = line;
while let Some(pos) = line.find('"') {
line = &line[(pos+1)..];
if let Some(end) = line.find('"') {
tokens.push(&line[..end]);
line = &line[(end+1)..];
} else {
// Not a complete string
}
}
tokens
}

Issues with Rust timelines and ownerships

I am trying to create a hashmap by reading a file. Below is the code that I have written. The twist is that I need to persist subset_description till the next iteration so that I can store it in the hasmap and then finally return the hashmap.
fn myfunction(filename: &Path) -> io::Result<HashMap<&str, &str>> {
let mut SIF = HashMap::new();
let file = File::open(filename).unwrap();
let mut subset_description = "";
for line in BufReader::new(file).lines() {
let thisline = line?;
let line_split: Vec<&str> = thisline.split("=").collect();
subset_description = if thisline.starts_with("a") {
let subset_description = line_split[1].trim();
subset_description
} else {
""
};
let subset_ids = if thisline.starts_with("b") {
let subset_ids = line_split[1].split(",");
let subset_ids = subset_ids.map(|s| s.trim());
subset_ids.collect()
} else {
Vec::new()
};
for k in subset_ids {
SIF.insert(k, subset_description);
println!("");
}
if thisline.starts_with("!dataset_table_begin") {
break;
}
}
Ok(SIF)
}
I am getting the below error and not able to resolve this
error[E0515]: cannot return value referencing local variable `thisline`
--> src/main.rs:73:5
|
51 | let line_split: Vec<&str> = thisline.split("=").collect();
| -------- `thisline` is borrowed here
...
73 | Ok(SIF)
| ^^^^^^^ returns a value referencing data owned by the current function
The problem lies within the guarantees the Rust makes on your behalf. The root of the problem can be seen as following. You are reading a file and manipulating it's content into a HashMap, and you are trying to return reference to the the data you read. But by returning a reference you would need to guarantee, that the strings in the file wont be changed later on, which you naturally can not do.
In Rust terms you keep trying to return references to local variables, which get dropped at the end of the function, which would efficiently leave you with dangling pointers. Here is the changes I made, even though they may not be most efficient, they do compile.
fn myfunction(filename: &Path) -> io::Result<HashMap<String, String>> {
let mut SIF = HashMap::new();
let file = File::open(filename).unwrap();
let mut subset_description = "";
for line in BufReader::new(file).lines() {
let thisline = line?;
let line_split: Vec<String> = thisline.split("=").map(|s| s.to_string()).collect();
subset_description = if thisline.starts_with("a") {
let subset_description = line_split[1].trim();
subset_description
} else {
""
};
let subset_ids = if thisline.starts_with("b") {
let subset_ids = line_split[1].split(",");
let subset_ids = subset_ids.map(|s| s.trim());
subset_ids.map(|s| s.to_string()).collect()
} else {
Vec::new()
};
for k in subset_ids {
SIF.insert(k, subset_description.to_string());
println!("");
}
if thisline.starts_with("!dataset_table_begin") {
break;
}
}
Ok(SIF)
}
As you can see, now you give away the ownership of strings in return value. This is achieved by modifying the return type and using to_string() function, to give away the ownership of local strings to HashMap.
There is an argument that to_string() is slow, so you can explore the use of into or to_owned(), but as I am not proficient with those constructs I can not assist you in optimization.

Iterating through a Window of a String without collect

I need to iterate through and compare a window of unknown length of a string. My current implementation works, however I've done performance tests against it, and it is very inefficient. The method needs to be guaranteed to be safe against Unicode.
fn foo(line: &str, patt: &str) {
for window in line.chars().collect::<Vec<char>>().windows(patt.len()) {
let mut bar = String::new();
for ch in window {
bar.push(*ch);
}
// perform various comparison checks
}
}
An improvement on Shepmaster's final solution, which significantly lowers overhead (by a factor of ~1.5), is
fn foo(line: &str, pattern: &str) -> bool {
let pattern_len = pattern.chars().count();
let starts = line.char_indices().map(|(i, _)| i);
let mut ends = line.char_indices().map(|(i, _)| i);
// Itertools::dropping
if pattern_len != 0 { ends.nth(pattern_len - 1); }
for (start, end) in starts.zip(ends.chain(Some(line.len()))) {
let bar = &line[start..end];
if bar == pattern { return true }
}
false
}
That said, your code from the Github page is a little odd. For instance, you try to deal with different length open and close tags with a wordier version of
let length = cmp::max(comment.len(), comment_end.len());
but your check
if window.contains(comment)
could then trigger multiple times!
Much better would be to just iterate over shrinking slices. In the mini example this would be
fn foo(line: &str, pattern: &str) -> bool {
let mut chars = line.chars();
loop {
let bar = chars.as_str();
if bar.starts_with(pattern) { return true }
if chars.next().is_none() { break }
}
false
}
(Note that this once again ends up again improving performance by another factor of ~1.5.)
and in a larger example this would be something like
let mut is_in_comments = 0u64;
let start = match line.find(comment) {
Some(start) => start,
None => return false,
};
let end = match line.rfind(comment_end) {
Some(end) => end,
None => return true,
};
let mut chars = line[start..end + comment_end.len()].chars();
loop {
let window = chars.as_str();
if window.starts_with(comment) {
if nested {
is_in_comments += 1;
} else {
is_in_comments = 1;
}
} else if window.starts_with(comment_end) {
is_in_comments = is_in_comments.saturating_sub(1);
}
if chars.next().is_none() { break }
}
Note that this still counts overlaps, so /*/ might count as an opening /* immediately followed by a closing */.
The method needs to be guaranteed to be safe against Unicode.
pattern.len() returns the number of bytes that the string requires, so it's already possible that your code is doing the wrong thing. I might suggest you check out tools like QuickCheck to produce arbitrary strings that include Unicode.
Here's my test harness:
use std::iter;
fn main() {
let mut haystack: String = iter::repeat('a').take(1024*1024*100).collect();
haystack.push('b');
println!("{}", haystack.len());
}
And I'm compiling and timing via cargo build --release && time ./target/release/x. Creating the string by itself takes 0.274s.
I used this version of your original code just to have some kind of comparison:
fn foo(line: &str, pattern: &str) -> bool {
for window in line.chars().collect::<Vec<char>>().windows(pattern.len()) {
let mut bar = String::new();
for ch in window {
bar.push(*ch);
}
if bar == pattern { return true }
}
false
}
This takes 4.565s, or 4.291s for just foo.
The first thing I see is that there is a lot of allocation happening on the inner loop. The code creates, allocates, and destroys the String for each iteration. Let's reuse the String allocation:
fn foo_mem(line: &str, pattern: &str) -> bool {
let mut bar = String::new();
for window in line.chars().collect::<Vec<char>>().windows(pattern.len()) {
bar.clear();
bar.extend(window.iter().cloned());
if bar == pattern { return true }
}
false
}
This takes 2.155s or 1.881s for just foo_mem.
Continuing on, another extraneous allocation is the one for the String at all. We already have bytes that look like the right thing, so let's reuse them:
fn foo_no_string(line: &str, pattern: &str) -> bool {
let indices: Vec<_> = line.char_indices().map(|(i, _c)| i).collect();
let l = pattern.chars().count();
for window in indices.windows(l + 1) {
let first_idx = *window.first().unwrap();
let last_idx = *window.last().unwrap();
let bar = &line[first_idx..last_idx];
if bar == pattern { return true }
}
// Do the last pair
{
let last_idx = indices[indices.len() - l];
let bar = &line[last_idx..];
if bar == pattern { return true }
}
false
}
This code is ugly and unidiomatic. I'm pretty sure some thinking (that I'm currently too lazy to do) would make it look a lot better.
This takes 1.409s or 1.135s for just foo_mem.
As this is ~25% of the original time, Amdahl's Law suggests this is a reasonable stopping point.

How do I return a struct or anything more complicated than a primitive?

I've been tinkering with Rust and I'm a little confused with function return types. As an experiment I'm writing an IRC log parser. I'm familiar with the primitive types, and having functions return those. What about more complex types when returning multiple pieces of data?
/* Log line example from log.txt */
/* [17:35] <#botname> name1 [460/702] has challenged name2 [224/739] and taken them in combat! */
#[derive(Show)]
struct Challenger {
challenger: String,
defender: String
}
fn main() {
let path = Path::new("log.txt");
let mut file = BufferedReader::new(File::open(&path));
for line in file.lines() {
let mut unwrapped_line = line.unwrap();
let mut chal = challenges3(unwrapped_line);
println!("Challenger: {}", chal.challenger);
println!("Defender: {}", chal.defender);
}
}
fn challenges3(text: String)-> Challenger {
let s: String = text;
let split: Vec<&str> = s.as_slice().split(' ').collect();
if(split[4] == "has" && split[5] == "challenged") {
let mychallenger = Challenger { challenger: split[2].to_string(), defender: split[6].to_string()};
return mychallenger;
}
}
I realize this code isn't very idiomatic, I'm getting familiar with the language.
I get an error with this code:
"mismatched types: expected `Challenger`, found `()` (expected struct Challenger, found ())"
How can I return a Struct or a HashMap? Is there a better way to return multiple fields of data?
The if in challenges3 has no else block, so if the condition isn't met, execution continues after the if block. There's nothing there, so the function implicitly returns () at this point. You must also return a Challenger after the if block, or panic! to abort the program.
Alternatively, you could change the return type of your function to Option<Challenger>. Return Some(mychallenger) in the if block, and None after the if block:
fn challenges3(text: String) -> Option<Challenger> {
let s: String = text;
let split: Vec<&str> = s.as_slice().split(' ').collect();
if split[4] == "has" && split[5] == "challenged" {
let mychallenger = Challenger { challenger: split[2].to_string(), defender: split[6].to_string()};
return Some(mychallenger);
}
None
}
You can also use Result instead of Option if you want to return some information about the error.

Resources