Can I reset a borrow of a local in a loop? - rust

I have a processing loop that needs a pointer to a large lookup table.
The pointer is unfortunately triply indirected from the source data, so keeping that pointer around for the inner loop is essential for performance.
Is there any way I can tell the borrow checker that I'm "unborrowing" the state variable in the unlikely event I need to modify the state... so I can only re-lookup the slice in the event that the modify_state function triggers?
One solution I thought of was to change data to be a slice reference and do a mem::replace on the struct at the beginning of the function and pull the slice into local scope, then replace it back at the end of the function — but that is very brittle and error prone (as I need to remember to replace the item on every return). Is there another way to accomplish this?
struct DoubleIndirect {
data: [u8; 512 * 512],
lut: [usize; 16384],
lut_index: usize,
}
#[cold]
fn modify_state(s: &mut DoubleIndirect) {
s.lut_index += 63;
s.lut_index %= 16384;
}
fn process(state: &mut DoubleIndirect) -> [u8; 65536] {
let mut ret: [u8; 65536] = [0; 65536];
let mut count = 0;
let mut data_slice = &state.data[state.lut[state.lut_index]..];
for ret_item in ret.iter_mut() {
*ret_item = data_slice[count];
if count % 197 == 196 {
data_slice = &[];
modify_state(state);
data_slice = &state.data[state.lut[state.lut_index]..];
}
count += 1
}
return ret;
}

The simplest way to do this is to ensure the borrows of state are all disjoint:
#[cold]
fn modify_state(lut_index: &mut usize) {
*lut_index += 63;
*lut_index %= 16384;
}
fn process(state: &mut DoubleIndirect) -> [u8; 65536] {
let mut ret: [u8; 65536] = [0; 65536];
let mut count = 0;
let mut lut_index = &mut state.lut_index;
let mut data_slice = &state.data[state.lut[*lut_index]..];
for ret_item in ret.iter_mut() {
*ret_item = data_slice[count];
if count % 197 == 196 {
modify_state(lut_index);
data_slice = &state.data[state.lut[*lut_index]..];
}
count += 1
}
return ret;
}
The problem is basically two things: first, Rust will not look beyond a function's signature to find out what it does. As far as the compiler knows, your call to modify_state could be changing state.data as well, and it can't allow that.
The second problem is that borrows are lexical; the compiler looks at the block of code where the borrow might be used as goes with that. It doesn't (currently) bother to try and reduce the length of borrows to match where they're actually active.
You can also play games with, for example, using std::mem::replace to pull state.data out into a local variable, do your work, then replace it back just before you return.

Related

Rust double mut borrow in loops

Looking for a way to push in both Vec<Vec<>> and it's inner Vec<>. I do understand why it fails, but still struggle to find some graceful way to solve it.
fn example() {
let mut vec: Vec<Vec<i32>> = vec![];
vec.push(vec![]);
for i in &mut vec {
i.push(1);
if vec.len() < 10 {
vec.push(vec![]); // second mut borrow
}
}
}
The borrow checker won't allow you to iterate over a vector by reference and modify it during iteration. The reason for that is that modifying the vector can reallocate its storage, which would invalidate the references used for iteration. (And there is also the question of what it means to iterate over a changing vector, do you want to visit the elements added during iteration or just the elements that were present originally.)
The easiest fix that allows you to do what you want is to just iterate the vector using an index:
fn example() {
let mut vec: Vec<Vec<i32>> = vec![];
vec.push(vec![]);
let mut ind = 0;
while ind < vec.len() {
let i = &mut vec[ind];
i.push(1);
if vec.len() < 10 {
vec.push(vec![]);
}
ind += 1;
}
}

Why does a generic function replicating C's fread for unsigned integers always return zero?

I am trying to read in binary 16-bit machine instructions from a 16-bit architecture (the exact nature of that is irrelevant here), and print them back out as hexadecimal values. In C, I found this simple by using the fread function to read 16 bits into a uint16_t.
I figured that I would try to replicate fread in Rust. It seems to be reasonably trivial if I can know ahead-of-time the exact size of the variable that is being read into, and I had that working specifically for 16 bits.
I decided that I wanted to try to make the fread function generic over the various built-in unsigned integer types. For that I came up with the below function, using some traits from the Num crate:
fn fread<T>(
buffer: &mut T,
element_count: usize,
stream: &mut BufReader<File>,
) -> Result<usize, std::io::Error>
where
T: num::PrimInt + num::Unsigned,
{
let type_size = std::mem::size_of::<T>();
let mut buf = Vec::with_capacity(element_count * type_size);
let buf_slice = buf.as_mut_slice();
let bytes_read = match stream.read_exact(buf_slice) {
Ok(()) => element_count * type_size,
Err(ref e) if e.kind() == std::io::ErrorKind::UnexpectedEof => 0,
Err(e) => panic!("{}", e),
};
*buffer = buf_slice
.iter()
.enumerate()
.map(|(i, &b)| {
let mut holder2: T = num::zero();
holder2 = holder2 | T::from(b).expect("Casting from u8 to T failed");
holder2 << ((type_size - i) * 8)
})
.fold(num::zero(), |acc, h| acc | h);
Ok(bytes_read)
}
The issue is that when I call it in the main function, I seem to always get 0x00 back out, but the number of bytes read that is returned by the function is always 2, so that the program enters an infinite loop:
extern crate num;
use std::fs::File;
use std::io::BufReader;
use std::io::prelude::Read;
fn main() -> Result<(), std::io::Error> {
let cmd_line_args = std::env::args().collect::<Vec<_>>();
let f = File::open(&cmd_line_args[1])?;
let mut reader = BufReader::new(f);
let mut instructions: Vec<u16> = Vec::new();
let mut next_instruction: u16 = 0;
fread(&mut next_instruction, 1, &mut reader)?;
let base_address = next_instruction;
while fread(&mut next_instruction, 1, &mut reader)? > 0 {
instructions.push(next_instruction);
}
println!("{:#04x}", base_address);
for i in instructions {
println!("0x{:04x}", i);
}
Ok(())
}
It appears to me that I'm somehow never reading anything from the file, so the function always just returns the number of bytes it was supposed to read. I'm clearly not using something correctly here, but I'm honestly unsure what I'm doing wrong.
This is compiled on Rust 1.26 stable for Windows if that matters.
What am I doing wrong, and what should I do differently to replicate fread? I realise that this is probably a case of the XY problem (in that there's almost certainly a better Rust way to repeatedly read some bytes from a file and pack them into one unsigned integer), but I'm really curious as to what I'm doing wrong here.
Your problem is that this line:
let mut buf = Vec::with_capacity(element_count * type_size);
creates a zero-length vector, even though it allocates memory for element_count * type_size bytes. Therefore you are asking stream.read_exact to read zero bytes. One way to fix this is to replace the above line with:
let mut buf = vec![0; element_count * type_size];
Side note: when the read succeeds, bytes_read receives the number of bytes you expected to read, not the number of bytes you actually read. You should probably use std::mem::size_of_val (buf_slice) to get the true byte count.
in that there's almost certainly a better Rust way to repeatedly read some bytes from a file and pack them into one unsigned integer
Yes, use the byteorder crate. This requires no unneeded heap allocation (the Vec in the original code):
extern crate byteorder;
use byteorder::{LittleEndian, ReadBytesExt};
use std::{
fs::File, io::{self, BufReader, Read},
};
fn read_instructions_to_end<R>(mut rdr: R) -> io::Result<Vec<u16>>
where
R: Read,
{
let mut instructions = Vec::new();
loop {
match rdr.read_u16::<LittleEndian>() {
Ok(instruction) => instructions.push(instruction),
Err(e) => {
return if e.kind() == std::io::ErrorKind::UnexpectedEof {
Ok(instructions)
} else {
Err(e)
}
}
}
}
}
fn main() -> Result<(), std::io::Error> {
let name = std::env::args().skip(1).next().expect("no file name");
let f = File::open(name)?;
let mut f = BufReader::new(f);
let base_address = f.read_u16::<LittleEndian>()?;
let instructions = read_instructions_to_end(f)?;
println!("{:#04x}", base_address);
for i in &instructions {
println!("0x{:04x}", i);
}
Ok(())
}

Using str and String interchangably

Suppose I'm trying to do a fancy zero-copy parser in Rust using &str, but sometimes I need to modify the text (e.g. to implement variable substitution). I really want to do something like this:
fn main() {
let mut v: Vec<&str> = "Hello there $world!".split_whitespace().collect();
for t in v.iter_mut() {
if (t.contains("$world")) {
*t = &t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
But of course the String returned by t.replace() doesn't live long enough. Is there a nice way around this? Perhaps there is a type which means "ideally a &str but if necessary a String"? Or maybe there is a way to use lifetime annotations to tell the compiler that the returned String should be kept alive until the end of main() (or have the same lifetime as v)?
Rust has exactly what you want in form of a Cow (Clone On Write) type.
use std::borrow::Cow;
fn main() {
let mut v: Vec<_> = "Hello there $world!".split_whitespace()
.map(|s| Cow::Borrowed(s))
.collect();
for t in v.iter_mut() {
if t.contains("$world") {
*t.to_mut() = t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
as #sellibitze correctly notes, the to_mut() creates a new String which causes a heap allocation to store the previous borrowed value. If you are sure you only have borrowed strings, then you can use
*t = Cow::Owned(t.replace("$world", "Earth"));
In case the Vec contains Cow::Owned elements, this would still throw away the allocation. You can prevent that using the following very fragile and unsafe code (It does direct byte-based manipulation of UTF-8 strings and relies of the fact that the replacement happens to be exactly the same number of bytes.) inside your for loop.
let mut last_pos = 0; // so we don't start at the beginning every time
while let Some(pos) = t[last_pos..].find("$world") {
let p = pos + last_pos; // find always starts at last_pos
last_pos = pos + 5;
unsafe {
let s = t.to_mut().as_mut_vec(); // operating on Vec is easier
s.remove(p); // remove $ sign
for (c, sc) in "Earth".bytes().zip(&mut s[p..]) {
*sc = c;
}
}
}
Note that this is tailored exactly to the "$world" -> "Earth" mapping. Any other mappings require careful consideration inside the unsafe code.
std::borrow::Cow, specifically used as Cow<'a, str>, where 'a is the lifetime of the string being parsed.
use std::borrow::Cow;
fn main() {
let mut v: Vec<Cow<'static, str>> = vec![];
v.push("oh hai".into());
v.push(format!("there, {}.", "Mark").into());
println!("{:?}", v);
}
Produces:
["oh hai", "there, Mark."]

More convenient way to work with strings in winapi calls

I'm looking for more convenient way to work with std::String in winapi calls in Rust.
Using rust v 0.12.0-nigtly with winapi 0.1.22 and user32-sys 0.1.1
Now I'm using something like this:
use winapi;
use user32;
pub fn get_window_title(handle: i32) -> String {
let mut v: Vec<u16> = Vec::new();
v.reserve(255);
let mut p = v.as_mut_ptr();
let len = v.len();
let cap = v.capacity();
let mut read_len = 0;
unsafe {
mem::forget(v);
read_len = unsafe { user32::GetWindowTextW(handle as winapi::HWND, p, 255) };
if read_len > 0 {
return String::from_utf16_lossy(Vec::from_raw_parts(p, read_len as usize, cap).as_slice());
} else {
return "".to_string();
}
}
}
I think, that this vector based memory allocation is rather bizarre. So I'm looking for more easier way to cast LPCWSTR to std::String
In your situation, you always want a maximum of 255 bytes, so you can use an array instead of a vector. This reduces the entire boilerplate to a mem::uninitialized() call, an as_mut_ptr() call and a slicing operation.
unsafe {
let mut v: [u16; 255] = mem::uninitialized();
let read_len = user32::GetWindowTextW(
handle as winapi::HWND,
v.as_mut_ptr(),
255,
);
String::from_utf16_lossy(&v[0..read_len])
}
In case you wanted to use a Vec, there's an easier way than to destroy the vec and re-create it. You can write to the Vec's content directly and let Rust handle everything else.
let mut v: Vec<u16> = Vec::with_capacity(255);
unsafe {
let read_len = user32::GetWindowTextW(
handle as winapi::HWND,
v.as_mut_ptr(),
v.capacity(),
);
v.set_len(read_len); // this is undefined behavior if read_len > v.capacity()
String::from_utf16_lossy(&v)
}
As a side-note, it is idiomatic in Rust to not use return on the last statement in a function, but to simply let the expression stand there without a semicolon. In your original code, the final if-expression could be written as
if read_len > 0 {
String::from_utf16_lossy(Vec::from_raw_parts(p, read_len as usize, cap).as_slice())
} else {
"".to_string()
}
but I removed the entire condition from my samples, as it is unnecessary to handle 0 read characters differently from n characters.

rust: error: lifetime of non-lvalue is too short to guarantee its contents can be safely reborrowed

I can't figure out what does this error in Rust mean:
error: lifetime of non-lvalue is too short to guarantee its contents can be safely reborrowed
What's a non-lvalue? (I suspect is not a right value).
I want to understand what the errror means and to be able to modify "objects" from a vector of mutable references.
This is a minimum test case to produce the error. I insert in a vector a mutable reference to a struct, and then I try to modify the pointed struct.
struct Point {
x: uint,
y: uint
}
fn main() {
let mut p = Point { x: 0, y: 0};
p.x += 1; // OK, p owns the point
let mut v: Vec<&mut Point> = Vec::new();
v.push(&mut p);
// p.x += 1 // FAIL (expected), v has borrowed the point
let p1:&mut Point = *v.get_mut(0); // ERROR, lifetime of non-lvalue...
// never reached this line
// p1.x += 1;
}
Let's go over what you're trying to do here:
let p1:&mut Point = *v.get_mut(0);
*v.get_mut(0) returns a mutable reference to first the mutable reference in the vector, then dereferences it. If this compiled, you would end up with two mutable references to the same object: one in the vector, the other in the p1 variable. Rust is rejecting this because it's not safe.
By far the best solution is to make the vector the owner of your Point objects. ie. use a Vec<Point> instead of a Vec<&mut Point.
If you need something more complicated, you can use a RefCell for dynamically checked borrowing:
use std::cell::RefCell;
struct Point {
x: uint,
y: uint
}
fn main() {
let p = RefCell::new(Point { x: 0, y: 0});
p.borrow_mut().x += 1;
let mut v: Vec<&RefCell<Point>> = Vec::new();
v.push(&p);
let p1 = v.get(0);
p1.borrow_mut().x += 1;
p.borrow_mut().x += 1;
}

Resources