Extend lifetime of a variable for thread - multithreading

I am reading a string from a file, splitting it by lines into a vector and then I want to do something with the extracted lines in separate threads. Like this:
use std::fs::File;
use std::io::prelude::*;
use std::thread;
fn main() {
match File::open("data") {
Ok(mut result) => {
let mut s = String::new();
result.read_to_string(&mut s);
let k : Vec<_> = s.split("\n").collect();
for line in k {
thread::spawn(move || {
println!("nL: {:?}", line);
});
}
}
Err(err) => {
println!("Error {:?}",err);
}
}
}
Of course this throws an error, because s will go out of scope before the threads are started:
s` does not live long enough
main.rs:9 let k : Vec<_> = s.split("\n").collect();
^
What can I do now? I've tried many things like Box or Arc, but I couldn't get it working. I somehow need to create a copy of s which also lives in the threads. But how do I do that?

The problem, fundamentally, is that line is a borrowed slice into s. There's really nothing you can do here, since there's no way to guarantee that each line will not outlive s itself.
Also, just to be clear: there is absolutely no way in Rust to "extend the lifetime of a variable". It simply cannot be done.
The simplest way around this is to go from line being borrowed to owned. Like so:
use std::thread;
fn main() {
let mut s: String = "One\nTwo\nThree\n".into();
let k : Vec<String> = s.split("\n").map(|s| s.into()).collect();
for line in k {
thread::spawn(move || {
println!("nL: {:?}", line);
});
}
}
The .map(|s| s.into()) converts from &str to String. Since a String owns its contents, it can be safely moved into each thread's closure, and will live independently of the thread that created it.
Note: you could do this in nightly Rust using the new scoped thread API, but that is still unstable.

Related

Rust: How to fix borrowed value does not live long enough

I have simple client/server application. I am receiving message on the server side from client but I want to send that response to the channel from server to other file and I am receiving error "borrowed value does not live long enough".
I have searched in the stack overflow for similar previous questions but not getting enough understanding of lifetime. Is there a good documentation or if simple example available on this topic?
For now if someone can help me to fix this code (may be edit the portion of code which needs to fix) that would be helpful.
Thanks in advance.
Server side:
use std::os::unix::net::UnixDatagram;
use std::path::Path;
fn unlink_socket (path: impl AsRef<Path>) {
let path = path.as_ref();
if Path::new(path).exists() {
let result = std::fs::remove_file(path);
match result {
Err(e) => {
println!("Couldn't remove the file: {:?}", e);
},
_ => {}
}
}
}
pub fn tcp_datagram_server() {
pub static FILE_PATH: &'static str = "/tmp/datagram.sock";
let (tx, rx) = mpsc::channel();
let mut buf = vec![0; 1024];
unlink_socket(FILE_PATH);
let socket = match UnixDatagram::bind(FILE_PATH) {
Ok(socket) => socket,
Err(e) => {
println!("Couldn't bind: {:?}", e);
return;
}
};
println!("Waiting for client to connect...");
loop {
let received_bytes = socket.recv(buf.as_mut_slice()).expect("recv function failed");
println!("Received {:?}", received_bytes);
let received_message = from_utf8(buf.as_slice()).expect("utf-8 convert failed");
tx.clone().send(received_message);
}
}
fn main() {
tcp_datagram_server();
}
client side:
use std::sync::mpsc;
use std::os::unix::net::UnixDatagram;
use std::path::Path;
use std::io::prelude::*;
pub fn tcp_datagram_client() {
pub static FILE_PATH: &'static str = "/tmp/datagram.sock";
let socket = UnixDatagram::unbound().unwrap();
match socket.connect(FILE_PATH) {
Ok(socket) => socket,
Err(e) => {
println!("Couldn't connect: {:?}", e);
return;
}
};
println!("TCP client Connected to TCP Server {:?}", socket);
loop {
socket.send(b"Hello from client to server").expect("recv function failed");
}
}
fn main() {
tcp_datagram_client();
}
Error I am getting
error[E0597]: `buf` does not live long enough
--> src/unix_datagram_server.rs:38:42
|
38 | let received_message = from_utf8(buf.as_slice()).expect("utf-8 convert failed");
| ^^^ borrowed value does not live long enough
...
41 | }
| -
| |
| `buf` dropped here while still borrowed
| borrow might be used here, when `tx` is dropped and runs the `Drop` code for type `std::sync::mpsc::Sender`
|
= note: values in a scope are dropped in the opposite order they are defined
error: aborting due to previous error; 8 warnings emitted
For now if someone can help me to fix this code (may be edit the portion of code which needs to fix) that would be helpful.
Well the message seems rather clear. send does exactly what it says it does, it sends the parameter through the channel. This means the data must live long enough and remain valid "forever" (it needs to be alive and valid in the channel, as well as when fetched from it by the receiver).
That is not the case here. rustc can't understand that the function never returns, and it can panic anyway which will end up the same: the function will terminate, which will invalidate buf. Since received_message borrows buf, that means received_message can't be valid after the function has terminated. But at that point the message would still be in the channel waiting to be read (or retrieved by the receiver doing who knows what).
Therefore your construction is not allowed.
A second issue is that you're overwriting the buffer data on every loop, which has the same effect of breaking the message you sent during the previous iteration, and thus is not correct either. Though Rust won't let you do that either: if you work around the first error it will tell you that there's an outstanding shared borrow (the message sent through the channel) so you can't modify the backing buffer in the following iteration.
The solution is quite simple: have each iteration create an owned string (copying the current iteration's message) and send that through the channel:
tx.clone().send(received_message.to_string());
Also, these are more style / inefficiency remarks but:
The clone() on tx is completely redundant. The point of having a sender that is Clone is being able to send from multiple threads (hence mp in the channel name, that's for multiple producers). Here you have a single thread, the original sender works fine.
.as_slice() and .as_mut_slice() are rarely used unless necessary, which they aren't here: array references coerce to slices, so you can just use &mut buf and &buf. And why are you calling Path::new on something that's already a path? It doesn't do anything but it's not useful either.
It is rather annoying that your snippet is missing multiple imports and thus doesn't even compile as is.
From more of a unixy perspective, errors are usually printed on stderr. In Rust, eprintln does that for you (otherwise working in the same way println does). And I don't understand the purpose of marking a lexically nested static pub. Since the static is inside the function it's not even visible to the function's siblings, to say nothing of external callers. As a result I'd end up with this:
use std::os::unix::net::UnixDatagram;
use std::path::Path;
use std::sync::mpsc;
use std::str::from_utf8;
fn unlink_socket (path: impl AsRef<Path>) {
let path = path.as_ref();
if path.exists() {
if let Err(e) = std::fs::remove_file(path) {
eprintln!("Couldn't remove the file: {:?}", e);
}
}
}
static FILE_PATH: &'static str = "/tmp/datagram.sock";
pub fn tcp_datagram_server() {
unlink_socket(FILE_PATH);
let socket = match UnixDatagram::bind(FILE_PATH) {
Ok(socket) => socket,
Err(e) => {
eprintln!("Couldn't bind: {:?}", e);
return;
}
};
let (tx, _) = mpsc::channel();
let mut buf = vec![0; 1024];
println!("Waiting for client to connect...");
loop {
let received_bytes = socket.recv(&mut buf).expect("recv function failed");
println!("Received {:?}", received_bytes);
let received_message = from_utf8(&buf).expect("utf-8 convert failed");
tx.send(received_message.to_string());
}
}
There's a hint in the compiler message, that values in a scope are dropped in the opposite order they are defined in, and in the example, buf is defined after tx, which means it will be dropped before tx. Since a reference to buf (in the form of received_message) is passed to tx.send(), then buf should live longer that tx, and therefore switching the definition order will fix this particular error (ie. switch lines 19 and 20).

Wait for backend thread to finish after application.run in Rust

I want to wait for a backend thread (Like this but in my case the backend manages a database which I want to close properly before the application actually exits) to finish (e.g. join it) after application.run() has finished.
My actual non working main.rs (the closure needs to be non-mut)
the thread to wait for
use gio::prelude::*;
use gtk::prelude::*;
use gtk::{ApplicationWindow, Label};
use std::env::args;
use std::thread;
fn main() {
let application = gtk::Application::new(
Some("com.github.gtk-rs.examples.communication_thread"),
Default::default(),
)
.expect("Initialization failed...");
let (thr, mut receiver) = start_communication_thread();
application.connect_activate(move |application| {
build_ui(application, receiver.take().unwrap())
});
application.run(&args().collect::<Vec<_>>());
thr.join();
}
fn build_ui(application: &gtk::Application, receiver: glib::Receiver<String>) {
let window = ApplicationWindow::new(application);
let label = Label::new(None);
window.add(&label);
spawn_local_handler(label, receiver);
window.show_all();
}
/// Spawn channel receive task on the main event loop.
fn spawn_local_handler(label: gtk::Label, receiver: glib::Receiver<String>) {
receiver.attach(None, move |item| {
label.set_text(&item);
glib::Continue(true)
});
}
/// Spawn separate thread to handle communication.
fn start_communication_thread() -> (thread::JoinHandle<()>, Option<glib::Receiver<String>>) {
let (sender, receiver) = glib::MainContext::channel(glib::PRIORITY_DEFAULT);
let thr = thread::spawn(move || {
let mut counter = 0;
loop {
let data = format!("Counter = {}!", counter);
println!("Thread received data: {}", data);
if sender.send(data).is_err() {
break
}
counter += 1;
thread::sleep(std::time::Duration::from_millis(100));
}
});
(thr, Some(receiver))
}
As mentioned above, the only error remaining is that application.connect_activate() takes an Fn closure, the current implementation is FnMut.
The error message is:
error[E0596]: cannot borrow `receiver` as mutable, as it is a captured variable in a `Fn` closure
--> src/main.rs:17:31
|
17 | build_ui(application, receiver.take().unwrap())
| ^^^^^^^^ cannot borrow as mutable
So you cannot use "receiver" mutably, which is necessary for you to take() its contents.
But if you wrap the receiver inside a Cell, then you can access the immutable Cell's contents mutably. So add this line directly after the line with start_communication_thread():
let receiver = Cell::new(receiver);
There might be some more correct answer as I am only a beginner at Rust, but at least it seems to work.
Please note that this changes the take() call to be called against the Cell instead of Option, whose implementation has the same effect, replacing the Cell's contents with None.

Creates a temporary which is freed while still in use

I'm creating a small application that explores variable lifetimes and threads. I want to load in a file once, and then use its contents (in this case an audio file) in a separate channel. I am having issues with value lifetimes.
I'm almost certain the syntax is wrong for what I have so far (for creating a static variable), but I can't find any resources for File types and lifetimes. What I have thus far produces this error:
let file = &File::open("src/censor-beep-01.wav").unwrap();
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ creates a temporary which is freed while still in use
let x: &'static File = file;
------------- type annotation requires that borrow lasts for `'static`
The code I currently have is:
#![allow(dead_code)]
#![allow(unused_imports)]
#![allow(unused_must_use)]
#![allow(unused_variables)]
use std::io::{self, BufRead, BufReader, stdin, Read};
use std::sync::mpsc::{self, TryRecvError};
use std::thread;
use std::time::Duration;
use std::fs::File;
use std::rc::Rc;
use rodio::Source;
fn main() {
let file = &File::open("src/censor-beep-01.wav").unwrap();
let x: &'static File = file;
loop {
let (tx, rx) = mpsc::channel();
thread::spawn(move || loop {
let tmp = x;
let (stream, stream_handle) = rodio::OutputStream::try_default().unwrap();
let source = rodio::Decoder::new(BufReader::new(tmp)).unwrap();
stream_handle.play_raw(source.convert_samples());
match rx.try_recv() {
Ok(_) | Err(TryRecvError::Disconnected) => {
break;
}
Err(TryRecvError::Empty) => {
println!("z");
thread::sleep(Duration::from_millis(1000));
}
}
});
let mut line = String::new();
let stdin = io::stdin();
let _ = stdin.lock().read_line(&mut line);
let _ = tx.send(());
return;
}
}
You need to wrap the file with Arc and Mutex like Arc::new(Mutex::new(file)) and then clone the file before passing it to the thread.
Arc is used for reference counting, which is needed to share the target object (in your case it is a file) across the thread and Mutex is needed to access the target object synchronously.
sample code (I have simplified your code to make it more understandable):
let file = Arc::new(Mutex::new(File::open("src/censor-beep-01.wav").unwrap()));
loop {
let file = file.clone();
thread::spawn(move || loop {
let mut file_guard = match file.lock() {
Ok(guard) => guard,
Err(poison) => poison.into_inner()
};
let file = file_guard.deref();
// now you can pass above file object to BufReader like "BufReader::new(file)"
});
}
reason for creates a temporary which is freed while still in use error:
You have only stored the reference of the file without the actual file object. so, the object will be droped in that line itself.

Using str and String interchangably

Suppose I'm trying to do a fancy zero-copy parser in Rust using &str, but sometimes I need to modify the text (e.g. to implement variable substitution). I really want to do something like this:
fn main() {
let mut v: Vec<&str> = "Hello there $world!".split_whitespace().collect();
for t in v.iter_mut() {
if (t.contains("$world")) {
*t = &t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
But of course the String returned by t.replace() doesn't live long enough. Is there a nice way around this? Perhaps there is a type which means "ideally a &str but if necessary a String"? Or maybe there is a way to use lifetime annotations to tell the compiler that the returned String should be kept alive until the end of main() (or have the same lifetime as v)?
Rust has exactly what you want in form of a Cow (Clone On Write) type.
use std::borrow::Cow;
fn main() {
let mut v: Vec<_> = "Hello there $world!".split_whitespace()
.map(|s| Cow::Borrowed(s))
.collect();
for t in v.iter_mut() {
if t.contains("$world") {
*t.to_mut() = t.replace("$world", "Earth");
}
}
println!("{:?}", &v);
}
as #sellibitze correctly notes, the to_mut() creates a new String which causes a heap allocation to store the previous borrowed value. If you are sure you only have borrowed strings, then you can use
*t = Cow::Owned(t.replace("$world", "Earth"));
In case the Vec contains Cow::Owned elements, this would still throw away the allocation. You can prevent that using the following very fragile and unsafe code (It does direct byte-based manipulation of UTF-8 strings and relies of the fact that the replacement happens to be exactly the same number of bytes.) inside your for loop.
let mut last_pos = 0; // so we don't start at the beginning every time
while let Some(pos) = t[last_pos..].find("$world") {
let p = pos + last_pos; // find always starts at last_pos
last_pos = pos + 5;
unsafe {
let s = t.to_mut().as_mut_vec(); // operating on Vec is easier
s.remove(p); // remove $ sign
for (c, sc) in "Earth".bytes().zip(&mut s[p..]) {
*sc = c;
}
}
}
Note that this is tailored exactly to the "$world" -> "Earth" mapping. Any other mappings require careful consideration inside the unsafe code.
std::borrow::Cow, specifically used as Cow<'a, str>, where 'a is the lifetime of the string being parsed.
use std::borrow::Cow;
fn main() {
let mut v: Vec<Cow<'static, str>> = vec![];
v.push("oh hai".into());
v.push(format!("there, {}.", "Mark").into());
println!("{:?}", v);
}
Produces:
["oh hai", "there, Mark."]

Does Read::read guarantee to append data and not overwrite any existing one?

I'm working on an SMTP library that reads lines over the network using a buffered reader.
I want a nice, safe way to read data from the network, without depending on Rust internals to make sure the code works as expected. Specifically, I'm wondering if the Read trait guarantees that data read with Read::read is appended to the buffer passed as an argument rather than overwriting the buffer entirely.
At the moment, I use a Range to make sure existing data is not overwritten without depending on Rust internals.
However, given that Rust used to have a nice way to do what I want, I'm wondering if the current code can be improved, possibly removing the unsafe blocks too.
No, it does not guarantee that:
use std::io::prelude::*;
use std::str;
fn main() {
let mut source1 = "hello, world!".as_bytes();
let mut source2 = "moo".as_bytes();
let mut dest = [0; 128];
source1.read(&mut dest).unwrap();
source2.read(&mut dest).unwrap();
let s = str::from_utf8(&dest[..16]).unwrap();
println!("{:?}", s)
}
This prints
"moolo, world!\u{0}\u{0}\u{0}"
Specifically, it cannot do what you want, based purely on the type signature:
fn read(&mut self, buf: &mut [u8]) -> Result<usize>;
All that the read method has access to is your mutable slice - there's nowhere to store information like "how far in the buffer you are". Furthermore, you aren't allowed to "extend" a mutable slice with more elements - you are only allowed to mutate the values within the slice.
For your particular case, you may want to look at BufRead::read_until. Here's a barely-tested example:
use std::io::{BufRead,BufReader};
use std::str;
fn main() {
let source1 = "header 1\r\nheader 2\r\n".as_bytes();
let mut reader = BufReader::new(source1);
let mut buf = vec![];
buf.reserve(128); // Maybe more efficient?
loop {
match reader.read_until(b'\n', &mut buf) {
Ok(0) => break,
Ok(_) => {},
Err(_) => panic!("Handle errors"),
}
if buf.len() < 2 { continue }
if buf[buf.len() - 2] == b'\r' {
{
let s = str::from_utf8(&buf).unwrap();
println!("Got a header {:?}", s);
}
buf.clear();
}
}
}

Resources