How to append to a file-backed mmap using the memmap crate? - linux

I have a file foo.txt with the content
foobar
I want to continuously append to this file and have access to the modified file.
MmapMut
The first thing I tried is to mutate the mmap directly:
use memmap;
use std::fs;
use std::io::prelude::*;
fn main() -> Result<(), Box<std::error::Error>> {
let backing_file = fs::OpenOptions::new()
.read(true)
.append(true)
.create(true)
.write(true)
.open("foo.txt")?;
let mut mmap = unsafe { memmap::MmapMut::map_mut(&backing_file)? };
loop {
println!("{}", std::str::from_utf8(&mmap[..])?);
std::thread::sleep(std::time::Duration::from_secs(5));
let buf = b"somestring";
(&mut mmap[..]).write_all(buf)?;
mmap.flush()?;
}
}
This will result in a panic:
Error: Custom { kind: WriteZero, error: StringError("failed to write whole buffer") }
The resulting file reads somest
Appending to the backing file directly
After that, I tried to append to the backing file directly:
use memmap;
use std::fs;
use std::io::prelude::*;
fn main() -> Result<(), Box<std::error::Error>> {
let mut backing_file = fs::OpenOptions::new()
.read(true)
.append(true)
.create(true)
.write(true)
.open("foo.txt")?;
let mmap = unsafe { memmap::MmapMut::map_mut(&backing_file)? };
loop {
println!("{}", std::str::from_utf8(&mmap[..])?);
std::thread::sleep(std::time::Duration::from_secs(5));
let buf = b"somestring";
backing_file.write_all(buf)?;
backing_file.flush()?;
}
}
This does not result in a panic. The file will get updated regularly, but my mmap does not reflect these changes. I expected standard output to look like this:
foobar
foobarsomestring
foobarsomestringsomestring
...
But I got
foobar
foobar
foobar
...
I am mainly interested in a Linux solution if it is platform-dependent.

First off, based on my understanding, I would urge you to be highly suspicious of that crate. It allows you do do things in safe Rust that you should not.
For example, if you have a file-backed mmap, then any process on your computer with the right permissions to the file can modify it. This means that:
It's never valid for the mmapped file to be treated as an immutable slice of bytes (&[u8]) because it might be mutated!
It's never valid for the mmapped file to be treated as a mutable slice of bytes (&mut [u8]) because a mutable reference implies an exclusive owner that can change that data, but you don't have that.
The documentation for that crate covers none of these concerns and doesn't discuss how you are supposed to use the small handful of unsafe functions in a safe manner. To me, these are signs that you may be introducing undefined behavior into your code, which is a Very Bad Thing.
For example:
use memmap;
use std::{fs, io::prelude::*};
fn main() -> Result<(), Box<std::error::Error>> {
let mut backing_file = fs::OpenOptions::new()
.read(true)
.append(true)
.create(true)
.write(true)
.open("foo.txt")?;
backing_file.write_all(b"initial")?;
let mut mmap_mut = unsafe { memmap::MmapMut::map_mut(&backing_file)? };
let mmap_immut = unsafe { memmap::Mmap::map(&backing_file)? };
// Code after here violates the rules of references, but doesn't use `unsafe`
let a_str: &str = std::str::from_utf8(&mmap_immut)?;
println!("{}", a_str); // initial
mmap_mut[0] = b'x';
// Look, we just changed an "immutable reference"!
println!("{}", a_str); // xnitial
Ok(())
}
Since people generally don't like being told "no, don't do that, it's a bad idea", here's how to get your code to "work": directly append to the file and then recreate the mmap:
use memmap;
use std::{fs, io::prelude::*, thread, time::Duration};
fn main() -> Result<(), Box<std::error::Error>> {
let mut backing_file = fs::OpenOptions::new()
.read(true)
.append(true)
.create(true)
.write(true)
.open("foo.txt")?;
// mmap requires that the initial mapping be non-zero
backing_file.write_all(b"initial")?;
for _ in 0..3 {
let mmap = unsafe { memmap::MmapMut::map_mut(&backing_file)? };
// I think this line can introduce memory unsafety
println!("{}", std::str::from_utf8(&mmap[..])?);
thread::sleep(Duration::from_secs(1));
backing_file.write_all(b"somestring")?;
}
Ok(())
}
You may want to preallocate a "big" chunk of space in this file so that you can just open it up and start writing, instead of having to re-map it.
I would not use this code for anything where it's important that the data is correct, myself.
See also:
Is appending to a file that is Mmap-ed safe and cross-platform?
appending to a memory-mapped file
How to create and write to memory mapped files?

Related

Reading ZIP file in Rust causes data owned by the current function

I'm new to Rust and am likely have a huge knowledge gap. Basically, I'm hoping to be create a utility function that would except a regular text file or a ZIP file and return a BufRead where the caller can start processing line by line. It is working well for non ZIP files but I am not understanding how to achieve the same for the ZIP files. The ZIP files will only contain a single file within the archive which is why I'm only processing the first file in the ZipArchive.
I'm running into the the following error.
error[E0515]: cannot return value referencing local variable `archive_contents`
--> src/file_reader.rs:30:9
|
27 | let archive_file: zip::read::ZipFile = archive_contents.by_index(0).unwrap();
| ---------------- `archive_contents` is borrowed here
...
30 | Ok(Box::new(BufReader::with_capacity(128 * 1024, archive_file)))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ returns a value referencing data owned by the current function
It seems the archive_contents is preventing the BufRead object from returning to the caller. I'm just not sure how to work around this.
file_reader.rs
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
pub struct FileReader {
pub file_reader: Result<Box<BufRead>, &'static str>,
}
pub fn file_reader(filename: &str) -> Result<Box<BufRead>, &'static str> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => panic!(
"ERROR: Could not open file, {}: {}",
path.display(),
why.to_string()
),
};
if path.extension() == Some(OsStr::new("zip")) {
// Processing ZIP file.
let mut archive_contents: zip::read::ZipArchive<std::fs::File> =
zip::ZipArchive::new(file).unwrap();
let archive_file: zip::read::ZipFile = archive_contents.by_index(0).unwrap();
// ERRORS: returns a value referencing data owned by the current function
Ok(Box::new(BufReader::with_capacity(128 * 1024, archive_file)))
} else {
// Processing non-ZIP file.
Ok(Box::new(BufReader::with_capacity(128 * 1024, file)))
}
}
main.rs
mod file_reader;
use std::io::BufRead;
fn main() {
let mut files: Vec<String> = Vec::new();
files.push("/tmp/text_file.txt".to_string());
files.push("/tmp/zip_file.zip".to_string());
for f in files {
let mut fr = match file_reader::file_reader(&f) {
Ok(fr) => fr,
Err(e) => panic!("Error reading file."),
};
fr.lines().for_each(|l| match l {
Ok(l) => {
println!("{}", l);
}
Err(e) => {
println!("ERROR: Failed to read line:\n {}", e);
}
});
}
}
Any help is greatly appreciated!
It seems the archive_contents is preventing the BufRead object from returning to the caller. I'm just not sure how to work around this.
You have to restructure the code somehow. The issue here is that, well, the archive data is part of the archive. So unlike file, archive_file is not an independent item, it is rather a pointer of sort into the archive itself. Which means the archive needs to live longer than archive_file for this code to be correct.
In a GC'd language this isn't an issue, archive_file has a reference to archive and will keep it alive however long it needs. Not so for Rust.
A simple way to fix this would be to just copy the data out of archive_file and into an owned buffer you can return to the parent. An other option might be to return a wrapper for (archive_contents, item_index), which would delegate the reading (might be somewhat tricky though). Yet another would be to not have file_reader.
Thanks to #Masklinn for the direction! Here's the working solution using their suggestion.
file_reader.rs
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::io::Cursor;
use std::io::Error;
use std::io::Read;
use std::path::Path;
use zip::read::ZipArchive;
pub fn file_reader(filename: &str) -> Result<Box<dyn BufRead>, Error> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => return Err(why),
};
if path.extension() == Some(OsStr::new("zip")) {
let mut archive_contents = ZipArchive::new(file)?;
let mut archive_file = archive_contents.by_index(0)?;
// Read the contents of the file into a vec.
let mut data = Vec::new();
archive_file.read_to_end(&mut data)?;
// Wrap vec in a std::io::Cursor.
let cursor = Cursor::new(data);
Ok(Box::new(cursor))
} else {
// Processing non-ZIP file.
Ok(Box::new(BufReader::with_capacity(128 * 1024, file)))
}
}
While the solution you have settled on does work, it has a few disadvantages. One is that when you read from a zip file, you have to read the contents of the file you want to process into memory before proceeding, which might be impractical for a large file. Another is that you have to heap allocate the BufReader in either case.
Another possibly more idiomatic solution is to restructure your code, such that the BufReader does not need to be returned from the function at all - rather, structure your code so that it has a function that opens the file, which in turn calls a function that processes the file:
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
pub fn process_file(filename: &str) -> Result<usize, String> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => return Err(format!(
"ERROR: Could not open file, {}: {}",
path.display(),
why.to_string()
)),
};
if path.extension() == Some(OsStr::new("zip")) {
// Handling a zip file
let mut archive_contents=zip::ZipArchive::new(file).unwrap();
let mut buf_reader = BufReader::with_capacity(128 * 1024,archive_contents.by_index(0).unwrap());
process_reader(&mut buf_reader)
} else {
// Handling a plain file.
process_reader(&mut BufReader::with_capacity(128 * 1024, file))
}
}
pub fn process_reader(reader: &mut dyn BufRead) -> Result<usize, String> {
// Example, just count the number of lines
return Ok(reader.lines().count());
}
fn main() {
let mut files: Vec<String> = Vec::new();
files.push("/tmp/text_file.txt".to_string());
files.push("/tmp/zip_file.zip".to_string());
for f in files {
match process_file(&f) {
Ok(count) => println!("File {} Count: {}", &f, count),
Err(e) => println!("Error reading file: {}", e),
};
}
}
This way, you don't need any Boxes and you don't need to read the file into memory before processing it.
A drawback to this solution would if you had multiple functions that need to be able to read from zip files. One way to handle that would be to define process_file to take a callback function to do the processing. First you would change the definition of process_file to be:
pub fn process_file<C>(filename: &str, process_reader: C) -> Result<usize, String>
where C: FnOnce(&mut dyn BufRead)->Result<usize, String>
The rest of the function body can be left unchanged. Now, process_reader can be passed into the function, like this:
process_file(&f, count_lines)
where count_lines would be the original simple function to count the lines, for instance.
This would also allow you to pass in a closure:
process_file(&f, |reader| Ok(reader.lines().count()))

How to do simple math with a list of numbers from a file and print out the result in Rust?

use std::fs::File;
use std::io::prelude::*;
use std::io::BufReader;
use std::iter::Iterator;
fn main() -> std::io::Result<()> {
let file = File::open("input")?; // file is input
let mut buf_reader = BufReader::new(file);
let mut contents = String::new();
buf_reader.read_to_string(&mut contents)?;
for i in contents.parse::<i32>() {
let i = i / 2;
println!("{}", i);
}
Ok(())
}
list of numbers:
50951
69212
119076
124303
95335
65069
109778
113786
124821
103423
128775
111918
138158
141455
92800
50908
107279
77352
129442
60097
84670
143682
104335
105729
87948
59542
81481
147508
str::parse::<i32> can only parse a single number at a time, so you will need to split the text first and then parse each number one by one. For example if you have one number per line and no extra whitespace, you can use BufRead::lines to process the text line by line:
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() -> std::io::Result<()> {
let file = File::open("input")?; // file is input
let mut buf_reader = BufReader::new(file);
for line in buf_reader.lines() {
let value = line?
.parse::<i32>()
.expect("Not able to parse: Content is malformed !");
println!("{}", value / 2);
}
Ok(())
}
As an extra bonus this avoids reading the whole file into memory, which can be important if the file is big.
For tiny examples like this, I'd read the entire string at once, then split it up on lines.
use std::fs;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let contents = fs::read_to_string("input")?;
for line in contents.trim().lines() {
let i: i32 = line.trim().parse()?;
let i = i / 2;
println!("{}", i);
}
Ok(())
}
See also:
What's the de-facto way of reading and writing files in Rust 1.x?
For tightly-controlled examples like this, I'd ignore errors occurring while parsing:
use std::fs;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let contents = fs::read_to_string("input")?;
for i in contents.trim().lines().flat_map(|l| l.trim().parse::<i32>()) {
let i = i / 2;
println!("{}", i);
}
Ok(())
}
See also:
Why does `Option` support `IntoIterator`?
For fixed-input examples like this, I'd avoid opening the file at runtime at all, pushing the error to compile-time:
fn main() -> Result<(), Box<dyn std::error::Error>> {
let contents = include_str!("../input");
for i in contents.trim().lines().flat_map(|l| l.trim().parse::<i32>()) {
let i = i / 2;
println!("{}", i);
}
Ok(())
}
See also:
Is there a good way to include external resource data into Rust source code?
If I wanted to handle failures to parse but treat the iterator as if errors were impossible, I'd use Itertools::process_results:
use itertools; // 0.8.2
fn main() -> Result<(), Box<dyn std::error::Error>> {
let contents = include_str!("../input");
let numbers = contents.trim().lines().map(|l| l.trim().parse::<i32>());
let sum = itertools::process_results(numbers, |i| i.sum::<i32>());
println!("{:?}", sum);
Ok(())
}
See also:
How do I perform iterator computations over iterators of Results without collecting to a temporary vector?
How do I stop iteration and return an error when Iterator::map returns a Result::Err?

How to convert a String to a &[u8] in order to write it to a file? [duplicate]

With Rust being comparatively new, I've seen far too many ways of reading and writing files. Many are extremely messy snippets someone came up with for their blog, and 99% of the examples I've found (even on Stack Overflow) are from unstable builds that no longer work. Now that Rust is stable, what is a simple, readable, non-panicking snippet for reading or writing files?
This is the closest I've gotten to something that works in terms of reading a text file, but it's still not compiling even though I'm fairly certain I've included everything I should have. This is based off of a snippet I found on Google+ of all places, and the only thing I've changed is that the old BufferedReader is now just BufReader:
use std::fs::File;
use std::io::BufReader;
use std::path::Path;
fn main() {
let path = Path::new("./textfile");
let mut file = BufReader::new(File::open(&path));
for line in file.lines() {
println!("{}", line);
}
}
The compiler complains:
error: the trait bound `std::result::Result<std::fs::File, std::io::Error>: std::io::Read` is not satisfied [--explain E0277]
--> src/main.rs:7:20
|>
7 |> let mut file = BufReader::new(File::open(&path));
|> ^^^^^^^^^^^^^^
note: required by `std::io::BufReader::new`
error: no method named `lines` found for type `std::io::BufReader<std::result::Result<std::fs::File, std::io::Error>>` in the current scope
--> src/main.rs:8:22
|>
8 |> for line in file.lines() {
|> ^^^^^
To sum it up, what I'm looking for is:
brevity
readability
covers all possible errors
doesn't panic
None of the functions I show here panic on their own, but I am using expect because I don't know what kind of error handling will fit best into your application. Go read The Rust Programming Language's chapter on error handling to understand how to appropriately handle failure in your own program.
Rust 1.26 and onwards
If you don't want to care about the underlying details, there are one-line functions for reading and writing.
Read a file to a String
use std::fs;
fn main() {
let data = fs::read_to_string("/etc/hosts").expect("Unable to read file");
println!("{}", data);
}
Read a file as a Vec<u8>
use std::fs;
fn main() {
let data = fs::read("/etc/hosts").expect("Unable to read file");
println!("{}", data.len());
}
Write a file
use std::fs;
fn main() {
let data = "Some data!";
fs::write("/tmp/foo", data).expect("Unable to write file");
}
Rust 1.0 and onwards
These forms are slightly more verbose than the one-line functions that allocate a String or Vec for you, but are more powerful in that you can reuse allocated data or append to an existing object.
Reading data
Reading a file requires two core pieces: File and Read.
Read a file to a String
use std::fs::File;
use std::io::Read;
fn main() {
let mut data = String::new();
let mut f = File::open("/etc/hosts").expect("Unable to open file");
f.read_to_string(&mut data).expect("Unable to read string");
println!("{}", data);
}
Read a file as a Vec<u8>
use std::fs::File;
use std::io::Read;
fn main() {
let mut data = Vec::new();
let mut f = File::open("/etc/hosts").expect("Unable to open file");
f.read_to_end(&mut data).expect("Unable to read data");
println!("{}", data.len());
}
Write a file
Writing a file is similar, except we use the Write trait and we always write out bytes. You can convert a String / &str to bytes with as_bytes:
use std::fs::File;
use std::io::Write;
fn main() {
let data = "Some data!";
let mut f = File::create("/tmp/foo").expect("Unable to create file");
f.write_all(data.as_bytes()).expect("Unable to write data");
}
Buffered I/O
I felt a bit of a push from the community to use BufReader and BufWriter instead of reading straight from a file
A buffered reader (or writer) uses a buffer to reduce the number of I/O requests. For example, it's much more efficient to access the disk once to read 256 bytes instead of accessing the disk 256 times.
That being said, I don't believe a buffered reader/writer will be useful when reading the entire file. read_to_end seems to copy data in somewhat large chunks, so the transfer may already be naturally coalesced into fewer I/O requests.
Here's an example of using it for reading:
use std::fs::File;
use std::io::{BufReader, Read};
fn main() {
let mut data = String::new();
let f = File::open("/etc/hosts").expect("Unable to open file");
let mut br = BufReader::new(f);
br.read_to_string(&mut data).expect("Unable to read string");
println!("{}", data);
}
And for writing:
use std::fs::File;
use std::io::{BufWriter, Write};
fn main() {
let data = "Some data!";
let f = File::create("/tmp/foo").expect("Unable to create file");
let mut f = BufWriter::new(f);
f.write_all(data.as_bytes()).expect("Unable to write data");
}
A BufReader is more useful when you want to read line-by-line:
use std::fs::File;
use std::io::{BufRead, BufReader};
fn main() {
let f = File::open("/etc/hosts").expect("Unable to open file");
let f = BufReader::new(f);
for line in f.lines() {
let line = line.expect("Unable to read line");
println!("Line: {}", line);
}
}
For anybody who is writing to a file, the accepted answer is good but if you need to append to the file you have to use the OpenOptions struct instead:
use std::io::Write;
use std::fs::OpenOptions;
fn main() {
let data = "Some data!\n";
let mut f = OpenOptions::new()
.append(true)
.create(true) // Optionally create the file if it doesn't already exist
.open("/tmp/foo")
.expect("Unable to open file");
f.write_all(data.as_bytes()).expect("Unable to write data");
}
Buffered writing still works the same way:
use std::io::{BufWriter, Write};
use std::fs::OpenOptions;
fn main() {
let data = "Some data!\n";
let f = OpenOptions::new()
.append(true)
.open("/tmp/foo")
.expect("Unable to open file");
let mut f = BufWriter::new(f);
f.write_all(data.as_bytes()).expect("Unable to write data");
}
By using the Buffered I/O you can copy the file size is greater than the actual memory.
use std::fs::{File, OpenOptions};
use std::io::{BufReader, BufWriter, Write, BufRead};
fn main() {
let read = File::open(r#"E:\1.xls"#);
let write = OpenOptions::new().write(true).create(true).open(r#"E:\2.xls"#);
let mut reader = BufReader::new(read.unwrap());
let mut writer = BufWriter::new(write.unwrap());
let mut length = 1;
while length > 0 {
let buffer = reader.fill_buf().unwrap();
writer.write(buffer);
length = buffer.len();
reader.consume(length);
}
}

Does Read::read guarantee to append data and not overwrite any existing one?

I'm working on an SMTP library that reads lines over the network using a buffered reader.
I want a nice, safe way to read data from the network, without depending on Rust internals to make sure the code works as expected. Specifically, I'm wondering if the Read trait guarantees that data read with Read::read is appended to the buffer passed as an argument rather than overwriting the buffer entirely.
At the moment, I use a Range to make sure existing data is not overwritten without depending on Rust internals.
However, given that Rust used to have a nice way to do what I want, I'm wondering if the current code can be improved, possibly removing the unsafe blocks too.
No, it does not guarantee that:
use std::io::prelude::*;
use std::str;
fn main() {
let mut source1 = "hello, world!".as_bytes();
let mut source2 = "moo".as_bytes();
let mut dest = [0; 128];
source1.read(&mut dest).unwrap();
source2.read(&mut dest).unwrap();
let s = str::from_utf8(&dest[..16]).unwrap();
println!("{:?}", s)
}
This prints
"moolo, world!\u{0}\u{0}\u{0}"
Specifically, it cannot do what you want, based purely on the type signature:
fn read(&mut self, buf: &mut [u8]) -> Result<usize>;
All that the read method has access to is your mutable slice - there's nowhere to store information like "how far in the buffer you are". Furthermore, you aren't allowed to "extend" a mutable slice with more elements - you are only allowed to mutate the values within the slice.
For your particular case, you may want to look at BufRead::read_until. Here's a barely-tested example:
use std::io::{BufRead,BufReader};
use std::str;
fn main() {
let source1 = "header 1\r\nheader 2\r\n".as_bytes();
let mut reader = BufReader::new(source1);
let mut buf = vec![];
buf.reserve(128); // Maybe more efficient?
loop {
match reader.read_until(b'\n', &mut buf) {
Ok(0) => break,
Ok(_) => {},
Err(_) => panic!("Handle errors"),
}
if buf.len() < 2 { continue }
if buf[buf.len() - 2] == b'\r' {
{
let s = str::from_utf8(&buf).unwrap();
println!("Got a header {:?}", s);
}
buf.clear();
}
}
}

How to create and write to memory mapped files?

Editor's note: This code example is from a version of Rust prior to 1.0 and the code it uses does not exist in Rust 1.0. Some answers have been updated to answer the core question for newer versions of Rust.
I'm trying to create a memory mapped file using std::os::MemoryMap. The current approach looks as follows:
use std::os;
use std::ptr;
use std::old_io as io;
use std::os::unix::prelude::AsRawFd;
use std::os::MapOption;
let path = Path::new("test.mmap");
let f = match io::File::open_mode(&path, io::Open, io::ReadWrite) {
Ok(f) => f,
Err(err) => panic!("Could not open file: {}", err),
};
let mmap_opts = &[
MapOption::MapReadable,
MapOption::MapWritable,
MapOption::MapFd(f.as_raw_fd())
];
let mmap = match os::MemoryMap::new(1024*1024, mmap_opts) {
Ok(mmap) => {
println!("Successfully created the mmap: {}", mmap.len());
mmap
}
Err(err) => panic!("Could not read the mmap: {}", err),
};
unsafe {
let data = mmap.data();
if data.is_null() {
panic!("Could not access data from memory mapped file")
}
let src = "Hello!";
ptr::copy_memory(data, src.as_ptr(), src.as_bytes().len());
}
This program fails with
Process didn't exit successfully: `target/mmap` (status=4)
when calling ptr::copy_memory or any other operations on data.
What is the reason I cannot write (or read) the data from the MemoryMap?
What is the correct way to use MemoryMap in Rust?
The real answer is to use a crate that provides this functionality, ideally in a cross-platform manner.
use memmap; // 0.7.0
use std::{
fs::OpenOptions,
io::{Seek, SeekFrom, Write},
};
const SIZE: u64 = 1024 * 1024;
fn main() {
let src = "Hello!";
let mut f = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open("test.mmap")
.expect("Unable to open file");
// Allocate space in the file first
f.seek(SeekFrom::Start(SIZE)).unwrap();
f.write_all(&[0]).unwrap();
f.seek(SeekFrom::Start(0)).unwrap();
let mut data = unsafe {
memmap::MmapOptions::new()
.map_mut(&f)
.expect("Could not access data from memory mapped file")
};
data[..src.len()].copy_from_slice(src.as_bytes());
}
Note that this is still possible for this code to lead to undefined behavior. Since the slice is backed by a file, the contents of the file (and thus the slice) may change from outside of the Rust program, breaking the invariants that the unsafe block is supposed to hold. The programmer needs to ensure that the file doesn't change during the lifetime of the map. Unfortunately, the crate itself does not provide much assistance to prevent this from happening or even any documentation warning the user.
If you wish to use lower-level system calls, you are missing two main parts:
mmap doesn't allocate any space on its own, so you need to set some space in the file. Without this, I get Illegal instruction: 4 when running on macOS.
MemoryMap (was) private by default so you need to mark the mapping as public so that changes are written back to the file (I'm assuming you want the writes to be saved). Without this, the code runs, but the file is never changed.
Here's a version that works for me:
use libc; // 0.2.67
use std::{
fs::OpenOptions,
io::{Seek, SeekFrom, Write},
os::unix::prelude::AsRawFd,
ptr,
};
fn main() {
let src = "Hello!";
let size = 1024 * 1024;
let mut f = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open("test.mmap")
.expect("Unable to open file");
// Allocate space in the file first
f.seek(SeekFrom::Start(size as u64)).unwrap();
f.write_all(&[0]).unwrap();
f.seek(SeekFrom::Start(0)).unwrap();
// This refers to the `File` but doesn't use lifetimes to indicate
// that. This is very dangerous, and you need to be careful.
unsafe {
let data = libc::mmap(
/* addr: */ ptr::null_mut(),
/* len: */ size,
/* prot: */ libc::PROT_READ | libc::PROT_WRITE,
// Then make the mapping *public* so it is written back to the file
/* flags: */ libc::MAP_SHARED,
/* fd: */ f.as_raw_fd(),
/* offset: */ 0,
);
if data == libc::MAP_FAILED {
panic!("Could not access data from memory mapped file")
}
ptr::copy_nonoverlapping(src.as_ptr(), data as *mut u8, src.len());
}
}
Up to date version:
use std::ptr;
use std::fs;
use std::io::{Write, SeekFrom, Seek};
use std::os::unix::prelude::AsRawFd;
use mmap::{MemoryMap, MapOption};
// from crates.io
extern crate mmap;
extern crate libc;
fn main() {
let size: usize = 1024*1024;
let mut f = fs::OpenOptions::new().read(true)
.write(true)
.create(true)
.open("test.mmap")
.unwrap();
// Allocate space in the file first
f.seek(SeekFrom::Start(size as u64)).unwrap();
f.write_all(&[0]).unwrap();
f.seek(SeekFrom::Start(0)).unwrap();
let mmap_opts = &[
// Then make the mapping *public* so it is written back to the file
MapOption::MapNonStandardFlags(libc::consts::os::posix88::MAP_SHARED),
MapOption::MapReadable,
MapOption::MapWritable,
MapOption::MapFd(f.as_raw_fd()),
];
let mmap = MemoryMap::new(size, mmap_opts).unwrap();
let data = mmap.data();
if data.is_null() {
panic!("Could not access data from memory mapped file")
}
let src = "Hello!";
let src_data = src.as_bytes();
unsafe {
ptr::copy(src_data.as_ptr(), data, src_data.len());
}
}
2022-version:
use memmap2::Mmap;
use std::fs::{self};
use std::io::{Seek, SeekFrom, Write};
use std::ops::DerefMut;
pub fn memmap2() {
// How to write to a file using mmap
// First open the file with writing option
let mut file = fs::OpenOptions::new()
.read(true)
.write(true)
.create(true)
.open("mmap_write_example2.txt")
.unwrap();
// Allocate space in the file for the data to be written,
// UTF8-encode string to get byte slice.
let data_to_write: &[u8] =
"Once upon a midnight dreary as I pondered weak and weary; äåößf\n".as_bytes();
let size: usize = data_to_write.len();
file.seek(SeekFrom::Start(size as u64 - 1)).unwrap();
file.write_all(&[0]).unwrap();
file.seek(SeekFrom::Start(0)).unwrap();
// Then write to the file
let mmap = unsafe { Mmap::map(&file).unwrap() };
let mut mut_mmap = mmap.make_mut().unwrap();
mut_mmap.deref_mut().write_all(data_to_write).unwrap();
}

Resources