Syn read file and add line to function - rust

I'm trying to parse a file with syn, and add a line to the single function in it. However, it seems to not modify the file at all when writing it back out. I'm fairly sure that I don't understand fully proc-macro and am using it wrong.
In my Cargo.toml I define a lib and bin like so:
[lib]
name = "gen"
path = "src/gen.rs"
proc-macro = true
[[bin]]
name = "main"
path = "src/main.rs"
In my gen.rs file, I define a macro to take in the input, get the function and modify it like so:
use proc_macro::TokenStream;
use quote::quote;
#[proc_macro]
pub fn gen(input: TokenStream) -> TokenStream {
let item = syn::parse(input.clone());
match item {
Ok(mut v) => {
let fn_item = match &mut v {
syn::Item::Fn(fn_item) => fn_item,
_ => panic!("expected fn"),
};
fn_item.block.stmts.insert(
0,
syn::parse(quote!(println!("count me in");).into()).unwrap(),
);
use quote::ToTokens;
return v.into_token_stream().into();
}
Err(error) => {
println!("{:?}", error);
return input;
}
};
}
Now in my main.rs file, I read the file, convert it to a TokenStream, and use my macro on it and write out the output to a file:
fn main() {
if let Err(error) = try_main() {
let _ = writeln!(io::stderr(), "{}", error);
process::exit(1);
}
}
fn try_main() -> Result<(), Error> {
let mut args = env::args_os();
let _ = args.next(); // executable name
let filepath = PathBuf::from("./src/file-to-parse.rs");
let code = fs::read_to_string(&filepath).map_err(Error::ReadFile)?;
let syntax = syn::parse_file(&code).map_err({
|error| Error::ParseFile {
error,
filepath,
source_code: code,
}
})?;
let mut token_stream = TokenStream::new();
syntax.to_tokens(&mut token_stream);
let file_contents_updated = gen::gen!(&token_stream);
std::fs::write("./src/file-updated.rs", file_contents_updated.to_string());
Ok(())
}
Running this, my output file looks the same as the input. For reference, my input file looks like:
fn init() {
println!("Hello, world!");
}

Yes, you've been misunderstood what proc macros do.
gen::gen!(&token_stream) will invoke gen!() at compile time with the literal tokens & token_stream. Since that doesn't look very much like a function, syn will fail to parse this, and your code will println!("{:?}", error); return input; (which by the way, is a bad idea for proc macro: parsing failure should abort compilation. Use return err.into_compile_error().into()). So it will return its input, meaning the output will be the same as the input.
You can use syn and quote for general purpose code generation, but you should not use proc macros for that - rather, use them as libraries. That is, gen::gen(token_stream) instead of gen::gen!(&token_stream). You can also not mark it proc_macro and put it in the same crate.

Related

How do I conditionally import modules and add instances of struct to vec!, only when the module (and struct) exists?

I can't figure out how to do import- and instancing-lines such that they tolerate non-existing files/modules and structs.
I tried making a macro that unwraps into such lines based on what files it finds in the directory, using a crate I found that had promise - include_optional - which allows to check for existence of files already at compile-time (since it's a macro).
However, I can't figure out how to use it properly in a macro, neither did I manage to use it without macro using the example at bottom of the docs conditional compilation chapter.
if cfg!(unix) { "unix" } else if cfg!(windows) { "windows" } else { "unknown" } (from the docs)
vs
if include_optional::include_bytes_optional!("day1.rs").is_some() { Some(&day01::Day01 {}) } else { None } // assume day1.rs and thus Day01 are non-existent (my attempt at doing same thing)
My if-statement compiles both cases, including the unreachable code (causing a compilation error), despite how according to the the docs it supposedly doesn't for cfg! ("conditional compilation").
Essentially, what I want is something of this form:
// Macro to generate code based on how many files/structs has been created
// There are anywhere between 1-25 days
get_days_created!;
/* // Let's assume 11 have been created so far, then the macro should evaluate to this code:
* mod day1;
* use day1 as day0#;
* // ...
* mod day11;
* use day11 as day11;
*
* // ...
* fn main() -> Result<(), i32> {
* let days : Vec<&dyn Day> = vec![
* &day01::Day01 {},
* // ...
* &day11::Day11 {},
* ];
* // ...
* }
*/
The solution is to create a proc_macro. These function similar to regular macros except they allow you to write a function of actual code they should execute, instead being given (and returning) a 'TokenStream' to parse the given tokens (and, respectively, what tokens the macro should expand to).
To create a proc_macro, the first and most important piece of information you need to know is that you can't do this anywhere. Instead, you need to create a new library, and in its Cargo.toml file you need to set proc-macro = true. Then you can declare them in its lib.rs. An example TOML would look something like this:
[package]
name = "some_proc_macro_lib"
version = "0.1.0"
edition = "2021"
[lib]
proc-macro = true
[dependencies]
glob = "0.3.0"
regex = "1.7.0"
Then you can create your macros in this library as regular functions, with the #[proc_macro] attribute/annotation. Here's an example lib.rs with as few dependencies as possible. For my exact question, the input TokenStream is irrelevant and can be ignored, and instead you want to generate and return a new one:
use proc_macro::TokenStream;
use glob::glob;
use regex::Regex;
#[proc_macro]
pub fn import_days(_: TokenStream) -> TokenStream {
let mut stream = TokenStream::new();
let re = Regex::new(r".+(\d+)").unwrap();
for entry in glob("./src/day*.rs").expect("Failed to read pattern") {
if let Ok(path) = entry {
let prefix = path.file_stem().unwrap().to_str().unwrap();
let caps = re.captures(prefix);
if let Some(caps) = caps {
let n: u32 = caps.get(1).unwrap().as_str().parse().unwrap();
let day = &format!("{}", prefix);
let day_padded = &format!("day{:0>2}", n);
stream.extend(format!("mod {};", day).parse::<TokenStream>().unwrap());
if n < 10 {
stream.extend(format!("use {} as {};", day, day_padded).parse::<TokenStream>().unwrap());
}
}
}
}
return proc_macro::TokenStream::from(stream);
}
The question could be considered answered with this already, but the answer can and should be further expanded on in my opinion. And as such I will do so.
Some additional explanations and suggestions, beyond the scope of the question
There are however quite a few other crates beside proc_macro that can aid you with both parsing the input stream, and building the output one. Of note are the dependencies syn and quote, and to aid them both there's the crate proc_macro2.
The syn crate
With syn you get helpful types, methods and macros for parsing the input Tokenstream. Essentially, with a struct Foo implementing syn::parse::Parse and the macro let foo = syn::parse_macro_input!(input as Foo) you can much more easily parse it into a custom struct thanks to syn::parse::ParseStream. An example would be something like this:
use proc_macro2::Ident;
use syn;
use syn::parse::{Parse, ParseStream};
#[derive(Debug, Default)]
struct Foo {
idents: Vec<Ident>,
}
impl syn::parse::Parse for Foo {
fn parse(input: syn::parse::ParseStream) -> syn::Result<Self> {
let mut foo= Foo::default();
while !input.is_empty() {
let fn_ident = input.parse::<Ident>()?;
foo.idents.push(fn_ident);
// Optional comma: Ok vs Err doesn't matter. Just consume if it exists and ignore failures.
input.parse::<syn::token::Comma>().ok();
}
return Ok(foo);
}
}
Note that the syn::Result return-type allows for nice propagation of parsing-errors when using the sugary ? syntax: input.parse::<SomeType>()?
The quote crate
With quote you get a helpful macro for generating a tokenstream more akin to how macro_rules does it. As an argument you write essentially regular code, and tell it to use the value of variables by prefixing with #.
Do note that you can't just pass it variables containing strings and expect it to expand into identifiers, as strings resolve to the value "foo" (quotes included). ie. mod "day1"; instead of mod day1;. You need to turn them into either:
a proce_macro2::Ident
syn::Ident::new(foo_str, proc_macro2::Span::call_site())
or a proc_macro2::TokenStream
foo_str.parse::<TokenStream>().unwrap()
The latter also allows to convert longer strings with more than a single Ident, and manages things such as literals etc., making it possible to skip the quote! macro entirely and just use this tokenstream directly (as seen in import_days).
Here's an example that creates a struct with dynamic name, and implements a specific trait for it:
use proc_macro2::TokenStream;
use quote::quote;
// ...
let mut stream = TokenStream::new();
stream.extend(quote!{
#[derive(Debug)]
pub struct #day_padded_upper {}
impl Day for #day_padded_upper {
#trait_parts
}
});
return proc_macro::TokenStream::from(stream);
Finally, on how to implement my question
This 'chapter' is a bit redundant, as I essentially answered it with the first two code-snippets (.toml and fn import_days), and the rest could have been considered an exercise for the reader. However, while the question is about reading the filesystem at compile-time in a macro to 'dynamically' change its expansion (sort of), I wrote it in a more general form asking how to achieve a specific result (as old me didn't know macro's could do that). So for completion I'll include this 'chapter' nevertheless.
There is also the fact that the last macro in this 'chapter' - impl_day (which wasn't mentioned at all in the question) - serves as a good example of how to achieve two adjacent but important and relevant tasks.
Retrieving and using call-site's filename.
Parsing the input TokenStream using the syn dependency as shown above.
In other words: knowing all the above, this is how you can create macros for importing all targeted files, instantiating structs for all targeted files, as well as to declare + define the struct from current file's name.
Importing all targeted files:
See import_days above at the start.
Instantiating Vec with structs from all targeted files:
#[proc_macro]
pub fn instantiate_days(_: proc_macro::TokenStream) -> proc_macro::TokenStream {
let re = Regex::new(r".+(\d+)").unwrap();
let mut stream = TokenStream::new();
let mut block = TokenStream::new();
for entry in glob("./src/day*.rs").expect("Failed to read pattern") {
match entry {
Ok(path) => {
let prefix = path.file_stem().unwrap().to_str().unwrap();
let caps = re.captures(prefix);
if let Some(caps) = caps {
let n: u32 = caps.get(1).unwrap().as_str().parse().unwrap();
let day_padded = &format!("day{:0>2}", n);
let day_padded_upper = &format!("Day{:0>2}", n);
let instance = &format!("&{}::{} {{}}", day_padded, day_padded_upper).parse::<TokenStream>().unwrap();
block.extend(quote!{
v.push( #instance );
});
}
},
Err(e) => println!("{:?}", e),
}
}
stream.extend(quote!{
{
let mut v: Vec<&dyn Day> = Vec::new();
#block
v
}
});
return proc_macro::TokenStream::from(stream);
}
Declaring and defining struct for current file invoking this macro:
#[derive(Debug, Default)]
struct DayParser {
parts: Vec<Ident>,
}
impl Parse for DayParser {
fn parse(input: ParseStream) -> syn::Result<Self> {
let mut day_parser = DayParser::default();
while !input.is_empty() {
let fn_ident = input.parse::<Ident>()?;
// Optional, Ok vs Err doesn't matter. Just consume if it exists.
input.parse::<syn::token::Comma>().ok();
day_parser.parts.push(fn_ident);
}
return Ok(day_parser);
}
}
#[proc_macro]
pub fn impl_day(input: proc_macro::TokenStream) -> proc_macro::TokenStream {
let mut stream = TokenStream::new();
let span = Span::call_site();
let binding = span.source_file().path();
let file = binding.to_str().unwrap();
let re = Regex::new(r".*day(\d+).rs").unwrap();
let caps = re.captures(file);
if let Some(caps) = caps {
let n: u32 = caps.get(1).unwrap().as_str().parse().unwrap();
let day_padded_upper = format!("Day{:0>2}", n).parse::<TokenStream>().unwrap();
let day_parser = syn::parse_macro_input!(input as DayParser);
let mut trait_parts = TokenStream::new();
for (k, fn_ident) in day_parser.parts.into_iter().enumerate() {
let k = k+1;
let trait_part_ident = format!("part_{}", k).parse::<TokenStream>().unwrap();
// let trait_part_ident = proc_macro::Ident::new(format!("part_{}", k).as_str(), span);
trait_parts.extend(quote!{
fn #trait_part_ident(&self, input: &str) -> Result<String, ()> {
return Ok(format!("Part {}: {:?}", #k, #fn_ident(input)));
}
});
}
stream.extend(quote!{
#[derive(Debug)]
pub struct #day_padded_upper {}
impl Day for #day_padded_upper {
#trait_parts
}
});
} else {
// don't generate anything
let str = format!("Tried to implement Day for a file with malformed name: file = \"{}\" , re = \"{:?}\"", file, re);
println!("{}", str);
// compile_error!(str); // can't figure out how to use these
}
return proc_macro::TokenStream::from(stream);
}

How to read a text File in Rust and read mutliple Values per line

So basically, I have a text file with the following syntax:
String int
String int
String int
I have an idea how to read the Values if there is only one entry per line, but if there are multiple, I do not know how to do it.
In Java, I would do something simple with while and Scanner but in Rust I have no clue.
I am fairly new to Rust so please help me.
Thanks for your help in advance
Solution
Here is my modified Solution of #netwave 's code:
use std::fs;
use std::io::{BufRead, BufReader, Error};
fn main() -> Result<(), Error> {
let buff_reader = BufReader::new(fs::File::open(file)?);
for line in buff_reader.lines() {
let parsed = sscanf::scanf!(line?, "{} {}", String, i32);
println!("{:?}\n", parsed);
}
Ok(())
}
You can use the BuffRead trait, which has a read_line method. Also you can use lines.
For doing so the easiest option would be to wrap the File instance with a BuffReader:
use std::fs;
use std::io::{BufRead, BufReader};
...
let buff_reader = BufReader::new(fs::File::open(path)?);
loop {
let mut buff = String::new();
buff_reader.read_line(&mut buff)?;
println!("{}", buff);
}
Playground
Once you have each line you can easily use sscanf crate to parse the line to the types you need:
let parsed = sscanf::scanf!(buff, "{} {}", String, i32);
Based on: https://doc.rust-lang.org/rust-by-example/std_misc/file/read_lines.html
For data.txt to contain:
str1 100
str2 200
str3 300
use std::fs::File;
use std::io::{self, BufRead};
use std::path::Path;
fn main() {
// File hosts must exist in current path before this produces output
if let Ok(lines) = read_lines("./data.txt") {
// Consumes the iterator, returns an (Optional) String
for line in lines {
if let Ok(data) = line {
let values: Vec<&str> = data.split(' ').collect();
match values.len() {
2 => {
let strdata = values[0].parse::<String>();
let intdata = values[1].parse::<i32>();
println!("Got: {:?} {:?}", strdata, intdata);
},
_ => panic!("Invalid input line {}", data),
};
}
}
}
}
// The output is wrapped in a Result to allow matching on errors
// Returns an Iterator to the Reader of the lines of the file.
fn read_lines<P>(filename: P) -> io::Result<io::Lines<io::BufReader<File>>>
where P: AsRef<Path>, {
let file = File::open(filename)?;
Ok(io::BufReader::new(file).lines())
}
Outputs:
Got: Ok("str1") Ok(100)
Got: Ok("str2") Ok(200)
Got: Ok("str3") Ok(300)

Is there a way to write a function that using odbc::Statement in a loop?

I have working example of a simple loop (mostly taken from the odbc crate's example):
use std::io;
use odbc::*;
use odbc_safe::AutocommitOn;
fn main(){
let env = create_environment_v3().map_err(|e| e.unwrap()).unwrap();
let conn = env.connect_with_connection_string(CONN_STRING).unwrap();
let mut stmt = Statement::with_parent(&conn).unwrap();
loop {
let mut sql_text = String::new();
println!("Please enter SQL statement string: ");
io::stdin().read_line(&mut sql_text).unwrap();
stmt = match stmt.exec_direct(&sql_text).unwrap() {
Data(mut stmt) => {
let cols = stmt.num_result_cols().unwrap();
while let Some(mut cursor) = stmt.fetch().unwrap() {
for i in 1..(cols + 1) {
match cursor.get_data::<&str>(i as u16).unwrap() {
Some(val) => print!(" {}", val),
None => print!(" NULL"),
}
}
println!();
}
stmt.close_cursor().unwrap()
}
NoData(stmt) => {println!("Query executed, no data returned"); stmt}
}
}
}
I don't want to create new Statements for each query, as I just can .close_cursor().
I'd like to extract the loop's body to a function, like this:
fn exec_stmt(stmt: Statement<Allocated, NoResult, AutocommitOn>) {
//loop's body here
}
But I just can't! The .exec_direct() method mutably consumes my Statement and returns another. I tried different ways to pass Statement arg to the function (borrow, RefCell, etc), but they all fail when using in a loop. I am still new to Rust, so most likely I just don't know something, or does the .exec_direct's Statement consumption makes it impossible?
There's no nice way to move and then move back values through parameters. It's probably best to copy what .exec_direct does and just make the return type of your function a statement as well.
The usage would then look like this:
let mut stmt = Statement::with_parent(&conn).unwrap();
loop {
stmt = exec_stmt(stmnt);
}
and your function signature would be:
fn exec_stmt(stmt: Statement<...>) -> Statement<...> {
match stmt.exec_direct() {
...
}
}
I probably wouldn't recommend this, but if you really wanted to get it to work you could use Option and the .take() method.
fn exec_stmt(some_stmt: &mut Option<Statement<...>>) {
let stmt = some_stmt.take().unwrap();
// do stuff ...
some_stmt.replace(stmt);
}
The odbc-safe crate tried to have each state transition of ODBC reflected in a different type. The odbc-api crate also tries to protect you from errors, but is a bit more subtle about it. Your use case would be covered by the the Preallocated struct.
The analog example from the odbc-api documentation looks like this:
use odbc_api::{Connection, Error};
use std::io::{self, stdin, Read};
fn interactive(conn: &Connection) -> io::Result<()>{
let mut statement = conn.preallocate().unwrap();
let mut query = String::new();
stdin().read_line(&mut query)?;
while !query.is_empty() {
match statement.execute(&query, ()) {
Err(e) => println!("{}", e),
Ok(None) => println!("No results set generated."),
Ok(Some(cursor)) => {
// ...print cursor contents...
},
}
stdin().read_line(&mut query)?;
}
Ok(())
}
This will allow you to declare a function without any trouble:
use odbc_api::Preallocated;
fn exec_statement(stmt: &mut Preallocated) {
// loops body here
}

Reading ZIP file in Rust causes data owned by the current function

I'm new to Rust and am likely have a huge knowledge gap. Basically, I'm hoping to be create a utility function that would except a regular text file or a ZIP file and return a BufRead where the caller can start processing line by line. It is working well for non ZIP files but I am not understanding how to achieve the same for the ZIP files. The ZIP files will only contain a single file within the archive which is why I'm only processing the first file in the ZipArchive.
I'm running into the the following error.
error[E0515]: cannot return value referencing local variable `archive_contents`
--> src/file_reader.rs:30:9
|
27 | let archive_file: zip::read::ZipFile = archive_contents.by_index(0).unwrap();
| ---------------- `archive_contents` is borrowed here
...
30 | Ok(Box::new(BufReader::with_capacity(128 * 1024, archive_file)))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ returns a value referencing data owned by the current function
It seems the archive_contents is preventing the BufRead object from returning to the caller. I'm just not sure how to work around this.
file_reader.rs
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
pub struct FileReader {
pub file_reader: Result<Box<BufRead>, &'static str>,
}
pub fn file_reader(filename: &str) -> Result<Box<BufRead>, &'static str> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => panic!(
"ERROR: Could not open file, {}: {}",
path.display(),
why.to_string()
),
};
if path.extension() == Some(OsStr::new("zip")) {
// Processing ZIP file.
let mut archive_contents: zip::read::ZipArchive<std::fs::File> =
zip::ZipArchive::new(file).unwrap();
let archive_file: zip::read::ZipFile = archive_contents.by_index(0).unwrap();
// ERRORS: returns a value referencing data owned by the current function
Ok(Box::new(BufReader::with_capacity(128 * 1024, archive_file)))
} else {
// Processing non-ZIP file.
Ok(Box::new(BufReader::with_capacity(128 * 1024, file)))
}
}
main.rs
mod file_reader;
use std::io::BufRead;
fn main() {
let mut files: Vec<String> = Vec::new();
files.push("/tmp/text_file.txt".to_string());
files.push("/tmp/zip_file.zip".to_string());
for f in files {
let mut fr = match file_reader::file_reader(&f) {
Ok(fr) => fr,
Err(e) => panic!("Error reading file."),
};
fr.lines().for_each(|l| match l {
Ok(l) => {
println!("{}", l);
}
Err(e) => {
println!("ERROR: Failed to read line:\n {}", e);
}
});
}
}
Any help is greatly appreciated!
It seems the archive_contents is preventing the BufRead object from returning to the caller. I'm just not sure how to work around this.
You have to restructure the code somehow. The issue here is that, well, the archive data is part of the archive. So unlike file, archive_file is not an independent item, it is rather a pointer of sort into the archive itself. Which means the archive needs to live longer than archive_file for this code to be correct.
In a GC'd language this isn't an issue, archive_file has a reference to archive and will keep it alive however long it needs. Not so for Rust.
A simple way to fix this would be to just copy the data out of archive_file and into an owned buffer you can return to the parent. An other option might be to return a wrapper for (archive_contents, item_index), which would delegate the reading (might be somewhat tricky though). Yet another would be to not have file_reader.
Thanks to #Masklinn for the direction! Here's the working solution using their suggestion.
file_reader.rs
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::io::Cursor;
use std::io::Error;
use std::io::Read;
use std::path::Path;
use zip::read::ZipArchive;
pub fn file_reader(filename: &str) -> Result<Box<dyn BufRead>, Error> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => return Err(why),
};
if path.extension() == Some(OsStr::new("zip")) {
let mut archive_contents = ZipArchive::new(file)?;
let mut archive_file = archive_contents.by_index(0)?;
// Read the contents of the file into a vec.
let mut data = Vec::new();
archive_file.read_to_end(&mut data)?;
// Wrap vec in a std::io::Cursor.
let cursor = Cursor::new(data);
Ok(Box::new(cursor))
} else {
// Processing non-ZIP file.
Ok(Box::new(BufReader::with_capacity(128 * 1024, file)))
}
}
While the solution you have settled on does work, it has a few disadvantages. One is that when you read from a zip file, you have to read the contents of the file you want to process into memory before proceeding, which might be impractical for a large file. Another is that you have to heap allocate the BufReader in either case.
Another possibly more idiomatic solution is to restructure your code, such that the BufReader does not need to be returned from the function at all - rather, structure your code so that it has a function that opens the file, which in turn calls a function that processes the file:
use std::ffi::OsStr;
use std::fs::File;
use std::io::BufRead;
use std::io::BufReader;
use std::path::Path;
pub fn process_file(filename: &str) -> Result<usize, String> {
let path = Path::new(filename);
let file = match File::open(&path) {
Ok(file) => file,
Err(why) => return Err(format!(
"ERROR: Could not open file, {}: {}",
path.display(),
why.to_string()
)),
};
if path.extension() == Some(OsStr::new("zip")) {
// Handling a zip file
let mut archive_contents=zip::ZipArchive::new(file).unwrap();
let mut buf_reader = BufReader::with_capacity(128 * 1024,archive_contents.by_index(0).unwrap());
process_reader(&mut buf_reader)
} else {
// Handling a plain file.
process_reader(&mut BufReader::with_capacity(128 * 1024, file))
}
}
pub fn process_reader(reader: &mut dyn BufRead) -> Result<usize, String> {
// Example, just count the number of lines
return Ok(reader.lines().count());
}
fn main() {
let mut files: Vec<String> = Vec::new();
files.push("/tmp/text_file.txt".to_string());
files.push("/tmp/zip_file.zip".to_string());
for f in files {
match process_file(&f) {
Ok(count) => println!("File {} Count: {}", &f, count),
Err(e) => println!("Error reading file: {}", e),
};
}
}
This way, you don't need any Boxes and you don't need to read the file into memory before processing it.
A drawback to this solution would if you had multiple functions that need to be able to read from zip files. One way to handle that would be to define process_file to take a callback function to do the processing. First you would change the definition of process_file to be:
pub fn process_file<C>(filename: &str, process_reader: C) -> Result<usize, String>
where C: FnOnce(&mut dyn BufRead)->Result<usize, String>
The rest of the function body can be left unchanged. Now, process_reader can be passed into the function, like this:
process_file(&f, count_lines)
where count_lines would be the original simple function to count the lines, for instance.
This would also allow you to pass in a closure:
process_file(&f, |reader| Ok(reader.lines().count()))

How do I use include_str! for multiple files or an entire directory?

I would like to copy an entire directory to a location in a user's $HOME. Individually copying files to that directory is straightforward:
let contents = include_str!("resources/profiles/default.json");
let fpath = dpath.join(&fname);
fs::write(fpath, contents).expect(&format!("failed to create profile: {}", n));
I haven't found a way to adapt this to multiple files:
for n in ["default"] {
let fname = format!("{}{}", n, ".json");
let x = format!("resources/profiles/{}", fname).as_str();
let contents = include_str!(x);
let fpath = dpath.join(&fname);
fs::write(fpath, contents).expect(&format!("failed to create profile: {}", n));
}
...the compiler complains that x must be a string literal.
As far as I know, there are two options:
Write a custom macro.
Replicate the first code for each file I want to copy.
What is the best way of doing this?
I would create a build script that iterates through a directory, building up an array of tuples containing the name and another macro call to include the raw data:
use std::{
env,
error::Error,
fs::{self, File},
io::Write,
path::Path,
};
const SOURCE_DIR: &str = "some/path/to/include";
fn main() -> Result<(), Box<dyn Error>> {
let out_dir = env::var("OUT_DIR")?;
let dest_path = Path::new(&out_dir).join("all_the_files.rs");
let mut all_the_files = File::create(&dest_path)?;
writeln!(&mut all_the_files, r##"["##,)?;
for f in fs::read_dir(SOURCE_DIR)? {
let f = f?;
if !f.file_type()?.is_file() {
continue;
}
writeln!(
&mut all_the_files,
r##"("{name}", include_bytes!(r#"{name}"#)),"##,
name = f.path().display(),
)?;
}
writeln!(&mut all_the_files, r##"]"##,)?;
Ok(())
}
This has some weaknesses, namely that it requires the path to be expressible as a &str. Since you were already using include_string!, I don't think that's an extra requirement. This also means that the generated string has to be a valid Rust string. We use raw strings inside the generated file, but this can still fail if a filename were to contain the string "#. A better solution would probably use str::escape_default.
Since we are including files, I used include_bytes! instead of include_str!, but if you really needed to you can switch back. The raw bytes skips performing UTF-8 validation at compile time, so it's a small win.
Using it involves importing the generated value:
const ALL_THE_FILES: &[(&str, &[u8])] = &include!(concat!(env!("OUT_DIR"), "/all_the_files.rs"));
fn main() {
for (name, data) in ALL_THE_FILES {
println!("File {} is {} bytes", name, data.len());
}
}
See also:
How can I locate resources for testing with Cargo?
You can use include_dir macro.
use include_dir::{include_dir, Dir};
use std::path::Path;
const PROJECT_DIR: Dir = include_dir!(".");
// of course, you can retrieve a file by its full path
let lib_rs = PROJECT_DIR.get_file("src/lib.rs").unwrap();
// you can also inspect the file's contents
let body = lib_rs.contents_utf8().unwrap();
assert!(body.contains("SOME_INTERESTING_STRING"));
Using a macro:
macro_rules! incl_profiles {
( $( $x:expr ),* ) => {
{
let mut profs = Vec::new();
$(
profs.push(($x, include_str!(concat!("resources/profiles/", $x, ".json"))));
)*
profs
}
};
}
...
let prof_tups: Vec<(&str, &str)> = incl_profiles!("default", "python");
for (prof_name, prof_str) in prof_tups {
let fname = format!("{}{}", prof_name, ".json");
let fpath = dpath.join(&fname);
fs::write(fpath, prof_str).expect(&format!("failed to create profile: {}", prof_name));
}
Note: This is not dynamic. The files ("default" and "python") are specified in the call to the macro.
Updated: Use Vec instead of HashMap.

Resources