I would like to have a rust program compiled, and afterwards have that compiled binary file's contents saved in another rust program as a variable, so that it can write that to a file. So for example, if I have an executable file which prints hello world, I would like to copy it as a variable to my second rust program, and for it to write that to a file. Basically an executable which creates an executable. I dont know if such thing is possible, but if it is I would like to know how it's done.
Rust provides macros for loading files as static variables at compile time. You might have to make sure they get compiled in the correct order, but it should work for your use case.
// The compiler will read the file put its contents in the text section of the
// resulting binary (assuming it doesn't get optimized out)
let file_contents: &'static [u8; _] = include_bytes!("../../target/release/foo.exe");
// There is also a version that loads the file as a string, but you likely don't want this version
let string_file_contents: &'static str = include_str!("data.txt");
So you can put this together to create a function for it.
use std::io::{self, Write};
use std::fs::File;
use std::path::Path;
/// Save the foo executable to a given file path.
pub fn save_foo<P: AsRef<Path>>(path: P) -> io::Result<()> {
let mut file = File::create(path.as_ref().join("foo.exe"))?;
file.write_all(include_bytes!("foo.exe"))
}
References:
https://doc.rust-lang.org/std/macro.include_bytes.html
https://doc.rust-lang.org/std/macro.include_str.html
Related
I want a function that mutates the underlying String of a PathBuf object, but all I could achieve now is creation of a new object instead.
use std::path::{Path, PathBuf};
fn expanduser(path: &mut PathBuf) -> PathBuf {
return PathBuf::from(
&path
.to_str()
.unwrap()
.replace("~", PathBuf::home().to_str().unwrap()),
);
}
PathBuf wraps an OsString, not a String. They are much different types - String contains a UTF-8 string while OsString depends on the platform: arbitrary bytes for Unix and potentially-malformed UTF-16 on Windows.
You can use into_os_string to convert a PathBuf to an OsString, and From for the reverse.
If you are just trying to replace ~ with the home path, your best bet is to check if the first component (via the components method) is a Normal component containing "~" and join the rest of the components to the home path if so. There's crates that do this for you.
I hate that function, cause if the path I'm canonicalizing doesn't exist, it throws an error instead of gracefully doing what it should do. No one asked it to check existence.
You're likely misunderstanding the function. canonicalize resolves symlinks, so of course it won't work if the path doesn't exist. Also worth mentioning is that foo/bar/../baz is NOT necessarily the same as foo/baz, if foo/bar is a symlink.
I tried to compile the following program:
use std::io;
fn main() {
io::stdout().write(b"Please enter your name: ");
io::stdout().flush();
}
Unfortunately, compiler resisted:
error: no method named `write` found for type `std::io::Stdout` in the current scope
--> hello.rs:4:18
|
4 | io::stdout().write(b"Please enter your name: ");
| ^^^^^
|
= help: items from traits can only be used if the trait is in scope; the following trait is implemented but not in scope, perhaps add a `use` for it:
= help: candidate #1: `use std::io::Write`
I found that I needed to do use std::io::{self, Write};. What does use std::io; actually do then and how do (if possible) I pull all names defined in std::io? Also, would it be a bad style?
What does use std::io; actually do then?
It does what every use-statement does: makes the last part of the used path directly available (pulling it into the current namespace). That means that you can write io and the compiler knows that you mean std::io.
How do I pull all names defined in std::io?
With use std::io::*;. This is commonly referred to as glob-import.
Also, would it be a bad style?
Yes, it would. Usually you should avoid glob-imports. They can be handy for certain situations, but cause a lot of trouble on most cases. For example, there is also the trait std::fmt::Write... so importing everything from fmt and io would possibly result in name-clashes. Rust values explicitness over implicitness, so rather avoid glob-imports.
However, there is one type of module that is usually used with a glob-import: preludes. And in fact, there is even a std::io::prelude which reexports important symbols. See the documentation for more information.
I'm working on Rust and using a shared library written in C++. The problem that the C++ library spawns a few threads that constantly print to stdout (1) and that interferes with my own logging.
I was able to duplicate stdout using dup to fd = 3. Then I open up a pipe (4, 5), and use dup2 to
move old stdout to one end of the pipe.
As a result:
C++ library writes to fd = 1 (old stdout), but that goes to another pipe where I can capture the data in another thread (say I read from fd = 5). Then I can parse those logs and print them to the console.
In Rust code I can use libc::write to fd = 3 and that will go directly to the console.
The problem now is that standard Rust function such as println! will still try to write to fd = 1, but I'd like to be able to change the default behavior so Rust code will write to fd = 3 instead of 1, that way, any Rust print related function will print to the console, and everything from the shared library will be parsed on a separate thread.
Is that possible to do in stable Rust? Closest thing I found is set_print function which looks like it's unstable and I couldn't even use it using +nightly build.
If it's acceptable to use writeln! instead of println!, you can call from_raw_fd to open a proper File:
use std::fs::File;
use std::io::{BufWriter, Write};
use std::os::unix::io::FromRawFd;
let mut out = BufWriter::new(unsafe { File::from_raw_fd(3) });
writeln!(out, "hello world!")?;
// ...
Note: from_raw_fd is unsafe because you must ensure that the File assumes sole ownership of the file descriptor, in this case that no one else closes or interacts with file descriptor 3. (But you can use into_raw_fd to re-assert ownership of the fd it while consuming the File.)
Is it possible to take any Rust code and make it work in only one line (without any line breaks)? In particular, it should work exactly like the "normal" multi line code:
if it's an executable, the runtime behavior should be the same.
if it's a library, the documentation and .rlib file should be the same.
This is a purely theoretical question and I don't plan on actually writing my Rust code like this :P
I know that most typical Rust code can be written in one line. Hello world is easy-peasy:
fn main() { println!("Hello, world"); }
Are there any Rust constructs that can't be written in one line? I thought of a few candidates already:
Doc comments. I usually see them written as /// or //! and they include everything until the end of the line.
Macros, especially procedural ones, can do some strange unexpected things. Maybe it is possible to construct macros that only work on multiple lines?
String literals can be written over multiple lines in which case they will include the linebreaks. I know that those line breaks can also be written as \n, but maybe there is something about multi-line strings that does not work in a single line? Maybe something something about raw string literals?
Maybe some planned future extensions of Rust?
Probably many other things I didn't think of...
Out of curiosity, I did a quick read through of Rust's Lexer Code and there is only one case (that I noticed) which requires a newline and can not be rewritten any other way. As people have pointed out, there are directly equivalent ways to write doc comments, string literals, and macros which can be done on a single line.
That being said, this is technically not considered Rust syntax. However, it has been part of rustc since the creation of rustc_lexer (4 years ago) and likely long before that so I'm going to count it. Plus if we squint really hard then it kind of looks like it might just fit the restriction of "if it's an executable, the runtime behavior should be the same".
Rust files allow the inclusion of a shebang on the first line. Since shebang's are a Unix convention, Rust needs to follow the existing standard (which requires a line break before the rest of the file contents). For example, it is not possible to write this Rust file without any newline characters in a way that preserves the behavior when run (cough on systems that support shebangs cough):
#!/usr/bin/rustrun
fn main() {
println!("Hello World!");
}
Rust Playground (You won't be able to try using the shebang on Rust Playground, but at least you can see it compiles and can not be reduced to a single line)
For anyone who is curious, here is how it is described in the lexer code:
/// `rustc` allows files to have a shebang, e.g. "#!/usr/bin/rustrun",
/// but shebang isn't a part of rust syntax.
pub fn strip_shebang(input: &str) -> Option<usize> {
// Shebang must start with `#!` literally, without any preceding whitespace.
// For simplicity we consider any line starting with `#!` a shebang,
// regardless of restrictions put on shebangs by specific platforms.
if let Some(input_tail) = input.strip_prefix("#!") {
// Ok, this is a shebang but if the next non-whitespace token is `[`,
// then it may be valid Rust code, so consider it Rust code.
let next_non_whitespace_token = tokenize(input_tail).map(|tok| tok.kind).find(|tok| {
!matches!(
tok,
TokenKind::Whitespace
| TokenKind::LineComment { doc_style: None }
| TokenKind::BlockComment { doc_style: None, .. }
)
});
if next_non_whitespace_token != Some(TokenKind::OpenBracket) {
// No other choice than to consider this a shebang.
return Some(2 + input_tail.lines().next().unwrap_or_default().len());
}
}
None
}
https://github.com/rust-lang/rust/blob/e1c91213ff80af5b87a197b784b40bcbc8cf3add/compiler/rustc_lexer/src/lib.rs#L221-L244
Given a std::path::Path, what's the most direct way to convert this to a null-terminated std::os::raw::c_char? (for passing to C functions that take a path).
use std::ffi::CString;
use std::os::raw::c_char;
use std::os::raw::c_void;
extern "C" {
some_c_function(path: *const c_char);
}
fn example_c_wrapper(path: std::path::Path) {
let path_str_c = CString::new(path.as_os_str().to_str().unwrap()).unwrap();
some_c_function(path_str_c.as_ptr());
}
Is there a way to avoid having so many intermediate steps?
Path -> OsStr -> &str -> CString -> as_ptr()
It's not as easy as it looks. There's one piece of information you didn't provide: what encoding is the C function expecting the path to be in?
On Linux, paths are "just" arrays of bytes (0 being invalid), and applications usually don't try to decode them. (However, they may have to decode them with a particular encoding to e.g. display them to the user, in which case they will usually try to decode them according to the current locale, which will often use the UTF-8 encoding.)
On Windows, it's more complicated, because there are variations of API functions that use an "ANSI" code page and variations that use "Unicode" (UTF-16). Additionally, Windows doesn't support setting UTF-8 as the "ANSI" code page. This means that unless the library specifically expects UTF-8 and converts path to the native encoding itself, passing it an UTF-8 encoded path is definitely wrong (though it might appear to work for strings containing only ASCII characters).
(I don't know about other platforms, but it's messy enough already.)
In Rust, Path is just a wrapper for OsStr. OsStr uses a platform-dependent representation that happens to be compatible with UTF-8 when the string is indeed valid UTF-8, but non-UTF-8 strings use an unspecified encoding (on Windows, it's actually using WTF-8, but this is not contractual; on Linux, it's just the array of bytes as is).
Before you pass a path to a C function, you must determine what encoding it's expecting the string to be in, and if it doesn't match Rust's encoding, you'll have to convert it before wrapping it in a CString. Rust doesn't let you convert a Path or an OsStr to anything other than a str in a platform-independent way. On Unix-based targets, the OsStrExt trait is available and provides access to the OsStr as a slice of bytes.
Rust used to provide a to_cstring method on OsStr, but it was never stabilized, and it was deprecated in Rust 1.6.0, as it was realized that the behavior was inappropriate for Windows (it returned an UTF-8 encoded path, but Windows APIs don't support that!).
As Path is just a thin wrapper around OsStr, you could nearly pass it as-is to your C function. But to be a valid C string we have to add the NUL terminating byte. Thus we must allocate a CString.
On the other hand, conversion to a str is both risky (what if the Path is not a valid UTF-8 string?) and an unnecessary cost: I use as_bytes() instead of to_str().
fn example_c_wrapper<P: AsRef<std::path::Path>>(path: P) {
let path_str_c = CString::new(path.as_ref().as_os_str().as_bytes()).unwrap();
some_c_function(path_str_c.as_ptr());
}
This is fo Unix. I do not know how it works for Windows.
If your goal is to convert a path to some sequence of bytes that is interpreted as a "native" path on whatever platform the code was compiled for, then the most direct way to do this is by using the OsStrExtof each platform you want to support:
let path = ..;
let mut buf = Vec::new();
#[cfg(unix)] {
use std::os::unix::ffi::OsStrExt;
buf.extend(path.as_os_str().as_bytes());
buf.push(0);
}
#[cfg(windows)] {
use std::os::windows::ffi::OsStrExt;
buf.extend(path.as_os_str()
.encode_wide()
.chain(Some(0))
.map(|b| {
let b = b.to_ne_bytes();
b.get(0).map(|s| *s).into_iter().chain(b.get(1).map(|s| *s))
})
.flatten());
}
This code[1] gives you a buffer of bytes that represents the path as a series of null-terminated bytes when you run it on linux, and represents "unicode" (utf16) when run on windows. You could add a fallback that converts OsStr to a str on other platforms, but I strongly recommend against that. (see why bellow)
For windows, you'll want to cast your buffer pointer to wchar_t * before using it with unicode functions on Windows (e.g. _wfopen). This code assumes that wchar_t is two bytes large, and that the buffer is properly aligned to wchar_ts.
On the Linux side, just use the pointer as-is.
About converting paths to unicode strings: Don't. Contrary to recommendations here and elsewhere, blindly converting a path to utf8 is not the correct way to handle a system path. Requiring that paths be valid unicode will cause your code to fail when (not if) it encounters paths that are not valid unicode. If you're handling real world paths, you will inevitably be handling non-utf8 paths. Doing it right the first time will help avoid a lot of pain and misery in the long run.
[1]: This code was taken directly out of a library I'm working on (feel free to reuse). It has been tested against both linux and 64-bit windows via wine.
If you are trying to produce a Vec<u8>, I usually phone it in and do:
#[cfg(unix)]
fn path_to_bytes<P: AsRef<Path>>(path: P) -> Vec<u8> {
use std::os::unix::ffi::OsStrExt;
path.as_ref().as_os_str().as_bytes().to_vec()
}
#[cfg(not(unix))]
fn path_to_bytes<P: AsRef<Path>>(path: P) -> Vec<u8> {
// On Windows, could use std::os::windows::ffi::OsStrExt to encode_wide(),
// but you end up with a Vec<u16> instead of a Vec<u8>, so that doesn't
// really help.
path.as_ref().to_string_lossy().to_string().into_bytes()
}
Knowing full well that non-UTF8 paths on non-UNIX will not be supported correctly. Note that you might need a Vec<u8> if working with Thrift/protocol buffers as opposed to a C API.