How can I convert LPWSTR into &str - string

In my previous question I asked how to cast LPVOID to LPNETRESOURCEW. And it goes well. I have a struct of NETRESOURCEW with fields:
dwScope: DWORD,
dwType: DWORD,
dwDisplayType: DWORD,
dwUsage: DWORD,
lpLocalName: LPWSTR,
lpRemoteName: LPWSTR,
lpComment: LPWSTR,
lpProvider: LPWSTR,
According to docs nr.lpRemoteName is LPWSTR -> *mut u16. I've tried using OsString::from_wide but nothing got well. How can I convert LPWSTR into Rust *str or String and print it in console?

This is normally done by marshalling the string pointer into a slice.
First you need to get the length of the string / slice which can be done like this if the string is null-terminated:
let length = (0..).take_while(|&i| *my_string.offset(i) != 0).count();
Then you can create the slice like so
let slice = std::slice::from_raw_parts(my_string, length);
and finally convert the slice into an OsString:
let my_rust_string = OsString::from_wide(slice).to_string_lossy().into_owned();
Update due to comment
I have verified that the given approach works as can be reproduced using this snippet:
use std::{ffi::OsString, os::windows::prelude::OsStringExt};
use windows_sys::w;
fn main() {
let my_string = w!("This is a UTF-16 string.");
let slice;
unsafe {
let length = (0..).take_while(|&i| *my_string.offset(i) != 0).count();
slice = std::slice::from_raw_parts(my_string, length);
}
let my_rust_string = OsString::from_wide(slice).to_string_lossy().into_owned();
println!("{}", my_rust_string);
}
If you get a STATUS_ACCESS_VIOLATION it is most likely because your string is not null-terminated in which case you would need to determine the length of the string in another way or preallocate the buffer.

Related

get byte offset after first char of str in rust

In rust, I want to get the byte offset immediately after of the first character of a str.
Rust Playground
fn main() {
let s: &str = "⚀⚁";
// char is 4 bytes, right??? (not always when in a str)
let offset: usize = 4;
let s1: &str = &s[offset..];
eprintln!("s1 {:?}", s1);
}
The program expectedly crashes with:
thread 'main' panicked at 'byte index 4 is not a char boundary; it is inside '⚁' (bytes 3..6) of `⚀⚁`'
How can find the byte offset for the second char '⚁' ?
Bonus if this can be done safely and without std.
Related:
How to get the byte offset between &str
How to find the starting offset of a string slice of another string?
A char is a 32-bit integer (a unicode scalar value), but individual characters inside a str are variable width UTF-8, as small as a single 8-bit byte.
You can iterate through the characters of the str and their boundaries using str::char_indices, and your code would look like this:
fn main() {
let s: &str = "⚀⚁";
let (offset, _) = s.char_indices().nth(1).unwrap();
dbg!(offset); // 3
let s1: &str = &s[offset..];
eprintln!("s1 {:?}", s1); // s1 "⚁"
}

How to create a &str from a single character? [duplicate]

This question already has answers here:
Converting a char to &str
(3 answers)
Closed 1 year ago.
I can't believe I'm asking this frankly, but how do I create a &str (or a String) when I have a single character?
The first thing to try for simple conversions is into().
It works for String because String implements From<char>.
let c: char = 'π';
let s: String = c.into();
You can't build a &str directly from a char. A &str is a reference type. The easiest solution is to build it from a string:
let s: &str = &s;
An alternative for most kinds of values is the format macro:
let s = format!("{}", c);
If just need to use the &str locally and you want to avoid heap allocation, you can use char method encode_utf8:
fn main() {
let c = 'n';
let mut tmp = [0; 1];
let foo = c.encode_utf8(&mut tmp);
println!("str: {}", foo);
}
or
fn main() {
let tmp = [b'n'; 1];
let foo = std::str::from_utf8(&tmp).unwrap();
println!("str: {}", foo);
}
To work with every char you need to use a u8 array of length 4 [0; 4]. In utf8, ascii chars can be represented as a single byte, but all other characters require more bytes with maximum of 4.
This is a simplified example based on an answer from a very similar question:
Converting a char to &str

Provide `char **` argument to C function from Rust?

I've got a C function that (simplified) looks like this:
static char buffer[13];
void get_string(const char **s) {
sprintf(buffer, "Hello World!");
*s = buffer;
}
I've declared it in Rust:
extern pub fn get_string(s: *mut *const c_char);
But I can't figure out the required incantation to call it, and convert the result to a Rust string. Everything I've tried either fails to compile, or causes a SEGV.
Any pointers?
First of all, char in Rust is not the equivalent to a char in C:
The char type represents a single character. More specifically, since 'character' isn't a well-defined concept in Unicode, char is a 'Unicode scalar value', which is similar to, but not the same as, a 'Unicode code point'.
In Rust you may use u8 or i8 depending in the operating system. You can use std::os::raw::c_char for this:
Equivalent to C's char type.
C's char type is completely unlike Rust's char type; while Rust's type represents a unicode scalar value, C's char type is just an ordinary integer. This type will always be either i8 or u8, as the type is defined as being one byte long.
C chars are most commonly used to make C strings. Unlike Rust, where the length of a string is included alongside the string, C strings mark the end of a string with the character '\0'. See CStr for more information.
First, we need a variable, which can be passed to the function:
let mut ptr: *const c_char = std::mem::uninitialized();
To pass it as *mut you simply can use a reference:
get_string(&mut ptr);
Now use the *const c_char for creating a CStr:
let c_str = CStr::from_ptr(ptr);
For converting it to a String you can choose:
c_str.to_string_lossy().to_string()
or
c_str().to_str().unwrap().to_string()
However, you shouldn't use String if you don't really need to. In most scenarios, a Cow<str> fulfills the needs. It can be obtained with c_str.to_string_lossy():
If the contents of the CStr are valid UTF-8 data, this function will return a Cow::Borrowed([&str]) with the the corresponding [&str] slice. Otherwise, it will replace any invalid UTF-8 sequences with U+FFFD REPLACEMENT CHARACTER and return a Cow::[Owned](String) with the result.
You can see this in action on the Playground. This Playground shows the usage with to_string_lossy().
Combine Passing a Rust variable to a C function that expects to be able to modify it
unsafe {
let mut c_buf = std::ptr::null();
get_string(&mut c_buf);
}
With How do I convert a C string into a Rust string and back via FFI?:
extern crate libc;
use libc::c_char;
use std::ffi::CStr;
use std::str;
extern "C" {
fn get_string(s: *mut *const c_char);
}
fn main() {
unsafe {
let mut c_buf = std::ptr::null();
get_string(&mut c_buf);
let c_str = CStr::from_ptr(c_buf);
let str_slice: &str = c_str.to_str().unwrap();
let str_buf: String = str_slice.to_owned(); // if necessary
};
}

Creating a string from Vec<char> [duplicate]

This question already has answers here:
How to convert Vec<char> to a string
(2 answers)
Closed 6 years ago.
I've got a Vec<char> that I need to turn into a &str or String, but I'm unsure of the best way to do this. I've looked around and every resource I've found seems to be out-dated in some way. The answers in this question don't seem to be applicable for the newest build.
I'm using the nightly for 2015-3-19
The iterator based approach with .collect should work, after updating for language changes:
char_vector.iter().cloned().collect::<String>();
(I've chosen to replace .map(|c| *c) with .cloned() but either works.)
If your vector can be consumed, you can also use into_iter to avoid the clone
fn main() {
let char_vector = vec!['h', 'e', 'l', 'l', 'o'];
let str: String = char_vector.into_iter().collect();
println!("{}", str);
}
You can convert the Vec into a String without doing any allocations. It requires quite some unsafe code though:
#![feature(raw, unicode)]
use std::raw::Repr;
use std::slice::from_raw_parts_mut;
fn inplace_to_string(v: Vec<char>) -> String {
unsafe {
let mut i = 0;
{
let ch_v = &v[..];
let r = ch_v.repr();
let p: &mut [u8] = from_raw_parts_mut(r.data as *mut u8, r.len*4);
for ch in ch_v {
i += ch.encode_utf8(&mut p[i..i+4]).unwrap();
}
}
let p = v.as_ptr();
let cap = v.capacity()*4;
std::mem::forget(v);
let v = Vec::from_raw_parts(p as *mut u8, i, cap);
String::from_utf8_unchecked(v)
}
}
fn main() {
let char_vector = vec!['h', 'ä', 'l', 'l', 'ö'];
let str: String = char_vector.iter().cloned().collect();
let str2 = inplace_to_string(char_vector);
println!("{}", str);
println!("{}", str2);
}
PlayPen
Detailed Explanation
This creates a mutable u8 slice and a char slice simultaneously to the same buffer (breaking all Rust guarantees). Note that the u8 slice is four times as large as the char slice, since char always takes up 4 bytes.
let ch_v = &v[..];
let r = ch_v.repr();
let v: &mut [u8] = from_raw_parts_mut(r.data as *mut u8, r.len*4);
We need that to iterate over the unicode chars and replace them by their utf8 encoded counterpart. Since utf8 is always shorter or the same length as unicode, we can guarantee that we never overwrite any part we haven't read yet.
for ch in ch_v {
i += ch.encode_utf8(&mut v[i..i+4]).unwrap();
}
Since a char is always unicode and our buffer is always exactly 4 bytes (which is the maximum number of bytes a utf8 encoded unicode char will need), we can encode our chars to utf8 without checking if it worked (it will always work). The encode_utf8 function returns the length of the utf8 representation. Our index i is the location of the last written utf8 char.
Finally we need to do some cleaning up. Our vector is still of type Vec<char>. We get all the info we need (Pointer to the heap allocated array and the capacity)
let p = v.as_ptr();
let cap = v.capacity()*4;
Then we release the previous vector from all obligations like freeing memory.
std::mem::forget(v);
and finally recreate the u8 vector with correct length and capacity, and directly turn it into a String. The conversion to String does not need to be checked, as we already know the utf8 is correct, since the original Vec<char> could only contain correct unicode chars.
let v = Vec::from_raw_parts(p as *mut u8, i, cap);
String::from_utf8_unchecked(v)

Convert a String to int?

Note: this question contains deprecated pre-1.0 code! The answer is correct, though.
To convert a str to an int in Rust, I can do this:
let my_int = from_str::<int>(my_str);
The only way I know how to convert a String to an int is to get a slice of it and then use from_str on it like so:
let my_int = from_str::<int>(my_string.as_slice());
Is there a way to directly convert a String to an int?
You can directly convert to an int using the str::parse::<T>() method, which returns a Result containing the int.
let my_string = "27".to_string(); // `parse()` works with `&str` and `String`!
let my_int = my_string.parse::<i32>().unwrap();
You can either specify the type to parse to with the turbofish operator (::<>) as shown above or via explicit type annotation:
let my_int: i32 = my_string.parse().unwrap();
Since parse() returns a Result, it will either be an Err if the string couldn't be parsed as the type specified (for example, the string "peter" can't be parsed as i32), or an Ok with the value in it.
let my_u8: u8 = "42".parse().unwrap();
let my_u32: u32 = "42".parse().unwrap();
// or, to be safe, match the `Err`
match "foobar".parse::<i32>() {
Ok(n) => do_something_with(n),
Err(e) => weep_and_moan(),
}
str::parse::<u32> returns a Result<u32, core::num::ParseIntError> and Result::unwrap "Unwraps a result, yielding the content of an Ok [or] panics if the value is an Err, with a panic message provided by the Err's value."
str::parse is a generic function, hence the type in angle brackets.
If you get your string from stdin().read_line, you have to trim it first.
let my_num: i32 = my_num.trim().parse()
.expect("please give me correct string number!");
With a recent nightly, you can do this:
let my_int = from_str::<int>(&*my_string);
What's happening here is that String can now be dereferenced into a str. However, the function wants an &str, so we have to borrow again. For reference, I believe this particular pattern (&*) is called "cross-borrowing".
You can use the FromStr trait's from_str method, which is implemented for i32:
let my_num = i32::from_str("9").unwrap_or(0);
Yes, you can use the parse method on a String to directly convert it to an integer lik so:
let my_string = "42".to_string();
let my_int = my_string.parse::<i32>().unwrap();
The parse method returns a Result object, so you will need to handle the case where the string cannot be parsed into an integer. You can use unwrap as shown above to get the value if the parse was successful, or it will panic if the parse failed.
Or you can use the match expression to handle the success and failure cases separately like so:
let my_string = "42".to_string();
let my_int = match my_string.parse::<i32>() {
Ok(n) => n,
Err(_) => {
println!("Failed to parse integer");
0
},
};
FYI, the parse method is available for any type that implements the FromStr trait, which includes all of the integer types (e.g. i32, i64, etc.) as well as many other types such as f32 and bool.
Well, no. Why there should be? Just discard the string if you don't need it anymore.
&str is more useful than String when you need to only read a string, because it is only a view into the original piece of data, not its owner. You can pass it around more easily than String, and it is copyable, so it is not consumed by the invoked methods. In this regard it is more general: if you have a String, you can pass it to where an &str is expected, but if you have &str, you can only pass it to functions expecting String if you make a new allocation.
You can find more on the differences between these two and when to use them in the official strings guide.
So basically you want to convert a String into an Integer right!
here is what I mostly use and that is also mentioned in official documentation..
fn main() {
let char = "23";
let char : i32 = char.trim().parse().unwrap();
println!("{}", char + 1);
}
This works for both String and &str
Hope this will help too.

Resources