What is the third argument to std::int::to_str_bytes? - rust

In Rust programming language - I am trying to convert an integer into the string representation and so I write something like:
use std::int::to_str_bytes;
...
to_str_bytes(x, 10);
...but it says that I have to specify a third argument.The documentation is here: http://static.rust-lang.org/doc/master/std/int/fn.to_str_bytes.html , but I am not clever enough to understand what it expects as the third argument.

Using x.to_str() as in Njol's answer is the straightforward way to get a string representation of an integer. However, x.to_str() returns an owned (and therefore heap-allocated) string (~str). As long as you don't need to store the resulting string permanently, you can avoid the expense of an extra heap allocation by allocating the string representation on the stack. This is exactly the point of the std::int::to_str_bytes function - to provide a temporary string representation of a number.
The third argument, of type f: |v: &[u8]| -> U, is a closure that takes a byte slice (I don't think Rust has stack-allocated strings). You use it like this:
let mut f = std::io::stdout();
let result = std::int::to_str_bytes(100, 16, |v| {
f.write(v);
Some(())
});
to_str_bytes returns whatever the closure does, in this case Some(()).

int seems to implement ToStr: http://static.rust-lang.org/doc/master/std/to_str/trait.ToStr.html
so you should be able to simply use x.to_str() or to_str(x)

Related

Rust Manipulating Strings in Functions

I'm new to Rust, and I want to process strings in a function in Rust and then return a struct that contains the results of that processing to use in more functions. This is very simplified and a bit messier because of all my attempts to get this working, but:
struct Strucc<'a> {
string: &'a str,
booool: bool
}
fn do_stuff2<'a>(input: &'a str) -> Result<Strucc, &str> {
let to_split = input.to_lowercase();
let splitter = to_split.split("/");
let mut array: Vec<&str> = Vec::new();
for split in splitter {
array.push(split);
}
let var = array[0];
println!("{}", var);
let result = Strucc{
string: array[0],
booool: false
};
Ok(result)
}
The issue is that to convert the &str to lowercase, I have to create a new String that's owned by the function.
As I understand it, the reason this won't compile is because when I split the new String I created, all the &strs I get from it are substrings of the String, which are all still owned by the function, and so when the value is returned and that String goes out of scope, the value in the struct I returned gets erased.
I tried to fix this with lifetimes (as you can see in the function definition), but from what I can tell I'd have to give the String a lifetime which I can't do as far as I'm aware because it isn't borrowed. Either that or I need to make the struct own that String (which I also don't understand how to do, nor does it seem reasonable as I'd have to make the struct mutable).
Also as a sidenote: Previously I have tried to just use a String in the struct but I want to define constants which won't work with that, and I still don't think it would solve the issue. I've also tried to use .clone() in various places just in case but had no luck (though I know why this shouldn't work anyway).
I have been looking for some solution for this for hours and it feels like such a small step so I feel I may be asking the wrong questions or have missed something simple but please explain it like I'm five because I'm very confused.
I think you misunderstand what &str actually is. &str is just a pointer to the string data plus a length. The point of &str is to be an immutable reference to a specific string, which enables all sorts of nice optimizations. When you attempt to turn the &str lowercase, Rust needs somewhere to put the data, and the only place to put it would be a String, because Strings own their data. Take a look at this post for more information.
Your goal is unachievable without Strucc containing a String, since .to_lowercase() has to create new data, and you have to allocate the resulting data somewhere in order to own a reference to it. The best place to put the resulting data would be the returned struct, i.e. Strucc, and therefore Strucc must contain a String.
Also as a sidenote: Previously I have tried to just use a String in the struct but I want to define constants which won't work with that, and I still don't think it would solve the issue.
You can use "x".to_owned() to create a String literal.
If you're trying to create a global constant, look at once_cell's lazy global initialization.

Converting a String to an &str Rust without resulting in borrowing problem

Is there someway to convert a String (function parameter) into &str (used in return value) while avoiding borrowing problems or referencing a temporary value.
Here is an example use case:
PS: I am constrained by existing functions to these types and cannot just change the type of the function's parameter nor its &str consuming return value.
Also the following example uses clap::Arg::new to emphasize this aspect.
// f is any iterator over String values, for example sake I am using Vec<String> with literal value which does not represent the real case.
let f: Vec<String> = vec!["A".into(),"B".into(),"C".into()].into_iter();
let arg: clap::Arg = f.map(|s: String|{
clap::Arg::new(s.as_str())
});
Is there someway to convert a String (function parameter) into &str (used in return value) while avoiding borrowing problems or referencing a temporary value.
No but yes, but, really, no.
"No": if your parameter is a String, then the function owns the String, meaning once the function exits the String will be reclaimed, and so any referenced to that would be invalid as it would be dangling.
"but yes": you can convert the String into a Box<str>, which you can then leak. This returns a &'static str, which will live forever
"but, really, no": while leaking memory is safe as far as Rust is concerned, it's still leaking memory. This might be fine in some contexts (e.g. short-running processes where explicitly freeing is a waste of resources, or some forms of ad-hoc interning), but in general it's really bad practice.

Slice of String vs Slice &String

I was reading the doc from rust lang website and in chapter 4 they did the following example:
let s = String::from("hello world");
let hello = &s[0..5];
let world = &s[6..11];
hello is of type &str that I created from a variable s of type String.
Some rows below they define the following function:
fn first_word(s: &String) -> &str {
let bytes = s.as_bytes();
for (i, &item) in bytes.iter().enumerate() {
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
This time s is of type &String but still &s[0..i] gave me a &str slice.
How is it possible? I thought that the correct way to achieve this would be something like &((*str)[0..i]).
Am I missing something? Maybe during the [0..i] operation Rust auto deference the variable?
Thanks
Maybe during the [0..i] operation Rust auto deference the variable?
This is exactly what happens. When you call methods/index a reference, it automatically dereferences before applying the method. This behavior can also be manually implemented with the Deref trait. String implements the Deref with a target of str, which means when you call str methods on String. Read more about deref coercion here.
It's important to realize what happens with &s[1..5], and that it's &(s[1..5]), namely, s[1..5] is first first evaluated, this returns a value of type str, and a reference to that value is taken. In fact, there's even more indirection: x[y] in rust is actually syntactic sugar for *std::ops::Index::index(x,y). Note the dereference, as this function always returns a reference, which is then dereferenced by the sugar, and then it is referenced again by the & in our code — naturally, the compiler will optimize this and ensure we are not pointlessly taking references to only dereference them again.
It so happens that the String type does support the Index<Range<usize>> trait and it's Index::output type is str.
It also happens that the str type supports the same, and that it's output type is also str, viā a blanket implementation of SliceIndex.
On your question of auto-dereferencing, it is true that Rust has a Deref trait defined on String as well so that in many contexts, such as this one, &String is automatically cast to &str — any context that accepts a &str also accepts a &String, meaning that the implementation on Index<usize> on String is actually for optimization to avoid this indirection. If it not were there, the code would still work, and perhaps the compiler could even optimize the indirection away.
But that automatic casting is not why it works — it simply works because indexing is defined on many different types.
Finally:
I thought that the correct way to achieve this would be something like &((*str)[0..i]).
This would not work regardless, a &str is not the same as a &String and cannot be dereferenced to a String like a &String. In fact, a &str in many ways is closer to a String than it is to a &String. a &str is really just a fat pointer to a sequence of unicode bytes, also containing the length of said sequence in the second word; a String is, if one will, an extra-fat pointer that also contains the current capacity of the buffer with it, and owns the buffer it points to, so it can delete and resize it.

How can I convert a float to string?

How can a float value be converted to a String? For whatever reason, the documentation and all online sources I can find are only concerned with the other way around.
let value: f32 = 17.65;
let value_as_str: String = .....
Sometimes, the answer is easy: to_string().
let pi = 3.1415926;
let s = pi.to_string(); // : String
Background
The foundation for "creating a readable string representation of something" is in the fmt module. Probably the most important trait in this module is Display. Display is an abstraction over types that can be formatted as a user-facing string (pretty much exactly what you want). Usually the Display trait is used by println!() and friends. So you can already convert your float to string with the format!() macro:
let s = format!("{}", pi);
But there is something else: the ToString trait. This trait talks about types that can be converted to a String. And now, there is a magic implementation:
impl<T> ToString for T
where T: Display + ?Sized
This means: every type which implements Display also automatically implements ToString! So instead of writing format!("{}", your_value) you can simply write your_value.to_string()!
While these wildcard implementations are extremely useful and versatile, they have one disadvantage: finding methods is much harder. As you point out, the documentation of f32 doesn't mention to_string() at all. This is not very good, but it is a known issue. We're trying to improve this situation!
Advanced formatting
The to_string() method uses the default formatting options, so it's equivalent to format!("{}", my_value). But sometimes, you want to tweak how the value is converted into a string. To do that, you have to use format!() and the full power of the fmt format specifier. You can read about those in the module documentation. One example:
let s = format!("{:.2}", pi);
This will result in a string with exactly two digits after the decimal point ("3.14").
If you want to convert your float into a string using scientific notation, you can use the {:e} (or {:E}) format specifier which corresponds to the LowerExp (or UpperExp) trait.
let s = format!("{:e}", pi * 1_000_000.0);
This will result in "3.1415926e6".

Does Rust provide a way to parse integer numbers directly from ASCII data in byte (u8) arrays?

Rust has FromStr, however as far as I can see this only takes Unicode text input. Is there an equivalent to this for [u8] arrays?
By "parse" I mean take ASCII characters and return an integer, like C's atoi does.
Or do I need to either...
Convert the u8 array to a string first, then call FromStr.
Call out to libc's atoi.
Write an atoi in Rust.
In nearly all cases the first option is reasonable, however there are cases where files maybe be very large, with no predefined encoding... or contain mixed binary and text, where its most straightforward to read integer numbers as bytes.
No, the standard library has no such feature, but it doesn't need one.
As stated in the comments, the raw bytes can be converted to a &str via:
str::from_utf8
str::from_utf8_unchecked
Neither of these perform extra allocation. The first one ensures the bytes are valid UTF-8, the second does not. Everyone should use the checked form until such time as profiling proves that it's a bottleneck, then use the unchecked form once it's proven safe to do so.
If bytes deeper in the data need to be parsed, a slice of the raw bytes can be obtained before conversion:
use std::str;
fn main() {
let raw_data = b"123132";
let the_bytes = &raw_data[1..4];
let the_string = str::from_utf8(the_bytes).expect("not UTF-8");
let the_number: u64 = the_string.parse().expect("not a number");
assert_eq!(the_number, 231);
}
As in other code, these these lines can be extracted into a function or a trait to allow for reuse. However, once that path is followed, it would be a good idea to look into one of the many great crates aimed at parsing. This is especially true if there's a need to parse binary data in addition to textual data.
I do not know of any way in the standard library, but maybe the atoi crate works for you? Full disclosure: I am its author.
use atoi::atoi;
let (number, digits) = atoi::<u32>(b"42 is the answer"); //returns (42,2)
You can check if the second element of the tuple is a zero to see if the slice starts with a digit.
let (number, digits) = atoi::<u32>(b"x"); //returns (0,0)
let (number, digits) = atoi::<u32>(b"0"); //returns (0,1)

Resources