What are Some and None? - rust

I came across some output I don't understand using Vec::get. Here's the code:
fn main() {
let command = [('G', 'H'), ('H', '5')];
for i in 0..3 {
print!(" {} ", i);
println!("{:?}", command.get(i));
}
}
the output is
0 Some(('G', 'H'))
1 Some(('H', '5'))
2 None
I've dabbled in Haskell before, and by that I mean looked at a tutorial site for 10 minutes and ran back to C++, but I remember reading something about Some and None for Haskell. I was surprised to see this here in Rust. Could someone explain why .get() returns Some or None?

The signature of get (for slices, not Vec, since you're using an array/slice) is
fn get(&self, index: usize) -> Option<&T>
That is, it returns an Option, which is an enum defined like
pub enum Option<T> {
None,
Some(T),
}
None and Some are the variants of the enum, that is, a value with type Option<T> can either be a None, or it can be a Some containing a value of type T. You can create the Option enum using the variants as well:
let foo = Some(42);
let bar = None;
This is the same as the core data Maybe a = Nothing | Just a type in Haskell; both represent an optional value, it's either there (Some/Just), or it's not (None/Nothing).
These types are often used to represent failure when there's only one possibility for why something failed, for example, .get uses Option to give type-safe bounds-checked array access: it returns None (i.e. no data) when the index is out of bounds, otherwise it returns a Some containing the requested pointer.
See also:
Why don't Option's Some and None variants need to be qualified?
What is the difference between Some and Option in Rust?

Think of Some and None as the canonical "safe" way of working around the fact that the Rust language does not support "safe" use of NULL pointers. Since the length of your Vec is 3, and you have only specified two pairs, the third pair is effectively NULL; instead of returning NULL, it returns None.
Rust provides safety guarantees by forcing us at compile-time, via Some / None, to always deal with the possibility of None being returned.

command is not a vector (type Vec<T>), it is a fixed-size array (type [(char, char); 2] in your case), and arrays are automatically borrowed into slices (views into arrays), hence you can use all methods defined on slices, including get:
Returns the element of a slice at the given index, or None if the index is out of bounds.
The behavior is pretty obvious: when given index is valid, it returns Some with the element under that index, otherwise it returns None.
There is another way to access elements in a slice - the indexing operator, which should be familiar to you:
let nums = [1, 2, 3];
let x = nums[1];
It returns the element of the slice directly, but it will fail the current task if the index is out of bounds:
fn main() {
let x = [1, 2];
for i in 0..3 {
println!("{}", x[i]);
}
}
This program fails:
% ./main2
1
2
task '<main>' failed at 'index out of bounds: the len is 2 but the index is 2', main2.rs:4
The get() method is needed for convenience; it saves you from checking in advance if the given index is valid.
If you don't know what Some and None really are and why they are needed in general, you should read the official tutorial, it explains it because it is very basic concept.

Option enum has 2 variants.
1- None is used to indicate failure or no value
2- Some which is tuple-struct that wraps the value
If you need to write this structure in OOB, for example in typescript, you would write like this. This would make it easier to visualize the situation
Define Option interface as derived class
interface Option<T = any> {
// pass all the methods here
// unwrap is used to access the wrapped value
unwrap(): T;
}
write Some class which inherits from Option
Some class returns a value
class Some<T> implements Option<T> {
private value: T;
constructor(v: T) {
this.value = v;
}
unwrap(): T {
return this.value
}}
Write None class which also inherits from Option
None class returns null
class None<T> implements Option<T> {
// you do not need constructor here
unwrap(): T {
return null as T;
}
}

The other answers discussing the return type for get() being option enum are accurate, but I think what is helpful is how to remove the some from the prints. To do that a quick way is to just call the unwrap on the option, although this is not production recommended. For a discussion on option take a look at the rust book here.
Updated with unwrap code in playground (below)
fn main() {
let command = [('G', 'H'), ('H', '5')];
for i in 0..3 {
print!(" {} ", i);
println!("{:?}", command.get(i).unwrap());
}
}

Related

How to define an array or vector that can contain any primitive data type in rust?

How to define an array or vector that can contain any primitive data type in rust?
let v: [std::any::Any] = <something>;
Solution 1 (faster + safer)
enum Data {
Usize(usize),
U8(u8),
U16(u16),
U32(u16),
U64(u16),
Isize(isize),
I8(i8),
I16(i16),
I32(i16),
I64(i16),
F32(f32),
F64(f64),
Bool(bool),
Char(char),
}
fn main() {
let v: [Data; 3] = [Data::Usize(1), Data::Bool(true), Data::Char('a')];
}
Solution 2 (flexible, supports almost any type)
use std::any::Any;
fn main() {
let v: [Box<dyn Any>; 3] = [Box::new(1), Box::new('a'), Box::new(true)];
}
You can't. The very point of an array or a Vec is that they are homogenous, which also means each member needs to be the same size. But when given a u8 and a u128 - which are both std::any::Any - their sizes are not the same. One, therefore, needs a layer of indirection e.g. via [Box<dyn std::any::Any>; _] or Vec<Box<std::any::Any>>.
Contrary to Javascript, Rust is a strongly typed language. This means that any variable has a single type that is known at compile-time. This includes vector or array elements. You can work around this limitation by wrapping your data in an enum that will keep track of the actual type of the contained value at runtime.
Note that I strongly advise against using Any until you have lots of experience with Rust and know that your are using it for the right reason (and in particular not just to reproduce a pattern found in scripting languages.

Flatten vector of enums in Rust

I am trying to flatten a vector of Enum in Rust, but I am having some issues:
enum Foo {
A(i32),
B(i32, i32),
}
fn main() {
let vf = vec![Foo::A(1), Foo::A(2), Foo::B(3, 4)];
let vi: Vec<i32> = vf
.iter()
.map(|f| match f {
Foo::A(i) => [i].into_iter(),
Foo::B(i, j) => [i, j].into_iter(),
})
.collect(); // this does not compile
// I want vi = [1, 2, 3, 4]. vf must still be valid
}
I could just use a regular for loop and insert elements into an existing vector, but that would be no fun. I'd like to know if there is a more idiomatic Rust way of doing it.
Here's a way to do it that produces an iterator (rather than necessarily a vector, as the fold() based solution does).
use std::iter::once;
enum Foo {
A(i32),
B(i32, i32),
}
fn main() {
let vf = vec![Foo::A(1), Foo::A(2), Foo::B(3, 4)];
let vi: Vec<i32> = vf
.iter()
.flat_map(|f| {
match f {
&Foo::A(i) => once(i).chain(None),
&Foo::B(i, j) => once(i).chain(Some(j)),
}
})
.collect();
dbg!(vi);
}
This does essentially the same thing that you were attempting, but in a way which will succeed. Here are the parts I changed, in the order they appear in the code:
I used .flat_map() instead of .map(). flat_map accepts a function which returns an iterator and produces the elements of that iterator ("flattening") whereas .map() would have just given the iterator.
I used & in the match patterns. This is because, since you are using .iter() on the vector (which is appropriate for your requirement “vf must still be valid”), you have references to enums, and pattern matching on a reference to an enum will normally give you references to its elements, but we almost certainly want to handle the i32s by value instead. There are several other things I could have done, such as using the * dereference operator on the values instead, but this is concise and tidy.
You tried to .into_iter() an array. Unfortunately, in current Rust this does not do what you want and you can't actually return that iterator, for somewhat awkward reasons (which will be fixed in an upcoming Rust version). And then, if it did mean what you wanted, then you'd get an error because the two match arms have unequal types — one is an iterator over [i32; 1] and the other is an iterator over [i32; 2].
Instead, you need to build two possible iterators which are clearly of the same type. There are lots of ways to do this, and the way I picked was to use Iterator::chain to combine once(i), an iterator that returns the single element i, with an Option<i32> (which implements IntoIterator) that contains the second element j if it exists.
Notice that in the first match arm I wrote the seemingly useless expression .chain(None); this is so that the two arms have the same type. Another way to write the same thing, which is arguably clearer since it doesn't duplicate code that has to be identical, is:
let (i, opt_j) = match f {
&Foo::A(i) => (i, None),
&Foo::B(i, j) => (i, Some(j)),
};
once(i).chain(opt_j)
In either case, the iterator's type is std::iter::Chain<std::iter::Once<i32>, std::option::IntoIter<i32>> — you don't need to know this exactly, just notice that there must be a type which handles both the A(i) and the B(i, j) cases.
First of all, you need to change the i32 references to owned values by e.g. dereferencing them. Then you can circumvent proxying through inlined arrays by using fold():
enum Foo {
A(i32),
B(i32, i32),
}
fn main() {
let vf = vec![Foo::A(1), Foo::A(2), Foo::B(3, 4)];
let vi: Vec<i32> = vf
.iter()
.fold(Vec::new(), |mut acc, f| {
match f {
Foo::A(i) => acc.push(*i),
Foo::B(i, j) => {
acc.push(*i);
acc.push(*j);
}
}
acc
});
}

Why do I get the error "no method named collect found for type Option"?

I'm doing the Exercism Rust problem in which a string has arbitrary length, but could be null, and needs to be classified based on its last two graphemes.
My understanding is that Option is used to account for something that could be null, or could be not null, when this is unknown at compile time, so I've tried this:
extern crate unicode_segmentation;
use unicode_segmentation::UnicodeSegmentation;
pub fn reply(message: &str) -> &str {
let message_opt: Option<[&str; 2]> = message.graphemes(true).rev().take(2).nth(0).collect();
}
My understanding of which, is that the right hand side will give an array of two &strs, if the string is non zero in length, or will return none, and the left hand side will store it as an option (so that I can later match on Some or None)
The error is:
no method named 'collect' found for type std::option::Option<&str> in the current scope
This doesn't make sense to me, as I (think) I'm trying to collect the output of an iterator, I am not collecting an option.
The error message isn't lying to you. Option does not have a method called collect.
I (think) I'm trying to collect the output of an iterator
Iterator::nth returns an Option. Option does not implement Iterator; you cannot call collect on it.
Option<[&str; 2]>
You can't do this, either:
How do I collect into an array?
I'd write this as
let mut graphemes = message.graphemes(true).fuse();
let message_opt = match (graphemes.next_back(), graphemes.next_back()) {
(Some(a), Some(b)) => Some([a, b]),
_ => None,
};

Is it possible to map a function over a Vec without allocating a new Vec?

I have the following:
enum SomeType {
VariantA(String),
VariantB(String, i32),
}
fn transform(x: SomeType) -> SomeType {
// very complicated transformation, reusing parts of x in order to produce result:
match x {
SomeType::VariantA(s) => SomeType::VariantB(s, 0),
SomeType::VariantB(s, i) => SomeType::VariantB(s, 2 * i),
}
}
fn main() {
let mut data = vec![
SomeType::VariantA("hello".to_string()),
SomeType::VariantA("bye".to_string()),
SomeType::VariantB("asdf".to_string(), 34),
];
}
I would now like to call transform on each element of data and store the resulting value back in data. I could do something like data.into_iter().map(transform).collect(), but this will allocate a new Vec. Is there a way to do this in-place, reusing the allocated memory of data? There once was Vec::map_in_place in Rust but it has been removed some time ago.
As a work-around, I've added a Dummy variant to SomeType and then do the following:
for x in &mut data {
let original = ::std::mem::replace(x, SomeType::Dummy);
*x = transform(original);
}
This does not feel right, and I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop. Is there a better way of doing this?
Your first problem is not map, it's transform.
transform takes ownership of its argument, while Vec has ownership of its arguments. Either one has to give, and poking a hole in the Vec would be a bad idea: what if transform panics?
The best fix, thus, is to change the signature of transform to:
fn transform(x: &mut SomeType) { ... }
then you can just do:
for x in &mut data { transform(x) }
Other solutions will be clunky, as they will need to deal with the fact that transform might panic.
No, it is not possible in general because the size of each element might change as the mapping is performed (fn transform(u8) -> u32).
Even when the sizes are the same, it's non-trivial.
In this case, you don't need to create a Dummy variant because creating an empty String is cheap; only 3 pointer-sized values and no heap allocation:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
let old = std::mem::replace(self, VariantA(String::new()));
// Note this line for the detailed explanation
*self = match old {
VariantA(s) => VariantB(s, 0),
VariantB(s, i) => VariantB(s, 2 * i),
};
}
}
for x in &mut data {
x.transform();
}
An alternate implementation that just replaces the String:
impl SomeType {
fn transform(&mut self) {
use SomeType::*;
*self = match self {
VariantA(s) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 0)
}
VariantB(s, i) => {
let s = std::mem::replace(s, String::new());
VariantB(s, 2 * *i)
}
};
}
}
In general, yes, you have to create some dummy value to do this generically and with safe code. Many times, you can wrap your whole element in Option and call Option::take to achieve the same effect .
See also:
Change enum variant while moving the field to the new variant
Why is it so complicated?
See this proposed and now-closed RFC for lots of related discussion. My understanding of that RFC (and the complexities behind it) is that there's an time period where your value would have an undefined value, which is not safe. If a panic were to happen at that exact second, then when your value is dropped, you might trigger undefined behavior, a bad thing.
If your code were to panic at the commented line, then the value of self is a concrete, known value. If it were some unknown value, dropping that string would try to drop that unknown value, and we are back in C. This is the purpose of the Dummy value - to always have a known-good value stored.
You even hinted at this (emphasis mine):
I have to deal with SomeType::Dummy everywhere else in the code, although it should never be visible outside of this loop
That "should" is the problem. During a panic, that dummy value is visible.
See also:
How can I swap in a new value for a field in a mutable reference to a structure?
Temporarily move out of borrowed content
How do I move out of a struct field that is an Option?
The now-removed implementation of Vec::map_in_place spans almost 175 lines of code, most of having to deal with unsafe code and reasoning why it is actually safe! Some crates have re-implemented this concept and attempted to make it safe; you can see an example in Sebastian Redl's answer.
You can write a map_in_place in terms of the take_mut or replace_with crates:
fn map_in_place<T, F>(v: &mut [T], f: F)
where
F: Fn(T) -> T,
{
for e in v {
take_mut::take(e, f);
}
}
However, if this panics in the supplied function, the program aborts completely; you cannot recover from the panic.
Alternatively, you could supply a placeholder element that sits in the empty spot while the inner function executes:
use std::mem;
fn map_in_place_with_placeholder<T, F>(v: &mut [T], f: F, mut placeholder: T)
where
F: Fn(T) -> T,
{
for e in v {
let mut tmp = mem::replace(e, placeholder);
tmp = f(tmp);
placeholder = mem::replace(e, tmp);
}
}
If this panics, the placeholder you supplied will sit in the panicked slot.
Finally, you could produce the placeholder on-demand; basically replace take_mut::take with take_mut::take_or_recover in the first version.

How do I cope with lazy iterators?

I'm trying to sort an array with a map() over an iterator.
struct A {
b: Vec<B>,
}
#[derive(PartialEq, Eq, PartialOrd, Ord)]
struct B {
c: Vec<i32>,
}
fn main() {
let mut a = A { b: Vec::new() };
let b = B { c: vec![5, 2, 3] };
a.b.push(b);
a.b.iter_mut().map(|b| b.c.sort());
}
Gives the warning:
warning: unused `std::iter::Map` that must be used
--> src/main.rs:16:5
|
16 | a.b.iter_mut().map(|b| b.c.sort());
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: #[warn(unused_must_use)] on by default
= note: iterators are lazy and do nothing unless consumed
Which is true, sort() isn't actually called here. This warning is described in the book, but I don't understand why this variation with iter_mut() works fine:
a.b.iter_mut().find(|b| b == b).map(|b| b.c.sort());
As the book you linked to says:
If you are trying to execute a closure on an iterator for its side effects, use for instead.
That way it works, and it's much clearer to anyone reading the code. You should use map when you want to transform a vector to a different one.
I don't understand why this variation with iter_mut() works fine:
a.b.iter_mut().find(|b| b == b).map(|b| b.c.sort());
It works because find is not lazy; it's an iterator consumer. It returns an Option not an Iterator. This might be why it is confusing you, because Option also has a map method, which is what you are using here.
As others have said, map is intended for transforming data, without modifying it and without any other side-effects. If you really want to use map, you can map over the collection and assign it back:
fn main() {
let mut a = A { b: Vec::new() };
let mut b = B { c: vec![5, 2, 3] };
a.b.push(b);
a.b =
a.b.into_iter()
.map(|mut b| {
b.c.sort();
b
})
.collect();
}
Note that vector's sort method returns (), so you have to explicitly return the sorted vector from the mapping function.
I use for_each.
According to the doc:
It is equivalent to using a for loop on the iterator, although break and continue are not possible from a closure. It's generally more idiomatic to use a for loop, but for_each may be more legible when processing items at the end of longer iterator chains. In some cases for_each may also be faster than a loop, because it will use internal iteration on adaptors like Chain.

Resources