Related
Is there a Rust equivalent of the following C++ sample (that I've written for this question):
union example {
uint32_t fullValue;
struct {
unsigned sixteen1: 16;
unsigned sixteen2: 16;
};
struct {
unsigned three: 3;
unsigned twentynine: 29;
};
};
example e;
e.fullValue = 12345678;
std::cout << e.sixteen1 << ' ' << e.sixteen2 << ' ' << e.three << ' ' << e.twentynine;
For reference, I'm writing a CPU emulator & easily being able to split out binary parts of a variable like this & reference them by different names, makes the code much simpler. I know how to do this in C++ (as above), but am struggling to work out how to do the equivalent in Rust.
You could do this by creating a newtype struct and extracting the relevant bits using masking and/or shifts.
This code to do this is slightly longer (but not much so) and importantly avoids the undefined behavior you are triggering in C++.
#[derive(Debug, Clone, Copy)]
struct Example(pub u32);
impl Example {
pub fn sixteen1(self) -> u32 {
self.0 & 0xffff
}
pub fn sixteen2(self) -> u32 {
self.0 >> 16
}
pub fn three(self) -> u32 {
self.0 & 7
}
pub fn twentynine(self) -> u32 {
self.0 >> 3
}
}
pub fn main() {
let e = Example(12345678);
println!("{} {} {} {}", e.sixteen1(), e.sixteen2(), e.three(), e.twentynine());
}
Update
You can make some macros for extracting certain bits:
// Create a u32 mask that's all 0 except for one patch of 1's that
// begins at index `start` and continues for `len` digits.
macro_rules! mask {
($start:expr, $len:expr) => {
{
assert!($start >= 0);
assert!($len > 0);
assert!($start + $len <= 32);
if $len == 32 {
assert!($start == 0);
0xffffffffu32
} else {
((1u32 << $len) - 1) << $start
}
}
}
}
const _: () = assert!(mask!(3, 7) == 0b1111111000);
const _: () = assert!(mask!(0, 32) == 0xffffffff);
// Select `num_bits` bits from `value` starting at `start`.
// For example, select_bits!(0xabcd1234, 8, 12) == 0xd12
// because the created mask is 0x000fff00.
macro_rules! select_bits {
($value:expr, $start:expr, $num_bits:expr) => {
{
let mask = mask!($start, $num_bits);
($value & mask) >> mask.trailing_zeros()
}
}
}
const _: () = assert!(select_bits!(0xabcd1234, 8, 12) == 0xd12);
Then either use these directly on a u32 or make a struct to implement taking certain bits:
struct Example {
v: u32,
}
impl Example {
pub fn first_16(&self) -> u32 {
select_bits!(self.v, 0, 16)
}
pub fn last_16(&self) -> u32 {
select_bits!(self.v, 16, 16)
}
pub fn first_3(&self) -> u32 {
select_bits!(self.v, 0, 3)
}
pub fn last_29(&self) -> u32 {
select_bits!(self.v, 3, 29)
}
}
fn main() {
// Use hex for more easily checking the expected values.
let e = Example { v: 0x12345678 };
println!("{:x} {:x} {:x} {:x}", e.first_16(), e.last_16(), e.first_3(), e.last_29());
// Or use decimal for checking with the provided C code.
let e = Example { v: 12345678 };
println!("{} {} {} {}", e.first_16(), e.last_16(), e.first_3(), e.last_29());
}
Original Answer
While Rust does have unions, it may be better to use a struct for your use case and just get bits from the struct's single value.
// Create a u32 mask that's all 0 except for one patch of 1's that
// begins at index `start` and continues for `len` digits.
macro_rules! mask {
($start:expr, $len:expr) => {
{
assert!($start >= 0);
assert!($len > 0);
assert!($start + $len <= 32);
let mut mask = 0u32;
for i in 0..$len {
mask |= 1u32 << (i + $start);
}
mask
}
}
}
struct Example {
v: u32,
}
impl Example {
pub fn first_16(&self) -> u32 {
self.get_bits(mask!(0, 16))
}
pub fn last_16(&self) -> u32 {
self.get_bits(mask!(16, 16))
}
pub fn first_3(&self) -> u32 {
self.get_bits(mask!(0, 3))
}
pub fn last_29(&self) -> u32 {
self.get_bits(mask!(3, 29))
}
// Get the bits of `self.v` specified by `mask`.
// Example:
// self.v == 0xa9bf01f3
// mask == 0x00fff000
// The result is 0xbf0
fn get_bits(&self, mask: u32) -> u32 {
// Find how many trailing zeros `mask` (in binary) has.
// For example, the mask 0xa0 == 0b10100000 has 5.
let mut trailing_zeros_count_of_mask = 0;
while mask & (1u32 << trailing_zeros_count_of_mask) == 0 {
trailing_zeros_count_of_mask += 1;
}
(self.v & mask) >> trailing_zeros_count_of_mask
}
}
fn main() {
// Use hex for more easily checking the expected values.
let e = Example { v: 0x12345678 };
println!("{:x} {:x} {:x} {:x}", e.first_16(), e.last_16(), e.first_3(), e.last_29());
// Or use decimal for checking with the provided C code.
let e = Example { v: 12345678 };
println!("{} {} {} {}", e.first_16(), e.last_16(), e.first_3(), e.last_29());
}
This setup makes it easy to select any range of bits you want. For example, if you want to get the middle 16 bits of the u32, you just define:
pub fn middle_16(&self) -> u32 {
self.get_bits(mask!(8, 16))
}
And you don't even really need the struct. Instead of having get_bits() be a method, you could define it to take a u32 value and mask, and then define functions like
pub fn first_3(v: u32) -> u32 {
get_bits(v, mask!(0, 3))
}
Note
I think this Rust code works the same regardless of your machine's endianness, but I've only run it on my little-endian machine. You should double check it if it could be a problem for you.
You could use the bitfield crate.
This appears to approximate what you are looking for at least on a syntactic level.
For reference, your original C++ code prints:
24910 188 6 1543209
Now there is no built-in functionality in Rust for bitfields, but there is the bitfield crate.
It allows specifying a newtype struct and then generates setters/getters for parts of the wrapped value.
For example pub twentynine, set_twentynine: 31, 3; means that it should generate the setter set_twentynine() and getter twentynine() that sets/gets the bits 3 through 31, both included.
So transferring your C++ union into a Rust bitfield, this is how it could look like:
use bitfield::bitfield;
bitfield! {
pub struct Example (u32);
pub full_value, set_full_value: 31, 0;
pub sixteen1, set_sixteen1: 15, 0;
pub sixteen2, set_sixteen2: 31, 16;
pub three, set_three: 2, 0;
pub twentynine, set_twentynine: 31, 3;
}
fn main() {
let mut e = Example(0);
e.set_full_value(12345678);
println!(
"{} {} {} {}",
e.sixteen1(),
e.sixteen2(),
e.three(),
e.twentynine()
);
}
24910 188 6 1543209
Note that those generated setters/getters are small enough to have a very high chance to be inlined by the compiler, giving you zero overhead.
Of course if you want to avoid adding an additional dependency and instead want to implement the getters/setters by hand, look at #apilat's answer instead.
Alternative: the c2rust-bitfields crate:
use c2rust_bitfields::BitfieldStruct;
#[repr(C, align(1))]
#[derive(BitfieldStruct)]
struct Example {
#[bitfield(name = "full_value", ty = "u32", bits = "0..=31")]
#[bitfield(name = "sixteen1", ty = "u16", bits = "0..=15")]
#[bitfield(name = "sixteen2", ty = "u16", bits = "16..=31")]
#[bitfield(name = "three", ty = "u8", bits = "0..=2")]
#[bitfield(name = "twentynine", ty = "u32", bits = "3..=31")]
data: [u8; 4],
}
fn main() {
let mut e = Example { data: [0; 4] };
e.set_full_value(12345678);
println!(
"{} {} {} {}",
e.sixteen1(),
e.sixteen2(),
e.three(),
e.twentynine()
);
}
24910 188 6 1543209
Advantage of this one is that you can specify the type of the union parts yourself; the first one was u32 for all of them.
I'm unsure, however, how endianess plays into this one. It might yield different results on a system with different endianess. Might require further research to be sure.
example
struct MyStruct{
row: u8,
column: u8
}
let my_vector = a Vec<MyStruct> with like 100 items in it
Lets say I have a simple setup like this ^. I want to sort my_vector list of say 100 items by row AND THEN by column so I get my vector looking like sample 1 instead of sample 2.
sample 1
my_vector = vec![
MyStruct { row: 10, column: 1 },
MyStruct { row: 10, column: 2 },
MyStruct { row: 10, column: 3 }, ]
sample 2
my_vector = vec![
MyStruct { row: 10, column: 3 },
MyStruct { row: 10, column: 1 },
MyStruct { row: 10, column: 2 }, ]
Currently I've been working off this post which describes how to sort by a single key with the sort_by_key() function, but the issue i'm having with that is that I can only sort by a single key, and not by two or multiple keys. This results in problems like sample 2, where I get my rows sorted but then my columns in a random order.
I want both my rows and columns to be ordered. How can I do this?, Thanks
Since tuples in Rust impl PartialOrd with lexicographic comparison, you can use the sort_by_key() methods:
my_vector.sort_unstable_by_key(|item| (item.row, item.column));
Playground.
You can also implement PartialOrd or Ord for MyStruct <module std::CMP>
use core::cmp::Ordering;
#[derive(Debug, Eq, PartialEq, Ord)]
struct MyStruct {
row: u8,
column: u8,
}
impl PartialOrd for MyStruct {
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
if self.row == other.row {
return Some(self.column.cmp(&other.column));
}
Some(self.row.cmp(&other.row))
}
}
fn main() {
let mut my_vector = vec![
MyStruct { row: 10, column: 3 },
MyStruct { row: 10, column: 1 },
MyStruct { row: 10, column: 2 },
];
my_vector.sort();
println!("{:?}", my_vector);
}
playground
By collating the information of the previous two solutions and the help of GitHub Copilot, here is a working solution for a sort by two keys sorting method:
Using the compare argument on the sort_by method for a mutable vector:
my_vector.sort_by(| a, b | if a.row == b.row {
a.column.partial_cmp(&b.column).unwrap()
} else {
a.row.partial_cmp(&b.row).unwrap()
});
And using:
println!("{:#?}", locations);
Would output:
let mut locations = vec![
Location {
row: 1,
column: 1
},
Location {
row: 1,
column: 2
},
Location {
row: 2,
column: 1
},
Location {
row: 2,
column: 2
}
];
u can use sort_by()
my_vector.sort_by(|a, b| ...);
with a and b, u can make condition to return true for sort
link
I think using match is a better alternative because it compares each pair only once. Using sort_unstable_by might be a bit better because unstable is faster, and because it is possible to make any comparison go in reverse (descending order) -- simply switch a and b for that cmp() call.
struct MyStruct{
row: u8,
column: u8
}
fn sort(items: &mut [MyStruct]) {
items.sort_unstable_by(|a, b| {
match a.row.cmp(&b.row) {
Ordering::Equal => { a.column.cmp(&b.column) }
v => { v }
}
});
}
This question already has answers here:
How do you access enum values in Rust?
(6 answers)
Closed 1 year ago.
I would like to make an array of vectors of different types, and I realize that in Rust this isn't as straightforward as:
let array_of_vecs = [
vec![],
vec![],
vec![]
]
Instead, I took a look through the Rust Book and found this https://doc.rust-lang.org/book/ch08-01-vectors.html#using-an-enum-to-store-multiple-types.
fn main() {
enum DataType {
Int(Vec<i32>),
Float(Vec<f64>),
Text(Vec<String>)
}
let array_of_vecs = [
DataType::Int(vec![1, 3]),
DataType::Text(vec![]),
DataType::Float(vec![])
];
println!("{:?}",array_of_vecs[0][1]);
}
However, when I run this, I get the following error:
error[E0608]: cannot index into a value of type `DataType`
--> src/main.rs:15:19
|
15 | println!("{:?}",array_of_vecs[0][1]);
| ^^^^^^^^^^^^^^^^^^^
I understand that at array_of_vecs[0][1] is a DataType, but isn't it also a Vec<i32>? How do I access the Vec<i32>?
You have to destructure the vectors. While every variant of your DataType is a tuple variant with a single vector field in this case, it's possible to have whatever types you want, so the Rust compiler can't guarantee that array_of_vecs[0] is a vector. This means that you have to account for every case:
match array_of_vecs[0] {
DataType::Int(v) => println!("{:?}", v[1]),
DataType::Float(v) => println!("{:?}", v[1]),
DataType::Text(v) => println!("{:?}", v[1]),
}
You probably need to add an index trait for your enum, so that Rust knows what you're trying to do when you are accessing the enum value with [1]:
use std::ops::Index;
enum Nucleotide {
A,
C,
G,
T,
}
struct NucleotideCount {
a: usize,
c: usize,
g: usize,
t: usize,
}
impl Index<Nucleotide> for NucleotideCount {
type Output = usize;
fn index(&self, nucleotide: Nucleotide) -> &Self::Output {
match nucleotide {
Nucleotide::A => &self.a,
Nucleotide::C => &self.c,
Nucleotide::G => &self.g,
Nucleotide::T => &self.t,
}
}
}
let nucleotide_count = NucleotideCount {a: 14, c: 9, g: 10, t: 12};
assert_eq!(nucleotide_count[Nucleotide::A], 14);
assert_eq!(nucleotide_count[Nucleotide::C], 9);
assert_eq!(nucleotide_count[Nucleotide::G], 10);
assert_eq!(nucleotide_count[Nucleotide::T], 12);
https://doc.rust-lang.org/std/ops/trait.Index.html
I am learning Rust and recently went through an exercise where I had to iterate through numbers that could go in either direction. I tried the below with unexpected results.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
struct Point {
x: i32,
y: i32
}
fn test() {
let p1 = Point { x: 1, y: 8 };
let p2 = Point { x: 3, y: 6 };
let all_x = p1.x..=p2.x;
println!("all_x: {:?}", all_x.clone().collect::<Vec<i32>>());
let all_y = p1.y..=p2.y;
println!("all_y: {:?}", all_y.clone().collect::<Vec<i32>>());
let points: Vec<Point> = all_x.zip(all_y).map(|(x, y)| Point { x, y }).collect();
println!("points: {:?}", points);
}
The output was
all_x: [1, 2, 3]
all_y: []
points: []
After some googling I found an explanation and some old answers which basically amount to use (a..b).rev() as needed.
My question is, how do I do this in a dynamic way? If I use an if...else like so
let all_x = if p1.x < p2.x { (p1.x..=p2.x) } else { (p2.x..=p1.x).rev() };
I get a type error because the else is different than the if
|
58 | let all_x = if p1.x < p2.x { (p1.x..=p2.x) }
| - ------------- expected because of this
| _________________|
| |
59 | | else { (p2.x..=p1.x).rev() };
| |____________^^^^^^^^^^^^^^^^^^^_- `if` and `else` have incompatible types
| |
| expected struct `RangeInclusive`, found struct `Rev`
|
= note: expected type `RangeInclusive<_>`
found struct `Rev<RangeInclusive<_>>`
After trying a bunch of different variations on let all_x: dyn Range<Item = i32>, let all_x: dyn Iterator<Item = i32>, etc, the only way I have managed to do this is by turning them into collections and then back to iterators.
let all_x: Vec<i32>;
if p1.x < p2.x { all_x = (p1.x..=p2.x).collect(); }
else { all_x = (p2.x..=p1.x).rev().collect(); }
let all_x = all_x.into_iter();
println!("all_x: {:?}", all_x.clone().collect::<Vec<i32>>());
let all_y: Vec<i32>;
if p1.y < p2.y { all_y = (p1.y..=p2.y).collect(); }
else { all_y = (p2.y..=p1.y).rev().collect(); }
let all_y = all_y.into_iter();
println!("all_y: {:?}", all_y.clone().collect::<Vec<i32>>());
which provides the desired outcome
all_x: [1, 2, 3]
all_y: [8, 7, 6]
points: [Point { x: 1, y: 8 }, Point { x: 2, y: 7 }, Point { x: 3, y: 6 }]
but is a bit repetitive, inelegant and I'm assuming not very efficient at large numbers. Is there a better way to handle this situation?
NOTE: Sorry for including the Point struct. I could not get my example to work with x1, x2, etc. Probably a different question for a different post lol.
You can dynamically dispatch it. Wrapping them into a Box and returning a dynamic object, an Iterator in this case. For example:
fn maybe_reverse_range(init: usize, end: usize, reverse: bool) -> Box<dyn Iterator<Item=usize>> {
if reverse {
Box::new((init..end).rev())
} else {
Box::new((init..end))
}
}
Playground
The enum itertools::Either can be used to solve the incompatible type error in the if/else statement. A function like get_range_iter below using Either can reduce the code repetition.
use itertools::Either;
fn get_range_iter(start: i32, end: i32) -> impl Iterator<Item=i32> {
if start < end {
Either::Left(start..=end)
} else {
Either::Right((end..=start).rev())
}
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
struct Point {
x: i32,
y: i32
}
fn main() {
let p1 = Point { x: 1, y: 8 };
let p2 = Point { x: 3, y: 6 };
let all_x = get_range_iter(p1.x, p2.x);
let all_y = get_range_iter(p1.y, p2.y);
println!("all_x: {:?}", all_x.collect::<Vec<_>>());
println!("all_y: {:?}", all_y.collect::<Vec<_>>());
}
Playground
I want to compare two pointers within this loop:
#[derive(Debug)]
struct Test {
first: i32,
second: i32
}
fn main() {
let test = vec![Test {first: 1, second: 2}, Test {first: 3, second: 4}, Test {first: 5, second: 6}];
for item in test.iter() {
println!("--- {:?}", item);
println!("item {:p}", item);
println!("test.last().unwrap() {:p}", test.last().unwrap());
// if item == test.last().unwrap() {
// println!("Last item!");
// }
}
}
The println gives me the same addresses:
--- Test { first: 1, second: 2 }
item 0x563caaf3bb40
test.last().unwrap() 0x563caaf3bb50
--- Test { first: 3, second: 4 }
item 0x563caaf3bb48
test.last().unwrap() 0x563caaf3bb50
--- Test { first: 5, second: 6 }
item 0x563caaf3bb50
test.last().unwrap() 0x563caaf3bb50
But when I uncomment the if statement the following error is thrown:
error[E0369]: binary operation `==` cannot be applied to type `&Test`
--> src/main.rs:20:17
|
20 | if item == test.last().unwrap() {
| ---- ^^ -------------------- &Test
| |
| &Test
|
= note: an implementation of `std::cmp::PartialEq` might be missing for `&Test`
How can I compare only the two pointers?
When you compare pointers you are actually comparing the values pointed by those. This is because there are a lot of implementations in std of the type:
impl<'_, '_, A, B> PartialEq<&'_ B> for &'_ A
where
A: PartialEq<B> + ?Sized,
B: ?Sized,
that do exactly that.
If you want to compare the pointers themselves you can use std::ptr::eq:
pub fn eq<T: ?Sized>(a: *const T, b: *const T) -> bool
Note that even though it takes raw pointers, it is safe because it does not dereference the pointers. Since there is an automatic coercion from a reference to a raw pointer, you can use:
if std::ptr::eq(item, test.last().unwrap()) {
println!("Last item!");
}