How to evaluate and stringify a compile-time constant? - rust

How to evaluate and stringify a compile-time constant into a static str? I suppose there should be a macro for this but couldn't find one. This is the code which shows the intentions and my best effort:
const NUM: i32 = 11;
fn main() {
let s = concat!("test", 10, stringify!(NUM), 'b', true);
assert_eq!(s, "test1011btrue");
}
playground

You can use the concatcp! macro from the const_format crate:
use const_format::concatcp;
const NUM: i32 = 11;
fn main() {
let s = concatcp!("test", 10, NUM, 'b', true);
assert_eq!(s, "test1011btrue");
}

Related

How do you pass a const or static parameter to a function?

How do you pass a const or static to a function in Rust?
Why do these not work?:
const COUNT: i32 = 5;
fn main() {
let repeated = "*".repeat(COUNT);
println!("Repeated Value: {}", repeated);
}
And
static COUNT: i32 = 5;
fn main() {
let repeated = "*".repeat(COUNT);
println!("Repeated Value: {}", repeated);
}
They return:
mismatched types
expected `usize`, found `i32`
But these work fine?:
fn main() {
let repeated = "*".repeat(5);
println!("Repeated Value: {}", repeated);
}
And
fn main() {
let count = 5;
let repeated = "*".repeat(count);
println!("Repeated Value: {}", repeated);
}
Surely const works the same way as 5? Both should be type i32
= "*".repeat(COUNT)
vs
= "*".repeat(5)
And similarly shouldn't 'static' work like 'let'? What am I missing here? How do you use a const as a parameter to a function call?
It has nothing to do with the definition of COUNT. repeat takes a usize, not a u32.
You either need to define COUNT as a usize
const COUNT: usize = 5;
Or convert COUNT to a usize when you call repeat
let repeated = "*".repeat(COUNT as usize);

Assertion Error When Accessing Struct Field

I'm trying to declare a global variable, who's type is a struct with a function pointer and a char pointer element { i64 ()*, i8* }, and then set the fields to null during main, but I'm getting an assertion error using a debug version of LLVM.
/media/work/contrib/llvm-project/llvm/lib/IR/ConstantsContext.h:745: void llvm::ConstantUniqueMap<ConstantClass>::remove(ConstantClass*) [with ConstantClass = llvm::ConstantExpr]: Assertion `I != Map.end() && "Constant not found in constant table!"' failed.
I believe this problem is causing another issue during optimization when compiling something a bit more complicated. The error itself occurs when disposing of the module at the end. A distilled runnable example in rust is:
use std::ffi::CString;
extern crate llvm_sys;
pub use self::llvm_sys::prelude::{ LLVMValueRef };
use self::llvm_sys::*;
use self::llvm_sys::prelude::*;
use self::llvm_sys::core::*;
use self::llvm_sys::target::*;
use self::llvm_sys::target_machine::*;
use self::llvm_sys::transforms::pass_manager_builder::*;
fn main() {
unsafe {
let context = LLVMContextCreate();
let module = LLVMModuleCreateWithName(cstr("module"));
build(module, context);
println!("{}", emit_module(module));
LLVMDisposeModule(module);
LLVMContextDispose(context);
}
}
pub fn cstr(string: &str) -> *mut i8 {
CString::new(string).unwrap().into_raw()
}
pub unsafe fn build(module: LLVMModuleRef, context: LLVMContextRef) {
let builder = LLVMCreateBuilderInContext(context);
let mut argtypes = vec!();
let func_type = LLVMFunctionType(LLVMInt64TypeInContext(context), argtypes.as_mut_ptr(), argtypes.len() as u32, false as i32);
let fptr_type = LLVMPointerType(func_type, 0);
let context_type = LLVMPointerType(LLVMInt8TypeInContext(context), 0);
let mut structfields = vec!(fptr_type, context_type);
let struct_type = LLVMStructType(structfields.as_mut_ptr(), structfields.len() as u32, false as i32);
let initializer = LLVMConstNull(struct_type);
let global = LLVMAddGlobal(module, struct_type, cstr("function"));
LLVMSetInitializer(global, initializer);
LLVMSetLinkage(global, LLVMLinkage::LLVMExternalLinkage);
let mut argtypes = vec!();
let main_type = LLVMFunctionType(LLVMInt64TypeInContext(context), argtypes.as_mut_ptr(), argtypes.len() as u32, false as i32);
let function = LLVMAddFunction(module, cstr("main"), main_type);
let bb = LLVMAppendBasicBlockInContext(context, function, cstr("entry"));
LLVMPositionBuilderAtEnd(builder, bb);
let mut indices = vec!(LLVMConstInt(LLVMInt32TypeInContext(context), 0, 0), LLVMConstInt(LLVMInt32TypeInContext(context), 0, 0));
let field = LLVMBuildGEP(builder, global, indices.as_mut_ptr(), indices.len() as u32, cstr(""));
LLVMBuildStore(builder, LLVMConstNull(fptr_type), field);
LLVMBuildRet(builder, LLVMConstInt(LLVMInt64TypeInContext(context), 0, 0));
LLVMDisposeBuilder(builder);
}
pub fn emit_module(module: LLVMModuleRef) -> String {
unsafe { CString::from_raw(LLVMPrintModuleToString(module)).into_string().unwrap() }
}
The full output is:
; ModuleID = 'module'
source_filename = "module"
#function = global { i64 ()*, i8* } zeroinitializer
define i64 #main() {
entry:
store i64 ()* null, i64 ()** getelementptr inbounds ({ i64 ()*, i8* }, { i64 ()*, i8* }* #function, i32 0, i32 0), align 8
ret i64 0
}
/media/work/contrib/llvm-project/llvm/lib/IR/ConstantsContext.h:745: void llvm::ConstantUniqueMap<ConstantClass>::remove(ConstantClass*) [with ConstantClass = llvm::ConstantExpr]: Assertion `I != Map.end() && "Constant not found in constant table!"' failed.
Aborted (core dumped)
Any help or suggestions would be most appreciated. Thanks
I finally figured out what the issue is. In the example above, the function LLVMStructType() also has an alternate version LLVMStructTypeInContext(), which if used instead will fix the assertion error. There is also another function LLVMModuleCreateWithNameInContext, which should probably be used instead, but in the example above, it will work without fixing this.
It's also possible to replace all the InContext versions with their non-context versions to fix the problem in the example
In my actual program, removing the InContext versions didn't work for some reason, but the two non-context functions mentioned above were being used, as well a couple uses of the non-context LLVMAppendBasicBlock. Replacing these with the InContext version fixed all the assertion errors, include the one that started this:
void llvm::Value::doRAUW(llvm::Value*, llvm::Value::ReplaceMetadataUses): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed.
The type pointers weren't the same, so it must have been because they were defined in different contexts.

How can I convert from Vec<char> to u32 in Rust without going through String?

My rust code runs in an environment where I have no access to std::string and std::* (but I have access to core::str). How can I convert a Vec<char> to u32 without going through String, such as:
let num_in_chars: Vec<char> = vec!['1', '2'];
// some process here
// let num = ...
// This is how I could do it if I have access to `String`
// let num = num_in_chars.iter().collect::<String>().parse::<u32>().unwrap();
assert_eq!(12, num);
Thanks
You must convert each char to a digit (in the map) and then you multiply each previous result by 10 and you add the new digit:
/// Returns `None` in case of invalid digit.
pub fn vec_to_int(digits: impl IntoIterator<Item = char>) -> Option<u32> {
const RADIX: u32 = 10;
digits
.into_iter()
.map(|c| c.to_digit(RADIX))
.try_fold(0, |ans, i| i.map(|i| ans * RADIX + i))
}
#[test]
fn it_works() {
let nums = vec!['1', '2'];
let num = vec_to_int(nums);
assert_eq!(Some(12), num);
}
#[test]
fn invalid_digit() {
let nums = vec!['1', 'a'];
let num = vec_to_int(nums);
assert_eq!(None, num);
}

How to convert a string of digits into a vector of digits?

I'm trying to store a string (or str) of digits, e.g. 12345 into a vector, such that the vector contains {1,2,3,4,5}.
As I'm totally new to Rust, I'm having problems with the types (String, str, char, ...) but also the lack of any information about conversion.
My current code looks like this:
fn main() {
let text = "731671";
let mut v: Vec<i32>;
let mut d = text.chars();
for i in 0..text.len() {
v.push( d.next().to_digit(10) );
}
}
You're close!
First, the index loop for i in 0..text.len() is not necessary since you're going to use an iterator anyway. It's simpler to loop directly over the iterator: for ch in text.chars(). Not only that, but your index loop and the character iterator are likely to diverge, because len() returns you the number of bytes and chars() returns you the Unicode scalar values. Being UTF-8, the string is likely to have fewer Unicode scalar values than it has bytes.
Next hurdle is that to_digit(10) returns an Option, telling you that there is a possibility the character won't be a digit. You can check whether to_digit(10) returned the Some variant of an Option with if let Some(digit) = ch.to_digit(10).
Pieced together, the code might now look like this:
fn main() {
let text = "731671";
let mut v = Vec::new();
for ch in text.chars() {
if let Some(digit) = ch.to_digit(10) {
v.push(digit);
}
}
println!("{:?}", v);
}
Now, this is rather imperative: you're making a vector and filling it digit by digit, all by yourself. You can try a more declarative or functional approach by applying a transformation over the string:
fn main() {
let text = "731671";
let v: Vec<u32> = text.chars().flat_map(|ch| ch.to_digit(10)).collect();
println!("{:?}", v);
}
ArtemGr's answer is pretty good, but their version will skip any characters that aren't digits. If you'd rather have it fail on bad digits, you can use this version instead:
fn to_digits(text: &str) -> Option<Vec<u32>> {
text.chars().map(|ch| ch.to_digit(10)).collect()
}
fn main() {
println!("{:?}", to_digits("731671"));
println!("{:?}", to_digits("731six71"));
}
Output:
Some([7, 3, 1, 6, 7, 1])
None
To mention the quick and dirty elephant in the room, if you REALLY know your string contains only digits in the range '0'..'9', than you can avoid memory allocations and copies and use the underlying &[u8] representation of String from str::as_bytes directly. Subtract b'0' from each element whenever you access it.
If you are doing competitive programming, this is one of the worthwhile speed and memory optimizations.
fn main() {
let text = "12345";
let digit = text.as_bytes();
println!("Text = {:?}", text);
println!("value of digit[3] = {}", digit[3] - b'0');
}
Output:
Text = "12345"
value of digit[3] = 4
This solution combines ArtemGr's + notriddle's solutions:
fn to_digits(string: &str) -> Vec<u32> {
let opt_vec: Option<Vec<u32>> = string
.chars()
.map(|ch| ch.to_digit(10))
.collect();
match opt_vec {
Some(vec_of_digits) => vec_of_digits,
None => vec![],
}
}
In my case, I implemented this function in &str.
pub trait ExtraProperties {
fn to_digits(self) -> Vec<u32>;
}
impl ExtraProperties for &str {
fn to_digits(self) -> Vec<u32> {
let opt_vec: Option<Vec<u32>> = self
.chars()
.map(|ch| ch.to_digit(10))
.collect();
match opt_vec {
Some(vec_of_digits) => vec_of_digits,
None => vec![],
}
}
}
In this way, I transform &str to a vector containing digits.
fn main() {
let cnpj: &str = "123456789";
let nums: Vec<u32> = cnpj.to_digits();
println!("cnpj: {cnpj}"); // cnpj: 123456789
println!("nums: {nums:?}"); // nums: [1, 2, 3, 4, 5, 6, 7, 8, 9]
}
See the Rust Playground.

Is there a method like JavaScript's substr in Rust?

I looked at the Rust docs for String but I can't find a way to extract a substring.
Is there a method like JavaScript's substr in Rust? If not, how would you implement it?
str.substr(start[, length])
The closest is probably slice_unchecked but it uses byte offsets instead of character indexes and is marked unsafe.
For characters, you can use s.chars().skip(pos).take(len):
fn main() {
let s = "Hello, world!";
let ss: String = s.chars().skip(7).take(5).collect();
println!("{}", ss);
}
Beware of the definition of Unicode characters though.
For bytes, you can use the slice syntax:
fn main() {
let s = b"Hello, world!";
let ss = &s[7..12];
println!("{:?}", ss);
}
You can use the as_str method on the Chars iterator to get back a &str slice after you have stepped on the iterator. So to skip the first start chars, you can call
let s = "Some text to slice into";
let mut iter = s.chars();
iter.by_ref().nth(start); // eat up start values
let slice = iter.as_str(); // get back a slice of the rest of the iterator
Now if you also want to limit the length, you first need to figure out the byte-position of the length character:
let end_pos = slice.char_indices().nth(length).map(|(n, _)| n).unwrap_or(0);
let substr = &slice[..end_pos];
This might feel a little roundabout, but Rust is not hiding anything from you that might take up CPU cycles. That said, I wonder why there's no crate yet that offers a substr method.
This code performs both substring-ing and string-slicing, without panicking nor allocating:
use std::ops::{Bound, RangeBounds};
trait StringUtils {
fn substring(&self, start: usize, len: usize) -> &str;
fn slice(&self, range: impl RangeBounds<usize>) -> &str;
}
impl StringUtils for str {
fn substring(&self, start: usize, len: usize) -> &str {
let mut char_pos = 0;
let mut byte_start = 0;
let mut it = self.chars();
loop {
if char_pos == start { break; }
if let Some(c) = it.next() {
char_pos += 1;
byte_start += c.len_utf8();
}
else { break; }
}
char_pos = 0;
let mut byte_end = byte_start;
loop {
if char_pos == len { break; }
if let Some(c) = it.next() {
char_pos += 1;
byte_end += c.len_utf8();
}
else { break; }
}
&self[byte_start..byte_end]
}
fn slice(&self, range: impl RangeBounds<usize>) -> &str {
let start = match range.start_bound() {
Bound::Included(bound) | Bound::Excluded(bound) => *bound,
Bound::Unbounded => 0,
};
let len = match range.end_bound() {
Bound::Included(bound) => *bound + 1,
Bound::Excluded(bound) => *bound,
Bound::Unbounded => self.len(),
} - start;
self.substring(start, len)
}
}
fn main() {
let s = "abcdèfghij";
// All three statements should print:
// "abcdè, abcdèfghij, dèfgh, dèfghij."
println!("{}, {}, {}, {}.",
s.substring(0, 5),
s.substring(0, 50),
s.substring(3, 5),
s.substring(3, 50));
println!("{}, {}, {}, {}.",
s.slice(..5),
s.slice(..50),
s.slice(3..8),
s.slice(3..));
println!("{}, {}, {}, {}.",
s.slice(..=4),
s.slice(..=49),
s.slice(3..=7),
s.slice(3..));
}
For my_string.substring(start, len)-like syntax, you can write a custom trait:
trait StringUtils {
fn substring(&self, start: usize, len: usize) -> Self;
}
impl StringUtils for String {
fn substring(&self, start: usize, len: usize) -> Self {
self.chars().skip(start).take(len).collect()
}
}
// Usage:
fn main() {
let phrase: String = "this is a string".to_string();
println!("{}", phrase.substring(5, 8)); // prints "is a str"
}
The solution given by oli_obk does not handle last index of string slice. It can be fixed with .chain(once(s.len())).
Here function substr implements a substring slice with error handling. If invalid index is passed to function, then a valid part of string slice is returned with Err-variant. All corner cases should be handled correctly.
fn substr(s: &str, begin: usize, length: Option<usize>) -> Result<&str, &str> {
use std::iter::once;
let mut itr = s.char_indices().map(|(n, _)| n).chain(once(s.len()));
let beg = itr.nth(begin);
if beg.is_none() {
return Err("");
} else if length == Some(0) {
return Ok("");
}
let end = length.map_or(Some(s.len()), |l| itr.nth(l-1));
if let Some(end) = end {
return Ok(&s[beg.unwrap()..end]);
} else {
return Err(&s[beg.unwrap()..s.len()]);
}
}
let s = "abc🙂";
assert_eq!(Ok("bc"), substr(s, 1, Some(2)));
assert_eq!(Ok("c🙂"), substr(s, 2, Some(2)));
assert_eq!(Ok("c🙂"), substr(s, 2, None));
assert_eq!(Err("c🙂"), substr(s, 2, Some(99)));
assert_eq!(Ok(""), substr(s, 2, Some(0)));
assert_eq!(Err(""), substr(s, 5, Some(4)));
Note that this does not handle unicode grapheme clusters. For example, "y̆es" contains 4 unicode chars but 3 grapheme clusters. Crate unicode-segmentation solves this problem. Unicode grapheme clusters are handled correctly if part
let mut itr = s.char_indices()...
is replaced with
use unicode_segmentation::UnicodeSegmentation;
let mut itr = s.grapheme_indices(true)...
Then also following works
assert_eq!(Ok("y̆"), substr("y̆es", 0, Some(1)));
Knowing about the various syntaxes of the slice type might be beneficial for some of the readers.
Reference to a part of a string
&s[6..11]
If you start at index 0, you can omit the value
&s[0..1] ^= &s[..1]
Equivalent if your substring contains the last byte of the string
&s[3..s.len()] ^= &s[3..]
This also applies when the slice encompasses the entire string
&s[..]
You can also use the range inclusive operator to include the last value
&s[..=1]
Link to docs: https://doc.rust-lang.org/book/ch04-03-slices.html
I would suggest you use the crate substring. (And look at its source code if you want to learn how to do this properly.)
I couldn't find the exact substr implementation that I'm familiar with from other programming languages like: JavaScript, Dart, and etc.
Here is possible implementation of method substr to &str and String
Let's define a trait for making able to implement functions to default types, (like extensions in Dart).
trait Substr {
fn substr(&self, start: usize, end: usize) -> String;
}
Then implement this trait for &str
impl<'a> Substr for &'a str {
fn substr(&self, start: usize, end: usize) -> String {
if start > end || start == end {
return String::new();
}
self.chars().skip(start).take(end - start).collect()
}
}
Try:
fn main() {
let string = "Hello, world!";
let substring = string.substr(0, 4);
println!("{}", substring); // Hell
}
You can also use .to_string()[ <range> ].
This example takes an immutable slice of the original string, then mutates that string to demonstrate the original slice is preserved.
let mut s: String = "Hello, world!".to_string();
let substring: &str = &s.to_string()[..6];
s.replace_range(..6, "Goodbye,");
println!("{} {} universe!", s, substring);
// Goodbye, world! Hello, universe!
I'm not very experienced in Rust but I gave it a try. If someone could correct my answer please don't hesitate.
fn substring(string:String, start:u32, end:u32) -> String {
let mut substr = String::new();
let mut i = start;
while i < end + 1 {
substr.push_str(&*(string.chars().nth(i as usize).unwrap().to_string()));
i += 1;
}
return substr;
}
Here is a playground

Resources