Generating a random Character from range of 2 characters

Generating a random Character from range of 2 characters - rust

I would like to randomly give "flag" the value of either "D" or "C", but I'm having some problems...
Would appreciate if someone could help out on what I'm missing or provide an easy way to achieve this.
Tried this but without success:
let mut letter: char = rng.gen_range(b'A', b'Z') as char;

You can just use an array of all of your options (in this case ['C', 'D']) and use SliceRandom::choose to pick one at random. I would generally recommend this since it doesn't assume that 'C' and 'D' are consequtive letters, which may not continue to be true if you were to extend your code to more than 2 characters in the future.
use rand::prelude::*;
let mut rng = thread_rng();
let letter = ['C', 'D'].choose(&mut rng);
In Playground
Alternatively, if you really do want to use gen_range despite the above, you need to pass in a range value, like 'C'..='D' (inclusive range from 'C' to 'D'). There's no need to use byte literals here, since char ranges are already valid.
use rand::prelude::*;
let mut rng = thread_rng();
let letter = rng.gen_range('C'..='D');
In Playground

Related

taking only a int from a text with int string in Rust

i need to take only the integer from a string like this "Critical: 3\r\n" , note that the value change everytime so i can't search for "3", i need to search for a generic int.
Thanks.

Many ways to do it. There are already some answers. Here is one more approach:
let s = "Critical: 3\r\n";
let s_res = s.split(":").collect::<Vec<&str>>()[1].trim();
println!("s_res = {s_res:?}"); // "3"
In the above code s_res will be a string (&str). To convert that string to an integer, you can do something like this:
let n: isize = s_res.parse().expect("Failed to parse the integer!");
println!("n = {n}"); // 3
Note that, depending on your needs, you might want to add some extra validations/asserts, in case you expect the pattern might change (for example, the number of colons not to be 1, etc.).

Building on #AlexanderKrauze's comment the most common way to do so is using a regex, which lets you look for any pattern in a String:
let your_text = "Critical: 3\r\n";
let re = Regex::new(r"\d+").unwrap(); // matches any amount of consecutive digits
let result:Option<Match> = re.find(your_text);// returns the match
let number:u32 = result.map(|m| m.as_str().parse::<u32>().unwrap()).unwrap_or(0); // converts to int
print!("{}", number);
would be the code for that. Only one digit is r"\d".
More documentation is found here.

You can use chars to get an iterator over the chars of a string, and then apply filter on that iterator to filter out only digits(is_digit).
fn main() {
let my_str: String = "Critical: 3\r\n".to_owned();
let digits: String = my_str.chars().filter(|char| char.is_digit(10)).collect();
println!("{}", digits)
}

How to get a substring of a &str based on character index?

I am trying to write a program that takes a list of words and then, if the word has an even length, prints the two middle letters. If the word has an odd length, it prints the single middle letter.
I can find the index of the middle letter(s), but I do not know how to use that index to print the corresponding letters of the word.
fn middle(wds: &[&str)){
for word in wds{
let index = words.chars().count() /2;
match words.chars().count() % 2{
0 => println!("Even word found"),
_ => println!("odd word found")
}
}
}
fn main(){
let wordlist = ["Some","Words","to","test","testing","elephant","absolute"];
middle(&wordlist);
}

You can use slices for this, specifically &str slices. Note the &.
These links might be helpful:
https://riptutorial.com/rust/example/4146/string-slicing
https://doc.rust-lang.org/book/ch04-03-slices.html
fn main() {
let s = "elephant";
let mid = s.len() / 2;
let sliced = &s[mid - 1..mid + 1];
println!("{}", sliced);
}

Hey after posting i found two different ways of doing it, the fact i had two seperate ways in my head was confusing me and stopping me finding the exact answer.
//i fixed printing the middle letter of the odd numbered string with
word.chars().nth(index).unwrap()
//to fix the even index problem i did
&word[index-1..index+1]

Slice a string containing Unicode chars

I have a piece of text with characters of different bytelength.
let text = "Hello привет";
I need to take a slice of the string given start (included) and end (excluded) character indices. I tried this
let slice = &text[start..end];
and got the following error
thread 'main' panicked at 'byte index 7 is not a char boundary; it is inside 'п' (bytes 6..8) of `Hello привет`'
I suppose it happens since Cyrillic letters are multi-byte and the [..] notation takes chars using byte indices. What can I use if I want to slice using character indices, like I do in Python:
slice = text[start:end] ?
I know I can use the chars() iterator and manually walk through the desired substring, but is there a more concise way?

Possible solutions to codepoint slicing
I know I can use the chars() iterator and manually walk through the desired substring, but is there a more concise way?
If you know the exact byte indices, you can slice a string:
let text = "Hello привет";
println!("{}", &text[2..10]);
This prints "llo пр". So the problem is to find out the exact byte position. You can do that fairly easily with the char_indices() iterator (alternatively you could use chars() with char::len_utf8()):
let text = "Hello привет";
let end = text.char_indices().map(|(i, _)| i).nth(8).unwrap();
println!("{}", &text[2..end]);
As another alternative, you can first collect the string into Vec<char>. Then, indexing is simple, but to print it as a string, you have to collect it again or write your own function to do it.
let text = "Hello привет";
let text_vec = text.chars().collect::<Vec<_>>();
println!("{}", text_vec[2..8].iter().cloned().collect::<String>());
Why is this not easier?
As you can see, neither of these solutions is all that great. This is intentional, for two reasons:
As str is a simply UTF8 buffer, indexing by unicode codepoints is an O(n) operation. Usually, people expect the [] operator to be a O(1) operation. Rust makes this runtime complexity explicit and doesn't try to hide it. In both solutions above you can clearly see that it's not O(1).
But the more important reason:
Unicode codepoints are generally not a useful unit
What Python does (and what you think you want) is not all that useful. It all comes down to the complexity of language and thus the complexity of unicode. Python slices Unicode codepoints. This is what a Rust char represents. It's 32 bit big (a few fewer bits would suffice, but we round up to a power of 2).
But what you actually want to do is slice user perceived characters. But this is an explicitly loosely defined term. Different cultures and languages regard different things as "one character". The closest approximation is a "grapheme cluster". Such a cluster can consist of one or more unicode codepoints. Consider this Python 3 code:
>>> s = "Jürgen"
>>> s[0:2]
'Ju'
Surprising, right? This is because the string above is:
0x004A LATIN CAPITAL LETTER J
0x0075 LATIN SMALL LETTER U
0x0308 COMBINING DIAERESIS
...
This is an example of a combining character that is rendered as part of the previous character. Python slicing does the "wrong" thing here.
Another example:
>>> s = "ﬁre"
>>> s[0:2]
'ﬁr'
Also not what you'd expect. This time, fi is actually the ligature ﬁ, which is one codepoint.
There are far more examples where Unicode behaves in a surprising way. See the links at the bottom for more information and examples.
So if you want to work with international strings that should be able to work everywhere, don't do codepoint slicing! If you really need to semantically view the string as a series of characters, use grapheme clusters. To do that, the crate unicode-segmentation is very useful.
Further resources on this topic:
Blogpost "Let's stop ascribing meaning to unicode codepoints"
Blogpost "Breaking our Latin-1 assumptions
http://utf8everywhere.org/

A UTF-8 encoded string may contain characters which consists of multiple bytes. In your case, п starts at index 6 (inclusive) and ends at position 8 (exclusive) so indexing 7 is not the start of the character. This is why your error occurred.
You may use str::char_indices() for solving this (remember, that getting to a position in a UTF-8 string is O(n)):
fn get_utf8_slice(string: &str, start: usize, end: usize) -> Option<&str> {
assert!(end >= start);
string.char_indices().nth(start).and_then(|(start_pos, _)| {
string[start_pos..]
.char_indices()
.nth(end - start - 1)
.map(|(end_pos, _)| &string[start_pos..end_pos])
})
}
playground
You may use str::chars() if you are fine with getting a String:
let string: String = text.chars().take(end).skip(start).collect();

Here is a function which retrieves a utf8 slice, with the following pros:
handle all edge cases (empty input, 0-width output ranges, out-of-scope ranges);
never panics;
use start-inclusive, end-exclusive ranges.
pub fn utf8_slice(s: &str, start: usize, end: usize) -> Option<&str> {
let mut iter = s.char_indices()
.map(|(pos, _)| pos)
.chain(Some(s.len()))
.skip(start)
.peekable();
let start_pos = *iter.peek()?;
for _ in start..end { iter.next(); }
Some(&s[start_pos..*iter.peek()?])
}

[] operator for strings, link with slices for vectors

Why do you have to walk over the string to find the nᵗʰ letter of a string when you do s[n] where s is a string. (According to https://doc.rust-lang.org/book/strings.html)
From what I understood, a string is an array of chars and a char is an array of 4 bytes or a number of 4 bytes. So is getting the nth letter would be similar as doing this : v[4*n..4*n+4] where v is a vector ?
What is the cost of v[i..j] ?
I would assume that the cost of v[i..j] is j-i and so that the cost of s[n] should be 4.

Note: The second edition of The Rust Programming Language has an improved and smooth explanation to Strings in Rust, which you might wish to read as well. The answer below, although still accurate, quotes from the first edition of the book.
I will try to clarify these misconceptions about strings in Rust by quoting from the book (https://doc.rust-lang.org/book/strings.html).
A ‘string’ is a sequence of Unicode scalar values encoded as a stream of UTF-8 bytes. All strings are guaranteed to be a valid encoding of UTF-8 sequences.
With this in mind, plus that UTF-8 code points are variably sized (1 to 4 bytes depending on the character), all strings in Rust, whether they are &str or String, are not arrays of characters, and can not be treated like such. It is further explained why on Slicing:
Because strings are valid UTF-8, they do not support indexing:
let s = "hello";
println!("The first letter of s is {}", s[0]); // ERROR!!!
Usually, access to a vector with [] is very fast. But, because each character in a UTF-8 encoded string can be multiple bytes, you have to walk over the string to find the nᵗʰ letter of a string. This is a significantly more expensive operation, and we don’t want to be misleading.
Unlike what was mentioned in the question, one cannot do s[n], because although in theory this would allows us to fetch the nth byte in constant time, that byte is not guaranteed to make any sense on its own.
What is the cost of v[i..j] ?
The cost of slicing is actually constant, because it is done at byte-level:
You can get a slice of a string with slicing syntax:
let dog = "hachiko";
let hachi = &dog[0..5];
But note that these are byte offsets, not character offsets. So this will fail at runtime:
let dog = "忠犬ハチ公";
let hachi = &dog[0..2];
with this error:
thread '' panicked at 'index 0 and/or 2 in 忠犬ハチ公 do not lie on
character boundary'
Basically, slicing is acceptable and will yield a new view of that string, so no copies are made. However, it should only be used when you are completely sure that the offsets are right in terms of character boundaries.
In order to iterate over each character of a string, you may instead call chars():
let c = s.chars().nth(n);
Even with that in mind, note that handling Unicode character might not be exactly what you want if you wish to handle character modifiers in UTF-8 (which are scalar values by themselves but should not be treated individually either). Quoting now from the str API:
fn chars(&self) -> Chars
Returns an iterator over the chars of a string slice.
As a string slice consists of valid UTF-8, we can iterate through a string slice by char. This method returns such an iterator.
It's important to remember that char represents a Unicode Scalar Value, and may not match your idea of what a 'character' is. Iteration over grapheme clusters may be what you actually want.
Remember, chars may not match your human intuition about characters:
let y = "y̆";
let mut chars = y.chars();
assert_eq!(Some('y'), chars.next()); // not 'y̆'
assert_eq!(Some('\u{0306}'), chars.next());
assert_eq!(None, chars.next());
The unicode_segmentation crate provides a means to define grapheme cluster boundaries:
extern crate unicode_segmentation;
use unicode_segmentation::UnicodeSegmentation;
let s = "a̐éö̲\r\n";
let g = UnicodeSegmentation::graphemes(s, true).collect::<Vec<&str>>();
let b: &[_] = &["a̐", "é", "ö̲", "\r\n"];
assert_eq!(g, b);

If you do want to treat the string as an array of codepoints (which isn't strictly the same as characters; there are combining marks, emoji with separate skin-tone modifiers, etc.), you can collect it into a Vec:
fn main() {
let s = "£10 🙃!";
for (i,c) in s.char_indices() {
println!("{} {}", i, c);
}
let v: Vec<char> = s.chars().collect();
println!("v[5] = {}", v[5]);
}
Play link
With bonus demonstration of some varying character widths, this outputs:
0 £
2 1
3 0
4
5 🙃
9 !
v[5] = !

How to check if two strings can be made equal by using recursion?

I am trying to practice recursion, but at the moment I don't quite understand it well...
I want to write a recursive Boolean function which takes 2 strings as arguments, and returns true if the second string can be made equal to the first by replacing some letters with a certain special character.
I'll demonstrate what I mean:
Let s1 = "hello", s2 = "h%lo", where '%' is the special character.
The function will return true since '%' can replace "el", causing the two strings to be equal.
Another example:
Let s1 = "hello", s2 = "h%l".
The function will return false since an 'o' is lacking in the second string, and there is no special character that can replace the 'o' (h%l% would return true).
Now the problem isn't so much with writing the code, but with understanding how to solve the problem in general, I don't even know where to begin.
If someone could guide me in the right direction I would be very grateful, even by just using English words, I'll try to translate it to code (Java)...
Thank you.

So this is relatively easy to do in Python. The method I chose was to put the first string ("hello") into an array then iterate over the second string ("h%lo") comparing the elements to those in the array. If the element was in the array i.e. 'h', 'l', 'o' then I would pop it from the array. The resulting array is then ['e','l']. The special character can be found as it is the element which does not exist in the initial array.
One can then substitute the special character for the joined array "el" in the string and compare with the first string.
In the first case this will give "hello" == "hello" -> True
In the second case this will give "hello" == "helol" -> False
I hope this helps and makes sense.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Generating a random Character from range of 2 characters - rust

I would like to randomly give "flag" the value of either "D" or "C", but I'm having some problems... Would appreciate if someone could help out on what I'm missing or provide an easy way to achieve this. Tried this but without success: let mut letter: char = rng.gen_range(b'A', b'Z') as char;

Related

taking only a int from a text with int string in Rust

How to get a substring of a &str based on character index?

Slice a string containing Unicode chars

[] operator for strings, link with slices for vectors

How to check if two strings can be made equal by using recursion?

Categories

Resources