How to convert Vec<char> to string form so that I can print it?
Use collect() on an iterator:
let v = vec!['a', 'b', 'c', 'd'];
let s: String = v.into_iter().collect();
println!("{}", s);
The original vector will be consumed. If you need to keep it, use v.iter():
let s: String = v.iter().collect();
There is no more direct way because char is a 32-bit Unicode scalar value, and strings in Rust are sequences of bytes (u8) representing text in UTF-8 encoding. They do not map directly to sequences of chars.
Here is a more readable version that consumes the vector:
use std::iter::FromIterator;
fn main() {
let v = vec!['a', 'b', 'c', 'd'];
let s = String::from_iter(v);
// vs
let s: String = v.into_iter().collect();
}
Note that collect is implemented with a call to FromIterator::from_iter:
fn collect<B: FromIterator<Self::Item>>(self) -> B
where
Self: Sized,
{
FromIterator::from_iter(self)
}
Related
I'm learning Rust and am messing around with conversions of types because I need it for my first program.
Basically I'm trying to convert a singular string of numbers into an array of numbers.
eg. "609" -> [6,0,9]
const RADIX: u32 = 10;
let lines: Vec<String> = read_lines(filename);
let nums = lines[0].chars().map(|c| c.to_digit(RADIX).expect("conversion error"));
println!("Line: {:?}, Converted: {:?}", lines[0], nums);
I tried the above and the output is as follows:
Line: "603", Converted: Map { iter: Chars(['6', '0', '3']) }
Which I assume isn't correct. I'd need it to be just a pure array of integers so I can perform operations with it later.
You're almost there, add the type ascription to nums:
let nums: Vec<u32> = ...
and end the method chain with .collect() to turn it into a vector of digits.
I have an application where I am receiving a string with some repetitive characters. I am receiving input as a String. How to remove the characters from specific index?
main.rs
fn main() {
let s:String = "{\"name\":\"xx/yyyy/machine/zzz/test_int4\",\"status\":\"online\",\"timestamp\":\"2021-06-11 18:20:42.231770800 UTC\",\"value\":7}8668982856274}".to_string();
println!("{}", s);
}
how can I get result
"{\"name\":\"xx/yyyy/machine/zzz/test_int4\",\"status\":\"online\",\"timestamp\":\"2021-06-11 18:20:42.231770800 UTC\",\"value\":7}"
instead of
"{\"name\":\"xx/yyyy/machine/zzz/test_int4\",\"status\":\"online\",\"timestamp\":\"2021-06-11 18:20:42.231770800 UTC\",\"value\":7}}8668982856274}"
String indexing works only with bytes, thus you need to find an index for the appropriate byte slice like this:
let mut s = "{\"name\":\"xx/yyyy/machine/zzz/test_int4\",\"status\":\"online\",\"timestamp\":\"2021-06-11 18:20:42.231770800 UTC\",\"value\":7}8668982856274}";
let closing_bracket_idx = s
.as_bytes()
.iter()
.position(|&x| x == b'}')
.map(|i| i + 1)
.unwrap_or_else(|| s.len());
let v: serde_json::Value = serde_json::from_str(&s[..closing_bracket_idx]).unwrap();
println!("{:?}", v);
However, keep in mind, this approach doesn't really work in general for more complex cases, for example } in a json string value, or nested objects, or a type other than an object at the upmost level (e.g. [1, {2: 3}, 4]). More neat way is using parser capabilities to ignore of the trailing, as an example for serde_json:
let v = serde_json::Deserializer::from_str(s)
.into_iter::<serde_json::Value>()
.next()
.expect("empty input")
.expect("invalid json value");
println!("{:?}", v);
I am getting this error
expected &str, found char
For this code
// Expected output
// -------
// h exists
// c exists
fn main() {
let list = ["c","h","p","u"];
let s = "Hot and Cold".to_string();
let mut v: Vec<String> = Vec::new();
for i in s.split(" ") {
let c = i.chars().nth(0).unwrap().to_lowercase().nth(0).unwrap();
println!("{}", c);
if list.contains(&c) {
println!("{} exists", c);
}
}
}
How do I solve this?
Change list from an array of &strs to an array of chars:
let list = ['c', 'h', 'p', 'u'];
Double-quotes "" create string literals, while single-quotes '' create character literals. See Literal Expressions in the Rust reference.
I'm assuming you want a list to be a list of chars not a list of strs, in that case try changing
let list = ["c","h","p","u"];
to
let list = ['c','h','p','u'];
and it should work
Rust playground
I am looking for the best way to go from String to Windows<T> using the windows function provided for slices.
I understand how to use windows this way:
fn main() {
let tst = ['a', 'b', 'c', 'd', 'e', 'f', 'g'];
let mut windows = tst.windows(3);
// prints ['a', 'b', 'c']
println!("{:?}", windows.next().unwrap());
// prints ['b', 'c', 'd']
println!("{:?}", windows.next().unwrap());
// etc...
}
But I am a bit lost when working this problem:
fn main() {
let tst = String::from("abcdefg");
let inter = ? //somehow create slice of character from tst
let mut windows = inter.windows(3);
// prints ['a', 'b', 'c']
println!("{:?}", windows.next().unwrap());
// prints ['b', 'c', 'd']
println!("{:?}", windows.next().unwrap());
// etc...
}
Essentially, I am looking for how to convert a string into a char slice that I can use the window method with.
The problem that you are facing is that String is really represented as something like a Vec<u8> under the hood, with some APIs to let you access chars. In UTF-8 the representation of a code point can be anything from 1 to 4 bytes, and they are all compacted together for space-efficiency.
The only slice you could get directly of an entire String, without copying everything, would be a &[u8], but you wouldn't know if the bytes corresponded to whole or just parts of code points.
The char type corresponds exactly to a code point, and therefore has a size of 4 bytes, so that it can accommodate any possible value. So, if you build a slice of char by copying from a String, the result could be up to 4 times larger.
To avoid making a potentially large, temporary memory allocation, you should consider a more lazy approach – iterate through the String, making slices at exactly the char boundaries. Something like this:
fn char_windows<'a>(src: &'a str, win_size: usize) -> impl Iterator<Item = &'a str> {
src.char_indices()
.flat_map(move |(from, _)| {
src[from ..].char_indices()
.skip(win_size - 1)
.next()
.map(|(to, c)| {
&src[from .. from + to + c.len_utf8()]
})
})
}
This will give you an iterator where the items are &str, each with 3 chars:
let mut windows = char_windows(&tst, 3);
for win in windows {
println!("{:?}", win);
}
The nice thing about this approach is that it hasn't done any copying at all - each &str produced by the iterator is still a slice into the original source String.
All of that complexity is because Rust uses UTF-8 encoding for strings by default. If you absolutely know that your input string doesn't contain any multi-byte characters, you can treat it as ASCII bytes, and taking slices becomes easy:
let tst = String::from("abcdefg");
let inter = tst.as_bytes();
let mut windows = inter.windows(3);
However, you now have slices of bytes, and you'll need to turn them back into strings to do anything with them:
for win in windows {
println!("{:?}", String::from_utf8_lossy(win));
}
This solution will work for your purpose. (playground)
fn main() {
let tst = String::from("abcdefg");
let inter = tst.chars().collect::<Vec<char>>();
let mut windows = inter.windows(3);
// prints ['a', 'b', 'c']
println!("{:?}", windows.next().unwrap());
// prints ['b', 'c', 'd']
println!("{:?}", windows.next().unwrap());
// etc...
println!("{:?}", windows.next().unwrap());
}
String can iterate over its chars, but it's not a slice, so you have to collect it into a vec, which then coerces into a slice.
You can use itertools to walk over windows of any iterator, up to a width of 4:
extern crate itertools; // 0.7.8
use itertools::Itertools;
fn main() {
let input = "日本語";
for (a, b) in input.chars().tuple_windows() {
println!("{}, {}", a, b);
}
}
See also:
Are there equivalents to slice::chunks/windows for iterators to loop over pairs, triplets etc?
This question already has answers here:
How to convert Vec<char> to a string
(2 answers)
Closed 6 years ago.
I've got a Vec<char> that I need to turn into a &str or String, but I'm unsure of the best way to do this. I've looked around and every resource I've found seems to be out-dated in some way. The answers in this question don't seem to be applicable for the newest build.
I'm using the nightly for 2015-3-19
The iterator based approach with .collect should work, after updating for language changes:
char_vector.iter().cloned().collect::<String>();
(I've chosen to replace .map(|c| *c) with .cloned() but either works.)
If your vector can be consumed, you can also use into_iter to avoid the clone
fn main() {
let char_vector = vec!['h', 'e', 'l', 'l', 'o'];
let str: String = char_vector.into_iter().collect();
println!("{}", str);
}
You can convert the Vec into a String without doing any allocations. It requires quite some unsafe code though:
#![feature(raw, unicode)]
use std::raw::Repr;
use std::slice::from_raw_parts_mut;
fn inplace_to_string(v: Vec<char>) -> String {
unsafe {
let mut i = 0;
{
let ch_v = &v[..];
let r = ch_v.repr();
let p: &mut [u8] = from_raw_parts_mut(r.data as *mut u8, r.len*4);
for ch in ch_v {
i += ch.encode_utf8(&mut p[i..i+4]).unwrap();
}
}
let p = v.as_ptr();
let cap = v.capacity()*4;
std::mem::forget(v);
let v = Vec::from_raw_parts(p as *mut u8, i, cap);
String::from_utf8_unchecked(v)
}
}
fn main() {
let char_vector = vec!['h', 'ä', 'l', 'l', 'ö'];
let str: String = char_vector.iter().cloned().collect();
let str2 = inplace_to_string(char_vector);
println!("{}", str);
println!("{}", str2);
}
PlayPen
Detailed Explanation
This creates a mutable u8 slice and a char slice simultaneously to the same buffer (breaking all Rust guarantees). Note that the u8 slice is four times as large as the char slice, since char always takes up 4 bytes.
let ch_v = &v[..];
let r = ch_v.repr();
let v: &mut [u8] = from_raw_parts_mut(r.data as *mut u8, r.len*4);
We need that to iterate over the unicode chars and replace them by their utf8 encoded counterpart. Since utf8 is always shorter or the same length as unicode, we can guarantee that we never overwrite any part we haven't read yet.
for ch in ch_v {
i += ch.encode_utf8(&mut v[i..i+4]).unwrap();
}
Since a char is always unicode and our buffer is always exactly 4 bytes (which is the maximum number of bytes a utf8 encoded unicode char will need), we can encode our chars to utf8 without checking if it worked (it will always work). The encode_utf8 function returns the length of the utf8 representation. Our index i is the location of the last written utf8 char.
Finally we need to do some cleaning up. Our vector is still of type Vec<char>. We get all the info we need (Pointer to the heap allocated array and the capacity)
let p = v.as_ptr();
let cap = v.capacity()*4;
Then we release the previous vector from all obligations like freeing memory.
std::mem::forget(v);
and finally recreate the u8 vector with correct length and capacity, and directly turn it into a String. The conversion to String does not need to be checked, as we already know the utf8 is correct, since the original Vec<char> could only contain correct unicode chars.
let v = Vec::from_raw_parts(p as *mut u8, i, cap);
String::from_utf8_unchecked(v)