Need to extract the last word in a Rust string - string

I am doing some processing of a string in Rust, and I need to be able to extract the last set of characters from that string. In other words, given a string like the following:
some|not|necessarily|long|name
I need to be able to get the last part of that string, namely "name" and put it into another String or a &str, in a manner like:
let last = call_some_function("some|not|necessarily|long|name");
so that last becomes equal to "name".
Is there a way to do this? Is there a string function that will allow this to be done easily? If not (after looking at the documentation, I doubt that there is), how would one do this in Rust?

While the answer from #effect is correct, it is not the most idiomatic nor the most performant way to do it. It'll walk the entire string and match all of the |s to reach the last. You can make it better, but there is a method of str that does exactly what you want - rsplit_once():
let (_, name) = s.rsplit_once('|').unwrap();
// Or
// let name = s.rsplit_once('|').unwrap().1;
//
// You can also use a multichar separator:
// let (_, name) = s.rsplit_once("|").unwrap();
// But in the case of a single character, a `char` type is likely to be more performant.
Playground.

You can use the String::split() method, which will return an iterator over the substrings split by that separator, and then use the Iterator::last() method to return the last element in the iterator, like so:
let s = String::from("some|not|necessarily|long|name");
let last = s.split('|').last().unwrap();
assert_eq!(last, "name");
Please also note that string slices (&str) also implement the split method, so you don't need to use std::String.
let s = "some|not|necessarily|long|name";
let last = s.split('|').last().unwrap();
assert_eq!(last, "name");

Related

Rust reconstitute format=flowed emails, or an iterator that combines some elements of the previous iterator

Currently I have a program that is reading some emails from disk, and parsing some included text (that is csv-like, although happens to be fixed-width fields and '|' separated.
The emails are not particularly huge, so I fs::read_to_string them into a string (in a loop), and for each one use .split("\n") to iterate over lines, then run a constructor on each line to create a struct for each valid csv-like line.
So like
let mut hostiter = text.split("\n")
.filter_map(|x| HostInfo::from_str(x));
Where HostInfo has owned values, copying from the &str references.
This all works fine as is, but now I want to be able to handle emails that quote the records I'm looking for (i.e. lines that start with "> > "). That's easy enough:
let quotes = &['>', ' '];
let mut hostiter = text.split("\n")
.map(|x| x.trim_start_matches(quotes))
.filter_map(|x| HostInfo::from_str(x));
I also need to cope with rfc3676/format=flowed emails? This means that, when forwarded/replied to, email clients split the lines so that each record I'm looking for is split over 2 or more lines. Continuation lines are delineated with " \r\n", i.e. it has a space before the cr/newline. Non-continuation lines have the "\r\n" after a non-space character. (Currently my code skips these partial records.) I need an iterator that iterates over complete lines. I'm thinking of two ways of doing this:
The easiest may be to split the string (on '\n'), trim the starts of any quoting, then collect the string into a new string with '\n' separating to remove the quotes. Then a second pass to replace all " \r\n" with ' ' again producing a new string. Now I have a string that can be split on '\n' and has complete records.
Else is there an iterator adapter I can use that will combine elements if they are continuation lines? e.g. can I use group_by to group lines with their continuation lines?
I realize I can't have an iterator that returns complete records as a single &str (unless I do 1.), since the records are split in the original string. However I can refactor my constructor to take a vector of &str instead of a single &str.
In the end I used coalesce to group the lines. Since the items I'm iterating over are &str which can't be joined without allocation I decided to store the output as Vec<&str>. Since coalesce wants the same types as input and output (why?), I needed to convert the &str to single item vectors before using it. The resulting code was:
let mut hostiter = text.split("\r\n")
.map(|x| vec![x.trim_start_matches(quotes)])
.coalesce(|mut x, mut y| match o.flowed && x[x.len()-1].ends_with(' ') {
true => { x.append(&mut y); Ok( x )},
false => Err( (x,y) ),
})
.filter_map(|x| HostInfo::from_vec_str(x);
(o.flowed is a flag indicating whether we picked up a Content type: with format=flowed in the headers of the email.)
I had to convert my HostInfo::from_str function to HostInfo::from_vec_str to take a Vec<&str> instead of a &str. Since my from_str function splits the &str on spaces anyway, it was easy enough to use flat_map to split each &str in the Vec and output words...
Not sure if coalesce is the best way to do this. I was looking for an iterator adaptor that would take a closure that takes a collection and an item, and returns a bool; I.e. does this item belong with the other items in this collection? The iterator adaptor output would iterate over collections of items.

How can I extract a string prefix that comes before a given character?

Is there a way of extracting a prefix defined by a character without using split? For instance, given "some-text|more-text" how can I extract "some-text" which is the string that comes before "|"?
If your issue with split is that it may also split rest of the string, then consider str.splitn(2, '|') which splits into maximum two parts, so you get the string up to the first | and then the rest of the string even if it contains other | characters.
Split may be the best way. But, if for whatever reason you want to do it without .split(), here's a couple alternatives.
You can use the .chars() iterator, chained with .take_while(), then produce a String with .collect().
// pfx will be "foo".
let pfx = "foo|bar".chars()
.take_while(|&ch| ch != '|')
.collect::<String>();
Another way, which is pretty efficient, but fallable, is to create a slice of the original str:
let s = "foo|bar";
let pfx = &s[..s.find('|').unwrap()];
// or - if used in a function/closure that returns an `Option`.
let pfx = &s[..s.find('|')?];
as #Kornel point out, you can use the Split function:
let raw_text = "some-text|more-text";
let extracted_text = raw_text.split("|").next().unwrap();
<iframe src="https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&code=fn%20main()%20%7B%0A%20%20%20%20let%20raw_text%20%3D%20%22some-text%7Cmore-text%22%3B%0A%20%20%20%20let%20extracted_text%20%3D%20raw_text.split(%22%7C%22).next().unwrap()%3B%0A%20%20%20%20println!(%22%7B%7D%22%2Cextracted_text)%3B%0A%7D" style="width:100%; height:400px;"></iframe>

Is there an equivalent to the string function String(format: ...) using Swift formatting

I'm starting to like the Swift string formatting since it uses variable names in the string rather than ambiguous formatting tags like "%#"
I want to load a large string from a file that has Swift-style formatting in it (like this)
Now is the time for all good \(who) to come to babble incoherently.
Then I want to feed the contents of that String variable into a statement that lest me replace
\(who)
with the contents of the constant/variable who at runtime.
The code below works with a string constant as the formatting string.
let who = "programmers"
let aString = "Now is the time for all good \(who) to come to babble incoherently."
That code does formatting of a quoted string that appears in-line in my code.
Instead I want something like the code
let formatString = "Now is the time for all good %# to come to babble incoherently."
aString = String(format: formatString, who)
But where I can pass in a Swift-style format string in a constant/variable I read from a file.
Is that possible? I didn't have any luck searching for it since I wasn't exactly sure what search terms to use.
I can always use C-style string formatting and the String class' initWithFormat method if I have to...
I don't think there's a way to do this. String interpolation is implemented via conforming to the StringInterpolationConvertible protocol, and presumably you're hoping to tap into that in the same way you can tap into the methods required by StringLiteralConvertible, a la:
let someString = toString(42)
// this is the method String implements to conform to StringLiteralConvertible
let anotherString = String(stringLiteral: someString)
// anotherString will be "42"
print(anotherString)
Unfortunately, you can't do quite the same trick with StringInterpolationConvertible. Seeing how the protocol works may help:
struct MyString: Printable {
let actualString: String
var description: String { return actualString }
}
extension MyString: StringInterpolationConvertible {
// first, this will get called for each "segment"
init<T>(stringInterpolationSegment expr: T) {
println("Processing segment: " + toString(expr))
actualString = toString(expr)
}
// here is a type-specific override for Int, that coverts
// small numbers into words:
init(stringInterpolationSegment expr: Int) {
if (0..<4).contains(expr) {
println("Embigening \(expr)")
let numbers = ["zeo","one","two","three"]
actualString = numbers[expr]
}
else {
println("Processing segment: " + toString(expr))
actualString = toString(expr)
}
}
// finally, this gets called with an array of all of the
// converted segments
init(stringInterpolation strings: MyString...) {
// strings will be a bunch of MyString objects
actualString = "".join(strings.map { $0.actualString })
}
}
let number = 3
let aString: MyString = "Then shalt thou count to \(number), no more, no less."
println(aString)
// prints "Then shalt thou count to three, no more, no less."
So, while you can call String.init(stringInterpolation:) and String.init(stringInterpolationSegment:) directly yourself if you want (just try String(stringInterpolationSegment: 3.141) and String(stringInterpolation: "blah", "blah")), this doesn't really help you much. What you really need is a facade function that coordinates the calls to them. And unless there's a handy pre-existing function in the standard library that does exactly that which I've missed, I think you're out of luck. I suspect it's built into the compiler.
You could maybe write your own to achieve your goal, but a lot of effort since you'd have to break up the string you want to interpolate manually into bits and handle it yourself, calling the segment init in a loop. Also you'll hit problems with calling the combining function, since you can't splat an array into a variadic function call.
I don't think so. The compiler needs to be able to resolve the interpolated variable at compile time.
I'm not a Swift programmer, specifically, but I think you can workaround it to something pretty close to what you want using a Dictionary and standard string-replacing and splitting methods:
var replacement = [String: String]()
replacement["who"] = "programmers"
Having that, you can try to find the occurrences of "\(", reading what is next and prior to a ")", (this post can help with the split part, this one, with the replacing part), finding it in the dictionary, and reconstructing your string from the pieces you get.
this one works like a charm:
let who = "programmers"
let formatString = "Now is the time for all good %# to come to babble incoherently."
let aString = String(format: formatString, who)

Best way to compare multiple string in java

Suppose I have a string "That question is on the minds of every one.".
I want to compare each word in string with a set of word I.e. (to , is ,on , of) and if those word occurs I want to append some string on the existing string.
Eg.
to = append "Hi";
Is = append "Hello";
And so on.
To be more specific I have used StringTokenizer to get the each word and compared thru if else statement. However we can use Switch also but it is available in Jdk 1.
7.
I don't know if this is what you mean, but:
You could use String.split() to separate the words from your string like
String[] words = myString.split(" ");
and then, for each word, compare it with the given set
for(String s : words)
{
switch(s)
{
case("to"):
[...]
}
}
Or you could just use the String.contains() method without even splitting your string, but I don't know if that's what you wanted.
Use a HashMap<String,String> variable to store your set of words and the replacement words you want. Then split your string with split(), loop through the resulting String[] and for each String in the String[], check whether the HashMap containsKey() that String. Build your output/resulting String in the loop - if the word is contained in the HashMap, replace it with the value of the corresponding key in the HashMap, otherwise use the String you are currently on from the String[].

repeat string with LINQ/extensions methods [duplicate]

This question already has answers here:
Is there an easy way to return a string repeated X number of times?
(21 answers)
Closed 9 years ago.
Just a curiosity I was investigating.
The matter: simply repeating (multiplying, someone would say) a string/character n times.
I know there is Enumerable.Repeat for this aim, but I was trying to do this without it.
LINQ in this case seems pretty useless, because in a query like
from X in "s" select X
the string "s" is being explored and so X is a char. The same is with extension methods, because for example "s".Aggregate(blablabla) would again work on just the character 's', not the string itself. For repeating the string something "external" would be needed, so I thought lambdas and delegates, but it can't be done without declaring a variable to assign the delegate/lambda expression to.
So something like defining a function and calling it inline:
( (a)=>{return " "+a;} )("a");
or
delegate(string a){return " "+a}(" ");
would give a "without name" error (and so no recursion, AFAIK, even by passing a possible lambda/delegate as a parameter), and in the end couldn't even be created by C# because of its limitations.
It could be that I'm watching this thing from the wrong perspective. Any ideas?
This is just an experiment, I don't care about performances, about memory use... Just that it is one line and sort of autonomous. Maybe one could do something with Copy/CopyTo, or casting it to some other collection, I don't know. Reflection is accepted too.
To repeat a character n-times you would not use Enumerable.Repeat but just this string constructor:
string str = new string('X', 10);
To repeat a string i don't know anything better than using string.Join and Enumerable.Repeat
string foo = "Foo";
string str = string.Join("", Enumerable.Repeat(foo, 10));
edit: you could use string.Concat instead if you need no separator:
string str = string.Concat( Enumerable.Repeat(foo, 10) );
If you're trying to repeat a string, rather than a character, a simple way would be to use the StringBuilder.Insert method, which takes an insertion index and a count for the number of repetitions to use:
var sb = new StringBuilder();
sb.Insert(0, "hi!", 5);
Console.WriteLine(sb.ToString());
Otherwise, to repeat a single character, use the string constructor as I've mentioned in the comments for the similar question here. For example:
string result = new String('-', 5); // -----
For the sake of completeness, it's worth noting that StringBuilder provides an overloaded Append method that can repeat a character, but has no such overload for strings (which is where the Insert method comes in). I would prefer the string constructor to the StringBuilder if that's all I was interested in doing. However, if I was already working with a StringBuilder, it might make sense to use the Append method to benefit from some chaining. Here's a contrived example to demonstrate:
var sb = new StringBuilder("This item is ");
sb.Insert(sb.Length, "very ", 2) // insert at the end to append
.Append('*', 3)
.Append("special")
.Append('*', 3);
Console.WriteLine(sb.ToString()); // This item is very very ***special***

Resources