Use match to match double letter pattern - rust

I'm iterating over a vector of chars. I would like to detect specific sequences of chars ("ab", "cd", "pq" and "xy"). If I find one of these, I want to return false. However if I find a double letter sequence (e.g: "aa"), I want to return true.
I came up with this:
let chars: Vec<char> = line.chars().collect();
for (idx, c) in chars.iter().enumerate() {
if idx > 0 {
match (chars[idx - 1], c) {
('a', 'b') => return false,
('c', 'd') => return false,
('p', 'q') => return false,
('x', 'y') => return false,
(c, c) => return true,
_ => (),
};
}
However when I run this I get the following error:
36 | (c, c) => return true,
| ^ used in a pattern more than once
and I can't understand why.
running rustc --explain E0416 seems to give a solution:
match (chars[idx - 1], c) {
('a', 'b') => return false,
('c', 'd') => return false,
('p', 'q') => return false,
('x', 'y') => return false,
(prev, curr) if &prev == curr => return true,
_ => (),
};
But I would like to understand what's happening here. I'd have expected the equality check to happen with (c, c).

From the documentation:
To create a match expression that compares the values of the outer x and y, rather than introducing a shadowed variable, we would need to use a match guard conditional
The match pattern needs a literal here, or the declaration of new variables, you can't just put another dynamic value to compare with.
You can use the solution given by the compiler, or the more direct
(a, _) if a==*c => return true,
Note that there are ways to avoid collecting into a vector here. You could for example store the previous value in a mutable variable of type char (by taking the first value before the loop if you know the string isn't empty) or into an Option<char>.

Related

Comparing Lists of Strings in Scala

I know lists are immutable but I'm still confused on how I would go about this. I have two lists of strings - For example:
var list1: List[String] = List("M", "XW1", "HJ", "K")
var list2: List[String] = List("M", "XW4", "K", "YN")
I want to loop through these lists and see if the elements match. If it doesn't, the program would immediately return false. If it is a match, it will continue to iterate until it finds an element that begins with X. If it is indeed an X, I want to return true regardless of whether the number is the same or not.
Problem I'm having is that currently I have a conditional stating that if the two elements do not match, return false immediately. This is a problem because obviously XW1 and XW4 are not the same and it will return false. How can I bypass this and determine that it is a match to my eyes regardless of the number?
I also have a counter a two length variables to account for the fact the lists may be of differing length. My counter goes up to the shortest list: for (x <- 0 to (c-1)) (c being the counter).
You want to use zipAll & forall.
def compareLists(l1: List[String], l2: List[String]): Boolean =
l1.zipAll(l2, "", "").forall {
case (x, y) =>
(x == y) || (x.startsWith("X") && y.startsWith("X"))
}
Note that I am assuming an empty string will always be different than any other element.
If I understand your requirement correctly, to be considered a match, 1) each element in the same position of the two lists being simultaneously iterated must be the same except when both start with X (in which case it should return true without comparing any further), and 2) both lists must be of the same size.
If that's correct, I would recommend using a simple recursive function like below:
def compareLists(ls1: List[String], ls2: List[String]): Boolean = (ls1, ls2) match {
case (Nil, Nil) =>
true
case (h1 :: t1, h2 :: t2) =>
if (h1.startsWith("X") && h2.startsWith("X"))
true // short-circuiting
else
if (h1 != h2)
false
else
compareLists(t1, t2)
case _ =>
false
}
Based on your comment that, result should be true for lists given in question, you could do something like this:
val list1: List[String] = List("M", "XW1", "HJ", "K")
val list2: List[String] = List("M", "XW4", "K", "YN")
val (matched, unmatched) = list1.zipAll(list2, "", "").partition { case (x, y) => x == y }
val result = unmatched match {
case Nil => true
case (x, y) :: _ => (x.startsWith("X") && y.startsWith("X"))
}
You could also use cats foldM to iterate through the lists and terminate early if there is either (a) a mismatch, or (b) two elements that begin with 'X':
import cats.implicits._
val list1: List[String] = List("M", "XW1", "HJ", "K")
val list2: List[String] = List("M", "XW4", "K", "YN")
list1.zip(list2).foldM(()){
case (_, (s1, s2)) if s1 == s2 => ().asRight
case (_, (s1, s2)) if s1.startsWith("X") && s2.startsWith("X") => true.asLeft
case _ => false.asLeft
}.left.getOrElse(false)

replace if condition by match

I am trying to improve my use of match expression.
I have a code like the following where foo is a String:
if foo.chars().nth(0).unwrap() != '2' &&
foo.chars().nth(0).unwrap() != '3' &&
&foo[0..3] != "xyz"
{
return message;
}
Is it possible to create the same behavior using match?
Something like this idea:
match foo {
&[0] == (2 | 3) => do_nothing
&[0..3] == "xyz" => do_nothing
_ => return message;
}
There are several problems with your approach:
The matched expression must be unique. You can't match foo[0] in some branches and foo[0..3] in some other branches. So let's pick the biggest range: foo[0..3].
Rust string matching cannot match sub-strings: it's all or nothing. Slices don't have that limitation and we can freely get a slice of bytes, so let's match &foo.as_bytes()[0..3].
match &foo.as_bytes()[0..3] {
&[b'2', ..] | &[b'3', ..] => do_nothing(),
b"xyz" => do_nothing(),
_ => return message,
}
There is a proposal to make | a regular pattern rather than a special construct of match, which would make the first branch expressable as &[2 | 3, ..]

Puppet in cycle added empty elements in array

$hash_arr_1 = { b => 2, c => 3, f => 1 }
$arr = ['a', 'c', 'd', 'f', 'e']
$hash_arr_2 = $arr.map |$param| {
if has_key($hash_arr_1, $param) {
{$param => $hash_arr_1[$param]}
}
}
notice($hash_arr_2)
Result: [{ , c => 3, , f => 1, ,}]
How to do that there are no empty elements in the array?
The problem here is that you are using the map lambda function when really you want to be using filter. Summary from linked documentation is as follows:
Applies a lambda to every value in a data structure and returns an array or hash containing any elements for which the lambda evaluates to true.
So the solution for you is:
$hash_arr_2 = $hash_arr_1.filter |$key, $value| { $key in $arr }
This will iterate through the keys of the hash $hash_arr_1, check if the key exists as a member of the array $arr with the provided conditional, and then return a hash with only the key value pairs that evaluated to true.

Scala String Similarity

I have a Scala code that computes similarity between a set of strings and give all the unique strings.
val filtered = z.reverse.foldLeft((List.empty[String],z.reverse)) {
case ((acc, zt), zz) =>
if (zt.tail.exists(tt => similarity(tt, zz) < threshold)) acc
else zz :: acc, zt.tail
}._1
I'll try to explain what is going on here :
This uses a fold over the reversed input data, starting from the empty String (to accumulate results) and the (reverse of the) remaining input data (to compare against - I labeled it zt for "z-tail").
The fold then cycles through the data, checking each entry against the tail of the remaining data (so it doesn't get compared to itself or any earlier entry)
If there is a match, just the existing accumulator (labelled acc) will be allowed through, otherwise, add the current entry (zz) to the accumulator. This updated accumulator is paired with the tail of the "remaining" Strings (zt.tail), to ensure a reducing set to compare against.
Finally, we end up with a pair of lists: the required remaining Strings, and an empty list (no Strings left to compare against), so we take the first of these as our result.
The problem is like in first iteration, if 1st, 4th and 8th strings are similar, I am getting only the 1st string. Instead of it, I should get a set of (1st,4th,8th), then if 2nd,5th,14th and 21st strings are similar, I should get a set of (2nd,5th,14th,21st).
If I understand you correctly - you want the result to be of type List[List[String]] and not the List[String] you are getting now - where each item is a list of similar Strings (right?).
If so - I can't see a trivial change to your implementation that would achieve this, as the similar values are lost (when you enter the if(true) branch and just return the acc - you skip an item and you'll never "see" it again).
Two possible solutions I can think of:
Based on your idea, but using a 3-Tuple of the form (acc, zt, scanned) as the foldLeft result type, where the added scanned is the list of already-scanned items. This way we can refer back to them when we find an element that doesn't have preceeding similar elements:
val filtered = z.reverse.foldLeft((List.empty[List[String]],z.reverse,List.empty[String])) {
case ((acc, zt, scanned), zz) =>
val hasSimilarPreceeding = zt.tail.exists { tt => similarity(tt, zz) < threshold }
val similarFollowing = scanned.collect { case tt if similarity(tt, zz) < threshold => tt }
(if (hasSimilarPreceeding) acc else (zz :: similarFollowing) :: acc, zt.tail, zz :: scanned)
}._1
A probably-slower but much simpler solution would be to just groupBy the group of similar strings:
val alternative = z.groupBy(s => z.collect {
case other if similarity(s, other) < threshold => other
}.toSet ).values.toList
All of this assumes that the function:
f(a: String, b: String): Boolean = similarity(a, b) < threshold
Is commutative and transitive, i.e.:
f(a, b) && f(a. c) means that f(b, c)
f(a, b) if and only if f(b, a)
To test both implementations I used:
// strings are similar if they start with the same character
def similarity(s1: String, s2: String) = if (s1.head == s2.head) 0 else 100
val threshold = 1
val z = List("aa", "ab", "c", "a", "e", "fa", "fb")
And both options produce the same results:
List(List(aa, ab, a), List(c), List(e), List(fa, fb))

Detecting the index in a string that is not printable character with Scala

I have a method that detects the index in a string that is not printable as follows.
def isPrintable(v:Char) = v >= 0x20 && v <= 0x7E
val ba = List[Byte](33,33,0,0,0)
ba.zipWithIndex.filter { v => !isPrintable(v._1.toChar) } map {v => v._2}
> res115: List[Int] = List(2, 3, 4)
The first element of the result list is the index, but I wonder if there is a simpler way to do this.
If you want an Option[Int] of the first non-printable character (if one exists), you can do:
ba.zipWithIndex.collectFirst{
case (char, index) if (!isPrintable(char.toChar)) => index
}
> res4: Option[Int] = Some(2)
If you want all the indices like in your example, just use collect instead of collectFirst and you'll get back a List.
For getting only the first index that meets the given condition:
ba.indexWhere(v => !isPrintable(v.toChar))
(it returns -1 if nothing is found)
You can use directly regexp to found unprintable characters by unicode code points.
Resource: Regexp page
In such way you can directly filter your string with such pattern, for instance:
val text = "this is \n sparta\t\r\n!!!"
text.zipWithIndex.filter(_._1.matches("\\p{C}")).map(_._2)
> res3: Vector(8, 16, 17, 18)
As result you'll get Vector with indices of all unprintable characters in String. Check it out
If desired only the first occurrence of non printable char
Method span applied on a List delivers two sublists, the first where all the elements hold a condition, the second starts with an element that falsified the condition. In this case consider,
val (l,r) = ba.span(b => isPrintable(b.toChar))
l: List(33, 33)
r: List(0, 0, 0)
To get the index of the first non printable char,
l.size
res: Int = 2
If desired all the occurrences of non printable chars
Consider partition of a given List for a criteria. For instance, for
val ba2 = List[Byte](33,33,0,33,33)
val (l,r) = ba2.zipWithIndex.partition(b => isPrintable(b._1.toChar))
l: List((33,0), (33,1), (33,3), (33,4))
r: List((0,2))
where r includes tuples with non printable chars and their position in the original List.
I am not sure whether list of indexes or tuples is needed and I am not sure whether 'ba' needs to be an list of bytes or starts off as a string.
for { i <- 0 until ba.length if !isPrintable(ba(i).toChar) } yield i
here, because people need performance :)
def getNonPrintable(ba:List[Byte]):List[Int] = {
import scala.collection.mutable.ListBuffer
var buffer = ListBuffer[Int]()
#tailrec
def go(xs: List[Byte], cur: Int): ListBuffer[Int] = {
xs match {
case Nil => buffer
case y :: ys => {
if (!isPrintable(y.toChar)) buffer += cur
go(ys, cur + 1)
}
}
}
go(ba, 0)
buffer.toList
}

Resources