replace if condition by match - rust

I am trying to improve my use of match expression.
I have a code like the following where foo is a String:
if foo.chars().nth(0).unwrap() != '2' &&
foo.chars().nth(0).unwrap() != '3' &&
&foo[0..3] != "xyz"
{
return message;
}
Is it possible to create the same behavior using match?
Something like this idea:
match foo {
&[0] == (2 | 3) => do_nothing
&[0..3] == "xyz" => do_nothing
_ => return message;
}

There are several problems with your approach:
The matched expression must be unique. You can't match foo[0] in some branches and foo[0..3] in some other branches. So let's pick the biggest range: foo[0..3].
Rust string matching cannot match sub-strings: it's all or nothing. Slices don't have that limitation and we can freely get a slice of bytes, so let's match &foo.as_bytes()[0..3].
match &foo.as_bytes()[0..3] {
&[b'2', ..] | &[b'3', ..] => do_nothing(),
b"xyz" => do_nothing(),
_ => return message,
}
There is a proposal to make | a regular pattern rather than a special construct of match, which would make the first branch expressable as &[2 | 3, ..]

Related

Use match to match double letter pattern

I'm iterating over a vector of chars. I would like to detect specific sequences of chars ("ab", "cd", "pq" and "xy"). If I find one of these, I want to return false. However if I find a double letter sequence (e.g: "aa"), I want to return true.
I came up with this:
let chars: Vec<char> = line.chars().collect();
for (idx, c) in chars.iter().enumerate() {
if idx > 0 {
match (chars[idx - 1], c) {
('a', 'b') => return false,
('c', 'd') => return false,
('p', 'q') => return false,
('x', 'y') => return false,
(c, c) => return true,
_ => (),
};
}
However when I run this I get the following error:
36 | (c, c) => return true,
| ^ used in a pattern more than once
and I can't understand why.
running rustc --explain E0416 seems to give a solution:
match (chars[idx - 1], c) {
('a', 'b') => return false,
('c', 'd') => return false,
('p', 'q') => return false,
('x', 'y') => return false,
(prev, curr) if &prev == curr => return true,
_ => (),
};
But I would like to understand what's happening here. I'd have expected the equality check to happen with (c, c).
From the documentation:
To create a match expression that compares the values of the outer x and y, rather than introducing a shadowed variable, we would need to use a match guard conditional
The match pattern needs a literal here, or the declaration of new variables, you can't just put another dynamic value to compare with.
You can use the solution given by the compiler, or the more direct
(a, _) if a==*c => return true,
Note that there are ways to avoid collecting into a vector here. You could for example store the previous value in a mutable variable of type char (by taking the first value before the loop if you know the string isn't empty) or into an Option<char>.

Longest common suffix

I I would like to find the longest common suffix of two strings in Scala.
def longestSuffix(s1: String, s2: String) = {
val it = (s1.reverseIterator zip s2.reverseIterator) takeWhile {case (x, y) => x == y}
it.map (_._1).toList.reverse.mkString
}
This code is clumsy and probably inefficient (e.g. because of reversing). How would find the longest common suffix functionally, i.e. without mutable variables ?
One way to improve it would be to connect reverse and map in last operation:
str1.reverseIterator.zip(str2.reverseIterator).takeWhile( c => c._1 == c._2)
.toList.reverseMap(c => c._1) mkString ""
firstly make a list, and then reverseMap this list
We can iterate over substrings, without reverse:
def longestSuffix(s1: String, s2: String) = {
s1.substring(s1.length to 0 by -1 takeWhile { n => s2.endsWith(s1.substring(n)) } last)
}
Let tails produce the sub-strings and then return the first that fits.
def longestSuffix(s1: String, s2: String) =
s1.tails.dropWhile(!s2.endsWith(_)).next
Some efficiency might be gained by calling tails on the shorter of the two inputs.
I came up with a solution like this:def commonSuffix(s1: String, s2: String): String = {
val n = (s1.reverseIterator zip s2.reverseIterator) // mutable !
.takeWhile {case (a, b) => a == b}
.size
s1.substring(s1.length - n) // is it efficient ?
}
Note that I am using substring for efficiency (not sure if it's correct).
This solution also is not completely "functional" since I am using reverseIterator despite it's mutable because I did not find another way to iterate over strings in reverse order. How would you suggest fix/improve it ?

Given a string sequence of words, check if it matches a pattern

I encountered this problem in an interview and I was stuck on the best way to go about it. The question is as follows:
Given a string sequence of words and a string sequence pattern, return true if the sequence of words matches the pattern otherwise false.
Definition of match: A word that is substituted for a variable must always follow that substitution. For example, if "f" is substituted as "monkey" then any time we see another "f" then it must match "monkey" and any time we see "monkey" again it must match "f".
Examples
input: "ant dog cat dog", "a d c d"
output: true
This is true because every variable maps to exactly one word and vice verse.
a -> ant
d -> dog
c -> cat
d -> dog
input: "ant dog cat dog", "a d c e"
output: false
This is false because if we substitute "d" as "dog" then you can not also have "e" be substituted as "dog".
a -> ant
d, e -> dog (Both d and e can't both map to dog so false)
c -> cat
input: "monkey dog eel eel", "e f c c"
output: true
This is true because every variable maps to exactly one word and vice verse.
e -> monkey
f -> dog
c -> eel
Initially, I thought of doing something as follows...
function matchPattern(pattern, stringToMatch) {
var patternBits = pattern.split(" ");
var stringBits = stringToMatch.split(" ");
var dict = {};
if (patternBits.length < 0
|| patternBits.length !== stringBits.length) {
return false;
}
for (var i = 0; i < patternBits.length; i++) {
if (dict.hasOwnProperty(patternBits[i])) {
if (dict[patternBits[i]] !== stringBits[i]) {
return false;
}
} else {
dict[patternBits[i]] = stringBits[i];
}
}
return true;
}
var ifMatches = matchPattern("a e c d", "ant dog cat dog");
console.log("Pattern: " + (ifMatches ? "matches!" : "does not match!"));
However, I realized that this won't work and fails example #2 as it erroneously returns true. One way to deal with this issue is to use a bi-directional dictionary or two dictionaries i.e store both {"a": "ant"} and
{"ant": "a"} and check both scenarios in the if check. However, that seemed like wasted space. Is there a better way to tackle this problem without using regular expressions?
I think a simple choice that is quadratic in the length of the list of words is to verify that every pairing of list indices has the same equality characteristics in the two lists. I'll assume that you get the "words" and "pattern" as lists already and don't need to parse out spaces and whatever -- that ought to be a separate function's responsibility anyway.
function matchesPatternReference(words, pattern) {
if(words.length !== pattern.length) return false;
for(var i = 0; i < words.length; i++)
for(var j = i+1; j < words.length; j++)
if((words[i] === words[j]) !== (pattern[i] === pattern[j]))
return false;
return true;
}
A slightly better approach would be to normalize both lists, then compare the normalized lists for equality. To normalize a list, replace each list element by the number of unique list elements that appear before its first occurrence in the list. This will be linear in the length of the longer list, assuming you believe hash lookups and list appends take constant time. I don't know enough Javascript to know if these are warranted; certainly at worst the idea behind this algorithm can be implemented with suitable data structures in n*log(n) time even without believing that hash lookups are constant time (a somewhat questionable assumption no matter the language).
function normalize(words) {
var next_id = 0;
var ids = {};
var result = [];
for(var i = 0; i < words.length; i++) {
if(!ids.hasOwnProperty(words[i])) {
ids[words[i]] = next_id;
next_id += 1;
}
result.push(ids[words[i]]);
}
return result;
}
function matchesPatternFast(words, pattern) {
return normalize(words) === normalize(pattern);
}
Note: As pointed out in the comments, one should check deep equality of the normalized arrays manually, since === on arrays does an identity comparison in Javascript and does not compare elementwise. See also How to check if two arrays are equal with Javascript?.
Addendum: Below I argue that matchesPatternFast and matchesPatternReference compute the same function -- but use the faulty assumption that === on arrays compares elements pointwise rather than being a pointer comparison.
We can define the following function:
function matchesPatternSlow(words, pattern) {
return matchesPatternReference(normalize(words), normalize(pattern));
}
I observe that normalize(x).length === x.length and normalize(x)[i] === normalize(x)[j] if and only if x[i] === x[j]; therefore matchesPatternSlow computes the same function as matchesPatternReference.
I will now argue that matchesPatternSlow(x,y) === matchesPatternFast(x,y). Certainly if normalize(x) === normalize(y) then we will have this property. matchesPatternFast will manifestly return true. On the other hand, matchesPatternSlow operates by making a number of queries on its two inputs and verifying that these queries always return the same results for both lists: outside the loop, the query is function(x) { return x.length }, and inside the loop, the query is function(x, i, j) { return x[i] === x[j]; }. Since equal objects will respond identically to any query, it follows that all queries on the two normalized lists will align, matchesPatternSlow will also return true.
What if normalize(x) !== normalize(y)? Then matchesPatternFast will manifestly return false. But if they are not equal, then either their lengths do not match -- in which case matchesPatternSlow will also return false from the first check in matchesPatternReference as we hoped -- or else the elements at some index are not equal. Suppose the smallest mismatching index is i. It is a property of normalize that the element at index i will either be equal to an element at index j<i or else it will be one larger than the maximal element from indices 0 through i-1. So we now have four cases to consider:
We have j1<i and j2<i for which normalize(x)[j1] === normalize(x)[i] and normalize(y)[j2] === normalize(y)[i]. But since normalize(x)[i] !== normalize(y)[i] we then know that normalize(x)[j1] !== normalize(y)[i]. So when matchesPatternReference chooses the indices j1 and i, we will find that normalize(x)[j1] === normalize(x)[i] is true and normalize(y)[j1] === normalize(y)[i] is false and immediately return false as we are trying to show.
We have j<i for which normalize(x)[j] === normalize(x)[i] and normalize(y)[i] is not equal to any previous element of normalize(y). Then matchesPatternReference will return false when it chooses the indices j and i, since normalize(x) matches on these indices but normalize(y) doesn't.
We have j<i for which normalize(y)[j] === normalize(y)[i] and normalize(x)[i] is not equal to any previous element of normalize(x). Basically the same as in the previous case.
We have that normalize(x)[i] is one larger than the largest earlier element in normalize(x) and normalize(y)[i] is one larger than the largest earlier element in normalize(y). But since normalize(x) and normalize(y) agree on all previous elements, this means normalize(x)[i] === normalize(y)[i], a contradiction to our assumption that the normalized lists differ at this index.
So in all cases, matchesPatternFast and matchesPatternSlow agree -- hence matchesPatternFast and matchesPatternReference compute the same function.
For this special case, I assume the pattern refers to matching first character. If so, you can simply zip and compare.
# python2.7
words = "ant dog cat dog"
letters = "a d c d"
letters2 = "a d c e"
def match(ws, ls):
ws = ws.split()
ls = ls.split()
return all(w[0] == l for w, l in zip(ws + [[0]], ls + [0]))
print match(words, letters)
print match(words, letters2)
The funny [[0]] and [0] in the end is to ensure that the pattern and the words have the same length.

Detecting the index in a string that is not printable character with Scala

I have a method that detects the index in a string that is not printable as follows.
def isPrintable(v:Char) = v >= 0x20 && v <= 0x7E
val ba = List[Byte](33,33,0,0,0)
ba.zipWithIndex.filter { v => !isPrintable(v._1.toChar) } map {v => v._2}
> res115: List[Int] = List(2, 3, 4)
The first element of the result list is the index, but I wonder if there is a simpler way to do this.
If you want an Option[Int] of the first non-printable character (if one exists), you can do:
ba.zipWithIndex.collectFirst{
case (char, index) if (!isPrintable(char.toChar)) => index
}
> res4: Option[Int] = Some(2)
If you want all the indices like in your example, just use collect instead of collectFirst and you'll get back a List.
For getting only the first index that meets the given condition:
ba.indexWhere(v => !isPrintable(v.toChar))
(it returns -1 if nothing is found)
You can use directly regexp to found unprintable characters by unicode code points.
Resource: Regexp page
In such way you can directly filter your string with such pattern, for instance:
val text = "this is \n sparta\t\r\n!!!"
text.zipWithIndex.filter(_._1.matches("\\p{C}")).map(_._2)
> res3: Vector(8, 16, 17, 18)
As result you'll get Vector with indices of all unprintable characters in String. Check it out
If desired only the first occurrence of non printable char
Method span applied on a List delivers two sublists, the first where all the elements hold a condition, the second starts with an element that falsified the condition. In this case consider,
val (l,r) = ba.span(b => isPrintable(b.toChar))
l: List(33, 33)
r: List(0, 0, 0)
To get the index of the first non printable char,
l.size
res: Int = 2
If desired all the occurrences of non printable chars
Consider partition of a given List for a criteria. For instance, for
val ba2 = List[Byte](33,33,0,33,33)
val (l,r) = ba2.zipWithIndex.partition(b => isPrintable(b._1.toChar))
l: List((33,0), (33,1), (33,3), (33,4))
r: List((0,2))
where r includes tuples with non printable chars and their position in the original List.
I am not sure whether list of indexes or tuples is needed and I am not sure whether 'ba' needs to be an list of bytes or starts off as a string.
for { i <- 0 until ba.length if !isPrintable(ba(i).toChar) } yield i
here, because people need performance :)
def getNonPrintable(ba:List[Byte]):List[Int] = {
import scala.collection.mutable.ListBuffer
var buffer = ListBuffer[Int]()
#tailrec
def go(xs: List[Byte], cur: Int): ListBuffer[Int] = {
xs match {
case Nil => buffer
case y :: ys => {
if (!isPrintable(y.toChar)) buffer += cur
go(ys, cur + 1)
}
}
}
go(ba, 0)
buffer.toList
}

groovy - is there any implicit variable to get access to item index in "each" method

Is there any way to remove variable "i" in the following example and still get access to index of item that being printed ?
def i = 0;
"one two three".split().each {
println ("item [ ${i++} ] = ${it}");
}
=============== EDIT ================
I found that one possible solution is to use "eachWithIndex" method:
"one two three".split().eachWithIndex {it, i
println ("item [ ${i} ] = ${it}");
}
Please do let me know if there are other solutions.
you can use eachWithIndex()
"one two three four".split().eachWithIndex() { entry, index ->
println "${index} : ${entry}" }
this will result in
0 : one
1 : two
2 : three
3 : four
Not sure what 'other solutions' you are looking for... The only other thing you can do that I can think of (with Groovy 1.8.6), is something like:
"one two three".split().with { words ->
[words,0..<words.size()].transpose().collect { word, index ->
word * index
}
}
As you can see, this allows you to use collect with an index as well (as there is no collectWithIndex method).
Another approach, if you want to use the index of the collection on other methods than each is to define an enumerate method that returns pairs [index, element], analog to Python's enumerate:
Iterable.metaClass.enumerate = { start = 0 ->
def index = start
delegate.collect { [index++, it] }
}
So, for example:
assert 'un dos tres'.tokenize().enumerate() == [[0,'un'], [1,'dos'], [2,'tres']]
(notice that i'm using tokenize instead of split because the former returns an Iterable, while the later returns a String[])
And we can use this new collection with each, as you wanted:
'one two three'.tokenize().enumerate().each { index, word ->
println "$index: $word"
}
Or we can use it with other iteration methods :D
def repetitions = 'one two three'.tokenize().enumerate(1).collect { n, word ->
([word] * n).join(' ')
}
assert repetitions == ['one', 'two two', 'three three three']
Note: Another way of defining the enumerate method, following tim_yates' more functional approach is:
Iterable.metaClass.enumerate = { start = 0 ->
def end = start + delegate.size() - 1
[start..end, delegate].transpose()
}

Resources