takeRightWhile() method in scala - string

I might be missing something but recently I came across a task to get last symbols according to some condition. For example I have a string: "this_is_separated_values_5". Now I want to extract 5 as Int.
Note: number of parts separated by _ is not defined.
If I would have a method takeRightWhile(f: Char => Boolean) on a string it would be trivial: takeRightWhile(ch => ch != '_'). Moreover it would be efficient: a straightforward implementation would actually involve finding the last index of _ and taking a substring while the use of this method would save first step and provide better average time complexity.
UPDATE: Guys, all the variations of str.reverse.takeWhile(_!='_').reverse are quite inefficient as you actually use additional O(n) space. If you want to implement method takeRightWhile efficiently you could iterate starting from the right, accumulating result in string builder of whatever else, and returning the result. I am asking about this kind of method, not implementation which was already described and declined in the question itself.
Question: Does this kind of method exist in scala standard library? If no, is there method combination from the standard library to achieve the same in minimum amount of lines?
Thanks in advance.

Possible solution:
str.reverse.takeWhile(_!='_').reverse
Update
You can go from right to left with following expression using foldRight:
str.toList.foldRight(List.empty[Char]) {
case (item, acc) => item::acc
}
Here you need to check condition and stop adding items after condition met. For this you can pass a flag to accumulated value:
val (_, list) = str.toList.foldRight((false, List.empty[Char])) {
case (item, (false, list)) if item!='_' => (false, item::list)
case (_, (_, list)) => (true, list)
}
val res = list.mkString.toInt
This solution is even more inefficient then solution with double reverse:
Implementation of foldRight uses combination of List reverse and foldLeft
You cannot break foldRight execution, so you need flag to skip all items after condition met

I'd go with this:
val s = "string_with_following_number_42"
s.split("_").reverse.head
// res:String = 42
This is a naive attempt and by no means optimized. What it does is splitting the String into an Array of Strings, reverses it and takes the first element. Note that, because the reversing happens after the splitting, the order of the characters is correct.

I am not exactly sure about the problem you are facing. My understanding is that you want have a string of format xxx_xxx_xx_...._xxx_123 and you want to extract the part at the end as Int.
import scala.util.Try
val yourStr = "xxx_xxx_xxx_xx...x_xxxxx_123"
val yourInt = yourStr.split('_').last.toInt
// But remember that the above is unsafe so you may want to take it as Option
val yourIntOpt = Try(yourStr.split('_').last.toInt).toOption
Or... lets say your requirement is to collect a right-suffix till some boolean condition remains true.
import scala.util.Try
val yourStr = "xxx_xxx_xxx_xx...x_xxxxx_123"
val rightSuffix = yourStr.reverse.takeWhile(c => c != '_').reverse
val yourInt = rightSuffix.toInt
// but above is unsafe so
val yourIntOpt = Try(righSuffix.toInt).toOption
Comment if your requirement is different from this.

You can use StringBuilder and lastIndexWhere.
val str = "this_is_separated_values_5"
val sb = new StringBuilder(str)
val lastIdx = sb.lastIndexWhere(ch => ch != '_')
val lastCh = str.charAt(lastIdx)

Related

Scala - Executing every element until they all have finished

I cannot figure out why my function invokeAll does not give out the correct output/work properly. Any solutions? (No futures or parallel collections allowed and the return type needs to be Seq[Int])
def invokeAll(work: Seq[() => Int]): Seq[Int] = {
//this is what we should return as an output "return res.toSeq"
//res cannot be changed!
val res = new Array[Int](work.length)
var list = mutable.Set[Int]()
var n = res.size
val procedure = (0 until n).map(work =>
new Runnable {
def run {
//add the finished element/Int to list
list += work
}
}
)
val threads = procedure.map(new Thread(_))
threads.foreach(x => x.start())
threads.foreach (x => (x.join()))
res ++ list
//this should be the final output ("return res.toSeq")
return res.toSeq
}
OMG, I know a java programmer, when I see one :)
Don't do this, it's not java!
val results: Future[Seq[Int]] = Future.traverse(work)
This is how you do it in scala.
This gives you a Future with the results of all executions, that will be satisfied when all work is finished. You can use .map, .flatMap etc. to access and transform those results. For example
val sumOfAll: Future[Int] = results.map(_.sum)
Or (in the worst case, when you want to just give the result back to imperative code), you could block and wait on the future to get ahold of the actual result (don't do this unless you are absolutely desperate): Await.result(results, 1 year)
If you want the results as array, results.map(_.toArray) will do that ... but you really should not: arrays aren't really a good choice for the vast majority of use cases in scala. Just stick with Seq.
The main problem in your code is that you are using fixed size array and trying to add some elements using ++ (concatenate) operator: res ++ list. It produces new Seq but you don't store it in some val.
You could remove last line return res.toSeq and see that res ++ lest will be return value. It will be your work.length array of zeros res with some list sequence at the end. Try read more about scala collections most of them immutable and there is a good practice to use immutable data structures. In scala Arrays doesn't accumulate values using ++ operator in left operand. Array's in scala are fixed size.

Is StringBuffer the Kotlin way to handle multiple string concatenation like in java?

What would be the kotlin way to handle multiple string concatenation?
--edit--
placing the piece of code that led me to this doubt
fun getNCharsFromRange(n: Int, range: CharRange): String {
val chars = range.toList()
val buffer = StringBuffer()
while (buffer.length < n) {
val randomInt = Random.Default.nextInt(0, chars.lastIndex)
val newchar = chars[randomInt]
val lastChar = buffer.lastOrNull() ?: ""
if (newchar != lastChar) {
buffer.append(newchar)
}
}
return buffer.toString()
}
A StringBuilder is the standard way to construct a String in Kotlin, as in Java.
(Unless it can be done in one line, of course, where a string template is usually better than Java-style concatenation.)
Kotlin has one improvement, though: you can use buildString to handle that implicitly, which can make the code a little more concise.  For example, your code can be written as:
fun getNCharsFromRange(n: Int, range: CharRange): String {
val chars = range.toList()
return buildString {
while (length < n) {
val randomInt = Random.Default.nextInt(0, chars.lastIndex)
val newChar = chars[randomInt]
val lastChar = lastOrNull() ?: ""
if (newChar != lastChar)
append(newChar)
}
}
}
That has no mention of buffer: instead, buildString creates a StringBuilder, makes it available as this, and then returns the resulting String.  (So length, lastOrNull(), and append refer to the StringBuilder.)
For short code, this can be significantly more concise and clearer; though the benefits are much less clear with longer code.  (Your code may be in the grey area between…)
Worth pointing out that the function name is misleading: it avoids repeated characters, but allows duplicates that are not consecutive.  If that's deliberate, then it would be worth making clear in the function name (and/or its doc comment).  Alternatively, if the intent is to avoid all duplicates, then there's an approach which is much simpler and/or more efficient: to shuffle the range (or at least part of it).
Using existing library functions, and making it an extension function on CharRange, the whole thing could be as simple as:
fun CharRange.randomChars(n: Int) = shuffled().take(n).joinToString("")
That shuffles the whole list, even if only a few characters are needed.  So it would be even more efficient to shuffle just the part needed.  But there's no library function for that, so you'd have to write that manually.  I'll leave it as an exercise!

Scala: replace char at position i in String

I have an initial String (binary) looking like this :
val mask = "00000000000000000000000000000000" of length 32
Additionally, I have a list of positions i (0 <= i <= 31) at which I want the mask to have value 1.
For instance List(0,12,30,4) should give the following result :
mask = "10001000000010000000000000000010"
How can I do this efficiently in scala ?
Thank you
A naive approach would be to fold over the positions with zero element 'mask' and successively update the char at the given position:
List(0,12,30,4).foldLeft(mask)((s, i) => s.updated(i, '1'))
- Daniel
Unfortunately the most efficient way I can think of to do this is the same as you'd do it in any other (not functional) programming language:
val mask = "00000000000000000000000000000000"
val l = List(0, 12, 30, 4)
val sb = new StringBuilder(mask)
for (i <- l) { sb(i) = '1' }
println(sb.toString)
This should actually be more efficient than Daniel answer, but I'd prefer Daniel's due to clarity. Still you've asked for the most efficient way
Updated
Ok, I think this should be more or less efficient and FP-style - the trick is to use views:
val view : SeqView[Char, Seq[_]] = (mask: Seq[Char]).view
println(List(0,12,30,4).foldLeft(view)((s, i) => s.updated(i, '1')).mkString)
I know this isn't exactly what you asked—but I have to wonder why you're using a String. There's a very efficient data structure for storing this type of information, called a BitSet.
If you're using a BitSet, then setting the bits corresponding to a list of integers is trivial.
If you want a mutable BitSet:
scala.collection.mutable.BitSet.empty ++= List(0,12,30,4)
If you want an immutable BitSet:
scala.collection.immutable.BitSet.empty ++ List(0,12,30,4)

How to find maximum overlap between two strings in Scala?

Suppose I have two strings: s and t. I need to write a function f to find a max. t prefix, which is also an s suffix. For example:
s = "abcxyz", t = "xyz123", f(s, t) = "xyz"
s = "abcxxx", t = "xx1234", f(s, t) = "xx"
How would you write it in Scala ?
This first solution is easily the most concise, also it's more efficient than a recursive version as it's using a lazily evaluated iteration
s.tails.find(t.startsWith).get
Now there has been some discussion regarding whether tails would end up copying the whole string over and over. In which case you could use toList on s then mkString the result.
s.toList.tails.find(t.startsWith(_: List[Char])).get.mkString
For some reason the type annotation is required to get it to compile. I've not actually trying seeing which one is faster.
UPDATE - OPTIMIZATION
As som-snytt pointed out, t cannot start with any string that is longer than it, and therefore we could make the following optimization:
s.drop(s.length - t.length).tails.find(t.startsWith).get
Efficient, this is not, but it is a neat (IMO) one-liner.
val s = "abcxyz"
val t ="xyz123"
(s.tails.toSet intersect t.inits.toSet).maxBy(_.size)
//res8: String = xyz
(take all the suffixes of s that are also prefixes of t, and pick the longest)
If we only need to find the common overlapping part, then we can recursively take tail of the first string (which should overlap with the beginning of the second string) until the remaining part will not be the one that second string begins with. This also covers the case when the strings have no overlap, because then the empty string will be returned.
scala> def findOverlap(s:String, t:String):String = {
if (s == t.take(s.size)) s else findOverlap (s.tail, t)
}
findOverlap: (s: String, t: String)String
scala> findOverlap("abcxyz", "xyz123")
res3: String = xyz
scala> findOverlap("one","two")
res1: String = ""
UPDATE: It was pointed out that tail might not be implemented in the most efficient way (i.e. it creates a new string when it is called). If that becomes an issue, then using substring(1) instead of tail (or converting both Strings to Lists, where it's tail / head should have O(1) complexity) might give a better performance. And by the same token, we can replace t.take(s.size) with t.substring(0,s.size).

Scala check if element is present in a list

I need to check if a string is present in a list, and call a function which accepts a boolean accordingly.
Is it possible to achieve this with a one liner?
The code below is the best I could get:
val strings = List("a", "b", "c")
val myString = "a"
strings.find(x=>x == myString) match {
case Some(_) => myFunction(true)
case None => myFunction(false)
}
I'm sure it's possible to do this with less coding, but I don't know how!
Just use contains
myFunction(strings.contains(myString))
And if you didn't want to use strict equality, you could use exists:
myFunction(strings.exists { x => customPredicate(x) })
Even easier!
strings contains myString
this should work also with different predicate
myFunction(strings.find( _ == mystring ).isDefined)
In your case I would consider using Set and not List, to ensure you have unique values only. unless you need sometimes to include duplicates.
In this case, you don't need to add any wrapper functions around lists.
You can also implement a contains method with foldLeft, it's pretty awesome. I just love foldLeft algorithms.
For example:
object ContainsWithFoldLeft extends App {
val list = (0 to 10).toList
println(contains(list, 10)) //true
println(contains(list, 11)) //false
def contains[A](list: List[A], item: A): Boolean = {
list.foldLeft(false)((r, c) => c.equals(item) || r)
}
}

Resources