I am trying to remove the last occurrence of a character in a string. I can get its index:
str.lastIndexOf(',')
I have already tried to use split and the replace function on the string.
You could use patch.
scala> val s = "s;dfkj;w;erw"
s: String = s;dfkj;w;erw
scala> s.patch(s.lastIndexOf(';'), "", 1)
res6: String = s;dfkj;werw
Curious why Scala doesn't have a .replaceLast but there must be a reason...
Reverse the String and use str.replaceFirst then reverse again
I doubt this is terrible efficient but it is effective :)
scala> "abc.xyz.abc.xyz".reverse.replaceFirst("zyx.", "").reverse
res5: String = abc.xyz.abc
As a def it would look like this:
def replaceLast(input: String, matchOn: String, replaceWith: String) = {
input.reverse.replaceFirst(matchOn.reverse, replaceWith.reverse).reverse
}
scala> def removeLast(x: Char, xs: String): String = {
|
| val accumulator: (Option[Char], String) = (None, "")
|
| val (_, applied) = xs.foldRight(accumulator){(e: Char, acc: (Option[Char], String)) =>
| val (alreadyReplaced, runningAcc) = acc
| alreadyReplaced match {
| case some # Some(_) => (some, e + runningAcc)
| case None => if (e == x) (Some(e), runningAcc) else (None, e + runningAcc)
| }
| }
|
| applied
| }
removeLast: (x: Char, xs: String)String
scala> removeLast('f', "foobarf")
res7: String = foobar
scala> removeLast('f', "foobarfff")
res8: String = foobarff
You could try the following:
val input = "The quick brown fox jumps over the lazy dog"
val lastIndexOfU = input.lastIndexOf("u")
val splits = input.splitAt(lastIndexOfU)
val inputWithoutLastU = splits._1 + splits._2.drop(1) // "The quick brown fox jmps over the lazy dog"
Related
Suppose I need to merge two overlapping strings like that:
def mergeOverlap(s1: String, s2: String): String = ???
mergeOverlap("", "") // ""
mergeOverlap("", "abc") // abc
mergeOverlap("xyz", "abc") // xyzabc
mergeOverlap("xab", "abc") // xabc
I can write this function using the answer to one of my previous questions:
def mergeOverlap(s1: String, s2: String): String = {
val n = s1.tails.find(tail => s2.startsWith(tail)).map(_.size).getOrElse(0)
s1 ++ s2.drop(n)
}
Could you suggest either a simpler or maybe more efficient implementation of mergeOverlap?
You can find the overlap between two strings in time proportional to the total length of the strings O(n + k) using the algorithm to calculate the prefix function. Prefix function of a string at index i is defined as the size of the longest suffix at index i that is equal to the prefix of the whole string (excluding the trivial case).
See those links for more explanation of the definition and the algorithm to compute it:
https://cp-algorithms.com/string/prefix-function.html
https://hyperskill.org/learn/step/6413#a-definition-of-the-prefix-function
Here is an implementation of a modified algorithm that calculates the longest prefix of the second argument, equal to the suffix of the first argument:
import scala.collection.mutable.ArrayBuffer
def overlap(hasSuffix: String, hasPrefix: String): Int = {
val overlaps = ArrayBuffer(0)
for (suffixIndex <- hasSuffix.indices) {
val currentCharacter = hasSuffix(suffixIndex)
val currentOverlap = Iterator.iterate(overlaps.last)(overlap => overlaps(overlap - 1))
.find(overlap =>
overlap == 0 ||
hasPrefix.lift(overlap).contains(currentCharacter))
.getOrElse(0)
val updatedOverlap = currentOverlap +
(if (hasPrefix.lift(currentOverlap).contains(currentCharacter)) 1 else 0)
overlaps += updatedOverlap
}
overlaps.last
}
And with that mergeOverlap is just
def mergeOverlap(s1: String, s2: String) =
s1 ++ s2.drop(overlap(s1, s2))
And some tests of this implementation:
scala> mergeOverlap("", "")
res0: String = ""
scala> mergeOverlap("abc", "")
res1: String = abc
scala> mergeOverlap("", "abc")
res2: String = abc
scala> mergeOverlap("xyz", "abc")
res3: String = xyzabc
scala> mergeOverlap("xab", "abc")
res4: String = xabc
scala> mergeOverlap("aabaaab", "aab")
res5: String = aabaaab
scala> mergeOverlap("aabaaab", "aabc")
res6: String = aabaaabc
scala> mergeOverlap("aabaaab", "bc")
res7: String = aabaaabc
scala> mergeOverlap("aabaaab", "bbc")
res8: String = aabaaabbc
scala> mergeOverlap("ababab", "ababc")
res9: String = abababc
scala> mergeOverlap("ababab", "babc")
res10: String = abababc
scala> mergeOverlap("abab", "aab")
res11: String = ababaab
It's not tail recursive but it is a very simple algorithm.
def mergeOverlap(s1: String, s2: String): String =
if (s2 startsWith s1) s2
else s1.head +: mergeOverlap(s1.tail, s2)
Using scala, how to print string in even and odd indices of a given string? I am aware of the imperative approach using var. I am looking for an approach that uses immutability, avoids side-effects (of course, until need to print result) and concise.
Here is a tail-recursive solution returning even and odd chars (List[Char], List[Char]) in one go
def f(in: String): (List[Char], List[Char]) = {
#tailrec def run(s: String, idx: Int, accEven: List[Char], accOdd: List[Char]): (List[Char], List[Char]) = {
if (idx < 0) (accEven, accOdd)
else if (idx % 2 == 0) run(s, idx - 1, s.charAt(idx) :: accEven, accOdd)
else run(s, idx - 1, accEven, s.charAt(idx) :: accOdd)
}
run(in, in.length - 1, Nil, Nil)
}
which could be printed like so
val (even, odd) = f("abcdefg")
println(even.mkString)
Another way to explore is using zipWithIndex
def printer(evenOdd: Int) {
val str = "1234"
str.zipWithIndex.foreach { i =>
i._2 % 2 match {
case x if x == evenOdd => print(i._1)
case _ =>
}
}
}
In this case you can check the results by using the printer function
scala> printer(1)
24
scala> printer(0)
13
.zipWithIndex takes a List and returns tuples of the elements coupled with their index. Knowing that a String is a list of Char
Looking at str
scala> val str = "1234"
str: String = 1234
str.zipWithIndex
res: scala.collection.immutable.IndexedSeq[(Char, Int)] = Vector((1,0), (2,1), (3,2), (4,3))
Lastly, as you only need to print, using foreach instead of map is more ideal as you aren't expecting values to be returned
You can use the sliding function, which is quite simple:
scala> "abcdefgh".sliding(1,2).mkString("")
res16: String = aceg
scala> "abcdefgh".tail.sliding(1,2).mkString("")
res17: String = bdfh
val s = "abcd"
// ac
(0 until s.length by 2).map(i => s(i))
// bd
(1 until s.length by 2).map(i => s(i))
just pure functions with map operator
I am coding in scala-spark and trying to segregate all strings and column datatypes .
I am getting the output for columns(2)_2 albeit with a warning but when i apply the same thing in the if statement i get an error . Any idea why. This part got solved by adding columns(2)._2 : David Griffin
var df = some dataframe
var columns = df.dtypes
var colnames = df.columns.size
var stringColumns:Array[(String,String)] = null;
var doubleColumns:Array[(String,String)] = null;
var otherColumns:Array [(String,String)] = null;
columns(2)._2
columns(2)._1
for (x<-1 to colnames)
{
if (columns(x)._2 == "StringType")
{stringColumns = stringColumns ++ Seq((columns(x)))}
if (columns(x)._2 == "DoubleType")
{doubleColumns = doubleColumns ++ Seq((columns(x)))}
else
{otherColumns = otherColumns ++ Seq((columns(x)))}
}
Previous Output:
stringColumns: Array[(String, String)] = null
doubleColumns: Array[(String, String)] = null
otherColumns: Array[(String, String)] = null
res158: String = DoubleType
<console>:127: error: type mismatch;
found : (String, String)
required: scala.collection.GenTraversableOnce[?]
{stringColumns = stringColumns ++ columns(x)}
Current Output:
stringColumns: Array[(String, String)] = null
doubleColumns: Array[(String, String)] = null
otherColumns: Array[(String, String)] = null
res382: String = DoubleType
res383: String = CVB
java.lang.NullPointerException
^
I believe you are missing a .. Change this:
columns(2)_2
to
columns(2)._2
If nothing else, it will get rid of the warning.
And then, you need to do:
++ Seq(columns(x))
Here's a cleaner example:
scala> val arr = Array[(String,String)]()
arr: Array[(String, String)] = Array()
scala> arr ++ (("foo", "bar"))
<console>:9: error: type mismatch;
found : (String, String)
required: scala.collection.GenTraversableOnce[?]
arr ++ (("foo", "bar"))
scala> arr ++ Seq(("foo", "bar"))
res2: Array[(String, String)] = Array((foo,bar))
This is the answer modified from David Griffins answer so please up-vote him too . Just altered ++ to +:=
var columns = df.dtypes
var colnames = df.columns.size
var stringColumns= Array[(String,String)]();
var doubleColumns= Array[(String,String)]();
var otherColumns= Array[(String,String)]();
for (x<-0 to colnames-1)
{
if (columns(x)._2 == "StringType"){
stringColumns +:= columns(x)
}else if (columns(x)._2 == "DoubleType") {
doubleColumns +:= columns(x)
}else {
otherColumns +:= columns(x)
}
}
println(stringColumns)
println(doubleColumns)
println(otherColumns)
I'm searching for idiomatic scala way to format string with named arguments. I'm aware of the String method format, but it does not allow to specify named arguments, only positional are available.
Simple example:
val bob = "Bob"
val alice = "Alice"
val message = "${b} < ${a} and ${a} > ${b}"
message.format(a = alice, b = bob)
Defining message as separate value is crucial, since I want to load it from resource file and not specify in the code directly. There are plenty of similar questions that was answered with the new scala feature called String Interpolation. But this does not address my case: I could not let the compiler do all the work, since resource file is loaded in the run time.
It's not clear whether by "positional args" you mean the usual sense or the sense of "argument index" as used by Formatter.
scala> val bob = "Bob"
bob: String = Bob
scala> val alice = "Alice"
alice: String = Alice
scala> val message = "%2$s likes %1$s and %1$s likes %2$s" format (bob, alice)
message: String = Alice likes Bob and Bob likes Alice
You'd like:
scala> def f(a: String, b: String) = s"$b likes $a and $a likes $b"
f: (a: String, b: String)String
scala> f(b = bob, a = alice)
res2: String = Bob likes Alice and Alice likes Bob
You can compile the interpolation with a reflective toolbox or the scripting engine.
scala> import scala.tools.reflect._
import scala.tools.reflect._
scala> val tb = reflect.runtime.currentMirror.mkToolBox()
tb: scala.tools.reflect.ToolBox[reflect.runtime.universe.type] = scala.tools.reflect.ToolBoxFactory$ToolBoxImpl#162b3d47
scala> val defs = s"""val a = "$alice" ; val b = "$bob""""
defs: String = val a = "Alice" ; val b = "Bob"
scala> val message = "${b} < ${a} and ${a} > ${b}"
message: String = ${b} < ${a} and ${a} > ${b}
scala> val msg = s"""s"$message""""
msg: String = s"${b} < ${a} and ${a} > ${b}"
scala> tb eval (tb parse s"$defs ; $msg")
res3: Any = Bob < Alice and Alice > Bob
or
scala> def f(a: String, b: String) = tb eval (tb parse s"""val a = "$a"; val b = "$b"; s"$message"""")
f: (a: String, b: String)Any
scala> f(a = alice, b = bob)
res4: Any = Bob < Alice and Alice > Bob
It's a little overmuch.
But consider:
https://www.playframework.com/documentation/2.0/ScalaTemplates
I don't know whether it's possible to use the same parser the compiler uses.
But Regex can handle the example you've given.
val values = Map("a" -> "Alice", "b" -> "Bob")
val message = "${b} < ${a} and ${a} > ${b}"
def escapeReplacement(s: String): String =
s.replace("\\", "\\\\").replace("$", "\\$")
"\\$\\{([^\\}]*)\\}".r.replaceAllIn(message,
m => escapeReplacement(values(m.group(1))))
One approach might be to have a list of substitutions (no limit on how many variables) and then go with something like:
def substituteVar(s:String,subs:Seq[(String,String)]):String=
subs.foldLeft(s)((soFar,item) => soFar replaceAll("\\$\\{"+item._1+"}", item._2))
substituteVar(message,Seq(("a",alice),("b",bob)))
res12: String = Bob < Alice and Alice > Bob
Could be extended as well...
I tried to use readInt() to read two integers from the same line but that is not how it works.
val x = readInt()
val y = readInt()
With an input of 1 727 I get the following exception at runtime:
Exception in thread "main" java.lang.NumberFormatException: For input string: "1 727"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:231)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
at scala.Console$.readInt(Console.scala:356)
at scala.Predef$.readInt(Predef.scala:201)
at Main$$anonfun$main$1.apply$mcVI$sp(Main.scala:11)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:75)
at Main$.main(Main.scala:10)
at Main.main(Main.scala)
I got the program to work by using readf but it seems pretty awkward and ugly to me:
val (x,y) = readf2("{0,number} {1,number}")
val a = x.asInstanceOf[Int]
val b = y.asInstanceOf[Int]
println(function(a,b))
Someone suggested that I just use Java's Scanner class, (Scanner.nextInt()) but is there a nice idiomatic way to do it in Scala?
Edit:
My solution following paradigmatic's example:
val Array(a,b) = readLine().split(" ").map(_.toInt)
Followup Question: If there were a mix of types in the String how would you extract it? (Say a word, an int and a percentage as a Double)
If you mean how would you convert val s = "Hello 69 13.5%" into a (String, Int, Double) then the most obvious way is
val tokens = s.split(" ")
(tokens(0).toString,
tokens(1).toInt,
tokens(2).init.toDouble / 100)
// (java.lang.String, Int, Double) = (Hello,69,0.135)
Or as mentioned you could match using a regex:
val R = """(.*) (\d+) (\d*\.?\d*)%""".r
s match {
case R(str, int, dbl) => (str, int.toInt, dbl.toDouble / 100)
}
If you don't actually know what data is going to be in the String, then there probably isn't much reason to convert it from a String to the type it represents, since how can you use something that might be a String and might be in Int? Still, you could do something like this:
val int = """(\d+)""".r
val pct = """(\d*\.?\d*)%""".r
val res = s.split(" ").map {
case int(x) => x.toInt
case pct(x) => x.toDouble / 100
case str => str
} // Array[Any] = Array(Hello, 69, 0.135)
now to do anything useful you'll need to match on your values by type:
res.map {
case x: Int => println("It's an Int!")
case x: Double => println("It's a Double!")
case x: String => println("It's a String!")
case _ => println("It's a Fail!")
}
Or if you wanted to take things a bit further, you could define some extractors which will do the conversion for you:
abstract class StringExtractor[A] {
def conversion(s: String): A
def unapply(s: String): Option[A] = try { Some(conversion(s)) }
catch { case _ => None }
}
val intEx = new StringExtractor[Int] {
def conversion(s: String) = s.toInt
}
val pctEx = new StringExtractor[Double] {
val pct = """(\d*\.?\d*)%""".r
def conversion(s: String) = s match { case pct(x) => x.toDouble / 100 }
}
and use:
"Hello 69 13.5%".split(" ").map {
case intEx(x) => println(x + " is Int: " + x.isInstanceOf[Int])
case pctEx(x) => println(x + " is Double: " + x.isInstanceOf[Double])
case str => println(str)
}
prints
Hello
69 is Int: true
0.135 is Double: true
Of course, you can make the extrators match on anything you want (currency mnemonic, name begging with 'J', URL) and return whatever type you want. You're not limited to matching Strings either, if instead of StringExtractor[A] you make it Extractor[A, B].
You can read the line as a whole, split it using spaces and then convert each element (or the one you want) to ints:
scala> "1 727".split(" ").map( _.toInt )
res1: Array[Int] = Array(1, 727)
For most complex inputs, you can have a look at parser combinators.
The input you are describing is not two Ints but a String which just happens to be two Ints. Hence you need to read the String, split by the space and convert the individual Strings into Ints as suggested by #paradigmatic.
One way would be splitting and mapping:
// Assuming whatever is being read is assigned to "input"
val input = "1 727"
val Array(x, y) = input split " " map (_.toInt)
Or, if you have things a bit more complicated than that, a regular expression is usually good enough.
val twoInts = """^\s*(\d+)\s*(\d+)""".r
val Some((x, y)) = for (twoInts(a, b) <- twoInts findFirstIn input) yield (a, b)
There are other ways to use regex. See the Scala API docs about them.
Anyway, if regex patterns are becoming too complicated, then you should appeal to Scala Parser Combinators. Since you can combine both, you don't loose any of regex's power.
import scala.util.parsing.combinator._
object MyParser extends JavaTokenParsers {
def twoInts = wholeNumber ~ wholeNumber ^^ { case a ~ b => (a.toInt, b.toInt) }
}
val MyParser.Success((x, y), _) = MyParser.parse(MyParser.twoInts, input)
The first example was more simple, but harder to adapt to more complex patterns, and more vulnerable to invalid input.
I find that extractors provide some machinery that makes this type of processing nicer. And I think it works up to a certain point nicely.
object Tokens {
def unapplySeq(line: String): Option[Seq[String]] =
Some(line.split("\\s+").toSeq)
}
class RegexToken[T](pattern: String, convert: (String) => T) {
val pat = pattern.r
def unapply(token: String): Option[T] = token match {
case pat(s) => Some(convert(s))
case _ => None
}
}
object IntInput extends RegexToken[Int]("^([0-9]+)$", _.toInt)
object Word extends RegexToken[String]("^([A-Za-z]+)$", identity)
object Percent extends RegexToken[Double](
"""^([0-9]+\.?[0-9]*)%$""", _.toDouble / 100)
Now how to use:
List("1 727", "uptime 365 99.999%") collect {
case Tokens(IntInput(x), IntInput(y)) => "sum " + (x + y)
case Tokens(Word(w), IntInput(i), Percent(p)) => w + " " + (i * p)
}
// List[java.lang.String] = List(sum 728, uptime 364.99634999999995)
To use for reading lines at the console:
Iterator.continually(readLine("prompt> ")).collect{
case Tokens(IntInput(x), IntInput(y)) => "sum " + (x + y)
case Tokens(Word(w), IntInput(i), Percent(p)) => w + " " + (i * p)
case Tokens(Word("done")) => "done"
}.takeWhile(_ != "done").foreach(println)
// type any input and enter, type "done" and enter to finish
The nice thing about extractors and pattern matching is that you can add case clauses as necessary, you can use Tokens(a, b, _*) to ignore some tokens. I think they combine together nicely (for instance with literals as I did with done).