I wrote the following Scala code below to handle a String that I pass in, format the String, append it to a StringBuilder and return the formatted String with escaped unicode back to my caller for other processing.
The Scala compiler complains of the following on the lines where there is a String.format call with the following error:
overloaded method value format with alternatives: (x$1; java.util.Locale; x$2: String, X$3: Object*) (x$1:String,x$2: Object*) String cannot be applied to (*String, Int)
class TestClass {
private def escapeUnicodeStuff(input: String): String = {
//type StringBuilder = scala.collection.mutable.StringBuilder
val sb = new StringBuilder()
val cPtArray = toCodePointArray(input) //this method call returns an Array[Int]
val len = cPtArray.length
for (i <- 0 until len) {
if (cPtArray(i) > 65535) {
val hi = (cPtArray(i) - 0x10000) / 0x400 + 0xD800
val lo = (cPtArray(i) - 0x10000) % 0x400 + 0xDC00
sb.append(String.format("\\u%04x\\u%04x", hi, lo)) //**complains here**
} else if (codePointArray(i) > 127) {
sb.append(String.format("\\u%04x", codePointArray(i))) //**complains here**
} else {
sb.append(String.format("%c", codePointArray(i))) //**complains here**
}
}
sb.toString
}
}
How do I address this problem? How can I clean up the code to accomplish my purpose of formatting a String? Thanks in advance to the Scala experts here
The String.format method in Java expects Objects as its arguments. The Object type in Java is equivalent to the AnyRef type in Scala. The primitive types in Scala extend AnyVal – not AnyRef. Read more about the differences between AnyVal, AnyRef, and Any in the docs or in this answer. The most obvious fix is to use the Integer wrapper class from Java to get an Object representation of your Ints:
String.format("\\u%04x\\u%04x", new Integer(hi), new Integer(lo))
Using those wrapper classes is almost emblematic of unidiomatic Scala code, and should only be used for interoperability with Java when there is no better option. The more natural way to do this in Scala would be to either use the StringOps equivalent method format:
"\\u%04x\\u%04x".format(hi, lo)
You can also use the f interpolator for a more concise syntax:
f"\\u$hi%04x\\u$lo%04x"
Also, using a for loop like you have here is unidiomatic in Scala. You're better off using one of the functional list methods like map, foldLeft, or even foreach together with a partial function using the match syntax. For example, you might try something like:
toCodePointArray(input).foreach {
case x if x > 65535 =>
val hi = (x - 0x10000) / 0x400 + 0xD800
val lo = (x - 0x10000) % 0x400 + 0xDC00
sb.append(f"\\u$hi%04x\\u$lo%04x")
case x if > 127 => sb.append(f"\\u$x%04x")
case x => sb.append(f"$x%c")
}
Or, if you don't have to use StringBuilder, which really only needs to be used in cases where you are appending many strings, you can replace your whole method body with foldLeft:
def escapeUnicodeStuff(input: String) = toCodePointArray(input).foldLeft("") {
case (acc, x) if x > 65535 =>
val hi = (x - 0x10000) / 0x400 + 0xD800
val lo = (x - 0x10000) % 0x400 + 0xDC00
acc + f"\\u$hi%04x\\u$lo%04x"
case (acc, x) if x > 127 => acc + f"\\u$x%04x"
case (acc, x) => acc + f"$x%c"
}
Or a even map followed by a mkString:
def escapeUnicodeStuff(input: String) = toCodePointArray(input).map {
case x if x > 65535 =>
val hi = (x - 0x10000) / 0x400 + 0xD800
val lo = (x - 0x10000) % 0x400 + 0xDC00
f"\\u$hi%04x\\u$lo%04x"
case x if x > 127 => f"\\u$x%04x"
case x => f"$x%c"
}.mkString
I couldn't figure out what exactly is causing that "overload clash" but notice the code below:
scala> "\\u%04x\\u%04x".format(10,20)
res12: String = \u000a\u0014
Using the one provided by StringOps works.
Related
Using scala, how to print string in even and odd indices of a given string? I am aware of the imperative approach using var. I am looking for an approach that uses immutability, avoids side-effects (of course, until need to print result) and concise.
Here is a tail-recursive solution returning even and odd chars (List[Char], List[Char]) in one go
def f(in: String): (List[Char], List[Char]) = {
#tailrec def run(s: String, idx: Int, accEven: List[Char], accOdd: List[Char]): (List[Char], List[Char]) = {
if (idx < 0) (accEven, accOdd)
else if (idx % 2 == 0) run(s, idx - 1, s.charAt(idx) :: accEven, accOdd)
else run(s, idx - 1, accEven, s.charAt(idx) :: accOdd)
}
run(in, in.length - 1, Nil, Nil)
}
which could be printed like so
val (even, odd) = f("abcdefg")
println(even.mkString)
Another way to explore is using zipWithIndex
def printer(evenOdd: Int) {
val str = "1234"
str.zipWithIndex.foreach { i =>
i._2 % 2 match {
case x if x == evenOdd => print(i._1)
case _ =>
}
}
}
In this case you can check the results by using the printer function
scala> printer(1)
24
scala> printer(0)
13
.zipWithIndex takes a List and returns tuples of the elements coupled with their index. Knowing that a String is a list of Char
Looking at str
scala> val str = "1234"
str: String = 1234
str.zipWithIndex
res: scala.collection.immutable.IndexedSeq[(Char, Int)] = Vector((1,0), (2,1), (3,2), (4,3))
Lastly, as you only need to print, using foreach instead of map is more ideal as you aren't expecting values to be returned
You can use the sliding function, which is quite simple:
scala> "abcdefgh".sliding(1,2).mkString("")
res16: String = aceg
scala> "abcdefgh".tail.sliding(1,2).mkString("")
res17: String = bdfh
val s = "abcd"
// ac
(0 until s.length by 2).map(i => s(i))
// bd
(1 until s.length by 2).map(i => s(i))
just pure functions with map operator
I am currently working on a small code that should allow to tell if a given substring is within a string. I checked all the other similar questions but everybody is using predefined functions. I need to build it from scratch… could you please tell me what I did wrong?
def substring(s: String, t: String): Boolean ={
var i = 0 // position on substring
var j = 0 // position on string
var result = false
var isSim = true
var n = s.length // small string size
var m = t.length // BIG string size
// m must always be bigger than n + j
while (m>n+j && isSim == true){
// j grows with i
// stopping the loop at i<n
while (i<n && isSim == true){
// if characters are similar
if (s(i)==t(j)){
// add 1 to i. So j will increase by one as well
// this will run the loop looking for similarities. If not, exit the loop.
i += 1
j = i+1
// exciting the loop if no similarity is found
}
// answer given if no similarity is found
isSim = false
}
}
// printing the output
isSim
}
substring("moth", "ramathaaaaaaa")
The problem consists of two subproblems of same kind. You have to check whether
there exists a start index j such that
for all i <- 0 until n it holds that substring(i) == string(j + i)
Whenever you have to check whether some predicate holds for some / for all elements of a sequence, it can be quite handy if you can short-circuit and exit early by using the return keyword. Therefore, I'd suggest to eliminate all variables and while-loops, and use a nested method instead:
def substring(s: String, t: String): Boolean ={
val n = s.length // small string size
val m = t.length // BIG string size
def substringStartingAt(startIndex: Int): Boolean = {
for (i <- 0 until n) {
if (s(i) != t(startIndex + i)) return false
}
true
}
for (possibleStartIndex <- 0 to m - n) {
if (substringStartingAt(possibleStartIndex)) return true
}
false
}
The inner method checks whether all s(j + i) == t(i) for a given j. The outer for-loop checks whether there exists a suitable offset j.
Example:
for (
(sub, str) <- List(
("moth", "ramathaaaaaaa"),
("moth", "ramothaaaaaaa"),
("moth", "mothraaaaaaaa"),
("moth", "raaaaaaaamoth"),
("moth", "mmoth"),
("moth", "moth"),
)
) {
println(sub + " " + " " + str + ": " + substring(sub, str))
}
output:
moth ramathaaaaaaa: false
moth ramothaaaaaaa: true
moth mothraaaaaaaa: true
moth raaaaaaaamoth: true
moth mmoth: true
moth moth: true
If you were allowed to use built-in methods, you could of course also write
def substring(s: String, t: String): Boolean = {
val n = s.size
val m = t.size
(0 to m-n).exists(j => (0 until n).forall(i => s(i) == t(j + i)))
}
I offer the following slightly more idiomatic Scala code, not because I think it will perform better than Andrey's code--I don't--but simply because it uses recursion and is, perhaps, slightly easier to read:
/**
* Method to determine if "sub" is a substring of "string".
*
* #param sub the candidate substring.
* #param string the full string.
* #return true if "sub" is a substring of "string".
*/
def substring(sub: String, string: String): Boolean = {
val p = sub.toList
/**
* Tail-recursive method to determine if "p" is a subsequence of "s"
*
* #param s the super-sequence to be tested (part of the original "string").
* #return as follows:
* (1) "p" longer than "s" => false;
* (2) "p" elements match the corresponding "s" elements (starting at the start of "s") => true
* (3) recursively invoke substring on "p" and the tail of "s".
*/
#tailrec def substring(s: Seq[Char]): Boolean = p.length <= s.length && (
s.startsWith(p) || (
s match {
case Nil => false
case _ :: z => substring(z)
}
)
)
p.isEmpty || substring(string.toList)
}
If you object to using the built-in method startsWith then we could use something like:
(p zip s forall (t => t._1==t._2))
But we have to draw the line somewhere between creating everything from scratch and using built-in functions.
What would be an idiomatic way to split a string into strings of 2 characters each?
Examples:
"" -> [""]
"ab" -> ["ab"]
"abcd" -> ["ab", "cd"]
We can assume that the string has a length which is a multiple of 2.
I could use a regex like in this Java answer but I was hoping to find a better way (i.e. using one of kotlin's additional methods).
Once Kotlin 1.2 is released, you can use the chunked function that is added to kotlin-stdlib by the KEEP-11 proposal. Example:
val chunked = myString.chunked(2)
You can already try this with Kotlin 1.2 M2 pre-release.
Until then, you can implement the same with this code:
fun String.chunked(size: Int): List<String> {
val nChunks = length / size
return (0 until nChunks).map { substring(it * size, (it + 1) * size) }
}
println("abcdef".chunked(2)) // [ab, cd, ef]
This implementation drops the remainder that is less than size elements. You can modify it do add the remainder to the result as well.
A functional version of chunked using generateSequence:
fun String.split(n: Int) = Pair(this.drop(n), this.take(n))
fun String.chunked(n: Int): Sequence<String> =
generateSequence(this.split(n), {
when {
it.first.isEmpty() -> null
else -> it.first.split(n)
}
})
.map(Pair<*, String>::second)
Output:
"".chunked(2) => []
"ab".chunked(2) => [ab]
"abcd".chunked(2) => [ab, cd]
"abc".chunked(2) => [ab, c]
How can I convert a string in Scala into a corresponding operator?
Given two integers and the string "+" I want the result of adding these two integers.
The last question is very simple:
def applyOperator(x: Int, y: Int, operator: String) = operator match {
case "+" => x + y
case "-" => x - y
...
}
You could try using Twitter's Eval library or reflection, but I wouldn't recommend it given the simpler solution.
For the first question: operators themselves aren't values, so you can't "convert a string into an operator". But you can come close: convert a string into a function which will add (or subtract, etc.) its arguments:
def stringToOperator(operator: String): (Int, Int) => Int = operator match {
case "+" => _ + _
case "-" => _ - _
...
}
You can even generalize it a bit to work not just on integers:
def stringToOperator[A: Numeric](operator: String): (A, A) => A = operator match { ... }
(This also applies to the first answer in the obvious way.)
This one
case class Evaluatee(v1: Int, operator: String, v2: Int)
object Evaluator {
def raw(s: String)(v1: Int, v2: Int) = s match {
case "+" => (v1 + v2)
case "-" => (v1 - v2)
case "*" => (v1 * v2)
case "/" => (v1 / v2)
}
def evaluate(evaluatee: Evaluatee) =
raw(evaluatee.operator)(evaluatee.v1, evaluatee.v2)
}
accomplishes this tests:
test("1+1=2"){
assert(Evaluator.evaluate(Evaluatee(1, "+", 1)) == 2)
}
test("2-1=1"){
assert(Evaluator.evaluate(Evaluatee(2, "-", 1)) == 1)
}
test("1+1=2 raw"){
assert(Evaluator.raw("+")(1,1) == 2)
}
We cannot just do something like 1 "+" 2 because I think the biggest feature of scala to can make an own DSL is the apply method but I can not just calling it with nothing, I'm pretty sure we always need to use () or {} for example List(1) we can't do List 1 but we can List{1}.
But try this maybe could work for you
case class NumOp (num1:Int){
def apply(op:String)(num2:Int):Int = {
op match {
case "+" => num1+num2
case "-" => num1-num2
case _ => 0
}
}
}
object ConvertsNumOp{
implicit def convert(a:Int):NumOp= NumOp(a)
}
import ConvertsNumOp._
scala> 2 ("-") (1)
res0: Int = 1
scala> 4 ("-") (2)
res1: Int = 2
scala> 4 ("+") (2)
res2: Int = 6
scala> 4 ("-") (2) ("+") (1) ("-") (8)
res0: Int = -5
You can do things dynamically so maybe could works.
EDITED:
Here is another version of NumOp maybe cleanest
case class NumOp(num1:Int) {
def apply(op:String):Int => Int = {
op match {
case "+" => num1.+_
case "-" => num1.-_
case _ => throw new NotImplementedError("Operator not implemented")
}
}
}
Using it dynamically
val numList = List(1,2,3,4,5,6,7,8,9,10);
val optList = List("+","-");
var retVal = for{ a <- numList; op <- optList }
yield (a)(op)(a)
I tried to use readInt() to read two integers from the same line but that is not how it works.
val x = readInt()
val y = readInt()
With an input of 1 727 I get the following exception at runtime:
Exception in thread "main" java.lang.NumberFormatException: For input string: "1 727"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:231)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
at scala.Console$.readInt(Console.scala:356)
at scala.Predef$.readInt(Predef.scala:201)
at Main$$anonfun$main$1.apply$mcVI$sp(Main.scala:11)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:75)
at Main$.main(Main.scala:10)
at Main.main(Main.scala)
I got the program to work by using readf but it seems pretty awkward and ugly to me:
val (x,y) = readf2("{0,number} {1,number}")
val a = x.asInstanceOf[Int]
val b = y.asInstanceOf[Int]
println(function(a,b))
Someone suggested that I just use Java's Scanner class, (Scanner.nextInt()) but is there a nice idiomatic way to do it in Scala?
Edit:
My solution following paradigmatic's example:
val Array(a,b) = readLine().split(" ").map(_.toInt)
Followup Question: If there were a mix of types in the String how would you extract it? (Say a word, an int and a percentage as a Double)
If you mean how would you convert val s = "Hello 69 13.5%" into a (String, Int, Double) then the most obvious way is
val tokens = s.split(" ")
(tokens(0).toString,
tokens(1).toInt,
tokens(2).init.toDouble / 100)
// (java.lang.String, Int, Double) = (Hello,69,0.135)
Or as mentioned you could match using a regex:
val R = """(.*) (\d+) (\d*\.?\d*)%""".r
s match {
case R(str, int, dbl) => (str, int.toInt, dbl.toDouble / 100)
}
If you don't actually know what data is going to be in the String, then there probably isn't much reason to convert it from a String to the type it represents, since how can you use something that might be a String and might be in Int? Still, you could do something like this:
val int = """(\d+)""".r
val pct = """(\d*\.?\d*)%""".r
val res = s.split(" ").map {
case int(x) => x.toInt
case pct(x) => x.toDouble / 100
case str => str
} // Array[Any] = Array(Hello, 69, 0.135)
now to do anything useful you'll need to match on your values by type:
res.map {
case x: Int => println("It's an Int!")
case x: Double => println("It's a Double!")
case x: String => println("It's a String!")
case _ => println("It's a Fail!")
}
Or if you wanted to take things a bit further, you could define some extractors which will do the conversion for you:
abstract class StringExtractor[A] {
def conversion(s: String): A
def unapply(s: String): Option[A] = try { Some(conversion(s)) }
catch { case _ => None }
}
val intEx = new StringExtractor[Int] {
def conversion(s: String) = s.toInt
}
val pctEx = new StringExtractor[Double] {
val pct = """(\d*\.?\d*)%""".r
def conversion(s: String) = s match { case pct(x) => x.toDouble / 100 }
}
and use:
"Hello 69 13.5%".split(" ").map {
case intEx(x) => println(x + " is Int: " + x.isInstanceOf[Int])
case pctEx(x) => println(x + " is Double: " + x.isInstanceOf[Double])
case str => println(str)
}
prints
Hello
69 is Int: true
0.135 is Double: true
Of course, you can make the extrators match on anything you want (currency mnemonic, name begging with 'J', URL) and return whatever type you want. You're not limited to matching Strings either, if instead of StringExtractor[A] you make it Extractor[A, B].
You can read the line as a whole, split it using spaces and then convert each element (or the one you want) to ints:
scala> "1 727".split(" ").map( _.toInt )
res1: Array[Int] = Array(1, 727)
For most complex inputs, you can have a look at parser combinators.
The input you are describing is not two Ints but a String which just happens to be two Ints. Hence you need to read the String, split by the space and convert the individual Strings into Ints as suggested by #paradigmatic.
One way would be splitting and mapping:
// Assuming whatever is being read is assigned to "input"
val input = "1 727"
val Array(x, y) = input split " " map (_.toInt)
Or, if you have things a bit more complicated than that, a regular expression is usually good enough.
val twoInts = """^\s*(\d+)\s*(\d+)""".r
val Some((x, y)) = for (twoInts(a, b) <- twoInts findFirstIn input) yield (a, b)
There are other ways to use regex. See the Scala API docs about them.
Anyway, if regex patterns are becoming too complicated, then you should appeal to Scala Parser Combinators. Since you can combine both, you don't loose any of regex's power.
import scala.util.parsing.combinator._
object MyParser extends JavaTokenParsers {
def twoInts = wholeNumber ~ wholeNumber ^^ { case a ~ b => (a.toInt, b.toInt) }
}
val MyParser.Success((x, y), _) = MyParser.parse(MyParser.twoInts, input)
The first example was more simple, but harder to adapt to more complex patterns, and more vulnerable to invalid input.
I find that extractors provide some machinery that makes this type of processing nicer. And I think it works up to a certain point nicely.
object Tokens {
def unapplySeq(line: String): Option[Seq[String]] =
Some(line.split("\\s+").toSeq)
}
class RegexToken[T](pattern: String, convert: (String) => T) {
val pat = pattern.r
def unapply(token: String): Option[T] = token match {
case pat(s) => Some(convert(s))
case _ => None
}
}
object IntInput extends RegexToken[Int]("^([0-9]+)$", _.toInt)
object Word extends RegexToken[String]("^([A-Za-z]+)$", identity)
object Percent extends RegexToken[Double](
"""^([0-9]+\.?[0-9]*)%$""", _.toDouble / 100)
Now how to use:
List("1 727", "uptime 365 99.999%") collect {
case Tokens(IntInput(x), IntInput(y)) => "sum " + (x + y)
case Tokens(Word(w), IntInput(i), Percent(p)) => w + " " + (i * p)
}
// List[java.lang.String] = List(sum 728, uptime 364.99634999999995)
To use for reading lines at the console:
Iterator.continually(readLine("prompt> ")).collect{
case Tokens(IntInput(x), IntInput(y)) => "sum " + (x + y)
case Tokens(Word(w), IntInput(i), Percent(p)) => w + " " + (i * p)
case Tokens(Word("done")) => "done"
}.takeWhile(_ != "done").foreach(println)
// type any input and enter, type "done" and enter to finish
The nice thing about extractors and pattern matching is that you can add case clauses as necessary, you can use Tokens(a, b, _*) to ignore some tokens. I think they combine together nicely (for instance with literals as I did with done).