So I was reading up about how scala lets you treat string as a sequence of chars through its implicit mechanism. I created a generic Trie class for a general element type and wanted to use it's Char based implementation with string like syntax.
import collection.mutable
import scala.annotation.tailrec
case class Trie[Elem, Meta](children: mutable.Map[Elem, Trie[Elem, Meta]], var metadata: Option[Meta] = None) {
def this() = this(mutable.Map.empty)
#tailrec
final def insert(item: Seq[Elem], metadata: Meta): Unit = {
item match {
case Nil =>
this.metadata = Some(metadata)
case x :: xs =>
children.getOrElseUpdate(x, new Trie()).insert(xs, metadata)
}
}
def insert(items: (Seq[Elem], Meta)*): Unit = items.foreach { case (item, meta) => insert(item, meta) }
def find(item: Seq[Elem]): Option[Meta] = {
item match {
case Nil => metadata
case x :: xs => children.get(x).flatMap(_.metadata)
}
}
}
object Trie extends App {
type Dictionary = Trie[Char, String]
val dict = new Dictionary()
dict.insert( "hello", "meaning of hello")
dict.insert("hi", "another word for hello")
dict.insert("bye", "opposite of hello")
println(dict)
}
Weird thing is, it compiles fine but gives error on running:
Exception in thread "main" scala.MatchError: hello (of class scala.collection.immutable.WrappedString)
at Trie.insert(Trie.scala:11)
at Trie$.delayedEndpoint$com$inmobi$data$mleap$Trie$1(Trie.scala:34)
at Trie$delayedInit$body.apply(Trie.scala:30)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at Trie$.main(Trie.scala:30)
at Trie.main(Trie.scala)
It's able to implicitly convert String to WrappedString, but that doesn't match the ::. Any workarounds for this?
You can use startsWith as follows:
val s = "ThisIsAString"
s match {
case x if x.startsWith("T") => 1
case _ => 0
}
Or convert your String to List of chars with toList
scala> val s = "ThisIsAString"
s: String = ThisIsAString
scala> s.toList
res10: List[Char] = List(T, h, i, s, I, s, A, S, t, r, i, n, g)
An then use it as any other List
s.toList match {
case h::t => whatever
case _ => anotherThing
}
Your insert method declares item to be a Seq, but your pattern match only matches on List. A string can be implicitly converted to a Seq[Char], but it isn't a List. Use a pattern match on Seq instead of List using +:.
#tailrec
final def insert(item: Seq[Elem], metadata: Meta): Unit = {
item match {
case Seq() =>
this.metadata = Some(metadata)
case x +: xs =>
children.getOrElseUpdate(x, new Trie()).insert(xs, metadata)
}
}
The same applies to your find method.
Related
This question already has an answer here:
Most efficient way to create a Scala Map from a file of strings?
(1 answer)
Closed 4 years ago.
Hi so I'm trying to create a Map[String, String] based on a text file, in the textfile there are arbritrary lines that begin with ";;;" that I ignore with the function and the lines that i dont ignore are the key-> values. they are separated by 2 spaces.
whenever i run my code i get an error saying the expected type Map[String,String] isn't the required type, even though my conversions seem correct.
def createMap(filename: String): Map[String,String] = {
for (line <- Source.fromFile(filename).getLines) {
if (line.nonEmpty && !line.startsWith(";;;")) {
val string: String = line.toString
val splits: Array[String] = string.split(" ")
splits.map(arr => arr(0) -> arr(1)).toMap
}
}
}
I expect it to return a (String -> String) map but instead i get a bunch of errors. how would i fix this?
Since your if statement is not an expression in the for-loop. You should use the if as a filter when yielding your results. To return a result, you must make it a for-comprehension. After the for-comprehension filters the results. You can map this structure to a Map.
import scala.io.Source
def createMap(filename: String): Map[String,String] = {
val keyValuePairs = for (line <- Source.fromFile(filename).getLines; if line.nonEmpty && !line.startsWith(";;;")) yield {
val string = line.toString
val splits: Array[String] = string.split(" ")
splits(0) -> splits(1)
}
keyValuePairs.toMap
}
Okay, so I took a second look. It looks like the file has some corrupt encodings. You can try this as a solution. It worked in my Scala REPL:
import java.nio.charset.CodingErrorAction
import scala.io.{Codec, Source}
def createMap(filename: String): Map[String,String] = {
val decoder = Codec.UTF8.decoder.onMalformedInput(CodingErrorAction.IGNORE)
Source.fromFile(filename)(decoder).getLines()
.filter(line => line.nonEmpty && !line.startsWith(";;;"))
.flatMap(line => {
val arr = line.split("\\s+")
arr match {
case Array(key, value) => Some(key -> value)
case Array(key, values#_*) => Some(key -> values.mkString(" "))
case _ => None
}
}).toMap
}
How can I convert a string in Scala into a corresponding operator?
Given two integers and the string "+" I want the result of adding these two integers.
The last question is very simple:
def applyOperator(x: Int, y: Int, operator: String) = operator match {
case "+" => x + y
case "-" => x - y
...
}
You could try using Twitter's Eval library or reflection, but I wouldn't recommend it given the simpler solution.
For the first question: operators themselves aren't values, so you can't "convert a string into an operator". But you can come close: convert a string into a function which will add (or subtract, etc.) its arguments:
def stringToOperator(operator: String): (Int, Int) => Int = operator match {
case "+" => _ + _
case "-" => _ - _
...
}
You can even generalize it a bit to work not just on integers:
def stringToOperator[A: Numeric](operator: String): (A, A) => A = operator match { ... }
(This also applies to the first answer in the obvious way.)
This one
case class Evaluatee(v1: Int, operator: String, v2: Int)
object Evaluator {
def raw(s: String)(v1: Int, v2: Int) = s match {
case "+" => (v1 + v2)
case "-" => (v1 - v2)
case "*" => (v1 * v2)
case "/" => (v1 / v2)
}
def evaluate(evaluatee: Evaluatee) =
raw(evaluatee.operator)(evaluatee.v1, evaluatee.v2)
}
accomplishes this tests:
test("1+1=2"){
assert(Evaluator.evaluate(Evaluatee(1, "+", 1)) == 2)
}
test("2-1=1"){
assert(Evaluator.evaluate(Evaluatee(2, "-", 1)) == 1)
}
test("1+1=2 raw"){
assert(Evaluator.raw("+")(1,1) == 2)
}
We cannot just do something like 1 "+" 2 because I think the biggest feature of scala to can make an own DSL is the apply method but I can not just calling it with nothing, I'm pretty sure we always need to use () or {} for example List(1) we can't do List 1 but we can List{1}.
But try this maybe could work for you
case class NumOp (num1:Int){
def apply(op:String)(num2:Int):Int = {
op match {
case "+" => num1+num2
case "-" => num1-num2
case _ => 0
}
}
}
object ConvertsNumOp{
implicit def convert(a:Int):NumOp= NumOp(a)
}
import ConvertsNumOp._
scala> 2 ("-") (1)
res0: Int = 1
scala> 4 ("-") (2)
res1: Int = 2
scala> 4 ("+") (2)
res2: Int = 6
scala> 4 ("-") (2) ("+") (1) ("-") (8)
res0: Int = -5
You can do things dynamically so maybe could works.
EDITED:
Here is another version of NumOp maybe cleanest
case class NumOp(num1:Int) {
def apply(op:String):Int => Int = {
op match {
case "+" => num1.+_
case "-" => num1.-_
case _ => throw new NotImplementedError("Operator not implemented")
}
}
}
Using it dynamically
val numList = List(1,2,3,4,5,6,7,8,9,10);
val optList = List("+","-");
var retVal = for{ a <- numList; op <- optList }
yield (a)(op)(a)
For a normal String I am be able to do like this:
val str = "asdzxc"
val (first, second) = str.splitAt(3) // splits into ("asd", "zxc")
I'd like to be able to do a similar thing for an Option[String]:
val optionalString: Option[String] = getOptionalString(...)
val (firstOption, secondOption) = ???
So that the types of firstOption and secondOption would be Option[String]. I know I can do like this:
optionalString.map(_.splitAt(3))
which returns me an Option[(String, String)] but that is not what I'm looking for. Any ideas?
You can either use a fold:
optionalString.map(_.splitAt(3)).fold((None: Option[String], None: Option[String])) { case (a, b) => (Some(a), Some(b)) }
Or pattern matching (possibly a bit clearer since it doesn't need the explicit type parameters):
optionalString.map(_.splitAt(3)) match {
case None => (None,None)
case Some((a,b)) => (Some(a), Some(b))
}
I am attempting to create a Scala method that will take one parent group of parentheses, represented as a String, and then map each subgroup of parentheses to a different letter. It should then put these in a map which it returns, so basically I call the following method like this:
val s = "((2((x+3)+6)))"
val map = mapParentheses(s)
Where s could contain any number of sets of parentheses, and the Map returned should contain:
"(x+3)" -> 'a'
"(a+6)" -> 'b'
"(2b)" -> 'c'
"(c)" -> 'd'
So that elsewhere in my program I can recall 'd' and get "(c)" which will become "((2b))" then ((2(a+6))) and finally ((2((x+3)+6))). The string sent to the method mapParentheses will never have unmatched parentheses, or extra chars outside of the main parent parentheses, so the following items will never be sent:
"(fsf)a" because the a is outside the parent parentheses
"(a(aa))(a)" because the (a) is outside the parent parentheses
"((a)" because the parentheses are unmatched
")a(" because the parentheses are unmatched
So I was wondering if anyone knew of an easy (or not easy) way of creating this mapParentheses method.
You can do this pretty easily with Scala's parser combinators. First for the import and some simple data structures:
import scala.collection.mutable.Queue
import scala.util.parsing.combinator._
sealed trait Block {
def text: String
}
case class Stuff(text: String) extends Block
case class Paren(m: List[(String, Char)]) extends Block {
val text = m.head._2.toString
def toMap = m.map { case (k, v) => "(" + k + ")" -> v }.toMap
}
I.e., a block represents a substring of the input that is either some non-parenthetical stuff or a parenthetical.
Now for the parser itself:
class ParenParser(fresh: Queue[Char]) extends RegexParsers {
val stuff: Parser[Stuff] = "[^\\(\\)]+".r ^^ (Stuff(_))
def paren: Parser[Paren] = ("(" ~> insides <~ ")") ^^ {
case (s, m) => Paren((s -> fresh.dequeue) :: m)
}
def insides: Parser[(String, List[(String, Char)])] =
rep1(paren | stuff) ^^ { blocks =>
val s = blocks.flatMap(_.text)(collection.breakOut)
val m = blocks.collect {
case Paren(n) => n
}.foldLeft(List.empty[(String, Char)])(_ ++ _)
(s, m)
}
def parse(input: String) = this.parseAll(paren, input).get.toMap
}
Using get in the last line is very much not ideal, but is justified by your assertion that we can expect well-formed input.
Now we can create a new parser and pass in a mutable queue with some fresh variables:
val parser = new ParenParser(Queue('a', 'b', 'c', 'd', 'e', 'f'))
And now try out your test string:
scala> println(parser parse "((2((x+3)+6)))")
Map((c) -> d, (2b) -> c, (a+6) -> b, (x+3) -> a)
As desired. A more interesting exercise (left to the reader) would be to thread some state through the parser to avoid the mutable queue.
Classic recursive parsing problem. It can be handy to hold the different bits. We'll add a few utility methods to help us out later.
trait Part {
def text: String
override def toString = text
}
class Text(val text: String) extends Part {}
class Parens(val contents: Seq[Part]) extends Part {
val text = "(" + contents.mkString + ")"
def mapText(m: Map[Parens, Char]) = {
val inside = contents.collect{
case p: Parens => m(p).toString
case x => x.toString
}
"(" + inside.mkString + ")"
}
override def equals(a: Any) = a match {
case p: Parens => text == p.text
case _ => false
}
override def hashCode = text.hashCode
}
Now you need to parse into these things:
def str2parens(s: String): (Parens, String) = {
def fail = throw new Exception("Wait, you told me the input would be perfect.")
if (s(0) != '(') fail
def parts(s: String, found: Seq[Part] = Vector.empty): (Seq[Part], String) = {
if (s(0)==')') (found,s)
else if (s(0)=='(') {
val (p,s2) = str2parens(s)
parts(s2, found :+ p)
}
else {
val (tx,s2) = s.span(c => c != '(' && c != ')')
parts(s2, found :+ new Text(tx))
}
}
val (inside, more) = parts(s.tail)
if (more(0)!=')') fail
(new Parens(inside), more.tail)
}
Now we've got the whole thing parsed. So let's find all the bits.
def findParens(p: Parens): Set[Parens] = {
val inside = p.contents.collect{ case q: Parens => findParens(q) }
inside.foldLeft(Set(p)){_ | _}
}
Now we can build the map you want.
def mapParentheses(s: String) = {
val (p,_) = str2parens(s)
val pmap = findParens(p).toSeq.sortBy(_.text.length).zipWithIndex.toMap
val p2c = pmap.mapValues(i => ('a'+i).toChar)
p2c.map{ case(p,c) => (p.mapText(p2c), c) }.toMap
}
Evidence that it works:
scala> val s = "((2((x+3)+6)))"
s: java.lang.String = ((2((x+3)+6)))
scala> val map = mapParentheses(s)
map: scala.collection.immutable.Map[java.lang.String,Char] =
Map((x+3) -> a, (a+6) -> b, (2b) -> c, (c) -> d)
I will leave it as an exercise to the reader to figure out how it works, with the hint that recursion is a really powerful way to parse recursive structures.
def parse(s: String,
c: Char = 'a', out: Map[Char, String] = Map() ): Option[Map[Char, String]] =
"""\([^\(\)]*\)""".r.findFirstIn(s) match {
case Some(m) => parse(s.replace(m, c.toString), (c + 1).toChar , out + (c -> m))
case None if s.length == 1 => Some(out)
case _ => None
}
This outputs an Option containing a Map if it parses, which is better than throwing an exception if it doesn't. I suspect you really wanted a map from Char to the String, so that's what this outputs. c and out are default parameters so you don't need to input them yourself. The regex just means "any number of characters that aren't parens, eclosed in parens" (the paren characters need to be escaped with "\"). findFirstIn finds the first match and returns an Option[String], which we can pattern match on, replacing that string with the relevant character.
val s = "((2((x+3)+6)))"
parse(s) //Some(Map(a -> (x+3), b -> (a+6), c -> (2b), d -> (c)))
parse("(a(aa))(a)") //None
I tried to use readInt() to read two integers from the same line but that is not how it works.
val x = readInt()
val y = readInt()
With an input of 1 727 I get the following exception at runtime:
Exception in thread "main" java.lang.NumberFormatException: For input string: "1 727"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:492)
at java.lang.Integer.parseInt(Integer.java:527)
at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:231)
at scala.collection.immutable.StringOps.toInt(StringOps.scala:31)
at scala.Console$.readInt(Console.scala:356)
at scala.Predef$.readInt(Predef.scala:201)
at Main$$anonfun$main$1.apply$mcVI$sp(Main.scala:11)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:75)
at Main$.main(Main.scala:10)
at Main.main(Main.scala)
I got the program to work by using readf but it seems pretty awkward and ugly to me:
val (x,y) = readf2("{0,number} {1,number}")
val a = x.asInstanceOf[Int]
val b = y.asInstanceOf[Int]
println(function(a,b))
Someone suggested that I just use Java's Scanner class, (Scanner.nextInt()) but is there a nice idiomatic way to do it in Scala?
Edit:
My solution following paradigmatic's example:
val Array(a,b) = readLine().split(" ").map(_.toInt)
Followup Question: If there were a mix of types in the String how would you extract it? (Say a word, an int and a percentage as a Double)
If you mean how would you convert val s = "Hello 69 13.5%" into a (String, Int, Double) then the most obvious way is
val tokens = s.split(" ")
(tokens(0).toString,
tokens(1).toInt,
tokens(2).init.toDouble / 100)
// (java.lang.String, Int, Double) = (Hello,69,0.135)
Or as mentioned you could match using a regex:
val R = """(.*) (\d+) (\d*\.?\d*)%""".r
s match {
case R(str, int, dbl) => (str, int.toInt, dbl.toDouble / 100)
}
If you don't actually know what data is going to be in the String, then there probably isn't much reason to convert it from a String to the type it represents, since how can you use something that might be a String and might be in Int? Still, you could do something like this:
val int = """(\d+)""".r
val pct = """(\d*\.?\d*)%""".r
val res = s.split(" ").map {
case int(x) => x.toInt
case pct(x) => x.toDouble / 100
case str => str
} // Array[Any] = Array(Hello, 69, 0.135)
now to do anything useful you'll need to match on your values by type:
res.map {
case x: Int => println("It's an Int!")
case x: Double => println("It's a Double!")
case x: String => println("It's a String!")
case _ => println("It's a Fail!")
}
Or if you wanted to take things a bit further, you could define some extractors which will do the conversion for you:
abstract class StringExtractor[A] {
def conversion(s: String): A
def unapply(s: String): Option[A] = try { Some(conversion(s)) }
catch { case _ => None }
}
val intEx = new StringExtractor[Int] {
def conversion(s: String) = s.toInt
}
val pctEx = new StringExtractor[Double] {
val pct = """(\d*\.?\d*)%""".r
def conversion(s: String) = s match { case pct(x) => x.toDouble / 100 }
}
and use:
"Hello 69 13.5%".split(" ").map {
case intEx(x) => println(x + " is Int: " + x.isInstanceOf[Int])
case pctEx(x) => println(x + " is Double: " + x.isInstanceOf[Double])
case str => println(str)
}
prints
Hello
69 is Int: true
0.135 is Double: true
Of course, you can make the extrators match on anything you want (currency mnemonic, name begging with 'J', URL) and return whatever type you want. You're not limited to matching Strings either, if instead of StringExtractor[A] you make it Extractor[A, B].
You can read the line as a whole, split it using spaces and then convert each element (or the one you want) to ints:
scala> "1 727".split(" ").map( _.toInt )
res1: Array[Int] = Array(1, 727)
For most complex inputs, you can have a look at parser combinators.
The input you are describing is not two Ints but a String which just happens to be two Ints. Hence you need to read the String, split by the space and convert the individual Strings into Ints as suggested by #paradigmatic.
One way would be splitting and mapping:
// Assuming whatever is being read is assigned to "input"
val input = "1 727"
val Array(x, y) = input split " " map (_.toInt)
Or, if you have things a bit more complicated than that, a regular expression is usually good enough.
val twoInts = """^\s*(\d+)\s*(\d+)""".r
val Some((x, y)) = for (twoInts(a, b) <- twoInts findFirstIn input) yield (a, b)
There are other ways to use regex. See the Scala API docs about them.
Anyway, if regex patterns are becoming too complicated, then you should appeal to Scala Parser Combinators. Since you can combine both, you don't loose any of regex's power.
import scala.util.parsing.combinator._
object MyParser extends JavaTokenParsers {
def twoInts = wholeNumber ~ wholeNumber ^^ { case a ~ b => (a.toInt, b.toInt) }
}
val MyParser.Success((x, y), _) = MyParser.parse(MyParser.twoInts, input)
The first example was more simple, but harder to adapt to more complex patterns, and more vulnerable to invalid input.
I find that extractors provide some machinery that makes this type of processing nicer. And I think it works up to a certain point nicely.
object Tokens {
def unapplySeq(line: String): Option[Seq[String]] =
Some(line.split("\\s+").toSeq)
}
class RegexToken[T](pattern: String, convert: (String) => T) {
val pat = pattern.r
def unapply(token: String): Option[T] = token match {
case pat(s) => Some(convert(s))
case _ => None
}
}
object IntInput extends RegexToken[Int]("^([0-9]+)$", _.toInt)
object Word extends RegexToken[String]("^([A-Za-z]+)$", identity)
object Percent extends RegexToken[Double](
"""^([0-9]+\.?[0-9]*)%$""", _.toDouble / 100)
Now how to use:
List("1 727", "uptime 365 99.999%") collect {
case Tokens(IntInput(x), IntInput(y)) => "sum " + (x + y)
case Tokens(Word(w), IntInput(i), Percent(p)) => w + " " + (i * p)
}
// List[java.lang.String] = List(sum 728, uptime 364.99634999999995)
To use for reading lines at the console:
Iterator.continually(readLine("prompt> ")).collect{
case Tokens(IntInput(x), IntInput(y)) => "sum " + (x + y)
case Tokens(Word(w), IntInput(i), Percent(p)) => w + " " + (i * p)
case Tokens(Word("done")) => "done"
}.takeWhile(_ != "done").foreach(println)
// type any input and enter, type "done" and enter to finish
The nice thing about extractors and pattern matching is that you can add case clauses as necessary, you can use Tokens(a, b, _*) to ignore some tokens. I think they combine together nicely (for instance with literals as I did with done).