I have various types of strings like the following:
sales_data_type
saledatatypes
sales_data.new.metric1
sales_data.type.other.metric2
sales_data.type3.metric3
I'm trying to parse them to get a substring with a word before and after the last dot. For example: new.metric1, other.metric2, type3.metric3. If a word doesn't contain dots, it has to be returned as is: sales_data_type, saledatatypes.
With a Regex it may be done this way:
val infoStr = "sales_data.type.other.metric2"
val pattern = ".*?([^.]+\\.)?([^.]+)$"
println(infoStr.replaceAll(pattern, "$1$2"))
// prints other.metric2
// for saledatatypes just prints nullsaledatatypes ??? but seems to work
I want to find a way to achieve this with Scala, without using Regex in order to expand my understanding of Scala features. Will be grateful for any ideas.
One-liner:
dataStr.split('.').takeRight(2).mkString(".")
takeRight(2) will take the last 2 if there are 2 to take, else it will take the last, and only, 1. mkString(".") will re-insert the dot only if there are 2 elements for the dot to go between, else it will leave the string unaltered.
Here's one with lots of scala features for you.
val string = "head.middle.last"
val split = string.split('.') // Array(head, middle, last)
val result = split.toSeq match {
case Seq(word) ⇒ word
case _ :+ before :+ after ⇒ s"$before.$after"
}
println(result) // middle.last
First we split the string on your . and get individual parts.
Then we pattern match those parts, first to check if there is only one (in which case we just return it), and second to grab the last two elements in the seq.
Finally we put a . back in between those last two using string interpolation.
One way of doing it:
val value = "sales_data.type.other.metric2"
val elems = value.split("\\.").toList
elems match {
case _:+beforeLast:+last => s"${beforeLast}.${last}"
case _ => throw new NoSuchElementException
}
for(s<-strs) yield {val s1 = s.split('.');
if(s1.size>=2)s1.takeRight(2).mkString(".") else s }
or
for(s<-strs) yield { val s1 = s.split('.');
if(s1.size>=2)s1.init.last+'.'+s1.last else s }
In Scala REPL:
scala> val strs =
Vector("sales_data_type","saledatatypes","sales_data.new.metric1","sales_data.type.other.metric2","sales_d
ata.type3.metric3")
strs: scala.collection.immutable.Vector[String] = Vector(sales_data_type, saledatatypes, sales_data.new.metric1, sales_data.
type.other.metric2, sales_data.type3.metric3)
scala> for(s<-strs) yield { val s1 = s.split('.');if(s1.size>=2)s1.takeRight(2).mkString(".") else s }
res62: scala.collection.immutable.Vector[String] = Vector(sales_data_type, saledatatypes, new.metric1, other.metric2, type3.
metric3)
scala> for(s<-strs) yield { val s1 = s.split('.');if(s1.size>=2)s1.init.last+'.'+s1.last else s }
res60: scala.collection.immutable.Vector[String] = Vector(sales_data_type, saledatatypes, new.metric1, other.metric2, type3.
metric3)
Use scala match and do like this
def getFormattedStr(str:String):String={
str.contains(".") match{
case true=>{
val arr=str.split("\\.")
val len=arr.length
len match{
case 1=>str
case _=>arr(len-2)+"."+arr(len-1)
}
}
case _=>str
}
}
Related
Is there a base function or simple way to replace multiple strings with multiple strings in a reference String?
I have seen Replace multiple strings with multiple other strings but it is using known lists instead of variable ones.
For example:
I have val str = "THE GOAT IS RED" , and I want to replace all the characters with other characters or digits, something like:
str.replace("THEGOAISRD".toList(), "0123456789".toList())
To which will result
"012 3450 67 829"
val list1 = listOf('a', 'b', 'c')
val list2 = listOf('0', '1', '2')
val str = "abacada"
val transform = list1.withIndex().associate { it.value to list2[it.index] }
val result = str.map { transform[it] ?: it }.joinToString(separator = "")
println(result)
prints 01020d0
You could do that by first building a dictionary (Map<Char, Char>) using zip and then iterating the string to transform with joinToString like that:
val str = "THE GOAT IS RED"
val dictionary = "THEGOAISRD".zip("0123475689").toMap()
val result = str.toCharArray().joinToString("") {
dictionary.getOrDefault(it, it).toString()
}
println(result)
I have two strings
val a = "abc"
val b = "xyz"
I want to merge it and need output like below
axbycz
I added both strings to arraylist and then flatmap it
val c = listOf(a, b)
val d = c.flatMap {
it.toList()
}
but not getting the desired result
Use the zip function. It creates a list of pairs with "adjacent" letters. You can then use joinToString with a transformer to create your final result.
a.zip(b) // Returns the list [(a, x), (b, y), (c, z)]
.joinToString("") { (a, b) -> "$a$b" } // Joins the list back to a string with no separator
You can always use a simple loop, assuming both strings have the same size. That way You only allocate a StringBuilder and counter variable, without any lists, arrays or pairs:
val a = "abc"
val b = "xyz"
val sb = StringBuilder()
for(i in 0 until a.length){
sb.append(a[i]).append(b[i])
}
val d = sb.toString()
marstran's answer is really concise and Pawels answer is really fast. Using buildString you can have to best of both worlds:
buildString {
a.zip(b).forEach { (a, b) ->
append(a).append(b)
}
}
buildString creates a StringBuilder and offers it as receiver in the lambda. It returns the built string.
Try it out here: Kotlin Playground. Thanks to Pawel for creating the original benchmark.
I have a method that detects the index in a string that is not printable as follows.
def isPrintable(v:Char) = v >= 0x20 && v <= 0x7E
val ba = List[Byte](33,33,0,0,0)
ba.zipWithIndex.filter { v => !isPrintable(v._1.toChar) } map {v => v._2}
> res115: List[Int] = List(2, 3, 4)
The first element of the result list is the index, but I wonder if there is a simpler way to do this.
If you want an Option[Int] of the first non-printable character (if one exists), you can do:
ba.zipWithIndex.collectFirst{
case (char, index) if (!isPrintable(char.toChar)) => index
}
> res4: Option[Int] = Some(2)
If you want all the indices like in your example, just use collect instead of collectFirst and you'll get back a List.
For getting only the first index that meets the given condition:
ba.indexWhere(v => !isPrintable(v.toChar))
(it returns -1 if nothing is found)
You can use directly regexp to found unprintable characters by unicode code points.
Resource: Regexp page
In such way you can directly filter your string with such pattern, for instance:
val text = "this is \n sparta\t\r\n!!!"
text.zipWithIndex.filter(_._1.matches("\\p{C}")).map(_._2)
> res3: Vector(8, 16, 17, 18)
As result you'll get Vector with indices of all unprintable characters in String. Check it out
If desired only the first occurrence of non printable char
Method span applied on a List delivers two sublists, the first where all the elements hold a condition, the second starts with an element that falsified the condition. In this case consider,
val (l,r) = ba.span(b => isPrintable(b.toChar))
l: List(33, 33)
r: List(0, 0, 0)
To get the index of the first non printable char,
l.size
res: Int = 2
If desired all the occurrences of non printable chars
Consider partition of a given List for a criteria. For instance, for
val ba2 = List[Byte](33,33,0,33,33)
val (l,r) = ba2.zipWithIndex.partition(b => isPrintable(b._1.toChar))
l: List((33,0), (33,1), (33,3), (33,4))
r: List((0,2))
where r includes tuples with non printable chars and their position in the original List.
I am not sure whether list of indexes or tuples is needed and I am not sure whether 'ba' needs to be an list of bytes or starts off as a string.
for { i <- 0 until ba.length if !isPrintable(ba(i).toChar) } yield i
here, because people need performance :)
def getNonPrintable(ba:List[Byte]):List[Int] = {
import scala.collection.mutable.ListBuffer
var buffer = ListBuffer[Int]()
#tailrec
def go(xs: List[Byte], cur: Int): ListBuffer[Int] = {
xs match {
case Nil => buffer
case y :: ys => {
if (!isPrintable(y.toChar)) buffer += cur
go(ys, cur + 1)
}
}
}
go(ba, 0)
buffer.toList
}
I would like to split a string on whitespace that has 4 elements:
1 1 4.57 0.83
and I am trying to convert into List[(String,String,Point)] such that first two splits are first two elements in the list and the last two is Point. I am doing the following but it doesn't seem to work:
Source.fromFile(filename).getLines.map(string => {
val split = string.split(" ")
(split(0), split(1), split(2))
}).map{t => List(t._1, t._2, t._3)}.toIterator
How about this:
scala> case class Point(x: Double, y: Double)
defined class Point
scala> s43.split("\\s+") match { case Array(i, j, x, y) => (i.toInt, j.toInt, Point(x.toDouble, y.toDouble)) }
res00: (Int, Int, Point) = (1,1,Point(4.57,0.83))
You could use pattern matching to extract what you need from the array:
case class Point(pts: Seq[Double])
val lines = List("1 1 4.34 2.34")
val coords = lines.collect(_.split("\\s+") match {
case Array(s1, s2, points # _*) => (s1, s2, Point(points.map(_.toDouble)))
})
You are not converting the third and fourth tokens into a Point, nor are you converting the lines into a List. Also, you are not rendering each element as a Tuple3, but as a List.
The following should be more in line with what you are looking for.
case class Point(x: Double, y: Double) // Simple point class
Source.fromFile(filename).getLines.map(line => {
val tokens = line.split("""\s+""") // Use a regex to avoid empty tokens
(tokens(0), tokens(1), Point(tokens(2).toDouble, tokens(3).toDouble))
}).toList // Convert from an Iterator to List
case class Point(pts: Seq[Double])
val lines = "1 1 4.34 2.34"
val splitLines = lines.split("\\s+") match {
case Array(s1, s2, points # _*) => (s1, s2, Point(points.map(_.toDouble)))
}
And for the curious, the # in pattern matching binds a variable to the pattern, so points # _* is binding the variable points to the pattern *_ And *_ matches the rest of the array, so points ends up being a Seq[String].
There are ways to convert a Tuple to List or Seq, One way is
scala> (1,2,3).productIterator.toList
res12: List[Any] = List(1, 2, 3)
But as you can see that the return type is Any and NOT an INTEGER
For converting into different types you use Hlist of
https://github.com/milessabin/shapeless
I'm fairly new to Scala, but I'm doing my exercises now.
I have a string like "A>Augsburg;B>Berlin". What I want at the end is a map
val mymap = Map("A"->"Augsburg", "B"->"Berlin")
What I did is:
val st = locations.split(";").map(dynamicListExtract _)
with the function
private def dynamicListExtract(input: String) = {
if (input contains ">") {
val split = input split ">"
Some(split(0), split(1)) // return key , value
} else {
None
}
}
Now I have an Array[Option[(String, String)
How do I elegantly convert this into a Map[String, String]
Can anybody help?
Thanks
Just change your map call to flatMap:
scala> sPairs.split(";").flatMap(dynamicListExtract _)
res1: Array[(java.lang.String, java.lang.String)] = Array((A,Augsburg), (B,Berlin))
scala> Map(sPairs.split(";").flatMap(dynamicListExtract _): _*)
res2: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map((A,Augsburg), (B,Berlin))
For comparison:
scala> Map("A" -> "Augsburg", "B" -> "Berlin")
res3: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map((A,Augsburg), (B,Berlin))
In 2.8, you can do this:
val locations = "A>Augsburg;B>Berlin"
val result = locations.split(";").map(_ split ">") collect { case Array(k, v) => (k, v) } toMap
collect is like map but also filters values that aren't defined in the partial function. toMap will create a Map from a Traversable as long as it's a Traversable[(K, V)].
It's also worth seeing Randall's solution in for-comprehension form, which might be clearer, or at least give you a better idea of what flatMap is doing.
Map.empty ++ (for(possiblePair<-sPairs.split(";"); pair<-dynamicListExtract(possiblePair)) yield pair)
A simple solution (not handling error cases):
val str = "A>Aus;B>Ber"
var map = Map[String,String]()
str.split(";").map(_.split(">")).foreach(a=>map += a(0) -> a(1))
but Ben Lings' is better.
val str= "A>Augsburg;B>Berlin"
Map(str.split(";").map(_ split ">").map(s => (s(0),s(1))):_*)
--or--
str.split(";").map(_ split ">").foldLeft(Map[String,String]())((m,s) => m + (s(0) -> s(1)))