Unable to split String using | - string

When I split a String using , works as expected :
val line1 = "this,is,a,test" //> line1 : String = this,is,a,test
val sLine = line1.split(",")
however if I use | the String is split into its character elements and added to array :
val line1 = "this|is|a|test" //> line1 : String = this|is|a|test
val sLine = line1.split("|") //> sLine : Array[String] = Array("", t, h, i, s, |, i, s, |, a, |, t, e, s, t)
Why is this occurring because of | character ?

possible solutions
val sLine2 = line1.split('|')
because ' denotes a character, a single character, split does not treat it as a regexp
val sLine2 = line1.split("\\|")
to escape the special alternation | regexp character. This is why it isn't working. split is treating | as a zero width regexp and so the string is vapourized into its component characters

As pipe is a special regex character, I believe you need to escape it like so "\\|" in order for it to work

scala> val line1 = "this,is,a,test"
line1: java.lang.String = this,is,a,test
scala> line1.split(",")
res2: Array[java.lang.String] = Array(this, is, a, test)
scala> var line2 = "this|is|a|test"
line2: java.lang.String = this|is|a|test
scala> line2.split("\\|")
res3: Array[java.lang.String] = Array(this, is, a, test)

Related

How do I split one string into two strings in Kotlin with "."?

Let's say I have this string:
val mainString: String = "AAA.BBB"
And now I define two children strings:
val firstString: String = ""
val secondString: String = ""
What code should I write to make firstString equals to "AAA", and secondString equals to "BBB"?
The below code works for any amount of strings separated by delimiters
val texto = "111.222.333"
val vet = texto.split(".")
for (st in vet) println(st)
It prints
111
222
333

How do I remove a substring/character from a string in Scala?

I am writing a program in which I need to filter a string. So I have a map of characters, and I want the string to filter out all characters that are not in the map. Is there a way for me to do this?
Let's say we have the string and map:
str = "ABCDABCDABCDABCDABCD"
Map('A' -> "A", 'D' -> "D")
Then I want the string to be filtered down to:
str = "BCBCBCBCBC"
Also, if I find a given substring in the string, is there a way I can replace that with a different substring?
So for example, if we have the string:
"The number ten is even"
Could we replace that with:
"The number 10 is even"
To filter the String with the map is just a filter command:
val str = "ABCDABCDABCDABCDABCD"
val m = Map('A' -> "A", 'D' -> "D")
str.filterNot(elem => m.contains(elem))
A more functional alternative as recommended in comments
str.filterNot(m.contains)
Output
scala> str.filterNot(elem => m.contains(elem))
res3: String = BCBCBCBCBC
To replace elements in the String:
string.replace("ten", "10")
Output
scala> val s = "The number ten is even"
s: String = The number ten is even
scala> s.replace("ten", "10")
res4: String = The number 10 is even

Removing characters from List of Strings

To remove characters from a List of Strings I use :
val validLines : List[String] = List("test[" , "test]")
val charsToClean: List[String] = List("\"", "[", "]", "'")
val filtered = validLines.map(line => line.replace(charsToClean(0), "")
.replace(charsToClean(1), "")
.replace(charsToClean(2), "")
.replace(charsToClean(3), ""))
I'm attempting use a inner map function instead of hard coding the positions of the chars to replace :
val filtered1 : List[String] = validLines.map(line => charsToClean.map {c => line.replace(c , "") })
But receive compiler error :
mismatch; found : List[List[String]] required: List[String]
Should result of line.replace(c , "") not be returned ?
No, because your code is more like: for every string and for every unwanted char, return replacement for this very symbol (n^2 strings).
What you probably wanted to do can be achieved using the following snippet:
val rawLines : List[String] = List("test[" , "test]")
val blacklist: Set[Char] = Set('\"', '[', ']' ,''')
rawLines.map(line => line.filterNot(c => blacklist.contains(c)))
// res2: List[String] = List(test, test)
Or alternatively, using regexp, as #ka4eli has shown in his answer.
replaceAll can contain more complex regex constructed with | (or) operator:
validLines.map(_.replaceAll("\\[|\\]|'|\"", ""))
You need to use foldLeft instead of map.
val validLines : List[String] = List("test[" , "test]")
val charsToClean: List[String] = List("\"", "[", "]", "'")
val filtered : List[String] = validLines.map(line => charsToClean.foldLeft(line)(_.replace(_, "")))
Equivalent to #om-nom-nom though using a for comprehension syntax; given a set of characters to erase,
val charsToClean = """"[]'""".toSet
then
for (l <- validLines) yield l.filterNot(charsToClean)
where implicit set inclusion Char => Boolean function is applied in filterNot.

Find the intersection of two strings in order with Scala

I'm trying to find the intersection of two strings in order with Scala. I'm pretty new to Scala, but I feel like this should be a one-liner. I've tried using both map and foldLeft, and have yet to attain the correct answer.
Given two strings, return a list of characters that are the same in order. For instance, "abcd", "acc" should return "a", and "abcd", "abc" should return "abc".
Here are the two functions I've tried so far
(str1 zip str2).map{ case(a, b) => if (a == b) a else ""}
and
(str1 zip str2).foldLeft(""){case(acc,n) => if (n._1 == n._2) acc+n._1.toString else ""}
What I want to do is something like this
(str1 zip str2).map{ case(a, b) => if (a == b) a else break}
but that doesn't work.
I know that I can do this with multiple lines and a for loop, but this feels like a one liner. Can anyone help?
Thanks
(str1 zip str2).takeWhile( pair => pair._1 == pair._2).map( _._1).mkString
Testing it out in the scala REPL:
scala> val str1 = "abcd"
str1: String = abcd
scala> val str2 = "abc"
str2: String = abc
scala> (str1 zip str2).takeWhile( pair => pair._1 == pair._2).map( _._1).mkString
res26: String = abc
Edited to pass both test cases
scala> (str1 zip "acc").takeWhile( pair => pair._1 == pair._2).map( _._1).mkString
res27: String = a
This is not at all efficient, but it is obvious:
def lcp(str1:String, str2:String) =
(str1.inits.toSet intersect str2.inits.toSet).maxBy(_.length)
lcp("abce", "abcd") //> res0: String = abc
lcp("abcd", "bcd") //> res1: String = ""
(take the longest of the intersection of all of the prefixes of string 1 with all of the prefixes of string 2)
Alternatively, to avoid zipping the entire strings:
(s1, s2).zipped.takeWhile(Function.tupled(_ == _)).unzip._1.mkString
Here it is:
scala> val (s1, s2) = ("abcd", "bcd")
s1: String = abcd
s2: String = bcd
scala> Iterator.iterate(s1)(_.init).find(s2.startsWith).get
res1: String = ""
scala> val (s1, s2) = ("abcd", "abc")
s1: String = abcd
s2: String = abc
scala> Iterator.iterate(s1)(_.init).find(s2.startsWith).get
res2: String = abc

Split String into alternating words (Scala)

I want to split a String into alternating words. There will always be an even number.
e.g.
val text = "this here is a test sentence"
should transform to some ordered collection type containing
"this", "is", "test"
and
"here", "a", "sentence"
I've come up with
val (l1, l2) = text.split(" ").zipWithIndex.partition(_._2 % 2 == 0) match {
case (a,b) => (a.map(_._1), b.map(_._1))}
which gives me the right results as two Arrays.
Can this be done more elegantly?
scala> val s = "this here is a test sentence"
s: java.lang.String = this here is a test sentence
scala> val List(l1, l2) = s.split(" ").grouped(2).toList.transpose
l1: List[java.lang.String] = List(this, is, test)
l2: List[java.lang.String] = List(here, a, sentence)
So, how about this:
scala> val text = "this here is a test sentence"
text: java.lang.String = this here is a test sentence
scala> val Reg = """\s*(\w+)\s*(\w+)""".r
Reg: scala.util.matching.Regex = \s*(\w+)\s*(\w+)
scala> (for(Reg(x,y) <- Reg.findAllIn(text)) yield(x,y)).toList.unzip
res8: (List[String], List[String]) = (List(this, is, test),List(here, a, sentence))
scala>

Resources