Splitting String with multiple spaces - string

I want to separate a string containing two words separated by one or more spaces.
But unfortunately it doesn't work as expected, at the end only one string will result.
I read a file that always has two words in a line. It looks like this: "word1 word2".
getData() returns a List[Int, String] whereby the string contains the two words.
As already mentioned, these two words can be separated by one or more spaces.
val myMap = getData("MyFile.txt").map{ line => val tempList = line._2.split(" +")
println(line)
println(tempList(0))
(tempList(0), tempList(1).toInt)
}.toMap
Result of the prints:
(13,word1 word2)
word1 word2

is this what you need? it seems nothing wrong!
as show in the picture
val a = "word1 world2"
val b = a.split(" +")
println(b(1))

Is this the answer you need?
import scala.io.Source
object Test{
def main(args: Array[String]): Unit = {
val filename = "C:\\src/com/practice/MyFile.txt"
val lines = Source.fromFile(filename).getLines.mkString
val contents = lines.split(" +");
print(contents(1))
}
}

Related

Kotlin Regex to Extract after and before a specific character

In Kotlin, i need to find the string value after and before of specific character,
For example,
Item1. myResultValue: someother string
So in the above, I need to get the value after ". " and before ":" SO my output should be "myResultValue". If it doesn't match it can return empty or null value.
Please help to write to Regex for this.
For your specific case you can use regex:
val text = "Item1. myResultValue: someother string"
val pattern = Pattern.compile(".*\\.(.*):.*")
val matcher = pattern.matcher(text)
if (matcher.matches()) {
println(matcher.group(1))
}
If you want to build something smarter that defines the start and end pattern you can do it like this:
val startSearch = "\\."
val endSearch = ":"
val text = "Item1. myResultValue: someother string"
val pattern = Pattern.compile(".*$startSearch(.*)$endSearch.*")
val matcher = pattern.matcher(text)
if (matcher.matches()) {
println(matcher.group(1))
}

How do I split one string into two strings in Kotlin with "."?

Let's say I have this string:
val mainString: String = "AAA.BBB"
And now I define two children strings:
val firstString: String = ""
val secondString: String = ""
What code should I write to make firstString equals to "AAA", and secondString equals to "BBB"?
The below code works for any amount of strings separated by delimiters
val texto = "111.222.333"
val vet = texto.split(".")
for (st in vet) println(st)
It prints
111
222
333

How to replace a string with star except first character in kotlin

I want to know how can I replace a given string with stars except first character of string in kotlin
For e.g i have string "Rizwan" , I want it to be R*****
You can do it with padEnd():
val name = "Rizwan"
val newName = name[0].toString().padEnd(name.length, '*')
Result:
"R*****"
Try replace center of String like Phone number. with ★:
phone.replaceRange(2 , phone.length-3 , "★".repeat(phone.length-5))
pay attention to 2 + 3 = 5 :D
Result:
"09★★★★★229"
Try replacing (?<=.). with *:
val input = "Rizwan"
val output = input.replace(Regex("(?<=.)."), "*")
println(output)
This prints:
R*****
The lookbehind (?<=.) in the regex pattern ensures that we only replace a character if at least one character precedes it. This spares the first character from being replaced.
I am not an expert in Kotlin so this may not be the best way to do it but it will work for sure.
var s = "Rizwan"
var l = s.length
val first = s[0]
s=""
while(l>1) {
s=s+"*"
l--
}
s=first+s
print(s)
Basic Algorithm..... using no library or functions
val name = "Rizwan"
val newName = name[0].toString().padEnd(name.length, '*')
Result:
"R*****"

Scala/Spark efficient partial string match

I am writing a small program in Spark using Scala, and came across a problem. I have a List/RDD of single word strings and a List/RDD of sentences which might or might not contain words from the list of single words. i.e.
val singles = Array("this", "is")
val sentence = Array("this Date", "is there something", "where are something", "this is a string")
and I want to select the sentences that contains one or more of the words from singles such that the result should be something like:
output[(this, Array(this Date, this is a String)),(is, Array(is there something, this is a string))]
I thought about two approaches, one by splitting the sentence and filtering using .contains. The other is to split and format sentence into a RDD and use the .join for RDD intersection. I am looking at around 50 single words and 5 million sentences, which method would be faster? Are there any other solutions? Could you also help me with the coding, I seem to get no results with my code (although it compiles and run without error)
You can create a set of required keys, look up the keys in sentences and group by keys.
val singles = Array("this", "is")
val sentences = Array("this Date",
"is there something",
"where are something",
"this is a string")
val rdd = sc.parallelize(sentences) // create RDD
val keys = singles.toSet // words required as keys.
val result = rdd.flatMap{ sen =>
val words = sen.split(" ").toSet;
val common = keys & words; // intersect
common.map(x => (x, sen)) // map as key -> sen
}
.groupByKey.mapValues(_.toArray) // group values for a key
.collect // get rdd contents as array
// result:
// Array((this, Array(this Date, this is a string)),
// (is, Array(is there something, this is a string)))
I've just tried to solve your problem and I've ended up with this code:
def check(s:String, l: Array[String]): Boolean = {
var temp:Int = 0
for (element <- l) {
if (element.equals(s)) {temp = temp +1}
}
var result = false
if (temp > 0) {result = true}
result
}
val singles = sc.parallelize(Array("this", "is"))
val sentence = sc.parallelize(Array("this Date", "is there something", "where are something", "this is a string"))
val result = singles.cartesian(sentence)
.filter(x => check(x._1,x._2.split(" ")) == true )
.groupByKey()
.map(x => (x._1,x._2.mkString(", ") )) // pay attention here(*)
result.foreach(println)
The last map line (*) is there just beacause without it I get something with CompactBuffer, like this:
(is,CompactBuffer(is there something, this is a string))
(this,CompactBuffer(this Date, this is a string))
With that map line (with a mkString command) I get a more readable output like this:
(is,is there something, this is a string)
(this,this Date, this is a string)
Hope it could help in some way.
FF

Find the intersection of two strings in order with Scala

I'm trying to find the intersection of two strings in order with Scala. I'm pretty new to Scala, but I feel like this should be a one-liner. I've tried using both map and foldLeft, and have yet to attain the correct answer.
Given two strings, return a list of characters that are the same in order. For instance, "abcd", "acc" should return "a", and "abcd", "abc" should return "abc".
Here are the two functions I've tried so far
(str1 zip str2).map{ case(a, b) => if (a == b) a else ""}
and
(str1 zip str2).foldLeft(""){case(acc,n) => if (n._1 == n._2) acc+n._1.toString else ""}
What I want to do is something like this
(str1 zip str2).map{ case(a, b) => if (a == b) a else break}
but that doesn't work.
I know that I can do this with multiple lines and a for loop, but this feels like a one liner. Can anyone help?
Thanks
(str1 zip str2).takeWhile( pair => pair._1 == pair._2).map( _._1).mkString
Testing it out in the scala REPL:
scala> val str1 = "abcd"
str1: String = abcd
scala> val str2 = "abc"
str2: String = abc
scala> (str1 zip str2).takeWhile( pair => pair._1 == pair._2).map( _._1).mkString
res26: String = abc
Edited to pass both test cases
scala> (str1 zip "acc").takeWhile( pair => pair._1 == pair._2).map( _._1).mkString
res27: String = a
This is not at all efficient, but it is obvious:
def lcp(str1:String, str2:String) =
(str1.inits.toSet intersect str2.inits.toSet).maxBy(_.length)
lcp("abce", "abcd") //> res0: String = abc
lcp("abcd", "bcd") //> res1: String = ""
(take the longest of the intersection of all of the prefixes of string 1 with all of the prefixes of string 2)
Alternatively, to avoid zipping the entire strings:
(s1, s2).zipped.takeWhile(Function.tupled(_ == _)).unzip._1.mkString
Here it is:
scala> val (s1, s2) = ("abcd", "bcd")
s1: String = abcd
s2: String = bcd
scala> Iterator.iterate(s1)(_.init).find(s2.startsWith).get
res1: String = ""
scala> val (s1, s2) = ("abcd", "abc")
s1: String = abcd
s2: String = abc
scala> Iterator.iterate(s1)(_.init).find(s2.startsWith).get
res2: String = abc

Resources