scala using string interpolation for string replacement - string

scala 2.11.6
val fontColorMap = Map( "Good" -> "#FFA500", "Bad" -> "#0000FF")
val content = "Good or Bad?"
"(Bad|Good)".r.replaceFirstIn(content,s"""<font color="${fontColorMap("$1")}">$$1</font>""")
I want to replace the String using regex. In this case
$$1 can fetch the matched string, but I dont know how to do it in ${}.
plus. I know that scala will translate the interpolation
into something like this
new StringContext("""<font color=""",""">$$1</font>""").s(fontColorMap("$1"))
Thus it will fail.
But, is there any way I can handle this gracefully?

You can use the version of replaceAllIn that takes a function:
"(Bad|Good)".r.replaceAllIn(content, m =>
s"""<font color="${fontColorMap(m.matched)}">${m.matched}</font>"""
)
where m is of type scala.util.matching.Regex.Match.
There doesn't seem to be a version of replaceFirstIn that does the same thing though.

Seems is caused by regex group variable interpolation with scala StringContext interpolation has the different interpolation order.And StringContext need to evaluate firstly before go to the regex interpolation. Maybe we can try to get value firstly before regex replace interpolation, like:
"(Bad|Good)".r.findFirstIn(content).map(key => {
val value = fontColorMap(key)
content.replaceFirst(key, s"""<font color="$value">$key</font>""")
}).get
> <font color="#FFA500">Good</font> or Bad?

Related

finding similar values of variable in pyspark dataframe

I can find similar strings with the following command, but instead of using a string, I need to use a variable to find similar values between two columns.
df.filter(df.column_name.like("%Spark%")).show()
pattern = "Spark"
df.filter(df.column_name.like("%" + pattern + "%")).show()
A perhaps better way is to use rlike with a regex pattern to take care of word boundaries:
pattern = "Spark"
df.filter(df.column_name.rlike("\\b" + pattern + "\\b")).show()

How do I take a string and turn it into a list using SCALA?

I am brand new to Scala and having a tough time figuring this out.
I have a string like this:
a = "The dog crossed the street"
I want to create a list that looks like below:
a = List("The","dog","crossed","the","street")
I tried doing this using .split(" ") and then returning that, but it seems to do nothing and returns the same string. Could anyone help me out here?
It's safer to split() on one-or-more whitespace characters, just in case there are any tabs or adjacent spaces in the mix.
split() returns an Array so if you want a List you'll need to convert it.
"The dog\tcrossed\nthe street".split("\\s+").toList
//res0: List[String] = List(The, dog, crossed, the, street)

Replacing a certain part of string with a pre-specified Value

I am fairly new to Puppet and Ruby. Most likely this question has been asked before but I am not able to find any relevant information.
In my puppet code I will have a string variable retrieved from the fact hostname.
$n="$facts['hostname'].ex-ample.com"
I am expecting to get the values like these
DEV-123456-02B.ex-ample.com,
SCC-123456-02A.ex-ample.com,
DEV-123456-03B.ex-ample.com,
SCC-999999-04A.ex-ample.com
I want to perform the following action. Change the string to lowercase and then replace the
-02, -03 or -04 to -01.
So my output would be like
dev-123456-01b.ex-ample.com,
scc-123456-01a.ex-ample.com,
dev-123456-01b.ex-ample.com,
scc-999999-01a.ex-ample.com
I figured I would need to use .downcase on $n to make everything lowercase. But I am not sure how to replace the digits. I was thinking of .gsub or split but not sure how. I would prefer to make this happen in a oneline code.
If you really want a one-liner, you could run this against each string:
str
.downcase
.split('-')
.map
.with_index { |substr, i| i == 2 ? substr.gsub(/0[0-9]/, '01') : substr }
.join('-')
Without knowing what format your input list is taking, I'm not sure how to advise on how to iterate through it, but maybe you have that covered already. Hope it helps.
Note that Puppet and Ruby are entirely different languages and the other answers are for Ruby and won't work in Puppet.
What you need is:
$h = downcase(regsubst($facts['hostname'], '..(.)$', '01\1'))
$n = "${h}.ex-ample.com"
notice($n)
Note:
The downcase and regsubst functions come from stdlib.
I do a regex search and replace using the regsubst function and replace ..(.)$ - 2 characters followed by another one that I capture at the end of the string and replace that with 01 and the captured string.
All of that is then downcased.
If the -01--04 part is always on the same string index you could use that to replace the content.
original = 'DEV-123456-02B.ex-ample.com'
# 11 -^
string = original.downcase # creates a new downcased string
string[11, 2] = '01' # replace from index 11, 2 characters
string #=> "dev-123456-01b.ex-ample.com"

Scala: split string via pattern matching

Is it possible to split string into lexems somehow like
"user#domain.com" match {
case name :: "#" :: domain :: "." :: zone => doSmth(name, domain, zone)
}
In other words, on the same manner as lists...
Yes, you can do this with Scala's Regex functionality.
I found an email regex on this site, feel free to use another one if this doesn't suit you:
[-0-9a-zA-Z.+_]+#[-0-9a-zA-Z.+_]+\.[a-zA-Z]{2,4}
The first thing we have to do is add parentheses around groups:
([-0-9a-zA-Z.+_]+)#([-0-9a-zA-Z.+_]+)\.([a-zA-Z]{2,4})
With this we have three groups: the part before the #, between # and ., and finally the TLD.
Now we can create a Scala regex from it and then use Scala's pattern matching unapply to get the groups from the Regex bound to variables:
val Email = """([-0-9a-zA-Z.+_]+)#([-0-9a-zA-Z.+_]+)\.([a-zA-Z]{2,4})""".r
Email: scala.util.matching.Regex = ([-0-9a-zA-Z.+_]+)#([-0-9a-zA-Z.+_]+)\.([a-zA-Z] {2,4})
"user#domain.com" match {
case Email(name, domain, zone) =>
println(name)
println(domain)
println(zone)
}
// user
// domain
// com
Starting Scala 2.13, it's possible to pattern match a Strings by unapplying a string interpolator:
val s"$user#$domain.$zone" = "user#domain.com"
// user: String = "user"
// domain: String = "domain"
// zone: String = "com"
If you are expecting malformed inputs, you can also use a match statement:
"user#domain.com" match {
case s"$user#$domain.$zone" => Some(user, domain, zone)
case _ => None
}
// Option[(String, String, String)] = Some(("user", "domain", "com"))
In general regex is horribly inefficient, so wouldn't advise.
You CAN do it using Scala pattern matching by calling .toList on your string to turn it into List[Char]. Then your parts name, domain and zone will also be List[Char], to turn them back into Strings use .mkString. Though I'm not sure how efficient this is.
I have benchmarked using basic string operations (like substring, indexOf, etc) for various use cases vs regex and regex is usually an order or two slower. And of course regex is hideously unreadible.
UPDATE: The best thing to do is to use Parsers, either the native Scala ones, or Parboiled2

extract string with linq

Is there a nice way to extract part of a string with linq, example:
I have
string s = "System.Collections.*";
or
string s2 = "System.Collections.Somethingelse.*";
my goal is to extract anything in the string without the last '.*'
thankx I am using C#
The simplest way might be to use String.LastIndexOf followed by String.Substring
int index = s.LastIndexOf('.');
string output = s.Substring(0, index);
Unless you have a specific requirement to use LINQ for learning purposes of course.
You might want a regex instead. (.*)\.\*
With the regex:
string input="System.Collections.Somethingelse.*";
string output=Regex.Matches(input,#"\b.*\b").Value;
output is:
"System.Collections.Somethingelse"
(because "*" is not a word) although a simple
output=input.Replace(".*","");
would have worked :P

Resources