Could anyone explain this spark expression for me? - apache-spark

I'm a new learner of spark. There's one line of code estimating pi but I don't quite understand how it works.
scala>val pi_approx = f"pi = ${355f/113}%.5f"
pi_approx: String = pi = 3.14159
I don't understand the 'f' '$' and '%' in the expression above. Could anyone explain the usage of them? Thanks!

This is the example of String Interpolation that allows users to embed variable references directly in processed string literals. For e.g.
scala> val name = "Scala"
name: String = Scala
scala> println(s"Hello, $name")
Hello, Scala
In above example the literal s"Hello, $name" is a processed string literal.
Scala provides three string interpolation methods out of the box: s, f and raw.
Prepending f to any string literal allows the creation of simple formatted strings, similar to printf in other languages.
The formats allowed after the % character tells that result is formatted as a decimal number while ${} allows any arbitrary expression to be embedded. For e.g.
scala> println(s"1 + 1 = ${1 + 1}")
1 + 1 = 2
More detailed information can be found on:
Scala String Interpolation
Java Formatter

Related

How to split a string in scala?

I have a below string which I want to parse in Scala.
word, {"..Json Structure..."}
In python I can split the string giving (", {") as an argument.However, Scala is not accepting space as an argument.
Can you guys please help me with the query?
Scala string split method uses regular expression, { is a special character in regular expression which is used for quantifying matched patterns. If you want to treat it as literal, you need to escape the character with , \\{:
val s = """word, {"..Json Structure..."}"""
// s: String = word, {"..Json Structure..."}
s.split(", \\{")
// res32: Array[String] = Array(word, "..Json Structure..."})
Or:
s.split(""", \{""")
// res33: Array[String] = Array(word, "..Json Structure..."})

What's the difference between raw string interpolation and triple quotes in scala

Scala has triple quoted strings """String\nString""" to use special characters in the string without escaping. Scala 2.10 also added raw"String\nString" for the same purpose.
Is there any difference in how raw"" and """""" work? Can they produce different output for the same string?
Looking at the source for the default interpolators (found here: https://github.com/scala/scala/blob/2.11.x/src/library/scala/StringContext.scala) it looks like the "raw" interpolator calls the identity function on each letter, so what you put in is what you get out. The biggest difference that you will find is that if you are providing a string literal in your source that includes the quote character, the raw interpolator still won't work. i.e. you can't say
raw"this whole "thing" should be one string object"
but you can say
"""this whole "thing" should be one string object"""
So you might be wondering "Why would I ever bother using the raw interpolator then?" and the answer is that the raw interpolator still performs variable substitution. So
val helloVar = "hello"
val helloWorldString = raw"""$helloVar, "World"!\n"""
Will give you the string "hello, "World"!\n" with the \n not being converted to a newline, and the quotes around the word world.
It is surprising that using the s-interpolator turns escapes back on, even when using triple quotes:
scala> "hi\nthere."
res5: String =
hi
there.
scala> """hi\nthere."""
res6: String = hi\nthere.
scala> s"""hi\nthere."""
res7: String =
hi
there.
The s-interpolator doesn't know that it's processing string parts that were originally triple-quoted. Hence:
scala> raw"""hi\nthere."""
res8: String = hi\nthere.
This matters when you're using backslashes in other ways, such as regexes:
scala> val n = """\d"""
n: String = \d
scala> s"$n".r
res9: scala.util.matching.Regex = \d
scala> s"\d".r
scala.StringContext$InvalidEscapeException: invalid escape character at index 0 in "\d"
at scala.StringContext$.loop$1(StringContext.scala:231)
at scala.StringContext$.replace$1(StringContext.scala:241)
at scala.StringContext$.treatEscapes0(StringContext.scala:245)
at scala.StringContext$.treatEscapes(StringContext.scala:190)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:94)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:94)
at scala.StringContext.standardInterpolator(StringContext.scala:124)
at scala.StringContext.s(StringContext.scala:94)
... 33 elided
scala> s"""\d""".r
scala.StringContext$InvalidEscapeException: invalid escape character at index 0 in "\d"
at scala.StringContext$.loop$1(StringContext.scala:231)
at scala.StringContext$.replace$1(StringContext.scala:241)
at scala.StringContext$.treatEscapes0(StringContext.scala:245)
at scala.StringContext$.treatEscapes(StringContext.scala:190)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:94)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:94)
at scala.StringContext.standardInterpolator(StringContext.scala:124)
at scala.StringContext.s(StringContext.scala:94)
... 33 elided
scala> raw"""\d$n""".r
res12: scala.util.matching.Regex = \d\d

Scala String interpolation with Format, how to change locale?

When doing format string interpolation in Sweden I get a comma instead of a dot when creating strings with decimal numbers:
scala> val a = 5.010
a: Double = 5.01
scala> val a = 5.0101
a: Double = 5.0101
scala> f"$a%.2f"
res0: String = 5,01
My question is, how do I set the format so that I get the result 5.01? I would like to be able to set the locale only for that String, i.e. so that I don't change the locale for the whole environment.
Cheers,
Johan
Using the same Java library number formatting support accessible
from StringOps enriched String class, you could specify another locale just for that output:
"%.2f".formatLocal(java.util.Locale.US, a)
(as described in "How to convert an Int to a String of a given length with leading zeros to align?")
The Scala way would be to use the string f interpolator (Scala 2.10+), as in the OP's question, but it is using the "current locale", without offering an easy way to set that locale to a different one just for one call.
Locale.setDefault(Locale.US)
println(f"$a%.2f")

How to insert double quotes into String with interpolation in scala

Having trouble escaping all the quotes in my function
(basic usage of it -> if i find a string do nothing, if its not a string add " in the begin and end)
code snippet :
def putTheDoubleQuotes(value: Any): Any = {
value match {
case s: String => s //do something ...
case _ => s"\"$value\"" //not working
}
}
only thing that worked was :
case _ => s"""\"$value\""""
is there a better syntax for this ?
it looks terrible and the IDE (IntelliJ) marks it in red (but lets you run it which really pisses me!!!!!)
This is a bug in Scala:
escape does not work with string interpolation
but maybe you can use:
scala> import org.apache.commons.lang.StringEscapeUtils.escapeJava
import org.apache.commons.lang.StringEscapeUtils.escapeJava
scala> escapeJava("this is a string\nover two lines")
res1: java.lang.String = this is a string\nover two lines
You don't need to escape quotes in triple-quoted string, so s""""$value""""" will work. Admittedly, it doesn't look good either.
Another solution (also mentioned in the Scala tracker) is to use
case _ => s"${'"'}$value${'"'}"
Still ugly, but sometimes perhaps may be preferred over triple quotes.
It seems an escape sequence $" was suggested as a part of SIP-24 for 2.12:
case _ => s"$"$value$""
This SIP was never accepted, as it contained other more controversial suggestions. Currently there is an effort to get escape sequence $" implemented in 2.13 as Pre SIP/mini SIP $” escapes in interpolations.
An example:
scala> val username="admin"
> username: String = admin
scala> val pass="xyz"
> pass: String = xyz
scala> println(s"""{"username":"$username", "pass":"$pass"}""")
> {"username":"admin", "pass":"xyz"}
This fixed the problem for me, I tested this out and this is what I used.
raw"""
Inside this block you can put "as many" quotes as you "want" and even "${5 + 7}" interpolate inside the quotes
"""
http://docs.scala-lang.org/overviews/core/string-interpolation.html#the-raw-interpolator
For your use case, they make it easy to achieve nice syntax.
scala> implicit class `string quoter`(val sc: StringContext) {
| def q(args: Any*): String = "\"" + sc.s(args: _*) + "\""
| }
defined class string$u0020quoter
scala> q"hello,${" "*8}world"
res0: String = "hello, world"
scala> "hello, world"
res1: String = hello, world // REPL doesn't add the quotes, sanity check
scala> " hello, world "
res2: String = " hello, world " // unless the string is untrimmed
Squirrel the implicit away in a package object somewhere.
You can name the interpolator something besides q, of course.
Last week, someone asked on the ML for the ability to use backquoted identifiers. Right now you can do res3 but not res4:
scala> val `"` = "\""
": String = "
scala> s"${`"`}"
res3: String = "
scala> s"hello, so-called $`"`world$`"`"
res4: String = hello, so-called "world"
Another idea that just occurred to me was that the f-interpolator already does some work to massage your string. For instance, it has to handle "%n" intelligently. It could, at the same time, handle an additional escape "%q" which it would not pass through to the underlying formatter.
That would look like:
scala> f"%qhello, world%q"
<console>:9: error: conversions must follow a splice; use %% for literal %, %n for newline
That's worth an enhancement request.
Update: just noticed that octals aren't deprecated in interpolations yet:
scala> s"\42hello, world\42"
res12: String = "hello, world"
As already mentioned, this is a known bug in Scala. A workaround is to use \042.
Simple way:-
val str="abc"
println(s"$str") //without double quotes
println(s"""\"$str\"""") // with double quotes
Starting Scala 2.13.6 escaped double quotes work as expected in string interpolations
Welcome to Scala 2.13.6 (OpenJDK 64-Bit Server VM, Java 15.0.2).
Type in expressions for evaluation. Or try :help.
scala> s"\"Hello\""
val res0: String = "Hello"
scala> s"$"Hello$""
val res1: String = "Hello"
How about
s"This is ${"\"" + variable + "\""}" inserted in string with quotes
It's heavily used in my case, therefore I created this version:
object StringUtil{
implicit class StringImprovements(s: String) {
def quoted = "\""+s+"\""
}
}
val myStatement = s"INSERT INTO ${tableName.quoted} ..."
Taking #Pascalius suggestion a few steps further. class StringImprovements extends and inherits AnyVal.
object StringUtil{
implicit class StringImprovements(val s: String) extends AnyVal {
def dqt = "\""+s+"\"" // double quote
def sqt = s"'$s'" // single quote
}
}
Scala only uses the StringImprovements class to create an intermediate object on which to call implicitly the two extension methods dqt & sqt. Nevertheless, we can eliminate the creation of this object and improve performance by making the class inherit from AnyVal. For Scala provides the value type specifically for such cases where the compiler will replace the object by just making the call to the method directly.
Here is a simple example using the above implicit class in an intermix where we use named variables (string & boolean) and a function in the interpolation string.
import StringUtil._
abstract class Animal {
...
override def toString(): String = s"Animal:${getFullName().dqt}, CanFly:$canFly, Sound:${getSound.dqt}"
}
The following worked for me inside my terminal.
scala> val str: String = hi
scala> print(str)
hi~
scala> print(s"${str}")
hi~
scala> print(s"\"${str}\"")
"hi"~
scala> print(s"\"$str\"")
"hi"~
scala>

Short for String.format in Scala

Is there a short syntax for string interpolation in Scala? Something like:
"my name is %s" < "jhonny"
Instead of
"my name is %s" format "jhonny"
No, but you can add it yourself:
scala> implicit def betterString(s:String) = new { def %(as:Any*)=s.format(as:_*) }
betterString: (s: String)java.lang.Object{def %(as: Any*): String}
scala> "%s" % "hello"
res3: String = hello
Note that you can't use <, because that would conflict with a different implicit conversion already defined in Predef.
In case you are wondering what syntax may be in the works
$ ./scala -nobootcp -Xexperimental
Welcome to Scala version 2.10.0.r25815-b20111011020241
scala> val s = "jhonny"
s: String = jhonny
scala> "my name is \{ s }"
res0: String = my name is jhonny
Playing some more:
scala> "those things \{ "ne\{ "ts".reverse }" }"
res9: String = those things nest
scala> println("Hello \{ readLine("Who am I speaking to?") }")
Who am I speaking to?[typed Bozo here]Hello Bozo
I seem to remember Martin Odersky having been quoted with stating that string concatenation in the style presented in "Programming in Scala" is a useful approximation to interpolation. The idea is that without spaces you are only using a few extra characters per substitution. For example:
val x = "Mork"
val y = "Ork"
val intro = "my name is"+x+", I come from "+y
The format method provides a lot more power however. Daniel Sobral has blogged on a regex based technique too.

Resources