Scala StringLike split method creates extra double quotes for leading spaces - string

I tried a simple split for a csv styled string, that contains spaces after commas like this:
scala> """"First", "SecondAfterSpace"""".split(",")
res0: Array[String] = Array("First", " "SecondAfterSpace"")
scala> res0(0)
res3: String = "First"
scala> res0(1)
res4: String = " "SecondAfterSpace""
The second string of the result Array has unexpected double quotes, more than the original string has.
It is ok that is contains the additional space in the beginning as I did not yet trim it. But I would expect a similar result as in the following with an additional leading space, instead of the extra double quotes:
scala> """"First","SecondNoSpace"""".split(",")
res1: Array[String] = Array("First", "SecondNoSpace")
I know I can workaround this issue with the following, but I'd like to understand if I do something wrong or if this is a bug:
scala> """"First", "SecondAfterSpaceTrimmed"""".split(",").map(_.trim)
res2: Array[String] = Array("First", "SecondAfterSpaceTrimmed")
Just to be sure I also tried all variants like
.split(',')
.split(""",""")
.split("""\,""")
.split(Array(','))
but all with the same result of extra double quotes.
In that context: From the scala-doc I see that the method in StringLike is used. The documentation talks about a char array. Yet I can use regex, which is not documented, so it made me suspicious if it is using the split method in a Java String... I am confused...

No, it is not. That is the way REPL represents it:
scala> val xs = """"First", "SecondAfterSpace"""".split(",")
xs: Array[String] = Array("First", " "SecondAfterSpace"")
scala> xs.last
res0: String = " "SecondAfterSpace""
scala> xs.last.count(_ == '"')
res1: Int = 2
As you can see, there is no extra quotes
To trim spaces after quote you may use regexp in split:
scala> val xs = """"First", "SecondAfterSpace"""".split(",[ ]?")
xs: Array[String] = Array("First", "SecondAfterSpace")

Related

What's the difference between raw string interpolation and triple quotes in scala

Scala has triple quoted strings """String\nString""" to use special characters in the string without escaping. Scala 2.10 also added raw"String\nString" for the same purpose.
Is there any difference in how raw"" and """""" work? Can they produce different output for the same string?
Looking at the source for the default interpolators (found here: https://github.com/scala/scala/blob/2.11.x/src/library/scala/StringContext.scala) it looks like the "raw" interpolator calls the identity function on each letter, so what you put in is what you get out. The biggest difference that you will find is that if you are providing a string literal in your source that includes the quote character, the raw interpolator still won't work. i.e. you can't say
raw"this whole "thing" should be one string object"
but you can say
"""this whole "thing" should be one string object"""
So you might be wondering "Why would I ever bother using the raw interpolator then?" and the answer is that the raw interpolator still performs variable substitution. So
val helloVar = "hello"
val helloWorldString = raw"""$helloVar, "World"!\n"""
Will give you the string "hello, "World"!\n" with the \n not being converted to a newline, and the quotes around the word world.
It is surprising that using the s-interpolator turns escapes back on, even when using triple quotes:
scala> "hi\nthere."
res5: String =
hi
there.
scala> """hi\nthere."""
res6: String = hi\nthere.
scala> s"""hi\nthere."""
res7: String =
hi
there.
The s-interpolator doesn't know that it's processing string parts that were originally triple-quoted. Hence:
scala> raw"""hi\nthere."""
res8: String = hi\nthere.
This matters when you're using backslashes in other ways, such as regexes:
scala> val n = """\d"""
n: String = \d
scala> s"$n".r
res9: scala.util.matching.Regex = \d
scala> s"\d".r
scala.StringContext$InvalidEscapeException: invalid escape character at index 0 in "\d"
at scala.StringContext$.loop$1(StringContext.scala:231)
at scala.StringContext$.replace$1(StringContext.scala:241)
at scala.StringContext$.treatEscapes0(StringContext.scala:245)
at scala.StringContext$.treatEscapes(StringContext.scala:190)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:94)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:94)
at scala.StringContext.standardInterpolator(StringContext.scala:124)
at scala.StringContext.s(StringContext.scala:94)
... 33 elided
scala> s"""\d""".r
scala.StringContext$InvalidEscapeException: invalid escape character at index 0 in "\d"
at scala.StringContext$.loop$1(StringContext.scala:231)
at scala.StringContext$.replace$1(StringContext.scala:241)
at scala.StringContext$.treatEscapes0(StringContext.scala:245)
at scala.StringContext$.treatEscapes(StringContext.scala:190)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:94)
at scala.StringContext$$anonfun$s$1.apply(StringContext.scala:94)
at scala.StringContext.standardInterpolator(StringContext.scala:124)
at scala.StringContext.s(StringContext.scala:94)
... 33 elided
scala> raw"""\d$n""".r
res12: scala.util.matching.Regex = \d\d

Scala raw strings error in unicode escape

In a Scala String need to include this literal verbatim: \usepackage{x}. Thus, desired would be that for
val s = """ ... \usepackage{X} ... """
println(s)
... \usepackage{X} ...
Attempts so far include,
scala> """\usepackage{X}"""
<console>:1: error: error in unicode escape
"""\usepackage{X}"""
^
scala> raw"""\usepackage{X}"""
<console>:1: error: error in unicode escape
raw"""\usepackage{X}"""
^
Single double-quoted strings prove unsuccessful as well.
Following http://docs.scala-lang.org/overviews/core/string-interpolation.html , a working example includes
scala> raw"a\nb"
res1: String = a\nb
which does not cover unicode cases.
You appear to be facing issue SI-4706: Unicode literal syntax thwarts common use cases for triple-quotes.
In Scala, unicode escape sequences are processed not only inside character or string literals. It may not be obvious that the following code would work:
scala> 5 \u002B 10
res0: Int = 15
Unfortunately, there doesn't seem to be a good way around this if you don't want to disable unicode escapes completely (-Xno-uescape, only available until Scala 2.13.1, see PR #8282 and ee8c1ef8).
One of workarounds suggested in the SI-4706 issue is separating the backslash character:
scala> """\""" + """usepackage{X}"""
res1: String = \usepackage{X}

How to insert double quotes into String with interpolation in scala

Having trouble escaping all the quotes in my function
(basic usage of it -> if i find a string do nothing, if its not a string add " in the begin and end)
code snippet :
def putTheDoubleQuotes(value: Any): Any = {
value match {
case s: String => s //do something ...
case _ => s"\"$value\"" //not working
}
}
only thing that worked was :
case _ => s"""\"$value\""""
is there a better syntax for this ?
it looks terrible and the IDE (IntelliJ) marks it in red (but lets you run it which really pisses me!!!!!)
This is a bug in Scala:
escape does not work with string interpolation
but maybe you can use:
scala> import org.apache.commons.lang.StringEscapeUtils.escapeJava
import org.apache.commons.lang.StringEscapeUtils.escapeJava
scala> escapeJava("this is a string\nover two lines")
res1: java.lang.String = this is a string\nover two lines
You don't need to escape quotes in triple-quoted string, so s""""$value""""" will work. Admittedly, it doesn't look good either.
Another solution (also mentioned in the Scala tracker) is to use
case _ => s"${'"'}$value${'"'}"
Still ugly, but sometimes perhaps may be preferred over triple quotes.
It seems an escape sequence $" was suggested as a part of SIP-24 for 2.12:
case _ => s"$"$value$""
This SIP was never accepted, as it contained other more controversial suggestions. Currently there is an effort to get escape sequence $" implemented in 2.13 as Pre SIP/mini SIP $” escapes in interpolations.
An example:
scala> val username="admin"
> username: String = admin
scala> val pass="xyz"
> pass: String = xyz
scala> println(s"""{"username":"$username", "pass":"$pass"}""")
> {"username":"admin", "pass":"xyz"}
This fixed the problem for me, I tested this out and this is what I used.
raw"""
Inside this block you can put "as many" quotes as you "want" and even "${5 + 7}" interpolate inside the quotes
"""
http://docs.scala-lang.org/overviews/core/string-interpolation.html#the-raw-interpolator
For your use case, they make it easy to achieve nice syntax.
scala> implicit class `string quoter`(val sc: StringContext) {
| def q(args: Any*): String = "\"" + sc.s(args: _*) + "\""
| }
defined class string$u0020quoter
scala> q"hello,${" "*8}world"
res0: String = "hello, world"
scala> "hello, world"
res1: String = hello, world // REPL doesn't add the quotes, sanity check
scala> " hello, world "
res2: String = " hello, world " // unless the string is untrimmed
Squirrel the implicit away in a package object somewhere.
You can name the interpolator something besides q, of course.
Last week, someone asked on the ML for the ability to use backquoted identifiers. Right now you can do res3 but not res4:
scala> val `"` = "\""
": String = "
scala> s"${`"`}"
res3: String = "
scala> s"hello, so-called $`"`world$`"`"
res4: String = hello, so-called "world"
Another idea that just occurred to me was that the f-interpolator already does some work to massage your string. For instance, it has to handle "%n" intelligently. It could, at the same time, handle an additional escape "%q" which it would not pass through to the underlying formatter.
That would look like:
scala> f"%qhello, world%q"
<console>:9: error: conversions must follow a splice; use %% for literal %, %n for newline
That's worth an enhancement request.
Update: just noticed that octals aren't deprecated in interpolations yet:
scala> s"\42hello, world\42"
res12: String = "hello, world"
As already mentioned, this is a known bug in Scala. A workaround is to use \042.
Simple way:-
val str="abc"
println(s"$str") //without double quotes
println(s"""\"$str\"""") // with double quotes
Starting Scala 2.13.6 escaped double quotes work as expected in string interpolations
Welcome to Scala 2.13.6 (OpenJDK 64-Bit Server VM, Java 15.0.2).
Type in expressions for evaluation. Or try :help.
scala> s"\"Hello\""
val res0: String = "Hello"
scala> s"$"Hello$""
val res1: String = "Hello"
How about
s"This is ${"\"" + variable + "\""}" inserted in string with quotes
It's heavily used in my case, therefore I created this version:
object StringUtil{
implicit class StringImprovements(s: String) {
def quoted = "\""+s+"\""
}
}
val myStatement = s"INSERT INTO ${tableName.quoted} ..."
Taking #Pascalius suggestion a few steps further. class StringImprovements extends and inherits AnyVal.
object StringUtil{
implicit class StringImprovements(val s: String) extends AnyVal {
def dqt = "\""+s+"\"" // double quote
def sqt = s"'$s'" // single quote
}
}
Scala only uses the StringImprovements class to create an intermediate object on which to call implicitly the two extension methods dqt & sqt. Nevertheless, we can eliminate the creation of this object and improve performance by making the class inherit from AnyVal. For Scala provides the value type specifically for such cases where the compiler will replace the object by just making the call to the method directly.
Here is a simple example using the above implicit class in an intermix where we use named variables (string & boolean) and a function in the interpolation string.
import StringUtil._
abstract class Animal {
...
override def toString(): String = s"Animal:${getFullName().dqt}, CanFly:$canFly, Sound:${getSound.dqt}"
}
The following worked for me inside my terminal.
scala> val str: String = hi
scala> print(str)
hi~
scala> print(s"${str}")
hi~
scala> print(s"\"${str}\"")
"hi"~
scala> print(s"\"$str\"")
"hi"~
scala>

Scala: How can I get an escaped representation of a string?

Basically, what I'd like to do is have:
// in foo.scala
val string = "this is a string\nover two lines"
println(string)
println(foo(string))
Do this:
% scala foo.scala
this is a string
over two lines
"this is a string\nover two lines"
Basically looking for an analog of ruby's String#inspect or haskell's show :: String -> String.
This question is a bit old but I stumbled over it while searching for a solution myself and was dissatisfied with the other answers because they either are not safe (replacing stuff yourself) or require an external library.
I found a way to get the escaped representation of a string with the scala standard library (>2.10.0) which is safe. It uses a little trick:
Through runtime reflection you can can easily obtain a representation of a literal string expression. The tree of such an expression is returned as (almost) scala code when calling it's toString method. This especially means that the literal is represented the way it would be in code, i.e. escaped and double quoted.
def escape(raw: String): String = {
import scala.reflect.runtime.universe._
Literal(Constant(raw)).toString
}
The escape function therefore results in the desired code-representation of the provided raw string (including the surrounding double quotes):
scala> "\bHallo" + '\n' + "\tWelt"
res1: String =
?Hallo
Welt
scala> escape("\bHallo" + '\n' + "\tWelt")
res2: String = "\bHallo\n\tWelt"
This solution is admittedly abusing the reflection api but IMHO still safer and more maintainable than the other proposed solutions.
I'm pretty sure this isn't available in the standard libraries for either Scala or Java, but it is in Apache Commons Lang:
scala> import org.apache.commons.lang.StringEscapeUtils.escapeJava
import org.apache.commons.lang.StringEscapeUtils.escapeJava
scala> escapeJava("this is a string\nover two lines")
res1: java.lang.String = this is a string\nover two lines
You could easily add the quotation marks to the escaped string if you wanted, of course.
The scala.reflect solution actually works fine. When you do not want to pull in that whole library, this is what it seems to do under the hood (Scala 2.11):
def quote (s: String): String = "\"" + escape(s) + "\""
def escape(s: String): String = s.flatMap(escapedChar)
def escapedChar(ch: Char): String = ch match {
case '\b' => "\\b"
case '\t' => "\\t"
case '\n' => "\\n"
case '\f' => "\\f"
case '\r' => "\\r"
case '"' => "\\\""
case '\'' => "\\\'"
case '\\' => "\\\\"
case _ => if (ch.isControl) "\\0" + Integer.toOctalString(ch.toInt)
else String.valueOf(ch)
}
val string = "\"this\" is a string\nover two lines"
println(quote(string)) // ok
If I compile these:
object s1 {
val s1 = "this is a string\nover two lines"
}
object s2 {
val s2 = """this is a string
over two lines"""
}
I don't find a difference in the String, so I guess: There is no possibility, to find out, whether there was was a "\n" in the source.
But maybe I got you wrong, and you would like to get the same result for both?
"\"" + s.replaceAll ("\\n", "\\\\n").replaceAll ("\\t", "\\\\t") + "\""
The second possibility is:
val mask = Array.fill (3)('"').mkString
mask + s + mask
res5: java.lang.String =
"""a
b"""
Test:
scala> val s = "a\n\tb"
s: java.lang.String =
a
b
scala> "\"" + s.replaceAll ("\\n", "\\\\n").replaceAll ("\\t", "\\\\t") + "\""
res7: java.lang.String = "a\n\tb"
scala> mask + s + mask
res8: java.lang.String =
"""a
b"""
You could build your own function pretty easily, if you don't want to use the apache library:
scala> var str = "this is a string\b with some \n escapes \t so we can \r \f \' \" see how they work \\";
str: java.lang.String =
this is a string? with some
escapes so we can
' " see how they work \
scala> print(str.replace("\\","\\\\").replace("\n","\\n").replace("\b","\\b").replace("\r","\\r").replace("\t","\\t").replace("\'","\\'").replace("\f","\\f").replace("\"","\\\""));
this is a string\b with some \n escapes \t so we can \r \f \' \" see how they work \\
Maybe there is a cleaner way to implement this in Scala 3, but this is what I've come up with following #martin-ring approach:
import scala.quoted.*
inline def escape(inline raw: String): String = ${escapeImpl('{raw})}
def escapeImpl(raw: Expr[String])(using Quotes): Expr[String] =
import quotes.reflect.*
Literal(StringConstant(raw.show)).asExprOf[String]

Short for String.format in Scala

Is there a short syntax for string interpolation in Scala? Something like:
"my name is %s" < "jhonny"
Instead of
"my name is %s" format "jhonny"
No, but you can add it yourself:
scala> implicit def betterString(s:String) = new { def %(as:Any*)=s.format(as:_*) }
betterString: (s: String)java.lang.Object{def %(as: Any*): String}
scala> "%s" % "hello"
res3: String = hello
Note that you can't use <, because that would conflict with a different implicit conversion already defined in Predef.
In case you are wondering what syntax may be in the works
$ ./scala -nobootcp -Xexperimental
Welcome to Scala version 2.10.0.r25815-b20111011020241
scala> val s = "jhonny"
s: String = jhonny
scala> "my name is \{ s }"
res0: String = my name is jhonny
Playing some more:
scala> "those things \{ "ne\{ "ts".reverse }" }"
res9: String = those things nest
scala> println("Hello \{ readLine("Who am I speaking to?") }")
Who am I speaking to?[typed Bozo here]Hello Bozo
I seem to remember Martin Odersky having been quoted with stating that string concatenation in the style presented in "Programming in Scala" is a useful approximation to interpolation. The idea is that without spaces you are only using a few extra characters per substitution. For example:
val x = "Mork"
val y = "Ork"
val intro = "my name is"+x+", I come from "+y
The format method provides a lot more power however. Daniel Sobral has blogged on a regex based technique too.

Resources