I am learning Scala from Scala for the Impatient and Chapter 01 exercise has a problem
What do the take, drop, takeRight, and dropRight string functions
do? What advantage or disadvantage do they have over using substring?
The only advantage I see that drop(and flavors) will not throw IndexOutOfBoundsException
For example:
scala> "Hello World!" dropRight 100
res26: String = ""
scala> "Hello World!" substring 100
java.lang.StringIndexOutOfBoundsException: String index out of range: -88
at java.lang.String.substring(String.java:1919)
... 33 elided
What else? Memory efficient?
The main benefit is that it allows you to treat a String as a sequential collection of characters, much like any other Seq or List instance.
In fact, these methods (and other important transformational functions like map, flatMap and filter) are not implemented in String itself (which is, in fact, simply the Java String class, not a native-Scala class), but in the StringOps class (which extends StringLike -> ... -> SeqLike), and an implicit conversion ensures that a String is converted into a StringOps whenever you need access to these methods.
This means you can pass a String to a list-manipulation function and the function will receive a StringOps instance, work on it like any other SeqLike entity without needing to know it is actually a String, and hand back the results of the manipulation, which StringOps is designed to present back to you as a String.
If you know an entity is a String in a given piece of code, feel free to use the String-specific methods, but the availability of this implicit conversion means that you can also take advantage of a String's "character sequence"-like nature to treat it like any other list in situations where that may be convenient.
Seems that you are right. All these operations use StringOps.slice method, that delegates to String.substring method.
So, except for the overhead of wrapping string and performing bounds validation it's the same call to substring.
Related
After reading the docs and having a look through the source code, I'm a little unsure if there are any benefits of passing an explicit string to the fmt.Sprint function?
For example, will this:
return fmt.Sprint("this is a string")
be more beneficial than doing this:
return "this is a string"
Am I correct in thinking that this function works better with types that conform to the Stringer interface?
Yes, Sprint is pretty much useless for a single string, it should be used if you have a bunch of objects, Stringers or not. It just concatenates the string representations of them. from TFM:
Sprint formats using the default formats for its operands and returns
the resulting string. Spaces are added between operands when neither
is a string.
So if you have just one string it's useless. If you have many arguments, especially variadic, it sure beats doing something like fmt.Sprintf("%v %v %v", foo, bar, baz), especially if you don't know how many elements you have.
I know that Java system treats a string as an immutable type because a string is always initialized with its fixed length. Another reason is Java want to make any strings in many safe threads, which is unmodified by any users. However, because I really want to learn a complete checklist of why Java's string is immutable, so I wonder if there are any other reasons along those above?
One other reason is efficiency: If strings were mutable, every call to Class.getName, System.getProperty or basically any method that returns a string would always have to make a fresh copy. If they didn’t, you could do the following:
"".getClass().setCharAt(11, 'p')
And from that point on, the java.lang.String class would be called java.lang.Spring.
In the following code, I would expect 3 total string allocations to be made:
String str = "abc";
String str2 = str*2; //"abcabc"
1 when creating str
another when creating a copy of str to concatenate with itself
a third to hold the concatenation of str with itself (str2)
Are there fewer or more allocations made in this example? I know that strings are immutable in Dart but I'm unsure how these operations work under the hood because of this property.
I have no knowledge about the inner workings of the Dart VM but I would say:
"abc" creates one String object.
String str = "abc"; makes str reference the one created String object ("abc").
str*2; creates a second String object "abcabc" which str2 refers to after the second statement.
All in all two String objects.
With optimising compilers it's difficult to know for sure. If you want to know more you can look at the generated native code with irhydra.
In general a good approach is write code to be as readable as possible, and then use tools to find the bottle necks in your code, and optimise those.
For example observatory can show you which objects are using up the most memory, and which methods are running the most.
I've recently been reading about the behavior of GStringImpls vs Strings when used in collections in Groovy.
I understand that the reason this evaluates to false...
"${'test'}".equals("test") == false
is due to the symmetry requirement of the .equals() contract, however I was wondering if there was a reason the GStringImpl couldn't just be evaluated to a String immediately. So when I do something like this...
"${'someString'}"
I don't get a GStringImpl, I just get a plain Java String back, which I can immediately use as the key in a map, for example.
I know there are some workarounds, like
String s = "${'someString'}"
however stuff like this is a bit inconvenient, and the mix-up between GStringImpl and String seems to be a big 'gotcha' for Groovy newbies.
GStrings are not evaluated inmediately to String because of some reasons, mainly related to lazy evaluation (which is quite good for logging) and templating. In Strings and GString you can find a good explanation:
GString can involve lazy evaluation so it's not until the toString()
method is invoked that the GString is evaluated. This lazy evaluation
is useful for things like logging as it allows the calculation of the
string, the calls to toString() on the values, and the concatenation
of the different strings to be done lazily if at all.
GString is pretty handy when you don't want to use a template engine,
or when you really want full lazy evaluation of GStrings. When some
variable embedded in a GString, the toString() is called on that
variable to get a string representation, and it's inserted into the
final string.
Therefore:
GString and String are two distinct classes, and hence use of GString
objects as keys for Map objects or comparisons involving GString
objects, can produce unexpected results when combined with String
objects since a GString and a String won't have the same hashCode nor
will they be equal. There is no automatic coercion between the two
types for comparisons or map keys, so it's sometimes necessary to
explicitly invoke toString() on GString objects.
Unexpected conversion to String can lead to problems when code is
expecting a GString, as for methods in groovy.sql classes.
In scala, when you write out a string "Hello World" to a file it writes
Hello World
(note: no double quotes).
Lisp has a concept of print and write. One writes without the double quotes, the other includes them to make it easy to write out data structures and read them back later using the standard reader.
Is there anyway to do this in Scala?
With one string it is easy enough to format it - but with many deeply nested structures, it is nearly impossible.
For example, say I have
sealed trait PathSegment
case class P(x:String) extends PathSegment
case class V(x:Int) extends PathSegment
To create one does:
P("X")
or
V(0)
a list of these PathSegments prints as:
List(P(paths), P(/pets), P(get), P(responses), V(200))
I want this to print out as:
List(P("paths"), P("/pets"), P("get"), P("responses"), V(200))
In other words, I want strings (and characters), no matter where to occur in a structure to print out as "foo" or 'c'
That's what Serialization is about. Also, why JSON is popular.
Check out lift-json ( https://github.com/lift/lift/tree/master/framework/lift-base/lift-json/ ) for writing data out that will be parsed and read by another language. JSON is pretty standard in the web services world for request/response serialization and there are JSON libraries in just about every language.
To literally write out a string including double quotes, you can also do something like this:
"""
The word "apple" is in double quotes.
"""
I find a slightly more structured format like JSON more useful, and a library like lift-json does the right thing in terms of quoting Strings and not quoting Ints, etc.
I think you are looking for something like Javascript's eval() + JSON, and Python's eval(), str() and repr(). Essentially, you want Lispy symmetric meta-circular evaluation. Meaning you can transform data into source code, and evaluating that source code with give you back the same data, right?
AFAIK, there's no equivalent of eval() in Scala. Daniel Spiewak has talked about this here before. However, if you reeeeeealy want to. I suggest the following things:
Every collection object has 3 methods that will allow you to transform its data to a string representation anyway you want. There are mkString, addString and stringPrefix. Do something clever with them (think "decompiling" your in-memory ADTs back to source-code form) and you shall arrive to step 2). Essentially, you can transform a list of integers created by List(1,2,3) back to a string "List(1,2,3)". For more basic literals like a simple string or integer, you'll need to pimp the built-in types using implicits to provide them with these toString (I'm overloading the term here) helper methods.
Now you have your string representation, you can think about how to "interpret" or "evaluate" them. You will need an eval() function that create a new instance of a parser combinator that understands Scala's literals and reassemble the data structure for you.
Implementing this actually sounds fun. Don't forget to post back here if you've successfully implementing it. :)