how to write read readable strings in scala

how to write read readable strings in scala - string

In scala, when you write out a string "Hello World" to a file it writes
Hello World
(note: no double quotes).
Lisp has a concept of print and write. One writes without the double quotes, the other includes them to make it easy to write out data structures and read them back later using the standard reader.
Is there anyway to do this in Scala?
With one string it is easy enough to format it - but with many deeply nested structures, it is nearly impossible.
For example, say I have
sealed trait PathSegment
case class P(x:String) extends PathSegment
case class V(x:Int) extends PathSegment
To create one does:
P("X")
or
V(0)
a list of these PathSegments prints as:
List(P(paths), P(/pets), P(get), P(responses), V(200))
I want this to print out as:
List(P("paths"), P("/pets"), P("get"), P("responses"), V(200))
In other words, I want strings (and characters), no matter where to occur in a structure to print out as "foo" or 'c'

That's what Serialization is about. Also, why JSON is popular.

Check out lift-json ( https://github.com/lift/lift/tree/master/framework/lift-base/lift-json/ ) for writing data out that will be parsed and read by another language. JSON is pretty standard in the web services world for request/response serialization and there are JSON libraries in just about every language.
To literally write out a string including double quotes, you can also do something like this:
"""
The word "apple" is in double quotes.
"""
I find a slightly more structured format like JSON more useful, and a library like lift-json does the right thing in terms of quoting Strings and not quoting Ints, etc.

I think you are looking for something like Javascript's eval() + JSON, and Python's eval(), str() and repr(). Essentially, you want Lispy symmetric meta-circular evaluation. Meaning you can transform data into source code, and evaluating that source code with give you back the same data, right?
AFAIK, there's no equivalent of eval() in Scala. Daniel Spiewak has talked about this here before. However, if you reeeeeealy want to. I suggest the following things:
Every collection object has 3 methods that will allow you to transform its data to a string representation anyway you want. There are mkString, addString and stringPrefix. Do something clever with them (think "decompiling" your in-memory ADTs back to source-code form) and you shall arrive to step 2). Essentially, you can transform a list of integers created by List(1,2,3) back to a string "List(1,2,3)". For more basic literals like a simple string or integer, you'll need to pimp the built-in types using implicits to provide them with these toString (I'm overloading the term here) helper methods.
Now you have your string representation, you can think about how to "interpret" or "evaluate" them. You will need an eval() function that create a new instance of a parser combinator that understands Scala's literals and reassemble the data structure for you.
Implementing this actually sounds fun. Don't forget to post back here if you've successfully implementing it. :)

Related

How can you dynamically format a string with a user-provided template and slice of parameters in Go?

I have user-provided format strings, and for each, I have a corresponding slice. For instance, I might have Test string {{1}}: {{2}} and ["number 1", "The Bit Afterwards"]. I want to generate Test string number 1: The Bit Afterwards from this.
The format of the user-provided strings is not fixed, and can be changed if need be. However, I cannot guarantee their sanity or safety; neither can I guarantee that any given character will not be used in the string, so any tags (like {} in my example) must be escapable. I also cannot guarantee that the same number of slice values will exist as tags in the template - for example, I might quite reasonably have Test string {{1}} and ["number 1", "another parameter", "yet another parameter"].
How can I efficiently format these strings, in accordance with the input given? They are for use as strings only, and don't require HTML, SQL or any other sort of escaping.
Things I've already considered:
fmt.Sprintf - two issues: 1) using it with user-provided templates is not ideal; 2) Sprintf does not play nicely with a number of parameters that doesn't match its format string, adding %!(EXTRA type=value) to the end.
The text/template library. This would work fine in theory, but I don't want to have to make users type out {{index .arr n}} for each and every one of their tags; in this case, I only ever need slice indexes.
The valyala/fasttemplate library. This is pretty much exactly what I'm looking for, but for the fact that it doesn't currently support escaping the delimiters it uses for its tags, at the time of writing. I've opened an issue for this, but I would have thought that there's already a solution to this problem somewhere - it doesn't feel like it's that unique.
Just writing my own parser for it. This would work... but, as above, I can't be the first person to have come across this!
Any advice or suggestions would be greatly appreciated.

Pass an explicit string to fmt.Sprint in Golang

After reading the docs and having a look through the source code, I'm a little unsure if there are any benefits of passing an explicit string to the fmt.Sprint function?
For example, will this:
return fmt.Sprint("this is a string")
be more beneficial than doing this:
return "this is a string"
Am I correct in thinking that this function works better with types that conform to the Stringer interface?

Yes, Sprint is pretty much useless for a single string, it should be used if you have a bunch of objects, Stringers or not. It just concatenates the string representations of them. from TFM:
Sprint formats using the default formats for its operands and returns
the resulting string. Spaces are added between operands when neither
is a string.
So if you have just one string it's useless. If you have many arguments, especially variadic, it sure beats doing something like fmt.Sprintf("%v %v %v", foo, bar, baz), especially if you don't know how many elements you have.

drop,dropRight,take,takeRight vs substring?

I am learning Scala from Scala for the Impatient and Chapter 01 exercise has a problem
What do the take, drop, takeRight, and dropRight string functions
do? What advantage or disadvantage do they have over using substring?
The only advantage I see that drop(and flavors) will not throw IndexOutOfBoundsException
For example:
scala> "Hello World!" dropRight 100
res26: String = ""
scala> "Hello World!" substring 100
java.lang.StringIndexOutOfBoundsException: String index out of range: -88
at java.lang.String.substring(String.java:1919)
... 33 elided
What else? Memory efficient?

The main benefit is that it allows you to treat a String as a sequential collection of characters, much like any other Seq or List instance.
In fact, these methods (and other important transformational functions like map, flatMap and filter) are not implemented in String itself (which is, in fact, simply the Java String class, not a native-Scala class), but in the StringOps class (which extends StringLike -> ... -> SeqLike), and an implicit conversion ensures that a String is converted into a StringOps whenever you need access to these methods.
This means you can pass a String to a list-manipulation function and the function will receive a StringOps instance, work on it like any other SeqLike entity without needing to know it is actually a String, and hand back the results of the manipulation, which StringOps is designed to present back to you as a String.
If you know an entity is a String in a given piece of code, feel free to use the String-specific methods, but the availability of this implicit conversion means that you can also take advantage of a String's "character sequence"-like nature to treat it like any other list in situations where that may be convenient.

Seems that you are right. All these operations use StringOps.slice method, that delegates to String.substring method.
So, except for the overhead of wrapping string and performing bounds validation it's the same call to substring.

Is it a reasonable practice to serialize Haskell data structures to disk just using Show/Read

I've played around with the Text.Show.Pretty module, and it makes it possible to serialize out Haskell data structures like records into a nice human-readable format & still be able to deserialize them easily using read. The output format is even more readable than YAML and JSON.
Example serialized output for a Haskell record using Text.Show.Pretty:
Book
{ author = "Plato"
, title = "Republic"
, numbers = [ 123
, 1234
]
}
Coming from the Ruby world, I know that YAML and JSON are most Rubyists' preferred format for serializing data structures. Are Haskell Show and Read instances used often to achieve the same end in Haskell?

For big structures, I wouldn't recommend it. read is slower than molasses. Anecdote time: I have a program named yeganesh. Conceptually, it's pretty simple: read in a [(String,Double)] with about 2000 elements and dump out the keys sorted by their elements. I used to store this using Show/Read, but found that switching to a custom printer and parser sped up the program by a factor of 8. (Note: it's not that the parsing sped up by a factor of eight. The whole program sped up by a factor of eight. That means the parsing sped up by a bigger factor than that.) That made the difference between uncomfortably long pauses and instant gratification.

I agree with Daniel Wagner but if you want file that a user can manipulate with simple text editors you could use the read/show for a small set of data, aka config files.
I don't think that is a common way amongst haskellers though, I usually use parsec instead of read 'config data' and a custom class /instance instead of Show.
If you got alot of data one usually use Data.Binary or Data.Serialize.

Why doesn't a string builder exist everywhere?

I sort of understand the motivation for a String Builder class, but do all languages have one? Should they? I'm thinking specifically of PHP, Perl, Python, and Ruby. I know C# and Java do. If the others don't, why not? Do they not suffer from the same implementation problem? Or do they not care?

Not all languages have a String builder.
C, for example, doesn't even have strings.
In C++, std::strings are mutable -- they can be changed, so there is no real need for a separate string builder class.
In C# (and the rest of .NET), string are immutable - they cannot be changed, only replaced This leads to the problem causing the need for StringBuilder.
Technically, .NET strings are reference types pretending to be value types. This was done to make they act more like the native types (int, float, decimal).

There is no need in string builders when string streams exist - file-like objects to construct strings.
For example, Python has StringIO:
from cStringIO import StringIO
sio = StringIO()
sio.write("Hello")
sio.write(" world!!")
sio.write(111)
sio.write('!')
print sio.getvalue()
Hello world!!111!
Ruby has its own StringIO too. In C++, the equivalent is std::stringstream.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

how to write read readable strings in scala - string

That's what Serialization is about. Also, why JSON is popular.

Related

How can you dynamically format a string with a user-provided template and slice of parameters in Go?

Pass an explicit string to fmt.Sprint in Golang

drop,dropRight,take,takeRight vs substring?

Is it a reasonable practice to serialize Haskell data structures to disk just using Show/Read

Why doesn't a string builder exist everywhere?

Categories

Resources