Why does the Data.String.IsString typeclass only define one conversion? - string

Why does the Haskell base package only define the IsString class to have a conversion from String to 'like-string' value, and not define the inverse transformation, from 'like-string' value to String?
The class should be defined as:
class IsString a where
fromString :: String -> a
toString :: a -> String
ref: http://hackage.haskell.org/packages/archive/base/4.4.0.0/doc/html/Data-String.html

The reason is IMHO that IsString's primary purpose is to be used for string literals in Haskell source code (or (E)DSLs -- see also Paradise: A two-stage DSL embedded in Haskell) via the OverloadedStrings language extension in an analogous way to how other polymorphic literals work (e.g. via fromRational for floating point literals or fromInteger for integer literals)
The term IsString might be a bit misleading, as it suggests that the type-class represents string-like structures, whereas it's really just to denote types which have a quoted-string-representation in Haskell source code.

If you desire to use toString :: a -> String, I think you're simply forgetting about show :: a -> String, or more properly Show a => show :: a -> String.
If you want to operate on a type both having a :: a -> String and :: String -> a, you can simply put those type-class constraints on the functions.
doubleConstraintedFunction :: Show a, IsString a => a -> .. -> .. -> a
We carefully note that we avoid defining type classes having a set of functions that can as well be split into two subclasses. Therefor we don't put toString in IsString.
Finally, I must also mention about Read, which provides Read a => String -> a. You use read and show for very simple serialization. fromString from IsString has a different purpose, it's useful with the language pragma OverloadedStrings, then you can very conveniently insert code like "This is not a string" :: Text. (Text is a (efficient) data-structure for Strings)

Related

How can I save a variable as a bytestring?

Ik this is a dumb question, but if I have this:
a :: B.ByteString
a = "a"
I get an error that says "Couldn't match type B.ByteString with type [Char]". I know what's the problem but I don't know how to fix it, could you help? thx.
Character string literals in Haskell, by default, are always treated as String, which is equivalent to [Char]. Most string-like data structures define a function called pack to convert from, and the bytestring package is no exception (Note that this is pack from Data.ByteString.Char8; the one in Data.ByteString converts from [Word8]).
import Data.ByteString.Char8(pack)
a :: B.ByteString
a = pack "a"
However, GHC also supports an extension called OverloadedStrings. If you're willing to enable this, ByteString implements a typeclass called IsString. With this extension enabled, the type of a string literal like "a" is no longer [Char] and is instead forall a. IsString a => a (similar to how the type of numerical literals like 3 is forall a. Num a => a). This will happily specialize to ByteString if the type is in scope.
{-# LANGUAGE OverloadedStrings #-}
a :: B.ByteString
a = "a"
If you go this route, make sure you understand the proviso listed in the docs for this instance. For ASCII characters, it won't pose a problem, but if your string has Unicode characters outside the ASCII range, you need to be aware of it.

How does OverloadedStrings language extension work?

I am trying to understand the language extension OverloadedStrings from the page https://ocharles.org.uk/posts/2014-12-17-overloaded-strings.html.
When the OverloadedStrings is enabled, then String becomes a type Data.String.IsString a => a:
Prelude Data.String> :t fromString "Foo"
fromString "Foo" :: IsString a => a
In the description, the author has mentioned the following:
By enabling this extension, string literals are now a call to the
fromString function, which belongs to the IsString type class.
What does string literals are now a call to the fromString function ?
and also the author has mentioned:
This polymorphism is extremely powerful, and it allows us to write
embedded domain specific languages in Haskell source code, without
having to introduce new constructs for otherwise normal values.
what does without having to introduce new constructs for otherwise normal values mean?
When the OverloadedStrings is enabled, then String becomes a type Data.String.IsString a => a
No that is incorrect. A String remains a String. It has only effect on string literals, not variables that have as type a String, and these still can be Strings.
What does string literals are now a call to the fromString function?
It means that if you write a string literal, like "foo", Haskell implicitly writes fromString "foo", and thus you can use this like any IsString object.
what does without having to introduce new constructs for otherwise normal values mean?
It means that we can make our own types for which we can write some sort of "mini-parser", and thus write these objects as string literals in our code. For example if we make a datatype like:
newtype BoolList = BoolList [Bool] deriving Show
then we can write our own parser
instance IsString BoolList where
fromString = BoolList . map toBool
where toBool '1' = True
toBool _ = False
Now we can for example define a list of Bools as:
myboollist :: BoolList
myboollist = "10110010001"
So then we get:
Prelude Data.String> myboollist
BoolList [True,False,True,True,False,False,True,False,False,False,True]
We here thus wrote a string literal "10110010001", and that means that implictly, we wrote fromString "10110010001". Since the type of myboollist is BoolList, it is here clear to what the string literal is parsed.
This thus can be useful if some data types are complex, our would take a lot of code to construct an object.
Since the fromString call is however postponed, and frequently not all possible strings map to a value of the type (here it is the case, although it is debatable if it is good to just fill in False for everything else than '1'), it thus can raise errors at runtime when the string turns out to be "unparsable".
what does without having to introduce new constructs for otherwise normal values mean?
The next sentence says
So why should string literals be any different?
so this one refers primarily to number literals. Consider e.g. a type defining polynomials. Because + and * can only be applied to arguments of the same type, if we want
2*x^3 + 3*x :: Poly Int
to be legal, 2 and 3 have to be of type Poly Int; otherwise you'd need either
a separate operator to multiply a polynomial by a number: 2.*x^3 + 3.^x.
a constructor for a constant polynomial: (C 2)*x^3 + (C 3)*x
An example for string literals is given at the end:
However, SQL queries are notorious for injection attacks when we concatenate strings. Interestingly, postgresql-simple provides a Query type that only has a IsString instance. This means that it’s very lightweight to write a literal query, but the moment we want to start concatenating strings for our query, we have to be very explicit.

IsString instance not automatically converted to String

I have a type a which is an instance of the IsString typeclass.
If I use something like
"foobar" :: a
everything works fine.
As soon as I use a function that returns a string, as in
("foo" ++ "bar") :: a
I get a compilation error telling me that
Couldn't match expected type ‘a’ with actual type ‘[Char]’
Expected type: a
Actual type: String
Notice that I have the {-# LANGUAGE OverloadedStrings #-} pragma.
Is there something else I should do to solve the compilation error?
The idea of the IsString typeclass is to specify that we can convert a String object to such object (with the fromString :: String -> a function). Furthermore by enabling the OverloadedStrings pragma, we can also write a objects as string literals (in that case these String literals will transparently be converted to as by calling the fromString function).
Note however that IsString does not results in a way to convert as back to Strings. Furthermore functions that are defined on Strings can not be used for such instances (at least not without doing some implementation work).
If you write:
("foo" ++ "bar") :: a
Haskell will derive that you call (++) :: [b] -> [b] -> [b], so as a result it knows that the type of these string literals is a IsString [b] => [b]. So that means that a ~ [b]. Since your type is probably not a list, there is no way that this can match.

Tagging a string with corresponding symbol

I would like an easy way to create a String tagged with itself. Right now I can
do something like:
data TagString :: Symbol -> * where
Tag :: String -> TagString s
deriving Show
tag :: KnownSymbol s => Proxy s -> TagString s
tag s = Tag (symbolVal s)
and use it like
tag (Proxy :: Proxy "blah")
But this is not nice because
The guarantee about the tag is only provided by tag not by the GADT.
Every time I want to create a value I have to provide a type signature, which
gets unwieldy if the value is part of some bigger expression.
Is there any way to improve this, preferably going in the opposite direction, i.e. from String to Symbol? I would like to write Tag "blah" and have ghc infer the type
TagString "blah".
GHC.TypeLits provides the someSymbolVal function which looks somewhat
related but it produces a SomeSymbol, not a Symbol and I can quite grasp how to use
it.
Is there any way to improve this, preferably going in the opposite direction, i.e. from String to Symbol?
There is no way to go directly from String to Symbol, because Haskell isn't dependently typed, unfortunately. You do have to write out a type annotation every time you want a new value and there isn't an existing tag with the desired symbol already around.
The guarantee about the tag is only provided by tag not by the GADT.
The following should work well (in fact, the same type can be found in the singletons package):
data SSym :: Symbol -> * where
SSym :: KnownSymbol s => SSym s
-- defining values
sym1 = SSym :: SSym "foo"
sym2 = SSym :: SSym "bar"
This type essentially differs from Proxy only by having the KnownSymbol dictionary in the constructor. The dictionary lets us recover the string contained within even if the symbol is not known statically:
extractString :: SSym s -> String
extractString s#SSym = symbolVal s
We pattern matched on SSym, thereby bringing into scope the implicit KnownSymbol dictionary. The same doesn't work with a mere Proxy:
extractString' :: forall (s :: Symbol). Proxy s -> String
extractString' p#Proxy = symbolVal p
-- type error, we can't recover the string from anywhere
... it produces a SomeSymbol, not a Symbol and I can quite grasp how to use it.
SomeSymbol is like SSym except it hides the string it carries around so that it doesn't appear in the type. The string can be recovered by pattern matching on the constructor.
extractString'' :: SomeSymbol -> String
extractString'' (SomeSymbol proxy) = symbolVal proxy
It can be useful when you want to manipulate different symbols in bulk, for example you can put them in a list (which you can't do with different SSym-s, because their types differ).

Deserializing an existential data type

I need to write a Serialize instance for the following data type:
data AnyNode = forall n . (Typeable n, Serialize n) => AnyNode n
Serializing this is no problem, but I can't implement deserialization, since the compiler has no way to resolve the specific instance of Serialize n, since the n is isolated from the outer scope.
There's been a related discussion in 2006. I am now wondering whether any sort of solution or a workaround has arrived today.
You just tag the type when you serialize, and use a dictionary to untag the type when you deserialize. Here's some pseudocode omitting error checking etc:
serialAnyNode (AnyNode x) = serialize (typeOf n, serialize x)
deserialAnyNode s = case deserialize s of
(typ,bs) -> case typ of
"String" -> AnyNode (deserialize bs :: String)
"Int" -> AnyNode (deserialize bs :: Int)
....
Note that you can only deserialize a closed universe of types with your function. With some extra work, you can also deserialize derived types like tuples, maybes and eithers.
But if I were to declare an entirely new type "Gotcha" deriving Typeable and Serialize, deserialAnyNode of course couldn't deal with it without extension.
You need to have some kind of centralised "registry" of deserialization functions so you can dispatch on the actual type (extracted from the Typeable information). If all types you want to deserialize are in the same module this is pretty easy to set up. If they are in multiple modules you need to have one module that has the mapping.
If your collection of types is more dynamic and not easily available at compile time, you can perhaps use the dynamic linking to gain access to the deserializers. For each type that you want to deserialize you export a C callable function with a name derived from the Typeable information (you could use TH to generate these). Then at runtime, when you want to deserialize a type, generate the same name and the use the dynamic linker to get hold of the address of the function and then an FFI wrapper to get a Haskell callable function. This is a rather involved process, but it can be wrapped up in a library. No, sorry, I don't have such a library.
It's hard to tell what you're asking here, exactly. You can certainly pick a particular type T, deserialize a ByteString to it, and store it in an AnyNode. That doesn't do the user of an AnyNode much good, though -- you still picked T, after all. If it wasn't for the Typeable constraint, the user wouldn't even be able to tell what the type is (so let's get rid of the Typeable constraint because it makes things messier). Maybe what you want is a universal instead of an existential.
Let's split Serialize up into two classes -- call them Read and Show -- and simplify them a bit (so e.g. read can't fail).
So we have
class Show a where show :: a -> String
class Read a where read :: String -> a
We can make an existential container for a Show-able value:
data ShowEx where
ShowEx :: forall a. Show a => a -> ShowEx
-- non-GADT: data ShowEx = forall a. Show a => ShowEx a
But of course ShowEx is isomorphic to String, so there isn't a whole lot point to this. But note that an existential for Read is has even less point:
data ReadEx where
ReadEx :: forall a. Read a => a -> ReadEx
-- non-GADT: data ReadEx = forall a. Read a => ReadEx a
When I give you a ReadEx -- i.e. ∃a. Read a *> a -- it means that you have a value of some type, and you don't know what the type is, but you can a String into another value of the same type. But you can't do anything with it! read only produces as, but that doesn't do you any good when you don't know what a is.
What you might want with Read would be a type that lets the caller choose -- i.e., a universal. Something like
newtype ReadUn where
ReadUn :: (forall a. Read a => a) -> ReadUn
-- non-GADT: newtype ReadUn = ReadUn (forall a. Read a => a)
(Like ReadEx, you could make ShowUn -- i.e. ∀a. Show a => a -- and it would be just as useless.)
Note that ShowEx is essentially the argument to show -- i.e. show :: (∃a. Show a *> a) -> String -- and ReadUn is essentially the return value of read -- i.e. read :: String -> (∀a. Read a => a).
So what are you asking for, an existential or a universal? You can certainly make something like ∀a. (Show a, Read a) => a or ∃a. (Show a, Read a) *> a, but neither does you much good here. The real issue is the quantifier.
(I asked a question a while ago where I talked about some of this in another context.)

Resources