Deserializing many network messages without using an ad-hoc parser implementation

Deserializing many network messages without using an ad-hoc parser implementation - haskell

I have a question pertaining to deserialization. I can envision a solution using Data.Data, Data.Typeable, or with GHC.Generics, but I'm curious if it can be accomplished without generics, SYB, or meta-programming.
Problem Description:
Given a list of [String] that is known to contain the fields of a locally defined algebraic data type, I would like to deserialize the [String] to construct the target data type. I could write a parser to do this, but I'm looking for a generalized solution that will deserialize to an arbitrary number of data types defined within the program without writing a parser for each type. With knowledge of the number and type of value constructors an algebraic type has, it's as simple as performing a read on each string to yield the appropriate values necessary to build up the type. However, I don't want to use generics, reflection, SYB, or meta-programming (unless it's otherwise impossible).
Say I have around 50 types defined similar to this (all simple algebraic types composed of basic primitives (no nested or recursive types, just different combinations and orderings of primitives) :
data NetworkMsg = NetworkMsg { field1 :: Int, field2 :: Int, field3 :: Double}
data NetworkMsg2 = NetworkMsg2 { field1 :: Double, field2 :: Int, field3 :: Double }
I can determine the data-type to be associated with a [String] I've received over the network using a tag id that I parse before each [String].
Possible conjectured solution path:
Since data constructors are first-class values in Haskell, and actually have a type-- Can NetworkMsg constructor be thought of as a function, such as:
NetworkMsg :: Int -> Int -> Double -> NetworkMsg
Could I transform this function into a function on tuples using uncurryN then copy the [String] into a tuple of the same shape the function now takes?
NetworkMsg' :: (Int, Int, Double) -> NetworkMsg
I don't think this would work because I'd need knowledge of the value constructors and type information, which would require Data.Typeable, reflection, or some other metaprogramming technique.
Basically, I'm looking for automatic deserialization of many types without writing type instance declarations or analyzing the type's shape at run-time. If it's not feasible, I'll do it an alternative way.

You are correct in that the constructors are essentially just functions so you can write generic instances for any number of types by just writing instances for the functions. You'll still need to write a separate instance
for all the different numbers of arguments, though.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Text.Read
import Control.Applicative
class FieldParser p r where
parseFields :: p -> [String] -> Maybe r
instance Read a => FieldParser (a -> r) r where
parseFields con [a] = con <$> readMaybe a
parseFields _ _ = Nothing
instance (Read a, Read b) => FieldParser (a -> b -> r) r where
parseFields con [a, b] = con <$> readMaybe a <*> readMaybe b
parseFields _ _ = Nothing
instance (Read a, Read b, Read c) => FieldParser (a -> b -> c -> r) r where
parseFields con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseFields _ _ = Nothing
{- etc. for as many arguments as you need -}
Now you can use this type class to parse any message based on the constructor as long as the type-checker is able to figure out the resulting message type from context (i.e. it is not able to deduce it simply from the given constructor for these sort of multi-param type class instances).
data Test1 = Test1 {fieldA :: Int} deriving Show
data Test2 = Test2 {fieldB ::Int, fieldC :: Float} deriving Show
test :: String -> [String] -> IO ()
test tag fields = case tag of
"Test1" -> case parseFields Test1 fields of
Just (a :: Test1) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
"Test2" -> case parseFields Test2 fields of
Just (a :: Test2) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
I'd like to know how exactly you use the message types in the application, though, because having each message as its separate type makes it very difficult to have any sort of generic message handler.
Is there some reason why you don't simply have a single message data type? Such as
data NetworkMsg
= NetworkMsg1 {fieldA :: Int}
| NetworkMsg2 {fieldB :: Int, fieldC :: Float}
Now, while the instances are built in pretty much the same way, you get much better type inference since the result type is always known.
instance Read a => MessageParser (a -> NetworkMsg) where
parseMsg con [a] = con <$> readMaybe a
instance (Read a, Read b) => MessageParser (a -> b -> NetworkMsg) where
parseMsg con [a, b] = con <$> readMaybe a <*> readMaybe b
instance (Read a, Read b, Read c) => MessageParser (a -> b -> c -> NetworkMsg) where
parseMsg con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseMessage :: String -> [String] -> Maybe NetworkMsg
parseMessage tag fields = case tag of
"NetworkMsg1" -> parseMsg NetworkMsg1 fields
"NetworkMsg2" -> parseMsg NetworkMsg2 fields
_ -> Nothing
I'm also not sure why you want to do type-generic programming specifically without actually using any of the tools meant for generics. GHC.Generics, SYB or Template Haskell is usually the best solution for this kind of problem.

Related

RPC (Or: How do I disambiguate function application based on TypeRep values?)

I'm building some infrastructure for doing remote procedure calls in Haskell, and for reasons that are too long to explain here, I cannot reuse existing libraries.
So here's the setup: I have a type class for serializing and deserializing data:
class Serializable a where
encode :: a -> B.ByteString
decode :: B.ByteString -> Maybe a
maxSize :: a -> Int
where B is Data.ByteString.
I can use this to implement serialization of integers, booleans, lists of serializables, tuples of serializables ect.
Now I want to send some arguments across a network to a server, which then performs a computation based on these arguments, and sends back a result. So I create an existential type representing things that can be serialized:
data SerializableExt = forall t . Serializable t => SerializableExt t
because I want to send something of type [SerializableExt].
So, of course, I need to create an instance Serializable SerializableExt. This is where the problem starts:
In order to implement decode :: B.ByteString -> Maybe SerializableExt I need to know the concrete type that the existential type SerializableExt wraps.
So I implement encode :: SerializableExt -> B.ByteString as serializing the concrete type along with the value:
encode (SerializableExt x) = encode (typeOf x, x)
using typeOf from Data-Typeable. The problem is now the implementation of decode :: B.ByteString -> Maybe SerializableExt:
decode bs =
let (tyenc, xenc) = splitPair bs -- Not really important. It just splits bs into the two components
in case (decode tyenc :: Maybe TypeRep) of
Just ty -> SerializableExt <$> _ -- Somehow invoke decode xenc, where the choice of which decode to execute depends on the value of ty.
_ -> Nothing
But I can't see how to fill in the hole here. Because of Haskell's separation of the value level and the type level I can't use the value of ty to disambiguate the invocation of decode xenc, right?
Is there a way to solve this issue, and actually put something in the hole which will do what I want? Or can you come up with another design?
EDIT: One way of doing it would be the following:
decode bs =
let (tyenc, xenc) = splitPair bs
in SerializableExt <$>
case (decode tyenc :: Maybe TypeRep) of
Just ty
| ty == typeRep (Proxy :: Proxy Int) -> decode xenc :: Maybe Int
| ty = typeRep (Proxy :: Proxy ()) -> decode xenc :: Maybe ()
| ...
_ -> Nothing
but this is bad for several reasons:
It's tedious to extend.
It cannot handle pairs (or generally: tuples) generically; every
combination of types needs to be handled.
It's not very Haskelly

Data.Dynamic lets us put arbitrary Haskell values into a single container, and get them out again in a type-safe way. That's a good start towards inter-process communication; I'll come back to serialization below.
We can write a program that takes a list of Dynamic values, checks for the number & types it needs, and returns a result in the same way.
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE ScopedTypeVariables #-}
-- | Experiments with type-safe serialization.
module Main where
import Data.Proxy
import Data.Dynamic
import Data.Foldable
import Data.Type.Equality
import Type.Reflection
foo :: Int -> String -> String
foo i s = concat (replicate i s)
actor :: [Dynamic] -> Either String [Dynamic]
actor (di : ds : _) = case (fromDynamic di, fromDynamic ds) of
(Just i, Just s) -> Right [toDyn (foo i s)]
_ -> Left "Wrong types of arguments"
actor _ = Left "Not enough arguments"
caller :: Either String [Dynamic]
caller = actor [ toDyn (3::Int), toDyn "bar" ]
main :: IO ()
main = case caller of
Left err -> putStrLn err
Right dyns -> for_ dyns (\d -> case fromDynamic d of
Just s -> putStrLn s
Nothing -> print d)
We can use a TypeRep to guide selection of a class instance. (For ease of testing my code, I used String.)
class Serial a where
encode :: a -> String
decode :: String -> Maybe a
decodeAs :: Serial a => TypeRep a -> String -> Maybe a
decodeAs _ s = decode s
Finally, we'd like to serialize the TypeRep, and when decoding, check that the encoded type matches the type that we're decoding at.
instance Serial SomeTypeRep
encodeDyn :: (Typeable a, Serial a) => a -> (String, String)
encodeDyn a = (encode (SomeTypeRep (typeOf a)), encode a)
decodeDynamic :: forall a. (Typeable a, Serial a) => String -> String -> Maybe a
decodeDynamic tyStr aStr = case decode tyStr of
Nothing -> Nothing
Just (SomeTypeRep ty) ->
case eqTypeRep ty (typeRep :: TypeRep a) of
Nothing -> Nothing
Just HRefl -> decodeAs ty aStr

Is it possible to ensure that two GADT type variables are the same without dependent types?

I'm writing a compiler where I'm using GADTs for my IR but standard data types for my everything else. I'm having trouble during the conversion from the old data type to the GADT. I've attempted to recreate the situation with a smaller/simplified language below.
To start with, I have the following data types:
data OldLVal = VarOL Int -- The nth variable. Can be used to construct a Temp later.
| LDeref OldLVal
data Exp = Var Int -- See above
| IntT Int32
| Deref Exp
data Statement = AssignStmt OldLVal Exp
| ...
I want to convert these into this intermediate form:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
-- Note: this is a Phantom type
data Temp a = Temp Int
data Type = IntT
| PtrT Type
data Command where
Assign :: NewLVal a -> Pure a -> Command
...
data NewLVal :: Type -> * where
VarNL :: Temp a -> NewLVal a
DerefNL :: NewLVal ('PtrT ('Just a)) -> NewLVal a
data Pure :: Type -> * where
ConstP :: Int32 -> Pure 'IntT
ConstPtrP :: Int32 -> Pure ('PtrT a)
VarP :: Temp a -> Pure a
At this point, I just want to write a conversion from the old data type to the new GADT. For right now, I have something that looks like this.
convert :: Statement -> Either String Command
convert (AssignStmt oldLval exp) = do
newLval <- convertLVal oldLval -- Either String (NewLVal a)
pure <- convertPure exp -- Either String (Pure b)
-- return $ Assign newLval pure -- Obvious failure. Can't ensure a ~ b.
pure' <- matchType newLval pure -- Either String (Pure a)
return $ Assign newLval pure'
-- Converts Pure b into Pure a. Should essentially be a noop, but simply
-- proves that it is possible.
matchType :: NewLVal a -> Pure b -> Either String (Pure a)
matchType = undefined
I realized that I couldn't write convert trivially, so I attempted to solve the problem using this idea of matchType which acts as a proof that these two types are indeed equal. The question is: how do I actually write matchType? This would be much easier if I had fully dependent types (or so I'm told), but can I finish this code here?
An alternative to this would be to somehow provide newLval as an argument to convertPure, but I think that essentially is just attempting to use dependent types.
Any other suggestions are welcome.
If it helps, I also have a function that can convert an Exp or OldLVal to its type:
class Typed a where
typeOf :: a -> Type
instance Typed Exp where
...
instance Typed OldLVal where
...
EDIT:
Thanks to the excellent answers below, I've been able to finish writing this module.
I ended up using the singletons package mentioned below. It was a little strange at first, but I found it pretty reasonable to use after I started understanding what I was doing. However, I did run into one pitfall: The type of convertLVal and convertPure requires an existential to express.
data WrappedPure = forall a. WrappedPure (Pure a, SType a)
data WrappedLVal = forall a. WrappedLVal (NewLVal a, SType a)
convertPure :: Exp -> Either String WrappedPure
convertLVal :: OldLVal -> Either String WrappedLVal
This means that you'll have to unwrap that existential in convert, but otherwise, the answers below show you the way. Thanks so much once again.

You want to perform a comparison at runtime on some type level data (namely the Types by which your values are indexed). But by the time you run your code, and the values start to interact, the types are long gone. They're erased by the compiler, in the name of producing efficient code. So you need to manually reconstruct the type level data that was erased, using a value which reminds you of the type you'd forgotten you were looking at. You need a singleton copy of Type.
data SType t where
SIntT :: SType IntT
SPtrT :: SType t -> SType (PtrT t)
Members of SType look like members of Type - compare the structure of a value like SPtrT (SPtrT SIntT) with that of PtrT (PtrT IntT) - but they're indexed by the (type-level) Types that they resemble. For each t :: Type there's precisely one SType t (hence the name singleton), and because SType is a GADT, pattern matching on an SType t tells the type checker about the t. Singletons span the otherwise strictly-enforced separation between types and values.
So when you're constructing your typed tree, you need to track the runtime STypes of your values and compare them when necessary. (This basically amounts to writing a partially verified type checker.) There's a class in Data.Type.Equality containing a function which compares two singletons and tells you whether their indexes match or not.
instance TestEquality SType where
-- testEquality :: SType t1 -> SType t2 -> Maybe (t1 :~: t2)
testEquality SIntT SIntT = Just Refl
testEquality (SPtrT t1) (SPtrT t2)
| Just Refl <- testEquality t1 t2 = Just Refl
testEquality _ _ = Nothing
Applying this in your convert function looks roughly like this:
convert :: Statement -> Either String Command
convert (AssignStmt oldLval exp) = do
(newLval, newLValSType) <- convertLVal oldLval
(pure, pureSType) <- convertPure exp
case testEquality newLValSType pureSType of
Just Refl -> return $ Assign newLval pure'
Nothing -> Left "type mismatch"
There actually aren't a whole lot of dependently typed programs you can't fake up with TypeInType and singletons (are there any?), but it's a real hassle to duplicate all of your datatypes in both "normal" and "singleton" form. (The duplication gets even worse if you want to pass singletons around implicitly - see Hasochism for the details.) The singletons package can generate much of the boilerplate for you, but it doesn't really alleviate the pain caused by duplicating the concepts themselves. That's why people want to add real dependent types to Haskell, but we're a good few years away from that yet.
The new Type.Reflection module contains a rewritten Typeable class. Its TypeRep is GADT-like and can act as a sort of "universal singleton". But programming with it is even more awkward than programming with singletons, in my opinion.

matchType as written is not possible to implement, but the idea you are going for is definitely possible. Do you know about Data.Typeable? Typeable is a class that provides some basic reflective operations for inspecting types. To use it, you need a Typeable a constraint in scope for any type variable a you want to know about. So for matchType you would have
matchType :: (Typeable a, Typeable b) => NewLVal a -> Pure b -> Either String (Pure a)
It needs also to infect your GADTs any time you want to hide a type variable:
data Command where
Assign :: (Typeable a) => NewLVal a -> Pure a -> Command
...
But if you have the appropriate constraints in scope, you can use eqT to make type-safe runtime type comparisons. For example
-- using ScopedTypeVariables and TypeApplications
matchType :: forall a b. (Typeable a, Typeable b) => NewLVal a -> Pure b -> Either String (Pure b)
matchType = case eqT #a #b of
Nothing -> Left "types are not equal"
Just Refl -> {- in this scope the compiler knows that
a and b are the same type -}

Is there a compiler-extension for untagged union types in Haskell?

In some languages (#racket/typed, for example), the programmer can specify a union type without discriminating against it, for instance, the type (U Integer String) captures integers and strings, without tagging them (I Integer) (S String) in a data IntOrStringUnion = ... form or anything like that.
Is there a way to do the same in Haskell?

Either is what you're looking for... ish.
In Haskell terms, I'd describe what you're looking for as an anonymous sum type. By anonymous, I mean that it doesn't have a defined name (like something with a data declaration). By sum type, I mean a data type that can have one of several (distinguishable) types; a tagged union or such. (If you're not familiar with this terminology, try Wikipedia for starters.)
We have a well-known idiomatic anonymous product type, which is just a tuple. If you want to have both an Int and a String, you just smush them together with a comma: (Int, String). And tuples (seemingly) can go on forever--(Int, String, Double, Word), and you can pattern-match the same way. (There's a limit, but never mind.)
The well-known idiomatic anonymous sum type is Either, from Data.Either (and the Prelude):
data Either a b = Left a | Right b
deriving (Eq, Ord, Read, Show, Typeable)
It has some shortcomings, most prominently a Functor instance that favors Right in a way that's odd in this context. The problem is that chaining it introduces a lot of awkwardness: the type ends up like Either (Int (Either String (Either Double Word))). Pattern matching is even more awkward, as others have noted.
I just want to note that we can get closer to (what I understand to be) the Racket use case. From my extremely brief Googling, it looks like in Racket you can use functions like isNumber? to determine what type is actually in a given value of a union type. In Haskell, we usually do that with case analysis (pattern matching), but that's awkward with Either, and function using simple pattern-matching will likely end up hard-wired to a particular union type. We can do better.
IsNumber?
I'm going to write a function I think is an idiomatic Haskell stand-in for isNumber?. First, we don't like doing Boolean tests and then running functions that assume their result; instead, we tend to just convert to Maybe and go from there. So the function's type will end with -> Maybe Int. (Using Int as a stand-in for now.)
But what's on the left hand of the arrow? "Something that might be an Int -- or a String, or whatever other types we put in the union." Uh, okay. So it's going to be one of a number of types. That sounds like typeclass, so we'll put a constraint and a type variable on the left hand of the arrow: MightBeInt a => a -> Maybe Int. Okay, let's write out the class:
class MightBeInt a where
isInt :: a -> Maybe Int
fromInt :: Int -> a
Okay, now how do we write the instances? Well, we know if the first parameter to Either is Int, we're golden, so let's write that out. (Incidentally, if you want a nice exercise, only look at the instance ... where parts of these next three code blocks, and try to implement that class members yourself.)
instance MightBeInt (Either Int b) where
isInt (Left i) = Just i
isInt _ = Nothing
fromInt = Left
Fine. And ditto if Int is the second parameter:
instance MightBeInt (Either a Int) where
isInt (Right i) = Just i
isInt _ = Nothing
fromInt = Right
But what about Either String (Either Bool Int)? The trick is to recurse on the right hand type: if it's not Int, is it an instance of MightBeInt itself?
instance MightBeInt b => MightBeInt (Either a b) where
isInt (Right xs) = isInt xs
isInt _ = Nothing
fromInt = Right . fromInt
(Note that these all require FlexibleInstances and OverlappingInstances.) It took me a long time to get a feel for writing and reading these class instances; don't worry if this instance is surprising. The punch line is that we can now do this:
anInt1 :: Either Int String
anInt1 = fromInt 1
anInt2 :: Either String (Either Int Double)
anInt2 = fromInt 2
anInt3 :: Either String Int
anInt3 = fromInt 3
notAnInt :: Either String Int
notAnInt = Left "notint"
ghci> isInt anInt3
Just 3
ghci> isInt notAnInt
Nothing
Great!
Generalizing
Okay, but now do we need to write another type class for each type we want to look up? Nope! We can parameterize the class by the type we want to look up! It's a pretty mechanical translation; the only question is how to tell the compiler what type we're looking for, and that's where Proxy comes to the rescue. (If you don't want to install tagged or run base 4.7, just define data Proxy a = Proxy. It's nothing special, but you'll need PolyKinds.)
class MightBeA t a where
isA :: proxy t -> a -> Maybe t
fromA :: t -> a
instance MightBeA t t where
isA _ = Just
fromA = id
instance MightBeA t (Either t b) where
isA _ (Left i) = Just i
isA _ _ = Nothing
fromA = Left
instance MightBeA t b => MightBeA t (Either a b) where
isA p (Right xs) = isA p xs
isA _ _ = Nothing
fromA = Right . fromA
ghci> isA (Proxy :: Proxy Int) anInt3
Just 3
ghci> isA (Proxy :: Proxy String) notAnInt
Just "notint"
Now the usability situation is... better. The main thing we've lost, by the way, is the exhaustiveness checker.
Notational Parity With (U String Int Double)
For fun, in GHC 7.8 we can use DataKinds and TypeFamilies to eliminate the infix type constructors in favor of type-level lists. (In Haskell, you can't have one type constructor--like IO or Either--take a variable number of parameters, but a type-level list is just one parameter.) It's just a few lines, which I'm not really going to explain:
type family OneOf (as :: [*]) :: * where
OneOf '[] = Void
OneOf '[a] = a
OneOf (a ': as) = Either a (OneOf as)
Note that you'll need to import Data.Void. Now we can do this:
anInt4 :: OneOf '[Int, Double, Float, String]
anInt4 = fromInt 4
ghci> :kind! OneOf '[Int, Double, Float, String]
OneOf '[Int, Double, Float, String] :: *
= Either Int (Either Double (Either Float [Char]))
In other words, OneOf '[Int, Double, Float, String] is the same as Either Int (Either Double (Either Float [Char])).

You need some kind of tagging because you need to be able to check if a value is actually an Integer or a String to use it for anything. One way to alleviate having to create a custom ADT for every combination is to use a type such as
{-# LANGUAGE TypeOperators #-}
data a :+: b = L a | R b
infixr 6 :+:
returnsIntOrString :: Integer -> Integer :+: String
returnsIntOrString i
| i `rem` 2 == 0 = R "Even"
| otherwise = L (i * 2)
returnsOneOfThree :: Integer -> Integer :+: String :+: Bool
returnsOneOfThree i
| i `rem` 2 == 0 = (R . L) "Even"
| i `rem` 3 == 0 = (R . R) False
| otherwise = L (i * 2)

Generic data constructor for Data instance

Given a datatype
data Foo = IFoo Int | SFoo String deriving (Data, Typeable)
what is a simple definition of
gconstr :: (Typeable a, Data t) => a -> t
such that
gconstr (5 :: Int) :: Foo == IFoo 5
gconstr "asdf" :: Foo == SFoo "asdf"
gconstr True :: Foo == _|_
It would be essentially the opposite of syb's gfindtype.
Or does such a thing exist already? I've tried hoogle-ing the type and haven't found much, but the syb types are kind of hard to interpret. A function returning Nothing on error is also acceptable.

This seems to be possible, though it's not completely trivial.
Preliminaries:
{-# LANGUAGE DeriveDataTypeable #-}
import Control.Monad ( msum )
import Data.Data
import Data.Maybe
First a helper function gconstrn, which tries to do the same thing as required of gconstr, but for a specific constructor only:
gconstrn :: (Typeable a, Data t) => Constr -> a -> Maybe t
gconstrn constr arg = gunfold addArg Just constr
where
addArg :: Data b => Maybe (b -> r) -> Maybe r
addArg Nothing = Nothing
addArg (Just f) =
case cast arg of
Just v -> Just (f v)
Nothing -> Nothing
The key part is that the addArg function will use arg as an argument to the constructor, if the types match.
Essentially gunfold starts unfolding with Just IFoo or Just SFoo, and then the next step is to try addArg to provide it with its argument.
For multi-argument constructors this would be called repeatedly, so if you defined an IIFoo constructor that took two Ints, it would also get successfully filled in by gconstrn. Obviously with a bit more work you could do something more sophisticated like providing a list of arguments.
Then it's just a question of trying this with all possible constructors. The recursive definition between result and dt is just to get the right type argument for dataTypeOf, the actual value being passed in doesn't matter at all. ScopedTypeVariables would be an alternative for achieving this.
gconstr :: (Typeable a, Data t) => a -> Maybe t
gconstr arg = result
where result = msum [gconstrn constr arg | constr <- dataTypeConstrs dt]
dt = dataTypeOf (fromJust result)
As discussed in the comments, both functions can be simplified with <*> from Control.Applicative to the following, though it's a bit harder to see what's going on in the gunfold:
gconstr :: (Typeable a, Data t) => a -> Maybe t
gconstr arg = result
where
result = msum $ map (gunfold (<*> cast arg) Just) (dataTypeConstrs dt)
dt = dataTypeOf (fromJust result)

Laziness and polymorphic values

(For the following, simplify Show and Read to
class Show a where show :: a -> String
class Read a where read :: String -> a
And assume that read never fails.)
It's well-known that one can make an existential type of the form
data ShowVal where
ShowVal :: forall a. Show a => a -> ShowVal
And then construct a "heterogeneous list" :: [ShowVal], such as
l = [ShowVal 4, ShowVal 'Q', ShowVal True]
It's also well-known that this is relatively useless, because, instead, one can
just construct a list :: [String], such as
l = [show 4, show 'Q', show True]
Which is exactly isomorphic (after all, the only thing one can do with a
ShowVal is show it).
Laziness makes this particularly nice, because for each value in the list, the
result of show is memoized automatically, so no String is computed more than
once (and Strings that aren't used aren't computed at all).
A ShowVal is equivalent to an existential tuple exists a. (a -> String, a),
where the function is the Show dictionary.
A similar construct can be made for Read:
data ReadVal where
ReadVal :: (forall a. Read a => a) -> ReadVal
Note that, because read is polymorphic in its return value, ReadVal is
universal rather than existential (which means that we don't really need it at
all, because Haskell has first-class universals; but we'll use it here to
highlight the similaries to Show).
We can also make a list :: [ReadVal]:
l = [ReadVal (read "4"), ReadVal (read "'Q'"), ReadVal (read "True")]
Just as with Show, a list :: [ReadVal] is isomorphic to a list :: [String],
such as
l = ["4", "'Q'", "True"]
(We can always get the original String back with
newtype Foo = Foo String
instance Read Foo where read = Foo
Because the Read type class is open.)
A ReadVal is equivalent to a universal function forall a. (String -> a) -> a
(a CPS-style representation). Here the Read dictionary is supplied by the user
of the ReadVal rather than by the producer, because the return value is
polymorphic rather than the argument.
However, in neither of these representations do we get the automatic
memoization that we get in the String representation with Show. Let's say that
read for our type is an expensive operation, so we don't want to compute it
on the same String for the same type more than once.
If we had a closed type, we could do something like:
data ReadVal = ReadVal { asInt :: Int, asChar :: Char, asBool :: Bool }
And then use a value
ReadVal { asInt = read s, asChar = read s, asBool = read s }
Or something along those lines.
But in this case -- even if we only ever use the ReadVal as one type -- the
String will be parsed each time the value is used. Is there a simple way to
get memoization while keeping the ReadVal polymorphic?
(Getting GHC to do it automatically, similarly to the Show case, would be
ideal, if it's somehow possible. A more explicit memoization approach --
perhaps by adding a Typeable constraint? -- would also be OK.)

Laziness makes this particularly nice, because for each value in the list, the result of show is memoized automatically, so no String is computed more than once (and Strings that aren't used aren't computed at all).
This premise is incorrect. There is no magical memo table under the hood.
Laziness means things that aren't needed, aren't computed. It does not mean that all computed values are shared. You still have to introduce explicit sharing (via a table of your own).

Here's an implementation of the more explicit approach; it requires Typeable, because otherwise there'd be nothing to key the memo table on. I based the memoisation code on uglymemo; there might be a way to get this to work with pure memoisation, but I'm not sure. It's tricky, because you have to construct the table outside of the implicit function that any forall a. (Read a, Typeable a) => ... creates, otherwise you end up constructing one table per call, which is useless.
{-# LANGUAGE GADTs, RankNTypes #-}
import Data.Dynamic
import Control.Concurrent.MVar
import Data.HashMap.Strict (HashMap)
import qualified Data.HashMap.Strict as HM
import System.IO.Unsafe
data ReadVal where
ReadVal :: { useReadVal :: forall a. (Read a, Typeable a) => a } -> ReadVal
mkReadVal :: String -> ReadVal
mkReadVal s = unsafePerformIO $ do
v <- newMVar HM.empty
return $ ReadVal (readVal v)
where
readVal :: (Read a, Typeable a) => MVar (HashMap TypeRep Dynamic) -> a
readVal v = unsafePerformIO $ do
m <- readMVar v
let r = read s -- not evaluated
let typeRep = typeOf r
case HM.lookup typeRep m of
Nothing -> do
modifyMVar_ v (return . HM.insert typeRep (toDyn r))
return r
Just r' -> return $ fromDyn r' (error "impossible")

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Deserializing many network messages without using an ad-hoc parser implementation - haskell

Related

RPC (Or: How do I disambiguate function application based on TypeRep values?)

Is it possible to ensure that two GADT type variables are the same without dependent types?

Is there a compiler-extension for untagged union types in Haskell?

Generic data constructor for Data instance

Laziness and polymorphic values

Categories

Resources