Data type design in Haskell - haskell

Learning Haskell, I write a formatter of C++ header files. First, I parse all class members into a-collection-of-class-members which is then passed to the formatting routine. To represent class members I have
data ClassMember = CmTypedef Typedef |
CmMethod Method |
CmOperatorOverload OperatorOverload |
CmVariable Variable |
CmFriendClass FriendClass |
CmDestructor Destructor
(I need to classify the class members this way because of some peculiarities of the formatting style.)
The problem that annoys me is that to "drag" any function defined for the class member types to the ClassMember level, I have to write a lot of redundant code. For example,
instance Formattable ClassMember where
format (CmTypedef td) = format td
format (CmMethod m) = format m
format (CmOperatorOverload oo) = format oo
format (CmVariable v) = format v
format (CmFriendClass fc) = format fc
format (CmDestructor d) = format d
instance Prettifyable ClassMember where
-- same story here
On the other hand, I would definitely like to have a list of ClassMember objects (at least, I think so), hence defining it as
data ClassMember a = ClassMember a
instance Formattable ClassMember a
format (ClassMember a) = format a
doesn't seem to be an option.
The alternatives I'm considering are:
Store in ClassMember not object instances themselves, but functions defined on the corresponding types, which are needed by the formatting routine. This approach breaks the modularity, IMO, as the parsing results, represented by [ClassMember], need to be aware of all their usages.
Define ClassMember as an existential type, so [ClassMember] is no longer a problem. I doubt whether this design is strict enough and, again, I need to specify all constraints in the definition, like data ClassMember = forall a . Formattable a => ClassMember a. Also, I would prefer a solution without using extensions.
Is what I'm doing a proper way to do it in Haskell or there is a better way?

First, consider trimming down that ADT a bit. Operator overloads and destructors are special kinds of methods, so it might make more sense to treat all three in CmMethod; Method will then have special ways to separate them. Alternatively, keep all three CmMethod, CmOperatorOverload, and CmDestructor, but let them all contain the same Method type.
But of course, you can reduce the complexity only so much.
As for the specific example of a Show instance: you really don't want to write that yourself except in some special cases. For your case, it's much more reasonable to have the instance derived automatically:
data ClassMember = CmTypedef Typedef
| CmMethod Method
| ...
| CmDestructor Destructor
deriving (Show)
This will give different results from your custom instance – because yours is wrong: showing a contained result should also give information about the constructor.
If you're not really interested in Show but talking about another class C that does something more specific to ClassMembers – well, then you probably shouldn't have defined C in the first place! The purpose of type classes is to express mathematical concepts that hold for a great variety of types.

A possible solution is to use records.
It can be used without extensions and preserves flexibility.
There is still some boilerplate code, but you need to type it only once for all. So if you would need to perform another set of operations over your ClassMember, it would be very easy and quick to do it.
Here is an example for your particular case (template Haskell and Control.Lens makes things easier but are not mandatory):
{-# LANGUAGE TemplateHaskell #-}
module Test.ClassMember
import Control.Lens
-- | The class member as initially defined.
data ClassMember =
CmTypedef Typedef
| CmMethod Method
| CmOperatorOverload OperatorOverload
| CmVariable Variable
| CmFriendClass FriendClass
| CmDestructor Destructor
-- | Some dummy definitions of the data types, so the code will compile.
data Typedef = Typedef
data Method = Method
data OperatorOverload = OperatorOverload
data Variable = Variable
data FriendClass = FriendClass
data Destructor = Destructor
{-|
A data type which defines one function per constructor.
Note the type a, which means that for a given Hanlder "a" all functions
must return "a" (as for a type class!).
-}
data Handler a = Handler
{
_handleType :: Typedef -> a
, _handleMethod :: Method -> a
, _handleOperator :: OperatorOverload -> a
, _handleVariable :: Variable -> a
, _handleFriendClass :: FriendClass -> a
, _handleDestructor :: Destructor -> a
}
{-|
Here I am using lenses. This is not mandatory at all, but makes life easier.
This is also the reason of the TemplateHaskell language pragma above.
-}
makeLenses ''Handler
{-|
A function acting as a dispatcher (the boilerplate code!!!), telling which
function of the handler must be used for a given constructor.
-}
handle :: Handler a -> ClassMember -> a
handle handler member =
case member of
CmTypedef a -> handler^.handleType $ a
CmMethod a -> handler^.handleMethod $ a
CmOperatorOverload a -> handler^.handleOperator $ a
CmVariable a -> handler^.handleVariable $ a
CmFriendClass a -> handler^.handleFriendClass $ a
CmDestructor a) -> handler^.handleDestructor $ a
{-|
A dummy format method.
I kept things simple here, but you could define much more complicated
functions.
You could even define some generic functions separately and... you could define
them with some extra arguments that you would only provide when building
the Handler! An (dummy!) example is the way the destructor function is
constructed.
-}
format :: Handler String
format = Handler
(\x -> "type")
(\x -> "method")
(\x -> "operator")
(\x -> "variable")
(\x -> "Friend")
(destructorFunc $ (++) "format ")
{-|
A dummy function showcasing partial application.
It has one more argument than handleDestructor. In practice you are free
to add as many as you wish as long as it ends with the expected type
(Destructor -> String).
-}
destructorFunc :: (String -> String) -> Destructor -> String
destructorFunc f _ = f "destructor"
{-|
Construction of the pretty handler which illustrates the reason why
using lens by keeping a nice and concise syntax.
The "&" is the backward operator and ".~" is the set operator.
All we do here is to change the functions of the handleType and the
handleDestructor.
-}
pretty :: Handler String
pretty = format & handleType .~ (\x -> "Pretty type")
& handleDestructor .~ (destructorFunc ((++) "Pretty "))
And now we can run some tests:
test1 = handle format (CmDestructor Destructor)
> "format destructor"
test2 = handle pretty (CmDestructor Destructor)
> "Pretty destructor"

Related

Type design for the AST of my language remembering token locations

I wrote a parser and evaluator for a simple programming language. Here is a simplified version of the types for the AST:
data Value = IntV Int | FloatV Float | BoolV Bool
data Expr = IfE Value [Expr] | VarDefE String Value
type Program = [Expr]
I want error messages to tell the line and column of the source code in which the error occured. For example, if the value in an If expression is not a boolean, I want the evaluator to show an error saying "expected boolean at line x, column y", with x and y referring to the location of the value.
So, what I need to do is redefine the previous types so that they can store the relevant locations of different things. One option would be to add a location to each constructor for expressions, like so:
type Location = (Int, Int)
data Expr = IfE Value [Expr] Location | VarDef String Value Location
This clearly isn't optimal, because I have to add those Location fields to every possible expression, and if for example a value contained other values, I would need to add locations to that value too:
{-
this would turn into FunctionCall String [Value] [Location],
with one location for each value in the function call
-}
data Value = ... | FunctionCall String [Value]
I came up with another solution, which allows me to add locations to everything:
data Located a = Located Location a
type LocatedExpr = Located Expr
type LocatedValue = Located Value
data Value = IntV Int | FloatV Float | BoolV Bool | FunctionCall String [LocatedValue]
data Expr = IfE LocatedValue [LocatedExpr] | VarDef String LocatedValue
data Program = [LocatedExpr]
However I don't like this that much. First of all, it clutters the definition of the evaluator and pattern matching has an extra layer every time. Also, I don't think saying that a function call takes located values as arguments is quite right. Function calls should take values as arguments, and locations should be metadata that doesn't interfere with the evaluator.
I need help redefining my types so that the solution is as clean as possible. Maybe there is a language extension or a design pattern I don't know about that could be helpful.
There are many ways to annotate an AST! This is half of what’s known as the AST typing problem, the other half being how you manage an AST that changes over the course of compilation. The problem isn’t exactly “solved”: all of the solutions have tradeoffs, and which one to pick depends on your expected use cases. I’ll go over a few that you might like to investigate at the end.
Whichever method you choose for organising the actual data types, if it makes pattern-matching ugly or unwieldy, the natural solution is PatternSynonyms.
Considering your first example:
{-# Language PatternSynonyms #-}
type Location = (Int, Int)
data Expr
= LocatedIf Value [Expr] Location
| LocatedVarDef String Value Location
-- Unidirectional pattern synonyms which ignore the location:
pattern If :: Value -> [Expr] -> Expr
pattern If val exprs <- LocatedIf val exprs _loc
pattern VarDef :: String -> Value -> Expr
pattern VarDef name expr <- LocatedVarDef name expr _loc
-- Inform GHC that matching ‘If’ and ‘VarDef’ is just as good
-- as matching ‘LocatedIf’ and ‘LocatedVarDef’.
{-# Complete If, VarDef #-}
This may be sufficiently tidy for your purposes already. But here are a few more tips that I find helpful.
Put annotations first: when adding an annotation type to an AST directly, I often prefer to place it as the first parameter of each constructor, so that it can be conveniently partially applied.
data LocatedExpr
= LocatedIf Location Value [Expr]
| LocatedVarDef Location String Value
If the annotation is a location, then this also makes it more convenient to obtain when writing certain kinds of parsers, along the lines of AnnotatedIf <$> (getSourceLocation <* ifKeyword) <*> value <*> many expr in a parser combinator library.
Parameterise your annotations: I often make the annotation type into a type parameter, so that GHC can derive some useful classes for me:
{-# Language
DeriveFoldable,
DeriveFunctor,
DeriveTraversable #-}
data AnnotatedExpr a
= AnnotatedIf a Value [Expr]
| AnnotatedVarDef a String Value
deriving (Functor, Foldable, Traversable)
type LocatedExpr = AnnotatedExpr Location
-- Get the annotation of an expression.
-- (Total as long as every constructor is annotated.)
exprAnnotation :: AnnotatedExpr a -> a
exprAnnotation = head
-- Update annotations purely.
mapAnnotations
:: (a -> b)
-> AnnotatedExpr a -> AnnotatedExpr b
mapAnnotations = fmap
-- traverse, foldMap, &c.
If you want “doesn’t interfere”, use polymorphism: you can enforce that the evaluator can’t inspect the annotation type by being polymorphic over it. Pattern synonyms still let you match on these expressions conveniently:
pattern If :: Value -> [AnnotatedExpr a] -> AnnotatedExpr a
pattern If val exprs <- AnnotatedIf _anno val exprs
-- …
eval :: AnnotatedExpr a -> Value
eval expr = case expr of
If val exprs -> -- …
VarDef name expr -> -- …
Unannotated terms aren’t your enemy: a term without source locations is no good for error reporting, but I think it’s still a good idea to make the pattern synonyms bidirectional for the convenience of constructing unannotated terms with a unit () annotation. (Or something equivalent, if you use e.g. Maybe Location as the annotation type.)
The reason is that this is quite convenient for writing unit tests, where you want to check the output, but want to use Eq instead of pattern matching, and don’t want to have to compare all the source locations in tests that aren’t concerned with them. Using the derived classes, void :: (Functor f) => f a -> f () strips out all the annotations on an AST.
import Control.Monad (void)
type BareExpr = AnnotatedExpr ()
-- One way to define bidirectional synonyms, so e.g.
-- ‘If’ can be used as either a pattern or a constructor.
pattern If :: Value -> [BareExpr] -> BareExpr
pattern If val exprs = AnnotatedIf () val exprs
-- …
stripAnnotations :: AnnotatedExpr a -> BareExpr
stripAnnotations = void
Equivalently, you could use GADTs / ExistentialQuantification to say data AnyExpr where { AnyExpr :: AnnotatedExpr a -> AnyExpr } / data AnyExpr = forall a. AnyExpr (AnnotatedExpr a); that way, the annotations have exactly as much information as (), but you don’t need to fmap over the entire tree with void in order to strip it, just apply the AnyExpr constructor to hide the type.
Finally, here are some brief introductions to a few AST typing solutions.
Annotate each AST node with a tag (e.g. a unique ID), then store all metadata like source locations, types, and whatever else, separately from the AST:
import Data.IntMap (IntMap)
-- More sophisticated/stronglier-typed tags are possible.
newtype Tag = Tag Int
newtype TagMap a = TagMap (IntMap a)
data Expr
= If !Tag Value [Expr]
| VarDef !Tag String Expr
type Span = (Location, Location)
type SourceMap = TagMap Span
type CommentMap = TagMap (Span, String)
parse
:: String -- Input
-> Either ParseError
( Expr -- Parsed expression
, SourceMap -- Source locations of tags
, CommentMap -- Sideband for comments
-- …
)
The advantage is that you can very easily mix in arbitrary new types of annotations anywhere, without affecting the AST itself, and avoid rewriting the AST just to change annotations. You can think of the tree and annotation tables as a kind of database, where the tags are the “foreign keys” relating them. A downside is that you must be careful to maintain these tags when you do rewrite the AST.
I don’t know if this approach has an established name; I think of it as just “tagging” or a “tagged AST”.
recursion-schemes and/or Data Types à la CartePDF: separate out the “recursive” part of an annotated expression tree from the “annotation” part, and use Fix to tie them back together, with Compose (or Cofree) to add annotations in the middle.
data ExprF e
= IfF Value [e]
| VarDefF String e
-- …
deriving (Foldable, Functor, Traversable, …)
-- Unannotated: Expr ~ ExprF (ExprF (ExprF (…)))
type Expr = Fix ExprF
-- With a location at each recursive step:
--
-- LocatedExpr ~ Located (ExprF (Located (ExprF (…))))
type LocatedExpr = Fix (Compose Located ExprF)
data Located a = Located Location a
deriving (Foldable, Functor, Traversable, …)
-- or: type Located = (,) Location
A distinct advantage is that you get a bunch of nice traversal stuff like cata for free-ish, so you can avoid having to write manual traversals over your AST over and over. A downside is that it adds some pattern clutter to clean up, as does the “à la carte” approach, but they do offer a lot of flexibility.
Trees That GrowPDF is overkill for just source locations, but in a serious compiler it’s quite helpful. If you expect to have more than one annotation type (such as inferred types or other analysis results) or an AST that changes over time, then you add a type parameter for the “compilation phase” (parsed, renamed, typechecked, desugared, &c.) and select field types or enable & disable constructors based on that index.
A really unfortunate downside of this is that you often have to rewrite the tree even in places nothing has changed, because everything depends on the “phase”. An alternative that I use is to add one type parameter for each type of phase or annotation that can vary independently, e.g. data Expr annotation termVarName typeVarName, and abstract over that with type and pattern synonyms. This lets you update indices independently and still use classes like Functor and Bitraversable.

Does Haskell support anonymous instances of typeclass?

I have the following code in F# (it's from a book)
open System.Collections.Generic
type Table<'T, 'U> =
abstract Item : 'T -> 'U with get
abstract Discard : unit -> unit
let memoizeAndPermitDiscard f =
let lookasideTable = new Dictionary<_, _>(HashIdentity.Structural)
{new Table<'T, 'U> with
member t.Item
with get(n) =
if lookasideTable.ContainsKey(n) then
lookasideTable.[n]
else
let res = f n
lookasideTable.Add(n, res)
res
member t.Discard() =
lookasideTable.Clear()}
let rec fibFast =
memoizeAndPermitDiscard (fun n ->
printfn "computing fibFast %d" n
if n <= 2 then 1 else fibFast.[n - 1] + fibFast.[n - 2])
As we can see the abstract type Table take it's implementation in the function memoizeAndPermitDiscard. Can Haskell do the same?
Apologies in advance: I'm not an F# expert, so I may be misreading the F# code. But if I'm reading it right it's fairly straightforward to translate to Haskell:
data Table t u = Table { get :: t -> IO u, discard :: IO () }
memoize :: Hashable t => (t -> u) -> IO (Table t u)
memoize f = do
tbl <- newHashTable
return Table
{ get = \t -> do
result <- lookupHashTable t tbl
case result of
Nothing -> let u = f t in writeHashTable t u tbl >> return u
Just u -> return u
, discard = clearHashTable tbl
}
I'm assuming some suitable hash table implementation here that offers newHashTable, lookupHashTable, writeHashTable, and clearHashTable. Implementing these (or suggesting a library that offers them) is sort of beside the point of the question, I think.
I'm not an F# expert either, but I believe what you're describing is where you create a anonymous single-use subclass, by declaring in at the point where you create an object how it implements the methods of a superclass or interface? So it's really an anonymous class, not an anonymous instance (or rather, it's no more anonymous than any other object-oriented instance, which typically don't have names inherently, only variable names storing references to them).
It doesn't really make sense to do that with Haskell type classes/instances. The reason is that a Haskell instance represents something very different from an OO instance.
The instances of OO classes are objects (even the instances of interfaces are objects). All of a class' methods will always be invoked on an instance of that class. So it makes sense to create an anonymous subclass of an existing class or interface at the time you create a new object. You basically say how that object implements the required methods, as an alternative to declaring a whole named class of objects that implement the methods the same way, which you could instantiate in multiple places.
The instances of Haskell classes are types (which is why the're called type classes). All of the methods of a class must involve the type somehow, but there is no guarantee that they take an input of the type. For example, consider the class1:
class Monoid' a
where mempty' :: a
mappend' :: a -> a -> a
It doesn't really make sense to say an object is an instance of Monoid'; if I were to create a new object and I wanted to anonymously instantiate Monoid', how would I define mempty'? mempty' isn't an operation I could invoke on my new object, it's an operation that receives no inputs at all (not even an implicit "this") and produces a value2.
And then there's things like:
class Functor' f
where fmap :: (a -> b) -> (f a -> f b)
Nothing ever takes an input of a type f that is an instance of Functor'; it doesn't even make sense to talk about something that might, since the instances of the class Functor' are type constructors that need a type parameter to result in a type, not types that actually contain values. So again, it just makes no sense at the point that I'm creating a new object to say "and here's how this object implements Functor'").
It could potentially make sense to declare a new anonymous type locally, and declare how it implements some type classes at the same time. But Haskell has no syntax for it, no.
Fortunately, you also don't need to create anonymous classes/instances in order to have a one-off collection of functions that conforms to a known interface. Functions are first-class values too, so you can just have type whose fields are functions. Then anywhere you like you can create a new value of that type by providing a definition for all of the function fields. For example:
data MyInterface = MyInterface
{ foo :: Int -> Bool
, bar :: Int -> String
}
example :: MyInterface -> Int -> (Bool, String)
example impl x
= (foo impl x, bar impl x)
main = do
let impl = MyInterface { foo = even, bar = show }
print $ example impl 7
The above program prints (False,"7").
1 I'm using Monoid' rather than Monoid (and similarly Functor') because I'm using simplifications of the real classes. You can see the real definition with :info Monoid in ghci (or look up the documentation) if you're interested.
2 Or alternatively, the class Monoid' mandates that there simply exists a value of each type that instantiates it, and mempty' just is a reference to it.

Augment a complex data type in a generic way in Haskell

I have been using the abstract syntax tree (AST) of Language.C library to modify C programs using generic transformations of SYB library. This AST has different kind of nodes (data types), each one representing a C construction, i.e. expressions, statements, definitions, etc. I need now to augment somehow the information that statements carry, i.e. annotate them. I have supposed (maybe I'm wrong) that I cannot modify or redefine the original data type, so I would like to have something like this:
annotateAST anns =
everywhere (mkT (annotateAST_ anns))
annotateAST_ astnode anns
| isStmt astnode = AnnStmt astnode (getAnn astnode anns)
| otherwise = astnode
In this way I would have a new ast with annotated statements instead of the original one. Of course, GHC is going to complain because everywhereshould return the same type that it gets, and this is not what it is happening here.
Concluding, I need to generically annotate an AST without modifying the original data types, and in a way that it is easy to return to the original data structure.
I have been thinking in different solutions for this problem, but not convinced of any of them, so I decided to share it here.
P.S. I was told that SYB library is not very efficient. Taking into account that the AST of Language.C only derives Data, do I have a more efficient alternative to do generic traversal and modification of the AST?
I am not an expert of that library, but it seems to be designed so to allow user-defined decorations.
This is because all the main data types are parametrized over NodeInfo, the standard annotation (only carrying location and name information). E.g. the library provides
type CTranslUnit = CTranslationUnit NodeInfo
which allows you to define
type MyTransUnit = CTranslationUnit MyNodeInfo
data MyNodeInfo = MNI NodeInfo AdditionalStuffHere
so to decorate the AST as you wish.
The library provides Functor instances that can affect such decorations, as well as an Annotated typeclass to retrieve the (possibly user-defined) annotation from any AST node.
I'd try to pursue that approach.
The design looks nice. The only drawback I can see is that the annotation type must be the same for all kinds on nodes, which basically forces one to define it as a huge sum of all kinds of annotations one might possibly have inside. For example:
-- AST library for a simple lambda-calculus
data AST n
= Fun n String (AST n)
| Var n String
| App n (AST n) (AST n)
-- user code
data Annotation
= AnnVar ... | AnnFun ... | AnnApp ...
type AnnotatedAST = AST Annotation
and we offer no static guarantees on functions being decorated with AnnFun, only.
One could wish for a more advanced library design exploiting GADTs such as:
-- AST library for a simple lambda-calculus
data Tag = TagFun | TagVar | TagApp
data AST (n :: Tag -> *)
= Fun (n 'TagFun) String (AST n)
| Var (n 'TagVar) String
| App (n 'TagApp) (AST n) (AST n)
-- user code
data Annotation (n :: Tag) where
AnnFun :: String -> Annotation 'TagFun
AnnVar :: Int -> Annotation 'TagVar
AnnApp :: Bool -> Annotation 'TagApp
type AnnotatedAST = AST Annotation
which guarantees a correct annotation in every node. AST would no longer be a Functor, but a Functor-like class could be defined, at least.
Still -- I'd be grateful that at least the library allows some form of user-defined annotations.

How can I perform a scatter/gather operation on types in Haskell?

I have tree that hold contains nodes of different types. These are tagged using a datatype:
data Wrapping = A Int
| B String
I want to write two functions:
scatter :: Wrapping -> a
gather :: a -> Output
The idea is that I can use (scatter.gather) :: Wrapping -> Output. There will of course be several different variations on both the scatter and the gather function (with each scatter variant having a unique Wrappingn datatype, but the set of intermediate types will always be the same) and I want to be able to cleanly compose them.
The issue that I have is that the type parameter a is not really free, it is a small explicit set of types (here it is {Int,String}). If I try to encode what I have so far into Haskell typeclasses then I get to:
{-# LANGUAGE FlexibleInstances #-}
data Wrapping = A Int | B String
class Fanin a where
gather :: a -> String
instance Fanin Int where
gather x = show x
instance Fanin String where
gather x = x
class Fanout a where
scatter :: Fanout a => Wrapping -> a
instance Fanout Int where
scatter (A n) = n
instance Fanout String where
scatter (B x) = x
combined = gather.scatter
The two classes typecheck fine but obviously the final line throws errors because ghc knows that the type parameters do match on every case, only on the two that I have defined. I've tried various combinations of extending one class from the other:
class Fanin a => Fanout a where ...
class Fanout a => Fanin a where ...
Finally I've looked at GADTs and existential types to solve this but I am stumbling around in the dark. I can't find a way to express a legal qualified type signature to GHC, where I've tried combinations of:
{-# LANGUAGE RankNTypes #-}
class (forall a. Fanout a) => Fanin a where
class (forall a. Fanin a) => Fanout a where
Question: how do I express to GHC that I want to restrict a to only the two types in the set?
I get the feeling that the solution lies in one of the techniques that I've looked at but I'm too lost to see what it is...
The idea is that I can use (scatter.gather) :: Wrapping -> Output.
There will of course be several different variations on both the
scatter and the gather function (with each scatter variant having a
unique Wrappingn datatype, but the set of intermediate types will
always be the same) and I want to be able to cleanly compose them.
If I understand correctly, you'd like to have different Wrapping types but the intermediate a type is constantly Either Int String. We can just reflect this information in our classes:
data Wrapping = A Int
| B String
class Fanout wrap where
scatter :: wrap -> Either Int String
instance Fanout Wrapping where
scatter (A n) = Left n
scatter (B str) = Right str
class Fanin output where
gather :: Either Int String -> output
instance Fanin String where
gather = either show id
combined :: Wrapping -> String
combined = gather . scatter
Also, this use case doesn't seem especially amenable to type classes, from what I can glean from the question. In particular, we can get rid of Fanin, and then combined = either show id . scatter looks better to my eyes than the previous definition.
The type class solution makes sense here only if just a single Either Int String -> a or a -> Either Int String function makes sense for each a, and you'd like to enforce this.
If I understand you correctly you need something like the following:
module Main ( main ) where
-- Different kinds of wrapper data types
data WrapperA = A Int | B String
data WrapperB = C Int | D Float
-- A single intermediate data type (with phantom type)
data Intermediate a = E Int | F String
-- Generic scatter and gather functions
class Wrapped a where
scatter :: Wrapped a => a -> Intermediate a
gather :: Wrapped a => Intermediate a -> String
-- Specific scatter and gather implementations
instance Wrapped WrapperA where
scatter (A i) = E i
scatter (B s) = F s
gather (E i) = show i
gather (F s) = s
instance Wrapped WrapperB where
scatter (C i) = E i
scatter (D f) = F $ show f
gather (E i) = show i
gather (F s) = s ++ " was a float"
-- Beautiful composability
combined :: Wrapped a => a -> String
combined = gather . scatter
wrapperAexample1 = A 10
wrapperAexample2 = B "testing"
wrapperBexample1 = C 11
wrapperBexample2 = D 12.4
main :: IO ()
main = do print $ combined wrapperAexample1
print $ combined wrapperAexample2
print $ combined wrapperBexample1
print $ combined wrapperBexample2
The main issue seems to be that you have an intermediate type which can have different kinds of content, but this is constant for different wrappers. Still, depending on the kind of wrapper, you want the gather function to behave differently.
To do this, I would define the Intermediate type to specify the kinds of values that can be held in the intermediate stage, and give it a phantom type parameter (to remember what kind of wrapper it originated from). You can then define a class to hold the scatter and gather functions, and define these differently for different kinds of wrappers.
The code above compiles without errors for me, and gives the following output:
"10"
"testing"
"11"
"12.4 was a float"
As you can see, the WrapperB/D Float input is treated differently than the WrapperA/B String (it is tagged as a float value, even after it has been converted to a String). This is because in the Intermediate representation it is remembered that the origin is a WrapperB: the one is of type Intermediate WrapperA, the other of Intermediate WrapperB.
If, on the other hand, you don't actually want the gather function to behave differently for different wrappers, you can simply take that out of the class and take out the phantom type. The easiest way to let ghc know that the type in the intermediate stage can be Int or String seems to me to still define something like the Intermediate type, rather than use just a.

Showing the type A -> A

data A = Num Int
| Fun (A -> A) String deriving Show
instance Show (Fun (A -> A) String) where
show (Fun f s) = s
I would like to have an attribute for a function A -> A to print it, therefore there is a String type parameter to Fun. When I load this into ghci, I get
/home/kmels/tmp/show-abs.hs:4:16:
Not in scope: type constructor or class `Fun'
I guess this could be achieved by adding a new data type
data FunWithAttribute = FA (A -> A) String
adding data A = Num Int | Fun FunWithAttribute and writing an instance Show FunWithAttribute. Is the additional data type avoidable?
Instances are defined for types as a whole, not individual constructors, which is why it complains about Fun not being a type.
I assume your overall goal is to have a Show instance for A, which can't be derived because functions can't (in general) have a Show instance. You have a couple options here:
Write your own Show instance outright:
That is, something like:
instance Show A where
show (Num n) = "Num " ++ show n
show (Fun _ s) = s
In many cases, this makes the most sense. But sometimes it's nicer to derive Show, especially on complex recursive types where only one case of many is not automatically Show-able.
Make A derivable:
You can only derive Show for types that contain types that themselves have Show instances. There's no instance for A -> A, so deriving doesn't work. But you can write one that uses a placeholder of some sort:
instance Show (A -> A) where
show _ = "(A -> A)"
Or even just an empty string, if you prefer.
Note that this requires the FlexibleInstances language extension; it's one of the most harmless and commonly used extensions, is supported by multiple Haskell implementations, and the restrictions it relaxes are (in my opinion) a bit silly to begin with, so there's little reason to avoid it.
An alternate approach would be to have a wrapper type, as you mention in the question. You could even make this more generic:
data ShowAs a = ShowAs a String
instance Show (ShowAs a) where
show (ShowAs _ s) = s
...and then use (ShowAs (A -> A)) in the Fun constructor. This makes it a bit awkward by forcing you to do extra pattern matching any time you want to use the wrapped type, but it gives you lots of flexibility to "tag" stuff with how it should be displayed, e.g. showId = id `ShowAs` "id" or suchlike.
Perhaps I'm not following what you are asking for. But the above code could be written like this in order to compile:
data A = Num Int
| Fun (A -> A) String
instance Show A where
show (Fun f s) = s
show (Num i) = show i
Some explanation
It looked like you were trying to write a show instance for a constructor (Fun). Class instances are written for the entire data type (there might be exceptions, dunno). So you need to write one show matching on each constructor as part of the instance. Num and Fun are each constructors of the data type A.
Also, deriving can't be used unless each parameter of each constructor is, in turn, member of, in this case, Show. Now, your example is a bit special since it wants to Show (A -> A). How to show a function is somewhat explained in the other responses, although I don't think there is an exhaustive way. The other examples really just "show" the type or some place holder.
A Show instance (or any class instance) needs to be defined for a data type, not for a type constructor. That is, you need simply
instance Show A where
Apparently, you're trying to get this instance with the deriving, but that doesn't work because Haskell doesn't know how to show A->A. Now it seems you don't even want to show that function, but deriving Show instances always show all available information, so you can't use that.
The obvious, and best, solution to your problem is worldsayshi's: don't use deriving at all, but define a proper instance yourself. Alternatively, you can define a pseudo-instance for A->A and then use deriving:
{-# LANGUAGE FlexibleInstances #-}
data A = Num Int | Fun (A->A) String deriving(Show)
instance Show (A->A) where show _ = ""
This works like
Prelude> Fun (const $ Num 3) "bla"
Fun "bla"

Resources