Deriving functor instance, not on last type argument - haskell

Related to this question I asked earlier today.
I have an AST data type with a large number of cases, which is parameterized by an "annotation" type
data Expr ann def var = Plus a Int Int
| ...
| Times a Int Int
deriving (Data, Typeable, Functor)
I've got concrete instances for def and var, say Def and Var.
What I want is to automatically derive fmap which operates as a functor on the first argument. I want to derive a function that looks like this:
fmap :: (a -> b) -> (Expr a Def Var) -> (Expr b Def Var)
When I use normal fmap, I get a compiler message that indicates fmap is trying to apply its function to the last type argument, not the first.
Is there a way I can derive the function as described, without writing a bunch of boilerplate? I tried doing this:
newtype Expr' a = E (Expr a Def Var)
deriving (Data, Typeable, Functor)
But I get the following error:
Constructor `E' must use the type variable only as the last argument of a data type
I'm working with someone else's code base, so it would be ideal if I don't have to switch the order of the type arguments everywhere.

The short answer is, this isn't possible, because Functor requires that the changing type variable be in the last position. Only type constructors of kind * -> * can have Functor instances, and your Expr doesn't have that kind.
Do you really need a Functor instance? If you just want to avoid the boilerplate of writing an fmap-like function, something like SYB is a better solution (but really the boilerplate isn't that bad, and you'd only write it once).
If you need Functor for some other reason (perhaps you want to use this data structure in some function with a Functor constraint), you'll have to choose whether you want the instance or the type variables in the current order.

You can exploit a type synonym for minimizing the changes to the original code:
data Expr' def var ann = Plus a Int Int -- change this to Expr', correct order
| ...
| Something (Expr ann def var) -- leave this as it is, with the original order
deriving (Data, Typeable, Functor)
type Expr ann def var = Expr' def var ann
The rest of the code can continue using Expr, unchanged. The only exceptions are class instances such as Functor, which as you noticed require a certain order in the parameters. Hopefully Functor is the only such class you need.
The auto-derived fmap function has type
fmap :: (a -> b) -> Expr' def var a -> Expr' def var b
which can be written as
fmap :: (a -> b) -> Expr a def var -> Expr b def var

Related

Recursion schemes with several types

Right now, I've got an AST for expression that's polymorphic over the type of recursion:
data Expr a = Const Int
| Add a a
This has been incredibly useful by allowing me to use a type for plain recursion (Fix Expr) and another one when I need to attach extra information (Cofree Expr ann).
The issue occurs when I want to introduce another type into this recursion scheme:
data Stmt a = Compound [a]
| Print (Expr ?)
I'm not sure what to put for the Expr term without introducing additional type variables and breaking compatibility with all the general functions I've already written.
Can this be done, and if so, is it a useful pattern?
The recursion-schemes perspective is to view recursive types as fixed points of functors. The type of expressions is the fixed point of the following functor:
data ExprF expr = Const Int
| Add expr expr
The point of changing the name of the variable is to make explicit the fact that it is a placeholder for the actual type of expressions, that would otherwise be defined as:
data Expr = Const Int | Add Expr Expr
In Stmt, there are two recursive types, Expr and Stmt itself. So we put two holes/unknowns.
data StmtF expr stmt = Compound [stmt]
| Print expr
When we take a fixpoint with Fix or Cofree, we are now solving a system of two equations (one for Expr, one for Stmt), and that comes with some amount of boilerplate.
Instead of applying Fix or Cofree directly, we generalize, taking those fixpoint combinators (Fix, Cofree, Free) as parameters in the construction of expressions and statements:
type Expr_ f = f ExprF
type Stmt_ f = f (StmtF (Expr_ f))
Now we can say Expr_ Fix or Stmt_ Fix for the unannotated trees, and Expr_ (Flip Cofree ann), Stmt_ (Flip Cofree ann). Unfortunately we have to pay another LOC fee to make the kinds match, and the types get ever more convoluted.
newtype Flip cofree a f b = Flip (cofree f a b)
(This also assumes we want to use the same Fix or Cofree everywhere at the same times.)
Another representation to consider is (called HKD nowadays):
data Expr f = Const Int
| Add (f Expr) (f Expr)
data Stmt f = Compount [f Stmt]
| Print (f (Expr f))
where you only abstract from annotation/no-annotation (f = Identity or (,) ann) and not from recursion.

Typed abstract syntax and DSL design in Haskell

I'm designing a DSL in Haskell and I would like to have an assignment operation. Something like this (the code below is just for explaining my problem in a limited context, I didn't have type checked Stmt type):
data Stmt = forall a . Assign String (Exp a) -- Assignment operation
| forall a. Decl String a -- Variable declaration
data Exp t where
EBool :: Bool -> Exp Bool
EInt :: Int -> Exp Int
EAdd :: Exp Int -> Exp Int -> Exp Int
ENot :: Exp Bool -> Exp Bool
In the previous code, I'm able to use a GADT to enforce type constraints on expressions. My problem is how can I enforce that the left hand side of an assignment is: 1) Defined, i.e., a variable must be declared before it is used and 2) The right hand side must have the same type of the left hand side variable?
I know that in a full dependently typed language, I could define statements indexed by some sort of typing context, that is, a list of defined variables and their type. I believe that this would solve my problem. But, I'm wondering if there is some way to achieve this in Haskell.
Any pointer to example code or articles is highly appreciated.
Given that my work focuses on related issues of scope and type safety being encoded at the type-level, I stumbled upon this old-ish question whilst googling around and thought I'd give it a try.
This post provides, I think, an answer quite close to the original specification. The whole thing is surprisingly short once you have the right setup.
First, I'll start with a sample program to give you an idea of what the end result looks like:
program :: Program
program = Program
$ Declare (Var :: Name "foo") (Of :: Type Int)
:> Assign (The (Var :: Name "foo")) (EInt 1)
:> Declare (Var :: Name "bar") (Of :: Type Bool)
:> increment (The (Var :: Name "foo"))
:> Assign (The (Var :: Name "bar")) (ENot $ EBool True)
:> Done
Scoping
In order to ensure that we may only assign values to variables which have been declared before, we need a notion of scope.
GHC.TypeLits provides us with type-level strings (called Symbol) so we can very-well use strings as variable names if we want. And because we want to ensure type safety, each variable declaration comes with a type annotation which we will store together with the variable name. Our type of scopes is therefore: [(Symbol, *)].
We can use a type family to test whether a given Symbol is in scope and return its associated type if that is the case:
type family HasSymbol (g :: [(Symbol,*)]) (s :: Symbol) :: Maybe * where
HasSymbol '[] s = 'Nothing
HasSymbol ('(s, a) ': g) s = 'Just a
HasSymbol ('(t, a) ': g) s = HasSymbol g s
From this definition we can define a notion of variable: a variable of type a in scope g is a symbol s such that HasSymbol g s returns 'Just a. This is what the ScopedSymbol data type represents by using an existential quantification to store the s.
data ScopedSymbol (g :: [(Symbol,*)]) (a :: *) = forall s.
(HasSymbol g s ~ 'Just a) => The (Name s)
data Name (s :: Symbol) = Var
Here I am purposefully abusing notations all over the place: The is the constructor for the type ScopedSymbol and Name is a Proxy type with a nicer name and constructor. This allows us to write such niceties as:
example :: ScopedSymbol ('("foo", Int) ': '("bar", Bool) ': '[]) Bool
example = The (Var :: Name "bar")
Statements
Now that we have a notion of scope and of well-typed variables in that scope, we can start considering the effects Statements should have. Given that new variables can be declared in a Statement, we need to find a way to propagate this information in the scope. The key hindsight is to have two indices: an input and an output scope.
To Declare a new variable together with its type will expand the current scope with the pair of the variable name and the corresponding type.
Assignments on the other hand do not modify the scope. They merely associate a ScopedSymbol to an expression of the corresponding type.
data Statement (g :: [(Symbol, *)]) (h :: [(Symbol,*)]) where
Declare :: Name s -> Type a -> Statement g ('(s, a) ': g)
Assign :: ScopedSymbol g a -> Exp g a -> Statement g g
data Type (a :: *) = Of
Once again we have introduced a proxy type to have a nicer user-level syntax.
example' :: Statement '[] ('("foo", Int) ': '[])
example' = Declare (Var :: Name "foo") (Of :: Type Int)
example'' :: Statement ('("foo", Int) ': '[]) ('("foo", Int) ': '[])
example'' = Assign (The (Var :: Name "foo")) (EInt 1)
Statements can be chained in a scope-preserving way by defining the following GADT of type-aligned sequences:
infixr 5 :>
data Statements (g :: [(Symbol, *)]) (h :: [(Symbol,*)]) where
Done :: Statements g g
(:>) :: Statement g h -> Statements h i -> Statements g i
Expressions
Expressions are mostly unchanged from your original definition except that they are now scoped and a new constructor EVar lets us dereference a previously-declared variable (using ScopedSymbol) giving us an expression of the appropriate type.
data Exp (g :: [(Symbol,*)]) (t :: *) where
EVar :: ScopedSymbol g a -> Exp g a
EBool :: Bool -> Exp g Bool
EInt :: Int -> Exp g Int
EAdd :: Exp g Int -> Exp g Int -> Exp g Int
ENot :: Exp g Bool -> Exp g Bool
Programs
A Program is quite simply a sequence of statements starting in the empty scope. We use, once more, an existential quantification to hide the scope we end up with.
data Program = forall h. Program (Statements '[] h)
It is obviously possible to write subroutines in Haskell and use them in your programs. In the example, I have the very simple increment which can be defined like so:
increment :: ScopedSymbol g Int -> Statement g g
increment v = Assign v (EAdd (EVar v) (EInt 1))
I have uploaded the whole code snippet together with the right LANGUAGE pragmas and the examples listed here in a self-contained gist. I haven't however included any comments there.
You should know that your goals are quite lofty. I don't think you will get very far treating your variables exactly as strings. I'd do something slightly more annoying to use, but more practical. Define a monad for your DSL, which I'll call M:
newtype M a = ...
data Exp a where
... as before ...
data Var a -- a typed variable
assign :: Var a -> Exp a -> M ()
declare :: String -> a -> M (Var a)
I'm not sure why you have Exp a for assignment and just a for declaration, but I reproduced that here. The String in declare is just for cosmetics, if you need it for code generation or error reporting or something -- the identity of the variable should really not be tied to that name. So it's usually used as
myFunc = do
foobar <- declare "foobar" 42
which is the annoying redundant bit. Haskell doesn't really have a good way around this (though depending on what you're doing with your DSL, you may not need the string at all).
As for the implementation, maybe something like
data Stmt = forall a. Assign (Var a) (Exp a)
| forall a. Declare (Var a) a
data Var a = Var String Integer -- string is auxiliary from before, integer
-- stores real identity.
For M, we need a unique supply of names and a list of statements to output.
newtype M a = M { runM :: WriterT [Stmt] (StateT Integer Identity a) }
deriving (Functor, Applicative, Monad)
Then the operations as usually fairly trivial.
assign v a = M $ tell [Assign v a]
declare name a = M $ do
ident <- lift get
lift . put $! ident + 1
let var = Var name ident
tell [Declare var a]
return var
I've made a fairly large DSL for code generation in another language using a fairly similar design, and it scales well. I find it a good idea to stay "near the ground", just doing solid modeling without using too many fancy type-level magical features, and accepting minor linguistic annoyances. That way Haskell's main strength -- it's ability to abstract -- can still be used for code in your DSL.
One drawback is that everything needs to be defined within a do block, which can be a hinderance to good organization as the amount of code grows. I'll steal declare to show a way around that:
declare :: String -> M a -> M a
used like
foo = declare "foo" $ do
-- actual function body
then your M can have as a component of its state a cache from names to variables, and the first time you use a declaration with a certain name you render it and put it in a variable (this will require a bit more sophisticated monoid than [Stmt] as the target of your Writer). Later times you just look up the variable. It does have a rather floppy dependence on uniqueness of names, unfortunately; an explicit model of namespaces can help with that but never eliminate it entirely.
After seeing all the code by #Cactus and the Haskell suggestions by #luqui, I've managed to got a solution close to what I want in Idris. The complete code is available at the following gist:
(https://gist.github.com/rodrigogribeiro/33356c62e36bff54831d)
Some little things I need to fix in the previous solution:
I don't know (yet) if Idris support integer literal overloading, what would be quite useful to build my DSL.
I've tried to define in DSL syntax a prefix operator for program variables, but it didn't worked as I like. I've got a solution (in the previous gist) that uses a keyword --- use --- for variable access.
I'll check this minor points with guys in Idris #freenode channel to see if these two points are possible.

Convert from type `T a` to `T b` without boilerplate

So, I have an AST data type with a large number of cases, which is parameterized by an "annotation" type
data Expr a = Plus a Int Int
| ...
| Times a Int Int
I have annotation types S and T, and some function f :: S -> T. I want to take an Expr S and convert it to an Expr T using my conversion f on each S which occurs within an Expr value.
Is there a way to do this using SYB or generics and avoid having to pattern match on every case? It seems like the type of thing that this is suited for. I just am not familiar enough with SYB to know the specific way to do it.
It sounds like you want a Functor instance. This can be automatically derived by GHC using the DeriveFunctor extension.
Based on your follow-up question, it seems that a generics library is more appropriate to your situation than Functor. I'd recommend just using the function given on SYB's wiki page:
{-# LANGUAGE DeriveDataTypeable, ScopedTypeVariables, FlexibleContexts #-}
import Data.Generics
import Unsafe.Coerce
newtype C a = C a deriving (Data,Typeable)
fmapData :: forall t a b. (Typeable a, Data (t (C a)), Data (t a)) =>
(a -> b) -> t a -> t b
fmapData f input = uc . everywhere (mkT $ \(x::C a) -> uc (f (uc x)))
$ (uc input :: t (C a))
where uc = unsafeCoerce
The reason for the extra C type is to avoid a problematic corner case where there are occurrences of fields at the same type as a (more details on the wiki). The caller of fmapData doesn't need to ever see it.
This function does have a few extra requirements compared to the real fmap: there must be instances of Typeable for a, and Data for t a. In your case t a is Expr a, which means that you'll need to add a deriving Data to the definition of Expr, as well as have a Data instance in scope for whatever a you're using.

Does exporting type constructors make a difference?

Let's say I have an internal data type, T a, that is used in the signature of exported functions:
module A (f, g) where
newtype T a = MkT { unT :: (Int, a) }
deriving (Functor, Show, Read) -- for internal use
f :: a -> IO (T a)
f a = fmap (\i -> T (i, a)) randomIO
g :: T a -> a
g = snd . unT
What is the effect of not exporting the type constructor T? Does it prevent consumers from meddling with values of type T a? In other words, is there a difference between the export list (f, g) and (f, g, T()) here?
Prevented
The first thing a consumer will see is that the type doesn't appear in Haddock documentation. In the documentation for f and g, the type Twill not be hyperlinked like an exported type. This may prevent a casual reader from discovering T's class instances.
More importantly, a consumer cannot doing anything with T at the type level. Anything that requires writing a type will be impossible. For instance, a consumer cannot write new class instances involving T, or include T in a type family. (I don't think there's a way around this...)
At the value level, however, the main limitation is that a consumer cannot write a type annotation including T:
> :t (f . read) :: Read b => String -> IO (A.T b)
<interactive>:1:39: Not in scope: type constructor or class `A.T'
Not prevented
The restriction on type signatures is not as significant a limitation as it appears. The compiler can still infer such a type:
> :t f . read
f . read :: Read b => String -> IO (A.T b)
Any value expression within the inferrable subset of Haskell may therefore be expressed regardless of the availability of the type constructor T. If, like me, you're addicted to ScopedTypeVariables and extensive annotations, you may be a little surprised by the definition of unT' below.
Furthermore, because typeclass instances have global scope, a consumer can use any available class functions without additional limitation. Depending on the classes involved, this may allow significant manipulation of values of the unexposed type. With classes like Functor, a consumer can also freely manipulate type parameters, because there's an available function of type T a -> T b.
In the example of T, deriving Show of course exposes the "internal" Int, and gives a consumer enough information to hackishly implement unT:
-- :: (Show a, Read a) => T a -> (Int, a)
unT' = (read . strip . show') `asTypeOf` (mkPair . g)
where
strip = reverse . drop 1 . reverse . drop 9
-- :: T a -> String
show' = show `asTypeOf` (mkString . g)
mkPair :: t -> (Int, t)
mkPair = undefined
mkString :: t -> String
mkString = undefined
> :t unT'
unT' :: (Show b, Read b) => A.T b -> (Int, b)
> x <- f "x"
> unT' x
(-29353, "x")
Implementing mkT' with the Read instance is left as an exercise.
Deriving something like Generic will completely explode any idea of containment, but you'd probably expect that.
Prevented?
In the corners of Haskell where type signatures are necessary or where asTypeOf-style tricks don't work, I guess not exporting the type constructor could actually prevent a consumer from doing something they could with the export list (f, g, T()).
Recommendation
Export all type constructors that are used in the type of any value you export. Here, go ahead and include T() in your export list. Leaving it out doesn't accomplish anything other than muddying the documentation. If you want to expose an purely abstract immutable type, use a newtype with a hidden constructor and no class instances.

Type class definition with functions depending on an additional type

Still new to Haskell, I have hit a wall with the following:
I am trying to define some type classes to generalize a bunch of functions that use gaussian elimination to solve linear systems of equations.
Given a linear system
M x = k
the type a of the elements m(i,j) \elem M can be different from the type b of x and k. To be able to solve the system, a should be an instance of Num and b should have multiplication/addition operators with b, like in the following:
class MixedRing b where
(.+.) :: b -> b -> b
(.*.) :: (Num a) => b -> a -> b
(./.) :: (Num a) => b -> a -> b
Now, even in the most trivial implementation of these operators, I'll get Could not deduce a ~ Int. a is a rigid type variable errors (Let's forget about ./. which requires Fractional)
data Wrap = W { get :: Int }
instance MixedRing Wrap where
(.+.) w1 w2 = W $ (get w1) + (get w2)
(.*.) w s = W $ ((get w) * s)
I have read several tutorials on type classes but I can find no pointer to what actually goes wrong.
Let us have a look at the type of the implementation that you would have to provide for (.*.) to make Wrap an instance of MixedRing. Substituting Wrap for b in the type of the method yields
(.*.) :: Num a => Wrap -> a -> Wrap
As Wrap is isomorphic to Int and to not have to think about wrapping and unwrapping with Wrap and get, let us reduce our goal to finding an implementation of
(.*.) :: Num a => Int -> a -> Int
(You see that this doesn't make the challenge any easier or harder, don't you?)
Now, observe that such an implementation will need to be able to operate on all types a that happen to be in the type class Num. (This is what a type variable in such a type denotes: universal quantification.) Note: this is not the same (actually, it's the opposite) of saying that your implementation can itself choose what a to operate on); yet that is what you seem to suggest in your question: that your implementation should be allowed to pick Int as a choice for a.
Now, as you want to implement this particular (.*.) in terms of the (*) for values of type Int, we need something of the form
n .*. s = n * f s
with
f :: Num a => a -> Int
I cannot think of a function that converts from an arbitary Num-type a to Int in a meaningful way. I'd therefore say that there is no meaningful way to make Int (and, hence, Wrap) an instance of MixedRing; that is, not such that the instance behaves as you would probably expect it to do.
How about something like:
class (Num a) => MixedRing a b where
(.+.) :: b -> b -> b
(.*.) :: b -> a -> b
(./.) :: b -> a -> b
You'll need the MultiParamTypeClasses extension.
By the way, it seems to me that the mathematical structure you're trying to model is really module, not a ring. With the type variables given above, one says that b is an a-module.
Your implementation is not polymorphic enough.
The rule is, if you write a in the class definition, you can't use a concrete type in the instance. Because the instance must conform to the class and the class promised to accept any a that is Num.
To put it differently: Exactly the class variable is it that must be instantiated with a concrete type in an instance definition.
Have you tried:
data Wrap a = W { get :: a }
Note that once Wrap a is an instance, you can still use it with functions that accept only Wrap Int.

Resources