I can't seem to find any explanation of what lenses are used for in practical examples. This short paragraph from the Hackage page is the closest I've found:
This modules provides a convienient way to access and update the elements of a structure. It is very similar to Data.Accessors, but a bit more generic and has fewer dependencies. I particularly like how cleanly it handles nested structures in state monads.
So, what are they used for? What benefits and disadvantages do they have over other methods? Why are they needed?
They offer a clean abstraction over data updates, and are never really "needed." They just let you reason about a problem in a different way.
In some imperative/"object-oriented" programming languages like C, you have the familiar concept of some collection of values (let's call them "structs") and ways to label each value in the collection (the labels are typically called "fields"). This leads to a definition like this:
typedef struct { /* defining a new struct type */
float x; /* field */
float y; /* field */
} Vec2;
typedef struct {
Vec2 col1; /* nested structs */
Vec2 col2;
} Mat2;
You can then create values of this newly defined type like so:
Vec2 vec = { 2.0f, 3.0f };
/* Reading the components of vec */
float foo = vec.x;
/* Writing to the components of vec */
vec.y = foo;
Mat2 mat = { vec, vec };
/* Changing a nested field in the matrix */
mat.col2.x = 4.0f;
Similarly in Haskell, we have data types:
data Vec2 =
Vec2
{ vecX :: Float
, vecY :: Float
}
data Mat2 =
Mat2
{ matCol1 :: Vec2
, matCol2 :: Vec2
}
This data type is then used like this:
let vec = Vec2 2 3
-- Reading the components of vec
foo = vecX vec
-- Creating a new vector with some component changed.
vec2 = vec { vecY = foo }
mat = Mat2 vec2 vec2
However, in Haskell, there's no easy way of changing nested fields in a data structure. This is because you need to re-create all of the wrapping objects around the value that you are changing, because Haskell values are immutable. If you have a matrix like the above in Haskell, and want to change the upper right cell in the matrix, you have to write this:
mat2 = mat { matCol2 = (matCol2 mat) { vecX = 4 } }
It works, but it looks clumsy. So, what someone came up with, is basically this: If you group two things together: the "getter" of a value (like vecX and matCol2 above) with a corresponding function that, given the data structure that the getter belongs to, can create a new data structure with that value changed, you are able to do a lot of neat stuff. For example:
data Data = Data { member :: Int }
-- The "getter" of the member variable
getMember :: Data -> Int
getMember d = member d
-- The "setter" or more accurately "updater" of the member variable
setMember :: Data -> Int -> Data
setMember d m = d { member = m }
memberLens :: (Data -> Int, Data -> Int -> Data)
memberLens = (getMember, setMember)
There are many ways of implementing lenses; for this text, let's say that a lens is like the above:
type Lens a b = (a -> b, a -> b -> a)
I.e. it is the combination of a getter and a setter for some type a which has a field of type b, so memberLens above would be a Lens Data Int. What does this let us do?
Well, let's first make two simple functions that extract the getters and setters from a lens:
getL :: Lens a b -> a -> b
getL (getter, setter) = getter
setL :: Lens a b -> a -> b -> a
setL (getter, setter) = setter
Now, we can start abstracting over stuff. Let's take the situation above again, that we want to modify a value "two stories deep." We add a data structure with another lens:
data Foo = Foo { subData :: Data }
subDataLens :: Lens Foo Data
subDataLens = (subData, \ f s -> f { subData = s }) -- short lens definition
Now, let's add a function that composes two lenses:
(#) :: Lens a b -> Lens b c -> Lens a c
(#) (getter1, setter1) (getter2, setter2) =
(getter2 . getter1, combinedSetter)
where
combinedSetter a x =
let oldInner = getter1 a
newInner = setter2 oldInner x
in setter1 a newInner
The code is kind of quickly written, but I think it's clear what it does: the getters are simply composed; you get the inner data value, and then you read its field. The setter, when it is supposed to alter some value a with the new inner field value of x, first retrieves the old inner data structure, sets its inner field, and then updates the outer data structure with the new inner data structure.
Now, let's make a function that simply increments the value of a lens:
increment :: Lens a Int -> a -> a
increment l a = setL l a (getL l a + 1)
If we have this code, it becomes clear what it does:
d = Data 3
print $ increment memberLens d -- Prints "Data 4", the inner field is updated.
Now, because we can compose lenses, we can also do this:
f = Foo (Data 5)
print $ increment (subDataLens#memberLens) f
-- Prints "Foo (Data 6)", the innermost field is updated.
What all of the lens packages do is essentially to wrap this concept of lenses - the grouping of a "setter" and a "getter," into a neat package that makes them easy to use. In a particular lens implementation, one would be able to write:
with (Foo (Data 5)) $ do
subDataLens . memberLens $= 7
So, you get very close to the C version of the code; it becomes very easy to modify nested values in a tree of data structures.
Lenses are nothing more than this: an easy way of modifying parts of some data. Because it becomes so much easier to reason about certain concepts because of them, they see a wide use in situations where you have huge sets of data structures that have to interact with one another in various ways.
For the pros and cons of lenses, see a recent question here on SO.
Lenses provide convenient ways to edit data structures, in a uniform, compositional way.
Many programs are built around the following operations:
viewing a component of a (possibly nested) data structure
updating fields of (possibly nested) data structures
Lenses provide language support for viewing and editing structures in a way that ensures your edits are consistent; that edits can be composed easily; and that the same code can be used for viewing parts of a structure, as for updating the parts of the structure.
Lenses thus make it easy to write programs from views onto structures; and from structures back on to views (and editors) for those structures. They clean up a lot of the mess of record accessors and setters.
Pierce et al. popularized lenses, e.g. in their Quotient Lenses paper, and implementations for Haskell are now widely used (e.g. fclabels and data-accessors).
For concrete use cases, consider:
graphical user interfaces, where a user is editing information in a structured way
parsers and pretty printers
compilers
synchronizing updating data structures
databases and schemas
and many other situations where you have a data structure model of the world, and a editable view onto that data.
As an additional note it is often overlooked that lenses implement a very generic notion of "field access and update". Lenses can be written for all kinds of things, including function-like objects. It requires a bit of abstract thinking to appreciate this, so let me show you an example of the power of lenses:
at :: (Eq a) => a -> Lens (a -> b) b
Using at you can actually access and manipulate functions with multiple arguments depending on earlier arguments. Just keep in mind that Lens is a category. This is a very useful idiom for locally adjusting functions or other things.
You can also access data by properties or alternate representations:
polar :: (Floating a, RealFloat a) => Lens (Complex a) (a, a)
mag :: (RealFloat a) => Lens (Complex a) a
You can go further writing lenses to access individual bands of a Fourier-transformed signal and a lot more.
Related
I'm new to Haskell and trying to do something which I'm sure is easy but I'm not seeing the right way to do it.
What I want is a list of values of a particular typeclass, but different types of that typeclass. Eg:
class Shape a where
area :: a -> Double
numVertices :: a -> Integer
data Triangle = Triangle {...}
data Square = Square {...}
instance Shape Triangle where ...
instance Shape Square where ...
x = [Triangle (...), Square (...)]
I'm getting a compiler error because the list has different types. What's the right way to do what I'm trying to do here? The only thing I've been able to come up with is doing something like this:
data WrappedShape = WrappedShape {
getArea :: () -> Double
, getNumVertices :: () -> Integer
}
wrap s = WrappedShape {
getArea = \ () -> area s
, getNumVertices = \ () -> vertices s
}
x = [wrap (Triangle (...)), wrap (Square (...))]
This works, but it's heavy on boilerplate, since I have to effectively define Shape twice and with differently-named members. What's the standard way to do this sort of thing?
If you just need a few different shapes, you can enumerate each shape as a constructor, here is a example:
data SomeShapes = Triangle {...}
| Square {...}
instance Shape SomeShapes where
area (Triangle x) = ...
area (Square x) = ....
now you can put them in a list, because they are same type of SomeShapes
[Triangle {...}, Square {...}]
Your wrapped type is probably the best idea.
It can be improved by noting that, in a lazy language like Haskell, the type () -> T essentially works like the plain T. You probably want to delay computation and write stuff like let f = \ () -> 1+2 which does not perform addition until the function f is called with argument (). However, let f = 1+2 already does not perform addition until f is really needed by some other expression -- this is laziness.
So, we can simply use
data WrappedShape = WrappedShape {
getArea :: Double
, getNumVertices :: Integer
}
wrap s = WrappedShape {
getArea = area s
, getNumVertices = vertices s
}
x = [wrap (Triangle (...)), wrap (Square (...))]
and forget about passing () later on: when we will access a list element, the area/vertices will be computed (whatever we need). That is print (getArea (head x)) will compute the area of the triangle.
The \ () -> ... trick is indeed needed in eager languages, but in Haskell it is an anti-pattern. Roughly, in Haskell everything has a \ () -> ... on top, roughly speaking, s o there's no need to add another one.
These is another solution to your problem, which is called an "existential type". However, this sometimes turns into an anti-pattern as well, so I do not recommend to use it lightly.
It would work as follows
data WrappedShape = forall a. Shape a => WrappedShape a
x = [WrappedShape (Triangle ...), WrappedShape (Square ...)]
exampleUsage = case head x of WrappedShape s -> area s
This is more convenient when the type class has a lots of methods, since we do not have to write a lot of fields in the wrapped type.
The main downside of this technique is that it involves more complex type machinery, for no real gain. I mean a basic list [(Double, Integer)] has the same functionality of [WrappedShape] (list of existentials), so why bother with the latter?
Luke Palmer wrote about this anti-pattern. I do not agree with that post completely, but I think he does have some good points.
I do not have a clear-cut line where I would start using existentials over the basic approach, but these factors are what I would consider:
How many methods does the type class have?
Are there any methods of the type class where the type a (the one related to the class) appears not only as an argument? E.g. a method foo :: a -> (String, a)?
So I want to define multiple data classes for my Asteroids game/assignment:
data One = One {oneVelocity :: Velocity, onePosition :: Position, (((other properties unique to One)))}
data Two = Two {twoVelocity :: Velocity, twoPosition :: Position, (((other properties unique to Two)))}
data Three = Three {threeVelocity :: Velocity, threePosition :: Position, (((other properties unique to Three)))}
As you can see I have multiple data classes with some overlapping properties (velocity, position). That also meant that I had to give them different names per data class ("oneVelocity", "twoVelocity", ...).
Is there a way I can let these types of data extend something? I thought of using one datatype with multiple constructors, but some of these current data classes are very different and I don't thing they should reside in one data class with multiple constructors.
You should probably use just a single data type for all of these, but parameterised on the specific details:
data MovingObj s = MovingObj
{ velocity :: Velocity
, position :: Position
, specifics :: s }
Then you can create e.g. asteroid :: MovingObj AsteroidSpecifics, but you can also write functions that work with any such moving object like
advance :: TimeStep -> MovingObj s -> MovingObj s
advance h (MovingObj v p s) = MovingObj v (p .+^ h*^v) s
There is no inheritance in Haskell (at least, not the kind you associate with object-oriented classes). You just want composition of data types.
data Particle = Particle { velocity :: Velocity
, position :: Position
}
-- Exercise for the reader: research the GHC extension that
-- allows all three of these types to use the same name `p`
-- for the particle field.
data One = One { p1 :: Particle
, ... }
data Two = Two { p2 :: Particle
, ... }
data Three = Three { p3 :: Particle
, ... }
Or, you can define a type that encapsulates the other properties, and let those be added to different kinds of Particles.
data Properties = One { ... }
| Two { ... }
| Three { ... }
data Particle = Particle { velocity :: Velocity
, position :: Position
, properties :: Properties
}
(Or see #leftaroundabout's answer, which is a nicer way of handling this approach.)
I have tree that hold contains nodes of different types. These are tagged using a datatype:
data Wrapping = A Int
| B String
I want to write two functions:
scatter :: Wrapping -> a
gather :: a -> Output
The idea is that I can use (scatter.gather) :: Wrapping -> Output. There will of course be several different variations on both the scatter and the gather function (with each scatter variant having a unique Wrappingn datatype, but the set of intermediate types will always be the same) and I want to be able to cleanly compose them.
The issue that I have is that the type parameter a is not really free, it is a small explicit set of types (here it is {Int,String}). If I try to encode what I have so far into Haskell typeclasses then I get to:
{-# LANGUAGE FlexibleInstances #-}
data Wrapping = A Int | B String
class Fanin a where
gather :: a -> String
instance Fanin Int where
gather x = show x
instance Fanin String where
gather x = x
class Fanout a where
scatter :: Fanout a => Wrapping -> a
instance Fanout Int where
scatter (A n) = n
instance Fanout String where
scatter (B x) = x
combined = gather.scatter
The two classes typecheck fine but obviously the final line throws errors because ghc knows that the type parameters do match on every case, only on the two that I have defined. I've tried various combinations of extending one class from the other:
class Fanin a => Fanout a where ...
class Fanout a => Fanin a where ...
Finally I've looked at GADTs and existential types to solve this but I am stumbling around in the dark. I can't find a way to express a legal qualified type signature to GHC, where I've tried combinations of:
{-# LANGUAGE RankNTypes #-}
class (forall a. Fanout a) => Fanin a where
class (forall a. Fanin a) => Fanout a where
Question: how do I express to GHC that I want to restrict a to only the two types in the set?
I get the feeling that the solution lies in one of the techniques that I've looked at but I'm too lost to see what it is...
The idea is that I can use (scatter.gather) :: Wrapping -> Output.
There will of course be several different variations on both the
scatter and the gather function (with each scatter variant having a
unique Wrappingn datatype, but the set of intermediate types will
always be the same) and I want to be able to cleanly compose them.
If I understand correctly, you'd like to have different Wrapping types but the intermediate a type is constantly Either Int String. We can just reflect this information in our classes:
data Wrapping = A Int
| B String
class Fanout wrap where
scatter :: wrap -> Either Int String
instance Fanout Wrapping where
scatter (A n) = Left n
scatter (B str) = Right str
class Fanin output where
gather :: Either Int String -> output
instance Fanin String where
gather = either show id
combined :: Wrapping -> String
combined = gather . scatter
Also, this use case doesn't seem especially amenable to type classes, from what I can glean from the question. In particular, we can get rid of Fanin, and then combined = either show id . scatter looks better to my eyes than the previous definition.
The type class solution makes sense here only if just a single Either Int String -> a or a -> Either Int String function makes sense for each a, and you'd like to enforce this.
If I understand you correctly you need something like the following:
module Main ( main ) where
-- Different kinds of wrapper data types
data WrapperA = A Int | B String
data WrapperB = C Int | D Float
-- A single intermediate data type (with phantom type)
data Intermediate a = E Int | F String
-- Generic scatter and gather functions
class Wrapped a where
scatter :: Wrapped a => a -> Intermediate a
gather :: Wrapped a => Intermediate a -> String
-- Specific scatter and gather implementations
instance Wrapped WrapperA where
scatter (A i) = E i
scatter (B s) = F s
gather (E i) = show i
gather (F s) = s
instance Wrapped WrapperB where
scatter (C i) = E i
scatter (D f) = F $ show f
gather (E i) = show i
gather (F s) = s ++ " was a float"
-- Beautiful composability
combined :: Wrapped a => a -> String
combined = gather . scatter
wrapperAexample1 = A 10
wrapperAexample2 = B "testing"
wrapperBexample1 = C 11
wrapperBexample2 = D 12.4
main :: IO ()
main = do print $ combined wrapperAexample1
print $ combined wrapperAexample2
print $ combined wrapperBexample1
print $ combined wrapperBexample2
The main issue seems to be that you have an intermediate type which can have different kinds of content, but this is constant for different wrappers. Still, depending on the kind of wrapper, you want the gather function to behave differently.
To do this, I would define the Intermediate type to specify the kinds of values that can be held in the intermediate stage, and give it a phantom type parameter (to remember what kind of wrapper it originated from). You can then define a class to hold the scatter and gather functions, and define these differently for different kinds of wrappers.
The code above compiles without errors for me, and gives the following output:
"10"
"testing"
"11"
"12.4 was a float"
As you can see, the WrapperB/D Float input is treated differently than the WrapperA/B String (it is tagged as a float value, even after it has been converted to a String). This is because in the Intermediate representation it is remembered that the origin is a WrapperB: the one is of type Intermediate WrapperA, the other of Intermediate WrapperB.
If, on the other hand, you don't actually want the gather function to behave differently for different wrappers, you can simply take that out of the class and take out the phantom type. The easiest way to let ghc know that the type in the intermediate stage can be Int or String seems to me to still define something like the Intermediate type, rather than use just a.
Learning Haskell, I write a formatter of C++ header files. First, I parse all class members into a-collection-of-class-members which is then passed to the formatting routine. To represent class members I have
data ClassMember = CmTypedef Typedef |
CmMethod Method |
CmOperatorOverload OperatorOverload |
CmVariable Variable |
CmFriendClass FriendClass |
CmDestructor Destructor
(I need to classify the class members this way because of some peculiarities of the formatting style.)
The problem that annoys me is that to "drag" any function defined for the class member types to the ClassMember level, I have to write a lot of redundant code. For example,
instance Formattable ClassMember where
format (CmTypedef td) = format td
format (CmMethod m) = format m
format (CmOperatorOverload oo) = format oo
format (CmVariable v) = format v
format (CmFriendClass fc) = format fc
format (CmDestructor d) = format d
instance Prettifyable ClassMember where
-- same story here
On the other hand, I would definitely like to have a list of ClassMember objects (at least, I think so), hence defining it as
data ClassMember a = ClassMember a
instance Formattable ClassMember a
format (ClassMember a) = format a
doesn't seem to be an option.
The alternatives I'm considering are:
Store in ClassMember not object instances themselves, but functions defined on the corresponding types, which are needed by the formatting routine. This approach breaks the modularity, IMO, as the parsing results, represented by [ClassMember], need to be aware of all their usages.
Define ClassMember as an existential type, so [ClassMember] is no longer a problem. I doubt whether this design is strict enough and, again, I need to specify all constraints in the definition, like data ClassMember = forall a . Formattable a => ClassMember a. Also, I would prefer a solution without using extensions.
Is what I'm doing a proper way to do it in Haskell or there is a better way?
First, consider trimming down that ADT a bit. Operator overloads and destructors are special kinds of methods, so it might make more sense to treat all three in CmMethod; Method will then have special ways to separate them. Alternatively, keep all three CmMethod, CmOperatorOverload, and CmDestructor, but let them all contain the same Method type.
But of course, you can reduce the complexity only so much.
As for the specific example of a Show instance: you really don't want to write that yourself except in some special cases. For your case, it's much more reasonable to have the instance derived automatically:
data ClassMember = CmTypedef Typedef
| CmMethod Method
| ...
| CmDestructor Destructor
deriving (Show)
This will give different results from your custom instance – because yours is wrong: showing a contained result should also give information about the constructor.
If you're not really interested in Show but talking about another class C that does something more specific to ClassMembers – well, then you probably shouldn't have defined C in the first place! The purpose of type classes is to express mathematical concepts that hold for a great variety of types.
A possible solution is to use records.
It can be used without extensions and preserves flexibility.
There is still some boilerplate code, but you need to type it only once for all. So if you would need to perform another set of operations over your ClassMember, it would be very easy and quick to do it.
Here is an example for your particular case (template Haskell and Control.Lens makes things easier but are not mandatory):
{-# LANGUAGE TemplateHaskell #-}
module Test.ClassMember
import Control.Lens
-- | The class member as initially defined.
data ClassMember =
CmTypedef Typedef
| CmMethod Method
| CmOperatorOverload OperatorOverload
| CmVariable Variable
| CmFriendClass FriendClass
| CmDestructor Destructor
-- | Some dummy definitions of the data types, so the code will compile.
data Typedef = Typedef
data Method = Method
data OperatorOverload = OperatorOverload
data Variable = Variable
data FriendClass = FriendClass
data Destructor = Destructor
{-|
A data type which defines one function per constructor.
Note the type a, which means that for a given Hanlder "a" all functions
must return "a" (as for a type class!).
-}
data Handler a = Handler
{
_handleType :: Typedef -> a
, _handleMethod :: Method -> a
, _handleOperator :: OperatorOverload -> a
, _handleVariable :: Variable -> a
, _handleFriendClass :: FriendClass -> a
, _handleDestructor :: Destructor -> a
}
{-|
Here I am using lenses. This is not mandatory at all, but makes life easier.
This is also the reason of the TemplateHaskell language pragma above.
-}
makeLenses ''Handler
{-|
A function acting as a dispatcher (the boilerplate code!!!), telling which
function of the handler must be used for a given constructor.
-}
handle :: Handler a -> ClassMember -> a
handle handler member =
case member of
CmTypedef a -> handler^.handleType $ a
CmMethod a -> handler^.handleMethod $ a
CmOperatorOverload a -> handler^.handleOperator $ a
CmVariable a -> handler^.handleVariable $ a
CmFriendClass a -> handler^.handleFriendClass $ a
CmDestructor a) -> handler^.handleDestructor $ a
{-|
A dummy format method.
I kept things simple here, but you could define much more complicated
functions.
You could even define some generic functions separately and... you could define
them with some extra arguments that you would only provide when building
the Handler! An (dummy!) example is the way the destructor function is
constructed.
-}
format :: Handler String
format = Handler
(\x -> "type")
(\x -> "method")
(\x -> "operator")
(\x -> "variable")
(\x -> "Friend")
(destructorFunc $ (++) "format ")
{-|
A dummy function showcasing partial application.
It has one more argument than handleDestructor. In practice you are free
to add as many as you wish as long as it ends with the expected type
(Destructor -> String).
-}
destructorFunc :: (String -> String) -> Destructor -> String
destructorFunc f _ = f "destructor"
{-|
Construction of the pretty handler which illustrates the reason why
using lens by keeping a nice and concise syntax.
The "&" is the backward operator and ".~" is the set operator.
All we do here is to change the functions of the handleType and the
handleDestructor.
-}
pretty :: Handler String
pretty = format & handleType .~ (\x -> "Pretty type")
& handleDestructor .~ (destructorFunc ((++) "Pretty "))
And now we can run some tests:
test1 = handle format (CmDestructor Destructor)
> "format destructor"
test2 = handle pretty (CmDestructor Destructor)
> "Pretty destructor"
how can i group getX and putX in a class instance ?
the code below is an answer for this post Class set method in Haskell using State-Monad
import Control.Monad.State
data Point = Point { x :: Int, y :: Int } deriving Show
getX :: State Point Int
getX = get >>= return . x
putX :: Int -> State Point ()
putX newVal = do
pt - get
put (pt { x = newVal })
increaseX :: State Point ()
increaseX = do
x - getX
putX (x + 1)
Later I hope I will implement setters and getters for a hierarchy of 2 classes, but for now i just wanna do something like this:
class A a where
putX :: Int -> State Point ()
instance A (State Point) where
putX newVal = do
pt - get
put (pt { x = newVal })
You seem to be conflating multiple concepts here, to the point that I'm not sure exactly what you're aiming to accomplish. A few thoughts on what you might be after:
Field access, i.e., a way to inspect or replace a piece of a larger data structure. This doesn't really lend itself to a type class, because for many combinations of "inner field" and "data structure" there will be more than one accessor possible. Abstractions along these lines are often called "lenses".
Stateful references in some generic fashion. For the most part in a State monad this amounts to combining something like the aforementioned lenses with the standard get and put. In this case you could have a type class for the combination of a particular monad and the accessor data type, but it wouldn't really do that much.
Overloading access to a particular field, so that functions can work on any data type that contains an "x" field. In this case I assume you'd also want "y", as some sort of type class for 2D points. This is entirely separate from the above issues, however.
Perhaps you could clarify your goal?