Haskell Generating graphs with QuickCheck properties - haskell

Graphs have these properties:
The type 'Edge' represents an edge between two nodes.
data Edge v = Edge {source :: v, target :: v}
deriving (Show,Eq,Ord)
The 'Graph' type represents a directed graph.
data Graph v = Graph {nodes :: Set v, edges :: Set (Edge v)}
deriving Show
The fuction 'isDAG' tests if a graph is acyclic.
isDAG :: Ord v => Graph v -> Bool
isDAG g = isValid g && all nocycle (nodes g)
where nocycle v = all (\a -> v `notMember` reachable g a) $ Set.map target (adj g v)
The fuction 'isForest' tests if a valid DAG is a forest (a set of trees)
isForest :: Ord v => DAG v -> Bool
isForest g = isDAG g && all (\v -> length (adj g v) <= 1) (nodes g)
The generators code is:
DAGs generator
dag :: (Ord v, Arbitrary v) => Gen (DAG v)
dag = arbitrary `suchThat` isDAG
Forests generator
forest :: (Ord v, Arbitrary v) => Gen (Forest v)
forest = arbitrary `suchThat` isForest
I want to improve the generators Dag and Forest, so they are defined based on their properties and not with 'suchThat'. How can I do it?
Thank you in advance.

I believe the question at the core is how to generate DAGs and forests.
What's a forest? Forest is a collection of trees. What's a tree? Tree is a graph where every node except the root has exactly one parent. How do we turn it into an algorithm? Generate a list of nodes. For every node in the list going from the left randomly pick an element to the right of it as its parent and create an edge to it.
What is a DAG? DAG is a directed acyclic graph. What can we do with DAGs? We can topologically order them. What does that mean? It means we can put them in a sequence where every edge goes from left to right. How do we turn it into an algorithm? Generate a list of nodes. For every node in the list going from the left randomly pick a subset of elements to the right of it and create an edge to them.

Related

Alpha Beta Pruning with Recursion Schemes

I'm trying to get more proficient with recursion schemes as they have so far been really helpful for turning gnarly explicit recursion code into something less spike-y. One of the other tools I tend to reach for when implementing algorithms that can get really confusing with explicit recursion is monad transformers / mutability. Ideally I'd like to get comfortable enough with recursion schemes such that I can ditch statefulness altogether. An example of an algorithm I'd still reach for the transformers for is minimax with alpha beta pruning. I did normal minimax with a catamorphism and minimax f-algebra (data MinimaxF a f = MMResult a | MMState [f] Bool), but I wasn't sure how I could extend this to do alpha beta pruning. I thought maybe I could use histomorphism, or maybe there was some custom solution with comonads, but I didn't know how to approach trying a solution using either technique.
In addition to a version of alpha beta pruning with recursion schemes any general advice you have about tackling similar problems would be much appreciated. For example I've had trouble applying recursion schemes to algorithms like Dijkstra that usually are implemented in an imperative fashion.
Alpha-beta can be seen as an instance of minimax, where min and max are instantiated using a well-chosen lattice. Full gist.
We represent games as a tree, where each internal node is a position in the game, waiting for a designated player to pick a move to a child node, and each leaf is a final position with its score, or value.
-- | At every step, either the game ended with a value/score,
-- or one of the players is to play.
data GameF a r = Value a | Play Player (NonEmpty r)
deriving Functor
type Game a = Fix (GameF a)
-- | One player wants to maximize the score,
-- the other wants to minimize the score.
data Player = Mini | Maxi
minimax will work on any lattice, defined by the following class:
class Lattice l where
inf, sup :: l -> l -> l
The Lattice class is more general than Ord: and Ord instance is a Lattice with decidable equality (Eq). If we could redefine Ord, then it would be appropriate to add Lattice as a superclass. But here a newtype will have to do:
-- The Lattice induced by an Ord
newtype Order a = Order { unOrder :: a }
deriving (Eq, Ord)
instance Ord a => Lattice (Order a) where
inf = min
sup = max
Here's minimax. It is parameterized by an embedding leaf :: a -> l of final values to the chosen lattice. One player maximizes the embedded value, the other player minimizes it.
-- | Generalized minimax
gminimax :: Lattice l => (a -> l) -> Game a -> l
gminimax leaf = cata minimaxF where
minimaxF (Value x) = leaf x
minimaxF (Play p xs) = foldr1 (lopti p) xs
lopti :: Lattice l => Player -> l -> l -> l
lopti Mini = inf
lopti Maxi = sup
The "regular" minimax uses the scores of the game directly as the lattice:
minimax :: Ord a => Game a -> a
minimax = unOrder . gminimax Order
For alpha-beta pruning, the idea is that we can keep track of some bounds on the optimal score, and this allows us to short-circuit the search. So the search is to be parameterized by that interval (alpha, beta). This leads us to a lattice of functions Interval a -> a:
newtype Pruning a = Pruning { unPruning :: Interval a -> a }
An interval can be represented by (Maybe a, Maybe a) to allow either side to be unbounded. But we shall use better named types for clarity, and also to leverage a different Ord instance on each side:
type Interval a = (WithBot a, WithTop a)
data WithBot a = Bot | NoBot a deriving (Eq, Ord)
data WithTop a = NoTop a | Top deriving (Eq, Ord)
We will require that we can only construct Pruning f if f satisfies clamp i (f i) = clamp i (f (Bot, Top)), where clamp is defined below. That way, f is a search algorithm which may shortcircuit if it learns that its result lies outside of the interval, without having to find the exact result.
clamp :: Ord a => Interval a -> a -> a
clamp (l, r) = clampBot l . clampTop r
clampBot :: Ord a => WithBot a -> a -> a
clampBot Bot x = x
clampBot (NoBot y) x = max y x
clampTop :: Ord a => WithTop a -> a -> a
clampTop Top x = x
clampTop (NoTop y) x = min y x
Functions form a lattice by pointwise lifting. And when we consider only functions satisfying clamp i (f i) = clamp i (f (Bot, Top)) and equate them modulo a suitable equivalence relation (Pruning f = Pruning g if clamp <*> f = clamp <*> g), a short-circuiting definition of the lattice becomes possible.
The inf of two functions l and r, given an interval i = (alpha, beta), first runs l (alpha, beta) to obtain a value vl.
If vl <= alpha, then it must be clamp i vl == alpha == clamp i (min vl (r i)) so we can stop and return vl without looking at r. Otherwise, we run r, knowing that the final result is not going to be more than vl so we can also update the upper bound passed to r. sup is defined symmetrically.
instance Ord a => Lattice (Pruning a) where
inf l r = Pruning \(alpha, beta) ->
let vl = unPruning l (alpha, beta) in
if NoBot vl <= alpha then vl else min vl (unPruning r (alpha, min (NoTop vl) beta))
sup l r = Pruning \(alpha, beta) ->
let vl = unPruning l (alpha, beta) in
if beta <= NoTop vl then vl else max vl (unPruning r (max (NoBot vl) alpha, beta))
Thus we obtain alpha-beta as an instance of minimax. Once the lattice above is defined, we only need some simple wrapping and unwrapping.
alphabeta :: Ord a => Game a -> a
alphabeta = runPruning . gminimax constPruning
constPruning :: a -> Pruning a
constPruning = Pruning . const
runPruning :: Pruning a -> a
runPruning f = unPruning f (Bot, Top)
If all goes well, alphabeta and minimax should have the same result:
main :: IO ()
main = quickCheck \g -> minimax g === alphabeta (g :: Game Int)

Requiring that a function must be a retraction in Haskell?

I'm envisioning an implementation of a monadic graph. I'll do my best to explain how it is to be constructed here.
The Graph type should be isomorphic to the following:
data Graph e v = Graph{ vertices :: [v], edges :: [(e, (v, v))] }
Where e is the edge type, and v is the vertex type, we include a list of vertices and a list of edges along with the vertices they connect.
What I'm envisioning is a monad instance of this type as follows:
instance Monad (Graph e) where
return v = Graph v [] -- | Empty graph with one vertex
m >>= f = {- see below -}
I have an idea of how to implement >>= which basically takes each vertex, maps it to a new graph, and then re-connects the vertex which built each graph correspondingly based on how the original graph was connected.
For example, consider a function f which takes a vertex and produces the complete graph on two vertices (K_2) from it. Then if we bound K_2 itself to f, we'd get something like:
A----B
| |
C D
where the graph A----B was the original, and the graphs A----C and B----D were produced from A and B respectively. In the end, A and B need to be connected since they were connected in the original graph. Note that A and B need not be exactly the same, but they need to directly map to something in the new graph. I'm leaving out some information for simplicity (what are the edges of the graph, etc), but the main point I've noticed is that for this to actually work as a Monad instance, A needs to be directly mapped to a vertex in f A, and the same goes for B. In general, each vertex in the original graph needs to be mapped directly to a graph in the graph resulting from f.
If I'm understanding correctly, this means that f must be a retraction for some other morphism g. If it is, we can clearly join the graph by connecting each morphed vertex in its resulting graph to the morphed vertices in the others, producing a new graph of the type we want.
Mostly this is just an idea I had, but I really wanted to if there is any way to, in Haskell, require that f be a retraction? Is there a way to state this within the confines of the language in order to supply an appropriate instance of Monad for a graph, or to do this, must I say "this is really only a monad if the function you're binding to is a retraction?" I suspect the latter, but I just wanted to check.
Alternatively, I may be understanding everything wrong! Feel free to correct me or give me some thoughts of your own.
Like the comments say, you could use a pointed graph:
module PointedGraph where
import Control.Arrow (second)
data PointedGraph e v = PointedGraph { hops :: [(e, PointedGraph e v)], center :: v }
deriving (Eq, Show)
instance Monad (PointedGraph e) where
return = PointedGraph []
PointedGraph hs c >>= f = PointedGraph (hs' ++ map (second (>>= f)) hs) c'
where PointedGraph hs' c' = f c
connect :: PointedGraph e v -> e -> PointedGraph e v -> PointedGraph e v
connect g e g' = g { hops = (e,g') : hops g }
k2, ex :: PointedGraph String Int
k2 = connect (return 0) "original" (return 2)
ex = do
n <- k2
connect (return n) "derived" (return $ n + 1)
So this makes:
k2: 0 -original-> 2
ex: 0 -original-> 2
| |
derived derived
| |
v v
1 3
Note that we have no checking for uniqueness of the vertex labels (that'd require an Eq constraint or the like) so we could easily have something like
k2 >>= const k2:
0 -original-> 0
| |
original original
| |
v v
2 2

Creating a directed acyclic graph in haskell with lists and sets

I want to create a DAG in haskell but since I'm new to the whole functional programming thing I would like some directions.
The graph needs to be built with only lists and sets, and the following functions must be implemented:
v = add_vertex(g,w)
A vertex with the specified weight w is added to the DAG g and its unique vertex identifier v is returned.
add_edge(g,a,b,w)
An edge from the vertex with vertex identifier a to the vertex with vertex identifier b is added to the DAG g with weight w.
What I've done so far is creating a data type which looks like this:
data Graph v w = Graph {vertices :: [(v, w)],
edges :: [([(v, w)], [(v, w)], w)]} deriving Show
And I guess I need some form of constructor for the graph, it looks like this:
create_graph :: (v,w) -> w -> Graph v w
create_graph v w = Graph [v] [(v, v, w)]
What I would like to do is to create just an empty graph, but now I need to input some starting values if I understand correctly. How can I fix that?
The add_vertex function looks like this:
add_vertex :: Graph v w -> (v, w) -> Graph v w
add_vertex (Graph v w) x = Graph (v ++ [x]) w
But I dont really know how to return a vertex identifier instead of the whole graph. I guess I should also specify that the identifier needs to be a char and the weights can be either floats or ints, where do I do that?
I would also like to have functions for topological ordering and getting the weight for the longest path. With this in mind, should I define the structure of the graph in a different way?
Thanks
what I would do is define the graph like a tree
data Graph a = Graph [(a,[a])] -- Graph is a list of origins paired with edgeends
createGraph ::Eq a => [(a,a)] -> Graph a
createGraph = undefined
empty :: Graph a
empty = Graph []
insertVertex :: Eq a => a -> Graph a -> Graph a
insertVertex = undefined -- insert if not already in the Graph (with empty edges)
insertEdge :: Eq a => (a,a) -> Graph a -> Graph a
insertEdge = undefined -- insert edge in list of origin
--do not forget to add origin, end if they don't exist
implement these and worry about bfs/topsort later and think about the result of a bfs - what do you want as a result? (the result of topsort should be a list i guess).

QuickCheck giving up investigating a recursive data structure (rose tree.)

Given an arbitrary tree, I can construct a subtype relation over that tree, using Schubert numbering:
constructH :: Tree a -> Tree (Type a)
where Type nests the original label, and additionally provides the data needed to perform child/parent (or subtype) checks. With Schubert Numbering, the two Int parameters are sufficient for that.
data Type a where !Int -> !Int -> a -> Type a
This leads to the binary predicate
subtypeOf :: Type a -> Type a -> Bool
I now want to test with QuickCheck that this does indeed do what I want it to do. The following property, however, does not work, because QuickCheck just gives up:
subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity Node { rootLabel = t, subForest = f } =
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> conjoin
(forAll (elements subtypes) (\x → x `subtypeOf` t):(map subtypeSanity f))
If I leave out the recursive call to subtypeSanity, i.e. the tail of the list I'm passing to conjoin, the property runs fine, but tests just the root node of the tree! How can I descend into my data structure recursively without QuickCheck giving up on generating new test cases?
If needed, I could provide the code to construct the Schubert Hierarchy, and the Arbitrary instance for Tree (Type a), to provide a complete runnable example, but that would be quite a bit of code. I'm convinced that I'm just not "getting" QuickCheck, and using it in the wrong way here.
EDIT: unfortunately, the sized function does not seem to eliminate the problem here. It ends up with the same result (see comment to J. Abrahamson's answer.)
EDIT II: I ended up "fixing" my problem by avoiding the recursive step, and avoiding conjoin. We just make a list of all nodes in the tree, then test the single-node property (which worked fine from the beginning) on those.
allNodes ∷ Tree a → [Tree a]
allNodes n#(Node { subForest = f }) = n:(concatMap allNodes f)
subtypeSanity ∷ Tree (Type ()) → Gen Prop
subtypeSanity tree = forAll (elements $ allNodes tree)
(\(Node { rootLabel = t, subForest = f }) →
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> forAll (elements subtypes) (\x → x `subtypeOf` t))
Tweaking the Arbitrary instance for trees did not work. Here is the arbitrary instance I'm still using:
instance (Arbitrary a, Eq a) ⇒ Arbitrary (Tree (Type a)) where
arbitrary = liftM (constructH) $ sized arbTree
arbTree ∷ Arbitrary a ⇒ Int → Gen (Tree a)
arbTree n = do
m ← choose (0,n)
if m == 0
then Node <$> arbitrary <*> (return [])
else do part ← randomPartition n m
Node <$> arbitrary <*> mapM arbTree part
-- this is a crude way to find a sufficiently random x1,..,xm,
-- such that x1 + .. + xm = n, for any n, m, with 0 < m.
randomPartition ∷ Int → Int → Gen [Int]
randomPartition n m' = do
let m = m' - 1
seed ← liftM ((++[n]) . sort) $ replicateM m (choose (0,n))
return $ zipWith (-) seed (0:seed)
I consider the problem "solved for now," but if someone could explain to me why the recursive step and/or conjoin made QuickCheck give up (after passing "only" 0 tests,) I would be more than grateful.
When generating Arbitrary recursive structures, QuickCheck is often a bit too eager and generates sprawling, enormous random examples. These are undesirable as they usually don't better check the properties of interest and can be very slow. Two solutions are
Use things like the size parameter (sized function) and frequency function to bias the generator toward small trees.
Use a small-type oriented generator like those in smallcheck. These try to exhaustively generate all "small" examples and thus help to keep the size of the tree down.
To clarify the sized and frequency method of controlling generation size, here's an example RoseTree
data Rose a = It a | Rose [Rose a]
instance Arbitrary a => Arbitrary (Rose a) where
arbitrary = frequency
[ (3, It <$> arbitrary) -- The 3-to-1 ratio is chosen, ah,
-- arbitrarily...
-- you'll want to tune it
, (1, Rose <$> children)
]
where children = sized $ \n -> vectorOf n arbitrary
It can be done even more simply with a different Rose formation by very carefully controlling the size of the child list
data Rose a = Rose a [Rose a]
instance Arbitrary a => Arbitrary (Rose a) where
arbitrary = Rose <$> arbitrary <*> sized (\n -> vectorOf (tuneUp n) arbitrary)
where tuneUp n = round $ fromIntegral n / 4.0
You could do this without referencing sized, but that gives the user of your Arbitrary instance a knob to ask for larger trees if needed.
In case it's useful for those stumbling across this issue: when QuickCheck "gives up", it's a sign that your pre-condition (using ==>) is too hard to satisfy.
QuickCheck uses a simple rejection sampling technique: pre-conditions have no effect on the generation of values. QuickCheck generates a bunch of random values like normal. After these are generated, they're sent through the pre-condition: if the result is True, the property is tested with that value; if it's False, that value is discarded. If your pre-condition rejects most of the values QuickCheck has generated, then QuickCheck will "give up" (better to give up completely, than to make statistically dubious pass/fail claims).
In particular, QuickCheck will not attempt to produce values which satisfy a given pre-condition. It's up to you to make sure that the generator you're using (arbitrary or otherwise) produces lots of values which pass your pre-condition.
Let's see how this is manifesting in your example:
subtypeSanity :: Tree (Type ()) -> Gen Prop
subtypeSanity Node { rootLabel = t, subForest = f } =
let subtypes = concatMap flatten f
in (not $ null subtypes) ==> conjoin
(forAll (elements subtypes) (`subtypeOf` t):(map subtypeSanity f))
There is only one occurance of ==>, so its precondition (not $ null subtypes) must be too hard to satisfy. This is due to the recursive call map subtypeSanity f: not only are you rejecting any Tree which has an empty subForest, you're also (due to the recursion) rejecting any Tree where the subForest contains Trees with empty subForests, and rejecting any Tree where the subForest contains Trees with subForests containing Trees with empty subForests, and so on.
According to your arbitrary instance, Trees are only nested to finite depth: eventually we will always reach an empty subForest, hence your recursive precondition will always fail, and QuickCheck will give up.

Haskell can't match type, claims rigid variable

I am new to Haskell, and I am playing around with creating a typeclass for graphs and the nodes in them. Since I want both directed and undirected graphs, I have
data Node = Node { label :: Char
, index :: Int
} deriving (Ord, Eq)
type Graph edgeType = ([Node], [edgeType])
data Edge = DirectedEdge {h :: Node, t :: Node}
| UndirectedEdge {a :: Node, b :: Node}
instance Show Node where
show n = ['(', label n, ')']
instance Show Edge where
show (DirectedEdge h t) = show h ++ "->" ++ show t
show (UndirectedEdge a b) = show a ++ "-" ++ show b
So I am distinguishing between directed and undirected edges. A graph must only have edges of either type. I also have the following:
nodes :: [Node]
nodes = zipWith Node ['a'..] [0..]
emptyGraph :: [Node] -> Graph edgeType
emptyGraph ns = (ns, [])
So far so good, however I am writing a function connect, with connects a node to an existing graph. Ideally, I only want it to apply to undirected graphs, but that doesn't seem to be an option. Instead, I have something like this:
connect :: Graph edgeType -> Node -> Graph edgeType
connect (ns, es) n = (n:ns, e:es)
where e = UndirectedEdge n (head ns)
But this gives the following error:
Couldn't match type `edgeType' with `Edge'
`edgeType' is a rigid type variable bound by
the type signature for
connect :: Graph edgeType -> Node -> Graph edgeType
What is the best way to accomplish what I am trying to achieve?
You probably want to have two separate edge types instead of Edge
newtype DirectedEdge = DirectedEdge { h :: Node, t :: Node}
newtype UndirectedEdge = UndirectedEdge { a :: Node, b :: Node}
And you probably want some kind of typeclass that gives you back a (Node, Node) given an arbitrary edge:
class HasNodeEndpoints a where
endpoints :: a -> (Node, Node)
-- obvious instances for DirectedEdge and UndirectedEdge
Then when you want to talk about arbitrary graphs, you will write functions that work on Graph a, and probably on HasNodeEndpoints a => Graph a. Algorithms that care about the graph kind would work on Graph DirectedEdge and Graph UndirectedEdge for directed and undirected graphs, respectively.
Another natural extension would be labeled directed and undirected edges.
class HasLabeled a where
type Label a -- associated type synonym
label :: a -> Label a
updateLabel :: a -> (Label a -> Label a) -> a
-- now define data types and instances for labeled directed and undirected edges
Because you choose a specific edge type, namely Edge, when you use UndirectedEdge, the result is that your graph is no longer polymorphic in the edge type. It has to have the type:
connect :: Graph Edge -> Node -> Graph Edge
connect (ns, es) n = (n:ns, e:es)
where e = UndirectedEdge n (head ns)
Since there's noo other type your edges can be, given that clear use of UndirectedEdge.
As an aside, I'd use strictness annotations on the nodes, just as a matter of good hygiene:
data Node = Node { label :: !Char
, index :: !Int
} deriving (Ord, Eq)

Resources