Graphs in Haskell [duplicate] - haskell

This question already has answers here:
How do you represent a graph in Haskell?
(7 answers)
Closed 1 year ago.
I'm struggling to understand how simple graphs are represented in Haskell. My understanding is that a graph would basically be a list of vertices and edges (pairs of vertices)
I've been looking at the Data.Graph implementation in order to build a Graph constructor with a type variable "a" for a simple graph, but I don't understand what the constructor should look like and how it will store an Edge and a Vertex.
My initial thinking was to base my constructor on a similar logic as a tree:
type Vertex = Int
type Edge = (Vertex, Vertex)
data Graph a = Void | Vertex a [Graph a]
But I'm not sure how the edges are then represented.

It's tempting to try to represent a graph structurally in Haskell ADTs, but this doesn't really work the way it does with trees because of the presence of loops. It would be possible to represent only a spanning tree, but then the remaining edges need to be represented as addresses into the tree, which is possible but awkward and if you need direct addressing anyway, what's the point of having the tree structure at all?† That's why the standard way is to instead just flatten it completely, an array of vertices and a list (or another array) of edges.
(The vertices can't reasonably be stored in a list because that would have too slow direct access.)
If you want to add extra data, like you would add data to list nodes, you can just add them to the vertex data.
{-# LANGUAGE DeriveFunctor #-}
import Data.Vector as V
newtype Vertex a = Vertex { getVertexData :: a }
deriving (Functor, Eq, Show)
type VertexIndex = Int
type Edge = (VertexIndex, VertexIndex)
data Graph a = Graph
{ graphVertices :: V.Vector (Vertex a)
, graphEdges :: V.Vector Edge
} deriving (Functor, Show)
†IMO there are actually valid reasons to want a tree structure, including to support lazyness. This could be used in some kind of comonadic interface; I dabbled with that once, not sure if somebody has done it properly somewhere.
Just for fun, here is a simple (and inefficient) implementation of nonempty connected graphs based on a spanning tree.
data TreeAddress = Here
| Up TreeAddress
| Down Int TreeAddress
data ConnectedGraph a = Vertex
{ vertexContainedData :: a
, managedNeighbours :: [ConnectedGraph a]
, unmanagedNeighbours :: [TreeAddress]
}
To make it a bit less wasteful, TreeAddress could be condensed down into a single Int if we also keep track of the total number of vertices, or at least Integer if we quotient out the number of managed neighbours at each junction.
It would be a fun exercise to write a Comonad instance for this.
Ah, somebody seems to have done this in Scala.
And they use a library that was itself inspired by the Haskell fgl library! I knew somebody had to have done this already. In fact it's quite old.

Related

Data type for a simple, undirected graph without multiple edges or loops in Haskell

I am new to Haskell and I am trying to come up with a suitable way to represent a graph. First some background for an undirected simple graph. For all vertices u and v, an edge between u and v is the same as an edge between v and u, there is at most one edge between u and v, and there is no edge between u and u.
Later on I want to be able to write functions to check if the graph is 1) empty, 2) add new vertices, 3) add new edges, 4) obtain all neighbors to a vertex and 5) obtain a list of all vertices in a graph.
After doing research, I am a bit confused by all the ways to define a graph data type, which I also hope to get some help to clarify. All seem to agree that a you need some way to represent a Vertex/Node and the edges/links to other Vertices/Nodes. However, the implementations differ.
Before I have done a Tree with infinite amount of branches, following this question tree-with-an-arbitrary-number-of-branches
data Tree a = Node a [Tree a]
deriving (Eq, Show)
Whats different with a graph I guess is that a nodes on the same "level" and nodes on different "branches" can be connected with an edge, see figure below.
What I came up with first was defining a data type using recursive data structures with a type variable where each Vertex/Node has a list with its associated nodes:
data Node a = Node a [a]
deriving Show
data Graph a = Graph [Node a]
deriving Show
However, what I am a bit unsure about is whether this representation makes it possible to insert new edges later on. With this definition a graph is just a list of nodes that in turn contains list of nodes they link/edge to.
After doing research about how to represent a graph in Haskell I found some interesting ideas. The first was to define a graph just using type synonyms:
type Node = Int
type Element = String
type Edge = (Node, Node)
type Node = (Node, Element)
type Graph = ([Node], [Edge])
Here we have that a Graph is a list of nodes with an associated list of its connections/links/edges. However, I was not sure what the corresponding data type definition would look like and with a type variable/parameter instead of a concrete type. In this regard, I found this question declare-data-constructor-for-graphs suggesting to represent a graph like this:
type Vertex = Integer
data Graph = Graph [Vertex] [(Vertex, Vertex)]
Which I guess with type parameter instead could be translated to:
data Graph a = Graph [a] [(a, a)]
Is this correct? Would this solution work for creating a simple, undirected graph without multiple edges or loops in Haskell? That also support the creation of the specified functions above.
In continuation, similar to this representation, I found this question directed-acyclic-graph where a graph is defined as:
data Graph a = Graph [(a,[a])] -- Graph is a list of origins paired with edgeends
Here I guess the author defines a graph as a list of tuples where each tuple consists of one node and a list of its linked nodes.
Another way I found was to use record syntax in the question graph-data-type-representation.
data Node a = Node { value :: a
, neighbors :: [Node a]
} deriving (Show)
-- or simply,
data Node a = Node a [Node a]
deriving (Show)
Which I guess is the same reasoning. A Node/Vertex has a value, and neighbors that a just a list of other Vertices. Building on top of this, a graph definition would be:
data Graph a = Graph [Node a]
Or am I wrong? If so, this implementation is similar to what my initial thinking, but differs in the data Node definition. Not sure whats more correct here.
In summary, I have found many ways to represent a graph data type in Haskell. But I am a bit confused about which way that best suits my use-case, to create a simple, undirected graph without multiple edges or loops that also supports the functions I would like to implement.
Looking forward answers and comments to clarify this!
Ended up using
data Graph a = Graph [(a, [a])] deriving (Show)
Since Algebraic data types are just a way to represent data, it is mostly about choosing a design that fits your needs. Here is a good source to read more about it: Learn you a haskell
For example, choosing this data type representation makes adding vertices to a graph possible in this way.
addVertex :: Eq a => Graph a -> a -> Graph a
addVertex (Graph vList) v
| not (containsVertex v vList) = Graph (vList ++ makeVertex v)
| otherwise = Graph vList
makeVertex :: a -> [(a, [a])]
makeVertex x = [(x, [])]
containsVertex :: Eq a => a -> [(a, [a])] -> Bool
containsVertex _ [] = False
containsVertex x ((v, _) : ys) = x == v || containsVertex x ys
Thus, just easy to deal with and manage the Graph when expressed in this way.

Haskell: Assigning unique char to matrix values if x > 0

So my goal for the program is for it to receive an Int matrix for input, and program converts all numbers > 0 to a unique sequential char, while 0's convert into a '_' (doesn't matter, just any character not in the sequence).
eg.
main> matrixGroupings [[0,2,1],[2,2,0],[[0,0,2]]
[["_ab"],["cd_"],["__e"]]
The best I've been able to achieve is
[["_aa"],["aa_"],["__a"]]
using:
matrixGroupings xss = map (map (\x -> if x > 0 then 'a' else '_')) xss
As far as I can tell, the issue I'm having is getting the program to remember what its last value was, so that when the value check is > 0, it picks the next char in line. I can't for the life of me figure out how to do this though.
Any help would be appreciated.
Your problem is an instance of an ancient art: labelling of various structures with a stream of
labels. It dates back at least to Chris Okasaki, and my favourite treatment is by Jeremy
Gibbons.
As you can see from these two examples, there is some variety to the way a structure may be
labelled. But in this present case, I suppose the most straightforward way will do. And in Haskell
it would be really short. Let us dive in.
The recipe is this:
Define a polymorphic type for your matrices. It must be such that a matrix of numbers and a
matrix of characters are both rightful members.
Provide an instance of Traversable class. It may in many cases be derived automagically.
Pick a monad to your liking. One simple choice is State. (Actually, that is the only choice I
can think of.)
Create an action in this monad that takes a number to a character.
Traverse a matrix with this action.
Let's cook!
A type may be as simple as this:
newtype Matrix a = Matrix [[a]] deriving Show
It is entirely possible that the inner lists will be of unequal length — this type does not
protect us from making a "ragged" matrix. This is poor design. But I am going to skim over
it for now. Haskell provides an endless depth for perfection. This type is good enough for
our needs here.
We can immediately define an example of a matrix:
example :: Matrix Int
example = Matrix [[0,2,1],[2,2,0],[0,0,2]]
How hard is it to define a Traversable? 0 hard.
{-# language DeriveTraversable #-}
...
newtype Matrix a = Matrix [[a]] deriving (Show, Functor, Foldable, Traversable)
Presto.
Where do we get labels from? It is a side effect. The function reaches somewhere, takes a
stream of labels, takes the head, and puts the tail back in the extra-dimensional pocket. A
monad that can do this is State.
It works like this:
label :: Int -> State String Char
label 0 = return '_'
label x = do
ls <- get
case ls of
[ ] -> error "No more labels!"
(l: ls') -> do
put ls'
return l
I hope the code explains itself. When a function "creates" a monadic value, we call it
"effectful", or an "action" in a given monad. For instance, print is an action that,
well, prints stuff. Which is an effect. label is also an action, though in a different
monad. Compare and see for youself.
Now we are ready to cook a solution:
matrixGroupings m = evalState (traverse label m) ['a'..'z']
This is it.
λ matrixGroupings example
Matrix ["_ab","cd_","__e"]
Bon appetit!
P.S. I took all glory from you, it is unfair. To make things fun again, I challenge you for an exercise: can you define a Traversable instance that labels a matrix in another order — by columns first, then rows?

Using subclass implementation in the definition of superclass functions

In my Haskell program I have some typeclasses representing abstract notions of "shapes", namely
-- | Class representing shapes.
class Shape a where
isColliding :: (Shape b) => a -> b -> Bool
centroid :: Point
-- | Class representing shapes composed of a finite number of vertices
and line segments connecting them.
class (Shape a) => Polygon a where
vertices :: a -> Vertices
As you can see, Polygon is naturally a subclass of Shape. I also have some data types that are instances of these different typeclasses. For example:
data Box = Box Point Point Angle
instance Shape Box where
...
instance Polygon Box where
...
---------------------------------
data Circle = Circle Point Radius
instance Shape Circle where
...
I have many more possible shapes, such as NGon, RegularNGon, etc. I would like to be able to implement isColliding, but the information required to calculate whether two shapes are colliding is dependent upon the implementation of the specific instance of Shape. For example, to calculate if two boxes are colliding, I need their list of vertices. So I have a few questions:
Is there anyway to "specialize" my function isColliding so that it is defined in a specific way for collisions of the type isColliding :: (Polygon b) => Box -> b -> Bool?
Is the structuring of my datatypes the best way to approach this problem, or am I misusing typeclasses and datatypes when the whole thing could be restructured to eliminate this problem?
I am rather new to Haskell, so if my question is worded poorly or any clarification is needed, please tell me.
Your current Shape class says “isColliding can tell whether this shape intersects another shape using only the methods of Shape on the other shape”, because its signature (Shape b) => a -> b -> Bool only tells you that b has an instance of Shape. So you’re right that this isn’t quite what you want.
One thing you can do is use MultiParamTypeClasses to describe a relationship between two types:
{-# LANGUAGE MultiParamTypeClasses #-}
class Colliding a b where
collidesWith :: a -> b -> Bool
And then make instances for various concrete combinations of types:
instance Colliding Circle Box where
Circle p r `collidesWith` Box p1 p2 θ = {- … -}
Here you know the concrete types of both a and b when defining the implementation. That might be good enough for your use case.
However, this leaves you with n2 instances if you have n types. And you’ll run into problems if you try to define polymorphic instances like this:
instance (HasBoundingBox b) => Colliding Circle b where
collidesWith = {- … -}
Because this overlaps with all your other instances for Colliding Circle: b will match any type, and only add the constraint that b must have an instance of HasBoundingBox. That constraint is checked after instance resolution. You can work around this with OverlappingInstances or the newer OVERLAPPABLE/OVERLAPPING/OVERLAPS pragmas to tell GHC to choose the most specific matching instance, but this might be more trouble than it’s worth if you’re just getting familiar with Haskell.
I’d have to think on it more, but there are definitely alternative approaches. In the simplest case, if you only need to deal with a few different kinds of shape, then you can just make them a single sum type instead of separate data types:
data Shape
= Circle Point Radius
| Box Point Point Angle
| …
Then your isColliding function can be of type Shape -> Shape -> Bool and just pattern-match on this type.
Generally speaking, if you’re writing a typeclass, it should come with laws for how instances should behave, like mappend x mempty == mappend mempty x == x from Data.Monoid. If you can’t think of any equations that should always hold for instances of your class, you should prefer to represent things with plain old functions and data types instead.

How do I capture different scopes using the bound library?

I'm trying to use Edward's bound library to model the graph of levels in my game - at least levels as they are stored representationally, before being realised as OpenGL objects.
A level consists of a bunch of vertices, whereby we can form a wall between pairs of vertices. Vertices are also used to create simple polygons - sectors (rooms). A sector owns walls, but also has some material properties. There is a lot of sharing going on between vertices, walls, sectors and materials, so to exploit the graph-like nature of this, I turned to the bound library.
The code so far is,
-- Vertices live alone and aren't influenced by anything else. Perhaps these
-- should still be a functor, but act like Const?
data Vertex = Vertex
{ vertexPos :: V2 CFloat }
-- Textures also live alone, and are simply a wrapper around the path to the
-- texture.
data Texture = Texture
{ texturePath :: FilePath }
-- A Material needs to refer to one (or more) textures.
data Material a = Material
{ materialDiffuseTexture :: a
, materialNormalMap :: Maybe a
}
-- A Sector needs to refer to materials *and* vertices. How do I reference two
-- types of variables?
data Sector a = Sector
{ sectorFloorMaterial :: a
, sectorWallMaterial :: a
, sectorCeilingMaterial :: a
, sectorVertices :: Vector a -- How do we guarantee these aren't material ids?
, sectorFloorLevel :: Double
, sectorCeilingLevel :: Double
}
-- A wall points to the sectors on either side of the wall, but also its start
-- and end vertices. The same problem with 'Sector' appears here too.
data Wall a = Wall
{ wallFront :: a -- This should be a sector
, wallBack :: Maybe a -- This should also be a sector
, wallV1 :: a -- But this is a vertex
, wallV2 :: a -- This is also a vertex
}
-- Level ties this all together, with the various expressions making up the
-- vertices, the walls between them, the sectors, and the materials.
data Level = Level
{ levelVertices :: IntMap.IntMap Vertex
, levelSectors :: Vector (Scope Int Sector ())
, levelWalls :: Vector (Scope Int Wall ())
, levelMaterials :: Vector (Scope Int Material ())
, levelTextures :: Vector (Scope Int Texture ())
}
However, I'm not sure if I'm putting the pieces together properly here. For example, I have Sector a, where I am using a to identify both vertices and materials. However, it's important that I use the right identifier in the right place!
I'd love to hear feedback on whether or not I'm going in the right direction by modelling this somewhat constrained AST via bound.

How would you represent a graph (the kind associated with the travelling salesman problem) in Haskell

It's pretty easy to represent a tree in haskell:
data Tree a = Node Tree a Tree | Leaf a
but that's because it has no need for the concept of an imperative style "pointer" because each Node/Leaf has one, and only one parent. I guess I could represent it as a list of lists of Maybe Ints ...to create a table with Nothing for those nodes without a path between and Just n for those that do... but that seems really ugly and unwieldy.
You can use a type like
type Graph a = [Node a]
data Node a = Node a [Node a]
The list of nodes is the outgoing (or incoming if you prefer) edges of that node. Since you can build cyclic data structures this can represent arbitrary (multi-)graphs. The drawback of this kind of graph structure is that it cannot be modified once you have built it it. To do traversals each node probably needs a unique name (can be included in the a) so you can keep track of which nodes you have visited.
Disclaimer: below is a mostly pointless exercise in "tying the knot" technique. Fgl is the way to go if you want to actually use your graphs. However if you are wondering how it's possible to represent cyclic data structures functionally, read on.
It is pretty easy to represent a graph in Haskell!
-- a directed graph
data Vertex a b = Vertex { vdata :: a, edges :: [Edge a b] }
data Edge a b = Edge { edata :: b, src :: Vertex a b, dst :: Vertex a b }
-- My graph, with vertices labeled with strings, and edges unlabeled
type Myvertex = Vertex String ()
type Myedge = Edge String ()
-- A couple of helpers for brevity
e :: Myvertex -> Myvertex -> Myedge
e = Edge ()
v :: String -> [Myedge] -> Myvertex
v = Vertex
-- This is a full 5-graph
mygraph5 = map vv [ "one", "two", "three", "four", "five" ] where
vv s = let vk = v s (zipWith e (repeat vk) mygraph5) in vk
This is a cyclic, finite, recursive, purely functional data structure. Not a very efficient or beautiful one, but look, ma, no pointers! Here's an exercise: include incoming edges in the vertex
data Vertex a b = Vertex {vdata::a, outedges::[Edge a b], inedges::[Edge a b]}
It's easy to build a full graph that has two (indistinguishable) copies of each edge:
mygraph5 = map vv [ "one", "two", "three", "four", "five" ] where
vv s =
let vks = repeat vk
vk = v s (zipWith e vks mygraph5)
(zipWith e mygraph5 vks)
in vk
but try to build one that has one copy of each! (Imagine that there's some expensive computation involved in e v1 v2).
The knot-tying techniques that others have outlined can work, but are a bit of a pain, especially when you're trying to construct the graph on the fly. I think the approach you describe is a bit more practical. I would use an array/vector of node types where each node type holds a list/array/vector of neighbors (in addition to any other data you need) represented as ints of the appropriate size, where the int is an index into the node array. I probably wouldn't use Maybe Ints. With Int you can still use -1 or any suitable value as your uninitialized default. Once you have populated all your neighbor lists and know they are good values you won't need the failure machinery provided by Maybe anyway, which as you observed imposes overhead and inconvenience. But your pattern of using Maybe would be the correct thing to do if you needed to make complete use of all possible values the node pointer type could contain.
The simplest way is to give the vertices in the graph unique names (which could be as simple as Ints) and use either the usual adjacency matrix or neighbor list approaches, i.e., if the names are Ints, either use array (Int,Int) Bool, or array Int [Int].
Have a look at this knot-tying technique, it is used to create circular structures. You may need it if your graph contains cycles.
Also, you can represent your graph using the adjacency matrix.
Or you can keep maps between each node and the inbound and outbound edges.
In fact, each of them is useful in one context and a pain in others. Depending on your problem, you'll have to choose.

Resources