Data type for a simple, undirected graph without multiple edges or loops in Haskell - haskell

I am new to Haskell and I am trying to come up with a suitable way to represent a graph. First some background for an undirected simple graph. For all vertices u and v, an edge between u and v is the same as an edge between v and u, there is at most one edge between u and v, and there is no edge between u and u.
Later on I want to be able to write functions to check if the graph is 1) empty, 2) add new vertices, 3) add new edges, 4) obtain all neighbors to a vertex and 5) obtain a list of all vertices in a graph.
After doing research, I am a bit confused by all the ways to define a graph data type, which I also hope to get some help to clarify. All seem to agree that a you need some way to represent a Vertex/Node and the edges/links to other Vertices/Nodes. However, the implementations differ.
Before I have done a Tree with infinite amount of branches, following this question tree-with-an-arbitrary-number-of-branches
data Tree a = Node a [Tree a]
deriving (Eq, Show)
Whats different with a graph I guess is that a nodes on the same "level" and nodes on different "branches" can be connected with an edge, see figure below.
What I came up with first was defining a data type using recursive data structures with a type variable where each Vertex/Node has a list with its associated nodes:
data Node a = Node a [a]
deriving Show
data Graph a = Graph [Node a]
deriving Show
However, what I am a bit unsure about is whether this representation makes it possible to insert new edges later on. With this definition a graph is just a list of nodes that in turn contains list of nodes they link/edge to.
After doing research about how to represent a graph in Haskell I found some interesting ideas. The first was to define a graph just using type synonyms:
type Node = Int
type Element = String
type Edge = (Node, Node)
type Node = (Node, Element)
type Graph = ([Node], [Edge])
Here we have that a Graph is a list of nodes with an associated list of its connections/links/edges. However, I was not sure what the corresponding data type definition would look like and with a type variable/parameter instead of a concrete type. In this regard, I found this question declare-data-constructor-for-graphs suggesting to represent a graph like this:
type Vertex = Integer
data Graph = Graph [Vertex] [(Vertex, Vertex)]
Which I guess with type parameter instead could be translated to:
data Graph a = Graph [a] [(a, a)]
Is this correct? Would this solution work for creating a simple, undirected graph without multiple edges or loops in Haskell? That also support the creation of the specified functions above.
In continuation, similar to this representation, I found this question directed-acyclic-graph where a graph is defined as:
data Graph a = Graph [(a,[a])] -- Graph is a list of origins paired with edgeends
Here I guess the author defines a graph as a list of tuples where each tuple consists of one node and a list of its linked nodes.
Another way I found was to use record syntax in the question graph-data-type-representation.
data Node a = Node { value :: a
, neighbors :: [Node a]
} deriving (Show)
-- or simply,
data Node a = Node a [Node a]
deriving (Show)
Which I guess is the same reasoning. A Node/Vertex has a value, and neighbors that a just a list of other Vertices. Building on top of this, a graph definition would be:
data Graph a = Graph [Node a]
Or am I wrong? If so, this implementation is similar to what my initial thinking, but differs in the data Node definition. Not sure whats more correct here.
In summary, I have found many ways to represent a graph data type in Haskell. But I am a bit confused about which way that best suits my use-case, to create a simple, undirected graph without multiple edges or loops that also supports the functions I would like to implement.
Looking forward answers and comments to clarify this!

Ended up using
data Graph a = Graph [(a, [a])] deriving (Show)
Since Algebraic data types are just a way to represent data, it is mostly about choosing a design that fits your needs. Here is a good source to read more about it: Learn you a haskell
For example, choosing this data type representation makes adding vertices to a graph possible in this way.
addVertex :: Eq a => Graph a -> a -> Graph a
addVertex (Graph vList) v
| not (containsVertex v vList) = Graph (vList ++ makeVertex v)
| otherwise = Graph vList
makeVertex :: a -> [(a, [a])]
makeVertex x = [(x, [])]
containsVertex :: Eq a => a -> [(a, [a])] -> Bool
containsVertex _ [] = False
containsVertex x ((v, _) : ys) = x == v || containsVertex x ys
Thus, just easy to deal with and manage the Graph when expressed in this way.

Related

Graphs in Haskell [duplicate]

This question already has answers here:
How do you represent a graph in Haskell?
(7 answers)
Closed 1 year ago.
I'm struggling to understand how simple graphs are represented in Haskell. My understanding is that a graph would basically be a list of vertices and edges (pairs of vertices)
I've been looking at the Data.Graph implementation in order to build a Graph constructor with a type variable "a" for a simple graph, but I don't understand what the constructor should look like and how it will store an Edge and a Vertex.
My initial thinking was to base my constructor on a similar logic as a tree:
type Vertex = Int
type Edge = (Vertex, Vertex)
data Graph a = Void | Vertex a [Graph a]
But I'm not sure how the edges are then represented.
It's tempting to try to represent a graph structurally in Haskell ADTs, but this doesn't really work the way it does with trees because of the presence of loops. It would be possible to represent only a spanning tree, but then the remaining edges need to be represented as addresses into the tree, which is possible but awkward and if you need direct addressing anyway, what's the point of having the tree structure at all?† That's why the standard way is to instead just flatten it completely, an array of vertices and a list (or another array) of edges.
(The vertices can't reasonably be stored in a list because that would have too slow direct access.)
If you want to add extra data, like you would add data to list nodes, you can just add them to the vertex data.
{-# LANGUAGE DeriveFunctor #-}
import Data.Vector as V
newtype Vertex a = Vertex { getVertexData :: a }
deriving (Functor, Eq, Show)
type VertexIndex = Int
type Edge = (VertexIndex, VertexIndex)
data Graph a = Graph
{ graphVertices :: V.Vector (Vertex a)
, graphEdges :: V.Vector Edge
} deriving (Functor, Show)
†IMO there are actually valid reasons to want a tree structure, including to support lazyness. This could be used in some kind of comonadic interface; I dabbled with that once, not sure if somebody has done it properly somewhere.
Just for fun, here is a simple (and inefficient) implementation of nonempty connected graphs based on a spanning tree.
data TreeAddress = Here
| Up TreeAddress
| Down Int TreeAddress
data ConnectedGraph a = Vertex
{ vertexContainedData :: a
, managedNeighbours :: [ConnectedGraph a]
, unmanagedNeighbours :: [TreeAddress]
}
To make it a bit less wasteful, TreeAddress could be condensed down into a single Int if we also keep track of the total number of vertices, or at least Integer if we quotient out the number of managed neighbours at each junction.
It would be a fun exercise to write a Comonad instance for this.
Ah, somebody seems to have done this in Scala.
And they use a library that was itself inspired by the Haskell fgl library! I knew somebody had to have done this already. In fact it's quite old.

In Haskell, how to bind one list-like monad to another list-like monad

Say you want to implement very general operations on a directed graph making as few assumptions about the structure as possible.
It is impossible to make absolutely no assumptions, so I am still assuming that I will represent my graph as some sort of adjacency list, but the spirit is to try to be as opaque as possible about the nature of manipulated things.
Assume you have the two following operations: one operation to list all nodes in a graph, and one operation to list all outgoing edges from some vertex.
class List_Nodes graph list vertex where
list_nodes :: graph -> list vertex
class List_Edges_From graph vertex list edge where
list_edges_from :: graph -> vertex -> list edge
Then, just for the fun of it I decided I might want want to iterate over all edges
class List_Edges graph vertex list edge where
list_edges :: graph -> list edge
No matter what the concrete implementation of a graph will be, I believe I can express very generally that listing edges can be understood as listing nodes, and listing edges from each of them.
So I decided to write an instance as general as possible like this:
instance (
Monad node_list,
Monad edge_list,
List_Nodes graph node_list vertex,
List_Edges_From graph vertex edge_list edge
) => List_Edges graph vertex edge_list edge where
list_edges graph = (list_nodes graph :: node_list vertex) >>= list_edges_from graph
-- I added :: node_list vertex to help GHC infer the type.
However, this code does not work as is. This code works only with an additional instance requirement that edge_list ~ node_list,. That's because binding happens only in one monad, the returned one: edge_list.
But to be as general as possible I do not want to assume that the way I store nodes, is necessarily the same way I store outgoing edges in a node. For example one might want to use a list to store nodes, and a vector to store edges out of a node.
Question:
How can I express the monadic bind list_nodes graph >>= list_edges_from graph between two possibly different list like containers?
More generally, how can I say convert a list to a vector without being specific about them? I am only assuming they are "list-like" whatever that means. Somehow these list like things are themselves functors, so I'm looking to convert some functor into some other functor. Am I looking for natural transformations of category theory? How can I do this in Haskell?
Language extensions used and imports used:
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE UndecidableInstances #-}
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE ScopedTypeVariables #-}
module Lib () where
import Prelude
import Control.Monad
If you want to be very general about the monad in which your nodes and edges are stored, you can't really do anything. Two monads in general do not compose with each other: what should the return type be if nodes are "stored" as IO String and edges as String -> Maybe String?
I would suggest doing a lot less of this work at the type level. There is little need for type classes: instead, define a concrete type that contains the functions that you need, and a single typeclass for converting to that canonical type. Then the various implementations of your graph type can simply create a "canonical view" of their graph, representing it in the type that you use to implement generic algorithms. This way, you have only one canonical representation to perform these algorithms on, despite having many representations for the graphs themselves.
The graph type can be as simple as
data Graph v e = Graph { nodes :: [v]
, edges :: v -> [e]
}
class AsGraph t v e where
asGraph :: t v e -> Graph v e
and you can implement allEdges generically in terms of that quite easily. If you have a graph with vector edges, it can be converted to this generic graph type in order to participate in generic operations like allEdges:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Data.Foldable (toList)
data Graph v e = Graph { nodes :: [v]
, edges :: v -> [e]
}
class AsGraph t v e where
asGraph :: t v e -> Graph v e
data VectorEdges v e = VectorEdges { vs :: [v]
, es :: v -> Vector e
}
instance AsGraph VectorEdges v e where
asGraph g = Graph (vs g) (toList . es g)
allEdges :: AsGraph t v e => t v e -> [e]
allEdges g = let g' = asGraph g
in nodes g' >>= edges g'
There does not seem to be something standard in Haskell to achieve my purpose, so I ended up adding a class specific for list conversion leaving me room to implement it for what I believe should be convertable lists.
class Convert_List list1 list2 element where
convert_list :: list1 element -> list2 element
Then I am free to implement it on my own.
The advantage of having such a class is that you can then write the graph operation like this:
class List_Nodes graph list vertex where
list_nodes :: graph -> list vertex
class List_Edges_From graph vertex list edge where
list_edges_from :: graph -> vertex -> list edge
class List_Edges graph vertex list edge where
list_edges :: graph -> list edge
instance (
Monad list,
List_Nodes graph l1 vertex,
List_Edges_From graph vertex l2 edge,
Convert_List l1 list vertex,
Convert_List l2 list edge
) => List_Edges graph vertex list edge where
list_edges graph =
convert_list (list_nodes graph :: l1 vertex) >>= \u ->
convert_list (list_edges_from graph u :: l2 edge)
Here you see that I implement list_edge in an very general way making few assumptions, i'm not even assuming the return list has to be the same as the graph internal representation.
This is also why I splitted each operation in its own class. Although this may seem counterintuitive at first I believe that there is more potential for factorization as shown here. If I had only one class containing the 3 operations, I could not implement only list_edges without enforcing constraints on the other operations as well.
It's only my opinion, but I believe more and more this sort of approach for code design has more potential for factoring.

Find a distance in graphs between nodes [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I got a list in my hand such:
[(1,2),(1,4),(2,4),(3,9),(4,7),(7,9)]
I have to implement a function which takes: a list of existing relations, a pair of new realiton,a distance n.
Function should work in this way: It takes all parameters, calculates distance between nodes which is given in a new relation, if distance is <= to distance n, function returns the list including the new relationship.
For ex:
list = [(1,2),(1,4),(2,4),(3,9),(4,7),(7,9)]
new_relation = [(1,3)]
distance_n = 4
It will return [(1,2),(1,3),(1,4),(2,4),(3,9),(4,7),(7,9)]
If distance was 3 it would return the original list
[(1,2),(1,4),(2,4),(3,9),(4,7),(7,9)]
How I can do this? I have a problems with graphs.
Note: It should be implemented in Haskell.
Both the containers package and the graphs package have adjacency list representations that are similar to yours.
A Very General Method of Computing Shortest Paths contains a functional implementation of Djikstra's algorithm for finding graph distances, but it works on an adjacency matrix. Either do a change of representation or alter the algorithm to work on adjacency lists.
Once you actually have a function distance :: Graph -> Vertex -> Vertex -> Distance, and a function addEdge :: Edge -> Graph -> Graph, you are golden. addEdge should be relatively easy to write independent of representation, but in general adding an edge means you have to throw-away any previous, cached distance calculations.
As always in Haskell, we start by declaring our types. Here' I'm just going to say that a Graph is a list of Edges, and an Edge is a tuple of Nodes, which are just Ints
type Node = Int
type Edge = (Int, Int)
type Graph = [Edge]
Then we can declare our functions' types. First we have a function that solves the specific problem
addNode :: Graph -> Edge -> Int -> Graph
addNode graph newEdge maxDistance = undefined
But we know from the problem statement that we're going to need a helper, namely a function that calculates the distance between two nodes (which can be undefined if the nodes aren't connected). Since this doesn't always have a valid value to return, we'll wrap it in Maybe and return Nothing when the nodes aren't connected
distance :: Graph -> Node -> Node -> Maybe Int
distance graph fromNode toNode = undefined
With this helper function, we can now implement addNode pretty simply
addNode graph newEdge#(fromNode, toNode) maxDistance =
case distance graph fromNode toNode of
Nothing -> graph
Just d ->
if d <= maxDistance
then newEdge : graph
else graph
But it looks like you want to keep the graph sorted, so if you import Data.List you can just toss in sort
addNode graph newEdge#(fromNode, toNode) maxDistance =
case distance graph fromNode toNode of
Nothing -> graph
Just d ->
if d <= maxDistance
then sort $ newEdge : graph
else graph
Now all you have to do is implement distance and you'll be done.

Creating a directed acyclic graph in haskell with lists and sets

I want to create a DAG in haskell but since I'm new to the whole functional programming thing I would like some directions.
The graph needs to be built with only lists and sets, and the following functions must be implemented:
v = add_vertex(g,w)
A vertex with the specified weight w is added to the DAG g and its unique vertex identifier v is returned.
add_edge(g,a,b,w)
An edge from the vertex with vertex identifier a to the vertex with vertex identifier b is added to the DAG g with weight w.
What I've done so far is creating a data type which looks like this:
data Graph v w = Graph {vertices :: [(v, w)],
edges :: [([(v, w)], [(v, w)], w)]} deriving Show
And I guess I need some form of constructor for the graph, it looks like this:
create_graph :: (v,w) -> w -> Graph v w
create_graph v w = Graph [v] [(v, v, w)]
What I would like to do is to create just an empty graph, but now I need to input some starting values if I understand correctly. How can I fix that?
The add_vertex function looks like this:
add_vertex :: Graph v w -> (v, w) -> Graph v w
add_vertex (Graph v w) x = Graph (v ++ [x]) w
But I dont really know how to return a vertex identifier instead of the whole graph. I guess I should also specify that the identifier needs to be a char and the weights can be either floats or ints, where do I do that?
I would also like to have functions for topological ordering and getting the weight for the longest path. With this in mind, should I define the structure of the graph in a different way?
Thanks
what I would do is define the graph like a tree
data Graph a = Graph [(a,[a])] -- Graph is a list of origins paired with edgeends
createGraph ::Eq a => [(a,a)] -> Graph a
createGraph = undefined
empty :: Graph a
empty = Graph []
insertVertex :: Eq a => a -> Graph a -> Graph a
insertVertex = undefined -- insert if not already in the Graph (with empty edges)
insertEdge :: Eq a => (a,a) -> Graph a -> Graph a
insertEdge = undefined -- insert edge in list of origin
--do not forget to add origin, end if they don't exist
implement these and worry about bfs/topsort later and think about the result of a bfs - what do you want as a result? (the result of topsort should be a list i guess).

How would you represent a graph (the kind associated with the travelling salesman problem) in Haskell

It's pretty easy to represent a tree in haskell:
data Tree a = Node Tree a Tree | Leaf a
but that's because it has no need for the concept of an imperative style "pointer" because each Node/Leaf has one, and only one parent. I guess I could represent it as a list of lists of Maybe Ints ...to create a table with Nothing for those nodes without a path between and Just n for those that do... but that seems really ugly and unwieldy.
You can use a type like
type Graph a = [Node a]
data Node a = Node a [Node a]
The list of nodes is the outgoing (or incoming if you prefer) edges of that node. Since you can build cyclic data structures this can represent arbitrary (multi-)graphs. The drawback of this kind of graph structure is that it cannot be modified once you have built it it. To do traversals each node probably needs a unique name (can be included in the a) so you can keep track of which nodes you have visited.
Disclaimer: below is a mostly pointless exercise in "tying the knot" technique. Fgl is the way to go if you want to actually use your graphs. However if you are wondering how it's possible to represent cyclic data structures functionally, read on.
It is pretty easy to represent a graph in Haskell!
-- a directed graph
data Vertex a b = Vertex { vdata :: a, edges :: [Edge a b] }
data Edge a b = Edge { edata :: b, src :: Vertex a b, dst :: Vertex a b }
-- My graph, with vertices labeled with strings, and edges unlabeled
type Myvertex = Vertex String ()
type Myedge = Edge String ()
-- A couple of helpers for brevity
e :: Myvertex -> Myvertex -> Myedge
e = Edge ()
v :: String -> [Myedge] -> Myvertex
v = Vertex
-- This is a full 5-graph
mygraph5 = map vv [ "one", "two", "three", "four", "five" ] where
vv s = let vk = v s (zipWith e (repeat vk) mygraph5) in vk
This is a cyclic, finite, recursive, purely functional data structure. Not a very efficient or beautiful one, but look, ma, no pointers! Here's an exercise: include incoming edges in the vertex
data Vertex a b = Vertex {vdata::a, outedges::[Edge a b], inedges::[Edge a b]}
It's easy to build a full graph that has two (indistinguishable) copies of each edge:
mygraph5 = map vv [ "one", "two", "three", "four", "five" ] where
vv s =
let vks = repeat vk
vk = v s (zipWith e vks mygraph5)
(zipWith e mygraph5 vks)
in vk
but try to build one that has one copy of each! (Imagine that there's some expensive computation involved in e v1 v2).
The knot-tying techniques that others have outlined can work, but are a bit of a pain, especially when you're trying to construct the graph on the fly. I think the approach you describe is a bit more practical. I would use an array/vector of node types where each node type holds a list/array/vector of neighbors (in addition to any other data you need) represented as ints of the appropriate size, where the int is an index into the node array. I probably wouldn't use Maybe Ints. With Int you can still use -1 or any suitable value as your uninitialized default. Once you have populated all your neighbor lists and know they are good values you won't need the failure machinery provided by Maybe anyway, which as you observed imposes overhead and inconvenience. But your pattern of using Maybe would be the correct thing to do if you needed to make complete use of all possible values the node pointer type could contain.
The simplest way is to give the vertices in the graph unique names (which could be as simple as Ints) and use either the usual adjacency matrix or neighbor list approaches, i.e., if the names are Ints, either use array (Int,Int) Bool, or array Int [Int].
Have a look at this knot-tying technique, it is used to create circular structures. You may need it if your graph contains cycles.
Also, you can represent your graph using the adjacency matrix.
Or you can keep maps between each node and the inbound and outbound edges.
In fact, each of them is useful in one context and a pain in others. Depending on your problem, you'll have to choose.

Resources