How would you represent a graph (the kind associated with the travelling salesman problem) in Haskell - haskell

It's pretty easy to represent a tree in haskell:
data Tree a = Node Tree a Tree | Leaf a
but that's because it has no need for the concept of an imperative style "pointer" because each Node/Leaf has one, and only one parent. I guess I could represent it as a list of lists of Maybe Ints ...to create a table with Nothing for those nodes without a path between and Just n for those that do... but that seems really ugly and unwieldy.

You can use a type like
type Graph a = [Node a]
data Node a = Node a [Node a]
The list of nodes is the outgoing (or incoming if you prefer) edges of that node. Since you can build cyclic data structures this can represent arbitrary (multi-)graphs. The drawback of this kind of graph structure is that it cannot be modified once you have built it it. To do traversals each node probably needs a unique name (can be included in the a) so you can keep track of which nodes you have visited.

Disclaimer: below is a mostly pointless exercise in "tying the knot" technique. Fgl is the way to go if you want to actually use your graphs. However if you are wondering how it's possible to represent cyclic data structures functionally, read on.
It is pretty easy to represent a graph in Haskell!
-- a directed graph
data Vertex a b = Vertex { vdata :: a, edges :: [Edge a b] }
data Edge a b = Edge { edata :: b, src :: Vertex a b, dst :: Vertex a b }
-- My graph, with vertices labeled with strings, and edges unlabeled
type Myvertex = Vertex String ()
type Myedge = Edge String ()
-- A couple of helpers for brevity
e :: Myvertex -> Myvertex -> Myedge
e = Edge ()
v :: String -> [Myedge] -> Myvertex
v = Vertex
-- This is a full 5-graph
mygraph5 = map vv [ "one", "two", "three", "four", "five" ] where
vv s = let vk = v s (zipWith e (repeat vk) mygraph5) in vk
This is a cyclic, finite, recursive, purely functional data structure. Not a very efficient or beautiful one, but look, ma, no pointers! Here's an exercise: include incoming edges in the vertex
data Vertex a b = Vertex {vdata::a, outedges::[Edge a b], inedges::[Edge a b]}
It's easy to build a full graph that has two (indistinguishable) copies of each edge:
mygraph5 = map vv [ "one", "two", "three", "four", "five" ] where
vv s =
let vks = repeat vk
vk = v s (zipWith e vks mygraph5)
(zipWith e mygraph5 vks)
in vk
but try to build one that has one copy of each! (Imagine that there's some expensive computation involved in e v1 v2).

The knot-tying techniques that others have outlined can work, but are a bit of a pain, especially when you're trying to construct the graph on the fly. I think the approach you describe is a bit more practical. I would use an array/vector of node types where each node type holds a list/array/vector of neighbors (in addition to any other data you need) represented as ints of the appropriate size, where the int is an index into the node array. I probably wouldn't use Maybe Ints. With Int you can still use -1 or any suitable value as your uninitialized default. Once you have populated all your neighbor lists and know they are good values you won't need the failure machinery provided by Maybe anyway, which as you observed imposes overhead and inconvenience. But your pattern of using Maybe would be the correct thing to do if you needed to make complete use of all possible values the node pointer type could contain.

The simplest way is to give the vertices in the graph unique names (which could be as simple as Ints) and use either the usual adjacency matrix or neighbor list approaches, i.e., if the names are Ints, either use array (Int,Int) Bool, or array Int [Int].

Have a look at this knot-tying technique, it is used to create circular structures. You may need it if your graph contains cycles.
Also, you can represent your graph using the adjacency matrix.
Or you can keep maps between each node and the inbound and outbound edges.
In fact, each of them is useful in one context and a pain in others. Depending on your problem, you'll have to choose.

Related

Data type for a simple, undirected graph without multiple edges or loops in Haskell

I am new to Haskell and I am trying to come up with a suitable way to represent a graph. First some background for an undirected simple graph. For all vertices u and v, an edge between u and v is the same as an edge between v and u, there is at most one edge between u and v, and there is no edge between u and u.
Later on I want to be able to write functions to check if the graph is 1) empty, 2) add new vertices, 3) add new edges, 4) obtain all neighbors to a vertex and 5) obtain a list of all vertices in a graph.
After doing research, I am a bit confused by all the ways to define a graph data type, which I also hope to get some help to clarify. All seem to agree that a you need some way to represent a Vertex/Node and the edges/links to other Vertices/Nodes. However, the implementations differ.
Before I have done a Tree with infinite amount of branches, following this question tree-with-an-arbitrary-number-of-branches
data Tree a = Node a [Tree a]
deriving (Eq, Show)
Whats different with a graph I guess is that a nodes on the same "level" and nodes on different "branches" can be connected with an edge, see figure below.
What I came up with first was defining a data type using recursive data structures with a type variable where each Vertex/Node has a list with its associated nodes:
data Node a = Node a [a]
deriving Show
data Graph a = Graph [Node a]
deriving Show
However, what I am a bit unsure about is whether this representation makes it possible to insert new edges later on. With this definition a graph is just a list of nodes that in turn contains list of nodes they link/edge to.
After doing research about how to represent a graph in Haskell I found some interesting ideas. The first was to define a graph just using type synonyms:
type Node = Int
type Element = String
type Edge = (Node, Node)
type Node = (Node, Element)
type Graph = ([Node], [Edge])
Here we have that a Graph is a list of nodes with an associated list of its connections/links/edges. However, I was not sure what the corresponding data type definition would look like and with a type variable/parameter instead of a concrete type. In this regard, I found this question declare-data-constructor-for-graphs suggesting to represent a graph like this:
type Vertex = Integer
data Graph = Graph [Vertex] [(Vertex, Vertex)]
Which I guess with type parameter instead could be translated to:
data Graph a = Graph [a] [(a, a)]
Is this correct? Would this solution work for creating a simple, undirected graph without multiple edges or loops in Haskell? That also support the creation of the specified functions above.
In continuation, similar to this representation, I found this question directed-acyclic-graph where a graph is defined as:
data Graph a = Graph [(a,[a])] -- Graph is a list of origins paired with edgeends
Here I guess the author defines a graph as a list of tuples where each tuple consists of one node and a list of its linked nodes.
Another way I found was to use record syntax in the question graph-data-type-representation.
data Node a = Node { value :: a
, neighbors :: [Node a]
} deriving (Show)
-- or simply,
data Node a = Node a [Node a]
deriving (Show)
Which I guess is the same reasoning. A Node/Vertex has a value, and neighbors that a just a list of other Vertices. Building on top of this, a graph definition would be:
data Graph a = Graph [Node a]
Or am I wrong? If so, this implementation is similar to what my initial thinking, but differs in the data Node definition. Not sure whats more correct here.
In summary, I have found many ways to represent a graph data type in Haskell. But I am a bit confused about which way that best suits my use-case, to create a simple, undirected graph without multiple edges or loops that also supports the functions I would like to implement.
Looking forward answers and comments to clarify this!
Ended up using
data Graph a = Graph [(a, [a])] deriving (Show)
Since Algebraic data types are just a way to represent data, it is mostly about choosing a design that fits your needs. Here is a good source to read more about it: Learn you a haskell
For example, choosing this data type representation makes adding vertices to a graph possible in this way.
addVertex :: Eq a => Graph a -> a -> Graph a
addVertex (Graph vList) v
| not (containsVertex v vList) = Graph (vList ++ makeVertex v)
| otherwise = Graph vList
makeVertex :: a -> [(a, [a])]
makeVertex x = [(x, [])]
containsVertex :: Eq a => a -> [(a, [a])] -> Bool
containsVertex _ [] = False
containsVertex x ((v, _) : ys) = x == v || containsVertex x ys
Thus, just easy to deal with and manage the Graph when expressed in this way.

Graphs in Haskell [duplicate]

This question already has answers here:
How do you represent a graph in Haskell?
(7 answers)
Closed 1 year ago.
I'm struggling to understand how simple graphs are represented in Haskell. My understanding is that a graph would basically be a list of vertices and edges (pairs of vertices)
I've been looking at the Data.Graph implementation in order to build a Graph constructor with a type variable "a" for a simple graph, but I don't understand what the constructor should look like and how it will store an Edge and a Vertex.
My initial thinking was to base my constructor on a similar logic as a tree:
type Vertex = Int
type Edge = (Vertex, Vertex)
data Graph a = Void | Vertex a [Graph a]
But I'm not sure how the edges are then represented.
It's tempting to try to represent a graph structurally in Haskell ADTs, but this doesn't really work the way it does with trees because of the presence of loops. It would be possible to represent only a spanning tree, but then the remaining edges need to be represented as addresses into the tree, which is possible but awkward and if you need direct addressing anyway, what's the point of having the tree structure at all?† That's why the standard way is to instead just flatten it completely, an array of vertices and a list (or another array) of edges.
(The vertices can't reasonably be stored in a list because that would have too slow direct access.)
If you want to add extra data, like you would add data to list nodes, you can just add them to the vertex data.
{-# LANGUAGE DeriveFunctor #-}
import Data.Vector as V
newtype Vertex a = Vertex { getVertexData :: a }
deriving (Functor, Eq, Show)
type VertexIndex = Int
type Edge = (VertexIndex, VertexIndex)
data Graph a = Graph
{ graphVertices :: V.Vector (Vertex a)
, graphEdges :: V.Vector Edge
} deriving (Functor, Show)
†IMO there are actually valid reasons to want a tree structure, including to support lazyness. This could be used in some kind of comonadic interface; I dabbled with that once, not sure if somebody has done it properly somewhere.
Just for fun, here is a simple (and inefficient) implementation of nonempty connected graphs based on a spanning tree.
data TreeAddress = Here
| Up TreeAddress
| Down Int TreeAddress
data ConnectedGraph a = Vertex
{ vertexContainedData :: a
, managedNeighbours :: [ConnectedGraph a]
, unmanagedNeighbours :: [TreeAddress]
}
To make it a bit less wasteful, TreeAddress could be condensed down into a single Int if we also keep track of the total number of vertices, or at least Integer if we quotient out the number of managed neighbours at each junction.
It would be a fun exercise to write a Comonad instance for this.
Ah, somebody seems to have done this in Scala.
And they use a library that was itself inspired by the Haskell fgl library! I knew somebody had to have done this already. In fact it's quite old.

Haskell: Assigning unique char to matrix values if x > 0

So my goal for the program is for it to receive an Int matrix for input, and program converts all numbers > 0 to a unique sequential char, while 0's convert into a '_' (doesn't matter, just any character not in the sequence).
eg.
main> matrixGroupings [[0,2,1],[2,2,0],[[0,0,2]]
[["_ab"],["cd_"],["__e"]]
The best I've been able to achieve is
[["_aa"],["aa_"],["__a"]]
using:
matrixGroupings xss = map (map (\x -> if x > 0 then 'a' else '_')) xss
As far as I can tell, the issue I'm having is getting the program to remember what its last value was, so that when the value check is > 0, it picks the next char in line. I can't for the life of me figure out how to do this though.
Any help would be appreciated.
Your problem is an instance of an ancient art: labelling of various structures with a stream of
labels. It dates back at least to Chris Okasaki, and my favourite treatment is by Jeremy
Gibbons.
As you can see from these two examples, there is some variety to the way a structure may be
labelled. But in this present case, I suppose the most straightforward way will do. And in Haskell
it would be really short. Let us dive in.
The recipe is this:
Define a polymorphic type for your matrices. It must be such that a matrix of numbers and a
matrix of characters are both rightful members.
Provide an instance of Traversable class. It may in many cases be derived automagically.
Pick a monad to your liking. One simple choice is State. (Actually, that is the only choice I
can think of.)
Create an action in this monad that takes a number to a character.
Traverse a matrix with this action.
Let's cook!
A type may be as simple as this:
newtype Matrix a = Matrix [[a]] deriving Show
It is entirely possible that the inner lists will be of unequal length — this type does not
protect us from making a "ragged" matrix. This is poor design. But I am going to skim over
it for now. Haskell provides an endless depth for perfection. This type is good enough for
our needs here.
We can immediately define an example of a matrix:
example :: Matrix Int
example = Matrix [[0,2,1],[2,2,0],[0,0,2]]
How hard is it to define a Traversable? 0 hard.
{-# language DeriveTraversable #-}
...
newtype Matrix a = Matrix [[a]] deriving (Show, Functor, Foldable, Traversable)
Presto.
Where do we get labels from? It is a side effect. The function reaches somewhere, takes a
stream of labels, takes the head, and puts the tail back in the extra-dimensional pocket. A
monad that can do this is State.
It works like this:
label :: Int -> State String Char
label 0 = return '_'
label x = do
ls <- get
case ls of
[ ] -> error "No more labels!"
(l: ls') -> do
put ls'
return l
I hope the code explains itself. When a function "creates" a monadic value, we call it
"effectful", or an "action" in a given monad. For instance, print is an action that,
well, prints stuff. Which is an effect. label is also an action, though in a different
monad. Compare and see for youself.
Now we are ready to cook a solution:
matrixGroupings m = evalState (traverse label m) ['a'..'z']
This is it.
λ matrixGroupings example
Matrix ["_ab","cd_","__e"]
Bon appetit!
P.S. I took all glory from you, it is unfair. To make things fun again, I challenge you for an exercise: can you define a Traversable instance that labels a matrix in another order — by columns first, then rows?

Apply function to all pairs efficiently

I need a second order function pairApply that applies a binary function f to all unique pairs of a list-like structure and then combines them somehow. An example / sketch:
pairApply (+) f [a, b, c] = f a b + f a c + f b c
Some research leads me to believe that Data.Vector.Unboxed probably will have good performance (I will also need fast access to specific elements); also it necessary for Statistics.Sample, which would come in handy further down the line.
With this in mind I have the following, which almost compiles:
import qualified Data.Vector.Unboxed as U      
pairElement :: (U.Unbox a, U.Unbox b)    
=> (U.Vector a)                    
  -> (a -> a -> b)                   
  -> Int                             
-> a                               
 -> (U.Vector b)                    
pairElement v f idx el =
U.map (f el) $ U.drop (idx + 1) v            
pairUp :: (U.Unbox a, U.Unbox b)   
=> (a -> a -> b)                        
 -> (U.Vector a)                         
-> (U.Vector (U.Vector b))
pairUp f v = U.imap (pairElement v f) v 
pairApply :: (U.Unbox a, U.Unbox b)
=> (b -> b -> b)                     
-> b                                 
 -> (a -> a -> b)                     
-> (U.Vector a)                      
 -> b
pairApply combine neutral f v =
folder $ U.map folder (pairUp f v) where
folder = U.foldl combine neutral
The reason this doesn't compile is that there is no Unboxed instance of a U.Vector (U.Vector a)). I have been able to create new unboxed instances in other cases using Data.Vector.Unboxed.Deriving, but I'm not sure it would be so easy in this case (transform it to a tuple pair where the first element is all the inner vectors concatenated and the second is the length of the vectors, to know how to unpack?)
My question can be stated in two parts:
Does the above implementation make sense at all or is there some quick library function magic etc that could do it much easier?
If so, is there a better way to make an unboxed vector of vectors than the one sketched above?
Note that I'm aware that foldl is probably not the best choice; once I've got the implementation sorted I plan to benchmark with a few different folds.
There is no way to define a classical instance for Unbox (U.Vector b), because that would require preallocating a memory area in which each element (i.e. each subvector!) has the same fixed amount of space. But in general, each of them may be arbitrarily big, so that's not feasible at all.
It might in principle be possible to define that instance by storing only a flattened form of the nested vector plus an extra array of indices (where each subvector starts). I once briefly gave this a try; it actually seems somewhat promising as far as immutable vectors are concerned, but a G.Vector instance also requires a mutable implementation, and that's hopeless for such an approach (because any mutation that changes the number of elements in one subvector would require shifting everything behind it).
Usually, it's just not worth it, because if the individual element vectors aren't very small the overhead of boxing them won't matter, i.e. often it makes sense to use B.Vector (U.Vector b).
For your application however, I would not do that at all – there's no need to ever wrap the upper element-choices in a single triangular array. (And it would be really bad for performance to do that, because it make the algorithm take O (n²) memory rather than O (n) which is all that's needed.)
I would just do the following:
pairApply combine neutral f v
= U.ifoldl' (\acc i p -> U.foldl' (\acc' q -> combine acc' $ f p q)
acc
(U.drop (i+1) v) )
neutral v
This corresponds pretty much to the obvious nested-loops imperative implementation
pairApply(combine, b, f, v):
for(i in 0..length(v)-1):
for(j in i+1..length(v)-1):
b = combine(b, f(v[i], v[j]);
return b;
My answer is basically the same as leftaroundabout's nested-loops imperative implementation:
pairApply :: (Int -> Int -> Int) -> Vector Int -> Int
pairApply f v = foldl' (+) 0 [f (v ! i) (v ! j) | i <- [0..(n-1)], j <- [(i+1)..(n-1)]]
where n = length v
As far as I know, I do not see any performance issue with this implementation.
Non-polymorphic for simplicity.

What datatype to choose for a dungeon map

As part of a coding challenge I have to implement a dungeon map.
I have already designed it using Data.Map as a design choice because printing the map was not required and sometimes I had to update an map tile, e.g. when an obstacle was destroyed.
type Dungeon = Map Pos Tile
type Pos = (Int,Int) -- cartesian coordinates
data Tile = Wall | Destroyable | ...
But what if I had to print it too - then I would have to use something like
elaboratePrint . sort $ fromList dungeon where elaboratePrint takes care of the linebreaks and makes nice unicode symbols from the tileset.
Another choice I considered would be a nested list
type Dungeon = [[Tile]]
This would have the disadvantage, that it is hard to update a single element in such a data structure. But printing then would be a simple one liner unlines . map show.
Another structure I considered was Array, but as I am not used to arrays a short glance at the hackage docs - i only found a map function that operated on indexes and one that worked on elements, unless one is willing to work with mutable arrays updating one element is not easy at first glance. And printing an array is also not clear how to do that fast and easily.
So now my question - is there a better data structure for representing a dungeon map that has the property of easy printing and easy updating single elements.
How about an Array? Haskell has real, 2-d arrays.
import Data.Array.IArray -- Immutable Arrays
Now an Array is indexed by any Ix a => a. And luckily, there is an instance (Ix a, Ix b) => Ix (a, b). So we can have
type Dungeon = Array (Integer, Integer) Tile
Now you construct one of these with any of several functions, the simplest to use being
array :: Ix i => (i, i) -> [(i, a)] -> Array i a
So for you,
startDungeon = array ( (0, 0), (100, 100) )
[ ( (x, y), Empty ) | x <- [0..100], y <- [0..100]]
And just substitute 100 and Empty for the appropriate values.
If speed becomes a concern, then it's a simple fix to use MArray and ST. I'd suggest not switching unless speed is actually a real concern here.
To address the pretty printing
import Data.List
import Data.Function
pretty :: Array (Integer, Integer) Tile -> String
pretty = unlines . map show . groupBy ((==) `on` snd.fst) . assoc
And map show can be turned in to however you want to format [Tile] into a row. If you decide that you really want these to be printed in an awesome and efficient manner (Console game maybe) you should look at a proper pretty printing library, like this one.
First — tree-likes such as Data.Map and lists remain the natural data structures for functional languages. Map is a bit of an overkill structure-wise if you only need rectangular maps, but [[Tile]] may actually be pretty fine. It has O(√n) for both random-access and updates, that's not too bad.
In particular, it's better than pure-functional updates of a 2D array (O(n))! So if you need really good performance, there's no way around using mutable arrays. Which isn't necessarily bad though, after all a game is intrinsically concerned with IO and state. What is good about Data.Array, as noted by jozefg, is the ability to use tuples as Ix indexes, so I would go with MArray.
Printing is easy with arrays. You probably only need rectangular parts of the whole map, so I'd just extract such slices with a simple list comprehension
[ [ arrayMap ! (x,y) | x<-[21..38] ] | y<-[37..47] ]
You already know how to print lists.

Resources