Haskell - read in graph specification from a file - haskell

I'm having a very hard time trying to figure out how to read in (and also how to represent) a graph in Haskell.
The input from the file will look something like
NODES 3
EDGE 1 2
EDGE 1 3
EDGE 2 3
I have figured out how to grab the individual lines of input from the file using:
loadFile :: String -> IO [[String]]
loadFile filename = do
contents <- readFile filename
return $ map words $ lines contents
That gives output like:
loadFile "input.txt"
[["NODES","3"],["EDGE","1","2"],["EDGE","1","3"],["EDGE","2","3"]]
What I really need to figure out is how to represent this graph data as a graph, though.
I was thinking of setting it up as a list of edges:
type Edge = (Int,Int)
type Graph = [Edge]
But then I'm not sure how I would even begin to implement the functions I need such as addNode, addEdge, getNodes, getEdges.
Any help or pointing me in the right direction would be awesome! Note: I can't use any already developed graph modules for this.
So, for the tl;dr version:
Am I reading in the data the best way?
How should I represent this data in haskell?
If I use the data structures I outlined above, how would I go about implementing one of those functions.

There are a lot of interesting concerns going on here. Let me attack them all.
You're reading in the data just fine for a line-oriented language. Later on you'll see Data.ByteString and Data.Text replace String for efficiency. You'll also see Parsec for parsing. Those can wait, though. Revisit them in time.
Your graph representation is fine. Adjacency lists are a common and useful representation.
Now, the real trick you have is here. Let's take a look at addNode and addEdge. Each is a somewhat challenging function to produce in a pure functional language because they want to modify a graph... but we don't have state.
The most important way to modify-without-state is to mutate. The kind of function you're looking for is thus
addNode :: Node -> Graph -> Graph
where the returned Graph is identical to the input Graph except with one more edge. You should note immediately that there's something wrong here---adjacency lists assume that there are no orphan nodes. We can't add just a single node to the graph.
There are two solutions. One, we could "link" the node in to the graph (which is really addEdge in disguise) or two we could extend the graph representation to include orphan nodes. Let's do (2).
data Graph = Graph [Edge] [Int] -- orphans
Now let's implement adding an edge. Assume you can have duplicate edges, adding an edge to the adjacency list is easy, just append it
addEdge0 :: Edge -> Graph -> Graph
addEdge0 e (Graph adj orph) = Graph (e:adj) orph
but that's not good enough---we want our orphan list to only include truly orphaned nodes. We'll filter it.
addEdge :: Edge -> Graph -> Graph
addEdge (n1,n2) (Graph adj orph) =
Graph ((n1,n2):adj) (filter (/=n1) . filter (/=n2) $ orph)
getEdges is trivial since we're already storing the list of edges
getEdges :: Graph -> [Edge]
getEdges (Graph edges _) = edges
getNodes just needs to append all of our nodes from the adjacency list to the orphan list. We could use Data.List.nub to get only the unique nodes.
getNodes :: Graph -> [Int]
getNotes (Graph adj orph) = nub (orph ++ adjNodes adj) where
adjNodes [] = []
adjNodes ((n1,n2):rest) = n1 : n2 : adjNodes rest
Hopefully these give you some indication of how to think in a functional language. You'll hava to dig into them a little bit to see how they work, but I've introduced a large number of interesting concepts here.
Next steps here might include trying to use the State monad to recapture imperative state modification and to chain these Graph-modifying functions together.

Related

How to iterate over tree with memory limit in Haskell?

I know that there is a solution for iterating through the Tree using Zippers (see details here). Though it is not clear for me whether it is possible to apply memory constraints to this approach.
Context
I was given the following problem to solve in Haskell:
Design an iterator that will iterate through a binary tree in-order.
Assume the binary tree is stored on disk and can contain up to 10 levels, and therefore can contain up to (2^10 - 1) nodes, and we can store at most 100 nodes in memory at any given time.
The goal of this iterator is to load a small fraction of the binary tree from disk to memory each time it's incremented, so that we don't need to load the entire tree into
memory all at once.
I assumed that the memory part is not possible to represent in Haskell, but I was told that it is not true.
Question: what can be used in Haskell to achieve that memory behaviour? Any suggestions, approaches and directions are appreciated. This is just out of curiosity, I've already failed at solving this problem.
If the iterator loads part of the tree each time it is incremented then there are two options:
It exists in the IO monad and works just like in an imperative language.
It is exploiting laziness and interleaved IO. This is the approach taken by functions like readFile which give you the entire contents of a file as one lazy list. The actual file is read on-demand as your application traverses the list.
The latter option is the interesting one here.
The tricky part of lazy lists is retainers. Suppose your file contains a list of numbers. If you compute the sum like this
nums <- map read . lines <$> readFile "numbers.txt"
putStrLn $ "The total is " <> show (sum nums)
then the program will run in constant space. But if you want the average:
putStrLn $ "The average is " <> show (sum nums / fromIntegral (length nums))
then the program will load the entire file into memory. This is because it has to traverse the list twice, once to compute the sum and once to compute the length. It can only do this by holding the entire list.
(The solution is to compute the sum and length in parallel within one pass. But that's beside the point here).
The challenge for the tree problem you pose is to come up with an approach to iteration which avoids retaining the tree.
Lets assume that each node in the file contains offsets in the file for the left and right child nodes. We can write a function in the IO monad which seeks to an offset and reads the node there.
data MyNode = MyNode Int Int ..... -- Rest of data to be filled in.
readNodeData :: Handle -> Int -> IO MyNode
From there it would be simple to write a function which traverses the entire file to create a Tree MyNode. If you implement this using unsafeInterleaveIO then you can get a tree which is read lazily as you traverse it.
unsafeInterleaveIO is unsafe because you don't know when the IO will be done. You don't even know what order it will happen in, because it only happens when the value is forced during evaluation. In this way its like the "promise" structures you get in some other languages. In this particular case this isn't a problem because we can assume the file doesn't change during the evaluation.
Unfortunately this doesn't solve the problem because the entire tree will be held in memory by the time you finish. Your traversal has to retain the root, at least as long as its traversing the left side, and as long as it does so it will retain the whole of the rest of the tree.
The solution is to rewrite the IO part to return a list instead of a tree, something like this:
readNode :: Handle -> Int -> IO [MyNode]
readNode _ (-1) = return [] -- Null case for empty child.
readNode h pos = unsafeInterleaveIO $ do
n <- readNodeData h pos -- Needs to be defined elsewhere.
lefts <- readNode (leftChild n)
rights <- readNode (rightChild n)
return $ lefts ++ [n] ++ rights
This returns the entire tree as a lazy list. As you traverse the list the relevant nodes will be read on demand. As long as you don't retain the list (see above) your program will not need to hold anything more than the current node and its parents.

Finding the number of elements in a matrix

I'm a Haskell newcomer, so cut me a bit of slack :P
I need to write a Haskell function that goes through a matrix and outputs a list of all matching elements to a given element (like using filter) and then matches the list against another to check if they are the same.
checkMatrix :: Matrix a -> a -> [a] -> Bool
I have tried variations of using filter, and using the !! operator and I can't figure it out. I don't really want to get the answer handed to me, just need some pointers for getting me on the right path
checkMatrix :: Matrix a -> a -> [a] -> Bool
checkMatrix matr a lst = case matr of
x:xs | [] -> (i don't really know what to put for the base case)
| filter (== True) (x:xs !! 0) -> checkMatrix xs a lst
Thats all i got, I'm really very lost as to what to do next
tl;dr You want something to the effect of filter someCondition (toList matrix) == otherList, with minor details varying depending on your matrix type and your specific needs.
The Full Answer
I don't know what Matrix type you're using, but the approach is going to be similar for any reasonably defined matrix type.
For this answer, I'll assume you're using the Data.Matrix class from the package on Hackage called matrix.
You are right to think you should use filter. Thinking functionally, you want to eliminate some elements from the matrix and keep others, based on a condition. However, a matrix does not provide a natural way to perform filter on it, as the idea is not really well-defined. So, instead, we want to extract the elements from our matrix into a list first. The matrix package provides the following function, which does just that.
toList :: Matrix a -> [a]
Once you have a list representation, you can very easily use filter to get the elements that you want.
A few caveats and notes.
If the matrix package that you're using doesn't define toList itself, check if it defines a Foldable instance for the matrix type. If it does, then Data.Foldable has a general-purpose toList that works for all Foldable types.
Be careful with the ordering here. It's not entirely clear what order the elements should be put into the list in, since matrices are two-dimensional and lists are inherently one-dimensional. If the ordering matters for whatever you're doing, you might have to put some additional effort into guaranteeing the desired order. If it does not matter, consider using Data.Set or some other unordered collection instead of lists.
I don't see any constraints in your checkMatrix implementation. Remember that comparing elements of lists adds an Eq a constraint, and if you want to use an unordered collection then that's going to add Ord a instead.

Simple graphs (without parallel edges) in fgl?

Are there any provisions in http://hackage.haskell.org/package/fgl for handling simple graphs (i.e., no parallel edges)? At the moment, we get
import Data.Graph.Inductive
mkGraph [(1,()),(2,())] [(1,2,()),(1,2,())] :: Gr () ()
==> mkGraph [(1,()),(2,())] [(1,2,()),(1,2,())]
containing two edges from 1 to 2. Indeed for this graph,
suc g 1 ==> [2,2]
Just to show that I've done some research: the official documentation http://web.engr.oregonstate.edu/~erwig/fgl/haskell/old/fgl0103.pdf (paper linked from original web page) states in Section 2.1 "we define one type for directed, node-labelled, edge-labelled multi-graphs; other graph types can be obtained as special cases" but then am seeing just the trivial specialisation (no label = label ()), but nothing on simple graphs.
For simple graphs, the type of Adj b, which occurs in Context a b, should perhaps be Set (b,Node) instead of [(b,Node)] as it is now.
(Or even Map b (Set Node), or Map Node (Set b) ...)
Somewhat related: is there a method that allows to check for the presence of an edge, or give me the set of edges from one node p to another node q? At the moment it seems I have to find q by walking through suc g p. And that's a list, which worries me even more. What if my graph has nodes with large degree?

Tree visualisation and animation

I'm working with multiple search strategies on trees in Haskell. I want to visualize them and also animate the search I'm doing in it. The best I've found so far is graphviz images that I could generate by writing DOT files (like in Land of Lisp) but I doubt that it is the best approach. My trees can get quite big so I don't want to enter the position of each node in my program, I want them to be places correctly automatically.
I've also looked a little bit at Gephi but I'm not sure if I can input my data in it.
Also my Tree datatype is very basic : data Tree a = Leaf a | Branch (Tree a) (Tree a).
So in short, I'm looking for a way to get tree visualisation and animation on search strategy in it. I'm not looking necessarily for a Haskell centric solution but it could be great. Also being able to output the images/animation in standard format such as gif would a big plus.
I'll expand my comment:
I haven't investigated pricing policy of Ubigraph, but you can download a free version from their site ("basic" one?). Then you can install vacuum-ubigraph package (there seems to be a build failure HackageDB under GHC 7.0, but I've just managed to install it under my 7.0.2 without a problem). Once it's done you can just start ubigraph_server and start 'feeding' it with your data structures right from ghci:
import System.Vacuum.Ubigraph
data Tree a = Leaf a | Branch (Tree a) (Tree a)
data Root a = Root a
tree =
Root
(Branch
(Branch
(Leaf "A")
(Leaf "B"))
(Leaf "C"))
Type view tree and you'll get something similar to:
You can zoom in/out and rotate it. Not sure how practical it is (it shows the entire Haskell object graph like it is - note shared []), but there are lots of settings to play with, so you can definitely make it look nicer. Animation seems to be supported as well.
If you go the Ubigraph route you can just use the HUbigraph bindings directly, for example:
import Graphics.Ubigraph
import Control.Monad
main = do
h <- initHubigraph "http://127.0.0.1:20738/RPC2"
runHubigraph op h
op = do
clear
vs <- mapM (const newVertex) [0..400]
mapM_ (setVAttr (VShape Sphere)) vs
let bind i = zipWithM (\a b -> newEdge (a,b)) vs (drop i vs ++ take i vs)
mapM_ bind [1..15]
return ()
I just spent some time playing with this - it's fun but don't try to up the value of 15 to, say, 40 or ubigraph gets very upset (constant motion of the verticies)!

How can monads make my job easier? Show me some cool piece of code

I like reading snippets of code about concepts that I don't understand. Are there any snippets that show off monads in all their glory? More importantly how can I apply monads to make my job easier.
I use jQuery heavily. That's one cool application of monads I know of.
Like others, I think the question is far too general. I think most answers (like mine) will give examples of something neat making use of one specific monad. The real power of monads is that, once you understand them as an abstraction, you can apply that knowledge to any new monads you come across (and in Haskell there are a lot). This in turn means you can easily figure out what new code does and how to use it because you already know the interface and some rules that govern its behavior.
Anyway, here's an example using the List monad from a test-running script I wrote:
runAll :: IO ()
runAll = do
curdir <- getCurrentDirectory
sequence $ runTest <$> srcSets <*> optExeFlags <*> optLibFlags
setCurrentDirectory curdir
Technically I'm using the Applicative interface, but you can just change the <*>'s to ap from Control.Monad if that bothers you.
The cool thing about this is that it calls runTest for every combination of arguments from the lists "srcSets", "optExeFlags", and "optLibFlags" in order to generate profiling data for each of those sets. I think this is much nicer than what I would have done in C (3 nested loops).
Your question is really vague -- it's like asking, "show an example of code that uses variables". It's so intrinsic to programming that any code is going to be an example. So, I'll just give you the most-recently-visited Haskell function that's still open in my editor, and explain why I used monadic control flow.
It's a code snippet from my xmonad config file. It is part of the implementation for a layout that behaves in a certain way when there is one window to manage, and in another way for more than one window. This function takes a message and generates a new layout. If we decide that there is no change to be made, however, we return Nothing:
handleMessage' :: AlmostFull a -> SomeMessage -> Int -> Maybe (AlmostFull a)
handleMessage' l#(AlmostFull ratio delta t) m winCount =
case winCount of
-- keep existing Tall layout, maybe update ratio
0 -> finalize (maybeUpdateRatio $ fromMessage m) (Just t)
1 -> finalize (maybeUpdateRatio $ fromMessage m) (Just t)
-- keep existing ratio, maybe update Tall layout
_ -> finalize (Just ratio) (pureMessage t m)
where
finalize :: Maybe Rational -> Maybe (Tall a) -> Maybe (AlmostFull a)
finalize ratio t = ratio >>= \ratio -> t >>= \t ->
return $ AlmostFull ratio delta t
maybeUpdateRatio :: Message -> Maybe Rational
maybeUpdateRatio (Just Shrink) = Just (max 0 $ ratio-delta)
maybeUpdateRatio (Just Expand) = Just (min 1 $ ratio+delta)
maybeUpdateRatio _ = Nothing
We decide what to return based on the current window manager state (which is determined by a computation in the X monad, whose result we pass to this function to keep the actual logic pure) -- if there are 0 or 1 windows, we pass the message to the AlmostFull layout and let it decide what to do. That's the f function. It returns Just the new ratio if the message changes the ratio, otherwise it returns Nothing. The other half is similar; it passes the message onto Tall's handler if there are 2 or more windows. That returns Just a new Tall layout if that's what the user asked for, otherwise it returns Nothing.
The finalize function is the interesting part; it extracts both ratio (the desired new ratio) and t (the desired new Tall layout) from its Maybe wrapper. This means that both have to be not Nothing, otherwise we automatically return Nothing from our function.
The reason we used the Maybe monad here was so that we could write a function contingent on all results being available, without having to write any code to handle the cases where a Nothing appeared.
Essentially, monads are "imperative minilanguages". Hence, they enable you to use any imperative construct like exceptions (Maybe), logging (Writer), Input/Output (IO), State (State), non-determinism (lists [a]), parsers (Parsec, ReadP) or combinations thereof.
For more advanced examples, have a look at the example code for my operational package. In particular,
WebSessionState.lhs implements web sessions that are programmed as if the server were a persistent process while they are in fact delivered asynchronously.
TicTacToe.hs shows a game engine where players and AI are written as if they were running in concurrent processes.
I've been looking into Haskell and Information Flow security. This paper is pretty interesting, it uses Monads to enforce confidentiality in Haskell Programs.
http://www.cse.chalmers.se/~russo/seclib.htm
Here is something that I did recently that might show off some of the power of monads. The actual code is not shown here to protect the innocent, this is just a sketch.
Let's say you want to search through some dictionary and depending on what you find you want to do some other search. The searches might return Nothing (the element you are looking for doesn't exist) in which case you might try a different search, and if all searches fail you return Nothing.
The idea is to make our own monad by combining monad transformers, and then we can easily make some combinators for searches. Our monad will be ReaderT Dictionary Maybe. And we define the functions find wich looks up a given key, both which will return a the list of elements it found in both of the searches and oneOf which takes two searches and tries the first and if it didn't succeed it tries the second. Here is an example of such a search:
import Control.Monad
import Control.Monad.Reader
find a = ReaderT (lookup a)
both a b = liftM2 (++) a b
oneOf = mplus
search = both (find 1) ((find 2) `oneOf` (find 3))
`oneOf` both (find 4) (find 5)
And running:
(runReaderT search) [(1,"a"),(3,"c"),(4,"d"),(5,"g")] --> Just "ac"
(runReaderT search) [(6,"a")] --> Nothing
The big advantage we gain from this being a monad is that we can bind searches together and lift other functions into this abstraction. Let's say for instance, I have two searches search_a and search_b, and I want to do them and then return them merged:
do a <- search_a
b <- search_b
return (merge a b)
or alternatively liftM2 merge search_a search_b.

Resources