I'm working with multiple search strategies on trees in Haskell. I want to visualize them and also animate the search I'm doing in it. The best I've found so far is graphviz images that I could generate by writing DOT files (like in Land of Lisp) but I doubt that it is the best approach. My trees can get quite big so I don't want to enter the position of each node in my program, I want them to be places correctly automatically.
I've also looked a little bit at Gephi but I'm not sure if I can input my data in it.
Also my Tree datatype is very basic : data Tree a = Leaf a | Branch (Tree a) (Tree a).
So in short, I'm looking for a way to get tree visualisation and animation on search strategy in it. I'm not looking necessarily for a Haskell centric solution but it could be great. Also being able to output the images/animation in standard format such as gif would a big plus.
I'll expand my comment:
I haven't investigated pricing policy of Ubigraph, but you can download a free version from their site ("basic" one?). Then you can install vacuum-ubigraph package (there seems to be a build failure HackageDB under GHC 7.0, but I've just managed to install it under my 7.0.2 without a problem). Once it's done you can just start ubigraph_server and start 'feeding' it with your data structures right from ghci:
import System.Vacuum.Ubigraph
data Tree a = Leaf a | Branch (Tree a) (Tree a)
data Root a = Root a
tree =
Root
(Branch
(Branch
(Leaf "A")
(Leaf "B"))
(Leaf "C"))
Type view tree and you'll get something similar to:
You can zoom in/out and rotate it. Not sure how practical it is (it shows the entire Haskell object graph like it is - note shared []), but there are lots of settings to play with, so you can definitely make it look nicer. Animation seems to be supported as well.
If you go the Ubigraph route you can just use the HUbigraph bindings directly, for example:
import Graphics.Ubigraph
import Control.Monad
main = do
h <- initHubigraph "http://127.0.0.1:20738/RPC2"
runHubigraph op h
op = do
clear
vs <- mapM (const newVertex) [0..400]
mapM_ (setVAttr (VShape Sphere)) vs
let bind i = zipWithM (\a b -> newEdge (a,b)) vs (drop i vs ++ take i vs)
mapM_ bind [1..15]
return ()
I just spent some time playing with this - it's fun but don't try to up the value of 15 to, say, 40 or ubigraph gets very upset (constant motion of the verticies)!
Related
Let's say we have existing tree-like data and we would like to add information about depth of each node. How can we easily achieve that?
Data Tree = Node Tree Tree | Leaf
For each node we would like to know in constant complexity how deep it is. We have the data from external module, so we have information as it is shown above. Real-life example would be external HTML parser which just provides the XML tree and we would like to gather data e.g. how many hyperlinks every node contains.
Functional languages are created for traversing trees and gathering data, there should be an easy solution.
Obvious solution would be creating parallel structure. Can we do better?
The standard trick, which I learned from Chris Okasaki's wonderful Purely Functional Data Structures is to cache the results of expensive operations at each node. (Perhaps this trick was known before Okasaki's thesis; I don't know.) You can provide smart constructors to manage this information for you so that constructing the tree need not be painful. For example, when the expensive operation is depth, you might write:
module SizedTree (SizedTree, sizedTree, node, leaf, depth) where
data SizedTree = Node !Int SizedTree SizedTree | Leaf
node l r = Node (max (depth l) (depth r) + 1) l r
leaf = Leaf
depth (Node d _ _) = d
depth Leaf = 0
-- since we don't expose the constructors, we should
-- provide a replacement for pattern matching
sizedTree f v (Node _ l r) = f l r
sizedTree f v Leaf = v
Constructing SizedTrees costs O(1) extra work at each node (hence it is O(n) work to convert an n-node Tree to a SizedTree), but the payoff is that checking the depth of a SizedTree -- or of any subtree -- is an O(1) operation.
You do need some another data where you can store these Ints. Define Tree as
data Tree a = Node Tree a Tree | Leaf a
and then write a function
annDepth :: Tree a -> Tree (Int, a)
Your original Tree is Tree () and with pattern synonyms you can recover nice constructors.
If you want to preserve the original tree for some reason, you can define a view:
{-# LANGUAGE GADTs, DataKinds #-}
data Shape = SNode Shape Shape | SLeaf
data Tree a sh where
Leaf :: a -> Tree a SLeaf
Node :: Tree a lsh -> a -> Tree a rsh -> Tree a (SNode lsh rsh)
With this you have a guarantee that an annotated tree has the same shape as the unannotated. But this doesn't work good without proper dependent types.
Also, have a look at the question Boilerplate-free annotation of ASTs in Haskell?
The standard solution is what #DanielWagner suggested, just extend the data structure. This can be somewhat inconvenient, but can be solved: Smart constructors for creating instances and using records for pattern matching.
Perhaps Data types a la carte could help, although I haven't used this approach myself. There is a library compdata based on that.
A completely different approach would be to efficiently memoize the values you need. I was trying to solve a similar problem and one of the solutions is provided by the library stable-memo. Note that this isn't a purely functional approach, as the library is internally based on object identity, but the interface is pure and works perfectly for the purpose.
In Scheme, the primitive eq? tests whether its arguments are the same object. For example, in the following list
(define lst
(let (x (list 'a 'b))
(cons x x)))
The result of
(eq? (car x) (cdr x))
is true, and moreover it is true without having to peer into (car x) and (cdr x). This allows you to write efficient equality tests for data structures that have a lot of sharing.
Is the same thing ever possible in Haskell? For example, consider the following binary tree implementation
data Tree a = Tip | Bin a (Tree a) (Tree a)
left (Bin _ l _) = l
right (Bin _ _ r) = r
mkTree n :: Int -> Tree Int
mkTree 0 = Tip
mkTree n = let t = mkTree (n-1) in Bin n t t
which has sharing at every level. If I create a tree with let tree = mkTree 30 and I want to see if left tree and right tree are equal, naively I have to traverse over a billion nodes to discover that they are the same tree, which should be obvious because of data sharing.
I don't expect there is a simple way to discover data sharing in Haskell, but I wondered what the typical approaches to dealing with issues like this are, when it would be good to detect sharing for efficiency purposes (or e.g. to detect cyclic data structures).
Are there unsafe primitives that can detect sharing? Is there a well-known way to build data structures with explicit pointers, so that you can compare pointer equality?
There's lots of approaches.
Generate unique IDs and stick everything in a finite map (e.g. IntMap).
The refined version of the last choice is to make an explicit graph, e.g. using fgl.
Use stable names.
Use IORefs (see also), which have both Eq and Ord instances regardless of the contained type.
There are libraries for observable sharing.
As mentioned above, there is reallyUnsafePtrEquality# but you should understand what's really unsafe about it before you use it!
See also this answer about avoiding equality checks altogether.
It is not possible in Haskell, the pure language.
But in its implementation in GHC, there are loopholes, such as
the use of reallyUnsafePtrEquality# or
introspection libraries like ghc-heap-view.
In any case, using this in regular code would be very unidiomatic; at most I could imagine that building a highly specialized library for something (memoizatoin, hash tables, whatever) that then provides a sane, pure API, might be acceptable.
There is reallyUnsafePtrEquality#. Also see here
I'm having a very hard time trying to figure out how to read in (and also how to represent) a graph in Haskell.
The input from the file will look something like
NODES 3
EDGE 1 2
EDGE 1 3
EDGE 2 3
I have figured out how to grab the individual lines of input from the file using:
loadFile :: String -> IO [[String]]
loadFile filename = do
contents <- readFile filename
return $ map words $ lines contents
That gives output like:
loadFile "input.txt"
[["NODES","3"],["EDGE","1","2"],["EDGE","1","3"],["EDGE","2","3"]]
What I really need to figure out is how to represent this graph data as a graph, though.
I was thinking of setting it up as a list of edges:
type Edge = (Int,Int)
type Graph = [Edge]
But then I'm not sure how I would even begin to implement the functions I need such as addNode, addEdge, getNodes, getEdges.
Any help or pointing me in the right direction would be awesome! Note: I can't use any already developed graph modules for this.
So, for the tl;dr version:
Am I reading in the data the best way?
How should I represent this data in haskell?
If I use the data structures I outlined above, how would I go about implementing one of those functions.
There are a lot of interesting concerns going on here. Let me attack them all.
You're reading in the data just fine for a line-oriented language. Later on you'll see Data.ByteString and Data.Text replace String for efficiency. You'll also see Parsec for parsing. Those can wait, though. Revisit them in time.
Your graph representation is fine. Adjacency lists are a common and useful representation.
Now, the real trick you have is here. Let's take a look at addNode and addEdge. Each is a somewhat challenging function to produce in a pure functional language because they want to modify a graph... but we don't have state.
The most important way to modify-without-state is to mutate. The kind of function you're looking for is thus
addNode :: Node -> Graph -> Graph
where the returned Graph is identical to the input Graph except with one more edge. You should note immediately that there's something wrong here---adjacency lists assume that there are no orphan nodes. We can't add just a single node to the graph.
There are two solutions. One, we could "link" the node in to the graph (which is really addEdge in disguise) or two we could extend the graph representation to include orphan nodes. Let's do (2).
data Graph = Graph [Edge] [Int] -- orphans
Now let's implement adding an edge. Assume you can have duplicate edges, adding an edge to the adjacency list is easy, just append it
addEdge0 :: Edge -> Graph -> Graph
addEdge0 e (Graph adj orph) = Graph (e:adj) orph
but that's not good enough---we want our orphan list to only include truly orphaned nodes. We'll filter it.
addEdge :: Edge -> Graph -> Graph
addEdge (n1,n2) (Graph adj orph) =
Graph ((n1,n2):adj) (filter (/=n1) . filter (/=n2) $ orph)
getEdges is trivial since we're already storing the list of edges
getEdges :: Graph -> [Edge]
getEdges (Graph edges _) = edges
getNodes just needs to append all of our nodes from the adjacency list to the orphan list. We could use Data.List.nub to get only the unique nodes.
getNodes :: Graph -> [Int]
getNotes (Graph adj orph) = nub (orph ++ adjNodes adj) where
adjNodes [] = []
adjNodes ((n1,n2):rest) = n1 : n2 : adjNodes rest
Hopefully these give you some indication of how to think in a functional language. You'll hava to dig into them a little bit to see how they work, but I've introduced a large number of interesting concepts here.
Next steps here might include trying to use the State monad to recapture imperative state modification and to chain these Graph-modifying functions together.
Sometimes I get myself using different types of trees in Haskell and I don't know what they are called or where to get more information on algorithms using them or class instances for them, or even some pre-existing code or library on hackage.
Examples:
Binary trees where the labels are on the leaves or the branches:
data BinTree1 a = Leaf |
Branch {label :: a, leftChild :: BinTree1 a, rightChild :: BinTree1 a}
data BinTree2 a = Leaf {label :: a} |
Branch {leftChild :: BinTree2 a, rightChild :: BinTree2 a}
Similarly trees with the labels for each children node or a general label for all their children:
data Tree1 a = Branch {label :: a, children :: [Tree1 a]}
data Tree2 a = Branch {labelledChildren :: [(a, Tree2 a)]}
Sometimes I start using Tree2 and somehow on the course of developing it gets refactored into Tree1, which seems simpler to deal with, but I never gave a lot of thought about it. Is there some kind of duality here?
Also, if you can post some other different kinds of trees that you think are useful, please do.
In summary: everything you can tell me about those trees will be useful! :)
Thanks.
EDIT:
Clarification: this is not homework. It's just that I usually end up using those data types and creating instances (Functor, Monad, etc...) and maybe if I new their names I would find libraries with stuff implemented and more theoretical information on them.
Usually when a library on Hackage have Tree in the name, it implements BinTree2 or some version of a non-binary tree with labels only on the leaves, so it seems to me that maybe Tree2 and BinTree2 have some other name or identifier.
Also I feel that there may be some kind of duality or isomorphism, or a way of turning code that uses Tree1 into code that uses Tree2 with some transformation. Is there? May be it's just an impression.
The names I've heard:
BinTree1 is a binary tree
BinTree2 don't know a name but you can use such a tree to represent a prefix-free code like huffman coding for example
Tree1 is a Rose tree
Tree2 is isomoprhic to [Tree1] (a forest of Tree1) or another way to view it is a Tree1 without a label for the root.
A binary tree that only has labels in the leaves (BinTree2) is usually used for hash maps, because the tree structure itself doesn't offer any information other than the binary position of the leaves.
So, if you have 4 values with the following hash codes:
...000001 A
...000010 B
...000011 C
...000010 D
... you might store them in a binary tree (an implicit patricia trie) like so:
+ <- Bit #1 (least significant bit) of hash code
/ \ 0 = left, 1 = right
/ \
[B, D] + <- Bit #2
/ \
/ \
[A] [C]
We see that since the hash codes of B and D "start" with 0, they are stored in the left root child. They have exactly the same hash codes, so no more forks are necessary. The hash codes of A and C both "start" with 1, so another fork is necessary. A has bit 2 as 0, so it goes to the left, and C with 1 goes to the right.
This hash table implementation is kind of bad, because hashes might have to be recomputed when certain elements are inserted, but no matter.
BinTree1 is just an ordinary binary tree, and is used for fast order-based sets. Nothing more to say about it, really.
The only difference between Tree1 and Tree2 is that Tree2 can't have root node labels. This means that if used as a prefix tree, it cannot contain the empty string. It has very limited use, and I haven't seen anything like it in practice. Tree1, however, obviously has an use as a non-binary prefix tree, as I said.
I like reading snippets of code about concepts that I don't understand. Are there any snippets that show off monads in all their glory? More importantly how can I apply monads to make my job easier.
I use jQuery heavily. That's one cool application of monads I know of.
Like others, I think the question is far too general. I think most answers (like mine) will give examples of something neat making use of one specific monad. The real power of monads is that, once you understand them as an abstraction, you can apply that knowledge to any new monads you come across (and in Haskell there are a lot). This in turn means you can easily figure out what new code does and how to use it because you already know the interface and some rules that govern its behavior.
Anyway, here's an example using the List monad from a test-running script I wrote:
runAll :: IO ()
runAll = do
curdir <- getCurrentDirectory
sequence $ runTest <$> srcSets <*> optExeFlags <*> optLibFlags
setCurrentDirectory curdir
Technically I'm using the Applicative interface, but you can just change the <*>'s to ap from Control.Monad if that bothers you.
The cool thing about this is that it calls runTest for every combination of arguments from the lists "srcSets", "optExeFlags", and "optLibFlags" in order to generate profiling data for each of those sets. I think this is much nicer than what I would have done in C (3 nested loops).
Your question is really vague -- it's like asking, "show an example of code that uses variables". It's so intrinsic to programming that any code is going to be an example. So, I'll just give you the most-recently-visited Haskell function that's still open in my editor, and explain why I used monadic control flow.
It's a code snippet from my xmonad config file. It is part of the implementation for a layout that behaves in a certain way when there is one window to manage, and in another way for more than one window. This function takes a message and generates a new layout. If we decide that there is no change to be made, however, we return Nothing:
handleMessage' :: AlmostFull a -> SomeMessage -> Int -> Maybe (AlmostFull a)
handleMessage' l#(AlmostFull ratio delta t) m winCount =
case winCount of
-- keep existing Tall layout, maybe update ratio
0 -> finalize (maybeUpdateRatio $ fromMessage m) (Just t)
1 -> finalize (maybeUpdateRatio $ fromMessage m) (Just t)
-- keep existing ratio, maybe update Tall layout
_ -> finalize (Just ratio) (pureMessage t m)
where
finalize :: Maybe Rational -> Maybe (Tall a) -> Maybe (AlmostFull a)
finalize ratio t = ratio >>= \ratio -> t >>= \t ->
return $ AlmostFull ratio delta t
maybeUpdateRatio :: Message -> Maybe Rational
maybeUpdateRatio (Just Shrink) = Just (max 0 $ ratio-delta)
maybeUpdateRatio (Just Expand) = Just (min 1 $ ratio+delta)
maybeUpdateRatio _ = Nothing
We decide what to return based on the current window manager state (which is determined by a computation in the X monad, whose result we pass to this function to keep the actual logic pure) -- if there are 0 or 1 windows, we pass the message to the AlmostFull layout and let it decide what to do. That's the f function. It returns Just the new ratio if the message changes the ratio, otherwise it returns Nothing. The other half is similar; it passes the message onto Tall's handler if there are 2 or more windows. That returns Just a new Tall layout if that's what the user asked for, otherwise it returns Nothing.
The finalize function is the interesting part; it extracts both ratio (the desired new ratio) and t (the desired new Tall layout) from its Maybe wrapper. This means that both have to be not Nothing, otherwise we automatically return Nothing from our function.
The reason we used the Maybe monad here was so that we could write a function contingent on all results being available, without having to write any code to handle the cases where a Nothing appeared.
Essentially, monads are "imperative minilanguages". Hence, they enable you to use any imperative construct like exceptions (Maybe), logging (Writer), Input/Output (IO), State (State), non-determinism (lists [a]), parsers (Parsec, ReadP) or combinations thereof.
For more advanced examples, have a look at the example code for my operational package. In particular,
WebSessionState.lhs implements web sessions that are programmed as if the server were a persistent process while they are in fact delivered asynchronously.
TicTacToe.hs shows a game engine where players and AI are written as if they were running in concurrent processes.
I've been looking into Haskell and Information Flow security. This paper is pretty interesting, it uses Monads to enforce confidentiality in Haskell Programs.
http://www.cse.chalmers.se/~russo/seclib.htm
Here is something that I did recently that might show off some of the power of monads. The actual code is not shown here to protect the innocent, this is just a sketch.
Let's say you want to search through some dictionary and depending on what you find you want to do some other search. The searches might return Nothing (the element you are looking for doesn't exist) in which case you might try a different search, and if all searches fail you return Nothing.
The idea is to make our own monad by combining monad transformers, and then we can easily make some combinators for searches. Our monad will be ReaderT Dictionary Maybe. And we define the functions find wich looks up a given key, both which will return a the list of elements it found in both of the searches and oneOf which takes two searches and tries the first and if it didn't succeed it tries the second. Here is an example of such a search:
import Control.Monad
import Control.Monad.Reader
find a = ReaderT (lookup a)
both a b = liftM2 (++) a b
oneOf = mplus
search = both (find 1) ((find 2) `oneOf` (find 3))
`oneOf` both (find 4) (find 5)
And running:
(runReaderT search) [(1,"a"),(3,"c"),(4,"d"),(5,"g")] --> Just "ac"
(runReaderT search) [(6,"a")] --> Nothing
The big advantage we gain from this being a monad is that we can bind searches together and lift other functions into this abstraction. Let's say for instance, I have two searches search_a and search_b, and I want to do them and then return them merged:
do a <- search_a
b <- search_b
return (merge a b)
or alternatively liftM2 merge search_a search_b.