Simple graphs (without parallel edges) in fgl? - haskell

Are there any provisions in http://hackage.haskell.org/package/fgl for handling simple graphs (i.e., no parallel edges)? At the moment, we get
import Data.Graph.Inductive
mkGraph [(1,()),(2,())] [(1,2,()),(1,2,())] :: Gr () ()
==> mkGraph [(1,()),(2,())] [(1,2,()),(1,2,())]
containing two edges from 1 to 2. Indeed for this graph,
suc g 1 ==> [2,2]
Just to show that I've done some research: the official documentation http://web.engr.oregonstate.edu/~erwig/fgl/haskell/old/fgl0103.pdf (paper linked from original web page) states in Section 2.1 "we define one type for directed, node-labelled, edge-labelled multi-graphs; other graph types can be obtained as special cases" but then am seeing just the trivial specialisation (no label = label ()), but nothing on simple graphs.
For simple graphs, the type of Adj b, which occurs in Context a b, should perhaps be Set (b,Node) instead of [(b,Node)] as it is now.
(Or even Map b (Set Node), or Map Node (Set b) ...)
Somewhat related: is there a method that allows to check for the presence of an edge, or give me the set of edges from one node p to another node q? At the moment it seems I have to find q by walking through suc g p. And that's a list, which worries me even more. What if my graph has nodes with large degree?

Related

How to use category theory diagrams with polyary functions?

So, there's a lot of buzz about categories all around the Haskell ecosystem. But I feel one piece is missing from the common sense I have so far absorbed by osmosis. (I did read the first few pages of Mac Lane's famous introduction as well, but I don't believe I have enough mathematical maturity to carry the wisdom from this text to actual programming I have at hand.) I will now follow with a real world example involving a binary function that I have trouble depicting in categorical terms.
So, I have this function chain that allows me to S -> A, where A is a type synonym for a function, akin to a -> b. Now, I want to depict a process that does S -> a -> b, but I end up with an arrow pointing to another arrow rather than an object. How do I deal with such predicament?
I did overhear someone talking about a thing called n-category but I don't know if I should even try to understand what it is and how it's useful.
Though I believe my abstraction is accurate, the actual functions are parsePath >>> either error id >>> toAxis :: String -> Text.XML.Cursor.Axis from selectors and Axis = Text.XML.Cursor.Cursor -> [Text.XML.Cursor.Cursor] from xml-conduit.
There are two approaches to model binary functions as morphism in category theory (n-ary functions are dealt with similarly -- no new machinery is needed). One is to consider the uncurried version:
(A * B) -> C
where we take the product of the types A and B as a starting object. For that we need the category to contain such a products. (In Haskell, products are written (A, B). Well, technically in Haskell this is not exactly the product as in categories, but let's ignore that.)
Another is to consider the result type (B -> C) as an object in the category. Usually, this is called an exponential object, written as C^B. Assuming our category has such objects, we can write
A -> C^B
These two representations of binary functions are isomorphic: using curry and uncurry we can transform each one into the other.
Indeed, when there is such a (natural) isomorphism, we get a so called cartesian closed category, which is the simplest form of category which can describe a simply typed lambda calculus -- the core of every typed functional language.
This isomorphism is often cited as an adjunction between two functors
(- * B) -| (- ^ B)
I can use tuple projections to depict this situation, as follows:
-- Or, in actual Haskell terms:
This diagram features backwards fst & snd arrows in place of a binary function that constructs the tuple from its constituents, and that I can in no way depict directly. The caveat is that, while in this diagram Cursor has only one incoming arrow, I should remember that in actual code some real arrows X -> Axis & Y -> Cursor should go to both of the projections of the tuple, not just the symbolic projecting functions. The flow will then be uniformly left to right.
Pragmatically speaking, I traded an arrow with two sources (that constructs a tuple and isn't a morphism) for two reversed arrows (the tuple's projections that are legal morphisms in all regards).

What about arrows?

Reading through various tutorials about Haskell's various category-themed classes, we find things like Monoid, Functor, Monad and so on - all of which have dozens of instances. But for some reason, when we reach Arrow, there are only two instances: functions and monads. In both cases, using the Arrow instance is less powerful and more difficult than just using the underlying thing directly.
Does anybody have any interesting examples of arrows? I'm sure there must be some, but I've never come across any writing about them...
I like to think of Arrows as composable directed acyclic graphs. For example, an arrow of type:
SomeArrow (a, b, c) (d, e, f)
... you can think of as a graph that has three incoming edges of type a, b, and c and three outgoing edges of type d, e, and f.
Using this interpretation, the category composition operations for Arrows are like horizontal concatenation for graphs, connecting their edges together:
(.) :: SomeArrow b c -> SomeArrow a b -> Some Arrow a c
... where a, b, and c may be themselves tuples. Similarly, id is just the identity graph that forwards all incoming edges to outgoing edges:
id :: SomeArrow a a
The other key operation is (***) which is like vertical concatenation of graphs:
(***) :: Arrow a b -> Arrow c d -> Arrow (a, c) (b, d)
You can think of that as putting two graphs side-by-side, combining their input edges and output edges.
So Arrow commonly arise when working with typed directed acyclic graphs. However, the reason you usually don't see them that often is because most people mentally associate graphs with untyped and high-performance data structures.
HXT, a library which is used for parsing XML, is a very good example for the usage of arrows (have a look how often the word Arrow occurs in the module names of this package!). You shall have a look on the great tutorial: http://adit.io/posts/2012-04-14-working_with_HTML_in_haskell.html
But it is also good to have the arrow concept for functions. For example the following code
((+1) &&& (*2)) 3 -- result is (4,6)
just works, because (->) is an instance of the arrow class (The operator &&& is defined in Control.Arrow).
Thanks to the arrow syntax you have also a great tool to write complex computations in Haskell (it works as well for functions, monads and XML filters in HXT).

In functional reactive programming, how do you share state between two parts of the application?

I have some application architecture where user inputs flow to some automata, which runs in the context of the event stream and directs the user to different part of the application. Each part of the application may run some action based on user inputs. However, two parts of the application is sharing some state and are, conceptually, reading and writing to the same state. The caveat is that the two "threads" are not running at the same time, one of them is "paused" while the other one "yields" outputs. What is the canonical way to describe this state sharing computation, without resorting to some global variable? Does it make sense for the two "threads" to keep local states that sync by some form of message passing, even though they are not concurrent by any means?
There is no code sample since the question is more conceptual, but answers with sample in Haskell (using any FRP framework) or some other language is welcomed.
I've been working on a solution to this problem. The high-level summary is that you:
A) Distill all your concurrent code into a pure and single-threaded specification
B) The single-threaded specification uses StateT to share common state
The overall architecture is inspired by model-view-controller. You have:
Controllers, which are effectful inputs
Views, which are effectful outputs
A model, which is a pure stream transformation
The model can only interact with one controller and one view. However, both controllers and views are monoids, so you can combine multiple controllers into a single controller and multiple views into a single view. Diagrammatically, it looks like this:
controller1 - -> view1
\ /
controller2 ---> controllerTotal -> model -> viewTotal---> view2
/ \
controller3 - -> view3
\______ ______/ \__ __/ \___ ___/
v v v
Effectful Pure Effectful
The model is a pure, single-threaded stream transformer that implements Arrow and ArrowChoice. The reason why is that:
Arrow is the single-threaded equivalent to parallelism
ArrowChoice is the single-threaded equivalent to concurrency
In this case, I use push-based pipes, which appear to have a correct Arrow and ArrowChoice instance, although I'm still working on verifying the laws, so this solution is still experimental until I complete their proofs. For those who are curious, the relevant type and instances are:
newtype Edge m r a b = Edge { unEdge :: a -> Pipe a b m r }
instance (Monad m) => Category (Edge m r) where
id = Edge push
(Edge p2) . (Edge p1) = Edge (p1 >~> p2)
instance (Monad m) => Arrow (Edge m r) where
arr f = Edge (push />/ respond . f)
first (Edge p) = Edge $ \(b, d) ->
evalStateP d $ (up \>\ unsafeHoist lift . p />/ dn) b
where
up () = do
(b, d) <- request ()
lift $ put d
return b
dn c = do
d <- lift get
respond (c, d)
instance (Monad m) => ArrowChoice (Edge m r) where
left (Edge k) = Edge (bef >=> (up \>\ (k />/ dn)))
where
bef x = case x of
Left b -> return b
Right d -> do
_ <- respond (Right d)
x2 <- request ()
bef x2
up () = do
x <- request ()
bef x
dn c = respond (Left c)
The model also needs to be a monad transformer. The reason why is that we want to embed StateT in the base monad to keep track of shared state. In this case, pipes fits the bill.
The last piece of the puzzle is a sophisticated real-world example of taking a complex concurrent system and distilling it into a pure single-threaded equivalent. For this I use my upcoming rcpl library (short for "read-concurrent-print-loop"). The purpose of the rcpl library is to provide a concurrent interface to the console that lets you read input from the user while concurrently printing to the console, but without the printed output clobbering the user's input. The Github repository for it is here:
Link to Github Repository
My original implementation of this library had pervasive concurrency and message passing, but was plagued by several concurrency bugs which I could not solve. Then when I came up with mvc (the code name for my FRP-like framework, short for "model-view-controller"), I figured that rcpl would be an excellent test case to see if mvc was ready for prime-time.
I took the entire logic of the rcpl and turned it into a single, pure pipe. That's what you will find in this module, and the total logic is contained entirely within the rcplCore pipe.
This is neat, because now that the implementation is pure, I can quickcheck it and verify certain properties! For example, one property I might want to quickcheck is that there is exactly one terminal command per user key press of the x key, which I would specify like this:
>>> quickCheck $ \n -> length ((`evalState` initialStatus) $ P.toListM $ each (replicate n (Key 'x')) >-> runEdge (rcplCore t)) == n || n < 0
n is the number of times that I press the x key. Running that test produces the following output:
*** Failed! Falsifiable (after 17 tests and 6 shrinks):
78
QuickCheck discovered that my property was false! Moreover, because the code is referentially transparent, QuickCheck can narrow down the counterexample to the minimal-reproducing violation. After 78 key presses the terminal driver emits a newline because the console is 80 characters wide, and two characters are taken up by the prompt ("> " in this case). That's the kind of property I would have great difficult verifying if concurrency and IO infected my entire system.
Having a pure setup is great for another reason: everything is completely reproducible! If I store a log of all incoming events, then any time there is a bug I can replay the events and have perfectly reproducing test case that I can add to my test suite.
However, really the most important benefit of purity is the ability to more easily reason about code, both informally and formally. When you remove Haskell's scheduler from the equation you can prove things statically about your code that you couldn't prove when you have to depend on a concurrent runtime with an informally specified semantics. This actually proved to be really useful even for informal reasoning, because when I transformed my code to use mvc it still had several bugs, but these were far easier to debug and remove than the stubborn concurrency bugs from my first iteration.
The rcpl example uses StateT to share global state between different components, so the long-winded answer to your question is: You can use StateT, but only if you transform your system to a single-threaded version. Fortunately that's possible!

Haskell - read in graph specification from a file

I'm having a very hard time trying to figure out how to read in (and also how to represent) a graph in Haskell.
The input from the file will look something like
NODES 3
EDGE 1 2
EDGE 1 3
EDGE 2 3
I have figured out how to grab the individual lines of input from the file using:
loadFile :: String -> IO [[String]]
loadFile filename = do
contents <- readFile filename
return $ map words $ lines contents
That gives output like:
loadFile "input.txt"
[["NODES","3"],["EDGE","1","2"],["EDGE","1","3"],["EDGE","2","3"]]
What I really need to figure out is how to represent this graph data as a graph, though.
I was thinking of setting it up as a list of edges:
type Edge = (Int,Int)
type Graph = [Edge]
But then I'm not sure how I would even begin to implement the functions I need such as addNode, addEdge, getNodes, getEdges.
Any help or pointing me in the right direction would be awesome! Note: I can't use any already developed graph modules for this.
So, for the tl;dr version:
Am I reading in the data the best way?
How should I represent this data in haskell?
If I use the data structures I outlined above, how would I go about implementing one of those functions.
There are a lot of interesting concerns going on here. Let me attack them all.
You're reading in the data just fine for a line-oriented language. Later on you'll see Data.ByteString and Data.Text replace String for efficiency. You'll also see Parsec for parsing. Those can wait, though. Revisit them in time.
Your graph representation is fine. Adjacency lists are a common and useful representation.
Now, the real trick you have is here. Let's take a look at addNode and addEdge. Each is a somewhat challenging function to produce in a pure functional language because they want to modify a graph... but we don't have state.
The most important way to modify-without-state is to mutate. The kind of function you're looking for is thus
addNode :: Node -> Graph -> Graph
where the returned Graph is identical to the input Graph except with one more edge. You should note immediately that there's something wrong here---adjacency lists assume that there are no orphan nodes. We can't add just a single node to the graph.
There are two solutions. One, we could "link" the node in to the graph (which is really addEdge in disguise) or two we could extend the graph representation to include orphan nodes. Let's do (2).
data Graph = Graph [Edge] [Int] -- orphans
Now let's implement adding an edge. Assume you can have duplicate edges, adding an edge to the adjacency list is easy, just append it
addEdge0 :: Edge -> Graph -> Graph
addEdge0 e (Graph adj orph) = Graph (e:adj) orph
but that's not good enough---we want our orphan list to only include truly orphaned nodes. We'll filter it.
addEdge :: Edge -> Graph -> Graph
addEdge (n1,n2) (Graph adj orph) =
Graph ((n1,n2):adj) (filter (/=n1) . filter (/=n2) $ orph)
getEdges is trivial since we're already storing the list of edges
getEdges :: Graph -> [Edge]
getEdges (Graph edges _) = edges
getNodes just needs to append all of our nodes from the adjacency list to the orphan list. We could use Data.List.nub to get only the unique nodes.
getNodes :: Graph -> [Int]
getNotes (Graph adj orph) = nub (orph ++ adjNodes adj) where
adjNodes [] = []
adjNodes ((n1,n2):rest) = n1 : n2 : adjNodes rest
Hopefully these give you some indication of how to think in a functional language. You'll hava to dig into them a little bit to see how they work, but I've introduced a large number of interesting concepts here.
Next steps here might include trying to use the State monad to recapture imperative state modification and to chain these Graph-modifying functions together.

What are the names used in computer science for some of the following tree data types?

Sometimes I get myself using different types of trees in Haskell and I don't know what they are called or where to get more information on algorithms using them or class instances for them, or even some pre-existing code or library on hackage.
Examples:
Binary trees where the labels are on the leaves or the branches:
data BinTree1 a = Leaf |
Branch {label :: a, leftChild :: BinTree1 a, rightChild :: BinTree1 a}
data BinTree2 a = Leaf {label :: a} |
Branch {leftChild :: BinTree2 a, rightChild :: BinTree2 a}
Similarly trees with the labels for each children node or a general label for all their children:
data Tree1 a = Branch {label :: a, children :: [Tree1 a]}
data Tree2 a = Branch {labelledChildren :: [(a, Tree2 a)]}
Sometimes I start using Tree2 and somehow on the course of developing it gets refactored into Tree1, which seems simpler to deal with, but I never gave a lot of thought about it. Is there some kind of duality here?
Also, if you can post some other different kinds of trees that you think are useful, please do.
In summary: everything you can tell me about those trees will be useful! :)
Thanks.
EDIT:
Clarification: this is not homework. It's just that I usually end up using those data types and creating instances (Functor, Monad, etc...) and maybe if I new their names I would find libraries with stuff implemented and more theoretical information on them.
Usually when a library on Hackage have Tree in the name, it implements BinTree2 or some version of a non-binary tree with labels only on the leaves, so it seems to me that maybe Tree2 and BinTree2 have some other name or identifier.
Also I feel that there may be some kind of duality or isomorphism, or a way of turning code that uses Tree1 into code that uses Tree2 with some transformation. Is there? May be it's just an impression.
The names I've heard:
BinTree1 is a binary tree
BinTree2 don't know a name but you can use such a tree to represent a prefix-free code like huffman coding for example
Tree1 is a Rose tree
Tree2 is isomoprhic to [Tree1] (a forest of Tree1) or another way to view it is a Tree1 without a label for the root.
A binary tree that only has labels in the leaves (BinTree2) is usually used for hash maps, because the tree structure itself doesn't offer any information other than the binary position of the leaves.
So, if you have 4 values with the following hash codes:
...000001 A
...000010 B
...000011 C
...000010 D
... you might store them in a binary tree (an implicit patricia trie) like so:
+ <- Bit #1 (least significant bit) of hash code
/ \ 0 = left, 1 = right
/ \
[B, D] + <- Bit #2
/ \
/ \
[A] [C]
We see that since the hash codes of B and D "start" with 0, they are stored in the left root child. They have exactly the same hash codes, so no more forks are necessary. The hash codes of A and C both "start" with 1, so another fork is necessary. A has bit 2 as 0, so it goes to the left, and C with 1 goes to the right.
This hash table implementation is kind of bad, because hashes might have to be recomputed when certain elements are inserted, but no matter.
BinTree1 is just an ordinary binary tree, and is used for fast order-based sets. Nothing more to say about it, really.
The only difference between Tree1 and Tree2 is that Tree2 can't have root node labels. This means that if used as a prefix tree, it cannot contain the empty string. It has very limited use, and I haven't seen anything like it in practice. Tree1, however, obviously has an use as a non-binary prefix tree, as I said.

Resources