Related
I'm trying to write a function in Haskell to generate multidimensional lists.
(Technically I'm using Curry, but my understanding is that it's mostly a superset of Haskell, and the thing I'm trying to do is common to Haskell as well.)
After a fair bit of head scratching, I realized my initial desired function (m_array generating_function list_of_dimensions, giving a list nested to a depth equal to length list_of_dimensions) was probably at odds with they type system itself, since (AFAICT) the nesting-depth of lists is part of its type, and my function wanted to return values whose nesting-depths differed based on the value of a parameter, meaning it wanted to return values whose types varied based on the value of a parameter, which (AFAICT) isn't supported in Haskell. (If I'm wrong, and this CAN be done, please tell me.) At this point I moved on to the next paragraph, but if there's a workaround I've missed that takes very similar parameters and still outputs a nested list, let me know. Like, maybe if you can encode the indices as some data type that implicitly includes the nesting level in its type, and is instantiated with e.g. dimensions 5 2 6 ..., maybe that'd work? Not sure.
In any case, I thought that perhaps I could encode the nesting-depth by nesting the function itself, while still keeping the parameters manageable. This did work, and I ended up with the following:
ma f (l:ls) idx = [f ls (idx++[i]) | i <- [0..(l-1)]]
However, so far it's still a little clunky to use: you need to nest the calls, like
ma (ma (ma (\_ i -> 0))) [2,2,2] []
(which, btw, gives [[[0,0],[0,0]],[[0,0],[0,0]]]. If you use (\_ i -> i), it fills the array with the indices of the corresponding element, which is a result I'd like to keep available, but could be a confusing example.)
I'd prefer to minimize the boilerplate necessary. If I can't just call
ma (\_ i -> i) [2,2,2]
I'd LIKE to be able to call, at worst,
ma ma ma (\_ i -> i) [2,2,2] []
But if I try that, I get errors. Presumably the list of parameters is being divvied up in a way that doesn't make sense for the function. I've spent about half an hour googling and experimenting, trying to figure out Haskell's mechanism for parsing strings of functions like that, but I haven't found a clear explanation, and understanding eludes me. So, the formal questions:
How does Haskell parse e.g. f1 f2 f3 x y z? How are the arguments assigned? Is it dependent on the signatures of the functions, or does it e.g. just try to call f1 with 5 arguments?
Is there a way of restructuring ma to permit calling it without parentheses? (Adding at most two helper functions would be permissible, e.g. maStart ma ma maStop (\_ i -> i) [1,2,3,4] [], if necessary.)
The function you want in your head-scratching paragraph is possible directly -- though a bit noisily. With GADTs and DataKinds, values can be parameterized by numbers. You won't be able to use lists directly, because they don't mention their length in their type, but a straightforward variant that does works great. Here's how it looks.
{-# Language DataKinds #-}
{-# Language GADTs #-}
{-# Language ScopedTypeVariables #-}
{-# Language StandaloneDeriving #-}
{-# Language TypeOperators #-}
import GHC.TypeLits
infixr 5 :+
data Vec n a where
O :: Vec 0 a -- O is supposed to look a bit like a mix of 0 and []
(:+) :: a -> Vec n a -> Vec (n+1) a
data FullTree n a where
Leaf :: a -> FullTree 0 a
Branch :: [FullTree n a] -> FullTree (n+1) a
deriving instance Show a => Show (Vec n a)
deriving instance Show a => Show (FullTree n a)
ma :: forall n a. ([Int] -> a) -> Vec n Int -> FullTree n a
ma f = go [] where
go :: [Int] -> Vec n' Int -> FullTree n' a
go is O = Leaf (f is)
go is (l :+ ls) = Branch [go (i:is) ls | i <- [0..l-1]]
Try it out in ghci:
> ma (\_ -> 0) (2 :+ 2 :+ 2 :+ O)
Branch [Branch [Branch [Leaf 0,Leaf 0],Branch [Leaf 0,Leaf 0]],Branch [Branch [Leaf 0,Leaf 0],Branch [Leaf 0,Leaf 0]]]
> ma (\i -> i) (2 :+ 2 :+ 2 :+ O)
Branch [Branch [Branch [Leaf [0,0,0],Leaf [1,0,0]],Branch [Leaf [0,1,0],Leaf [1,1,0]]],Branch [Branch [Leaf [0,0,1],Leaf [1,0,1]],Branch [Leaf [0,1,1],Leaf [1,1,1]]]]
A low-tech solution:
In Haskell, you can model multi-level lists by using the so-called free monad.
The base definition is:
data Free ft a = Pure a | Free (ft (Free ft a))
where ft can be any functor, but here we are interested in ft being [], that is the list functor.
So we define our multidimensional list like this:
import Control.Monad
import Control.Monad.Free
type Mll = Free [] -- Multi-Level List
The Mll type transformer happens to be an instance of the Functor, Foldable, Traversable classes, which can come handy.
To make an array of arbitrary dimension, we start with:
the list of dimensions, for example [5,2,6]
the filler function, which returns a value for a given set of indices
We can start by making a “grid” object, whose item at indices say [x,y,z] is precisely the [x,y,z] list. As we have a functor instance, we can complete the process by just applying fmap filler to our grid object.
This gives the following code:
makeNdArray :: ([Int] -> a) -> [Int] -> Mll a
makeNdArray filler dims =
let
addPrefix x (Pure xs) = Pure (x:xs)
addPrefix x (Free xss) = Free $ map (fmap (x:)) xss
makeGrid [] = Pure []
makeGrid (d:ds) = let base = 0
fn k = addPrefix k (makeGrid ds)
in Free $ map fn [base .. (d-1+base)]
grid = makeGrid dims
in
fmap filler grid -- because we are an instance of the Functor class
To visualize the resulting structure, it is handy to be able to remove the constructor names:
displayMll :: Show a => Mll a -> String
displayMll = filter (\ch -> not (elem ch "Pure Free")) . show
The resulting structure can easily be flattened if need be:
toListFromMll :: Mll a -> [a]
toListFromMll xs = foldr (:) [] xs
For numeric base types, we can get a multidimensional sum function “for free”, so to speak:
mllSum :: Num a => (Mll a) -> a
mllSum = sum -- because we are an instance of the Foldable class
-- or manually: foldr (+) 0
Some practice:
We use [5,2,6] as the dimension set. To visualize the structure, we associate a decimal digit to every index. We can pretend to have 1-base indexing by adding 111, because that way all the resulting numbers are 3 digits long, which makes the result easier to check. Extra newlines added manually.
$ ghci
GHCi, version 8.8.4: https://www.haskell.org/ghc/ :? for help
λ>
λ> dims = [5,2,6]
λ> filler = \[x,y,z] -> (100*x + 10*y + z + 111)
λ>
λ> mxs = makeNdArray filler dims
λ>
λ> displayMll mxs
"[[[111,112,113,114,115,116],[121,122,123,124,125,126]],
[[211,212,213,214,215,216],[221,222,223,224,225,226]],
[[311,312,313,314,315,316],[321,322,323,324,325,326]],
[[411,412,413,414,415,416],[421,422,423,424,425,426]],
[[511,512,513,514,515,516],[521,522,523,524,525,526]]]"
λ>
As mentioned above, we can flatten the structure:
λ>
λ> xs = toListFromMll mxs
λ> xs
[111,112,113,114,115,116,121,122,123,124,125,126,211,212,213,214,215,216,221,222,223,224,225,226,311,312,313,314,315,316,321,322,323,324,325,326,411,412,413,414,415,416,421,422,423,424,425,426,511,512,513,514,515,516,521,522,523,524,525,526]
λ>
or take its overall sum:
λ>
λ> sum mxs
19110
λ>
λ> sum xs
19110
λ>
λ>
λ> length mxs
60
λ>
λ> length xs
60
λ>
I have the following Haskell polymorphic data type:
data Tree a = Leaf Int a | Node Int (Tree a) (Tree a)
The tree will be compressed in a bitstring of 0s and 1s. A '0' signifies a Node and it is followed by the encoding of the left subtree, then the encoding of the right subtree. A '1' signifies a Leaf and is followed by 7 bits of information (for example it might be a char). Each node/leaf is supposed to also contain the frequency of the information stored, but this is not important for this problem (so we can put anything there).
For example, starting from this encoded tree
[0,0,0,1,1,1,0,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,1,1,0,0,0,1,1,1,
1,0,0,1,1,1,1,1,1,1,0,0,1,0,0,1,1,1,1,0,1,1,1,1,1,1,0,0,0,0,1]
it is supposed to give back something like this
Node 0 (Node 0 (Node 0 (Leaf 0 'k') (Leaf 0 't'))
(Node 0 (Node 0 (Leaf 0 'q') (Leaf 0 'g')) (Leaf 0 'r')))
(Node 0 (Leaf 0 'w') (Leaf 0 'a'))
(spacing is not important, but it did not fit on one line).
I have little experience working with trees, especially when implementing code. I have a vague idea about how I'd solve this on paper (using something similar to a stack to deal with the depth/levels) but I am still a bit lost.
Any help or ideas are appreciated!
Well, you're trying to parse a tree of bytes from a bit-stream. Parsing's one of those cases where it pays to set up some structure: we're going to write a miniature parser combinator library in the style of How to Replace Failure by a List of Successes, which will allow us to write our code in an idiomatic functional style and delegate a lot of the work to the machine.
Translating the old rhyme into the language of monad transformers, and reading "string" as "bit-string", we have
newtype Parser a = Parser (StateT [Bool] [] a)
deriving (Functor, Applicative, Monad, Alternative)
runParser :: Parser a -> [Bool] -> [(a, [Bool])]
runParser (Parser m) = runStateT m
A parser is a monadic computation which operates statefully on a stream of Booleans, yielding a collection of successfully-parsed as. GHC's GeneralizedNewtypeDeriving superpowers allow me to elide the boilerplate instances of Monad et al.
The goal, then, is to write a Parser (Tree SevenBits) - a parser which returns a tree of septuples of Booleans. (You can turn the 7 bits into a Word8 at your leisure by deriving a Functor instance for Tree and using fmap.) I'm going to use the following definition of Tree because it's simpler - I'm sure you can figure out how to adapt this code to your own ends.
data Tree a = Leaf a | Node (Tree a) (Tree a) deriving Show
type SevenBits = (Bool, Bool, Bool, Bool, Bool, Bool, Bool)
Here's a parser that attempts to consume a single bit from the input stream, failing if it's empty:
one :: Parser Bool
one = Parser $ do
stream <- get
case stream of
[] -> empty
(x:xs) -> put xs *> return x
Here's one which attempts to consume a particular bit from the input stream, failing if it doesn't match:
bit :: Bool -> Parser ()
bit b = do
i <- one
guard (i == b)
Here I'm pulling a sequence of seven Booleans from the input stream using replicateM and packing them into a tuple. We'll be using this to populate Leaf nodes' contents.
sevenBits :: Parser SevenBits
sevenBits = pack7 <$> replicateM 7 one
where pack7 [a,b,c,d,e,f,g] = (a, b, c, d, e, f, g)
Now we can finally write the code which parses the tree structure itself. We'll be choosing between the Node and Leaf alternatives using <|>.
tree :: Parser (Tree SevenBits)
tree = node <|> leaf
where node = bit False *> liftA2 Node tree tree
leaf = bit True *> fmap Leaf sevenBits
If node succeeds in parsing a low bit from the head of the stream, it continues to recursively parse the encoding of the left subtree followed by the right subtree, sequencing the applicative actions with liftA2. The trick is that node fails if it doesn't encounter a low bit at the head of the input stream, which tells <|> to give up on node and try leaf instead.
Note how the structure of tree reflects the structure of the Tree type itself. This is applicative parsing at work. We could alternately have structured this parser monadically, first using one to parse an arbitrary bit and then using a case analysis on the bit to determine whether we should continue to parse a pair of trees or a leaf. In my opinion this version is simpler, more declarative, and less verbose.
Also compare the clarity of this code to the low-level style of #behzad.nouri's foldr-based solution. Rather than building an explicit finite-state machine which switches between parsing nodes and leaves - an imperative-flavoured idea - my design allows you to declaratively describe the grammar to the machine using standard functions like liftA2 and <|> and trust that the abstractions will do the right thing.
Anyway, here I'm parsing a simple tree consisting of a pair of Leafs containing the (binary-encoded) numbers 0 and 1. As you can see, it returns the single successful parse and an empty stream of remaining bits.
ghci> runParser tree $ map (>0) [0, 1, 0,0,0,0,0,0,0, 1, 0,0,0,0,0,0,1]
[(Node (Leaf (False, False, False, False, False, False, False)) (Leaf (False, False, False, False, False, False, True)),[])]
Ok, here's a simple (ad-hoc, but easier to understand) way.
We need to buid a function parse, with the following type:
parse :: [Int] -> Tree Char
The approach you mentioned, with stacks, is the imperative one. Here we just lay on the recursive calls. The stack will be built by the compiler and it will just have each recursive call stored in it (At least you can imagine it that way, if you want, or just ignore all this paragraph).
So, the idea is the following: whenever you find a 0, you need to make two recursive calls to the algorithm. The first recursive call will read one branch (the left one) of the tree. The second one needs to be called with the rest of the list as argument. The rest left by the first recursive call. So, we need a auxiliar function parse' with the following type (now we return a pair, being the second value the rest of list):
parse' :: [Int] -> (Tree Char, [Int])
Next, you can see a piece of code where the 0 case is just as described before.
For the 1 case, we just need to take the next 7 numbers and make them into a char somehow (I leave the definition of toChar for you), then, just return a Leaf and the rest of the list.
parse' (0:xs) = let (l, xs') = parse' xs
(r, xs'') = parse' xs' in (Node 0 l r, xs'') --xs'' should be []
parse' (1:xs) = let w = toChar (take 7 xs) in (Leaf 0 w , drop 7 xs)
Finally, our parse function just calls the auxiliary parse one and returns the first element of the pair.
parse xs = fst $ parse' xs
do a right fold:
import Data.Char (chr)
data Tree a = Leaf a | Node (Tree a) (Tree a)
deriving Show
build :: [Int] -> [Tree Char]
build xs = foldr go (\_ _ -> []) xs 0 0
where
nil = Leaf '?'
go 0 run 0 0 = case run 0 0 of
[] -> Node nil nil:[]
x:[] -> Node x nil:[]
x:y:zs -> Node x y :zs
go 1 run 0 0 = run 0 1
go _ _ _ 0 = error "this should not happen!"
go x run v 7 = (Leaf $ chr (v * 2 + x)): run 0 0
go x run v k = run (v * 2 + x) (k + 1)
then:
\> head $ build [0,0,0,1,1,1,0, ...] -- the list of 01s as in the question
Node (Node (Node (Leaf 'k') (Leaf 't'))
(Node (Node (Leaf 'q') (Leaf 'g')) (Leaf 'r')))
(Node (Leaf 'w') (Leaf 'a'))
I have this data definition for a tree:
data Tree = Leaf Int | Node Tree Int Tree
and I have to make a function, nSatisfy, to check how many items of the tree check some predicate.
Here's what I've done:
nSatisfy :: (Int->Bool) -> Tree -> Int
nSatisfy _ Leaf = 0
nSatisfy y (Node left x right)
|y x = 1 + nSatisfy y (Node left x right)
| otherwise = nSatisfy y (Node left x right)
Is this the right way to solve this problem?
In your nSatisfy function, you should add the number of nodes satisfying the condition in both subtrees with two recursive calls. The last two lines should be like this:
|x y=1+(nSatisfy y left)+(nSatisfy y right)
|otherwise=(nSatisfy y left)+(nSatisfy y right)
This way, it will call itself again on the same node but only on the subtrees.
Also, if a leaf contains an integer, as is implied in the data declaration, you should make it evaluate the condition for a leaf and return 1 if it is true, instead of always returning 0.
In addition to the main answer, I'd like to offer a slightly different way how to generalize your problem and solving it using existing libraries.
The operation you're seeking is common to many data structures - to go through all elements and perform some operation on them. Haskell defines Foldable type-class, which can be implemented by structures like yours.
First let's import some modules we'll need:
import Data.Foldable
import Data.Monoid
In order to use Foldable, we need to generalize the structure a bit, in particular parametrize its content:
data Tree a = Leaf a | Node (Tree a) a (Tree a)
In many cases this is a good idea as it separates the structure from its content and allows it to be easily reused.
Now let's define its Foldable instance. For tree-like structures it's easier to define it using foldMap, which maps each element into a monoid and then combines all values:
instance Foldable Tree where
foldMap f (Leaf x) = f x
foldMap f (Node lt x rt) = foldMap f lt <> f x <> foldMap f rt
This immediately gives us the whole library of functions in the Data.Foldable module, such as searching for an element, different kinds of folds, etc. While a function counting the number of values satisfying some predicate isn't defined there, we can easily define it for any Foldable. The idea is that we'll use the Sum:
nSatisfy :: (Foldable f) => (a -> Bool) -> f a -> Int
nSatisfy p = getSum . foldMap (\x -> Sum $ if p x then 1 else 0)
The idea behind this function is simple: Map each value to 1 if it satisfies the predicate, otherwise to 0. And then folding with the Sum monoid just adds all values up.
Given a tuple of lists, I need to find all unique path from that:
Example I/P: [(1,2),(2,3),(3,4),(9,11),(4,5),(5,6),(6,7),(3,9)]
O/P: [[(1,2),(2,3),(3,4),(4,5),(5,6),(6,7)],[(1,2),(2,3),(3,9),(9,11)]]
Two tuples can connect if the second element of the tuple matches with the first element of the other tuple i.e: One tuple is (_,a) and other tuple is like (a,_).
What is the most efficient implementation for this ? I need to find the best data structure suited for it. Any suggestions ? The number of tuples in which I will execute the algorithm will be like more than 400,000.
{-# LANGUAGE NoMonomorphismRestriction #-}
import Data.List (permutations, nub)
path :: Eq a => [(a, a)] -> [(a, a)]
path [] = []
path [x] = [x]
path (u#(_, a):v#(b, _):xs) = if a == b then u:path (v:xs) else [u]
allPaths = nub . map path . permutations
(you can optimize chain generation but I think this problem has exponential time complexity)
EDITED
In general, you must to define more preciselly what paths you want to return.
Ignoring cycle invariant ([(1,2),(2,3),(3,1)] == [(2,3),(3,1),(1,3)]) you can generate all paths (without using permutations)
{-# LANGUAGE NoMonomorphismRestriction #-}
import Data.List (permutations, nub, sortBy, isInfixOf)
data Tree a = Node a [Tree a] deriving Show
treeFromList :: Eq a => a -> [(a, a)] -> Tree a
treeFromList a [] = Node a []
treeFromList a xs = Node a $ map subTree $ filter ((a==).fst) xs
where subTree v#(_, b) = treeFromList b $ filter (v/=) xs
treesFromList :: Eq a => [(a, a)] -> [Tree a]
treesFromList xs = map (flip treeFromList xs) $ nub $ map fst xs ++ map snd xs
treeToList :: Tree a -> [[a]]
treeToList (Node a []) = [[a]]
treeToList (Node a xs) = [a:ws | ws <- concatMap treeToList xs]
treesToList :: [Tree a] -> [[a]]
treesToList = concatMap treeToList
uniqTrees :: Eq a => [[a]] -> [[a]]
uniqTrees = f . reverse . sortBy ((.length).compare.length)
where f [] = []
f (x:xs) = x: filter (not.flip isInfixOf x) (f xs)
allPaths = uniqTrees . treesToList . treesFromList
then
*Main> allPaths [(1, 2), (1, 3), (2, 3), (2, 4), (3, 4), (4, 1)]
[[2,4,1,2,3,4],[2,3,4,1,2,4],[1,3,4,1,2,4],[1,3,4,1,2,3],[1,2,4,1,3,4],[1,2,3,4,1,3]]
uniqTrees has poor efficiency and, in general, you can do many optimizations.
If you want to avoid cycle invariant, you can normalize a cycle selecting minimum base10 representation, in previous example ([(1,2),(2,3),(3,1)] == [(2,3),(3,1),(1,3)]) 1231 < 2313 then
normalize [(2,3),(3,1),(1,3)] == [(1,2),(2,3),(3,1)]
you can normalize a path rotating it n-times and taking "head . sortBy toBase10 . rotations".
I think your problem fits on the NP category since:
A Hamiltonian path, also called a Hamilton path, is a path between two
vertices of a graph that visits each vertex exactly once.
In general, the problem of finding a Hamiltonian path is NP-complete
(Garey and Johnson 1983, pp. 199-200), so the only known way to
determine whether a given general graph has a Hamiltonian path is to
undertake an exhaustive search (source)
You problem is even "harder" since you don't know before hand what will be the end node.
In terms of data structure you can try to simulate the hash table structure in Haskell, since this data type is commonly use in graph and you problem can be turn into a graph.
So, I have a function of type:
genTree :: Node -> [Nodes]
Given a node, this function generates the set of children of that node in a tree. The function can be applied again to those children to generate their children, until it eventually generates a node with no children, i.e. a node for which genTree returns [].
What I'm trying to do is, given a starting node, generate the list of all leaf nodes in the tree that has it as the root.
Any advice?
The function from Martijn's answer generates a list of all nodes in the tree. You can use this list and filter out the nodes without children to get the leaves:
nodes root = root : concatMap nodes (genTree root)
leaves root = filter (null . genTree) (nodes root)
You can also combine these two functions into one to directly generate just a list of leaves, if you prefer:
leaves node
| null children = [node]
| otherwise = concatMap leaves children
where children = genTree node
Let's generalize it a bit:
leaves :: (a -> [a]) -> a -> [a]
leaves tree x = case (tree x) of
[] -> [x]
-- the node x has no children and is therefore a leaf
xs -> concatMap (leaves tree) xs
-- otherwise get list of all leaves for each child and concatenate them
Applying static argument transformation (http://hackage.haskell.org/trac/ghc/ticket/888), we get
leaves :: (a -> [a]) -> a -> [a]
leaves tree x = leaves' x where
leaves' x = case (tree x) of
[] -> [x]
xs -> concatMap leaves' xs
Use it as
leaves genTree root
or if you really want it to work only with genTree, inline it into the definition:
leaves1 root = case (genTree x) of
[] -> [x]
xs -> concatMap leaves1 xs
which is morally equivalent to sth's second answer.
(not exactly an answer to the question, but related)
I like to represent trees of a as "ListT [] a". (ListT from the List package in hackage)
Then the answer for this question is just to use the function lastL.
"Monad m => ListT m a" is a monadic list containing "a"s, where trying to get the next list item (which may find out there is no such item) is a monadic action in "m".
A usage example for ListT - a program that reads numbers from the user until the user does not type a number and prints the sum of numbers after each input:
main =
execute . joinM . fmap print .
scanl (+) 0 .
fmap (fst . head) .
takeWhile (not . null) .
fmap reads .
joinM $ (repeat getLine :: ListT IO (IO String))
Where repeat, scanl and takeWhile are from Data.List.Class. They work both for regular lists and monadic lists.
joinM :: List l => l (ItemM l a) -> l a -- (l = ListT IO, ItemM l = IO)
execute :: List l => l a -> ItemM l () -- consume the whole list and run its actions
If you are familiar with Python, python iterators/generators are "ListT IO"s.
When using [] instead of IO as the monad of the monadic list, the result is a tree. Why? Imagine a list where getting the next item is an action in the list monad - the list monad means there are several options, therefore there are several "next items", which makes it a tree.
You can construct monadic lists either with higher-order functions (like the example above), or with cons, or with a python-generator notation (with yield) using the GeneratorT monad transformer from the generator package in hackage.
Disclaimer: ListT and GeneratorT are in no way widely used. I wrote those and I am not aware of any other users except for myself. There are several of users of equivalent ListTs, such as the one from the Haskell wiki, NondetT, and others.
flatten node = node : concatMap flatten (genTree node)