List Comprehension with list of lists - haskell

I have problems to understand the following piece of code:
treePositions :: Tree a -> [[Int]]
treePositions (Node _ ts) =
[] : [ (i : is ) | i <- [0..(length ts - 1)],
is <- treePositions (index ts i) ]
This function would calculate any valid paths to a position in the given tree, where for every node the edges are marked with 0..lastOutgoingEdge.
If I understood it right the index function would return the node at Index i in the node list of the tree.
index :: [a] -> Int -> a
index :: [a] -> Int -> a
index [] i = error "invalid index"
index (x:xs) 0 = x
index (x:xs) i = ith xs (i-1)
Now for given trees:
t1 = Node "a" [
Node "b" [
Node "c"[],
Node "d"[]
]
]
t2 = Node "z" []
the function would return:
treePositions t1 == [ [], [0], [0,0], [0,1] ]
treePositions t2 == [ [] ]
What I don't understand is this part:
[] : [ (i : is ) | i <- [0..(length ts - 1)],is <- treePositions (index ts i) ]
My thoughts:
If I have x|x <- [0..10], x < 2. This would translate in "take every x in [0..10] for which x < 2 applies. So if I would take every i in [0..(length ts - 1)], how would this then return lists for a condition?

In your case, the part after the comma is not a condition, but a second generator. Simplified, the notation looks like this:
> [ (a, b) | a <- [1..3], b <- [1..2] ]
> [(1,1),(1,2),(2,1),(2,2),(3,1),(3,2)]
The example compherension above means the following:
go through list [1..3], and for every element of it,
go through list [1..2], and for every element of it,
produce a tuple
Further, a generator can depend on elements of previous generators, e.g.:
> [ (a, b) | a <- [1..3], b <- [1..a] ]
> [(1,1),(2,1),(2,2),(3,1),(3,2),(3,3)]
So in your case:
[ (i : is ) | i <- [0..(length ts - 1)],is <- treePositions (index ts i) ]
the logic is this:
for every i in 0..(length ts - 1),
go through every is in treePositions (index ts i),
and produce i : is as result

Related

Getting parse error while doing list comprehensions in haskell

I'm writing a function like this:
testing :: [Int] -> [Int] -> [Int]
testing lst1 lst2 =
let t = [ r | (x,y) <- zip lst1 lst2, let r = if y == 0 && x == 2 then 2 else y ]
let t1 = [ w | (u,v) <- zip t (tail t), let w = if (u == 2) && (v == 0) then 2 else v]
head t : t1
What the first let does is: return a list like this: [2,0,0,0,1,0], from the second let and the following line, I want the output to be like this: [2,2,2,2,1,0]. But, it's not working and giving parse error!!
What am I doing wrong?
There are two kinds of lets: the "let/in" kind, which can appear anywhere an expression can, and the "let with no in" kind, which must appear in a comprehension or do block. Since your function definition isn't in either, its let's must use an in, for example:
testing :: [Int] -> [Int] -> [Int]
testing lst1 lst2 =
let t = [ r | (x,y) <- zip lst1 lst2, let r = if y == 0 && x == 2 then 2 else y ] in
let t1 = [ w | (u,v) <- zip t (tail t), let w = if (x == 2) && (y == 0) then 2 else y] in
return (head t : t1)
Alternately, since you can define multiple things in each let, you could consider:
testing :: [Int] -> [Int] -> [Int]
testing lst1 lst2 =
let t = [ r | (x,y) <- zip lst1 lst2, let r = if y == 0 && x == 2 then 2 else y ]
t1 = [ w | (u,v) <- zip t (tail t), let w = if (x == 2) && (y == 0) then 2 else y]
in return (head t : t1)
The code has other problems, but this should get you to the point where it parses, at least.
With an expression formed by a let-binding, you generally need
let bindings
in
expressions
(there are exceptions when monads are involved).
So, your code can be rewritten as follows (with simplification of r and w, which were not really necessary):
testing :: [Int] -> [Int] -> [Int]
testing lst1 lst2 =
let t = [ if y == 0 && x == 2 then 2 else y | (x,y) <- zip lst1 lst2]
t1 = [ if (v == 0) && (u == 2) then 2 else v | (u,v) <- zip t (tail t)]
in
head t : t1
(Note, I also switched u and v so that t1 and t has similar forms.
Now given a list like [2,0,0,0,1,0], it appears that your code is trying to replace 0 with 2 if the previous element is 2 (from the pattern of your code), so that eventually, the desired output is [2,2,2,2,1,0].
To achieve this, it is not enough to use two list comprehensions or any fixed number of comprehensions. You need to somehow apply this process recursively (again and again). So instead of only doing 2 steps, we can write out one step, (and apply it repeatedly). Taking your t1 = ... line, the one step function can be:
testing' lst =
let
t1 = [ if (u == 2) && (v == 0) then 2 else v | (u,v) <- zip lst (tail lst)]
in
head lst : t1
Now this gives:
*Main> testing' [2,0,0,0,1,0]
[2,2,0,0,1,0]
, as expected.
The rest of the job is to apply testing' as many times as necessary. Here applying it (length lst) times should suffice. So, we can first write a helper function to apply another function n times on a parameter, as follows:
apply_n 0 f x = x
apply_n n f x = f $ apply_n (n - 1) f x
This gives you what you expected:
*Main> apply_n (length [2,0,0,0,1,0]) testing' [2,0,0,0,1,0]
[2,2,2,2,1,0]
Of course, you can wrap the above in one function like:
testing'' lst = apply_n (length lst) testing' lst
and in the end:
*Main> testing'' [2,0,0,0,1,0]
[2,2,2,2,1,0]
NOTE: this is not the only way to do the filling, see the fill2 function in my answer to another question for an example of achieving the same thing using a finite state machine.

Haskell: Generating k-itemsets for apriori

I am trying to generate all k-item sets for use in apriori, I am following this pseudocode:
L1= {frequent items};
for (k= 2; Lk-1 !=∅; k++) do begin
Ck= candidates generated from Lk-1 (that is: cartesian product Lk-1 x Lk-1 and eliminating any
k-1 size itemset that is not frequent);
for each transaction t in database do
increment the count of all candidates in
Ck that are contained in t
Lk = candidates in Ck with min_sup
end
return U_k Lk;
,here is the code I have:
-- d transactions, threshold
kItemSets d thresh = kItemSets' 2 $ frequentItems d thresh
where
kItemSets' _ [] = [[]]
kItemSets' k t = ck ++ (kItemSets' (k+1) ck)
where
-- those (k-1) length sets that meet the threshold of being a subset of the transactions in d
ck = filter (\x->(countSubsets x d) >= thresh) $ combinations k t
-- length n combinations that can be made from xs
combinations 0 _ = [[]]
combinations _ [] = []
combinations n xs#(y:ys)
| n < 0 = []
| otherwise = case drop (n-1) xs of
[ ] -> []
[_] -> [xs]
_ -> [y:c | c <- combinations (n-1) ys]
++ combinations n ys
-- those items of with frequency o in the dataset
frequentItems xs o = [y| y <- nub cs, x<-[count y cs], x >= o]
where
cs = concat xs
isSubset a b = not $ any (`notElem` b) a
-- Count how many times the list y appears as a subset of a list of lists xs
countSubsets y xs = length $ filter (isSubset y ) xs
count :: Eq a => a -> [a] -> Int
count x [] = 0
count x (y:ys) | x == y = 1+(count x ys)
| otherwise = count x ys
transactions =[["Butter", "Biscuits", "Cream", "Newspaper", "Bread", "Chocolate"],
["Cream", "Newspaper", "Tea", "Oil", "Chocolate"] ,
["Chocolate", "Cereal", "Bread"],
["Chocolate", "Flour", "Biscuits", "Newspaper"],
["Chocolate", "Biscuits", "Newspaper"] ]
But when I compile I get the error:
apriori.hs:5:51:
Occurs check: cannot construct the infinite type: a0 = [a0]
Expected type: [a0]
Actual type: [[a0]]
In the second argument of kItemSets', namely `ck'
In the second argument of `(++)', namely `(kItemSets' (k + 1) ck)'
Failed, modules loaded: none.
But when I run from ghci:
*Main> mapM_ print $ filter (\x->(countSubsets x transactions ) >= 2 ) $ combinations 2 $ frequentItems transactions 2
["Biscuits","Newspaper"]
["Biscuits","Chocolate"]
["Cream","Newspaper"]
["Cream","Chocolate"]
["Newspaper","Chocolate"]
["Bread","Chocolate"]
Which is correct, since it's those 2-item sets that meet the occurrence threshold in the set of transactions. But what I need for the 3-item sets is
[["Biscuits", "Chocolate", "Newspaper" ],
["Chocolate", "Cream", "Newspaper"]]
and for this to be appended to the list of 2-item sets. How would I change my current code to achieve this? I know it can be built from the 2-item set, but I'm not sure how to go about it.
Had to use this for line 5:
kItemSets' k t = ck ++ (kItemSets' (k+1) $ nub $ concat ck)
Not the most efficient but it works.

Haskell Knapsack

I've written an answer to the bounded knapsack problem with one of each item in Scala, and tried transposing it to Haskell with the following result:
knapsack :: [ ( Int, Int ) ] -> [ ( Int, Int ) ] -> Int -> [ ( Int, Int ) ]
knapsack xs [] _ = xs
knapsack xs ys max =
foldr (maxOf) [ ] [ knapsack ( y : xs ) ( filter (y /=) ys ) max | y <- ys
, weightOf( y : xs ) <= max ]
maxOf :: [ ( Int, Int ) ] -> [ ( Int, Int ) ] -> [ ( Int, Int ) ]
maxOf a b = if valueOf a > valueOf b then a else b
valueOf :: [ ( Int, Int ) ] -> Int
valueOf [ ] = 0
valueOf ( x : xs ) = fst x + valueOf xs
weightOf :: [ ( Int, Int ) ] -> Int
weightOf [ ] = 0
weightOf ( x : xs ) = snd x + weightOf xs
I'm not looking for tips on how to clean up the code, just to get it working. To my knowledge it should be doing the following:
For each tuple option (in ys)
if the weight of the current tuple (y) and the running total (xs) combined is less than the capacity
get the optimal knapsack that contains the current tuple and the current total (xs), using the available tuples (in ys) less the current tuple
Finally, get the most valuable of these results and return it
*Edit: * Sorry, forgot to say what's wrong... So it compiles alright, but it gives the wrong answer. For the following inputs, what I expect and what it produces:
knapsack [] [(1,1),(2,2)] 5
Expect: [(1,1),(2,2)]
Produces: [(1,1),(2,2)]
knapsack [] [(1,1),(2,2),(3,3)] 5
Expect: [(2,2),(3,3)]
Produces: []
knapsack [] [(2,1),(3,2),(4,3),(6,4)] 5
Expect: [(2,1),(6,4)]
Produces: []
So I was wondering what could be the cause of the discrepancy?
The solution, thanks to sepp2k:
ks = knapsack []
knapsack :: [ ( Int, Int ) ] -> [ ( Int, Int ) ] -> Int -> [ ( Int, Int ) ]
knapsack xs [] _ = xs
knapsack xs ys max =
foldr (maxOf) [ ] ( xs : [ knapsack ( y : xs ) ( ys #- y ) max
| y <- ys, weightOf( y : xs ) <= max ] )
(#-) :: [ ( Int, Int ) ] -> ( Int, Int ) -> [ ( Int, Int ) ]
[ ] #- _ = [ ]
( x : xs ) #- y = if x == y then xs else x : ( xs #- y )
maxOf :: [ ( Int, Int ) ] -> [ ( Int, Int ) ] -> [ ( Int, Int ) ]
maxOf a b = if valueOf a > valueOf b then a else b
valueOf :: [ ( Int, Int ) ] -> Int
valueOf [ ] = 0
valueOf ( x : xs ) = fst x + valueOf xs
weightOf :: [ ( Int, Int ) ] -> Int
weightOf [ ] = 0
weightOf ( x : xs ) = snd x + weightOf xs
Which returns the expected results, above.
Your first case fires when ys contains. so for knapsack [foo,bar] [] 42, you get back [foo, bar], which is what you want. However it does not fire when ys contains nothing except elements that would put you over the max weight, i.e. knapsack [(x, 20), (y,20)] [(bla, 5)] will return [] and thus discard the previous result. Since this is not what you want you should adjust your cases so that the second case only fires if there's at least one element in ys that's below the max weight.
One way to do that would be to throw out any elements that put you over the max weight when recursing, so that that scenario simply can't happen.
Another way would be to switch the order of the cases and add a guard to the first case that says that ys must contain at least one element that does not put you over the total weight (and adjust the other case to not require ys to be empty).
PS: Another, unrelated problem with your code is that it ignores duplicates. I.e. if you use it on the list [(2,2), (2,2)] it will act as if the list was just [(2,2)] because filter (y /=) ys will throw out all occurrences of y, not just one.
Some improvements on your working version:
import Data.List
import Data.Function(on)
ks = knapsack []
knapsack :: [(Int, Int)] -> [(Int, Int)] -> Int -> [(Int, Int)]
knapsack xs [] _ = xs
knapsack xs ys max =
foldr (maxOf) [] (xs: [knapsack (y:xs) (delete y ys) max
| y <- ys, weightOf(y:xs) <= max ] ) where
weightOf = sum . map snd
maxOf :: [(Int, Int)] -> [(Int, Int)] -> [(Int, Int)]
maxOf a b = maximumBy (compare `on` valueOf) [a,b] where
valueOf = sum . map fst
Might I suggest using a dynamic programming approach? This way of solving 0-1 knapsack problems are almost painfully slow, at least when the amount of variables gets larger than around 20. While it's simple, it's just too ineffective. Here's my shot at it:
import Array
-- creates the dynamic programming table as an array
dynProgTable (var,cap) = a where
a = array ((0,0),(length var,cap)) [ ((i,j), best i j)
| i <- [0..length var] , j <- [0..cap] ] where
best 0 _ = 0
best _ 0 = 0
best i j
| snd (var !! (i-1)) > j = a!decline
| otherwise = maximum [a!decline,value+a!accept]
where decline = (i-1,j)
accept = (i-1,j - snd (var !! (i-1)))
value = fst (var !! (i-1))
--Backtracks the solution from the dynamic programming table
--Output on the form [Int] where i'th element equals 1 if
--i'th variable was accepted, 0 otherwise.
solve (var,cap) =
let j = cap
i = length var
table = dynProgTable (var,cap)
step _ 0 _ = []
step a k 0 = step table (k-1) 0 ++ [0]
step a k l
| a!(k,l) == a!(k-1,l) = step a (k-1) l ++ [0]
| otherwise = step a (k-1) (l - snd (var !! (k-1))) ++ [1]
in step table i j
In the input (var,cap), var is a list of variables in the form of 2-tuples (c,w), where c is the cost and w is the weight. cap is the maximum weight allowance.
I'm sure above code could be cleaned up to make it more readable and obvious, but that's how it turned out for me :) Where the code snippet by Landei above is short, my computer took ages computing instances with only 20 variables. The dynamic programming approach above gave me a solution for 1000 variables faster.
If you don't know about dynamic programming, you should check out this link:Lecture slides on dynamic programming, it helped me a lot.
For an introduction to arrays, check out Array tutorial.

Comparing 3 output lists in haskell

I am doing another Project Euler problem and I need to find when the result of these 3 lists is equal (we are given 40755 as the first time they are equal, I need to find the next:
hexag n = [ n*(2*n-1) | n <- [40755..]]
penta n = [ n*(3*n-1)/2 | n <- [40755..]]
trian n = [ n*(n+1)/2 | n <- [40755..]]
I tried adding in the other lists as predicates of the first list, but that didn't work:
hexag n = [ n*(2*n-1) | n <- [40755..], penta n == n, trian n == n]
I am stuck as to where to to go from here.
I tried graphing the function and even calculus but to no avail, so I must resort to a Haskell solution.
Your functions are weird. They get n and then ignore it?
You also have a confusion between function's inputs and outputs. The 40755th hexagonal number is 3321899295, not 40755.
If you really want a spoiler to the problem (but doesn't that miss the point?):
binarySearch :: Integral a => (a -> Bool) -> a -> a -> a
binarySearch func low high
| low == high = low
| func mid = search low mid
| otherwise = search (mid + 1) high
where
search = binarySearch func
mid = (low+high) `div` 2
infiniteBinarySearch :: Integral a => (a -> Bool) -> a
infiniteBinarySearch func =
binarySearch func ((lim+1) `div` 2) lim
where
lim = head . filter func . lims $ 0
lims x = x:lims (2*x+1)
inIncreasingSerie :: (Ord a, Integral i) => (i -> a) -> a -> Bool
inIncreasingSerie func val =
val == func (infiniteBinarySearch ((>= val) . func))
figureNum :: Integer -> Integer -> Integer
figureNum shape index = (index*((shape-2)*index+4-shape)) `div` 2
main :: IO ()
main =
print . head . filter r $ map (figureNum 6) [144..]
where
r x = inIncreasingSerie (figureNum 5) x && inIncreasingSerie (figureNum 3) x
Here's a simple, direct answer to exactly the question you gave:
*Main> take 1 $ filter (\(x,y,z) -> (x == y) && (y == z)) $ zip3 [1,2,3] [4,2,6] [8,2,9]
[(2,2,2)]
Of course, yairchu's answer might be more useful in actually solving the Euler question :)
There's at least a couple ways you can do this.
You could look at the first item, and compare the rest of the items to it:
Prelude> (\x -> all (== (head x)) $ tail x) [ [1,2,3], [1,2,3], [4,5,6] ]
False
Prelude> (\x -> all (== (head x)) $ tail x) [ [1,2,3], [1,2,3], [1,2,3] ]
True
Or you could make an explicitly recursive function similar to the previous:
-- test.hs
f [] = True
f (x:xs) = f' x xs where
f' orig (y:ys) = if orig == y then f' orig ys else False
f' _ [] = True
Prelude> :l test.hs
[1 of 1] Compiling Main ( test.hs, interpreted )
Ok, modules loaded: Main.
*Main> f [ [1,2,3], [1,2,3], [1,2,3] ]
True
*Main> f [ [1,2,3], [1,2,3], [4,5,6] ]
False
You could also do a takeWhile and compare the length of the returned list, but that would be neither efficient nor typically Haskell.
Oops, just saw that didn't answer your question at all. Marking this as CW in case anyone stumbles upon your question via Google.
The easiest way is to respecify your problem slightly
Rather than deal with three lists (note the removal of the superfluous n argument):
hexag = [ n*(2*n-1) | n <- [40755..]]
penta = [ n*(3*n-1)/2 | n <- [40755..]]
trian = [ n*(n+1)/2 | n <- [40755..]]
You could, for instance generate one list:
matches :: [Int]
matches = matches' 40755
matches' :: Int -> [Int]
matches' n
| hex == pen && pen == tri = n : matches (n + 1)
| otherwise = matches (n + 1) where
hex = n*(2*n-1)
pen = n*(3*n-1)/2
tri = n*(n+1)/2
Now, you could then try to optimize this for performance by noticing recurrences. For instance when computing the next match at (n + 1):
(n+1)*(n+2)/2 - n*(n+1)/2 = n + 1
so you could just add (n + 1) to the previous tri to obtain the new tri value.
Similar algebraic simplifications can be applied to the other two functions, and you can carry all of them in accumulating parameters to the function matches'.
That said, there are more efficient ways to tackle this problem.

Detecting cyclic behaviour in Haskell

I am doing yet another projecteuler question in Haskell, where I must find if the sum of the factorials of each digit in a number is equal to the original number. If not repeat the process until the original number is reached. The next part is to find the number of starting numbers below 1 million that have 60 non-repeating units. I got this far:
prob74 = length [ x | x <- [1..999999], 60 == ((length $ chain74 x)-1)]
factorial n = product [1..n]
factC x = sum $ map factorial (decToList x)
chain74 x | x == 0 = []
| x == 1 = [1]
| x /= factC x = x : chain74 (factC x)
But what I don't know how to do is to get it to stop once the value for x has become cyclic. How would I go about stopping chain74 when it gets back to the original number?
When you walk through the list that might contain a cycle your function needs to keep track of the already seen elements to be able to check for repetitions. Every new element is compared against the already seen elements. If the new element has already been seen, the cycle is complete, if it hasn't been seen the next element is inspected.
So this calculates the length of the non-cyclic part of a list:
uniqlength :: (Eq a) => [a] -> Int
uniqlength l = uniqlength_ l []
where uniqlength_ [] ls = length ls
uniqlength_ (x:xs) ls
| x `elem` ls = length ls
| otherwise = uniqlength_ xs (x:ls)
(Performance might be better when using a set instead of a list, but I haven't tried that.)
What about passing another argument (y for example) to the chain74 in the list comprehension.
Morning fail so EDIT:
[.. ((length $ chain74 x x False)-1)]
chain74 x y not_first | x == y && not_first = replace_with_stop_value_:-)
| x == 0 = []
| x == 1 = [1]
| x == 2 = [2]
| x /= factC x = x : chain74 (factC x) y True
I implemented a cycle-detection algorithm in Haskell on my blog. It should work for you, but there might be a more clever approach for this particular problem:
http://coder.bsimmons.name/blog/2009/04/cycle-detection/
Just change the return type from String to Bool.
EDIT: Here is a modified version of the algorithm I posted about:
cycling :: (Show a, Eq a) => Int -> [a] -> Bool
cycling k [] = False --not cycling
cycling k (a:as) = find 0 a 1 2 as
where find _ _ c _ [] = False
find i x c p (x':xs)
| c > k = False -- no cycles after k elements
| x == x' = True -- found a cycle
| c == p = find c x' (c+1) (p*2) xs
| otherwise = find i x (c+1) p xs
You can remove the 'k' if you know your list will either cycle or terminate soon.
EDIT2: You could change the following function to look something like:
prob74 = length [ x | x <- [1..999999], let chain = chain74 x, not$ cycling 999 chain, 60 == ((length chain)-1)]
Quite a fun problem. I've come up with a corecursive function that returns the list of the "factorial chains" for every number, stopping as soon as they would repeat themselves:
chains = [] : let f x = x : takeWhile (x /=) (chains !! factC x) in (map f [1..])
Giving:
take 4 chains == [[],[1],[2],[3,6,720,5043,151,122,5,120,4,24,26,722,5044,169,363601,1454]]
map head $ filter ((== 60) . length) (take 10000 chains)
is
[1479,1497,1749,1794,1947,1974,4079,4097,4179,4197,4709,4719,4790,4791,4907,4917
,4970,4971,7049,7094,7149,7194,7409,7419,7490,7491,7904,7914,7940,7941,9047,9074
,9147,9174,9407,9417,9470,9471,9704,9714,9740,9741]
It works by calculating the "factC" of its position in the list, then references that position in itself. This would generate an infinite list of infinite lists (using lazy evaluation), but using takeWhile the inner lists only continue until the element occurs again or the list ends (meaning a deeper element in the corecursion has repeated itself).
If you just want to remove cycles from a list you can use:
decycle :: Eq a => [a] -> [a]
decycle = dc []
where
dc _ [] = []
dc xh (x : xs) = if elem x xh then [] else x : dc (x : xh) xs
decycle [1, 2, 3, 4, 5, 3, 2] == [1, 2, 3, 4, 5]

Resources