I'm learning Haskell at the moment and I read a book called "Thinking Functionally With Haskell" and I can't really understand why this expression from the first chapter is true:
sum . map sum = sum . concat
Informally, this is just saying that because if addition is associative, it doesn't matter how you group the numbers you are adding. (a + b) + (c + d) is the same as (a + b + c + d).
Formally, we can use equational reasoning and structure induction to prove this for lists of any size. (See the end for quick definitions of these two processes.)
Assuming the following definitions of map, concat, sum, and (.):
map sum [] = []
map sum (a:as) = sum a : map sum as
concat [] = []
concat (a:as) = a ++ concat as
sum [] = 0
sum (a:as) = a + sum as
(f . g) x = f (g x)
To make the proof below a little simpler, we'll claim without an explicit proof (but see below) that
sum (a ++ b) == sum a + sum b
First we establish that the identity is true for empty lists.
(sum . map sum) [] == sum (map sum []) -- (7)
== sum [] -- (1)
== sum (concat []) -- (3)
== (sum . concat) [] -- (7)
(Note that we don't need definition 5, since an empty list is an empty list.)
Now, add a new definition, for any list as of size k.
(sum . map sum) as == (sum . concat) as
If (9) is true, we can prove the identity for list of size k+1:
(sum . map sum) (a:as) == sum (map sum (a:as)) -- (7)
== sum (sum a : map sum as) -- (2)
== sum a + sum (map sum as) -- (6)
== sum a + (sum . map sum) as -- (7)
== sum a + (sum . concat) as -- (9)
== sum a + sum (concat as) -- (7)
== sum (a ++ concat as) -- (8)
== sum (concat (a:as)) -- (4)
== (sum . concat) (a:as) -- (7)
By induction, we have proved the sum . map sum == sum . concat for lists of any size.
Equational reasoning means that we can use an equality like a = b to replace a with b or b with a at any step of our proofs.
Structural induction on lists is a bootstrapping process. You assume some property is true for lists of size k, then use that to prove it is true for lists of size k+1. Then, if you can prove it is true for k=0, this implies it is true for all k. For example, if it is true for k=0, then it is true for k=1, which means it is true for k=2, etc.
Definition 4 assumes a definition of ++:
[] ++ bs = bs
(a:as) ++ bs = a : (as ++ bs)
With ++ defined, we can prove (8):
A base case: a is empty
sum ([] ++ b) == sum b -- definition of ++
== 0 + sum b -- definition of +
== sum [] + sum b -- definition of sum
Assuming sum (a++b) is true for a of length k,
sum ((a:as) ++ bs) == sum (a : (as ++ bs)) -- definition of ++
== a + sum (as ++ bs) -- definition of sum
== a + sum as + sum bs -- induction
== sum (a:as) + sum bs -- definition of sum
Imagine we have a list:
myList :: [[Int]]
myList = [[1,2],[3,4,5]]
Let's apply sum . map sum:
(sum . map sum) [[1,2],[3,4,5]]
= sum [sum [1,2], sum [3,4,5]]
= sum [1+2,3+4+5]
= 1+2+3+4+5
Now let's apply sum . concat:
(sum . concat) [[1,2],[3,4,5]]
= sum [1,2,3,4,5]
= 1+2+3+4+5
Hopefully you can see now that, because (a+b)+c = a+(b+c), the order in which we add things does not matter, thus summing the inner lists, then summing the entire list produces the same result as simply summing each value of the inner lists.
Related
I am working through some examples and trying to implement a function that counts how many subsets of a list add up to a given number.
In trying to rewrite some implementations in python to Haskell :
test1 :: [Int]
test1 = [2,4,6,10,1,4,5,6,7,8]
countSets1 total input = length [n | n <- subsets $ sort input, sum n == total]
where
subsets [] = [[]]
subsets (x:xs) = map (x:) (subsets xs) ++ subsets xs
countSets2 total input = go (reverse . sort $ input) total
where
go [] _ = 0
go (x:xs) t
| t == 0 = 1
| t < 0 = 0
| t < x = go xs t
| otherwise = go xs (t - x) + go xs t
countSets3 total input = go (sort input) total (length input - 1)
where
go xxs t i
| t == 0 = 1
| t < 0 = 0
| i < 0 = 0
| t < (xxs !! i) = go xxs t (i-1)
| otherwise = go xxs (t - (xxs !! i)) (i-1) + go xxs t (i-1)
I can't figure out why countSets2 does not return the same result as countSets3 (a copy of the python version)
λ: countSets1 16 test1
24
λ: countSets2 16 test1
13
λ: countSets3 16 test1
24
EDIT:
#freestyle pointed out that the order of my conditions was different in the two solutions:
countSets2 total input = go (sortBy (flip compare) input) total
where
go _ 0 = 1
go [] _ = 0
go (x:xs) t
| t < 0 = 0
| t < x = go xs t
| otherwise = go xs (t - x) + go xs t
fixes the problem.
I'm not sure about your logic, but in your second solution I think you need
go [] 0 = 1
otherwise, your code causes go [] 0 = 0 which feels wrong.
I don't treat your error so I don't expect you accept my answer. I only provide a solution:
import Math.Combinat.Sets (sublists)
getSublists :: [Int] -> Int -> [[Int]]
getSublists list total = filter (\x -> sum x == total) (sublists list)
countSublists :: [Int] -> Int -> Int
countSublists list total = length $ getSublists list total
The module Math.Combinat.Sets is from the combinat package.
>>> countSublists [2,4,6,10,1,4,5,6,7,8] 16
24
This problem looks similar to a pearl written by Richard Bird on how many sums and products can make 100. I'll use it as a template here. First, the specification:
subseqn :: (Num a, Eq a) => a -> [a] -> Int
subseqn n = length . filter ((== n) . sum) . subseqs
where
subseqs = foldr prefix [[]]
prefix x xss = map (x:) xss ++ xss
Observe that a lot of work may be wasted in subseqs. Intuitively, we can discard candidates as soon as they exceed n, i.e. use the weaker predicate (<= n) somewhere. Trivially, filtering on it before filtering on the stronger one does not change the outcome. Then you can derive
filter ((== n) . sum) . subseqs
= {- insert weaker predicate -}
filter ((== n) . sum) . filter ((<= n) . sum) . subseqs
= {- definition of subseqs -}
filter ((== n) . sum) . filter ((<= n) . sum) . foldr prefix [[]]
= {- fusion law of foldr -}
filter ((== n) . sum) . foldr prefix' [[]]
The fusion law states that f . foldr g a = foldr h b iff
f is strict
f a = b
f (g x y) = h x (f y)
Here, a = b = [[]], f is filter ((<= n) . sum) and g is prefix. You can derive h (i.e. prefix') by observing that the predicate can be applied before prefixing:
filter ((<= n) . sum) (prefix x xss) =
filter ((<= n) . sum) (prefix x (filter ((<= n) . sum) xss))
which is exactly the third condition; then h is filter ((<= n) . sum) . prefix.
Another observation is that sum is computed too many times. To get around that, we can modify our definition of subseqn so that each candidate carries its own sum. Let's use
(&&&) :: (a -> b) -> (a -> c) -> a -> (b, c)
(&&&) f g x = (f x, g x)
and derive
filter ((== n) . sum) . subseqs
= {- use &&& -}
filter ((== n) . snd) . map (id &&& sum) . subseqs
= {- definition of subseqs -}
filter ((== n) . snd) . map (id &&& sum) . foldr prefix' [[]]
= {- fusion law of foldr -}
filter ((== n) . snd) . foldr prefix'' [[]]
I won't go through the whole derivation of prefix'', it is quite long. The gist is that you can avoid using sum at all by working on pairs, so that the sum is computed iteratively. Initially the sum is 0 for the empty list and all we have to do is add the new candidate to it.
We update our base case from [[]] to [([], 0)] and get:
prefix'' x = filter ((<= n) . snd) . uncurry zip . (prefix x *** add x) . unzip
where
(***) :: (a -> a') -> (b -> b') -> (a, b) -> (a', b')
(***) f g (x, y) = (f x, g y)
add :: Num a => a -> [a] -> [a]
add x xs = map (x+) xs ++ xs
Here is the final version:
subseqn :: (Num a, Ord a) => a -> [a] -> Int
subseqn n = length . filter ((== n) . snd) . foldr expand [([], 0)]
where
expand x = filter ((<= n) . snd) . uncurry zip . (prefix x *** add x) . unzip
prefix x xss = map (x:) xss ++ xss
add x xs = map (x+) xs ++ xs
(*** and &&& are from Control.Arrow)
Here I have a function to generate a stream of random numbers between 0 and 999.
randomHelp :: RandomGen g => g -> [Int]
randomHelp g = zipWith (mod) (map fst $ iterate (next . snd) $ next $ snd $ split g) $ repeat 1000
I would like to select all numbers from the stream defined above and each elem(i) and elem(i + 1) must respect a propriety. For example their gcd have to be one. All I can think is a fold function with because I can start with and accumulator which contains the number 1 (let's assume 1 will be the first element I want to show) then I check the propriety in fold's function and if it is respected i add the element to the accumulator, but the problem is the program blocks because of stackoverflow I think.
Here is the function:
randomFunc :: RandomGen g => g -> [Int]
randomFunc g = foldl (\acc x -> if (gcd x (last acc) == 1) then acc ++ [x] else acc) [1] (randomHelp g)
Note: I don't want to use explicit recursion.
A right fold would probably fit better, something like:
import System.Random (RandomGen, randomRs, mkStdGen)
randomFunc :: RandomGen g => g -> [Int]
randomFunc g = foldr go (const []) (randomRs (1, 20) g) 1
where go x f lst = if gcd x lst == 1 then x: f x else f lst
then
\> take 20 . randomFunc $ mkStdGen 1
[16,7,6,19,8,15,16,1,9,2,15,17,14,3,11,17,15,8,1,5]
Doing so you may build the list using : instead of ++ which may cause quadratic performance cost, and you may bypass the call to last.
List functions allow us to implement arbitrarily-dimensional vector math quite elegantly. For example:
on = (.) . (.)
add = zipWith (+)
sub = zipWith (-)
mul = zipWith (*)
dist = len `on` sub
dot = sum `on` mul
len = sqrt . join dot
And so on.
main = print $ add [1,2,3] [1,1,1] -- [2,3,4]
main = print $ len [1,1,1] -- 1.7320508075688772
main = print $ dot [2,0,0] [2,0,0] -- 4
Of course, this is not the most efficient solution, but is insightful to look at, as one can say map, zipWith and such generalize those vector operations. There is one function I couldn't implement elegantly, though - that is cross products. Since a possible n-dimensional generalization of cross products is the nd matrix determinant, how can I implement matrix multiplication elegantly?
Edit: yes, I asked a completely unrelated question to the problem I set up. Fml.
It just so happens I have some code lying around for doing n-dimensional matrix operations which I thought was quite cute when I wrote it at least:
{-# LANGUAGE NoMonomorphismRestriction #-}
module MultiArray where
import Control.Arrow
import Control.Monad
import Data.Ix
import Data.Maybe
import Data.Array (Array)
import qualified Data.Array as A
-- {{{ from Dmwit.hs
deleteAt n xs = take n xs ++ drop (n + 1) xs
insertAt n x xs = take n xs ++ x : drop n xs
doublify f g xs ys = f (uncurry g) (zip xs ys)
any2 = doublify any
all2 = doublify all
-- }}}
-- makes the most sense when ls and hs have the same length
instance Ix a => Ix [a] where
range = sequence . map range . uncurry zip
inRange = all2 inRange . uncurry zip
rangeSize = product . uncurry (zipWith (curry rangeSize))
index (ls, hs) xs = fst . foldr step (0, 1) $ zip indices sizes where
indices = zipWith index (zip ls hs) xs
sizes = map rangeSize $ zip ls hs
step (i, b) (s, p) = (s + p * i, p * b)
fold :: (Enum i, Ix i) => ([a] -> b) -> Int -> Array [i] a -> Array [i] b
fold f n a = A.array newBound assocs where
(oldLowBound, oldHighBound) = A.bounds a
(newLowBoundBeg , dimLow : newLowBoundEnd ) = splitAt n oldLowBound
(newHighBoundBeg, dimHigh: newHighBoundEnd) = splitAt n oldHighBound
assocs = [(beg ++ end, f [a A.! (beg ++ i : end) | i <- [dimLow..dimHigh]])
| beg <- range (newLowBoundBeg, newHighBoundBeg)
, end <- range (newLowBoundEnd, newHighBoundEnd)
]
newBound = (newLowBoundBeg ++ newLowBoundEnd, newHighBoundBeg ++ newHighBoundEnd)
flatten a = check a >> return value where
check = guard . (1==) . length . fst . A.bounds
value = A.ixmap ((head *** head) . A.bounds $ a) return a
elementWise :: (MonadPlus m, Ix i) => (a -> b -> c) -> Array i a -> Array i b -> m (Array i c)
elementWise f a b = check >> return value where
check = guard $ A.bounds a == A.bounds b
value = A.listArray (A.bounds a) (zipWith f (A.elems a) (A.elems b))
unsafeFlatten a = fromJust $ flatten a
unsafeElementWise f a b = fromJust $ elementWise f a b
matrixMult a b = fold sum 1 $ unsafeElementWise (*) a' b' where
aBounds = (join (***) (!!0)) $ A.bounds a
bBounds = (join (***) (!!1)) $ A.bounds b
a' = copy 2 bBounds a
b' = copy 0 aBounds b
bijection f g a = A.ixmap ((f *** f) . A.bounds $ a) g a
unFlatten = bijection return head
matrixTranspose = bijection reverse reverse
copy n (low, high) a = A.ixmap (newBounds a) (deleteAt n) a where
newBounds = (insertAt n low *** insertAt n high) . A.bounds
The cute bit here is matrixMult, which is one of the only operations that is specialized to two-dimensional arrays. It expands its first argument along one dimension (by putting a copy of the two-dimensional object into each slice of the three-dimensional object); expands its second along another; does pointwise multiplication (now in a three-dimensional array); then collapses the fabricated third dimension by summing. Quite nice.
Reading "Thinking Functionally with Haskell" I came across a part of a program calculation that required that map sum (map (x:) xss) be rewritten as map (x+) (map sum xss)
Intuitively I know that it makes sense ...
if you have some lists that you are going to sum but, before summing, to those same lists you are also going to add one element 'x', then that is the same as taking a list of sums of the origninal lists and adding x's value to each of them.
But I would like to know how to transform one into the other only using equational reasoning. I feel like I'm missing a law or rule that would help me understand.
Using the law
map f (map g list) === map (f . g) list
We can deduce
map sum (map (x:) xss) =
map (sum . (x:)) xss =
eta-expand to give an argument to work with
map (\xs -> sum . (x:) $ xs) xss =
Substituting in for (f . g) x === f (g x)
map (\xs -> sum (x:xs)) xs =
Where
sum (x:xs) = x + sum xs
sum [] = 0
so
map (\xs -> sum (x:xs)) xss =
map (\xs -> x + sum xs) xss =
Substituting f (g x) === (f . g) x
map (\xs -> (x +) . sum $ xs) xss =
eta-reduce the lambda
map ((x +) . sum) xss =
The use the reverse of the first law from above
map (x+) (map sum xss)
I recommend you look at the types and let them guide you through the transformation.
> let xss = [[1], [2], [3]]
> :t xss
xss :: Num t => [[t]]
> map sum xss -- basically compacting the lists
[1,2,3]
> :t map sum xss -- into just a list of numbers
map sum xss :: Num b => [b]
Next we need to do the addition
> :t (+5)
(+5) :: Num a => a -> a
> :t map (+5) -- no magic in this
map (+5) :: Num b => [b] -> [b]
> map (+5) (map sum xss)
[6,7,8]
The bottom line I'd guess is that in the first example you're changing the types in the other way than in the second one. The point where a list of lists becomes just a list changes, and so has the way in which you add the number.
I failed at reading RWH; and not one to quit, I ordered Haskell: The Craft of Functional Programming. Now I'm curious about these functional proofs on page 146. Specifically I'm trying to prove 8.5.1 sum (reverse xs) = sum xs. I can do some of the induction proof but then I get stuck..
HYP:
sum ( reverse xs ) = sum xs
BASE:
sum ( reverse [] ) = sum []
Left = sum ( [] ) (reverse.1)
= 0 (sum.1)
Right = 0 (sum.1)
INDUCTION:
sum ( reverse (x:xs) ) = sum (x:xs)
Left = sum ( reverse xs ++ [x] ) (reverse.2)
Right = sum (x:xs)
= x + sum xs (sum.2)
So now I'm just trying ot prove that Left sum ( reverse xs ++ [x] ) is equal to Right x + sum xs, but that isn't too far off from where I started sum ( reverse (x:xs) ) = sum (x:xs).
I'm not quite sure why this needs to be proved, it seems totally reasonable to use the symbolic proof of reverse x:y:z = z:y:x (by defn), and because + is commutative (arth) then reverse 1+2+3 = 3+2+1,
sum (reverse []) = sum [] -- def reverse
sum (reverse (x:xs)) = sum (reverse xs ++ [x]) -- def reverse
= sum (reverse xs) + sum [x] -- sum lemma below
= sum (reverse xs) + x -- def sum
= x + sum (reverse xs) -- commutativity assumption!
= x + sum xs -- inductive hypothesis
= sum (x:xs) -- definition of sum
However, there are underlying assumptions of associativity and commutativity that are not strictly warranted and this will not work properly for a number of numerical types such as Float and Double where those assumptions are violated.
Lemma: sum (xs ++ ys) == sum xs + sum ys given the associativity of (+)
Proof:
sum ([] ++ ys) = sum ys -- def (++)
= 0 + sum ys -- identity of addition
= sum [] ++ sum ys -- def sum
sum ((x:xs) ++ ys) = sum (x : (xs ++ ys)) -- def (++)
= x + sum (xs ++ ys) -- def sum
= x + (sum xs + sum ys) -- inductive hypothesis
= (x + sum xs) + sum ys -- associativity assumption!
= sum (x:xs) + sum ys -- def sum
Basically you need to show that
sum (reverse xs ++ [x]) = sum (reverse xs) + sum [x]
which then easily leads to
= x + sum (reverse xs)
= x + sum xs -- by inductive hyp.
The problem is to show that sum distributes over list concatenation.
Use the definition of a sum to break up (sum reverse xs ++[x]) into x + sum(reverse(xs)), and using your inductive hypothesis you know sum(reverse(xs)) = sum(xs). But I agree, induction is overkill for a problem like this.
Here's where I think you're stuck. You need to prove a lemma that says
sum (xs ++ ys) == sum xs + sum ys
To prove this law you will have to assume that addition is associative, which is true only for integers and rationals.
Then, you will also need to assume that addition is commutative, which is true for integers and rationals but also for floats.
Digression: The style of your proofs looks very strange to me. I think you will have an easier time writing these kinds of proofs if you use the style in Graham Hutton's book.