What optimization technique does ghci use to speed up recursive map? - haskell

Say I have the following function:
minc = map (+1)
natural = 1:minc natural
It seems like it unfolds like this:
1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc...
1:2:minc(minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc...
1:2:minc(2:minc(2:minc(2:minc(2:minc(2:minc(2:minc(2:minc(2:minc(minc...
1:2:3:minc(3:minc(3:minc(3:minc(3:minc(3:minc(3:minc(3:minc(minc(minc...
...
Although it's lazily evaluated, to build each new number n in the list is has to unfold an expression n times which gives us O(N^2) complexity. But by the execution time I can see that the real complexity is still linear!
Which optimization does Haskell use in this case and how does it unfold this expression?

The list of naturals is being shared between each recursive step. The graph is evaluated like this.
1:map (+1) _
^ |
`---------'
1: (2 : map (+1) _)
^ |
`----------'
1: (2 : (3 : map (+1) _)
^ |
`----------'
This sharing means that the code uses O(n) time rather than the expected O(N^2).

to build each new number n in the list is has to unfold an expression n times which gives us O(N2) complexity.
Not quite. The complexity of unfolding the first N numbers this way is indeed O(N2)Apparently I'm wrong here[1]. But if you request only the N-th number, then it actually evaluates like this:
(!!n) $ 1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc...
(!!n-1) $ minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc...
(!!n-1) $ (1+1):minc(minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc...
-- note that `(1+1)` isn't actually calculated!
(!!n-2) $ minc(minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc...
(!!n-2) $ ((1+1)+1):minc(minc(minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc...
-- again, neither of the additions is actually calculated.
(!!n-3) $ minc(minc(minc(1:minc(1:minc(1:minc(1:minc(1:minc(1:minc...
(!!n-3) $ ((...)+1):minc(minc(minc(minc(1:minc(1:minc(1:minc(1:minc(1:minc...
...
(!!n-n) $ ((...+1)+1) : minc(minc(...minc(minc(1:minc(...
╰─ n ─╯
(!!0) $ (n+1) : _
n+1
Which takes only a fixed number of two steps per increase in N, plus N additions once it's reached the index – that's still O(N) all in all.
The crucial thing here is that basically, map is only applied once to the entire list. It's completely lazy, i.e. to yield a _:_ thunk it only needs to know that the list has at least length 1, but the actual elements don't matter at all.
This way, what we've written as minc(minc(...(minc(1 : ... is replaced by (... + 1) : minc(... in only one step.
[1]Turns out that even if we sum the first N numbers, it's done in O(N). I don't know how.

Related

Haskell recursion list of self growing lists

I found the following piece of code in Contract.hs line 147 Pricing Financial Contracts with Haskell:
konstSlices :: a -> [[a]]
konstSlices x = nextSlice [x]
where nextSlice sl = sl : nextSlice (x:sl)
This produces a infinite list of lists:
konstSlices 100 = [[100],[100,100],[100,100,100],...]
I am not sure what is happening inside the where clause. If we just take 3 iterations what should be inside the nextSlice at this time
[100]:[100,100]:nextSlice (100 :[100,100]) ?
how the terminating: [] appears to pack the lists inside a list [100]:[100,100]:[100,100,100]:[] = [[100],[100,100],[100,100,100]]
the recursive construction is really hard to follow btw I am curious if there are tools allowing to follow such iterations and see how such values are build? Actually in such cases I am using a pen and a paper to get a grip on what is hapenning. Recursion lists are not the worst case btw.. (what bring me to this question was the analysis of the function at t (line 130) with the liftA2'ing stuff inside applicative functions which are build from other smaller functions or data constructor with function type, you rapidly see growing a big chunk of inter-related computations and you are totally lost - brain washed..)
Here is a much simpler case for you
Prelude> let ones = 1 : ones
Prelude> take 3 ones
[1,1,1]
ones is defined to be an infinite list of 1s. There is no end, so there is no final empty list constructor. take n initiates the generation of the first n elements, here with n=3.
karakfa has a great illustration of what’s going on here, but I’ll expand a bit.
There isn’t any closing ]. A list is a data structure whose head is an item of data, and whose tail is a list. Furthermore, objects in Haskell are lazily evaluated.
Let’s take another look at this example:
konstSlices :: a -> [[a]]
konstSlices x = nextSlice [x]
where nextSlice sl = sl : nextSlice (x:sl)
Lazy evaluation means that, if you try to use konstSlices 100, the program will only calculate as many items of the list as it needs to. So, if you take 1 (konstSlices 100), the program will compute
konstSlices 100 = [100]:
nextSlice (100:[100]))
The tail of the list, everything after the [100]:, is stored as a thunk. Its value hasn’t been computed yet.
What if you ask for take 2 (konstSlices 100)? Then, the program needs to compute the thunk until it finds the second element. That’s all it needs, so it will stop when it gets to,
konstSlices 100 = [100]:
[100,100]:
(nextSlice (100:[100,100]))
And so on, for however many entries you need to compute.
There’s never anything corresponding to a closing bracket. There doesn’t need to be. The recursive definition of konstSlices never generates anything like one, just more thunks. And that’s allowed.
On the other hand, if you try to take length (konstSlices 100), the program will attempt to generate an infinite number of nodes, run out of memory, and crash. If you tried to compute the entirety of a circular list, like xs = 1:xs, it wouldn’t need to allocate any new nodes, because it links back to the same ones, and it wouldn’t need to generate new stack frames, because it’s tail-recursive modulo cons, so it would go into an infinite loop.
the logic is very simple
konstSlices :: a -> [[a]]
konstSlices x = nextSlice [x]
where nextSlice sl = sl : nextSlice (x:sl)
if x = 100 then nextSlice will return [100] then it will recursive get first element and append it to the previous list nextSlice (x:sl) in this case sl will grow with each call , sorted by list index
0 -> [100] = [100]
1 -> [100,100] = nextSlice (100:[100])
2 -> [100 ,100,100] = nextSlice (100:[100 ,100])
3 -> [100 ,100 , 100,100] = nextSlice (100:[100 , 100,100])
and process will continue ,
x:sl the first element of sl list is x.
and every nextSlice call will return a list. sl:sl:sl ...
same as this one where acc = 1
num = 100 : num
slice acc = take acc num :slice (acc+1)
Your code is meant to be instructional.
The list with one value prepended is displayed every iteration so that you get
[100]
[100,100]
[100,100,100]
however many times you want it. Every list listed is a brand new list. Haskell builds a new list every iteration by prepending one value to the previous list. The use of a previous value in the next value is what recursive functions do. See below, the first example.
In an actual program, you might not be interested in the building of a list but only in the final result.
Haskell has functions that help you see your list as it is being built. The functions generalize primitive recursion and so can work with 90% of all recursive functions that you may need.
The functions are foldl/foldr and scanl/scanr. When you want to see your list being built use scan?. When you want just the final result use fold?.
You may only be interested in the construction as in the following to build a Fibonacci list up to 12.
scanl (\(a,b) x -> (a+b,a)) (1,0) [2..12]
[(1,0),(1,1),(2,1),(3,2),(5,3),(8,5),(13,8),(21,13),(34,21),(55,34),(89,55),(144,89)]
in which the previous two values are added to make the next first value and the previous first value becomes the next second value.
In your code with 3 iterations, you can see what happens to each, easily.
take 3.konstSlices $ 100
[ [100], [100,100], [100,100,100] ]
scanl (\b a -> a : b) [] $ take 3 $ repeat 100
[ [], [100], [100,100], [100,100,100] ]
But this shows more. It has the initial null list value to which it prepends 100 to, for the next value.
If you want only the final result,
foldl (\b a -> a : b) [] $ take 3 $ repeat 100
[100,100,100]
It is exactly
100 : [] = [100]
100 : [100] = [100,100]
100 : [100,100] = [100,100,100]

Build a specific list of string from a string in Haskell

I'm starting learning haskell and i'm stuck in a problem.
I read from the standard input a string like "1234" or "azer"
and I want to make a list like ["123", "234", "341", "412"] or ["aze", "zer", "era", "raz"].
I probably must use map but i don't know how to proceed.
Is someone can help me to do that ? Thanks
Let's start with a list, [1..4]. Let's repeat it for eternity:
>>> cycle [1..4]
[1,2,3,4,1,2,3,4,1,2,3,4,...
Now let's take a slice of it, at say, the 2nd index:
>>> take 4 $ drop (2-1) $ cycle [1..4]
[2,3,4,1]
We can generalize this by naming a function:
slice n = take 4 $ drop n $ cycle [1..4]
To obtain all possible cyclic permutations, we only need to sample n from 1 to 4:
>>> map slice [1..4]
[[2,3,4,1],[3,4,1,2],[4,1,2,3],[1,2,3,4]]
Now, how can we make this work with an arbitrary string? Let's redefine slice to accept a string:
slice s n = take (length s) $ drop n $ cycle s
And so our cyclic permutations function can be defined as follows:
cyclicPerms s = map (slice s) [1..(length s)]
Testing:
>>> cyclicPerms "abcde"
["bcdea","cdeab","deabc","eabcd","abcde"]
I had originally posted an answer totally misunderstanding the specification. I, like a Haskell enumeration, read only the first two numbers so I thought it continued as such. Oops. In any event I just adapted a chunks function I wrote to produce the repetitions. When I get home, I think I have another that cycles lists. I'll post it as well if it's not the same. Who knows.
This function allows you to specify the chunk size as well as the list.
cychnks n ls = [take n.drop x$ls2|(x,y) <-zip [0..] ls]
where ls2 = ls++ls
cychnks 5 "abcde"
["abcde","bcdea","cdeab","deabc","eabcd"]
cychnks 3 "abcde"
["abc","bcd","cde","dea","eab"]

Iterating through a list to detect prime numbers

i was given a homework in Haskell in which i should program a module, which helps detect prime numbers from a list, say :
[2,3,4,5,6,7,8,9,10]
For the homework, I should iterate through every elements of this list, and eliminate all of it's multiples. Example, I go at number 2, I should eliminate 4,6,8,10. Then go to number 3 and delete 6 and 9, and so on until the end, return the list with prime numbers only.
I have an idea of using function map, but I'm stuck at this place (I'm pretty new to Haskell, though)
Yes, it is my homework, but no, i don't have to do it, it's just practicing. So I'm thankful for any help.
Instead of using a map (I don't think that's possible without doing some pre-processing), you can roll your own function:
sieveWith _ [] = []
sieveWith ss (x:xs) | any ((==) 0 . mod x) ss = sieveWith ss xs
| otherwise = x : (sieveWith (x:ss) xs)
and:
sieve = sieveWith []
Now if you call sieve:
*Main> sieve [2,3,4,5,6,7,8,9,10]
[2,3,5,7]
The function works with a variable (the first one) that is passed through the function calls and each time a value is picked, added to the list. A value is picked if no modulo operation on the variable list yields a zero (second guard). In case any of the modulo's yields zero, the value is simply omitted.

How does this haskell code work?

I'm a new student and I'm studying in Computer Sciences. We're tackling Haskell, and while I understand the idea of Haskell, I just can't seem to figure out how exactly the piece of code we're supposed to look at works:
module U1 where
double x = x + x
doubles (d:ds) = (double d):(doubles ds)
ds = doubles [1..]
I admit, it seems rather simple for someone that knows whats happening, but I can't wrap my head around it. If I write "take 5 ds", it obviously gives back [2,4,6,8,10]. What I dont get, is why.
Here's my train of thought : I call ds, which then looks for doubles. because I also submit the value [1..], doubles (d:ds) should mean that d = 1 and ds = [2..], correct? I then double the d, which returns 2 and puts it at the start of a list (array?). Then it calls upon itself, transferring ds = [2..] to d = 2 and ds = [3..], which then doubles d again and again calls upon itself and so on and so forth until it can return 5 values, [2,4,6,8,10].
So first of all, is my understanding right? Do I have any grave mistakes in my string of thought?
Second of all, since it seems to save all doubled d into a list to call for later, whats the name of that list? Where did I exactly define it?
Thanks in advance, hope you can help out a student to understand this x)
I think you are right about the recursion/loop part about how doubles goes through each element of the infinite list.
Now regarding
it seems to save all doubled d into a list to call for later, whats
the name of that list? Where did I exactly define it?
This relates to a feature that's called Lazy Evaluation in Haskell. The list isn't precomputed and stored any where. Instead, you can imagine that a list is a function object in C++ that can generate elements when needed. (The normal language you may see is that expressions are evaluated on demand). So when you do
take 5 [1..]
[1..] can be viewed as a function object that generates numbers when used with head, take etc. So,
take 5 [1..] == (1 : take 4 [2..])
Here [2..] is also a "function object" that gives you numbers. Similarly, you can have
take 5 [1..] == (1 : 2 : take 3 [3..]) == ... (1 : 2 : 3 : 4 : 5 : take 0 [6..])
Now, we don't need to care about [6..], because take 0 xs for any xs is []. Therefore, we can have
take 5 [1..] == (1 : 2 : 3 : 4 : 5 : [])
without needing to store any of the "infinite" lists like [2..]. They may be viewed as function objects/generators if you want to get an idea of how Lazy computation can actually happen.
Your train of thought looks correct. The only minor inaccuracy in it lies in describing the computation using expressions such has "it doubles 2 and then calls itself ...". In pure functional programming languages, such as Haskell, there actually is no fixed evaluation order. Specifically, in
double 1 : double [2..]
it is left unspecified whether doubling 1 happens before of after doubling the rest of the list. Theoretical results guarantee that order is indeed immaterial, in that -- roughly -- even if you evaluate your expression in a different order you will get the same result. I would recommend that you see this property at work using the Lambda Bubble Pop website: there you can pop bubbles in a different order to simulate any evaluation order. No matter what you do, you will get the same result.
Note that, because evaluation order does not matter, the Haskell compiler is free to choose any evaluation order it deems to be the most appropriate for your code. For instance, let ds be defined as in the final line in your code, and consider
take 5 (drop 5 ds)
this results in [12,14,16,18,20]. Note that the compiler has no need to double the first 5 numbers, since you are dropping them, so they can be dropped before they are completely computed (!!).
If you want to experiment, define yourself a function which is very expensive to compute (say, write fibonacci following the recursive definifion).
fibonacci 0 = 0
fibonacci 1 = 1
fibonacci n = fibonacci (n-1) + fibonacci (n-2)
Then, define
const5 n = 5
and compute
fibonacci 100
and observe how long that actually takes. Then, evaluate
const5 (fibonacci 100)
and see that the result is immediately reached -- the argument was not even computed (!) since there was no need for it.

haskell: factors of a natural number

I'm trying to write a function in Haskell that calculates all factors of a given number except itself.
The result should look something like this:
factorlist 15 => [1,3,5]
I'm new to Haskell and the whole recursion subject, which I'm pretty sure I'm suppoused to apply in this example but I don't know where or how.
My idea was to compare the given number with the first element of a list from 1 to n div2
with the mod function but somehow recursively and if the result is 0 then I add the number on a new list. (I hope this make sense)
I would appreciate any help on this matter
Here is my code until now: (it doesn't work.. but somehow to illustrate my idea)
factorList :: Int -> [Int]
factorList n |n `mod` head [1..n`div`2] == 0 = x:[]
There are several ways to handle this. But first of all, lets write a small little helper:
isFactorOf :: Integral a => a -> a -> Bool
isFactorOf x n = n `mod` x == 0
That way we can write 12 `isFactorOf` 24 and get either True or False. For the recursive part, lets assume that we use a function with two arguments: one being the number we want to factorize, the second the factor, which we're currently testing. We're only testing factors lesser or equal to n `div` 2, and this leads to:
createList n f | f <= n `div` 2 = if f `isFactorOf` n
then f : next
else next
| otherwise = []
where next = createList n (f + 1)
So if the second parameter is a factor of n, we add it onto the list and proceed, otherwise we just proceed. We do this only as long as f <= n `div` 2. Now in order to create factorList, we can simply use createList with a sufficient second parameter:
factorList n = createList n 1
The recursion is hidden in createList. As such, createList is a worker, and you could hide it in a where inside of factorList.
Note that one could easily define factorList with filter or list comprehensions:
factorList' n = filter (`isFactorOf` n) [1 .. n `div` 2]
factorList'' n = [ x | x <- [1 .. n`div` 2], x `isFactorOf` n]
But in this case you wouldn't have written the recursion yourself.
Further exercises:
Try to implement the filter function yourself.
Create another function, which returns only prime factors. You can either use your previous result and write a prime filter, or write a recursive function which generates them directly (latter is faster).
#Zeta's answer is interesting. But if you're new to Haskell like I am, you may want a "simple" answer to start with. (Just to get the basic recursion pattern...and to understand the indenting, and things like that.)
I'm not going to divide anything by 2 and I will include the number itself. So factorlist 15 => [1,3,5,15] in my example:
factorList :: Int -> [Int]
factorList value = factorsGreaterOrEqual 1
where
factorsGreaterOrEqual test
| (test == value) = [value]
| (value `mod` test == 0) = test : restOfFactors
| otherwise = restOfFactors
where restOfFactors = factorsGreaterOrEqual (test + 1)
The first line is the type signature, which you already knew about. The type signature doesn't have to live right next to the list of pattern definitions for a function, (though the patterns themselves need to be all together on sequential lines).
Then factorList is defined in terms of a helper function. This helper function is defined in a where clause...that means it is local and has access to the value parameter. Were we to define factorsGreaterOrEqual globally, then it would need two parameters as value would not be in scope, e.g.
factorsGreaterOrEqual 4 15 => [5,15]
You might argue that factorsGreaterOrEqual is a useful function in its own right. Maybe it is, maybe it isn't. But in this case we're going to say it isn't of general use besides to help us define factorList...so using the where clause and picking up value implicitly is cleaner.
The indentation rules of Haskell are (to my tastes) weird, but here they are summarized. I'm indenting with two spaces here because it grows too far right if you use 4.
Having a list of boolean tests with that pipe character in front are called "guards" in Haskell. I simply establish the terminal condition as being when the test hits the value; so factorsGreaterOrEqual N = [N] if we were doing a call to factorList N. Then we decide whether to concatenate the test number into the list by whether dividing the value by it has no remainder. (otherwise is a Haskell keyword, kind of like default in C-like switch statements for the fall-through case)
Showing another level of nesting and another implicit parameter demonstration, I added a where clause to locally define a function called restOfFactors. There is no need to pass test as a parameter to restOfFactors because it lives "in the scope" of factorsGreaterOrEqual...and as that lives in the scope of factorList then value is available as well.

Resources