How to optimize a sum over list comprehension - haskell

I have to make a large computation (for statistics), but the following function is making haskell say nothing if n is close to 100, but I have to make n = 500. When I remove one of the factors (for example (p**fromIntegral l)) things get better. Any idea how to make the sum more efficient? Also, the 'choose' function is from a library that is optimized (hackage said). Thanks a lot
probaMetodo :: Integral b => b -> b -> Double -> b -> Double
probaMetodo i j p n = sum [(p ** fromIntegral l) * (fromIntegral(n `choose` l)) * ((1-p) ** fromIntegral (n-l)) | l <- [i,i+1..j]]

Related

Prime Factorization in Haskell to return a list of tuples giving the number and the power

I have been trying to learn haskell by trying to do some simple problems.
The Problem
Currently, I am trying to implement a function primeFactorization :: Integer -> [(Integer, Integer)] such that the output is a list of tuples containing the prime factor and the power it is raise to in the number.
Example Output
> primeFactorization 120
[(2,3), (3,1), (5,1)] since 120 = 2^3 * 3^1 * 5^1
My (Partial) Solution
primeFactorization :: Integer -> [Integer]
primeFactorization n =
let
factors :: Integer -> [Integer]
factors n = [x | x <- [2..n-1], n `mod` x == 0]
isPrime :: Integer -> Bool
isPrime n
| n `elem` [0, 1] = False
| n == 2 = True
| n > 2 = null [ x | x <- [2..(ceiling . sqrt . fromIntegral) n], n `mod` x == 0]
| otherwise = False
in
filter isPrime $ (factors n)
This is a working implementation to get the prime factors of a number. However as seen it only outputs the prime factors. I am not sure on how to store the number of times in haskell. Also, considering it is un-idiomatic to iterate in haskell I don't know how I would implement the solution. In python, I would do:
def pf(number):
factors=[]
d=2
while(number>1):
while(number%d==0):
factors.append(d)
number=number/d
d+=1
return factors
So, the question: How to implement the powers of the prime factors?
NOTE:
I already saw: Prime factorization of a factorial however that does not answer my question.
This is NOT a homework problem, I am learning independently.
You can always replace imperative-language loops (as long as they don't meddle with any global state) with recursion. That may not be the most elegant approach, but in this case it seems perfectly appropriate to imitate your inner Python loop with a recursive function:
dividerPower :: Integer -> Integer -> Int
dividerPower n d
| n`rem`d == 0 = 1 + dividerPower (n`quot`d) d
| otherwise = 0
(This counts “backwards” compared to the Python loop. You could also make it tail-recursive with a helper function and count forwards over an accumulator variable, but that's more awkward and I don't think there's a memory/performance benefit that would justify it in this case.)
You can either use that together with your Haskell code (for each of the factors you've already found, check how often it occurs), or extend it so the whole thing works like the Python solution (which is actually a lot more efficient, because it avoids for every number checking whether it's prime). For that you just need to give back the final n in the result. Let's use a where block for handling the pattern matching, and also make the rem and:
dividePower :: Integer -> Integer -> (Integer, Int)
dividePower n d
| r == 0 = (nfin, p'+1)
| otherwise = (n, 0)
where (n', r) = n `quotRem` d
(nfin, p') = dividePower n' d
Then the equivalent to your Python code is
pf :: Integer -> Integer -> [(Integer, Int)]
pf = go 2
where go d n
| n>1 = (d, p) : go (d+1) n'
| otherwise = []
where (n', p) = dividePower n d
This actually gives you, like in Python, the list including also non-dividers (with power 0). To avoid that, change the list-building to
| n>1 = (if p>0 then ((d,p):) else id) $ go (d+1) n'

How to create a list of list of numbers in Haskell

I need to create a function that makes a board of size row and column and then populate that with zeros.
mkBoard 2 3 would make
[[0,0,0],[0,0,0]]
I don't really know where to start as I am new to Haskell programming I was thinking that the function would be something like this:
mkBoard m n= [m | ???? take n (repeat 0)]
But I am not sure if this would be the correct approach.
Thank you for your help.
As #arrowd already mentioned, there is a replicate function for take n (repeat x). You can create one row of your board with replicate n 0, and then create a board of such rows like mkBoard m n = replicate m (replicate n 0).
Also you can create more generic version of this function to fill the field with any value:
genericMkBoard :: a -> Int -> Int -> [[a]]
genericMkBoard x m n = replicate m (replicate n x)
And define your mkBoard with partial application:
mkBoard = genericMkBoard 0
UPD: Also you can make it like this to be more consistent with replicate (thanks to #chepner):
replicate2D m n x = replicate m (replicate n x)
mkBoard m n = replicate2D m n 0
There is no need to use list comprehension here. We can make use of replicate :: Int -> a -> [a] that will convert an Int n and an a to a list that repeats that value n times.
We thus can construct a list of three 0s with replicate 3 0. We can then use replicate a second time to construct a list that contains that list two times for example.
mkBoard :: Num n => Int -> Int -> [[n]]
mkBoard m n = replicate m (replicate n 0)

Split a tuple into n parts

I am trying to create a function that receives a range of doubles (Double, Double) and an n (Int), where I divide this interval into n equal parts. I know that if it was a list, I did a Split in the list, but being in tuples and getting Doubles, I'm not sure what to do.
Thank you for any help
This is similar to #mschmidt's answer, but I think a list comprehension is probably clearest:
intervals :: Int -> (Double,Double) -> [(Double,Double)]
intervals n (a,b) =
let n' = fromIntegral n
d = (b - a) / n'
in [(a + i*d, a + (i+1)*d) | i <- [0..n'-1]]
giving:
> intervals 4 (1,10)
[(1.0,3.25),(3.25,5.5),(5.5,7.75),(7.75,10.0)]
>
If the duplicate calculation of the endpoint offends you, you could write:
intervals' :: Int -> (Double,Double) -> [(Double,Double)]
intervals' n (a,b) =
let n' = fromIntegral n
d = (b - a) / n'
x = [a + i*d | i <- [0..n']]
in zip x (tail x)
Note that zip x (tail x) is a pretty standard way to get tuples of consecutive pairs of a list:
> let x = [1,2,3,4] in zip x (tail x)
[(1,2),(2,3),(3,4)]
>
A rough sketch, probably not the most elegant solution:
Take the two input doubles (I call them l and u) and compute the width of the input range/interval.
You want to compute n output ranges of equal width w. Compute this w by dividing the input width by n.
Build a list of length n containing the values l+0*w, l+1*w, l+2*w, ...
Build the list of output tuples by combining the first two items in the list into a tuple. Drop one element of the list. Continue until only one element remains.
Try to catch all possible errors

How can I produce a fixed length of numbers that sum up a given number in Haskell

I'm new to haskell world and wanted to know, given any positive integer and number of digits between 1-9 how can I find the combination of numbers that sum into the positive integer using the provided number of digits in Haskell. For example,
4 using two digits can be represented as a list of [[2,2],[3,1]] using three digits as a list of [[1,1,2]],
5 using two digits can be represented as a list of [[2,3],[4,1]] using three digits as a list of [[1,1,3],[2,2,1]]
Assuming that you want to avoid a brute-force approach, this can be regarded as a typical dynamic-programming problem:
import Data.Array
partitions :: Int -> Int -> [[Int]]
partitions m n = table ! (m, n, 9)
where
table = listArray ((1, 1, 1), (m, n, 9)) l
l = [f i j k | i <- [1 .. m], j <- [1 .. n], k <- [1 .. 9]]
f i 1 k = if i > k `min` 9 then [] else [[i]]
f i j k = [d : ds | d <- [1 .. k `min` pred i], ds <- table ! (i - d, j - 1, d)]
The idea is to construct a three-dimensional lazy array table in which a cell with index (i, j, k) contains all partitions ds of the positive integer i into lists of j digits drawn from [1 .. k] such that sum ds == i.
For example:
> partitions 4 2
[[2,2],[3,1]]
> partitions 4 3
[[2,1,1]]
> partitions 5 2
[[3,2],[4,1]]
> partitions 5 3
[[2,2,1],[3,1,1]]
If you really don't want to think about the problem, and you really should because dynamic programming is good brain food, then you can ask the computer to be smart on your behalf. For example, you could use a tool called an SMT solver to which the sbv package gives you easy access.
Encoding Partitioning in SBV
A great advantage of solvers is you merely need to express the problem and not the solution. In this case lets declare some number of integers (identified by len) which are values 1..9 that sum to a known result (sumVal):
intPartitions :: Int -> Int -> IO AllSatResult
intPartitions sumVal len = allSat $ do
xs <- mapM exists [show i | i <- [1..len]] :: Symbolic [SWord32]
mapM (constrain . (.< 10)) xs
mapM (constrain . (.> 0)) xs
return $ sum xs .== fromIntegral sumVal
Calling this function is rather simple we just have to import the right libraries and print out what are called the satisfying "models" for our problem:
import Data.SBV
import Data.List (nub,sort)
main = do
res <- intPartitions 5 3
print (nub (map sort (extractModels res :: [[Word32]])))
Notice I sorted and eliminated duplicate solutions because you didn't seem to care that [1,1,3], [3,1,1] etc were all solutions - you just want one permutation of the resulting assignments.
For these hard-coded values we have a result of:
[[1,1,3],[1,2,2]]
Well a simple brute force does the trick:
import Data.List
import Control.Monad
sums :: Int -> Int -> [[Int]]
sums number count = nub . map sort . filter ((==number) . sum) $ replicateM count [1..number+1-count]
Note that this is very inefficient. The usage of nub . map sort only shortens the result by removing doubled elements.
This is usually solved by using dynamic programming to avoid recomputing common sub-problems. But this is not the most important problem here: you need to start by coming up with the recursive algorithm! You will have plenty of time to think about producing an efficient solution once you've solved that problem. Hence this answer in two steps. The whole gist without comments is available here.
I start off by giving names to types because I'd get confused with all the Ints floating around and I consider types to be documentation. You might be more clever than I am and not need all this extra stuff.
type Target = Int
type Digits = Int
type MaxInt = Int
Now, the bruteforce solution: We're given the number of Digits left to partition a number, the Target number and the MaxInt we may use in this partition.
partitionMaxBrute :: Digits -> Target -> MaxInt -> [[Int]]
partitionMaxBrute d t m
If we have no digits left and the target is zero, we're happy!
| d == 0 && t == 0 = [[]]
If the product of Digits by MaxInt is smaller than Target or if the MaxInt itself is smaller than zero, there is no way we may succeed accumulating Digits non-zero numbers! :(
| d * m < t || m <= 0 = []
If MaxInt is bigger than our Target then we better decrease MaxInt if we want to have a solution. It does not make sense to decrease it to anything bigger than Target + 1 - Digits.
| t < m = partitionMaxBrute d t (t + 1 - d)
Finally, we may either lower MaxInt (we are not using that number) or substract MaxInt from Target and keep going (we are using MaxInt at least once):
| otherwise = partitionMaxBrute d t (m - 1)
++ fmap (m :) (partitionMaxBrute (d - 1) (t - m) m)
Given that solution, we can get our brute force partition: it's the one where the MaxInt we start with is Target + 1 - Digits which makes sense given that we are expecting a list of Digits non-zero numbers.
partitionBrute :: Digits -> Target -> [[Int]]
partitionBrute d t = partitionMaxBrute d t (t + 1 - d)
Now comes the time of memoization: dynamic programming is taking advantage of the fact that the smaller problems we solve are discovered through a lot of different paths and we do not need to recompute the answer over and over again. Easy caching is made possible by the memoize package. We simply write the same function with its recursive calls abstracted:
partitionMax :: (Digits -> Target -> MaxInt -> [[Int]]) ->
Digits -> Target -> MaxInt -> [[Int]]
partitionMax rec d t m
| d == 0 && t == 0 = [[]]
| d * m < t || m <= 0 = []
| t < m = rec d t (t + 1 - d)
| otherwise = rec d t (m - 1)
++ fmap (m :) (rec (d - 1) (t - m) m)
And make sure that we cache the values:
partition :: Digits -> Target -> [[Int]]
partition d t = memoPM d t (t + 1 - d)
where memoPM = memoize3 $ partitionMax memoPM
You can produce all partitions directly:
type Count = Int
type Max = Int
type Target = Int
partitions :: Count -> Max -> Target -> [[Int]]
partitions 0 m 0 = [[]]
partitions k m n = do
let m' = min m (n - k + 1)
d <- takeWhile (\d -> n <= k * d) [m', m' - 1 .. 1]
map (d:) $ partitions (k - 1) d (n - d)
It's easy to check, that there are no redundant cases. We just need to replace do with redundant $ do, where redundant is
redundant [] = [[]]
redundant xs = xs
If partitions (k - 1) d (n - d) returned [], then redundant would make [[]] from it, and then map (d:) $ partitions (k - 1) d (n - d) would be equal to [d]. But output doesn't change with the redundant function, so all partitions are generated directly.
The code is pretty simple and fast, since you want to produce partitions, rather than count them.

How to write this as a single function?

I'm having a go at project Euler Q3 and need to get the largest prime factor of a number. So far I've gotten a pair of functions to return a list of all the factors of the given number but it seems like a really bad way to do it (partly because I only need the largest).
get_factors :: (Integral a) => a -> [a] -> [a]
get_factors _ [] = []
get_factors t (x:xs)
| t `mod` x == 0 = x:get_factors t xs
| otherwise = get_factors t xs
factors :: (Integral a) => a -> [a]
factors x = get_factors x [x,x-1..1]
> factors 1000
> [1000,500,250,200,125,100,50,40,25,20,10,8,5,4,2,1]
It seems weird to me that I would need to have a "launch" function if you will to start the recursive function off (or have a function where I have to pass it the same value twice, again, seems silly to me).
Can you point me in the right direction of how I should be going about doing this please?
You should try to recognize that what you're doing here, namely picking elements from a list which satisfy some condition, is a very common pattern. This pattern is implemented by the filter function in the Prelude.
Using filter, you can write your function as:
factors n = filter (\d -> n `mod` d == 0) [n, n-1 .. 1]
or, equivalently, you can use a list comprehension:
factors n = [d | d <- [n, n-1 .. 1], n `mod` d == 0]
Using a "launch" function for calling a recursive function is very common in Haskell, so don't be afraid of that. Most often it'd be written as
f = g someArgument
where
g = ...
in your case
factors :: (Integral a) => a -> [a]
factors x = get_factors [x,x-1..1]
where
get_factors [] = []
get_factors (y:ys)
| x `mod` y == 0 = y : get_factors ys
| otherwise = get_factors ys
This signals readers of your code that get_factors is used only here and nowhere else, and helps you to keep the code clean. Also get_factors has access to x, which simplifies the design.
Some other ideas:
It's inefficient to try dividing by all numbers. In problems like that it's much better to pre-compute the list of primes and factor using the list. There are many methods how to compute such a list, but for educational purposes I'd suggest you to write your own (this will come in handy for other Project Euler problems). Then you could take the list of primes, take a part of primes less or equal than x and try dividing by them.
When searching just for the largest factor, you have to search through all primes between 1 and x. But if x is composite, one of its factors must be <= sqrt(n). You can use this to construct a significantly better algorithm.
I do not think it is a very good idea to go through every number like [n, n-1..] since the problem says 600851475143.
largest_factors :: Integer -> Integer
largest_factors n = helper n 2
where
helper m p
| m < p^2 = m
| m == p = m
| m `mod` p == 0 = helper (m `div` p) p
| otherwise = helper m (p+1)
What I did is that, once it found that a certain number, say p, divides the number n, it just divides it. This one works on my computer just fine. This gave me the solution within a sec.

Resources