Related
I want to add two positive numbers together without the use of any basic operators like + for addition. I've already worked my way around that (in the add''' function) (i think) may not be efficient but thats not the point right now. I am getting lots of type errors however which i have no idea how to handle, and is very confusing for me as it works on paper and i've come from python.
add 1245 7489
--add :: Int -> Int -> Int
add x y = add'' (zip (add' x) (add' y))
where
add' :: Int -> [Int]
add' 0 = []
add' x = add' (x `div` 10) ++ [x `mod` 10]
conversion [1,2,4,5] [7,4,8,9] then zipping them together [(1,7),(2,4)....]
add'' :: [(Int,Int)] -> [Int]
add'' (x:xs) = [(add''' (head x) (last x))] ++ add'' xs
summary [8,6,...] what happens when the sum reaches 10 is not implemented yet.
where
--add''' :: (Int,Int) -> Int
add''' x y = last (take (succ y) $ iterate succ x)
adding two numbers together
You can't use head and last on tuples. ...Frankly, you should never use these functions at all because they're unsafe (partial), but they can be used on lists. In Haskell, lists are something completely different from tuples.To get at the elements of a tuple, use pattern matching.
add'' ((x,y):xs) = [add''' x y] ++ add'' xs
(To get at the elements of a list, pattern matching is very often the best too.) Alternatively, you can use fst and snd, these do on 2-tuples what you apparently thought head and last would.
Be clear which functions are curried and which aren't. The way you write add''', its type signature is actually Int -> Int -> Int. That is equivalent to (Int, Int) -> Int, but it's still not the same to the type checker.
The result of add'' is [Int], but you're trying to use this as Int in the result of add. That can't work, you need to translate from digits to numbers again.
add'' doesn't handle the empty case. That's fixed easily enough, but better than doing this recursion at all is using standard combinators. In your case, this is only supposed to work element-wise anyway, so you can simply use map – or do that right in the zipping, with zipWith. Then you also don't need to unwrap any tuples at all, because it works with a curried function.
A clean version of your attempt:
add :: Int -> Int -> Int
add x y = fromDigits 0 $ zipWith addDigits (toDigits x []) (toDigits y [])
where
fromDigits :: Int -> [Int] -> Int
fromDigits acc [] = acc
fromDigits acc (d:ds)
= acc `seq` -- strict accumulator, to avoid thunking.
fromDigits (acc*10 + d) ds
toDigits :: Int -> [Int] -> [Int] -- yield difference-list,
toDigits 0 = id -- because we're consing
toDigits x = toDigits (x`div`10) . ((x`mod`10):) -- left-associatively.
addDigits :: Int -> Int -> Int
addDigits x y = last $ take (succ x) $ iterate succ y
Note that zipWith requires both numbers to have the same number of digits (as does zip).
Also, yes, I'm using + in fromDigits, making this whole thing pretty futile. In practice you would of course use binary, then it's just a bitwise-or and the multiplication is a left shift. What you actually don't need to do here is take special care with 10-overflow, but that's just because of the cheat of using + in fromDigits.
By head and last you meant fst and snd, but you don't need them at all, the components are right there:
add'' :: [(Int, Int)] -> [Int]
add'' (pair : pairs) = [(add''' pair)] ++ add'' pairs
where
add''' :: (Int, Int) -> Int
add''' (x, y) = last (take (succ y) $ iterate succ x)
= iterate succ x !! y
= [x ..] !! y -- nice idea for an exercise!
Now the big question that remains is what to do with those big scary 10-and-over numbers. Here's a thought: produce a digit and a carry with
= ([(d, 0) | d <- [x .. 9]] ++ [(d, 1) | d <- [0 ..]]) !! y
Can you take it from here? Hint: reverse order of digits is your friend!
the official answer my professor gave
works on positive and negative numbers too, but still requires the two numbers to be the same length
add 0 y = y
add x y
| x>0 = add (pred x) (succ y)
| otherwise = add (succ x) (pred y)
The other answers cover what's gone wrong in your approach. From a theoretical perspective, though, they each have some drawbacks: they either land you at [Int] and not Int, or they use (+) in the conversion back from [Int] to Int. What's more, they use mod and div as subroutines in defining addition -- which would be okay, but then to be theoretically sound you would want to make sure that you could define mod and div themselves without using addition as a subroutine!
Since you say efficiency is no concern, I propose using the usual definition of addition that mathematicians give, namely: 0 + y = y, and (x+1) + y = (x + y)+1. Here you should read +1 as a separate operation than addition, a more primitive one: the one that just increments a number. We spell it succ in Haskell (and its "inverse" is pred). With this theoretical definition in mind, the Haskell almost writes itself:
add :: Int -> Int -> Int
add 0 y = y
add x y = succ (add (pred x) y)
So: compared to other answers, we can take an Int and return an Int, and the only subroutines we use are ones that "feel" more primitive: succ, pred, and checking whether a number is zero or nonzero. (And we land at only three short lines of code... about a third as long as the shortest proposed alternative.) Of course the price we pay is very bad performance... try add (2^32) 0!
Like the other answers, this only works for positive numbers. When you are ready for handling negative numbers, we should chat again -- there's some fascinating mathematical tricks to pull.
I'm trying to write a Haskell library for cryptographically secure random numbers. The code follows:
module URandom (URandom, initialize) where
import qualified Data.ByteString.Lazy as B
import System.Random
import Data.Word
newtype URandom = URandom [Word8]
instance RandomGen URandom where
next (URandom (x : xs)) = (fromIntegral x, URandom xs)
split (URandom l) = (URandom (evens l), URandom (odds l))
where evens (x : _ : xs) = x : evens xs
odds (_ : x : xs) = x : odds xs
genRange _ = (fromIntegral (minBound :: Word8), fromIntegral (maxBound :: Word8))
initialize :: IO URandom
initialize = URandom . B.unpack <$> B.readFile "/dev/urandom"
Unfortunately, it's not behaving like I want. In particular, performing
take 10 . randoms <$> initialize
yields (something similar to)
[-4611651379516519433,-4611644973572935887,-31514321567846,9223361179177989878,-4611732094835278236,9223327886739677537,4611709625714976418,37194416358963,4611669560113361421,-4611645373004878170,-9223329383535098640,4611675323959360258,-27021785867556,9223330964083681227,4611705212636167666]
which to my, albiet untrained, eye, does not appear very random. A lot of 46... and 92... in there.
What could be going wrong? Why doesn't this produce well-distributed numbers? It's worth noting that even if I concatenate together Word8s to form Ints the distribution does not improve, I didn't think it was worth including that code here.
Edit: here's some evidence that's not distributed correctly. I've written a function called histogram:
histogram :: ∀ t . (Integral t, Bounded t)
=> [t] -> Int -> S.Seq Int
histogram [] buckets = S.replicate buckets 0
histogram (x : xs) buckets = S.adjust (+ 1) (whichBucket x) (histogram xs buckets)
where whichBucket x = fromIntegral $ ((fromIntegral x * fromIntegral buckets) :: Integer) `div` fromIntegral (maxBound :: t)
and when I run
g <- initialize
histogram (take 1000000 $ randoms g :: [Word64]) 16
I get back
fromList [128510,0,0,121294,129020,0,0,122090,127873,0,0,120919,128637,0,0,121657]
Some of the buckets are completely empty!
The issue is a bug in random-1.0.1.1 that was fixed in random-1.1. The changelog points to this ticket. In particular, referring to the older version:
It also assumes that all RandomGen implementations produce the same range of random values as StdGen.
Here randomness is produced 8 bits at a time, and that caused the observed behavior.
random-1.1 fixed this:
This implementation also works with any RandomGen, even ones that produce as little as a single bit of entropy per next call or have a minimum bound other than zero.
I wrote a function for evaluating a polynomial at a given number. The polynomial is represented as a list of coefficients (e.g. [1,2,3] corresponds to x^2+2x+3).
polyEval x p = sum (zipWith (*) (iterate (*x) 1) (reverse p))
As you can see, I first used a lot of parenthesis to group which expressions should be evaluated. For better readability I tried to eliminate as many parenthesis using . and $. (In my opinion more than two pairs of nested parenthesis are making the code more and more difficult to read.) I know that function application has highest priority and is left associative. The . and $are both right associative but . has priority 9, while $ has priority 0.
So it seemed to me that following expression cannot be written with even fewer parenthesis
polyEval x p = sum $ zipWith (*) (iterate (*x) 1) $ reverse p
I know that we need parenthesis for (*) and (*x) to convert them to prefix functions, but is it possible to somehow remove the parenthesis around iterate (*x) 1?
Also what version would you prefer for readability?
I know that there are many other ways to achieve the same, but I'd like to discuss my particular example, as it has a function evaluated in two arguments (iterate (*x) 1) as middle argument of another function that takes three arguments.
As usual with this sort of question I prefer the OP's version to any of the alternatives that have been proposed so far. I would write
polyEval x p = sum $ zipWith (*) (iterate (* x) 1) (reverse p)
and leave it at that. The two arguments of zipWith (*) play symmetric roles in the same way that the two arguments of * do, so eta-reducing is just obfuscation.
The value of $ is that it makes the outermost structure of the computation clear: the evaluation of a polynomial at a point is the sum of something. Eliminating parentheses should not be a goal in itself.
So it might be a little puerile, but I actually really like to think of Haskell’s rules in terms of food. I think of Haskell’s left-associative function application f x y = (f x) y as a sort of aggressive nom or greedy nom, in that the function f refuses to wait for the y to come around and immediately eats the f, unless you take the time to put these things in parentheses to make a sort of "argument sandwich" f (x y) (at which point the x, being uneaten, becomes hungry and eats the y.) The only boundaries are the operators and the special forms.
Then within the boundaries of the special forms, the operators consume whatever is around them; finally the special forms take their time to digest the expressions around them. This is the only reason that . and $ are able to save some parentheses.
Finally this we can see that iterate (* x) 1 is probably going to need to be in a sandwich because we don't want something to just eat iterate and stop. So there is no great way to do that without changing that code, unless we can somehow do away with the third argument to zipWith -- but that argument contains a p so that requires writing something to be more point-free.
So, one solution is to change your approach! It makes a little more sense to store a polynomial as a list of coefficients in the already-reversed direction, so that your x^2 + 2 * x + 3 example is stored as [3, 2, 1]. Then we don't need to perform this complicated reverse operation. It also makes the mathematics a little simpler as the product of two polynomials can be rewritten recursively as (a + x * P(x)) * (b + x * Q(x)) which gives the straightforward algorithm:
newtype Poly f = Poly [f] deriving (Eq, Show)
instance Num f => Num (Poly f) where
fromInteger n = Poly [fromInteger n]
negate (Poly ps) = Poly (map negate ps)
Poly f + Poly g = Poly $ summing f g where
summing [] g = g
summing f [] = f
summing (x:xs) (y:ys) = (x + y) : summing xs ys
Poly (x : xs) * Poly (y : ys) = prefix (x*y) (y_p + x_q) + r where
y_p = Poly $ map (y *) xs
x_q = Poly $ map (x *) ys
prefix n (Poly m) = Poly (n : m)
r = prefix 0 . prefix 0 $ Poly xs * Poly ys
Then your function
evaluatePoly :: Num f => Poly f -> f -> f
evaluatePoly (Poly p) x = eval p where
eval = (sum .) . zipWith (*) $ iterate (x *) 1
lacks parentheses around iterate because the eval is written in pointfree style, so $ can be used to consume the rest of the expression. As you can see it unfortunately leaves some new parentheses around (sum .) to do this, though, so it might not be totally worth your while. I find the latter less readable than, say,
evaluatePoly (Poly coeffs) x = sum $ zipWith (*) powersOfX coeffs where
powersOfX = iterate (x *) 1
I might even prefer to write the latter, if performance on high powers is not super-critical, as powersOfX = [x^n | n <- [0..]] or powersOfX = map (x^) [0..], but I think iterate is not too hard to understand in general.
Perhaps breaking it down to more elementary functions will simplify further. First define a dot product function to multiply two arrays (inner product).
dot x y = sum $ zipWith (*) x y
and change the order of terms in polyEval to minimize the parenthesis
polyEval x p = dot (reverse p) $ iterate (* x) 1
reduced to 3 pairs of parenthesis.
I'm using Project Euler to learn Haskell. I'm new at Haskell and am having a lot of trouble coming up with an algorithm that doesn't take an absurd amount of time. I'm estimating that the program here would take 14 gigayears to arrive at the solution.
The problem:
Which prime, below one-million, can be written as the sum of the most
consecutive primes?
Here's my source. I've left out isPrime. I've posted it because it's far too inefficient to solve the problem. I think the issue lies with the slicedChains and primeChains calls, but I'm not sure what it is. I've resolved this before with C++. But for whatever reason, the efficient solution seems beyond me in Haskell.
Edit: I've included isPrime.
import System.Environment (getArgs)
import Data.List (nub,maximumBy)
import Data.Ord (comparing)
isPrime :: Integer -> Bool
isPrime 1 = False
isPrime 2 = True
isPrime x
| any (== 0) (fmap (x `mod`) [2..x-1]) = False
| otherwise = True
primeChain :: Integer -> [Integer]
primeChain x = [ n | n <- 1 : 2 : [3,5..x-1], isPrime n ]
slice :: [a] -> [Int] -> [a]
slice xs args = take (to - from + 1) (drop from xs)
where from = head args
to = last args
subsequencesOfSize :: Int -> [a] -> [[a]]
subsequencesOfSize n xs = let l = length xs
in if n>l then [] else subsequencesBySize xs !! (l-n)
where
subsequencesBySize [] = [[[]]]
subsequencesBySize (x:xs) = let next = subsequencesBySize xs
in zipWith (++) ([]:next) (map (map (x:)) next ++ [[]])
slicedChains :: Int -> [Integer] -> [[Integer]]
slicedChains len xs = nub [x | x <- fmap (xs `slice`) subseqs, length x > 1]
where subseqs = [x | x <- (subsequencesOfSize 2 [1..len]), (last x) > (head x)]
primeSums :: Integer -> [[Integer]]
primeSums x = filter (\ns -> sum ns == x) chain
where xs = primeChain x
len = length xs
chain = slicedChains len xs
compLength :: [[a]] -> [a]
compLength xs = maximumBy (comparing length) xs
cleanSums :: [Integer] -> [[Integer]]
cleanSums xs = fmap (compLength) filtered
where filtered = filter (not . null) (fmap primeSums xs)
main :: IO()
main = do
args <- getArgs
let arg = read (head args) :: Integer
let xs = primeChain arg
print $ maximumBy (comparing length) $ cleanSums xs
Your basic problem is that you are not pruning your search space based on the best solution you have found so far.
I can tell this just from the fact that you are using maximumBy to find the longest sequence.
For instance, if during your search your find a consecutive sequence of 4 primes whose sum is a prime < 10^6, you don't have to examine any sequence which begins with a prime greater than 250000.
To do this kind of pruning you have to keep track of the solution found so far and interleave the testing of candidate sequences with their generation so that the best solution found so far can stop the search early.
Update
There are several inefficiencies in slicedChains. Haskell lists are implemented a linked lists. This video is pretty good overview of linked lists and how they differ from arrays: (link)
The following expressions in your code are going to be problematic w.r.t. efficiency:
* nub has quadratic running time
* length x > 1 - the complexity of length is O(n) where n is the length of the list. A better way to write this is:
lengthGreaterThan1 :: [a] -> Bool
lengthGreaterThan1 (_:_:_) = True
lengthGreaterThan1 _ = False
* subsequencesOfSize 2 [1..len] may be more succinctly written:
[ [a,b] | a <- [1..len], b <- [a+1..len] ]
and this will also ensure that a < b.
* The take and drop calls in slice are also O(n)
* In primeSums the call to primeChain will regenerate essentially the same list over and over again resulting in a lot of multiple calls to isPrime. A better approach is to define primeChain like this:
allPrimes = filter isPrime [1..]
primeChain x = takeWhile (<= x) allPrimes
The list allPrimes will be generated once, and primeChain simply takes prefixes of that list.
* primeSums x is charged with finding sequences whose sum is exactly x, but it looks at a lot of sequences that can't possibly work. For instance, primeSums 31 will examine:
11 + 13 + 17, 11 + 13 + 17 + 23, 11 + 13 + 17 + 23 + 29,
17 + 19, 17 + 19 + 23, 17 + 19 + 23 + 29,
19 + 23, 19 + 23 + 29
23 + 29
even though it's pretty obvious that none of these sums could equal 31.
So the first thing you need is a good data structure: Once you find a sequence of length n you don't care about sequences of shorter length, so your primary needs are: (1) tracking the sum, (2) tracking the primes in the set, (3) removing the least element, (4) adding a new greatest element. The key is amortization, where a big cost is paid infrequently enough that you can pretend it is a small cost per procedure. The data structure looks like this:
data Queue x = Q [x] [x]
q_empty (Q [] []) = True
q_empty _ = False
q_headtails (Q (x:xs) rest) = (x, Q xs rest)
q_headtails (Q [] xs) = case reverse xs of y:ys -> (y, Q ys [])
[] -> error "End of queue."
q_append el (Q beg end) = Q beg (el:end)
So deconstructing the list is possible, but sometimes triggers an O(n) operation, but that's OK because when it does, we won't have to do it for another n steps, so it averages out to one operation per step. (You might also want to do it with a spine-strict list.)
To save on length operations and summing the items of the list you probably want to cache those, too:
type Length = Int
type Sum = Int
type Prime = Int
data PrimeSeq = PS Length Sum (Queue Prime)
headTails (PS len sum q) = (x, PS (len - 1) (sum - x) xs)
where (x, xs) = q_headtails q
append x (PS len sum xs) = PS (len + 1) (sum + x) (q_append x xs)
The algorithm for these looks like:
Cache a copy of the PrimeSeq you're starting with
Keep adding primes to it and testing primality until you get to 10^6.
If you find a new prime with a longer sequence, replace the cache.
Whenever you run into 10^6, revert to the cache, pull a prime off the front of the queue, then repeat as needed.
Your prime generation is quadratic (isPrime 101 tests rem 101 100 == 0 even though 10 is the biggest number by which 101 needs to be tested -- and actually 7 is enough).
Yet even with it, a simple enough list-based code finds the answer in under 2 seconds (on an Intel Core i7 2.5 GHz, interpreted in GHCi). And with the code corrected to take advantage of the above mentioned optimization (and additionally, testing by primes only), it takes 0.1s.
Also, f x | t = False | otherwise = True is the same as f x = not t.
We are asked by the PE site not to give you even a hint.
But in general, the key to efficiency in Haskell, thanks to its laziness, is being generative with as small a duplication of effort as possible. As one example, instead of calculating each slice of a list in isolation starting anew, we can produce the bunch of them together as part of one process,
slices :: Int -> [a] -> [[a]]
slices n = map (take n) . iterate tail -- sequence of list's slices of length n each
Another principle is, try to solve a more general problem, of which yours is an instance.
Having written such a function, we can play with it by trying out different values for its parameters, from smaller to the bigger ones, for an exploratory style of problem solving. We're told about 21 consecutive primes. What about 22 of them? 27? 1127 of them? ... and I've said enough about this already.
If it starts taking too much time, we can assess the full solution's needed run time by empirical orders of growth analysis.
Though the solution is found quickly enough with your unoptimized isPrime code, the exploratory process can be prohibitively slow with it, but it is fast enough with the optimized code:
primes :: [Int]
primes = 2 : filter isPrime [3,5..]
isPrime n = and [rem n p > 0 | p <- takeWhile ((<= n).(^2)) primes]
I needed an efficient sliding window function in Haskell, so I wrote the following:
windows n xz#(x:xs)
| length v < n = []
| otherwise = v : windows n xs
where
v = take n xz
My problem with this is that I think the complexity is O(n*m) where m is the length of the list and n is the window size. You count down the list once for take, another time for length, and you do it down the list of essentially m-n times. It seems like it can be more efficient than this, but I'm at a loss for how to make it more linear. Any takers?
You can't get better than O(m*n), since this is the size of the output data structure.
But you can avoid checking the lengths of the windows if you reverse the order of operations: First create n shifted lists and then just zip them together. Zipping will get rid of those that don't have enough elements automatically.
import Control.Applicative
import Data.Traversable (sequenceA)
import Data.List (tails)
transpose' :: [[a]] -> [[a]]
transpose' = getZipList . sequenceA . map ZipList
Zipping a list of lists is just a transposition, but unlike transpose from Data.List it throws away outputs that would have less than n elements.
Now it's easy to make the window function: Take m lists, each shifted by 1, and just zip them:
windows :: Int -> [a] -> [[a]]
windows m = transpose' . take m . tails
Works also for infinite lists.
You can use Seq from Data.Sequence, which has O(1) enqueue and dequeue at both ends:
import Data.Foldable (toList)
import qualified Data.Sequence as Seq
import Data.Sequence ((|>))
windows :: Int -> [a] -> [[a]]
windows n0 = go 0 Seq.empty
where
go n s (a:as) | n' < n0 = go n' s' as
| n' == n0 = toList s' : go n' s' as
| otherwise = toList s'' : go n s'' as
where
n' = n + 1 -- O(1)
s' = s |> a -- O(1)
s'' = Seq.drop 1 s' -- O(1)
go _ _ [] = []
Note that if you materialize the entire result your algorithm is necessarily O(N*M) since that is the size of your result. Using Seq just improves performance by a constant factor.
Example use:
>>> windows [1..5]
[[1,2,3],[2,3,4],[3,4,5]]
First let's get the windows without worrying about the short ones at the end:
import Data.List (tails)
windows' :: Int -> [a] -> [[a]]
windows' n = map (take n) . tails
> windows' 3 [1..5]
[[1,2,3],[2,3,4],[3,4,5],[4,5],[5],[]]
Now we want to get rid of the short ones without checking the length of every one.
Since we know they are at the end, we could lose them like this:
windows n xs = take (length xs - n + 1) (windows' n xs)
But that's not great since we still go through xs an extra time to get its length. It also doesn't work on infinite lists, which your original solution did.
Instead let's write a function for using one list as a ruler to measure the amount to take from another:
takeLengthOf :: [a] -> [b] -> [b]
takeLengthOf = zipWith (flip const)
> takeLengthOf ["elements", "get", "ignored"] [1..10]
[1,2,3]
Now we can write this:
windows :: Int -> [a] -> [[a]]
windows n xs = takeLengthOf (drop (n-1) xs) (windows' n xs)
> windows 3 [1..5]
[[1,2,3],[2,3,4],[3,4,5]]
Works on infinite lists too:
> take 5 (windows 3 [1..])
[[1,2,3],[2,3,4],[3,4,5],[4,5,6],[5,6,7]]
As Gabriella Gonzalez says, the time complexity is no better if you want to use the whole result. But if you only use some of the windows, we now manage to avoid doing the work of take and length on the ones you don't use.
If you want O(1) length then why not use a structure that provides O(1) length? Assuming you aren't looking for windows from an infinite list, consider using:
import qualified Data.Vector as V
import Data.Vector (Vector)
import Data.List(unfoldr)
windows :: Int -> [a] -> [[a]]
windows n = map V.toList . unfoldr go . V.fromList
where
go xs | V.length xs < n = Nothing
| otherwise =
let (a,b) = V.splitAt n xs
in Just (a,b)
Conversation of each window from a vector to a list might bite you some, I won't hazard an optimistic guess there, but I will bet that the performance is better than the list-only version.
For the sliding window I also used unboxed Vetors as length, take, drop as well as splitAt are O(1) operations.
The code from Thomas M. DuBuisson is a by n shifted window, not a sliding, except if n =1. Therefore a (++) is missing, however this has a cost of O(n+m). Therefore careful, where you put it.
import qualified Data.Vector.Unboxed as V
import Data.Vector.Unboxed (Vector)
import Data.List
windows :: Int -> Vector Double -> [[Int]]
windows n = (unfoldr go)
where
go !xs | V.length xs < n = Nothing
| otherwise =
let (a,b) = V.splitAt 1 xs
c= (V.toList a ++V.toList (V.take (n-1) b))
in (c,b)
I tried it out with +RTS -sstderr and:
putStrLn $ show (L.sum $ L.concat $ windows 10 (U.fromList $ [1..1000000]))
and got real time 1.051s and 96.9% usage, keeping in mind that after the sliding window two O(m) operations are performed.