I need to write a function or functions in Haskell that can solve the Chinese Remainder Theorem. It needs to be created with the following definition:
crt :: [(Integer, Integer)] -> (Integer, Integer)
That the answer looks like
>crt [(2,7), (0,3), (1,5)]
(51, 105)
I think I have the overall idea, I just don't have the knowledge to write it. I know that the crt function must be recursive. I have created a helper function to split the list of tuples into a tuple of two lists:
crtSplit xs = (map fst xs, product(map snd xs))
Which, in this example, gives me:
I think what I need to do know is create a list for each of the elements in the first list. How would I begin to do this?
Chinese remainder theorem has an algebraic solution, based on the fact that x = r1 % m1 and x = r2 % m2 can be reduced to one modular equation if m1 and m2 are coprime.
To do so you need to know what modular inverse is and how it can efficiently be calculated using extended Euclidean algorithm.
If you put these pieces together, you can solve Chinese remainder theorem with a right fold:
crt :: (Integral a, Foldable t) => t (a, a) -> (a, a)
crt = foldr go (0, 1)
go (r1, m1) (r2, m2) = (r `mod` m, m)
r = r2 + m2 * (r1 - r2) * (m2 `inv` m1)
m = m2 * m1
-- Modular Inverse
a `inv` m = let (_, i, _) = gcd a m in i `mod` m
-- Extended Euclidean Algorithm
gcd 0 b = (b, 0, 1)
gcd a b = (g, t - (b `div` a) * s, s)
where (g, s, t) = gcd (b `mod` a) a
\> crt [(2,7), (0,3), (1,5)]
\> crt [(2,3), (3,4), (1,5)] -- wiki example
Without going into algebra, you can also solve this with brute force. Perhaps that's what you've been asked to do.
For your example, create a list for each mod independent of the other two (upper bound will be least common multiple of the mod, assuming they are co-prime as a precondition, product, i.e. 105). These three list should have one common element which will satisfy all constraints.
m3 = [3,6,9,12,15,...,105]
m5 = [6,11,16,21,...,101]
m7 = [9,16,23,30,...,100]
you can use Data.Set to find the intersection of these lists. Now, extend this logic to n number of terms using recursion or fold.
Perhaps an easier approach is defining a filter to create a sequence with a fixed remainder for a modulus and repeatedly apply for the given pairs
Prelude> let rm (r,m) = filter (\x -> x `mod` m == r)
verify that it works,
Prelude> take 10 $ rm (1,5) [1..]
now, for the given example use it repeatedly,
Prelude> take 3 $ rm (1,5) $ rm (0,3) $ rm (2,7) [1..]
of course we just need the first element, change to head instead
Prelude> head $ rm (1,5) $ rm (0,3) $ rm (2,7) [1..]
which we can generalize with fold
Prelude> head $ foldr rm [1..] [(1,5),(0,3),(2,7)]
I said in this question that I didn't understand the source code of findIndices.
In fact I didn't pay enough attention and I didn't see that there are two definitions of this function:
findIndices :: (a -> Bool) -> [a] -> [Int]
findIndices p xs = [ i | (x,i) <- zip xs [0..], p x]
-- Efficient definition, adapted from Data.Sequence
{-# INLINE findIndices #-}
findIndices p ls = build $ \c n ->
let go x r k | p x = I# k `c` r (k +# 1#)
| otherwise = r (k +# 1#)
in foldr go (\_ -> n) ls 0#
I understand the first definition, the one I didn't see. I don't understand the second one. I have a couple of questions:
what is if defined(USE_REPORT_PRELUDE) ?
can one explain the second definition ? What are build, I#, +#, 1# ?
why the second definition is inlined, not the first one ?
The CPP extensions enables the C preprocessor, as for the C programming language. Here, it is used to test if the flag USE_REPORT_PRELUDE was set during compilation. According to that flag, the compiler uses the #if or the #else variant of code.
build is a function which could be defined as
build f = f (:) []
So, using build (\c n -> ... essentially lets c to the "cons" (:), and n to the "nil" [].
This is not used for convenience: it is not convenient at all! However, the compiler optimizer works great with build and foldr combined, so the code is written here in a weird way to take advantage of that.
Further, I# ... is the low-level constructor for integers. When we normally write
x :: Int
x = 4+2
GHC implements x (very roughly) with a pointer to some memory that reads as unevaluated: 4+2. After x is forced the first time, this memory gets overwritten with evaluated: I# 6#. This is needed to implement laziness.
The "boxing" here refers to the indirection through a pointer.
Instead, the type Int# is a plain machine integer, with no pointers, no indirection, no unevaluated expressions. It is strict (instead of lazy), but being more low-level it is more efficient. One creates a value as in
x' :: Int#
x' = 6#
x :: Int
x = I# x'
Indeed, Int is defined as newtype Int = I# Int#.
Keep in mind that this is not standard Haskell, but GHC-specific low-level details. In normal code, you should not need to use such unboxed types. In libraries, the authors do that to achieve a little more performance, but that's it.
Sometimes, even if in our code we only use Ints, GHC is smart enough to automatically convert our code to using Int# and achieve more efficiency, avoiding the boxing. This can be observed if we ask GHC to "dump Core" so that we can see the result of the optimization.
For instance, compiling
f :: Int -> Int
f 0 = 0
f n = n + f (n-1)
GHC produces a lower level version (this is GHC Core, not Haskell, but it is similar enough to be understood):
Main.$wf :: GHC.Prim.Int# -> GHC.Prim.Int#
Main.$wf = \ (ww_s4un :: GHC.Prim.Int#) ->
case ww_s4un of ds_X210 {
case Main.$wf (GHC.Prim.-# ds_X210 1#) of ww1_s4ur { __DEFAULT ->
GHC.Prim.+# ds_X210 ww1_s4ur
0# -> 0#
Notice the number of arguments to go. go x r k = ... === go x r = \k -> .... This is the standard trick to arrange for left-to-right information flow while folding the list (go is used as the reducer function, in foldr go (\_ -> n) ls 0#). Here, it's the counting of [0..], explicated as the initial k=0 and the (k + 1) on each step (k is an unfortunate naming choice, i seems better; k is overloaded with the irrelevant "constant" and "continuation", not just "counter" which was probably the intended meaning here).
The foldr/build (sic) fusion (linked to by luqui in the comments) turns foldr c n $ findIndices p [a1,a2,...,an] into a loop, exposing the inner foldr of the findIndices definition, avoiding building the actual list structure of the result of the findIndices call:
build g = g (:) []
foldr c n $ build g = g c n
foldr c n $ findIndices p [a1,a2,...,an]
foldr c n $ build g where {g c n = ...}
g c n where {g c n = ...}
foldr go (const n) [a1,a2,...,an] 0 where {go x r k = ...}
go a1 (foldr go (const n) [a2,...,an]) 0
let { x=a1, r=foldr go (const n) [a2,...,an], k=0 }
if | p x -> c (I# k) (r (k +# 1#)) -- no 'cons' (`:`), only 'c'
| otherwise -> r (k +# 1#)
So you see, it's a standard trick to have foldr define a function which expects one more input argument, to arrange the left-to-right information flow while processing the input list.
All the stuff with the hash sign are "primitive" or "closer-to-machine-level" entities. I# is a primitive Int constructor; 0# is a machine-level 0; etc.. This may or may not be exactly correct, but it should be close enough.
foldr/build fusion seems a particular case of transducers-based code transformation, which is based on the fact that nested folds are fused by composing their reducers' transformers (aka transducers), as in
foldr c n $
foldr (tr2 c2) n2 $
foldr (tr3 c3) n3 xs
foldr (tr2 c) n $ -- fold "replaces" the constructor nodes with its reducer
foldr (tr3 c3) n3 xs -- so just use the outer reducer in the first place!
foldr (tr3 (tr2 c)) n xs
foldr ((tr3 . tr2) c) n xs
and build g === foldr . tr for some appropriate choice of tr for a given g, so that
build g = g c n = (foldr . tr) c n = foldr (tr c) n
As for USE_REPORT_PRELUDE, again, I can't say this with any authority, but I always assumed that it is the compilation flag which is enabled when the mock definitions from the Haskell Report are used as actual code, even though they were intended as an executable specification.
I wrote a function for evaluating a polynomial at a given number. The polynomial is represented as a list of coefficients (e.g. [1,2,3] corresponds to x^2+2x+3).
polyEval x p = sum (zipWith (*) (iterate (*x) 1) (reverse p))
As you can see, I first used a lot of parenthesis to group which expressions should be evaluated. For better readability I tried to eliminate as many parenthesis using . and $. (In my opinion more than two pairs of nested parenthesis are making the code more and more difficult to read.) I know that function application has highest priority and is left associative. The . and $are both right associative but . has priority 9, while $ has priority 0.
So it seemed to me that following expression cannot be written with even fewer parenthesis
polyEval x p = sum $ zipWith (*) (iterate (*x) 1) $ reverse p
I know that we need parenthesis for (*) and (*x) to convert them to prefix functions, but is it possible to somehow remove the parenthesis around iterate (*x) 1?
Also what version would you prefer for readability?
I know that there are many other ways to achieve the same, but I'd like to discuss my particular example, as it has a function evaluated in two arguments (iterate (*x) 1) as middle argument of another function that takes three arguments.
As usual with this sort of question I prefer the OP's version to any of the alternatives that have been proposed so far. I would write
polyEval x p = sum $ zipWith (*) (iterate (* x) 1) (reverse p)
and leave it at that. The two arguments of zipWith (*) play symmetric roles in the same way that the two arguments of * do, so eta-reducing is just obfuscation.
The value of $ is that it makes the outermost structure of the computation clear: the evaluation of a polynomial at a point is the sum of something. Eliminating parentheses should not be a goal in itself.
So it might be a little puerile, but I actually really like to think of Haskell’s rules in terms of food. I think of Haskell’s left-associative function application f x y = (f x) y as a sort of aggressive nom or greedy nom, in that the function f refuses to wait for the y to come around and immediately eats the f, unless you take the time to put these things in parentheses to make a sort of "argument sandwich" f (x y) (at which point the x, being uneaten, becomes hungry and eats the y.) The only boundaries are the operators and the special forms.
Then within the boundaries of the special forms, the operators consume whatever is around them; finally the special forms take their time to digest the expressions around them. This is the only reason that . and $ are able to save some parentheses.
Finally this we can see that iterate (* x) 1 is probably going to need to be in a sandwich because we don't want something to just eat iterate and stop. So there is no great way to do that without changing that code, unless we can somehow do away with the third argument to zipWith -- but that argument contains a p so that requires writing something to be more point-free.
So, one solution is to change your approach! It makes a little more sense to store a polynomial as a list of coefficients in the already-reversed direction, so that your x^2 + 2 * x + 3 example is stored as [3, 2, 1]. Then we don't need to perform this complicated reverse operation. It also makes the mathematics a little simpler as the product of two polynomials can be rewritten recursively as (a + x * P(x)) * (b + x * Q(x)) which gives the straightforward algorithm:
newtype Poly f = Poly [f] deriving (Eq, Show)
instance Num f => Num (Poly f) where
fromInteger n = Poly [fromInteger n]
negate (Poly ps) = Poly (map negate ps)
Poly f + Poly g = Poly $ summing f g where
summing [] g = g
summing f [] = f
summing (x:xs) (y:ys) = (x + y) : summing xs ys
Poly (x : xs) * Poly (y : ys) = prefix (x*y) (y_p + x_q) + r where
y_p = Poly $ map (y *) xs
x_q = Poly $ map (x *) ys
prefix n (Poly m) = Poly (n : m)
r = prefix 0 . prefix 0 $ Poly xs * Poly ys
Then your function
evaluatePoly :: Num f => Poly f -> f -> f
evaluatePoly (Poly p) x = eval p where
eval = (sum .) . zipWith (*) $ iterate (x *) 1
lacks parentheses around iterate because the eval is written in pointfree style, so $ can be used to consume the rest of the expression. As you can see it unfortunately leaves some new parentheses around (sum .) to do this, though, so it might not be totally worth your while. I find the latter less readable than, say,
evaluatePoly (Poly coeffs) x = sum $ zipWith (*) powersOfX coeffs where
powersOfX = iterate (x *) 1
I might even prefer to write the latter, if performance on high powers is not super-critical, as powersOfX = [x^n | n <- [0..]] or powersOfX = map (x^) [0..], but I think iterate is not too hard to understand in general.
Perhaps breaking it down to more elementary functions will simplify further. First define a dot product function to multiply two arrays (inner product).
dot x y = sum $ zipWith (*) x y
and change the order of terms in polyEval to minimize the parenthesis
polyEval x p = dot (reverse p) $ iterate (* x) 1
reduced to 3 pairs of parenthesis.
Lets say I'm given two functions:
f :: [a] -> b
g :: [a] -> c
I want to write a function that is the equivalent of this:
h x = (f x, g x)
But when I do that, for large lists inevitably I run out of memory.
A simple example is the following:
x = [1..100000000::Int]
main = print $ (sum x, product x)
I understand this is the case because the list x is being stored in memory without being garbage collected. It would be better instead of f and g worked on x in, well, "parallel".
Assuming I can't change f and g, nor want to make a separate copy of x (assume x is expensive to produce) how can I write h without running into out of memory issues?
A short answer is you can't. Since you have no control over f and g, you have no guarantee that the functions process their input sequentially. Such a function can as well keep the whole list stored in memory before producing the final result.
However, if your functions are expressed as folds, the situation is different. This means that we know how to incrementally apply each step, so we can parallelize those steps in one run.
The are many resources about this area. For example:
Haskell: Can I perform several folds over the same lazy list without keeping list in memory?
Classic Beautiful folding
More beautiful fold zipping
The pattern of consuming a sequence of values with properly defined space bounds is solved more generally with pipe-like libraries such conduit, iteratees or pipes. For example, in conduit, you could express the combination of computing sums and products as
import Control.Monad.Identity
import Data.Conduit
import Data.Conduit.List (fold, sourceList)
import Data.Conduit.Internal (zipSinks)
product', sum' :: (Monad m, Num a) => Sink a m a
sum' = fold (+) 0
product' = fold (*) 1
main = print . runIdentity $ sourceList (replicate (10^6) 1) $$
zipSinks sum' product'
If you can turn your functions into folds, you can then just use them with a scan:
x = [1..100000000::Int]
main = mapM_ print . tail . scanl foo (a0,b0) . takeWhile (not.null)
. unfoldr (Just . splitAt 1000) -- adjust the chunk length as needed
$ x
foo (a,b) x = let a2 = f' a $ f x ; b2 = g' b $ g x
in a2 `seq` b2 `seq` (a2, b2)
f :: [t] -> a -- e.g. sum
g :: [t] -> b -- (`rem` 10007) . product
f' :: a -> a -> a -- e.g. (+)
g' :: b -> b -> b -- ((`rem` 10007) .) . (*)
we consume the input in chunks for better performance. Compiled with -O2, this should run in a constant space. The interim results are printed as indication of progress.
If you can't turn your function into a fold, this means it has to consume the whole list to produce any output and this trick doesn't apply.
You can use multiple threads to evaluate f x and g x in parallel.
x :: [Int]
x = [1..10^8]
main = print $ let a = sum x
b = product x
in a `par` b `pseq` (a,b)
Its a nice way to exploit GHC's parallel runtime to prevent a space leak by doing two things at once.
Alternatively, you need to fuse f and g into a single pass.
Is there a standard function to sum all values in a Haskell map. My Map reads something like [(a,2),(b,4),(c,6)]?
Essentially what I am trying to do is a normalized frequency distribution. So the values of the keys in the above map are counts for a,b,c. I need to normalize them as [(a,1/6),(b,1/3),(c,1/2)]
You can simply do Map.foldl' (+) 0 (or M.foldl', if you imported Data.Map as M).
This is just like foldl' (+) 0 . Map.elems, but slightly more efficient. (Don't forget the apostrophe — using foldl or foldr to do sums with the standard numeric types (Int, Integer, Float, Double, etc.) will build up huge thunks, which will use up lots of memory and possibly cause your program to overflow the stack.)
However, only sufficiently recent versions of containers (>= contain Data.Map.foldl', and you shouldn't upgrade it with cabal install, since it comes with GHC. So unless you're on GHC 7.2 or above, foldl' (+) 0 . Map.elems is the best way to accomplish this.
You could also use Data.Foldable.sum, which works on any instance of the Foldable typeclass, but will still build up large thunks on the common numeric types.
Here's a complete example:
normalize :: (Fractional a) => Map k a -> Map k a
normalize m = Map.map (/ total) m
where total = foldl' (+) 0 $ Map.elems m
You'll need to import Data.List to use foldl'.
total = foldr (\(_, n) r -> r + n) 0 l
in map (\(x, y) -> (x, y/total) l
Where l is your map.
import qualified Data.Map as M
sumMap = M.foldl' (+) 0
normalizeMap m =
let s = sumMap m in
M.map (/ s) m
main = do
let m = M.fromList [("foo", 1), ("bar", 2), ("baz", 6)]
(print . sumMap) m
(print . normalizeMap) m
fromList [("bar",0.2222222222222222),("baz",0.6666666666666666),("foo",0.1111111111111111)]
I'm a C++ Programmer trying to teach myself Haskell and it's proving to be challenging grasping the basics of using functions as a type of loop. I have a large number, 50!, and I need to add the sum of its digits. It's a relatively easy loop in C++ but I want to learn how to do it in Haskell.
I've read some introductory guides and am able to get 50! with
fac 0 = 1
fac n = n * fac (n-1)
x = fac 50
main = print x
Unfortunately at this point I'm not entirely sure how to approach the problem.
Is it possible to write a function that adds (mod) x 10 to a value and then calls the same function again on x / 10 until x / 10 is less than 10? If that's not possible how should I approach this problem?
sumd 0 = 0
sumd x = (x `mod` 10) + sumd (x `div` 10)
Then run it:
ghci> sumd 2345
This one doesn't generate thunks and uses accumulator:
sumd2 0 acc = acc
sumd2 x acc = sumd2 (x `div` 10) (acc + (x `mod` 10))
ghci> sumd2 2345 0
Partially applied version in pointfree style:
sumd2w = (flip sumd2) 0
ghci> sumd2w 2345
I used flip here because function for some reason (probably due to GHC design) didn't work with accumulator as a first parameter.
Why not just
sumd = sum . map Char.digitToInt . show
This is just a variant of #ony's, but how I'd write it:
import Data.List (unfoldr)
digits :: (Integral a) => a -> [a]
digits = unfoldr step . abs
where step n = if n==0 then Nothing else let (q,r)=n`divMod`10 in Just (r,q)
This will product the digits from low to high, which while unnatural for reading, is generally what you want for mathematical problems involving the digits of a number. (Project Euler anyone?) Also note that 0 produces [], and negative numbers are accepted, but produce the digits of the absolute value. (I don't want partial functions!)
If, on the other hand, I need the digits of a number as they are commonly written, then I would use #newacct's method, since the problem is one of essentially orthography, not math:
import Data.Char (digitToInt)
writtenDigits :: (Integral a) => a -> [a]
writtenDigits = map (fromIntegral.digitToInt) . show . abs
Compare output:
> digits 123
> writtenDigits 123
> digits 12300
> writtenDigits 12300
> digits 0
> writtenDigits 0
In doing Project Euler, I've actually found that some problems call for one, and some call for the other.
About . and "point-free" style
To make this clear for those not familiar with Haskell's . operator, and "point-free" style, these could be rewritten as:
import Data.Char (digitToInt)
import Data.List (unfoldr)
digits :: (Integral a) => a -> [a]
digits i = unfoldr step (abs i)
where step n = if n==0 then Nothing else let (q,r)=n`divMod`10 in Just (r,q)
writtenDigits :: (Integral a) => a -> [a]
writtenDigits i = map (fromIntegral.digitToInt) (show (abs i))
These are exactly the same as the above. You should learn that these are the same:
f . g
(\a -> f (g a))
And "point-free" means that these are the same:
foo a = bar a
foo = bar
Combining these ideas, these are the same:
foo a = bar (baz a)
foo a = (bar . baz) a
foo = bar . baz
The laster is idiomatic Haskell, since once you get used to reading it, you can see that it is very concise.
To sum up all digits of a number:
digitSum = sum . map (read . return) . show
show transforms a number to a string. map iterates over the single elements of the string (i.e. the digits), turns them into a string (e.g. character '1' becomes the string "1") and read turns them back to an integer. sum finally calculates the sum.
Just to make pool of solutions greater:
miterate :: (a -> Maybe (a, b)) -> a -> [b]
miterate f = go . f where
go Nothing = []
go (Just (x, y)) = y : (go (f x))
sumd = sum . miterate f where
f 0 = Nothing
f x = Just (x `divMod` 10)
Well, one, your Haskell function misses brackets, you need fac (n - 1). (oh, I see you fixed that now)
Two, the real answer, what you want is first make a list:
listdigits n = if n < 10 then [n] else (listdigits (n `div` 10)) ++ (listdigits (n `mod` 10))
This should just compose a list of all the digits (type: Int -> [Int]).
Then we just make a sum as in sum (listdigits n). And we should be done.
Naturally, you can generalize the example above for the list for many different radices, also, you can easily translate this to products too.
Although maybe not as efficient as the other examples, here is a different way of approaching it:
import Data.Char
sumDigits :: Integer -> Int
sumDigits = foldr ((+) . digitToInt) 0 . show
Edit: newacct's method is very similar, and I like it a bit better :-)