I have the following matrix multiplication function
winograd :: M.Matrix Int -> M.Matrix Int -> M.Matrix Int
winograd a b = c
where
ra = M.nrows a
ca = M.ncols a
rb = M.nrows b
cb = M.ncols b
isEven = even ca
avs = V.generate ra $ \i -> M.getRow (i+1) a
bvs = V.generate cb $ \j -> M.getCol (j+1) b
rows = V.generate ra $ \i -> wgdGroup $ V.unsafeIndex avs i
cols = V.generate cb $ \j -> wgdGroup $ V.unsafeIndex bvs j
wgdGroup x = let
finish = (V.length x - 1)
in
forLoopFold 0 (<finish) (+2) 0 $
\acc i -> acc
- V.unsafeIndex x i * V.unsafeIndex x (i+1)
c = if isEven then
M.matrix ra cb $
\(i,j) ->
V.unsafeIndex rows (i-1) +
V.unsafeIndex cols (j-1) +
g (V.unsafeIndex avs (i-1)) (V.unsafeIndex bvs (j-1))
else
M.matrix ra cb $
\(i,j) ->
V.unsafeIndex rows (i-1) +
V.unsafeIndex cols (j-1) +
g (V.unsafeIndex avs (i-1)) (V.unsafeIndex bvs (j-1)) +
V.last (V.unsafeIndex avs (i-1)) * V.last (V.unsafeIndex bvs (j-1))
g r c = forLoopFold 0 (<(ca-1)) (+2) 0 $ \acc i ->
let
x1 = V.unsafeIndex r i
x2 = V.unsafeIndex r (i+1)
y1 = V.unsafeIndex c i
y2 = V.unsafeIndex c (i+1)
in
acc + (x1+y2)*(x2+y1)
And I measure the execution time of this function as follows, using clock and formatting packages (yes, I know about Criterion, but now I need to measure it this way):
main :: IO ()
main = do
let a = simple 2 (-1) 1000
let b = simple 2 (-3) 1000
start <- getTime Realtime
let
c = winograd a b
end <- c `deepseq` getTime Realtime
fprint (timeSpecs % "\n") start end
where simple is
simple :: Int -> Int -> Int -> M.Matrix Int
simple x y size = M.matrix size size $ \(i,j) -> x*i+y*j
I get the result about 5 seconds. But when I get rid of the function simple and do like this:
main :: IO ()
main = do
let a = M.matrix 1000 1000 $ \(i,j) -> 2*i-1*j
let b = M.matrix 1000 1000 $ \(i,j) -> 2*i-3*j
start <- getTime Realtime
let
c = winograd a b
end <- c `deepseq` getTime Realtime
fprint (timeSpecs % "\n") start end
the time increases to 15 seconds!
Very interesting why. Compiled with flag -O2.
Related
I implemented the Winograd algorithm on Haskell and, trying to improve it, decided to make it more strict. I did this, but I do not understand why it started to work faster. Yesterday I asked a similar question, but posted not quite correct code.
Since that code does not show a problem, I post the code completely.
module Main where
import qualified Data.Vector as V
import qualified Data.Matrix as M
import Control.DeepSeq
import Control.Exception
import System.Clock
import System.Mem
import Data.Time
matrixCtor :: Int -> Int -> Int -> M.Matrix Int
matrixCtor x y size = M.matrix size size $ \(i,j) -> x*i+y*j
winogradLazy :: M.Matrix Int -> M.Matrix Int -> M.Matrix Int
winogradLazy a b = c
where
n = M.nrows a
p = M.ncols a
m = M.ncols b
-- Translate into vectors, since indexing in matrices takes longer.
-- Matrix b is also transposed
a' = V.generate n $ \i -> M.getRow (i+1) a
bt' = V.generate m $ \j -> M.getCol (j+1) b
rows = V.generate n $ \i -> group $ V.unsafeIndex a' i
cols = V.generate m $ \j -> group $ V.unsafeIndex bt' j
group x = foldl (groupHelper x) 0 [0,2..p-1]
groupHelper x acc i = let
x1 = V.unsafeIndex x (i)
x2 = V.unsafeIndex x (i+1)
in
acc - x1 * x2
c = M.matrix n m $ \(i,j) ->
let
a = V.unsafeIndex rows (i-1) + V.unsafeIndex cols (j-1)
b = wsum (V.unsafeIndex a' (i-1)) (V.unsafeIndex bt' (j-1))
in
a + b
wsum r c = foldl (wsumHelper r c) 0 [0,2..p-1]
wsumHelper r c acc i = let
x1 = V.unsafeIndex r (i)
x2 = V.unsafeIndex r (i+1)
y1 = V.unsafeIndex c (i)
y2 = V.unsafeIndex c (i+1)
in
acc +(x1+y2)*(x2+y1)
winogradStrict :: M.Matrix Int -> M.Matrix Int -> M.Matrix Int
winogradStrict a b = c
where
n = M.nrows a
p = M.ncols a
m = M.ncols b
-- Translate into vectors, since indexing in matrices takes longer.
-- Matrix b is also transposed
a' = V.generate n $ \i -> M.getRow (i+1) a
bt' = V.generate m $ \j -> M.getCol (j+1) b
rows = V.generate n $ \i -> group $ V.unsafeIndex a' i
cols = V.generate m $ \j -> group $ V.unsafeIndex bt' j
group x = foldl (groupHelper x) 0 [0,2..p-1]
groupHelper x acc i = let
x1 = V.unsafeIndex x (i)
x2 = V.unsafeIndex x (i+1)
in
acc - x1 * x2
c = a' `deepseq` bt' `deepseq` M.matrix n m $ \(i,j) ->
let
a = V.unsafeIndex rows (i-1) + V.unsafeIndex cols (j-1)
b = wsum (V.unsafeIndex a' (i-1)) (V.unsafeIndex bt' (j-1))
in
a + b
wsum r c = foldl (wsumHelper r c) 0 [0,2..p-1]
wsumHelper r c acc i = let
x1 = V.unsafeIndex r (i)
x2 = V.unsafeIndex r (i+1)
y1 = V.unsafeIndex c (i)
y2 = V.unsafeIndex c (i+1)
in
acc + (x1+y2)*(x2+y1)
lazyTest :: IO ()
lazyTest = do
let a = matrixCtor 2 (-1) 500
let b = matrixCtor 2 (-3) 500
evaluate $ force a
evaluate $ force b
start <- getCurrentTime
let c = winogradLazy a b
evaluate $ force c
end <- getCurrentTime
print (diffUTCTime end start)
strictTest :: IO ()
strictTest = do
let a = matrixCtor 2 (-1) 500
let b = matrixCtor 2 (-3) 500
evaluate $ force a
evaluate $ force b
start <- getCurrentTime
let c = winogradStrict a b
evaluate $ force c
end <- getCurrentTime
print (diffUTCTime end start)
main :: IO ()
main = do
performMajorGC
lazyTest
performMajorGC
strictTest
in the strict version before computing the matrix c, I do the following:
a' `deepseq` bt' `deepseq`
and because of this I get the following results
2.083201s --lazyTest
0.613508s --strictTest
I'm trying to wrap my head around Cont and callCC, by reducing this function:
s0 = (flip runContT) return $ do
(k, n) <- callCC $ \k -> let f x = k (f, x)
in return (f, 0)
lift $ print n
if n < 3
then k (n+1) >> return ()
else return ()
I've managed to reach this point:
s21 = runContT (let f x = ContT $ \_ -> cc (f, x) in ContT ($(f,0))) cc where
cc = (\(k,n) -> let
iff = if n < 3 then k (n+1) else ContT ($())
in print n >> runContT iff (\_ -> return ()))
And at this point i have no idea what to do with recursive definition of f
What is the best way to finish this reduction?
You can proceed as follows.
s21 = runContT (let f x = ContT $ \_ -> cc (f, x) in ContT ($(f,0))) cc where
cc = (\(k,n) -> let
iff = if n < 3 then k (n+1) else ContT ($())
in print n >> runContT iff (\_ -> return ())
-- runContT is the opposite of ContT
s22 = (let f x = ContT $ \_ -> cc (f, x) in ($(f,0))) cc
where
cc = (\(k,n) -> let
iff = if n < 3 then k (n+1) else ContT ($())
in print n >> runContT iff (\_ -> return ())
-- reordering
s23 = ($(f,0)) cc
where
f x = ContT $ \_ -> cc (f, x)
cc = (\(k,n) -> let
iff = if n < 3 then k (n+1) else ContT ($())
in print n >> runContT iff (\_ -> return ())
s24 = cc (f,0)
where ...
-- beta
s25 = let iff = if 0 < 3 then f (0+1) else ContT ($())
in print 0 >> runContT iff (\_ -> return ())
where ...
-- if, arithmetics
s26 = let iff = f 1
in print 0 >> runContT iff (\_ -> return ())
where ...
s27 = print 0 >> runContT (f 1) (\_ -> return ())
where ...
s28 = print 0 >> runContT (ContT $ \_ -> cc (f, 1)) (\_ -> return ())
where ...
s29 = print 0 >> (\_ -> cc (f, 1)) (\_ -> return ())
where ...
s30 = print 0 >> cc (f, 1)
where ...
-- repeat all the steps s24..s30
s31 = print 0 >> print 1 >> cc (f, 2)
where ...
-- etc.
s32 = print 0 >> print 1 >> print 2 >> cc (f, 3)
where ...
s33 = print 0 >> print 1 >> print 2 >>
let iff = if 3 < 3 then f (3+1) else ContT ($())
in print 3 >> runContT iff (\_ -> return ())
where ...
s34 = print 0 >> print 1 >> print 2 >> print 3 >>
let iff = ContT ($())
in runContT iff (\_ -> return ()))
where ...
s35 = print 0 >> print 1 >> print 2 >> print 3 >>
runContT (ContT ($())) (\_ -> return ())
where ...
s36 = print 0 >> print 1 >> print 2 >> print 3 >>
($()) (\_ -> return ())
where ...
s37 = print 0 >> print 1 >> print 2 >> print 3 >>
return ()
I have a clearly tail-recursive function for finding (choose n k) mod 10007 (with k nonnegative)
Why is this function consuming lots of memory for large inputs? (ie 100000000 choose 50000000) I can understand if it might be slow, but it shouldn't use more than constant memory, should it? (assuming GHC knows about tail-call optimization)
GHC version 7.8.3
modulus :: Int
modulus = 10007
choose :: Int -> Int -> Int
choose n1 k1
| s1 > 0 = 0
| otherwise = q1
where
(q1, s1) = doChoose n1 k1 (1, 0)
doChoose :: Int -> Int -> (Int, Int) -> (Int, Int)
doChoose _ 0 (qr, sr) = (qr, sr)
doChoose n k (qr, sr) =
doChoose (n `seq` (n-1)) (k-1) (qr `seq` (qn * qr `rem` modulus * inv qk `rem` modulus), sr `seq` (sn + sr - sk))
where
(qn, sn) = removePs n
(qk, sk) = removePs k
removePs :: Int -> (Int, Int)
removePs n =
case r of
0 -> (q0, s0 + 1)
_ -> (n, 0)
where
(q, r) = n `quotRem` modulus
(q0, s0) = removePs q
inv :: Int -> Int
inv = doInv 0 1 modulus . (`mod` modulus)
where
doInv x _ 1 0
| x < 0 = x + modulus
| otherwise = x
doInv _ _ _ 0 = error "Not relatively prime"
doInv x y a b = doInv y (x - q * y) b r
where
(q, r) = a `quotRem` b
I was putting the seq in the wrong place.
It needs to be:
n `seq` qr `seq` sr `seq` doChoose (n-1) (k-1) (qn * qr `rem` modulus * inv qk `rem` modulus, sn + sr - sk)
Otherwise the call to seq isn't evaluated until reaching the base-case and a chain of thunks is still built up.
This isn't strictly tail-recursive, but rather it's "mutually" tail-recursive since seq ultimately returns its second argument without modifying it.
By the way, to simplify your expressions, you can write a helper function:
force x = x `seq` x
or use force (no pun intended) from the Deepseq package. Then
doChoose (force n - 1) (k - 1) (qn * force qr * etc.)
Here is the following code in which I try to found some prime divisors. I have tried to convert TAOCP algorithms to Haskell programs but I can understand when something evaluates lazily or eagerly:
modof2 n = let a0 = shiftR n 1
a1 = shiftL a0 1
in n-a1
iseven n = modof2 n == 0
factoringby2 n = let s=(lastf (takeWhile f [1..])) + 1
d=n `quot` powerof2 s
in (s,d)
where f s = let d = n `quot` (powerof2 s)
in if isodd d
then False
else True
lastf [] = 0
lastf xs = last xs
miller_rabin_prime_test n 0 result=return result
miller_rabin_prime_test n k result| (isodd n) && n>3 = do
a<-randomRIO(2,n-2)
let z = basic_step n a (fst sd) (snd sd)
miller_rabin_prime_test n (k-1) z
where sd=factoringby2 n
basic_step:: Integer->Integer->Int->Integer->Bool
basic_step n a s d =any (\x-> x==1 || x==n-1) (map x (map u [0..s-1]))
where u j=powerof2(j)*d
x j=modular_pow a j n 1
isprime n = if n==2 || n==3
then return True
else if n<2
then return False
else if iseven n
then return False
else miller_rabin_prime_test n 5 True
x_m :: Double->Integer->Integer
x_m 0 n = 2
x_m m n = f (x_m (m-1) n) `mod` n
where f x = x^2 +1
l::Double->Double
l m = 2 ^ (floor (log2 m))
where log2 m = log m / log 2
g m n = let a = x_m m n
b = x_m ((l m)-1) n
in gcd (a-b) n
gg n = [g m n|m<-[1..]]
algorithmB n = do
testprime<-isprime n
let a = head (filter (1>) (gg n))
c<-algorithmB (n `div` a)
if testprime
then return []
else return (a:c)
algorithmB does not terminate. Why this happens? I think that c<-algorithmB (n div a) is the reason because it does not evaluate lazily. Is that true?
Thanks
algorithmB calls itself in an infinite loop. Of course it doesn't return!
Can someone provide "for-dummies" example of how to use `MonadRandom'?
Currently I have code that does stuff like passing around the generator variable, all the way from the main function:
main = do
g <- getStdGen
r <- myFunc g
putStrLn "Result is : " ++ show r
--my complicated func
myFunc g x y z = afunc g x y z
afunc g x y z = bfunc g x y
bfunc g x y = cfunc g x
cfunc g x = ret where
(ret, _ ) = randomR (0.0, 1.0) g
Thanks
Basically all the extra g parameters can just be dropped. You then get random numbers using the functions from Control.Monad.Random (such as getRandomR). Here is your example (I added some args to make it compile):
import Control.Monad.Random
main = do
g <- getStdGen
let r = evalRand (myFunc 1 2 3) g :: Double
-- or use runRand if you want to do more random stuff:
-- let (r,g') = runRand (myFunc 1 2 3) g :: (Double,StdGen)
putStrLn $ "Result is : " ++ show r
--my complicated func
myFunc x y z = afunc x y z
afunc x y z = bfunc x y
bfunc x y = cfunc x
cfunc x = do
ret <- getRandomR (0.0,1.0)
return ret
You just run something in the RandT monad transformer with runRandT or evalRandT, and for the pure Rand monad, with runRand or evalRand:
main = do
g <- getStdGen
r = evalRand twoEliteNumbers g
putStrLn $ "Result is: " ++ show r
twoEliteNumbers :: (RandomGen g) => Rand g (Double, Double)
twoEliteNumbers = do
-- You can call other functions in the Rand monad
number1 <- eliteNumber
number2 <- eliteNumber
return $ (number1, number2)
eliteNumber :: (RandomGen g) => Rand g Double
eliteNumber = do
-- When you need random numbers, just call the getRandom* functions
randomNumber <- getRandomR (0.0, 1.0)
return $ randomNumber * 1337