Nested triangular loop in Haskell? - haskell

I have the following in Java which basically does a nested triangular loop:
int n = 10;
B bs[] = new B[n];
// some initial values, bla bla
double dt = 0.001;
for (int i = 0; i < n; i++) {
bs[i] = new B();
bs[i].x = i * 0.5;
bs[i].v = i * 2.5;
bs[i].m = i * 5.5;
}
for (int i = 0; i < n; i++) {
for (int j = **(i+1)**; j < n; j++) {
double d = bs[i].x - bs[j].x;
double sqr = d * d + 0.01;
double dist = Math.sqrt(sqr);
double mag = dt / (sqr * dist);
bs[i].v -= d * bs[j].m * mag;
**bs[j].v += d * bs[i].m * mag;**
}
}
// printing out the value v
for (int i = 0; i < n; i++) {
System.out.println(bs[i].v);
}
Class B:
class B {
double x, v, m;
}
In each iteration, the value at index i and j of the array is updated at the same time thus avoiding to do a complete nested loop. The following gives the same result but it does a complete nested loop (excuse me for the terms i'm using, they may not be correct but i hope it does make sense).
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
double d = bs[i].x - bs[j].x;
double sqr = d * d + 0.01;
double dist = Math.sqrt(sqr);
double mag = dt / (sqr * dist);
bs[i].v -= d * bs[j].m * mag;
}
}
NOTE:
the only change from the previous code is int j = 0; NOT int j = (i+1); and removed bs[j].v += d * bs[i].m * mag;
I want to do same in Haskell but having difficulty to think about it properly. I have the following code. The array in the Haskell version is represented as a list (xs) which i've initialised to 0.
n = 20
xs = replicate n 0
update = foldl' (update') xs [0..(n-1)]
where
update' i = update'' i (i+1) []
update'' i j acc
| j == n = acc
| otherwise = new_acc
where
new_acc = result:acc
result = ...do something
I am going to have very big value for n e.g. 1000, 5000, etc.
A complete nested loop when n = 1000 gives length [(i,j)|i<-[0..1000],j<-[0..1000]] = 1002001 but a triangular version gives length [(i,j)|i<-[0..1000],j<-[(i+1)..1000]]
= 500500. Doing 2 maps in Haskell is easy to get it to do the complete loops but I want the triangular version. I guess this implies keeping the changes to i and j in a list and then update the original list at the end? Any idea would be much appreciated. Thanks

Here's a straightforward translation using unboxed mutable vectors from the vector package. Code is somewhat ugly, but should be very fast:
module Main
where
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Unboxed.Mutable as M
numElts :: Int
numElts = 10
dt :: Double
dt = 0.001
loop :: Int -> M.IOVector Double -> M.IOVector Double
-> M.IOVector Double -> IO ()
loop n x v m = go 0
where
doWork i j = do xI <- M.read x i
xJ <- M.read x j
vI <- M.read v i
vJ <- M.read v j
mI <- M.read m i
mJ <- M.read m j
let d = xI - xJ
let sqr = d * d + 0.01
let dist = sqrt sqr
let mag = dt / (sqr * dist)
M.write v i (vI - d * mJ * mag)
M.write v j (vJ + d * mI * mag)
go i | i < n = do go' (i+1)
go (i+1)
| otherwise = return ()
where
go' j | j < n = do doWork i j
go' (j + 1)
| otherwise = return ()
main :: IO ()
main = do x <- generateVector 0.5
v <- generateVector 2.5
m <- generateVector 5.5
loop numElts x v m
v' <- U.unsafeFreeze v
U.forM_ v' print
where
generateVector :: Double -> IO (M.IOVector Double)
generateVector d = do v <- M.new numElts
generateVector' numElts d v
return v
generateVector' :: Int -> Double -> M.IOVector Double -> IO ()
generateVector' n d v = go 0
where
go i | i < n = do M.unsafeWrite v i (fromIntegral i * d)
go (i+1)
| otherwise = return ()
Update: Regarding the "very fast" claim: I benchmarked my solution against the pure one provided by Federico and got the following results (for n = 1000):
benchmarking pureSolution
collecting 100 samples, 1 iterations each, in estimated 334.5483 s
mean: 2.949640 s, lb 2.867693 s, ub 3.005429 s, ci 0.950
std dev: 421.1978 ms, lb 343.8233 ms, ub 539.4906 ms, ci 0.950
found 4 outliers among 100 samples (4.0%)
3 (3.0%) high severe
variance introduced by outliers: 5.997%
variance is slightly inflated by outliers
benchmarking pureVectorSolution
collecting 100 samples, 1 iterations each, in estimated 280.4593 s
mean: 2.747359 s, lb 2.709507 s, ub 2.803392 s, ci 0.950
std dev: 237.7489 ms, lb 179.3110 ms, ub 311.8813 ms, ci 0.950
found 13 outliers among 100 samples (13.0%)
7 (7.0%) high mild
6 (6.0%) high severe
variance introduced by outliers: 2.998%
variance is slightly inflated by outliers
benchmarking imperativeSolution
collecting 100 samples, 1 iterations each, in estimated 5.905104 s
mean: 58.59154 ms, lb 56.79405 ms, ub 60.60033 ms, ci 0.950
std dev: 11.70101 ms, lb 9.120100 ms, ub NaN s, ci 0.950
So the imperative solution is approx. 50 times faster than the functional one (the difference is even more dramatic for smaller n, when everything fits in cache). I tried to make Federico's solution work with unboxed vectors, but apparently it relies on laziness in a crucial way, which makes the unboxed version loop forever. The "pure vector" version uses boxed vectors.

I'm not sure this solves your problem because I didn't grasp it completely yet, but the triangular loop itself is very easy to do in Haskell:
triangularLoop :: (a -> a -> b) -> [a] -> [b]
triangularLoop f xs = do
(x1 : t) <- tails xs
x2 <- t
return $ f x1 x2
Or, written without the monadic syntax,
triangularLoop f = concat . map singlePass . tails
where
singlePass [] = []
singlePass (h:t) = map (f h) t

A typical, idiomatic way of writing nested loops in Haskell is using list comprehensions.
Here is how I would translate your code:
import Data.Array
import Data.List (tails)
data Body = Body {x::Double,v::Double,m::Double}
deriving Show
n::Int
n = 9
dt::Double
dt = 0.001
bs_0 :: Array Int Body
bs_0 = array (0,n) [(i,Body {x = i'*0.5,v = i'*2.5,m = i'*5.5}) |
i <- [0..n], let i' = fromIntegral i]
bs :: Array Int Body
bs = accum (\b dv -> b {v = v b + dv}) bs_0 dvs
where
dvs :: [(Int,Double)]
dvs = concat [[(i,dv_i),(j,dv_j)] | (i:is) <- tails [0..n],
j <- is,
let d = x(bs!i) - x(bs!j)
sqr = d * d + 0.01
dist = sqrt sqr
mag = dt / (sqr * dist)
dv_i = -d * m(bs!j) * mag
dv_j = d * m(bs!i) * mag]
main :: IO()
main = mapM_ print (assocs bs)

Related

How to code a function in Haskell which takes every other digit from an Int and adds it to a result Int?

I want to create a function as mentioned in the title. The specific is that it adds the digits in reversed order, you can see that in the test cases: 12 -> 1; 852369 -> 628; 1714 -> 11; 12345 -> 42; 891 -> 9; 448575 -> 784; 4214 -> 14
The main idea is that when the number is bigger than 99 it enters the helper function which has i - indicator if the the digit is on an even position, and res which stores the result. Helper begins to cycle n as it checks whether or not the current digit is on even position and adds it to the result.
So far I've tried the following code:
everyOther :: Int -> Int
everyOther n
| n < 10 = error "n must be bigger than 10 or equal"
| n < 100 = div n 10
| otherwise = helper n 0 0
where
helper :: Int -> Int -> Int -> Int
helper n i res
| n < 100 = res
| i == 1 = helper (div n 10) (i - 1) (res + (mod n 10)*10)
| otherwise = helper (div n 10) i res
Any help would be appreciated!
You can obtain the one but last digit of x with mod (div x 10) 10. You can use this with an accumulator that accumulates the value by each time multiplying with 10, so:
everyOther :: Int -> Int
everyOther = go 0
where go a v
| v < 10 = a
| otherwise = go (10*a + mod (div v 10) 10) (div v 100)
If v is thus less than 10, we can return the accumulator, since there is no "other digit" anymore. If that is not the case, we multiply a with 10, and add mod (div v 10) 10 to add the other digit to it, and recurse with the value divided by 100 to move it two places to the right.
We can improve this, as #Daniel Wagner says, by making use of quotRem :: Integral a => a -> a -> (a, a):
everyOther :: Int -> Int
everyOther = go 0
where go a v
| v < 10 = a
| otherwise = let (q, r) = v `quotRem` 100 in go (10*a + r `quot` 10) q
here we thus work with the remainder of a division by 100, and this thus avoids an extra modulo.

Accessing list positions inside functions like map in Haskell

I have just tried rewriting some code, originally a short Javascript function, in Haskell. The original has 2 nested loops and the inner loop contains a check for equality against both loop counters:
function f(x, points){
var i, j;
var n = points.length;
var result = 0;
for(i=0; i<n; i++){
var xprod = 1;
for(j=0; j<n; j++){
if(j != i){
xprod *= (x - points[j][0]);
}
}
result += points[i][1] * xprod;
}
return result;
}
I was hoping to be able to simplify it in Haskell, but I couldn't figure out how get hold of the i and j values without effectively writing out every step of the original recursively. In Javascript Array.map passes the list position into the callback function as the second parameter, but it seems that map in Haskell doesn't do this. My current Haskell version looks awful to me as I'm passing in 2 copies of the array (one for each loop):
xproduct :: Int -> Int -> Double -> [(Double,Double)] -> Double
xproduct _ _ _ [] = 1
xproduct i j x (pt:todo)
| i == j = (xproduct i (j+1) x todo)
| otherwise = (xproduct i (j+1) x todo) * (x - (fst pt))
solvestep :: Int -> Double -> [(Double,Double)] -> [(Double,Double)] -> Double
solvestep _ _ _ [] = 0
solvestep i x pts (pt:todo) = ((snd pt) * xprod) + (solvestep (i+1) x pts todo)
where xprod = xproduct i 0 x pts
solve :: Double -> [(Double,Double)] -> Double
solve x points = solvestep 0 x points points
Is there a better way to do this?
I generally avoid using any indices at all, if possible. In this case, what you're really working with is: any one element of the list with all the other elements. No need to express that with index comparison, instead write a function that will give you a suitable look into the list:
pickouts :: [a] -> [(a,[a])]
pickouts [] = []
pickouts (x:xs) = (x,xs) : (second (x:) <$> pickouts xs)
Then, the actual computation becomes just
f :: Double -> [(Double,Double)] -> Double
f x points = sum [q * product [x-p | (p,_)<-ps] | ((_,q),ps) <- pickouts points]

What's special about 787?

In ghci, using the arithmoi package:
Math.NumberTheory.Powers.General> :set +s
Math.NumberTheory.Powers.General> integerRoot 786 ((10^32)^786)
100000000000000000000000000000000
(0.04 secs, 227,064 bytes)
Math.NumberTheory.Powers.General> integerRoot 787 ((10^32)^787)
After five minutes, it still hasn't responded. Why is it taking so long?
(From some ad-hoc testing, it appears to be slow for all choices larger than 787 and fast for all choices smaller.)
arithmoi implements integerRoot by getting an initial approximate root and refining its guess with Newton’s method. For (1032)786, the second approximation gets a really good starting point:
> appKthRoot 786 ((10^32)^786)
100000000000000005366162204393472
For (1032)787, the second approximation gets a really bad starting point. Like, really bad.
> appKthRoot 787 ((10^32)^787)
1797693134862315907729305190789024733617976978942306572734300811577326758055009
6313270847732240753602112011387987139335765878976881441662249284743063947412437
7767893424865485276302219601246094119453082952085005768838150682342462881473913
110540827237163350510684586298239947245938479716304835356329624224137216
It actually gets this approximation for everything starting there.
> length $ nub [appKthRoot x ((10^32)^x) | x <- [787..1000]]
1
Anyway, putting in the important parts of appKthRoot, we get:
> let h = 106; k = 786; n = (10^32)^k; !(I# s) = h * k - k in floor (scaleFloat (h - 1) (fromInteger (n `shiftRInteger` s) ** (1/fromIntegral k) :: Double))
100000000000000005366162204393472
> let h = 106; k = 787; n = (10^32)^k; !(I# s) = h * k - k in floor (scaleFloat (h - 1) (fromInteger (n `shiftRInteger` s) ** (1/fromIntegral k) :: Double))
179769313486231590772930519078902473361797697894230657273430081157732675805500963132708477322407536021120113879871393357658789768814416622492847430639474124377767893424865485276302219601246094119453082952085005768838150682342462881473913110540827237163350510684586298239947245938479716304835356329624224137216
and taking a look at what’s going into scaleFloat:
> let h = 106; k = 786; n = (10^32)^k; !(I# s) = h * k - k in fromInteger (n `shiftRInteger` s) ** (1/fromIntegral k) :: Double
2.465190328815662
> let h = 106; k = 787; n = (10^32)^k; !(I# s) = h * k - k in fromInteger (n `shiftRInteger` s) ** (1/fromIntegral k) :: Double
Infinity
Yeah, that’d do it. (1032)786 ÷ 282530 &approx; 21023.1 fits in a double, but (1032)787 ÷ 282635 &approx; 21024.4 does not.

How much space does ridge regression require?

In Haskell, ridge regression can be expressed as:
import Numeric.LinearAlgebra
createReadout :: Matrix Double → Matrix Double → Matrix Double
createReadout a b = oA <\> oB
where
μ = 1e-4
oA = (a <> (tr a)) + (μ * (ident $ rows a))
oB = a <> (tr b)
However, this operation is very memory expensive. Here is a minimalistic example that requires more than 2GB on my machine and takes 3 minutes to execute.
import Numeric.LinearAlgebra
import System.Random
createReadout :: Matrix Double -> Matrix Double -> Matrix Double
createReadout a b = oA <\> oB
where
mu = 1e-4
oA = (a <> (tr a)) + (mu * (ident $ rows a))
oB = a <> (tr b)
teacher :: [Int] -> Int -> Int -> Matrix Double
teacher labelsList cols' correctRow = fromBlocks $ f <$> labelsList
where ones = konst 1.0 (1, cols')
zeros = konst 0.0 (1, cols')
rows' = length labelsList
f i | i == correctRow = [ones]
| otherwise = [zeros]
glue :: Element t => [Matrix t] -> Matrix t
glue xs = fromBlocks [xs]
main :: IO ()
main = do
let n = 1500 -- <- The constant to be increased
m = 10000
cols' = 12
g <- newStdGen
-- Stub data
let labels = take m . map (`mod` 10) . randoms $ g :: [Int]
a = (n >< (cols' * m)) $ take (cols' * m * n) $ randoms g :: Matrix Double
teachers = zipWith (teacher [0..9]) (repeat cols') labels
b = glue teachers
print $ maxElement $ createReadout a b
return ()
$ cabal exec ghc -- -O2 Test.hs
$ time ./Test
./Test 190.16s user 5.22s system 106% cpu 3:03.93 total
The problem is to increase the constant n, at least to n = 4000, while RAM is limited by 5GB. What is minimal space that matrix inversion operation requires in theory? How can this operation be optimized in terms of space? Can ridge regression be efficiently replaced with a cheaper method?
Simple Gauss-Jordan elimination only takes space to store the input and output matrices plus constant auxiliary space. If I'm reading correctly, the matrix oA you need to invert is n x n so that's not a problem.
Your memory usage is completely dominated by storing the input matrix a, which uses at least 1500 * 120000 * 8 = 1.34 GB. n = 4000 would be 4000 * 120000 * 8 = 3.58 GB which is over half of your space budget. I don't know what matrix library you are using or how it stores its matrices, but if they are on the Haskell heap then GC effects could easily account for another factor of 2 in space usage.
Well you can get away with 3*m + nxn space, but how numerically stable this will be I'm not sure.
The basis is the identity
inv( inv(Q) + A'*A)) = Q - Q*A'*R*A*Q
where R = inv( I + A*Q*A')
If A is your A matrix and
Q = inv( mu*I*mu*I) = I/(mu*mu)
then the solution to your ridge regression is
inv( inv(Q) + A'*A)) * A'*b
A little more algebra shows
inv( inv(Q) + A'*A)) = (I - A'*inv( (mu2 + A*A'))*A)/mu2
where mu2 = mu*m
Note that since A is n x m, A*A' is n x n.
So one algorithm would be
Compute C = A*A' + mu2
Do a cholesky decompostion of C, ie find upper triangular U so that U'*U = C
Compute the vector y = A'*b
Compute the vector z = A*y
Solve U'*u = z for u in z
Solve U*v = z for v in z
compute w = A'*z
Compute x = (y - w)/mu2.

Haskell ways to the 3n+1 challenge

Here is a simple programming problem from SPOJ: http://www.spoj.com/problems/PROBTRES/.
Basically, you are asked to output the biggest Collatz cycle for numbers between i and j. (Collatz cycle of a number $n$ is the number of steps to eventually get from $n$ to 1.)
I have been looking for a Haskell way to solve the problem with comparative performance than that of Java or C++ (so as to fits in the allowed run-time limit). Although a simple Java solution that memoizes the cycle length of any already computed cycles will work. I haven't been successful at applying the idea to obtain a Haskell solution.
I have tried the Data.Function.Memoize, as well as home-brewed log time memoization technique using the idea from this post: Memoization in Haskell?. Unfortunately, memoization actually makes the computation of cycle(n) even slower. I believe the slow down comes from the overhead of haskell way. (I tried running with the compiled binary code, instead of interpreting.)
I also suspect that simply iterating numbers from i to j can be costly ($i,j\le10^6$). So I even tried precompute everything for the range query, using idea from http://blog.openendings.net/2013/10/range-trees-and-profiling-in-haskell.html. However, this still gives "Time Limit Exceeding" error.
Can you help to inform a neat competitive Haskell program for this?
Thanks!
>>> using the approach bellow, I could submit an accepted answer to SPOJ. You may check the entire code from here.
The problem has bounds 0 < n < 1,000,000. Pre-calculate all of them and store them inside an array; then freeze the array. The array can be used as its own cache / memoization space.
The problem would then reduce to a range query problem over an array, which can be done very efficiently using trees.
With the code bellow I can get Collatz of 1..1,000,000 in a fraction of a second:
$ time echo 1000000 | ./collatz
525
real 0m0.177s
user 0m0.173s
sys 0m0.003s
Note that collatz function below, uses mutable STUArray internally, but itself is a pure function:
import Control.Monad.ST (ST)
import Control.Monad (mapM_)
import Control.Applicative ((<$>))
import Data.Array.Unboxed (UArray, elems)
import Data.Array.ST (STUArray, readArray, writeArray, runSTUArray, newArray)
collatz :: Int -> UArray Int Int
collatz size = out
where
next i = if odd i then 3 * i + 1 else i `div` 2
loop :: STUArray s Int Int -> Int -> ST s Int
loop arr k
| size < k = succ <$> loop arr (next k)
| otherwise = do
out <- readArray arr k
if out /= 0 then return out
else do
out <- succ <$> loop arr (next k)
writeArray arr k out
return out
out = runSTUArray $ do
arr <- newArray (1, size) 0
writeArray arr 1 1
mapM_ (loop arr) [2..size]
return arr
main = do
size <- read <$> getLine
print . maximum . elems $ collatz size
In order to perform range queries on this array, you may build a balanced tree as simple as below:
type Range = (Int, Int)
data Tree = Leaf Int | Node Tree Tree Range Int
build_tree :: Int -> Tree
build_tree size = loop 1 cnt
where
ctz = collatz size
cnt = head . dropWhile (< size) $ iterate (*2) 1
(Leaf a) +: (Leaf b) = max a b
(Node _ _ _ a) +: (Node _ _ _ b) = max a b
loop lo hi
| lo == hi = Leaf $ if size < lo then minBound else ctz ! lo
| otherwise = Node left right (lo, hi) (left +: right)
where
i = (lo + hi) `div` 2
left = loop lo i
right = loop (i + 1) hi
query_tree :: Tree -> Int -> Int -> Int
query_tree (Leaf x) _ _ = x
query_tree (Node l r (lo, hi) x) i j
| i <= lo && hi <= j = x
| mid < i = query_tree r i j
| j < 1 + mid = query_tree l i j
| otherwise = max (query_tree l i j) (query_tree r i j)
where mid = (lo + hi) `div` 2
Here is the same as in the other answer, but with an immutable recursively defined array (and it also leaks slightly (can someone say why?) and so two times slower):
import Data.Array
upper = 10^6
step :: Integer -> Int
step i = 1 + colAt (if odd i then 3 * i + 1 else i `div` 2)
colAt :: Integer -> Int
colAt i | i > upper = step i
colAt i = col!i
col :: Array Integer Int
col = array (1, upper) $ (1, 1) : [(i, step i) | i <- [2..upper]]
main = print $ maximum $ elems col

Resources