Bad performance for parallel N-Body in Haskell - haskell

I wrote an N-body implementation in Haskell and tried to parallelize the updating of position of the particles (outer loop) using parList. However, the performance improvement from 1 to 3 cores is small, and it actually becomes worse for 4 cores. Here is the code, using 5 particles:
import Numeric.LinearAlgebra
import Data.List (zipWith4)
import Data.List.Split
import Control.DeepSeq
import System.Environment
import Control.Parallel.Strategies
type Vec = Vector Double
data Particle = Particle
{ tag :: Int
, position :: !Vec
, velocity :: !Vec
, mass :: Double
}
instance Eq Particle where
p1 == p2 = (tag p1) == (tag p2)
instance Show Particle where
show p = show $ position p
instance NFData Particle where
rnf p = t `seq` ps `seq` vel `seq` m `seq` ()
where
t = tag p
ps = position p
vel = velocity p
m = mass p
data System = System
{ number :: Int
, particles :: [Particle]
}
instance Show System where
show sys = show (particles sys)
instance NFData System where
rnf sys = rnf $ particles sys
g :: Double
g = 6.674e-11
-- | Construct a vector from a list of elements
mkVector :: [Double] -> Vec
mkVector = vector
-- | Construct a system from a given list of
-- position vectors, velocity vectors, and masses
mkSystem :: [(Vec, Vec, Double)] -> System
mkSystem xs = System
{ number = length xs
, particles = map (apply Particle) $ zipWith cons [0..] xs
}
where
cons y (a, b, c) = (y, a, b, c)
apply f (y, a, b, c) = f y a b c
-- | Calculate the gravitational force vector acted on p1 by p2
getForce :: Particle -> Particle -> Vec
getForce p1 p2
| p1 == p2 = scalar 0.0
| otherwise = scalar (g * (m1 * m2 / r^2)) * unit
where
m1 = mass p1
m2 = mass p2
r1 = position p1
r2 = position p2
dr = r2 - r1
r = norm_2 dr
unit = normalize dr
-- | Calculate the net gravitational force acted on the particle by
-- all the other particles
netForce :: Particle -> [Particle] -> Vec
netForce p ps = sum (map (getForce p) ps)
-- | Update the position and velocity of a particle in the system
-- through a given timestep
move :: Double -> System -> Particle -> Particle
move h sys p = let
(pos, vel) = (position p, velocity p)
acc = (netForce p $ particles sys) / (scalar $ mass p)
in
p { position = pos + scalar h * vel
, velocity = vel + scalar h * acc
}
stepSize = 1000
-- | Update the system through one timestep
evolve :: System -> System
evolve sys = sys { particles = ps' }
where
ps' = map (move stepSize sys) (particles sys) `using` parList rdeepseq
-- | Stream of system at different timesteps
evolution :: System -> [System]
evolution = iterate evolve
sys1 :: System
sys1 = mkSystem
[ (mkVector [1.496e11, 0], mkVector [0, 2.98e4], 5.974e24)
, (mkVector [2.279e11, 0], mkVector [0, 2.41e4], 6.42e23)
, (mkVector [5.79e10, 0], mkVector [0, 4.8e4], 3.3e23)
, (mkVector [0, 0], mkVector [0, 0], 1.98e30)
, (mkVector [ 1.08e11, 0], mkVector [0, 3.5e4], 4.87e24)
]
main :: IO ()
main = do
[n] <- fmap (map read) getArgs :: IO [Int]
let states = evolution sys1
finalState = states !! n
finalState `deepseq` (return ())
What is limiting the parallelism?

Two thoughts:
I don't recommend HMatrix. It's a “Matlab in a box” library, which not only goes strongly against Haskell's grain in terms of semantics (dynamical dimension checking† etc.) but also subverts Haskell's better sides in performance. The idea of HMatrix/Matlab is to work with large, monolithic, heap-allocated arrays and do as much as possible with linear algebra primitives (which are implemented in C). This approach makes sense in a slow language like Matlab or Python (you delegate the inner loops to another, faster language), but not really in Haskell: the external calls can't properly be inlined, unlike native Haskell functions, and they add some overhead which though it may be negligible when crunching large vectors, can definitely add up for small ones. Pointer indirection harms cache performance.
The most popular static, native LA library is linear, which takes a much more Haskell-oriented approach (functors everywhere), and has proper tight memory layout. (I'm actually not such a fan of the interface itself, but the data types can also be used with the very nice vector-space operators.)
Switching to the types from linear improves the performance (single-threaded) by a factor 10 for me.
import Data.VectorSpace
import Data.VectorSpace.Free
import Linear.V2
import Data.List (zipWith4)
import Data.List.Split
import Control.DeepSeq
import System.Environment
import Control.Parallel.Strategies
type Vec = V2 Double -- In case you're wondering: you don't need to
-- hard-code the dimension, you can also make your
-- functions _polymorphic_ on the vector space.
data Particle = Particle
{ tag :: !Int
, position :: !Vec
, velocity :: !Vec
, mass :: !Double
}
instance Eq Particle where
p1 == p2 = (tag p1) == (tag p2)
instance Show Particle where
show p = show $ position p
instance NFData Particle where
rnf p = t `seq` ps `seq` vel `seq` m `seq` ()
where
t = tag p
ps = position p
vel = velocity p
m = mass p
data System = System
{ number :: Int
, particles :: [Particle]
}
instance Show System where
show sys = show (particles sys)
instance NFData System where
rnf sys = rnf $ particles sys
g :: Double
g = 6.674e-11
-- | Construct a vector from a list of elements
mkVector :: [Double] -> Vec
mkVector [x,y] = V2 x y
-- | Construct a system from a given list of
-- position vectors, velocity vectors, and masses
mkSystem :: [(Vec, Vec, Double)] -> System
mkSystem xs = System
{ number = length xs
, particles = map (apply Particle) $ zipWith cons [0..] xs
}
where
cons y (a, b, c) = (y, a, b, c)
apply f (y, a, b, c) = f y a b c
-- | Calculate the gravitational force vector acted on p1 by p2
getForce :: Particle -> Particle -> Vec
getForce p1 p2
| p1 == p2 = zeroV
| otherwise = (g * (m1 * m2 / r^2)) *^ unit
where
m1 = mass p1
m2 = mass p2
r1 = position p1
r2 = position p2
dr = r2 - r1
r = magnitude dr
unit = dr ^/ r
-- | Calculate the net gravitational force acted on the particle by
-- all the other particles
netForce :: Particle -> [Particle] -> Vec
netForce p ps = sum (map (getForce p) ps)
-- | Update the position and velocity of a particle in the system
-- through a given timestep
move :: Double -> System -> Particle -> Particle
move h sys p = let
(pos, vel) = (position p, velocity p)
acc = (netForce p $ particles sys) ^/ mass p
in
p { position = pos ^+^ h *^ vel
, velocity = vel ^+^ h *^ acc
}
stepSize = 1000
-- | Update the system through one timestep
evolve :: System -> System
evolve sys = sys { particles = ps' }
where
ps' = map (move stepSize sys) (particles sys) `using` parList rdeepseq
-- | Stream of system at different timesteps
evolution :: System -> [System]
evolution = iterate evolve
sys1 :: System
sys1 = mkSystem
[ (mkVector [1.496e11, 0], mkVector [0, 2.98e4], 5.974e24)
, (mkVector [2.279e11, 0], mkVector [0, 2.41e4], 6.42e23)
, (mkVector [5.79e10, 0], mkVector [0, 4.8e4], 3.3e23)
, (mkVector [0, 0], mkVector [0, 0], 1.98e30)
, (mkVector [ 1.08e11, 0], mkVector [0, 3.5e4], 4.87e24)
]
main :: IO ()
main = do
[n] <- fmap (map read) getArgs :: IO [Int]
let states = evolution sys1
finalState = states !! n
finalState `deepseq` (return ())
I wouldn't expect too gain much from parallelisation for only five particles anyhow. Each thread has very little it can do before giving back control, and everything interacts. You therefore quickly notice the overhead even of Haskell's light green threads.
To benefit properly from parallelisation, you'll need to either set up the threads as a thread pool (like you would in traditional number-crunching languages) and thus avoid the sparking overhead, or make sure each thread has a lot to do. This works best if you mostly decouple the threads' domains; in really-many body applications this is done with something like a Barnes-Hut simulation.
Another thing: you're right now using a simple Euler solver for solving the equations of motion. That's a very bad scheme. Switching to a higher-order Runge Kutta solver, ideally with adaptive step size, would allow you to save so much in terms of number of steps that the parallelisation aspect largely pales in impact by comparison.
†Actually, HMatrix also has had a statically checked module for quite some time now.

Related

Generate a random list of custom data type, where values provided to data constructor are somehow bounded within a range

I have defined a Point data type, with a single value constructor like so:
data Point = Point {
x :: Int,
y :: Int,
color :: Color
} deriving (Show, Eq)
data Color = None
| Black
| Red
| Green
| Blue
deriving (Show, Eq, Enum, Bounded)
I have found an example of making a Bounded Enum an instance of the Random class and have made Color an instance of it like so:
instance Random Color where
random g = case randomR (1, 4) g of
(r, g') -> (toEnum r, g')
randomR (a, b) g = case randomR (fromEnum a, fromEnum b) g of
(r, g') -> (toEnum r, g')
I was then able to find out how to make Point an instance of the Random class also:
instance Random Point where
randomR (Point xl yl cl, Point xr yr cr) g =
let (x, g1) = randomR (xl, xr) g
(y, g2) = randomR (yl, yr) g1
(c, g3) = randomR (cl, cr) g2
in (Point x y c, g3)
random g =
let (x, g1) = random g
(y, g2) = random g1
(c, g3) = random g2
in (Point x y c, g3)
So, this let's me make random point values. But, what I'd like to do is be able to create a list of random Point values, where the x and the y properties are bounded within some range whilst leaving the color property to be an unbounded random value. Is this possible with the way I am currently modelling the code, or do I need to rethink how I construct Point values? For instance, instead of making Point an instance of the Random class, should I just create a random list of Int in the IO monad and then have a pure function that creates n Points, using values from the random list to construct each Point value?
Edit, I think I have found out how to do it:
Without changing the above code, in the IO monad I can do the following:
solved :: IO ()
solved = do
randGen <- getStdGen
let x = 2
let randomPoints = take x $ randomRs (Point 0 0 None, Point 200 200 Blue) randGen
putStrLn $ "Random points: " ++ show randomPoints
This seems to work, randomRs appears to let me specify a range...
Presumably because my Point data type derives Eq?
Or
Is it because my x and y properties are Int (guessing here, but may be "bounded" by default) and I have Color derive bounded?
It works because of the properties of the Int and Color types, not because of the properties of Point. If one suppresses the Eq clause of Point, it still works.
Your code is overall quite good, however I would mention a few minor caveats.
In the Random instance for Point, you are chaining the generator states manually; this is a bit error prone, and monadic do notation is supposed to make it unnecessary. The Color instance could be simplified.
You are using IO where it is not really required. IO is just one instance of the MonadRandom class. If g is an instance of RandomGen, any Rand g is an instance of MonadRandom.
The random values you're getting are not reproducible from a program execution to the next one; this is because getStdGen implicitly uses the launch time as a random number generation seed. It may do that because it is IO-hosted. In many situations, this is a problem, as one might want to vary the choice of random sequence and the system parameters independently of each other.
Using monadic style, the basics of your code could be rewritten for example like this:
import System.Random
import System.Random.TF -- Threefish random number generator
import Control.Monad.Random
data Point = Point {
x :: Int,
y :: Int,
color :: Color
} deriving (Show, Eq)
data Color = None
| Black
| Red
| Green
| Blue
deriving (Show, Eq, Enum, Bounded)
instance Random Color where
randomR (a, b) g = let (r,g') = randomR (fromEnum a, fromEnum b) g
in (toEnum r, g')
random g = randomR (minBound::Color, maxBound::Color) g
singleRandomPoint :: -- monadic action for just one random point
MonadRandom mr => Int -> Int -> Color -> Int -> Int -> Color -> mr Point
singleRandomPoint xmin ymin cmin xmax ymax cmax =
do
-- avoid manual chaining of generator states:
x <- getRandomR (xmin, xmax)
y <- getRandomR (ymin, ymax)
c <- getRandomR (cmin, cmax)
return (Point x y c)
And then we can derive an expression returning an unlimited list of random points:
-- monadic action for an unlimited list of random points:
seqRandomPoints :: MonadRandom mr =>
Int -> Int -> Color -> Int -> Int -> Color -> mr [Point]
seqRandomPoints xmin ymin cmin xmax ymax cmax =
sequence (repeat (singleRandomPoint xmin ymin cmin xmax ymax cmax))
-- returns an unlimited list of random points:
randomPoints :: Int -> Int -> Int -> Color -> Int -> Int -> Color -> [Point]
randomPoints seed xmin ymin cmin xmax ymax cmax =
let
-- get random number generator:
-- using Threefish algorithm (TF) for better statistical properties
randGen = mkTFGen seed
action = seqRandomPoints xmin ymin cmin xmax ymax cmax
in
evalRand action randGen
Finally we can print the first few random points on stdout:
-- Small printing utility:
printListAsLines :: Show t => [t] -> IO()
printListAsLines xs = mapM_ (putStrLn . show) xs
solved01 :: IO ()
solved01 = do
let
seed = 42 -- for random number generator setup
-- unlimited list of random points:
allRandomPoints = randomPoints seed 0 0 None 200 200 Blue
count = 5
someRandomPoints = take count allRandomPoints
-- IO not used at all so far
putStrLn $ "Random points: "
printListAsLines someRandomPoints
main = solved01
Program execution (reproducible with constant seed):
$ randomPoints
Random points:
Point {x = 187, y = 56, color = Green}
Point {x = 131, y = 28, color = Black}
Point {x = 89, y = 135, color = Blue}
Point {x = 183, y = 190, color = Red}
Point {x = 27, y = 161, color = Green}
$
Should you prefer to just get a finite number of points and also get back the updated state of your random number generator, you would have to use replicate n instead of repeat, and runRand instead of evalRand.
Bit more details about the monadic approach here.

How to memoize the repeated subtrees of a game tree (a potentially infinite rose tree)?

I am attempting to implement the Negamax algorithm in Haskell.
For this, I am representing the future possibilities a game might take in a rose tree (Data.Tree.Forest (depth, move, position)). However, often there are positions that can be reached with two different sequences of moves. It is a waste (and quickly becomes very slow) to re-evaluate (the subtrees of) repeated positions.
Here is what I tried so far:
Implement a variant of Tying the Knot to share common sub-results. However, I have only been able to find explanations of tying the knot for (potentially infinite) lists, and nothing about re-using subtrees.
Another approach I have considered was to build a tree inside the State monad, where the state to keep would be a Map (depth, position) (Forest (depth, move, position)) to perform explicit memoization but I have so far not been able to set this up properly either.
I think that both approaches might have the problem that a game tree can only be built in a corecursive way: We do not build the tree up to the root from the leaves, but build a (potentially infinite) tree lazily from the root down.
EDIT: To give you an example of the code I am currently using (that is too slow):
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleContexts #-}
module ZeroSumGame where
import qualified Control.Arrow
import Data.Tree
import Numeric.Natural (Natural)
(|>) :: a -> (a -> b) -> b
x |> f = f x
infixl 0 |>
{-# INLINE (|>) #-}
class Ord s => Game s where
data Move s
initial :: s -- | Beginning of the game
applyMove :: Natural -> s -> Move s -> s -- | Moving from one game state to the next
possibleMoves :: Natural -> s -> [Move s] -- | Lists moves the current player is able to do.
isGameOver :: s -> Bool -- | True if the game has ended. TODO: Maybe write default implementation using `possibleMoves state == []`?
scorePosition :: Natural -> Move s -> s -> Int -- | Turns a position in an integer, for the Negamax algorithm to decide which position is the best.
type Trimove state = (Natural, Move state, state) -- | Depth since start of game, move to next position, new position
gameforest :: Game s => Natural -> s -> Forest (Trimove s)
gameforest start_depth start_state = unfoldForest buildNode (nextpositions start_depth start_state)
where
buildNode (depth, move, current_state) =
if
isGameOver current_state
then
((depth, move, current_state), [])
else
((depth, move, current_state), nextpositions depth current_state)
nextpositions depth current_state =
current_state
|> possibleMoves depth
|> fmap (\move -> (succ depth, move, applyMove depth current_state move))
scoreTree :: Game s => Ord (Move s) => Natural -> Tree (Trimove s) -> (Move s, Int)
scoreTree depth node =
case (depth, subForest node) of
(0, _) ->
node |> rootLabel |> uncurry3dropFirst scorePosition
(_, []) ->
node |> rootLabel |> uncurry3dropFirst scorePosition
(_, children) ->
children
|> scoreForest (pred depth)
|> map (Control.Arrow.second negate)
|> maximum
uncurry3dropFirst :: (a -> b -> c -> d) -> (a, b, c) -> (b, d)
uncurry3dropFirst fun (a, b, c) = (b, fun a b c)
scoreForest :: Game s => Ord (Move s) => Natural -> Forest (Trimove s) -> [(Move s, Int)]
scoreForest depth forest =
forest
|> fmap (scoreTree depth)
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleContexts #-}
module TicTacToe where
import qualified Control.Monad.State
import Control.Monad.State (State)
import qualified Data.Map
import Data.Map (Map)
import qualified Control.Arrow
import Data.Tree
import Data.Array (Array)
import qualified Data.Array
import qualified Data.Maybe
import qualified Data.Foldable
import Numeric.Natural (Natural)
import ZeroSumGame
data CurrentPlayer = First | Second
deriving (Eq, Ord, Show)
instance Enum CurrentPlayer where
fromEnum First = 1
fromEnum Second = -1
toEnum 1 = First
toEnum (-1) = Second
toEnum _ = error "Improper player"
newtype TicTacToe = TicTacToe (Array (Int, Int) (Maybe CurrentPlayer))
deriving (Eq, Ord)
instance Game TicTacToe where
data Move TicTacToe = TicTacToeMove (Int, Int)
deriving (Eq, Ord, Show, Bounded)
initial = TicTacToe initialTicTacToeBoard
possibleMoves _depth = possibleTicTacToeMoves
applyMove depth (TicTacToe board) (TicTacToeMove (x, y)) =
TicTacToe newboard
where
newboard = board Data.Array.// [((x, y), Just player)]
player = case depth `mod` 2 of
0 -> First
_ -> Second
isGameOver state = Data.Maybe.isJust (findFilledLines state)
scorePosition _ _ state =
state
|> findFilledLines
|> fmap fromEnum
|> Data.Maybe.fromMaybe 0
|> (* (-10000))
findFilledLines :: TicTacToe -> Maybe CurrentPlayer
findFilledLines (TicTacToe board) =
(rows ++ columns ++ diagonals)
|> map winner
|> Data.Foldable.asum
where
rows = vals rows_indexes
columns = vals columns_indexes
diagonals = vals diagonals_indexes
rows_indexes = [[(i, j) | i <- [0..2]]| j <- [0..2]]
columns_indexes = [[(i, j) | j <- [0..2]]| i <- [0..2]]
diagonals_indexes = [[(i, i) ]| i <- [0..2]] ++ [[(i, 2 - i) ]| i <- [0..2]]
vals = map (map (\index -> board Data.Array.! index))
winner :: Eq a => [Maybe a] -> Maybe a
winner [x,y,z] =
if x == y && x == z then x else Nothing
winner _ = Nothing
initialTicTacToeBoard :: (Array (Int, Int) (Maybe CurrentPlayer))
initialTicTacToeBoard =
Data.Array.array ((0, 0), (2, 2)) [((i, j), Nothing) | i <- [0..2], j <- [0..2]]
possibleTicTacToeMoves :: TicTacToe -> [Move TicTacToe]
possibleTicTacToeMoves (TicTacToe board) = foldr checkSquareForMove [] (Data.Array.assocs board)
where
checkSquareForMove (index, val) acc = case val of
Nothing -> TicTacToeMove index : acc
Just _ -> acc
printBoard :: TicTacToe -> String
printBoard (TicTacToe board) =
unlines [unwords [showTile (board Data.Array.! (y, x)) | x <- [0..2]] | y <- [0..2]]
where
showTile loc =
case loc of
Nothing -> " "
Just Second -> "X"
Just First -> "O"
(TypeFamilies is used to allow each Game implementation to have their own notion of a Move, and FlexibleContexts is then required to enforce Move s to implement Ord.
Problem reformulation
If I understand the question correctly, you have a function that returns the possible next moves in a game, and one to take that move:
start :: Position
moves :: Position -> [Move]
act :: Position -> Move -> Position
and how you want to build the infinite tree of states (please allow me to ignore the Depth field, for simplicity. If you consider the depth counter as part of the Position type, you see that no generality is lost here):
states :: Forest (Position, Move)
states = forest start
forest :: Position -> Forest (Position, Move)
forest p = [ Node (m, p') (states p') | m <- moves p, let p' = act p m ]
but you want to achieve that in a way that identical subtrees of forest are shared.
Towards Memoization
The general technique is here is that we want to memoize forest: This way, for identical Positions, we get shared subtrees. So the recipe is:
forest :: Position -> Forest (Position, Move)
forest = memo forest'
forest' :: Position -> Forest (Position, Move)
forest' p = [ Node (m, p') (states p') | m <- moves p, let p' = act p m ]
And we need a suitable memo-function:
memo :: (Position -> a) -> (Position -> a)
At this point, we need to know more about Position in order to know how to implement that using an equivalent of the “lazy list” trick… But you see that you do not need to memoize functions that involve Rose trees.
I would try to do this by normalizing board positions based on some "canonical" sequence of moves to reach that position. Then each child is assigned the value of traversing its individual normalized sequence through the tree. (no code because I'm on my phone and this is a big task.)
How well this works depends on the ease of calculating normalized move sequences in the game you're playing. But it's a way to introduce sharing by tying the knot, making use of a shared reference to the root of the game tree. Maybe it will serve as inspiration for other ideas that fit your specific case.

Haskell: Implementing a design with an interface and a polymorphic function

Again I'm requesting comments about how a given design should be implemented in Haskell.
Thanks in advance to everyone providing helpful comments. Also I hope this could be an aid to other Haskell novices like me, having a practical sample code.
This time, we have a polymorphic function doSampling (in module Samples) that takes a generic function f and
a list of reals (indexes) and returns a Samples (indexes, values=f(indexes)). We want implement doSampling only once, as it doesn't matter if is f is a Polynomial or a Sinus. For that,
we have introduced an interface Function, and have Polynomial and Sinus types implement it.
The following is the design being implemented:
Edit 1:
There is a debate on the Function interface (class in Haskell). It has been suggested it is not actually necessary, as doSampling may take a "nude" function (Double -> Double) instead.
But, how to do it, if you need some extra state within the nude function (coeffs for a polynomial, amp+freq+phase for a sinus?
Edit 2:
Very good answers by kosmikus and by Chris Taylor. Thanks.
A key idea in both: have
doSampling :: (Double -> Double) -> [Double] -> Samples
This is: it takes a function (Double -> Double) (instead of Function) and list and returns samples.
My intention was to keep the state of Polynomials and Sinuses. That is not regarded in Chris answer, but it is in kosmikus'. On the other hand, the weak point in kosmikus version could be how to extend its Function definition if you don't have access to the source code.
I would also point out:
Chris' idea of encapsulating a polynomial or a sinus into a function (Double -> Double) by means of a factory function mkPolynomial or mkSinus that generates (using currying?) the desired function taking the apropriate parameters. (Although you can't consult the parameters later).
kosmikous' idea of using value to transform (also using currying?) a Function into a (Double -> Double)
Both answers are worth reading as they have other little Haskell tricks to reduce and simplify code.
In sum
Chris answers does not support keeping the state of a Polynomial or of a Sinus
kosmikus answers is not extensible: adding new type of functions (Cosinus ...)
my answer (being verbose) does overcome the previous downsides, and it would allow (this not necessary for the problem) impose Function types to have more associated-functions apart of value (in the sense of how an java-interfaces work).
My own approach
main (usage)
import Polynomial
import Sinus
import Function
import Samples
-- ...............................................................
p1 = Polynomial [1, 0, 0.5] -- p(x) = 1 + 0.5x^2
s1 = Sinus 2 0.5 3 -- f(x) = 2 sin(0.5x + 3)
-- ...............................................................
-- sample p1 from 0 to 5
m1 = doSampling p1 [0, 0.5 .. 5]
m2 = doSampling s1 [0, 0.5 .. 5]
-- ...............................................................
-- main
-- ...............................................................
main = do
putStrLn "Hello"
print $ value p1 2
print $ value s1 (pi/2)
print $ pairs m1
print $ pairs m2
Function
module Function where
-- ...............................................................
-- "class type" : the types belonging to this family of types
-- must implement the following functions:
-- + value : takes a function and a real and returns a real
-- ...............................................................
class Function f where
value :: f -> Double -> Double
-- f is a type variable, this is:
-- f is a type of the Function "family" not an actual function
Samples
module Samples where
import Function
-- ...............................................................
-- Samples: new data type
-- This is the constructor and says it requieres
-- two list, one for the indexes (xs values) and another
-- for the values ( ys = f (xs) )
-- this constructor should not be used, instead use
-- the "factory" function: new_Samples that performs some checks
-- ...............................................................
data Samples = Samples { indexes :: [Double] , values :: [Double] }
deriving (Show)
-- ...............................................................
-- constructor: it checks lists are equal size, and indexes are sorted
new_Samples :: [Double] -> [Double] -> Samples
new_Samples ind val
| (length ind) /= (length val) = samplesVoid
| not $ isSorted ind = samplesVoid
| otherwise = Samples ind val
-- ...............................................................
-- sample a funcion
-- it takes a funcion f and a list of indexes and returns
-- a Samples calculating the values array as f(indexes)
doSampling :: (Function f) => f -> [Double] -> Samples
doSampling f ind = new_Samples ind vals
where
vals = [ value f x | x <- ind ]
-- ...............................................................
-- used as "error" in the construction
samplesVoid = Samples [] []
-- ...............................................................
size :: Samples -> Int
size samples = length (indexes samples)
-- ...............................................................
-- utility function to get a pair (index,value) out of a Samples
pairs :: Samples -> [(Double, Double)]
pairs samples = pairs' (indexes samples) (values samples)
pairs' :: [Double] -> [Double] -> [(Double, Double)]
pairs' [] [] = []
pairs' [i] [v] = [(i,v)]
pairs' (i:is) (v:vs) = (i,v) : pairs' is vs
-- ...............................................................
-- to check whether a list is sorted (<)
isSorted :: (Ord t) => [t] -> Bool
isSorted [] = True
isSorted [e] = True
isSorted (e1:(e2:tail))
| e1 < e2 = isSorted (e2:tail)
| otherwise = False
Sinus
module Sinus where
-- ...............................................................
import Function
-- ...............................................................
-- Sinus: new data type
-- This is the constructor and says it requieres
-- a three reals
-- ...............................................................
data Sinus = Sinus { amplitude :: Double, frequency :: Double, phase :: Double }
deriving (Show)
-- ...............................................................
-- we say that a Sinus is a Function (member of the class Function)
-- and then, how value is implemented
instance Function Sinus where
value s x = (amplitude s) * sin ( (frequency s)*x + (phase s))
Polynomial
module Polynomial where
-- ...............................................................
import Function
-- ...............................................................
-- Polynomial: new data type
-- This is the constructor and says it requieres
-- a list of coefficients
-- ...............................................................
data Polynomial = Polynomial { coeffs :: [Double] }
deriving (Show)
-- ...............................................................
degree :: Polynomial -> Int
degree p = length (coeffs p) - 1
-- ...............................................................
-- we say that a Polynomial is a Function (member of the class Function)
-- and then, how value is implemented
instance Function Polynomial where
value p x = value' (coeffs p) x 1
-- list of coeffs -> x -> pw (power of x) -> Double
value' :: [Double] -> Double -> Double -> Double
value' (c:[]) _ pw = c * pw
value' (c:cs) x pw = (c * pw) + (value' cs x x*pw)
You certainly don't need the Function class. All this heavyweight class, instance, member variable fluff is one of the things that Haskell is designed to avoid. Pure functions can be much more flexible.
Here's a simple way of doing what you want.
type Sample = ([Double], [Double])
newSample xs vs
| isSorted xs && length xs == length vs = (indices, values)
| otherwise = ([], [])
pairs = uncurry zip
doSampling :: (Double -> Double) -> [Double] -> Sample
doSampling f xs = newSample xs (map f xs)
mkPolynomial :: [Double] -> (Double -> Double)
mkPolynomial coefs x = go coefs
where
go [] = 0
go (c:cs) = c + x * go cs
mkSinus :: Double -> Double -> Double -> (Double -> Double)
mkSinus amp freq phase x = amp * sin (freq * x + phase)
p1 = mkPolynomial [1, 0, 0.5] -- 1 + 0.5x^2
s1 = mkSinus 2 0.5 3 -- 2 sin(0.5x + 3)
m1 = doSampling p1 [0, 0.5 .. 5]
m2 = doSampling s1 [0, 0.5 .. 5]
main :: IO ()
main = do
print $ p1 2
print $ s1 (pi/2)
print $ pairs m1
print $ pairs m2
[Expanded my comment on request.]
I'd probably do this roughly as follows:
import Data.Functor
-- Use a datatype rather than a class. Yes, this makes it harder to
-- add new types of functions later, and in turn easier to define new
-- operations. ("expression problem")
data Function =
Sinus { amplitude :: Double, frequency :: Double, phase :: Double }
| Polynomial { coeffs :: [Double] }
deriving (Show)
-- Interpreting a Function as an actual function.
value :: Function -> (Double -> Double)
value (Sinus amp freq ph) x = amp * sin (freq * x + ph)
value (Polynomial cs) x = value' cs x
-- Rewrite value' to not require non-empty lists. This can also be
-- nicely written as a fold.
value' :: [Double] -> Double -> Double
value' [] _ = 0
value' (c:cs) x = c + x * value' cs x
data Samples = Samples { indexes :: [Double] , values :: [Double] }
deriving (Show)
-- Use Maybe to detect error conditions, instead of strange values
-- such as voidSamples.
newSamples :: [Double] -> [Double] -> Maybe Samples
newSamples ind val
| length ind /= length val = Nothing
| not $ isSorted ind = Nothing
| otherwise = Just (Samples ind val)
doSampling :: (Double -> Double) -> [Double] -> Maybe Samples
doSampling f ind = newSamples ind (map f ind)
isSorted :: (Ord t) => [t] -> Bool
isSorted [] = True
isSorted [e] = True
isSorted (e1:e2:es)
| e1 < e2 = isSorted (e2:es)
| otherwise = False
-- This is just zip.
pairs :: Samples -> [(Double, Double)]
pairs (Samples idxs vals) = zip idxs vals
p1 = Polynomial [1, 0, 0.5] -- p(x) = 1 + 0.5x^2
s1 = Sinus 2 0.5 3 -- f(x) = 2 sin(0.5x + 3)
m1 = doSampling (value p1) [0, 0.5 .. 5]
m2 = doSampling (value s1) [0, 0.5 .. 5]
-- The <$> maps over a Maybe.
main = do
putStrLn "Hello"
print $ value p1 2
print $ value s1 (pi/2)
print $ pairs <$> m1
print $ pairs <$> m2

How do I stop randomness from pervading my code in Haskell?

I am attempting to implement the following algorithm, as detailed here.
Start with a flat terrain (initialize all height values to zero).
Pick a random point on or near the terrain, and a random radius
between some predetermined minimum and maximum. Carefully choosing
this min and max will make a terrain rough and rocky or smooth and
rolling.
Raise a hill on the terrain centered at the point, having the given
radius.
Go back to step 2, and repeat as many times as necessary. The number
of iterations chosen will affect the appearance of the terrain.
However, I start to struggle once I get to the point where I have to select a random point on the terrain. This random point is wrapped in an IO monad, which is then passed up my chain of functions.
Can I cut the IO off at a certain point and, if so, how do I find that point?
The following is my (broken) code. I would appreciate any suggestions on improving it / stopping the randomness from infecting everything.
type Point = (GLfloat, GLfloat, GLfloat)
type Terrain = [Point]
flatTerrain :: Double -> Double -> Double -> Double -> Terrain
flatTerrain width length height spacing =
[(realToFrac x, realToFrac y, realToFrac z)
| x <- [-width,-1+spacing..width], y <- [height], z <- [-length,-1+spacing..length]]
hill :: Terrain -> Terrain
hill terrain = hill' terrain 100
where hill' terrain 0 = terrain
hill' terrain iterations = do
raised <- raise terrain
hill' (raise terrain) (iterations - 1)
raise terrain = do
point <- pick terrain
map (raisePoint 0.1 point) terrain
raisePoint r (cx,cy,cz) (px,py,pz) =
(px, r^2 - ((cx - px)^2 + (cz - pz)^2), pz)
pick :: [a] -> IO a
pick xs = randomRIO (0, (length xs - 1)) >>= return . (xs !!)
The algorithm says that you need to iterate and in each iteration select a random number and update the terrain which can be viewed as generate a list of random points and use this list to update the terrain i.e iteration to generate random numbers == list of random numbers.
So you can do something like:
selectRandomPoints :: [Points] -> Int -> IO [Points] -- generate Int times random points
updateTerrain :: Terrain -> [Points] -> Terrain
-- somewhere in IO
do
pts <- selectRandomPoints allPts iterationCount
let newTerrain = updateTerrain t pts
One of the most useful features of haskell is to know a function is deterministic just based on its type - it makes testing much easier. For this reason, I would base my design on limiting randomness as much as possible, and wrapping the core non random functions with a random variant. This is easily done with the MonadRandom type class, which is the best way of writing code in haskell that requires random values.
For fun, I wrote a console version of that hill generator. It is pretty basic, with a lot of hard coded constants. However, it does provide a pretty cool ascii terrain generator :)
Note with my solution all of the calculations are isolated in pure, non random functions. This could then be tested easily, as the result is deterministic. As little as possible occurs in the IO monad.
import Control.Monad
import Control.Monad.Random
import Data.List
import Data.Function (on)
type Point = (Double, Double, Double)
type Terrain = [Point]
-- Non random code
flatTerrain :: Double -> Double -> Double -> Double -> Terrain
flatTerrain width length height spacing = [(realToFrac x, realToFrac y, realToFrac z)
| x <- [-width,-width+spacing..width], y <- [height], z <- [-length,-length+spacing..length]]
-- simple terrain displayer, uses ascii to render the area.
-- assumes the terrain points are all separated by the same amount
showTerrain :: Terrain -> String
showTerrain terrain = unlines $ map (concat . map showPoint) pointsByZ where
pointsByZ = groupBy ((==) `on` getZ) $ sortBy (compare `on` getZ) terrain
getZ (_, _, z) = z
getY (_, y, _) = y
largest = getY $ maximumBy (compare `on` getY) terrain
smallest = getY $ minimumBy (compare `on` getY) terrain
atPC percent = (largest - smallest) * percent + smallest
showPoint (_, y, _)
| y < atPC (1/5) = " "
| y < atPC (2/5) = "."
| y < atPC (3/5) = "*"
| y < atPC (4/5) = "^"
| otherwise = "#"
addHill :: Double -- Radius of hill
-> Point -- Position of hill
-> Terrain -> Terrain
addHill radius point = map (raisePoint radius point) where
raisePoint :: Double -> Point -> Point -> Point
-- I had to add max py here, otherwise new hills destroyed the
-- old hills with negative values.
raisePoint r (cx,cy,cz) (px,py,pz) = (px, max py (r^2 - ((cx - px)^2 + (cz - pz)^2)), pz)
-- Some random variants. IO is an instance of MonadRandom, so these function can be run in IO. They
-- can also be run in any other monad that has a MonadRandom instance, so they are pretty flexible.
-- creates a random point. Note that the ranges are hardcoded - an improvement would
-- be to be able to specify them, either through parameters, or through reading from a Reader
-- monad or similar
randomPoint :: (MonadRandom m) => m Point
randomPoint = do
x <- getRandomR (-30, 30)
y <- getRandomR (0,10)
z <- getRandomR (-30, 30)
return (x, y, z)
addRandomHill :: (MonadRandom m) => Terrain -> m Terrain
addRandomHill terrain = do
radius <- getRandomR (0, 8) -- hardcoded again
position <- randomPoint
return $ addHill radius position terrain
-- Add many random hills to the Terrain
addRandomHills :: (MonadRandom m) => Int -> Terrain -> m Terrain
addRandomHills count = foldr (>=>) return $ replicate count addRandomHill
-- testing code
test hillCount = do
let terrain = flatTerrain 30 30 0 2
withHills <- addRandomHills hillCount terrain
-- let oneHill = addHill 8 (0, 3, 0) terrain
-- putStrLn $ showTerrain oneHill
putStrLn $ showTerrain withHills
main = test 200
Example output:
... .. ..*. .***^^^***.
... ... .***. .***^^^*^^*.
... .. .*^**......*^*^^^^.
. .***.***. ..*^^^*.
....*^^***^*. .^##^*.
..*.*^^^*****. .^###^..*
.**^^^^.***... .*^#^*.**
.***^##^**..*^^*.*****..**
....***^^##^*.*^##^****. ..
.......*^###^.*###^****.
.*********^###^**^##^***....
*^^^*^##^^^^###^.^^^*. .****..
*^^^^####*^####^..**. .******.
*^^^*####**^###*. .. .*******
*^#^^^##^***^^*. ...........***
*^^^**^^*..*... ..*******...***
.***..*^^*... ..*^^#^^^*......
...*^##^**. .*^^#####*.
.*^##^**....**^^####*. .***
.. ..*^^^*...*...**^^###^* *^#^
..****^^*. .... ...**###^*.^###
..*******.**. ..**^^^#^^..^###
.*****..*^^* ..**^##^**...*^##
.^^^^....*^^*..*^^^##^* ..**^^^
*###^*. .*^**..^###^^^*...*****
^####*.*..*^^*.^###^**.....*..
*###^**^**^^^*.*###^. .. .
.^^^***^^^^#^*.**^^**.
.....***^##^**^^^*^^*.
.*^^##^*^##^^^^^.
.*^^^^*.^##^*^^*.
Nope, you can't escape IO. Perhaps you can do all your randomness up front and rewrite your functions to take that randomness as a parameter; if not, you can use MonadRandom or similar to track a random seed or just put everything in IO.

Optimizing Perlin noise in Haskell

(Dependencies for this program: vector --any and JuicyPixels >= 2. Code is available as Gist.)
{-# LANGUAGE Haskell2010 #-}
{-# LANGUAGE BangPatterns #-}
import Control.Arrow
import Data.Bits
import Data.Vector.Unboxed ((!))
import Data.Word
import System.Environment (getArgs)
import qualified Codec.Picture as P
import qualified Data.ByteString as B
import qualified Data.Vector.Unboxed as V
I tried to port Ken Perlin's improved noise
to Haskell, but I'm not entirely sure that my method is correct. The main part
is something that should generalize nicely to higher and lower dimensions, but
that is something for later:
perlin3 :: (Ord a, Num a, RealFrac a, V.Unbox a) => Permutation -> (a, a, a) -> a
perlin3 p (!x', !y', !z')
= let (!xX, !x) = actuallyProperFraction x'
(!yY, !y) = actuallyProperFraction y'
(!zZ, !z) = actuallyProperFraction z'
!u = fade x
!v = fade y
!w = fade z
!h = xX
!a = next p h + yY
!b = next p (h+1) + yY
!aa = next p a + zZ
!ab = next p (a+1) + zZ
!ba = next p b + zZ
!bb = next p (b+1) + zZ
!aaa = next p aa
!aab = next p (aa+1)
!aba = next p ab
!abb = next p (ab+1)
!baa = next p ba
!bab = next p (ba+1)
!bba = next p bb
!bbb = next p (bb+1)
in
lerp w
(lerp v
(lerp u
(grad aaa (x, y, z))
(grad baa (x-1, y, z)))
(lerp u
(grad aba (x, y-1, z))
(grad bba (x-1, y-1, z))))
(lerp v
(lerp u
(grad aab (x, y, z-1))
(grad bab (x-1, y, z-1)))
(lerp u
(grad abb (x, y-1, z-1))
(grad bbb (x-1, y-1, z-1))))
This is of course accompanied by a few functions mentioned in the perlin3
function, of which I hope they are as efficient as possible:
fade :: (Ord a, Num a) => a -> a
fade !t | 0 <= t, t <= 1 = t * t * t * (t * (t * 6 - 15) + 10)
lerp :: (Ord a, Num a) => a -> a -> a -> a
lerp !t !a !b | 0 <= t, t <= 1 = a + t * (b - a)
grad :: (Bits hash, Integral hash, Num a, V.Unbox a) => hash -> (a, a, a) -> a
grad !hash (!x, !y, !z) = dot3 (vks `V.unsafeIndex` fromIntegral (hash .&. 15)) (x, y, z)
where
vks = V.fromList
[ (1,1,0), (-1,1,0), (1,-1,0), (-1,-1,0)
, (1,0,1), (-1,0,1), (1,0,-1), (-1,0,-1)
, (0,1,1), (0,-1,1), (0,1,-1), (0,-1,-1)
, (1,1,0), (-1,1,0), (0,-1,1), (0,-1,-1)
]
dot3 :: Num a => (a, a, a) -> (a, a, a) -> a
dot3 (!x0, !y0, !z0) (!x1, !y1, !z1) = x0 * x1 + y0 * y1 + z0 * z1
-- Unlike `properFraction`, `actuallyProperFraction` rounds as intended.
actuallyProperFraction :: (RealFrac a, Integral b) => a -> (b, a)
actuallyProperFraction x
= let (ipart, fpart) = properFraction x
r = if x >= 0 then (ipart, fpart)
else (ipart-1, 1+fpart)
in r
For the permutation group, I just copied the one Perlin used on his website:
newtype Permutation = Permutation (V.Vector Word8)
mkPermutation :: [Word8] -> Permutation
mkPermutation xs
| length xs >= 256
= Permutation . V.fromList $ xs
permutation :: Permutation
permutation = mkPermutation
[151,160,137,91,90,15,
131,13,201,95,96,53,194,233,7,225,140,36,103,30,69,142,8,99,37,240,21,10,23,
190, 6,148,247,120,234,75,0,26,197,62,94,252,219,203,117,35,11,32,57,177,33,
88,237,149,56,87,174,20,125,136,171,168, 68,175,74,165,71,134,139,48,27,166,
77,146,158,231,83,111,229,122,60,211,133,230,220,105,92,41,55,46,245,40,244,
102,143,54, 65,25,63,161, 1,216,80,73,209,76,132,187,208, 89,18,169,200,196,
135,130,116,188,159,86,164,100,109,198,173,186, 3,64,52,217,226,250,124,123,
5,202,38,147,118,126,255,82,85,212,207,206,59,227,47,16,58,17,182,189,28,42,
223,183,170,213,119,248,152, 2,44,154,163, 70,221,153,101,155,167, 43,172,9,
129,22,39,253, 19,98,108,110,79,113,224,232,178,185, 112,104,218,246,97,228,
251,34,242,193,238,210,144,12,191,179,162,241, 81,51,145,235,249,14,239,107,
49,192,214, 31,181,199,106,157,184, 84,204,176,115,121,50,45,127, 4,150,254,
138,236,205,93,222,114,67,29,24,72,243,141,128,195,78,66,215,61,156,180
]
next :: Permutation -> Word8 -> Word8
next (Permutation !v) !idx'
= v `V.unsafeIndex` (fromIntegral $ idx' .&. 0xFF)
And all this is tied together with JuicyPixels:
main = do
[target] <- getArgs
let image = P.generateImage pixelRenderer 512 512
P.writePng target image
where
pixelRenderer, pixelRenderer' :: Int -> Int -> Word8
pixelRenderer !x !y
= floor $ ((perlin3 permutation ((fromIntegral x - 256) / 32,
(fromIntegral y - 256) / 32, 0 :: Double))+1)/2 * 128
-- This code is much more readable, but also much slower.
pixelRenderer' x y
= (\w -> floor $ ((w+1)/2 * 128)) -- w should be in [-1,+1]
. perlin3 permutation
. (\(x,y,z) -> ((x-256)/32, (y-256)/32, (z-256)/32))
$ (fromIntegral x, fromIntegral y, 0 :: Double)
My problem is that perlin3 seems very slow to me. If I profile it, pixelRenderer
is getting a lot of time as well, but I'll ignore that for now. I don't know
how to optimize perlin3. I tried to hint GHC with bang patterns, which cuts
the execution time in half, so that's nice. Explicitly specializing and inlining
barely helps with ghc -O. Is perlin3 supposed to be this slow?
UPDATE: an earlier version of this question mentioned a bug in my code. This problem has been resolved; it turns out my old version of actuallyProperFraction was buggy. It implicitly rounded the integral part of a floating point number to Word8, and then subtracted it from the floating point number to get the fractional part. Since Word8 can only take values between 0 and 255 inclusive, this won't work properly for numbers outside that range, including negative numbers.
This code appears to be mostly computation-bound. It can be improved a little bit, but not by much unless there's a way to use fewer array lookups and less arithmetic.
There are two useful tools for measuring performance: profiling and code dumps. I added an SCC annotation to perlin3 so that it would show up in the profile. Then I compiled with gcc -O2 -fforce-recomp -ddump-simpl -prof -auto. The -ddump-simpl flag prints the simplified code.
Profiling: On my computer, it takes 0.60 seconds to run the program, and about 20% of execution time (0.12 seconds) is spent in perlin3 according to the profile. Note that the precision of my profile info is about +/-3%.
Simplifier output: The simplifier produces fairly clean code. perlin3 gets inlined into pixelRenderer, so that's the part of the output you want to look at. Most of the code consists of unboxed array reads and unboxed arithmetic. To improve performance, we want to eliminate some of this arithmetic.
An easy change is to eliminate the run-time checks on SomeFraction (which doesn't appear in your question, but is part of the code that you uploaded). This reduces the program's execution time to 0.56 seconds.
-- someFraction t | 0 <= t, t < 1 = SomeFraction t
someFraction t = SomeFraction t
Next, there are several array lookups that show up in the simplifier like this:
case GHC.Prim.indexWord8Array#
ipv3_s23a
(GHC.Prim.+#
ipv1_s21N
(GHC.Prim.word2Int#
(GHC.Prim.and#
(GHC.Prim.narrow8Word#
(GHC.Prim.plusWord# ipv5_s256 (__word 1)))
(__word 255))))
The primitive operation narrow8Word# is for coercing from an Int to a Word8. We can get rid of this coercion by using Int instead of Word8 in the definition of next.
next :: Permutation -> Int -> Int
next (Permutation !v) !idx'
= fromIntegral $ v `V.unsafeIndex` (fromIntegral idx' .&. 0xFF)
This reduces the program's execution time to 0.54 seconds. Considering just the time spent in perlin3, the execution time has fallen (roughly) from 0.12 to 0.06 seconds. Although it's hard to measure where the rest of the time is going, it's most likely spread out among the remaining arithmetic and array accesses.
On my machine reference code with Heatsink's optimisations takes 0.19 secs.
Firstly, I has moved from JuicyPixels to yarr and yarr-image-io with my favourite flags, -Odph -rtsopts -threaded -fno-liberate-case -funbox-strict-fields -fexpose-all-unfoldings -funfolding-keeness-factor1000 -fsimpl-tick-factor=500 -fllvm -optlo-O3 (they are given here):
import Data.Yarr as Y
import Data.Yarr.IO.Image as Y
...
main = do
[target] <- getArgs
image <- dComputeS $ fromFunction (512, 512) (return . pixelRenderer)
Y.writeImage target (Grey image)
where
pixelRenderer, pixelRenderer' :: Dim2 -> Word8
pixelRenderer (y, x)
= floor $ ((perlin3 permutation ((fromIntegral x - 256) / 32,
(fromIntegral y - 256) / 32, 0 :: Double))+1)/2 * 128
-- This code is much more readable, but also much slower.
pixelRenderer' (y, x)
= (\w -> floor $ ((w+1)/2 * 128)) -- w should be in [-1,+1]
. perlin3 permutation
. (\(x,y,z) -> ((x-256)/32, (y-256)/32, (z-256)/32))
$ (fromIntegral x, fromIntegral y, 0 :: Double)
This makes the program 30% faster, 0.13 seconds.
Secondly I has replaced uses of standard floor with
doubleToByte :: Double -> Word8
doubleToByte f = fromIntegral (truncate f :: Int)
It is known issue (google "haskell floor performance"). Execution time is reduced to 52 ms (0.052 secs), in almost 3 times.
Finally, just for fun I tried to compute noise in parallel (dComputeP instead of dComputeS and +RTS -N4 in command line run). Program took 36 ms, including I/O constant of about 10 ms.

Resources