Writing fusible O(1) update for vector - haskell

It is continuation of this question. Since vector library doesn't seem to have a fusible O(1) update function, I am wondering if it is possible to write a fusible O(1) update function that doesn't involve unsafeFreeze and unsafeThaw. It would use vector stream representation, I guess - I am not familiar with how to write one using stream and unstream - hence, this question. The reason is this will give us the ability to write a cache-friendly update function on vector where only a narrow region of vector is being modified, and so, we don't want to walk through entire vector just to process that narrow region (and this operation can happen billions of times in each function call - so, the motivation to keep the overhead really low). The transformation functions like map process entire vector - so they will be too slow.
I have a toy example of what I want to do, but the upd function below uses unsafeThaw and unsafeFreeze - it doesn't seem to be optimized away in the core, and also breaks the promise of not using the buffer further:
module Main where
import Data.Vector.Unboxed as U
import Data.Vector.Unboxed.Mutable as MU
import Control.Monad.ST
upd :: Vector Int -> Int -> Int -> Vector Int
upd v i x = runST $ do
v' <- U.unsafeThaw v
MU.write v' i x
U.unsafeFreeze v'
sum :: Vector Int -> Int
sum = U.sum . (\x -> upd x 0 73) . (\x -> upd x 1 61)
main = print $ Main.sum $ U.fromList [1..3]
I know how to implement imperative algorithms using STVector. In case you are wondering why this alternative approach, I want to try out this approach of using pure vectors to check how GHC transformation of a particular algorithm differs when written using fusible pure vector streams (with monadic operations under the hood of course).
When the algorithm is written using STVector, it doesn't seem to be as nicely iterative as I would like it to be (I guess it is harder for GHC optimizer to spot loops when there is lot of mutability strewn around). So, I am investigating this alternative approach to see I can get a nicer loop in there.

The upd function you have written does not look correct, let alone fusable. Fusion is a library level optimization and requires you to write your code out of certain primatives. In this case what you want is not just fusion, but recycling which can be easily achieved via the bulk update operations such as // and update. These operations will fuse, and even happen in place much of the time.
If you really want to write your own destructive update based code DO NOT use unsafeThaw--use modify

Any function is a fusible update function; you seem to be trying to escape from the programming model the vector package is trying to get you to use
module Main where
import Data.Vector.Unboxed as U
change :: Int -> Int -> Int
change 0 n = 73
change 1 n = 61
change m n = n
myfun2 = U.sum . U.imap change . U.enumFromStepN 1 1
main = print $ myfun2 30000000
-- this doesn't create any vectors much less 'update' them, as you will see if you study the core.

Related

How to make a custom Attoparsec parser combinator that returns a Vector instead of a list?

{-# LANGUAGE OverloadedStrings #-}
import Data.Attoparsec.Text
import Control.Applicative(many)
import Data.Word
parseManyNumbers :: Parser [Int] -- I'd like many to return a Vector instead
parseManyNumbers = many (decimal <* skipSpace)
main :: IO ()
main = print $ parseOnly parseManyNumbers "131 45 68 214"
The above is just an example, but I need to parse a large amount of primitive values in Haskell and need to use arrays instead of lists. This is something that possible in the F#'s Fparsec, so I've went as far as looking at Attoparsec's source, but I can't figure out a way to do it. In fact, I can't figure out where many from Control.Applicative is defined in the base Haskell library. I thought it would be there as that is where documentation on Hackage points to, but no such luck.
Also, I am having trouble deciding what data structure to use here as I can't find something as convenient as a resizable array in Haskell, but I would rather not use inefficient tree based structures.
An option to me would be to skip Attoparsec and implement an entire parser inside the ST monad, but I would rather avoid it except as a very last resort.
There is a growable vector implementation in Haskell, which is based on the great AMT algorithm: "persistent-vector". Unfortunately, the library isn't that much known in the community so far. However to give you a clue about the performance of the algorithm, I'll say that it is the algorithm that drives the standard vector implementations in Scala and Clojure.
I suggest you implement your parser around that data-structure under the influence of the list-specialized implementations. Here the functions are, btw:
-- | One or more.
some :: f a -> f [a]
some v = some_v
where
many_v = some_v <|> pure []
some_v = (fmap (:) v) <*> many_v
-- | Zero or more.
many :: f a -> f [a]
many v = many_v
where
many_v = some_v <|> pure []
some_v = (fmap (:) v) <*> many_v
Some ideas:
Data Structures
I think the most practical data structure to use for the list of Ints is something like [Vector Int]. If each component Vector is sufficiently long (i.e. has length 1k) you'll get good space economy. You'll have
to write your own "list operations" to traverse it, but you'll avoid re-copying data that you would have to perform to return the data in a single Vector Int.
Also consider using a Dequeue instead of a list.
Stateful Parsing
Unlike Parsec, Attoparsec does not provide for user state. However, you
might be able to make use of the runScanner function (link):
runScanner :: s -> (s -> Word8 -> Maybe s) -> Parser (ByteString, s)
(It also returns the parsed ByteString which in your case may be problematic since it will be very large. Perhaps you can write an alternate version which doesn't do this.)
Using unsafeFreeze and unsafeThaw you can incrementally fill in a Vector. Your s data structure might look
something like:
data MyState = MyState
{ inNumber :: Bool -- True if seen a digit
, val :: Int -- value of int being parsed
, vecs :: [ Vector Int ] -- past parsed vectors
, v :: Vector Int -- current vector we are filling
, vsize :: Int -- number of items filled in current vector
}
Maybe instead of a [Vector Int] you use a Dequeue (Vector Int).
I imagine, however, that this approach will be slow since your parsing function will get called for every single character.
Represent the list as a single token
Parsec can be used to parse a stream of tokens, so how about writing
your own tokenizer and letting Parsec create the AST.
The key idea is to represent these large sequences of Ints as a single token. This gives you a lot more latitude in how you parse them.
Defer Conversion
Instead of converting the numbers to Ints at parse time, just have parseManyNumbers return a ByteString and defer the conversion until
you actually need the values. This much enable you to avoid reifying
the values as an actual list.
Vectors are arrays, under the hood. The tricky thing about arrays is that they are fixed-length. You pre-allocate an array of a certain length, and the only way of extending it is to copy the elements into a larger array.
This makes linked lists simply better at representing variable-length sequences. (It's also why list implementations in imperative languages amortise the cost of copying by allocating arrays with extra space and copying only when the space runs out.) If you don't know in advance how many elements there are going to be, your best bet is to use a list (and perhaps copy the list into a Vector afterwards using fromList, if you need to). That's why many returns a list: it runs the parser as many times as it can with no prior knowledge of how many that'll be.
On the other hand, if you happen to know how many numbers you're parsing, then a Vector could be more efficient. Perhaps you know a priori that there are always n numbers, or perhaps the protocol specifies before the start of the sequence how many numbers there'll be. Then you can use replicateM to allocate and populate the vector efficiently.

linear congruent generator in haskell

This is a very simple linear-congruent pseudo-random number generator. It works fine when I seed it, but I want to make it so that it self-seeds with every produced number. Problem is that I don't know how to do that in Haskell where the notion of variables does not exist. I can feed the produced number recursively, but then my result would be a list of integers instead of a single number.
linCongGen :: Int -> Int
linCongGen seed = ((2*seed) + 3) `mod` 100
I'll summarize the comments a bit more meaningfully. The simplest solution is, like you observed, an infinite list of the sequence of generated elements. Then, every time you want to get a new number, pop off the head of that list.
linCongGen :: Integral a => a -> [a]
linCongGen = iterate $ \x -> ((2*x) + 3) `mod` 100
That said, here is a solution (which I do not agree with), but which does what I think you want. For mutable state, we usually use IORef, which is sort of like a reference or pointer. Here is the code. Please read the disclaimer afterwards though.
import Data.IORef
import System.IO.Unsafe
seed :: IORef Int
seed = unsafePerformIO $ newIORef 71
linCongGen :: IO Int
linCongGen = do previous <- readIORef seed
modifyIORef' seed $ \x -> ((2*x) + 3) `mod` 100
return previous
And here is a sample usage printing out the first hundred numbers generated: main = replicateM_ 100 $ getRandom >>= print (you'll need to have Control.Monad imported too for replicateM_).
DISCLAIMER
This is a bit of a hacky approach described here. As the link says "Maybe the need for global mutable state is a symptom of bad design." The link also has a good description of a more intelligent workaround. Making an IORef is an inherently IO operation, and we really shouldn't be using unsafePerformIO on it. If you find yourself fighting Haskell in this way, it's because Haskell was designed to get in your way when you are doing things you shouldn't.
That said, I find comfort in knowing that this approach is also the one using in System.Random (the standard random number module) to define the initial seed (check out the source).

Procedurally generating large list of values in Haskell -- most idiomatic approach? memory management?

I have a function that takes a series of random numbers/floats, and uses them to generate a value/structure (ie, taking a random velocity and position of the point a ball is thrown from and outputting the coordinates of where it would land). And I need to generate several thousands in succession.
The way I have everything implemented is each calculation takes in an stdGen, uses it to generate several numbers, and passes out a new stdGen to allow it to be chained to another one.
And to do this for 10000 items, I make a sort of list from generate_item n which basically outputs a (value,gen) tuple (the value being the value i'm trying to calculate), where the value of gen is the recursively outputted stdGen from the calculations involved in getting the value from generate_item n-1
However, this program seems to crawl to be impractically slow at around a thousand results or so. And seems to definitely not be scalable. Could it have to do with the fact that I am storing all of the generate_item results in memory?
Or is there a more idomatic way of approaching this problem in Haskell using Monads or something than what I have describe above?
Note that the code to generate the algorithm from the random value generates 10k within seconds even in high-level scripting languages like ruby and python; these calculations are hardly intensive.
Code
-- helper functions that take in StdGen and return (Result,new StdGen)
plum_radius :: StdGen -> (Float,StdGen)
unitpoint :: Float -> StdGen -> ((Float,Float,Float),StdGen)
plum_speed :: Float -> StdGen -> (Float,StdGen)
-- The overall calculation of the value
plum_point :: StdGen -> (((Float,Float,Float),(Float,Float,Float)),StdGen)
plum_point gen = (((px,py,pz),(vx,vy,vz)),gen_out)
where
(r, gen2) = plum_radius gen
((px,py,pz),gen3) = unitpoint r gen2
(s, gen4) = plum_speed r gen3
((vx,vy,vz),gen5) = unitpoint s gen4
gen_out = gen5
-- Turning it into some kind of list
plum_data_list :: StdGen -> Int -> (((Float,Float,Float),(Float,Float,Float)),StdGen)
plum_data_list seed_gen 0 = plum_point seed_gen
plum_data_list seed_gen i = plum_point gen2
where
(_,gen2) = plum_data_list seed_gen (i-1)
-- Getting 100 results
main = do
gen <- getStdGen
let data_list = map (plum_data_list gen) [1..100]
putStrLn List.intercalate " " (map show data_list)
Consider just using the mersenne-twister and the vector-random package , which is specifically optimized to generate large amounts of high-quality random data.
Lists are unsuitable for allocating large amounts of data -- better to use a packed representation -- unless you're streaming.
First of all, the pattern you are describing -- taking an StdGen and then returning a tuple with a value and another StdGen to be chained into the next computation -- is exactly the pattern the State monad encodes. Refactoring your code to use it might be a good way to start to become familiar with monadic patterns.
As for your performance problem, StdGen is notoriously slow. I haven't done a lot with this stuff, but I've heard mersenne twister is faster.
However, you might also want to post your code, since in cases where you are generating large lists, laziness can really work to your advantage or disadvantage depending on how you are doing it. But it is hard to give specific advice without seeing what you are doing. One rule of thumb just in case you are coming from another functional language such as Lisp -- when generating a list (or other lazy data structure -- e.g. a tree, but not a Int), avoid tail recursion. The intuition for it being faster does not transfer to lazy languages. E.g. use (written without the monadic style that I would acutally use in practice)
randoms :: Int -> StdGen -> (StdGen, [Int])
randoms 0 g = (g, [])
randoms n g = let (g', x) = next g
(g'', xs) = randoms (n-1) g'
in (g'', x : xs)
This will allow the result list to be "streamed", so you can access the earlier parts of it before generating the later parts. (In this state case, it's a little subtle because accessing the resulting StdGen will have to generate the whole list, so you'll have to carefully avoid doing that until after you have consumed the list -- I wish there was a fast random generator that supported a good split operation, then you could get around having to return a generator at all).
Oh, just in case you're having trouble getting going with the monads thing, here's the above function written with a state monad:
randomsM :: Int -> State StdGen [Int]
randomsM 0 = return []
randomsM n = do
x <- state next
xs <- randomsM (n-1)
return (x : xs)
See the correspondence?
The other posters have good points, StdGen doesn't perform very well, and you should probably try to use State instead of manually passing the generator along. But I think the biggest problem is your plum_data_list function.
It seems to be intended to be some kind of lookup, but since it's implemented recursively without any memoization, the calls you make have to recurse to the base case. That is, plum_data_list seed_gen 100 needs the random generator from plum_data_list seed_gen 99 and so on, until plum_data_list seed_gen 0. This will give you quadratic performance when you try to generate a list of these values.
Probably the more idiomatic way is to let plum_data_list seed_gen generate an infinite list of points like so:
plum_data_list :: StdGen -> [((Float,Float,Float),(Float,Float,Float))]
plum_data_list seed_gen = first_point : plum_data_list seed_gen'
where
(first_point, seed_gen') = plum_point seed_gen
Then you just need to modify the code in main to something like take 100 $ plum_data_list gen, and you are back to linear performance.

Which container should I use?

I'm new to haskell, and so I'm trying to recreate the following C++ code in haskell.
int main() {
class MyClass {
public:
int a;
std::string s;
float f;
};
std::vector <MyClass> v;
LoadSerialized(&v); // don't need haskell equivalent; just reads a bunch of MyClass's and pushes them back onto v
}
Now, I've looked at the various containers in haskell that might work as the std::vector here: there's list, unboxed vector, boxed vector, and some weird usage of foreign pointers like the following:
data Table = Table { floats :: ForeignPtr CFloat
, ints :: ForeignPtr Int }
newTable :: IO Table
newTable = do
fp <- S.mallocByteString (floatSize * sizeOf (undefined :: CFloat))
ip <- S.mallocByteString (intSize * sizeOf (undefined :: Int ))
withForeignPtr fp $ \p ->
forM_ [0..floatSize-1] $ \n ->
pokeElemOff p n pi
withForeignPtr ip $ \p ->
forM_ [0..intSize-1] $ \n ->
pokeElemOff p n n
return (Table fp ip)
Now, I could implement the C++ code in the way I think is best--being a haskell newbie. Or I could ask people more experienced with the language what the best way is, because to me it looks like there's some nuance going on here that I'm missing. Simply, I want to push a structure containing many datatypes into a haskell container, and I don't care about the order. If it helps, I'm going to read serialized data into the container as you can see with LoadSerialized.
I'm not mixing in C++ code.
(Edit: is it stackoverflow policy to allow censorship of questions through editing (not minor)? It does say "always respect the original author.")
If you are writing the whole program in Haskell, Just use a list unless you have a good reason not to. (If you do have a good reason not to, please say what it is and we can help you choose a more appropriate data structure. e.g. random access to a specific list element is O(n) rather than the O(1) of a C++ vector, and updating values in a data structure is different in Haskell.)
If you are mixing Haskell and C++ in the same program, and you need help calling C++ from Haskell, please say.
Use lists by default. List operations such as map, foldr and filter can be fused together by the compiler, resulting in more efficient code than you would typically get using a C++ vector.
Use an array of some sort if you find yourself needing to lookup an element by index, or wanting to mutate an element at a specific index. See Data.Array, Data.Array.IO, and Data.Array.ST.
Use a sequence if you find yourself needing to insert new elements in the middle of the data structure, or at both ends of the structure. See Data.Sequence.

Short-lived memoization in Haskell?

In an object-oriented language when I need to cache/memoize the results of a function for a known life-time I'll generally follow this pattern:
Create a new class
Add to the class a data member and a method for each function result I want to cache
Implement the method to first check to see if the result has been stored in the data member. If so, return that value; else call the function (with the appropriate arguments) and store the returned result in the data member.
Objects of this class will be initialized with values that are needed for the various function calls.
This object-based approach is very similar to the function-based memoization pattern described here: http://www.bardiak.com/2012/01/javascript-memoization-pattern.html
The main benefit of this approach is that the results are kept around only for the life time of the cache object. A common use case is in the processing of a list of work items. For each work item one creates the cache object for that item, processes the work item with that cache object then discards the work item and cache object before proceeding to the next work item.
What are good ways to implement short-lived memoization in Haskell? And does the answer depend on if the functions to be cached are pure or involve IO?
Just to reiterate - it would be nice to see solutions for functions which involve IO.
Let's use Luke Palmer's memoization library: Data.MemoCombinators
import qualified Data.MemoCombinators as Memo
import Data.Function (fix) -- we'll need this too
I'm going to define things slightly different from how his library does, but it's basically the same (and furthermore, compatible). A "memoizable" thing takes itself as input, and produces the "real" thing.
type Memoizable a = a -> a
A "memoizer" takes a function and produces the memoized version of it.
type Memoizer a b = (a -> b) -> a -> b
Let's write a little function to put these two things together. Given a Memoizable function and a Memoizer, we want the resultant memoized function.
runMemo :: Memoizer a b -> Memoizable (a -> b) -> a -> b
runMemo memo f = fix (f . memo)
This is a little magic using the fixpoint combinator (fix). Never mind that; you can google it if you are interested.
So let's write a Memoizable version of the classic fib example:
fib :: Memoizable (Integer -> Integer)
fib self = go
where go 0 = 1
go 1 = 1
go n = self (n-1) + self (n-2)
Using a self convention makes the code straightforward. Remember, self is what we expect to be the memoized version of this very function, so recursive calls should be on self. Now fire up ghci.
ghci> let fib' = runMemo Memo.integral fib
ghci> fib' 10000
WALL OF NUMBERS CRANKED OUT RIDICULOUSLY FAST
Now, the cool thing about runMemo is you can create more than one freshly memoized version of the same function, and they will not share memory banks. That means that I can write a function that locally creates and uses fib', but then as soon as fib' falls out of scope (or earlier, depending on the intelligence of the compiler), it can be garbage collected. It doesn't have to be memoized at the top level. This may or may not play nicely with memoization techniques that rely on unsafePerformIO. Data.MemoCombinators uses a pure, lazy Trie, which fits perfectly with runMemo. Rather than creating an object which essentially becomes a memoization manager, you can simply create memoized functions on demand. The catch is that if your function is recursive, it must be written as Memoizable. The good news is you can plug in any Memoizer that you wish. You could even use:
noMemo :: Memoizer a b
noMemo f = f
ghci> let fib' = runMemo noMemo fib
ghci> fib' 30 -- wait a while; it's computing stupidly
1346269
Lazy-Haskell programming is, in a way, the memoization paradigm taken to a extreme. Also, whatever you do in an imperative language is possible in Haskell, using either IO monad, the ST monad, monad transformers, arrows, or you name what.
The only problem is that these abstraction devices are much more complicated than the imperative equivalent that you mentioned, and they need a pretty deep mind-rewiring.
I believe the above answers are both more complex than necessary, although they might be more portable than what I'm about to describe.
As I understand it, there is a rule in ghc that each value is computed exactly once when it's enclosing lambda expression is entered. You may thus create exactly your short lived memoization object as follows.
import qualified Data.Vector as V
indexerVector :: (t -> Int) -> V.Vector t -> Int -> [t]
indexerVector idx vec = \e -> tbl ! e
where m = maximum $ map idx $ V.toList vec
tbl = V.accumulate (flip (:)) (V.replicate m [])
(V.map (\v -> (idx v, v)) vec)
What does this do? It groups all the elements in the Data.Vector t passed as it's second argument vec according to integer computed by it's first argument idx, retaining their grouping as a Data.Vector [t]. It returns a function of type Int -> [t] which looks up this grouping by this pre-computed index value.
Our compiler ghc has promised that tbl shall only be thunked once when we invoke indexerVector. We may therefore assign the lambda expression \e -> tbl ! e returned by indexVector to another value, which we may use repeatedly without fear that tbl ever gets recomputed. You may verify this by inserting a trace on tbl.
In short, your caching object is exactly this lambda expression.
I've found that almost anything you can accomplish with a short term object can be better accomplished by returning a lambda expression like this.
You can use very same pattern in haskell too. Lazy evaluation will take care of checking whether value is evaluated already. It has been mentioned mupltiple times already but code example could be useful. In example below memoedValue will calculated only once when it is demanded.
data Memoed = Memoed
{ value :: Int
, memoedValue :: Int
}
memo :: Int -> Memoed
memo i = Memoed
{ value = i
, memoedValue = expensiveComputation i
}
Even better you can memoize values which depend on other memoized values. You shoud avoid dependecy loops. They can lead to nontermination
data Memoed = Memoed
{ value :: Int
, memoedValue1 :: Int
, memoedValue2 :: Int
}
memo :: Int -> Memoed
memo i = r
where
r = Memoed
{ value = i
, memoedValue1 = expensiveComputation i
, memoedValue2 = anotherComputation (memoedValue1 r)
}

Resources