How does this cyclic recursion provide the desired result? - haskell

Consider the following abbreviated code from this excellent blog post:
import System.Random (Random, randomRIO)
newtype Stream m a = Stream { runStream :: m (Maybe (NonEmptyStream m a)) }
type NonEmptyStream m a = (a, Stream m a)
empty :: (Monad m) => Stream m a
empty = Stream $ return Nothing
cons :: (Monad m) => a -> Stream m a -> Stream m a
cons a s = Stream $ return (Just (a, s))
fromList :: (Monad m) => [a] -> NonEmptyStream m a
fromList (x:xs) = (x, foldr cons empty xs)
Not too bad thus far - a monadic, recursive data structure and a way to build one from a list.
Now consider this function that chooses a (uniformly) random element from a stream, using constant memory:
select :: NonEmptyStream IO a -> IO a
select (a, s) = select' (return a) 1 s where
select' :: IO a -> Int -> Stream IO a -> IO a
select' a n s = do
next <- runStream s
case next of
Nothing -> a
Just (a', s') -> select' someA (n + 1) s' where
someA = do i <- randomRIO (0, n)
case i of 0 -> return a'
_ -> a
I'm not grasping the mysterious cyclic well of infinity that's going on in the last four lines; the result a' depends on a recursion on someA, which itself could depend on a', but not necessarily.
I get the vibe that the recursive worker is somehow 'accumulating' potential values in the IO a accumulator, but I obviously can't reason about it well enough.
Could anyone provide an explanation as to how this function produces the behaviour that it does?

That code doesn't actually run in constant space, as it composes a bigger and bigger IO a action which delays all the random choices until it's reached the end of the stream. Only when we reach the Nothing -> a case does the action in a actually get run.
For example, try running it on an infinite, constant space stream made by this function:
repeat' :: a -> NonEmptyStream IO a
repeat' x = let xs = (x, Stream $ return (Just xs)) in xs
Obviously, running select on this stream won't terminate, but you should see the memory usage going up as it allocates a lot of thunks for the delayed actions.
Here's a slightly re-written version of the code which does the choices as it goes along, so it runs in constant space and should hopefully be more clear as well. Note that I've replaced the IO a argument with a plain a which makes it clear that there are no delayed actions being built up here.
select :: NonEmptyStream IO a -> IO a
select (x, xs) = select' x 1 xs where
select' :: a -> Int -> Stream IO a -> IO a
select' current n xs = do
next <- runStream xs
case next of
Nothing -> return current
Just (x, xs') -> do
i <- randomRIO (0, n) -- (1)
case i of
0 -> select' x (n+1) xs' -- (2)
_ -> select' current (n+1) xs' -- (3)
As the name implies, current stores the currently selected value at each step. Once we've extracted the next item from the stream, we (1) pick a random number and use this to decide whether to (2) replace our selection with the new item or (3) keep our current selection before recursing on the rest of the stream.

There doesn't seem anything "cyclic" going on here. In particular, a' does not depend on someA. The a' is bound by pattern machting on the result of next. It is being used by someA which is in turn used on the right hand side, but this does not constitute a cycle.
What select' does is to traverse the stream. It maintains two accumulating arguments. The first is a random element from the stream (it's not yet selected and still random, hence IO a). The second is the position in the stream (Int).
The invariant being maintained is that the first accumulator selects an element uniformly from the stream we have seen so far, and that the integer represents the number of elements encountered so far.
Now, if we reach the end of the stream (Nothing), we can return the current random element, and it will be ok.
If we see another element (the Just case), then we recurse by calling select' again. Updating the number of elements to n + 1 is trivial. But how do we update the random element someA? Well, the old random element a chooses between the first n positions of the stream with equal probability. If we choose the new element a' with probability 1 / (n + 1) and use the old one in all other cases, then we have a uniform distribution over the whole stream up to this point again.

Related

Haskell arrow tutorial loop/state

From this
https://en.wikibooks.org/wiki/Haskell/Arrow_tutorial#Hangman:_Main_program
How is the IO done?. Particulary
main :: IO ()
main = do
rng <- getStdGen
interact $ unlines -- Concatenate lines out output
. ("Welcome to Arrow Hangman":) -- Prepend a greeting to the output
. concat . map snd . takeWhile fst -- Take the [String]s as long as the first element of the tuples is True
. runCircuit (hangman rng) -- Process the input lazily
. ("":) -- Act as if the user pressed ENTER once at the start
. lines -- Split input into lines
Interact appears to be (string -> string ) -> IO (). With the impression that it prints runs that function per line that it reads. What confuses me is, How is the initial line printed. Where is the state of the game stored in between?.
runCircuit was used earlier in a way in which it had all the inputs already generated. I'm confused as to how this version runs line by line, but doesn't appear to store any state?.
how can Circuit String (Bool, [String]) be ran by runCircuit :: Circuit a b -> [a] -> [b] in a line-by-line fashion?. In a way that appears to remember what the previous results where?.
Interact does not run the function per line. interact and runCircuit are lazy. Because you split the input in lines and concatenate the output, you'll see the progress of runCircuit as you provide more and more input.
The function runCircuit is defined as follows:
runCircuit :: Circuit a b -> [a] -> [b]
runCircuit _ [] = []
runCircuit cir (x:xs) =
let (cir',x') = unCircuit cir x
in x' : runCircuit cir' xs
There you see that you produce one element in the output list for each element in the input list (each line). Which already indicates that you'll be able to process the list lazily. (For comparison: if it required the length of xs to produce the first output x', then runCircuit would not be lazy.)
Let's put that together with the definition of Circuit:
data Circuit a b = Circuit { unCircuit :: a -> (Circuit a b, b) }
The way you run a circuit is that you provide a first input x of type a and obtain not only a first output x' of type b, but also a continuation Circuit (cir' in runCircuit). This continuation is a new Circuit a b, used by runCircuit in the next iteration. That is how state is kept: the new Circuit will be similar to the original one, but it may have been affected by previous inputs.
For example, you could define a circuit that sums Ints and produces the total sum. There is one example in that article, but to make things super-simple:
mySum :: Circuit Int Int
mySum = mySum' 0
mySum' :: Int -> Circuit Int Int
mySum' acc = Circuit $ \input ->
let acc' = acc + input
in (mySum' acc', acc')
In each iteration, the continuation Circuit returned, mySum' acc', uses acc', the new accumulator, which contains the sum up to that point. So this Circuit keeps state because it remembers or carries forward the sum of all numbers up to that point.
Back to that article, the slightly more general function:
accum :: acc -> (a -> acc -> (b, acc)) -> Circuit a b
accum acc f = Circuit $ \input ->
let (output, acc') = input `f` acc
in (accum acc' f, output)
returns a continuation Circuit in the first argument of the tuple that is different from itself. It was called as accum acc f, but the continuation is accum acc' f, where acc' depends on the input and acc, so it retains memory in this accumulator.
Using continuations is very, very common. I think most pipe/stream-processing frameworks and many FRP implementations do this, including Yampa, Varying, Dunai and netwire.

How to use the Select monad to solve n-queens?

I'm trying to understand how the Select monad works. Apparently, it is a cousin of Cont and it can be used for backtracking search.
I have this list-based solution to the n-queens problem:
-- All the ways of extracting an element from a list.
oneOf :: [Int] -> [(Int,[Int])]
oneOf [] = []
oneOf (x:xs) = (x,xs) : map (\(y,ys) -> (y,x:ys)) (oneOf xs)
-- Adding a new queen at col x, is it threathened diagonally by any of the
-- existing queens?
safeDiag :: Int -> [Int] -> Bool
safeDiag x xs = all (\(y,i) -> abs (x-y) /= i) (zip xs [1..])
nqueens :: Int -> [[Int]]
nqueens queenCount = go [] [1..queenCount]
where
-- cps = columsn of already positioned queens.
-- fps = columns that are still available
go :: [Int] -> [Int] -> [[Int]]
go cps [] = [cps]
go cps fps = [ps | (p,nfps) <- oneOf fps, ps <- go (p:cps) nfps, safeDiag p cps]
I'm struggling to adapt this solution to use Select instead.
It seems that Select lets you abstract over the "evaluation function" that is used to compare answers. That function is passed to runSelect. I have the feeling that something like safeDiag in my solution could work as the evaluation function, but how to structure the Select computation itself?
Also, is it enough to use the Select monad alone, or do I need to use the transformer version over lists?
I realize this is question is almost 4 years old and already has an answer, but I wanted to chime in with some additional information for the sake of anyone who comes across this question in the future. Specifically, I want to try to answer 2 questions:
how are multiple Selects that return single values combined to create a single Select that returns a sequence of values?
is it possible to return early when a solution path is destined to fail?
Chaining Selects
Select is implemented as a monad transformer in the transformers library (go figure), but let's take a look at how one might implement >>= for Select by itself:
(>>=) :: Select r a -> (a -> Select r b) -> Select r b
Select g >>= f = Select $ \k ->
let choose x = runSelect (f x) k
in choose $ g (k . choose)
We start by defining a new Select which takes an input k of type a -> r (recall that Select wraps a function of type (a -> r) -> a). You can think of k as a function that returns a "score" of type r for a given a, which the Select function may use to determine which a to return.
Inside our new Select, we define a function called choose. This function passes some x to the function f, which is the a -> m b portion of monadic binding: it transforms the result of the m a computation into a new computation m b. So f is going to take that x and return a new Select, which choose then runs using our scoring function k. You can think of choose as a function that asks "what would the final result be if I selected x and passed it downstream?"
On the second line, we return choose $ g (k . choose). The function k . choose is the composition of choose and our original scoring function k: it takes in a value, calculates the downstream result of selecting that value, and returns the score of that downstream result. In other words, we've created a kind of "clairvoyant" scoring function: instead of returning the score of a given value, it returns the score of the final result we would get if we selected that value. By passing in our "clairvoyant" scoring function to g (the original Select that we're binding to), we're able to select the intermediate value that leads to the final result we're looking for. Once we have that intermediate value, we simply pass it back into choose and return the result.
That's how we're able to string together single-value Selects while passing in a scoring function that operates on an array of values: each Select is scoring the hypothetical final result of selecting a value, not necessarily the value itself. The applicative instance follows the same strategy, the only difference being how the downstream Select is computed (instead of passing a candidate value into the a -> m b function, it maps a candidate function over the 2nd Select.)
Returning Early
So, how can we use Select while returning early? We need some way of accessing the scoring function within the scope of the code that constructs the Select. One way to do that is to construct each Select within another Select, like so:
sequenceSelect :: Eq a => [a] -> Select Bool [a]
sequenceSelect [] = return []
sequenceSelect domain#(x:xs) = select $ \k ->
if k [] then runSelect s k else []
where
s = do
choice <- elementSelect (x:|xs)
fmap (choice:) $ sequenceSelect (filter (/= choice) domain)
This allows us to test the sequence in progress and short-circuit the recursion if it fails. (We can test the sequence by calling k [] because the scoring function includes all of the prepends that we've recursively lined up.)
Here's the whole solution:
import Data.List
import Data.List.NonEmpty (NonEmpty(..))
import Control.Monad.Trans.Select
validBoard :: [Int] -> Bool
validBoard qs = all verify (tails qs)
where
verify [] = True
verify (x:xs) = and $ zipWith (\i y -> x /= y && abs (x - y) /= i) [1..] xs
nqueens :: Int -> [Int]
nqueens boardSize = runSelect (sequenceSelect [1..boardSize]) validBoard
sequenceSelect :: Eq a => [a] -> Select Bool [a]
sequenceSelect [] = return []
sequenceSelect domain#(x:xs) = select $ \k ->
if k [] then runSelect s k else []
where
s = do
choice <- elementSelect (x:|xs)
fmap (choice:) $ sequenceSelect (filter (/= choice) domain)
elementSelect :: NonEmpty a -> Select Bool a
elementSelect domain = select $ \p -> epsilon p domain
-- like find, but will always return something
epsilon :: (a -> Bool) -> NonEmpty a -> a
epsilon _ (x:|[]) = x
epsilon p (x:|y:ys) = if p x then x else epsilon p (y:|ys)
In short: we construct a Select recursively, removing elements from the domain as we use them and terminating the recursion if the domain has been exhausted or if we're on the wrong track.
One other addition is the epsilon function (based on Hilbert's epsilon operator). For a domain of size N it will check at most N - 1 items... it might not sound like a huge savings, but as you know from the above explanation, p will usually kick off the remainder of the entire computation, so it's best to keep predicate calls to a minimum.
The nice thing about sequenceSelect is how generic it is: it can be used to create any Select Bool [a] where
we're searching within a finite domain of distinct elements
we want to create a sequence that includes every element exactly once (i.e. a permutation of the domain)
we want to test partial sequences and abandon them if they fail the predicate
Hope this helps clarify things!
P.S. Here's a link to an Observable notebook in which I implemented the Select monad in Javascript along with a demonstration of the n-queens solver: https://observablehq.com/#mattdiamond/the-select-monad
Select can be viewed as an abstraction of a search in a "compact" space, guided by some predicate. You mentioned SAT in your comments, have you tried modelling the problem as a SAT instance and throw it at a solver based on Select (in the spirit of this paper)? You can specialise the search to hardwire the N-queens specific constraints inside your and turn the SAT solver into a N-queens solver.
Inspired by jd823592's answer, and after looking at the SAT example in the paper, I have written this code:
import Data.List
import Control.Monad.Trans.Select
validBoard :: [Int] -> Bool
validBoard qs = all verify (tails qs)
where
verify [] = True
verify (x : xs) = and $ zipWith (\i y -> x /= y && abs (x-y) /= i) [1..] xs
nqueens :: Int -> [Int]
nqueens boardSize = runSelect (traverse selectColumn columns) validBoard
where
columns = replicate boardSize [1..boardSize]
selectColumn candidates = select $ \s -> head $ filter s candidates ++ candidates
It seems to arrive (albeit slowly) to a valid solution:
ghci> nqueens 8
[1,5,8,6,3,7,2,4]
I don't understand it very well, however. In particular, the way sequence works for Select, transmuting a function (validBoard) that works over a whole board into functions that take a single column index, seems quite magical.
The sequence-based solution has the defect that putting a queen in a column doesn't rule out the possibility of choosing the same column for subsequent queens; we end up unnecesarily exploring doomed branches.
If we want our column choices to be affected by previous decisions, we need to go beyond Applicative and use the power of Monad:
nqueens :: Int -> [Int]
nqueens boardSize = fst $ runSelect (go ([],[1..boardSize])) (validBoard . fst)
where
go (cps,[]) = return (cps,[])
go (cps,fps) = (select $ \s ->
let candidates = map (\(z,zs) -> (z:cps,zs)) (oneOf fps)
in head $ filter s candidates ++ candidates) >>= go
The monadic version still has the problem that it only checks completed boards, when the original list-based solution backtracked as soon as a partially completed board was found to be have a conflict. I don't know how to do that using Select.

How to generate a list of repeated applications of a function to the previous result of it in IO context

As a part of a solution for the problem I'm trying to solve I need to generate a list of repeated application of a function to it's previous result. Sounds very much like iterate function, with the exception, that iterate has signature of
iterate :: (a -> a) -> a -> [a]
and my function lives inside of IO (I need to generate random numbers), so I'd need something more of a:
iterate'::(a -> IO a) -> a -> [a]
I have looked at the hoogle, but without much success.
You can actually get a lazy iterate that works on infinite lists if you use the pipes library. The definition is really simple:
import Pipes
iterate' :: (a -> IO a) -> a -> Producer a IO r
iterate' f a = do
yield a
a2 <- lift (f a)
iterate' f a2
For example, let's say that our step function is:
step :: Int -> IO Int
step n = do
m <- readLn
return (n + m)
Then applying iterate to step generates a Producer that lazily prompts the user for input and generates the tally of values read so far:
iterate' step 0 :: Producer Int IO ()
The simplest way to read out the value is to loop over the Producer using for:
main = runEffect $
for (iterate' step 0) $ \n -> do
lift (print n)
The program then endlessly loops, requesting user input and displaying the current tally:
>>> main
0
10<Enter>
10
14<Enter>
24
5<Enter>
29
...
Notice how this gets two things correct which the other solutions do not:
It works on infinite lists (you don't need a termination condition)
It produces results immediately. It doesn't wait until you run the action on the entire list to start producing usable values.
However, we can easily filter results just like the other two solutions. For example, let's say I want to stop when the tally is greater than 100. I can just write:
import qualified Pipes.Prelude as P
main = runEffect $
for (iterate' step 0 >-> P.takeWhile (< 100)) $ \n -> do
lift (print n)
You can read that as saying: "Loop over the iterated values while they are less than 100. Print the output". Let's try it:
>>> main
0
10<Enter>
10
20<Enter>
30
75<Enter>
>>> -- Done!
In fact, pipes has another helper function for printing out values, so you can simplify the above to a pipeline:
main = runEffect $ iterate' step 0 >-> P.takeWhile (< 100) >-> P.print
This gives a clear view of the flow of information. iterate' produces a never-ending stream of Ints, P.takeWhile filters that stream, and P.print prints all values that reach the end.
If you want to learn more about the pipes library, I encourage you to read the pipes tutorial.
Your functions lives in IO, so the signature is rather:
iterate'::(a -> IO a) -> a -> IO [a]
The problem is that the original iterate function returns an infinite list, so if you try to do the same in IO you will get an action that never ends. Maybe you should add a condition to end the iteration.
iterate' action value = do
result <- action value
if condition result
then return []
else
rest <- iterate' action result
return $ result : rest
Firstly, your resulting list must be in the IO monad, so iterate' must have produce an IO [a], rather than '[a]'
Iterate can be defined as:
iterate (a -> a) -> a -> [a]
iterate f x = x : iterate f (f x)
so we could make an iterateM quite easily
iterateM :: (a -> m a) -> m a -> [m a]
iterateM f x = x : iterateM f (x >>= f)
This still needs your seed value to be in the monad to start though, and also gives you a list of monadic things, rather than a monad of listy things.
So, lets change it a bit.
iterateM :: (a -> m a) -> a -> m [a]
iterateM f x = sequence $ go f (return x)
where
go f x = x : go f (x >>= f)
However, this doesn't work. This is because sequence first runs every action, and then returns. (You can see this if you write some safeDivide :: Double -> Double -> Maybe Double, and then try something like fmap (take 10) $ iterateM (flip safeDivide 2) 1000. You'll find it doesn't terminate. I'm not sure how to fix that though.

Evaluate a List until a certain time has passed [duplicate]

I want to write a function that takes a time limit (in seconds) and a list, and computes as many elements of the list as possible within the time limit.
My first attempt was to first write the following function, which times a pure computation and returns the time elapsed along with the result:
import Control.DeepSeq
import System.CPUTime
type Time = Double
timed :: (NFData a) => a -> IO (a, Time)
timed x = do t1 <- getCPUTime
r <- return $!! x
t2 <- getCPUTime
let diff = fromIntegral (t2 - t1) / 10^12
return (r, diff)
I can then define the function I want in terms of this:
timeLimited :: (NFData a) => Time -> [a] -> IO [a]
timeLimited remaining [] = return []
timeLimited remaining (x:xs) = if remaining < 0
then return []
else do
(y,t) <- timed x
ys <- timeLimited (remaining - t) xs
return (y:ys)
This isn't quite right though. Even ignoring timing errors and floating point errors, this approach never stops the computation of an element of the list once it has started, which means that it can (and in fact, normally will) overrun its time limit.
If instead I had a function that could short-circuit evaluation if it had taken too long:
timeOut :: Time -> a -> IO (Maybe (a,t))
timeOut = undefined
then I could write the function that I really want:
timeLimited' :: Time -> [a] -> IO [a]
timeLimited' remaining [] = return []
timeLimited' remaining (x:xs) = do
result <- timeOut remaining x
case result of
Nothing -> return []
Just (y,t) -> do
ys <- timeLimited' (remaining - t) xs
return (y:ys)
My questions are:
How do I write timeOut?
Is there a better way to write the function timeLimited, for example, one that doesn't suffer from accumulated floating point error from adding up time differences multiple times?
Here's an example I was able to cook up using some of the suggestions above. I've not done huge amounts of testing to ensure work is cut off exactly when the timer runs out, but based on the docs for timeout, this should work for all things not using FFI.
import Control.Concurrent.STM
import Control.DeepSeq
import System.Timeout
type Time = Int
-- | Compute as many items of a list in given timeframe (microseconds)
-- This is done by running a function that computes (with `force`)
-- list items and pushed them onto a `TVar [a]`. When the requested time
-- expires, ghc will terminate the execution of `forceIntoTVar`, and we'll
-- return what has been pushed onto the tvar.
timeLimited :: (NFData a) => Time -> [a] -> IO [a]
timeLimited t xs = do
v <- newTVarIO []
_ <- timeout t (forceIntoTVar xs v)
readTVarIO v
-- | Force computed values into given tvar
forceIntoTVar :: (NFData a) => [a] -> TVar [a] -> IO [()]
forceIntoTVar xs v = mapM (atomically . modifyTVar v . forceCons) xs
-- | Returns function that does actual computation and cons' to tvar value
forceCons :: (NFData a) => a -> [a] -> [a]
forceCons x = (force x:)
Now let's try it on something costly:
main = do
xs <- timeLimited 100000 expensiveThing -- run for 100 milliseconds
print $ length $ xs -- how many did we get?
-- | Some high-cost computation
expensiveThing :: [Integer]
expensiveThing = sieve [2..]
where
sieve (p:xs) = p : sieve [x|x <- xs, x `mod` p > 0]
Compiled and run with time, it seems to work (obviously there is some overhead outside the timed portion, but I'm at roughly 100ms:
$ time ./timeLimited
1234
./timeLimited 0.10s user 0.01s system 97% cpu 0.112 total
Also, something to note about this approach; since I'm enclosing the entire operation of running the computations and pushing them onto the tvar inside one call to timeout, some time here is likely lost in creating the return structure, though I'm assuming (if your computations are costly) it won't account or much of your overall time.
Update
Now that I've had some time to think about it, due to Haskell's laziness, I'm not 100% positive the note above (about time-spent creating the return structure) is correct; either way, let me know if this is not precise enough for what you are trying to accomplish.
You can implement timeOut with the type you gave using timeout and evaluate. It looks something like this (I've omitted the part that computes how much time is left -- use getCurrentTime or similar for that):
timeoutPure :: Int -> a -> IO (Maybe a)
timeoutPure t a = timeout t (evaluate a)
If you want more forcing than just weak-head normal form, you can call this with an already-seq'd argument, e.g. timeoutPure (deepseq v) instead of timeoutPure v.
I would use two threads together with TVars and raise an exception (that causes every ongoing transaction to be rolled back) in the computation thread when the timeout has been reached:
forceIntoTVar :: (NFData a) => [a] -> TVar [a] -> IO [()]
forceIntoTVar xs v = mapM (atomically . modifyTVar v . forceCons) xs
-- | Returns function that does actual computation and cons' to tvar value
forceCons :: (NFData a) => a -> [a] -> [a]
forceCons x = (force x:)
main = do
v <- newTVarIO []
tID <- forkIO $ forceIntoTVar args v
threadDelay 200
killThread tID
readTVarIO v
In this example you (may) need to adjust forceIntoTVar a bit so that e.g. the list nodes are NOT computet inside the atomic transaction but first computed and then a atomic transaction is started to cons them onto the list.
In any case, when the exception is raised the ongoing transaction is rolled back or the ongoing computation is stopped before the result is consed to the list and that is what you want.
What you need to consider is that when the individual computations to prepare a node run with high frequency then this example is probably very costly compared to not using STM.

Compute as much of a list as possible in a fixed time

I want to write a function that takes a time limit (in seconds) and a list, and computes as many elements of the list as possible within the time limit.
My first attempt was to first write the following function, which times a pure computation and returns the time elapsed along with the result:
import Control.DeepSeq
import System.CPUTime
type Time = Double
timed :: (NFData a) => a -> IO (a, Time)
timed x = do t1 <- getCPUTime
r <- return $!! x
t2 <- getCPUTime
let diff = fromIntegral (t2 - t1) / 10^12
return (r, diff)
I can then define the function I want in terms of this:
timeLimited :: (NFData a) => Time -> [a] -> IO [a]
timeLimited remaining [] = return []
timeLimited remaining (x:xs) = if remaining < 0
then return []
else do
(y,t) <- timed x
ys <- timeLimited (remaining - t) xs
return (y:ys)
This isn't quite right though. Even ignoring timing errors and floating point errors, this approach never stops the computation of an element of the list once it has started, which means that it can (and in fact, normally will) overrun its time limit.
If instead I had a function that could short-circuit evaluation if it had taken too long:
timeOut :: Time -> a -> IO (Maybe (a,t))
timeOut = undefined
then I could write the function that I really want:
timeLimited' :: Time -> [a] -> IO [a]
timeLimited' remaining [] = return []
timeLimited' remaining (x:xs) = do
result <- timeOut remaining x
case result of
Nothing -> return []
Just (y,t) -> do
ys <- timeLimited' (remaining - t) xs
return (y:ys)
My questions are:
How do I write timeOut?
Is there a better way to write the function timeLimited, for example, one that doesn't suffer from accumulated floating point error from adding up time differences multiple times?
Here's an example I was able to cook up using some of the suggestions above. I've not done huge amounts of testing to ensure work is cut off exactly when the timer runs out, but based on the docs for timeout, this should work for all things not using FFI.
import Control.Concurrent.STM
import Control.DeepSeq
import System.Timeout
type Time = Int
-- | Compute as many items of a list in given timeframe (microseconds)
-- This is done by running a function that computes (with `force`)
-- list items and pushed them onto a `TVar [a]`. When the requested time
-- expires, ghc will terminate the execution of `forceIntoTVar`, and we'll
-- return what has been pushed onto the tvar.
timeLimited :: (NFData a) => Time -> [a] -> IO [a]
timeLimited t xs = do
v <- newTVarIO []
_ <- timeout t (forceIntoTVar xs v)
readTVarIO v
-- | Force computed values into given tvar
forceIntoTVar :: (NFData a) => [a] -> TVar [a] -> IO [()]
forceIntoTVar xs v = mapM (atomically . modifyTVar v . forceCons) xs
-- | Returns function that does actual computation and cons' to tvar value
forceCons :: (NFData a) => a -> [a] -> [a]
forceCons x = (force x:)
Now let's try it on something costly:
main = do
xs <- timeLimited 100000 expensiveThing -- run for 100 milliseconds
print $ length $ xs -- how many did we get?
-- | Some high-cost computation
expensiveThing :: [Integer]
expensiveThing = sieve [2..]
where
sieve (p:xs) = p : sieve [x|x <- xs, x `mod` p > 0]
Compiled and run with time, it seems to work (obviously there is some overhead outside the timed portion, but I'm at roughly 100ms:
$ time ./timeLimited
1234
./timeLimited 0.10s user 0.01s system 97% cpu 0.112 total
Also, something to note about this approach; since I'm enclosing the entire operation of running the computations and pushing them onto the tvar inside one call to timeout, some time here is likely lost in creating the return structure, though I'm assuming (if your computations are costly) it won't account or much of your overall time.
Update
Now that I've had some time to think about it, due to Haskell's laziness, I'm not 100% positive the note above (about time-spent creating the return structure) is correct; either way, let me know if this is not precise enough for what you are trying to accomplish.
You can implement timeOut with the type you gave using timeout and evaluate. It looks something like this (I've omitted the part that computes how much time is left -- use getCurrentTime or similar for that):
timeoutPure :: Int -> a -> IO (Maybe a)
timeoutPure t a = timeout t (evaluate a)
If you want more forcing than just weak-head normal form, you can call this with an already-seq'd argument, e.g. timeoutPure (deepseq v) instead of timeoutPure v.
I would use two threads together with TVars and raise an exception (that causes every ongoing transaction to be rolled back) in the computation thread when the timeout has been reached:
forceIntoTVar :: (NFData a) => [a] -> TVar [a] -> IO [()]
forceIntoTVar xs v = mapM (atomically . modifyTVar v . forceCons) xs
-- | Returns function that does actual computation and cons' to tvar value
forceCons :: (NFData a) => a -> [a] -> [a]
forceCons x = (force x:)
main = do
v <- newTVarIO []
tID <- forkIO $ forceIntoTVar args v
threadDelay 200
killThread tID
readTVarIO v
In this example you (may) need to adjust forceIntoTVar a bit so that e.g. the list nodes are NOT computet inside the atomic transaction but first computed and then a atomic transaction is started to cons them onto the list.
In any case, when the exception is raised the ongoing transaction is rolled back or the ongoing computation is stopped before the result is consed to the list and that is what you want.
What you need to consider is that when the individual computations to prepare a node run with high frequency then this example is probably very costly compared to not using STM.

Resources