Haskell arrow tutorial loop/state

Haskell arrow tutorial loop/state - haskell

From this
https://en.wikibooks.org/wiki/Haskell/Arrow_tutorial#Hangman:_Main_program
How is the IO done?. Particulary
main :: IO ()
main = do
rng <- getStdGen
interact $ unlines -- Concatenate lines out output
. ("Welcome to Arrow Hangman":) -- Prepend a greeting to the output
. concat . map snd . takeWhile fst -- Take the [String]s as long as the first element of the tuples is True
. runCircuit (hangman rng) -- Process the input lazily
. ("":) -- Act as if the user pressed ENTER once at the start
. lines -- Split input into lines
Interact appears to be (string -> string ) -> IO (). With the impression that it prints runs that function per line that it reads. What confuses me is, How is the initial line printed. Where is the state of the game stored in between?.
runCircuit was used earlier in a way in which it had all the inputs already generated. I'm confused as to how this version runs line by line, but doesn't appear to store any state?.
how can Circuit String (Bool, [String]) be ran by runCircuit :: Circuit a b -> [a] -> [b] in a line-by-line fashion?. In a way that appears to remember what the previous results where?.

Interact does not run the function per line. interact and runCircuit are lazy. Because you split the input in lines and concatenate the output, you'll see the progress of runCircuit as you provide more and more input.
The function runCircuit is defined as follows:
runCircuit :: Circuit a b -> [a] -> [b]
runCircuit _ [] = []
runCircuit cir (x:xs) =
let (cir',x') = unCircuit cir x
in x' : runCircuit cir' xs
There you see that you produce one element in the output list for each element in the input list (each line). Which already indicates that you'll be able to process the list lazily. (For comparison: if it required the length of xs to produce the first output x', then runCircuit would not be lazy.)
Let's put that together with the definition of Circuit:
data Circuit a b = Circuit { unCircuit :: a -> (Circuit a b, b) }
The way you run a circuit is that you provide a first input x of type a and obtain not only a first output x' of type b, but also a continuation Circuit (cir' in runCircuit). This continuation is a new Circuit a b, used by runCircuit in the next iteration. That is how state is kept: the new Circuit will be similar to the original one, but it may have been affected by previous inputs.
For example, you could define a circuit that sums Ints and produces the total sum. There is one example in that article, but to make things super-simple:
mySum :: Circuit Int Int
mySum = mySum' 0
mySum' :: Int -> Circuit Int Int
mySum' acc = Circuit $ \input ->
let acc' = acc + input
in (mySum' acc', acc')
In each iteration, the continuation Circuit returned, mySum' acc', uses acc', the new accumulator, which contains the sum up to that point. So this Circuit keeps state because it remembers or carries forward the sum of all numbers up to that point.
Back to that article, the slightly more general function:
accum :: acc -> (a -> acc -> (b, acc)) -> Circuit a b
accum acc f = Circuit $ \input ->
let (output, acc') = input `f` acc
in (accum acc' f, output)
returns a continuation Circuit in the first argument of the tuple that is different from itself. It was called as accum acc f, but the continuation is accum acc' f, where acc' depends on the input and acc, so it retains memory in this accumulator.
Using continuations is very, very common. I think most pipe/stream-processing frameworks and many FRP implementations do this, including Yampa, Varying, Dunai and netwire.

Related

Understanding non-strictness in Haskell with a recursive example

What is the difference between this two, in terms of evaluation?
Why this "obeys" (how to say?) non-strictness
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = []
recFilter p (h:tl) = if (p h)
then h : recFilter p tl
else recFilter p tl
while this doesn't?
recFilter :: (a -> Bool) -> [a] -> Int -> [a]
recFilter _ xs 0 = xs
recFilter p (h:tl) len
| p(h) = recFilter p (tl ++ [h]) (len-1)
| otherwise = recFilter p tl (len-1)
Is it possible to write a tail-recursive function non-strictly?
To be honest I also don't understand the call stack of the first example, because I can't see where that h: goes. Is there a way to see this in ghci?

The non-tail recursive function roughly consumes a portion of the input (the first element) to produce a portion of the output (well, if it's not filtered out at least). Then recursion handles the next portion of the input, and so on.
Your tail recursive function will recurse until len becomes zero, and only at that point it will output the whole result.
Consider this pseudocode:
def rec1(p,xs):
case xs:
[] -> []
(y:ys) -> if p(y): print y
rec1(p,ys)
and compare it with this accumulator-based variant. I'm not using len since I use a separate accumulator argument, which I assume to be initially empty.
def rec2(p,xs,acc):
case xs:
[] -> print acc
(y:ys) -> if p(y):
rec2(p,ys,acc++[y])
else:
rec2(p,ys,acc)
rec1 prints before recursing: it does not need to inspect the whole input list to start printing its output. It works in a "steraming" fashion, in a sense. Instead, rec2 will only start to print at the very end, after the input list was completely scanned.
In your Haskell code there are no prints, of course, but you can thing of returning x : function call as "printing x", since x is made available to the caller of our function before function call is actually made. (Well, to be pedantic this depends on how the caller will consume the output list, but I'll neglect this.)
Hence the non-tail recursive code can also work on infinite lists. Even on finite inputs, performance is improved: if we call head (rec1 p xs), we only evaluate xs until the first non-discarded element. By contrast head (rec2 p xs) would fully filter the whole list xs, even we don't need that.

The second implementation does not make much sense: a variable named len will not contain the length of the list. You thus need to pass this, for infinite lists, this would not work, since there is no length at all.
You likely want to produce something like:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter p = go []
where go ys [] = ys -- (1)
go ys (x:xs) | p x = go (ys ++ [x]) xs
| otherwise = go ys xs
where we thus have an accumulator to which we append the items in the list, and then eventually return the accumulator.
The problem with the second approach is that as long as the accumulator is not returned, Haskell will need to keep recursing until at least we reach weak head normal form (WHNF). This means that if we pattern match the result with [] or (_:_), we will need at least have to recurse until case one, since the other cases only produce a new expression, and it will thus not yield a data constructor on which we can pattern match.
This in contrast to the first filter where if we pattern match on [] or (_:_) it is sufficient to stop at the first case (1), or the third case 93) where the expression produces an object with a list data constructor. Only if we require extra elements to pattern match, for example (_:_:_), it will require to evaluate the recFilter p tl in case (2) of the first implementation:
recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = [] -- (1)
recFilter p (h:tl) = if (p h)
then h : recFilter p tl -- (2)
else recFilter p tl
For more information, see the Laziness section of the Wikibook on Haskell that describes how laziness works with thunks.

How to use the Select monad to solve n-queens?

I'm trying to understand how the Select monad works. Apparently, it is a cousin of Cont and it can be used for backtracking search.
I have this list-based solution to the n-queens problem:
-- All the ways of extracting an element from a list.
oneOf :: [Int] -> [(Int,[Int])]
oneOf [] = []
oneOf (x:xs) = (x,xs) : map (\(y,ys) -> (y,x:ys)) (oneOf xs)
-- Adding a new queen at col x, is it threathened diagonally by any of the
-- existing queens?
safeDiag :: Int -> [Int] -> Bool
safeDiag x xs = all (\(y,i) -> abs (x-y) /= i) (zip xs [1..])
nqueens :: Int -> [[Int]]
nqueens queenCount = go [] [1..queenCount]
where
-- cps = columsn of already positioned queens.
-- fps = columns that are still available
go :: [Int] -> [Int] -> [[Int]]
go cps [] = [cps]
go cps fps = [ps | (p,nfps) <- oneOf fps, ps <- go (p:cps) nfps, safeDiag p cps]
I'm struggling to adapt this solution to use Select instead.
It seems that Select lets you abstract over the "evaluation function" that is used to compare answers. That function is passed to runSelect. I have the feeling that something like safeDiag in my solution could work as the evaluation function, but how to structure the Select computation itself?
Also, is it enough to use the Select monad alone, or do I need to use the transformer version over lists?

I realize this is question is almost 4 years old and already has an answer, but I wanted to chime in with some additional information for the sake of anyone who comes across this question in the future. Specifically, I want to try to answer 2 questions:
how are multiple Selects that return single values combined to create a single Select that returns a sequence of values?
is it possible to return early when a solution path is destined to fail?
Chaining Selects
Select is implemented as a monad transformer in the transformers library (go figure), but let's take a look at how one might implement >>= for Select by itself:
(>>=) :: Select r a -> (a -> Select r b) -> Select r b
Select g >>= f = Select $ \k ->
let choose x = runSelect (f x) k
in choose $ g (k . choose)
We start by defining a new Select which takes an input k of type a -> r (recall that Select wraps a function of type (a -> r) -> a). You can think of k as a function that returns a "score" of type r for a given a, which the Select function may use to determine which a to return.
Inside our new Select, we define a function called choose. This function passes some x to the function f, which is the a -> m b portion of monadic binding: it transforms the result of the m a computation into a new computation m b. So f is going to take that x and return a new Select, which choose then runs using our scoring function k. You can think of choose as a function that asks "what would the final result be if I selected x and passed it downstream?"
On the second line, we return choose $ g (k . choose). The function k . choose is the composition of choose and our original scoring function k: it takes in a value, calculates the downstream result of selecting that value, and returns the score of that downstream result. In other words, we've created a kind of "clairvoyant" scoring function: instead of returning the score of a given value, it returns the score of the final result we would get if we selected that value. By passing in our "clairvoyant" scoring function to g (the original Select that we're binding to), we're able to select the intermediate value that leads to the final result we're looking for. Once we have that intermediate value, we simply pass it back into choose and return the result.
That's how we're able to string together single-value Selects while passing in a scoring function that operates on an array of values: each Select is scoring the hypothetical final result of selecting a value, not necessarily the value itself. The applicative instance follows the same strategy, the only difference being how the downstream Select is computed (instead of passing a candidate value into the a -> m b function, it maps a candidate function over the 2nd Select.)
Returning Early
So, how can we use Select while returning early? We need some way of accessing the scoring function within the scope of the code that constructs the Select. One way to do that is to construct each Select within another Select, like so:
sequenceSelect :: Eq a => [a] -> Select Bool [a]
sequenceSelect [] = return []
sequenceSelect domain#(x:xs) = select $ \k ->
if k [] then runSelect s k else []
where
s = do
choice <- elementSelect (x:|xs)
fmap (choice:) $ sequenceSelect (filter (/= choice) domain)
This allows us to test the sequence in progress and short-circuit the recursion if it fails. (We can test the sequence by calling k [] because the scoring function includes all of the prepends that we've recursively lined up.)
Here's the whole solution:
import Data.List
import Data.List.NonEmpty (NonEmpty(..))
import Control.Monad.Trans.Select
validBoard :: [Int] -> Bool
validBoard qs = all verify (tails qs)
where
verify [] = True
verify (x:xs) = and $ zipWith (\i y -> x /= y && abs (x - y) /= i) [1..] xs
nqueens :: Int -> [Int]
nqueens boardSize = runSelect (sequenceSelect [1..boardSize]) validBoard
sequenceSelect :: Eq a => [a] -> Select Bool [a]
sequenceSelect [] = return []
sequenceSelect domain#(x:xs) = select $ \k ->
if k [] then runSelect s k else []
where
s = do
choice <- elementSelect (x:|xs)
fmap (choice:) $ sequenceSelect (filter (/= choice) domain)
elementSelect :: NonEmpty a -> Select Bool a
elementSelect domain = select $ \p -> epsilon p domain
-- like find, but will always return something
epsilon :: (a -> Bool) -> NonEmpty a -> a
epsilon _ (x:|[]) = x
epsilon p (x:|y:ys) = if p x then x else epsilon p (y:|ys)
In short: we construct a Select recursively, removing elements from the domain as we use them and terminating the recursion if the domain has been exhausted or if we're on the wrong track.
One other addition is the epsilon function (based on Hilbert's epsilon operator). For a domain of size N it will check at most N - 1 items... it might not sound like a huge savings, but as you know from the above explanation, p will usually kick off the remainder of the entire computation, so it's best to keep predicate calls to a minimum.
The nice thing about sequenceSelect is how generic it is: it can be used to create any Select Bool [a] where
we're searching within a finite domain of distinct elements
we want to create a sequence that includes every element exactly once (i.e. a permutation of the domain)
we want to test partial sequences and abandon them if they fail the predicate
Hope this helps clarify things!
P.S. Here's a link to an Observable notebook in which I implemented the Select monad in Javascript along with a demonstration of the n-queens solver: https://observablehq.com/#mattdiamond/the-select-monad

Select can be viewed as an abstraction of a search in a "compact" space, guided by some predicate. You mentioned SAT in your comments, have you tried modelling the problem as a SAT instance and throw it at a solver based on Select (in the spirit of this paper)? You can specialise the search to hardwire the N-queens specific constraints inside your and turn the SAT solver into a N-queens solver.

Inspired by jd823592's answer, and after looking at the SAT example in the paper, I have written this code:
import Data.List
import Control.Monad.Trans.Select
validBoard :: [Int] -> Bool
validBoard qs = all verify (tails qs)
where
verify [] = True
verify (x : xs) = and $ zipWith (\i y -> x /= y && abs (x-y) /= i) [1..] xs
nqueens :: Int -> [Int]
nqueens boardSize = runSelect (traverse selectColumn columns) validBoard
where
columns = replicate boardSize [1..boardSize]
selectColumn candidates = select $ \s -> head $ filter s candidates ++ candidates
It seems to arrive (albeit slowly) to a valid solution:
ghci> nqueens 8
[1,5,8,6,3,7,2,4]
I don't understand it very well, however. In particular, the way sequence works for Select, transmuting a function (validBoard) that works over a whole board into functions that take a single column index, seems quite magical.
The sequence-based solution has the defect that putting a queen in a column doesn't rule out the possibility of choosing the same column for subsequent queens; we end up unnecesarily exploring doomed branches.
If we want our column choices to be affected by previous decisions, we need to go beyond Applicative and use the power of Monad:
nqueens :: Int -> [Int]
nqueens boardSize = fst $ runSelect (go ([],[1..boardSize])) (validBoard . fst)
where
go (cps,[]) = return (cps,[])
go (cps,fps) = (select $ \s ->
let candidates = map (\(z,zs) -> (z:cps,zs)) (oneOf fps)
in head $ filter s candidates ++ candidates) >>= go
The monadic version still has the problem that it only checks completed boards, when the original list-based solution backtracked as soon as a partially completed board was found to be have a conflict. I don't know how to do that using Select.

Haskell foldl implementation with foldr

I have troubles understanding the implementation of the foldl function using foldr. I have read this question (Writing foldl using foldr) and I still have some things I don't understand in the following example:
fun :: [Int] -> [Int]
fun l = foldr g (const []) l 1
where g x f lst = if gcd x lst == 1 then x : f x else f lst
The function takes a list as parameter and return another list where gcd(l[i], l[i + 1] = 1.
My questions are the following:
1. Who are x, f and lst
2. What is const[] and why I can't use the id function?

foldr is one of those weird tools like bicycles that are really easy to use once you get the hang of them but hard to learn from the start. After several years of experience, I've gotten really good at spotting problems I can solve with foldr, and solving them with it immediately and correctly, but it could easily take me a while to figure out what just what I've done in enough detail to explain!
From a practical standpoint, I usually think of foldr in vaguely continuation-passing language. Ignoring the "simple" case where foldr is only applied to three arguments, an application of foldr looks like this:
foldr go finish xs acc1 acc2 ... where
finish acc1 acc2 ... = ?
go x cont acc1 acc2 ... = ?
acc1, etc., are accumulators passed "from left to right". The result consists, conceptually, of a single value passed "from right to left".
finish gets the final values of the accumulators and produces something of the result type. It's usually the easiest part to write because
foldr go finish [] acc1 acc2 ...
=
finish acc1 acc2 ...
So once you figure out just what you want your fold to produce, writing finish is fairly mechanical.
go gets a single container element, a "continuation", and the accumulators. It passes modified values if those accumulators "forward" to the continuation to get a result, and uses that result to construct its own.
foldl is a particularly simple case because its go function just returns the result it gets from folding the rest of the container with a new accumulator argument. I think it's a bit more instructive to look at an example that does a bit more. Here's one that takes a container of numbers and produces a list of pairs representing a running sum and a running product.
sumsProducts :: (Num n, Foldable f) => f n -> [(n, n)]
sumsProducts xs = foldr go finish xs 0 1
where
finish total prod = [(total, prod)]
go x cont total prod =
(total, prod) : cont (x + total) (x * prod)

foldr's type signature is this
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b
This means your foldr applied to its 3 arguments must return a function that takes the 1 as an argument.
So you can specialise your foldr to this
foldr :: (Int -> (Int -> [Int]) -> (Int -> [Int]))
-> (Int -> [Int])
-> [Int]
-> (Int -> [Int])
This means your g function must have the following type
g :: Int -> (Int -> [Int]) -> Int -> [Int]
So your parameters have the type
x :: Int
f :: Int -> [Int]
lst :: Int
And foldr in its 2nd argument requires a Int -> [Int] instead of just an Int, so you can't pass it the value [].
Fortunately const returns a function that ignores its argument and just always return a constant expression
const [] :: a -> [b]
In your case f is indeed some kind of accumulator. But instead of reducing e.g. a list of values to some number, you are chaining functions here. By passing 1 to this function chain in the end, it gets evaluated and is then building the actual list you return in fun.

How to generate a list of repeated applications of a function to the previous result of it in IO context

As a part of a solution for the problem I'm trying to solve I need to generate a list of repeated application of a function to it's previous result. Sounds very much like iterate function, with the exception, that iterate has signature of
iterate :: (a -> a) -> a -> [a]
and my function lives inside of IO (I need to generate random numbers), so I'd need something more of a:
iterate'::(a -> IO a) -> a -> [a]
I have looked at the hoogle, but without much success.

You can actually get a lazy iterate that works on infinite lists if you use the pipes library. The definition is really simple:
import Pipes
iterate' :: (a -> IO a) -> a -> Producer a IO r
iterate' f a = do
yield a
a2 <- lift (f a)
iterate' f a2
For example, let's say that our step function is:
step :: Int -> IO Int
step n = do
m <- readLn
return (n + m)
Then applying iterate to step generates a Producer that lazily prompts the user for input and generates the tally of values read so far:
iterate' step 0 :: Producer Int IO ()
The simplest way to read out the value is to loop over the Producer using for:
main = runEffect $
for (iterate' step 0) $ \n -> do
lift (print n)
The program then endlessly loops, requesting user input and displaying the current tally:
>>> main
0
10<Enter>
10
14<Enter>
24
5<Enter>
29
...
Notice how this gets two things correct which the other solutions do not:
It works on infinite lists (you don't need a termination condition)
It produces results immediately. It doesn't wait until you run the action on the entire list to start producing usable values.
However, we can easily filter results just like the other two solutions. For example, let's say I want to stop when the tally is greater than 100. I can just write:
import qualified Pipes.Prelude as P
main = runEffect $
for (iterate' step 0 >-> P.takeWhile (< 100)) $ \n -> do
lift (print n)
You can read that as saying: "Loop over the iterated values while they are less than 100. Print the output". Let's try it:
>>> main
0
10<Enter>
10
20<Enter>
30
75<Enter>
>>> -- Done!
In fact, pipes has another helper function for printing out values, so you can simplify the above to a pipeline:
main = runEffect $ iterate' step 0 >-> P.takeWhile (< 100) >-> P.print
This gives a clear view of the flow of information. iterate' produces a never-ending stream of Ints, P.takeWhile filters that stream, and P.print prints all values that reach the end.
If you want to learn more about the pipes library, I encourage you to read the pipes tutorial.

Your functions lives in IO, so the signature is rather:
iterate'::(a -> IO a) -> a -> IO [a]
The problem is that the original iterate function returns an infinite list, so if you try to do the same in IO you will get an action that never ends. Maybe you should add a condition to end the iteration.
iterate' action value = do
result <- action value
if condition result
then return []
else
rest <- iterate' action result
return $ result : rest

Firstly, your resulting list must be in the IO monad, so iterate' must have produce an IO [a], rather than '[a]'
Iterate can be defined as:
iterate (a -> a) -> a -> [a]
iterate f x = x : iterate f (f x)
so we could make an iterateM quite easily
iterateM :: (a -> m a) -> m a -> [m a]
iterateM f x = x : iterateM f (x >>= f)
This still needs your seed value to be in the monad to start though, and also gives you a list of monadic things, rather than a monad of listy things.
So, lets change it a bit.
iterateM :: (a -> m a) -> a -> m [a]
iterateM f x = sequence $ go f (return x)
where
go f x = x : go f (x >>= f)
However, this doesn't work. This is because sequence first runs every action, and then returns. (You can see this if you write some safeDivide :: Double -> Double -> Maybe Double, and then try something like fmap (take 10) $ iterateM (flip safeDivide 2) 1000. You'll find it doesn't terminate. I'm not sure how to fix that though.

How does this cyclic recursion provide the desired result?

Consider the following abbreviated code from this excellent blog post:
import System.Random (Random, randomRIO)
newtype Stream m a = Stream { runStream :: m (Maybe (NonEmptyStream m a)) }
type NonEmptyStream m a = (a, Stream m a)
empty :: (Monad m) => Stream m a
empty = Stream $ return Nothing
cons :: (Monad m) => a -> Stream m a -> Stream m a
cons a s = Stream $ return (Just (a, s))
fromList :: (Monad m) => [a] -> NonEmptyStream m a
fromList (x:xs) = (x, foldr cons empty xs)
Not too bad thus far - a monadic, recursive data structure and a way to build one from a list.
Now consider this function that chooses a (uniformly) random element from a stream, using constant memory:
select :: NonEmptyStream IO a -> IO a
select (a, s) = select' (return a) 1 s where
select' :: IO a -> Int -> Stream IO a -> IO a
select' a n s = do
next <- runStream s
case next of
Nothing -> a
Just (a', s') -> select' someA (n + 1) s' where
someA = do i <- randomRIO (0, n)
case i of 0 -> return a'
_ -> a
I'm not grasping the mysterious cyclic well of infinity that's going on in the last four lines; the result a' depends on a recursion on someA, which itself could depend on a', but not necessarily.
I get the vibe that the recursive worker is somehow 'accumulating' potential values in the IO a accumulator, but I obviously can't reason about it well enough.
Could anyone provide an explanation as to how this function produces the behaviour that it does?

That code doesn't actually run in constant space, as it composes a bigger and bigger IO a action which delays all the random choices until it's reached the end of the stream. Only when we reach the Nothing -> a case does the action in a actually get run.
For example, try running it on an infinite, constant space stream made by this function:
repeat' :: a -> NonEmptyStream IO a
repeat' x = let xs = (x, Stream $ return (Just xs)) in xs
Obviously, running select on this stream won't terminate, but you should see the memory usage going up as it allocates a lot of thunks for the delayed actions.
Here's a slightly re-written version of the code which does the choices as it goes along, so it runs in constant space and should hopefully be more clear as well. Note that I've replaced the IO a argument with a plain a which makes it clear that there are no delayed actions being built up here.
select :: NonEmptyStream IO a -> IO a
select (x, xs) = select' x 1 xs where
select' :: a -> Int -> Stream IO a -> IO a
select' current n xs = do
next <- runStream xs
case next of
Nothing -> return current
Just (x, xs') -> do
i <- randomRIO (0, n) -- (1)
case i of
0 -> select' x (n+1) xs' -- (2)
_ -> select' current (n+1) xs' -- (3)
As the name implies, current stores the currently selected value at each step. Once we've extracted the next item from the stream, we (1) pick a random number and use this to decide whether to (2) replace our selection with the new item or (3) keep our current selection before recursing on the rest of the stream.

There doesn't seem anything "cyclic" going on here. In particular, a' does not depend on someA. The a' is bound by pattern machting on the result of next. It is being used by someA which is in turn used on the right hand side, but this does not constitute a cycle.
What select' does is to traverse the stream. It maintains two accumulating arguments. The first is a random element from the stream (it's not yet selected and still random, hence IO a). The second is the position in the stream (Int).
The invariant being maintained is that the first accumulator selects an element uniformly from the stream we have seen so far, and that the integer represents the number of elements encountered so far.
Now, if we reach the end of the stream (Nothing), we can return the current random element, and it will be ok.
If we see another element (the Just case), then we recurse by calling select' again. Updating the number of elements to n + 1 is trivial. But how do we update the random element someA? Well, the old random element a chooses between the first n positions of the stream with equal probability. If we choose the new element a' with probability 1 / (n + 1) and use the old one in all other cases, then we have a uniform distribution over the whole stream up to this point again.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string