Random walk on a pointed container - haskell

Let us consider a dwarf wandering in a tunnel. I will define a type that represents this
situation thusly:
data X a = X { xs :: [a], i :: Int }
display :: X Bool -> IO ()
display X{..} = putStrLn (concatMap f xs) where { f True = "*" ; f False = "-" }
Here you see a dwarf in a section of a tunnel:
λ display x
-*---
It is discovered that a pointed container is an instance of Comonad. I can use this
instance here to define a function that simulates my dwarf moving right:
shiftRight :: X Bool -> Bool
shiftRight x#X{..} | let i' = i - 1 in i' `isInRange` x && xs !! i' = True
| otherwise = False
See:
λ traverse_ display $ scanl (&) x (replicate 4 (extend shiftRight))
-*---
--*--
---*-
----*
-----
Spectacularly, this same operation works with any number of dwarves, in any pointed container,
and so can be extended to a whole dwarf fortress if desired. I can similarly define a function
that moves a dwarf leftwards, or in any other deterministic fashion.
But now what if I want my dwarf to wander around aimlessly? Now my "shift randomly" must
only place a dwarf to the right if the same dwarf is not being placed to the left (for that would
make two dwarves out of one), and also it must never place two dwarves in the same place (which
would make one dwarf out of two). In other words, "shift randomly" must be linear (as in
"linear logic") when applied over a comonadic fortress.
One approach I have in mind is to assign some sort of state to dwarves that tracks the available
moves for a dwarf, removing moves from every relevant dwarf when we decide that the location is
taken by one of them. This way, the remaining dwarves will not be able to take that move. Or we
may track availability of locations. I am thinking that some sort of a "monadic" extendM
might be useful. (It would compare to the usual extend as traverse compares to fmap.)
But I am not aware of any prior art.

The easiest way to solve this is by using the MonadRandom library, which introduces a new monad for random computations. So let’s set up a computation using random numbers:
-- normal comonadic computation
type CoKleisli w a b = w a -> b
-- randomised comonadic computation
type RCoKleisli w a b = w a -> Rand b
Now, how to apply this thing? It’s easy enough to extend it:
halfApply :: Comonad w => (w a -> Rand b) -> (w a -> w (Rand b))
halfApply = extend
But this doesn’t quite work: it gives us a container of randomised values, whereas we want a randomised container of values. In other words, we need to find something which can do w (Rand b) -> Rand (w b). And in fact there does exist such a function: sequenceA! As the documentation states, if we apply sequenceA to a w (Rand b), it will run each Rand computation, then accumulate the results to get a Rand (w b) — which is exactly what we want! So:
fullApply :: (Comonad w, Traversible w, Applicative f)
=> (w a -> f b) -> (w a -> f (w b))
fullApply c = sequenceA . extend c
As you can see from the type signature above, this actually works for any Applicative (because all we require is that each applicative computation can be run in turn), but requires w to be Traversible (so we can traverse over each value in w).
(For more on this sort of thing, I recommend this blog post, plus its second part. If you want to see the above technique in action, I recommend my own probabilistic cellular automata library, back when it still used comonads instead of my own typeclass.)
So that answers one half of your question; that is, how to get probabilistic behaviour using comonads. The second half is:
… and also it must never place two dwarves in the same place …
This I’m not too sure about, but one solution could be to split your comonadic computation into three stages:
Convert every dwarf probabilistically to a diff stating whether that dwarf will move left, right, or stay. Type for this operation: mkDiffs :: X Dwarf -> Rand (X DwarfDiff)
Execute each diff, but keeping the original dwarf positions. Type for this operation: execDiffs :: X DwarfDiff -> X (DwarfDiff, [DwarfDiffed]).
Resolve situations where dwarfs have collided. Type for this operation: resolve :: X (Dwarf, [DwarfDiffed]) -> Rand (X Dwarf).
Types used above:
data Dwarf = Dwarf | NoDwarf
data DwarfDiff = MoveLeft | MoveRight | DontMove | NoDiff
data DwarfDiffed = MovedFromLeft | MovedFromRight | NothingMoved
Example of what I’m talking about:
myDwarfs = X [NoDwarf ,Dwarf ,NoDwarf ,Dwarf ,Dwarf ,Dwarf ] 0
mkDiffs myDwarfs
= X [NoDiff ,MoveRight ,NoDiff ,MoveLeft ,MoveRight ,DontMove ] 0
execDiffs (mkDiffs myDwarfs)
= X [(NoDiff,[NothingMoved]),(MoveRight,[NothingMoved]),(NoDiff,[MovedFromRight,MovedFromLeft]),(MoveLeft,[NothingMoved]),(MoveRight,[NothingMoved]),(DontMove,[MovedFromLeft])] 0
resolve (execDiffs (mkDiffs myDwarfs))
= X [NoDwarf ,NoDwarf ,Dwarf ,Dwarf ,Dwarf , Dwarf ] 0
As you can see, the above solution is pretty complicated. I have an alternate recommendation: don’t use comonads for this problem! Comonads are great for when you need to update one value based on its context, but are awful at updating multiple values simultaneously. The issue is that comonads such as your X are zippers, which store a data structure as a single ‘focused’ value plus a surrounding ‘context’. As I said, this is great for updating a focused value based on its context, but if you need to update multiple values, you have to shoehorn your computation into this value+context mould… which, as we saw above, can be pretty tricky. So possibly comonads aren’t the best choice for this application.

Related

What are the benefits of replacing Haskell record with a function

I was reading this interesting article about continuations and I discovered this clever trick. Where I would naturally have used a record, the author uses instead a function with a sum type as the first argument.
So for example, instead of doing this
data Processor = Processor { processString :: String -> IO ()
, processInt :: Int -> IO ()
}
processor = Processor (\s -> print $ "Hello "++ s)
(\x -> print $ "value" ++ (show x))
We can do this:
data Arg = ArgString String | ArgInt Int
processor :: Arg -> IO ()
processor (ArgString s) = print "Hello" ++ s
processor (ArgInt x) = print "value" ++ (show x)
Apart from being clever, what are the benefits of it over a simple record ?
Is it a common pattern and does it have a name ?
Well, it's just a simple isomorphism. In ADT algebraic:
IO()String × IO()Int
≅ IO()String+Int
The obvious benefit of the RHS is perhaps that it only contains IO() once – DRY FTW.
This is a very loose example but you can see the Arg method as being an initial encoding and the Processor method as being a final encoding. They are, as others have noted, of equal power when viewed in many lights; however, there are some differences.
Initial encodings enable us to examine the "commands" being executed. In some sense, it means we've sliced the operation so that the input and the output are separated. This lets us choose many different outputs given the same input.
Final encodings enable us to abstract over implementations more easily. For instance, if we have two values of type Processor then we can treat them identically even if the two have different effects or achieve their effects by different means. This kind of abstraction is popularized in OO languages.
Initial encodings enable (in some sense) an easier time adding new functions since we just have to add a new branch to the Arg type. If we had many different ways of building Processors then we'd have to update each of these mechanisms.
Honestly, what I've described above is rather stretched. It is the case that Arg and Processor fit these patterns somewhat, but they do not do so in such a significant way as to really benefit from the distinction. It may be worth studying more examples if you're interested—a good search term is the "expression problem" which emphasizes the distinction in points (2) and (3) above.
To expand a bit on leftroundabout's response, there is a way of writing functions as OutputInput, because of cardinality (how many things there are). So for example if you think about all of the mappings of the set {0, 1, 2} of cardinality 3 to the set {0, 1} of cardinality 2, you see that 0 can map to 0 or 1, independent of 1 mapping to 0 or 1, independent of 2 mapping to 0 or 1. When counting the total number of functions we get 2 * 2 * 2 or 23.
In this same way of writing, sum types are written with + and product types are written with * and there is a cute way to phrase this as OutIn1 + In2 = OutIn1 * OutIn2; we could write the isomorphism as:
combiner :: (a -> z, b -> z) -> Either a b -> z
combiner (za, zb) e_ab = case e_ab of Left a -> za a; Right b -> zb b
splitter :: (Either a b -> z) -> (a -> z, b -> z)
splitter z_eab = (\a -> z_eab $ Left a, \b -> z_eab $ Right b)
and we can reify it in your code with:
type Processor = Either String Int -> IO ()
So what's the difference? There aren't many:
The combined form requires both things to have the exact same tail-end. You can't apply combiner to something of type a -> b -> z since that parses as a -> (b -> z) and b -> z is not unifiable with z. If you wanted to unify a -> b -> z with c -> z then you have to first uncurry the function to (a, b) -> z, which looks like a bit of work -- it's just not an issue when you use the record version.
The split form is also a little more concise for application; you just write fst split a instead of combined $ Left a. But this also means that you can't quite do something like yz . combined (whose equivalent is (yz . fst split, yz . snd split)) so easily. When you've actually got the Processor record defined it might be worth it to extend its kind to * -> * and make it a Functor.
The record can in general participate in type classes more easily than the sum-type-function.
Sum types will look more imperative, so they'll probably be clearer to read. For example, if I hand you the pattern withProcState p () [Read path1, Apply (map toUpper), Write path2] it's pretty easy to see that this feeds the processor with commands to uppercase path1 into path2. The equivalent of defining processors would look like procWrite p path2 $ procApply p (map toUpper) $ procRead p path1 () which is still pretty clear but not quite as awesome as the previous case.

How to work around F#'s type system

In Haskell, you can use unsafeCoerce to override the type system. How to do the same in F#?
For example, to implement the Y-combinator.
I'd like to offer a different solution, based on embedding the untyped lambda calculus in a typed functional language. The idea is to create a data type that allows us to change between types α and α → α, which subsequently allows to escape the restrictions of a type system. I'm not very familiar with F# so I'll give my answer in Haskell, but I believe it could be adapted easily (perhaps the only complication could be F#'s strictness).
-- | Roughly represents morphism between #a# and #a -> a#.
-- Therefore we can embed a arbitrary closed λ-term into #Any a#. Any time we
-- need to create a λ-abstraction, we just nest into one #Any# constructor.
--
-- The type parameter allows us to embed ordinary values into the type and
-- retrieve results of computations.
data Any a = Any (Any a -> a)
Note that the type parameter isn't significant for combining terms. It just allows us to embed values into our representation and extract them later. All terms of a particular type Any a can be combined freely without restrictions.
-- | Embed a value into a λ-term. If viewed as a function, it ignores its
-- input and produces the value.
embed :: a -> Any a
embed = Any . const
-- | Extract a value from a λ-term, assuming it's a valid value (otherwise it'd
-- loop forever).
extract :: Any a -> a
extract x#(Any x') = x' x
With this data type we can use it to represent arbitrary untyped lambda terms. If we want to interpret a value of Any a as a function, we just unwrap its constructor.
First let's define function application:
-- | Applies a term to another term.
($$) :: Any a -> Any a -> Any a
(Any x) $$ y = embed $ x y
And λ abstraction:
-- | Represents a lambda abstraction
l :: (Any a -> Any a) -> Any a
l x = Any $ extract . x
Now we have everything we need for creating complex λ terms. Our definitions mimic the classical λ-term syntax, all we do is using l to construct λ abstractions.
Let's define the Y combinator:
-- λf.(λx.f(xx))(λx.f(xx))
y :: Any a
y = l (\f -> let t = l (\x -> f $$ (x $$ x))
in t $$ t)
And we can use it to implement Haskell's classical fix. First we'll need to be able to embed a function of a -> a into Any a:
embed2 :: (a -> a) -> Any a
embed2 f = Any (f . extract)
Now it's straightforward to define
fix :: (a -> a) -> a
fix f = extract (y $$ embed2 f)
and subsequently a recursively defined function:
fact :: Int -> Int
fact = fix f
where
f _ 0 = 1
f r n = n * r (n - 1)
Note that in the above text there is no recursive function. The only recursion is in the Any data type, which allows us to define y (which is also defined non-recursively).
In Haskell, unsafeCoerce has the type a -> b and is generally used to assert to the compiler that the thing being coerced actually has the destination type and it's just that the type-checker doesn't know it.
Another, less common use, is to reinterpret a pattern of bits as another type. For example an unboxed Double# could be reinterpreted as an unboxed Int64#. You have to be sure about the underlying representations for this to be safe.
In F#, the first application can be achieved with box |> unbox as John Palmer said in a comment on the question. If possible use explicit type arguments to make sure that you don't accidentally have the wrong coercion inferred, e.g. box<'a> |> unbox<'b> where 'a and 'b are type variables or concrete types that are already in scope in your code.
For the second application, look at the BitConverter class for specific conversions of bit-patterns. In theory you could also do something like interfacing with unmanaged code to achieve this, but that seems very heavyweight.
These techniques won't work for implementing the Y combinator because the cast is only valid if the runtime objects actually do have the target type, but with the Y combinator you actually need to call the same function again but with a different type. For this you need the kinds of encoding tricks mentioned in the question John Palmer linked to.

Haskell monad return arbitrary data type

I am having trouble defining the return over a custom defined recursive data type.
The data type is as follows:
data A a = B a | C (A a) (A a)
However, I don't know how to define the return statement since I can't figure out when to return B value and when to recursively return C.
Any help is appreciated!
One way to define a Monad instance for this type is to treat it as a free monad. In effect, this takes A a to be a little syntax with one binary operator C, and variables represented by values of type a embedded by the B constructor. That makes return the B constructor, embedding variables, and >>= the operator which performs subsitution.
instance Monad A where
return = B
B x >>= f = f x
C l r >>= f = C (l >>= f) (r >>= f)
It's not hard to see that (>>= B) performs the identity substitution, and that composition of substitutions is associative.
Another, more "imperative" way to see this monad is that it captures the idea of computations that can flip coins (or read a bitstream or otherwise have some access to a sequence of binary choices).
data Coin = Heads | Tails
Any computation which can flip coins must either stop flipping and be a value (with B), or flip a coin and carry on (with C) in one way if the coin comes up Heads and another if Tails. The monadic operation which flips a coin and tells you what came up is
coin :: A Coin
coin = C (B Heads) (B Tails)
The >>= of A can now be seen as sequencing coin-flipping computations, allowing the choice of a subsequent computation to depend on the value delivered by an earlier computation.
If you have an infinite stream of coins, then (apart from your extraordinary good fortune) you're also lucky enough to be able to run any A-computation to its value, as follows
data Stream x = x :> Stream x -- actually, I mean "codata"
flipping :: Stream Coin -> A v -> v
flipping _ (B v) = v
flipping (Heads :> cs) (C h t) = flipping cs h
flipping (Tails :> cs) (C h t) = flipping cs t
The general pattern in this sort of monad is to have one constructor for returning a value (B here) and a bunch of others which represent the choice of possible operations and the different ways computations can continue given the result of an operation. Here C has no non-recursive parameters and two subtrees, so I could tell that there must be just one operation and that it must have just two possible outcomes, hence flipping a coin.
So, it's substitution for a syntax with variables and one binary operator, or it's a way of sequencing computations that flip coins. Which view is better? Well... they're two sides of the same coin.
A good rule of thumb for return is to make it the simplest possible thing which could work (of course, any definition that satisfies the monad laws is fine, but usually you want something with minimal structure). In this case it's as simple as return = B (now write a (>>=) to match!).
By the way, this is an example of a free monad -- in fact, it's the example given in the documentation, so I'll let the documentation speak for itself.

Are newtypes faster than enumerations?

According to this article,
Enumerations don't count as single-constructor types as far as GHC is concerned, so they don't benefit from unpacking when used as strict constructor fields, or strict function arguments. This is a deficiency in GHC, but it can be worked around.
And instead the use of newtypes is recommended. However, I cannot verify this with the following code:
{-# LANGUAGE MagicHash,BangPatterns #-}
{-# OPTIONS_GHC -O2 -funbox-strict-fields -rtsopts -fllvm -optlc --x86-asm-syntax=intel #-}
module Main(main,f,g)
where
import GHC.Base
import Criterion.Main
data D = A | B | C
newtype E = E Int deriving(Eq)
f :: D -> Int#
f z | z `seq` False = 3422#
f z = case z of
A -> 1234#
B -> 5678#
C -> 9012#
g :: E -> Int#
g z | z `seq` False = 7432#
g z = case z of
(E 0) -> 2345#
(E 1) -> 6789#
(E 2) -> 3535#
f' x = I# (f x)
g' x = I# (g x)
main :: IO ()
main = defaultMain [ bench "f" (whnf f' A)
, bench "g" (whnf g' (E 0))
]
Looking at the assembly, the tags for each constructor of the enumeration D is actually unpacked and directly hard-coded in the instruction. Furthermore, the function f lacks error-handling code, and more than 10% faster than g. In a more realistic case I have also experienced a slowdown after converting a enumeration to a newtype. Can anyone give me some insight about this? Thanks.
It depends on the use case. For the functions you have, it's expected that the enumeration performs better. Basically, the three constructors of D become Ints resp. Int#s when the strictness analysis allows that, and GHC knows it's statically checked that the argument can only have one of the three values 0#, 1#, 2#, so it needs not insert error handling code for f. For E, the static guarantee of only one of three values being possible isn't given, so it needs to add error handling code for g, that slows things down significantly. If you change the definition of g so that the last case becomes
E _ -> 3535#
the difference vanishes completely or almost completely (I get a 1% - 2% better benchmark for f still, but I haven't done enough testing to be sure whether that's a real difference or an artifact of benchmarking).
But this is not the use case the wiki page is talking about. What it's talking about is unpacking the constructors into other constructors when the type is a component of other data, e.g.
data FooD = FD !D !D !D
data FooE = FE !E !E !E
Then, if compiled with -funbox-strict-fields, the three Int#s can be unpacked into the constructor of FooE, so you'd basically get the equivalent of
struct FooE {
long x, y, z;
};
while the fields of FooD have the multi-constructor type D and cannot be unpacked into the constructor FD(1), so that would basically give you
struct FooD {
long *px, *py, *pz;
}
That can obviously have significant impact.
I'm not sure about the case of single-constructor function arguments. That has obvious advantages for types with contained data, like tuples, but I don't see how that would apply to plain enumerations, where you just have a case and splitting off a worker and a wrapper makes no sense (to me).
Anyway, the worker/wrapper transformation isn't so much a single-constructor thing, constructor specialisation can give the same benefit to types with few constructors. (For how many constructors specialisations would be created depends on the value of -fspec-constr-count.)
(1) That might have changed, but I doubt it. I haven't checked it though, so it's possible the page is out of date.
I would guess that GHC has changed quite a bit since that page was last updated in 2008. Also, you're using the LLVM backend, so that's likely to have some effect on performance as well. GHC can (and will, since you've used -O2) strip any error handling code from f, because it knows statically that f is total. The same cannot be said for g. I would guess that it's the LLVM backend that then unpacks the constructor tags in f, because it can easily see that there is nothing else used by the branching condition. I'm not sure of that, though.

How can iterative deepening search implemented efficient in haskell?

I have an optimization problem I want to solve. You have some kind of data-structure:
data Foo =
{ fooA :: Int
, fooB :: Int
, fooC :: Int
, fooD :: Int
, fooE :: Int
}
and a rating function:
rateFoo :: myFoo -> Int
I have to optimize the result of rateFoo by changing the values in the struct. In this specific case, I decided to use iterative deepening search to solve the problem. The (infinite) search tree for the best optimization is created by another function, which simply applies all possible changes recursivly to the tree:
fooTree :: Foo -> Tree
My searching function looks something like this:
optimize :: Int -> Foo -> Foo
optimize threshold foo = undefined
The question I had, before I start is this:
As the tree can be generated by the data at each point, is it possible to have only the parts of the tree generated, which are currently needed by the algorithm? Is it possible to have the memory freed and the tree regenerated if needed in order to save memory (A leave at level n can be generated in O(n) and n remains small, but not small enough to have the whole tree in memory over time)?
Is this something I can excpect from the runtime? Can the runtime unevaluate expressions (turn an evaluated expression into an unevaluated one)? Or what is the dirty hack I have to do for this?
The runtime does not unevaluate expressions.
There's a straightforward way to get what you want however.
Consider a zipper-like structure for your tree. Each node holds a value and a thunk representing down, up, etc. When you move to the next node, you can either move normally (placing the previous node value in the corresponding slot) or forgetfully (placing an expression which evaluates to the previous node in the right slot). Then you have control over how much "history" you hang on to.
Here's my advice:
Just implement your algorithm in the
most straightforward way possible.
Profile.
Optimize for speed or memory use if necessary.
I very quickly learned that I'm not smart and/or experienced enough to reason about what GHC will do or how garbage collection will work. Sometimes things that I'm sure will be disastrously memory-inefficient work smoothly the first time around, and–less often–things that seem simple require lots of fussing with strictness annotations, etc.
The Real World Haskell chapter on profiling and optimization is incredibly helpful once you get to steps 2 and 3.
For example, here's a very simple implementation of IDDFS, where f expands children, p is the search predicate, and x is the starting point.
search :: (a -> [a]) -> (a -> Bool) -> a -> Bool
search f p x = any (\d -> searchTo f p d x) [1..]
where
searchTo f p d x
| d == 0 = False
| p x = True
| otherwise = any (searchTo f p $ d - 1) (f x)
I tested by searching for "abbaaaaaacccaaaaabbaaccc" with children x = [x ++ "a", x ++ "bb", x ++ "ccc"] as f. It seems reasonably fast and requires very little memory (linear with the depth, I think). Why not try something like this first and then move to a more complicated data structure if it isn't good enough?

Resources