Chaining Haskell Functions in data types - haskell

Lets say I have the following:
data FuncAndValue v res = FuncAndValue (v -> res) v
chain :: (res -> new_res) -> FuncAndValue v res -> FuncAndValue v new_res
chain new_f (FuncAndValue old_f v) = FuncAndValue (new_f . old_f) v
Is GHC likely to be able to combine functions new_f and old_f into a single function through inlining?
Basically, does storing functions in data types in anyway inhibit optimizations.
I'd like GHC to be easily able to compose chains of functions into one (i.e. so a "sum" on my structure doesn't involve repeated calls to a thunk that represents (+) and instead just inlines the (+) so it runs like a for loop. I'm hoping storing functions in data types and then accessing them later doesn't inhibit this.

Is GHC likely to be able to combine functions new_f and old_f into a single function through inlining?
Yes, if it could do the same without the intervening FuncAndValue. Of course the unfoldings of the functions need to be available, or there wouldn't be any chance of inlining anyway. But if there is a chance, wrapping the function(s) in a FuncAndValue makes little difference if any.
But let's ask GHC itself. First the type and a very simple chaining:
module FuncAndValue where
data FuncAndValue v res = FuncAndValue (v -> res) v
infixr 7 `chain`
chain :: (res -> new_res) -> FuncAndValue v res -> FuncAndValue v new_res
chain new_f (FuncAndValue old_f v) = FuncAndValue (new_f . old_f) v
apply :: FuncAndValue v res -> res
apply (FuncAndValue f x) = f x
trivia :: FuncAndValue Int (Int,Int)
trivia = FuncAndValue (\x -> (2*x - 1, 3*x + 2)) 1
composed :: FuncAndValue Int Int
composed = chain (uncurry (+)) trivia
and (the interesting part of) the core we get for trivia and composed:
FuncAndValue.trivia1 =
\ (x_af2 :: GHC.Types.Int) ->
(case x_af2 of _ { GHC.Types.I# y_agp ->
GHC.Types.I# (GHC.Prim.-# (GHC.Prim.*# 2 y_agp) 1)
},
case x_af2 of _ { GHC.Types.I# y_agp ->
GHC.Types.I# (GHC.Prim.+# (GHC.Prim.*# 3 y_agp) 2)
})
FuncAndValue.composed2 =
\ (x_agg :: GHC.Types.Int) ->
case x_agg of _ { GHC.Types.I# y_agp ->
GHC.Types.I#
(GHC.Prim.+#
(GHC.Prim.-# (GHC.Prim.*# 2 y_agp) 1)
(GHC.Prim.+# (GHC.Prim.*# 3 y_agp) 2))
}
Inlined fair enough, no (.) to be seen. The two cases from trivia have been joined so that we have only one in composed. Unless somebody teaches GHC enough algebra to simplify \x -> (2*x-1) + (3*x+2) to \x -> 5*x + 1, that's as good as you can hope. apply composed is reduced to 6 at compile time, even in a separate module.
But that was very simple, let's give it a somewhat harder nut to crack.
An inlinable version of until (the current definition of until is recursive, so GHC doesn't inline it),
module WWUntil where
wwUntil :: (a -> Bool) -> (a -> a) -> a -> a
wwUntil p f = recur
where
recur x
| p x = x
| otherwise = recur (f x)
Another simple function it its own module,
collatzStep :: Int -> Int
collatzStep n
| n .&. 1 == 0 = n `unsafeShiftR` 1
| otherwise = 3*n + 1
and finally, the nut
module Hailstone (collatzLength, hailstone) where
import FuncAndValue
import CollatzStep
import WWUntil
data P = P {-# UNPACK #-} !Int {-# UNPACK #-} !Int
fstP :: P -> Int
fstP (P x _) = x
sndP :: P -> Int
sndP (P _ y) = y
hailstone :: Int -> FuncAndValue Int Int
hailstone n = sndP `chain` wwUntil ((== 1) . fstP) (\(P n k) -> P (collatzStep n) (k+1))
`chain` FuncAndValue (\x -> P x 0) n
collatzLength :: Int -> Int
collatzLength = apply . hailstone
I have helped the strictness analyser a bit by using a strict pair. With the vanilla (,) the second component would be unboxed and reboxed after adding 1 in each step, and I just can't bear such waste ;) But otherwise there's no relevant difference.
And (the interesting part of) the core GHC generates:
Rec {
Hailstone.$wrecur [Occ=LoopBreaker]
:: GHC.Prim.Int#
-> GHC.Prim.Int# -> (# GHC.Prim.Int#, GHC.Prim.Int# #)
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType LL]
Hailstone.$wrecur =
\ (ww_sqq :: GHC.Prim.Int#) (ww1_sqr :: GHC.Prim.Int#) ->
case ww_sqq of wild_Xm {
__DEFAULT ->
case GHC.Prim.word2Int#
(GHC.Prim.and# (GHC.Prim.int2Word# wild_Xm) (__word 1))
of _ {
__DEFAULT ->
Hailstone.$wrecur
(GHC.Prim.+# (GHC.Prim.*# 3 wild_Xm) 1) (GHC.Prim.+# ww1_sqr 1);
0 ->
Hailstone.$wrecur
(GHC.Prim.uncheckedIShiftRA# wild_Xm 1) (GHC.Prim.+# ww1_sqr 1)
};
1 -> (# 1, ww1_sqr #)
}
end Rec }
lvl_rsz :: GHC.Types.Int -> GHC.Types.Int
[GblId, Arity=1, Caf=NoCafRefs]
lvl_rsz =
\ (x_iog :: GHC.Types.Int) ->
case x_iog of _ { GHC.Types.I# tpl1_B4 ->
case Hailstone.$wrecur tpl1_B4 0 of _ { (# _, ww2_sqH #) ->
GHC.Types.I# ww2_sqH
}
}
and that's exactly what you get without FuncAndValue. Everything inlined nicely, a beautiful tight loop.
Basically, does storing functions in data types in anyway inhibit optimizations.
If you wrap the function under enough layers, yes. But it's the same with other values.

Related

Pattern matching in `Alternative`

I have a function that pattern matches on its arguments to produce a computation in StateT () Maybe (). This computation can fail when run, in which case I want the current pattern match branch to fail, so to speak.
I highly doubt it's possible to have something like
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f (Just n1) (Just n2) = do
m <- compute (n1 + n2)
guard (m == 42)
f (Just n) _ = do
m <- compute n
guard (m == 42)
f _ (Just n) = do
m <- compute n
guard (m == 42)
behave in the way I want it to: When the first computation fails due to the guard or somewhere in compute, I want f to try the next pattern.
Obviously the above can't work, because StateT (as any other monad might) involves an additional parameter when expanded, so I probably can't formulate this as simple pattern guards.
The following does what I want, but it's ugly:
f' :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f' a b = asum (map (\f -> f a b) [f1, f2, f3])
where
f1 a b = do
Just n1 <- pure a
Just n2 <- pure b
m <- compute (n1 + n2)
guard (m == 42)
f2 a _ = do
Just n <- pure a
m <- compute n
guard (m == 42)
f3 _ b = do
Just n <- pure b
m <- compute n
guard (m == 42)
A call like execStateT (f (Just 42) (Just 1)) () would fail for f but return Just () for f', because it matches f2.
How do I get the behavior of f' while having elegant pattern matching with as little auxiliary definitions as possible like in f? Are there other, more elegant ways to formulate this?
Complete runnable example:
#! /usr/bin/env stack
-- stack --resolver=lts-11.1 script
import Control.Monad.Trans.State
import Control.Applicative
import Control.Monad
import Data.Foldable
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f (Just n1) (Just n2) = do
m <- compute (n1 + n2)
guard (m == 42)
f (Just n) _ = do
m <- compute n
guard (m == 42)
f _ (Just n) = do
m <- compute n
guard (m == 42)
f' :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f' a b = asum (map (\f -> f a b) [f1, f2, f3])
where
f1 a b = do
Just n1 <- pure a
Just n2 <- pure b
m <- compute (n1 + n2)
guard (m == 42)
f2 a _ = do
Just n <- pure a
m <- compute n
guard (m == 42)
f3 _ b = do
Just n <- pure b
m <- compute n
guard (m == 42)
main = do
print $ execStateT (f (Just 42) (Just 1)) () -- Nothing
print $ execStateT (f' (Just 42) (Just 1)) () -- Just (), because `f2` succeeded
Edit: I elicited quite some clever answers with this question so far, thanks! Unfortunately, they mostly suffer from overfitting to the particular code example I've given. In reality, I need something like this for unifying two expressions (let-bindings, to be precise), where I want to try unifying the RHS of two simultaneous lets if possible and fall through to the cases where I handle let bindings one side at a time by floating them. So, actually there's no clever structure on Maybe arguments to exploit and I'm not computeing on Int actually.
The answers so far might benefit others beyond the enlightenment they brought me though, so thanks!
Edit 2: Here's some compiling example code with probably bogus semantics:
module Unify (unify) where
import Control.Applicative
import Control.Monad.Trans.State.Strict
data Expr
= Var String -- meta, free an bound vars
| Let String Expr Expr
-- ... more cases
-- no Eq instance, fwiw
-- | If the two terms unify, return the most general unifier, e.g.
-- a substitution (`Map`) of meta variables for terms as association
-- list.
unify :: [String] -> Expr -> Expr -> Maybe [(String, Expr)]
unify metaVars l r = execStateT (go [] [] l r) [] -- threads the current substitution as state
where
go locals floats (Var x) (Var y)
| x == y = return ()
go locals floats (Var x) (Var y)
| lookup x locals == Just y = return ()
go locals floats (Var x) e
| x `elem` metaVars = tryAddSubstitution locals floats x e
go locals floats e (Var y)
| y `elem` metaVars = tryAddSubstitution locals floats y e
-- case in point:
go locals floats (Let x lrhs lbody) (Let y rrhs rbody) = do
go locals floats lrhs rrhs -- try this one, fail current pattern branch if rhss don't unify
-- if we get past the last statement, commit to this branch, no matter
-- the next statement fails or not
go ((x,y):locals) floats lbody rbody
-- try to float the let binding. terms mentioning a floated var might still
-- unify with a meta var
go locals floats (Let x rhs body) e = do
go locals (Left (x,rhs):floats) body e
go locals floats e (Let y rhs body) = do
go locals (Right (y,rhs):floats) body e
go _ _ _ _ = empty
tryAddSubstitution = undefined -- magic
When I need something like this, I just use asum with the blocks inlined. Here I also condensed the multiple patterns Just n1 <- pure a; Just n2 <- pure b into one, (Just n1, Just n2) <- pure (a, b).
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f a b = asum
[ do
(Just n1, Just n2) <- pure (a, b)
m <- compute (n1 + n2)
guard (m == 42)
, do
Just n <- pure a
m <- compute n
guard (m == 42)
, do
Just n <- pure b
m <- compute n
guard (m == 42)
]
You can also use chains of <|>, if you prefer:
f :: Maybe Int -> Maybe Int -> StateT () Maybe ()
f a b
= do
(Just n1, Just n2) <- pure (a, b)
m <- compute (n1 + n2)
guard (m == 42)
<|> do
Just n <- pure a
m <- compute n
guard (m == 42)
<|> do
Just n <- pure b
m <- compute n
guard (m == 42)
This is about as minimal as you can get for this kind of “fallthrough”.
If you were using Maybe alone, you would be able to do this with pattern guards:
import Control.Monad
import Control.Applicative
ensure :: Alternative f => (a -> Bool) -> a -> f a
ensure p a = a <$ guard (p a)
compute :: Int -> Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> Maybe Int
f (Just m) (Just n)
| Just x <- ensure (== 42) =<< compute (m + n)
= return x
f (Just m) _
| Just x <- ensure (== 42) =<< compute m
= return x
f _ (Just n)
| Just x <- ensure (== 42) =<< compute n
= return x
f _ _ = empty
(ensure is a general purpose combinator. Cf. Lift to Maybe using a predicate)
As you have StateT on the top, though, you would have to supply a state in order to pattern match on Maybe, which would foul up everything. That being so, you are probably better off with something in the vein of your "ugly" solution. Here is a whimsical attempt at improving its looks:
import Control.Monad
import Control.Applicative
import Control.Monad.State
import Control.Monad.Trans
import Data.Foldable
ensure :: Alternative f => (a -> Bool) -> a -> f a
ensure p a = a <$ guard (p a)
compute :: Int -> StateT () Maybe Int
compute = return
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f a b = asum (map (\c -> f' (c a b)) [liftA2 (+), const, flip const])
where
f' = ensure (== 42) <=< compute <=< lift
While this is an answer specific to the snippet I've given, the refactorings only apply limited to the code I was facing.
Perhaps it's not that far-fetched of an idea to extract the skeleton of the asum expression above to a more general combinator:
-- A better name would be welcome.
selector :: Alternative f => (a -> a -> a) -> (a -> f b) -> a -> a -> f b
selector g k x y = asum (fmap (\sel -> k (sel x y)) [g, const, flip const])
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f = selector (liftA2 (+)) (ensure (== 42) <=< compute <=< lift)
Though it is perhaps a bit awkward of a combinator, selector does show the approach is more general than it might appear at first: the only significant restriction is that k has to produce results in some Alternative context.
P.S.: While writing selector with (<|>) instead of asum is arguably more tasteful...
selector g k x y = k (g x y) <|> k x <|> k y
... the asum version straightforwardly generalises to an arbitrary number of pseudo-patterns:
selector :: Alternative f => [a -> a -> a] -> (a -> f b) -> a -> a -> f b
selector gs k x y = asum (fmap (\g -> k (g x y)) gs)
It looks like you could get rid of the whole pattern match by relying on the fact that Int forms a Monoid with addition and 0 as the identity element, and that Maybe a forms a Monoid if a does. Then your function becomes:
f :: Maybe Int -> Maybe Int -> StateT () Maybe Int
f a b = pure $ a <> b >>= compute >>= pure . mfilter (== 42)
You could generalise by passing the predicate as an argument:
f :: Monoid a => (a -> Bool) -> Maybe a -> Maybe a -> StateT () Maybe a
f p a b = pure $ a <> b >>= compute >>= pure . mfilter p
The only thing is that compute is now taking a Maybe Int as input, but that is just a matter of calling traverse inside that function with whatever computation you need to do.
Edit: Taking into account your last edit, I find that if you spread your pattern matches into separate computations that may fail, then you can just write
f a b = f1 a b <|> f2 a b <|> f3 a b
where f1 (Just a) (Just b) = compute (a + b) >>= check
f1 _ _ = empty
f2 (Just a) _ = compute a >>= check
f2 _ _ = empty
f3 _ (Just b) = compute b >>= check
f3 _ _ = empty
check x = guard (x == 42)

How to know in Haskell in what row and column of a table ([[a]]) you are

I want to make a sudoku solver in Haskell (as an exercise). My idea is:
I have t :: [[Int]] representing a 9x9 grid so that it contains 0 in an empty field and 1-9 in a solved field.
A function solve :: [[Int]] -> [[Int]] returns the solved sudoku.
Here is a rough sketch of it (i'd like to point out i'm a beginner, i know it is not the most optimal code):
solve :: [[Int]] -> [[Int]]
solve t
| null (filter (elem 0) t) = t
| t /= beSmart t = solve (beSmart t)
| otherwise = guess t
The function beSmart :: [[Int]] -> [[Int]] tries to solve it by applying some solving algorithms, but if methodical approach fails (beSmart returns the unchanged sudoku table in that case) it should try to guess some numbers (and i'll think of that function later). In order to fill in an empty field, i have to find it first. And here's the problem:
beSmart :: [[Int]] -> [[Int]]
beSmart t = map f t
where f row
| elem 0 row = map unsolvedRow row
| otherwise = row
where unsolvedRow a
| a == 0 = tryToDo t r c --?!?!?!?! skip
| otherwise = a
The function tryToDo :: [[Int]]] -> Int -> Int - > Int needs the row and column of the field i'm trying to change, but i have no idea how to get that information. How do i get from map what element of the list i am in at the moment? Or is there a better way to move around in the table? I come from iterative and procedural programing and i understand that perhaps my approach to the problem is wrong when it comes to functional programing.
I know this is not really an answer to your question, but I would argue, that usually you would want a different representation (one that keeps a more detailed view of what you know about the sudoku puzzle, in your attempted solution you can only distinguish a solved cell from a cell that is free to assume any value). Sudoku is a classical instance of CSP. Where modern approaches offer many fairly general smart propagation rules, such as unit propagation (blocking a digit in neighboring cells once used somewhere), but also many other, see AC-3 for further details. Other related topics include SAT/SMT and you might find the algorithm DPLL also interesting. In the heart of most solvers there usually is some kind of a search engine to deal with non-determinism (not every instance must have a single solution that is directly derivable from the initial configuration of the instance by application of inference rules). There are also techniques such as CDCL to direct the search.
To address the question in the title, to know where you are, its probably best if you abstract the traversal of your table so that each step has access to the coordinates, you can for example zip a list of rows with [0..] (zip [0..] rows) to number the rows, when you then map a function over the zipped lists, you will have access to pairs (index, row), the same applies to columns. Just a sketch of the idea:
mapTable :: (Int -> Int -> a -> b) -> [[a]] -> [[b]]
mapTable f rows = map (\(r, rs) -> mapRow (f r) rs) $ zip [0..] rows
mapRow :: (Int -> a -> b) -> [a] -> [b]
mapRow f cols = map (uncurry f) $ zip [0..] cols
or use fold to turn your table into something else (for example to search for a unit cell):
foldrTable :: (Int -> Int -> a -> b -> b) -> b -> [[a]] -> b
foldrTable f z rows = foldr (\(r, rs) b -> foldrRow (f r) b rs) z $ zip [0..] rows
foldrRow :: (Int -> a -> b -> b) -> b -> [a] -> b
foldrRow f z cols = foldr (uncurry f) z $ zip [0..] cols
to find which cell is unital:
foldrTable
(\x y v acc -> if length v == 1 then Just (x, y) else acc)
Nothing
[[[1..9],[1..9],[1..9]],[[1..9],[1..9],[1..9]],[[1..9],[1],[1..9]]]
by using Monoid you can refactor it:
import Data.Monoid
foldrTable' :: Monoid b => (Int -> Int -> a -> b) -> [[a]] -> b
foldrTable' f rows = foldrTable (\r c a b -> b <> f r c a) mempty rows
unit :: Int -> Int -> [a] -> Maybe (Int, Int)
unit x y c | length c == 1 = Just (x, y)
| otherwise = Nothing
firstUnit :: [[[a]]] -> Maybe (Int, Int)
firstUnit = getFirst . foldrTable' (\r c v -> First $ unit r c v)
so now you would do
firstUnit [[[1..9],[1..9],[1..9]],[[1,2],[3,4],[5]]]
to obtain
Just (1, 2)
correctly determining that the first unit cell is at position 1,2 in the table.
[[Int]] is a good type for a sodoku. But map does not give any info regarding the place it is in. This is one of the ideas behind map.
You could zip together the index with the value. But a better idea would be to pass the whole [[Int]] and the indexes to to the function. So its type would become:
f :: [[Int]] -> Int -> Int -> [[Int]]
inside the function you can now access the current element by
t !! x !! y
Already did this a while ago as a learning example. It is definitely not the nicest solution, but it worked for me.
import Data.List
import Data.Maybe
import Data.Char
sodoku="\
\-9-----1-\
\8-4-2-3-7\
\-6-9-7-2-\
\--5-3-1--\
\-7-5-1-3-\
\--3-9-8--\
\-2-8-5-6-\
\1-7-6-4-9\
\-3-----8-"
sodoku2="\
\----13---\
\7-5------\
\1----547-\
\--418----\
\951-67843\
\-2---4--1\
\-6235-9-7\
\--7-98--4\
\89----1-5"
data Position = Position (Int, Int) deriving (Show)
data Sodoku = Sodoku [Int]
insertAtN :: Int -> a -> [a] -> [a]
insertAtN n y xs = intercalate [y] . groups n $ xs
where
groups n xs = takeWhile (not.null) . unfoldr (Just . splitAt n) $ xs
instance Show Sodoku where
show (Sodoku s) = (insertAtN 9 '\n' $ map intToDigit s) ++ "\n"
convertDigit :: Char -> Int
convertDigit x = case x of
'-' -> 0
x -> if digit>=1 && digit<=9 then
digit
else
0
where digit=digitToInt x
convertSodoku :: String -> Sodoku
convertSodoku x = Sodoku $ map convertDigit x
adjacentFields :: Position -> [Position]
adjacentFields (Position (x,y)) =
[Position (i,y) | i<-[0..8]] ++
[Position (x,j) | j<-[0..8]] ++
[Position (u+i,v+j) | i<-[0..2], j<-[0..2]]
where
u=3*(x `div` 3)
v=3*(y `div` 3)
positionToField :: Position -> Int
positionToField (Position (x,y)) = x+y*9
fieldToPosition :: Int -> Position
fieldToPosition x = Position (x `mod` 9, x `div` 9)
getDigit :: Sodoku -> Position -> Int
getDigit (Sodoku x) pos = x !! (positionToField pos )
getAdjacentDigits :: Sodoku -> Position -> [Int]
getAdjacentDigits s p = nub digitList
where
digitList=filter (\x->x/=0) $ map (getDigit s) (adjacentFields p)
getFreePositions :: Sodoku -> [Position]
getFreePositions (Sodoku x) = map fieldToPosition $ elemIndices 0 x
isSolved :: Sodoku -> Bool
isSolved s = (length $ getFreePositions s)==0
isDeadEnd :: Sodoku -> Bool
isDeadEnd s = any (\x->x==0) $ map length $ map (getValidDigits s)$ getFreePositions s
setDigit :: Sodoku -> Position -> Int -> Sodoku
setDigit (Sodoku x) pos digit = Sodoku $ h ++ [digit] ++ t
where
field=positionToField pos
h=fst $ splitAt field x
t=tail$ snd $ splitAt field x
getValidDigits :: Sodoku -> Position -> [Int]
getValidDigits s p = [1..9] \\ (getAdjacentDigits s p)
-- Select numbers with few possible choices first to increase execution time
sortImpl :: (Position, [Int]) -> (Position, [Int]) -> Ordering
sortImpl (_, i1) (_, i2)
| length(i1)<length(i2) = LT
| length(i1)>length(i2) = GT
| length(i1)==length(i2) = EQ
selectMoves :: Sodoku -> Maybe (Position, [Int])
selectMoves s
| length(posDigitList)>0 = Just (head posDigitList)
| otherwise = Nothing
where
posDigitList=sortBy sortImpl $ zip freePos validDigits
validDigits=map (getValidDigits s) freePos
freePos=getFreePositions s
createMoves :: Sodoku -> [Sodoku]
createMoves s=
case selectMoves s of
Nothing -> []
(Just (pos, digits)) -> [setDigit s pos d|d<-digits]
solveStep :: Sodoku -> [Sodoku]
solveStep s
| (isSolved s) = [s]
| (isDeadEnd s )==True = []
| otherwise = createMoves s
solve :: Sodoku -> [Sodoku]
solve s
| (isSolved s) = [s]
| (isDeadEnd s)==True = []
| otherwise=concat $ map solve (solveStep s)
s=convertSodoku sodoku2
readSodoku :: String -> Sodoku
readSodoku x = Sodoku []

Memoisation with auxiliary parameter in Haskell

I have a recursive function f that takes two parameters x and y. The function is uniquely determined by the first parameter; the second one merely makes things easier.
I now want to memoise that function w.r.t. it's first parameter while ignoring the second one. (I.e. f is evaluated at most one for every value of x)
What is the easiest way to do that? At the moment, I simply define an array containing all values recursively, but that is a somewhat ad-hoc solution. I would prefer some kind of memoisation combinator that I can just throw at my function.
EDIT: to clarify, the function f takes a pair of integers and a list. The first integer is some parameter value, the second one denotes the index of an element in some global list xs to consume.
To avoid indexing the list, I pass the partially consumed list to f as well, but obviously, the invariant is that if the first parameter is (m, n), the second one will always be drop n xs, so the result is uniquely determined by the first parameter.
Just using a memoisation combinator on the partially applied function will not work, since that will leave an unevaluated thunk \xs -> … lying around. I could probably wrap the two parameters in a datatype whose Eq instance ignores the second value (and similarly for other instances), but that seems like a very ad-hoc solution. Is there not an easier way?
EDIT2: The concrete function I want to memoise:
g :: [(Int, Int)] -> Int -> Int
g xs n = f 0 n
where f :: Int -> Int -> Int
f _ 0 = 0
f m n
| m == length xs = 0
| w > n = f (m + 1) n
| otherwise = maximum [f (m + 1) n, v + f (m + 1) (n - w)]
where (w, v) = xs !! m
To avoid the expensive indexing operation, I instead pass the partially-consumed list to f as well:
g' :: [(Int, Int)] -> Int -> Int
g' xs n = f xs 0 n
where f :: [(Int, Int)] -> Int -> Int -> Int
f [] _ _ = 0
f _ _ 0 = 0
f ((w,v) : xs) m n
| w > n = f xs (m + 1) n
| otherwise = maximum [f xs (m + 1) n, v + f xs (m + 1) (n - w)]
Memoisation of f w.r.t. the list parameter is, of course, unnecessary, since the list does not (morally) influence the result. I would therefore like the memoisation to simply ignore the list parameter.
Your function is unnecessarily complicated. You don't need the index m at all:
foo :: [(Int, Int)] -> Int -> Int
foo [] _ = 0
foo _ 0 = 0
foo ((w,v):xs) n
| w > n = foo xs n
| otherwise = foo xs n `max` foo xs (n - w) + v
Now if you want to memoize foo then both the arguments must be considered (as it should be).
We'll use the monadic memoization mixin method to memoize foo:
First, we create an uncurried version of foo (because we want to memoize both arguments):
foo' :: ([(Int, Int)], Int) -> Int
foo' ([], _) = 0
foo' (_, 0) = 0
foo' ((w,v):xs, n)
| w > n = foo' (xs, n)
| otherwise = foo' (xs, n) `max` foo' (xs, n - w) + v
Next, we monadify the function foo' (because we want to thread a memo table in the function):
foo' :: Monad m => ([(Int, Int)], Int) -> m Int
foo' ([], _) = return 0
foo' (_, 0) = return 0
foo' ((w,v):xs, n)
| w > n = foo' (xs, n)
| otherwise = do
a <- foo' (xs, n)
b <- foo' (xs, n - w)
return (a `max` b + v)
Then, we open the self-reference in foo' (because we want to call the memoized function):
type Endo a = a -> a
foo' :: Monad m => Endo (([(Int, Int)], Int) -> Int)
foo' _ ([], _) = return 0
foo' _ (_, 0) = return 0
foo' self ((w,v):xs, n)
| w > n = foo' (xs, n)
| otherwise = do
a <- self (xs, n)
b <- self (xs, n - w)
return (a `max` b + v)
We'll use the following memoization mixin to memoize our function foo':
type Dict a b m = (a -> m (Maybe b), a -> b -> m ())
memo :: Monad m => Dict a b m -> Endo (a -> m b)
memo (check, store) super a = do
b <- check a
case b of
Just b -> return b
Nothing -> do
b <- super a
store a b
return b
Our dictionary (memo table) will use the State monad and a Map data structure:
import Prelude hiding (lookup)
import Control.Monad.State
import Data.Map.Strict
mapDict :: Ord a => Dict a b (State (Map a b))
mapDict = (check, store) where
check a = gets (lookup a)
store a b = modify (insert a b)
Finally, we combine everything to create a memoized function memoFoo:
import Data.Function (fix)
type MapMemoized a b = a -> State (Map a b) b
memoFoo :: MapMemoized ([(Int, Int)], Int) Int
memoFoo = fix (memo mapDict . foo')
We can recover the original function foo as follows:
foo :: [(Int, Int)] -> Int -> Int
foo xs n = evalState (memoFoo (xs, n)) empty
Hope that helps.

What is the type of the variable in do-notation here in Haskell?

The codes below looks quite clear:
do
x <- Just 3
y <- Just "!"
Just (show x ++ y)
Here the type of x is Num and y is String. (<- here is used to take actual value out of the Monad)
However, this snippet looks not so clear to me:
import Control.Monad.Instances
addStuff :: Int -> Int
addStuff = do
a <- (* 2)
b <- (+ 10)
return (a + b)
What is the type of a and type of b here? It seems they act like a Num, but a <- (* 2) and b <- (+ 10) looks cryptic here...
Does anyone have ideas about this?
Well, you've stumbled upon a kind of weird monad.
The monad in question is the Monad ((->) r). Now, what does that mean? Well, it's the monad of functions of the form r -> *. I.e., of functions that take the same type of input.
You asked what the type of a and b are in this instance. Well, they are both Num a => a, but that doesn't really explain much.
Intuitively, we can understand the monad like this: A monadic value is a function that takes a value of type r as input. Whenever we bind in the monad, we take that value and pass it to the bound function.
I.e., in our addStuff example, if we call addStuff 5, then a is bound to (*2) 5 (which is 10), and b is bound to (+10) 5 (which is 15).
Let's see a simpler example from this monad to try to understand how it works precisely:
mutate = do a <- (*2)
return (a + 5)
If we desugar this to a bind, we get:
mutate = (*2) >>= (\a -> return (a + 5))
Now, this doesn't help much, so let's use the definition of bind for this monad:
mutate = \ r -> (\a -> return (a + 5)) ((*2) r) r
This reduces to
mutate = \ r -> return ((r*2) + 5) r
Which we using the definition that return is const, can reduce to
mutate = \ r -> (r*2) + 5
Which is a function, that multiplies a number by 2, and then adds 5.
That's one weird monad.
Given addStuff
addStuff :: Int -> Int
addStuff = do
a<-(*2)
b<-(+10)
return (a+b)
the definition desugars into
addStuff =
(* 2) >>= \a ->
(+ 10) >>= \b ->
return (a + b)
Hovering over the >>= in fpcomplete online editor shows
:: Monad m => forall a b.
(m a ) -> (a -> m b ) -> (m b )
:: (Int -> a ) -> (a -> Int -> b ) -> (Int -> b )
:: (Int -> Int) -> (Int -> Int -> Int) -> (Int -> Int)
That leads us to believe we use a Monad instance for functions. Indeed if we look at the source code, we see
instance Monad ((->) r) where
return = const
f >>= k = \ r -> k (f r) r
Using this newly obtained information we can evaluate the addStuff function ourselves.
Given the initial expression
(* 2) >>= ( \a -> (+10) >>= \b -> return (a + b) )
we substitute using the >>= definition, giving us (in the following {}, [], () just illustrate different depth of ())
\r1 -> {\a -> (+10) >>= \b -> return (a + b)} {(* 2) r1} r1
simplify the second-to-last term inside the outermost lambda
\r1 -> {\a -> (+10) >>= \b -> return (a + b)} {r1 * 2} r1
apply {r1 * 2} to {\a -> ...}
\r1 -> {(+10) >>= \b -> return ((r1 * 2) + b)} r1
substitute remaining >>= with its definition again
\r1 -> {\r2 -> [\b -> return (r1 * 2 + b)] [(+10) r2] r2} r1
simplify second-to-last term inside inner lambda
\r1 -> {\r2 -> [\b -> return (r1 * 2 + b)] [r2 + 10] r2} r1
apply [r2 + 10] to {\b -> ...}
\r1 -> {\r2 -> [return (r1 * 2 + (r2 + 10))] r2} r1
apply r1 to {\r2 -> ...}
\r1 -> {return (r1 * 2 + r1 + 10) r1}
substitute return with its definition
\r1 -> {const (r1 * 2 + r1 + 10) r1}
evaluate const x _ = x
\r1 -> {r1 * 2 + r1 + 10}
prettify
\x -> 3 * x + 10
finally we get
addStuff :: Int -> Int
addStuff = (+ 10) . (* 3)

Converting `do` Notation `addStuff` to `>>=`

Learn You a Haskell presents the addStuff function:
import Control.Monad.Instances
addStuff :: Int -> Int
addStuff = do
a <- (*2) -- binds (*2) to a
b <- (+10) -- binds (+10) to b
return (a+b) -- return has type sig: 'Monad m => a -> m a'
Are the types of a, b, and return (a+b) all Int -> Int? I think so, but I'm not sure how bind-ing plays a role.
I tried to implement it using >>=, but I'm not sure how to complete it (hence ...).
addStuff' :: Int -> Int
addStuff' = (*2) >>= (+10) >>= ...
Please give me a hint to complete it, as well as edit my understanding of the do notation version.
As I understand, the ... needs to include a type of Int -> Int. In the do version, I could use a and b, but I'm not sure how to add them with the >>= version.
When working with the reader monad (a.k.a. the function monad), you have the type a -> b, which can be rewritten as (->) a b. The actual monad instance here is
instance Monad ((->) r) where
return x = const x
f >>= g = \r -> g (f r) r
Notice that during >>=, the type is
(>>=) :: ((->) r a) -> (a -> ((->) r b)) -> ((->) r b)
Which can be rewritten as
(>>=) :: (r -> a) -> (a -> (r -> b)) -> (r -> b)
Or even
(>>=) :: (r -> a) -> (a -> r -> b) -> (r -> b)
So as you can see, what >>= does is take a single input, apply that to f, and then apply that result to g to produce a new function r -> b. So for your example, you could use:
addStuff' :: Int -> Int
addStuff' = (*2) >>= (+)
And so addStuff' 10 == 30, since it performs the computation (10 * 2) + (10). Note how 10 is fed both to (*2) and (+), and the result of (10*2) is fed to (+) as well. It might make things a little more clear to see it as
test :: Int -> (Int, Int, Int)
test = do
x <- (*2)
y <- (*3)
z <- (*5)
return (x, y, z)
And it's result would be
> test 1
(2, 3, 5)
> test 10
(20, 30, 50)
What this essentially is doing is taking the argument to test "before" it's been applied, feeding it to each of the functions on the right hand side of the <-s, and then combining that result in the return.
So how can you write these without do notation? You could do something like
test :: Int -> (Int, Int, Int)
test =
(\r -> r * 2) >>= (\x ->
(\r -> r * 3) >>= (\y ->
(\r -> r * 5) >>= (\z ->
return (x, y, z))))
Which, admittedly, is not very readable, even with formatting, but the gist is basically that r gets fed to each intermediate function, which produces a result, and a couple nested lambda expressions later you return all three of those results in a tuple.
With a bit of simplification, you could also make each of those nested lambdas into two arguments lambdas:
test =
(\r -> r * 2) >>=
(\x r -> r * 3) >>=
(\y r -> r * 5) >>=
(\z r -> const (x, y, z) r)
I've also replaced the last \z -> return (x, y, z) with its equivalent \z -> const (x, y, z) => \z r -> const (x, y, z) r, just so they all have the same form.
As a rough rule if you want to manually desugar do-notation, first erase the do at the top and flip the bind arrow (<-) on the left-hand-side to a (>>=) on the right-hand-side with the variable on the left as a lambda variable on the right. So:
addStuff :: Int -> Int
addStuff = do
a <- (*2)
... rest ...
Becomes:
addStuff :: Int -> Int
addStuff =
(*2) >>= (\a ->
... rest ...
)
This is recursive, so the next term in the do-notation then becomes nested in the lambda of the desugared term above it, all the way down to the last expression which is just the body of the nested lambda expression.
The desugaring is quite mechanical, it's defined by the following rewrites, where ; denotes a newline.
do { a <- f ; m } ≡ f >>= \a -> do { m }
do { f ; m } ≡ f >> do { m }
do { m } ≡ m
Both a and b are of type Int while return (a+b) has type Int -> Int which is the last term in the do-notation so it has to be identical to the toplevel signature. Using -XScopedTypeVariables we can manually annotate the subterms:
{-# LANGUAGE ScopedTypeVariables #-}
import Control.Monad.Instances
addStuff :: Int -> Int
addStuff = do
(a :: Int) <- (*2)
(b :: Int) <- (+10)
(return (a+b)) :: Int -> Int
Thanks to bheklilr.
I wrote my own code.
addStuff :: Int -> Int
addStuff = (\r -> r * 2) >>= (\x ->
(\r -> r + 10) >>= (\y ->
return (x + y)))

Resources