Constant space short circuiting `foldM` over `Maybe`

Constant space short circuiting `foldM` over `Maybe` - haskell

Lets say I have the following:
f :: b -> a -> b
x :: b
l :: [a]
and
foldl' f x l
runs in constant space. That is f is suitably strict.
Now consider if I have:
f2 :: b -> a -> Maybe b
f2 x y = if (pred x y) then Just $! (f x y) else Nothing
will
foldM f2 x l
reliably run in constant space? Or is there something else I need to do to ensure I have both constant space but still the short circuiting behaviour of Maybe?
(Note whilst I've asked this question about Maybe, I actually want to do this with Either, but I suspect the approach is similar)

In the library source code foldM is defined as foldlM, which in turn is defined as
foldlM :: (Foldable t, Monad m) => (b -> a -> m b) -> b -> t a -> m b
foldlM f z0 xs = foldr c return xs z0
where c x k z = f z x >>= k
Assuming, c x k z = f2 z x >>= k, let's see what happens when we call it. To see if it's constant space or not, we will only reduce the expressions by applying the topmost function without reducing the subexpressions.
foldlM f2 z0 (x:xs)
=
foldr c return (x:xs) z0
=
c x (foldr c return xs) z0
=
f2 z0 x >>= foldr c return xs
Since >>= is strict on the first arg, we evaluate f2 z0 x first. If that returns Nothing, we ignore the rest (short-circuiting, as you mentioned). If that returns Just y, we have
Just y >>= foldr c return xs
=
foldr c return xs y
and we are ready for the next loop.
This did not cause our term to grow, so it looks like it runs in constant space (provided f2 keeps the size of y constant, of course).

Related

Generalizing fold such that it becomes expressive enough to define any finite recursion?

So, there is something known as a "universal property of fold", stating exactly following:
g [] = i; g (x:xs) = f x (g xs) <=> g = fold f i
However, as you probably now, there are rare cases like dropWhile, which can not be redefined as fold f i unless you generalize it.
The simplest yet obvious way to generalize is to redefine universal property:
g' y [] = j y; g' y (x:xs) = h y x xs (g' y xs) <=> g' y = fold (?) l
At this point I can make my assumption: I assume existence of somewhat function p :: a -> b -> b, which would satisfy the equation g' y = fold p l. Let's try to solve given equation with help of universal property, mention at the very beginning:
g' y [] = j y = fold p l [] = l => j y = l
g' y (x:xs) = h y x xs (g' y xs) = fold p l (x:xs) = p x (fold p l xs) = p x (g' y xs) => letting rs = (g' y xs), h y x xs rs = p x rs, which is wrong: xs occurs freely from the left and thus equality can't hold.
Now let me try to interpret result I've came up with and ask question.
I see that the problem is xs emerging as unbound variable; it's true for various situations, including above mentioned dropWhile. Does it mean that the only way that equation can be solved is by "extending" rs to a pair of (rs, xs)? In other words, fold accumulates into tuple rather than a single type (ignoring the fact that tuple itself is a single type)? Is there any other way to generalize bypassing pairing?

It is as you say. The universal property says that g [] = i; g (x:xs) = f x (g xs) iff g = fold f i. This can't apply for a straightforward definition of dropWhile, as the would-be f :: a -> [a] -> [a] depends not just on the element and accumulated value at the current fold step, but also on the whole list suffix left to process (in your words, "xs emerg[es] as an unbound variable"). What can be done is twisting dropWhileso that this dependency on the list suffix becomes manifest in the accumulated value, be it through a tuple -- cf. dropWhilePair from this question, with f :: a -> ([a], [a]) -> ([a], [a]) -- or a function -- as in chi's implementation...
dropWhileFun = foldr (\x k -> \p -> if p x then k p else x : k (const False)) (const [])
... with f :: a -> ((a -> Bool) -> [a]) -> ((a -> Bool) -> [a]).
At the end of the day, the universal property is what it is -- a fundamental fact about foldr. It is no accident that not all recursive functions are immediately expressible through foldr. In fact, the tupling workaround your question brings to the table directly reflects the notion of paramorphism (for an explanation of them, see What are paramorphisms? and its exquisite answer by Conor McBride). At face value, paramorphisms are generalisations of catamorphisms (i.e. a straightforward fold); however, it only takes a slight contortion to implement paramorphisms in terms of catamorphisms. (Additional technical commentary on that might be found, for instance, in Chapter 3 of Categorical Programming With Inductive and Coinductive Types, Varmo Vene's PhD thesis.)

Laziness of (>>=) in folding

Consider the following 2 expressions in Haskell:
foldl' (>>=) Nothing (repeat (\y -> Just (y+1)))
foldM (\x y -> if x==0 then Nothing else Just (x+y)) (-10) (repeat 1)
The first one takes forever, because it's trying to evaluate the infinite expression
...(((Nothing >>= f) >>= f) >>=f)...
and Haskell will just try to evaluate it inside out.
The second expression, however, gives Nothing right away. I've always thought foldM was just doing fold using (>>=), but then it would run into the same problem. So it's doing something more clever here - once it hits Nothing it knows to stop. How does foldM actually work?

foldM can't be implemented using foldl. It needs the power of foldr to be able to stop short. Before we get there, here's a version without anything fancy.
foldM f b [] = return b
foldM f b (x : xs) = f b x >>= \q -> foldM f q xs
We can transform this into a version that uses foldr. First we flip it around:
foldM f b0 xs = foldM' xs b0 where
foldM' [] b = return b
foldM' (x : xs) b = f b x >>= foldM' xs
Then move the last argument over:
foldM' [] = return
foldM' (x : xs) = \b -> f b x >>= foldM' xs
And then recognize the foldr pattern:
foldM' = foldr go return where
go x r = \b -> f b x >>= r
Finally, we can inline foldM' and move b back to the left:
foldM f b0 xs = foldr go return xs b0 where
go x r b = f b x >>= r
This same general approach works for all sorts of situations where you want to pass an accumulator from left to right within a right fold. You first shift the accumulator all the way over to the right so you can use foldr to build a function that takes an accumulator, instead of trying to build the final result directly. Joachim Breitner did a lot of work to create the Call Arity compiler analysis for GHC 7.10 that helps GHC optimize functions written this way. The main reason to want to do so is that it allows them to participate in the GHC list libraries' fusion framework.

One way to define foldl in terms of foldr is:
foldl f z xn = foldr (\ x g y -> g (f y x)) id xn z
It's probably worth working out why that is for yourself. It can be re-written using >>> from Control.Arrow as
foldl f z xn = foldr (>>>) id (map (flip f) xn) z
The monadic equivalent of >>> is
f >=> g = \ x -> f x >>= \ y -> g y
which allows us to guess that foldM might be
foldM f z xn = foldr (>=>) return (map (flip f) xn) z
which turns out to be the correct definition. It can be re-written using foldr/map as
foldM f z xn = foldr (\ x g y -> f y x >>= g) return xn z

Defining foldl in terms of foldr in Standard ML

The defined code is
fun foldl f e l = let
fun g(x, f'') = fn y => f''(f(x, y))
in foldr g (fn x => x) l e end
I don't understand how this works;
what is the purpose of g(x, f'')?
I also find a similar example in Haskell,
the definition is quite short
myFoldl f z xs = foldr step id xs z
where
step x g a = g (f a x)

Let's dissect the Haskell implementation of myFoldl and then take a look at the ocaml SML code. First, we'll look at some type signatures:
foldr :: (a -> b -> b) -- the step function
-> b -- the initial value of the accumulator
-> [a] -- the list to fold
-> b -- the result
It should be noted that although the foldr function accepts only three arguments we are applying it two four arguments:
foldr step id xs z
However, as you can see the second argument to foldr (i.e. the inital value of the accumulator) is id which is a function of the type x -> x. Therefore, the result is also of the type x -> x. Hence, it accepts four arguments.
Similarly, the step function is now of the type a -> (x -> x) -> x -> x. Hence, it accepts three arguments instead of two. The accumulator is an endofunction (i.e. a function whose domain and codomain is the same).
Endofunctions have a special property, they are composed from left to right instead of from right to left. For example, let's compose a bunch of Int -> Int functions:
inc :: Int -> Int
inc n = n + 1
dbl :: Int -> Int
dbl n = n * 2
The normal way to compose these functions is to use the function composition operator as follows:
incDbl :: Int -> Int
incDbl = inc . dbl
The incDbl function first doubles a number and then increments it. Note that this reads from right to left.
Another way to compose them is to use continuations (denoted by k):
inc' :: (Int -> Int) -> Int -> Int
inc' k n = k (n + 1)
dbl' :: (Int -> Int) -> Int -> Int
dbl' k n = k (n * 2)
Notice that the first argument is a continuation. If we want to recover the original functions then we can do:
inc :: Int -> Int
inc = inc' id
dbl :: Int -> Int
dbl = dbl' id
However, if we want to compose them then we do it as follows:
incDbl' :: (Int -> Int) -> Int -> Int
incDbl' = dbl' . inc'
incDbl :: Int -> Int
incDbl = incDbl' id
Notice that although we are still using the dot operator to compose the functions, it now reads from left to right.
This is the key behind making foldr behave as foldl. We fold the list from right to left but instead of folding it into a value, we fold it into an endofunction which when applied to an initial accumulator value actually folds the list from left to right.
Consider our incDbl function:
incDbl = incDbl' id
= (dbl' . inc') id
= dbl' (inc' id)
Now consider the definition of foldr:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr _ acc [] = acc
foldr fun acc (y:ys) = fun y (foldr fun acc ys)
In the basis case we simply return the accumulated value. However, in the inductive case we return fun y (foldr fun acc ys). Our step function is defined as follows:
step :: a -> (x -> x) -> x -> x
step x g a = g (f a x)
Here f is the reducer function of foldl and is of the type x -> a -> x. Notice that step x is an endofunction of the type (x -> x) -> x -> x which we know can be composed left to right.
Hence the folding operation (i.e. foldr step id) on a list [y1,y2..yn] looks like:
step y1 (step y2 (... (step yn id)))
-- or
(step y1 . step y2 . {dots} . step yn) id
Each step yx is an endofunction. Hence, this is equivalent to composing the endofunctions from left to right.
When this result is applied to an initial accumulator value then the list folds from left to right. Hence, myFoldl f z xs = foldr step id xs z.
Now consider the foldl function (which is written in Standard ML and not OCaml). It is defined as:
fun foldl f e l = let fun g (x, f'') = fn y => f'' (f (x, y))
in foldr g (fn x => x) l e end
The biggest difference between the foldr functions of Haskell and SML are:
In Haskell the reducer function has the type a -> b -> b.
In SML the reducer function has the type (a, b) -> b.
Both are correct. It's only a matter of preference. In SML instead of passing two separate arguments, you pass one single tuple which contains both arguments.
Now, the similarities:
The id function in Haskell is the anonymous fn x => x function in SML.
The step function in Haskell is the function g in SML which takes a tuple containing the first two arguments.
The step function is Haskell step x g a has been split into two functions in SML g (x, f'') = fn y => f'' (f (x, y)) for more clarity.
If we rewrite the SML function to use the same names as in Haskell then we have:
fun myFoldl f z xs = let step (x, g) = fn a => g (f (a, x))
in foldr step (fn x => x) xs z end
Hence, they are exactly the same function. The expression g (x, f'') simply applies the function g to the tuple (x, f''). Here f'' is a valid identifier.

Intuition
The foldl function traverses the list head to tail while operating elements with an accumulator:
(...(a⊗x1)⊗...⊗xn-1)⊗xn
And you want to define it via a foldr:
x1⊕(x2⊕...⊕(xn⊕e)...)
Rather unintuitive. The trick is that your foldr will not produce a value, but rather a function. The list traversal will operate the elements as to produce a function that, when applied to the accumulator, performs the computation you desire.
Lets see a simple example to illustrate how this works. Consider sum foldl (+) 0 [1,2,3] = ((0+1)+2)+3. We may calculate it via foldr as follows.
foldr ⊕ [1,2,3] id
-> 1⊕(2⊕(3⊕id))
-> 1⊕(2⊕(id.(+3))
-> 1⊕(id.(+3).(+2))
-> (id.(+3).(+2).(+1))
So when we apply this function to 0 we get
(id.(+3).(+2).(+1)) 0
= ((0+1)+2)+3
We began with the identity function and successively changed it as we traversed the list, using ⊕ where,
n ⊕ g = g . (+n)
Using this intuition, it isn't hard to define a sum with an accumulator via foldr. We built the computation for a given list via foldr ⊕ id xs. Then to calculate the sum we applied it to 0, foldr ⊕ id xs 0. So we have,
foldl (+) 0 xs = foldr ⊕ id xs 0
where n ⊕ g = g . (+n)
or equivalently, denoting n ⊕ g in prefix form by (⊕) n g and noting that (⊕) n g a = (g . (+n)) a = g (a+n),
foldl (+) 0 xs = foldr ⊕ id xs 0
where (⊕) n g a = g (a+n)
Note that the ⊕ is your step function, and that you can obtain the generic result you're looking for by substituting a function f for +, and accumulator a for 0.
Next let us show that the above really is correct.
Formal derivation
Moving on to a more formal approach. It is useful, for simplicity, to be aware of the following universal property of foldr.
h [] = e
h (x:xs) = f x (h xs)
iff
h = foldr f e
This means that rather than defining foldr directly, we may instead and more simply define a function h in the form above.
We want to define such an h so that,
h xs a = foldl f a xs
or equivalently,
h xs = \a -> foldl f a xs
So lets determine h. The empty case is simple:
h [] = \a -> foldl f a []
= \a -> a
= id
The non-empty case results in:
h (x:xs) = \a -> foldl f a (x:xs)
= \a -> foldl f (f a x) xs
= \a -> h xs (f a x)
= step x (h xs) where step x g = \a -> g (f a x)
= step x (h xs) where step x g a = g (f a x)
So we conclude that,
h [] = id
h (x:xs) = step x (h xs) where step x g a = g (f a x)
satisfies h xs a = foldl f a xs
And by the universal property above (noting that the f in the universal property formula corresponds to step here, and e to id) we know that h = foldr step id. Therefore,
h = foldr step id
h xs a = foldl f a xs
-----------------------
foldl f a xs = foldr step id xs a
where step x g a = g (f a x)

Is there a way to elegantly represent this pattern in Haskell?

Mind the pure function below, in an imperative language:
def foo(x,y):
x = f(x) if a(x)
if c(x):
x = g(x)
else:
x = h(x)
x = f(x)
y = f(y) if a(y)
x = g(x) if b(y)
return [x,y]
That function represents a style where you have to incrementally update variables. It can be avoided in most cases, but there are situations where that pattern is unavoidable - for example, writing a cooking procedure for a robot, which inherently requires a series of steps and decisions. Now, imagine we were trying to represent foo in Haskell.
foo x0 y0 =
let x1 = if a x0 then f x0 else x0 in
let x2 = if c x1 then g x1 else h x1 in
let x3 = f x2 in
let y1 = if a y0 then f y0 else y0 in
let x4 = if b y1 then g x3 else x3 in
[x4,y1]
That code works, but it is too complicated and error prone due to the need for manually managing the numeric tags. Notice that, after x1 is set, x0's value should never be used again, but it still can. If you accidentally use it, that will be an undetected error.
I've managed to solve this problem using the State monad:
fooSt x y = execState (do
(x,y) <- get
when (a x) (put (f x, y))
(x,y) <- get
if c x
then put (g x, y)
else put (h x, y)
(x,y) <- get
put (f x, y)
(x,y) <- get
when (a y) (put (x, f y))
(x,y) <- get
when (b y) (put (g x, x))) (x,y)
This way, need for tag-tracking goes away, as well as the risk of accidentally using an outdated variable. But now the code is verbose and much harder to understand, mainly due to the repetition of (x,y) <- get.
So: what is a more readable, elegant and safe way to express this pattern?
Full code for testing.

Your goals
While the direct transformation of imperative code would usually lead to the ST monad and STRef, lets think about what you actually want to do:
You want to manipulate values conditionally.
You want to return that value.
You want to sequence the steps of your manipulation.
Requirements
Now this indeed looks first like the ST monad. However, if we follow the simple monad laws, together with do notation, we see that
do
x <- return $ if somePredicate x then g x
else h x
x <- return $ if someOtherPredicate x then a x
else b x
is exactly what you want. Since you need only the most basic functions of a monad (return and >>=), you can use the simplest:
The Identity monad
foo x y = runIdentity $ do
x <- return $ if a x then f x
else x
x <- return $ if c x then g x
else h x
x <- return $ f x
y <- return $ if a x then f y
else y
x <- return $ if b y then g x
else y
return (x,y)
Note that you cannot use let x = if a x then f x else x, because in this case the x would be the same on both sides, whereas
x <- return $ if a x then f x
else x
is the same as
(return $ if a x then (f x) else x) >>= \x -> ...
and the x in the if expression is clearly not the same as the resulting one, which is going to be used in the lambda on the right hand side.
Helpers
In order to make this more clear, you can add helpers like
condM :: Monad m => Bool -> a -> a -> m a
condM p a b = return $ if p then a else b
to get an even more concise version:
foo x y = runIdentity $ do
x <- condM (a x) (f x) x
x <- fmap f $ condM (c x) (g x) (h x)
y <- condM (a y) (f y) y
x <- condM (b y) (g x) x
return (x , y)
Ternary craziness
And while we're up to it, lets crank up the craziness and introduce a ternary operator:
(?) :: Bool -> (a, a) -> a
b ? ie = if b then fst ie else snd ie
(??) :: Monad m => Bool -> (a, a) -> m a
(??) p = return . (?) p
(#) :: a -> a -> (a, a)
(#) = (,)
infixr 2 ??
infixr 2 #
infixr 2 ?
foo x y = runIdentity $ do
x <- a x ?? f x # x
x <- fmap f $ c x ?? g x # h x
y <- a y ?? f y # y
x <- b y ?? g x # x
return (x , y)
But the bottomline is, that the Identity monad has everything you need for this task.
Imperative or non-imperative
One might argue whether this style is imperative. It's definitely a sequence of actions. But there's no state, unless you count the bound variables. However, then a pack of let … in … declarations also gives an implicit sequence: you expect the first let to bind first.
Using Identity is purely functional
Either way, the code above doesn't introduce mutability. x doesn't get modified, instead you have a new x or y shadowing the last one. This gets clear if you desugar the do expression as noted above:
foo x y = runIdentity $
a x ?? f x # x >>= \x ->
c x ?? g x # h x >>= \x ->
return (f x) >>= \x ->
a y ?? f y # y >>= \y ->
b y ?? g x # x >>= \x ->
return (x , y)
Getting rid of the simplest monad
However, if we would use (?) on the left hand side and remove the returns, we could replace (>>=) :: m a -> (a -> m b) -> m b) by something with type a -> (a -> b) -> b. This just happens to be flip ($). We end up with:
($>) :: a -> (a -> b) -> b
($>) = flip ($)
infixr 0 $> -- same infix as ($)
foo x y = a x ? f x # x $> \x ->
c x ? g x # h x $> \x ->
f x $> \x ->
a y ? f y # y $> \y ->
b y ? g x # x $> \x ->
(x, y)
This is very similar to the desugared do expression above. Note that any usage of Identity can be transformed into this style, and vice-versa.

The problem you state looks like a nice application for arrows:
import Control.Arrow
if' :: (a -> Bool) -> (a -> a) -> (a -> a) -> a -> a
if' p f g x = if p x then f x else g x
foo2 :: (Int,Int) -> (Int,Int)
foo2 = first (if' c g h . if' a f id) >>>
first f >>>
second (if' a f id) >>>
(\(x,y) -> (if b y then g x else x , y))
in particular, first lifts a function a -> b to (a,c) -> (b,c), which is more idiomatic.
Edit: if' allows a lift
import Control.Applicative (liftA3)
-- a functional if for lifting
if'' b x y = if b then x else y
if' :: (a -> Bool) -> (a -> a) -> (a -> a) -> a -> a
if' = liftA3 if''

I'd probably do something like this:
foo x y = ( x', y' )
where x' = bgf y' . cgh . af $ x
y' = af y
af z = (if a z then f else id) z
cgh z = (if c z then g else h) z
bg y x = (if b y then g else id) x
For something more complicated, you may want to consider using lens:
whenM :: Monad m => m Bool -> m () -> m ()
whenM c a = c >>= \res -> when res a
ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM mb ml mr = mb >>= \b -> if b then ml else mr
foo :: Int -> Int -> (Int, Int)
foo = curry . execState $ do
whenM (uses _1 a) $
_1 %= f
ifM (uses _1 c)
(_1 %= g)
(_1 %= h)
_1 %= f
whenM (uses _2 a) $
_2 %= f
whenM (uses _2 b) $ do
_1 %= g
And there's nothing stopping you from using more descriptive variable names:
foo :: Int -> Int -> (Int, Int)
foo = curry . execState $ do
let x :: Lens (a, c) (b, c) a b
x = _1
y :: Lens (c, a) (c, b) a b
y = _2
whenM (uses x a) $
x %= f
ifM (uses x c)
(x %= g)
(x %= h)
x %= f
whenM (uses y a) $
y %= f
whenM (uses y b) $ do
x %= g

This is a job for the ST (state transformer) library.
ST provides:
Stateful computations in the form of the ST type. These look like ST s a for a computation that results in a value of type a, and may be run with runST to obtain a pure a value.
First-class mutable references in the form of the STRef type. The newSTRef a action creates a new STRef s a reference with an initial value of a, and which can be read with readSTRef ref and written with writeSTRef ref a. A single ST computation can use any number of STRef references internally.
Together, these let you express the same mutable variable functionality as in your imperative example.
To use ST and STRef, we need to import:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Control.Monad.ST.Safe
import Data.STRef
Instead of using the low-level readSTRef and writeSTRef all over the place, we can define the following helpers to match the imperative operations that the Python-style foo example uses:
-- STRef assignment.
(=:) :: STRef s a -> ST s a -> ST s ()
ref =: x = writeSTRef ref =<< x
-- STRef function application.
($:) :: (a -> b) -> STRef s a -> ST s b
f $: ref = f `fmap` readSTRef ref
-- Postfix guard syntax.
if_ :: Monad m => m () -> m Bool -> m ()
action `if_` guard = act' =<< guard
where act' b = if b then action
else return ()
This lets us write:
ref =: x to assign the value of ST computation x to the STRef ref.
(f $: ref) to apply a pure function f to the STRef ref.
action `if_` guard to execute action only if guard results in True.
With these helpers in place, we can faithfully translate the original imperative definition of foo into Haskell:
a = (< 10)
b = even
c = odd
f x = x + 3
g x = x * 2
h x = x - 1
f3 x = x + 2
-- A stateful computation that takes two integer STRefs and result in a final [x,y].
fooST :: Integral n => STRef s n -> STRef s n -> ST s [n]
fooST x y = do
x =: (f $: x) `if_` (a $: x)
x' <- readSTRef x
if c x' then
x =: (g $: x)
else
x =: (h $: x)
x =: (f $: x)
y =: (f $: y) `if_` (a $: y)
x =: (g $: x) `if_` (b $: y)
sequence [readSTRef x, readSTRef y]
-- Pure wrapper: simply call fooST with two fresh references, and run it.
foo :: Integral n => n -> n -> [n]
foo x y = runST $ do
x' <- newSTRef x
y' <- newSTRef y
fooST x' y'
-- This will print "[9,3]".
main = print (foo 0 0)
Points to note:
Although we first had to define some syntactical helpers (=:, $:, if_) before translating foo, this demonstrates how you can use ST and STRef as a foundation to grow your own little imperative language that's directly suited to the problem at hand.
Syntax aside, this matches the structure of the original imperative definition exactly, without any error-prone restructuring. Any minor changes to the original example can be mirrored directly to Haskell. (The addition of the temporary x' <- readSTRef x binding in the Haskell code is only in order to use it with the native if/else syntax: if desired, this can be replaced with an appropriate ST-based if/else construct.)
The above code demonstrates giving both pure and stateful interfaces to the same computation: pure callers can use foo without knowing that it uses mutable state internally, while ST callers can directly use fooST (and for example provide it with existing STRefs to modify).

#Sibi said it best in his comment:
I would suggest you to stop thinking imperatively and rather think in a functional way. I agree that it will take some time to getting used to the new pattern, but try to translate imperative ideas to functional languages isn't a great approach.
Practically speaking, your chain of let can be a good starting point:
foo x0 y0 =
let x1 = if a x0 then f x0 else x0 in
let x2 = if c x1 then g x1 else h x1 in
let x3 = f x2 in
let y1 = if a y0 then f y0 else y0 in
let x4 = if b y1 then g x3 else x3 in
[x4,y1]
But I would suggest using a single let and giving descriptive names to the intermediate stages.
In this example unfortunately I don't have a clue what the various x's and y's do, so I cannot suggest meaningful names. In real code you would use names such as x_normalized, x_translated, or such, instead of x1 and x2, to describe what those values really are.
In fact, in a let or where you don't really have variables: they're just shorthand names you give to intermediate results, to make it easy to compose the final expression (the one after in or before the where.)
This is the spirit behind the x_bar and x_baz below. Try to come up with names that are reasonably descriptive, given the context of your code.
foo x y =
let x_bar = if a x then f x else x
x_baz = f if c x_bar then g x_bar else h x_bar
y_bar = if a y then f y else y
x_there = if b y_bar then g x_baz else x_baz
in [x_there, y_bar]
Then you can start recognizing patterns that were hidden in the imperative code. For example, x_bar and y_bar are basically the same transformation, applied respectively to x and y: that's why they have the same suffix "_bar" in this nonsensical example; then your x2 probably doesn't need an intermediate name , since you can just apply f to the result of the entire "if c then g else h".
Going on with the pattern recognition, you should factor out the transformations that you are applying to variables into sub-lambdas (or whatever you call the auxiliary functions defined in a where clause.)
Again, I don't have a clue what the original code did, so I cannot suggest meaningful names for the auxiliary functions. In a real application, f_if_a would be called normalize_if_needed or thaw_if_frozen or mow_if_overgrown... you get the idea:
foo x y =
let x_bar = f_if_a x
y_bar = f_if_a y
x_baz = f (g_if_c_else_h x_bar)
x_there = g_if_b x_baz y_bar
in [x_there, y_bar]
where
f_if_a x
| a x = f x
| otherwise = x
g_if_c_else_h x
| c x = g x
| otherwise = h x
g_if_b x y
| b y = g x
| otherwise = x
Don't disregard this naming business.
The whole point of Haskell and other pure functional languages is to express algorithms without the assignment operator, meaning the tool that can modify the value of an existing variable.
The names you give to things inside a function definition, whether introduced as arguments, let, or where, can only refer to one value (or auxiliary function) throughout the entire definition, so that your code can be more easily reasoned about and proven correct.
If you don't give them meaningful names (and conversely giving your code a meaningful structure) then you're missing out on the entire purpose of Haskell.
(IMHO the other answers so far, citing monads and other shenanigans, are barking up the wrong tree.)

I always prefer layering state transformers to using a single state over a tuple: it definitely declutters things by letting you "focus" on a specific layer (representations of the x and y variables in our case):
import Control.Monad.Trans.Class
import Control.Monad.Trans.State
foo :: x -> y -> (x, y)
foo x y =
(flip runState) y $ (flip execStateT) x $ do
get >>= \v -> when (a v) (put (f v))
get >>= \v -> put ((if c v then g else h) v)
modify f
lift $ get >>= \v -> when (a v) (put (f v))
lift get >>= \v -> when (b v) (modify g)
The lift function allows us to focus on the inner state layer, which is y.

Mapping which holds and passes previous result

When solving system of linear equations by Tridiagonal matrix algorithm in Haskell I met following problem.
We have three vectors: a, b and c, and we want to make a third vector c' which is a combination of them:
c'[i] = c[i] / b[i], i = 0
c'[i] = c[i] / (b[i] - a[i] * c'[i-1]), 0 < i < n - 1
c'[i] = undefined, i = n - 1
Naive implementation of the formula above in Haskell is as follows:
calcC' a b c = Data.Vector.generate n f
where
n = Data.Vector.length a
f i =
| i == 0 = c!0 / b!0
| i == n - 1 = 0
| otherwise = c!i / (b!i - a!i * f (i - 1))
It looks like this function calcC' has complexity O(n2) due to recurrence. But all we actualy need is to pass to inner function f one more parameter with previously generated value.
I wrote my own version of generate with complexity O(n) and helper function mapP:
mapP f xs = mapP' xs Nothing
where
mapP' [] _ = []
mapP' (x:xs) xp = xn : mapP' xs (Just xn)
where
xn = f x xp
generateP n f = Data.Vector.fromList $ mapP f [0 .. n-1]
As one can see, mapP acts like a standard map, but also passes to mapping function previously generated value or Nothing for first call.
My question: is there any pretty standard ways to do this in Haskell? Don't I reinvent the weel?
Thanks.

There are two standard function called mapAccumL and mapAccumR that do precisely what you want.
mapAccumL :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y])
mapAccumR :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y])
Basically, they behave like a combination of fold and map.
map f = snd . mapAccumL (\_ x -> (() , f x) ()
foldl f b = fst . mapAccumL (\b x -> (f b x, () ) b

If you use Data.Array, which is lazy, you can express the recurrence directly by referring to c' while defining c'.

Following code seems to be the simplest implementation of formula above in my case:
import qualified Data.Vector.Generic as V
calcC' a b c = V.postscanl' f 0.0 $ V.zip3 a b c
where
f c' (a, b, c) = c / (b - a * c')
Thanks to the authors of Vector who added helpfull postscanl' method.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Constant space short circuiting `foldM` over `Maybe` - haskell

Related

Generalizing fold such that it becomes expressive enough to define any finite recursion?

Laziness of (>>=) in folding

Defining foldl in terms of foldr in Standard ML

Is there a way to elegantly represent this pattern in Haskell?

Mapping which holds and passes previous result

Categories

Resources