I've been reading some Haskell code and keep seeing functions that look something like this:
ok :: a -> Result i w e a
ok a =
Result $ \i w _ good ->
good i w a
Why is a lambda used? Why wouldn't you just write the following?:
ok :: a -> Result i w e a
ok a =
Result $ good i w a
This is continuation passing style or "CPS".
So first off, your alternative example doesn't make sense. good, i, and w are not known at the point they are used, and you will get an error.
The basic idea of continuation passing style is that instead of returning the relevant information, you instead call a function that you are given (in this case good), passing it your intended result as an argument. Presumably (based on the naming) the ignored argument _ would have been called bad, and it is a function that you would call in the case of failure.
If you are the ok function, it's like the difference between asking you to
Bake me a batch of cookies.
(where I have the intention of giving the cookies to Dave), and
Bake a batch of cookies and then give it to Dave.
which accomplishes the same thing but now I don't have to be a middleman anymore. There are often performance advantages to cutting me out as a middleman, and also it means you can do more things, for example if the batch of cookies is really good you might decide to give it to your mom instead of Dave (thus aborting whatever Dave would have done with them), or bake two batches and give them both to Dave (duplicating what Dave would have done). Sometimes you want this ability and other times you don't, it depends on context. (N.B. in the below examples the types are sufficiently general to disallow these possibilities)
Here is a very simple example of continuation passing style. Say you have a program
pred :: Integer -> Maybe Integer
pred n = if n > 0 then Just (n-1) else Nothing
which subtracts 1 from a number and returns it (in a Just constructor), unless it would become negative then it returns Nothing. You might use it like this:
main = do
x <- readLn
case x of
Just predx -> putStrLn $ "The predecessor is " ++ show predx
Nothing -> putStrLn "Can't take the predecessor"
We can encode this in continuation passing style by, instead of returning Maybe, have pred take an argument for what to do in each case:
pred :: Integer -> (Integer -> r) -> r -> r
-- ^^^^^^^^^^^^^^ ^
-- Just case |
-- Nothing case
pred n ifPositive ifNegative =
if n > 0
then ifPositive (n-1)
else ifNegative
And the usage becomes:
main = do
x <- readLn
pred x (\predx -> putStrLn $ "The predecessor is " ++ show predx)
(putStrLn "Can't take the predecessor)
See how that works? -- doing it the first way we got the result and then did case analysis; in the second way each case became an argument to the function. And in the process the call to pred became a tail call, eliminating the need for a stack frame and the intermediate Maybe data structure.
The only remaining problem is that the pred's signature is kind of confusing. We can make it a bit clearer by wrapping the CPS stuff in its own type constructor:
newtype CPSMaybe a = CPSMaybe (forall r. (a -> r) -> r -> r)
pred :: Integer -> CPSMaybe Integer
pred n = CPSMaybe $ \ifPositive ifNegative ->
if n > 0
then ifPositive (n-1)
else ifNegative
which has a signature that looks more like the first one but with code that looks like the second (except for the CPSMaybe newtype wrapper, which has no effect at runtime). And now maybe you can see the connection to the code in your question.
Well, the Result type apparently wraps a function, so a lambda is the natural thing to use here. If you wanted to avoid a lambda, you could use a local definition instead using let or where, e.g.:
ok a = let
proceed i w _ good = good i w a
in Result proceed
-- or --
ok a = Result proceed
where
proceed i w _ good = good i w a
Writing this won’t work because the variables i, w, and good are not in scope:
ok :: a -> Result i w e a
ok a =
Result $ good i w a
I wonder if the source of your confusion is the fact that i and w are also used as type variables in the signature of ok, but they’re different variables that happen to have the same names. It’s just as if you’d written something like this:
ok :: a -> Result i w e a
ok value =
Result $ continue index writer value
Here it should be obvious that the continue, index, and writer variables aren’t defined.
In the first sample, good, w and i are locally defined parameters to the lambda expression. In the second sample, they are free variables. I would expect the second sample to fail with an error saying that those identifiers are not in scope. Result apparently is a type that contains information about how to use given data and handlers. ok says to take the data and apply the handler indicating a good outcome to it. In the second sample, it is not clear that one is even refering to the arguments available to what Result wraps, or which names refer to which arguments.
In Haskell, this is a simple (naive) definition of a fixed point
fix :: (a -> a) -> a
fix f = f (fix f)
But, here is how Haskell actually implements it (more efficient)
fix f = let x = f x in x
My question is why is the second one more efficient than the first?
The slow fix calls f on each step of recursion, while the fast one calls f exactly once. It can be visualized with tracing:
import Debug.Trace
fix f = f (fix f)
fix' f = let x = f x in x
facf :: (Int -> Int) -> Int -> Int
facf f 0 = 1
facf f n = n * f (n - 1)
tracedFacf x = trace "called" facf x
fac = fix tracedFacf
fac' = fix' tracedFacf
Now try some running:
> fac 3
called
called
called
called
6
> fac' 3
called
6
In more detail, let x = f x in x results in a lazy thunk being allocated for x, and a pointer to this thunk is passed to f. On first evaluating fix' f, the thunk is evaluated and all references to it (here specifically: the one we pass to f) are redirected to the resulting value. It just happens that x is given a value that also contains a reference to x.
I admit this can be rather mind-bending. It's something that one should get used to when working with laziness.
I don't think this always (or necessarily ever) helps when you're calling fix with a function that takes two arguments to produce a function taking one argument. You'd have to run some benchmarks to see. But you can also call it with a function taking one argument!
fix (1 :)
is a circular linked list. Using the naive definition of fix, it would instead be an infinite list, with new pieces built lazily as the structure is forced.
I believe this has been asked already, but I couldn't find the answer. The reason is that the first version
fix f = f (fix f)
is a recursive function, so it can't be inlined and then optimized. From the GHC manual:
For example, for a self-recursive function, the loop breaker can only be the function itself, so an INLINE pragma is always ignored.
But
fix f = let x = f x in x
isn't recursive, the recursion is moved into the let binding, so it's possible to inline it.
Update: I did some tests and while the former version doesn't inline while the latter does, it doesn't seem to be crucial for performance. So the other explanations (a single object on heap vs creating one every iteration) seem to be more accurate.
I'm not entirely clear on how seq works in Haskell.
It seems like it there are lots of cases where it would be useful to write
seq x x
and maybe even define a function:
strict x = seq x x
but such a function doesn't already exist so I'm guessing this approach is somehow wrongheaded. Could someone tell me if this is meaningful or useful?
seq a b returns the value of b, but makes that value depend on the evaluation of a. Thus, seq a a is exactly the same thing as a.
I think the misunderstanding here is that seq doesn't take any action, because pure functions don't take actions, it just introduces a dependency.
There is a function evaluate :: a -> IO () in Control.Exception that does what you want (note that it's in IO). They put it in exception because it's useful to see if the evaluation of an expression would throw, and if so handle the exception.
The expression x = seq a b means that if x is evaluated, then a will also be evaluated (but x will be equal to b).
It does not mean "evaluate a now".
Notice that if x is being evaluated, then since x equals b, then b will also be evaluated.
And hence, if I write x = seq a a, I am saying "if x is evaluated then evaluate a". But if I just do x = a, that would achieve exactly the same thing.
When you say seq a b what you are telling the computer is,
Whenever you need to evaluate b, evaluate a for me too, please.
If we replace both a and b with x you can see why it's useless to write seq x x:
Whenever you need to evaluate x, evaluate x for me too, please.
Asking the computer to evaluate x when it needs to evaluate x is just a useless thing to do – it was going to evaluate x anyway!
seq does not evaluate anything – it simply tells the computer that when you need the second argument, also evaluate the first argument. Understanding this is actually really important, because it allows you to understand the behaviour of your programs much better.
seq x x would be entirely, trivially redundant.
Remember, seq is not a command. The presence of a seq a b in your program does not force evaluation of a or b What it does do, is it makes the evaluation of the result artificially dependent on the evaluation of a, even though the result itself is b If you print out seq a b, a will be evaluated and its result discarded.. Since x already depends on itself, seq x x is silly.
Close! deepseq (which is the "more thorough" seq -- see the docs for a full description) has the type NFData a => a -> b -> b, and force (with type NFData a => a -> a) is defined simply as
force :: (NFData a) => a -> a
force x = x `deepseq` x
What is the best monadic type to explain monad typeclass to some people who don't know anything about monads? Should I use something from standart Haskell library or should I make up some new type?
I think the best way to motivate monads is to show how many embedded domain specific languages have the structure of a monad:
List comprehensions are the obvious example.
JS Promises are a monad with .then serving as the bind operation
Groovy's ?. operator
many "fluent" interfaces in O-O languages are monadic
Using monads you can embed an assembler for the 6502 into your program link or even BASIC code link
The monad pattern allows you to drive out the inessential complexity from your code concentrate on the important details of the computation.
Understanding the monad pattern will serve you well when you want to create you own EDSLs.
The Maybe monad is (in my opinion) the easiest to understand. Once you get passed the concept of the (simple) algebraic type, understanding how the Maybe monad works is fairly straightforward.
If someone is having trouble understanding the constructors for Maybe, you could write them a class that does the essentially the same thing:
class Maybe(object):
def __init__(self, value=None):
self.__just = value
def just(self):
if self.isJust():
return self.__just
else:
raise ValueError('None')
def isJust(self):
return self.__just is not None
def fmap(self, f):
if self.isJust():
return Maybe(f(self.just()))
else:
return Maybe()
def bind(self, fM):
"""fM must return a value of type Maybe"""
if self.isJust():
return fM(self.just())
else:
return Maybe()
def __repr__(self):
if self.isJust():
return 'Just ({})'.format(self.just())
else:
return 'Nothing'
def head(some_list):
if len(some_list) == 0:
return Maybe()
else:
return Maybe(some_list[0])
def idx(some_list, i):
if idx < len(some_list):
return Maybe(some_list[i])
else:
return Maybe()
print head([1, 2, 3]).bind(
lambda x: Maybe(2 * x)).bind(
lambda x: Maybe(x + 1)).bind(
lambda x: Maybe(x + 3))
print head([[1, 2, 3]]).bind(
lambda xs: idx(xs, 0)).bind(
head).bind(
lambda x: 2 * x)
print head([[1, 2, 3]]).bind(
lambda xs: idx(xs, 1)).bind(
head).bind(
lambda x: 2 * x)
This code will print out
Just (6)
Nothing
Nothing
And this has the same functionality (more or less) as the Maybe monad in Haskell, just re-implemented in Python using a class. The return function in Haskell is replaced by the constructor and >>= is replaced by .bind.
I think it's important to let monadic patterns arise from actual use. It can be enlightening to pick a few types and direct someone toward problems which are naturally expressed by a monadic pattern.
Off the top of my head, it's very easy to argue for the benefits of Maybe, have someone get concerned about the nested error handling that obviously results, and then talk about how
case f x of
Nothing -> Nothing
Just y -> case g y of
Nothing -> Nothing
Just z -> case h z of
Nothing -> Nothing
Just q -> case r q of
Nothing -> Nothing
Just end -> end
is actually a very, very common pattern that Haskell lets you abstract over.
Then talk about configuration and how it's useful to pass a Config data type to many functions for them to operate. It's easy to end up writing code like
go config in =
let (x, y) = f config $ g config $ h config in
in finally config x (we'reDone config y)
but this is again a very common pattern in Haskell that gets annoying but has a common strategy for mitigating the verbosity.
Finally, talk about mutation of state as chaining endomorphisms like
let state4 = (modify4 . modify3 . modify2 . modify1 :: State -> State) state0
and how that's pretty annoying as well while also fixing your "modification chain" ahead of time without allowing you to get any information out of the intermediate steps (without, at least, threading it along with your state as well).
And again, this can be solved very uniformly in Haskell by a common abstraction pattern with a weird name. You've heard stories about Monads, right?
One of the more profound insights I had, was that monads can be seen as imperative programming languages, which you can compose. So maybe you should build a "language" with them, so they get a grip of how powerful the abstraction is.
I think creating a programming language for them would be a good perspective.
For example, first you add state
import Data.Map as M
import Control.Monad
import Control.Monad.State
data Variable = IntV Int | StringV String ..
type Context = M.Map String Variable
type Program a = State Context a
Then you add logging:
type Program a = WriterT [String] (State Context) a
log x = tell [x]
Then you add exceptions:
type Program a = WriterT [String] (StateT Context Either) a
Then you add continuations etc.
This way you can show them that you can use monads to build an environment which is precisely good for your problem. After this if they are interested you show them the anatomy of a monad and how they are built.
For example first you show the Maybe monad. Give them first the pure lambda version of it:
data Perhaps a = Sure a | Nope
next (Sure a) f = f a
next (Nope) f = Nope
Show them how you can chain computations with lambda's:
small x | x < 100 = Sure x
| otherwise = Nope
between x y z | x < z < y = Sure z
| otherwise = Nope
small 10 `next` (\b -> between 5 20 b)
Then show them how you can transform this to do notation:
small x `next` (\b -> between 5 20 b
`next` (\c -> between 10 14 c))
Suggest it would be handy if we could write it like this:
small x -> \b ->
between 5 20 b -> \c ->
between 10 14 c
Then introduce do notation:
b <- small x
c <- between 5 20 b
between 10 14 c
Now they have invented do notation with you from there you can explain some other monads.
Here's a simple, barebones example of how the code that I'm trying to do would look in C++.
while (state == true) {
a = function1();
b = function2();
state = function3();
}
In the program I'm working on, I have some functions that I need to loop through until bool state equals false (or until one of the variables, let's say variable b, equals 0).
How would this code be done in Haskell? I've searched through here, Google, and even Bing and haven't been able to find any clear, straight forward explanations on how to do repetitive actions with functions.
Any help would be appreciated.
Taking Daniels comment into account, it could look something like this:
f = loop init_a init_b true
where
loop a b True = loop a' b' (fun3 a' b')
where
a' = fun1 ....
b' = fun2 .....
loop a b False = (a,b)
Well, here's a suggestion of how to map the concepts here:
A C++ loop is some form of list operation in Haskell.
One iteration of the loop = handling one element of the list.
Looping until a certain condition becomes true = base case of a function that recurses on a list.
But there is something that is critically different between imperative loops and functional list functions: loops describe how to iterate; higher-order list functions describe the structure of the computation. So for example, map f [a0, a1, ..., an] can be described by this diagram:
[a0, a1, ..., an]
| | |
f f f
| | |
v v v
[f a0, f a1, ..., f an]
Note that this describes how the result is related to the arguments f and [a0, a1, ..., an], not how the iteration is performed step by step.
Likewise, foldr f z [a0, a1, ..., an] corresponds to this:
f a0 (f a1 (... (f an z)))
filter doesn't quite lend itself to diagramming, but it's easy to state many rules that it satisfies:
length (filter pred xs) <= length xs
For every element x of filter pred xs, pred x is True.
If x is an element of filter pred xs, then x is an element of xs
If x is not an element of xs, then x is not an element of filter pred xs
If x appears before x' in filter pred xs, then x appears before x' in xs
If x appears before x' in xs, and both x and x' appear in filter pred xs, then x appears before x' in filter pred xs
In a classic imperative program, all three of these cases are written as loops, and the difference between them comes down to what the loop body does. Functional programming, on the contrary, insists that this sort of structural pattern does not belong in "loop bodies" (the functions f and pred in these examples); rather, these patterns are best abstracted out into higher-order functions like map, foldr and filter. Thus, every time you see one of these list functions you instantly know some important facts about how the arguments and the result are related, without having to read any code; whereas in a typical imperative program, you must read the bodies of loops to figure this stuff out.
So the real answer to your question is that it's impossible to offer an idiomatic translation of an imperative loop into functional terms without knowing what the loop body is doing—what are the preconditions supposed to be before the loop runs, and what the postconditions are supposed to be when the loop finishes. Because that loop body that you only described vaguely is going to determine what the structure of the computation is, and different such structures will call for different higher-order functions in Haskell.
First of all, let's think about a few things.
Does function1 have side effects?
Does function2 have side effects?
Does function3 have side effects?
The answer to all of these is a resoundingly obvious YES, because they take no inputs, and presumably there are circumstances which cause you to go around the while loop more than once (rather than def function3(): return false). Now let's remodel these functions with explicit state.
s = initialState
sentinel = true
while(sentinel):
a,b,s,sentinel = function1(a,b,s,sentinel)
a,b,s,sentinel = function2(a,b,s,sentinel)
a,b,s,sentinel = function3(a,b,s,sentinel)
return a,b,s
Well that's rather ugly. We know absolutely nothing about what inputs each function draws from, nor do we know anything about how these functions might affect the variables a, b, and sentinel, nor "any other state" which I have simply modeled as s.
So let's make a few assumptions. Firstly, I am going to assume that these functions do not directly depend on nor affect in any way the values of a, b, and sentinel. They might, however, change the "other state". So here's what we get:
s = initState
sentinel = true
while (sentinel):
a,s2 = function1(s)
b,s3 = function2(s2)
sentinel,s4 = function(s3)
s = s4
return a,b,s
Notice I've used temporary variables s2, s3, and s4 to indicate the changes that the "other state" goes through. Haskell time. We need a control function to behave like a while loop.
myWhile :: s -- an initial state
-> (s -> (Bool, a, s)) -- given a state, produces a sentinel, a current result, and the next state
-> (a, s) -- the result, plus resultant state
myWhile s f = case f s of
(False, a, s') -> (a, s')
(True, _, s') -> myWhile s' f
Now how would one use such a function? Well, given we have the functions:
function1 :: MyState -> (AType, MyState)
function2 :: MyState -> (BType, MyState)
function3 :: MyState -> (Bool, MyState)
We would construct the desired code as follows:
thatCodeBlockWeAreTryingToSimulate :: MyState -> ((AType, BType), MyState)
thatCodeBlockWeAreTryingToSimulate initState = myWhile initState f
where f :: MyState -> (Bool, (AType, BType), MyState)
f s = let (a, s2) = function1 s
(b, s3) = function2 s2
(sentinel, s4) = function3 s3
in (sentinel, (a, b), s4)
Notice how similar this is to the non-ugly python-like code given above.
You can verify that the code I have presented is well-typed by adding function1 = undefined etc for the three functions, as well as the following at the top of the file:
{-# LANGUAGE EmptyDataDecls #-}
data MyState
data AType
data BType
So the takeaway message is this: in Haskell, you must explicitly model the changes in state. You can use the "State Monad" to make things a little prettier, but you should first understand the idea of passing state around.
Lets take a look at your C++ loop:
while (state == true) {
a = function1();
b = function2();
state = function3();
}
Haskell is a pure functional language, so it won't fight us as much (and the resulting code will be more useful, both in itself and as an exercise to learn Haskell) if we try to do this without side effects, and without using monads to make it look like we're using side effects either.
Lets start with this structure
while (state == true) {
<<do stuff that updates state>>
}
In Haskell we're obviously not going to be checking a variable against true as the loop condition, because it can't change its value[1] and we'd either evaluate the loop body forever or never. So instead, we'll want to be evaluating a function that returns a boolean value on some argument:
while (check something == True) {
<<do stuff that updates state>>
}
Well, now we don't have a state variable, so that "do stuff that updates state" is looking pretty pointless. And we don't have a something to pass to check. Lets think about this a bit more. We want the something to be checked to depend on what the "do stuff" bit is doing. We don't have side effects, so that means something has to be (or be derived from) returned from the "do stuff". "do stuff" also needs to take something that varies as an argument, or it'll just keep returning the same thing forever, which is also pointless. We also need to return a value out all this, otherwise we're just burning CPU cycles (again, with no side effects there's no point running a function if we don't use its output in some way, and there's even less point running a function repeatedly if we never use its output).
So how about something like this:
while check func state =
let next_state = func state in
if check next_state
then while check func next_state
else next_state
Lets try it in GHCi:
*Main> while (<20) (+1) 0
20
This is the result of applying (+1) repeatedly while the result is less than 20, starting from 0.
*Main> while ((<20) . length) (++ "zob") ""
"zobzobzobzobzobzobzob"
This is the result of concatenating "zob" repeatedly while the result's length is less than 20, starting from the empty string.
So you can see I've defined a function that is (sort of a bit) analogous to a while loop from imperative languages. We didn't even need dedicated loop syntax for it! (which is the real reason Haskell has no such syntax; if you need this kind of thing you can express it as a function). It's not the only way to do so, and experienced Haskell programmers would probably use other standard library functions to do this kind of job, rather than writing while.
But I think it's useful to see how you can express this kind of thing in Haskell. It does show that you can't translate things like imperative loops directly into Haskell; I didn't end up translating your loop in terms of my while because it ends up pretty pointless; you never use the result of function1 or function2, they're called with no arguments so they'd always return the same thing in every iteration, and function3 likewise always returns the same thing, and can only return true or false to either cause while to keep looping or stop, with no information resulting.
Presumably in the C++ program they're all using side effects to actually get some work done. If they operate on in-memory things then you need to translate a bigger chunk of your program at once to Haskell for the translation of this loop to make any sense. If those functions are doing IO then you'll need to do this in the IO monad in Haskell, for which my while function doesn't work, but you can do something similar.
[1] As an aside, it's worth trying to understand that "you can't change variables" in Haskell isn't just an arbitrary restriction, nor is it just an acceptable trade off for the benefits of purity, it is a concept that doesn't make sense the way Haskell wants you to think about Haskell code. You're writing down expressions that result from evaluating functions on certain arguments: in f x = x + 1 you're saying that f x is x + 1. If you really think of it that way rather than thinking "f takes x, then adds one to it, then returns the result" then the concept of "having side effects" doesn't even apply; how could something existing and being equal to something else somehow change a variable, or have some other side effect?
You should write a solution to your problem in a more functional approach.
However, some code in haskell works a lot like imperative looping, take for example state monads, terminal recursivity, until, foldr, etc.
A simple example is the factorial. In C, you would write a loop where in haskell you can for example write fact n = foldr (*) 1 [2..n].
If you've two functions f :: a -> b and g :: b -> c where a, b, and c are types like String or [Int] then you can compose them simply by writing f . b.
If you wish them to loop over a list or vector you could write map (f . g) or V.map (f . g), assuming you've done Import qualified Data.Vector as V.
Example : I wish to print a list of markdown headings like ## <number>. <heading> ## but I need roman numerals numbered from 1 and my list headings has type type [(String,Double)] where the Double is irrelevant.
Import Data.List
Import Text.Numeral.Roman
let fun = zipWith (\a b -> a ++ ". " ++ b ++ "##\n") (map toRoman [1..]) . map fst
fun [("Foo",3.5),("Bar",7.1)]
What the hell does this do?
toRoman turns a number into a string containing the roman numeral. map toRoman does this to every element of a loop. map toRoman [1..] does it to every element of the lazy infinite list [1,2,3,4,..], yielding a lazy infinite list of roman numeral strings
fst :: (a,b) -> a simply extracts the first element of a tuple. map fst throws away our silly Meow information along the entire list.
\a b -> "##" ++ show a ++ ". " ++ b ++ "##" is a lambda expression that takes two strings and concatenates them together within the desired formatting strings.
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c] takes a two argument function like our lambda expression and feeds it pairs of elements from it's own second and third arguments.
You'll observe that zip, zipWith, etc. only read as much of the lazy infinite list of Roman numerals as needed for the list of headings, meaning I've number my headings without maintaining any counter variable.
Finally, I have declared fun without naming it's argument because the compiler can figure it out from the fact that map fst requires one argument. You'll notice that put a . before my second map too. I could've written (map fst h) or $ map fst h instead if I'd written fun h = ..., but leaving the argument off fun meant I needed to compose it with zipWith after applying zipWith to two arguments of the three arguments zipWith wants.
I'd hope the compiler combines the zipWith and maps into one single loop via inlining.