I'm new to Haskell. I understand that functions are curried to become functions that take one parameter. What I don't understand is how pattern matching against multiple values can be achieved when this is the case. For example:
Suppose we have the following completely arbitrary function definition:
myFunc :: Int -> Int -> Int
myFunc 0 0 = 0
myFunc 1 1 = 1
myFunc x y = x `someoperation` y
Is the partially applied function returned by myFunc 0 essentially:
partiallyAppliedMyFunc :: Int -> Int
partiallyAppliedMyFunc 0 = 0
partiallyAppliedMyFunc y = 0 `someoperation` y
Thus removing the extraneous pattern that can't possibly match? Or.... what's going on here?
Actually, this question is more subtle than it may appear on the surface, and involves learning a little bit about compiler internals to really answer properly. The reason is that we sort of take for granted that we can have nested patterns and patterns over more than one term, when in reality for the purposes of a compiler the only thing you can do is branch on the top-level constructor of a single value. So the first stage of the compiler is to turn nested patterns (and patterns over more than one value) into simpler patterns. For example, a naive algorithm might transform your function into something like this:
myFunc = \x y -> case x of
0 -> case y of
0 -> 0
_ -> x `someoperation` y
1 -> case y of
1 -> 1
_ -> x `someoperation` y
_ -> x `someoperation` y
You can already see there's lots of suboptimal things here: the someoperation term is repeated a lot, and the function expects both arguments before it will even start to do a case at all; see A Term Pattern-Match Compiler Inspired by Finite Automata Theory for a discussion of how you might improve on this.
Anyway, in this form, it should actually be a bit more clear how the currying step happens. We can directly substitute for x in this expression to look at what myFunc 0 does:
myFunc 0 = \y -> case 0 of
0 -> case y of
0 -> 0
_ -> 0 `someoperation` y
1 -> case y of
1 -> 1
_ -> 0 `someoperation` y
_ -> 0 `someoperation` y
Since this is still a lambda, no further reduction is done. You might hope that a sufficiently smart compiler could do a bit more, but GHC explicitly does not do more; if you want more computation to be done after supplying only one argument, you have to change your definition. (There's a time/space tradeoff here, and choosing correctly is too hard to do reliably. So GHC leaves it in the programmer's hands to make this choice.) For example, you could explicitly write
myFunc 0 = \y -> case y of
0 -> 0
_ -> 0 `someoperation` y
myFunc 1 = \y -> case y of
1 -> 1
_ -> 1 `someoperation` y
myFunc x = \y -> x `someoperation` y
and then myFunc 0 would reduce to a much smaller expression.
Related
I'm just getting started with Haskell. I'm trying to create a function that imitates the standard replicate function in Haskell, but using recursion. For example,
Prelude> replicate 3 "Ha!"
["Ha!","Ha!","Ha!"]
It should be of type Int -> a -> [a]. So far I have:
myReplicate :: Int -> a -> [a]
myReplicate x y = y : myReplicate (x-1) y
myReplicate 0 y = [ ]
However, my function always generates infinite lists:
Prelude> myReplicate 3 "Ha!"
["Ha!","Ha!","Ha!","Ha!","Ha!","Ha!","Ha!",...
You have to put the second case before the first, else it will never get to the second case.
myReplicate :: Int -> a -> [a]
myReplicate 0 y = [ ]
myReplicate x y = y : myReplicate (x-1) y
Your code should generate a warning reading (in GHC, at least):
Pattern match(es) are overlapped
In an equation for 'myReplicate': myReplicate 0 y = ...
What is happening is that the code tries to match your input against each definition you've written, in the order you've written (top-down). When you write f x = ..., the x variable will always match with any value of the type it represents. If all the bindings in the definition match, then that definition will be used.
In your case, the first definition is myReplicate x y = y : myReplicate (x-1) y. As I said, x and y will match with any value you pass, including 0 for the x binding. The solution proposed by #Alec shows how you can avoid this problem, by having the most specific pattern written first, and the catch-all pattern written last.
Another solution is using guards:
myReplicate :: Int -> a -> [a]
myReplicate x y
| x > 0 = y : myReplicate (x-1) y
| x == 0 = []
| otherwise = [] -- or throw an exception, or use Maybe
This way you can write the expressions to be used in any order, if you write the conditions properly (in another words, if the conditions are mutually exclusive). Note the conditions will still be evaluated first from the top, then going down until a condition is true, pretty much like an if ... else if ... else if ... else ... chain in an imperative language.
You can use map:
myReplicate :: Int -> a -> [a]
myReplicate n x = map (const x) [1..n]
You can also use $> from Data.Functor:
import Data.Functor
myReplicate :: Int -> a -> [a]
myReplicate n x = [1..n] $> x
I want to define a function that considers it's equally-typed arguments without considering their order. For example:
weirdCommutative :: Int -> Int -> Int
weirdCommutative 0 1 = 999
weirdCommutative x y = x + y
I would like this function to actually be commutative.
One option is adding the pattern:
weirdCommutative 1 0 = 999
or even better:
weirdCommutative 1 0 = weirdCommutative 0 1
Now lets look at the general case: There could be more than two arguments and/or two values that need to be considered without order - So considering all possible cases becomes tricky.
Does anyone know of a clean, natural way to define commutative functions in Haskell?
I want to emphasize that the solution I am looking for is general and cannot assume anything about the type (nor its deconstruction type-set) except that values can be compared using == (meaning that the type is in the Eq typeclass but not necessarily in the Ord typeclass).
There is actually a package that provides a monad and some scaffolding for defining and using commutative functions. Also see this blog post.
In a case like Int, you can simply order the arguments and feed them to a (local) partial function that only accepts the arguments in that canonically ordered form:
weirdCommutative x y
| x > y = f y x
| otherwise = f x y
where f 0 1 = 999
f x' y' = x' + y'
Now, obviously most types aren't in the Ord class – but if you're deconstructing the arguments by pattern-matching, chances are you can define at least some partial ordering. It doesn't really need to be >.
How to indent correctly a nested case expression in haskell that would act like a nested loop in imperative programming ?
f x y = case x of
1 -> case y of
1 ->
2 ->
...
2 -> case y of
...
The compiler gives me an indentation error at the start of the second x case, so i'm guessing it doesn't understand that the first x case is over
Not directly an answer, but could maybe helpful nevertheless:
In this particular case, you could also write:
f 1 1 = ...
f 1 2 = ...
f 2 2 = ...
or, as a case expression:
f x y = case (x, y) of
(1,1) -> ...
(1,2) -> ...
(2,1) -> ...
Your code seems ok. Haskell has a very simple rule of Indenation as explained in wikibooks:
Code which is part of some expression should be indented further in
than the beginning of that expression.
This works for me:
f x y = case x of
1 -> case y of
1 -> undefined
2 -> undefined
2 -> case y of
1 -> undefined
You may want to check your editor to see if it is doing proper indentation. As #Tarmil suggested, always use spaces for indentation. More details on that here.
I had the same problem and it was due to that I was using tabs for identation. When I indentated the code with spaces, it worked!
I'm a little new to Haskell, but this behavior is bizarre to me. If I have a simple function defined as follows:
foobar :: Integer -> [Integer] -> Integer
foobar x y = case y of
(a:x:b) -> x
_ -> -1
I'm basically expecting that the function should evaluate to the first argument of foobar if y contains at least two elements and the second element of y is just the first argument of foobar. Otherwise get a -1. But in ghci:
foobar 5 [6,7]
gives me 7, not -1.
How do I make sense of this behavior?
What you are doing here is not "updating" the x variable but shadowing it.
You are creating a new variable called x in the scope of the first branch of
your case statement.
You cannot use a case statement to compare equality as I believe you are
trying to do. If that is your goal, you will need to do something like
foobar :: Integer -> [Integer] -> Integer
foobar x y = case y of
(a:x':b) | x == x' -> x
_ -> -1
You can tell that x is not destructively updated by adjusting your code like so:
foobar :: Integer -> [Integer] -> Integer
foobar x y = (case y of
(a:x:b) -> x
_ -> -1
) + x
The x at the end will use the original x value; it is not destroyed, rather, the x binding inside the case expression is shadowed. Calling foobar 5 [6,7] will produce 12, not 14.
Consider an haskell-expression like the following: (Trivial example, don't tell me what the obvious way is! ;)
toBits :: Integral a => a -> [Bool]
toBits 0 = []
toBits n = x : toBits m where
(m,y) = n `divMod` 2
x = y /= 0
Because this function is not tail-recursive, one could also write:
toBits :: Integral a => a -> [Bool]
toBits = toBits' [] where
toBits' l 0 = l
toBits' l n = toBits (x : l) m where
(m,y) = n `divMod` 2
x = y /= 0
(I hope there is nothing wron whithin this expression)
What I want to ask is, which one of these solutions is better. The advantage of the first one is, that it can be evaluated partitially very easy (because Haskell stops at the first : not needed.), but the second solution is (obviously) tail-recursive, but in my opinion it needs to be completely evaluated until you can get something out.
The background for this is my Brainfuck parser, (see my optimization question), which is implemented very uggly (various reverse instructions... ooh), but can be implemented easily in the first - let's call it "semi-tail-recursion" way.
I think you've got it all just right. The first form is in general better because useful output can be obtained from it before it has completed computation. That means that if 'toBits' is used in another computation the compiler can likely combine them and the list that is the output of 'toBits' may never exist at all, or perhaps just one cons cell at a time. Nice that the first version is also more clear to read!
In Haskell, your first choice would typically be preferred (I would say "always," but you're always wrong when you use that word). The accumulator pattern is appropriate for when the output can not be consumed incrementally (e.g. incrementing a counter).
Let me rename the second version and fix a few typos so that you can test the functions.
toBits :: Integral a => a -> [Bool]
toBits 0 = []
toBits n = x : toBits m where
(m,y) = n `divMod` 2
x = y /= 0
toBits2 :: Integral a => a -> [Bool]
toBits2 = toBits' [] where
toBits' l 0 = l
toBits' l n = toBits' (x : l) m where
(m,y) = n `divMod` 2
x = y /= 0
These functions don't actually compute the same thing; the first one produces a list starting with the least significant bit, while the second one starts with the most significant bit. In other words toBits2 = reverse . toBits, and in fact reverse can be implemented with exactly the same kind of accumulator that you use in toBits2.
If you want a list starting from the least significant bit, toBits is good Haskell style. It won't produce a stack overflow because the recursive call is contained inside the (:) constructor which is lazy. (Also, you can't cause a thunk buildup in the argument of toBits by forcing the value of a late element of the result list early, because the argument needs to be evaluated in the first case toBits 0 = [] to determine whether the list is empty.)
If you want a list starting from the most significant bit, either writing toBits2 directly or defining toBits and using reverse . toBits is acceptable. I would probably prefer the latter since it's easier to understand in my opinion and you don't have to worry about whether your reimplementation of reverse will cause a stack overflow.