Let's say I have a function which does some computation, with several patterns; implemented in the form of pattern matching.
Most of these patterns do (along with other things different from one to another) a treatment on a parameter, for which I use an intermediary variable in a let expression. But I find it really redundant to have the same let on many patterns, and I wonder if there is a way to define a let for several patterns?
Here is an example of my duplicated let :
data MyType a = Something a | Another Int [a]
myFunc (Something x) = -- return something, this isn't the point here
myFunc (Another 0 xs) =
let intermediary = some $ treatment xs
in doSthg intermediary 1
myFunc (Another 1 (x:xs)) =
let intermediary = some $ treatment xs
in doSthg1 intermediary 1 x
myFunc (Another 2 (x:x':xs)) =
let intermediary = some $ treatment xs
in doSthg2 intermediary 2 x x'
You can see that the parameter xs is always present when I use it for intermediary, and this could be factorised.
It could easily be achieved by using a helper function but I was wondering if what I am asking is possible without one. Please try to keep it simple for a beginner, and I hope my example is clear enough.
This particular problem can be worked around as follows:
myFunc2 (Something x) = returnSomething x
myFunc2 (Another n ys) =
let xs = drop n ys
x = head ys
x' = head (tail ys)
intermediate = some $ treatment xs
in case n of
0 -> doSomething intermediate n
1 -> doSomething1 intermediate n x
2 -> doSomething2 intermediate n x x'
Thanks to lazy evaluation x and x' will be only evaluated if their value is needed.
However - and this is a big however! - your code will give a runtime error when you try to call myFunc2 (Another 2 []) (and if doSomething2 actually uses x!) because to find out what x is, we need to evaluate head ys - and that'll crash for an empty list. The code you gave as an example also won't work (another runtime error) for Another 2 [] since there's no matching pattern, but there it's easier to supply a fall-back case.
This might not be a problem if you control the input and always make sure that the list in Another is long enough, but it's important to be aware of this issue!
Related
I found this statement while studying Functional Reactive Programming, from "Plugging a Space Leak with an Arrow" by Hai Liu and Paul Hudak ( page 5) :
Suppose we wish to define a function that repeats its argument indefinitely:
repeat x = x : repeat x
or, in lambdas:
repeat = λx → x : repeat x
This requires O(n) space. But we can achieve O(1) space by writing instead:
repeat = λx → let xs = x : xs
in xs
The difference here seems small but it hugely prompts the space efficiency. Why and how it happens ? The best guess I've made is to evaluate them by hand:
r = \x -> x: r x
r 3
-> 3: r 3
-> 3: 3: 3: ........
-> [3,3,3,......]
As above, we will need to create infinite new thunks for these recursion. Then I try to evaluate the second one:
r = \x -> let xs = x:xs in xs
r 3
-> let xs = 3:xs in xs
-> xs, according to the definition above:
-> 3:xs, where xs = 3:xs
-> 3:xs:xs, where xs = 3:xs
In the second form the xs appears and can be shared between every places it occurring, so I guess that's why we can only require O(1) spaces rather than O(n). But I'm not sure whether I'm right or not.
BTW: The keyword "shared" comes from the same paper's page 4:
The problem here is that the standard call-by-need evaluation rules
are unable to recognize that the function:
f = λdt → integralC (1 + dt) (f dt)
is the same as:
f = λdt → let x = integralC (1 + dt) x in x
The former definition causes work to be repeated in the recursive call
to f, whereas in the latter case the computation is shared.
It's easiest to understand with pictures:
The first version
repeat x = x : repeat x
creates a chain of (:) constructors ending in a thunk which will replace itself with more constructors as you demand them. Thus, O(n) space.
The second version
repeat x = let xs = x : xs in xs
uses let to "tie the knot", creating a single (:) constructor which refers to itself.
Put simply, variables are shared, but function applications are not. In
repeat x = x : repeat x
it is a coincidence (from the language's perspective) that the (co)recursive call to repeat is with the same argument. So, without additional optimization (which is called static argument transformation), the function will be called again and again.
But when you write
repeat x = let xs = x : xs in xs
there are no recursive function calls. You take an x, and construct a cyclic value xs using it. All sharing is explicit.
If you want to understand it more formally, you need to familiarize yourself with the semantics of lazy evaluation, such as A Natural Semantics for Lazy Evaluation.
Your intuition about xs being shared is correct. To restate the author's example in terms of repeat, instead of integral, when you write:
repeat x = x : repeat x
the language does not recognize that the repeat x on the right is the same as the value produced by the expression x : repeat x. Whereas if you write
repeat x = let xs = x : xs in xs
you're explicitly creating a structure that when evaluated looks like this:
{hd: x, tl:|}
^ |
\________/
I was trying to implement a Haskell function that takes as input an array of integers A
and produces another array B = [A[0], A[0]+A[1], A[0]+A[1]+A[2] ,... ]. I know that scanl from Data.List can be used for this with the function (+). I wrote the second implementation
(which performs faster) after seeing the source code of scanl. I want to know why the first implementation is slower compared to the second one, despite being tail-recursive?
-- This function works slow.
ps s x [] = x
ps s x y = ps s' x' y'
where
s' = s + head y
x' = x ++ [s']
y' = tail y
-- This function works fast.
ps' s [] = []
ps' s y = [s'] ++ (ps' s' y')
where
s' = s + head y
y' = tail y
Some details about the above code:
Implementation 1 : It should be called as
ps 0 [] a
where 'a' is your array.
Implementation 2: It should be called as
ps' 0 a
where 'a' is your array.
You are changing the way that ++ associates. In your first function you are computing ((([a0] ++ [a1]) ++ [a2]) ++ ...) whereas in the second function you are computing [a0] ++ ([a1] ++ ([a2] ++ ..)). Appending a few elements to the start of the list is O(1), whereas appending a few elements to the end of a list is O(n) in the length of the list. This leads to a linear versus quadratic algorithm overall.
You can fix the first example by building the list up in reverse order, and then reversing again at the end, or by using something like dlist. However the second will still be better for most purposes. While tail calls do exist and can be important in Haskell, if you are familiar with a strict functional language like Scheme or ML your intuition about how and when to use them is completely wrong.
The second example is better, in large part, because it's incremental; it immediately starts returning data that the consumer might be interested in. If you just fixed the first example using the double-reverse or dlist tricks, your function will traverse the entire list before it returns anything at all.
I would like to mention that your function can be more easily expressed as
drop 1 . scanl (+) 0
Usually, it is a good idea to use predefined combinators like scanl in favour of writing your own recursion schemes; it improves readability and makes it less likely that you needlessly squander performance.
However, in this case, both my scanl version and your original ps and ps' can sometimes lead to stack overflows due to lazy evaluation: Haskell does not necessarily immediately evaluate the additions (depends on strictness analysis).
One case where you can see this is if you do last (ps' 0 [1..100000000]). That leads to a stack overflow. You can solve that problem by forcing Haskell to evaluate the additions immediately, for instance by defining your own, strict scanl:
myscanl :: (b -> a -> b) -> b -> [a] -> [b]
myscanl f q [] = []
myscanl f q (x:xs) = q `seq` let q' = f q x in q' : myscanl f q' xs
ps' = myscanl (+) 0
Then, calling last (ps' [1..100000000]) works.
I need to get the nth element of a list but without using the !! operator. I am extremely new to haskell so I'd appreciate if you can answer in more detail and not just one line of code. This is what I'm trying at the moment:
nthel:: Int -> [Int] -> Int
nthel n xs = 0
let xsxs = take n xs
nthel n xs = last xsxs
But I get: parse error (possibly incorrect indentation)
There's a lot that's a bit off here,
nthel :: Int -> [Int] -> Int
is technically correct, really we want
nthel :: Int -> [a] -> a
So we can use this on lists of anything (Optional)
nthel n xs = 0
What you just said is "No matter what you give to nthel return 0". which is clearly wrong.
let xsxs = ...
This is just not legal haskell. let ... in ... is an expression, it can't be used toplevel.
From there I'm not really sure what that's supposed to do.
Maybe this will help put you on the right track
nthelem n [] = <???> -- error case, empty list
nthelem 0 xs = head xs
nthelem n xs = <???> -- recursive case
Try filling in the <???> with your best guess and I'm happy to help from there.
Alternatively you can use Haskell's "pattern matching" syntax. I explain how you can do this with lists here.
That changes our above to
nthelem n [] = <???> -- error case, empty list
nthelem 0 (x:xs) = x --bind x to the first element, xs to the rest of the list
nthelem n (x:xs) = <???> -- recursive case
Doing this is handy since it negates the need to use explicit head and tails.
I think you meant this:
nthel n xs = last xsxs
where xsxs = take n xs
... which you can simplify as:
nthel n xs = last (take n xs)
I think you should avoid using last whenever possible - lists are made to be used from the "front end", not from the back. What you want is to get rid of the first n elements, and then get the head of the remaining list (of course you get an error if the rest is empty). You can express this quite directly as:
nthel n xs = head (drop n xs)
Or shorter:
nthel n = head . drop n
Or slightly crazy:
nthel = (head .) . drop
As you know list aren't naturally indexed, but it can be overcome using a common tips.
Try into ghci, zip [0..] "hello", What's about zip [0,1,2] "hello" or zip [0..10] "hello" ?
Starting from this observation, we can now easily obtain a way to index our list.
Moreover is a good illustration of the use of laziness, a good hint for your learning process.
Then based on this and using pattern matching we can provide an efficient algorithm.
Management of bounding cases (empty list, negative index).
Replace the list by an indexed version using zipper.
Call an helper function design to process recursively our indexed list.
Now for the helper function, the list can't be empty then we can pattern match naively, and,
if our index is equal to n we have a winner
else, if our next element is empty it's over
else, call the helper function with the next element.
Additional note, as our function can fail (empty list ...) it could be a good thing to wrap our result using Maybe type.
Putting this all together we end with.
nth :: Int -> [a] -> Maybe a
nth n xs
| null xs || n < 0 = Nothing
| otherwise = helper n zs
where
zs = zip [0..] xs
helper n ((i,c):zs)
| i == n = Just c
| null zs = Nothing
| otherwise = helper n zs
I am trying to write a function to find the index of a given element using tail recursion. Lets say the list contains the numbers 1 through 10, and I am searching for 5, then the output should be 4. The problem I am having is 'counting' using tail recursion. However, I am not even sure if I need to maunally 'count' the number of recursive calls in this case. I tried using !! which does not help because it returns the element in a particular position. I need the the function to return the position of a particular element (the exact opposite).
I have been trying to figure this one out for a hours now.
Code:
whatIndex a [] = error "cannot search empty list"
whatIndex a (x:xs) = foo a as
where
foo m [] = error "empty list"
foo m (y:ys) = if m==y then --get index of y
else foo m ys
Note: I am trying to implement this without using library functions
Your helper function needs an additional parameter for the count.
whatIndex a as = foo as 0
where
foo [] _ = error "empty list"
foo (y:ys) c
| a == y = c
| otherwise = foo ys (c+1)
BTW, it's better form to give this function a Maybe return type instead of using errors. That's how elemIndex works too, for good reason. This would look like
whatIndex a as = foo as 0
where
foo [] _ = Nothing
foo (y:ys) c
| a == y = Just c
| otherwise = foo ys (c+1)
Note: I am trying to implement this without using library functions
This is not a good idea in general. A better exercise is this:
Figure out how to implement it using library functions.
Figure out how to implement whichever library functions you used in step 1 on your own.
This way you're learning three key skills:
What are the standard library functions, and examples of when they are useful.
How to break problems into smaller pieces
How to write basic functions like the ones in the libraries.
In this case, however, your whatIndex is more or less the same function as elemIndex in Data.List, so your problem reduces to writing your own version of this library function.
The trick here is that you want to increment a counter while you recurse down the list. There is a standard technique for writing tail recursive functions, which is called an accumulating parameter. It works like this:
You write an auxiliary function that, compared to the "front-end" function, takes an extra parameter (or more) to keep track of the extra information.
You then define the "real" function as a call to the auxiliary one.
So for elemIndex, the auxiliary function would be something like this (with i as the accumulating parameter for the current element index):
-- I'll leave the blanks for you to fill.
elemIndex' i x [] = ...
elemIndex' i x (x':xs) = ...
Then the "driver" function is this:
elemIndex x xs = elemIndex 0 x xs
But there is a serious problem here that I must mention: getting this function to perform well in Haskell is tricky. Tail recursion is a useful trick in strict (non-lazy) functional languages, but not so much in Haskell, because:
A tail-recursive function in Haskell can still blow the stack,
A non-tail-recursive function can run in constant space.
This older answer of mine shows an example of the second point.
So in your case, a non-tail-recursive solution is probably the easiest one you can give that will run in constant space (i.e., not blow the stack on a long list):
elemIndex x xs = elemIndex' x (zip xs [0..])
elemIndex' x pairs = snd (find (\(x', i) -> x == x') pairs)
-- | Combine two lists by pairing together their first elements, their second
-- elements, etc., until one of the lists runs out.
--
-- EXERCISE: write this function on your own!
zip :: [a] -> [b] -> [(a, b)]
zip xs ys = ...
-- | Return the first element x of xs such that pred x == True. Returns Nothing if
-- there isn't one, Just x if there is one.
--
-- EXERCISE: write this function on your own!
find :: (a -> Bool) -> [a] -> Maybe a
find pred xs = ...
I'm just wondering about a recursion function I'm laying out in Haskell. Is it generally better to use guards than patterns for recursion functions?
I'm just not sure on what the best layout is but I do know that patterns are better when defining functions such as this:
units :: Int -> String
units 0 = "zero"
units 1 = "one"
is much preferred to
units n
| n == 0 = "zero"
| n == 1 = "one"
I'm just not sure though when it comes to recursion as to whether this is the same or different.
Just not quite sure on terminology: I'm using something like this:
f y [] = []
f y (x:xs)
| y == 0 = ......
| otherwise = ......
or would this be better?
f y [] = []
f 0 (x:xs) =
f y (x:xs) =
My general rule of thumb would be this:
Use pattern matching when the guard would be a simple == check.
With recursion, you usually are checking for a base case. So if your base case is a simple == check, then use pattern matching.
So I'd generally do this:
map f [] = []
map f (x:xs) = f x : map f xs
Instead of this (null simply checks if a list is empty. It's basically == []):
map f xs | null xs = []
| otherwise = f (head xs) : map f (tail xs)
Pattern matching is meant to make your life easier, imho, so in the end you should do what makes sense to you. If you work with a group, then do what makes sense to the group.
[update]
For your particular case, I'd do something like this:
f _ [] = []
f 0 _ = ...
f y (x:xs) = ...
Pattern matches, like guards, fall from top to bottom, stopping at the first definition that matches the input. I used the underscore symbol to indicate that for the first pattern match, I didn't care what the y argument was, and for the second pattern match, I didn't care what the list argument was (although, if you do use the list in that computation, then you should not use the underscore). Since it's still fairly simple ==-like checks, I'd personally stick with pattern matching.
But I think it's a matter of personal preference; your code is perfectly readable and correct as it is. If I'm not mistaken, when the code is compiled, both guards and pattern matches get turned into case statements in the end.
A simple rule
If you are recursing on a data structure, use pattern matching
If your recursive condition is more complex, use guards.
Discussion
Fundamentally, it depends on the test you wish to do to guard the recursion. If it is a test on the structure of a data type, use pattern matching, as it will be more efficient than redundant testing for equality.
For your example, pattern matching on the integers is obviously cleaner and more efficient:
units 0 = "zero"
units 1 = "one"
The same goes for recursive calls on any data type, where you distinguish cases via the shape of the data.
Now, if you had more complicated logical conditions, then guards would make sense.
There aren't really hard and fast rules on this, which is why the answers you've gotten were a bit hazy. Some decisions are easy, like pattern matching on [] instead of guarding with f xs | null xs = ... or, heaven forbid, f xs | length xs == 0 = ... which is terrible in multiple ways. But when there's no compelling practical issue, just use whichever makes the code clearer.
As an example, consider these functions (that aren't really doing anything useful, just serving as illustrations):
f1 _ [] = []
f1 0 (x:xs) = [[x], xs]
f1 y (x:xs) = [x] : f1 (y - 1) xs
f2 _ [] = []
f2 y (x:xs) | y == 0 = calc 1 : f2 (- x) xs
| otherwise = calc (1 / y) : f2 (y * x) xs
where calc z = x * ...
In f1, the separate patterns emphasize that the recursion has two base cases. In f2, the guards emphasize that 0 is merely a special case for some calculations (most of which are done by calc, defined in a where clause shared by both branches of the guard) and doesn't change the structure of the computation.
#Dan is correct: it's basically a matter of personal preferences and doesn't affect the generated code. This module:
module Test where
units :: Int -> String
units 0 = "zero"
units 1 = "one"
unitGuarded :: Int -> String
unitGuarded n
| n == 0 = "zero"
| n == 1 = "one"
produced the following core:
Test.units =
\ (ds_dkU :: GHC.Types.Int) ->
case ds_dkU of _ { GHC.Types.I# ds1_dkV ->
case ds1_dkV of _ {
__DEFAULT -> Test.units3;
0 -> Test.unitGuarded2;
1 -> Test.unitGuarded1
}
}
Test.unitGuarded =
\ (n_abw :: GHC.Types.Int) ->
case n_abw of _ { GHC.Types.I# x_ald ->
case x_ald of _ {
__DEFAULT -> Test.unitGuarded3;
0 -> Test.unitGuarded2;
1 -> Test.unitGuarded1
}
}
Exactly the same, except for the different default case, which in both instances is a pattern match error. GHC even commoned-up the strings for the matched cases.
The answers so far do not mention the advantage of pattern matching which is the most important for me: ability to safely implement total functions.
When doing pattern matching you can safely access the internal structure of the object without the fear of this object being something else. In case you forget some of the patterns, the compiler can warn you (unfortunately this warning is off by default in GHC).
For example, when writing this:
map f xs | null xs = []
| otherwise = f (head xs) : map f (tail xs)
You are forced to use non-total functions head and tail, thus risking the life of your program. If you make a mistake in guard conditions, the compiler can't help you.
On the other hand, if you make an error with pattern matching, the compiler can give you an error or a warning depending on how bad your error was.
Some examples:
-- compiles, crashes in runtime
map f xs | not (null xs) = []
| otherwise = f (head xs) : map f (tail xs)
-- does not have any way to compile
map f (h:t) = []
map f [] = f h : map f t
-- does not give any warnings
map f xs = f (head xs) : map f (tail xs)
-- can give a warning of non-exhaustive pattern match
map f (h:t) = f h : map f t