Programming style in OCaml - haskell

I have a question about the correct way to write efficient functional programs. Suppose I'm given a list s of positive ints, and I want to find the minimum element (or just 0 if empty). Then a generic functional program for doing this would look like
minList s =
| [] -> undefined
| [x] -> x
| x :: t -> min x (minList t)
In a lazy language one can make this more efficient by adding an extra clause which terminates the recursion if a zero is found - this way s is only computed up to the first zero
minList s =
| [] -> undefined
| [x] -> x
| x :: t -> if x==0 then 0 else min x (minList t)
However, am I correct in believing that this sort of trick would not work in a strict evaluation language like OCaml, which would evaluate the whole of s before running minList? If so, what would be the correct way to optimize this in OCaml?
ADDITIONAL QUESTION: Ok, so if I understand that if statements are always lazy. But what about the following, for example: I have a function on int lists again which first checks whether or not the ith element is zero i.e.
f s = if s(i)==0 then 0 else g s
Here the input sequence s is present in both clauses of the if statement, but clearly for an efficient computation you would only want to evaluate s(i) in the first case. Here, would OCaml always evaluate all of s, even if the first case succeeds?

if expressions in ocaml don't follow the strict evaluation rule.
Like || and &&, it's lazily evaluated.
See this link: if expressions

In a strictly evaluated language, the whole list s would be evaluated. Still,
minList s =
| [] -> 0
| x :: t -> if x==0 then 0 else min x (minList t)
would not scan the whole list if a 0 is found.
The if construct has a "non-strict" semantics, in that it will evaluate only one branch, and not both. This holds in both strict and non strict languages.
An actual difference would be when calling a "user defined if" such as (using Haskell syntax):
myIf :: Bool -> a -> a
myIf b x y = if b then x else y
In a non strict language, calling myIf True 3 (nonTerminatingFunction ()) would yield 3, while in a strict language the same expression would loop forever.

First of all the minimum of an empty list is undefined, not 0. This makes sense, otherwise minList [1,2,3] would be 0 which is clearly not true. This is what ghci has to say:
Prelude> minimum []
*** Exception: Prelude.minimum: empty list
Hence your function should be written as:
let minList (x::t) = min x (minList t)
There are some problems with this definition though:
It will still give an error because there's no pattern match for the empty list.
It is not tail recursive.
It doesn't stop if the head is 0.
So here's a better solution:
let minimum x xs = match x,xs
| 0,xs -> 0
| x,[] -> x
| x,(y :: ys) -> minimum (min x y) ys
let minList = function
| [] -> raise Failure "No minimum of empty list"
| x::t -> minimum x t
The advantage of writing it like this is that minimum is tail recursive. Hence it will not increase the stack size. In addition if the head is 0 it will immediately return 0.
Lazy evaluation has no play here.

In almost every modern programming language:
for expr1 && expr2, if expr1 is already false, then expr2 won't be evaluated.
for expr1 || expr2, if expr1 is already true, then expr2 won't be evaluated.
OCaml does this too.

Related

Function with strict arguments

A corrected quiz in my textbook is asking me how many of f's arguments are strict, f being:
f x 0 z = x == z
f x y z = x
My initial thought was that all of f's arguments are to be considered strict, since y is being evaluated to check if its equal to 0, and x and z are compared to see that they're both equal.
And yet the answer is that only x and y are strict.
Any clues as to why?
First of all, you need a very precise definition of "strict" in order for this to make sense. A function f is strict iff evaluating f x to whnf causes x to be evaluated to whnf. The interaction this has with currying is a bit awkward, and I'm going to ignore some of the potential weirdness that introduces.
Assuming the type here is f :: Bool -> Int -> Bool -> Bool your analysis of the behavior wrt y is correct - evaluating f x y z to whnf will always require evaluating y to determine which equation to choose. As that is the only factor determining which equation to use, we have to split the analysis for x and z. In the first equation, evaluating the result to whnf results in both x and z being evaluated. In the second equation, evaluating the result to whnf results in evaluating x to whnf.
Since x is evaluated in both branches, this function is strict in x. This is a little bit amusing - it's strict in the way id is strict. But that's still valid! z, however, is a different story. Only one of the branches causes z to be evaluated, so it's not evaluated strictly - it's only evaluated on demand. Usually we talk about this happening where evaluation is guarded behind a constructor or when a function is applied and the result isn't evaluated, but being conditionally evaluated is sufficient. f True 1 undefined evaluates to True. If f was strict in z, that would have to evaluate to undefined.
It turns out that whether f is strict in its second argument depends on what type it gets resolved to.
Here's proof:
data ModOne = Zero
instance Eq ModOne where
_ == _ = True -- after all, they're both Zero, right?
instance Num ModOne -- the method implementations literally don't matter
f x 0 z = x == z
f x y z = x
Now in ghci:
> f True (undefined :: ModOne) True
True
> f True (undefined :: Int) True
*** Exception: Prelude.undefined
And, in a related way, whether f is strict in its third argument depends on what values you pick for the first two. Proof, again:
> f True 1 undefined
True
> f True 0 undefined
*** Exception: Prelude.undefined
So, there isn't really a simple answer to this question! f is definitely strict in its first argument; but the other two are conditionally one or the other depending on circumstances.

correction to the code of mergesort and merge in haskell

merge ::(ord a) => [a] ->[a] -> [a]
merge [][] = []
merge [a][b] = [[a,b]|a<-mergesort [a],b<- mergesort [b]]
mergesort ::(a -> Bool) -> [a] -> [a]
mergesort [] = []
mergesort (x:xs) = if xs >=2 then mergesort xs else mergesort(x:xs)
| comparision > 0 = x:xs
| comparision <= 0 = xs:x
where comparision = x-xs
That is the code what I've written for merge and mergesort and is not right of course.
Could you give some advises to correct the code?
plese don't give me the answers....
As it stands, the only error the compiler gives is this one:
test.hs:8:34: error: parse error on input ‘|’
|
8 | | comparision > 0 = x:xs
| ^
Therefore I will assume this is the one you are stuck on, and talk about how to fix this problem.
The basic issue is that guards (that is, the syntactic form where there's a pipe character | followed by a condition) are only allowed at binding sites (like function equations or inside case statements). You've included yours at an incorrect position. It's not completely clear what you mean those guards to do, so I'm not too certain how to help you put them in the correct position.
Perhaps the best help I can give is to describe what they mean with some abstract examples, and let you figure out where you want them to go instead. Take this as an example:
f x y | cond1 = val1
| cond2 = val2
| cond3 = val3
This defines a new function named f, which takes two arguments. It names the arguments x and y. Then, to decide what value to return, it checks the guards, looking for the first one that evaluates to True. So if cond1 evaluates to True, the function returns val1; if cond1 evaluates to False but cond2 evaluates to True, the function returns val2; if cond1 and cond2 evaluate to False but cond3 evaluates to True, the function returns val3. (..and if all three conditions evaluate to False, it throws a runtime exception.)
Now let's look at the syntax you used:
mergesort (x:xs) = if xs >=2 then mergesort xs else mergesort(x:xs)
| comparision > 0 = x:xs
| comparision <= 0 = xs:x
It here you are defining a new function named mergesort that accepts one argument. It matches that argument against the pattern x:xs. Then it appears to return the result of if xs >=2 then mergesort xs else mergesort(x:xs). But what are these two extra guards doing? I don't know. Perhaps you are imagining that if comparision > 0, then the function will return x:xs instead of if xs >=2 then ... else .... If so, you should write it this way:
mergesort (x:xs) | comparision > 0 = x:xs
| comparision <= 0 = xs:x
| otherwise = if x >=2 then mergesort xs else mergesort(x:xs)
Or perhaps you are imagining that if comparision > 0, then the recursive call will use x:xs instead of (x:xs). If so, you should write it this way:
mergesort (x:xs) | comparision > 0 = if x >=2 then mergesort xs else mergesort(x:xs)
| comparision <= 0 = if x >=2 then mergesort xs else mergesort(xs:x)
I'm not sure what was intended.
Anyway, hopefully this helps you resolve your parse error, and gets you to the point where you can look at the next compiler errors and take a stab at fixing those yourself.
A couple questions to ask yourself:
What, specifically, are you trying to do with the functions merge and mergesort? Suppose I have a list I want sorted -- just one list, not yet split up. Which function should I call to use your implementation of the merge sort?
Based on the type signature mergesort :: (a -> Bool) -> [a] -> [a], how many arguments should mergesort have?
What are the types of x and xs? Are all functions you're applying to variables defined for those types? I see the functions -, :, >=, and mergesort. The types given for those functions (except for mergesort) by ghci's :t are as follows:
(-) :: Num a => a -> a -> a. This means:
(-) takes two arguments of the same type. This type must belong to the typeclass Num -- that is, it must be some kind of number.
(-) returns a single value of that type.
(:) :: a -> [a] -> [a]. This means:
(:) takes an argument of one type, and another argument that is a list of values of that type.
(:) returns a list of values of that type.
(>=) :: Ord a => a -> a -> Bool. This means:
(>=) takes two arguments of the same type. This type must belong to the Ord typeclass -- that is, it must be possible to compare two values of that type and say that one value is greater than, less than, or equal to the other value. (In Prelude, this includes all members of typeclass Num.)
(>=) returns a Bool.
You should also think about how you want to structure your implementation. Think about how the merge sort works. Look at the steps, think about how you can implement those steps as functions, and how you can connect those steps together with one mergesort function. Once you have a design planned out, then you should worry about the syntax.

Why is this tail-recursive Haskell function slower ?

I was trying to implement a Haskell function that takes as input an array of integers A
and produces another array B = [A[0], A[0]+A[1], A[0]+A[1]+A[2] ,... ]. I know that scanl from Data.List can be used for this with the function (+). I wrote the second implementation
(which performs faster) after seeing the source code of scanl. I want to know why the first implementation is slower compared to the second one, despite being tail-recursive?
-- This function works slow.
ps s x [] = x
ps s x y = ps s' x' y'
where
s' = s + head y
x' = x ++ [s']
y' = tail y
-- This function works fast.
ps' s [] = []
ps' s y = [s'] ++ (ps' s' y')
where
s' = s + head y
y' = tail y
Some details about the above code:
Implementation 1 : It should be called as
ps 0 [] a
where 'a' is your array.
Implementation 2: It should be called as
ps' 0 a
where 'a' is your array.
You are changing the way that ++ associates. In your first function you are computing ((([a0] ++ [a1]) ++ [a2]) ++ ...) whereas in the second function you are computing [a0] ++ ([a1] ++ ([a2] ++ ..)). Appending a few elements to the start of the list is O(1), whereas appending a few elements to the end of a list is O(n) in the length of the list. This leads to a linear versus quadratic algorithm overall.
You can fix the first example by building the list up in reverse order, and then reversing again at the end, or by using something like dlist. However the second will still be better for most purposes. While tail calls do exist and can be important in Haskell, if you are familiar with a strict functional language like Scheme or ML your intuition about how and when to use them is completely wrong.
The second example is better, in large part, because it's incremental; it immediately starts returning data that the consumer might be interested in. If you just fixed the first example using the double-reverse or dlist tricks, your function will traverse the entire list before it returns anything at all.
I would like to mention that your function can be more easily expressed as
drop 1 . scanl (+) 0
Usually, it is a good idea to use predefined combinators like scanl in favour of writing your own recursion schemes; it improves readability and makes it less likely that you needlessly squander performance.
However, in this case, both my scanl version and your original ps and ps' can sometimes lead to stack overflows due to lazy evaluation: Haskell does not necessarily immediately evaluate the additions (depends on strictness analysis).
One case where you can see this is if you do last (ps' 0 [1..100000000]). That leads to a stack overflow. You can solve that problem by forcing Haskell to evaluate the additions immediately, for instance by defining your own, strict scanl:
myscanl :: (b -> a -> b) -> b -> [a] -> [b]
myscanl f q [] = []
myscanl f q (x:xs) = q `seq` let q' = f q x in q' : myscanl f q' xs
ps' = myscanl (+) 0
Then, calling last (ps' [1..100000000]) works.

How can I replace generators if I need only one result?

I'm playing with Haskell for first time.
I've created function that returns first precise enough result. It works as expected, but I'm using generator for this. How can I replace generator in this task?
integrateWithPrecision precision =
(take 1 $ preciseIntegrals precision) !! 0
preciseIntegrals :: Double -> [Double]
preciseIntegrals precision =
[
integrate (2 ^ power) pi | power <- [0..],
enoughPowerForPrecision power precision
]
You can use the beautiful until function. Here it is:
-- | #'until' p f# yields the result of applying #f# until #p# holds.
until :: (a -> Bool) -> (a -> a) -> a -> a
until p f x | p x = x
| otherwise = until p f (f x)
So, you can write your function like this:
integrateWithPrecision precision = integrate (2 ^ pow) pi
where
pow = until done succ 0
done pow = enoughPowerForPrecision pow precision
In your case, you do all the iteration and then compute a result just once. But until is useful even when you need to compute a result at each step - just use an (iter, result) tuple and then just extract the result at the end with snd.
It seems like you want to check higher and higher powers until you get one that satisfies a requirement. This is what you could do: First you define a function to get enough power, and then you integrate using that.
find gets the first element of a list that satisfies a condition – like being enough of a power! Then we need a fromJust to get the actual value from that. Please note that almost always, fromJust is a terrible idea to have in your code. However, in this case the list is infinite, so we will have troubles with infinite loops long before fromJust is able to crash the program.
enoughPower :: Double -> Int
enoughPower precision =
fromJust $ find (flip enoughPowerForPrecision precision) [0..]
preciseIntegrals :: Double -> Double
preciseIntegrals precision = integrate (2^(enoughPower precision)) pi
The function
\xs -> take 1 xs !! 0
is called head
head [] = error "Cannot take head of empty list"
head (x:xs) = x
Its use is somewhat unsafe, as shown it can throw an error if you pass it an empty list, but in this case since you can be certain your list is non-empty it's fine.
Also, we tend not to call these "generators" in Haskell as they're not a special form but are instead a simple consequence of lazy evaluation. In this case, preciseIntegrals is called a "list comprehension" and [0..] is nothing more than a lazily generated list.

Finding the index of a given element using tail recursion

I am trying to write a function to find the index of a given element using tail recursion. Lets say the list contains the numbers 1 through 10, and I am searching for 5, then the output should be 4. The problem I am having is 'counting' using tail recursion. However, I am not even sure if I need to maunally 'count' the number of recursive calls in this case. I tried using !! which does not help because it returns the element in a particular position. I need the the function to return the position of a particular element (the exact opposite).
I have been trying to figure this one out for a hours now.
Code:
whatIndex a [] = error "cannot search empty list"
whatIndex a (x:xs) = foo a as
where
foo m [] = error "empty list"
foo m (y:ys) = if m==y then --get index of y
else foo m ys
Note: I am trying to implement this without using library functions
Your helper function needs an additional parameter for the count.
whatIndex a as = foo as 0
where
foo [] _ = error "empty list"
foo (y:ys) c
| a == y = c
| otherwise = foo ys (c+1)
BTW, it's better form to give this function a Maybe return type instead of using errors. That's how elemIndex works too, for good reason. This would look like
whatIndex a as = foo as 0
where
foo [] _ = Nothing
foo (y:ys) c
| a == y = Just c
| otherwise = foo ys (c+1)
Note: I am trying to implement this without using library functions
This is not a good idea in general. A better exercise is this:
Figure out how to implement it using library functions.
Figure out how to implement whichever library functions you used in step 1 on your own.
This way you're learning three key skills:
What are the standard library functions, and examples of when they are useful.
How to break problems into smaller pieces
How to write basic functions like the ones in the libraries.
In this case, however, your whatIndex is more or less the same function as elemIndex in Data.List, so your problem reduces to writing your own version of this library function.
The trick here is that you want to increment a counter while you recurse down the list. There is a standard technique for writing tail recursive functions, which is called an accumulating parameter. It works like this:
You write an auxiliary function that, compared to the "front-end" function, takes an extra parameter (or more) to keep track of the extra information.
You then define the "real" function as a call to the auxiliary one.
So for elemIndex, the auxiliary function would be something like this (with i as the accumulating parameter for the current element index):
-- I'll leave the blanks for you to fill.
elemIndex' i x [] = ...
elemIndex' i x (x':xs) = ...
Then the "driver" function is this:
elemIndex x xs = elemIndex 0 x xs
But there is a serious problem here that I must mention: getting this function to perform well in Haskell is tricky. Tail recursion is a useful trick in strict (non-lazy) functional languages, but not so much in Haskell, because:
A tail-recursive function in Haskell can still blow the stack,
A non-tail-recursive function can run in constant space.
This older answer of mine shows an example of the second point.
So in your case, a non-tail-recursive solution is probably the easiest one you can give that will run in constant space (i.e., not blow the stack on a long list):
elemIndex x xs = elemIndex' x (zip xs [0..])
elemIndex' x pairs = snd (find (\(x', i) -> x == x') pairs)
-- | Combine two lists by pairing together their first elements, their second
-- elements, etc., until one of the lists runs out.
--
-- EXERCISE: write this function on your own!
zip :: [a] -> [b] -> [(a, b)]
zip xs ys = ...
-- | Return the first element x of xs such that pred x == True. Returns Nothing if
-- there isn't one, Just x if there is one.
--
-- EXERCISE: write this function on your own!
find :: (a -> Bool) -> [a] -> Maybe a
find pred xs = ...

Resources