Correct way to define a function in Haskell - haskell

I'm new to Haskell and I'm trying out a few tutorials.
I wrote this script:
lucky::(Integral a)=> a-> String
lucky 7 = "LUCKY NUMBER 7"
lucky x = "Bad luck"
I saved this as lucky.hs and ran it in the interpreter and it works fine.
But I am unsure about function definitions. It seems from the little I have read that I could equally define the function lucky as follows (function name is lucky2):
lucky2::(Integral a)=> a-> String
lucky2 x=(if x== 7 then "LUCKY NUMBER 7" else "Bad luck")
Both seem to work equally well. Clearly function lucky is clearer to read but is the lucky2 a correct way to write a function?

They are both correct. Arguably, the first one is more idiomatic Haskell because it uses its very important feature called pattern matching. In this form, it would usually be written as:
lucky::(Integral a)=> a-> String
lucky 7 = "LUCKY NUMBER 7"
lucky _ = "Bad luck"
The underscore signifies the fact that you are ignoring the exact form (value) of your parameter. You only care that it is different than 7, which was the pattern captured by your previous declaration.
The importance of pattern matching is best illustrated by function that operates on more complicated data, such as lists. If you were to write a function that computes a length of list, for example, you would likely start by providing a variant for empty lists:
len [] = 0
The [] clause is a pattern, which is set to match empty lists. Empty lists obviously have length of 0, so that's what we are having our function return.
The other part of len would be the following:
len (x:xs) = 1 + len xs
Here, you are matching on the pattern (x:xs). Colon : is the so-called cons operator: it is appending a value to list. An expression x:xs is therefore a pattern which matches some element (x) being appended to some list (xs). As a whole, it matches a list which has at least one element, since xs can also be an empty list ([]).
This second definition of len is also pretty straightforward. You compute the length of remaining list (len xs) and at 1 to it, which corresponds to the first element (x).
(The usual way to write the above definition would be:
len (_:xs) = 1 + len xs
which again signifies that you do not care what the first element is, only that it exists).

A 3rd way to write this would be using guards:
lucky n
| n == 7 = "lucky"
| otherwise = "unlucky"
There is no reason to be confused about that. There is always more than 1 way to do it. Note that this would be true even if there were no pattern matching or guards and you had to use the if.
All of the forms we've covered so far use so-called syntactic sugar provided by Haskell. Pattern guards are transformed to ordinary case expressions, as well as multiple function clauses and if expressions. Hence the most low-level, unsugared way to write this would be perhaps:
lucky n = case n of
7 -> "lucky"
_ -> "unlucky"
While it is good that you check for idiomatic ways I'd recommend to a beginner that he uses whatever works for him best, whatever he understands best. For example, if one does (not yet) understand points free style, there is no reason to force it. It will come to you sooner or later.

Related

List comprehension in haskell with let and show, what is it for?

I'm studying project euler solutions and this is the solution of problem 4, which asks to
Find the largest palindrome made from the product of two 3-digit
numbers
problem_4 =
maximum [x | y<-[100..999], z<-[y..999], let x=y*z, let s=show x, s==reverse s]
I understand that this code creates a list such that x is a product of all possible z and y.
However I'm having a problem understanding what does s do here. Looks like everything after | is going to be executed everytime a new element from this list is needed, right?
I don't think I understand what's happening here. Shouldn't everything to the right of | be constraints?
A list comprehension is a rather thin wrapper around a do expression:
problem_4 = maximum $ do
y <- [100..999]
z <- [y..999]
let x = y*z
let s = show x
guard $ s == reverse s
return x
Most pieces translate directly; pieces that aren't iterators (<-) or let expressions are treated as arguments to the guard function found in Control.Monad. The effect of guard is to short-circuit the evaluation; for the list monad, this means not executing return x for the particular value of x that led to the false argument.
I don't think I understand what's happening here. Shouldn't everything to the right of | be constraints?
No, at the right part you see an expression that is a comma-separated (,) list of "parts", and every part is one of the following tree:
an "generator" of the form somevar <- somelist;
a let statement which is an expression that can be used to for instance introduce a variable that stores a subresult; and
expressions of the type boolean that act like a filter.
So it is not some sort of "constraint programming" where one simply can list some constraints and hope that Haskell figures it out (in fact personally that is the difference between a "programming language" and a "specification language": in a programming language you have "control" how the data flows, in a specification language, that is handled by a system that reads your specifications)
Basically an iterator can be compared to a "foreach" loop in many imperative programming languages. A "let" statement can be seen as introducing a temprary variable (but note that in Haskell you do not assign variable, you declare them, so you can not reassign values). The filter can be seen as an if statement.
So the list comprehension would be equivalent to something in Python like:
for y in range(100, 1000):
for z in range(y, 1000):
x = y * z
s = str(x)
if x == x[::-1]:
yield x
We thus first iterate over two ranges in a nested way, then we declare x to be the multiplication of y and z, with let s = show x, we basically convert a number (for example 15129) to its string counterpart (for example "15129"). Finally we use s == reverse s to reverse the string and check if it is equal to the original string.
Note that there are more efficient ways to test Palindromes, especially for multiplications of two numbers.

haskell: factors of a natural number

I'm trying to write a function in Haskell that calculates all factors of a given number except itself.
The result should look something like this:
factorlist 15 => [1,3,5]
I'm new to Haskell and the whole recursion subject, which I'm pretty sure I'm suppoused to apply in this example but I don't know where or how.
My idea was to compare the given number with the first element of a list from 1 to n div2
with the mod function but somehow recursively and if the result is 0 then I add the number on a new list. (I hope this make sense)
I would appreciate any help on this matter
Here is my code until now: (it doesn't work.. but somehow to illustrate my idea)
factorList :: Int -> [Int]
factorList n |n `mod` head [1..n`div`2] == 0 = x:[]
There are several ways to handle this. But first of all, lets write a small little helper:
isFactorOf :: Integral a => a -> a -> Bool
isFactorOf x n = n `mod` x == 0
That way we can write 12 `isFactorOf` 24 and get either True or False. For the recursive part, lets assume that we use a function with two arguments: one being the number we want to factorize, the second the factor, which we're currently testing. We're only testing factors lesser or equal to n `div` 2, and this leads to:
createList n f | f <= n `div` 2 = if f `isFactorOf` n
then f : next
else next
| otherwise = []
where next = createList n (f + 1)
So if the second parameter is a factor of n, we add it onto the list and proceed, otherwise we just proceed. We do this only as long as f <= n `div` 2. Now in order to create factorList, we can simply use createList with a sufficient second parameter:
factorList n = createList n 1
The recursion is hidden in createList. As such, createList is a worker, and you could hide it in a where inside of factorList.
Note that one could easily define factorList with filter or list comprehensions:
factorList' n = filter (`isFactorOf` n) [1 .. n `div` 2]
factorList'' n = [ x | x <- [1 .. n`div` 2], x `isFactorOf` n]
But in this case you wouldn't have written the recursion yourself.
Further exercises:
Try to implement the filter function yourself.
Create another function, which returns only prime factors. You can either use your previous result and write a prime filter, or write a recursive function which generates them directly (latter is faster).
#Zeta's answer is interesting. But if you're new to Haskell like I am, you may want a "simple" answer to start with. (Just to get the basic recursion pattern...and to understand the indenting, and things like that.)
I'm not going to divide anything by 2 and I will include the number itself. So factorlist 15 => [1,3,5,15] in my example:
factorList :: Int -> [Int]
factorList value = factorsGreaterOrEqual 1
where
factorsGreaterOrEqual test
| (test == value) = [value]
| (value `mod` test == 0) = test : restOfFactors
| otherwise = restOfFactors
where restOfFactors = factorsGreaterOrEqual (test + 1)
The first line is the type signature, which you already knew about. The type signature doesn't have to live right next to the list of pattern definitions for a function, (though the patterns themselves need to be all together on sequential lines).
Then factorList is defined in terms of a helper function. This helper function is defined in a where clause...that means it is local and has access to the value parameter. Were we to define factorsGreaterOrEqual globally, then it would need two parameters as value would not be in scope, e.g.
factorsGreaterOrEqual 4 15 => [5,15]
You might argue that factorsGreaterOrEqual is a useful function in its own right. Maybe it is, maybe it isn't. But in this case we're going to say it isn't of general use besides to help us define factorList...so using the where clause and picking up value implicitly is cleaner.
The indentation rules of Haskell are (to my tastes) weird, but here they are summarized. I'm indenting with two spaces here because it grows too far right if you use 4.
Having a list of boolean tests with that pipe character in front are called "guards" in Haskell. I simply establish the terminal condition as being when the test hits the value; so factorsGreaterOrEqual N = [N] if we were doing a call to factorList N. Then we decide whether to concatenate the test number into the list by whether dividing the value by it has no remainder. (otherwise is a Haskell keyword, kind of like default in C-like switch statements for the fall-through case)
Showing another level of nesting and another implicit parameter demonstration, I added a where clause to locally define a function called restOfFactors. There is no need to pass test as a parameter to restOfFactors because it lives "in the scope" of factorsGreaterOrEqual...and as that lives in the scope of factorList then value is available as well.

why is function name repeated in haskell? (newbie)

Why is the function name repeated in
example:
lucky :: (Integral a) => a -> String
lucky 7 = "LUCKY NUMBER SEVEN!"
lucky x = "Sorry, you're out of luck, pal!"
when should I not be repeating function name? what is the meaning of it?
thanks
What you are seeing is pattern match in action.
I will show you another example:
test 1 = "one"
test 2 = "two"
test 3 = "three"
Demo in ghci:
ghci> test 1
"one"
ghci> test 2
"two"
ghci> test 3
"three"
ghci> test 4
"*** Exception: Non-exhaustive patterns in function test
So, when you call any function, the runtime system will try to match
the input with the defined function. So a call to test 3 will
initially check test 1 and since 1 is not equal to 3, it will
move on to the next definition. Again since 2 is not equal to 3,
it will move to the next defintion. In the next definiton since 3 is
equal to 3 it will return "three" String back. When you try to
pattern match something, which doesn't exist at all, the program
throws the exception.
This kind of pattern matching can be transformed to a case statement (and indeed, that's what compilers will normally do!):
lucky' n = case n of
7 -> "LUCKY NUMBER SEVEN!"
x -> "Sorry, you're out of luck, pal!"
Because the x isn't really used, you'd normally write _ -> "Sorry, ..." instead.
Note that this is not2 the same as
lucky'' n = if n==7 then ...
Equality comparison with (==) is in general more expensive1 than pattern matching, and also comes out uglier.
1 Why it's more expensive: suppose we have a big data structure. To determine that they are equal, the program will need to dig through both entire structures, make sure really all branches are equal. However, if you pattern match, you will just compare a small part you're interested in right now.
2 Actually, it is the same in the case, but just because the compiler has a particular trick for pattern matching on numbers: it rewrites it with (==). This is really special for Num types and not true for anything else. (Except if you use the OverloadedStrings extension.)
That definition of lucky uses "pattern matching", and equals (in this case)
lucky :: (Integral a) => a -> String
lucky a = if a == 7
then "LUCKY NUMBER SEVEN!"
else "Sorry, you're out of luck, pal!"
I assume you're looking at learn you a haskell. After that example, it says that
When you call lucky, the patterns will be checked from top to bottom and when it conforms to a pattern, the corresponding function body will be used.
So the first line indicates the type of the function, and later lines are patterns to check. Each line has the function name so the compiler knows you're still talking about the same function.
Think of it this way: When you write the expression lucky (a+b) or whatever, the compiler will attempt to replace lucky (a+b) with the first thing before the = in the function definition that "fits." So if a=3 and b=4, you get this series of replacements:
lucky (a+b) =
lucky (3+4) =
--pattern matching occurs...
lucky 7 =
"LUCKY NUMBER SEVEN!"
This is part of what makes Haskell so easy to reason about in practice; you get a system that works similarly to math.

Writing a function to get all subsequences of size n in Haskell?

I was trying to write a function to get all subsequences of a list that are of size n, but I'm not sure how to go about it.
I was thinking that I could probably use the built-in Data.List.subsequences and just filter out the lists that are not of size n, but it seems like a rather roundabout and inefficient way of doing it, and I'd rather not do that if I can avoid it, so I'm wondering if you have any ideas?
I want it to be something like this type
subseqofsize :: Int -> [a] -> [[a]]
For further clarification, here's an example of what I'm looking for:
subseqofsize 2 [1,2,3,3]
[[1,2],[1,3],[2,3],[1,3],[2,3],[3,3]]
Also, I don't care about the order of anything.
I'm assuming that this is homework, or that you are otherwise doing this as an exercise to learn, so I'll give you an outline of what the solution looks like instead of spoon-feeding you the correct answer.
Anyway, this is a recursion question.
There are two base cases:
sublistofsize 0 _ = ...
sublistofsize _ [] = ...
Then there are two recursive steps:
sublistofsize n (x : xs) = sublistsThatStartWithX ++ sublistsThatDon'tStartWithX
where sublistsThatStartWithX = ...
sublistsThatDon'tStartWithX = ...
Remember that the definitions of the base cases need to work appropriately with the definitions in the recursive cases. Think carefully: don't just assume that the base cases both result in an empty list.
Does this help?
You can think about this mathematically: to compute the sublists of size k, we can look at one element x of the list; either the sublists contain x, or they don't. In the former case, the sublist consists of x and then k-1 elements chosen from the remaining elements. In the latter case, the sublists consist of k elements chosen from the elements that aren't x. This lends itself to a (fairly) simple recursive definition.
(There are very very strong similarities to the recursive formula for binomial coefficients, which is expected.)
(Edit: removed code, per dave4420's reasons :) )

Haskell - Find the longest word in a text

I've got a problem about to write a function to find the longest word in a text.
Input: A string with a lot of word. Ex: "I am a young man, and I have a big house."
The result will be 5 because the longest words in the text have 5 letters (young and house).
I've just started to learn Haskell. I've tried:
import Char
import List
maxord' (str:strs) m n =
if isAlpha str == True
then maxord'(strs m+1 n)
else if m >= n
then maxord'(strs 0 m)
else maxord'(strs 0 n)
maxord (str:strs) = maxord' (str:strs) 0 0
I want to return n as the result but I don't know how to do it, and it seems there is also something wrong with the code.
Any help? Thanks
Try to split your task into several subtasks. I would suggest splitting it like this:
Turn the string into a list of words. For instance, your example string becomes
["I","am","a","young","man","and","I","have","a","big","house"]
map length over the list. This calculates the word lengths. For example, the list in step 1 becomes
[1,2,1,5,3,3,1,4,1,3,5]
Find the word with the highest number of characters. You could use maximum for this.
You can compose those steps using the operator (.) which pipes two functions together. For instance, if the function to perform step 1 is called toWords, you can perform the whole task in one line:
maxord = maximum . map length . toWords
The implementation of toWords is left as an excercise to the reader. If you need help, feel free to write a comment.
There are several issues here. Let's start with the syntax.
Your else parts should be indented the same or more as the if they belong to, for example like this:
if ...
then ...
else if ...
then ...
else ...
Next, your function applications. Unlike many other languages, in Haskell, parentheses are only used for grouping and tuples. Since function application is so common in Haskell, we use the most lightweight syntax possible for it, namely whitespace. So to apply the function maxord' to the arguments strs, m+1 and n, we write maxord' strs (m+1) n. Note that since function application has the highest precedence, we have to add parentheses around m+1, otherwise it would be interpreted as (maxord' strs m) + (1 n).
That's it for the syntax. The next problem is a semantic one, namely that you have recursion without a base case. Using the pattern (str:strs), you have specified what to do when you have some characters left, but you've not specified what to do when you reach the end of the string. In this case, we want to return n, so we add a case for that.
maxord' [] m n = n
The fixed maxord' is thus
maxord' [] m n = n
maxord' (str:strs) m n =
if isAlpha str == True
then maxord' strs (m+1) n
else if m >= n
then maxord' strs 0 m
else maxord' strs 0 n
However, note that this solution is not very idiomatic. It uses explicit recursion, if expressions instead of guards, comparing booleans to True and has a very imperative feel to it. A more idiomatic solution would be something like this.
maxord = maximum . map length . words
This is a simple function chain where words splits up the input into a list of words, map length replaces each word with its length, and maximum returns the maximum of those lengths.
Although, note that it's not the exact same as your code, since the words function uses slightly different criteria when splitting the input.
There are a couple of problems
There is no termination for you recursion. You want to return n when you processed the whole input.
maxord' [] _ n = n
Syntax:
maxord'(strs 0 m)
this means that call apply strs with parameters 0 and m, and then use that as an argument to maxord. What you wan't to do is this:
maxord' strs 0 m
m+1 should be (m+1).
You might want to process empty strings, but maxord doesn't allow it.
maxord s = maxord' s 0 0
That should do it. There are a couple of subtleties. maxord' shoudln't leak out to the namespace, use where. (max m n) is a lot more concise than the if-then-else you use. And check the other answers to see how you can build your solution by wiring builtin things together. Recursions are a lot harder to read.

Resources