Haskell - Find the longest word in a text - haskell

I've got a problem about to write a function to find the longest word in a text.
Input: A string with a lot of word. Ex: "I am a young man, and I have a big house."
The result will be 5 because the longest words in the text have 5 letters (young and house).
I've just started to learn Haskell. I've tried:
import Char
import List
maxord' (str:strs) m n =
if isAlpha str == True
then maxord'(strs m+1 n)
else if m >= n
then maxord'(strs 0 m)
else maxord'(strs 0 n)
maxord (str:strs) = maxord' (str:strs) 0 0
I want to return n as the result but I don't know how to do it, and it seems there is also something wrong with the code.
Any help? Thanks

Try to split your task into several subtasks. I would suggest splitting it like this:
Turn the string into a list of words. For instance, your example string becomes
["I","am","a","young","man","and","I","have","a","big","house"]
map length over the list. This calculates the word lengths. For example, the list in step 1 becomes
[1,2,1,5,3,3,1,4,1,3,5]
Find the word with the highest number of characters. You could use maximum for this.
You can compose those steps using the operator (.) which pipes two functions together. For instance, if the function to perform step 1 is called toWords, you can perform the whole task in one line:
maxord = maximum . map length . toWords
The implementation of toWords is left as an excercise to the reader. If you need help, feel free to write a comment.

There are several issues here. Let's start with the syntax.
Your else parts should be indented the same or more as the if they belong to, for example like this:
if ...
then ...
else if ...
then ...
else ...
Next, your function applications. Unlike many other languages, in Haskell, parentheses are only used for grouping and tuples. Since function application is so common in Haskell, we use the most lightweight syntax possible for it, namely whitespace. So to apply the function maxord' to the arguments strs, m+1 and n, we write maxord' strs (m+1) n. Note that since function application has the highest precedence, we have to add parentheses around m+1, otherwise it would be interpreted as (maxord' strs m) + (1 n).
That's it for the syntax. The next problem is a semantic one, namely that you have recursion without a base case. Using the pattern (str:strs), you have specified what to do when you have some characters left, but you've not specified what to do when you reach the end of the string. In this case, we want to return n, so we add a case for that.
maxord' [] m n = n
The fixed maxord' is thus
maxord' [] m n = n
maxord' (str:strs) m n =
if isAlpha str == True
then maxord' strs (m+1) n
else if m >= n
then maxord' strs 0 m
else maxord' strs 0 n
However, note that this solution is not very idiomatic. It uses explicit recursion, if expressions instead of guards, comparing booleans to True and has a very imperative feel to it. A more idiomatic solution would be something like this.
maxord = maximum . map length . words
This is a simple function chain where words splits up the input into a list of words, map length replaces each word with its length, and maximum returns the maximum of those lengths.
Although, note that it's not the exact same as your code, since the words function uses slightly different criteria when splitting the input.

There are a couple of problems
There is no termination for you recursion. You want to return n when you processed the whole input.
maxord' [] _ n = n
Syntax:
maxord'(strs 0 m)
this means that call apply strs with parameters 0 and m, and then use that as an argument to maxord. What you wan't to do is this:
maxord' strs 0 m
m+1 should be (m+1).
You might want to process empty strings, but maxord doesn't allow it.
maxord s = maxord' s 0 0
That should do it. There are a couple of subtleties. maxord' shoudln't leak out to the namespace, use where. (max m n) is a lot more concise than the if-then-else you use. And check the other answers to see how you can build your solution by wiring builtin things together. Recursions are a lot harder to read.

Related

Permutation to disjoint cycles in Haskell

I was trying to implement permutation to cycles in Haskell without using Monad. The problem is as follow: given a permutation of numbers [1..n], output the correspondence disjoint cycles. The function is defined like
permToCycles :: [Int] -> [[Int]]
For the input:
permToCycles [3,5,4,1,2]
The output should be
[[3,4,1],[5,2]]
By the definition of cyclic permutation, the algorithm itself is straightforward. Since [3,5,4,1,2] is a permutation of [1,2,3,4,5], we start from the first element 3 and follow the orbit until we get back to 3. In this example, we have two cycles 3 -> 4 -> 1 -> 3. Continue to do so until we traverse all elements. Thus the output is [[3,4,1],[5,2]].
Using this idea, it is fairly easy to implement in any imperative language, but I have trouble with doing it in Haskell. I find something similar in the module Math.Combinat.Permutations, but the implementation of function permutationToDisjointCycles uses Monad, which is not easy to understand as I'm a beginner.
I was wondering if I could implement it without Monad. Any help is appreciated.
UPDATE: Here is the function implemented in Python.
def permToCycles(perm):
pi_dict = {i+1: perm[i]
for i in range(len(perm))} # permutation as a dictionary
cycles = []
while pi_dict:
first_index = next(iter(pi_dict)) # take the first key
this_elem = pi_dict[first_index] # the first element in perm
next_elem = pi_dict[this_elem] # next element according to the orbit
cycle = []
while True:
cycle.append(this_elem)
# delete the item in the dict when adding to cycle
del pi_dict[this_elem]
this_elem = next_elem
if next_elem in pi_dict:
# continue the cycle
next_elem = pi_dict[next_elem]
else:
# end the cycle
break
cycles.append(cycle)
return cycles
print(permToCycles([3, 5, 4, 1, 2]))
The output is
[[3,4,1],[5,2]]
I think the main obstacle when implementing it in Haskell is how to trace the marked (or unmarked) elements. In Python, it can easily be done using a dictionary as I showed above. Also in functional programming, we tend to use recursion to replace loops, but here I have trouble with thinking the recursive structure of this problem.
Let's start with the basics. You hopefully started with something like this:
permutationToDisjointCycles :: [Int] -> [[Int]]
permutationToDisjointCycles perm = ...
We don't actually want to recur on the input list so much as we want to use an index counter. In this case, we'll want a recursive helper function, and the next step is to just go ahead and call it, providing whatever arguments you think you'll need. How about something like this:
permutationToDisjointCycles perm = cycles [] 0
where
cycles :: [Int] -> Int -> [[Int]]
cycles seen ix = ...
Instead of declaring a pi_dict variable like in Python, we'll start with a seen list as an argument (I flipped it around to keeping track of what's been seen because that ends up being a little easier). We do the same with the counting index, which I here called ix. Let's consider the cases:
cycles seen ix
| ix >= length perm = -- we've reached the end of the list
| ix `elem` seen = -- we've already seen this index
| otherwise = -- we need to generate a cycle.
That last case is the interesting one and corresponds to the inner while loop of the Python code. Another while loop means, you guessed it, more recursion! Let's make up another function that we think will be useful, passing along as arguments what would have been variables in Python:
| otherwise = let c = makeCycle ix ix in c : cycles (c ++ seen) (ix+1)
makeCycle :: Int -> Int -> [Int]
makeCycle startIx currentIx = ...
Because it's recursive, we'll need a base case and recursive case (which corresponds to the if statement in the Python code which either breaks the loop or continues it). Rather than use the seen list, it's a little simpler to just check if the next element equals the starting index:
makeCycle startIx currentIx =
if next == start
then -- base case
else -- recursive call, where we attach an index onto the cycle and recur
where next = perm !! i
I left a couple holes that need to be filled in as an exercise, and this version works on 0-indexed lists rather than 1-indexed ones like your example, but the general shape of the algorithm is there.
As a side note, the above algorithm is not super efficient. It uses lists for both the input list and the "seen" list, and lookups in lists are always O(n) time. One very simple performance improvement is to immediately convert the input list perm into an array/vector, which has constant time lookups, and then use that instead of perm !! i at the end.
The next improvement is to change the "seen" list into something more efficient. To match the idea of your Python code, you could change it to a Set (or even a HashSet), which has logarithmic time lookups (or constant with a hashset).
The code you found Math.Combinat.Permutations actually uses an array of Booleans for the "seen" list, and then uses the ST monad to do imperative-like mutation on that array. This is probably even faster than using Set or HashSet, but as you yourself could tell, readability of the code suffers a bit.

Sequence of legal pairs of parentheses using recursion - Python

I have some problem to solve using recursion in Python.
I'm simply bad in recursion and don't know how to start so please guide me.
We will say that a string contains 'n' legal pairs of parentheses if the string contains only the chars '(',')' and if this sequence of parentheses can be written so in a manner of mathematical formula (That is, every opening of parentheses is closed and parentheses are not closed before they are opened). More precise way to describe it is at the beginning of the string the number of '(' is greater or equal to ')' - and the number of any kind of char in the whole string is equal. Implement a function that recieves a positive integer n and returns a list which contains every legal string of n-combination of the parentheses.
I have tried to start at least, think of a base case, but what's my base case at all?
I tried to think of a base case when I am given the minimal n which is 1 and then I think I have to return a list ['(', ')']. But to do that I have also a difficulty...
def parentheses(n):
if n == 1:
return combine_parent(n)
def combine_parent(n):
parenth_lst = []
for i in range(n):
parenth_lst +=
Please explain me the way to solve problems recursively.
Thank you!
Maybe it's helpful to look in a simple case of the problem:
n = 2
(())
()()
So we start by n=2 and we produce a sequence of ( n times followed by a sequence of ) n times and we return a list of that. Then we recursively do that with n-1. When we reach n=1 it looks like we reached the base case which is that we need to return a string with () n times (not n=1 but n=2).
n = 3
((()))
(())()
()()()
Same pattern for n=3.
The above examples are helpful to understand how the problem can be solved recursively.
def legal_parentheses(n, nn=None):
if nn == 1:
return ["()" * n]
else:
if not nn:
nn = n
# This will produce n ( followed by n ) ( i.e n=2 -> (()) )
string = "".join(["(" * nn, ")" * nn])
if nn < n:
# Then here we want to produce () n-nn times.
string += "()" * (n-nn)
return [string] + legal_parentheses(n, nn-1)
print(legal_parentheses(3))
print(legal_parentheses(4))
print(legal_parentheses(5))
For n = 3:
['((()))', '(())()', '()()()']
For n = 4:
['(((())))', '((()))()', '(())()()', '()()()()']
For n = 5:
['((((()))))', '(((())))()', '((()))()()', '(())()()()', '()()()()()']
This is one way of solving the problem.
The way to think about solving a problem recursively, in my opinion, is to first pick the simplest example of your problem in your case, n=2 and then write down what do you expect as a result. In this case, you are expecting the following output:
"(())", "()()"
Now, you are trying to find a strategy to break down the problem such that you can produce each of the strings. I start by thinking of the base case. I say the trivial case is when the result is ()(), I know that an element of that result is just () n times. If n=2, I should expect ()() and when n=3 I should expect ()()() and there should be only one such element in the sequence (so it should be done only once) hence it becomes the base case. The question is how do we calculate the (()) part of the result. The patterns shows that we just have to put n ( followed by n ) -> (()) for n=2. This looks like a good strategy. Now you need to start thinking for a slightly harder problem and see if our strategy still holds.
So let's think of n=3. What do we expect as a result?
'((()))', '(())()', '()()()'
Ok, we see that the base case should still produce the ()()() part, all good, and should also produce the ((())) part. What about the (())() part? It looks like we need a slightly different approach. In this case, we need to somehow generate n ( followed by n ) and then produce n-1 ( followed by n-1 ) and then n-2 ( followed by n-2 ) and so on, until we reach the base case n=1 where we just going to produce () part n times. But, if we were to call the function each time with n-1 then when we reach the base case we have no clue what the original n value was and hence we cannot produce the () part as we don't know how many we want (if original n was 3 then we need ()()() but if we change n by calling the function with n-1 then by the time we reach the base case, we won't know the original n value). Hence, because of that, in my solution, I introduce a second variable called nn that is the one that is reduced each time, but still, we leave the n unmodified so we know what the original value was.

haskell: factors of a natural number

I'm trying to write a function in Haskell that calculates all factors of a given number except itself.
The result should look something like this:
factorlist 15 => [1,3,5]
I'm new to Haskell and the whole recursion subject, which I'm pretty sure I'm suppoused to apply in this example but I don't know where or how.
My idea was to compare the given number with the first element of a list from 1 to n div2
with the mod function but somehow recursively and if the result is 0 then I add the number on a new list. (I hope this make sense)
I would appreciate any help on this matter
Here is my code until now: (it doesn't work.. but somehow to illustrate my idea)
factorList :: Int -> [Int]
factorList n |n `mod` head [1..n`div`2] == 0 = x:[]
There are several ways to handle this. But first of all, lets write a small little helper:
isFactorOf :: Integral a => a -> a -> Bool
isFactorOf x n = n `mod` x == 0
That way we can write 12 `isFactorOf` 24 and get either True or False. For the recursive part, lets assume that we use a function with two arguments: one being the number we want to factorize, the second the factor, which we're currently testing. We're only testing factors lesser or equal to n `div` 2, and this leads to:
createList n f | f <= n `div` 2 = if f `isFactorOf` n
then f : next
else next
| otherwise = []
where next = createList n (f + 1)
So if the second parameter is a factor of n, we add it onto the list and proceed, otherwise we just proceed. We do this only as long as f <= n `div` 2. Now in order to create factorList, we can simply use createList with a sufficient second parameter:
factorList n = createList n 1
The recursion is hidden in createList. As such, createList is a worker, and you could hide it in a where inside of factorList.
Note that one could easily define factorList with filter or list comprehensions:
factorList' n = filter (`isFactorOf` n) [1 .. n `div` 2]
factorList'' n = [ x | x <- [1 .. n`div` 2], x `isFactorOf` n]
But in this case you wouldn't have written the recursion yourself.
Further exercises:
Try to implement the filter function yourself.
Create another function, which returns only prime factors. You can either use your previous result and write a prime filter, or write a recursive function which generates them directly (latter is faster).
#Zeta's answer is interesting. But if you're new to Haskell like I am, you may want a "simple" answer to start with. (Just to get the basic recursion pattern...and to understand the indenting, and things like that.)
I'm not going to divide anything by 2 and I will include the number itself. So factorlist 15 => [1,3,5,15] in my example:
factorList :: Int -> [Int]
factorList value = factorsGreaterOrEqual 1
where
factorsGreaterOrEqual test
| (test == value) = [value]
| (value `mod` test == 0) = test : restOfFactors
| otherwise = restOfFactors
where restOfFactors = factorsGreaterOrEqual (test + 1)
The first line is the type signature, which you already knew about. The type signature doesn't have to live right next to the list of pattern definitions for a function, (though the patterns themselves need to be all together on sequential lines).
Then factorList is defined in terms of a helper function. This helper function is defined in a where clause...that means it is local and has access to the value parameter. Were we to define factorsGreaterOrEqual globally, then it would need two parameters as value would not be in scope, e.g.
factorsGreaterOrEqual 4 15 => [5,15]
You might argue that factorsGreaterOrEqual is a useful function in its own right. Maybe it is, maybe it isn't. But in this case we're going to say it isn't of general use besides to help us define factorList...so using the where clause and picking up value implicitly is cleaner.
The indentation rules of Haskell are (to my tastes) weird, but here they are summarized. I'm indenting with two spaces here because it grows too far right if you use 4.
Having a list of boolean tests with that pipe character in front are called "guards" in Haskell. I simply establish the terminal condition as being when the test hits the value; so factorsGreaterOrEqual N = [N] if we were doing a call to factorList N. Then we decide whether to concatenate the test number into the list by whether dividing the value by it has no remainder. (otherwise is a Haskell keyword, kind of like default in C-like switch statements for the fall-through case)
Showing another level of nesting and another implicit parameter demonstration, I added a where clause to locally define a function called restOfFactors. There is no need to pass test as a parameter to restOfFactors because it lives "in the scope" of factorsGreaterOrEqual...and as that lives in the scope of factorList then value is available as well.

Correct way to define a function in Haskell

I'm new to Haskell and I'm trying out a few tutorials.
I wrote this script:
lucky::(Integral a)=> a-> String
lucky 7 = "LUCKY NUMBER 7"
lucky x = "Bad luck"
I saved this as lucky.hs and ran it in the interpreter and it works fine.
But I am unsure about function definitions. It seems from the little I have read that I could equally define the function lucky as follows (function name is lucky2):
lucky2::(Integral a)=> a-> String
lucky2 x=(if x== 7 then "LUCKY NUMBER 7" else "Bad luck")
Both seem to work equally well. Clearly function lucky is clearer to read but is the lucky2 a correct way to write a function?
They are both correct. Arguably, the first one is more idiomatic Haskell because it uses its very important feature called pattern matching. In this form, it would usually be written as:
lucky::(Integral a)=> a-> String
lucky 7 = "LUCKY NUMBER 7"
lucky _ = "Bad luck"
The underscore signifies the fact that you are ignoring the exact form (value) of your parameter. You only care that it is different than 7, which was the pattern captured by your previous declaration.
The importance of pattern matching is best illustrated by function that operates on more complicated data, such as lists. If you were to write a function that computes a length of list, for example, you would likely start by providing a variant for empty lists:
len [] = 0
The [] clause is a pattern, which is set to match empty lists. Empty lists obviously have length of 0, so that's what we are having our function return.
The other part of len would be the following:
len (x:xs) = 1 + len xs
Here, you are matching on the pattern (x:xs). Colon : is the so-called cons operator: it is appending a value to list. An expression x:xs is therefore a pattern which matches some element (x) being appended to some list (xs). As a whole, it matches a list which has at least one element, since xs can also be an empty list ([]).
This second definition of len is also pretty straightforward. You compute the length of remaining list (len xs) and at 1 to it, which corresponds to the first element (x).
(The usual way to write the above definition would be:
len (_:xs) = 1 + len xs
which again signifies that you do not care what the first element is, only that it exists).
A 3rd way to write this would be using guards:
lucky n
| n == 7 = "lucky"
| otherwise = "unlucky"
There is no reason to be confused about that. There is always more than 1 way to do it. Note that this would be true even if there were no pattern matching or guards and you had to use the if.
All of the forms we've covered so far use so-called syntactic sugar provided by Haskell. Pattern guards are transformed to ordinary case expressions, as well as multiple function clauses and if expressions. Hence the most low-level, unsugared way to write this would be perhaps:
lucky n = case n of
7 -> "lucky"
_ -> "unlucky"
While it is good that you check for idiomatic ways I'd recommend to a beginner that he uses whatever works for him best, whatever he understands best. For example, if one does (not yet) understand points free style, there is no reason to force it. It will come to you sooner or later.

Why doesn't my Haskell function accept negative numbers?

I am fairly new to Haskell but do get most of the basics. However there is one thing that I just cannot figure out. Consider my example below:
example :: Int -> Int
example (n+1) = .....
The (n+1) part of this example somehow prevents the input of negative numbers but I cannot understand how. For example.. If the input were (-5) I would expect n to just be (-6) since (-6 + 1) is (-5). The output when testing is as follows:
Program error: pattern match failure: example (-5)
Can anyone explain to me why this does not accept negative numbers?
That's just how n+k patterns are defined to work:
Matching an n+k pattern (where n is a variable and k is a positive integer literal) against a value v succeeds if x >= k, resulting in the binding of n to x - k, and fails otherwise.
The point of n+k patterns is to perform induction, so you need to complete the example with a base case (k-1, or 0 in this case), and decide whether a parameter less than that would be an error or not. Like this:
example (n+1) = ...
example 0 = ...
The semantics that you're essentially asking for would be fairly pointless and redundant — you could just say
example n = let n' = n-1 in ...
to achieve the same effect. The point of a pattern is to fail sometimes.

Resources