Many times I see functions which operate on the head of a list, e.g:
trimHead ('\n':xs) = xs
trimHead xs = xs
then I see the the definition:
trimTail = reverse . trimHead . reverse
then I see:
trimBoth = trimHead . trimTail
They are clean, but are trimTail and trimBoth efficient? Is there a better way?
Consider this alternative implementation
trimTail2 [] = []
trimTail2 ['\n'] = []
trimTail2 (x:xs) = x : trimTail2 xs
trimBoth2 = trimHead . trimTail2
It's easy to confirm that trimTail and trimBoth require that the entire list be evaluated, while trimTail2 and trimBoth2 only evaluate as much of the list as is necessary.
*Main> head $ trimTail ('h':undefined)
*** Exception: Prelude.undefined
*Main> head $ trimBoth ('h':undefined)
*** Exception: Prelude.undefined
*Main> head $ trimTail2 ('h':undefined)
'h'
*Main> head $ trimBoth2 ('h':undefined)
'h'
This implies that your version is going to be less efficient if the whole result is not needed.
Assuming the whole list is to be evaluated (if you don't need the whole list, why are you trimming the end?), it's about half as efficient as you can get out of immutable lists, but it has the same asymptotic complexity O(n).
The new list requires at least:
You have to find the end: n pointer traversals.
You have to modify the end, and thus what points to the end, etc.: n cons of existing data with new pointers.
reverse . trimHead . reverse performs roughly twice this:
The first reverse performs n pointer traversals and n cons.
trimHead possibly performs 1 pointer traversal.
The second reverse performs n pointer traversals and n cons.
Is this worth worrying about? In some circumstances, maybe. Is the code too slow, and is this called a lot? In others, maybe not. Benchmark! The implementation with reverse is nice and easy to understand, and that's important.
There is a fairly natural recursive step-through-the-list solution, which will only evaluate as much of the output as is consumed, so in the case that you don't know whether you need the whole string, you can possibly save some evaluation.
It isn't efficient in the sense, that streaming is impossible, because the whole list needs to be evaluated to get even a single element. But a better solution is difficult, as you need to evaluate the rest of the list to know, whether a line-break is to be trimmed or not. A slightly more efficient way would be to look ahead whether the linebreak is to be trimmed and react appropriately:
trimTail, trimHead, trimBoth :: String -> String
trimTail ('\n':xs) | all (=='\n') xs = ""
trimTail (x:xs) = x : trimTail xs
trimHead = dropWhile (=='\n')
trimBoth = trimTail . trimHead
The solution above evaluates only as much as needed from the string to know, if the linebreak is to be trimmed. An even better method would be to incorporate the knowledge, that the next n chars are not to be trimmed. Implementing this is left as an exercise to the reader.
An even better (and shorter) way to write trimTail is this way (by rotsor):
trimTail = foldr step [] where
step '\n' [] = []
step x xs = x:xs
Generally, try to avoid reverse. Usually there is a better way to solve the problem.
Are trimHead and trimTail efficient?
They both take O(n) time (time directly proportional to the size of the list) since the entire list must be traversed twice in order to perform the two reverses.
Is there a better way?
Well, do you have to use lists? With Data.Sequence you can modify either end of the list in constant time. If you're stuck with lists, then check out the other solutions suggested here. If you can use Sequences instead, then just modify FUZxxl's answer to use dropWhileR.
Related
I have two arrays of equal size and I want to combine them element-wise. What is the best way to do this? The array package doesn't seem to provide a zipWith equivalent function.
I'm reluctant to make my own function because the main way I can think of doing this is to convert back and forth with lists. I care about speed and I assume this way is not the most efficient way.
Option 1: use repa. Might get some performance benefit from the parallelism as well, should you care about it.
Option 2: just get indices using bounds. I would suggest avoiding list comprehensions in general; though the other answer uses them correctly in this case, it may be worthwhile to get in the habit of doing the right thing.
zipWithArr f xs ys = listArray (bounds xs) $ fmap (liftA2 f (xs !) (ys !)) (range (bounds xs))
The reason that this works is that Haskell's lists are lazy, and we can generally treat them as control structures (due to various optimizations), though we can not treat them as containers. Moreover, GHC evaluates an expression at most once per lambda, so in general you do not need to worry that this will be done inefficiently.
It's pretty easy to cook one up yourself:
zipWithA f xs ys = listArray (bounds xs) [f (xs ! i) (ys ! i) | i <- range (bounds xs)]
I am trying to solve one of the problem in H99:
Split a list into two parts; the length of the first part is given.
Do not use any predefined predicates.
Example:
> (split '(a b c d e f g h i k) 3)
( (A B C) (D E F G H I K))
And I can quickly come with a solution:
split'::[a]->Int->Int->[a]->[[a]]
split' [] _ _ _ = []
split' (x:xs) y z w = if y == z then [w,xs] else split' xs y (z+1) (w++[x])
split::[a]->Int->[[a]]
split x y = split' x y 0 []
My question is that what I am doing is kind of just rewriting the loop version in a recursion format. Is this the right way you do things in Haskell? Isn't it just the same as imperative programming?
EDIT: Also, how do you generally avoid the extra function here?
It's convenient that you can often convert an imperative solution to Haskell, but you're right, you do usually want to find a more natural recursive statement. For this one in particular, reasoning in terms of base case and inductive case can be very helpful. So what's your base case? Why, when the split location is 0:
split x 0 = ([], x)
The inductive case can be built on that by prepending the first element of the list onto the result of splitting with n-1:
split (x:xs) n = (x:left, right)
where (left, right) = split xs (n-1)
This may not perform wonderfully (it's probably not as bad as you'd think) but it illustrates my thought process when I first encounter a problem and want to approach it functionally.
Edit: Another solution relying more heavily on the Prelude might be:
split l n = (take n l, drop n l)
It's not the same as imperative programming really, each function call avoids any side effects, they're just simple expressions. But I have a suggestion for your code
split :: Int -> [a] -> ([a], [a])
split p xs = go p ([], xs)
where go 0 (xs, ys) = (reverse xs, ys)
go n (xs, y:ys) = go (n-1) (y : xs, ys)
So how we've declared that we're only returning two things ([a], [a]) instead of a list of things (which is a bit misleading) and that we've constrained our tail recursive call to be in local scope.
I'm also using pattern matching, which is a more idiomatic way to write recursive functions in Haskell, when go is called with a zero, then the first case is run. It's more pleasant generally to write recursive functions that go down rather than up since you can use pattern matching rather than if statements.
Finally this is more efficient since ++ is linear in the length of the first list, which means that the complexity of your function is quadratic rather than linear. This method is also tail recursive unlike Daniel's solution, which is important for handling any large lists.
TLDR: Both versions are functional style, avoiding mutation, using recursion instead of loops. But the version I've presented is a little more Haskell-ish and slightly faster.
A word on tail recursion
This solution uses tail recursion which isn't always essential in Haskell but in this case is helpful when you use the resulting lists, but at other times is actually a bad thing. For example, map isn't tail recursive, but if it was you couldn't use it over infinite lists!
In this case, we can use tail recursion, since an integer is always finite. But, if we only use the first element of the list, Daniel's solution is much faster, since it produces the list lazily. On the other hand, if we use the whole list, my solution is much faster.
split'::[a]->Int->([a],[a])
split' [] _ = ([],[])
split' xs 0 = ([],xs)
split' (x:xs) n = (x:(fst splitResult),snd splitResult)
where splitResult = split' xs (n-1)
It seems you have already shown an example of a better solution.
I would recommend you read SICP. Then you come to the conclusion that the extra function is normal. There's also widely used approach to hide functions in the local area. The book may seem boring to you but in the early chapters she will get used to the functional approach in solving problems.
There are tasks in which the recursive approach is more necessary. But for example if you use tail recursion (which is so often praised without cause) then you will notice that this is just the usual iteration. Often with "extra-function" which hide iteration variable (oh.. word variable is not very appropriate, likely argument).
So, I'm new here, and I would like to ask 2 questions about some code:
Duplicate each element in list by n times. For example, duplicate [1,2,3] should give [1,2,2,3,3,3]
duplicate1 xs = x*x ++ duplicate1 xs
What is wrong in here?
Take positive numbers from list and find the minimum positive subtraction. For example, [-2,-1,0,1,3] should give 1 because (1-0) is the lowest difference above 0.
For your first part, there are a few issues: you forgot the pattern in the first argument, you are trying to square the first element rather than replicate it, and there is no second case to end your recursion (it will crash). To help, here is a type signature:
replicate :: Int -> a -> [a]
For your second part, if it has been covered in your course, you could try a list comprehension to get all differences of the numbers, and then you can apply the minimum function. If you don't know list comprehensions, you can do something similar with concatMap.
Don't forget that you can check functions on http://www.haskell.org/hoogle/ (Hoogle) or similar search engines.
Tell me if you need a more thorough answer.
To your first question:
Use pattern matching. You can write something like duplicate (x:xs). This will deconstruct the first cell of the parameter list. If the list is empty, the next pattern is tried:
duplicate (x:xs) = ... -- list is not empty
duplicate [] = ... -- list is empty
the function replicate n x creates a list, that contains n items x. For instance replicate 3 'a' yields `['a','a','a'].
Use recursion. To understand, how recursion works, it is important to understand the concept of recursion first ;)
1)
dupe :: [Int] -> [Int]
dupe l = concat [replicate i i | i<-l]
Theres a few problems with yours, one being that you are squaring each term, not creating a new list. In addition, your pattern matching is off and you would create am infinite recursion. Note how you recurse on the exact same list as was input. I think you mean something along the lines of duplicate1 (x:xs) = (replicate x x) ++ duplicate1 xs and that would be fine, so long as you write a proper base case as well.
2)
This is pretty straight forward from your problem description, but probably not too efficient. First filters out negatives, thewn checks out all subtractions with non-negative results. Answer is the minumum of these
p2 l = let l2 = filter (\x -> x >= 0) l
in minimum [i-j | i<-l2, j<-l2, i >= j]
Problem here is that it will allow a number to be checkeed against itself, whichwiull lend to answers of always zero. Any ideas? I'd like to leave it to you, commenter has a point abou t spoon-feeding.
1) You can use the fact that list is a monad:
dup = (=<<) (\x -> replicate x x)
Or in do-notation:
dup xs = do x <- xs; replicate x x; return x
2) For getting only the positive numbers from a list, you can use filter:
filter (>= 0) [1,-1,0,-5,3]
-- [1,0,3]
To get all possible "pairings" you can use either monads or applicative functors:
import Control.Applicative
(,) <$> [1,2,3] <*> [1,2,3]
[(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)]
Of course instead of creating pairs you can generate directly differences when replacing (,) by (-). Now you need to filter again, discarding all zero or negative differences. Then you only need to find the minimum of the list, but I think you can guess the name of that function.
Here, this should do the trick:
dup [] = []
dup (x:xs) = (replicate x x) ++ (dup xs)
We define dup recursively: for empty list it is just an empty list, for a non empty list, it is a list in which the first x elements are equal to x (the head of the initial list), and the rest is the list generated by recursively applying the dup function. It is easy to prove the correctness of this solution by induction (do it as an exercise).
Now, lets analyze your initial solution:
duplicate1 xs = x*x ++ duplicate1 xs
The first mistake: you did not define the list pattern properly. According to your definition, the function has just one argument - xs. To achieve the desired effect, you should use the correct pattern for matching the list's head and tail (x:xs, see my previous example). Read up on pattern matching.
But that's not all. Second mistake: x*x is actually x squared, not a list of two values. Which brings us to the third mistake: ++ expects both of its operands to be lists of values of the same type. While in your code, you're trying to apply ++ to two values of types Int and [Int].
As for the second task, the solution has already been given.
HTH
Struggling to learn Haskell, how does one take the head of a string and compare it with the next character untill it finds a character thats note true?
In pseudo code I'm trying to:
while x == 'next char in string' put in new list to be returned
The general approach would be to create a function that recursively evaluates the head of the string until it finds the false value or reaches the end.
To do that, you would need to
understand recursion (prerequisite: understand recursion) and how to write recursive functions in Haskell
know how to use the head function
quite possibly know how to use list comprehension in Haskell
I have notes on Haskell that you may find useful, but you may well find Yet Another Haskell Tutorial more comprehensive (Sections 3.3 Lists; 3.5 Functions; and 7.8 More Lists would probably be good places to start in order to address the bullet points I mention)
EDIT0:
An example using guards to test the head element and continue only if it the same as the second element:
someFun :: String -> String
someFun[] = []
someFun [x:y:xs]
| x == y = someFun(y:xs)
| otherwise = []
EDIT1:
I sort of want to say x = (newlist) and then rather than otherwise = [] have otherwise = [newlist] if that makes any sense?
It makes sense in an imperative programming paradigm (e.g. C or Java), less so for functional approaches
Here is a concrete example to, hopefully, highlight the different between the if,then, else concept the quote suggests and what is happening in the SomeFun function:
When we call SomeFun [a,a,b,b] we match this to SomeFun [x:y:xs] and since x is 'a', and y is 'a', and x==y, then SomeFun [a,a,b,b] = SomeFun [a,b,b], which again matches SomeFun [x:y:xs] but condition x==y is false, so we use the otherwise guard, and so we get SomeFun [a,a,b,b] = SomeFun [a,b,b] = []. Hence, the result of SomeFun [a,a,b,b] is [].
So where did the data go? .Well, I'll hold my hands up and admit a bug in the code, which is now a feature I'm using to explain how Haskell functions work.
I find it helpful to think more in terms of constructing mathematical expressions rather than programming operations. So, the expression on the right of the = is your result, and not an assignment in the imperative (e.g. Java or C sense).
I hope the concrete example has shown that Haskell evaluates expressions using substitution, so if you don't want something in your result, then don't include it in that expression. Conversely, if you do want something in the result, then put it in the expression.
Since your psuedo code is
while x == 'next char in string' put in new list to be returned
I'll modify the SomeFun function to do the opposite and let you figure out how it needs to be modified to work as you desire.
someFun2 :: String -> String
someFun2[] = []
someFun2 [x:y:xs]
| x == y = []
| otherwise = x : someFun(y:xs)
Example Output:
SomeFun2 [a,a,b,b] = []
SomeFun2 [a,b,b,a,b] = [a]
SomeFun2 [a,b,a,b,b,a,b] = [a,b,a]
SomeFun2 [a,b,a,b] = [a,b,a,b]
(I'd like to add at this point, that these various code snippets aren't tested as I don't have a compiler to hand, so please point out any errors so I can fix them, thanks)
There are two typical ways to get the head of a string. head, and pattern matching (x:xs).
In fact, the source for the head function shows is simply defined with pattern matching:
head (x:_) = x
head _ = badHead
I highly recommend you check out Learn You a Haskell # Pattern Matching. It gives this example, which might help:
tell (x:y:[]) = "The list has two elements: " ++ show x ++ " and " ++ show y
Notice how it pattern matched against (x:y:[]), meaning the list must have two elements, and no more. To match the first two elements in a longer list, just swap [] for a variable (x:y:xs)
If you choose the pattern matching approach, you will need to use recursion.
Another approach is the zip xs (drop 1 xs). This little idiom creates tuples from adjacent pairs in your list.
ghci> let xs = [1,2,3,4,5]
ghci> zip xs (drop 1 xs)
[(1,2),(2,3),(3,4),(4,5)]
You could then write a function that looks at these tuples one by one. It would also be recursive, but it could be written as a foldl or foldr.
For understanding recursion in Haskell, LYAH is again highly recommended:
Learn You a Haskell # Recursion
Let's say I have a list of values to which I want to apply a sequence of operations until I get a final result:
[0, 1, 2]
firstOperation xs = map (+1) xs
secondOperation xs = filter even xs
thirdOperation xs = sum xs
Although I am sure there are other better ways to handle this, the only one I currently know is to define a function that calls all these functions nested one inside another:
runAllOperations xs = thirdOperation (secondOperation (firstOperation xs))
but this is both ugly and raises the problem that if I have 10 operations, turns this bit of code into a maintanance nightmare.
What is the correct way of implementing something of the kind here? Keep in mind the example I gave above is just a oversimplification of what I am facing on my current project.
. or $ are way more readable than ( and )
runAllOperations xs = thirdOperation $ secondOperation $ firstOperation xs
or
runAllOperations = thirdOperation . secondOperation . firstOperation
If you can make a list of all the operations, you can then fold the composition operator over that list:
foldr (.) id fns
Then you can apply the result of that to the initial values.
Though you might need to apply a final reduction step separately.