Haskell maximumBy for Ord instances - haskell

I keep finding myself wanting a function for this, and I always need to implement it myself, but I feel like there must be a built-in for it.
Basically, what I want is sort of halfway between maximum and maximumBy.
maximum takes a list of Ord values and returns the maximum. This is O(n).
maximum :: (Ord a) => [a] -> a
maximumBy takes a list of non-Ord values and a sort function that provides an ordering for them, and computes the maximum using that function. This is O(nlogn) (or whatever the complexity is for a sort).
maximumBy :: (a -> a -> Ordering) -> [a] -> a
I want a maximumBy that takes a list of values of type a and a function that returns Ord values for a values, and computes the maximum a using the Ord values. This would be O(n) because it only needs to keep track of the maximum as it goes along. This is what that signature would be:
maximumBy' :: (Ord b) => (a -> b) -> [a] -> a
And this is my rough implementation:
maximumBy' func list = foldl1' (\max i -> let (maxby, iby) = (func max, func i) in if iby > maxby then i else max) list
Is there such a function built-in, or perhaps an easy way to write this using existing built-ins?

First of all you make a claim that is wrong:
maximumBy takes a list of non-Ord values and a sort function that provides an ordering for them, and computes the maximum using that function. This is O(nlogn).
No sorting has to be done to calculate the maximum element. An ordering relation has to be reflexive, anti-symmetric and transitive, and therefore one can simply alsways hold the maximum thus far, and use the custom comparison to check whether the new item is greater.
Furthermore you can construct you maximumBy' function as:
maximumBy' f = maximumBy (compare `on` f)

I think what you are looking for is easily expressed using Data.Ord.comparing:
Instead of using your maximumBy' function:
maximumBy' head ["abc", "def", "geh"]
you can build a comparator with comparing and give that to the usual maximumBy:
maximumBy (comparing head) ["abc", "def", "geh"]
And as Willem Van Onsem says in another answer, you are incorrect to be concerned about the cost of maximumBy: it does not sort the list.

That function is defined the tip-lib package as
maximumOn :: (Foldable f, Ord b) => (a -> b) -> f a -> b
The general idea has also been adopted in recent base, with
sortOn :: Ord b => (a -> b) -> [a] -> [a]
Note that thanks to laziness, you can use
maximumOn :: Ord b => (a -> b) -> [a] -> b
maximumOn f = head . sortOn f
and get the same asymptotic performance as a manual implementation.

I don't know any built-in function that does what you want, but you can really simplify your implementation of maximumBy'.
You could use on for example:
import Data.Function (on)
maximumBy' :: Ord b => (a -> b) -> [a] -> a
maximumBy' f = maximumBy (compare `on` f)
In fact, this definition is so simple you could use it as-is, without defining maximumBy' at all.

Related

Maybe monad and a list

Ok, so I am trying to learn how to use monads, starting out with maybe. I've come up with an example that I can't figure out how to apply it to in a nice way, so I was hoping someone else could:
I have a list containing a bunch of values. Depending on these values, my function should return the list itself, or a Nothing. In other words, I want to do a sort of filter, but with the consequence of a hit being the function failing.
The only way I can think of is to use a filter, then comparing the size of the list I get back to zero. Is there a better way?
This looks like a good fit for traverse:
traverse :: (Traversable t, Applicative f) => (a -> f b) -> t a -> f (t b)
That's a bit of a mouthful, so let's specialise it to your use case, with lists and Maybe:
GHCi> :set -XTypeApplications
GHCi> :t traverse #[] #Maybe
traverse #[] #Maybe :: (a -> Maybe b) -> [a] -> Maybe [b]
It works like this: you give it an a -> Maybe b function, which is applied to all elements of the list, just like fmap does. The twist is that the Maybe b values are then combined in a way that only gives you a modified list if there aren't any Nothings; otherwise, the overall result is Nothing. That fits your requirements like a glove:
noneOrNothing :: (a -> Bool) -> [a] -> Maybe [a]
noneOrNothing p = traverse (\x -> if p x then Nothing else Just x)
(allOrNothing would have been a more euphonic name, but then I'd have to flip the test with respect to your description.)
There are a lot of things we might discuss about the Traversable and Applicative classes. For now, I will talk a bit more about Applicative, in case you haven't met it yet. Applicative is a superclass of Monad with two essential methods: pure, which is the same thing as return, and (<*>), which is not entirely unlike (>>=) but crucially different from it. For the Maybe example...
GHCi> :t (>>=) #Maybe
(>>=) #Maybe :: Maybe a -> (a -> Maybe b) -> Maybe b
GHCi> :t (<*>) #Maybe
(<*>) #Maybe :: Maybe (a -> b) -> Maybe a -> Maybe b
... we can describe the difference like this: in mx >>= f, if mx is a Just-value, (>>=) reaches inside of it to apply f and produce a result, which, depending on what was inside mx, will turn out to be a Just-value or a Nothing. In mf <*> mx, though, if mf and mx are Just-values you are guaranteed to get a Just value, which will hold the result of applying the function from mf to the value from mx. (By the way: what will happen if mf or mx are Nothing?)
traverse involves Applicative because the combining of values I mentioned at the beginning (which, in your example, turns a number of Maybe a values into a Maybe [a]) is done using (<*>). As your question was originally about monads, it is worth noting that it is possible to define traverse using Monad rather than Applicative. This variation goes by the name mapM:
mapM :: (Traversable t, Monad m) => (a -> m b) -> t a -> m (t b)
We prefer traverse to mapM because it is more general -- as mentioned above, Applicative is a superclass of Monad.
On a closing note, your intuition about this being "a sort of filter" makes a lot of sense. In particular, one way to think about Maybe a is that it is what you get when you pick booleans and attach values of type a to True. From that vantage point, (<*>) works as an && for these weird booleans, which combines the attached values if you happen to supply two of them (cf. DarthFennec's suggestion of an implementation using any). Once you get used to Traversable, you might enjoy having a look at the Filterable and Witherable classes, which play with this relationship between Maybe and Bool.
duplode's answer is a good one, but I think it is also helpful to learn to operate within a monad in a more basic way. It can be a challenge to learn every little monad-general function, and see how they could fit together to solve a specific problem. So, here's a DIY solution that shows how to use do notation and recursion, tools which can help you with any monadic question.
forbid :: (a -> Bool) -> [a] -> Maybe [a]
forbid _ [] = Just []
forbid p (x:xs) = if p x
then Nothing
else do
remainder <- forbid p xs
Just (x : remainder)
Compare this to an implementation of remove, the opposite of filter:
remove :: (a -> Bool) -> [a] -> [a]
remove _ [] = []
remove p (x:xs) = if p x
then remove p xs
else
let remainder = remove p xs
in x : remainder
The structure is the same, with just a couple differences: what you want to do when the predicate returns true, and how you get access to the value returned by the recursive call. For remove, the returned value is a list, and so you can just let-bind it and cons to it. With forbid, the returned value is only maybe a list, and so you need to use <- to bind to that monadic value. If the return value was Nothing, bind will short-circuit the computation and return Nothing; if it was Just a list, the do block will continue, and cons a value to the front of that list. Then you wrap it back up in a Just.

Point Free Style Required for Optimized Curry

Say we have a (contrived) function like so:
import Data.List (sort)
contrived :: Ord a => [a] -> [a] -> [a]
contrived a b = (sort a) ++ b
And we partially apply it to use elsewhere, eg:
map (contrived [3,2,1]) [[4],[5],[6]]
On the surface, this works as one would expect:
[[1,2,3,4],[1,2,3,5],[1,2,3,6]]
However, if we throw some traces in:
import Debug.Trace (trace)
contrived :: Ord a => [a] -> [a] -> [a]
contrived a b = (trace "sorted" $ sort a) ++ b
map (contrived $ trace "a value" [3,2,1]) [[4],[5],[6]]
We see that the first list passed into contrived is evaluated only once, but it is sorted for each item in [4,5,6]:
[sorted
a value
[1,2,3,4],sorted
[1,2,3,5],sorted
[1,2,3,6]]
Now, contrived can be rather simply translated to point-free style:
contrived :: Ord a => [a] -> [a] -> [a]
contrived a = (++) (sort a)
Which when partially applied:
map (contrived [3,2,1]) [4,5,6]
Still works as we expect:
[[1,2,3,4],[1,2,3,5],[1,2,3,6]]
But if we again add traces:
contrived :: Ord a => [a] -> [a] -> [a]
contrived a = (++) (trace "sorted" $ sort a)
map (contrived $ trace "a value" [3,2,1]) [[4],[5],[6]]
We see that now the first list passed into contrived is evaluated and sorted only once:
[sorted
a value
[1,2,3,4],[1,2,3,5],[1,2,3,6]]
Why is this so? Since the translation into pointfree style is so trivial, why can't GHC deduce that it only needs to sort a once in the first version of contrived?
Note: I know that for this rather trivial example, it's probably preferable to use pointfree style. This is a contrived example that I've simplified quite a bit. The real function that I'm having the issue with is less clear (in my opinion) when expressed in pointfree style:
realFunction a b = conditionOne && conditionTwo
where conditionOne = map (something a) b
conditionTwo = somethingElse a b
In pointfree style, this requires writing an ugly wrapper (both) around (&&):
realFunction a = both conditionOne conditionTwo
where conditionOne = map (something a)
conditionTwo = somethingElse a
both f g x = (f x) && (g x)
As an aside, I'm also not sure why the both wrapper works; the pointfree style of realFunction behaves like the pointfree style version of contrived in that the partial application is only evaluated once (ie. if something sorted a it would only do so once). It appears that since both is not pointfree, Haskell should have the same issue that it had with the non-pointfree contrived.
If I understand correctly, you are looking for this:
contrived :: Ord a => [a] -> [a] -> [a]
contrived a = let a' = sort a in \b -> a' ++ b
-- or ... in (a' ++)
If you want the sort to be computed only once, it has to be done before the \b.
You are correct in that a compiler could optimize this. This is known as the "full laziness" optimization.
If I remember correctly, GHC does not always do it because it's not always an actual optimization, in the general case. Consider the contrived example
foo :: Int -> Int -> Int
foo x y = let a = [1..x] in length a + y
When passing both arguments, the above code works in constant space: the list elements are immediately garbage collected as they are produced.
When partially applying x, the closure for foo x only requires O(1) memory, since the list is not yet generated. Code like
let f = foo 1000 in f 10 + f 20 -- (*)
still run in constant space.
Instead, if we wrote
foo :: Int -> Int -> Int
foo x = let a = [1..x] in (length a +)
then (*) would no longer run in constant space. The first call f 10 would allocate a 1000-long list, and keep it in memory for the second call f 20.
Note that your partial application
... = (++) (sort a)
essentially means
... = let a' = sort a in \b -> a' ++ b
since argument passing involves a binding, as in let. So, the result of your sort a is kept around for all the future calls.

Cleanest way to apply a list of boolean functions to a list?

Consider this:
ruleset = [rule0, rule1, rule2, rule3, rule4, rule5]
where rule0, rule1, etc. are boolean functions that take one argument. What is the cleanest way to find if all elements of a particular list satisfy all the rules in the ruleset?
Obviously, a loop would work, but Haskell folks always seem to have clever one-liners for these types of problems.
The all function seems appropriate (eg. all (== check_one_element) ruleset) or nested maps. Also, map ($ anElement) ruleset is roughly what I want, but for all elements.
I'm a novice at Haskell and the many ways one could approach this problem are overwhelming.
If you require all the functions to be true for each argument, then it's just
and (ruleset <*> list)
(You'll need to import Control.Applicative to use <*>.)
Explanation:
When <*> is given a pair of lists, it applies each function from the list on the left to each argument from the list on the right, and gives back a list containing all the results.
A one-liner:
import Control.Monad.Reader
-- sample data
rulesetL = [ (== 1), (>= 2), (<= 3) ]
list = [1..10]
result = and $ concatMap (sequence rulesetL) list
(The type we're working on here is Integer, but it could be anything else.)
Let me explain what's happening: rulesetL is of type [Integer -> Bool]. By realizing that (->) e is a monad, we can use
sequence :: Monad m => [m a] -> m [a]
which in our case will get specialized to type [Integer -> Bool] -> (Integer -> [Bool]). So
sequence rulesetL :: Integer -> [Bool]
will pass a value to all the rules in the list. Next, we use concatMap to apply this function to list and collect all results into a single list. Finally, calling
and :: [Bool] -> Bool
will check that all combinations returned True.
Edit: Check out dave4420's answer, it's nicer and more concise. Mine answer could help if you'd need to combine rules and apply them later on some lists. In particular
liftM and . sequence :: [a -> Bool] -> (a -> Bool)
combines several rules into one. You can also extend it to other similar combinators like using or etc. Realizing that rules are values of (->) a monad can give you other useful combinators, such as:
andRules = liftM2 (&&) :: (a -> Bool) -> (a -> Bool) -> (a -> Bool)
orRules = liftM2 (||) :: (a -> Bool) -> (a -> Bool) -> (a -> Bool)
notRule = liftM not :: (a -> Bool) -> (a -> Bool)
-- or just (not .)
etc. (don't forget to import Control.Monad.Reader).
An easier-to-understand version (without using Control.Applicative):
satisfyAll elems ruleset = and $ map (\x -> all ($ x) ruleset) elems
Personally, I like this way of writing the function, as the only combinator it uses explicitly is and:
allOkay ruleset items = and [rule item | rule <- ruleset, item <- items]

What are the alternatives to prelude's iterate if the "output" values are not the same as those being iterated on?

I have come across a pattern where, I start with a seed value x and at each step generate a new seed value and a value to be output. My desired final result is a list of the output values. This can be represented by the following function:
my_iter :: (a -> (a, b)) -> a -> [b]
my_iter f x = y : my_iter f x'
where (x',y) = f x
And a contrived example of using this would be generating the Fibonacci numbers:
fibs:: [Integer]
fibs = my_iter (\(a,b) -> let c = a+b in ((b, c), c)) (0,1)
-- [1, 2, 3, 5, 8...
My problem is that I have this feeling that there is very likely a more idiomatic way to do this kind of stuff. What are the idiomatic alternatives to my function?
The only ones I can think of right now involve iterate from the Prelude, but they have some shortcomings.
One way is to iterate first and map after
my_iter f x = map f2 $ iterate f1 x
where f1 = fst . f
f2 = snd . f
However, this can look ugly if there is no natural way to split f into the separate f1 and f2 functions. (In the contrived Fibonacci case this is easy to do, but there are some situations where the generated value is not an "independent" function of the seed so its not so simple to split things)
The other way is to tuple the "output" values together with the seeds, and use a separate step to separate them (kind of like the "Schwartzian transform" for sorting things):
my_iter f x = map snd . tail $ iterate (f.fst) (x, undefined)
But this seems wierd, since we have to remember to ignore the generated values in order to get to the seed (the (f.fst) bit) and add we need an "undefined" value for the first, dummy generated value.
As already noted, the function you want is unfoldr. As the name suggests, it's the opposite of foldr, but it might be instructive to see exactly why that's true. Here's the type of foldr:
(a -> b -> b) -> b -> [a] -> b
The first two arguments are ways of obtaining something of type b, and correspond to the two data constructors for lists:
[] :: [a]
(:) :: a -> [a] -> [a]
...where each occurrence of [a] is replaced by b. Noting that the [] case produces a b with no input, we can consolidate the two as a function taking Maybe (a, b) as input.
(Maybe (a, b) -> b) -> ([a] -> b)
The extra parentheses show that this is essentially a function that turns one kind of transformation into another.
Now, simply reverse the direction of both transformations:
(b -> Maybe (a, b)) -> (b -> [a])
The result is exactly the type of unfoldr.
The underlying idea this demonstrates can be applied similarly to other recursive data types, as well.
The standard function you're looking for is called unfoldr.
Hoogle is a very useful tool in this case, since it doesn't only support searching functions by name, but also by type.
In your case, you came up with the desired type (a -> (a, b)) -> a -> [b]. Entering it yields no results - hmm.
Well, maybe there's a standard function with a slightly different syntax. For example, the standard function might have its arguments flipped; let's look for something with (a -> (a, b)) in its type signature somewhere. This time we're lucky as there are plenty of results, but all of them are in exotic packages and none of them seems very helpful.
Maybe the second part of your function is a better match, you want to generate a list out of some initial element after all - so type in a -> [b] and hit search. First result: unfoldr - bingo!
Another possibility is iterateM in State monad:
iterateM :: Monad m => m a -> m [a]
iterateM = sequence . repeat
It is not in standard library but it's easy to build.
So your my_iter is
evalState . sequence . repeat :: State s a -> s -> [a]

What to call a function that splits lists?

I want to write a function that splits lists into sublists according to what items satisfy a given property p. My question is what to call the function. I'll give examples in Haskell, but the same problem would come up in F# or ML.
split :: (a -> Bool) -> [a] -> [[a]] --- split lists into list of sublists
The sublists, concatenated, are the original list:
concat (split p xss) == xs
Every sublist satisfies the initial_p_only p property, which is to say (A) the sublist begins with an element satisfying p—and is therefore not empty, and (B) no other elements satisfy p:
initial_p_only :: (a -> Bool) -> [a] -> Bool
initial_p_only p [] = False
initial_p_only p (x:xs) = p x && all (not . p) xs
So to be precise about it,
all (initial_p_only p) (split p xss)
If the very first element in the original list does not satisfy p, split fails.
This function needs to be called something other than split. What should I call it??
I believe the function you're describing is breakBefore from the list-grouping package.
Data.List.Grouping: http://hackage.haskell.org/packages/archive/list-grouping/0.1.1/doc/html/Data-List-Grouping.html
ghci> breakBefore even [3,1,4,1,5,9,2,6,5,3,5,8,9,7,9,3,2,3,8,4,6,2,6]
[[3,1],[4,1,5,9],[2],[6,5,3,5],[8,9,7,9,3],[2,3],[8],[4],[6],[2],[6]]
I quite like some name based on the term "break" as adamse suggests. There are quite a few possible variants of the function. Here is what I'd expect (based on the naming used in F# libraries).
A function named just breakBefore would take an element before which it should break:
breakBefore :: Eq a => a -> [a] -> [[a]]
A function with the With suffix would take some kind of function that directly specifies when to break. In case of brekaing this is the function a -> Bool that you wanted:
breakBeforeWith :: (a -> Bool) -> [a] -> [[a]]
You could also imagine a function with By suffix would take a key selector and break when the key changes (which is a bit like group by, but you can have multiple groups with the same key):
breakBeforeBy :: Eq k => (a -> k) -> [a] -> [[a]]
I admit that the names are getting a bit long - and maybe the only function that is really useful is the one you wanted. However, F# libraries seem to be using this pattern quite consistently (e.g. there is sort, sortBy taking key selector and sortWith taking comparer function).
Perhaps it is possible to have these three variants for more of the list processing functions (and it's quite good idea to have some consistent naming pattern for these three types).

Resources