Dot Operator in Haskell: need more explanation - haskell

I'm trying to understand what the dot operator is doing in this Haskell code:
sumEuler = sum . (map euler) . mkList
The entire source code is below.
My understanding
The dot operator is taking the two functions sum and the result of map euler and the result of mkList as the input.
But, sum isn't a function it is the argument of the function, right? So what is going on here?
Also, what is (map euler) doing?
Code
mkList :: Int -> [Int]
mkList n = [1..n-1]
euler :: Int -> Int
euler n = length (filter (relprime n) (mkList n))
sumEuler :: Int -> Int
sumEuler = sum . (map euler) . mkList

Put simply, . is function composition, just like in math:
f (g x) = (f . g) x
In your case, you are creating a new function, sumEuler that could also be defined like this:
sumEuler x = sum (map euler (mkList x))
The style in your example is called "point-free" style -- the arguments to the function are omitted. This makes for clearer code in many cases. (It can be hard to grok the first time you see it, but you will get used to it after a while. It is a common Haskell idiom.)
If you are still confused, it may help to relate . to something like a UNIX pipe. If f's output becomes g's input, whose output becomes h's input, you'd write that on the command-line like f < x | g | h. In Haskell, . works like the UNIX |, but "backwards" -- h . g . f $ x. I find this notation to be quite helpful when, say, processing a list. Instead of some unwieldy construction like map (\x -> x * 2 + 10) [1..10], you could just write (+10) . (*2) <$> [1..10]. (And, if you want to only apply that function to a single value; it's (+10) . (*2) $ 10. Consistent!)
The Haskell wiki has a good article with some more detail: http://www.haskell.org/haskellwiki/Pointfree

The . operator composes functions. For example,
a . b
Where a and b are functions is a new function that runs b on its arguments, then a on those results. Your code
sumEuler = sum . (map euler) . mkList
is exactly the same as:
sumEuler myArgument = sum (map euler (mkList myArgument))
but hopefully easier to read. The reason there are parens around map euler is because it makes it clearer that there are 3 functions being composed: sum, map euler and mkList - map euler is a single function.

sum is a function in the Haskell Prelude, not an argument to sumEuler. It has the type
Num a => [a] -> a
The function composition operator . has type
(b -> c) -> (a -> b) -> a -> c
So we have
euler :: Int -> Int
map :: (a -> b ) -> [a ] -> [b ]
(map euler) :: [Int] -> [Int]
mkList :: Int -> [Int]
(map euler) . mkList :: Int -> [Int]
sum :: Num a => [a ] -> a
sum . (map euler) . mkList :: Int -> Int
Note that Int is indeed an instance of the Num typeclass.

The . operator is used for function composition. Just like math, if you have to functions f(x) and g(x) f . g becomes f(g(x)).
map is a built-in function which applies a function to a list. By putting the function in parentheses the function is treated as an argument. A term for this is currying. You should look that up.
What is does is that it takes a function with say two arguments, it applies the argument euler. (map euler) right? and the result is a new function, which takes only one argument.
sum . (map euler) . mkList is basically a fancy way of putting all that together. I must say, my Haskell is a bit rusty but maybe you can put that last function together yourself?

Dot Operator in Haskell
I'm trying to understand what the dot operator is doing in this Haskell code:
sumEuler = sum . (map euler) . mkList
Short answer
Equivalent code without dots, that is just
sumEuler = \x -> sum ((map euler) (mkList x))
or without the lambda
sumEuler x = sum ((map euler) (mkList x))
because the dot (.) indicates function composition.
Longer answer
First, let's simplify the partial application of euler to map:
map_euler = map euler
sumEuler = sum . map_euler . mkList
Now we just have the dots. What is indicated by these dots?
From the source:
(.) :: (b -> c) -> (a -> b) -> a -> c
(.) f g = \x -> f (g x)
Thus (.) is the compose operator.
Compose
In math, we might write the composition of functions, f(x) and g(x), that is, f(g(x)), as
(f ∘ g)(x)
which can be read "f composed with g".
So in Haskell, f ∘ g, or f composed with g, can be written:
f . g
Composition is associative, which means that f(g(h(x))), written with the composition operator, can leave out the parentheses without any ambiguity.
That is, since (f ∘ g) ∘ h is equivalent to f ∘ (g ∘ h), we can simply write f ∘ g ∘ h.
Circling back
Circling back to our earlier simplification, this:
sumEuler = sum . map_euler . mkList
just means that sumEuler is an unapplied composition of those functions:
sumEuler = \x -> sum (map_euler (mkList x))

The dot operator applies the function on the left (sum) to the output of the function on the right. In your case, you're chaining several functions together - you're passing the result of mkList to (map euler), and then passing the result of that to sum.
This site has a good introduction to several of the concepts.

Related

Function Composition Do Notation

Is there a "do notation" syntactic sugar for simple function composition?
(i.e. (.) :: (b -> c) -> (a -> b) -> a -> c)
I'd like to be able to store results of some compositions for later (while still continuing the chain.
I'd rather not use the RebindableSyntax extension if possible.
I'm looking for something like this:
composed :: [String] -> [String]
composed = do
fmap (++ "!!!")
maxLength <- maximum . fmap length
filter ((== maxLength) . length)
composed ["alice", "bob", "david"]
-- outputs: ["alice!!!", "david!!!"]
I'm not sure something like this is possible, since the result of the earlier function essentially has to pass "through" the bind of maxLength, but I'm open to hearing of any other similarly expressive options. Basically I need to collect information as I go through the composition in order to use it later.
Perhaps I could do something like this with a state monad?
Thanks for your help!
Edit
This sort of thing kinda works:
split :: (a -> b) -> (b -> a -> c) -> a -> c
split ab bac a = bac (ab a) a
composed :: [String] -> [String]
composed = do
fmap (++ "!!!")
split
(maximum . fmap length)
(\maxLength -> (filter ((== maxLength) . length)))
One possible way to achieve something like that are arrows. Basically, in “storing interstitial results” you're just splitting up the information flow through the composition chain. That's what the &&& (fanout) combinator does.
import Control.Arrow
composed = fmap (++ "!!!")
>>> ((. length) . (==) . maximum . fmap length &&& id)
>>> uncurry filter
This definitely isn't good human-comprehensible code though.
A state monad would seem to allow something related too, but the problem is that the state type is fixed through the do block's monadic chain. That's not really flexible enough to pick up different-typed values throughout the composition chain. While it is certainly possible to circumvent this (amongst them, indeed, RebindableSyntax), this too isn't a good idea IMO.
The type of (<*>) specialised to the function instance of Applicative is:
(<*>) :: (r -> a -> b) -> (r -> a) -> (r -> b)
The resulting r -> b function passes its argument to both the r -> a -> b and the r -> a functions, and then uses the a value produced by the r -> a function as the second argument of the r -> a -> b one.
What does this have to do with your function? filter is a function of two arguments, a predicate and a list. Now, a key aspect of what you are trying to do is that the predicate is generated from the list. That means the core of your function can be expressed in terms of (<*>):
-- Using the predicate-generating function from leftaroundabout's answer.
maxLengthOnly :: Foldable t => [t a] -> [t a]
maxLengthOnly = flip filter <*> ((. length) . (==) . maximum . fmap length)
composed :: [String] -> [String]
composed = maxLengthOnly . fmap (++ "!!!")
This maxLengthOnly definition would be a quite nice one-liner if the pointfree predicate-generating function weren't so clunky.
Since the Applicative instance of functions is equivalent in power to the Monad one, maxLengthOnly can also be phrased as:
maxLengthOnly = (. length) . (==) . maximum . fmap length >>= filter
(The split you added to your question, by the way, is (>>=) for functions.)
A different way of writing it with Applicative is:
maxLengthOnly = filter <$> ((. length) . (==) . maximum . fmap length) <*> id
It is no coincidence that this looks a lot like leftaroundabout's solution: for functions, (,) <$> f <*> g = liftA2 (,) f g = f &&& g.
Finally, it is also worth noting that, while it is tempting to replace id in the latest version of maxLengthOnly with fmap (++ "!!!"), that won't work because fmap (++ "!!!") changes the length of the strings, and therefore affects the result of the predicate. With a function that doesn't invalidate the predicate, though, it would work pretty well:
nicerComposed = filter
<$> ((. length) . (==) . maximum . fmap length) <*> fmap reverse
GHCi> nicerComposed ["alice","bob","david"]
["ecila","divad"]
As leftaroundabout mentioned, you can use Arrows to write your function. But, there is a feature in ghc Haskell compiler, which is proc-notation for Arrows. It is very similar to well-known do-notation, but, unfortunately, not many people aware of it.
With proc-notation you can write your desired function in next more redable and elegant way:
{-# LANGUAGE Arrows #-}
import Control.Arrow (returnA)
import Data.List (maximum)
composed :: [String] -> [String]
composed = proc l -> do
bangedL <- fmap (++"!!!") -< l
maxLen <- maximum . fmap length -< bangedL
returnA -< filter ((== maxLen) . length) bangedL
And this works in ghci as expected:
ghci> composed ["alice", "bob", "david"]
["alice!!!","david!!!"]
If you are interested, you can read some tutorials with nice pictures to understand what is arrow and how this powerful feature works so you can dive deeper into it:
https://www.haskell.org/arrows/index.html
https://en.wikibooks.org/wiki/Haskell/Understanding_arrows
What you have is essentially a filter, but one where the filtering function changes as you iterate over the list. I would model this not as a "forked" composition, but as a fold using the following function f :: String -> (Int, [String]):
The return value maintains the current maximum and all strings of that length.
If the first argument is shorter than the current maximum, drop it.
If the first argument is the same as the current maximum, add it to the list.
If the first argument is longer, make its length the new maximum, and replace the current output list with a new list.
Once the fold is complete, you just extract the list from the tuple.
-- Not really a suitable name anymore, but...
composed :: [String] -> [String]
composed = snd . foldr f (0, [])
where f curr (maxLen, result) = let currLen = length curr
in case compare currLen maxLen of
LT -> (maxLen, result) -- drop
EQ -> (maxLen, curr:result) -- keep
GT -> (length curr, [curr]) -- reset

Filter a list of tuples by fst

What I'm trying to do is not really solve a problem, but more to learn how to write Haskell code that composes/utilizes basic functions to do it.
I have a function that takes a list of tuples (String, Int) and a String, and returns a tuple whose fst matches the given String.
This was fairly easy to do with filter and lambda, but what I want to do now, is remove the rightmost argument, ie. I want to refactor the function to be a composition of partially applied functions that'll do the same functionality.
Original code was:
getstat :: Player -> String -> Stat
getstat p n = head $ filter (\(n', v) -> n' == n) $ stats p
New code is:
getstat :: Player -> String -> Stat
getstat p = head . (flip filter $ stats p) . cmpfst
where cmpfst = (==) . fst . (flip (,)) 0 -- Wrong :-\
The idea is to flip the filter and partially apply by giving in the list of tuples (stats p) and then compose cmpfst.
cmpfst should be String -> (String, Int) -> Bool so that when String argument is applied, it becomes a -> Bool which is good for the filter to pass in tuples, but as you can see - I have problems composing (==) so that only fst's of given tuples are compared.
P.S. I know that the first code is likely cleaner; the point of this task was not to write clean code but to learn how to solve the problem through composition.
Edit:
I understand well that asking for a head on an possibly empty list is a bad programming that'll result in a crash. Like one earlier poster mentioned, it is very simply and elegantly resolved with Maybe monad - a task I've done before and am familiar with.
What I'd like the focus to be on, is how to make cmpfst composed primarily of basic functions.
So far, the furthest I got is this:
getstat :: Player -> String -> Stat
getstat p = head . (flip filter $ stats p) . (\n' -> (==(fst n')) . fst) . (flip (,)) 0
I can't get rid of the (a -> Bool) lambda by composing and partially applying around (==). This signals, to me, that I either don't understand what I'm doing, or it's impossible using (==) operator in the way I imagined.
Furthermore, unless there's no exact solution, I'll accept signature-change solution as correct one. I'd like not to change the signature of the function simply because its a mental exercise for me, not a production code.
If I were writing this function, I'd probably have given it this type signature:
getstat :: String -> Player -> Stat
This makes it easy to eta-reduce the definition to
getstat n = head . filter ((== n) . fst) . stats
In a comment, you reached
getstat p = head . (flip filter $ stats p) . (\n (n', v) -> n' == n)
I wonder if there's a nicer composition that can eliminate the anon f.
Well, here it is
\n (n', v) -> n' == n
-- for convenience, we flip the ==
\n (n', v) -> n == n'
-- prefix notation
\n (n', v) -> (==) n n'
-- let's remove pattern matching over (n', v)
\n (n', v) -> (==) n $ fst (n', v)
\n x -> (==) n $ fst x
-- composition, eta
\n -> (==) n . fst
-- prefix
\n -> (.) ((==) n) fst
-- composition
\n -> ((.) . (==) $ n) fst
-- let's force the application to be of the form (f n (g n))
\n -> ((.) . (==) $ n) (const fst $ n)
-- exploit f <*> g = \n -> f n (g n) -- AKA the S combinator
((.) . (==)) <*> (const fst)
-- remove unneeded parentheses
(.) . (==) <*> const fst
Removing p is left as an exercise.

Project Euler 3 - Haskell

I'm working my way through the Project Euler problems in Haskell. I have got a solution for Problem 3 below, I have tested it on small numbers and it works, however due to the brute force implementation by deriving all the primes numbers first it is exponentially slow for larger numbers.
-- Project Euler 3
module Main
where
import System.IO
import Data.List
main = do
hSetBuffering stdin LineBuffering
putStrLn "This program returns the prime factors of a given integer"
putStrLn "Please enter a number"
nums <- getPrimes
putStrLn "The prime factors are: "
print (sort nums)
getPrimes = do
userNum <- getLine
let n = read userNum :: Int
let xs = [2..n]
return $ getFactors n (primeGen xs)
--primeGen :: (Integral a) => [a] -> [a]
primeGen [] = []
primeGen (x:xs) =
if x >= 2
then x:primeGen (filter (\n->n`mod` x/=0) xs)
else 1:[2]
--getFactors
getFactors :: (Integral a) => a -> [a] -> [a]
getFactors n xs = [ x | x <- xs, n `mod` x == 0]
I have looked at the solution here and can see how it is optimised by the first guard in factor. What I dont understand is this:
primes = 2 : filter ((==1) . length . primeFactors) [3,5..]
Specifically the first argument of filter.
((==1) . length . primeFactors)
As primeFactors is itself a function I don't understand how it is used in this context. Could somebody explain what is happening here please?
If you were to open ghci on the command line and type
Prelude> :t filter
You would get an output of
filter :: (a -> Bool) -> [a] -> [a]
What this means is that filter takes 2 arguments.
(a -> Bool) is a function that takes a single input, and returns a Bool.
[a] is a list of any type, as longs as it is the same type from the first argument.
filter will loop over every element in the list of its second argument, and apply it to the function that is its first argument. If the first argument returns True, it is added to the resulting list.
Again, in ghci, if you were to type
Prelude> :t (((==1) . length . primeFactors))
You should get
(((==1) . length . primeFactors)) :: a -> Bool
(==1) is a partially applied function.
Prelude> :t (==)
(==) :: Eq a => a -> a -> Bool
Prelude> :t (==1)
(==1) :: (Eq a, Num a) => a -> Bool
It only needs to take a single argument instead of two.
Meaning that together, it will take a single argument, and return a Boolean.
The way it works is as follows.
primeFactors will take a single argument, and calculate the results, which is a [Int].
length will take this list, and calculate the length of the list, and return an Int
(==1) will
look to see if the values returned by length is equal to 1.
If the length of the list is 1, that means it is a prime number.
(.) :: (b -> c) -> (a -> b) -> a -> c is the composition function, so
f . g = \x -> f (g x)
We can chain more than two functions together with this operator
f . g . h === \x -> f (g (h x))
This is what is happening in the expression ((==1) . length . primeFactors).
The expression
filter ((==1) . length . primeFactors) [3,5..]
is filtering the list [3, 5..] using the function (==1) . length . primeFactors. This notation is usually called point free, not because it doesn't have . points, but because it doesn't have any explicit arguments (called "points" in some mathematical contexts).
The . is actually a function, and in particular it performs function composition. If you have two functions f and g, then f . g = \x -> f (g x), that's all there is to it! The precedence of this operator lets you chain together many functions quite smoothly, so if you have f . g . h, this is the same as \x -> f (g (h x)). When you have many functions to chain together, the composition operator is very useful.
So in this case, you have the functions (==1), length, and primeFactors being compose together. (==1) is a function through what is called operator sections, meaning that you provide an argument to one side of an operator, and it results in a function that takes one argument and applies it to the other side. Other examples and their equivalent lambda forms are
(+1) => \x -> x + 1
(==1) => \x -> x == 1
(++"world") => \x -> x ++ "world"
("hello"++) => \x -> "hello" ++ x
If you wanted, you could re-write this expression using a lambda:
(==1) . length . primeFactors => (\x0 -> x0 == 1) . length . primeFactors
=> (\x1 -> (\x0 -> x0 == 1) (length (primeFactors x1)))
Or a bit cleaner using the $ operator:
(\x1 -> (\x0 -> x0 == 1) $ length $ primeFactors x1)
But this is still a lot more "wordy" than simply
(==1) . length . primeFactors
One thing to keep in mind is the type signature for .:
(.) :: (b -> c) -> (a -> b) -> a -> c
But I think it looks better with some extra parentheses:
(.) :: (b -> c) -> (a -> b) -> (a -> c)
This makes it more clear that this function takes two other functions and returns a third one. Pay close attention the the order of the type variables in this function. The first argument to . is a function (b -> c), and the second is a function (a -> b). You can think of it as going right to left, rather than the left to right behavior that we're used to in most OOP languages (something like myObj.someProperty.getSomeList().length()). We can get this functionality by defining a new operator that has the reverse order of arguments. If we use the F# convention, our operator is called |>:
(|>) :: (a -> b) -> (b -> c) -> (a -> c)
(|>) = flip (.)
Then we could have written this as
filter (primeFactors |> length |> (==1)) [3, 5..]
And you can think of |> as an arrow "feeding" the result of one function into the next.
This simply means, keep only the odd numbers that have only one prime factor.
In other pseodo-code: filter(x -> length(primeFactors(x)) == 1) for any x in [3,5,..]

infix operator precedence in Haskell

For the following Haskell expression
return a >>= f
Should it be read as
(return a) >>= f
or
return (a >>= f)?
what are the related rules here?
The rule is always that function application has higher precedence than any operator, so
return a >>= f
Is parsed as
(return a) >>= f
no matter what functions or operators are being used instead of return, f, and >>=.
That means things like
divide :: Int -> Int -> Double
divide x y = (fromIntegral x) / (fromIntegral y)
Are equivalent to
divide :: Int -> Int -> Double
divide x y = fromIntegral x / fromIntegral y
Another example where this is even more useful is in function composition:
something :: [Int] -> [Int]
something xs = filter even . map (+1) . zipWith (*) [1..] . take 200 . cycle $ xs
As you can see here, we even have zipWith taking two arguments composed with several other functions. This is equivalent to having put parentheses around every component of the composition.

What does a fullstop or period or dot (.) mean in Haskell?

I really wish that Google was better at searching for syntax:
decades :: (RealFrac a) => a -> a -> [a] -> Array Int Int
decades a b = hist (0,9) . map decade
where decade x = floor ((x - a) * s)
s = 10 / (b - a)
f(g(x))
is
in mathematics : f ∘ g (x)
in haskell : ( f . g ) (x)
It means function composition.
See this question.
Note also the f.g.h x is not equivalent to (f.g.h) x, because it is interpreted as f.g.(h x) which won't typecheck unless (h x) returns a function.
This is where the $ operator can come in handy: f.g.h $ x turns x from being a parameter to h to being a parameter to the whole expression. And so it becomes equivalent to f(g(h x)) and the pipe works again.
. is a higher order function for function composition.
Prelude> :type (.)
(.) :: (b -> c) -> (a -> b) -> a -> c
Prelude> (*2) . (+1) $ 1
4
Prelude> ((*2) . (+1)) 1
4
"The period is a function composition operator. In general terms, where f and g are functions, (f . g) x means the same as f (g x). In other words, the period is used to take the result from the function on the right, feed it as a parameter to the function on the left, and return a new function that represents this computation."
It is a function composition: link
Function composition (the page is pretty long, use search)

Resources