Most efficient way of building a list in a left fold? - haskell

When building lists, I usually use a right fold, as that lets me use the right-associative : operator without affecting the order of the resulting list. In a left fold, I could use ++, but I understand that this will entail repeatedly copying the list while it is being produced, giving O(N^2) operations for an N-element list, which is generally unacceptable.
So, when I have to use a left fold (in my specific case this is because I'm using foldlM and the monadic actions produced must be performed in left-to-right order), is there any better way of reconciling this other than building the list in the fold using : and reversing the result?

when I have to use a left fold (... because ... the monadic actions produced must be performed in left-to-right order)
Right fold can lean so far right that it comes back left again. For example, you can print (i.e. monadic action) each number and calculate partial sums of a list from left to right using a right fold:
fun :: [Int] -> IO [Int]
fun xs = foldr go (const $ return []) xs 0
where go x f a = let a' = a + x in print x >> (a' :) <$> f a'
then:
\> fun [1..5]
1
2
3
4
5
[1,3,6,10,15]
note that the output list is built using (a' :) and the monadic action is performed left to right, even though it is a right fold.

Since you mention foldlM, likely this blog post of mine answers your question in depth:
Constructing a list in a Monad
The bottom line is: Use difference lists; i.e. in your left-fold, you accumulate a value of type [a] -> [a], where you can append a value using . (x:) efficiently, and at the end you apply this function to [] to obtain your list.

Related

Functional Programming-Style Map Function that adds elements?

I know and love my filter, map and reduce, which happen to be part of more and more languages that are not really purely functional.
I found myself needing a similar function though: something like map, but instead of one to one it would be one to many.
I.e. one element of the original list might be mapped to multiple elements in the target list.
Is there already something like this out there or do I have to roll my own?
This is exactly what >>= specialized to lists does.
> [1..6] >>= \x -> take (x `mod` 3) [1..]
[1,1,2,1,1,2]
It's concatenating together the results of
> map (\x -> take (x `mod` 3) [1..]) [1..6]
[[1],[1,2],[],[1],[1,2],[]]
You do not have to roll your own. There are many relevant functions here, but I'll highlight three.
First of all, there is the concat function, which already comes in the Prelude (the standard library that's loaded by default). What this function does, when applied to a list of lists, is return the list that contains concatenated contents of the sublists.
EXERCISE: Write your own version of concat :: [[a]] -> [a].
So using concat together with map, you could write this function:
concatMap :: (a -> [b]) -> [a] -> [b]
concatMap f = concat . map f
...except that you don't actually need to write it, because it's such a common pattern that the Prelude already has it (at a more general type than what I show here—the library version takes any Foldable, not just lists).
Finally, there is also the Monad instance for list, which can be defined this way:
instance Monad [] where
return a = [a]
as >>= f = concatMap f as
So the >>= operator (the centerpiece of the Monad class), when working with lists, is exactly the same thing as concatMap.
EXERCISE: Skim through the documentation of the Data.List module. Figure out how to import the module into your code and play around with some of the functions.

How to modify this Haskell square root function to take an array

I have a function that will take and int and return its square root. However now i want to modify it so that it takes an array of integers and gives back an array with the square roots of the elements of the first array. I know Haskell does not use loops so how can this modification be done? Thanks.
intSquareRoot :: Int -> Int
intSquareRoot n = try n where
try i | i*i > n = try (i - 1)
| i*i <= n = i
Don't.
The idea of “looping through some collection”, putting each result in the corresponding slot of its input, is a somewhat trivial, extremely common pattern. Patterns are for OO programmers. In Haskell, when there's a pattern, we want to abstract over it, i.e. give it a simple name that we can always re-use without extra boilerplate.
This particular “pattern” is the functor operation1. For lists it's called
map :: (a->b) -> [a]->[b]
more generally (e.g. it'll also work with real arrays; lists aren't actually arrays),
class Functor f where
fmap :: (a->b) -> f a->f b
So instead of defining an extra function
intListSquareRoot :: [Int] -> [Int]
intListSquareRoot = ...
you simply use map intSquareRoot right where you wanted to use that function.
Of course, you could also define that “lifted” version of intSquareRoot,
intListSquareRoot = map intSquareRoot
but that gains you practically nothing over simply inlining the map call right where you need it.
If you insist
That said... it's of course valid to wonder how map itself works. Well, you can manually “loop” through a list by recursion:
map' :: (a->b) -> [a]->[b]
map' _ [] = []
map' f (x:xs) = f x : map' f xs
Now, you could inline your specific function here
intListSquareRoot' :: [Int] -> [Int]
intListSquareRoot' [] = []
intListSquareRoot' (x:xs) = intSquareRoot x : intListSquareRoot' xs
This is not only much more clunky and awkward than quickly inserting the map magic word, it will also often be slower: compilers such as GHC can make better optimisations when they work on higher-level concepts2 such as folds, than when they have to work again and again with manually defined recursion.
1Not to be confused what many C++ programmers call a “functor”. Haskell uses the word in the correct mathematical sense, which comes from category theory.
2This is why languages such as Matlab and APL actually achieve decent performance for special applications, although they are dynamically-typed, interpreted languages: they have this special case of “vector looping” hard-coded into their very syntax. (Unfortunately, this is pretty much the only thing they can do well...)
You can use map:
arraySquareRoot = map intSquareRoot

Folding across Maybes in Haskell

In an attempt to learn Haskell, I have come across a situation in which I wish to do a fold over a list but my accumulator is a Maybe. The function I'm folding with however takes in the "extracted" value in the Maybe and if one fails they all fail. I have a solution I find kludgy, but knowing as little Haskell as I do, I believe there should be a better way. Say we have the following toy problem: we want to sum a list, but fours for some reason are bad, so if we attempt to sum in a four at any time we want to return Nothing. My current solution is as follows:
import Maybe
explodingFourSum :: [Int] -> Maybe Int
explodingFourSum numberList =
foldl explodingFourMonAdd (Just 0) numberList
where explodingFourMonAdd =
(\x y -> if isNothing x
then Nothing
else explodingFourAdd (fromJust x) y)
explodingFourAdd :: Int -> Int -> Maybe Int
explodingFourAdd _ 4 = Nothing
explodingFourAdd x y = Just(x + y)
So basically, is there a way to clean up, or eliminate, the lambda in the explodingFourMonAdd using some kind of Monad fold? Or somehow currying in the >>=
operator so that the fold behaves like a list of functions chained by >>=?
I think you can use foldM
explodingFourSum numberList = foldM explodingFourAdd 0 numberList
This lets you get rid of the extra lambda and that (Just 0) in the beggining.
BTW, check out hoogle to search around for functions you don't really remember the name for.
So basically, is there a way to clean up, or eliminate, the lambda in the explodingFourMonAdd using some kind of Monad fold?
Yapp. In Control.Monad there's the foldM function, which is exactly what you want here. So you can replace your call to foldl with foldM explodingFourAdd 0 numberList.
You can exploit the fact, that Maybe is a monad. The function sequence :: [m a] -> m [a] has the following effect, if m is Maybe: If all elements in the list are Just x for some x, the result is a list of all those justs. Otherwise, the result is Nothing.
So you first decide for all elements, whether it is a failure. For instance, take your example:
foursToNothing :: [Int] -> [Maybe Int]
foursToNothing = map go where
go 4 = Nothing
go x = Just x
Then you run sequence and fmap the fold:
explodingFourSum = fmap (foldl' (+) 0) . sequence . foursToNothing
Of course you have to adapt this to your specific case.
Here's another possibility not mentioned by other people. You can separately check for fours and do the sum:
import Control.Monad
explodingFourSum xs = guard (all (/=4) xs) >> return (sum xs)
That's the entire source. This solution is beautiful in a lot of ways: it reuses a lot of already-written code, and it nicely expresses the two important facts about the function (whereas the other solutions posted here mix those two facts up together).
Of course, there is at least one good reason not to use this implementation, as well. The other solutions mentioned here traverse the input list only once; this interacts nicely with the garbage collector, allowing only small portions of the list to be in memory at any given time. This solution, on the other hand, traverses xs twice, which will prevent the garbage collector from collecting the list during the first pass.
You can solve your toy example that way, too:
import Data.Traversable
explodingFour 4 = Nothing
explodingFour x = Just x
explodingFourSum = fmap sum . traverse explodingFour
Of course this works only because one value is enough to know when the calculation fails. If the failure condition depends on both values x and y in explodingFourSum, you need to use foldM.
BTW: A fancy way to write explodingFour would be
import Control.Monad
explodingFour x = mfilter (/=4) (Just x)
This trick works for explodingFourAdd as well, but is less readable:
explodingFourAdd x y = Just (x+) `ap` mfilter (/=4) (Just y)

Does there exist something like (xs:x)

I'm new to Haskell. I know I can create a reverse function by doing this:
reverse :: [a] -> [a]
reverse [] = []
reverse (x:xs) = (Main.reverse xs) ++ [x]
Is there such a thing as (xs:x) (a list concatenated with an element, i.e. x is the last element in the list) so that I put the last list element at the front of the list?
rotate :: [a] -> [a]
rotate [] = []
rotate (xs:x) = [x] ++ xs
I get these errors when I try to compile a program containing this function:
Occurs check: cannot construct the infinite type: a = [a]
When generalising the type(s) for `rotate'
I'm also new to Haskell, so my answer is not authoritative. Anyway, I would do it using last and init:
Prelude> last [1..10] : init [1..10]
[10,1,2,3,4,5,6,7,8,9]
or
Prelude> [ last [1..10] ] ++ init [1..10]
[10,1,2,3,4,5,6,7,8,9]
The short answer is: this is not possible with pattern matching, you have to use a function.
The long answer is: it's not in standard Haskell, but it is if you are willing to use an extension called View Patterns, and also if you have no problem with your pattern matching eventually taking longer than constant time.
The reason is that pattern matching is based on how the structure is constructed in the first place. A list is an abstract type, which have the following structure:
data List a = Empty | Cons a (List a)
deriving (Show) -- this is just so you can print the List
When you declare a type like that you generate three objects: a type constructor List, and two data constructors: Empty and Cons. The type constructor takes types and turns them into other types, i.e., List takes a type a and creates another type List a. The data constructor works like a function that returns something of type List a. In this case you have:
Empty :: List a
representing an empty list and
Cons :: a -> List a -> List a
which takes a value of type a and a list and appends the value to the head of the list, returning another list. So you can build your lists like this:
empty = Empty -- similar to []
list1 = Cons 1 Empty -- similar to 1:[] = [1]
list2 = Cons 2 list1 -- similar to 2:(1:[]) = 2:[1] = [2,1]
This is more or less how lists work, but in the place of Empty you have [] and in the place of Cons you have (:). When you type something like [1,2,3] this is just syntactic sugar for 1:2:3:[] or Cons 1 (Cons 2 (Cons 3 Empty)).
When you do pattern matching, you are "de-constructing" the type. Having knowledge of how the type is structured allows you to uniquely disassemble it. Consider the function:
head :: List a -> a
head (Empty) = error " the empty list have no head"
head (Cons x xs) = x
What happens on the type matching is that the data constructor is matched to some structure you give. If it matches Empty, than you have an empty list. If if matches Const x xs then x must have type a and must be the head of the list and xs must have type List a and be the tail of the list, cause that's the type of the data constructor:
Cons :: a -> List a -> List a
If Cons x xs is of type List a than x must be a and xs must be List a. The same is true for (x:xs). If you look to the type of (:) in GHCi:
> :t (:)
(:) :: a -> [a] -> [a]
So, if (x:xs) is of type [a], x must be a and xs must be [a] . The error message you get when you try to do (xs:x) and then treat xs like a list, is exactly because of this. By your use of (:) the compiler infers that xs have type a, and by your use of
++, it infers that xs must be [a]. Then it freaks out cause there's no finite type a for which a = [a] - this is what he's trying to tell you with that error message.
If you need to disassemble the structure in other ways that don't match the way the data constructor builds the structure, than you have to write your own function. There are two functions in the standard library that do what you want: last returns the last element of a list, and init returns all-but-the-last elements of the list.
But note that pattern matching happens in constant time. To find out the head and the tail of a list, it doesn't matter how long the list is, you just have to look to the outermost data constructor. Finding the last element is O(N): you have to dig until you find the innermost Cons or the innermost (:), and this requires you to "peel" the structure N times, where N is the size of the list.
If you frequently have to look for the last element in long lists, you might consider if using a list is a good idea after all. You can go after Data.Sequence (constant time access to first and last elements), Data.Map (log(N) time access to any element if you know its key), Data.Array (constant time access to an element if you know its index), Data.Vector or other data structures that match your needs better than lists.
Ok. That was the short answer (:P). The long one you'll have to lookup a bit by yourself, but here's an intro.
You can have this working with a syntax very close to pattern matching by using view patterns. View Patterns are an extension that you can use by having this as the first line of your code:
{-# Language ViewPatterns #-}
The instructions of how to use it are here: http://hackage.haskell.org/trac/ghc/wiki/ViewPatterns
With view patterns you could do something like:
view :: [a] -> (a, [a])
view xs = (last xs, init xs)
someFunction :: [a] -> ...
someFunction (view -> (x,xs)) = ...
than x and xs will be bound to the last and the init of the list you provide to someFunction. Syntactically it feels like pattern matching, but it is really just applying last and init to the given list.
If you're willing to use something different from plain lists, you could have a look at the Seq type in the containers package, as documented here. This has O(1) cons (element at the front) and snoc (element at the back), and allows pattern matching the element from the front and the back, through use of Views.
"Is there such a thing as (xs:x) (a list concatenated with an element, i.e. x is the last element in the list) so that I put the last list element at the front of the list?"
No, not in the sense that you mean. These "patterns" on the left-hand side of a function definition are a reflection of how a data structure is defined by the programmer and stored in memory. Haskell's built-in list implementation is a singly-linked list, ordered from the beginning - so the pattern available for function definitions reflects exactly that, exposing the very first element plus the rest of the list (or alternatively, the empty list).
For a list constructed in this way, the last element is not immediately available as one of the stored components of the list's top-most node. So instead of that value being present in pattern on the left-hand side of the function definition, it's calculated by the function body onthe right-hand side.
Of course, you can define new data structures, so if you want a new list that makes the last element available through pattern-matching, you could build that. But there's be some cost: Maybe you'd just be storing the list backwards, so that it's now the first element which is not available by pattern matching, and requires computation. Maybe you're storing both the first and last value in the structures, which would require additional storage space and bookkeeping.
It's perfectly reasonable to think about multiple implementations of a single data structure concept - to look forward a little bit, this is one use of Haskell's class/instance definitions.
Reversing as you suggested might be much less efficient. Last is not O(1) operation, but is O(N) and that mean that rotating as you suggested becomes O(N^2) alghorhim.
Source:
http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/src/GHC-List.html#last
Your first version has O(n) complexity. Well it is not, becuase ++ is also O(N) operation
you should do this like
rotate l = rev l []
where
rev [] a = a
rev (x:xs) a = rev xs (x:a)
source : http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/src/GHC-List.html#reverse
In your latter example, x is in fact a list. [x] becomes a list of lists, e.g. [[1,2], [3,4]].
(++) wants a list of the same type on both sides. When you are using it, you're doing [[a]] ++ [a] which is why the compiler is complaining. According to your code a would be the same type as [a], which is impossible.
In (x:xs), x is the first item of the list (the head) and xs is everything but the head, i.e., the tail. The names are irrelevant here, you might as well call them (head:tail).
If you really want to take the last item of the input list and put that in the front of the result list, you could do something like:
rotate :: [a] -> [a]
rotate [] = []
rotate lst = (last lst):(rotate $ init lst)
N.B. I haven't tested this code at all as I don't have a Haskell environment available at the moment.

How would you (re)implement iterate in Haskell?

iterate :: (a -> a) -> a -> [a]
(As you probably know) iterate is a function that takes a function and starting value. Then it applies the function to the starting value, then it applies the same function to the last result, and so on.
Prelude> take 5 $ iterate (^2) 2
[2,4,16,256,65536]
Prelude>
The result is an infinite list. (that's why I use take).
My question how would you implement your own iterate' function in Haskell, using only the basics ((:) (++) lambdas, pattern mataching, guards, etc.) ?
(Haskell beginner here)
Well, iterate constructs an infinite list of values a incremented by f. So I would start by writing a function that prepended some value a to the list constructed by recursively calling iterate with f a:
iterate :: (a -> a) -> a -> [a]
iterate f a = a : iterate f (f a)
Thanks to lazy evaluation, only that portion of the constructed list necessary to compute the value of my function will be evaluated.
Also note that you can find concise definitions for the range of basic Haskell functions in the report's Standard Prelude.
Reading through this list of straightforward definitions that essentially bootstrap a rich library out of raw primitives can be very educational and eye-opening in terms of providing a window onto the "haskell way".
I remember a very early aha moment on reading: data Bool = False | True.

Resources