Efficient string swapping in Haskell - haskell

I'm trying to solve a problem for a functional programming exercise in Haskell. I have to implement a function such that, given a string with an even number of characters, the function returns the same string with character pairs swapped.
Like this:
"helloworld" -> "ehllworodl"
This is my current implementation:
swap :: String -> String
swap s = swapRec s ""
where
swapRec :: String -> String -> String
swapRec [] result = result
swapRec (x:y:xs) result = swapRec xs (result++[y]++[x])
My function returns the correct results, however the programming exercise is timed, and It seems like my code is running too slowly.
Is there something I could do to make my code run faster, or I am following the wrong approach to the problem ?

Yes. If you use (++) :: [a] -> [a] -> [a], then this takes linear time in the number of elements of the first list you want to concatenate. Since result can be large, this will result in a ineffeciency: the algorithm is then O(n2).
You however do not need to construct the result with an accumulator. You can return a list, and do the processing of the remaining elements with a recursive call, like:
swap :: [a] -> [a]
swap [] = []
swap [x] = [x]
swap (x:y:xs) = y : x : swap xs
The above also uncovered a problem with the implementation: if the list had an odd length, then the function would have crashed. Here in the second case, we handle a list with one element by returning that list (perhaps you need to modify this according to the specifications).
Furthermore here we can benefit of Haskell's laziness: if we have a large list, want to pass it through the swap function, but are only interested in the first five elements, then we will not calculate the entire list.
We can also process all kinds of list with the above function: a list of numbers, of strings, etc.
Note that (++) itself is not inherently bad: if you need to concatenate, it is of course the most efficient way to do this. The problem is that you here in every recursive step will concatenate again, and the left list is growing each time.

Affixing something at the end of the accumulator passed into a recursive call
swapRec (x:y:xs) resultSoFar = swapRec xs
(resultSoFar ++ [y] ++ [x])
is the same as prepending it at the start of the result returned from the recursive call:
swapRec (x:y:xs) = [y] ++ [x] ++ swapRec xs
You will have to amend your function accordingly throughout.
This is known as guarded recursion. What you were using is known as tail recursion (a left fold).
The added benefit is that it will now be on-line (i.e., taking O(1) time per each processed element). You were creating the (++) nesting on the left which leads to quadratic behaviour, as discussed e.g. here.

Related

Permutations of a list in Haskell without using Data.Set

I have to write a Haskell function that gives all possible permutations of a given list.
The type signature has to be:
permutations :: [a] -> [[a]]
For example, an acceptable result is (under ghci):
λ>
λ> permutations [1,2,3]
[[1,2,3],[2,1,3],[2,3,1],[1,3,2],[3,1,2],[3,2,1]]
λ>
The following restrictions apply:
Plain lists are the sole authorized data structure, so no Set or Vector.
The permutations may be produced in any order.
The function must work for any element type, i.e. no Ord a or Eq a instances.
Only library functions from the standard Prelude may be used.
Does anyone know how I could do it ?
There are several ways to approach this problem. jpmarinier suggest one possible way in the comments, but I think a recursive approach following the structure of the input list is more natural in Haskell.
For that recursive approach you have to implement what needs to happen in the case the list is empty and what needs to happen in the case that the list contains at least one element and in this case you can also use the function recursively on the rest of the list. The general structure is:
permutations [] = _
permutations (x:xs) = let xs' = permutations xs in _
The case with the empty list is pretty simple, but there are a few different choices that make the compiler happy, so it might not be immediately clear which one you should choose.
For the case with at least one element I would use a second helper function called splits :: [Int] -> [([Int],[Int])] which computes all possible splits of the input list into two lists.
Here an example input and output that might make it more clear what I mean:
splits [1,2,3] == [([],[1,2,3]),([1],[2,3]),([1,2],[3]),([1,2,3],[])]
The implementation of this function is also recursive and follows the same pattern:
splits [] = _
splits (x:xs) = let xs' = splits xs in _
The Wikipedia article on permutations leads us to, among many other things, the Steinhaus–Johnson–Trotter algorithm, which seems well suited to linked lists.
For this algorithm, an essential building block is a function we could declare as:
spread :: a -> [a] -> [[a]]
For example, expression spread 4 [1,2,3] has to put 4 at all possible positions within [1,2;3], thus evaluating to: [[4,1,2,3],[1,4,2,3],[1,2,4,3],[1,2,3,4]]. To get all permutations of [1,2,3,4], you just need to apply spread 4 to all permutations of [1,2,3]. And it is easy to write spread in recursive fashion:
spread :: a -> [a] -> [[a]]
spread x [] = [[x]]
spread x (y:ys) = (x:y:ys) : (map (y:) (spread x ys))
And permutations can thus be obtained like this:
permutations :: [a] -> [[a]]
permutations [] = [[]]
permutations (x:xs) = concat (map (spread x) (permutations xs))

Haskell: Parse error in pattern x ++ xs

Doing the third of the 99-Haskell problems (I am currently trying to learn the language) I tried to incorporate pattern matching as well as recursion into my function which now looks like this:
myElementAt :: [a] -> Int -> a
myElementAt (x ++ xs) i =
if length (x ++ xs) == i && length xs == 1 then xs!!0
else myElementAt x i
Which gives me Parse error in pattern: x ++ xs. The questions:
Why does this give me a parse error? Is it because Haskell is no idea where to cut my list (Which is my best guess)?
How could I reframe my function so that it works? The algorithmic idea is to check wether the list has the length as the specified inde; if yes return the last elemen; if not cut away one element at the end of the list and then do the recursion.
Note: I know that this is a really bad algorithm, but it I've set myself the challenge to write that function including recursion and pattern matching. I also tried not to use the !! operator, but that is fine for me since the only thing it really does (or should do if it compiled) is to convert a one-element list into that element.
Haskell has two different kinds of value-level entities: variables (this also includes functions, infix operators like ++ etc.) and constructors. Both can be used in expressions, but only constructors can also be used in patterns.
In either case, it's easy to tell whether you're dealing with a variable or constructor: a constructor always starts with an uppercase letter (e.g. Nothing, True or StateT) or, if it's an infix, with a colon (:, :+). Everything else is a variable. Fundamentally, the difference is that a constructor is always a unique, immediately matcheable value from a predefined collection (namely, the alternatives of a data definition), whereas a variable can just have any value, and often it's in principle not possible to uniquely distinguish different variables, in particular if they have a function type.
Yours is actually a good example for this: for the pattern match x ++ xs to make sense, there would have to be one unique way in which the input list could be written in the form x ++ xs. Well, but for, say [0,1,2,3], there are multiple different ways in which this can be done:
[] ++[0,1,2,3]
[0] ++ [1,2,3]
[0,1] ++ [2,3]
[0,1,2] ++ [3]
[0,1,2,3]++ []
Which one should the runtime choose?
Presumably, you're trying to match the head and tail part of a list. Let's step through it:
myElementAt (x:_) 0 = x
This means that if the head is x, the tail is something, and the index is 0, return the head. Note that your x ++ x is a concatenation of two lists, not the head and tail parts.
Then you can have
myElementAt(_:tl) i = myElementAt tl (i - 1)
which means that if the previous pattern was not matched, ignore the head, and take the i - 1 element of the tail.
In patterns, you can only use constructors like : and []. The append operator (++) is a non-constructor function.
So, try something like:
myElementAt :: [a] -> Int -> a
myElementAt (x:xs) i = ...
There are more issues in your code, but at least this fixes your first problem.
in standard Haskell pattern matches like this :
f :: Int -> Int
f (g n 1) = n
g :: Int -> Int -> Int
g a b = a+b
Are illegal because function calls aren't allowed in patterns, your case is just a special case as the operator ++ is just a function.
To pattern match on lists you can do it like this:
myElementAt :: [a] -> Int -> a
myElementAt (x:xs) i = // result
But in this case x is of type a not [a] , it is the head of the list and xs is its tail, you'll need to change your function implementation to accommodate this fact, also this function will fail with the empty list []. However that's the idiomatic haskell way to pattern match aginst lists.
I should mention that when I said "illegal" I meant in standard Haskell, there are GHC extensions that give something similar to that , it's called ViewPatterns But I don't think you need it especially that you're still learning.

The length of a list without the "length" function in Haskell

I want to see how long a list is, but without using the function length. I wrote this program and it does not work. Maybe you can tell me why? Thanks!
let y = 0
main = do
list (x:xs) = list (xs)
y++
list :: [Integer] -> Integer
list [] = y
Your program looks quite "imperative": you define a variable y, and then somehow write a do, that calls (?) the list function (?) that automagically seems to "return y" and then you want to increment y.
That's not how Haskell (and most functional and declarative) languages work:
in a declarative language, you define a variable only once, after the value is set, there is usually no way to alter its value,
in Haskell a do usually is used for monads, whereas the length is a pure function,
the let is a syntax construction to define a variable within the scope of an expression,
...
In order to program Haskell (or any functional language), you need to "think functional": think how you would solve the problem in a mathematical way using only functions.
In mathematics, you would say that the empty list [] clearly has length 0. Furthermore in case the list is not empty, there is a first element (the "head") and remaining elements (the "tail"). In that case the result is one plus the length of the tail. We can convert that in a mathematical expression, like:
Now we can easily translate that function into the following Haskell code:
ownLength :: [a] -> Int
ownLength [] = 0
ownLength (_:xs) = 1 + ownLength xs
Now in Haskell, one usually also uses accumulators in order to perform tail recursion: you pass a parameter through the recursive calls and each time you update the variable. When you reach the end of your recursion, you return - sometimes after some post-processing - the accumulator.
In this case the accumulator would be the so far seen length, so you could write:
ownLength :: [a] -> Int
ownLength = ownLength' 0
where ownLength' a [] = a
ownLength' a (_:xs) = ownLength' (a+1) xs
It looks you still think in an imperative way (not the functional way). For example:
you try to change the value of a "variable" (i.e. y++)
you try to use "global variable" (i.e. y) in the body of the list function
Here is the possible solution to your problem:
main = print $ my_length [1..10]
my_length :: [Integer] -> Integer
my_length [] = 0
my_length (_:xs) = 1 + my_length xs
You can also run this code here: http://ideone.com/mjUwL9.
Please also note that there is no need to require that your list consists of Integer values. In fact, you can create much more "agnostic" version of your function by using the following declaration:
my_length :: [a] -> Integer
Implementation of this function doesn't rely on the type of items from the list, thus you can use it for a list of any type. In contrast, you couldn't be that much liberal for, for example, my_sum function (a potential function that calculates the sum of elements from the given list). In this situation, you should define that your list consists of some numerical type items.
At the end, I'd like to suggest you a fantastic book about Haskell programming: http://learnyouahaskell.com/chapters.
Other answers have already beautifully explained the proper functional approach. It looks like an overkill but here is another way of implementing the length function by using only available higher order functions.
my_length :: [a] -> Integer
my_length = foldr (flip $ const . (+1)) 0
I've found this solution in Learn you a haskell.
length' xs = sum [1 | _ <- xs]
It replaces every element of the list with 1 and sums it up.
Probably the simplest way is to convert all elements to 1 and then to sum the new elements:
sum . map (const 1)
For added speed:
foldl' (+) 0 . map (const 1)

Haskell, Monads, Stack Space, Laziness -- how to structure code to be lazy?

A contrived example, but the below code demonstrates a class of problems I keep running into while learning Haskell.
import Control.Monad.Error
import Data.Char (isDigit)
countDigitsForList [] = return []
countDigitsForList (x:xs) = do
q <- countDigits x
qs <- countDigitsForList xs
return (q:qs)
countDigits x = do
if all isDigit x
then return $ length x
else throwError $ "Bad number: " ++ x
t1 = countDigitsForList ["1", "23", "456", "7890"] :: Either String [Int]
t2 = countDigitsForList ["1", "23", "4S6", "7890"] :: Either String [Int]
t1 gives me the right answer and t2 correctly identifies the error.
Seems to me that, for a sufficiently long list, this code is going to run out of stack space because it runs inside of a monad and at each step it tries to process the rest of the list before returning the result.
An accumulator and tail recursion seems like it may solve the problem but I repeatedly read that neither are necessary in Haskell because of lazy evaluation.
How do I structure this kind of code into one which won't have a stack space problem and/or be lazy?
How do I structure this kind of code into one which won't have a stack space problem and/or be lazy?
You can't make this function process the list lazily, monads or no. Here's a direct translation of countDigitsForList to use pattern matching instead of do notation:
countDigitsForList [] = return []
countDigitsForList (x:xs) = case countDigits x of
Left e -> Left e
Right q -> case countDigitsForList xs of
Left e -> Left e
Right qs -> Right (q:qs)
It should be easier to see here that, because a Left at any point in the list makes the whole thing return that value, in order to determine the outermost constructor of the result, the entire list must be traversed and processed; likewise for processing each element. Because the final result potentially depends on the last character in the last string, this function as written is inherently strict, much like summing a list of numbers.
Given that, the thing to do is ensure that the function is strict enough to avoid building up a huge unevaluated expression. A good place to start for information on that is discussions on the difference between foldr, foldl and foldl'.
An accumulator and tail recursion seems like it may solve the problem but I repeatedly read that neither are necessary in Haskell because of lazy evaluation.
Both are unnecessary when you can instead generate, process, and consume a list lazily; the simplest example here being map. For a function where that's not possible, strictly-evaluated tail recursion is precisely what you want.
camccann is right that the function is inherently strict. But that doesn't mean that it can't run in constant stack!
countDigitsForList xss = go xss []
where go (x:xs) acc = case countDigits x of
Left e -> Left e
Right q -> go xs (q:acc)
go [] acc = reverse acc
This accumulating parameter version is a partial cps transform of camccann's code, and I bet that you could get the same result by working over a cps-transformed either monad as well.
Edited to take into account jwodder's correction regarding reverse. oops. As John L notes an implicit or explicit difference list would work as well...

Haskell guards not being met

test :: [String] -> [String]
test = foldr step []
where step x ys
| elem x ys = x : ys
| otherwise = ys
I am trying to build a new list consisting of all the distinct strings being input. My test data is:
test ["one", "one", "two", "two", "three"]
expected result:
["one", "two", "three"]
I am new to Haskell, and I am sure that I am missing something very fundamental and obvious, but have run out of ways to explore this. Could you provide pointers to where my thinking is deficient?
The actual response is []. It seems that the first guard condition is never met (if I replace it with True, the original list is replicated), so the output list is never built.
My understanding was that the fold would accumulate the result of step on each item of the list, adding it to the empty list. I anticipated that step would test each item for its inclusion in the output list (the first element tested not being there) and would add anything that was not already included to the output list. Obviously not :-)
Your reasoning is correct: you just need to switch = x : ys and = ys so that you add the x when it's not an element of ys. Also, Data.List.nub does this exact thing.
Think about it: your code is saying "when x is in the remainder, prepend x to the result", i.e. creating a duplicate. You just need to change it to "when x is not in the remainder, prepend x to the result" and you get the correct function.
This function differs from Data.List.nub in an important way: this function is more strict. Thus:
test [1..] = _|_ -- infinite loop (try it)
nub [1..] = [1..]
nub gives the answer correctly for infinite lists -- this means that it doesn't need the whole list to start computing results, and thus it is a nice player in the stream processing game.
The reason it is strict is because elem is strict: it searches the whole list (presuming it doesn't find a match) before it returns a result. You could write that like this:
nub :: (Eq a) => [a] -> [a]
nub = go []
where
go seen [] = []
go seen (x:xs) | x `elem` seen = go seen xs
| otherwise = x : go (x:seen) xs
Notice how seen grows like the output so far, whereas yours grows like the remainder of the output. The former is always finite (starting at [] and adding one at a time), whereas the latter may be infinite (eg. [1..]). So this variant can yield elements more lazily.
This would be faster (O(n log n) instead of O(n^2)) if you used a Data.Set instead of a list for seen. But it adds an Ord constraint.

Resources