merge finite sorted list in Haskell - haskell

I am new to Haskell. I am wondering how to write a function in Haskell that accepts finite sorted list of integers and merge them (sorted). Any code is appreciated!

If your goal is just to merge two list this is not so complicated
merge :: Ord a => [a] -> [a] -> [a]
this says that merge takes two lists and produce a list for any type with a defined ordering relation
merge [] x = x
merge x [] = x
this says that if you merge the empty list with anything you get that anything
merge (x:xs) (y:ys) | y < x = y : merge (x:xs) ys
merge (x:xs) (y:ys) | otherwise = x : merge xs (y:ys)
this says that if when you merge two lists the first element of the second list is lower, that should go on the front of the new list, and otherwise you should use the first element of the first list.
EDIT: Note that unlike some of the other solutions the merge above is both O(n) and stable. Wikipedia it if you don't know what that means.
If your goal is to merge a list of lists you generally want to do this bottom up by merging two lists at a time
mergePairs :: Ord a => [[a]] -> [[a]]
mergePairs [] = []
mergePairs [ls] = [ls]
mergePairs (x:y:ls) = (merge x y):mergePairs ls
merges :: Ord a => [[a]] -> [a]
merges [] = []
merges [x] = x
merges ls = merges $ mergePairs ls
it can be shown that this is asymptotically optimal if all the initial lists are the same length (O(m n log n) where m is the length of sorted lists and n is the number of sorted lists).
This can lead to an asymptotically efficent merge sort
mergeSort :: Ord a => [a] -> [a]
mergeSort ls = merges $ map (\x -> [x]) ls

This should do it, without requiring that the lists be finite:
merge :: Ord a => [a] -> [a] -> [a]
merge (x:xs) (y:ys) = if x < y
then x:(merge xs (y:ys))
else y:(merge (x:xs) ys)
merge [] xs = xs
merge xs [] = xs
In english, check the first elements of each list, and make the lesser one the next element, then merge the lists that remain.

(sort . concat) [[30..32],[1..3]] == [1,2,3,30,31,32]

Related

More efficient powerset algorithm haskell

I Have a powerset function which creates a list [[a]] but the largest [a] is worked out first, meaning the whole algorithm has to run before I can get the smaller values.
I need a function which returns a powerset, in ascending order, so I could take the first n values of the function and the whole algorithm would not need to run.
Current simple algorithm
powerset :: [a] -> [[a]]
powerset [] = [[]]
powerset (x:xs) = [x:ps | ps <- powerset xs] ++ powerset xs
I don't understand what you mean by ascending order, but consider this solution:
powerset' :: [a] -> [[a]]
powerset' = loop [[]]
where
loop :: [[a]] -> [a] -> [[a]]
loop acc [] = acc
loop acc (x:xs) = loop (acc ++ fmap (\e -> e ++ [x]) acc) xs
We start with the powerset of the empty list, which is [[]], and expand it for each new element we encounter in the input list. The expansion is by appending the new element in each sublist we already emitted.
It requires that we append elements to the sublists exponentially many times, so I also considered using Data.DList from the dlist package that provides an efficient snoc operator that appends new elements to the end of the list:
import Data.DList
powerset :: [a] -> [[a]]
powerset xs = toList <$> loop [empty] xs
where
loop :: [DList a] -> [a] -> [DList a]
loop acc [] = acc
loop acc (y:ys) = loop (acc ++ fmap (`snoc` y) acc) ys
In my (rough) experiments, though, the first solution uses way less memory in the REPL and thus finishes faster for bigger input lists.
In both cases, this is what you get at the end:
$> powerset [1,2,3]
[[],[1],[2],[1,2],[3],[1,3],[2,3],[1,2,3]]
$> powerset_original [1,2,3]
[[1],[1,2],[1,3],[1,2,3],[],[2],[3],[2,3]]

Nested Loop Formatting

I am trying to produce an output where the input list is split each time f x is true. I use two variables to keep track of the substring and the final list, and this function will be called by another that provides empty lists for two tracking variables. Example desired output:
separate odd [1,2,3,4] = [[2],[4]]
Below is what I have so far - although I keep running into type errors because lists all have to be of the same type - can anyone advise what changes need to be made to produce the desired output?
separate f (x:xs) sublist finalstr
| f x = (finalstr ++ sublist) : separate f xs sublist finalstr
| otherwise = (sublist ++ x) : separate f xs sublist finalstr
separate f [] sublist finalstr = []
You could divide your problem into the following sub-problems:
Group each element in the list by what f returns:
groupOn :: Eq b => (a -> b) -> [a] -> [[a]]
groupOn f = ...
For example,
> groupOn odd [1,3,3,4,5,6,8]
[[1,3,3],[4],[5],[6,8]]
Filter out sub-lists in which the first element satisfies f:
separate :: (a -> Bool) -> [a] -> [[a]]
separate f xs = filter (\ys -> ...) (groupOn f xs)
where ys would be [1,3,3], [4], [5] and [6,8] in the above example.

Sorting a list of lists in Haskell

I want to write a function that takes a list of sorted lists, then merges everything together and sorts them again.
I managed to write this so far:
merge_:: Ord a => [[a]] -> [a] --takes in the list and merges it
merge_ [] = []
merge_ (x:xs) = x ++ merge_ xs
isort:: Ord a => [a] -> [a] --Sorts a list
isort [] = []
isort (a:x) = ins a (isort x)
where
ins a [] = [a]
ins a (b:y) | a<= b = a:(b:y)
| otherwise = b: (ins a y)
I haven't been able to find a way to combine these two in one function in a way that makes sense. Note that I'm not allowed to use things such as ('.', '$'..etc) (homework)
We start simple. How do we merge two sorted lists?
mergeTwo :: Ord a => [a] -> [a] -> [a]
mergeTwo [] ys = ys
mergeTwo xs [] = xs
mergeTwo (x:xs) (y:ys)
| x <= y = x : mergeTwo xs (y:ys)
| otherwise = y : mergeTwo (x:xs) ys
How do we merge multiple? Well, we start with the first and the second and merge them together. Then we merge the new one and the third together:
mergeAll :: Ord a => [[a]] -> [a]
mergeAll (x:y:xs) = mergeAll ((mergeTwo x y) : xs)
mergeAll [x] = x
mergeAll _ = []
Allright. Now, to sort all elements, we need to create a list from every element, and then merge them back. Let's write a function that creates a list for a single item:
toList :: a -> [a]
toList x = -- exercise
And now a function to wrap all elements in lists:
allToList :: [a] -> [[a]]
allToList = -- exercise
And now we're done. We simply need to use allToList and then mergeAll:
isort :: Ord a => [a] -> [a]
isort xs = mergeAll (allToList xs)
Note that this exercise got a lot easier since we've split it into four functions.
Exercises (which might not be possible for you(r homework))
Write toList and allToList.
Try a list comprehension for allToList. Try a higher order function for allToList.
Write isort point-free (with (.)).
Check whether there is already a toList function with the same type. Use that one.
Rewrite mergeAll using foldr
Try this (not tested):
merge :: Ord a => [a] -> [a] -> [a]
merge [] l1 = l1
merge l1 [] = l1
merge (e1:l1) (e2:l2)
| e1<e2 = e1:merge l1 (e2:l2)
| otherwise = e2:merge (e1:l1) l2

Haskell Merge Sort

This is an implementation of Mergesort using higher order functions,guards,where and recursion.
However getting an error from compiler 6:26: parse error on input ‘=’
mergeSort :: ([a] -> [a] -> [a]) -> [a] -> [a]
mergeSort merge xs
| length xs < 2 = xs
| otherwise = merge (mergeSort merge first) (mergeSort merge second)
where first = take half xs
second = drop half xs
half = (length xs) `div` 2
I can't see whats wrong? or rather I don't understand the compiler.
Halving a list is not an O(1) operation but O(n), so the given solutions introduce additional costs compared to the imperative version of merge sort. One way to avoid halving is to simply start merging directly by making singletons and then merging every two consecutive lists:
sort :: (Ord a) => [a] -> [a]
sort = mergeAll . map (:[])
where
mergeAll [] = []
mergeAll [t] = t
mergeAll xs = mergeAll (mergePairs xs)
mergePairs (x:y:xs) = merge x y:mergePairs xs
mergePairs xs = xs
where merge is already given by others.
Another msort implementation in Haskell;
merge :: Ord a => [a] -> [a] -> [a]
merge [] ys = ys
merge xs [] = xs
merge (x:xs) (y:ys) | x < y = x:merge xs (y:ys)
| otherwise = y:merge (x:xs) ys
halve :: [a] -> ([a],[a])
halve xs = (take lhx xs, drop lhx xs)
where lhx = length xs `div` 2
msort :: Ord a => [a] -> [a]
msort [] = []
msort [x] = [x]
msort xs = merge (msort left) (msort right)
where (left,right) = halve xs
Haskell is an indentation sensitive programming language, you simply need to fix that (btw. if you are using tabs change that to using spaces).
mergeSort :: ([a] -> [a] -> [a]) -> [a] -> [a]
mergeSort merge xs
| length xs < 2 = xs
| otherwise = merge (mergeSort merge first) (mergeSort merge second)
where first = take half xs
second = drop half xs
half = length xs `div` 2
None of these solutions is as smart as Haskell's own solution, which runs on the idea that in the worst case scenario's these proposed algorithms is still run Theta (n log n) even if the list to be sorted is already trivially sorted.
Haskell's solution is to merge lists of strictly decreasing (and increasing values). The simplified code looks like:
mergesort :: Ord a => [a] -> [a]
mergesort xs = unwrap (until single (pairWith merge) (runs xs))
runs :: Ord a => [a] -> [[a]]
runs = foldr op []
where op x [] = [[x]]
op x ((y:xs):xss) | x <= y = (x:y:xs):xss
| otherwise = [x]:(y:xs):xss`
This will run Theta(n)
Haskell's version is smarter still because it will do an up run and a down run.
As usual I am in awe with the cleverness of Haskell!

How do I make a list of substrings?

I am trying to make a list of all substrings where each substring has one less element of the originial string.
e.g "1234" would result in ["1234","123","12","1"]
I would like to achieve this only using prelude (no import) so cant use subsequences.
I am new to Haskell, and I know some of the problems with my code but don't currently know how to fix them.
slist :: String -> [String]
slist (x:xs) = (take (length (x:xs)) (x:xs)) ++ slist xs
How can I do this recursively using
Edit: would like to this by using init recursively
slist :: String -> [String]
slist [] = []
-- slist xs = [xs] ++ (slist $ init xs)
slist xs = xs : (slist $ init xs)
main = do
print $ slist "1234"
Here's a very lazy version suitable for working on infinite lists. Each element of each resulting list after the first only requires O(1) amortized time to compute it no matter how far into the list we look.
The general idea is: for each length n we intend to drop off the end we split the list into a queue of items of length n and the remainder of the list. To yield results, we first check there's another item in the list that can take a place in the queue, then yield the first item in the queue. When we reach the end of the list we discard the remaining items from the queue.
import Data.Sequence (Seq, empty, fromList, ViewL (..), viewl, (|>))
starts :: [a] -> [[a]]
starts = map (uncurry shiftThrough) . splits
shiftThrough :: Seq a -> [a] -> [a]
shiftThrough queue [] = []
shiftThrough queue (x:xs) = q1:shiftThrough qs xs
where
(q1 :< qs) = viewl (queue |> x)
splits finds all the initial sequences of a list together with the tailing list.
splits :: [a] -> [(Seq a, [a])]
splits = go empty
where
go s [] = []
go s (x:xs) = (s,x:xs):go (s |> x) xs
We can write dropping from the end of a list in terms of the same strategy.
dropEnd :: Int -> [a] -> [a]
dropEnd n = uncurry (shiftThrough . fromList) . splitAt n
These use Data.Sequence's amortized O(n) construction of a sequence fromList, O(1) appending to the end of sequence with |> and O(1) examining the start of a sequence with viewl.
This is fast enough to query things like (starts [1..]) !! 80000 very quickly and (starts [1..]) !! 8000000 in a few seconds.
Look ma, no imports
A simple purely functional implementation of a queue is a pair of lists, one containing the things to output next in order and one containing the most recent things added. Whenever something is added it's added to the beginning of the added list. When something is needed the item is removed from the beginning of the next list. When there are no more items left to remove from the next list it is replaced by the added list in reverse order, and the added list is set to []. This has amortized O(1) running time since each item will be added once, removed once, and reversed once, however many of the reversals will happen all at once.
delay uses the queue logic described above to implement the same thing as shiftThrough from the previous section. xs is the list of things that were recently added and ys is the list of things to use next.
delay :: [a] -> [a] -> [a]
delay ys = traverse step ([],ys)
where
step (xs, ys) x = step' (x:xs) ys
step' xs [] = step' [] (reverse xs)
step' xs (y:ys) = (y, (xs, ys))
traverse is almost a scan
traverse :: (s -> a -> (b, s)) -> s -> [a] -> [b]
traverse f = go
where
go _ [] = []
go s (x:xs) = y : go s' xs
where (y, s') = f s x
We can define starts in terms of delay and another version of splits that returns lists.
starts :: [a] -> [[a]]
starts = map (uncurry delay) . splits
splits :: [a] -> [([a], [a])]
splits = go []
where
go s [] = []
go s (x:xs) = (reverse s, x:xs):go (x:s) xs
This has very similar performance to the implementation using Seq.
Here's a somewhat convoluted version:
slist xs = go (zip (repeat xs) [lenxs, lenxs - 1..1])
where lenxs = length xs
go [] = []
go (x:xs) = (take (snd x) (fst x)) : go xs
main = do
print $ slist "1234"
Updated answer to list all possible substrings (not just starting from the root).
slist :: [t] -> [[t]]
slist [] = []
slist xs = xs : (slist $ init xs ) # Taken from Pratik Deoghare's post
all_substrings:: [t] -> [[t]]
all_substrings (x:[]) = [[x]]
all_substrings (x:xs) = slist z ++ all_substrings xs
where z = x:xs
λ> all_substrings "1234"
["1234","123","12","1","234","23","2","34","3","4"]

Resources