Haskell Split String at every character into list of n size string [duplicate] - string

For example, I need a function:
gather :: Int -> [a] -> [[a]]
gather n list = ???
where gather 3 "Hello!" == ["Hel","ell","llo","ol!"].
I have a working implementation:
gather :: Int-> [a] -> [[a]]
gather n list =
unfoldr
(\x ->
if fst x + n > length (snd x) then
Nothing
else
Just
(take
n
(drop
(fst x)
(snd x)),
(fst x + 1, snd x)))
(0, list)
but I am wondering if there is something already built into the language for this? I scanned Data.List but didn't see anything.

You could use tails:
gather n l = filter ((== n) . length) $ map (take n) $ tails l
or using takeWhile instead of filter:
gather n l = takeWhile ((== n) . length) $ map (take n) $ tails l
EDIT: You can remove the filter step by dropping the last n elements of the list returned from tails as suggested in the comments:
gather n = map (take n) . dropLast n . tails
where dropLast n xs = zipWith const xs (drop n xs)

The dropping of tails can be arranged for automagically, thanks to the properties of zipping,
import Data.List (tails)
g :: Int -> [a] -> [[a]]
g n = foldr (zipWith (:)) (repeat []) . take n . tails
or else a simple transpose . take n . tails would suffice. Testing:
Prelude Data.List> g 3 [1..10]
[[1,2,3],[2,3,4],[3,4,5],[4,5,6],[5,6,7],[6,7,8],[7,8,9],[8,9,10]]
Prelude Data.List> transpose . take 3 . tails $ [1..10]
[[1,2,3],[2,3,4],[3,4,5],[4,5,6],[5,6,7],[6,7,8],[7,8,9],[8,9,10],[9,10],[10]]
(edit 2018-09-16:) The use of zipping can be expressed on a higher level, with traverse ZipList:
g :: Int -> [a] -> [[a]]
g n = getZipList . traverse ZipList . take n . tails

Related

Get all rotations for a string in haskell

So I'm trying to make a function "rot" which takes a string and returns a list of strings with all possible rotations, e.g rot "abc" returns ["abc", "bca", cab"], seems very simple to do in other languages but I'm a newbie at haskell so I can't think of a way to do it. This is what I have so far:
rot :: [Char] -> [[Char]]
rot word =
let
lst = [tail word ++ [head word]]
in
lst
main = do
print(rot "abc")
It returns me "bca" as expected, but I would like a way to find all rotations and store it in a list.
Here's an example in python
def rot(word):
lst = []
for i in range(len(word)):
newWord1 = word[0:i]
newWord2 = word[i:]
newWordResult = newWord2 + newWord1
lst.append(newWordResult)
return lst
Well, you can more or less directly translate your Python code. Recursion is customarily used in functional programming instead of iteration, and it's more convenient to count from length word down to zero. Other than that, it's pretty much the same:
rot word =
let loop 0 lst = lst
loop i lst =
let newWord1 = take (i-1) word
newWord2 = drop (i-1) word
newWordResult = newWord2 ++ newWord1
in loop (i-1) (newWordResult : lst)
in loop (length word) []
One can make use of the tails and inits of a list:
Prelude Data.List> tails "abc"
["abc","bc","c",""]
Prelude Data.List> inits "abc"
["","a","ab","abc"]
we thus can use this with:
import Data.List(inits, tails)
rotated :: [a] -> [[a]]
rotated xs = [x ++ y | (x#(_:_), y) <- zip (tails xs) (inits xs)]
This produces:
Prelude Data.List> rotated "abc"
["abc","bca","cab"]
Prelude Data.List> rotated [1,4,2,5]
[[1,4,2,5],[4,2,5,1],[2,5,1,4],[5,1,4,2]]
Prelude Data.List> rotated [1.0,3.0,0.0,2.0]
[[1.0,3.0,0.0,2.0],[3.0,0.0,2.0,1.0],[0.0,2.0,1.0,3.0],[2.0,1.0,3.0,0.0]]
or as #Iceland_jack says, we can use the ParallelListComp extension to allow iterating over two lists in parallel in list comprehension without the explicit use of zip:
{-# LANGUAGE ParallelListComp #-}
import Data.List(inits, tails)
rotated :: [a] -> [[a]]
rotated xs = [x ++ y | x#(_:_) <- tails xs | y <- inits xs]
This is, simply,
rotations xs = map (take n) . take n
. tails $ xs ++ xs
where
n = length xs
It is customary to avoid length if at all possible, though here it leads to a bit more convoluted code(*) (but more often than not it leads to a simpler, cleaner code that is more true to the true nature of the problem),
rotations2 xs = map (zipWith (\a b -> b) xs)
. zipWith (\a b -> b) xs
. tails $ xs ++ xs
Testing, we get
> rotations "abcd"
["abcd","bcda","cdab","dabc"]
> rotations2 "abcd"
["abcd","bcda","cdab","dabc"]
> take 4 . map (take 4) $ rotations2 [1..]
[[1,2,3,4],[2,3,4,5],[3,4,5,6],[4,5,6,7]]
(*) edit Actually, it probably merits its own name,
takeLength :: [a] -> [b] -> [b]
takeLength = zipWith (\a b -> b)
rotations2 xs = map (takeLength xs)
. takeLength xs
. tails $ xs ++ xs

Pass output from a function as input in another function in Haskell

How can one pass output from a function as an input in another function.
For example I have those two functions
collatz :: (Integral a) => a -> [a]
collatz 1 = [1]
collatz n
|even n = n:collatz (n `div` 2)
|odd n = n:collatz (n*3 + 1)
And my another function
length' [] = 0
length' (x:xs) = 1 + length' xs
I want to calculate the length of a list, which was output from my collatz function.
finally I want calculate completely this
numLongChains :: Int
numLongChains = length (filter isLong (map collatz [1..100]))
where isLong xs = length xs > 15
But step by step.
You are talking about function composition:
-- assuming signature length' :: [a] -> Int
composed :: (Integral a) => a -> Int
composed = length' . collatz
I am not sure what you mean by 'step by step', but for your second question, it would be:
numLongChains :: (Integral a) => [a] -> Int
numLongChains = length' . (filter isLong) . (map collatz)
where isLong xs = length' xs > 15
I made it a unary function so it can work with any list, not just [1..100].
The length' of the collatz sequence for n is \n -> length' (collatz n). This pattern is so common that we gave it a name: f . g = \x -> f (g x). We call this "function composition". Replacing f, g, and x above with length', collatz, and n, we get length' . collatz = \n -> length' (collatz n). So:
collatzLength = length' . collatz
As an aside, instead of writing isLong as you do, we can use function composition here too, albeit in a slightly less obvious way:
isLong xs = length' xs > 15
isLong = \xs -> length' xs > 15
isLong = \xs -> (>) (length' xs) 15
isLong = \xs -> (> 15) (length' xs) -- this is the less obvious bit
isLong = (> 15) . length'
So the question "is the collatz sequence for the number n longer than 15?" can now be written as:
isLongSequence = isLong . collatz
So the long sequences for the numbers [1..100] are
filter isLongSequence [1..100]
or with
longSequences = filter isLongSequence
it's
longSequences [1..100]
now, we can write numLongChains as
numLongChains = length' . longSequences
The number of long chains is the number (length) of (the list of) long sequences.
Now, to show that this is equivalent to your definition, you need to know a rule (theorem) about map and filter:
filter p . map f = filter (p . f)
So
\xs -> filter isLong (map collatz xs)
is the same as
filter isLong . map collatz
is the same as
filter (isLong . collatz)
is the same as
filter isLongSequence
is the same as
longSequences

Haskell: find subsets of list that add up to given number

I am working through some examples and trying to implement a function that counts how many subsets of a list add up to a given number.
In trying to rewrite some implementations in python to Haskell :
test1 :: [Int]
test1 = [2,4,6,10,1,4,5,6,7,8]
countSets1 total input = length [n | n <- subsets $ sort input, sum n == total]
where
subsets [] = [[]]
subsets (x:xs) = map (x:) (subsets xs) ++ subsets xs
countSets2 total input = go (reverse . sort $ input) total
where
go [] _ = 0
go (x:xs) t
| t == 0 = 1
| t < 0 = 0
| t < x = go xs t
| otherwise = go xs (t - x) + go xs t
countSets3 total input = go (sort input) total (length input - 1)
where
go xxs t i
| t == 0 = 1
| t < 0 = 0
| i < 0 = 0
| t < (xxs !! i) = go xxs t (i-1)
| otherwise = go xxs (t - (xxs !! i)) (i-1) + go xxs t (i-1)
I can't figure out why countSets2 does not return the same result as countSets3 (a copy of the python version)
λ: countSets1 16 test1
24
λ: countSets2 16 test1
13
λ: countSets3 16 test1
24
EDIT:
#freestyle pointed out that the order of my conditions was different in the two solutions:
countSets2 total input = go (sortBy (flip compare) input) total
where
go _ 0 = 1
go [] _ = 0
go (x:xs) t
| t < 0 = 0
| t < x = go xs t
| otherwise = go xs (t - x) + go xs t
fixes the problem.
I'm not sure about your logic, but in your second solution I think you need
go [] 0 = 1
otherwise, your code causes go [] 0 = 0 which feels wrong.
I don't treat your error so I don't expect you accept my answer. I only provide a solution:
import Math.Combinat.Sets (sublists)
getSublists :: [Int] -> Int -> [[Int]]
getSublists list total = filter (\x -> sum x == total) (sublists list)
countSublists :: [Int] -> Int -> Int
countSublists list total = length $ getSublists list total
The module Math.Combinat.Sets is from the combinat package.
>>> countSublists [2,4,6,10,1,4,5,6,7,8] 16
24
This problem looks similar to a pearl written by Richard Bird on how many sums and products can make 100. I'll use it as a template here. First, the specification:
subseqn :: (Num a, Eq a) => a -> [a] -> Int
subseqn n = length . filter ((== n) . sum) . subseqs
where
subseqs = foldr prefix [[]]
prefix x xss = map (x:) xss ++ xss
Observe that a lot of work may be wasted in subseqs. Intuitively, we can discard candidates as soon as they exceed n, i.e. use the weaker predicate (<= n) somewhere. Trivially, filtering on it before filtering on the stronger one does not change the outcome. Then you can derive
filter ((== n) . sum) . subseqs
= {- insert weaker predicate -}
filter ((== n) . sum) . filter ((<= n) . sum) . subseqs
= {- definition of subseqs -}
filter ((== n) . sum) . filter ((<= n) . sum) . foldr prefix [[]]
= {- fusion law of foldr -}
filter ((== n) . sum) . foldr prefix' [[]]
The fusion law states that f . foldr g a = foldr h b iff
f is strict
f a = b
f (g x y) = h x (f y)
Here, a = b = [[]], f is filter ((<= n) . sum) and g is prefix. You can derive h (i.e. prefix') by observing that the predicate can be applied before prefixing:
filter ((<= n) . sum) (prefix x xss) =
filter ((<= n) . sum) (prefix x (filter ((<= n) . sum) xss))
which is exactly the third condition; then h is filter ((<= n) . sum) . prefix.
Another observation is that sum is computed too many times. To get around that, we can modify our definition of subseqn so that each candidate carries its own sum. Let's use
(&&&) :: (a -> b) -> (a -> c) -> a -> (b, c)
(&&&) f g x = (f x, g x)
and derive
filter ((== n) . sum) . subseqs
= {- use &&& -}
filter ((== n) . snd) . map (id &&& sum) . subseqs
= {- definition of subseqs -}
filter ((== n) . snd) . map (id &&& sum) . foldr prefix' [[]]
= {- fusion law of foldr -}
filter ((== n) . snd) . foldr prefix'' [[]]
I won't go through the whole derivation of prefix'', it is quite long. The gist is that you can avoid using sum at all by working on pairs, so that the sum is computed iteratively. Initially the sum is 0 for the empty list and all we have to do is add the new candidate to it.
We update our base case from [[]] to [([], 0)] and get:
prefix'' x = filter ((<= n) . snd) . uncurry zip . (prefix x *** add x) . unzip
where
(***) :: (a -> a') -> (b -> b') -> (a, b) -> (a', b')
(***) f g (x, y) = (f x, g y)
add :: Num a => a -> [a] -> [a]
add x xs = map (x+) xs ++ xs
Here is the final version:
subseqn :: (Num a, Ord a) => a -> [a] -> Int
subseqn n = length . filter ((== n) . snd) . foldr expand [([], 0)]
where
expand x = filter ((<= n) . snd) . uncurry zip . (prefix x *** add x) . unzip
prefix x xss = map (x:) xss ++ xss
add x xs = map (x+) xs ++ xs
(*** and &&& are from Control.Arrow)

Recursively sort non-contiguous list to list of contiguous lists

I've been trying to learn a bit of functional programming (with Haskell & Erlang) lately and I'm always amazed at the succinct solutions people can come up with when they can think recursively and know the tools.
I want a function to convert a list of sorted, unique, non-contiguous integers into a list of contiguous lists, i.e:
[1,2,3,6,7,8,10,11]
to:
[[1,2,3], [6,7,8], [10,11]
This was the best I could come up with in Haskell (two functions)::
make_ranges :: [[Int]] -> [Int] -> [[Int]]
make_ranges ranges [] = ranges
make_ranges [] (x:xs)
| null xs = [[x]]
| otherwise = make_ranges [[x]] xs
make_ranges ranges (x:xs)
| (last (last ranges)) + 1 == x =
make_ranges ((init ranges) ++ [(last ranges ++ [x])]) xs
| otherwise = make_ranges (ranges ++ [[x]]) xs
rangify :: [Int] -> [[Int]]
rangify lst = make_ranges [] lst
It might be a bit subjective but I'd be interested to see a better, more elegant, solution to this in either Erlang or Haskell (other functional languages too but I might not understand it.) Otherwise, points for just fixing my crappy beginner's Haskell style!
Most straightforward way in my mind is a foldr:
ranges = foldr step []
where step x [] = [[x]]
step x acc#((y:ys):zs) | y == x + 1 = (x:y:ys):zs
| otherwise = [x]:acc
Or, more concisely:
ranges = foldr step []
where step x ((y:ys):zs) | y == x + 1 = (x:y:ys):zs
step x acc = [x]:acc
But wait, there's more!
abstractRanges f = foldr step []
where step x ((y:ys):zs) | f x y = (x:y:ys):zs
step x acc = [x]:acc
ranges = abstractRanges (\x y -> y == x + 1)
powerRanges = abstractRanges (\x y -> y == x*x) -- mighty morphin
By turning the guard function into a parameter, you can group more interesting things than just +1 sequences.
*Main> powerRanges [1,1,1,2,4,16,3,9,81,5,25]
[[1,1,1],[2,4,16],[3,9,81],[5,25]]
The utility of this particular function is questionable...but fun!
I can't believe I got the shortest solution. I know this is no code golf, but I think it is still quite readable:
import GHC.Exts
range xs = map (map fst) $ groupWith snd $ zipWith (\a b -> (a, a-b)) xs [0..]
or pointfree
range = map (map snd) . groupWith fst . zipWith (\a b -> (b-a, b)) [0..]
BTW, groupWith snd can be replaced with groupBy (\a b -> snd a == snd b) if you prefer Data.List over GHC.Exts
[Edit]
BTW: Is there a nicer way to get rid of the lambda (\a b -> (b-a, b)) than (curry $ (,) <$> ((-) <$> snd <*> fst) <*> snd) ?
[Edit 2]
Yeah, I forgot (,) is a functor. So here is the obfuscated version:
range = map (map fst) . groupWith snd . (flip $ zipWith $ curry $ fmap <$> (-).fst <*> id) [0..]
Suggestions are welcome...
import Data.List (groupBy)
ranges xs = (map.map) snd
. groupBy (const fst)
. zip (True : zipWith ((==) . succ) xs (tail xs))
$ xs
As to how to come up with such a thing: I started with the zipWith f xs (tail xs), which is a common idiom when you want to do something on consecutive elements of a list. Likewise is zipping up a list with information about the list, and then acting (groupBy) upon it. The rest is plumbing.
Then, of course, you can feed it through #pl and get:
import Data.List (groupBy)
import Control.Monad (ap)
import Control.Monad.Instances()
ranges = (((map.map) snd)
. groupBy (const fst))
.) =<< zip
. (True:)
. ((zipWith ((==) . succ)) `ap` tail)
, which, by my authoritative definition, is evil due to Mondad ((->) a). Twice, even. The data flow is meandering too much to lay it out in any sensible way. zipaptail is an Aztec god, and Aztec gods aren't to be messed with.
Another version in Erlang:
part(List) -> part(List,[]).
part([H1,H2|T],Acc) when H1 =:= H2 - 1 ->
part([H2|T],[H1|Acc]);
part([H1|T],Acc) ->
[lists:reverse([H1|Acc]) | part(T,[])];
part([],Acc) -> Acc.
k z = map (fst <$>) . groupBy (const snd) .
zip z . (False:) . (zipWith ((==) . succ) <*> tail) $ z
Try reusing standard functions.
import Data.List (groupBy)
rangeify :: (Num a) => [a] -> [[a]]
rangeify l = map (map fst) $ groupBy (const snd) $ zip l contigPoints
where contigPoints = False : zipWith (==) (map (+1) l) (drop 1 l)
Or, following (mixed) advice to use unfoldr, stop abusing groupBy, and be happy using partial functions when it doesn't matter:
import Control.Arrow ((***))
import Data.List (unfoldr)
spanContig :: (Num a) => [a] -> [[a]]
spanContig l =
map fst *** map fst $ span (\(a, b) -> a == b + 1) $ zip l (head l - 1 : l)
rangeify :: (Num a) => [a] -> [[a]]
rangeify = unfoldr $ \l -> if null l then Nothing else Just $ spanContig l
Erlang using foldr:
ranges(List) ->
lists:foldr(fun (X, [[Y | Ys], Acc]) when Y == X + 1 ->
[[X, Y | Ys], Acc];
(X, Acc) ->
[[X] | Acc]
end, [], List).
This is my v0.1 and I can probably make it better:
makeCont :: [Int] -> [[Int]]
makeCont [] = []
makeCont [a] = [[a]]
makeCont (a:b:xs) = if b - a == 1
then (a : head next) : tail next
else [a] : next
where
next :: [[Int]]
next = makeCont (b:xs)
And I will try and make it better. Edits coming I think.
As a comparison, here's an implementation in Erlang:
partition(L) -> [lists:reverse(T) || T <- lists:reverse(partition(L, {[], []}))].
partition([E|L], {R, [EL|_] = T}) when E == EL + 1 -> partition(L, {R, [E|T]});
partition([E|L], {R, []}) -> partition(L, {R, [E]});
partition([E|L], {R, T}) -> partition(L, {[T|R], [E]});
partition([], {R, []}) -> R;
partition([], {R, T}) -> [T|R].
The standard paramorphism recursion scheme isn't in Haskell's Data.List module, though I think it should be. Here's a solution using a paramorphism, because you are building a list-of-lists from a list, the cons-ing is a little tricksy:
contig :: (Eq a, Num a) => [a] -> [[a]]
contig = para phi [] where
phi x ((y:_),(a:acc)) | x + 1 == y = (x:a):acc
phi x (_, acc) = [x]:acc
Paramorphism is general recursion or a fold with lookahead:
para :: (a -> ([a], b) -> b) -> b -> [a] -> b
para phi b [] = b
para phi b (x:xs) = phi x (xs, para phi b xs)
It can be pretty clear and simple in the Erlang:
partition([]) -> [];
partition([A|T]) -> partition(T, [A]).
partition([A|T], [B|_]=R) when A =:= B+1 -> partition(T, [A|R]);
partition(L, P) -> [lists:reverse(P)|partition(L)].
Edit: Just for curiosity I have compared mine and Lukas's version and mine seems about 10% faster either in native either in bytecode version on testing set what I generated by lists:usort([random:uniform(1000000)||_<-lists:seq(1,1000000)]) on R14B01 64b version at mine notebook. (Testing set is 669462 long and has been partitioned to 232451 sublists.)
Edit2: Another test data lists:usort([random:uniform(1000000)||_<-lists:seq(1,10000000)]), length 999963 and 38 partitions makes bigger diference in native code. Mine version finish in less than half of time. Bytecode version is only about 20% faster.
Edit3: Some microoptimizations which provides additional performance but leads to more ugly and less maintainable code:
part4([]) -> [];
part4([A|T]) -> part4(T, A, []).
part4([A|T], B, R) when A =:= B+1 -> part4(T, A, [B|R]);
part4([A|T], B, []) -> [[B]|part4(T, A, [])];
part4([A|T], B, R) -> [lists:reverse(R, [B])|part4(T, A, [])];
part4([], B, R) -> [lists:reverse(R,[B])].
Here's an attempt from a haskell noob
ranges ls = let (a, r) = foldl (\(r, a#(h:t)) e -> if h + 1 == e then (r, e:a) else (a:r, [e])) ([], [head ls]) (tail ls)
in reverse . map reverse $ r : a

Haskell pair and unpair functions

I have the following two functions written.
pair :: [a] -> [(a, a)]
pair [] = []
pair [x] = []
pair (x1:x2:xs) = (x1, x2) : pair xs
unpair :: [(a, a)] -> [a]
unpair [] = []
unpair ((x1, x2):xs) = x1 : x2 : unpair xs
Pair will take pairs of elements and make 2-tuples of them. If the list has an odd number of elements, discard the last one. Unpair is the reverse of pair.
These work, but wondering whether there is a more succinct way to write these.
One-liners:
pair xs = map snd . filter fst . zip (iterate not True) $ zip xs (drop 1 xs)
unpair = concatMap (\(x,y) -> [x,y])
You could have also abbreviate your definition of pair a little:
pair (x1:x2:xs) = (x1, x2) : pair xs
pair _ = []
It's not any more concise, but for the sake of clarity I'd use splitEvery from Data.List.Split for pair:
pair = map tuplify . filter ((>1) . length) . splitEvery 2
where
tuplify [x, y] = (x, y)
This is off the top of my head—it would be nicer to check the length of the last list only.
For unpair I'd use foldr to avoid the explicit recursion:
unpair = foldr (\(x, y) -> (x:) . (y:)) []
This is just a matter of taste.
So many possibilities. How about these?
unpair' = concatMap (\(x,y) -> [x,y])
pair' xs = map snd . filter fst . zip (cycle [True, False]) $ zip xs (tail xs)
pair'' xs = [(x,y) | (True,x,y) <- zip3 (cycle [True,False]) xs (tail xs)]
The two versions of pair should be the same.
Edit: Regarding my comment above, one can use the split package from Hackage to write:
pair xs = map head . splitEvery 2 $ zip xs (tail xs)
which is closer to the desired
pair xs = everyOther $ zip xs (tail xs)
But, in the spirit of pointlessness, I think we should probably all agree on writing it,
pair = map head . splitEvery 2 . (zip <$> id <*> tail)
to ensure confusion.
pair s = dropEven $ zip s (tail s)
where dropEven s = map fst $ filter snd $ zip s (cycle [True, False])
unpair = concatMap (\(a, b) -> [a, b])
Though I definitely prefer your definition of pair.
This is a nice use for view patterns:
{-# LANGUAGE ViewPatterns #-}
pair :: [a] -> [(a,a)]
pair (splitAt 2 -> ([x,y],ys)) = (x,y) : pair ys
pair _ = []
unpair :: [(a,a)] -> [a]
unpair = (>>= \(x,y) -> [x,y])

Resources