Dynamic List Comprehension in Haskell - haskell

Suppose I have a list comprehension that returns a list of sequences, where the elements chosen depend on each other (see example below). Is there a way to (conveniently) program the number of elements and their associated conditions based on an earlier computation? For example, return type [[a,b,c]] or [[a,b,c,d,e]] depending on another value in the program? Also, are there other/better ways than a list comprehension to formulate the same idea?
(I thought possible, although cumbersome and limited, to write out a larger list comprehension to start with and trim it by adding to s a parameter and helper functions that could make one or more of the elements a value that could easily be filtered later, and the associated conditions True by default.)
s = [[a, b, c, d] | a <- list, someCondition a,
b <- list, b /= a, not (someCondition b),
otherCondition a b,
c <- list, c /= a, c /= b, not (someCondition c),
otherCondition b c,
d <- list, d /= a, d /= b, d /= c,
someCondition d, someCondition (last d),
otherCondition c d]

The question is incredibly difficult to understand.
Is there a way to (conveniently) program the number of elements and their associated conditions based on an earlier computation?
The problem is "program" is not really an understandable verb in this sentence, because a human programs a computer, or programs a VCR, but you can't "program a number". So I don't understand what you are trying to say here.
But I can give you code review, and maybe through code review I can understand the question you are asking.
Unsolicited code review
It sounds like you are trying to solve a maze by eliminating dead ends, maybe.
What your code actually does is:
Generate a list of cells that are not dead ends or adjacent to dead ends, called filtered
Generate a sequence of adjacent cells from step 1, sequences
Concatenate four such adjacent sequences into a route.
Major problem: this only works if a correct route is exactly eight tiles long! Try to solve this maze:
[E]-[ ]-[ ]-[ ]
|
[ ]-[ ]-[ ]-[ ]
|
[ ]-[ ]-[ ]-[ ]
|
[ ]-[ ]-[ ]-[ ]
|
[ ]-[ ]-[ ]-[E]
So, working backwards from the code review, it sounds like your question is:
How do I generate a list if I don't know how long it is beforehand?
Solutions
You can solve a maze with a search (DFS, BFS, A*).
import Control.Monad
-- | Maze cells are identified by integers
type Cell = Int
-- | A maze is a map from cells to adjacent cells
type Maze = Cell -> [Cell]
maze :: Maze
maze = ([[1], [0,2,5], [1,3], [2],
[5], [4,6,1,9], [5,7], [6,11],
[12], [5,13], [9], [7,15],
[8,16], [14,9,17], [13,15], [14,11],
[12,17], [13,16,18], [17,19], [18]] !!)
-- | Find paths from the given start to the end
solve :: Maze -> Cell -> Cell -> [[Cell]]
solve maze start = solve' [] where
solve' path end =
let path' = end : path
in if start == end
then return path'
else do neighbor <- maze end
guard (neighbor `notElem` path)
solve' path' neighbor
The function solve works by depth-first search. Rather than putting everything in a single list comprehension, it works recursively.
In order to find a path from start to end, if start /= end,
Look at all cells adjacent to the end, neighbor <- maze end,
Make sure that we're not backtracking over a cell guard (negihbor `notElem` path),
Try to find a path from start to neighbor.
Don't try to understand the whole function at once, just understand the bit about recursion.
Summary
If you want to find the route from cell 0 to cell 19, recurse: We know that cell 18 and 19 are connected (because they are directly connected), so we can instead try to solve the problem of finding a route from cell 0 to cell 18.
This is recursion.
Footnotes
The guard,
someCondition a == True
Is equivalent to,
someCondition a
And therefore also equivalent to,
(someCondition a == True) == True
Or,
(someCondition a == (True == True)) == (True == (True == True))
Or,
someCondition a == (someCondition a == someCondition a)
The first one, someCondition a, is fine.
Footnote about do notation
The do notation in the above example is equivalent to list comprehension,
do neighbor <- maze end
guard (neighbor `notElem` path)
solve' path' neighbor
The equivalent code in list comprehension syntax is,
[result | neighbor <- maze end,
neighbor `notElem` path,
result <- solve' path' neighbor]

Is there a way to (conveniently) program the number of elements and their associated conditions based on an earlier computation? For example, return type [[a,b,c]] or [[a,b,c,d,e]] depending on another value in the program?
I suppose you want to encode the length of the list (or vector) statically in the type signature. Length of the standard lists cannot be checked on type level.
One approach to do that is to use phantom types, and introduce dummy data types which will encode different sizes:
newtype Vector d = Vector { vecArray :: UArray Int Float }
-- using EmptyDataDecls extension too
data D1
data D2
data D3
Now you can create vectors of different length which will have distinct types:
vector2d :: Float -> Float -> Vector D2
vector2d x y = Vector $ listArray (1,2) [x,y]
vector3d :: Float -> Float -> Float -> Vector D3
vector3d x y z = Vector $ listArray (1,3) [x,y,z]
If the length of the output depends on the length of the input, then consider using type-level arithmetics to parametrize the output.
You can find more by googling for "Haskell statically sized vectors".
A simpler solution is to use tuples, which are fixed length. If your function can produce either a 3-tuple, or a 5-tuple, wrap them with an Either data type: `Either (a,b,c) (a,b,c,d,e).

Looks like you're trying to solve some logic puzzle by unique selection from finite domain. Consult these:
Euler 43 - is there a monad to help write this list comprehension?
Splitting list into a list of possible tuples
The way this helps us is, we carry our domain around while we're making picks from it; and the next pick is made from the narrowed domain containing what's left after the previous pick, so a chain is naturally formed. E.g.
p43 = sum [ fromDigits [v0,v1,v2,v3,v4,v5,v6,v7,v8,v9]
| (dom5,v5) <- one_of [0,5] [0..9] -- [0..9] is the
, (dom6,v6) <- pick_any dom5 -- initial domain
, (dom7,v7) <- pick_any dom6
, rem (100*d5+10*d6+d7) 11 == 0
....
-- all possibilities of picking one elt from a domain
pick_any :: [a] -> [([a], a)]
pick_any [] = []
pick_any (x:xs) = (xs,x) : [ (x:dom,y) | (dom,y) <- pick_any xs]
-- all possibilities of picking one of provided elts from a domain
-- (assume unique domains, i.e. no repetitions)
one_of :: (Eq a) => [a] -> [a] -> [([a], a)]
one_of ns xs = [ (ys,y) | let choices = pick_any xs, n <- ns,
(ys,y) <- take 1 $ filter ((==n).snd) choices ]
You can trivially check a number of elements in your answer as a part of your list comprehension:
s = [answer | a <- .... , let answer=[....] , length answer==4 ]
or just create different answers based on a condition,
s = [answer | a <- .... , let answer=if condition then [a,b,c] else [a]]

You have Data.List.subsequences
You can write your list comprehension in monadic form (see guards in Monad Comprehensions):
(Explanation: The monad must be an instance of MonadPlus which supports failure.
guard False makes the monad fail evaluating to mzero., subsequent results are appended with mplus = (++) for the List monad.)
import Control.Monad (guard)
myDomain = [1..9] -- or whatever
validCombinations :: [a] -> [[a]]
validCombinations domainList = do
combi <- List.subsequences domainList
case combi of
[a,b] -> do
guard (propertyA a && propertyB b)
return combi
[a,b,c] -> do
guard (propertyA a && propertyB b && propertyC c)
return combi
_ -> guard False
main = do
forM_ (validCombinations myDomain) print
Update again, obtaining elements recursively, saving combinations and checks
import Control.Monad
validCombinations :: Eq a => Int -> Int -> [a] -> [(a -> Bool)] -> [a] -> [[a]]
validCombinations indx size domainList propList accum = do
elt <- domainList -- try all domain elements
let prop = propList!!indx
guard $ prop elt -- some property
guard $ elt `notElem` accum -- not repeated
{-
case accum of
prevElt : _ -> guard $ some_combined_check_with_previous elt prevElt
_ -> guard True
-}
if size > 1 then do
-- append recursively subsequent positions
other <- validCombinations (indx+1) (size-1) domainList propList (elt : accum)
return $ elt : other
else
return [elt]
myDomain = [1..3] :: [Int]
myProps = repeat (>1)
main = do
forM_ (validCombinations 0 size myDomain myProps []) print
where
size = 2
result for size 2 with non trivial result:
[2,3]
[3,2]

Related

How to split a [String] in to [[String]] based on length

I'm trying to split a list of Strings in to a List of Lists of Strings
so like in the title [String] -> [[String]]
This has to be done based on length of characters, so that the Lists in the output are no longer than 10. So if input was length 20 this would be broken down in to 2 lists and if length 21 in to 3 lists.
I'm not sure what to use to do this, I don't even know how to brake down a list in to a list of lists never mind based on certain length.
For example if the limit was 5 and the input was:
["abc","cd","abcd","ab"]
The output would be:
[["abc","cd"],["abcd"],["ab"]]
I'd like to be pointed in the right direction and what methods to use, list comprehension? recursion?
Here's an intuitive solution:
import Data.List (foldl')
breakup :: Int -> [[a]] -> [[[a]]]
breakup size = foldl' accumulate [[]]
where accumulate broken l
| length l > size = error "Breakup size too small."
| sum (map length (last broken ++ [l])) <= size
= init broken ++ [last broken ++ [l]]
| otherwise = broken ++ [[l]]
Now, let's go through it line-by-line:
breakup :: Int -> [[a]] -> [[[a]]]
Since you hinted that you may want to generalize the function to accept different size limits, our type signature reflects this. We also generalize beyond [String] (that is, [[Char]]), since our problem is not specific to [[Char]], and could equally apply to any [[a]].
breakup size = foldl' accumulate [[]]
We're using a left fold because we want to transform a list, left-to-right, into our target, which will be a list of sub-lists. Even though we're not concerned with efficiency, we're using Data.List.foldl' instead of Prelude's own foldl because this is standard practice. You can read more about foldl vs. foldl' here.
Our folding function is called accumulate. It will consider a new item and decide whether to place it in the last-created sub-list or to start a new sub-list. To make that judgment, it uses the size we passed in. We start with an initial value of [[]], that is, a list with one empty sub-list.
Now the question is, how should you accumulate your target?
where accumulate broken l
We're using broken to refer to our constructed target so far, and l (for "list") to refer to the next item to process. We'll use guards for the different cases:
| length l > size = error "Breakup size too small."
We need to raise an error if the item surpasses the size limit on its own, since there's no way to place it in a sub-list that satisfies the size limit. (Alternatively, we could build a safe function by wrapping our return value in the Maybe monad, and that's something you should definitely try out on your own.)
| sum (map length (last broken ++ [l])) <= size
= init broken ++ [last broken ++ [l]]
The guard condition is sum (map length (last broken ++ [l])) <= size, and the return value for this guard is init broken ++ [last broken ++ [l]]. Translated into plain English, we might say, "If the item can fit in the last sub-list without going over the size limit, append it there."
| otherwise = broken ++ [[l]]
On the other hand, if there isn't enough "room" in the last sub-list for this item, we start a new sub-list, containing only this item. When the accumulate helper is applied to the next item in the input list, it will decide whether to place that item in this sub-list or start yet another sub-list, following the same logic.
There you have it. Don't forget to import Data.List (foldl') up at the top. As another answer points out, this is not a performant solution if you plan to process 100,000 strings. However, I believe this solution is easier to read and understand. In many cases, readability is the more important optimization.
Thanks for the fun question. Good luck with Haskell, and happy coding!
You can do something like this:
splitByLen :: Int -> [String] -> [[String]]
splitByLen n s = go (zip s $ scanl1 (+) $ map length s) 0
where go [] _ = []
go xs prev = let (lst, rest) = span (\ (x, c) -> c - prev <= n) xs
in (map fst lst) : go rest (snd $ last lst)
And then:
*Main> splitByLen 5 ["abc","cd","abcd","ab"]
[["abc","cd"],["abcd"],["ab"]]
In case there is a string longer than n, this function will fail. Now, what you want to do in those cases depends on your requirements and that was not specified in your question.
[Update]
As requested by #amar47shah, I made a benchmark comparing his solution (breakup) with mine (splitByLen):
import Data.List
import Data.Time.Clock
import Control.DeepSeq
import System.Random
main :: IO ()
main = do
s <- mapM (\ _ -> randomString 10) [1..10000]
test "breakup 10000" $ breakup 10 s
test "splitByLen 10000" $ splitByLen 10 s
putStrLn ""
r <- mapM (\ _ -> randomString 10) [1..100000]
test "breakup 100000" $ breakup 10 r
test "splitByLen 100000" $ splitByLen 10 r
test :: (NFData a) => String -> a -> IO ()
test s a = do time1 <- getCurrentTime
time2 <- a `deepseq` getCurrentTime
putStrLn $ s ++ ": " ++ show (diffUTCTime time2 time1)
randomString :: Int -> IO String
randomString n = do
l <- randomRIO (1,n)
mapM (\ _ -> randomRIO ('a', 'z')) [1..l]
Here are the results:
breakup 10000: 0.904012s
splitByLen 10000: 0.005966s
breakup 100000: 150.945322s
splitByLen 100000: 0.058658s
Here is another approach. It is clear from the problem that the result is a list of lists and we need a running length and an inner list to keep track of how much we have accumulated (We use foldl' with these two as input). We then describe what we want which is basically:
If the length of the current input string itself exceeds the input length, we ignore that string (you may change this if you want a different behavior).
If the new length after we have added the length of the current string is within our input length, we add it to the current result list.
If the new length exceeds the input length, we add the result so far to the output and start a new result list.
chunks len = reverse . map reverse . snd . foldl' f (0, [[]]) where
f (resSoFar#(lenSoFar, (currRes: acc)) curr
| currLength > len = resSoFar -- ignore
| newLen <= len = (newLen, (curr: currRes):acc)
| otherwise = (currLength, [curr]:currRes:acc)
where
newLen = lenSoFar + currLength
currLength = length curr
Every time we add a result to the output list, we add it to the front hence we need reverse . map reverse at the end.
> chunks 5 ["abc","cd","abcd","ab"]
[["abc","cd"],["abcd"],["ab"]]
> chunks 5 ["abc","cd","abcdef","ab"]
[["abc","cd"],["ab"]]
Here is an elementary approach. First, the type String doesn't matter, so we can define our function in terms of a general type a:
breakup :: [a] -> [[a]]
I'll illustrate with a limit of 3 instead of 10. It'll be obvious how to implement it with another limit.
The first pattern will handle lists which are of size >= 3 and the the second pattern handles all of the other cases:
breakup (a1 : a2 : a3 : as) = [a1, a2, a3] : breakup as
breakup as = [ as ]
It is important to have the patterns in this order. That way the second pattern will only be used when the first pattern does not match, i.e. when there are less than 3 elements in the list.
Examples of running this on some inputs:
breakup [1..5] -> [ [1,2,3], [4,5] ]
breakup [1..4] -> [ [1,2,3], [4] ]
breakup [1..2] -> [ [1,2] ]
breakup [1..3] -> [ [1,2,3], [] ]
We see these is an extra [] when we run the function on [1..3]. Fortunately this is easy to fix by inserting another rule before the last one:
breakup [] = []
The complete definition is:
breakup :: [a] -> [[a]]
breakup [] = []
breakup (a1 : a2 : a3 : as) = [a1, a2, a3] : breakup as
breakup as = [ as ]

Convert a function returning a list to a list of functions

I have a function f :: Int -> [a] which always returns a list of size n, like so:
f 0 = [a_0_1, a_0_2, ..., a_0_n]
f 1 = [a_1_1, a_1_2, ..., a_1_n]
.
.
.
f k = [a_k_1, a_k_2, ..., a_k_n]
I want to transform this into a list of functions:
[f_1, f_2, ..., f_n] :: [Int -> a]
where
f_k i = a_i_k
I hope the notation I've used is clear enough to convey what I want.
Use lambdas
fn = [(\k -> (f k)!!i) | i <- [0..n - 1]]
then
[f1, f2, f3, ...] = fn
but you should analyze your general problem (this approach is slow).
Complete sandbox code:
n = 5
f k = [2 * k + i | i <- [1..n]]
fn = [(\k -> (f k)!!i) | i <- [0..n - 1]]
(f1:f2:_) = fn
main = do
print $ f 4
print $ (fn!!3) 4
print $ f2 4
Let's start by actually applying the function to all these values:
listf = map f [0..k]
So
listf = [[a_0_1, a_0_2, ..., a_0_n]
,[a_1_1, a_1_2, ..., a_1_n]
,...
,[a_k_1, a_k_2, ..., a_k_n]]
Now we have a list of lists, but it's the wrong way around. So let's take the transpose:
listft = transpose listf
listft = [[a_0_1, a_1_1, ..., a_k_1]
,[a_0_2, a_1_2, ..., a_k_2]
,...
,[a_0_n, a_1_n, ..., a_k_n]]
Now we have a list of lists, and each of the inner lists represents an f_i for some i. But we don't want to stop with that representation, because actually calculating the jth element of a list is O(j). So let's use Vector, from Data.Vector, instead:
listmapsf = map fromList listft
Now we have a list of Vectors, each of which represents an f_i. We can turn those into functions using !:
functions = map (!) listmapsf
Note: this does absolutely nothing to verify that the input lists are all the same length. It's probably also not the most efficient way to produce the list of functions, but it's not a bad one.
Edit: per user3237465's suggestion, I've replaced the IntMap representation with a Vector one.
This is a good question, and deserves to be applied to an arbitrary function of general type a -> [b].
I actually made a related question about the monadic sequence function, regarding making an opposite, an unsequence function. It was shown to be impossible. Here's why:
When a function returns a list, it may be of an arbitrary length. This means that we can only determine the length of the returned list when we call the function. As a result, we cannot make a list of fixed length without calling the function.
A simpler way of understanding this is imagining that we've got values trapped in an IO monad. of type IO [Int]. This is perfectly comparable to a -> [Int] because both values can only be manipulated while they're kept inside their monadic type. This is different for the sequence :: Monad m => [m a] -> m [a] because the [a] monad can be deconstructed, i.e: it's a comonad too.
To put it simply, 'pure' monads can only be constructed, and thus you cannot take a list's length, which is fundamental if we are to construct a list, from inside a function monad, or and IO monad. I hope this helps you!
If you have
f 0 = [a_0_0, a_0_1 ... a_0_m]
f 1 = [a_1_0, a_1_1 ... a_1_m]
...
f n = [a_n_0, a_n_1 ... a_n_m]
then
matr = transpose $ map f [0..n]
gives you
[ [a_0_0, a_1_0 ... a_n_0]
, [a_0_1, a_1_1 ... a_n_1]
...
, [a_0_m, a_1_m ... a_n_m]
]
Your equation is f_j i = a_i_j, where i ranges over [0..n] and j ranges over [0..m].
Since, matr is transposed, the equation becomes f_j i = matr_j_i, which can be reflected as follows:
map (\j i -> matr !! j !! i) [0..]
The whole function is
transform f n = map (\j i -> matr !! j !! i) [0..] where
matr = transpose $ map f [0..n]
Or just
transform f n = map (!!) $ transpose $ map f [0..n]
EDIT
As #josejuan pointed out, this code is not efficient, since, while evaluating
(transform f n !! j) i
transpose forces f i' for all i' <= i.
If you don't need all of the values don't compute them until they are needed. You just need to define indexed access, i.e. let frc r c = f!!r!!c where r is the row and c is the column.

mapping over entire data set to get results

Suppose I have the arrays:
A = "ABACUS"
B = "YELLOW"
And they are zipped so: Pairing = zip A B
I also have a function Connect :: Char -> [(Char,Char)] -> [(Char,Char,Int)]
What I want to do is given a char such as A, find the indices of where it is present in the first string and return the character in the same positions in the second string, as well as the position e.g. if I did Connect 'A' Pairing I'd want (A,Y,0) and (A,L,2) as results.
I know I can do
pos = x!!map fst pairing
to retrieve the positions. And fnd = findIndices (==pos) map snd pairing to get what's in this position in the second string but in Haskell how would I do this over the whole set of data (as if I were using a for loop) and how would I get my outputs?
To do exactly as you asked (but correct the initial letter of function names to be lowercase), I could define
connect :: Char -> [(Char,Char)] -> [(Char,Char,Int)]
connect c pairs = [(a,b,n)|((a,b),n) <- zip pairs [0..], a == c]
so if
pairing = zip "ABACUS" "YELLOW"
we get
ghci> connect 'A' pairing
[('A','Y',0),('A','L',2)]
However, I think it'd be neater to zip once, not twice, using zip3:
connect3 :: Char -> String -> String -> [(Char,Char,Int)]
connect3 c xs ys = filter (\(a,_,_) -> a==c) (zip3 xs ys [0..])
which is equivalent to
connect3' c xs ys = [(a,b,n)| (a,b,n) <- zip3 xs ys [0..], a==c]
they all work as you wanted:
ghci> connect3 'A' "ABACUS" "YELLOW"
[('A','Y',0),('A','L',2)]
ghci> connect3' 'A' "ABACUS" "AQUAMARINE"
[('A','A',0),('A','U',2)]
In comments, you said you'd like to get pairs for matches the other way round.
This time, it'd be most convenient to use the monadic do notation, since lists are an example of a monad.
connectEither :: (Char,Char) -> String -> String -> [(Char,Char,Int)]
connectEither (c1,c2) xs ys = do
(a,b,n) <- zip3 xs ys [0..]
if a == c1 then return (a,b,n) else
if b == c2 then return (b,a,n) else
fail "Doesn't match - leave it out"
I've used the fail function to leave out ones that don't match. The three lines starting if, if and fail are increasingly indented because they're actually one line from Haskell's point of view.
ghci> connectEither ('a','n') "abacus" "banana"
[('a','b',0),('a','n',2),('n','u',4)]
In this case, it hasn't included ('n','a',2) because it's only checking one way.
We can allow both ways by reusing existing functions:
connectBoth :: (Char,Char) -> String -> String -> [(Char,Char,Int)]
connectBoth (c1,c2) xs ys = lefts ++ rights where
lefts = connect3 c1 xs ys
rights = connect3 c2 ys xs
which gives us everything we want to get:
ghci> connectBoth ('a','n') "abacus" "banana"
[('a','b',0),('a','n',2),('n','a',2),('n','u',4)]
but unfortunately things more than once:
ghci> connectBoth ('A','A') "Austria" "Antwerp"
[('A','A',0),('A','A',0)]
So we can get rid of that using nub from Data.List. (Add import Data.List at the top of your file.)
connectBothOnce (c1,c2) xs ys = nub $ connectBoth (c1,c2) xs ys
giving
ghci> connectBothOnce ('A','A') "ABACUS" "Antwerp"
[('A','A',0),('A','t',2)]
I would recommend not zipping the lists together, since that'd just make it more difficult to use the function elemIndices from Data.List. You then have a list of the indices that you can use directly to get the values out of the second list.
You can add indices with another zip, then filter on the given character and convert tuples to triples. Especially because of this repackaging, a list comprehension seems appropriate:
connect c pairs = [(a, b, idx) | ((a, b), idx) <- zip pairs [0..], a == c]

Dovetail iteration over infinite lists in Haskell

I want to iterate 2 (or 3) infinite lists and find the "smallest" pair that satisfies a condition, like so:
until pred [(a,b,c) | a<-as, b<-bs, c<-cs]
where pred (a,b,c) = a*a + b*b == c*c
as = [1..]
bs = [1..]
cs = [1..]
The above wouldn't get very far, as a == b == 1 throughout the run of the program.
Is there a nice way to dovetail the problem, e.g. build the infinite sequence [(1,1,1),(1,2,1),(2,1,1),(2,1,2),(2,2,1),(2,2,2),(2,2,3),(2,3,2),..] ?
Bonus: is it possible to generalize to n-tuples?
There's a monad for that, Omega.
Prelude> let as = each [1..]
Prelude> let x = liftA3 (,,) as as as
Prelude> let x' = mfilter (\(a,b,c) -> a*a + b*b == c*c) x
Prelude> take 10 $ runOmega x'
[(3,4,5),(4,3,5),(6,8,10),(8,6,10),(5,12,13),(12,5,13),(9,12,15),(12,9,15),(8,15,17),(15,8,17)]
Using it's applicative features, you can generalize to arbitrary tuples:
quadrupels = (,,,) <$> as <*> as <*> as <*> as -- or call it liftA4
But: this alone does not eliminate duplication, of course. It only gives you proper diagonalization. Maybe you could use monad comprehensions together with an approach like Thomas's, or just another mfilter pass (restricting to b /= c, in this case).
List comprehensions are great (and concise) ways to solve such problems. First, you know you want all combinations of (a,b,c) that might satisfy a^2 + b^2 = c^2 - a helpful observation is that (considering only positive numbers) it will always be the case that a <= c && b <= c.
To generate our list of candidates we can thus say c ranges from 1 to infinity while a and b range from one to c.
[(a,b,c) | c <- [1..], a <- [1..c], b <- [1..c]]
To get to the solution we just need to add your desired equation as a guard:
[(a,b,c) | c <- [1..], a <- [1..c], b <- [1..c], a*a+b*b == c*c]
This is inefficient, but the output is correct:
[(3,4,5),(4,3,5),(6,8,10),(8,6,10),(5,12,13),(12,5,13),(9,12,15)...
There are more principled methods than blind testing that can solve this problem.
{- It depends on what is "smallest". But here is a solution for a concept of "smallest" if tuples were compared first by their max. number and then by their total sum. (You can just copy and paste my whole answer into a file as I write the text in comments.)
We will need nub later. -}
import Data.List (nub)
{- Just for illustration: the easy case with 2-tuples. -}
-- all the two-tuples where 'snd' is 'n'
tuples n = [(i, n) | i <- [1..n]]
-- all the two-tuples where 'snd' is in '1..n'
tuplesUpTo n = concat [tuples i | i <- [1..n]]
{-
To get all results, you will need to insert the flip of each tuple into the stream. But let's do that later and generalize first.
Building tuples of arbitrary length is somewhat difficult, so we will work on lists. I call them 'kList's, if they have a length 'k'.
-}
-- just copied from the tuples case, only we need a base case for k=1 and
-- we can combine all results utilizing the list monad.
kLists 1 n = [[n]]
kLists k n = do
rest <- kLists (k-1) n
add <- [1..head rest]
return (add:rest)
-- same as above. all the klists with length k and max number of n
kListsUpTo k n = concat [kLists k i | i <- [1..n]]
-- we can do that unbounded as well, creating an infinite list.
kListsInf k = concat [kLists k i | i <- [1..]]
{-
The next step is rotating these lists around, because until now the largest number is always in the last place. So we just look at all rotations to get all the results. Using nub here is admittedly awkward, you can improve that. But without it, lists where all elements are the same are repeated k times.
-}
rotate n l = let (init, end) = splitAt n l
in end ++ init
rotations k l = nub [rotate i l | i <- [0..k-1]]
rotatedKListsInf k = concatMap (rotations k) $ kListsInf k
{- What remains is to convert these lists into tuples. This is a bit awkward, because every n-tuple is a separate type. But it's straightforward, of course. -}
kListToTuple2 [x,y] = (x,y)
kListToTuple3 [x,y,z] = (x,y,z)
kListToTuple4 [x,y,z,t] = (x,y,z,t)
kListToTuple5 [x,y,z,t,u] = (x,y,z,t,u)
kListToTuple6 [x,y,z,t,u,v] = (x,y,z,t,u,v)
{- Some tests:
*Main> take 30 . map kListToTuple2 $ rotatedKListsInf 2
[(1,1),(1,2),(2,1),(2,2),(1,3),(3,1),(2,3),(3,2),(3,3),(1,4),(4,1),(2,4),(4,2),(3,4),
(4,3),(4,4),(1,5),(5,1),(2,5),(5,2),(3,5),(5,3),(4,5),(5,4),(5,5),(1,6),(6,1),
(2,6), (6,2), (3,6)]
*Main> take 30 . map kListToTuple3 $ rotatedKListsInf 3
[(1,1,1),(1,1,2),(1,2,1),(2,1,1),(1,2,2),(2,2,1),(2,1,2),(2,2,2),(1,1,3),(1,3,1),
(3,1,1),(1,2,3),(2,3,1),(3,1,2),(2,2,3),(2,3,2),(3,2,2),(1,3,3),(3,3,1),(3,1,3),
(2,3,3),(3,3,2),(3,2,3),(3,3,3),(1,1,4),(1,4,1),(4,1,1),(1,2,4),(2,4,1),(4,1,2)]
Edit:
I realized there is a bug: Just rotating the ordered lists isn't enough of course. The solution must be somewhere along the lines of having
rest <- concat . map (rotations (k-1)) $ kLists (k-1) n
in kLists, but then some issues with repeated outputs arise. You can figure that out, I guess. ;-)
-}
It really depends on what you mean by "smallest", but I assume you want to find a tuple of numbers with respect to its maximal element - so (2,2) is less than (1,3) (while standard Haskell ordering is lexicographic).
There is package data-ordlist, which is aimed precisely at working with ordered lists. It's function mergeAll (and mergeAllBy) allows you to combine a 2-dimensional matrix ordered in each direction into an ordered list.
First let's create a desired comparing function on tuples:
import Data.List (find)
import Data.List.Ordered
compare2 :: (Ord a) => (a, a) -> (a, a) -> Ordering
compare2 x y = compare (max2 x, x) (max2 y, y)
where
max2 :: Ord a => (a, a) -> a
max2 (x, y) = max x y
Then using mergeAll we create a function that takes a comparator, a combining function (which must be monotonic in both arguments) and two sorted lists. It combines all possible elements from the two lists using the function and produces a result sorted list:
mergeWith :: (b -> b -> Ordering) -> (a -> a -> b) -> [a] -> [a] -> [b]
mergeWith cmp f xs ys = mergeAllBy cmp $ map (\x -> map (f x) xs) ys
With this function, it's very simple to produce tuples ordered according to their maximum:
incPairs :: [(Int,Int)]
incPairs = mergeWith compare2 (,) [1..] [1..]
Its first 10 elements are:
> take 10 incPairs
[(1,1),(1,2),(2,1),(2,2),(1,3),(2,3),(3,1),(3,2),(3,3),(1,4)]
and when we (for example) look for the first pair whose sum of squares is equal to 65:
find (\(x,y) -> x^2+y^2 == 65) incPairs
we get the correct result (4,7) (as opposed to (1,8) if lexicographic ordering were used).
This answer is for a more general problem for a unknown predicate. If the predicate is known, more efficient solutions are possible, like others have listed solutions based on knowledge that you don't need to iterate for all Ints for a given c.
When dealing with infinite lists, you need to perform breadth-first search for solution. The list comprehension only affords depth-first search, that is why you never arrive at a solution in your original code.
counters 0 xs = [[]]
counters n xs = concat $ foldr f [] gens where
gens = [[x:t | t <- counters (n-1) xs] | x <- xs]
f ys n = cat ys ([]:n)
cat (y:ys) (x:xs) = (y:x): cat ys xs
cat [] xs = xs
cat xs [] = [xs]
main = print $ take 10 $ filter p $ counters 3 [1..] where
p [a,b,c] = a*a + b*b == c*c
counters generates all possible counters for values from the specified range of digits, including a infinite range.
First, we obtain a list of generators of valid combinations of counters - for each permitted digit, combine it with all permitted combinations for counters of smaller size. This may result in a generator that produces a infinite number of combinations. So, we need to borrow from each generator evenly.
So gens is a list of generators. Think of this as a list of all counters starting with one digit: gens !! 0 is a list of all counters starting with 1, gens !! 1 is a list of all counters starting with 2, etc.
In order to borrow from each generator evenly, we could transpose the list of generators - that way we would get a list of first elements of the generators, followed by a list of second elements of the generators, etc.
Since the list of generators may be infinite, we cannot afford to transpose the list of generators, because we may never get to look at the second element of any generator (for a infinite number of digits we'd have a infinite number of generators). So, we enumerate the elements from the generators "diagonally" - take first element from the first generator; then take the second element from the first generator and the first from the second generator; then take the third element from the first generator, the second from the second, and the first element from the third generator, etc. This can be done by folding the list of generators with a function f, which zips together two lists - one list is the generator, the other is the already-zipped generators -, the beginning of one of them being offset by one step by adding []: to the head. This is almost zipWith (:) ys ([]:n) - the difference is that if n or ys is shorter than the other one, we don't drop the remainder of the other list. Note that folding with zipWith (:) ys n would be a transpose.
For this answer I will take "smallest" to refer to the sum of the numbers in the tuple.
To list all possible pairs in order, you can first list all of the pairs with a sum of 2, then all pairs with a sum of 3 and so on. In code
pairsWithSum n = [(i, n-i) | i <- [1..n-1]]
xs = concatMap pairsWithSum [2..]
Haskell doesn't have facilities for dealing with n-tuples without using Template Haskell, so to generalize this you will have to switch to lists.
ntuplesWithSum 1 s = [[s]]
ntuplesWithSum n s = concatMap (\i -> map (i:) (ntuplesWithSum (n-1) (s-i))) [1..s-n+1]
nums n = concatMap (ntuplesWithSum n) [n..]
Here's another solution, with probably another slightly different idea of "smallest". My order is just "all tuples with max element N come before all tuples with max element N+1". I wrote the versions for pairs and triples:
gen2_step :: Int -> [(Int, Int)]
gen2_step s = [(x, y) | x <- [1..s], y <- [1..s], (x == s || y == s)]
gen2 :: Int -> [(Int, Int)]
gen2 n = concatMap gen2_step [1..n]
gen2inf :: [(Int, Int)]
gen2inf = concatMap gen2_step [1..]
gen3_step :: Int -> [(Int, Int, Int)]
gen3_step s = [(x, y, z) | x <- [1..s], y <- [1..s], z <- [1..s], (x == s || y == s || z == s)]
gen3 :: Int -> [(Int, Int, Int)]
gen3 n = concatMap gen3_step [1..n]
gen3inf :: [(Int, Int, Int)]
gen3inf = concatMap gen3_step [1..]
You can't really generalize it to N-tuples, though as long as you stay homogeneous, you may be able to generalize it if you use arrays. But I don't want to tie my brain into that knot.
I think this is the simplest solution if "smallest" is defined as x+y+z because after you find your first solution in the space of Integral valued pythagorean triangles, your next solutions from the infinite list are bigger.
take 1 [(x,y,z) | y <- [1..], x <- [1..y], z <- [1..x], z*z + x*x == y*y]
-> [(4,5,3)]
It has the nice property that it returns each symmetrically unique solution only once. x and z are also infinite, because y is infinite.
This does not work, because the sequence for x never finishes, and thus you never get a value for y, not to mention z. The rightmost generator is the innermost loop.
take 1 [(z,y,x)|z <- [1..],y <- [1..],x <- [1..],x*x + y*y == z*z]
Sry, it's quite a while since I did haskell, so I'm going to describe it with words.
As I pointed out in my comment. It is not possible to find the smallest anything in an infinite list, since there could always be a smaller one.
What you can do is, have a stream based approach that takes the lists and returns a list with only 'valid' elements, i. e. where the condition is met. Lets call this function triangle
You can then compute the triangle list to some extent with take n (triangle ...) and from this n elements you can find the minium.

List processing in Haskell

I am teaching myself Haskell and have run into a problem and need help.
Background:
type AInfo = (Char, Int)
type AList = [AInfo] (let’s say [(‘a’, 2), (‘b’,5), (‘a’, 1), (‘w’, 21)]
type BInfo = Char
type BList = [BInfo] (let’s say [‘a’, ‘a’, ‘c’, ‘g’, ‘a’, ‘w’, ‘b’]
One quick edit: The above information is for illustrative purposes only. The actual elements of the lists are a bit more complex. Also, the lists are not static; they are dynamic (hence the uses of the IO monad) and I need to keep/pass/"return"/have access to and change the lists during the running of the program.
I am looking to do the following:
For all elements of AList check against all elements of BList and where the character of the AList element (pair) is equal to the character in the Blist add one to the Int value of the AList element (pair) and remove the character from BList.
So what this means is after the first element of AList is checked against all elements of BList the values of the lists should be:
AList [(‘a’, 5), (‘b’,5), (‘a’, 1), (‘w’, 21)]
BList [‘c’, ‘g’, ‘w’, ‘b’]
And in the end, the lists values should be:
AList [(‘a’, 5), (‘b’,6), (‘a’, 1), (‘w’, 22)]
BList [‘c’, ‘g’]
Of course, all of this is happening in an IO monad.
Things I have tried:
Using mapM and a recursive helper function. I have looked at both:
Every element of AList checked against every element of bList -- mapM (myHelpF1 alist) blist and
Every element of BList checked against every element of AList – mapM (myHelpF2 alist) blist
Passing both lists to a function and using a complicated
if/then/else & helper function calls (feels like I am forcing
Haskell to be iterative; Messy convoluted code, Does not feel
right.)
I have thought about using filter, the character value of AList
element and Blist to create a third list of Bool and the count the
number of True values. Update the Int value. Then use filter on
BList to remove the BList elements that …… (again Does not feel
right, not very Haskell-like.)
Things I think I know about the problem:
The solution may be exceeding trivial. So much so, the more experienced Haskellers will be muttering under their breath “what a noob” as they type their response.
Any pointers would be greatly appreciated. (mutter away….)
A few pointers:
Don't use [(Char, Int)] for "AList". The data structure you are looking for is a finite map: Map Char Int. Particularly look at member and insertWith. toList and fromList convert from the representation you currently have for AList, so even if you are stuck with that representation, you can convert to a Map for this algorithm and convert back at the end. (This will be more efficient than staying in a list because you are doing so many lookups, and the finite map API is easier to work with than lists)
I'd approach the problem as two phases: (1) partition out the elements of blist by whether they are in the map, (2) insertWith the elements which are already in the map. Then you can return the resulting map and the other partition.
I would also get rid of the meaningless assumptions such as that keys are Char -- you can just say they are any type k (for "key") that satisfies the necessary constraints (that you can put it in a Map, which requires that it is Orderable). You do this with lowercase type variables:
import qualified Data.Map as Map
sieveList :: (Ord k) => Map.Map k Int -> [k] -> (Map.Map k Int, [k])
Writing algorithms in greater generality helps catch bugs, because it makes sure that you don't use any assumptions you don't need.
Oh, also this program has no business being in the IO monad. This is pure code.
import Data.List
type AInfo = (Char, Int)
type AList = [AInfo]
type BInfo = Char
type BList = [BInfo]
process :: AList -> BList -> AList
process [] _ = []
process (a:as) b = if is_in a b then (fst a,snd a + 1):(process as (delete (fst a) b)) else a:process as b where
is_in f [] = False
is_in f (s:ss) = if fst f == s then True else is_in f ss
*Main> process [('a',5),('b',5),('a',1),('b',21)] ['c','b','g','w','b']
[('a',5),('b',6),('a',1),('b',22)]
*Main> process [('a',5),('b',5),('a',1),('w',21)] ['c','g','w','b']
[('a',5),('b',6),('a',1),('w',22)]
Probably an important disclaimer: I'm rusty at Haskell to the point of ineptness, but as a relaxing midnight exercise I wrote this thing. It should do what you want, although it doesn't return a BList. With a bit of modification, you can get it to return an (AList,BList) tuple, but methinks you'd be better off using an imperative language if that kind of manipulation is required.
Alternately, there's an elegant solution and I'm too ignorant of Haskell to know it.
While I am by no means a Haskell expert, I have a partial attempt that returns that result of an operation once. Maybe you can find out how to map it over the rest to get your solution. The addwhile is clever, since you only want to update the first occurrence of an element in lista, if it exists twice, it will just add 0 to it. Code critiques are more than welcome.
import Data.List
type AInfo = (Char, Int)
type AList = [AInfo]
type BInfo = Char
type BList = [BInfo]
lista = ([('a', 2), ('b',5), ('a', 1), ('w', 21)] :: AList)
listb = ['a','a','c','g','a','w','b']
--step one, get the head, and its occurrences
items list = (eleA, eleB) where
eleA = length $ filter (\x -> x == (head list)) list
eleB = head list
getRidOfIt list ele = (dropWhile (\x -> x == ele) list) --drop like its hot
--add to lista
addWhile :: [(Char, Int)] -> Char -> Int -> [(Char,Int)]
addWhile [] _ _ = []
addWhile ((x,y):xs) letter times = if x == letter then (x,y+times) : addWhile xs letter times
else (x,y) : addWhile xs letter 0
--first answer
firstAnswer = addWhile lista (snd $ items listb) (fst $ items listb)
--[('a',5),('b',5),('a',1),('w',21)]
The operation you describe is pure, as #luqui points out, so we just define it as a pure Haskell function. It can be used inside a monad (including IO) by means of fmap (or do).
import Data.List
combine alist blist = (reverse a, b4) where
First we sort and count the B list:
b = map (\g->(head g,length g)) . group . sort $ blist
We need the import for group and sort to be available. Next, we roll along the alist and do our thing:
(a,b2) = foldl g ([],b) alist
g (acc,b) e#(x,c) = case pick x b of
Nothing -> (e:acc,b)
Just (n,b2) -> ((x,c+n):acc,b2)
b3 = map fst b2
b4 = [ c | c <- blist, elem c b3 ]
Now pick, as used, must be
pick x [] = Nothing
pick x ((y,n):t)
| x==y = Just (n,t)
| otherwise = case pick x t of Nothing -> Nothing
Just (k,r) -> Just (k, (y,n):r)
Of course pick performs a linear search, so if performance (speed) becomes a problem, b should be changed to allow for binary search (tree etc, like Map). The calculation of b4 which is filter (`elem` b3) blist is another potential performance problem with its repeated checks for presence in b3. Again, checking for presence in trees is faster than in lists, in general.
Test run:
> combine [('a', 2), ('b',5), ('a', 1), ('w', 21)] "aacgawb"
([('a',5),('b',6),('a',1),('w',22)],"cg")
edit: you probably want it the other way around, rolling along the blist while updating the alist and producing (or not) the elements of blist in the result (b4 in my code). That way the algorithm will operate in a more local manner on long input streams (that assuming your blist is long, though you didn't say that). As written above, it will have a space problem, consuming the input stream blist several times over. I'll keep it as is as an illustration, a food for thought.
So if you decide to go the 2nd route, first convert your alist into a Map (beware the duplicates!). Then, scanning (with scanl) over blist, make use of updateLookupWithKey to update the counts map and at the same time decide for each member of blist, one by one, whether to output it or not. The type of the accumulator will thus have to be (Map a Int, Maybe a), with a your element type (blist :: [a]):
scanl :: (acc -> a -> acc) -> acc -> [a] -> [acc]
scanning = tail $ scanl g (Nothing, fromList $ reverse alist) blist
g (_,cmap) a = case updateLookupWithKey (\_ c->Just(c+1)) a cmap of
(Just _, m2) -> (Nothing, m2) -- seen before
_ -> (Just a, cmap) -- not present in counts
new_b_list = [ a | (Just a,_) <- scanning ]
last_counts = snd $ last scanning
You will have to combine the toList last_counts with the original alist if you have to preserve the old duplicates there (why would you?).

Resources