Inspired by the recent question about 2d grids in Haskell, I'm wondering if it would be possible to create a two-dimensional zipper to keep track of a position in a list of lists. A one-dimensional zipper on a list allows us to really efficiently move locally in a large list (the common example being a text editor). But lets say we have a second dimension like this:
grid =
[[ 1, 2, 3, 4, 5]
,[ 6, 7, 8, 9,10]
,[11,12,13,14,15]
,[16,17,18,19,20]
,[21,22,23,24,25]]
Can we create some kind of zipper data structure to efficiently move not only left and right but up and down in the grid here? If so, what if we replace the list of lists with an infinite list of infinite lists, can we still get efficient movement?
Not quite, no. One of the key aspects of how zippers work is that they represent a location in a structure by a path used to reach it, plus extra fragments created along the way, with the end result that you can backtrack along that path and rebuild the structure as you go. The nature of the paths available through the data structure thus constrains the zipper.
Because locations are identified by paths, each distinct path represents a different location, so any data structure with multiple paths to the same value can't be used with a zipper--for example, consider a cyclic list, or any other structure with looping paths.
Arbitrary movement in 2D space doesn't really fit the above requirements, so we can deduce that a 2D zipper would necessarily be somewhat limited. Perhaps you'd start from the origin, walk a path through the structure, and then backtrack along that path some distance in order to reach other points, for example. This also implies that for any point in the structure, there are other points that can only be reached via the origin.
What you can do is build some notion of 2D distance into the data structure, so that as you follow a path down through the structure, the points "below" you are close to each other; the idea is to minimize the amount of backtracking needed on average to move a short distance in 2D space. This ends up being roughly the same approach needed to search 2D space by distance--nearest neighbor searches, efficient geometric intersection, that sort of thing--and can be done with the same kind of data structure, namely space partitioning to create a higher-dimensional search tree. Implementing a zipper for a quadtree, a kd-tree, or similar structures is straightforward, just like any other tree.
Well you can use something simple like the following code. We represent a table by the top rows of the selected element, the bottom rows of the selected element, plus the elements to the left of the selected one, and the elements to the right of the selected one.
The top rows and the left elements are stored in a reverse order to enable efficient movement.
I'm not sure if this qualifies as a zipper though, because even though we hold a "location" in the data structure, it is not a "path".
-- Table sel left right top bottom
data Table a = Table a [a] [a] [[a]] [[a]] deriving Show
left :: Table a -> Table a
left tab#(Table _ [] _ _ _) = tab
left (Table sel (l:ls) rs ts bs) = Table l ls (sel:rs) ts bs
right :: Table a -> Table a
right tab#(Table _ _ [] _ _) = tab
right (Table sel ls (r:rs) ts bs) = Table r (sel:ls) rs ts bs
up :: Table a -> Table a
up tab#(Table _ _ _ [] _) = tab
up (Table sel ls rs (t:ts) bs) = Table sel' ls' rs' ts (b:bs)
where
(ls',(sel':rs')) = splitAt (length ls) t
b = ls ++ (sel:rs)
down :: Table a -> Table a
down tab#(Table _ _ _ _ []) = tab
down (Table sel ls rs ts (b:bs)) = Table sel' ls' rs' (t:ts) bs
where
(ls',(sel':rs')) = splitAt (length ls) b
t = ls ++ (sel:rs)
tableToList :: Table a -> [[a]]
tableToList (Table sel ls rs ts bs) = (reverse ts) ++ [ls ++ (sel:rs)] ++ bs
listToTable :: [[a]] -> Table a
listToTable [] = error "cannot make empty table"
listToTable ([]:_) = error "cannot make empty table"
listToTable ((t:tr):ts) = Table t [] tr [] ts
This even works for infinite lists -
selected :: Table a -> a
selected (Table sel _ _ _ _) = sel
a :: Table Int
a = listToTable $ replicate 10 [1..]
selected a #=> 1
selected $ down a #=> 1
selected $ right $ down a #=> 2
I was looking for something similar: a way to cheaply and easily navigate (which includes going “backwards”) a doubly-infinite list of lists. Here's my take at it.
If I read the others answers carefully, what I'm presenting here isn't really a zipper: while navigation is amortized O(1), the memory used by the zipper structure network is never released. On the other hand, it ought to tie the knot enough for “cells” to be shared no matter the path we take to get to them, which is the kind of topology we'd want on a 2D list of lists.
To compensate, the list of lists used to generate it ought to eventually go unreferenced and garbage-collected.
data FakeZip2D a = Z2 { z2Val :: a
, goUp :: !( Maybe (FakeZip2D a) )
, goDown :: Maybe (FakeZip2D a)
, goLeft :: !( Maybe (FakeZip2D a) )
, goRight :: Maybe (FakeZip2D a)
}
fromList2 :: [[a]] -> Maybe (FakeZip2D a)
fromList2 xss = head (head zss) where
extended = [ repeat Nothing ] ++
map (\z -> [Nothing] ++ z ++ repeat Nothing) zss ++
[ repeat Nothing ]
zss = zipWith4' row xss extended (drop 1 extended) (drop 2 extended)
row xs prev cur next = Just <$> zipWith5' Z2 xs (tail prev) (tail next)
cur (drop 2 cur)
-- totally inspired by https://stackoverflow.com/a/54096748/12274
zipWith4' f (a:as) (b:bs) ~(c:cs) ~(d:ds) =
f a b c d : zipWith4' f as bs cs ds
zipWith5' f (a:as) (b:bs) ~(c:cs) (d:ds) ~(e:es) =
f a b c d e : zipWith5' f as bs cs ds es
The data structure ought to be self-explanatory. Up and left can afford to be strict because we're building from singly-linked lists. AFAIK, there's no point in maing them lazy in Haskell, as they wouldn't let anything go out of scope anyway.
The lattice is built recursively, expanding the borders of the provided input with Nothing. The lazy-enough variants of zipWith I needed are inspired from answers to another series of questions of mine on the topic.
Here it is in action:
demo :: IO ()
demo = do
let multList2 = [[ i*j | j <- [0..] ] | i <- [0..] ]
multZ2 = fromList2 multList2
let rows = iterate (>>= goDown) multZ2
cols = map (iterate (>>= goRight)) rows
putStrLn "Multiplication table"
mapM_ (print . map z2Val) $ take 5 $ map (catMaybes . take 5) cols
putStrLn "List of squares"
let goDiag = goRight >=> goDown
print $ map z2Val $ take 25 $ catMaybes $ iterate (>>= goDiag) multZ2
putStrLn "Convoluted list of squares"
let goDiag' = goDown >=> goRight >=> goUp >=> goLeft >=> goDiag
print $ map z2Val $ take 25 $ catMaybes $ iterate (>>= goDiag') multZ2
The interface can likely be made even easier to use by dropping the Maybes. At your own risk, naturally.
This might be slightly off-topic as it's not a real zipper either, but it solved my problem; and since this is the question that came up when I first looked for a solution, I'm posting it here with the intent it helps someone else.
Related
My aim: given a list of numbers, find the indices of the first pair of numbers that add up to a target number. This is part of the problem given in an old google-code-jam challenge. https://code.google.com/codejam/contest/351101/dashboard#s=p0
λ: check 14 [2,4,6,8,10]
(3,4)
NB indexed from 1
My solution is to create a map then iterate over the list of numbers; on each iteration, check if complement (ie target - current value) exists in the map. If it does I return the current index and index of complement from map. If it does not exist, recurse with current index inserted into map and rest of item list.
1st attempt:
import qualified Data.HashMap as M
check :: Int -> [Int] -> (Int, Int)
check target items = go M.empty (zip [1..] items)
where
go m ((i,v):vs) = case M.lookup (target - v) m of
Just x -> (x,i)
Nothing -> go (M.insert v i m) vs
Is this a correct implementation of my algorithm? It works on my test cases, but I feel that it is not correct Haskell way to write it.
Is HashMap the correct data structure to use? or would something like Vector be better?
Also, is this actually a good algorithm for the problem?
Another approach which does not need map (though still asymptotically equivalent to the OP's version) is to traverse sorted list from both ends moving left or right "pointer" depending on the current sum:
import Data.List (sort)
check t xs =
let sxs = sort xs
in go sxs (reverse sxs)
where
go ls#(l:ls') rs#(r:rs') | l < r =
case compare (l + r) t of
LT -> go ls' rs
EQ -> Just (l,r)
GT -> go ls rs'
go _ _ = Nothing
For simplicity assuming that the numbers in list are all distinct.
UPDATE: Missed the fact that we are after indexes, not values.
UPDATE2: And handle non-distinct case:
check t xs =
let sxs = sort $ zip xs [1..]
in go sxs (reverse sxs)
where
go ls#((l,li):ls') rs#((r,ri):rs') | li /= ri =
case compare (l + r) t of
LT -> go ls' rs
EQ -> Just (li,ri)
GT -> go ls rs'
go _ _ = Nothing
Or we can just find all of them:
checkAll t xs =
let sxs = sort $ zip xs [1..]
in go sxs (reverse sxs) []
where
go ls#((l,li):ls') rs#((r,ri):rs') ms | li /= ri =
case compare (l + r) t of
LT -> go ls' rs ms
EQ -> go ls' rs ((li,ri):ms)
GT -> go ls rs' ms
go _ _ ms = ms
It looks okay. The fact that go is partial is a bit disturbing, but the problem statement does say that a solution is supposed to exist.
IntMap should be preferred when the keys are Int. Then Map for other keys that are small in size. HashMap for larger keys (like String).
An immutable Vector would not be right because updating it is too expensive. It is possible to solve the problem with a mutable MVector, because the values of the input are quite small (<= 2000), but that would be more awkward than your current solution in Haskell.
I'm trying to split a list of Strings in to a List of Lists of Strings
so like in the title [String] -> [[String]]
This has to be done based on length of characters, so that the Lists in the output are no longer than 10. So if input was length 20 this would be broken down in to 2 lists and if length 21 in to 3 lists.
I'm not sure what to use to do this, I don't even know how to brake down a list in to a list of lists never mind based on certain length.
For example if the limit was 5 and the input was:
["abc","cd","abcd","ab"]
The output would be:
[["abc","cd"],["abcd"],["ab"]]
I'd like to be pointed in the right direction and what methods to use, list comprehension? recursion?
Here's an intuitive solution:
import Data.List (foldl')
breakup :: Int -> [[a]] -> [[[a]]]
breakup size = foldl' accumulate [[]]
where accumulate broken l
| length l > size = error "Breakup size too small."
| sum (map length (last broken ++ [l])) <= size
= init broken ++ [last broken ++ [l]]
| otherwise = broken ++ [[l]]
Now, let's go through it line-by-line:
breakup :: Int -> [[a]] -> [[[a]]]
Since you hinted that you may want to generalize the function to accept different size limits, our type signature reflects this. We also generalize beyond [String] (that is, [[Char]]), since our problem is not specific to [[Char]], and could equally apply to any [[a]].
breakup size = foldl' accumulate [[]]
We're using a left fold because we want to transform a list, left-to-right, into our target, which will be a list of sub-lists. Even though we're not concerned with efficiency, we're using Data.List.foldl' instead of Prelude's own foldl because this is standard practice. You can read more about foldl vs. foldl' here.
Our folding function is called accumulate. It will consider a new item and decide whether to place it in the last-created sub-list or to start a new sub-list. To make that judgment, it uses the size we passed in. We start with an initial value of [[]], that is, a list with one empty sub-list.
Now the question is, how should you accumulate your target?
where accumulate broken l
We're using broken to refer to our constructed target so far, and l (for "list") to refer to the next item to process. We'll use guards for the different cases:
| length l > size = error "Breakup size too small."
We need to raise an error if the item surpasses the size limit on its own, since there's no way to place it in a sub-list that satisfies the size limit. (Alternatively, we could build a safe function by wrapping our return value in the Maybe monad, and that's something you should definitely try out on your own.)
| sum (map length (last broken ++ [l])) <= size
= init broken ++ [last broken ++ [l]]
The guard condition is sum (map length (last broken ++ [l])) <= size, and the return value for this guard is init broken ++ [last broken ++ [l]]. Translated into plain English, we might say, "If the item can fit in the last sub-list without going over the size limit, append it there."
| otherwise = broken ++ [[l]]
On the other hand, if there isn't enough "room" in the last sub-list for this item, we start a new sub-list, containing only this item. When the accumulate helper is applied to the next item in the input list, it will decide whether to place that item in this sub-list or start yet another sub-list, following the same logic.
There you have it. Don't forget to import Data.List (foldl') up at the top. As another answer points out, this is not a performant solution if you plan to process 100,000 strings. However, I believe this solution is easier to read and understand. In many cases, readability is the more important optimization.
Thanks for the fun question. Good luck with Haskell, and happy coding!
You can do something like this:
splitByLen :: Int -> [String] -> [[String]]
splitByLen n s = go (zip s $ scanl1 (+) $ map length s) 0
where go [] _ = []
go xs prev = let (lst, rest) = span (\ (x, c) -> c - prev <= n) xs
in (map fst lst) : go rest (snd $ last lst)
And then:
*Main> splitByLen 5 ["abc","cd","abcd","ab"]
[["abc","cd"],["abcd"],["ab"]]
In case there is a string longer than n, this function will fail. Now, what you want to do in those cases depends on your requirements and that was not specified in your question.
[Update]
As requested by #amar47shah, I made a benchmark comparing his solution (breakup) with mine (splitByLen):
import Data.List
import Data.Time.Clock
import Control.DeepSeq
import System.Random
main :: IO ()
main = do
s <- mapM (\ _ -> randomString 10) [1..10000]
test "breakup 10000" $ breakup 10 s
test "splitByLen 10000" $ splitByLen 10 s
putStrLn ""
r <- mapM (\ _ -> randomString 10) [1..100000]
test "breakup 100000" $ breakup 10 r
test "splitByLen 100000" $ splitByLen 10 r
test :: (NFData a) => String -> a -> IO ()
test s a = do time1 <- getCurrentTime
time2 <- a `deepseq` getCurrentTime
putStrLn $ s ++ ": " ++ show (diffUTCTime time2 time1)
randomString :: Int -> IO String
randomString n = do
l <- randomRIO (1,n)
mapM (\ _ -> randomRIO ('a', 'z')) [1..l]
Here are the results:
breakup 10000: 0.904012s
splitByLen 10000: 0.005966s
breakup 100000: 150.945322s
splitByLen 100000: 0.058658s
Here is another approach. It is clear from the problem that the result is a list of lists and we need a running length and an inner list to keep track of how much we have accumulated (We use foldl' with these two as input). We then describe what we want which is basically:
If the length of the current input string itself exceeds the input length, we ignore that string (you may change this if you want a different behavior).
If the new length after we have added the length of the current string is within our input length, we add it to the current result list.
If the new length exceeds the input length, we add the result so far to the output and start a new result list.
chunks len = reverse . map reverse . snd . foldl' f (0, [[]]) where
f (resSoFar#(lenSoFar, (currRes: acc)) curr
| currLength > len = resSoFar -- ignore
| newLen <= len = (newLen, (curr: currRes):acc)
| otherwise = (currLength, [curr]:currRes:acc)
where
newLen = lenSoFar + currLength
currLength = length curr
Every time we add a result to the output list, we add it to the front hence we need reverse . map reverse at the end.
> chunks 5 ["abc","cd","abcd","ab"]
[["abc","cd"],["abcd"],["ab"]]
> chunks 5 ["abc","cd","abcdef","ab"]
[["abc","cd"],["ab"]]
Here is an elementary approach. First, the type String doesn't matter, so we can define our function in terms of a general type a:
breakup :: [a] -> [[a]]
I'll illustrate with a limit of 3 instead of 10. It'll be obvious how to implement it with another limit.
The first pattern will handle lists which are of size >= 3 and the the second pattern handles all of the other cases:
breakup (a1 : a2 : a3 : as) = [a1, a2, a3] : breakup as
breakup as = [ as ]
It is important to have the patterns in this order. That way the second pattern will only be used when the first pattern does not match, i.e. when there are less than 3 elements in the list.
Examples of running this on some inputs:
breakup [1..5] -> [ [1,2,3], [4,5] ]
breakup [1..4] -> [ [1,2,3], [4] ]
breakup [1..2] -> [ [1,2] ]
breakup [1..3] -> [ [1,2,3], [] ]
We see these is an extra [] when we run the function on [1..3]. Fortunately this is easy to fix by inserting another rule before the last one:
breakup [] = []
The complete definition is:
breakup :: [a] -> [[a]]
breakup [] = []
breakup (a1 : a2 : a3 : as) = [a1, a2, a3] : breakup as
breakup as = [ as ]
I want to write program that takes array of Ints and length and returns array that consist in position i all elements, that equals i, for example
[0,0,0,1,3,5,3,2,2,4,4,4] 6 -> [[0,0,0],[1],[2,2],[3,3],[4,4,4],[5]]
[0,0,4] 7 -> [[0,0],[],[],[],[4],[],[]]
[] 3 -> [[],[],[]]
[2,2] 3 -> [[],[],[2,2]]
So, that's my solution
import Data.List
import Data.Function
f :: [Int] -> Int -> [[Int]]
f ls len = g 0 ls' [] where
ls' = group . sort $ ls
g :: Int -> [[Int]] -> [[Int]] -> [[Int]]
g val [] accum
| len == val = accum
| otherwise = g (val+1) [] (accum ++ [[]])
g val (x:xs) accum
| len == val = accum
| val == head x = g (val+1) xs (accum ++ [x])
| otherwise = g (val+1) (x:xs) (accum ++ [[]])
But query f [] 1000000 works really long, why?
I see we're accumulating over some data structure. I think foldMap. I ask "Which Monoid"? It's some kind of lists of accumulations. Like this
newtype Bunch x = Bunch {bunch :: [x]}
instance Semigroup x => Monoid (Bunch x) where
mempty = Bunch []
mappend (Bunch xss) (Bunch yss) = Bunch (glom xss yss) where
glom [] yss = yss
glom xss [] = xss
glom (xs : xss) (ys : yss) = (xs <> ys) : glom xss yss
Our underlying elements have some associative operator <>, and we can thus apply that operator pointwise to a pair of lists, just like zipWith does, except that when we run out of one of the lists, we don't truncate, rather we just take the other. Note that Bunch is a name I'm introducing for purposes of this answer, but it's not that unusual a thing to want. I'm sure I've used it before and will again.
If we can translate
0 -> Bunch [[0]] -- single 0 in place 0
1 -> Bunch [[],[1]] -- single 1 in place 1
2 -> Bunch [[],[],[2]] -- single 2 in place 2
3 -> Bunch [[],[],[],[3]] -- single 3 in place 3
...
and foldMap across the input, then we'll get the right number of each in each place. There should be no need for an upper bound on the numbers in the input to get a sensible output, as long as you are willing to interpret [] as "the rest is silence". Otherwise, like Procrustes, you can pad or chop to the length you need.
Note, by the way, that when mappend's first argument comes from our translation, we do a bunch of ([]++) operations, a.k.a. ids, then a single ([i]++), a.k.a. (i:), so if foldMap is right-nested (which it is for lists), then we will always be doing cheap operations at the left end of our lists.
Now, as the question works with lists, we might want to introduce the Bunch structure only when it's useful. That's what Control.Newtype is for. We just need to tell it about Bunch.
instance Newtype (Bunch x) [x] where
pack = Bunch
unpack = bunch
And then it's
groupInts :: [Int] -> [[Int]]
groupInts = ala' Bunch foldMap (basis !!) where
basis = ala' Bunch foldMap id [iterate ([]:) [], [[[i]] | i <- [0..]]]
What? Well, without going to town on what ala' is in general, its impact here is as follows:
ala' Bunch foldMap f = bunch . foldMap (Bunch . f)
meaning that, although f is a function to lists, we accumulate as if f were a function to Bunches: the role of ala' is to insert the correct pack and unpack operations to make that just happen.
We need (basis !!) :: Int -> [[Int]] to be our translation. Hence basis :: [[[Int]]] is the list of images of our translation, computed on demand at most once each (i.e., the translation, memoized).
For this basis, observe that we need these two infinite lists
[ [] [ [[0]]
, [[]] , [[1]]
, [[],[]] , [[2]]
, [[],[],[]] , [[3]]
... ...
combined Bunchwise. As both lists have the same length (infinity), I could also have written
basis = zipWith (++) (iterate ([]:) []) [[[i]] | i <- [0..]]
but I thought it was worth observing that this also is an example of Bunch structure.
Of course, it's very nice when something like accumArray hands you exactly the sort of accumulation you need, neatly packaging a bunch of grungy behind-the-scenes mutation. But the general recipe for an accumulation is to think "What's the Monoid?" and "What do I do with each element?". That's what foldMap asks you.
The (++) operator copies the left-hand list. For this reason, adding to the beginning of a list is quite fast, but adding to the end of a list is very slow.
In summary, avoid adding things to the end of a list. Try to always add to the beginning instead. One simple way to do that is to build the list backwards, and then reverse it at the end. A more devious trick is to use "difference lists" (Google it). Another possibility is to use Data.Sequence rather than a list.
The first thing that should be noted is the most obvious way to implement this is use a data structure that allows random access, an array is an obviously choice. Note that you need to add the elements to the array multiple times and somehow "join them".
accumArray is perfect for this.
So we get:
f l i = elems $ accumArray (\l e -> e:l) [] (0,i-1) (map (\e -> (e,e)) l)
And we're good to go (see full code here).
This approach does involve converting the final array back into a list, but that step is very likely faster than say sorting the list, which often involves scanning the list at least a few times for a list of decent size.
Whenever you use ++ you have to recreate the entire list, since lists are immutable.
A simple solution would be to use :, but that builds a reversed list. However that can be fixed using reverse, which results in only building two lists (instead of 1 million in your case).
Your concept of glomming things onto an accumulator is a very useful one, and both MathematicalOrchid and Guvante show how you can use that concept reasonably efficiently. But in this case, there is a simpler approach that is likely also faster. You started with
group . sort $ ls
and this was a very good place to start! You get a list that's almost the one you want, except that you need to fill in some blanks. How can we figure those out? The simplest way, though probably not quite the most efficient, is to work with a list of all the numbers you want to count up to: [0 .. len-1].
So we start with
f ls len = g [0 .. len-1] (group . sort $ ls)
where
?
How do we define g? By pattern matching!
f ls len = g [0 .. len-1] (group . sort $ ls)
where
-- We may or may not have some lists left,
-- but we counted as high as we decided we
-- would
g [] _ = []
-- We have no lists left, so the rest of the
-- numbers are not represented
g ns [] = map (const []) ns
-- This shouldn't be possible, because group
-- doesn't make empty lists.
g _ ([]:_) = error "group isn't working!"
-- Finally, we have some work to do!
g (n:ns) xls#(xl#(x:_):xls')
| n == x = xl : g ns xls'
| otherwise = [] : g ns xls
That was nice, but making the list of numbers isn't free, so you might be wondering how you can optimize it. One method I invite you to try is using your original technique of keeping a separate counter, but following this same sort of structure.
I have a list of lists of strings and I need to format them into a triangle using periods such that each list is on its own line, and the strings of each list are separated by at least one period. Additionally, each character must have dots above and below. This is probably best explained by example:
> dots [["test"], ["hello", "world"], ["some", "random", "words"], ["even", "more", "random", "words"]]
should return
............test.....................
.......hello.......world.............
...some......random......words.......
not....really......random.....anymore
Finally, it should use the least amount of periods possible, i.e. the approach of padding out every word to the length of the maximum word is too wasteful; that is the above example should not return
.....................test........................
..............hello.........world................
.......some..........random........words.........
not...........really........random........anymore
I can easily write the a function that does the put the periods either side to make it into a triangle shape, my problem is with the periods in between words.
I have a function that works as long as the word length is 1, which is obviously pretty useless for the task. Nevertheless, my function dots:
dots :: [[String]] -> [[String]]
dots xss = map dots' xss
where dots' (x:[]) = [x]
dots' (x:xs) = [x] ++ ["."] ++ dots' xs
This is a homework exercise, so hints would be preferred, but I've been trying to do this for hours with no luck.
First you need a function, that adds placeholders to a list like this:
addPlaceholders [ ["test"]
, ["hello", "world"]
, ["some", "random", "words"]
, ["not", "really", "random", "anymore"]
]
==> [ ["" , "" , "" , "test" , "" , "" , "" ]
, ["" , "" , "hello" , "" , "world" , "" , "" ]
, ["" , "some", "" , "random", "" , "words", "" ]
, ["not", "" , "really", "" , "random", "" , "anymore"]
]
Now you need to fill these "" with dots. So you can write an auxiliary function, that adds dots to a list:
addDots ["test", "", "random", ""]
==> ["test..","......","random","......"]
and then fill is simply
fill = transpose . map addDots . transpose
And your function is just
triangle = map concat . fill . addPlaceholders
First, some terminology: for a given row, e.g. ["some", "more", "random", "words"], we will call a given word's index in the row its "logical column". Thus "more" has logical column 1 in that row; and "words" has logical column 3. Once we have chosen a position for each word, we will also have a "physical column" that says how many characters (dots or other word characters) should appear before it when rendering the row.
Let's make a simplifying assumption (the problem is hard enough even simplified): in the final layout, a word at row r, logical column c must be in between the words at row r+1, logical columns c and c+1.
One idea for tackling this problem is to add a third kind of column, let's call it a "checkerboard column", as an intermediate step. Rows an even number of steps from the bottom will have all their words in even checkerboard columns, and rows an odd number of steps from the bottom will have all their words in odd checkerboard columns. One can then choose a width for each checkerboard column, and set the physical column of a word to be the sum of the widths of the checkerboard columns smaller than it.
However, this has a slight problem; consider this checkerboard, where I've explicitly marked out the checkerboard column boundaries:
| | |aa| | |
| | b| |c| |
|d| |e | |f|
g| |hh| |i| |j
Because we have chosen a width for each checkerboard column, words in different checkerboard columns can never overlap. This rules out solutions like the following one, which are slightly narrower:
aa
b c
d e f
g hh i j
Note that aa and hh overlap -- though they are not on adjacent rows, so this is okay.
Another solution is to lay out the words in this order:
4
3 7
2 6 9
1 5 8 10
When laying out a given word, we can then simply choose the smallest physical column for it that doesn't violate the rules by looking at the position and length of the words above/left and below/left of it (which will already have been calculated). I have an implementation of this algorithm, which I will add to this answer in a few days (per the site guidelines about homework), but this hint should be enough for you to reproduce something very like it yourself. The interesting algorithmic bit of my implementation is a ten-line function of type Map (Row, LogicalColumn) String -> Map (Row, PhysicalColumn) String, and I recommend you make an attempt at a similarly typed function. It should be possible to do this with a clever traversal of the input lists directly (hence eliminating any map indexing costs), but I couldn't quite wrap my head around it. We can prove by induction (where the variable we are inducting on is the order we lay out words) that this approach produces the solution with minimal width.
As promised, the code I came up with:
import Control.Applicative
import Data.List
import Data.Map hiding (empty, map)
import Data.Ord
type Row = Int
type PhysicalColumn = Int
type LogicalColumn = Int
layout :: Map (Row, LogicalColumn) [a] -> Map (Row, PhysicalColumn) [a]
layout m = munge answer where
answer = mapWithKey positionFor m
positionFor (r, c) as = maximumBy (comparing snd) . concat $
[ [(as, 0)]
, addLength as <$> lookup (r+1, c ) answer
, addLength as <$> lookup (r-1, c-1) answer
]
addLength as (v, p) = (as, p + length v)
lookup k m = maybe empty pure (Data.Map.lookup k m)
munge = fromAscList . map (\((r, _), (w, c)) -> ((r, c), w)) . toAscList
parse :: String -> Map (Row, LogicalColumn) String
parse = fromList
. enumerate
. map words
. lines
enumerate :: [[a]] -> [((Row, LogicalColumn), a)]
enumerate xss = concat . zipWith (\i xs -> [((i, j), x) | (j, x) <- xs]) [0..] . map (zip [0..]) $ xss
groups :: Eq b => (a -> (b, c)) -> [a] -> [(b, [c])]
groups f
= map (\pairs -> (fst . head $ pairs, map snd pairs))
. groupBy ((==) `on` fst)
. map f
flatten :: Map (Int, Int) [a] -> [(Int, [(Int, [a])])]
flatten
= map (\(r, pairs) -> (r, map (concat <$>) (groups id pairs)))
. groups (\((r, c), a) -> (r, (c, a)))
. toAscList
pad :: a -> [(Int, [a])] -> [a]
pad blank = go 0 where
go n ((target, v):rest) = replicate (target-n) blank ++ v ++ go (target+length v) rest
go _ [] = []
pprint = unlines . map (pad ' ' . snd) . flatten
allTogetherNow = putStr . pprint . layout . parse
I would approach the problem as follows.
First, neglect the length of each word. Think of a n by m chessboard, and place each word in a square so that words end up only in black squares. Now, the number of rows n is the number of lists of words you have. Figure out what m should be.
You may wish to "center" the allocation in each row, so to obtain a triangle in the end.
Then, consider word length. How wide (in characters) should each of the m column be? For each column, compute its width. Fill each square with dots so to reach the intended width.
I make no claim this is the simpler approach -- it's just the first one that came up to me :)
I thought I was smooth sailing in my Haskell studies, until...
I have a [[Int]]
tiles = [[1,0,0]
,[0,1,0]
,[0,1,0]
]
and a data type:
data Coord = Coord
{ x :: Int
, y :: Int
} deriving (Eq)
Based on the input tiles, I've been trying to output a [Coord], such that a Coord is only generated when the value of tiles is 1, and the Coord will store it's position in the 2d list:
blackBox :: [[Int]] -> [Coord]
blackBox tiles = <magic>
-- given the above example I would expect:
-- [(Coord 0 0),(Coord 1 1),(Coord 1 2)]
I have tried things like first converting [[Int]] to a [Int], via:
foldTiles :: [[Int]] -> [Int]
foldTiles tiles = foldr (++) [] tiles
but after that I'm not really sure how to pass the indices along. I suppose if I could map over the "folded tiles", outputting a tuple (value, index), I could easily figure out the rest.
update In case anyone's interested, I got it working and here is a demo of it (with source code and link to GitHub)! I will have to take more time to understand each of the answers as this is my first time programming a game using FP. Thanks a lot!
http://kennycason.com/posts/2013-10-10-haskell-sdl-gameboy-boxxle.html
This is a place where list comprehensions shine.
blackBox tiles =
[Coord x y -- generate a Coord pair
| (y, row) <- enumerate tiles -- for each row with its coordinate
, (x, tile) <- enumerate row -- for each tile in the row (with coordinate)
, tile == 1] -- if the tile is 1
Or you could go for the equivalent do notation (since list is a monad), which requires importing Control.Monad (for guard.)
blackBox tiles = do
(y, row) <- enumerate tiles -- for each row with its coordinate
(x, tile) <- enumerate row -- for each tile in the row (with coordinate)
guard (tile == 1) -- as long as the tile is 1
return (Coord x y) -- return a coord pair
To aid with understanding, this latter function works like the following Python function.
def black_box(tiles):
for y, row in enumerate(tiles):
for x, tile in enumerate(row):
if tile == 1:
yield Coord(x, y)
do notation for the list monad is incredibly handy for processing lists, I think, so it's worth wrapping your head around!
In both of these examples I have used the definition
enumerate = zip [0..]
Here's a simple solution (not guarantee that it's viable for tiles of size 10000x10000, that's something for you to check ;)
The approach is, as usual in Haskell, a top-down development. You think: what should blackBox do? For every row of tiles it should collect the Coords of the tiles with 1 for that row, and concatenate them.
This gives you another function, blackBoxRow, for rows only. What should it do? Remove zeros from the row, and wrap the rest in Coords, so there's filter and then map. Also you want to keep the row and column numbers, so you map tiles joined with their respective coordinates.
This gives you:
tiles :: [[Int]]
tiles = [[1,0,0]
,[0,1,0]
,[0,1,0]
]
data Coord = Coord {
x :: Int
,y :: Int
} deriving (Eq, Show)
blackBox :: [[Int]] -> [Coord]
blackBox tiles2d = concat (map blackBoxRow (zip [0..] tiles2d))
blackBoxRow :: (Int, [Int]) -> [Coord]
blackBoxRow (row, tiles1d) = map toCoord $ filter pickOnes (zip [0..] tiles1d) where
pickOnes (_, value) = value == 1
toCoord (col, _) = Coord {x=col, y=row}
main = print $ blackBox tiles
Results in:
~> runhaskell t.hs
[Coord {x = 0, y = 0},Coord {x = 1, y = 1},Coord {x = 1, y = 2}]
The way I see it, you could put your 2D list through a series of transformations. The first one we'll need is one that can replace the 1 in your list with something more useful, such as its row:
assignRow :: Int -> [Int] -> [Int]
assignRow n xs = map (\x -> if x == 1 then n else x) xs
We can now use zipWith and [1..] to perform the first step:
assignRows :: [[Int]] -> [[Int]]
assignRows matrix = zipWith assignRow [1..] matrix
What's handy about this is that it'll work even if the matrix isn't square, and it terminates as soon as the matrix does.
Next we need to assign the column number, and here I'll do a few steps at once. This makes the tuples of the coordinates, but there are invalid ones where r == 0 (this is why I used [1..], otherwise, you'll lose the first row), so we filter them out. Next, we uncurry Coord to make a function that takes a tuple instead, and then we use flip on it, then map this thing over the list of tuples.
assignCol :: [Int] -> [Coord]
assignCol xs = map (uncurry (flip Coord)) $ filter (\(c, r) -> r /= 0) $ zip [1..] xs
And we can build our assignCols:
assignCols :: [[Int]] -> [Coord]
assignCols matrix = concatMap assignCol matrix
which allows us to build the final function
assignCoords :: [[Int]] -> [Coord]
assignCoords = assignCols . assignRows
You could compress this quite a bit with some eta reduction, too.
If you want 0-indexed coordinates, I'll leave you to modify this solution to do so.
Quick and dirty solution:
import Data.Maybe (mapMaybe)
data Coord = Coord {
x :: Int
,y :: Int
} deriving (Eq, Show)
blackBox :: [[Int]] -> [Coord]
blackBox = concatMap (\(y, xks) -> mapMaybe (toMaybeCoord y) xks)
. zip [0..] . map (zip [0..])
where
toMaybeCoord :: Int -> (Int, Int) -> Maybe Coord
toMaybeCoord y (x, k) = if k == 1
then Just (Coord x y)
else Nothing
The zips pair the the tile values (which I am referring to as k) with the x and y coordinates (we are dealing with lists, so we have to add the indices if we need them). mapMaybe is convenient so that we can map (in order to construct the Coords) and filter (to remove the zero tiles) in a single step. concatMap also does two things here: it maps a function (the anonymous function within the parentheses) generating a list of lists and then flattens it. Be sure to check the types of the intermediate functions and results to get a clearer picture of the transformations.
Here it is, using list comprehensions.
blackBox :: [[Integer]] -> [Coord]
blackBox ts = [Coord x y | (t,y) <- zip ts [0..], (e,x) <- zip t [0..], e == 1]
As long as we're collecting answers, here's another:
blackBox :: [[Int]] -> [Coord]
blackBox ts = map (uncurry Coord) xsAndYs
where
xsAndYs = concat $ zipWith applyYs [0..] x1s
applyYs i = map (flip (,) i)
x1s = map (map fst . filter ((==1) . snd)) xs
xs = map (zip [0..]) ts
Explanation:
This assigns the x indexes within each row:
xs = map (zip [0..]) ts
Then I filter each row to keep only the elements with a 1, and then I drop the 1 (since it's no longer useful):
x1s = map (map fst . filter ((==1) . snd)) xs
Which results in something of type [[Int]], which are the rows with xs where 1s used to be. Then I map the ys within each row, flipping the pairs so I'm left with (x,y) instead of (y,x). As a final step, I flatten the rows into a single list, since I don't need to keep them separate anymore:
xsAndYs = concat $ zipWith applyYs [0..] x1s
applyYs i = map (flip (,) i)
Finally I convert each element by mapping Coord over it. uncurry is necessary because Coord doesn't take a tuple as argument.