Haskell: Assigning unique char to matrix values if x > 0 - haskell

So my goal for the program is for it to receive an Int matrix for input, and program converts all numbers > 0 to a unique sequential char, while 0's convert into a '_' (doesn't matter, just any character not in the sequence).
eg.
main> matrixGroupings [[0,2,1],[2,2,0],[[0,0,2]]
[["_ab"],["cd_"],["__e"]]
The best I've been able to achieve is
[["_aa"],["aa_"],["__a"]]
using:
matrixGroupings xss = map (map (\x -> if x > 0 then 'a' else '_')) xss
As far as I can tell, the issue I'm having is getting the program to remember what its last value was, so that when the value check is > 0, it picks the next char in line. I can't for the life of me figure out how to do this though.
Any help would be appreciated.

Your problem is an instance of an ancient art: labelling of various structures with a stream of
labels. It dates back at least to Chris Okasaki, and my favourite treatment is by Jeremy
Gibbons.
As you can see from these two examples, there is some variety to the way a structure may be
labelled. But in this present case, I suppose the most straightforward way will do. And in Haskell
it would be really short. Let us dive in.
The recipe is this:
Define a polymorphic type for your matrices. It must be such that a matrix of numbers and a
matrix of characters are both rightful members.
Provide an instance of Traversable class. It may in many cases be derived automagically.
Pick a monad to your liking. One simple choice is State. (Actually, that is the only choice I
can think of.)
Create an action in this monad that takes a number to a character.
Traverse a matrix with this action.
Let's cook!
A type may be as simple as this:
newtype Matrix a = Matrix [[a]] deriving Show
It is entirely possible that the inner lists will be of unequal length — this type does not
protect us from making a "ragged" matrix. This is poor design. But I am going to skim over
it for now. Haskell provides an endless depth for perfection. This type is good enough for
our needs here.
We can immediately define an example of a matrix:
example :: Matrix Int
example = Matrix [[0,2,1],[2,2,0],[0,0,2]]
How hard is it to define a Traversable? 0 hard.
{-# language DeriveTraversable #-}
...
newtype Matrix a = Matrix [[a]] deriving (Show, Functor, Foldable, Traversable)
Presto.
Where do we get labels from? It is a side effect. The function reaches somewhere, takes a
stream of labels, takes the head, and puts the tail back in the extra-dimensional pocket. A
monad that can do this is State.
It works like this:
label :: Int -> State String Char
label 0 = return '_'
label x = do
ls <- get
case ls of
[ ] -> error "No more labels!"
(l: ls') -> do
put ls'
return l
I hope the code explains itself. When a function "creates" a monadic value, we call it
"effectful", or an "action" in a given monad. For instance, print is an action that,
well, prints stuff. Which is an effect. label is also an action, though in a different
monad. Compare and see for youself.
Now we are ready to cook a solution:
matrixGroupings m = evalState (traverse label m) ['a'..'z']
This is it.
λ matrixGroupings example
Matrix ["_ab","cd_","__e"]
Bon appetit!
P.S. I took all glory from you, it is unfair. To make things fun again, I challenge you for an exercise: can you define a Traversable instance that labels a matrix in another order — by columns first, then rows?

Related

How do we overcome the compile time and runtime gap when programming in a Dependently Typed Language?

I'm told that in dependent type system, "types" and "values" is mixed, and we can treat both of them as "terms" instead.
But there is something I can't understand: in a strongly typed programming language without Dependent Type (like Haskell), Types is decided (infered or checked) at compile time, but values is decided (computed or inputed) at runtime.
I think there must be a gap between these two stages. Just think that if a value is interactively read from STDIN, how can we reference this value in a type which must be decided AOT?
e.g. There is a natural number n and a list of natural number xs (which contains n elements) which I need to read from STDIN, how can I put them into a data structure Vect n Nat?
Suppose we input n :: Int at runtime from STDIN. We then read n strings, and store them into vn :: Vect n String (pretend for the moment this can be done).
Similarly, we can read m :: Int and vm :: Vect m String. Finally, we concatenate the two vectors: vn ++ vm (simplifying a bit here). This can be type checked, and will have type Vect (n+m) String.
Now it is true that the type checker runs at compile time, before the values n,m are known, and also before vn,vm are known. But this does not matter: we can still reason symbolically on the unknowns n,m and argue that vn ++ vm has that type, involving n+m, even if we do not yet know what n+m actually is.
It is not that different from doing math, where we manipulate symbolic expressions involving unknown variables according to some rules, even if we do not know the values of the variables. We don't need to know what number is n to see that n+n = 2*n.
Similarly, the type checker can type check
-- pseudocode
readNStrings :: (n :: Int) -> IO (Vect n String)
readNStrings O = return Vect.empty
readNStrings (S p) = do
s <- getLine
vp <- readNStrings p
return (Vect.cons s vp)
(Well, actually some more help from the programmer could be needed to typecheck this, since it involves dependent matching and recursion. But I'll neglect this.)
Importantly, the type checker can check that without knowing what n is.
Note that the same issue actually already arises with polymorphic functions.
fst :: forall a b. (a, b) -> a
fst (x, y) = x
test1 = fst # Int # Float (2, 3.5)
test2 = fst # String # Bool ("hi!", True)
...
One might wonder "how can the typechecker check fst without knowing what types a and b will be at runtime?". Again, by reasoning symbolically.
With type arguments this is arguably more obvious since we usually run the programs after type erasure, unlike value parameters like our n :: Int above, which can not be erased. Still, there is some similarity between universally quantifying over types or over Int.
It seems to me that there are two questions here:
Given that some values are unknown during compile-time (e.g., values read from STDIN), how can we make use of them in types? (Note that chi has already given an excellent answer to this.)
Some operations (e.g., getLine) seem to make absolutely no sense at compile-time; how could we possibly talk about them in types?
The answer to (1), as chi has said, is symbolic or abstract reasoning. You can read in a number n, and then have a procedure that builds a Vect n Nat by reading from the command line n times, making use of arithmetic properties such as the fact that 1+(n-1) = n for nonzero natural numbers.
The answer to (2) is a bit more subtle. Naively, you might want to say "this function returns a vector of length n, where n is read from the command line". There are two types you might try to give this (apologies if I'm getting Haskell notation wrong)
unsafePerformIO (do n <- getLine; return (IO (Vect (read n :: Int) Nat)))
or (in pseudo-Coq notation, since I'm not sure what Haskell's notation for existential types is)
IO (exists n, Vect n Nat)
These two types can actually both be made sense of, and say different things. The first type, to me, says "at compile time, read n from the command line, and return a function which, at runtime, gives a vector of length n by performing IO". The second type says "at runtime, perform IO to get a natural number n and a vector of length n".
The way I like looking at this is that all side effects (other than, perhaps, non-termination) are monad transformers, and there is only one monad: the "real-world" monad. Monad transformers work just as well at the type level as at the term level; the one thing which is special is run :: M a -> a which executes the monad (or stack of monad transformers) in the "real world". There are two points in time at which you can invoke run: one is at compile time, where you invoke any instance of run which shows up at the type level. Another is at runtime, where you invoke any instance of run which shows up at the value level. Note that run only makes sense if you specify an evaluation order; if your language does not specify whether it is call-by-value or call-by-name (or call-by-push-value or call-by-need or call-by-something-else), you can get incoherencies when you try to compute a type.

Using different Ordering for Sets

I was reading a Chapter 2 of Purely Functional Data Structures, which talks about unordered sets implemented as binary search trees. The code is written in ML, and ends up showing a signature ORDERED and a functor UnbalancedSet(Element: ORDERED): SET. Coming from more of a C++ background, this makes sense to me; custom comparison function objects form part of the type and can be passed in at construction time, and this seems fairly analogous to the ML functor way of doing things.
When it comes to Haskell, it seems the behavior depends only on the Ord instance, so if I wanted to have a set that had its order reversed, it seems like I'd have to use a newtype instance, e.g.
newtype ReverseInt = ReverseInt Int deriving (Eq, Show)
instance Ord ReverseInt where
compare (ReverseInt a) (ReverseInt b)
| a == b = EQ
| a < b = GT
| a > b = LT
which I could then use in a set:
let x = Set.fromList $ map ReverseInt [1..5]
Is there any better way of doing this sort of thing that doesn't resort to using newtype to create a different Ord instance?
No, this is really the way to go. Yes, having a newtype is sometimes annoying but you get some big benefits:
When you see a Set a and you know a, you immediately know what type of comparison it uses (sort of the same way that purity makes code more readable by not making you have to trace execution). You don't have to know where that Set a comes from.
For many cases, you can coerce your way through multiple newtypes at once. For example, I can turn xs = [1,2,3] :: Int into ys = [ReverseInt 1, ReverseInt 2, ReverseInt 3] :: [ReverseInt] just using ys = coerce xs :: [ReverseInt]. Unfortunately, that isn't the case for Set (and it shouldn't - you'd need the coercion function to be monotonic to not screw up the data structure invariants, and there is not yet a way to express that in the type system).
newtypes end up being more composable than you expect. For example, the ReverseInt type you made already exists in a form that generalizes to reversing any type with an Ord constraint: it is called Down. To be explicit, you could use Down Int instead of ReversedInt, and you get the instance you wrote out for free!
Of course, if you still feel very strongly about this, nothing is stopping you from writing your version of Set which has to have a field which is the comparison function it uses. Something like
data Set a = Set { comparisionKey :: a -> a -> Ordering
, ...
}
Then, every time you make a Set, you would have to pass in the comparison key.

How to make a custom Attoparsec parser combinator that returns a Vector instead of a list?

{-# LANGUAGE OverloadedStrings #-}
import Data.Attoparsec.Text
import Control.Applicative(many)
import Data.Word
parseManyNumbers :: Parser [Int] -- I'd like many to return a Vector instead
parseManyNumbers = many (decimal <* skipSpace)
main :: IO ()
main = print $ parseOnly parseManyNumbers "131 45 68 214"
The above is just an example, but I need to parse a large amount of primitive values in Haskell and need to use arrays instead of lists. This is something that possible in the F#'s Fparsec, so I've went as far as looking at Attoparsec's source, but I can't figure out a way to do it. In fact, I can't figure out where many from Control.Applicative is defined in the base Haskell library. I thought it would be there as that is where documentation on Hackage points to, but no such luck.
Also, I am having trouble deciding what data structure to use here as I can't find something as convenient as a resizable array in Haskell, but I would rather not use inefficient tree based structures.
An option to me would be to skip Attoparsec and implement an entire parser inside the ST monad, but I would rather avoid it except as a very last resort.
There is a growable vector implementation in Haskell, which is based on the great AMT algorithm: "persistent-vector". Unfortunately, the library isn't that much known in the community so far. However to give you a clue about the performance of the algorithm, I'll say that it is the algorithm that drives the standard vector implementations in Scala and Clojure.
I suggest you implement your parser around that data-structure under the influence of the list-specialized implementations. Here the functions are, btw:
-- | One or more.
some :: f a -> f [a]
some v = some_v
where
many_v = some_v <|> pure []
some_v = (fmap (:) v) <*> many_v
-- | Zero or more.
many :: f a -> f [a]
many v = many_v
where
many_v = some_v <|> pure []
some_v = (fmap (:) v) <*> many_v
Some ideas:
Data Structures
I think the most practical data structure to use for the list of Ints is something like [Vector Int]. If each component Vector is sufficiently long (i.e. has length 1k) you'll get good space economy. You'll have
to write your own "list operations" to traverse it, but you'll avoid re-copying data that you would have to perform to return the data in a single Vector Int.
Also consider using a Dequeue instead of a list.
Stateful Parsing
Unlike Parsec, Attoparsec does not provide for user state. However, you
might be able to make use of the runScanner function (link):
runScanner :: s -> (s -> Word8 -> Maybe s) -> Parser (ByteString, s)
(It also returns the parsed ByteString which in your case may be problematic since it will be very large. Perhaps you can write an alternate version which doesn't do this.)
Using unsafeFreeze and unsafeThaw you can incrementally fill in a Vector. Your s data structure might look
something like:
data MyState = MyState
{ inNumber :: Bool -- True if seen a digit
, val :: Int -- value of int being parsed
, vecs :: [ Vector Int ] -- past parsed vectors
, v :: Vector Int -- current vector we are filling
, vsize :: Int -- number of items filled in current vector
}
Maybe instead of a [Vector Int] you use a Dequeue (Vector Int).
I imagine, however, that this approach will be slow since your parsing function will get called for every single character.
Represent the list as a single token
Parsec can be used to parse a stream of tokens, so how about writing
your own tokenizer and letting Parsec create the AST.
The key idea is to represent these large sequences of Ints as a single token. This gives you a lot more latitude in how you parse them.
Defer Conversion
Instead of converting the numbers to Ints at parse time, just have parseManyNumbers return a ByteString and defer the conversion until
you actually need the values. This much enable you to avoid reifying
the values as an actual list.
Vectors are arrays, under the hood. The tricky thing about arrays is that they are fixed-length. You pre-allocate an array of a certain length, and the only way of extending it is to copy the elements into a larger array.
This makes linked lists simply better at representing variable-length sequences. (It's also why list implementations in imperative languages amortise the cost of copying by allocating arrays with extra space and copying only when the space runs out.) If you don't know in advance how many elements there are going to be, your best bet is to use a list (and perhaps copy the list into a Vector afterwards using fromList, if you need to). That's why many returns a list: it runs the parser as many times as it can with no prior knowledge of how many that'll be.
On the other hand, if you happen to know how many numbers you're parsing, then a Vector could be more efficient. Perhaps you know a priori that there are always n numbers, or perhaps the protocol specifies before the start of the sequence how many numbers there'll be. Then you can use replicateM to allocate and populate the vector efficiently.

Haskell--Manipulating data within a tuple

I'm attempting to simulate a checkers game using haskell. I am given a 4-tuple named, checkersState, that I would like to manipulate with a few different functions. So far, I have a function, oneMove, that receives input from checkerState and should return a tuple of the modified data:
The Input Tuple:
(
3600,
"",
[
"----------",
"------r---",
"----------",
"----------",
"---r-r----",
"------r---",
"---w---w-w",
"----------",
"----------",
"------w---"
],
(
49
,
43
)
)
So far I have something similar to below defining my function but am unsure how to access the individual members within the tuple checkerState. This method will take a time, array of captured pieces, board, and move to make, and return a time, array of captured pieces, and board. Currently, I would like to modify the time (INT) in the tuple depending on the state of the board:
onemove :: (Int,[Char],[[Char]],(Int,Int)) -> (Int,[Char],[[Char]])
Thanks in advance!
You can use pattern-matching to pull out the elements, do whatever changes need to be made, and pack them back into a tuple. For example, if you wanted to increment the first value, you could:
onemove (a,b,c,d) = (a + 1,b,c,d)
If you find yourself doing this a lot, you might reconsider using a tuple and instead use a data type:
data CheckersState = CheckersState { time :: Int -- field names are just
, steps :: [Char] -- guesses; change them
, board :: [[Char]] -- to something that
, pos :: (Int, Int) -- makes sense
} deriving (Eq, Read, Show)
Then you can update it with a much more convenient syntax:
onemove state = state { time = time state + 1 }
If you want to stick with tuples and you happen to be using lenses, there’s another easy way to update your tuple:
onemove = over _1 (+1)
Or if you’re using lenses and your own data type (with an appropriately-defined accessor like the one provided), you can do something similar:
_time :: Lens' CheckersState Int
_time f state = (\newTime -> state { time = newTime }) <$> f (time state)
onemove = over _time (+1)
So there’s plenty of fancy ways to do it. But the most general way is to use pattern-matching.
As icktoofay is saying, using tuples is a code smell, and records with named components is way better.
Also, using Char (and String) is a code smell. To repair it, define a data type that precisely describes what you expect in a cell of the board, like data Colour = None | Red | Black, but see next item.
And, using Lists is also a code smell. You actually want something like type Board = Data.Map.Map Pos Colour or Data.Map.Map Pos (Maybe Colour') with data Colour' = Red | Black.
Oh, and Int is also a code smell. You could define newtype Row = Row Int ; newtype Col = Col Int ; type Pos = (Row,Col). Possibly deriving Num for the newtypes, but it's not clear, e.g., you don't want to multiply row numbers. Perhaps deriving (Eq,Ord,Enum) is enough, with Enum you get pred and succ.
(Ah - this Pos is using a tuple, thus it`s smelly? Well, no, 2-tuples is allowed, sometimes.)
You use pattern matching to decompose the tuple into variables.
onemove (i, c, board, (x, y)) = <do something> (i, c, board)
However, you should define a separate data structure for the board to make your intention clear. I don't know what the meaning of the first two values. See: http://learnyouahaskell.com/making-our-own-types-and-typeclasses

What datatype to choose for a dungeon map

As part of a coding challenge I have to implement a dungeon map.
I have already designed it using Data.Map as a design choice because printing the map was not required and sometimes I had to update an map tile, e.g. when an obstacle was destroyed.
type Dungeon = Map Pos Tile
type Pos = (Int,Int) -- cartesian coordinates
data Tile = Wall | Destroyable | ...
But what if I had to print it too - then I would have to use something like
elaboratePrint . sort $ fromList dungeon where elaboratePrint takes care of the linebreaks and makes nice unicode symbols from the tileset.
Another choice I considered would be a nested list
type Dungeon = [[Tile]]
This would have the disadvantage, that it is hard to update a single element in such a data structure. But printing then would be a simple one liner unlines . map show.
Another structure I considered was Array, but as I am not used to arrays a short glance at the hackage docs - i only found a map function that operated on indexes and one that worked on elements, unless one is willing to work with mutable arrays updating one element is not easy at first glance. And printing an array is also not clear how to do that fast and easily.
So now my question - is there a better data structure for representing a dungeon map that has the property of easy printing and easy updating single elements.
How about an Array? Haskell has real, 2-d arrays.
import Data.Array.IArray -- Immutable Arrays
Now an Array is indexed by any Ix a => a. And luckily, there is an instance (Ix a, Ix b) => Ix (a, b). So we can have
type Dungeon = Array (Integer, Integer) Tile
Now you construct one of these with any of several functions, the simplest to use being
array :: Ix i => (i, i) -> [(i, a)] -> Array i a
So for you,
startDungeon = array ( (0, 0), (100, 100) )
[ ( (x, y), Empty ) | x <- [0..100], y <- [0..100]]
And just substitute 100 and Empty for the appropriate values.
If speed becomes a concern, then it's a simple fix to use MArray and ST. I'd suggest not switching unless speed is actually a real concern here.
To address the pretty printing
import Data.List
import Data.Function
pretty :: Array (Integer, Integer) Tile -> String
pretty = unlines . map show . groupBy ((==) `on` snd.fst) . assoc
And map show can be turned in to however you want to format [Tile] into a row. If you decide that you really want these to be printed in an awesome and efficient manner (Console game maybe) you should look at a proper pretty printing library, like this one.
First — tree-likes such as Data.Map and lists remain the natural data structures for functional languages. Map is a bit of an overkill structure-wise if you only need rectangular maps, but [[Tile]] may actually be pretty fine. It has O(√n) for both random-access and updates, that's not too bad.
In particular, it's better than pure-functional updates of a 2D array (O(n))! So if you need really good performance, there's no way around using mutable arrays. Which isn't necessarily bad though, after all a game is intrinsically concerned with IO and state. What is good about Data.Array, as noted by jozefg, is the ability to use tuples as Ix indexes, so I would go with MArray.
Printing is easy with arrays. You probably only need rectangular parts of the whole map, so I'd just extract such slices with a simple list comprehension
[ [ arrayMap ! (x,y) | x<-[21..38] ] | y<-[37..47] ]
You already know how to print lists.

Resources