How to return a polymorphic type in Haskell based on the results of string parsing? - string

TL;DR:
How can I write a function which is polymorphic in its return type? I'm working on an exercise where the task is to write a function which is capable of analyzing a String and, depending on its contents, generate either a Vector [Int], Vector [Char] or Vector [String].
Longer version:
Here are a few examples of how the intended function would behave:
The string "1 2\n3 4" would generate a Vector [Int] that's made up of two lists: [1,2] and [3,4].
The string "'t' 'i' 'c'\n't' 'a' 'c'\n't' 'o' 'e'" would generate a Vector [Char] (i.e., made up of the lists "tic", "tac" and "toe").
The string "\"hello\" \"world\"\n\"monad\" \"party\"" would generate a Vector [String] (i.e., ["hello","world"] and ["monad","party"]).
Error-checking/exception handling is not a concern for this particular exercise. At this stage, all testing is done purely, i.e., this isn't in the realm of the IO monad.
What I have so far:
I have a function (and new datatype) which is capable of classifying a string. I also have functions (one for each Int, Char and String) which can convert the string into the necessary Vector.
My question: how can I combine these three conversion functions into a single function?
What I've tried:
(It obviously doesn't typecheck if I stuff the three conversion
functions into a single function (i.e., using a case..of structure
to pattern match on VectorType of the string.
I tried making a Vectorable class and defining a separate instance for each type; I quickly realized that this approach only works if the functions' arguments vary by type. In our case, the the type of the argument doesn't vary (i.e., it's always a String).
My code:
A few comments
Parsing: the mySplitter object and the mySplit function handle the parsing. It's admittedly a crude parser based on the Splitter type and the split function from Data.List.Split.Internals.
Classifying: The classify function is capable of determining the final VectorType based on the string.
Converting: The toVectorNumber, toVectorChar and toVectorString functions are able to convert a string to type Vector [Int], Vector [Char] and Vector [String], respectively.
As a side note, I'm trying out CorePrelude based on a recommendation from a mentor. That's why you'll see me use the generalized versions of the normal Prelude functions.
Code:
import qualified Prelude
import CorePrelude
import Data.Foldable (concat, elem, any)
import Control.Monad (mfilter)
import Text.Read (read)
import Data.Char (isAlpha, isSpace)
import Data.List.Split (split)
import Data.List.Split.Internals (Splitter(..), DelimPolicy(..), CondensePolicy(..), EndPolicy(..), Delimiter(..))
import Data.Vector ()
import qualified Data.Vector as V
data VectorType = Number | Character | TextString deriving (Show)
mySplitter :: [Char] -> Splitter Char
mySplitter elts = Splitter { delimiter = Delimiter [(`elem` elts)]
, delimPolicy = Drop
, condensePolicy = Condense
, initBlankPolicy = DropBlank
, finalBlankPolicy = DropBlank }
mySplit :: [Char]-> [Char]-> [[Char]]
mySplit delims = split (mySplitter delims)
classify :: String -> VectorType
classify xs
| '\"' `elem` cs = TextString
| hasAlpha cs = Character
| otherwise = Number
where
cs = concat $ split (mySplitter "\n") xs
hasAlpha = any isAlpha . mfilter (/=' ')
toRows :: [Char] -> [[Char]]
toRows = mySplit "\n"
toVectorChar :: [Char] -> Vector [Char]
toVectorChar = let toChar = concat . mySplit " \'"
in V.fromList . fmap (toChar) . toRows
toVectorNumber :: [Char] -> Vector [Int]
toVectorNumber = let toNumber = fmap (\x -> read x :: Int) . mySplit " "
in V.fromList . fmap toNumber . toRows
toVectorString :: [Char] -> Vector [[Char]]
toVectorString = let toString = mfilter (/= " ") . mySplit "\""
in V.fromList . fmap toString . toRows

You can't.
Covariant polymorphism is not supported in Haskell, and wouldn't be useful if it were.
That's basically all there is to answer. Now as to why this is so.
It's no good "returning a polymorphic value" like OO languages so like to do, because the only reason to return any value at all is to use it in other functions. Now, in OO languages you don't have functions but methods that come with the object, so it's quite easy to "return different types": each will have its suitable methods built-in, and they can per instance vary. (Whether that's a good idea is another question.)
But in Haskell, the functions come from elsewhere. They don't know about implementation changes for a particular instance, so the only way such functions can safely be defined is to know every possible implementation. But if your return type is really polymorphic, that's not possible, because polymorphism is an "open" concept (it allows new implementation varieties to be added any time later).
Instead, Haskell has a very convenient and totally safe mechanism of describing a closed set of "instances" – you've actually used it yourself already! ADTs.
data PolyVector = NumbersVector (Vector [Int])
| CharsVector (Vector [Char])
| StringsVector (Vector [String])
That's the return type you want. The function won't be polymorphic as such, it'll simply return a more versatile type.
If you insist it should be polymorphic
Now... actually, Haskell does have a way to sort-of deal with "polymorphic returns". As in OO when you declare that you return a subclass of a specified class. Well, you can't "return a class" at all in Haskell, you can only return types. But those can be made to express "any instance of...". It's called existential quantification.
{-# LANGUAGE GADTs #-}
data PolyVector' where
PolyVector :: YourVElemClass e => Vector [e] -> PolyVector'
class YourVElemClass where
...?
instance YourVElemClass Int
instance YourVElemClass Char
instance YourVElemClass String
I don't know if that looks intriguing to you. Truth is, it's much more complicated and rather harder to use; you can't just just any of the possible results directly but can only make use of the elements through methods of YourVElemClass. GADTs can in some applications be extremely useful, but these usually involve classes with very deep mathematical motivation. YourVElemClass doesn't seem to have such a motivation, so you'll be much better off with a simple ADT alternative, than existential quantification.
There's a famous rant against existentials by Luke Palmer (note he uses another syntax, existential-specific, which I consider obsolete, as GADTs are strictly more general).

Easy, use an sum type!
data ParsedVector = NumberVector (Vector [Int]) | CharacterVector (Vector [Char]) | TextString (Vector [String]) deriving (Show)
parse :: [Char] -> ParsedVector
parse cs = case classify cs of
Number -> NumberVector $ toVectorNumber cs
Character -> CharacterVector $ toVectorChar cs
TextString -> TextStringVector $ toVectorString cs

Related

Haskell: Show instance of IOArray

I'm learning Haskell and trying to use mutable arrays (in particular IOArray). I wrote a pretty printer that has the following type:
disp :: Show a => IOArray Int a -> IO String
I didn't manage to get rid of the IO part because of a subcall to
getBounds :: Ix i => a i e -> m (i, i)
Now I'm trying to use disp to define a Show instance for my IOArray type but the IO gets in the way.
Is it possible to create a Show instance for IOArray ?
An IOArray is not really an array. It's just a reference to an array. Absolutely everything interesting you can do with an IOArray produces an action in IO. Why is that? Suppose you could index into an IOArray in pure code:
(!) :: IOArray Int a -> a
Consider the following:
f :: IO (Char, Char)
f = do
ar <- newArray (0 :: Int, 10 :: Int) 'a'
let x = ar ! 3
writeArray ar 3 'b'
let y = ar ! 3
return (x, y)
What should f produce? One answer might be that it should produce ('a', 'b'), because the third element of ar started out as 'a' and then was changed to 'b'. But that's deeply troubling! How can ar ! 3 have one value at one time and another later? That violates the fundamental idea of referential transparency that purely functional languages are built on. So you just can't do that.
AFAIK in Haskell getting rid of Monads is neither possible nor correct. Some monads (like Maybe and Either) has special methods to unwrap their values, but (over)using them is discouraged. If you have any Haskell type that is wrapped within a Monad context, you must use it and work with it without unwrapping and releasing. For your case, any type within an IO monad, can not be converted (using any type of function like Show) to any type without IO Monad. One solution for your case is using Haskell's rich treasure of Monad functions and operators to convert inner type (e.g. Int) to Char, and after that you have a IO String instead of IOArray, which in turn you can print out.

Parsing to Free Monads

Say I have the following free monad:
data ExampleF a
= Foo Int a
| Bar String (Int -> a)
deriving Functor
type Example = Free ExampleF -- this is the free monad want to discuss
I know how I can work with this monad, eg. I could write some nice helpers:
foo :: Int -> Example ()
foo i = liftF $ Foo i ()
bar :: String -> Example Int
bar s = liftF $ Bar s id
So I can write programs in haskell like:
fooThenBar :: Example Int
fooThenBar =
do
foo 10
bar "nice"
I know how to print it, interpret it, etc. But what about parsing it?
Would it be possible to write a parser that could parse arbitrary
programs like:
foo 12
bar nice
foo 11
foo 42
So I can store them, serialize them, use them in cli programs etc.
The problem I keep running into is that the type of the program depends on which program is being parsed. If the program ends with a foo it's of
type Example () if it ends with a bar it's of type Example Int.
I do not feel like writing parsers for every possible permutation (it's simple here because there are only two possibilities, but imagine we add
Baz Int (String -> a), Doo (Int -> a), Moz Int a, Foz String a, .... This get's tedious and error-prone).
Perhaps I'm solving the wrong problem?
Boilerplate
To run the above examples, you need to add this to the beginning of the file:
{-# LANGUAGE DeriveFunctor #-}
import Control.Monad.Free
import Text.ParserCombinators.Parsec
Note: I put up a gist containing this code.
Not every Example value can be represented on the page without reimplementing some portion of Haskell. For example, return putStrLn has a type of Example (String -> IO ()), but I don't think it makes sense to attempt to parse that sort of Example value out of a file.
So let's restrict ourselves to parsing the examples you've given, which consist only of calls to foo and bar sequenced with >> (that is, no variable bindings and no arbitrary computations)*. The Backus-Naur form for our grammar looks approximately like this:
<program> ::= "" | <expr> "\n" <program>
<expr> ::= "foo " <integer> | "bar " <string>
It's straightforward enough to parse our two types of expression...
type Parser = Parsec String ()
int :: Parser Int
int = fmap read (many1 digit)
parseFoo :: Parser (Example ())
parseFoo = string "foo " *> fmap foo int
parseBar :: Parser (Example Int)
parseBar = string "bar " *> fmap bar (many1 alphaNum)
... but how can we give a type to the composition of these two parsers?
parseExpr :: Parser (Example ???)
parseExpr = parseFoo <|> parseBar
parseFoo and parseBar have different types, so we can't compose them with <|> :: Alternative f => f a -> f a -> f a. Moreover, there's no way to know ahead of time which type the program we're given will be: as you point out, the type of the parsed program depends on the value of the input string. "Types depending on values" is called dependent types; Haskell doesn't feature a proper dependent type system, but it comes close enough for us to have a stab at making this example work.
Let's start by forcing the expressions on either side of <|> to have the same type. This involves erasing Example's type parameter using existential quantification.†
data Ex a = forall i. Wrap (a i)
parseExpr :: Parser (Ex Example)
parseExpr = fmap Wrap parseFoo <|> fmap Wrap parseBar
This typechecks, but the parser now returns an Example containing a value of an unknown type. A value of unknown type is of course useless - but we do know something about Example's parameter: it must be either () or Int because those are the return types of parseFoo and parseBar. Programming is about getting knowledge out of your brain and onto the page, so we're going to wrap up the Example value with a bit of GADT evidence which, when unwrapped, will tell you whether a was Int or ().
data Ty a where
IntTy :: Ty Int
UnitTy :: Ty ()
data (a :*: b) i = a i :&: b i
type Sig a b = Ex (a :*: b)
pattern Sig x y = Wrap (x :&: y)
parseExpr :: Parser (Sig Ty Example)
parseExpr = fmap (\x -> Sig UnitTy x) parseFoo <|>
fmap (\x -> Sig IntTy x) parseBar
Ty is (something like) a runtime "singleton" representative of Example's type parameter. When you pattern match on IntTy, you learn that a ~ Int; when you pattern match on UnitTy you learn that a ~ (). (Information can be made to flow the other way, from types to values, using classes.) :*:, the functor product, pairs up two type constructors ensuring that their parameters are equal; thus, pattern matching on the Ty tells you about its accompanying Example.
Sig is therefore called a dependent pair or sigma type - the type of the second component of the pair depends on the value of the first. This is a common technique: when you erase a type parameter by existential quantification, it usually pays to make it recoverable by bundling up a runtime representative of that parameter.
Note that this use of Sig is equivalent to Either (Example Int) (Example ()) - a sigma type is a sum, after all - but this version scales better when you're summing over a large (or possibly infinite) set.
Now it's easy to build our expression parser into a program parser. We just have to repeatedly apply the expression parser, and then manipulate the dependent pairs in the list.
parseProgram :: Parser (Sig Ty Example)
parseProgram = fmap (foldr1 combine) $ parseExpr `sepBy1` (char '\n')
where combine (Sig _ val) (Sig ty acc) = Sig ty (val >> acc)
The code I've shown you is not exemplary. It doesn't separate the concerns of parsing and typechecking. In production code I would modularise this design by first parsing the data into an untyped syntax tree - a separate data type which doesn't enforce the typing invariant - then transform that into a typed version by type-checking it. The dependent pair technique would still be necessary to give a type to the output of the type-checker, but it wouldn't be tangled up in the parser.
*If binding is not a requirement, have you thought about using a free applicative to represent your data?
†Ex and :*: are reusable bits of machinery which I lifted from the Hasochism paper
So, I worry that this is the same sort of premature abstraction that you see in object-oriented languages, getting in the way of things. For example, I am not 100% sure that you are using the structure of the free monad -- your helpers for example simply seem to use id and () in a rather boring way, in fact I'm not sure if your Int -> x is ever anything other than either Pure :: Int -> Free ExampleF Int or const (something :: Free ExampleF Int).
The free monad for a functor F can basically be described as a tree whose data is stored in leaves and whose branching factor is controlled by the recursion in each constructor of the functor F. So for example Free Identity has no branching, hence only one leaf, and thus has the same structure as the monad:
data MonoidalFree m x = MF m x deriving (Functor)
instance Monoid m => Monad (MonoidalFree m) where
return x = MF mempty x
MF m x >>= my_x = case my_x x of MF n y -> MF (mappend m n) y
In fact Free Identity is isomorphic to MonoidalFree (Sum Integer), the difference is just that instead of MF (Sum 3) "Hello" you see Free . Identity . Free . Identity . Free . Identity $ Pure "Hello" as the means of tracking this integer. On the other hand if you have data E x = L x | R x deriving (Functor) then you get a sort of "path" of Ls and Rs before you hit this one leaf, Free E is going to be isomorphic to MonoidalFree [Bool].
The reason I'm going through this is that when you combine Free with an Integer -> x functor, you get an infinitely branching tree, and when I'm looking through your code to figure out how you're actually using this tree, all I see is that you use the id function with it. As far as I can tell, that restricts the recursion to either have the form Free (Bar "string" Pure) or else Free (Bar "string" (const subExpression)), in which case the system would seem to reduce completely to the MonoidalFree [Either Int String] monad.
(At this point I should pause to ask: Is that correct as far as you know? Was this what was intended?)
Anyway. Aside from my problems with your premature abstraction, the specific problem that you're citing with your monad (you can't tell the difference between () and Int has a bunch of really complicated solutions, but one really easy one. The really easy solution is to yield a value of type Example (Either () Int) and if you have a () you can fmap Left onto it and if you have an Int you can fmap Right onto it.
Without a much better understanding of how you're using this thing over TCP/IP we can't recommend a better structure for you than the generic free monads that you seem to be finding -- in particular we'd need to know how you're planning on using the infinite-branching of Int -> x options in practice.

How to store arbitrary values in a recursive structure or how to build a extensible software architecture?

I'm working on a basic UI toolkit and am trying to figure out the overall architecture.
I am considering to use WAI's structure for extensibility. A reduced example of the core structure for my UI:
run :: Application -> IO ()
type Application = Event -> UI -> (Picture, UI)
type Middleware = Application -> Application
In WAI, arbitrary values for Middleware are saved in the vault. I think that this is a bad hack to save arbitary values, because it isn't transparent, but I can't think of a sufficient simple structure to replace this vault to give every Middleware a place to save arbitrary values.
I considered to recursively store tuples in tuples:
run :: (Application, x) -> IO ()
type Application = Event -> UI -> (Picture, UI)
type Middleware y x = (Application, x) -> (Application, (y,x))
Or to only use lazy lists to provide a level on which is no need to separate values (which provides more freedom, but also has more problems):
run :: Application -> IO ()
type Application = [Event -> UI -> (Picture, UI)]
type Middleware = Application -> Application
Actually, I would use a modified lazy list solution. Which other solutions might work?
Note that:
I prefer not to use lens at all.
I know UI -> (Picture, UI) could be defined as State UI Picture .
I'm not aware of a solution regarding monads, transformers or FRP. It would be great to see one.
Lenses provide a general way to reference data type fields so that you can extend or refactor your data set without breaking backwards compatibility. I'll use the lens-family and lens-family-th libraries to illustrate this, since they are lighter dependencies than lens.
Let's begin with a simple record with two fields:
{-# LANGUAGE Template Haskell #-}
import Lens.Family2
import Lens.Family2.TH
data Example = Example
{ _int :: Int
, _str :: String
}
makeLenses ''Example
-- This creates these lenses:
int :: Lens' Example Int
str :: Lens' Example String
Now you can write Stateful code that references fields of your data structure. You can use Lens.Family2.State.Strict for this purpose:
import Lens.Family2.State.Strict
-- Everything here also works for `StateT Example IO`
example :: State Example Bool
example = do
s <- use str -- Read the `String`
str .= s ++ "!" -- Set the `String`
int += 2 -- Modify the `Int`
zoom int $ do -- This sub-`do` block has type: `State Int Int`
m <- get
return (m + 1)
The key thing to note is that I can update my data type, and the above code will still compile. Add a new field to Example and everything will still work:
data Example = Example
{ _int :: Int
, _str :: String
, _char :: Char
}
makeLenses ''Example
int :: Lens' Example Int
str :: Lens' Example String
char :: Lens' Example Char
However, we can actually go a step further and completely refactor our Example type like this:
data Example = Example
{ _example2 :: Example
, _char :: Char
}
data Example2 = Example2
{ _int2 :: Int
, _str2 :: String
}
makeLenses ''Example
char :: Lens' Example Char
example2 :: Lens' Example Example2
makeLenses ''Example2
int2 :: Lens' Example2 Int
str2 :: Lens' Example2 String
Do we have to break our old code? No! All we have to do is add the following two lenses to support backwards compatibility:
int :: Lens' Example Int
int = example2 . int2
str :: Lens' Example Char
str = example2 . str2
Now all the old code still works without any changes, despite the intrusive refactoring of our Example type.
In fact, this works for more than just records. You can do the exact same thing for sum types, too (a.k.a. algebraic data types or enums). For example, suppose we have this type:
data Example3 = A String | B Int
makeTraversals ''Example3
-- This creates these `Traversals'`:
_A :: Traversal' Example3 String
_B :: Traversal' Example3 Int
Many of the things that we did with sum types can similarly be re-expressed in terms of Traversal's. There's a notable exception of pattern matching: it's actually possible to implement pattern matching with totality checking with Traversals, but it's currently verbose.
However, the same point holds: if you express all your sum type operations in terms of Traversal's, then you can greatly refactor your sum type and just update the appropriate Traversal's to preserve backwards compatibility.
Finally: note that the true analog of sum type constructors are Prisms (which let you build values using the constructors in addition to pattern matching). Those are not supported by the lens-family family of libraries, but they are provided by lens and you can implement them yourself using just a profunctors dependency if you want.
Also, if you're wondering what the lens analog of a newtype is, it's an Iso', and that also minimally requires a profunctors dependency.
Also, everything I've said works for reference multiple fields of recursive types (using Folds). Literally anything you can imagine wanting to reference in a data type in a backwards-compatible way is encompassed by the lens library.

How do I let a function in Haskell depend on the type of its argument?

I tried to write a variation on show that treats strings differently from other instances of Show, by not including the " and returning the string directly. But I don't know how to do it. Pattern matching? Guards? I couldn't find anything about that in any documentation.
Here is what I tried, which doesn't compile:
show_ :: Show a => a -> String
show_ (x :: String) = x
show_ x = show x
If possible, you should wrap your values of type String up in a newtype as #wowofbob suggests.
However, sometimes this isn't feasible, in which case there are two general approaches to making something recognise String specifically.
The first way, which is the natural Haskell approach, is to use a type class just like Show to get different behaviour for different types. So you might write
class Show_ a where
show_ :: a -> String
and then
instance Show_ String where
show_ x = x
instance Show_ Int where
show_ x = show x
and so on for any other type you want to use. This has the disadvantage that you need to explicitly write out Show_ instances for all the types you want.
#AndrewC shows how you can cut each instance down to a single line, but you'll still have to list them all explicitly. You can in theory work around this, as detailed in this question, but it's not pleasant.
The second option is to get true runtime type information with the Typeable class, which is quite short and simple in this particular situation:
import Data.Typeable
[...]
show_ :: (Typeable a, Show a) => a -> String
show_ x =
case cast x :: Maybe String of
Just s -> s
Nothing -> show x
This is not a natural Haskell-ish approach because it means callers can't tell much about what the function will do from the type.
Type classes in general give constrained polymorphism in the sense that the only variations in behaviour of a particular function must come from the variations in the relevant type class instances. The Show_ class gives some indication what it's about from its name, and it might be documented.
However Typeable is a very general class. You are delegating everything to the specific function you are calling; a function with a Typeable constraint might have completely different implementations for lots of different concrete types.
Finally, a further elaboration on the Typeable solution which gets closer to your original code is to use a couple of extensions:
{-# LANGUAGE ViewPatterns, ScopedTypeVariables #-}
import Data.Typeable
[...]
show_ :: (Typeable a, Show a) => a -> String
show_ (cast -> Just (s :: String)) = s
show_ x = show x
The use of ViewPatterns allows us to write the cast inside a pattern, which may fit in more nicely with more complicated examples. In fact we can omit the :: String type constraint because the body of this cases forces s to be the result type of show_, i.e. String, anyway. But that's a little obscure so I think it's better to be explicit.
You can wrap it into newtype and make custom Show instance for it:
newtype PrettyString = PrettyString { toString :: String }
instance Show PrettyString where
show (PrettyString s) = "$$" ++ s ++ "$$" -- for example
And then use it like below:
main = getLine >>= print . PrettyString
TL;DR:
Copy the prelude's way and use showList_ as a class function to generate instances for lists so you can override the definition for String.
Caveat
For the record, wowofbob's answer using a newtype wrapper is the simple, clean solution I would use in real life, but I felt it was instructive to also look at some of how the Prelude does this.
intercalate commas by default
The way this is done in the prelude is to make the Show class have a function for showing lists, with a default definition that you can override.
import Data.List (intercalate)
I'll use intercalate :: [a] -> [[a]] -> [a] to put commas in between stuff:
ghci> intercalate "_._" ["intercalate","works","like","this"]
"intercalate_._works_._like_._this"
Make a showList_ class function, and default to show and comma-separated lists.
So now the class with the default implementation of the showList function and, importantly, a default show_ implementation that just uses the ordinary show function. To be able to use that, we have to insist that the type is already in the Show typeclass, but that's OK as far as I understand you.
class Show a => Show_ a where
show_ :: a -> String
showList_ :: [a] -> String
show_ = show
showList_ xs = '[' : intercalate ", " (map show_ xs) ++ "]"
The real Show class uses functions of type String -> String instead of String directly for efficiency reasons, and a precedence argument to control the use of brackets, but I'll skip all that for simplicity.
Automatically make instances for lists
Now we can use the showList function to provide an instance for lists:
instance Show_ a => Show_ [a] where
show_ xs = showList_ xs
The Show a => superclass makes instances super-easy
Now we come to some instances. Because of our default show_ implementation, we don't need to do any actual programming unless we want to override the default, which we'll do for Char, because String ~ [Char].
instance Show_ Int
instance Show_ Integer
instance Show_ Double
instance Show_ Char where
show_ c = [c] -- so show_ 'd' = "d". You can put show_ = show if you want "'d'"
showList_ = id -- just return the string
In practice:
Now that's not much use to hide " from your output in ghci, because the default show function is used for that, but if we use putStrLn, the quotes disappear:
put :: Show_ a => a -> IO ()
put = putStrLn . show_
ghci> show "hello"
"\"hello\""
ghci> show_ "hello"
"hello"
ghci> put "hello"
hello
ghci> put [2,3,4]
[2, 3, 4]
ghci>

Best way to implement ad-hoc polymorphism in Haskell?

I have a polymorphic function like:
convert :: (Show a) => a -> String
convert = " [label=" ++ (show a) ++ "]"
But sometimes I want to pass it a Data.Map and do some more fancy key value conversion. I know I can't pattern match here because Data.Map is an abstract data type (according to this similar SO question), but I have been unsuccessful using guards to this end, and I'm not sure if ViewPatterns would help here (and would rather avoid them for portability).
This is more what I want:
import qualified Data.Map as M
convert :: (Show a) => a -> String
convert a
| M.size \=0 = processMap2FancyKVString a -- Heres a Data.Map
| otherwise = " [label=" ++ (show a) ++ "]" -- Probably a string
But this doesn't work because M.size can't take anything other than a Data.Map.
Specifically, I am trying to modify the sl utility function in the Functional Graph Library in order to handle coloring and other attributes of edges in GraphViz output.
Update
I wish I could accept all three answers by TomMD, Antal S-Z, and luqui to this question as they all understood what I really was asking. I would say:
Antal S-Z gave the most 'elegant' solution as applied to the FGL but would also require the most rewriting and rethinking to implement in personal problem.
TomMD gave a great answer that lies somewhere between Antal S-Z's and luqui's in terms of applicability vs. correctness. It also is direct and to the point which I appreciate greatly and why I chose his answer.
luqui gave the best 'get it working quickly' answer which I will probably be using in practice (as I'm a grad student, and this is just some throwaway code to test some ideas). The reason I didn't accept was because TomMD's answer will probably help other people in more general situations better.
With that said, they are all excellent answers and the above classification is a gross simplification. I've also updated the question title to better represent my question (Thanks Thanks again for broadening my horizons everyone!
What you just explained is you want a function that behaves differently based on the type of the input. While you could use a data wrapper, thus closing the function for all time:
data Convertable k a = ConvMap (Map k a) | ConvOther a
convert (ConvMap m) = ...
convert (ConvOther o) = ...
A better way is to use type classes, thus leaving the convert function open and extensible while preventing users from inputting non-sensical combinations (ex: ConvOther M.empty).
class (Show a) => Convertable a where
convert :: a -> String
instance Convertable (M.Map k a) where
convert m = processMap2FancyKVString m
newtype ConvWrapper a = CW a
instance Convertable (ConvWrapper a) where
convert (CW a) = " [label=" ++ (show a) ++ "]"
In this manner you can have the instances you want used for each different data type and every time a new specialization is needed you can extend the definition of convert simply by adding another instance Convertable NewDataType where ....
Some people might frown at the newtype wrapper and suggest an instance like:
instance Convertable a where
convert ...
But this will require the strongly discouraged overlapping and undecidable instances extensions for very little programmer convenience.
You may not be asking the right thing. I'm going to assume that you either have a graph whose nodes are all Maps or you have a graph whose nodes are all something else. If you need a graph where Maps and non-maps coexist, then there is more to your problem (but this solution will still help). See the end of my answer in that case.
The cleanest answer here is simply to use different convert functions for different types, and have any type that depends on convert take it as an argument (a higher order function).
So in GraphViz (avoiding redesigning this crappy code) I would modify the graphviz function to look like:
graphvizWithLabeler :: (a -> String) -> ... -> String
graphvizWithLabeler labeler ... =
...
where sa = labeler a
And then have graphviz trivially delegate to it:
graphviz = graphvizWithLabeler sl
Then graphviz continues to work as before, and you have graphvizWithLabeler when you need the more powerful version.
So for graphs whose nodes are Maps, use graphvizWithLabeler processMap2FancyKVString, otherwise use graphviz. This decision can be postponed as long as possible by taking relevant things as higher order functions or typeclass methods.
If you need to have Maps and other things coexisting in the same graph, then you need to find a single type inhabited by everything a node could be. This is similar to TomMD's suggestion. For example:
data NodeType
= MapNode (Map.Map Foo Bar)
| IntNode Int
Parameterized to the level of genericity you need, of course. Then your labeler function should decide what to do in each of those cases.
A key point to remember is that Haskell has no downcasting. A function of type foo :: a -> a has no way of knowing anything about what was passed to it (within reason, cool your jets pedants). So the function you were trying to write is impossible to express in Haskell. But as you can see, there are other ways to get the job done, and they turn out to be more modular.
Did that tell you what you needed to know to accomplish what you wanted?
Your problem isn't actually the same as in that question. In the question you linked to, Derek Thurn had a function which he knew took a Set a, but couldn't pattern-match. In your case, you're writing a function which will take any a which has an instance of Show; you can't tell what type you're looking at at runtime, and can only rely on the functions which are available to any Showable type. If you want to have a function do different things for different data types, this is known as ad-hoc polymorphism, and is supported in Haskell with type classes like Show. (This is as opposed to parametric polymorphism, which is when you write a function like head (x:_) = x which has type head :: [a] -> a; the unconstrained universal a is what makes that parametric instead.) So to do what you want, you'll have to create your own type class, and instantiate it when you need it. However, it's a little more complicated than usual, because you want to make everything that's part of Show implicitly part of your new type class. This requires some potentially dangerous and probably unnecessarily powerful GHC extensions. Instead, why not simplify things? You can probably figure out the subset of types which you actually need to print in this manner. Once you do that, you can write the code as follows:
{-# LANGUAGE TypeSynonymInstances #-}
module GraphvizTypeclass where
import qualified Data.Map as M
import Data.Map (Map)
import Data.List (intercalate) -- For output formatting
surround :: String -> String -> String -> String
surround before after = (before ++) . (++ after)
squareBrackets :: String -> String
squareBrackets = surround "[" "]"
quoted :: String -> String
quoted = let replace '"' = "\\\""
replace c = [c]
in surround "\"" "\"" . concatMap replace
class GraphvizLabel a where
toGVItem :: a -> String
toGVLabel :: a -> String
toGVLabel = squareBrackets . ("label=" ++) . toGVItem
-- We only need to print Strings, Ints, Chars, and Maps.
instance GraphvizLabel String where
toGVItem = quoted
instance GraphvizLabel Int where
toGVItem = quoted . show
instance GraphvizLabel Char where
toGVItem = toGVItem . (: []) -- Custom behavior: no single quotes.
instance (GraphvizLabel k, GraphvizLabel v) => GraphvizLabel (Map k v) where
toGVItem = let kvfn k v = ((toGVItem k ++ "=" ++ toGVItem v) :)
in intercalate "," . M.foldWithKey kvfn []
toGVLabel = squareBrackets . toGVItem
In this setup, everything which we can output to Graphviz is an instance of GraphvizLabel; the toGVItem function quotes things, and toGVLabel puts the whole thing in square brackets for immediate use. (I might have screwed some of the formatting you want up, but that part's just an example.) You then declare what's an instance of GraphvizLabel, and how to turn it into an item. The TypeSynonymInstances flag just lets us write instance GraphvizLabel String instead of instance GraphvizLabel [Char]; it's harmless.
Now, if you really need everything with a Show instance to be an instance of GraphvizLabel as well, there is a way. If you don't really need this, then don't use this code! If you do need to do this, you have to bring to bear the scarily-named UndecidableInstances and OverlappingInstances language extensions (and the less scarily named FlexibleInstances). The reason for this is that you have to assert that everything which is Showable is a GraphvizLabel—but this is hard for the compiler to tell. For instance, if you use this code and write toGVLabel [1,2,3] at the GHCi prompt, you'll get an error, since 1 has type Num a => a, and Char might be an instance of Num! You have to explicitly specify toGVLabel ([1,2,3] :: [Int]) to get it to work. Again, this is probably unnecessarily heavy machinery to bring to bear on your problem. Instead, if you can limit the things you think will be converted to labels, which is very likely, you can just specify those things instead! But if you really want Showability to imply GraphvizLabelability, this is what you need:
{-# LANGUAGE TypeSynonymInstances, FlexibleInstances
, UndecidableInstances, OverlappingInstances #-}
-- Leave the module declaration, imports, formatting code, and class declaration
-- the same.
instance GraphvizLabel String where
toGVItem = quoted
instance Show a => GraphvizLabel a where
toGVItem = quoted . show
instance (GraphvizLabel k, GraphvizLabel v) => GraphvizLabel (Map k v) where
toGVItem = let kvfn k v = ((toGVItem k ++ "=" ++ toGVItem v) :)
in intercalate "," . M.foldWithKey kvfn []
toGVLabel = squareBrackets . toGVItem
Notice that your specific cases (GraphvizLabel String and GraphvizLabel (Map k v)) stay the same; you've just collapsed the Int and Char cases into the GraphvizLabel a case. Remember, UndecidableInstances means exactly what it says: the compiler cannot tell if instances are checkable or will instead make the typechecker loop! In this case, I am reasonably sure that everything here is in fact decidable (but if anybody notices where I'm wrong, please let me know). Nevertheless, using UndecidableInstances should always be approached with caution.

Resources