Haskell: Runtime Data Type Iteration? - haskell

A friend and I have been working on a system for automatically importing C functions into GNU Guile, but need to use Haskell's C parser because no other parser seems sufficient or as accessible (let me know if we're wrong about that).
The trouble is coming up when we try to produce Scheme data from the parsed AST. We need to produce text that can be directly imported by scheme (S-Expressions, not M-Expressions), so below is an example...
toscm $ Just (SomeType (AnotherType "test" 5) (YetAnother "hello!"))
=> "(Just (SomeType (AnotherType \"test\" 5) (YetAnother \"hello!\")))"
The output of Haskell's C parser has type (Either ParseError CTranslUnit). CTranslUnit is a specialization (CTranslationUnit NodeInfo). The CTranslationUnit's contents have ever more contents going deeper than is really any fun at all.
Before realizing that, I tried the following...
class Schemable a where
toscm :: a -> String
{- and then an (omitted) ridiculously (1000+ lines) long chain
of horrible instance declarations for every single type
that can be part of the C AST -}
I figured there really must be a better way to do this, but I haven't been able to find one I understand. GHC.Generics seems like it might have a solution, but the semantics of some of its internal types baffles me.
What should I do??
Update: I've been looking into Scrap Your Boilerplate, and while it definitely looks good for finding substructures, it doesn't provide a way for me to generically convert data to strings. The ideal would be if the data with the constructor Maybe (CoolType "awesome" (Something "funny")) could be processed with a function that would give me access to the names of the constructors, and allow me to recurse on the values of the arguments to that constructor. e.g...
data Constructorized = Constructed String [Constructorized]
| RawValue <anything>
constructorize :: <anything> -> Constructorized
constructorize a = <???>
toscm :: Constructorized -> String
toscm (Constructed c v) = "(" ++ c ++ " " ++ (intercalate " " (map toscm v)) ++ ")"
toscm (RawValue v) = show v
I guess the train of thought I'm on is: Show seems to be able to recurse into every single type that derives it, no problem. How are the functions for that generated? Shouldn't we be able to make our own Show that generates similar functions, with a slightly different output?

Related

Indent all lines in a string

I have some types with custom Show instances defined. They are structured like this:
data TopLevel = TopLevel SndLevel
data SndLevel = SndLevel Int
instance Show SndLevel where
show (SndLevel i) = "SndLevel: \n\t" ++ (show i)
My Show instance for SndLevel produces nice looking strings that look like the following when they appear in my output:
SndLevel:
5
I would like to create a Show instance for topLevel that causes TopLevel (SndLevel 5) to look like this when printed to the terminal:
TopLevel
SndLevel
5
I was hoping to find a function built into Haskell that would add "\t" at the front of a string and before each location where "\n" appears in that string.
The best solution I found would go along the lines of the answer in this post. In this case, I would replace "\n" with "\t\n".
I assume I'm not the first person to need Show instances for hierarchically organized data in Haskell, so I would like to know if there is a more idiomatic way to get this done. Is there some better solution to my problem?
p.s: I realize this kind of printing is not the best for the example datatypes I use above. The real datatypes I want to write instances for are product types, and so they don't read well when stretched out on one line. With that in mind, if there is a popular way to deal with this kind of problem without newlines and tabs, that could also solve my problem.
We can solve this by using lines :: String -> [String] and unlines :: [String] -> String to move from a String to a list of Strings and back.
In between, we can make use of map :: (a -> b) -> [a] -> [b] to prepend all lines (a String is a list of Chars) with a tab, like:
indent :: String -> String
indent = unlines . map ('\t' :) . lines
For example:
Prelude> indent (show (SndLevel 5))
"\tSndLevel: \n\t\t5\n"
We can use this in our defintion of Show for both SndLevel and TopLevel like:
instance Show SndLevel where
show (SndLevel n) = "SndLevel:" ++ '\n' : indent (show n)
instance Show TopLevel where
show (TopLevel n) = "TopLevel:" ++ '\n' : indent (show n)
This thus gives us:
Prelude> print (TopLevel (SndLevel 5))
TopLevel:
SndLevel:
5
That being said, a Show is usually used to show a representation of the object that can usually be "injected" back inh the compiler/interpreter. The idea of using indentation is not bad at all, but perhaps it makes sence to define your own typeclass for that. You could make that typeclass more efficient by using a parameter that is passed and updated, that keeps track of the indentation level.
There are furthermore several "pretty printing" libraries [Reddit] that can print the structure of an object nicely. So instead of "reinventing the wheel", it might be worth using one of the packages listed on the Reddit page.

Extracting Information from Haskell Object

I'm new to Haskell and I'm confused on how to get values out of function results. In my particular case, I am trying to parse Haskell files and see which AST nodes appear on which lines. This is the code I have so far:
import Language.Haskell.Parser
import Language.Haskell.Syntax
getTree :: String -> IO (ParseResult HsModule)
getTree path = do
file <- readFile path
let tree = parseModuleWithMode (ParseMode path) file
return tree
main :: IO ()
main = do
tree <- getTree "ex.hs"
-- <do something with the tree other than print it>
print tree
So on the line where I have the comment, I have a syntax tree as tree. It appears to have type ParseResult HsModule. What I want is just HsModule. I guess what I'm looking for is a function as follows:
extract :: ParseResult a -> a
Or better yet, a general Haskell function
extract :: AnyType a -> a
Maybe I'm missing a major concept about Haskell here?
p.s. I understand that thinking of these things as "Objects" and trying to access "Fields" from them is wrong, but I'd like an explanation of how to deal with this type of thing in general.
Looking for a general function of type
extract :: AnyType a -> a
does indeed show a big misunderstanding about Haskell. Consider the many things AnyType might be, and how you might extract exactly one object from it. What about Maybe Int? You can easily enough convert Just 5 to 5, but what number should you return for Nothing?
Or what if AnyType is [], so that you have [String]? What should be the result of
extract ["help", "i'm", "trapped"]
or of
extract []
?
ParseResult has a similar "problem", in that it uses ParseOk to contain results indicating that everything was fine, and ParseFailed to indicate an error. Your incomplete pattern match successfully gets the result if the parse succeeded, but will crash your program if in fact the parse failed. By using ParseResult, Haskell is encouraging you to consider what you should do if the code you are analyzing did not parse correctly, rather than to just blithely assume it will come out fine.
The definition of ParseResult is:
data ParseResult a = ParseOk a | ParseFailed SrcLoc String
(obtained from source code)
So there are two possibilities: either the parsing succeeded, and it will return a ParseOk instance, or something went wrong during the parsing in which case you get the location of the error, and an error message with a ParseFailed constructor.
So you can define a function:
getData :: ParseResult a -> a
getData (ParseOk x) = x
getData (ParseFailed _ s) = error s
It is better to then throw an error as well, since it is always possible that your compiler/interpreter/analyzer/... parses a Haskell program containing syntax errors.
I just figured out how to do this. It seems that when I was trying to define
extract :: ParseResult a -> a
extract (ParseResult a) = a
I actually needed to use
extract :: ParseResult a -> a
extract (ParseOk a) = a
instead. I'm not 100% sure why this is.

Getting 'a' value from 'Maybe a' return type in Haskell

I have a Haskell function eval :: WExp -> Memory -> WValue with a bunch of different instances of itself for different cases. For now, knowledge about WExp, Memory, and WValue is not relevant. My problem is that, for a specific instance of eval, I am using a lookup function, which takes the parameter of eval (a string in this case) searches a list of key-value pairs for that string. Note that this lookup function is not the one included in the Prelude; it is self-defined within the .hs file. If the string is found, the value associated with it is returned, but if it is not found, Nothing is returned. Because of the Nothing case, the type of lookup is actually Maybe a, where a would be a WValue in this case. Because eval would return a Maybe WValue, the compiler obviously complains that the type is not WValue.
I thought that there might be some kind of general method to extract the a value from any function that returns Maybe a.
Do this
do
input <- getUserInput
result <- lookup input structure
case result of
Just a -> putStrLn $ "I'm so happy you chose "++show a++"."
Nothing -> putStrLn $ "So sorry; "++input++" is not a valid option."
Don't do this
do
input <- getUserInput
result <- lookup input structure
case result of
Just a -> putStrLn $ "I'm so happy you chose "++show a++"."
Nothing -> error $ input ++ " is not a valid option."
This is bad because your program just goes splat if the user input is wrong.
Really don't do this
There is a function called fromJust that attempts to pull a value out of a Maybe and throws an error if it finds Nothing. It looks like
fromJust :: Maybe a -> a
fromJust (Just a) = a
fromJust Nothing = error "Oops, you goofed up, fool."
This makes it hard to see what went wrong.
And really, really don't do this
But if you want to play with fire, you can try it just for fun. This will attempt to get a value out of a Maybe and crash real hard if it finds Nothing. By "crash real hard" I mean you'll get a segmentation fault if you're lucky, and you'll publish your private keys on the web if you're not.
{-# LANGUAGE GADTs, DataKinds, KindSignatures #-}
{-# OPTIONS_GHC -fno-warn-unused-binds #-}
module Unsafe.FromJust (unsafeFromJust) where
-- Clear sign of bad news
import Unsafe.Coerce (unsafeCoerce)
-- This creates a "closed kind" with types
-- 'JustType and 'NothingType. You could just
-- define datatypes called JustType and NothingType,
-- but this makes the intent clearer.
data MaybeType = JustType | NothingType
data M (t::MaybeType) a where
-- The order of these constructors must not
-- be changed, because this type must look,
-- at runtime, exactly like a Maybe
N :: M 'NothingType a
J :: a -> M 'JustType a
-- A safe sort of fromJust for M.
fromJ :: M 'JustType a -> a
fromJ (J a) = a
-- Really, seriously unsafe.
unsafeFromJust :: Maybe a -> a
unsafeFromJust m = fromJ (unsafeCoerce m)
The function you are looking for is maybe defined in Prelude.
You need to decide on what to return if the expression is Nothing. Lets say you want to get empty string "" for Nothing. Then the following will let you get out of Maybe boxes.
Prelude> maybe "" id (Just "hello")
"hello"
Prelude> maybe "" id (Nothing)
""
If you know that the lookup is successful, and that the Maybe a is actually Just a, you can simply pattern match:
let (Just val) = lookup ...
and there you have your val::a out of your Maybe a. Note that this is unsafe code which will ungracefully throw an error if lookup returns a Nothing.
Well, you got yourself into a quagmire because the type of your lookup says that it could fail. Haskell forces you in this case to deal with the possibility that such a failure will occur. This is the case if lookup returns Nothing.
If you are really sure that lookup never fails (maybe because you preprocessed and type-checked the program, or you really trust it :) ) you could use fromJust from Data.Maybe. Note that is is really just a band-aid solution because fromJust will produce a (Haskell) runtime error on its own if called with Nothing.

Why can't I use the type `Show a => [Something -> a]`?

I have a record type say
data Rec {
recNumber :: Int
, recName :: String
-- more fields of various types
}
And I want to write a toString function for Rec :
recToString :: Rec -> String
recToString r = intercalate "\t" $ map ($ r) fields
where fields = [show . recNumber, show . recName]
This works. fields has type [Rec -> String]. But I'm lazy and I would prefer writing
recToString r = intercalate "\t" $ map (\f -> show $ f r) fields
where fields = [recNumber, recName]
But this doesn't work. Intuitively I would say fields has type Show a => [Rec -> a] and this should be ok. But Haskell doesn't allow it.
I'd like to understand what is going on here. Would I be right if I said that in the first case I get a list of functions such that the 2 instances of show are actually not the same function, but Haskell is able to determine which is which at compile time (which is why it's ok).
[show . recNumber, show . recName]
^-- This is show in instance Show Number
^-- This is show in instance Show String
Whereas in the second case, I only have one literal use of show in the code, and that would have to refer to multiple instances, not determined at compile time ?
map (\f -> show $ f r) fields
^-- Must be both instances at the same time
Can someone help me understand this ? And also are there workarounds or type system expansions that allow this ?
The type signature doesn't say what you think it says.
This seems to be a common misunderstanding. Consider the function
foo :: Show a => Rec -> a
People frequently seem to think this means that "foo can return any type that it wants to, so long as that type supports Show". It doesn't.
What it actually means is that foo must be able to return any possible type, because the caller gets to choose what the return type should be.
A few moments' thought will reveal that foo actually cannot exist. There is no way to turn a Rec into any possible type that can ever exist. It can't be done.
People often try to do something like Show a => [a] to mean "a list of mixed types but they all have Show". That obviously doesn't work; this type actually means that the list elements can be any type, but they still have to be all the same.
What you're trying to do seems reasonable enough. Unfortunately, I think your first example is about as close as you can get. You could try using tuples and lenses to get around this. You could try using Template Haskell instead. But unless you've got a hell of a lot of fields, it's probably not even worth the effort.
The type you actually want is not:
Show a => [Rec -> a]
Any type declaration with unbound type variables has an implicit forall. The above is equivalent to:
forall a. Show a => [Rec -> a]
This isn't what you wan't, because the a must be specialized to a single type for the entire list. (By the caller, to any one type they choose, as MathematicalOrchid points out.) Because you want the a of each element in the list to be able to be instantiated differently... what you are actually seeking is an existential type.
[exists a. Show a => Rec -> a]
You are wishing for a form of subtyping that Haskell does not support very well. The above syntax is not supported at all by GHC. You can use newtypes to sort of accomplish this:
{-# LANGUAGE ExistentialQuantification #-}
newtype Showy = forall a. Show a => Showy a
fields :: [Rec -> Showy]
fields = [Showy . recNumber, Showy . recName]
But unfortunatley, that is just as tedious as converting directly to strings, isn't it?
I don't believe that lens is capable of getting around this particular weakness of the Haskell type system:
recToString :: Rec -> String
recToString r = intercalate "\t" $ toListOf (each . to fieldShown) fields
where fields = (recNumber, recName)
fieldShown f = show (f r)
-- error: Couldn't match type Int with [Char]
Suppose the fields do have the same type:
fields = [recNumber, recNumber]
Then it works, and Haskell figures out which show function instance to use at compile time; it doesn't have to look it up dynamically.
If you manually write out show each time, as in your original example, then Haskell can determine the correct instance for each call to show at compile time.
As for existentials... it depends on implementation, but presumably, the compiler cannot determine which instance to use statically, so a dynamic lookup will be used instead.
I'd like to suggest something very simple instead:
recToString r = intercalate "\t" [s recNumber, s recName]
where s f = show (f r)
All the elements of a list in Haskell must have the same type, so a list containing one Int and one String simply cannot exist. It is possible to get around this in GHC using existential types, but you probably shouldn't (this use of existentials is widely considered an anti-pattern, and it doesn't tend to perform terribly well). Another option would be to switch from a list to a tuple, and use some weird stuff from the lens package to map over both parts. It might even work.

Best way to implement ad-hoc polymorphism in Haskell?

I have a polymorphic function like:
convert :: (Show a) => a -> String
convert = " [label=" ++ (show a) ++ "]"
But sometimes I want to pass it a Data.Map and do some more fancy key value conversion. I know I can't pattern match here because Data.Map is an abstract data type (according to this similar SO question), but I have been unsuccessful using guards to this end, and I'm not sure if ViewPatterns would help here (and would rather avoid them for portability).
This is more what I want:
import qualified Data.Map as M
convert :: (Show a) => a -> String
convert a
| M.size \=0 = processMap2FancyKVString a -- Heres a Data.Map
| otherwise = " [label=" ++ (show a) ++ "]" -- Probably a string
But this doesn't work because M.size can't take anything other than a Data.Map.
Specifically, I am trying to modify the sl utility function in the Functional Graph Library in order to handle coloring and other attributes of edges in GraphViz output.
Update
I wish I could accept all three answers by TomMD, Antal S-Z, and luqui to this question as they all understood what I really was asking. I would say:
Antal S-Z gave the most 'elegant' solution as applied to the FGL but would also require the most rewriting and rethinking to implement in personal problem.
TomMD gave a great answer that lies somewhere between Antal S-Z's and luqui's in terms of applicability vs. correctness. It also is direct and to the point which I appreciate greatly and why I chose his answer.
luqui gave the best 'get it working quickly' answer which I will probably be using in practice (as I'm a grad student, and this is just some throwaway code to test some ideas). The reason I didn't accept was because TomMD's answer will probably help other people in more general situations better.
With that said, they are all excellent answers and the above classification is a gross simplification. I've also updated the question title to better represent my question (Thanks Thanks again for broadening my horizons everyone!
What you just explained is you want a function that behaves differently based on the type of the input. While you could use a data wrapper, thus closing the function for all time:
data Convertable k a = ConvMap (Map k a) | ConvOther a
convert (ConvMap m) = ...
convert (ConvOther o) = ...
A better way is to use type classes, thus leaving the convert function open and extensible while preventing users from inputting non-sensical combinations (ex: ConvOther M.empty).
class (Show a) => Convertable a where
convert :: a -> String
instance Convertable (M.Map k a) where
convert m = processMap2FancyKVString m
newtype ConvWrapper a = CW a
instance Convertable (ConvWrapper a) where
convert (CW a) = " [label=" ++ (show a) ++ "]"
In this manner you can have the instances you want used for each different data type and every time a new specialization is needed you can extend the definition of convert simply by adding another instance Convertable NewDataType where ....
Some people might frown at the newtype wrapper and suggest an instance like:
instance Convertable a where
convert ...
But this will require the strongly discouraged overlapping and undecidable instances extensions for very little programmer convenience.
You may not be asking the right thing. I'm going to assume that you either have a graph whose nodes are all Maps or you have a graph whose nodes are all something else. If you need a graph where Maps and non-maps coexist, then there is more to your problem (but this solution will still help). See the end of my answer in that case.
The cleanest answer here is simply to use different convert functions for different types, and have any type that depends on convert take it as an argument (a higher order function).
So in GraphViz (avoiding redesigning this crappy code) I would modify the graphviz function to look like:
graphvizWithLabeler :: (a -> String) -> ... -> String
graphvizWithLabeler labeler ... =
...
where sa = labeler a
And then have graphviz trivially delegate to it:
graphviz = graphvizWithLabeler sl
Then graphviz continues to work as before, and you have graphvizWithLabeler when you need the more powerful version.
So for graphs whose nodes are Maps, use graphvizWithLabeler processMap2FancyKVString, otherwise use graphviz. This decision can be postponed as long as possible by taking relevant things as higher order functions or typeclass methods.
If you need to have Maps and other things coexisting in the same graph, then you need to find a single type inhabited by everything a node could be. This is similar to TomMD's suggestion. For example:
data NodeType
= MapNode (Map.Map Foo Bar)
| IntNode Int
Parameterized to the level of genericity you need, of course. Then your labeler function should decide what to do in each of those cases.
A key point to remember is that Haskell has no downcasting. A function of type foo :: a -> a has no way of knowing anything about what was passed to it (within reason, cool your jets pedants). So the function you were trying to write is impossible to express in Haskell. But as you can see, there are other ways to get the job done, and they turn out to be more modular.
Did that tell you what you needed to know to accomplish what you wanted?
Your problem isn't actually the same as in that question. In the question you linked to, Derek Thurn had a function which he knew took a Set a, but couldn't pattern-match. In your case, you're writing a function which will take any a which has an instance of Show; you can't tell what type you're looking at at runtime, and can only rely on the functions which are available to any Showable type. If you want to have a function do different things for different data types, this is known as ad-hoc polymorphism, and is supported in Haskell with type classes like Show. (This is as opposed to parametric polymorphism, which is when you write a function like head (x:_) = x which has type head :: [a] -> a; the unconstrained universal a is what makes that parametric instead.) So to do what you want, you'll have to create your own type class, and instantiate it when you need it. However, it's a little more complicated than usual, because you want to make everything that's part of Show implicitly part of your new type class. This requires some potentially dangerous and probably unnecessarily powerful GHC extensions. Instead, why not simplify things? You can probably figure out the subset of types which you actually need to print in this manner. Once you do that, you can write the code as follows:
{-# LANGUAGE TypeSynonymInstances #-}
module GraphvizTypeclass where
import qualified Data.Map as M
import Data.Map (Map)
import Data.List (intercalate) -- For output formatting
surround :: String -> String -> String -> String
surround before after = (before ++) . (++ after)
squareBrackets :: String -> String
squareBrackets = surround "[" "]"
quoted :: String -> String
quoted = let replace '"' = "\\\""
replace c = [c]
in surround "\"" "\"" . concatMap replace
class GraphvizLabel a where
toGVItem :: a -> String
toGVLabel :: a -> String
toGVLabel = squareBrackets . ("label=" ++) . toGVItem
-- We only need to print Strings, Ints, Chars, and Maps.
instance GraphvizLabel String where
toGVItem = quoted
instance GraphvizLabel Int where
toGVItem = quoted . show
instance GraphvizLabel Char where
toGVItem = toGVItem . (: []) -- Custom behavior: no single quotes.
instance (GraphvizLabel k, GraphvizLabel v) => GraphvizLabel (Map k v) where
toGVItem = let kvfn k v = ((toGVItem k ++ "=" ++ toGVItem v) :)
in intercalate "," . M.foldWithKey kvfn []
toGVLabel = squareBrackets . toGVItem
In this setup, everything which we can output to Graphviz is an instance of GraphvizLabel; the toGVItem function quotes things, and toGVLabel puts the whole thing in square brackets for immediate use. (I might have screwed some of the formatting you want up, but that part's just an example.) You then declare what's an instance of GraphvizLabel, and how to turn it into an item. The TypeSynonymInstances flag just lets us write instance GraphvizLabel String instead of instance GraphvizLabel [Char]; it's harmless.
Now, if you really need everything with a Show instance to be an instance of GraphvizLabel as well, there is a way. If you don't really need this, then don't use this code! If you do need to do this, you have to bring to bear the scarily-named UndecidableInstances and OverlappingInstances language extensions (and the less scarily named FlexibleInstances). The reason for this is that you have to assert that everything which is Showable is a GraphvizLabel—but this is hard for the compiler to tell. For instance, if you use this code and write toGVLabel [1,2,3] at the GHCi prompt, you'll get an error, since 1 has type Num a => a, and Char might be an instance of Num! You have to explicitly specify toGVLabel ([1,2,3] :: [Int]) to get it to work. Again, this is probably unnecessarily heavy machinery to bring to bear on your problem. Instead, if you can limit the things you think will be converted to labels, which is very likely, you can just specify those things instead! But if you really want Showability to imply GraphvizLabelability, this is what you need:
{-# LANGUAGE TypeSynonymInstances, FlexibleInstances
, UndecidableInstances, OverlappingInstances #-}
-- Leave the module declaration, imports, formatting code, and class declaration
-- the same.
instance GraphvizLabel String where
toGVItem = quoted
instance Show a => GraphvizLabel a where
toGVItem = quoted . show
instance (GraphvizLabel k, GraphvizLabel v) => GraphvizLabel (Map k v) where
toGVItem = let kvfn k v = ((toGVItem k ++ "=" ++ toGVItem v) :)
in intercalate "," . M.foldWithKey kvfn []
toGVLabel = squareBrackets . toGVItem
Notice that your specific cases (GraphvizLabel String and GraphvizLabel (Map k v)) stay the same; you've just collapsed the Int and Char cases into the GraphvizLabel a case. Remember, UndecidableInstances means exactly what it says: the compiler cannot tell if instances are checkable or will instead make the typechecker loop! In this case, I am reasonably sure that everything here is in fact decidable (but if anybody notices where I'm wrong, please let me know). Nevertheless, using UndecidableInstances should always be approached with caution.

Resources