Haskell introspecting a record's field names and types

Haskell introspecting a record's field names and types - haskell

Based on a recent exchange, I've been convinced to use Template Haskell to generate some code to ensure compile-time type safety.
I need to introspect record field names and types. I understand I can get field names by using constrFields . toConstr :: Data a => a -> [String]. But I need more than the field names, I need to know their type. For example, I need to know the names of fields that are of type Bool.
How do I construct a function f :: a -> [(String, xx)] where a is the record, String is the field name and xx is the field type?

The type should be available, along with everything else, in the Info value provided by reify. Specifically, you should get a TyConI, which contains a Dec value, from which you can get the list of Con values specifying the constructors. A record type should then use RecC, which will give you a list of fields described by a tuple containing the field name, whether the field is strict, and the type.
Where you go from there depends on what you want to do with all this.
Edit: For the sake of actually demonstrating the above, here's a really terrible quick and dirty function that finds record fields:
import Language.Haskell.TH
test :: Name -> Q Exp
test n = do rfs <- fmap getRecordFields $ reify n
litE . stringL $ show rfs
getRecordFields :: Info -> [(String, [(String, String)])]
getRecordFields (TyConI (DataD _ _ _ cons _)) = concatMap getRF' cons
getRecordFields _ = []
getRF' :: Con -> [(String, [(String, String)])]
getRF' (RecC name fields) = [(nameBase name, map getFieldInfo fields)]
getRF' _ = []
getFieldInfo :: (Name, Strict, Type) -> (String, String)
getFieldInfo (name, _, ty) = (nameBase name, show ty)
Importing that in another module, we can use it like so:
data Foo = Foo { foo1 :: Int, foo2 :: Bool }
foo = $(test ''Foo)
Loading that in GHCi, the value in foo is [("Foo",[("foo1","ConT GHC.Types.Int"),("foo2","ConT GHC.Types.Bool")])].
Does that give you the rough idea?

Related

Using data constructor as a function parameter

I am making my way through "Haskell Programming..." and, in Chapter 10, have been working with a toy database. The database is defined as:
data DatabaseItem = DBString String
| DBNumber Integer
| DBDate UTCTime
deriving (Eq, Ord, Show)
and, given a database of the form [databaseItem], I am asked to write a function
dbNumberFilter :: [DatabaseItem] -> [Integer]
that takes a list of DatabaseItems, filters them for DBNumbers, and returns a list the of Integer values stored in them.
I solved that with:
dbNumberFilter db = foldr selectDBNumber [] db
where
selectDBNumber (DBNumber a) b = a : b
selectDBNumber _ b = b
Obviously, I can write an almost identical to extract Strings or UTCTTimes, but I am wondering if there is a way to create a generic filter that can extract a list of Integers, Strings, by passing the filter a chosen data constructor. Something like:
dbGenericFilter :: (a -> DataBaseItem) -> [DatabaseItem] -> [a]
dbGenericFilter DBICon db = foldr selectDBDate [] db
where
selectDBDate (DBICon a) b = a : b
selectDBDate _ b = b
where by passing DBString, DBNumber, or DBDate in the DBICon parameter, will return a list of Strings, Integers, or UTCTimes respectively.
I can't get the above, or any variation of it that I can think of, to work. But is there a way of achieving this effect?

You can't write a function so generic that it just takes a constructor as its first argument and then does what you want. Pattern matches are not first class in Haskell - you can't pass them around as arguments. But there are things you could do to write this more simply.
One approach that isn't really any more generic, but is certainly shorter, is to make use of the fact that a failed pattern match in a list comprehension skips the item:
dbNumberFilter db = [n | DBNumber n <- db]
If you prefer to write something generic, such that dbNUmberFilter = genericFilter x for some x, you can extract the concept of "try to match a DBNumber" into a function:
import Data.Maybe (mapMaybe)
genericFilter :: (DatabaseItem -> Maybe a) -> [DatabaseItem] -> [a]
genericFilter = mapMaybe
dbNumberFilter = genericFilter getNumber
where getNumber (DBNumber n) = Just n
getNumber _ = Nothing
Another somewhat relevant generic thing you could do would be to define the catamorphism for your type, which is a way of abstracting all possible pattern matches for your type into a single function:
dbCata :: (String -> a)
-> (Integer -> a)
-> (UTCTime -> a)
-> DatabaseItem -> a
dbCata s i t (DBString x) = s x
dbCata s i t (DBNumber x) = i x
dbCata s i t (DBDate x) = t x
Then you can write dbNumberFilter with three function arguments instead of a pattern match:
dbNumberFilter :: [DatabaseItem] -> [Integer]
dbNumberFilter = (>>= dbCata mempty pure mempty)

List of a Type Classe instance

I've been playing around with Haskell type classes and I am facing a problem I hope someone could help me to solve. Consider that I come from a Swift background and "trying" to port some of protocol oriented knowledge to Haskell code.
Initially I declared a bunch of JSON parsers which had the same structure, just a different implementation:
data Candle = Candle {
mts :: Integer,
open :: Double,
close :: Double
}
data Bar = Bar {
mts :: Integer,
min :: Double,
max :: Double
}
Then I decided to create a "Class" that would define their basic operations:
class GenericData a where
dataName :: a -> String
dataIdentifier :: a -> Double
dataParsing :: a -> String -> Maybe a
dataEmptyInstance :: a
instance GenericData Candle where
dataName _ = "Candle"
dataIdentifier = fromInteger . mts
dataParsing _ = candleParsing
dataEmptyInstance = emptyCandle
instance GenericData Bar where
dataName _ = "Bar"
dataIdentifier = fromInteger . mts
dataParsing _ = barParsing
dataEmptyInstance = emptyBar
My first code smell was the need to include "a" when it was not needed (dataName or dataParsing) but then I proceded.
analyzeArguments :: GenericData a => [] -> [String] -> Maybe (a, [String])
analyzeArguments [] _ = Nothing
analyzeArguments _ [] = Nothing
analyzeArguments name data
| name == "Candles" = Just (head possibleCandidates, data)
| name == "Bar" = Just (last possibleRecordCandidates, data)
| otherwise = Nothing
possibleCandidates :: GenericData a => [a]
possibleCandidates = [emptyCandle, emptyBar]
Now, when I want to select if either instance should be selected to perform parsing, I always get the following error
• Couldn't match expected type ‘a’ with actual type ‘Candle’
‘a’ is a rigid type variable bound by
the type signature for:
possibleCandidates :: forall a. GenericData a => [a]
at src/GenericRecords.hs:42:29
My objective was to create a list of instances of GenericData because other functions depend on that being selected to execute the correct dataParser. I understand this has something to do with the type class checker, the * -> Constraint, but still not finding a way to solve this conflict. I have used several GHC language extensions but none has solved the problem.

You have a type signature:
possibleCandidates :: GenericData a => [a]
Which you might thing implies that you can put anything in that list as long as it is GenericData. But that is not the way Haskell's type system actually works. The value possibleCandidates can be a list of any type which has a GenericData class but every element of the list must be of the same type.
What the GHC error message is telling you (in its own special way) is that the first element of the list is a Candle so it thinks that the rest of the list should also be of type Candle but the second element is actually a Bar.
Now there are ways to make heterogeneous lists (and other collections) in Haskell, but it is almost never the right thing to do.
One typical solution to this problem is to just merge everything down into one sum data type:
data GenericData = GenericCandle Candle | GenericBar Bar
You could even forgo the step of indirection and just put the Candle and Bar data directly into the data structure.
Now instead f a class you just have a datatype and your class functions become normal functions:
dataName :: GenericData -> String
dataIdentifier :: GenericData -> Double
dataParsing :: GenericData -> String -> Maybe a
dataEmptyInstance :: String -> GenericData
There are some other more complex ways to make this work, but if a sum data type fits the bill, use it. It is very common for parsers in Haskell to have a large sum data type (usually also recursive) as their result. Take a look at the Value type in Aeson the standard JSON library for an example.

Haskell read function overloading with two inputs

In my haskell program I have a list that represents a database in [(key, value)] format. For example this is a valid database: [("key1", "value1"), ("key2", "value2"), ("key3", "value3")]. The key and value data will always have String type.
My question is: is it possible to code the reading operation by overloading the read function and using that in this way: read dbList "key1"? If yes how can I solve this problem? The output needs to be ("not found","value data for key not exists") or ("found", "value1").
I`ve looked up how can I solve this but all that I found is how to use read function to one input parameter and how to define a new type in order to create an instance of the read for that particular type if it is needed. But I am still curious if I can overload somehow the read function with two input paramaters.

The function you want is lookup, which is part of the Prelude.
> :t lookup
lookup :: Eq a => a -> [(a, b)] -> Maybe b
> let dbList = [("key1", "value1")]
> lookup "key1" dbList
Just "value1"
> lookup "key2" dbList
Nothing
If you really need the output in the tuple form you show, you can pattern-match on the result.
case lookup dbList someKey of
Just x -> ("found", x)
Nothing -> ("not found", "data for " ++ key ++ " does not exist")

For completeness, I will present a way to do it using read. However, this is very unusual and I would consider it a bad idea, since you can just use lookup.
{-# LANGUAGE FlexibleContexts, FlexibleInstances #-}
type DB = [(String, String)]
instance Read (DB -> (String, String)) where
readsPrec _ = \key -> let
f db = case lookup key db of
Just x -> ("found", x)
Nothing -> ("not found", "data for " ++ key ++ " does not exist")
in [(f, "")]
Here we define an instance of Read for the type DB -> (String, String). Recall that the read function has type Read a => String -> a, so this instance gives us an overload of read of type String -> DB -> (String, String).
Our instance defines the readsPrec function, which has type Read a => Int => ReadS a, where ReadS a is an alias for String -> [(a, String)]. So our readsPrec implementation must have type Int -> String -> [(DB -> (String, String), String)]. We don’t care about the “precedence” argument, so we ignore it with _. And we only care about returning a single result in this instance, so we simply return [(f, "")] where f is a function of type DB -> (String, String) that performs the lookup of the key in its argument db.
Now we can use this instance like so:
> read "foo" [("foo", "bar")] :: (String, String)
("found","bar")
> read "baz" [("foo", "bar")] :: (String, String)
("not found","data for baz does not exist")
Again I would stress that this is an unusual instance that may cause confusion in real code—you should just use lookup directly.

Deserializing many network messages without using an ad-hoc parser implementation

I have a question pertaining to deserialization. I can envision a solution using Data.Data, Data.Typeable, or with GHC.Generics, but I'm curious if it can be accomplished without generics, SYB, or meta-programming.
Problem Description:
Given a list of [String] that is known to contain the fields of a locally defined algebraic data type, I would like to deserialize the [String] to construct the target data type. I could write a parser to do this, but I'm looking for a generalized solution that will deserialize to an arbitrary number of data types defined within the program without writing a parser for each type. With knowledge of the number and type of value constructors an algebraic type has, it's as simple as performing a read on each string to yield the appropriate values necessary to build up the type. However, I don't want to use generics, reflection, SYB, or meta-programming (unless it's otherwise impossible).
Say I have around 50 types defined similar to this (all simple algebraic types composed of basic primitives (no nested or recursive types, just different combinations and orderings of primitives) :
data NetworkMsg = NetworkMsg { field1 :: Int, field2 :: Int, field3 :: Double}
data NetworkMsg2 = NetworkMsg2 { field1 :: Double, field2 :: Int, field3 :: Double }
I can determine the data-type to be associated with a [String] I've received over the network using a tag id that I parse before each [String].
Possible conjectured solution path:
Since data constructors are first-class values in Haskell, and actually have a type-- Can NetworkMsg constructor be thought of as a function, such as:
NetworkMsg :: Int -> Int -> Double -> NetworkMsg
Could I transform this function into a function on tuples using uncurryN then copy the [String] into a tuple of the same shape the function now takes?
NetworkMsg' :: (Int, Int, Double) -> NetworkMsg
I don't think this would work because I'd need knowledge of the value constructors and type information, which would require Data.Typeable, reflection, or some other metaprogramming technique.
Basically, I'm looking for automatic deserialization of many types without writing type instance declarations or analyzing the type's shape at run-time. If it's not feasible, I'll do it an alternative way.

You are correct in that the constructors are essentially just functions so you can write generic instances for any number of types by just writing instances for the functions. You'll still need to write a separate instance
for all the different numbers of arguments, though.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Text.Read
import Control.Applicative
class FieldParser p r where
parseFields :: p -> [String] -> Maybe r
instance Read a => FieldParser (a -> r) r where
parseFields con [a] = con <$> readMaybe a
parseFields _ _ = Nothing
instance (Read a, Read b) => FieldParser (a -> b -> r) r where
parseFields con [a, b] = con <$> readMaybe a <*> readMaybe b
parseFields _ _ = Nothing
instance (Read a, Read b, Read c) => FieldParser (a -> b -> c -> r) r where
parseFields con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseFields _ _ = Nothing
{- etc. for as many arguments as you need -}
Now you can use this type class to parse any message based on the constructor as long as the type-checker is able to figure out the resulting message type from context (i.e. it is not able to deduce it simply from the given constructor for these sort of multi-param type class instances).
data Test1 = Test1 {fieldA :: Int} deriving Show
data Test2 = Test2 {fieldB ::Int, fieldC :: Float} deriving Show
test :: String -> [String] -> IO ()
test tag fields = case tag of
"Test1" -> case parseFields Test1 fields of
Just (a :: Test1) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
"Test2" -> case parseFields Test2 fields of
Just (a :: Test2) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
I'd like to know how exactly you use the message types in the application, though, because having each message as its separate type makes it very difficult to have any sort of generic message handler.
Is there some reason why you don't simply have a single message data type? Such as
data NetworkMsg
= NetworkMsg1 {fieldA :: Int}
| NetworkMsg2 {fieldB :: Int, fieldC :: Float}
Now, while the instances are built in pretty much the same way, you get much better type inference since the result type is always known.
instance Read a => MessageParser (a -> NetworkMsg) where
parseMsg con [a] = con <$> readMaybe a
instance (Read a, Read b) => MessageParser (a -> b -> NetworkMsg) where
parseMsg con [a, b] = con <$> readMaybe a <*> readMaybe b
instance (Read a, Read b, Read c) => MessageParser (a -> b -> c -> NetworkMsg) where
parseMsg con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseMessage :: String -> [String] -> Maybe NetworkMsg
parseMessage tag fields = case tag of
"NetworkMsg1" -> parseMsg NetworkMsg1 fields
"NetworkMsg2" -> parseMsg NetworkMsg2 fields
_ -> Nothing
I'm also not sure why you want to do type-generic programming specifically without actually using any of the tools meant for generics. GHC.Generics, SYB or Template Haskell is usually the best solution for this kind of problem.

Haskell data serialization of some data implementing a common type class

Let's start with the following
data A = A String deriving Show
data B = B String deriving Show
class X a where
spooge :: a -> Q
[ Some implementations of X for A and B ]
Now let's say we have custom implementations of show and read, named show' and read' respectively which utilize Show as a serialization mechanism. I want show' and read' to have types
show' :: X a => a -> String
read' :: X a => String -> a
So I can do things like
f :: String -> [Q]
f d = map (\x -> spooge $ read' x) d
Where data could have been
[show' (A "foo"), show' (B "bar")]
In summary, I wanna serialize stuff of various types which share a common typeclass so I can call their separate implementations on the deserialized stuff automatically.
Now, I realize you could write some template haskell which would generate a wrapper type, like
data XWrap = AWrap A | BWrap B deriving (Show)
and serialize the wrapped type which would guarantee that the type info would be stored with it, and that we'd be able to get ourselves back at least an XWrap... but is there a better way using haskell ninja-ery?
EDIT
Okay I need to be more application specific. This is an API. Users will define their As, and Bs and fs as they see fit. I don't ever want them hacking through the rest of the code updating their XWraps, or switches or anything. The most i'm willing to compromise is one list somewhere of all the A, B, etc. in some format. Why?
Here's the application. A is "Download a file from an FTP server." B is "convert from flac to mp3". A contains username, password, port, etc. information. B contains file path information. There could be MANY As and Bs. Hundreds. As many as people are willing to compile into the program. Two was just an example. A and B are Xs, and Xs shall be called "Tickets." Q is IO (). Spooge is runTicket. I want to read the tickets off into their relevant data types and then write generic code that will runTicket on the stuff read' from the stuff on disk. At some point I have to jam type information into the serialized data.

I'd first like to stress for all our happy listeners out there that XWrap is a very good way, and a lot of the time you can write one yourself faster than writing it using Template Haskell.
You say you can get back "at least an XWrap", as if that meant you couldn't recover the types A and B from XWrap or you couldn't use your typeclass on them. Not true! You can even define
separateAB :: [XWrap] -> ([A],[B])
If you didn't want them mixed together, you should serialise them seperately!
This is nicer than haskell ninja-ery; maybe you don't need to handle arbitrary instances, maybe just the ones you made.
Do you really need your original types back? If you feel like using existential types because you just want to spooge your deserialised data, why not either serialise the Q itself, or have some intermediate data type PoisedToSpooge that you serialise, which can deserialise to give you all the data you need for a really good spooging. Why not make it an instance of X too?
You could add a method to your X class that converts to PoisedToSpooge.
You could call it something fun like toPoisedToSpooge, which trips nicely off the tongue, don't you think? :)
Anyway this would remove your typesystem complexity at the same time as resolving the annoying ambiguous type in
f d = map (\x -> spooge $ read' x) d -- oops, the type of read' x depends on the String
You can replace read' with
stringToPoisedToSpoogeToDeserialise :: String -> PoisedToSpooge -- use to deserialise
and define
f d = map (\x -> spooge $ stringToPoisedToSpoogeToDeserialise x) -- no ambiguous type
which we could of course write more succincly as
f = map (spooge.stringToPoisedToSpoogeToDeserialise)
although I recognise the irony here in suggesting making your code more succinct. :)

If what you really want is a heterogeneous list then use existential types. If you want serialization then use Cereal + ByteString. If you want dynamic typing, which is what I think your actual goal is, then use Data.Dynamic. If none of this is what you want, or you want me to expand please press the pound key.
Based on your edit, I don't see any reason a list of thunks won't work. In what way does IO () fail to represent both the operations of "Download a file from an FTP server" and "convert from flac to MP3"?

I'll assume you want to do more things with deserialised Tickets
than run them, because if not you may as well ask the user to supply a bunch of String -> IO()
or similar, nothing clever needed at all.
If so, hooray! It's not often I feel it's appropriate to recommend advanced language features like this.
class Ticketable a where
show' :: a -> String
read' :: String -> Maybe a
runTicket :: a -> IO ()
-- other useful things to do with tickets
This all hinges on the type of read'. read' :: Ticket a => String -> a isn't very useful,
because the only thing it can do with invalid data is crash.
If we change the type to read' :: Ticket a => String -> Maybe a this can allow us to read from disk and
try all the possibilities or fail altogether.
(Alternatively you could use a parser: parse :: Ticket a => String -> Maybe (a,String).)
Let's use a GADT to give us ExistentialQuantification without the syntax and with nicer error messages:
{-# LANGUAGE GADTs #-}
data Ticket where
MkTicket :: Ticketable a => a -> Ticket
showT :: Ticket -> String
showT (MkTicket a) = show' a
runT :: Ticket -> IO()
runT (MkTicket a) = runTicket a
Notice how the MkTicket contstuctor supplies the context Ticketable a for free! GADTs are great.
It would be nice to make Ticket and instance of Ticketable, but that won't work, because there would be
an ambiguous type a hidden in it. Let's take functions that read Ticketable types and make them read
Tickets.
ticketize :: Ticketable a => (String -> Maybe a) -> (String -> Maybe Ticket)
ticketize = ((.).fmap) MkTicket -- a little pointfree fun
You could use some unusual sentinel string such as
"\n-+-+-+-+-+-Ticket-+-+-+-Border-+-+-+-+-+-+-+-\n" to separate your serialised data or better, use separate files
altogether. For this example, I'll just use "\n" as the separator.
readTickets :: [String -> Maybe Ticket] -> String -> [Maybe Ticket]
readTickets readers xs = map (foldr orelse (const Nothing) readers) (lines xs)
orelse :: (a -> Maybe b) -> (a -> Maybe b) -> (a -> Maybe b)
(f `orelse` g) x = case f x of
Nothing -> g x
just_y -> just_y
Now let's get rid of the Justs and ignore the Nothings:
runAll :: [String -> Maybe Ticket] -> String -> IO ()
runAll ps xs = mapM_ runT . catMaybes $ readTickets ps xs
Let's make a trivial ticket that just prints the contents of some directory
newtype Dir = Dir {unDir :: FilePath} deriving Show
readDir xs = let (front,back) = splitAt 4 xs in
if front == "dir:" then Just $ Dir back else Nothing
instance Ticketable Dir where
show' (Dir p) = "dir:"++show p
read' = readDir
runTicket (Dir p) = doesDirectoryExist p >>= flip when
(getDirectoryContents >=> mapM_ putStrLn $ p)
and an even more trivial ticket
data HelloWorld = HelloWorld deriving Show
readHW "HelloWorld" = Just HelloWorld
readHW _ = Nothing
instance Ticketable HelloWorld where
show' HelloWorld = "HelloWorld"
read' = readHW
runTicket HelloWorld = putStrLn "Hello World!"
and then put it all together:
myreaders = [ticketize readDir,ticketize readHW]
main = runAll myreaders $ unlines ["HelloWorld",".","HelloWorld","..",",HelloWorld"]

Just use Either. Your users don't even have to wrap it themselves. You have your deserializer wrap it in the Either for you. I don't know exactly what your serialization protocol is, but I assume that you have some way to detect which kind of request, and the following example assumes the first byte distinguishes the two requests:
deserializeRequest :: IO (Either A B)
deserializeRequest = do
byte <- get1stByte
case byte of
0 -> do
...
return $ Left $ A <A's fields>
1 -> do
...
return $ Right $ B <B's fields>
Then you don't even need to type-class spooge. Just make it a function of Either A B:
spooge :: Either A B -> Q

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string