Is there a possibility to construct data in runtime ? I mean something like "read" function, but which applies [(field name, value)]. Let's say I have
data Street = Street String
data City = City String
data ZipCode = ZipCode String
data Address = Address {
street :: Street,
city :: City,
zipCode :: ZipCode
}
I want to have a function like:
genericConstructor :: (DataConstructable a) => String -> [(String, a)] -> a
So I can use it like this:
genericConstructor "Address" [("street", Street "Baker"),
("city", City "London"),
("zipCode", ZipCode "12345")] :: Address
I don't want any boilerplate code, looking for anything similar to Reflection API for Java.
Currently looking at Data.Data and Data.Typeable modules though don't see how I can achieve it.
The purpose of all of this is to create a binding between some data format and haskell data structures.
Here's something close to what you're asking for.
import Data.Data
import Data.Dynamic
import Data.Maybe
data City = City String deriving (Data, Typeable, Show, Eq)
data Street = Street String deriving (Data, Typeable, Show, Eq)
data Addr = Addr {
city :: City
,street :: Street} deriving (Show, Eq, Data, Typeable)
class Foo a where
genericConstr :: [(String,Dynamic)] -> a
instance Foo Addr where
genericConstr = buildAddr
lf ls nm = fromMaybe (error $ nm ++ " not found") (lookup nm ls >>= fromDynamic)
buildAddr ls = Addr {city = lf ls "city", street = lf ls "street"}
Load this, and in ghci:
*Foo> genericConstr [("street", toDyn (Street "Baker")), ("city", toDyn (City "London"))] :: Addr
Addr {city = City "London", street = Street "Baker"}
This seems like a lot of work to me, though. This is tricky because Haskell requires all types to be resolved at compile time; in this case you're trying to create types with run-time information (e.g. the string "Address"). This is possible, but you'll be fighting the type system at every step. I agree with Jason that using a parser is probably a better approach.
I think one problem with your sceme is that with
genericConstructor :: (DataConstructable a) => String -> [(String, a)] -> a
all the 'arguments' would have to be of the same constructable type. Therefore, you would need something along the lines of
genericConstructor :: (DataConstructable a) => String -> [forall b. DataConstructable b => (String, b)] -> a
which I'm not entirely sure how to do, I must admit.
Would it not be easier just to parse everything from the data format string?
Related
Lets say I have a basic model in my application for handling users:
User
username Text
password Text
email Text
Based on these types I can store whatever I like (as long as it is Text obviously) to any of these fields. Say I want to begin enforcing some rules around what can and cannot be considered an email. I create this email module:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Model.EmailAddress
( EmailAddress
, parseEmailAddress
) where
import Data.Aeson
import Data.Text (Text)
import qualified Data.Text.Encoding as T
import qualified Data.Text.Lazy as TL
import qualified Data.Text.Lazy.Builder as TL
import Database.Persist
import Database.Persist.Sql
import Text.Shakespeare.Text (ToText (..))
newtype EmailAddress = EmailAddress (Either InvalidEmailAddress ValidEmailAddress)
deriving (Show, Eq)
instance ToJSON EmailAddress where
toJSON = String . toStrictText
instance FromJSON EmailAddress where
parseJSON = withText "EmailAddress" (pure . parseEmailAddress)
instance ToText EmailAddress where
toText (EmailAddress e) = either toText toText e
newtype ValidEmailAddress = ValidEmailAddress Text
deriving (Show, Eq)
instance ToText ValidEmailAddress where
toText (ValidEmailAddress t) = toText t
data InvalidEmailAddress = InvalidEmailAddress
{ invalidEmailAddressValue :: Text
, invalidEmailAddressReason :: Text
} deriving (Show, Eq)
instance ToText InvalidEmailAddress where
toText = toText . invalidEmailAddressValue
parseEmailAddress :: Text -> EmailAddress
parseEmailAddress t =
-- super basic validation example
EmailAddress $ case T.count "#" t of
0 -> Left (InvalidEmailAddress t "Must contain # symbol")
1 -> Right (ValidEmailAddress t)
_ -> Left (InvalidEmailAddress t "Contains more than one # symbol")
instance PersistFieldSql EmailAddress where
sqlType _ = SqlString
instance PersistField EmailAddress where
fromPersistValue =
decodePersistTextValue parseEmailAddress
toPersistValue =
PersistText . toStrictText
decodePersistTextValue :: PersistField a => (Text -> a) -> PersistValue -> Either Text a
decodePersistTextValue f = \case
PersistByteString bs ->
fromPersistValue $ PersistText (T.decodeUtf8 bs)
v ->
f <$> fromPersistValue v
toStrictText :: ToText t => t -> Text
toStrictText = builderToStrictText . toText
builderToStrictText :: TL.Builder -> Text
builderToStrictText = TL.toStrict . TL.toLazyText
I could now define my model like so:
User
username Text
password Text
email EmailAddress
However, this does not stop me from writing a bad email address to the database. If the rules around what constitutes a valid email address changes, I can read them from the DB still, and previously valid emails will come out of the DB as invalid, and this is fine, I like this. However I cannot enforce (at the persistent level anyway) preventing writing an invalid email to the database, because toPersistValue has the type a -> PersistValue, I suppose I would have to enforce this a layer up (when data is coming into the app, form decoders, json decoders etc) to ensure when a Left InvalidEmailAddress appears the user is given an error.
This leads me to thinking, is this the best practice around enforcing data integrity when using persistent? is there a better approach? Should persistent not be involved at all? ie the field remains as a Text and it is down to the app to convert these fields when they're being accessed?
I would really like a way of writing my entities to the database in a type safe way that allows me to determine what to do in the situation where an invalid email is used to try and save my User type, I could throw an error in the toPersistValue implementation but this feels a bit dirty to me, and not very Haskell'y.
I have no experience in Haskell. I'm trying to parse many .json files to a data structure in Haskell using aeson. However, by reasons beyond my control, I need to store the name of the file from where the data was parsed as one of the fields in my data. A simple example of what I have so far is:
data Observation = Observation { id :: Integer
, value :: Integer
, filename :: String}
instance FromJSON Observation where
parseJson (Object v) =
Observation <$> (read <$> v .: "id")
<*> v .: "value"
<*> ????
My question is: what is a smart way to be able to serialize my data when parsing a json file having access to the name of the file?
What comes in my mind is to define another data like NotNamedObservation, initialize it and then having a function that converts NotNamedObservation -> String -> Observation (where String is the filename) but that sounds like a very poor approach.
Thanks.
Just make your instance a function from file path to Observation:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE OverloadedStrings #-}
import Data.Aeson
import qualified Data.ByteString.Lazy as LBS
import System.Environment
data Observation = Observation { ident :: Integer
, value :: Integer
, filename :: FilePath
} deriving (Show)
instance FromJSON (FilePath -> Observation) where
parseJSON (Object v) =
do i <- read <$> v .: "id"
l <- v .: "value"
pure $ Observation i l
main :: IO ()
main = do
files <- getArgs
fileContents <- traverse LBS.readFile files
print fileContents
let fs = map (maybe (error "Invalid json") id . decode) fileContents
jsons :: [Observation]
jsons = zipWith ($) fs files
print jsons
When you don't control the data definition and you have strict requirements about the format to parse, it's better to write the (de)serializer explicitly.
If external information is required to fully construct values, avoid the FromJSON/ToJSON type classes, and just write standalone parsers.
aeson's deriving mechanism is more suited to applications that talk to themselves (and thus only care about round-tripping between parseJSON and toJSON), or where there is flexibility either in defining the JSON format or the Haskell types.
If you still have to use these classes for some reason, one option is of course to just put undefined in those missing fields. To rely on the type system more, you can also parameterize types by a "phase" (that assumes again you can tweak the data type), which is a type constructor that wraps some fields.
Functor functors
One way to keep that style compatible with regular records is to use this HKD/defunctionalization design pattern.
data Observation' p = Observation
{ id :: Integer
, value :: Integer
, filename :: p String }
-- This is isomorphic to the original Observation data type
type Observation = Observation Identity
-- When we don't have the filename available, we keep the field empty with Proxy
instance FromJSON (Observation' Proxy) where
...
mkObservation :: FileName -> Observation' Proxy -> Observation
I am new to Haskell, so maybe I am missing some fundamental concepts here (or maybe failed to find the appropriate extension). I was wondering if there was a way to optimize or further abstract the following scenario. This code seems very redundant.
Let's say I have the following data classes:
data Person = Person
{ personName :: !String
, personAge :: !Int
} deriving Show
data Dog = Dog
{ dogName :: !String
, dogAge :: !Int
} deriving Show
Let's say I have a service and I'm only concerned with outputing records as strings. In reality, the strings will probably be JSON and the records fetched from the DB, but let's take a simpler case. I basically need a URL token to fetch an appropriate object (say, the string "dog" will get me a Dog, or even just the Haskell "show" string, without expressly declaring it as (value)::Dog).
I have attempted to implement this in several ways...the only thing that seems to work is the following:
data Creature = DogC Dog
| PersonC Person
deriving Show
fromString :: String -> Maybe Creature
fromString "dog" = Just $ DogC $ Dog "muffin" 8
fromString "person" = Just $ PersonC $ Person "John" 22
fromString _ = Nothing
main :: IO ()
main = do
putStrLn $ show $ fromString "dog"
I'm not entirely fond of the new type, nor the list of fromString declarations. And to benefit from the original data declarations, I would probably need to write a similarly tedious expression (eg, "fromCreature") to revert Creature back into my original types. This information might change, so I would probably need TH for a few of the declarations...
Is there a way around some of this? I fiddled with GADTs and classes, but both seemed to be dependent on type- rather than value- based polymorphism (A string identifier tends to cause issues with ambiguous instances). It would be nice to map the constructor to a string (Say, with Data.Map), but constructors often have different kinds.
Update
So, I went with an approach that isn't exactly relevant to the question I had asked, but it may be useful to someone. I did want to maintain some record types, but most didn't add much value and were getting in my way. The steps I had followed went something like:
Use a different/lower-level DB driver, that returns workable types (eg, [ColumnDef] and [[SQLValue]] instead of tuples and records...).
Create ToJSON instances for SQLValue -- most of the types were covered, except a few ByteString types, and I had to handle the conversion of SQLNull to Null. To maintain compatibility with some record types, my default handler looked like: toJSON = genericToJSON defaultOptions { sumEncoding = UnTaggedValue} The untagged value should allow one to read the JSON into defined data types (eg, Dog / Person ) if desired....
Given that column name is accessible from ColumnDef, I wrote an expression that zips [ColumnDef] and [SqlValue] to a list of Aeson-compatible key-value pairs, eg: toJsPairs :: [ColumnDef] -> [SqlValue] -> [(Text,Value)]
Then, I wrote an expression to fetch the JSON from a table name, which more or less serves as my "universal dispatcher." It references a list of authorized tables, so it's less crazy than it might sound.
The code looked a bit like this (using mysql-haskell).
{-# LANGUAGE OverloadedStrings #-}
import qualified Control.Applicative as App
import Database.MySQL.Base
import qualified System.IO.Streams as Streams
import Data.Aeson (FromJSON, ToJSON)
import Data.Aeson.Encode.Pretty (encodePretty)
import Data.Aeson.Types
import Data.Text.Encoding
import Data.String (fromString)
import Data.ByteString.Internal
import qualified Data.ByteString.Lazy.Internal as BLI
import Data.HashMap.Strict (fromList)
appConnectInfo = defaultConnectInfo {
ciUser = "some_user"
, ciPassword = "some_password"
, ciDatabase = "some_db"
}
instance FromJSON ByteString where
parseJSON (String s) = pure $ encodeUtf8 s
parseJSON _ = App.empty
instance ToJSON ByteString where
toJSON = String . decodeUtf8
instance ToJSON MySQLValue where
toJSON (MySQLNull) = Null
toJSON x = genericToJSON defaultOptions
{ sumEncoding = UntaggedValue } x
-- This expression should fail on dimensional mismatch.
-- It's stupidly lenient, but really dimensional mismatch should
-- never occur...
toJsPairs :: [ColumnDef] -> [MySQLValue] -> [(Text,Value)]
toJsPairs [] _ = []
toJsPairs _ [] = []
toJsPairs (x:xs) (y:ys) = (txt x, toJSON y):toJsPairs xs ys
where
-- Implement any modifications to the key names here
txt = decodeUtf8.columnName
listRecords :: String -> IO BLI.ByteString
listRecords tbl = do
conn <- connect appConnectInfo
-- This is clearly an injection vulnerability.
-- Implemented, however, the values for 'tbl' are intensely
-- vetted. This is just an example.
(defs, is) <- query_ conn $ fromString ( "SELECT * FROM `" ++ tbl ++ "` LIMIT 100")
rcrds <- Streams.toList is
return $ encodePretty $ map (jsnobj defs) rcrds
where
jsnobj :: [ColumnDef] -> [MySQLValue] -> Value
jsnobj defs x = Object $ fromList $ toJsPairs defs x
If what you want to consume at the end is json value - it might make sense to
represent result as json value using aeson library:
{-# LANGUAGE DeriveGeneric #-}
import Data.Aeson
import GHC.Generics
data Dog = Dog Int String deriving (Show, Generic)
data Cat = Cat Int String deriving (Show, Generic)
-- here I'm using instance derived with generics, but you can write one by
-- hands
instance ToJSON Dog
instance ToJSON Cat
-- actions to get stuff from db
getDog :: Monad m => Int -> m Dog
getDog i = return (Dog i (show i))
getCat :: Monad m => Int -> m Cat
getCat i = return (Cat i (show i))
-- dispatcher - picks which action to use
getAnimal :: Monad m => String -> Int -> m (Maybe Value)
getAnimal "dog" i = Just . toJSON <$> getDog i
getAnimal "cat" i = Just . toJSON <$> getCat i
getAnimal _ _ = return Nothing
main :: IO ()
main = do
getAnimal "dog" 2 >>= print
getAnimal "cat" 3 >>= print
getAnimal "chupakabra" 12 >>= print
High energy magic version
class Monad m => MonadAnimal m where
-- basically you want something that fetches extra argumets from HTTP or
-- whatevere, perform DB query and so on.
class Animal a where
animalName :: Proxy a -> String
animalGetter :: MonadAnimal m => m a
locateAnimals :: MonadAnimal m => Q [(String, m Value)]
locateAnimals -- implement using TH (reify function is your friend). It should look for
-- all the animal instances in scope and make a list from them with serialized
-- fetcher.
-- with that in place dispatcher should be easy to implement
I have a data type where one of the fields is a list of one of n other data types (n is small and the types are known in advance). I would like to make a JSON parser but I can't quite figure it out. I've tried creating a Pet type class and making them both instances of it, but it seemed to be a dead end. Any help would be appreciated!
As a simplified example - I have a Person data type who can have a list of pets, either dogs or cats - but not a mix of both.
Here's the example:
{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative
import Data.Aeson
import Data.ByteString.Lazy as L
import Data.Aeson.Types (Parser)
import Control.Monad (mzero)
data Person = Person {
name :: String,
species :: String,
pets :: [?] -- a list of dogs OR cats
} deriving Show
instance FromJSON (Person a) where
parseJSON (Object v) = ???
data Dog = Dog {
dogField :: String
} deriving Show
instance FromJSON Dog where
parseJSON (Object v) = Dog <$>
v .: "dogField"
data Cat = Cat {
catField :: String
} deriving Show
instance FromJSON Cat where
parseJSON (Object v) = Cat <$>
v .: "catField"
A standard way of representing either one type or another is to use the Either type, e.g.:
data Person { ..., pets :: Either [Dog] [Cat] }
Also, you might be interested in use GHC Generics to auto-derive the To/FromJSON instances.
An example with a data structure that uses Either:
{-# LANGUAGE DeriveGeneric #-}
import Data.Aeson
import GHC.Generics
data Person = Person {
name :: String,
species :: String,
pets :: Either [String] [Int]
} deriving (Show,Generic)
instance ToJSON Person -- instances are auto-derived
instance FromJSON Person
doit = do
let me = Person "user5402" "Human" (Right [42])
print $ encode me
If you have more than two alternatives you can easily create your own sum type like this:
-- assume the possible pet types are: Dog, Cat, Rodent, Reptile
data Pets = Dogs [Dog] | Cats [Cat] | Rodents [Rodent] | Reptiles [Reptile]
deriving (Show, Generic)
data Person { ..., pets :: Pets }
deriving (Show, Generic)
doit = do
let me = Person "user5402" "Human" (Rodents [agerbil, amouse])
print $ encode me
where agerbil and amouse are Rodent values.
I'm modifying #user5402's answer because I don't like the "tag" and "contents" fields that Generics added. Also accepting his answer since he gave me the key insight of how to structure the sum type
instance FromJSON Pets where
parseJSON (Object o) = (parsePets o "pets")
parseJSON _ = mzero
parsePets :: Object -> T.Text -> Parser Pets
parsePets o key = case H.lookup key o of
Nothing -> fail $ "key " ++ show key ++ " not present"
Just v -> parseToCatsOrDogs (o .: "species") v
{-# INLINE parsePets #-}
parseToCatsOrDogs :: Parser String -> Value -> Parser Pets
parseToCatsOrDogs speciesParser (Array v) = speciesParser >>= \species -> case species of
"dog" -> (V.mapM (\x -> parseJSON $ x) v) >>= \ dogVector -> return $ Dogs (V.toList dogVector)
"cat" -> (V.mapM (\x -> parseJSON $ x) v) >>= \ catVector -> return $ Cats (V.toList catVector)
_ -> mzero
parseToCatsOrDogs _ _ = mzero
I am trying to serialize a Contacts type but I am stuck at defining put and get?
import Control.Monad
import Data.Binary
type Name = String
type Address = String
data Contacts = Contacts [(Name, Address)] deriving (Show)
instance Binary Contacts where
put (Contacts [(n,a)]) = do ...
get = do ...
main :: IO ()
main = do
let c = Contacts [("gert","home")]
let e = encode c
let d = decode e
print d
Yes, you are stuck defining put and get. Does that answer your question?
type Name = String
type Address = String
data Contacts = Contacts [(Name, Address)] deriving (Show)
instance Binary Contacts
put (Contacts [(n,a)]) = do ...
get = do ...
Since there are already instances:
instance (Binary a) => Binary [a]
instance (Binary a, Binary b) => Binary (a,b)
instance Binary Char
You should just be able to trivially lift the underlying put and get routines:
instance Binary Contacts where
put (Contacts set) = put set
get = fmap Contacts get
So when you put contacts you just tell it to put the list of pairs of strings. When you want to deserialize the contacts you just get the underlying list and use the Contacts constructor.
Adding more simple examples to prevent other noobs from suffering like me :)
{-# LANGUAGE RecordWildCards #-}
import Data.Binary
type Name = String
type Address = String
type Phone = String
data Contacts = Contacts [(Name, Address)] deriving (Show)
instance Binary Contacts where
put (Contacts set) = put set
get = fmap Contacts get
data Contact = Contact { name :: Name, address :: Address, phone :: Phone } deriving (Show)
instance Binary Contact where
put Contact{..} = do put name; put address; put phone
get = do name <- get; address <- get; phone <- get; return Contact{..}
main :: IO ()
main = do
let c = Contacts [("gert","home"),("gert2","home2")]
let e = encode c
print e
let d = decode e
print (d:: Contacts)
let c' = Contact{name="gert",address="home",phone="test"}
let e' = encode c'
print e'
let d' = decode e'
print (d':: Contact)