Safe modelling of relational data in Haskell - haskell

I find it very common to want to model relational data in my functional programs. For example, when developing a web-site I may want to have the following data structure to store info about my users:
data User = User
{ name :: String
, birthDate :: Date
}
Next, I want to store data about the messages users post on my site:
data Message = Message
{ user :: User
, timestamp :: Date
, content :: String
}
There are multiple problems associated with this data structure:
We don't have any way of distinguishing users with similar names and birth dates.
The user data will be duplicated on serialisation/deserialisation
Comparing the users requires comparing their data which may be a costly operation.
Updates to the fields of User are fragile -- you can forget to update all the occurences of User in your data structure.
These problems are manageble while our data can be represented as a tree. For example, you can refactor like this:
data User = User
{ name :: String
, birthDate :: Date
, messages :: [(String, Date)] -- you get the idea
}
However, it is possible to have your data shaped as a DAG (imagine any many-to-many relation), or even as a general graph (OK, maybe not). In this case, I tend to simulate the relational database by storing my data in Maps:
newtype Id a = Id Integer
type Table a = Map (Id a) a
This kind of works, but is unsafe and ugly for multiple reasons:
You are just an Id constructor call away from nonsensical lookups.
On lookup you get Maybe a, but often the database structurally ensures that there is a value.
It is clumsy.
It is hard to ensure referential integrity of your data.
Managing indices (which are very much necessary for performance) and ensuring their integrity is even harder and clumsier.
Is there existing work on overcoming these problems?
It looks like Template Haskell could solve them (as it usually does), but I would like not to reinvent the wheel.

The ixset library (or ixset-typed, a more type-safe version) will help you with this. It's the library that backs the relational part of acid-state, which also handles versioned serialization of your data and/or concurrency guarantees, in case you need it.
The Happstack Book has an IxSet tutorial.
The thing about ixset is that it manages "keys" for your data entries automatically.
For your example, one would create one-to-many relationships for your data types like this:
data User =
User
{ name :: String
, birthDate :: Date
} deriving (Ord, Typeable)
data Message =
Message
{ user :: User
, timestamp :: Date
, content :: String
} deriving (Ord, Typeable)
instance Indexable Message where
empty = ixSet [ ixGen (Proxy :: Proxy User) ]
You can then find the message of a particular user. If you have built up an IxSet like this:
user1 = User "John Doe" undefined
user2 = User "John Smith" undefined
messageSet =
foldr insert empty
[ Message user1 undefined "bla"
, Message user2 undefined "blu"
]
... you can then find messages by user1 with:
user1Messages = toList $ messageSet #= user1
If you need to find the user of a message, just use the user function like normal. This models a one-to-many relationship.
Now, for many-to-many relations, with a situation like this:
data User =
User
{ name :: String
, birthDate :: Date
, messages :: [Message]
} deriving (Ord, Typeable)
data Message =
Message
{ users :: [User]
, timestamp :: Date
, content :: String
} deriving (Ord, Typeable)
... you create an index with ixFun, which can be used with lists of indexes. Like so:
instance Indexable Message where
empty = ixSet [ ixFun users ]
instance Indexable User where
empty = ixSet [ ixFun messages ]
To find all the messages by an user, you still use the same function:
user1Messages = toList $ messageSet #= user1
Additionally, provided that you have an index of users:
userSet =
foldr insert empty
[ User "John Doe" undefined [ messageFoo, messageBar ]
, User "John Smith" undefined [ messageBar ]
]
... you can find all the users for a message:
messageFooUsers = toList $ userSet #= messageFoo
If you don't want to have to update the users of a message or the messages of a user when adding a new user/message, you should instead create an intermediary data type that models the relation between users and messages, just like in SQL (and remove the users and messages fields):
data UserMessage = UserMessage { umUser :: User, umMessage :: Message }
instance Indexable UserMessage where
empty = ixSet [ ixGen (Proxy :: Proxy User), ixGen (Proxy :: Proxy Message) ]
Creating a set of these relations would then let you query for users by messages and messages for users without having to update anything.
The library has a very simple interface considering what it does!
EDIT: Regarding your "costly data that needs to be compared": ixset only compares the fields that you specify in your index (so to find all the messages by a user in the first example, it compares "the whole user").
You regulate which parts of the indexed field it compares by altering the Ord instance. So, if comparing users is costly for you, you can add an userId field and modify the instance Ord User to only compare this field, for example.
This can also be used to solve the chicken-and-egg problem: what if you have an id, but neither a User, nor a Message?
You could then simply create an explicit index for the id, find the user by that id (with userSet #= (12423 :: Id)) and then do the search.

IxSet is the ticket. To help others who might stumble on this post here's a more fully expressed example,
{-# LANGUAGE OverloadedStrings, DeriveDataTypeable, TypeFamilies, TemplateHaskell #-}
module Main (main) where
import Data.Int
import Data.Data
import Data.IxSet
import Data.Typeable
-- use newtype for everything on which you want to query;
-- IxSet only distinguishes indexes by type
data User = User
{ userId :: UserId
, userName :: UserName }
deriving (Eq, Typeable, Show, Data)
newtype UserId = UserId Int64
deriving (Eq, Ord, Typeable, Show, Data)
newtype UserName = UserName String
deriving (Eq, Ord, Typeable, Show, Data)
-- define the indexes, each of a distinct type
instance Indexable User where
empty = ixSet
[ ixFun $ \ u -> [userId u]
, ixFun $ \ u -> [userName u]
]
-- this effectively defines userId as the PK
instance Ord User where
compare p q = compare (userId p) (userId q)
-- make a user set
userSet :: IxSet User
userSet = foldr insert empty $ fmap (\ (i,n) -> User (UserId i) (UserName n)) $
zip [1..] ["Bob", "Carol", "Ted", "Alice"]
main :: IO ()
main = do
-- Here, it's obvious why IxSet needs distinct types.
showMe "user 1" $ userSet #= (UserId 1)
showMe "user Carol" $ userSet #= (UserName "Carol")
showMe "users with ids > 2" $ userSet #> (UserId 2)
where
showMe :: (Show a, Ord a) => String -> IxSet a -> IO ()
showMe msg items = do
putStr $ "-- " ++ msg
let xs = toList items
putStrLn $ " [" ++ (show $ length xs) ++ "]"
sequence_ $ fmap (putStrLn . show) xs

I've been asked to write an answer using Opaleye. In fact there's not an awful lot to say, as the Opaleye code is fairly standard once you have a database schema. Anyway, here it is, assuming there is a user_table with columns user_id, name and birthdate, and a message_table with columns user_id, time_stamp and content.
This sort of design is explained in more detail in the Opaleye Basic Tutorial.
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE Arrows #-}
import Opaleye
import Data.Profunctor.Product (p2, p3)
import Data.Profunctor.Product.TH (makeAdaptorAndInstance)
import Control.Arrow (returnA)
data UserId a = UserId { unUserId :: a }
$(makeAdaptorAndInstance "pUserId" ''UserId)
data User' a b c = User { userId :: a
, name :: b
, birthDate :: c }
$(makeAdaptorAndInstance "pUser" ''User')
type User = User' (UserId (Column PGInt4))
(Column PGText)
(Column PGDate)
data Message' a b c = Message { user :: a
, timestamp :: b
, content :: c }
$(makeAdaptorAndInstance "pMessage" ''Message')
type Message = Message' (UserId (Column PGInt4))
(Column PGDate)
(Column PGText)
userTable :: Table User User
userTable = Table "user_table" (pUser User
{ userId = pUserId (UserId (required "user_id"))
, name = required "name"
, birthDate = required "birthdate" })
messageTable :: Table Message Message
messageTable = Table "message_table" (pMessage Message
{ user = pUserId (UserId (required "user_id"))
, timestamp = required "timestamp"
, content = required "content" })
An example query which joins the user table to the message table on the user_id field:
usersJoinMessages :: Query (User, Message)
usersJoinMessages = proc () -> do
aUser <- queryTable userTable -< ()
aMessage <- queryTable messageTable -< ()
restrict -< unUserId (userId aUser) .== unUserId (user aMessage)
returnA -< (aUser, aMessage)

Another radically different approach to representing relational data is used by the database package haskelldb. It doesn't work quite like the types you describe in your example, but it is designed to allow a type-safe interface to SQL queries. It has tools for generating data types from a database schema and vice versa. Data types such as the ones you describe work well if you always want to work with whole rows. But they don't work in situations where you want to optimize your queries by only selecting certain columns. This is where the HaskellDB approach can be useful.

I don't have a complete solution, but I suggest taking a look at the ixset package; it provides a set type with an arbitrary number of indices that lookups can be performed with. (It's intended to be used with acid-state for persistence.)
You do still need to manually maintain a "primary key" for each table, but you could make it significantly easier in a few ways:
Adding a type parameter to Id, so that, for instance, a User contains an Id User rather than just an Id. This ensures you don't mix up Ids for separate types.
Making the Id type abstract, and offering a safe interface to generating new ones in some context (like a State monad that keeps track of the relevant IxSet and the current highest Id).
Writing wrapper functions that let you, for example, supply a User where an Id User is expected in queries, and that enforce invariants (for example, if every Message holds a key to a valid User, it could allow you to look up the corresponding User without handling a Maybe value; the "unsafety" is contained within this helper function).
As an additional note, you don't actually need a tree structure for regular data types to work, since they can represent arbitrary graphs; however, this makes simple operations like updating a user's name impossible.

Related

How to elegantly avoid "Ambiguous type variable" when using Haskell type classes

I want to write a simple framework that deals with persisting entities.
The idea is to have an Entity type class and to provide generic persistence operations like
storeEntity :: (Entity a) => a -> IO ()
retrieveEntity :: (Entity a) => Integer -> IO a
publishEntity :: (Entity a) => a -> IO ()
Actual data types are instance of that Entity type class.
Even though the persistence operations are generic and don't need any information about the concrete data types You have to provide a type annotation at the call site to make GHC happy, like in:
main = do
let user1 = User 1 "Thomas" "Meier" "tm#meier.com"
storeEntity user1
user2 <- retrieveEntity 1 :: IO User -- how to avoid this type annotation?
publishEntity user2
Is there any way to avoid this kind of call site annotations?
I know that I don't need these annotations if the compiler can deduce the actual type from the context of the usage. So for example the following code works fine:
main = do
let user1 = User 1 "Thomas" "Meier" "tm#meier.com"
storeEntity user1
user2 <- retrieveEntity 1
if user1 == user2
then publishEntity user2
else fail "retrieve of data failed"
But I would like to be able to chain the polymorphic actions like so:
main = do
let user1 = User 1 "Heinz" "Meier" "hm#meier.com"
storeEntity user1
-- unfortunately the next line does not compile
retrieveEntity 1 >>= publishEntity
-- but with a type annotation it works:
(retrieveEntity 1 :: IO User) >>= publishEntity
But having a type annotation here breaks the elegance of the polymorphism...
For completeness sake I've included the full source code:
{-# LANGUAGE DeriveGeneric, DeriveAnyClass #-}
module Example where
import GHC.Generics
import Data.Aeson
-- | Entity type class
class (ToJSON e, FromJSON e, Eq e, Show e) => Entity e where
getId :: e -> Integer
-- | a user entity
data User = User {
userId :: Integer
, firstName :: String
, lastName :: String
, email :: String
} deriving (Show, Eq, Generic, ToJSON, FromJSON)
instance Entity User where
getId = userId
-- | load persistent entity of type a and identified by id
retrieveEntity :: (Entity a) => Integer -> IO a
retrieveEntity id = do
-- compute file path based on id
let jsonFileName = getPath id
-- parse entity from JSON file
eitherEntity <- eitherDecodeFileStrict jsonFileName
case eitherEntity of
Left msg -> fail msg
Right e -> return e
-- | store persistent entity of type a to a json file
storeEntity :: (Entity a) => a -> IO ()
storeEntity entity = do
-- compute file path based on entity id
let jsonFileName = getPath (getId entity)
-- serialize entity as JSON and write to file
encodeFile jsonFileName entity
-- | compute path of data file based on id
getPath :: Integer -> String
getPath id = ".stack-work/" ++ show id ++ ".json"
publishEntity :: (Entity a) => a -> IO ()
publishEntity = print
main = do
let user1 = User 1 "Thomas" "Meier" "tm#meier.com"
storeEntity user1
user2 <- retrieveEntity 1 :: IO User
print user2
You can tie the types of storeEntity and retrieveEntity together by adding a type-level tag to your entity's identifier Integer. I think your API design also has a small infelicity that isn't critical, but I'll fix it along the way anyway. Namely: Users shouldn't store their identifier. Instead have a single top-level type wrapper for identified things. This lets you write code once and for all that munges identifiers -- e.g. a function that takes an entity that doesn't yet have an ID (how would you even represent this with your definition of User?) and allocates a fresh ID for it -- without going back and modifying your Entity class and all its implementations. Also storing first and last names separately is wrong. So:
import Data.Tagged
data User = User
{ name :: String
, email :: String
} deriving (Eq, Ord, Read, Show)
type Identifier a = Tagged a Integer
data Identified a = Identified
{ ident :: Identifier a
, val :: a
} deriving (Eq, Ord, Read, Show)
Here my Identified User corresponds to your User, and my User doesn't have an analog in your version. The Entity class might look like this:
class Entity a where
store :: Identified a -> IO ()
retrieve :: Identifier a -> IO a
publish :: a -> IO () -- or maybe Identified a -> IO ()?
instance Entity User -- stub
As an example of the "write it once and for all" principle above, you may find it convenient for retrieve to actually associate the entity it returns with its identifier. This can be done uniformly for all entities now:
retrieveIDd :: Entity a => Identifier a -> IO (Identified a)
retrieveIDd id = Identified id <$> retrieve id
Now we can write an action that ties together the types of its store and retrieve actions:
storeRetrievePublish :: Entity a => Identified a -> IO ()
storeRetrievePublish e = do
store e
e' <- retrieve (ident e)
publish e'
Here ident e has rich enough type information that we know that e' must be an a, even though we don't have an explicit type signature for it. (The signature on storeRetrievePublish is also optional; the one given here is the one inferred by GHC.) Finishing touches:
main :: IO ()
main = storeRetrievePublish (Identified 1 (User "Thomas Meier" "tm#meier.com"))
If you don't want to define storeRetrievePublish explicitly, you can get away with this:
main :: IO ()
main = do
let user = Identified 1 (User "Thomas Meier" "tm#meier.com")
store user
user' <- retrieve (ident user)
publish user'
...but you can't unfold definitions much further: if you reduce ident user to just 1, you will have lost the tie between the type tag on the identifier used for store and for retrieve, and be back to your ambiguous type situation.

Haskell get types of Data Constructor

I was wondering if given a constructor, such as:
data UserType = User
{ username :: String
, password :: String
} -- deriving whatever necessary
What the easiest way is for me to get something on the lines of [("username", String), ("password", String)], short of just manually writing it. Now for this specific example it is fine to just write it but for a complex database model with lots of different fields it would be pretty annoying.
So far I have looked through Typeable and Data but so far the closest thing I have found is:
user = User "admin" "pass"
constrFields (toConstr user)
But that doesn't tell me the types, it just returns ["username", "password"] and it also requires that I create an instance of User.
I just knocked out a function using Data.Typeable that lets you turn a constructor into a list of the TypeReps of its arguments. In conjunction with the constrFields you found you can zip them together to get your desired result:
{-# LANGUAGE DeriveDataTypeable #-}
module Foo where
import Data.Typeable
import Data.Typeable.Internal(funTc)
getConsArguments c = go (typeOf c)
where go x = let (con, rest) = splitTyConApp x
in if con == funTc
then case rest of (c:cs:[]) -> c : go cs
_ -> error "arrows always take two arguments"
else []
given data Foo = Foo {a :: String, b :: Int} deriving Typeable, we get
*> getConsArguments Foo
[[Char],Int]
As one would hope.
On how to get the field names without using a populated data type value itself, here is a solution:
constrFields . head . dataTypeConstrs $ dataTypeOf (undefined :: Foo)

Writing an OOP-style "setter" function in Haskell using record-syntax

I'm reading a tutorial on lenses and, in the introduction, the author motivates the lens concept by showing a few examples of how we might implement OOP-style "setter"/"getter" using standard Haskell. I'm confused by the following example.
Let's say we define a User algebraic data types as per Figure 1 (below). The tutorial states (correctly) that we can implement "setter" functionality via the NaiveLens data type and the nameLens function (also in Figure 1). An example usage is given in Figure 2.
I'm perplexed as to why we need such an elaborate construct (i.e., a NaiveLens datatype and a nameLens function) in order to implement "setter" functionality, when the following (somewhat obvious) function seems to do the job equally well: set' a s = s {name = a}.
HOWEVER, given that my "obvious" function is none other than the lambda function that's part of nameLens, I suspect there is indeed an advantage to using the construct below but that I'm too dense to see what that advantage is. Am hoping one of the Haskell wizards can help me understand.
Figure 1 (definitions):
data User = User { name :: String
, age :: Int
} deriving Show
data NaiveLens s a = NaiveLens { view :: s -> a
, set :: a -> s -> s
}
nameLens :: NaiveLens User String
nameLens = NaiveLens name (\a s -> s {name = a})
Figure 2 (example usage):
λ: let john = User {name="John",age=30}
john :: User
λ: set nameLens "Bob" john
User {name = "Bob", age = 30}
it :: User
The main advantage of lenses is that they compose, so they can be used for accessing and updating fields in nested records. Writing this sort of nested update manually using record update syntax gets tedious quite quickly.
Say you added an Email data type:
data Email = Email
{ _handle :: String
, _domain :: String
} deriving (Eq, Show)
handle :: NaiveLens Email String
handle = NaiveLens _handle (\h e -> e { _handle = h })
And added this as a field to your User type:
data User = User
{ _name :: String
, _age :: Int
, _userEmail :: Email
} deriving (Eq, Show)
email :: NaiveLens User Email
email = NaiveLens _userEmail (\e u -> u { _userEmail = e })
The real power of lenses comes from being able to compose them, but this is a bit of a tricky step. We would like some function that looks like
(...) :: NaiveLens s b -> NaiveLens b a -> NaiveLens s a
NaiveLens viewA setA ... NaiveLens viewB setB
= NaiveLens (viewB . viewA) (\c a -> setA (setB c (viewA a)) a)
For an explanation of how this was written, I'll defer to this post, where I shamelessly lifted it from. The resulting set field of this new lens can be thought of as taking a new value and a top-level record, looking up the lower record and setting its value to c, then setting that new record for the top-level record.
Now we have a convenient function for composing our lenses:
> let bob = User "Bob" 30 (Email "bob" "gmail")
> view (email...handle) bob
"bob"
> set (email...handle) "NOTBOB" bob
User {_name = "Bob", _age = 30, _userEmail = Email {_handle = "NOTBOB", _domain = "gmail"}}
I've used ... as the composition operator here because I think it's rather easy to type and still is similar to the . operator. This now gives us a way to drill down into a structure, getting and setting values fairly arbitrarily. If we had a domain lens written similarly, we could get and set that value in much the same way. This is what makes it look like it's OOP member access, even when it's simply fancy function composition.
If you look at the lens library (my choice for lenses), you get some nice tools to automatically build the lenses for you using template haskell, and there's some extra stuff going on behind the scenes that lets you use the normal function composition operator . instead of a custom one.

Haskell: Create a list of only certain "kind" of type?

I've been working through both Learn You a Haskell and Beginning Haskell and have come on an interesting problem. To preface, I'm normally a C++ programmer, so forgive me if I have no idea what I'm talking about.
One of the exercises in Beginning Haskell has me create a type Client, which can be a Government organization, Company, or Individual. I decided to try out record syntax for this.
data Client = GovOrg { name :: String }
| Company { name :: String,
id :: Integer,
contact :: String,
position :: String
}
| Individual { fullName :: Person,
offers :: Bool
}
deriving Show
data Person = Person { firstName :: String,
lastName :: String,
gender :: Gender
}
deriving Show
data Gender = Male | Female | Unknown
deriving Show
This is used for an exercise where given a list of Clients, I have to find how many of each gender are in the list. I started by filtering to get a list of just Individuals since only they have the Gender type, but my method seems to be completely wrong:
listIndividuals :: [Client] -> [Client]
listIndividuals xs = filter (\x -> x == Individual) xs
How would I get this functionality where I can check what "kind" of Client something is. Also for the record syntax, how is my coding style? Too inconsistent?
First of all, I would recommend not using record types with algebraic types, because you end up with partial accessor functions. For example, it is perfectly legal to have the code position (Individual (Person "John" "Doe" Male) True), but it will throw a runtime error. Instead, consider something more like
data GovClient = GovClient {
govName :: String
} deriving Show
data CompanyClient = CompanyClient {
companyName :: String,
companyID :: Integer, -- Also, don't overwrite existing names, `id` is built-in function
companyContact :: String,
companyPosition :: String
} deriving Show
data IndividualClient = IndividualClient {
indvFullName :: Person,
indvOffers :: Bool
} deriving Show
Then you can have
data Client
= GovOrg GovClient
| Company CompanyClient
| Individual IndividualClient
deriving (Show)
Now you can also define your function as
isIndividualClient :: Client -> Bool
isIndividualClient (Individual _) = True
isIndividualClient _ = False
listIndividuals :: [Client] -> [IndividualClient]
listIndividuals clients = filter isIndividualClient clients
Or the more point-free form of
listIndividuals = filter isIndividualClient
Here, in order to make the decision I've simply used pattern matching in a separate function to determine which of Client's constructors was used. Now you get the full power of record and algebraic types, with just a hair more code to worry about, but a lot more safety. You'll never accidentally call a function expecting a government client on an individual client, for example, because it wouldn't type check, whereas with your current implementation it would be more than possible.
If you're concerned with the longer names, I would recommend eventually looking into the lens library that is designed to help you manipulate complex trees of record types with relative ease.
With your current implementation, you could also do something pretty similar to the final solution:
isIndividualClient :: Client -> Bool
isIndividualClient (Individual _ _) = True
isIndividualClient _ = False
listIndividuals :: [Client] -> [Client]
listIndividuals clients = filter isIndividualClient clients
The main difference here is that Individual takes two fields, so I have two _ wildcard matches in the pattern, and the type of listIndividuals is now [Client] -> [Client].

Why doesn't Haskell/GHC support record name overloading

I am a Haskell newbie. I have noticed that Haskell does not support record name overloading:
-- Records.hs
data Employee = Employee
{ firstName :: String
, lastName :: String
, ssn :: String
} deriving (Show, Eq)
data Manager = Manager
{ firstName :: String
, lastName :: String
, ssn :: String
, subordinates :: [Employee]
} deriving (Show, Eq)
When I compile this I get:
[1 of 1] Compiling Main ( Records.hs, Records.o )
Records.hs:10:5:
Multiple declarations of `firstName'
Declared at: Records.hs:4:5
Records.hs:10:5
Records.hs:11:5:
Multiple declarations of `lastName'
Declared at: Records.hs:5:5
Records.hs:11:5
Records.hs:12:5:
Multiple declarations of `ssn'
Declared at: Records.hs:6:5
Records.hs:12:5
Given the "strength" of the Haskell type system, it seems like it should be easy for the compiler to determine which field to access in
emp = Employee "Joe" "Smith" "111-22-3333"
man = Manager "Mary" "Jones" "333-22-1111" [emp]
firstName man
firstName emp
Is there some issue that I am not seeing. I know that the Haskell Report does not allow this, but why not?
Historical reasons. There have been many competing designs for better record systems for Haskell -- so many in fact, that no consensus could be reached. Yet.
The current record system is not very sophisticated. It's mostly some syntactic sugar for things you could do with boilerplate if there was no record syntax.
In particular, this:
data Employee = Employee
{ firstName :: String
, lastName :: String
, ssn :: String
} deriving (Show, Eq)
generates (among other things) a function firstName :: Employee -> String.
If you also allow in the same module this type:
data Manager = Manager
{ firstName :: String
, lastName :: String
, ssn :: String
, subordinates :: [Employee]
} deriving (Show, Eq)
then what would be the type of the firstName function?
It would have to be two separate functions overloading the same name, which Haskell does not allow. Unless you imagine that this would implicitly generate a typeclass and make instances of it for everything with a field named firstName (gets messy in the general case, when the fields could have different types), then Haskell's current record system isn't going to be able to support multiple fields with the same name in the same module. Haskell doesn't even attempt to do any such thing at present.
It could, of course, be done better. But there are some tricky problems to solve, and essentially no one's come up with solutions to them that have convinced everyone that there is a most promising direction to move in yet.
One option to avoid this is to put your data types in different modules and use qualified imports. In that way you can use the same field accessors on different data records and keep you code clean and more readable.
You can create one module for the employee, for example
module Model.Employee where
data Employee = Employee
{ firstName :: String
, lastName :: String
, ssn :: String
} deriving (Show, Eq)
And one module for the Manager, for example:
module Model.Manager where
import Model.Employee (Employee)
data Manager = Manager
{ firstName :: String
, lastName :: String
, ssn :: String
, subordinates :: [Employee]
} deriving (Show, Eq)
And then wherever you want to use these two data types you can import them qualified and access them as follows:
import Model.Employee (Employee)
import qualified Model.Employee as Employee
import Model.Manager (Manager)
import qualified Model.Manager as Manager
emp = Employee "Joe" "Smith" "111-22-3333"
man = Manager "Mary" "Jones" "333-22-1111" [emp]
name1 = Manager.firstName man
name2 = Employee.firstName emp
Keep in mind that after all you are using two different data types and thus Manger.firstName is another function than Employee.firstName, even when you know that both data types represent a person and each person has a first name. But it is up to you how far you go to abstract data types, for example to create a Person data type from those "attribute collections" as well.

Resources