I've decided to try functional programming and Purescript. After reading "Learn you a Haskell for great good" and "PureScript by Example" and playing with code a little I think that I can say that I understand the basics, but one thing bothers me a lot - code looks very coupled. It's usual for me to change libraries very often and in OOP I can use onion architecture to decouple my own code from the library specific one, but I have no idea how to do this in Purescript.
I've tried to find how people do this in Haskell, but all I could find were answers like "No one has ever made complex apps in Haskell, so no one knows how to do it" or "You have input and you have output, everything in between are just pure functions". But at this moment I have a toy app that uses virtal dom, signals, web storage, router libs and each of them have their own effects and data structures, so it doesn't sound like one input and one output.
So my question is how should I structure my code or what technics should I use so that I could change my libs without rewriting half of my app?
Update:
Suggestion to use several layers and keep effects in the main module is quite common too and I understand why I should do so.
Here is a simple example that hopefully will illustrate the problem i'm talking about:
btnHandler :: forall ev eff. (MouseEvent ev) => ev -> Eff (dom :: DOM, webStorage :: WebStorage, trace :: Trace | eff) Unit
btnHandler e = do
btn <- getTarget e
Just btnId <- getAttribute "id" btn
Right clicks <- (getItem localStorage btnId) >>= readNumber
let newClicks = clicks + 1
trace $ "Button #" ++ btnId ++ " has been clicked " ++ (show newClicks) ++ " times"
setText (show newClicks) btn
setItem localStorage btnId $ show newClicks
-- ... maybe some other actions
return unit
-- ... other handlers for different controllers
btnController :: forall e. Node -> _ -> Eff (dom :: DOM, webStorage :: WebStorage, trace :: Trace | e) Unit
btnController mainEl _ = do
delegateEventListener mainEl "click" "#btn1" btnHandler
delegateEventListener mainEl "click" "#btn2" btnHandler
delegateEventListener mainEl "click" "#btn3" btnHandler
-- ... render buttons
return unit
-- ... other controllers
main :: forall e. Eff (dom :: DOM, webStorage :: WebStorage, trace :: Trace, router :: Router | e) Unit
main = do
Just mainEl <- body >>= querySelector "#wrapper"
handleRoute "/" $ btnController mainEl
-- ... other routes each with it's own controller
return unit
Here we have simple counter app with routing, web storage, dom manipulations and console logging. As you can see there is no single input and single output. We can get inputs from router or event listeners and use console or dom as an output, so it becomes a little more complicated.
Having all this effectful code in main module feels wrong for me for two reasons:
If I will keep adding routes and controllers this module will quickly turn into a thousand line mess.
Keeping routing, dom manipulations and data storing in the same module violates single responsibility principle (and I assume that it is important in FP too)
We can split this module into several ones, for example one module per controller and create some kind of effectful layer. But then when I have ten controller modules and I want to change my dom specific lib I should edit them all.
Both of this approaches are far from ideal, so the question is wich one I should choose? Or maybe there is some other way to go?
There's no reason you can't have a middle layer for abstracting over dependencies. Let's say you want to use a router for your application. You can define a "router abstraction" library that would look like the following:
module App.Router where
import SomeRouterLib
-- Type synonym to make it easy to change later
type Route = SomeLibraryRouteType
-- Just an alias to the Router library
makeRoute :: String -> Route -> Route
makeRoute = libMakeRoute
And then the new shiny comes out, and you want to switch your routing library. You'll need to make a new module that conforms to the same API, but has the same functions -- an adapter, if you will.
module App.RouterAlt where
import AnotherRouterLib
type Route = SomeOtherLibraryType
makeRoute :: String -> Route -> Route
makeRoute = otherLibMakeRoute
In your main app, you can now swap the imports, and everything should work alright. There will likely be more massaging that needs to happen to get the types and functions working as you'd expect them, but that's the general idea.
Your example code is very imperative in nature. It's not idiomatic functional code, and I think you're correct in noting that it's not sustainable. More functional idioms include purescript-halogen and purescript-thermite.
Consider the UI as a pure function of current application state. In other words, given the current value of things, what does my app look like? Also, consider that the current state of the application can be derived from applying a series of pure functions to some initial state.
What is your application state?
data AppState = AppState { buttons :: [Button] }
data Button = Button { numClicks :: Integer }
What kind of events are you looking at?
data Event = ButtonClick { buttonId :: Integer }
How do we handle that Event?
handleEvent :: AppState -> Event -> AppState
handleEvent state (ButtonClick id) =
let newButtons = incrementButton id (buttons state)
in AppState { buttons = newButtons }
incrementButton :: Integer -> [Button] -> [Button]
incrementButton _ [] = []
incrementButton 0 (b:bs) = Button (1 + numClicks b) : bs
incrementButton i (b:bs) = b : incrementButton (i - 1) buttons
How do you render the application, based on the current state?
render :: AppState -> Html
render state =
let currentButtons = buttons state
btnList = map renderButton currentButtons
renderButton btn = "<li><button>" ++ show (numClicks btn) ++ "</button></li>"
in "<div><ul>" ++ btnList ++ "</ul></div>"
This is a bit of an open ended question, so it's hard to answer specifically without concrete examples.
You have input and you have output, everything in between are just pure functions
Statements like this are actually pretty close to the truth. Since there are no stateful objects in Haskell and PureScript, the majority of the code in an app will be based around pure functions and simple data types (or records), and therefore it is not tightly coupled to any particular library (aside from things like Maybe, Either, Tuple, and so on, which aren't really libraries in the sense you're talking about).
As much as possible you should try to push code that uses effects to the “outside”. This is where you interleave the various libraries you require to process whatever inputs and produce whatever outputs your app requires. This layering makes it easy to switch libraries in and out, as here you'll mostly be lifting your core pure code into the Eff monad to “wire it up” to the external inputs and ouputs.
One way of looking at it, is if you find yourself using Eff much outside of the main module or top layer of your app, you're probably “doing it wrong”.
If you're writing Haskell, substitute anywhere I mention Eff with IO.
Related
I'm writing a sort of scraper or data miner in Haskell. It consists of a main loop and other shared logic, along with a number of "adapters" each of which is designed to scrape a particular type of resource (not only web pages but possibly filesystem objects as well). The adapters all produce the same type of result but I would like them to be independent otherwise. Also, I would like them to share the main loop and other logic.
This is what I have so far, using ExistentialQuantification to hide the adapter affiliation of a scraping job. My idea is that the main loop processes a sequence of jobs, dispatching on the "process" method to find the right adapter implementation.
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE ExistentialQuantification #-}
import Control.Monad.Trans.Except
import Data.Text (Text)
import qualified Data.Text as T
-- Metadata about an article
data Article = Article
{ articleSource :: Text
, articleTitle :: Text
, articleUrl :: Text
, articleText :: Text
, articleDate :: Maybe Text
}
deriving (Show)
-- Adapter for scraping a certain type of resource.
--
-- The adapter provides a seed job constructor which samples the current config
-- and creates a seed job to be run first. The seed job then generates all
-- other jobs, e.g. by scraping an index page.
--
--- Each adapter module exports a value of this type, and nothing else.
data Adapter = Adapter
{ adapterName :: Text
, adapterSeedJob :: ScraperConfig -> AnyJob
}
-- Each scrape operation produces zero or more articles, and zero or more
-- new scrape jobs.
type ScrapeResult j = ExceptT AdapterErr IO ([Article], [j])
-- Specification of a resource to be scraped by a certain adapter.
--
-- Each adapter defines its own job type, containing the necessary information.
class ScrapeJob j where
jobAdapter :: j -> Text
jobDesc :: j -> Text
jobProcess :: j -> ScrapeResult j
-- Opaque type encapsulating any adapter's job type, used outside the adapter.
data AnyJob = forall j. ScrapeJob j => AnyJob j
instance ScrapeJob AnyJob where
jobAdapter (AnyJob j) = jobAdapter j
jobDesc (AnyJob j) = jobDesc j
jobProcess (AnyJob j) = wrap <$> jobProcess j
where wrap (as, js) = (as, AnyJob <$> js)
-- Global configuration, such as which passwords to use
data ScraperConfig = ScraperConfig
{ -- ...
}
My problem is that each adapter also has some context attached to it. One example would be that most adapters need to perform some kind of login procedure before it can access any data. I would like this to be handled separately from the scraping itself if possible, but I can't use the trick with AnyJob for "AnyContext" (I think) since there would be no way for the main loop of guaranteeing that the types of the AnyJob and AnyContext match up correctly.
My current ideas, neither of which I'm really satisfied with, are:
Fix the type of the context to a "bag of data" such as Map Text Text and add a special setup method to each adpater which creates this context.
Add the required context as a field of every ScrapeJob instance, and copy it along explicitly whenever a new job is created.
Turn the design upside down, letting each adapter run its own main loop using shared functions defined in a utility module.
Is there something I'm missing here? Any advice on how to improve on this design would be appreciated.
Thanks!
I'm new to Haskell and I like the programming approach of it a lot!
I've been running into this problem for the past 2 days, and no matter what I try, it refuses to work. I think I am confusing something about the nature of Haskell.
Here is my code (that doesn't work):
data Part = Part {partName :: String, events :: [Event]}
class Event e where
getChildren :: e -> [e]
This is for a music application. Event can be a NoteEvent, a DecrescendoEvent, a VolumeEvent, a KeyEvent, or anything that would "happen" in a piece of music.
A NoteEvent would have no children while a DecrescendoEvent would have child events that it would gradually reduce the volume of.
What am I doing wrong?
The problem you're having is that a sequence like [a] can only contain one type. There are two ways to deal with this.
1) Combine all your events into a single algebraic datatype, which might look something like this:
data Event = NoteEvent {note :: Int; duration :: Int}
| DecrescendoEvent {getChildren :: [Event]}
| VolumeEvent {change :: Double, getChildren :: [Event]}
...and so on. Now you can work with data of type [Event]. You can include the children like this:
2) Wrap the Event type in some sort of "wrapper" type, and create collections of that type. See http://www.haskell.org/haskellwiki/Heterogenous_collections for more information on this.
Motivation
I'm currently working on a little hobby project to try and implement something like TaskJuggler in Haskell, mostly as an experiment to play with writing domain specific languages.
My current goal is to have a small DSL for building up the description of a Project, along with it's associated Tasks. There is no hierarchy yet, though that'll be my next extension. Currently, I have the following data types:
data Project = Project { projectName :: Text
, projectStart :: Day
, projectEnd :: Day
, projectMaxHoursPerDay :: Int
, projectTasks :: [Task]
}
deriving (Eq, Show)
data Task = Task { taskName :: Text }
deriving (Eq, Show)
Nothing too crazy there, I'm sure you will agree.
Now I want to create a DSL to build up projects/tasks. I can use Writer [Task] monad to build up tasks, but this won't scale well. We might be able to do the following now:
project "LambdaBook" startDate endDate $ do
task "Web site"
task "Marketing"
Where project :: Text -> Date -> Date -> Writer [Task] a, which runs the Writer to get a list of tasks, and choses a default value such as 8 for projectMaxHoursPerDay.
But I will later want to be able to do something like:
project "LambdaBook" $ do
maxHoursPerDay 4
task "Web site"
task "Marketing"
So I'm using maxHoursPerDay to specify a (future) property about a Project. I can no longer use a Writer for this, because [Task] isn't able to capture everything I need.
I see two possibilities for solving this problem:
Separate "optional" properties into their own monoid
I could split Project into:
data Project = Project { projectName, projectStart, projectEnd, projectProperties }
data ProjectProperties = ProjectProperties { projectMaxHoursPerDay :: Maybe Int
, projectTasks :: [Task]
}
Now I can have an instance Monoid ProjectProperties. When I run Writer ProjectProperties I can do all the defaulting I need to build a Project. I suppose there's no reason that Project needs to embed ProjectProperties - it could even have the same definition as above.
Use the bindable functor Semigroup m => Writer m
While Project isn't a Monoid, it can certainly be made into a Semigroup. Name/start/end are First, maxHoursPerDay is Last, and projectTasks is [Task]. We can't have a Writer monad over a Semigroup, but we can have a Writer bindable functor.
The Actual Question
With the first solution - a dedicated 'properties' Monoid - we can use the full power of a monad, at a choice of costs. I could duplicate the overridable properties in Project and ProjectProperties, where the latter wraps each property in an appropriate monoid. Or I could just write the monoid once and embed it inside the Project - though I give up type safety (maxHoursPerDay must be Just when I actually produce the project plan!).
A bindable functor removes both the code duplication and retains type safety, but at the immediate cost of giving up syntax sugar, and the potentially longer term cost of being a pain to work with (due to lack of return/pure).
I have examples of both approaches at http://hpaste.org/82024 (for bindable functors), and http://hpaste.org/82025 (for the monad approach). These examples go a little beyond what's in this SO post (which was big enough already), and has Resource along with Task. Hopefully this will indicate why I need to go as far Bind (or Monad) in the DSL.
I'm excited to have even found an applicable use for bindable functors, so I'm happy to hear any thoughts or experience you might have.
data Project maxHours = Project {tasks :: [Task], maxHourLimit :: maxHours}
defProject = Project [] ()
setMaxHours :: Project () -> Project Double
setMaxHours = ...
addTask :: Project a -> Project a
type CompleteProject = Project Double...
runProject :: CompleteProject -> ...
storeProject :: CompleteProject -> ...
You need function composition now, instead of actions in a writer, but this pattern lets you start with a partially populated record, and set those things that need to be set once and only once with plenty of type safety. It even lets you impose constraints on the relationship between various set and unset values in the final result.
An interesting solution that was proposed on Google+ was to use a normal Writer monad, but using the Endo Project monoid. Along with lens, this yields a very nice DSL:
data Project = Project { _projectName :: String
, _projectStart :: Day
, _projectEnd :: Day
, _projectTasks :: [Task]
}
deriving (Eq, Show)
makeLenses ''Project
Along with the operation
task :: String -> ProjectBuilder Task
task name = t <$ mapProject (projectTasks <>~ [t])
where t = Task name []
Which can be used with the original DSL. This is probably the best solution for what I want (though maybe using a monad is just too much of an abuse of syntax anyway).
This is sort of a non-answer, but I feel it should be said.
Isn't record syntax good enough? Do you really need a DSL for marginally improved syntax?
defaultProject
{ projectName = "Lambdabook"
, projectStart = startDate
, projectEnd = endDate
, tasks =
[ Task "Web site"
, Task "marketing"
]
}
Tangentially, a Racketeer once told me that Haskell only has one macro: do syntax. So Haskellers shoehorn everything into monads whenever they want to manipulate syntax.
I'm sorry this problem description is so abstract: its for my job, and for commercial confidentiality reasons I can't give the real-world problem, just an abstraction.
I've got an application that receives messages containing key-value pairs. The keys are from a defined set of keywords, and each keyword has a fixed data type. So if "Foo" is an Integer and "Bar" is a date you might get a message like:
Foo: 234
Bar: 24 September 2011
A message may have any subset of keys in it. The number of keys is fairly large (several dozen). But lets stick with Foo and Bar for now.
Obviously there is a record like this corresponding to the messages:
data MyRecord {
foo :: Maybe Integer
bar :: Maybe UTCTime
-- ... and so on for several dozen fields.
}
The record uses "Maybe" types because that field may not have been received yet.
I also have many derived values that I need to compute from the current values (if they exist). For instance I want to have
baz :: MyRecord -> Maybe String
baz r = do -- Maybe monad
f <- foo r
b <- bar r
return $ show f ++ " " ++ show b
Some of these functions are slow, so I don't want to repeat them unnecessarily. I could recompute baz for each new message and memo it in the original structure, but if a message leaves the foo and bar fields unchanged then that is wasted CPU time. Conversely I could recompute baz every time I want it, but again that would waste CPU time if the underlying arguments have not changed since last time.
What I want is some kind of smart memoisation or push-based recomputation that only recomputes baz when the arguments change. I could detect this manually by noting that baz depends only on foo and bar, and so only recomputing it on messages that change those values, but for complicated functions that is error-prone.
An added wrinkle is that some of these functions may have multiple strategies. For instance you might have a value that can be computed from either Foo or Bar using 'mplus'.
Does anyone know of an existing solution to this? If not, how should I go about it?
I'll assume that you have one "state" record and these message all involve updating it as well as setting it. So if Foo is 12, it may later be 23, and therefore the output of baz would change. If any of this is not the case, then the answer becomes pretty trivial.
Let's start with the "core" of baz -- a function not on a record, but the values you want.
baz :: Int -> Int -> String
Now let's transform it:
data Cached a b = Cached (Maybe (a,b)) (a -> b)
getCached :: Eq a => Cached a b -> a -> (b,Cached a b)
getCached c#(Cached (Just (arg,res)) f) x | x == arg = (res,c)
getCached (Cached _ f) x = let ans = f x in (ans,Cached (Just (x,ans) f)
bazC :: Cached (Int,Int) String
bazC = Cached Nothing (uncurry baz)
Now whenever you would use a normal function, you use a cache-transformed function instead, substituting the resulting cache-transformed function back into your record. This is essentially a manual memotable of size one.
For the basic case you describe, this should be fine.
A fancier and more generalized solution involving a dynamic graph of dependencies goes under the name "incremental computation" but I've seen research papers for it more than serious production implementations. You can take a look at these for starters, and follow the reference trail forward:
http://www.carlssonia.org/ogi/Adaptive/
http://www.andres-loeh.de/Incrementalization/paper_final.pdf
Incremental computation is actually also very related to functional reactive programming, so you can take a look at conal's papers on that, or play with Heinrich Apfelmus' reactive-banana library: http://www.haskell.org/haskellwiki/Reactive-banana
In imperative languages, take a look at trellis in python: http://pypi.python.org/pypi/Trellis or Cells in lisp: http://common-lisp.net/project/cells/
You can build a stateful graph that corresponds to computations you need to do. When new values appear you push these into the graph and recompute, updating the graph until you reach the outputs. (Or you can store the value at the input and recompute on demand.) This is a very stateful solution but it works.
Are you perhaps creating market data, like yield curves, from live inputs of rates etc.?
What I want is some kind of smart memoisation or push-based recomputation that only recomputes baz when the arguments change.
It sounds to me like you want a variable that is sort of immutable, but allows a one-time mutation from "nothing computed yet" to "computed". Well, you're in luck: this is exactly what lazy evaluation gives you! So my proposed solution is quite simple: just extend your record with fields for each of the things you want to compute. Here's an example of such a thing, where the CPU-intensive task we're doing is breaking some encryption scheme:
data Foo = Foo
{ ciphertext :: String
, plaintext :: String
}
-- a smart constructor for Foo's
foo c = Foo { ciphertext = c, plaintext = crack c }
The point here is that calls to foo have expenses like this:
If you never ask for the plaintext of the result, it's cheap.
On the first call to plaintext, the CPU churns a long time.
On subsequent calls to plaintext, the previously computed answer is returned immediately.
Setup:
I have several collections of various data structures witch represent the state of simulated objects in a virtual system. I also have a number of functions that transform (that is create a new copy of the object based on the the original and 0 or more parameters) these objects.
The goal is to allow a user to select some object to apply transformations to (within the rules of the simulation), apply those the functions to those objects and update the collections by replacing the old objects with the new ones.
I would like to be able to build up a function of this type by combining smaller transformations into larger ones. Then evaluate this combined function.
Questions:
How to I structure my program to make this possible?
What kind of combinator do I use to build up a transaction like this?
Ideas:
Put all the collections into one enormous structure and pass this structure around.
Use a state monad to accomplish basically the same thing
Use IORef (or one of its more potent cousins like MVar) and build up an IO action
Use a Functional Reactive Programing Framework
1 and 2 seem like they carry a lot of baggage around especially if I envision eventually moving some of the collections into a database. (Darn IO Monad)
3 seems to work well but starts to look a lot like recreating OOP. I'm also not sure at what level to use the IORef. (e.g IORef (Collection Obj) or Collection (IORef Obj) or data Obj {field::IORef(Type)} )
4 feels the most functional in style, but it also seems to create a lot of code complexity without much payoff in terms of expressiveness.
Example
I have a web store front. I maintain a collections of products with (among other things) the quantity in stock and a price. I also have a collection of users who have credit with the store.
A user comes along ands selects 3 products to buy and goes to check out using store credit. I need to create a new products collection that has the amount in stock for the 3 products reduced, create a new user collection with the users account debited.
This means I get the following:
checkout :: Cart -> ProductsCol -> UserCol -> (ProductsCol, UserCol)
But then life gets more complicated and I need to deal with taxes:
checkout :: Cart -> ProductsCol -> UserCol -> TaxCol
-> (ProductsCol, UserCol, TaxCol)
And then I need to be sure to add the order to the shipping queue:
checkout :: Cart
-> ProductsCol
-> UserCol
-> TaxCol
-> ShipList
-> (ProductsCol, UserCol, TaxCol, ShipList)
And so forth...
What I would like to write is something like
checkout = updateStockAmount <*> applyUserCredit <*> payTaxes <*> shipProducts
applyUserCredit = debitUser <*> creditBalanceSheet
but the type-checker would have go apoplectic on me. How do I structure this store such that the checkout or applyUserCredit functions remains modular and abstract? I cannot be the only one to have this problem, right?
Okay, let's break this down.
You have "update" functions with types like A -> A for various specific types A, which may be derived from partial application, that specify a new value of some type in terms of a previous value. Each such type A should be specific to what that function does, and it should be easy to change those types as the program develops.
You also have some sort of shared state, which presumably contains all the information used by any of the aforementioned update functions. Further, it should be possible to change what the state contains, without significantly impacting anything other than the functions acting directly on it.
Additionally, you want to be able to abstractly combine update functions, without compromising the above.
We can deduce a few necessary features of a straightforward design:
An intermediate layer will be necessary, between the full shared state and the specifics needed by each function, allowing pieces of the state to be projected out and replaced independently of the rest.
The types of the update functions themselves are by definition incompatible with no real shared structure, so to compose them you'll need to first combine each with the intermediate layer portion. This will give you updates acting on the entire state, which can then be composed in the obvious way.
The only operations needed on the shared state as a whole are to interface with the intermediate layer, and whatever may be necessary to maintain the changes made.
This breakdown allows each entire layer to be modular to a large extent; in particular, type classes can be defined to describe the necessary functionality, allowing any relevant instance to be swapped in.
In particular, this essentially unifies your ideas 2 and 3. There's an inherent monadic context of some sort here, and the type class interface suggested would allow multiple approaches, such as:
Make the shared state a record type, store it in a State monad, and use lenses to provide the interface layer.
Make the shared state a record type containing something like an STRef for each piece, and combine field selectors with ST monad update actions to provide the interface layer.
Make the shared state a collection of TChans, with separate threads to read/write them as appropriate to communicate asynchronously with an external data store.
Or any number of other variations.
You can store your state in a record, and use lenses to update pieces of state. This lets you write the individual state updating components as simple, focused functions that may be composed to build more complex checkout functions.
{-# LANGUAGE TemplateHaskell #-}
import Data.Lens.Template
import Data.Lens.Common
import Data.List (foldl')
import Data.Map ((!), Map, adjust, fromList)
type User = String
type Item = String
type Money = Int -- money in pennies
type Prices = Map Item Money
type Cart = (User, [(Item,Int)])
type ProductsCol = Map Item Int
type UserCol = Map User Money
data StoreState = Store { _stock :: ProductsCol
, _users :: UserCol
, msrp :: Prices }
deriving Show
makeLens ''StoreState
updateProducts :: Cart -> ProductsCol -> ProductsCol
updateProducts (_,c) = flip (foldl' destock) c
where destock p' (item,count) = adjust (subtract count) item p'
updateUsers :: Cart -> Prices -> UserCol -> UserCol
updateUsers (name,c) p = adjust (subtract (sum prices)) name
where prices = map (\(itemName, itemCount) -> (p ! itemName) * itemCount) c
checkout :: Cart -> StoreState -> StoreState
checkout c s = (users ^%= updateUsers c (msrp s))
. (stock ^%= updateProducts c)
$ s
test = checkout cart store
where cart = ("Bob", [("Apples", 2), ("Bananas", 6)])
store = Store initialStock initialUsers prices
initialStock = fromList
[("Apples", 20), ("Bananas", 10), ("Lambdas", 1000)]
initialUsers = fromList [("Bob", 20000), ("Mary", 40000)]
prices = fromList [("Apples", 100), ("Bananas", 50), ("Lambdas", 0)]