Thinking Functionally. Building a New Array in Haskell / Purescript - haskell

I'm new to functional programming, and I've decided to build an app in Purescript. I've hit my first hurdle, and I'm not sure how to think about this conceptually.
I'm not looking for code as much as a way to think functionally about this problem.
I have a list of data. Specifically, something like
[ {a :: String, b :: String, c :: String} ]
I would like to create a list of Html (which is a purescript-halogen type) by using the record provided (with a list of the above types).
So, I would have a function
buildElements :: forall p i. MyRecordObject -> Array (HTML p i)
Now, I think I'm going to need to give this function result type a Monad computational context (purescript Eff is like Haskell IO)
So something like:
buildElements :: forall p i. MyRecordObject -> Eff (Array (HTML p i))
My first idea was vaguely around creating a list with something like
take $ length xs $ repeat ARecordObject
and then map the record over that list, but I wasn't really sure how to translate that into code. It seemed wrong anyway, since my plan involved mutating the state of ARecordObject, which is a no-no.
So then I found this function:
forEach :: forall e a. Array a -> (a -> Eff e Unit) -> Eff Unit
which looks almost perfect! I get an array, I give it a function that somehow assigns the properties in the record to this new array...but no, wait...I'm thinking non-functionally again.
I'm really at a bit of a loss here. Basically, I want to create something like a list of <li></li> elements, where I assign properties to each item.
E.g
I'm provided a record with:
[ { id: "id1", name: "name1", class: "class1", content: "content1" }
, { id: "id2", name: "name2", class: "class2", content: "content2" } ]
And I would like a function foo that returns an array:
[ li [ id_ rec.id, name_ rec.name, class_ rec.class ] [ text rec.content ]
, li [ id_ rec.id, name_ rec.name, class_ rec.lass ] [ text rec.content ] ]
where rec is the name of the recordObject (and obviously the two arrays are not identical, but actually mapped over the initial record).
(the dot syntax is a purescript record syntax notation similar to standard getter/setter notation)

My first idea was vaguely around creating a list with something like
take $ length xs $ repeat ARecordObject
and then map the record over that list, but I wasn't really sure how
to translate that into code. It seemed wrong anyway, since my plan
involved mutating the state of ARecordObject, which is a no-no.
Functional programmers don't just avoid mutation because it's a no-no (indeed, many functional programs make careful use of a controlled dose of mutability) - we do it because it produces safer, simpler code.
To wit: You're thinking in what I call "alloc-init mode", wherein you create some sort of "empty" value and then go about calculating its properties. Forgive my vehemency, but that's a fundamentally broken programming model, left over from the days of manual memory management; code which uses it will never be safe and abstractions relying on it will forever be leaky. The idiom doesn't fit into any language that's higher-level than C, and yet, if I had a pound for every time I see code like this...
var foo = new Foo();
foo.Bar = new Bar();
foo.Bar.Baz = new Baz();
...I would be a rich man (na na na). The default should be to create objects after you know what they're going to look like:
var foo = new Foo(new Bar(new Baz()));
This is simpler - you're just calculating a value, rather than reaching into the memory referenced by a pointer to update its contents - and more importantly it's safer because the type-checker ensures that you haven't forgotten a property and it allows you to make Foo immutable. The cleanest imperative code is functional code - you should only be imperative where necessary for performance (or when the language forces your hand).
Anyway, rant over. The point is that you're making life harder for yourself than necessary by thinking imperatively. Just write a function that calculates a single <li> from a single object...
toLi :: MyRecord -> HTML
toLi x = li [ id_ x.id, name_ x.name, class_ x.class ] [ text x.content ]
... (note that I'm not somehow creating an "empty" li and then populating its values), and then map it over your input list.
toLis :: [MyRecord] -> [HTML]
toLis = map toLi
This is how I'd do it in JS, too, even though I'm not required to by the language. No side-effects, no mutation, no need for Eff - just simple, safe, purely functional code.

Related

Haskell conversion between types

Again stuck on something probably theoretical. There are many libraries in Haskell, i'd like to use less as possible. If I have a type like this:
data Note = Note { _noteID :: Int
, _noteTitle :: String
, _noteBody :: String
, _noteSubmit :: String
} deriving Show
And use that to create a list of [Note {noteID=1...}, Note {noteID=2...}, ] et cetera. I now have a list of type Note. Now I want to write it to a file using writeFile. Probably it ghc will not allow it considering writeFile has type FilePath -> String -> IO (). But I also want to avoid deconstructing (writeFile) and constructing (readFile) the types all the time, assuming I will not leave the Haskell 'realm'. Is there a way to do that, without using special libs? Again: thanks a lot. Books on Haskell are good, but StackOverflow is the glue between the books and the real world.
If you're looking for a "quick fix", for a one-off script or something like that, you can derive Read in addition to Show, and then you'll be able to use show to convert to String and read to convert back, for example:
data D = D { x :: Int, y :: Bool }
deriving (Show, Read)
d1 = D 42 True
s = show d1
-- s == "D {x = 42, y = True}"
d2 :: D
d2 = read s
-- d2 == d1
However, please, please don't put this in production code. First, you're implicitly relying on how the record is coded, and there are no checks to protect from subtle changes. Second, the read function is partial - that is, it will crash if it can't parse the input. And finally, if you persist your data this way, you'll be stuck with this record format and can never change it.
For a production-quality solution, I'm sorry, but you'll have to come up with an explicit, documented serialization format. No way around it - in any language.

Growing a list in haskell

I'm learning Haskell by writing an OSC musical sequencer to use it with SuperCollider. But because I'd like to make fairly complex stuff with it, it will work like a programming language where you can declare variables and define functions so you can write music in an algorithmic way. The grammar is unusual in that we're coding sequences and sometimes a bar will reference the last bar (something like "play that last chord again but a fifth above").
I don't feel satisfied with my own explanation, but that's the best I can without getting too technical.
Anyway, what I'm coding now is the parser for that language, stateless so far, but now I need some way to implement a growing list of the declared variables and alikes using a dictionary in the [("key","value")] fashion, so I can add new values as I go parsing bar by bar.
I know this involves monads, which I don't really understand yet, but I need something meaningful enough to start toying with them or else I find the raw theory a bit too raw.
So what would be a clean and simple way to start?
Thanks and sorry if the question was too long.
Edit on how the thing works:
we input a string to the main parsing function, say
"afunction(3) ; anotherone(1) + [3,2,1]"
we identify closures first, then kinds of chars (letters, nums, etc) and group them together, so we get a list like:
[("word","afunction"),("parenth","(3)"),("space"," "),("semicolon",";"),("space"," "),("word","anotherone"),("parenth","(1)"),("space"," "),("opadd","+"),("space"," "),("bracket","[3,2,1]")]
then we use a function that tags all those tuples with the indices of the original string they occupy, like:
[("word","afunction",(0,8)),("parenth","(3)",(9,11)),("space"," ",(12,13)) ...]
then cut it in a list of bars, which in my language are separated using a semicolon, and then in notes, using commas.
And now I'm at the stage where those functions should be executed sequentially, but because some of them are reading or modifying previously declared values, I need to keep track of that change. For example, let's say the function f(x) moves the pitch of the last note by x semitones, so
f(9), -- from an original base value of 0 (say that's an A440) we go to 9
f(-2), -- 9-2 = 7, so a fifth from A
f(-3); -- 9-2-3, a minor third down from the last value.
etc
But sometimes it can get a bit more complicated than that, don't make me explain how cause I could bore you to death.
Adding an item to a list
You can make a new list that contains one more item than an existing list with the : constructor.
("key", "value") : existing
Where existing is a list you've already made
Keeping track of changing state
You can keep track of changing state between functions by passing the state from each function to the next. This is all the State monad is doing. State s a is a value of type a that depends on (and changes) a state s.
{- ┌---- type of the state
v v-- type of the value -}
data State s a = State { runState :: s -> (a, s) }
{- ^ ^ ^ ^
a function ---|--┘ | |
that takes a state ---┘ | |
and returns | |
a value that depends on the state ---┘ |
and a new state ------┘ -}
The bind operation >>= for State takes a value that depends on (and changes) the state and a function to compute another value that depends on (and changes) the state and combines them to make a new value that depends on (and changes) the state.
m >>= k = State $ \s ->
let ~(a, s') = runState m s
in runState (k a) s'

Data value dependencies, updates and memoisation

I'm sorry this problem description is so abstract: its for my job, and for commercial confidentiality reasons I can't give the real-world problem, just an abstraction.
I've got an application that receives messages containing key-value pairs. The keys are from a defined set of keywords, and each keyword has a fixed data type. So if "Foo" is an Integer and "Bar" is a date you might get a message like:
Foo: 234
Bar: 24 September 2011
A message may have any subset of keys in it. The number of keys is fairly large (several dozen). But lets stick with Foo and Bar for now.
Obviously there is a record like this corresponding to the messages:
data MyRecord {
foo :: Maybe Integer
bar :: Maybe UTCTime
-- ... and so on for several dozen fields.
}
The record uses "Maybe" types because that field may not have been received yet.
I also have many derived values that I need to compute from the current values (if they exist). For instance I want to have
baz :: MyRecord -> Maybe String
baz r = do -- Maybe monad
f <- foo r
b <- bar r
return $ show f ++ " " ++ show b
Some of these functions are slow, so I don't want to repeat them unnecessarily. I could recompute baz for each new message and memo it in the original structure, but if a message leaves the foo and bar fields unchanged then that is wasted CPU time. Conversely I could recompute baz every time I want it, but again that would waste CPU time if the underlying arguments have not changed since last time.
What I want is some kind of smart memoisation or push-based recomputation that only recomputes baz when the arguments change. I could detect this manually by noting that baz depends only on foo and bar, and so only recomputing it on messages that change those values, but for complicated functions that is error-prone.
An added wrinkle is that some of these functions may have multiple strategies. For instance you might have a value that can be computed from either Foo or Bar using 'mplus'.
Does anyone know of an existing solution to this? If not, how should I go about it?
I'll assume that you have one "state" record and these message all involve updating it as well as setting it. So if Foo is 12, it may later be 23, and therefore the output of baz would change. If any of this is not the case, then the answer becomes pretty trivial.
Let's start with the "core" of baz -- a function not on a record, but the values you want.
baz :: Int -> Int -> String
Now let's transform it:
data Cached a b = Cached (Maybe (a,b)) (a -> b)
getCached :: Eq a => Cached a b -> a -> (b,Cached a b)
getCached c#(Cached (Just (arg,res)) f) x | x == arg = (res,c)
getCached (Cached _ f) x = let ans = f x in (ans,Cached (Just (x,ans) f)
bazC :: Cached (Int,Int) String
bazC = Cached Nothing (uncurry baz)
Now whenever you would use a normal function, you use a cache-transformed function instead, substituting the resulting cache-transformed function back into your record. This is essentially a manual memotable of size one.
For the basic case you describe, this should be fine.
A fancier and more generalized solution involving a dynamic graph of dependencies goes under the name "incremental computation" but I've seen research papers for it more than serious production implementations. You can take a look at these for starters, and follow the reference trail forward:
http://www.carlssonia.org/ogi/Adaptive/
http://www.andres-loeh.de/Incrementalization/paper_final.pdf
Incremental computation is actually also very related to functional reactive programming, so you can take a look at conal's papers on that, or play with Heinrich Apfelmus' reactive-banana library: http://www.haskell.org/haskellwiki/Reactive-banana
In imperative languages, take a look at trellis in python: http://pypi.python.org/pypi/Trellis or Cells in lisp: http://common-lisp.net/project/cells/
You can build a stateful graph that corresponds to computations you need to do. When new values appear you push these into the graph and recompute, updating the graph until you reach the outputs. (Or you can store the value at the input and recompute on demand.) This is a very stateful solution but it works.
Are you perhaps creating market data, like yield curves, from live inputs of rates etc.?
What I want is some kind of smart memoisation or push-based recomputation that only recomputes baz when the arguments change.
It sounds to me like you want a variable that is sort of immutable, but allows a one-time mutation from "nothing computed yet" to "computed". Well, you're in luck: this is exactly what lazy evaluation gives you! So my proposed solution is quite simple: just extend your record with fields for each of the things you want to compute. Here's an example of such a thing, where the CPU-intensive task we're doing is breaking some encryption scheme:
data Foo = Foo
{ ciphertext :: String
, plaintext :: String
}
-- a smart constructor for Foo's
foo c = Foo { ciphertext = c, plaintext = crack c }
The point here is that calls to foo have expenses like this:
If you never ask for the plaintext of the result, it's cheap.
On the first call to plaintext, the CPU churns a long time.
On subsequent calls to plaintext, the previously computed answer is returned immediately.

Any nice record Handling tricks in Haskell?

I'm aware of partial updates for records like :
data A a b = A { a :: a, b :: b }
x = A { a=1,b=2 :: Int }
y = x { b = toRational (a x) + 4.5 }
Are there any tricks for doing only partial initialization, creating a subrecord type, or doing (de)serialization on subrecord?
In particular, I found that the first of these lines works but the second does not :
read "A {a=1,b=()}" :: A Int ()
read "A {a=1}" :: A Int ()
You could always massage such input using a regular expression, but I'm curious what Haskell-like options exist.
Partial initialisation works fine: A {a=1} is a valid expression of type A Int (); the Read instance just doesn't bother parsing anything the Show instance doesn't output. The b field is initialised to error "...", where the string contains file/line information to help with debugging.
You generally shouldn't be using Read for any real-world parsing situations; it's there for toy programs that have really simple serialisation needs and debugging.
I'm not sure what you mean by "subrecord", but if you want serialisation/deserialisation that can cope with "upgrades" to the record format to contain more information while still being able to process old (now "partial") serialisations, then the safecopy library does just that.
You cannot leave some value in Haskell "uninitialized" (it would not be possible to "initialize" it later anyway, since Haskell is pure). If you want to provide "default" values for the fields, then you can make some "default" value for your record type, and then do a partial update on that default value, setting only the fields you care about. I don't know how you would implement read for this in a simple way, however.

How to store recursive datatype with Data.Binary

Data.Binary is great. There is just one question I have. Let's imagine I've got a datatype like this:
import Data.Binary
data Ref = Ref {
refName :: String,
refRefs :: [(String, Ref)]
}
instance Binary Ref where
put a = put (refName a) >> put (refRefs a)
get = liftM2 Ref get get
It's easily to see that this is a recursive datatype, which works because Haskell is lazy. Since Haskell as a language uses neither references nor pointers, but presents the data as-is, I am not sure how this is going to be saved. I have the strong indication that this naive reproach will lead to an infinite bytestring...
So how can this type be safely saved?
If your data has no cycles you'll be fine. But a cycle, like
r = Ref "a" [("b", r)]
is indeed going to generate an infinite result. The only way around this is for you to give unique labels to all nodes and use those to avoid cycles when converting to binary.

Resources