Separation of data loading/unloading and processing logic - haskell

Sometimes it is necessary to perform some complex routines in order to retrieve or save data, which is being processed. In this case one wants to separate data generation and data processing logic. The common way is to use iteratee-like functionality. There are lots of decent libraries: pipes, conduit, etc. In most cases they will do the thing. But AFAIK they are (except, maybe, pipes) limited by the order of processing.
But consider a log viewer example: human may desire to ramble back and forth randomly. He also may zoom in and out. I fear iteratees can't help here.
A straightforward solution may look like this:
-- True is for 'right', 'up', etc. and vice versa
type Direction = Bool
class Frame (f :: * -> *) where
type Dimension f :: *
type Origin f :: * -> *
grow', shrink' move' :: Monad m => Dimension f -> Direction -> f a -> m (f a)
move' dim dir f = grow' dim dir f >>= shrink' dim (not dir)
liftF' :: (Origin f a -> b) -> f a -> b
class Frame f => MFrame f where
liftMF' :: (Origin f a -> (b, Origin f a)) -> f a -> (b, f a)
-- Example instance: infinite stream.
data LF a = LF [a] [a] [a]
instance Frame LF where
type Dimension LF = () -- We have only one dimension to move in...
type Origin LF = [] -- User see piece of stream as a plain list
liftF' f (LF _ m _) = f m
grow' () True (LF l m (h:r)) = return $ LF l (m++[h]) r
...
Then one may wrap this into StateT and so on. So, the questions:
0) Did I miss the point of iteratees completely, and they are applicable here?
1) Did I just reinvent a well-known wheel?
2) It is obvious, that grow and shrink operations are pretty uneffective, as their complexity is proportional to the frame size. Is there a better way to extend zippers like this?

You want lenses, specifically the sequenceOf function. Here is an example of targeted loading of a 3-tuple:
sequenceOf _2 :: (IO a, IO b, IO c) -> IO (IO a, b, IO c)
sequenceOf takes a lens to a polymorphic field that contains a loading action, runs the action, then replaces the field with the result of the action. You can use sequenceOf on your own custom types by just making your type polymorphic in the fields you want to load, like this:
data Asset a b = Asset
{ _art :: a
, _sound :: b
}
... and also making your lenses use the full four type parameters (this is one reason why they exist):
art :: Lens (Asset a1 b) (Asset a2 b) a1 a2
art k (Asset x y) = fmap (\x' -> Asset x' y) (k x)
sound :: Lens (Asset a b1) (Asset a b2) b1 b2
sound k (Asset x y) = fmap (\y' -> Asset x y') (k y)
... or you can auto generate lenses using makeLenses and they will be sufficiently general.
Then you can just write:
sequenceOf art :: Asset (IO Art) b -> IO (Asset Art b)
... and loading multiple assets is as simple as composing Kleisli arrows::
sequenceOf art >=> sequenceOf sound
:: Asset (IO Art) (IO Sound) -> IO (Asset Art Sound)
... and of course you can nest assets and compose lenses to reach nested assets and everything still "just works".
Now you have a pure Asset type that you can process using pure functions, and all the loading logic is factored out into lenses.
I wrote this on my phone so there may be several errors, but I will fix them later.

Related

The Monad Challenges - A Missed Generalization

I'm going through The Monad Challenges.
At the section A Missed Generalization, I'm supposed to have at least this code (I've removed parts not relevant to the question), where Gen looks a lot like the State Monad,
-- Imports and other stuff that hide Prelude
-- For instance, the do notation is NOT available at this stage of the challenges
type Gen a = Seed -> (a,Seed)
genTwo :: Gen a -> (a -> Gen b) -> Gen b
genTwo g f s = let (a,s') = g s
in f a s'
mkGen :: a -> Gen a
mkGen a s = (a,s)
generalB :: (a -> b -> c) -> Gen a -> Gen b -> Gen c
-- I've implemented it as follows and it works
generalB f a b s = let (x,s') = a s
(y,s'') = b s'
in (f x y,s'')
The text of the "assignment" reads
[…] you might not have implemented generalB in terms of genTwo. Go back and look at your generalB implementation and if you didn’t write it in terms of genTwo, do that now and call it generalB2. Doing this should get rid of the state threading between generators.
Is unclear to me what the solution to this should be, especially in view of the fact that the paragraph above doesn't mention mkGen. Assuming I'm able to apply f to the inside of a and b, I would still get something of type c, which I have to shove in a Gen, and I don't see how I can do that without mkGen or, alternatively, without using (,) explicitly (as I did in the above implementation).
Even assuming that the text implies that I should use mkGen, how should I go about it to get rid of the state threading?
With some editing I was able to come up with this
generalB2' f a b = genTwo a (genTwo b . (mkGen .) . f)
but I hardly believe this is the intended solution, because it's far from being readable, in my opinion. Also, getting to this form was a bit harder than anything else so far in the challenges, but it was after all just mechanical, so it didn't really pose a difficulty from the point of view of understanding monads, I believe, so I really think I took a wrong turn here, and I'd like some help.
I wonder whether the authors of the challenges hang out here on StackOverflow.
Your solution is probably close to the intended solution, although you might be able to make it more readable by eta-expanding it. You might even consider writing it using do notation, but still use genTwo and mkGen.
As far as I can tell, mkGen is a 'disguised' return function, and genTwo likewise is a 'disguised' monadic bind (i.e. >>=).
The type of generalB (and generalB2) is equivalent to liftM2, which is implemented like this:
liftM2 :: (Monad m) => (a1 -> a2 -> r) -> m a1 -> m a2 -> m r
liftM2 f m1 m2 = do { x1 <- m1; x2 <- m2; return (f x1 x2) }
That is, in terms of return and >>= (which you don't see, because it's using do syntax).

Avoiding thunks in sparsely evaluated list generated by monadic unfold

I have a simulation library that uses the FFI wrapped in a monad M, carrying a context. All the foreign functions are pure, so I've decided to make the monad lazy, which is normally convenient for flow-control. I represent my simulation as a list of simulation-frames, that I can consume by either writing to a file, or by displaying the frame graphically.
simulation :: [(Frame -> M Frame)] -> Frame -> M [Frame]
simulation [] frame = return [frame]
simulation (step:steps) frame
= step frame >>= fmap (frame:) . simulation steps
Each frame consists of a tuple of newtype-wrapped ForeignPtrs that I can lift to my Haskell representation with
lift :: Frame -> M HFrame
Since the time-steps in my simulation are quite short, I only want to look at every n frames, for which I use
takeEvery n l = foldr cons nil l 0 where
nil _ = []
cons x rest 0 = x : rest n
cons x rest n = rest (n-1)
So my code looks something like
main = consume
$ takeEvery n
$ runM
$ simulation steps initialFrame >>= mapM lift
Now, the problem is that as I increase n, a thunk builds up. I've tried a couple of different ways to try to strictly evaluate each frame in simulation, but I have yet to figure out how to do so. ForeignPtr doesn't appear to have a NFData instance, so I can't use deepseq, but all my attempts with seq, including using seq on each element in the tuple, have been without noticeable effect.
EDIT:
Upon request, I have included more specifics, that I initially excluded since I think they are probably mostly noise for this question.
The monad
newtype FT c a = FT (Context -> a)
instance Functor (FT c) where
fmap f (FT a) = FT (f.a)
instance Applicative (FT c) where
pure a = FT (\_ -> a)
(<*>) (FT a) (FT b) = FT (\c -> a c $ b c)
instance Monad (FT c) where
return = pure
(>>=) (FT a) f = FT (\c -> (\(FT b) -> b c) $ f $ a c)
runFTIn :: Context -> (forall c. FT c a) -> a
runFTIn context (FT a) = a context
runFTWith :: [ContextOption] -> (forall c. FT c a) -> a
runFTWith options a
= unsafePerformIO
$ getContext options >>= \c -> return $ runFTIn c a
runFT = runFTWith []
unsafeLiftFromIO :: (Context -> IO a) -> FT c a
unsafeLiftFromIO a = FT (\c -> unsafePerformIO $ a c)
All the foreign functions are lifted from IO with unsafeLiftFromIO
newtype Box c = Box (ForeignPtr RawBox)
newtype Coordinates c = Coordinates (ForeignPtr RawCoordinates)
type Frame c = (Box c, Coordinates c)
liftBox :: Box c -> FT c HBox
liftCoordinates :: Coordinates c -> FT c HCoordinates
liftFrame (box, coordinates) = do
box' <- liftBox box
coordinates' <- liftCoordinates coordinates
return (box', coordinates')
The steps themselves are supposed to be arbitrary (Frame c -> FT c (Frame c)), so strictness should preferably be in the higher level code.
EDIT2:
I have now tried to use Streamly, however the problem persists, so I think the issue really is finding a way to strictly evaluate ForeignPtrs.
current implementations:
import Streamly
import qualified Streamly.Prelude as S
import qualified Streamly.Internal.Data.Stream.Serial as Serial
takeEvery n = Serial.unfoldrM ((fmap.fmap) (\(h, t) -> (h, S.drop (n-1) t)) . S.uncons)
(#) = flip ($)
simulation
:: (IsStream t)
=> Frame c
-> t (FT c) (Frame c -> FT c (Frame c))
-> t (FT c) (Frame c)
simulation frame = S.scanlM' (#) frame
EDIT3:
To clarify the symptoms and how I have diagnosed the problem.
The library calls OpenCL functions running on a GPU. I am sure that the freeing of the pointers is handled correctly - the ForeignPtrs have the correct freeing functions, and memory use is independent of total number of steps as long as this number is larger than n. What I find is that memory use on the GPU is basically linearly correlated to n. The consumer I've been using for this testing is
import qualified Data.ByteString.Lazy as BL
import Data.Binary
import Data.Binary.Put
writeTrajectory fn = fmap (BL.writeFile fn . runPut) . S.foldr ((>>).putFrame) (pure ()) . serially
For my streamly implementation, and
writeTrajectory fn = BL.writeFile fn . runPut . MapM_ putFrame
For the original implementation. Both should consume the stream continuously. I've generated the steps for testing with replicate.
I am unsure of how to more precisely analyze the memory-use on the GPU. System memory use is not an issue here.
Update:
I am starting to think it's not a matter of strictness, but of GC-problems. The run-time system does not know the size of the memory allocated on the GPU and so does not know to collect the pointers, this is less of an issue when there is stuff going on CPU-side as well, as that will produce allocations too, activating the GC. This would explain the slightly non-determinstic memory usage, but linear correlation to n that I've seen. How too solve this nicely is another issue, but I suspect there will be a substantial overhaul to my code.
I think the issue really is finding a way to strictly evaluate ForeignPtrs
If that is really the issue, one way to do that is to change the second clause of simulation:
{-# LANGUAGE BangPatterns #-}
simulation :: [(Frame -> M Frame)] -> Frame -> M [Frame]
simulation [] frame = return [frame]
simulation (step:steps) frame#(!_, !_) -- Evaluate both components of the pair
= step frame >>= fmap (frame:) . simulation steps

How to use the same record selector two ways within a function? Lenses?

I have some data that have different representations based on a type parameter, a la Sandy Maguire's Higher Kinded Data. Here are two examples:
wholeMyData :: MyData Z
wholeMyData = MyData 1 'w'
deltaMyData :: MyData Delta
deltaMyData = MyData Nothing (Just $ Left 'b')
I give some of the implementation details below, but first the actual question.
I often want to get a field of the data, usually via a local definition like:
let x = either (Just . Left . myDataChar) myDataChar -- myDataChar a record of MyData
It happens so often I would like to make a standard combinator,
getSubDelta :: ( _ -> _ ) -> Either a b -> Maybe (Either c d)
getSubDelta f = either (Just . Left . f) f
but filling in that signature is problematic. The easy solution is to just supply the record selector function twice,
getSubDelta :: (a->c) -> (b->d) -> Either a b -> Maybe (Either c d)
getSubDelta f g = either (Just . Left . f) g
but that is unseemly. So my question. Is there a way I can fill in the signature above? I'm assuming there is probably a lens based solution, what would that look like? Would it help with deeply nested data? I can't rely on the data types always being single constructor, so prisms? Traversals? My lens game is weak, so I was hoping to get some advice before I proceed.
Thanks!
Some background. I defined a generic method of performing "deltas", via a mix of GHC.Generics and type families. The gist is to use a type family in the definition of the data type. Then, depending how the type is parameterized, the records will either represent whole data or a change to existing data.
For instance, I define the business data using DeltaPoints.
MyData f = MyData { myDataInt :: DeltaPoint f Int
, myDataChar :: DeltaPoint f Char} deriving Generic
The DeltaPoints are implemented in the library, and have different forms for Delta and Z states.
data DeltaState = Z | Delta deriving (Show,Eq,Read)
type family DeltaPoint (st :: DeltaState) a where
DeltaPoint Z a = a
DeltaPoint Delta a = Maybe (Either a (DeltaOf a))
So a DeltaPoint Z a is just the original data, a, and a DeltaPoint Delta a, may or may not be present, and if it is present will either be a replacement of the original (Left) or an update (DeltaOf a).
The runtime delta functionality is encapsulated in a type class.
class HasDelta a where
type DeltaOf a
delta :: a -> a -> Maybe (Either a (DeltaOf a))
applyDeltaOf :: a -> DeltaOf a -> Maybe a
And with the use of Generics, I can usually get the delta capabilities with something like:
instance HasDelta (MyData Z) where
type (DeltaOf (MyData Z)) = MyData Delta
I think you probably want:
{-# LANGUAGE RankNTypes #-}
getSubDelta :: (forall f . (dat f -> DeltaPoint f fld))
-> Either (dat Z) (dat Delta)
-> Maybe (Either (DeltaPoint Z fld) (DeltaOf fld))
getSubDelta sel = either (Just . Left . sel) sel
giving:
x :: Either (MyData Z) (MyData Delta)
-> Maybe (Either (DeltaPoint Z Char) (DeltaOf Char))
x = getSubDelta myDataChar
-- same as: x = either (Just . Left . myDataChar) myDataChar

How do I extract information from inner parameters in Haskell?

In most of programming languages that support mutable variables, one can easily implement something like this Java example:
interface Accepter<T> {
void accept(T t);
}
<T> T getFromDoubleAccepter(Accepter<Accepter<T>> acc){
final List<T> l = new ArrayList<T>();
acc.accept(new Accepter<T>(){
#Override
public void accept(T t) {
l.add(t);
}
});
return l.get(0); //Not being called? Exception!
}
Just for those do not understand Java, the above code receives something can can be provided a function that takes one parameter, and it supposed to grape this parameter as the final result.
This is not like callCC: there is no control flow alternation. Only the inner function's parameter is concerned.
I think the equivalent type signature in Haskell should be
getFromDoubleAccepter :: (forall b. (a -> b) -> b) -> a
So, if someone can gives you a function (a -> b) -> b for a type of your choice, he MUST already have an a in hand. So your job is to give them a "callback", and than keep whatever they sends you in mind, once they returned to you, return that value to your caller.
But I have no idea how to implement this. There are several possible solutions I can think of. Although I don't know how each of them would work, I can rate and order them by prospected difficulties:
Cont or ContT monad. This I consider to be easiest.
RWS monad or similar.
Any other monads. Pure monads like Maybe I consider harder.
Use only standard pure functional features like lazy evaluation, pattern-matching, the fixed point contaminator, etc. This I consider the hardest (or even impossible).
I would like to see answers using any of the above techniques (and prefer harder ways).
Note: There should not be any modification of the type signature, and the solution should do the same thing that the Java code does.
UPDATE
Once I seen somebody commented out getFromDoubleAccepter f = f id I realize that I have made something wrong. Basically I use forall just to make the game easier but it looks like this twist makes it too easy. Actually, the above type signature forces the caller to pass back whatever we gave them, so if we choose a as b then that implementation gives the same expected result, but it is just... not expected.
Actually what came up to my mind is a type signature like:
getFromDoubleAccepter :: ((a -> ()) -> ()) -> a
And this time it is harder.
Another comment writer asks for reasoning. Let's look at a similar function
getFunctionFromAccepter :: (((a -> b) -> b) -> b) -> a -> b
This one have an naive solution:
getFunctionFromAccepter f = \a -> f $ \x -> x a
But in the following test code it fails on the third:
exeMain = do
print $ getFunctionFromAccepter (\f -> f (\x -> 10)) "Example 1" -- 10
print $ getFunctionFromAccepter (\f -> 20) "Example 2" -- 20
print $ getFunctionFromAccepter (\f -> 10 + f (\x -> 30)) "Example 3" --40, should be 30
In the failing case, we pass a function that returns 30, and we expect to get that function back. However the final result is in turn 40, so it fails. Are there any way to implement doing Just that thing I wanted?
If this can be done in Haskell there are a lot of interesting sequences. For example, tuples (or other "algebraic" types) can be defined as functions as well, since we can say something like type (a,b) = (a->b->())->() and implement fst and snd in term of this. And this, is the way I used in a couple of other languages that do not have native "tuple" support but features "closure".
The type of accept is void accept(T) so the equivalent Haskell type is t -> IO () (since every function in Java is essentially IO). Thus getFromDoubleAccepted can be directly translated as
import Data.IORef
type Accepter t = t -> IO ()
getFromDoubleAccepter :: Accepter (Accepter a) -> IO a
getFromDoubleAccepter acc = do
l <- newIORef $ error "Not called"
acc $ writeIORef l
readIORef l
If you want an idiomatic, non-IO solution in Haskell, you need to be more specific about what your actual end goal is besides trying to imitate some Java-pattern.
EDIT: regarding the update
getFromDoubleAccepter :: ((a -> ()) -> ()) -> a
I'm sorry, but this signature is in no way equal to the Java version. What you are saying is that for any a, given a function that takes a function that takes an a but doesn't return anything or do any kind of side effects, you want to somehow conjure up a value of type a. The only implementation that satisfies the given signature is essentially:
getFromDoubleAccepter :: ((a -> ()) -> ()) -> a
getFromDoubleAccepter f = getFromDoubleAccepter f
First, I'll transliterate as much as I can. I'm going to lift these computations to a monad because accept returns void (read () in Haskell-land), which is useless unless there is some effect.
type Accepter m t = t -> m ()
getFromDoubleAccepter :: (MonadSomething m) => Accepter m (Accepter m t) -> m t
getFromDoubleAccepter acc = do
l <- {- new mutable list -}
acc $ \t -> add l t
return (head l)
Of course, we can't make a mutable list like that, so we'll have to use some intuitive sparks here. When an action just adds an element to some accumulator, I think of the Writer monad. So maybe that line should be:
acc $ \t -> tell [t]
Since you are simply returning the head of the list at the end, which doesn't have any effects, I think the signature should become:
getFromDoubleAccepter :: Accepter M (Accepter M t) -> t
where M is an appropriate monad. It needs to be able to write [t]s, so that gives us:
type M t = Writer [t]
getFromDoubleAccepter :: Accepter (M t) (Accepter (M t) t) -> t
And now the type of this function informs us how to write the rest of it:
getFromDoubleAccepter acc =
head . execWriter . acc $ \t -> tell [t]
We can check that it does something...
ghci> getFromDoubleAccepter $ \acc -> acc 42
42
So that seems right, I guess. I'm still a bit unclear on what this code is supposed to mean.
The explicit M t in the type signature is a bit aesthetically bothersome to me. If I knew what problem I was solving I would look at that carefully. If you mean that the argument can be a sequence of commands, but otherwise has no computational features available, then you could specialize the type signature to:
getFromDoubleAccepter :: (forall m. (Monad m) => Accepter m (Accepter m t)) -> t
which still works with our example. Of course, this is all a bit silly. Consider
forall m. (Monad m) => Accepter m (Accepter m t))
= forall m. (Monad m) => (t -> m ()) -> m ()
The only thing a function with this type can do is call its argument with various ts in order and then return (). The information in such a function is completely characterized[1] by those ts, so we could just as easily have used
getFromDoubleAccepter :: [t] -> t
getFromDoubleAccepter = head
[1] As long as I'm going on about nothing, I might as well say that that is not quite accurate in the face of infinity. The computation
crazy :: Integer -> Accepter m (Accepter m Integer)
crazy n acc = crazy (n+1) >> acc n
can be used to form the infinite sequence
... >> acc 3 >> acc 2 >> acc 1 >> acc 0
which has no first element. If we tried to interpret this as a list, we would get an infinite loop when trying to find the first element. However this computation has more information than an infinite loop -- if instead of a list, we used the Last monoid to interpret it, we would be able to extract 0 off the end. So really
forall m. (Monad m) => Accepter m (Accepter m t)
is isomorphic to something slightly more general than a list; specifically a free monoid.
Thanks to the above answers, I finally concluded that in Haskell we can do some different things than other languages.
Actually, the motivation of this post is to translate the famous "single axiom classical logic reduction system". I have implemented this in some other languages. It should be no problem to implement the
Axiom: (a|(b|c)) | ((d|(d|d)) | ((e|b) | ((a|e) | (a|e))))
However, since the reduction rule looks like
Rule: a|(b|c), a |-- c
It is necessary to extract the inner parameter as the final result. In other languages, this is done by using side-effects like mutable slots. However, in Haskell we do not have mutable slots and involving IO will be ugly so I keep looking for solutions.
In the first glance (as show in my question), the getFromDoubleAccepter f = f id seems nonsense, but I realise that it actually work in this case! For example:
rule :: (forall r.a -> (b -> c -> r) -> r) -> a -> c
rule abc a = abc a $ flip const
The trick is still the same: since the existential qualification hides r from the caller, and it is up to the callee to pick up a type for it, we can specify c to be r, so we simply apply the given function to get the result. On the other hand, the given function has to use our input to produce the final answer, so it effectively limiting the implementation to what we exactally want!
Putting them together, let's see what we can do with it:
newtype I r a b = I { runI :: a -> b -> r }
rule :: (forall r. I r a (I r b c)) -> a -> c
rule (I abc) a = abc a (I (\b c -> c))
axiom :: I r0 (I r1 a (I r2 b c))
(I r0 (I r3 d (I r3 d d))
(I r4 (I r2 e b) (I r4 (I r1 a e) (I r1 a e))))
axiom = let
a1 (I eb) e = I $ \b c -> eb e b
a2 = I $ \d (I dd) -> dd d d
a3 (I abc) eb = I $ \a e -> abc a (a1 eb e)
a4 abc = I $ \eb aeae -> runI a2 (a3 abc eb) aeae
in I $ \abc (I dddebaeae) -> dddebaeae a2 (a4 abc)
Here I use a naming convention to trace the type signatures: a variable name is combinded by the "effective" type varialbes (means it is not result type - all r* type variable).
I wouldn't repeat the prove represented in the sited essay, but I want to show something. In the above definition of axiom we use some let bindings variables to construct the result. Not surprisingly, those variables themselves can be extracted by using rule and axiom. let's see how:
--Equal to a4
t4 :: I r0 a (I r1 b c) -> I r2 (I r1 d b) (I r2 (I r0 a d) (I r0 a d))
t4 abc = rule axiom abc
--Equal to a3
t3 :: I r0 a (I r1 b c) -> I r1 d b -> I r0 a d
t3 abc eb = rule (t4 abc) eb
--Equal to a2
t2 :: I r a (I r a a)
t2 = rule (t3 axiom (t3 (t4 axiom) axiom)) axiom
--Equal to a1
t1 :: I r a b -> a -> I r b c
t1 ab a = rule (t3 t2 (t3 (t3 t2 t2) ab)) a
One thing left to be proved is that we can use t1 to t4 only to prove all tautologies. I feel it is the case but have not yet proved it.
Compare to other languages, the Haskell salutation seems more effective and expressive.

Is there an elegant way to have functions return functions of the same type (in a tuple)

I'm using haskell to implement a pattern involving functions that return a value, and themselves (or a function of the same type). Right now I've implemented this like so:
newtype R a = R (a , a -> R a)
-- some toy functions to demonstrate
alpha :: String -> R String
alpha str
| str == reverse str = R (str , omega)
| otherwise = R (reverse str , alpha)
omega :: String -> R String
omega (s:t:r)
| s == t = R (s:t:r , alpha)
| otherwise = R (s:s:t:r , omega)
The driving force for these types of functions is a function called cascade:
cascade :: (a -> R a) -> [a] -> [a]
cascade _ [] = []
cascade f (l:ls) = el : cascade g ls where
R (el , g) = f l
Which takes a seed function and a list, and returns a list created by applying the seed function to the first element of the list, applying the function returned by that to the second element of the list, and so on and so forth.
This works--however, in the process of using this for slightly more useful things, I noticed that a lot of times I had the basic units of which are functions that returned functions other than themselves only rarely; and explicitly declaring a function to return itself was becoming somewhat tedious. I'd rather be able to use something like a Monad's return function, however, I have no idea what bind would do for functions of these types, especially since I never intended these to be linked with anything other than the function they return in the first place.
Trying to shoehorn this into a Monad started worrying me about whether or not what I was doing was useful, so, in short, what I want to know is:
Is what I'm doing a Bad Thing? if not,
Has what I'm doing been done before/am I reinventing the wheel here? if not,
Is there an elegant way to do this, or have I already reached this and am being greedy by wanting some kind of return analogue?
(Incidentally, besides, 'functions that return themeselves' or 'recursive data structure (of functions)', I'm not quite sure what this kind of pattern is called, and has made trying to do effective research in it difficult--if anyone could give me a name for this pattern (if it indeed has one), that alone would be very helpful)
As a high-level consideration, I'd say that your type represents a stateful stream transformer. What's a bit confusing here is that your type is defined as
newtype R a = R (a , a -> R a)
instead of
newtype R a = R (a -> (R a, a))
which would be a bit more natural in the streaming context because you typically don't "produce" something if you haven't received anything yet. Your functions would then have simpler types too:
alpha, omage :: R String
cascade :: R a -> [a] -> [a]
If we try to generalize this idea of a stream transformer, we soon realize that the case where we transform a list of as into a list of as is just a special case. With the proper infrastructure in place we could just as well produce a list of bs. So we try to generalize the type R:
newtype R a b = R (a -> (R a b, b))
I've seen this kind of structure being called a Circuit, which happens to be a full-blown arrow. Arrows are a generalization of the concept of functions and are an even more powerful construct than monads. I can't pretend to understand the category-theoretical background, but it's definitely interesting to play with them. For example, the trivial transformation is just Cat.id:
import Control.Category
import Control.Arrow
import Prelude hiding ((.), id)
import qualified Data.List as L
-- ... Definition of Circuit and instances
cascade :: Circuit a b -> [a] -> [b]
cascade cir = snd . L.mapAccumL unCircuit cir
--
ghci> cascade (Cat.id) [1,2,3,4]
[1,2,3,4]
We can also simulate state by parameterizing the circuit we return as the continuation:
countingCircuit :: (a -> b) -> Circuit a (Int, b)
countingCircuit f = cir 0
where cir i = Circuit $ \x -> (cir (i+1), (i, f x))
--
ghci> cascade (countingCircuit (+5)) [10,3,2,11]
[(0,15),(1,8),(2,7),(3,16)]
And the fact that our circuit type is a category gives us a nice way to compose circuits:
ghci> cascade (countingCircuit (+5) . arr (*2)) [10,3,2,11]
[(0,25),(1,11),(2,9),(3,27)]
It looks like what you have is a simplified version of a stream. That is to
say, a representation of an infinite stream of values. I don't think you can
easily define this as a monad, because you use the same type for your seed as
for your elements, which makes defining fmap difficult (it seems that you
would need to invert the function provided to fmap so as to be able to
recover the seed). You can make this a monad by making the seed type
independent of the element type like so
{-# LANGUAGE ExistentialQuantification #-}
data Stream a = forall s. Stream a s (s -> Stream a)
This will allow you to define a Functor and Monad instance as follows
unfold :: (b -> (a, b)) -> b -> Stream a
unfold f b = Stream a b' (unfold f)
where (a, b') = f b
shead :: Stream a -> a
shead (Stream a _ _) = a
stail :: Stream a -> Stream a
stail (Stream _ b f) = f b
diag :: Stream (Stream a) -> Stream a
diag = unfold f
where f str = (shead $ shead str, stail $ fmap stail str)
sjoin :: Stream (Stream a) -> Stream a
sjoin = diag
instance Functor Stream where
fmap f (Stream a b g) = Stream (f a) b (fmap f . g)
instance Monad Stream where
return = unfold (\x -> (x, x))
xs >>= f = diag $ fmap f xs
Note that this only obeys the Monad laws when viewed as a set, as it does not
preserve element ordering.
This explanation
of the stream monad uses infinite lists, which works just as well in Haskell
since they can be generated in a lazy fashion. If you check out the
documentation for the Stream type in the vector library, you will
find a more complicated version, so that it can be used in efficient stream fusion.
I don't have much to add, except to note that your cascade function can be written as a left fold (and hence also as a right fold, though I haven't done the transformation.)
cascade f = reverse . fst . foldl func ([], f)
where
func (rs,g) s = let R (r,h) = g s in (r:rs,h)

Resources