Passing list of different typed elements to a C function - haskell

I have a function written in C I’d like to call from a Haskell program. The function type is:
foo :: Int -> Ptr a -> IO ()
It takes a size and a pointer on whatever and puts the whole thing somewhere in memory. It’s intended to be used with mixed types. You can put n floats then m bools and so on (in C).
The most convenient way to represent such a situation in Haskell would be – in my opinion – something like ([a],[b]) for instance. But, I need the whole thing to fit in a Ptr a (it’s actually a void* in C). I can try to write a function like ([a],[b]) -> Ptr c, but I need some help around it. The desired final function would be:
withArrayLen magicArray foo

Things that can be stored in memory are instances of type class Storable (in Foreign.Storable). So, given the raw FFI prototype
foreign import "foo" c_foo :: CInt -> Ptr a -> IO ()
you could write something like this for homogenous lists:
homfoo :: Storable a => [a] -> IO ()
homfoo items = withArray items $ \ptr -> c_foo (fromIntegral len) ptr
where len = length items * sizeOf (head items)
But you've said the function is intended to work with mixed types, so we need some kind of type-constrained heterogeneous list for the nice Haskell wrapper. Here is one way to do this:
{-# LANGUAGE GADTs #-}
data DynStorable where
MkStorable :: Storable a => a -> DynStorable
foo :: [DynStorable] -> IO ()
foo items =
let (requiredSize, offsets) = mapAccumL sizeFold 0 items in
allocaBytes requiredSize $ \ptr -> do
zipWithM
(\offset (MkStorable x) -> pokeByteOff ptr offset x)
offsets items
c_foo (fromIntegral requiredSize) ptr
where
sizeFold offset (MkStorable x) =
let unalignment = offset `mod` alignment x
offset' = if unalignment /= 0
then offset + alignment x - unalignment
else offset
in (offset' + sizeOf x, offset')
main :: IO ()
main = do
foo [MkStorable (2 :: Int), MkStorable (3.0 :: Double), MkStorable True]
C function has no means to distinguish item boundaries in the received chunk of data, but it wouldn't be hard to include length prefixes or type codes if required.

Related

How can I find the definition of a Prelude function?

I'm currently trying to find the definition of the words function to help get an idea for a similar function I'm writing. So I was wondering is there somewhere that has all the definitions of the Prelude functions? Maybe a GHCi command to show the definition of one, or something on the Haskell wiki, I'm not sure.
Or if there isn't somewhere I can find that do any of y'all know what the definitions of words is?
Most packages on Hackage come with documentation that also includes a link to the source code for every function. You can usually find the function via Hoogle.
In the case of words, the Prelude documentation is found here, and the source is found at https://hackage.haskell.org/package/base-4.16.3.0/docs/src/Data.OldList.html#words:
-- | 'words' breaks a string up into a list of words, which were delimited
-- by white space.
--
-- >>> words "Lorem ipsum\ndolor"
-- ["Lorem","ipsum","dolor"]
words :: String -> [String]
{-# NOINLINE [1] words #-}
words s = case dropWhile {-partain:Char.-}isSpace s of
"" -> []
s' -> w : words s''
where (w, s'') =
break {-partain:Char.-}isSpace s'
This should in general also work with local documentation.
For ghci specifically, there are :info and :list commands, but :list words only produces an error message ("cannot list source code for words: module base-…:Data.OldList is not interpreted") for me.
Documentation of a function.
To search for the definition, look at hackage. In the case of words: https://hackage.haskell.org/package/base-4.16.3.0/docs/Prelude.html#v:words
Finding a function
To find a function, use hoogle. It is the de-facto tool for finding a function.
Example:
This is inside the shell. Hoogle can be installed at the samne time as when ghc is.
Find a function like so:
> hoogle "subtract"
Prelude subtract :: Num a => a -> a -> a
GHC.Num subtract :: Num a => a -> a -> a
Distribution.Compat.Prelude.Internal subtract :: Num a => a -> a -> a
GHC.Prelude subtract :: Num a => a -> a -> a
Hedgehog.Internal.Prelude subtract :: Num a => a -> a -> a
BasePrelude subtract :: Num a => a -> a -> a
RIO.Prelude subtract :: Num a => a -> a -> a
System.Metrics.Gauge subtract :: Gauge -> Int64 -> IO ()
ClassyPrelude subtract :: Num a => a -> a -> a
Algebra.Additive subtract :: C a => a -> a -> a
-- plus more results not shown, pass --count=20 to see more
The cool thing is, you can also search for types and functions with certain types. For example, here is the list of functions which will take an Int and return an Int.
hoogle "Int -> Int"
GHC.Unicode wgencat :: Int -> Int
System.Win32.DebugApi dr :: Int -> Int
Codec.Picture.Jpg.Internal.Common toBlockSize :: Int -> Int
Statistics.Function nextHighestPowerOfTwo :: Int -> Int
Numeric.SpecFunctions log2 :: Int -> Int
Math.NumberTheory.Logarithms intLog2 :: Int -> Int
Math.NumberTheory.Logarithms intLog2' :: Int -> Int
Streamly.Internal.Data.Array.Foreign.Mut.Type roundUpToPower2 :: Int -> Int
Streamly.Internal.System.IO arrayPayloadSize :: Int -> Int
Data.Array.Comfort.Shape triangleSize :: Int -> Int
-- plus more results not shown, pass --count=20 to see more

Besides as-pattern, what else can # mean in Haskell?

I am studying Haskell currently and try to understand a project that uses Haskell to implement cryptographic algorithms. After reading Learn You a Haskell for Great Good online, I begin to understand the code in that project. Then I found I am stuck at the following code with the "#" symbol:
-- | Generate an #n#-dimensional secret key over #rq#.
genKey :: forall rq rnd n . (MonadRandom rnd, Random rq, Reflects n Int)
=> rnd (PRFKey n rq)
genKey = fmap Key $ randomMtx 1 $ value #n
Here the randomMtx is defined as follows:
-- | A random matrix having a given number of rows and columns.
randomMtx :: (MonadRandom rnd, Random a) => Int -> Int -> rnd (Matrix a)
randomMtx r c = M.fromList r c <$> replicateM (r*c) getRandom
And PRFKey is defined below:
-- | A PRF secret key of dimension #n# over ring #a#.
newtype PRFKey n a = Key { key :: Matrix a }
All information sources I can find say that # is the as-pattern, but this piece of code is apparently not that case. I have checked the online tutorial, blogs and even the Haskell 2010 language report at https://www.haskell.org/definition/haskell2010.pdf. There is simply no answer to this question.
More code snippets can be found in this project using # in this way too:
-- | Generate public parameters (\( \mathbf{A}_0 \) and \(
-- \mathbf{A}_1 \)) for #n#-dimensional secret keys over a ring #rq#
-- for gadget indicated by #gad#.
genParams :: forall gad rq rnd n .
(MonadRandom rnd, Random rq, Reflects n Int, Gadget gad rq)
=> rnd (PRFParams n gad rq)
genParams = let len = length $ gadget #gad #rq
n = value #n
in Params <$> (randomMtx n (n*len)) <*> (randomMtx n (n*len))
I deeply appreciate any help on this.
That #n is an advanced feature of modern Haskell, which is usually not covered by tutorials like LYAH, nor can be found the the Report.
It's called a type application and is a GHC language extension. To understand it, consider this simple polymorphic function
dup :: forall a . a -> (a, a)
dup x = (x, x)
Intuitively calling dup works as follows:
the caller chooses a type a
the caller chooses a value x of the previously chosen type a
dup then answers with a value of type (a,a)
In a sense, dup takes two arguments: the type a and the value x :: a. However, GHC is usually able to infer the type a (e.g. from x, or from the context where we are using dup), so we usually pass only one argument to dup, namely x. For instance, we have
dup True :: (Bool, Bool)
dup "hello" :: (String, String)
...
Now, what if we want to pass a explicitly? Well, in that case we can turn on the TypeApplications extension, and write
dup #Bool True :: (Bool, Bool)
dup #String "hello" :: (String, String)
...
Note the #... arguments carrying types (not values). Those are something that exists at compile time, only -- at runtime the argument does not exist.
Why do we want that? Well, sometimes there is no x around, and we want to prod the compiler to choose the right a. E.g.
dup #Bool :: Bool -> (Bool, Bool)
dup #String :: String -> (String, String)
...
Type applications are often useful in combination with some other extensions which make type inference unfeasible for GHC, like ambiguous types or type families. I won't discuss those, but you can simply understand that sometimes you really need to help the compiler, especially when using powerful type-level features.
Now, about your specific case. I don't have all the details, I don't know the library, but it's very likely that your n represents a kind of natural-number value at the type level. Here we are diving in rather advanced extensions, like the above-mentioned ones plus DataKinds, maybe GADTs, and some typeclass machinery. While I can't explain everything, hopefully I can provide some basic insight. Intuitively,
foo :: forall n . some type using n
takes as argument #n, a kind-of compile-time natural, which is not passed at runtime. Instead,
foo :: forall n . C n => some type using n
takes #n (compile-time), together with a proof that n satisfies constraint C n. The latter is a run-time argument, which might expose the actual value of n. Indeed, in your case, I guess you have something vaguely resembling
value :: forall n . Reflects n Int => Int
which essentially allows the code to bring the type-level natural to the term-level, essentially accessing the "type" as a "value". (The above type is considered an "ambiguous" one, by the way -- you really need #n to disambiguate.)
Finally: why should one want to pass n at the type level if we then later on convert that to the term level? Wouldn't be easier to simply write out functions like
foo :: Int -> ...
foo n ... = ... use n
instead of the more cumbersome
foo :: forall n . Reflects n Int => ...
foo ... = ... use (value #n)
The honest answer is: yes, it would be easier. However, having n at the type level allows the compiler to perform more static checks. For instance, you might want a type to represent "integers modulo n", and allow adding those. Having
data Mod = Mod Int -- Int modulo some n
foo :: Int -> Mod -> Mod -> Mod
foo n (Mod x) (Mod y) = Mod ((x+y) `mod` n)
works, but there is no check that x and y are of the same modulus. We might add apples and oranges, if we are not careful. We could instead write
data Mod n = Mod Int -- Int modulo n
foo :: Int -> Mod n -> Mod n -> Mod n
foo n (Mod x) (Mod y) = Mod ((x+y) `mod` n)
which is better, but still allows to call foo 5 x y even when n is not 5. Not good. Instead,
data Mod n = Mod Int -- Int modulo n
-- a lot of type machinery omitted here
foo :: forall n . SomeConstraint n => Mod n -> Mod n -> Mod n
foo (Mod x) (Mod y) = Mod ((x+y) `mod` (value #n))
prevents things to go wrong. The compiler statically checks everything. The code is harder to use, yes, but in a sense making it harder to use is the whole point: we want to make it impossible for the user to try adding something of the wrong modulus.
Concluding: these are very advanced extensions. If you're a beginner, you will need to slowly progress towards these techniques. Don't be discouraged if you can't grasp them after only a short study, it does take some time. Make a small step at a time, solve some exercises for each feature to understand the point of it. And you'll always have StackOverflow when you are stuck :-)

Do notation for monad in function returning a different type

Is there a way to write do notation for a monad in a function which the return type isn't of said monad?
I have a main function doing most of the logic of the code, supplemented by another function which does some calculations for it in the middle. The supplementary function might fail, which is why it is returning a Maybe value. I'm looking to use the do notation for the returned values in the main function. Giving a generic example:
-- does some computation to two Ints which might fail
compute :: Int -> Int -> Maybe Int
-- actual logic
main :: Int -> Int -> Int
main x y = do
first <- compute x y
second <- compute (x+2) (y+2)
third <- compute (x+4) (y+4)
-- does some Int calculation to first, second and third
What I intend is for first, second, and third to have the actual Int values, taken out of the Maybe context, but doing the way above makes Haskell complain about not being able to match types of Maybe Int with Int.
Is there a way to do this? Or am I heading towards the wrong direction?
Pardon me if some terminology is wrongly used, I'm new to Haskell and still trying to wrap my head around everything.
EDIT
main has to return an Int, without being wrapped in Maybe, as there is another part of the code using the result of mainas Int. The results of a single compute might fail, but they should collectively pass (i.e. at least one would pass) in main, and what I'm looking for is a way to use do notation to take them out of Maybe, do some simple Int calculations to them (e.g. possibly treating any Nothing returned as 0), and return the final value as just Int.
Well the signature is in essence wrong. The result should be a Maybe Int:
main :: Int -> Int -> Maybe Int
main x y = do
first <- compute x y
second <- compute (x+2) (y+2)
third <- compute (x+4) (y+4)
return (first + second + third)
For example here we return (first + second + third), and the return will wrap these in a Just data constructor.
This is because your do block, implicitly uses the >>= of the Monad Maybe, which is defined as:
instance Monad Maybe where
Nothing >>=_ = Nothing
(Just x) >>= f = f x
return = Just
So that means that it will indeed "unpack" values out of a Just data constructor, but in case a Nothing comes out of it, then this means that the result of the entire do block will be Nothing.
This is more or less the convenience the Monad Maybe offers: you can make computations as a chain of succesful actions, and in case one of these fails, the result will be Nothing, otherwise it will be Just result.
You can thus not at the end return an Int instead of a Maybe Int, since it is definitely possible - from the perspective of the types - that one or more computations can return a Nothing.
You can however "post" process the result of the do block, if you for example add a "default" value that will be used in case one of the computations is Nothing, like:
import Data.Maybe(fromMaybe)
main :: Int -> Int -> Int
main x y = fromMaybe 0 $ do
first <- compute x y
second <- compute (x+2) (y+2)
third <- compute (x+4) (y+4)
return (first + second + third)
Here in case the do-block thus returns a Nothing, we replace it with 0 (you can of course add another value in the fromMaybe :: a -> Maybe a -> a as a value in case the computation "fails").
If you want to return the first element in a list of Maybes that is Just, then you can use asum :: (Foldable t, Alternative f) => t (f a) -> f a, so then you can write your main like:
-- first non-failing computation
import Data.Foldable(asum)
import Data.Maybe(fromMaybe)
main :: Int -> Int -> Int
main x y = fromMaybe 0 $ asum [
compute x y
compute (x+2) (y+2)
compute (x+4) (y+4)
]
Note that the asum can still contain only Nothings, so you still need to do some post-processing.
Willem's answer is basically perfect, but just to really drive the point home, let's think about what would happen if you could write something that allows you to return an int.
So you have the main function with type Int -> Int -> Int, let's assume an implementation of your compute function as follows:
compute :: Int -> Int -> Maybe Int
compute a 0 = Nothing
compute a b = Just (a `div` b)
Now this is basically a safe version of the integer division function div :: Int -> Int -> Int that returns a Nothing if the divisor is 0.
If you could write a main function as you like that returns an Int, you'd be able to write the following:
unsafe :: Int
unsafe = main 10 (-2)
This would make the second <- compute ... fail and return a Nothing but now you have to interpret your Nothing as a number which is not good. It defeats the whole purpose of using Maybe monad which captures failure safely. You can, of course, give a default value to Nothing as Willem described, but that's not always appropriate.
More generally, when you're inside a do block you should just think inside "the box" that is the monad and don't try to escape. In some cases like Maybe you might be able to do unMaybe with something like fromMaybe or maybe functions, but not in general.
I have two interpretations of your question, so to answer both of them:
Sum the Maybe Int values that are Just n to get an Int
To sum Maybe Ints while throwing out Nothing values, you can use sum with Data.Maybe.catMaybes :: [Maybe a] -> [a] to throw out Nothing values from a list:
sum . catMaybes $ [compute x y, compute (x+2) (y+2), compute (x+4) (y+4)]
Get the first Maybe Int value that's Just n as an Int
To get the first non-Nothing value, you can use catMaybes combined with listToMaybe :: [a] -> Maybe a to get Just the first value if there is one or Nothing if there isn't and fromMaybe :: a -> Maybe a -> a to convert Nothing to a default value:
fromMaybe 0 . listToMaybe . catMaybes $ [compute x y, compute (x+2) (y+2), compute (x+4) (y+4)]
If you're guaranteed to have at least one succeed, use head instead:
head . catMaybes $ [compute x y, compute (x+2) (y+2), compute (x+4) (y+4)]

Pattern matching on a private data constructor

I'm writing a simple ADT for grid axis. In my application grid may be either regular (with constant step between coordinates), or irregular (otherwise). Of course, the regular grid is just a special case of irregular one, but it may worth to differentiate between them in some situations (for example, to perform some optimizations). So, I declare my ADT as the following:
data GridAxis = RegularAxis (Float, Float) Float -- (min, max) delta
| IrregularAxis [Float] -- [xs]
But I don't want user to create malformed axes with max < min or with unordered xs list. So, I add "smarter" construction functions which perform some basic checks:
regularAxis :: (Float, Float) -> Float -> GridAxis
regularAxis (a, b) dx = RegularAxis (min a b, max a b) (abs dx)
irregularAxis :: [Float] -> GridAxis
irregularAxis xs = IrregularAxis (sort xs)
I don't want user to create grids directly, so I don't add GridAxis data constructors into module export list:
module GridAxis (
GridAxis,
regularAxis,
irregularAxis,
) where
But it turned out that after having this done I cannot use pattern matching on GridAxis anymore. Trying to use it
import qualified GridAxis as GA
test :: GA.GridAxis -> Bool
test axis = case axis of
GA.RegularAxis -> True
GA.IrregularAxis -> False
gives the following compiler error:
src/Physics/ImplicitEMC.hs:7:15:
Not in scope: data constructor `GA.RegularAxis'
src/Physics/ImplicitEMC.hs:8:15:
Not in scope: data constructor `GA.IrregularAxis'
Is there something to work this around?
You can define constructor pattern synonyms. This lets you use the same name for smart construction and "dumb" pattern matching.
{-# LANGUAGE PatternSynonyms #-}
module GridAxis (GridAxis, pattern RegularAxis, pattern IrregularAxis) where
import Data.List
data GridAxis = RegularAxis_ (Float, Float) Float -- (min, max) delta
| IrregularAxis_ [Float] -- [xs]
-- The line with "<-" defines the matching behavior
-- The line with "=" defines the constructor behavior
pattern RegularAxis minmax delta <- RegularAxis_ minmax delta where
RegularAxis (a, b) dx = RegularAxis_ (min a b, max a b) (abs dx)
pattern IrregularAxis xs <- IrregularAxis_ xs where
IrregularAxis xs = IrregularAxis_ (sort xs)
Now you can do:
module Foo
import GridAxis
foo :: GridAxis -> a
foo (RegularAxis (a, b) d) = ...
foo (IrregularAxis xs) = ...
And also use RegularAxis and IrregularAxis as smart constructors.
This looks as a use case for pattern synonyms.
Basically you don't export the real constructor, but only a "smart" one
{-# LANGUAGE PatternSynonyms #-}
module M(T(), SmartCons, smartCons) where
data T = RealCons Int
-- the users will construct T using this
smartCons :: Int -> T
smartCons n = if even n then RealCons n else error "wrong!"
-- ... and destruct T using this
pattern SmartCons n <- RealCons n
Another module importing M can then use
case someTvalue of
SmartCons n -> use n
and e.g.
let value = smartCons 23 in ...
but can not use the RealCons directly.
If you prefer to stay in basic Haskell, without extensions, you can use a "view type"
module M(T(), smartCons, Tview(..), toView) where
data T = RealCons Int
-- the users will construct T using this
smartCons :: Int -> T
smartCons n = if even n then RealCons n else error "wrong!"
-- ... and destruct T using this
data Tview = Tview Int
toView :: T -> Tview
toView (RealCons n) = Tview n
Here, users have full access to the view type, which can be constructed/destructed freely, but have only a restricted start constructor for the actual type T. Destructing the actual type T is possible by moving to the view type
case toView someTvalue of
Tview n -> use n
For nested patterns, things become more cumbersome, unless you enable other extensions such as ViewPatterns.

State Monad, sequences of random numbers and monadic code

I'm trying to grasp the State Monad and with this purpose I wanted to write a monadic code that would generate a sequence of random numbers using a Linear Congruential Generator (probably not good, but my intention is just to learn the State Monad, not build a good RNG library).
The generator is just this (I want to generate a sequence of Bools for simplicity):
type Seed = Int
random :: Seed -> (Bool, Seed)
random seed = let (a, c, m) = (1664525, 1013904223, 2^32) -- some params for the LCG
seed' = (a*seed + c) `mod` m
in (even seed', seed') -- return True/False if seed' is even/odd
Don't worry about the numbers, this is just an update rule for the seed that (according to Numerical Recipes) should generate a pseudo-random sequence of Ints. Now, if I want to generate random numbers sequentially I'd do:
rand3Bools :: Seed -> ([Bool], Seed)
rand3Bools seed0 = let (b1, seed1) = random seed0
(b2, seed2) = random seed1
(b3, seed3) = random seed2
in ([b1,b2,b3], seed3)
Ok, so I could avoid this boilerplate by using a State Monad:
import Control.Monad.State
data Random {seed :: Seed, value :: Bool}
nextVal = do
Random seed val <- get
let seed' = updateSeed seed
val' = even seed'
put (Random seed' val')
return val'
updateSeed seed = let (a,b,m) = (1664525, 1013904223, 2^32) in (a*seed + c) `mod` m
And finally:
getNRandSt n = replicateM n nextVal
getNRand :: Int -> Seed -> [Bool]
getNRand n seed = evalState (getNRandStates n) (Random seed True)
Ok, this works fine and give me a list of n pseudo-random Bools for each given seed. But...
I can read what I've done (mainly based on this example: http://www.haskell.org/pipermail/beginners/2008-September/000275.html ) and replicate it to do other things. But I don't think I can understand what's really happening behind the do-notation and monadic functions (like replicateM).
Can anyone help me with some of this doubts?
1 - I've tried to desugar the nextVal function to understand what it does, but I couldn't. I can guess it extracts the current state, updates it and then pass the state ahead to the next computation, but this is just based on reading this do-sugar as if it was english.
How do I really desugar this function to the original >>= and return functions step-by-step?
2 - I couldn't grasp what exactly the put and get functions do. I can guess that they "pack" and "unpack" the state. But the mechanics behind the do-sugar is still elusive to me.
Well, any other general remarks about this code are very welcome. I sometimes fell with Haskell that I can create a code that works and do what I expect it to do, but I can't "follow the evaluation" as I'm accustomed to do with imperative programs.
The State monad does look kind of confusing at first; let's do as Norman Ramsey suggested, and walk through how to implement from scratch. Warning, this is pretty lengthy!
First, State has two type parameters: the type of the contained state data and the type of the final result of the computation. We'll use stateData and result respectively as type variables for them here. This makes sense if you think about it; the defining characteristic of a State-based computation is that it modifies a state while producing an output.
Less obvious is that the type constructor takes a function from a state to a modified state and result, like so:
newtype State stateData result = State (stateData -> (result, stateData))
So while the monad is called "State", the actual value wrapped by the the monad is that of a State-based computation, not the actual value of the contained state.
Keeping that in mind, we shouldn't be surprised to find that the function runState used to execute a computation in the State monad is actually nothing more than an accessor for the wrapped function itself, and could be defined like this:
runState (State f) = f
So what does it mean when you define a function that returns a State value? Let's ignore for a moment the fact that State is a monad, and just look at the underlying types. First, consider this function (which doesn't actually do anything with the state):
len2State :: String -> State Int Bool
len2State s = return ((length s) == 2)
If you look at the definition of State, we can see that here the stateData type is Int, and the result type is Bool, so the function wrapped by the data constructor must have the type Int -> (Bool, Int). Now, imagine a State-less version of len2State--obviously, it would have type String -> Bool. So how would you go about converting such a function into one returning a value that fits into a State wrapper?
Well, obviously, the converted function will need to take a second parameter, an Int representing the state value. It also needs to return a state value, another Int. Since we're not actually doing anything with the state in this function, let's just do the obvious thing--pass that int right on through. Here's a State-shaped function, defined in terms of the State-less version:
len2 :: String -> Bool
len2 s = ((length s) == 2)
len2State :: String -> (Int -> (Bool, Int))
len2State s i = (len2' s, i)
But that's kind of silly and redundant. Let's generalize the conversion so that we can pass in the result value, and turn anything into a State-like function.
convert :: Bool -> (Int -> (Bool, Int))
convert r d = (r, d)
len2 s = ((length s) == 2)
len2State :: String -> (Int -> (Bool, Int))
len2State s = convert (len2 s)
What if we want a function that changes the state? Obviously we can't build one with convert, since we wrote that to pass the state through. Let's keep it simple, and write a function to overwrite the state with a new value. What kind of type would it need? It'll need an Int for the new state value, and of course will have to return a function stateData -> (result, stateData), because that's what our State wrapper needs. Overwriting the state value doesn't really have a sensible result value outside the State computation, so our result here will just be (), the zero-element tuple that represents "no value" in Haskell.
overwriteState :: Int -> (Int -> ((), Int))
overwriteState newState _ = ((), newState)
That was easy! Now, let's actually do something with that state data. Let's rewrite len2State from above into something more sensible: we'll compare the string length to the current state value.
lenState :: String -> (Int -> (Bool, Int))
lenState s i = ((length s) == i, i)
Can we generalize this into a converter and a State-less function, like we did before? Not quite as easily. Our len function will need to take the state as an argument, but we don't want it to "know about" state. Awkward, indeed. However, we can write a quick helper function that handles everything for us: we'll give it a function that needs to use the state value, and it'll pass the value in and then package everything back up into a State-shaped function leaving len none the wiser.
useState :: (Int -> Bool) -> Int -> (Bool, Int)
useState f d = (f d, d)
len :: String -> Int -> Bool
len s i = (length s) == i
lenState :: String -> (Int -> (Bool, Int))
lenState s = useState (len s)
Now, the tricky part--what if we want to string these functions together? Let's say we want to use lenState on a string, then double the state value if the result is false, then check the string again, and finally return true if either check did. We have all the parts we need for this task, but writing it all out would be a pain. Can we make a function that automatically chains together two functions that each return State-like functions? Sure thing! We just need to make sure it takes as arguments two things: the State function returned by the first function, and a function that takes the prior function's result type as an argument. Let's see how it turns out:
chainStates :: (Int -> (result1, Int)) -> (result1 -> (Int -> (result2, Int))) -> (Int -> (result2, Int))
chainStates prev f d = let (r, d') = prev d
in f r d'
All this is doing is applying the first state function to some state data, then applying the second function to the result and the modified state data. Simple, right?
Now, the interesting part: Between chainStates and convert, we should almost be able to turn any combination of State-less functions into a State-enabled function! The only thing we need now is a replacement for useState that returns the state data as its result, so that chainStates can pass it along to the functions that don't know anything about the trick we're pulling on them. Also, we'll use lambdas to accept the result from the previous functions and give them temporary names. Okay, let's make this happen:
extractState :: Int -> (Int, Int)
extractState d = (d, d)
chained :: String -> (Int -> (Bool, Int))
chained str = chainStates extractState $ \state1 ->
let check1 = (len str state1) in
chainStates (overwriteState (
if check1
then state1
else state1 * 2)) $ \ _ ->
chainStates extractState $ \state2 ->
let check2 = (len str state2) in
convert (check1 || check2)
And try it out:
> chained "abcd" 2
(True, 4)
> chained "abcd" 3
(False, 6)
> chained "abcd" 4
(True, 4)
> chained "abcdef" 5
(False, 10)
Of course, we can't forget that State is actually a monad that wraps the State-like functions and keeps us away from them, so none of our nifty functions that we've built will help us with the real thing. Or will they? In a shocking twist, it turns out that the real State monad provides all the same functions, under different names:
runState (State s) = s
return r = State (convert r)
(>>=) s f = State (\d -> let (r, d') = (runState s) d in
runState (f r) d')
get = State extractState
put d = State (overwriteState d)
Note that >>= is almost identical to chainStates, but there was no good way to define it using chainStates. So, to wrap things up, we can rewrite the final example using the real State:
chained str = get >>= \state1 ->
let check1 = (len str state1) in
put (if check1
then state1 else state1 * 2) >>= \ _ ->
get >>= \state2 ->
let check2 = (len str state2) in
return (check1 || check2)
Or, all candied up with the equivalent do notation:
chained str = do
state1 <- get
let check1 = len str state1
_ <- put (if check1 then state1 else state1 * 2)
state2 <- get
let check2 = (len str state2)
return (check1 || check2)
First of all, your example is overly complicated because it doesn't need to store the val in the state monad; only the seed is the persistent state. Second, I think you will have better luck if instead of using the standard state monad, you re-implement all of the state monad and its operations yourself, with their types. I think you will learn more this way. Here are a couple of declarations to get you started:
data MyState s a = MyState (s -> (s, b))
get :: Mystate s s
put :: s -> Mystate s ()
Then you can write your own connectives:
unit :: a -> Mystate s a
bind :: Mystate s a -> (a -> Mystate s b) -> Mystate s b
Finally
data Seed = Seed Int
nextVal :: Mystate Seed Bool
As for your trouble desugaring, the do notation you are using is pretty sophisticated.
But desugaring is a line-at-a-time mechanical procedure. As near as I can make out, your code should desugar like this (going back to your original types and code, which I disagree with):
nextVal = get >>= \ Random seed val ->
let seed' = updateSeed seed
val' = even seed'
in put (Random seed' val') >>= \ _ -> return val'
In order to make the nesting structure a bit clearer, I've taken major liberties with the indentation.
You've got a couple great responses. What I do when working with the State monad is in my mind replace State s a with s -> (s,a) (after all, that's really what it is).
You then get a type for bind that looks like:
(>>=) :: (s -> (s,a)) ->
(a -> s -> (s,b)) ->
(s -> (s,b))
and you see that bind is just a specialized kind of function composition operator, like (.)
I wrote a blog/tutorial on the state monad here. It's probably not particularly good, but helped me grok things a little better by writing it.

Resources