Haskell function that works with STUArray

Haskell function that works with STUArray - haskell

I have a small Haskell function that is supposed to accept an STUArray, modify some of the elements, and then return the changed array. It will be called from another function working in the ST s (STUArray s Int Word32) monad. It part of a fast PBKDF2 function that I am trying to write. This function does SHA-1 padding for a fixed sized message (160-bits).
Here is my code:
padFixed :: STUArray s Int Word32 -> ST s (STUArray s Int Word32)
padFixed block = do
unsafeWrite block 5 0x80000000
unsafeWrite block 15 160
return block
The array will contain the 20 bytes from a previous SHA-1 run, plus 44 bytes of zeros. It will add the required padding as per RFC 3174.
How can I rewrite it so is "takes" the array out of the monad, works on it, and then puts it back? The signature should be padFixed :: ST s (STUArray s Int Word32), without the block parameter.
Is this possible? I could not find any functions in the library that let me extract the array from the monad, but maybe I missed something.
Are there any good tutorials on the STArray?

No, it's not possible; ST doesn't have those semantics. The monad is ST s, and not ST s (STUArray s a). ST s is just a monad for keeping track of mutable state; which structures you choose to allocate and use inside a single ST region are up to you. If you have a bunch of computations which all operate on the same STUArray, you can use ReaderT:
type Hasher s = ReaderT (STUArray s Int Word32) (ST s)
padFixed :: Hasher ()
padFixed = do
block <- ask
unsafeWrite block 5 0x80000000
unsafeWrite block 15 160
The Reader r monad is just a wrapper around r ->; a value of type Reader r a is just a function r -> a. This is essentially a way to compute a while having access to a value of type r. The ReaderT r monad transformer just allows you to provide access to a variable of type r to an arbitrary monadic computation; thus, ReaderT (STUArray s Int Word32) (ST s) is an ST s computation which has access to some array. Note that you don't need to return the array from padFixed; the monad bind will handle all of that.
This'll be a little bit of a pain to write, since we'll have to keep asking for the array. Luckily, we can write some combinators to handle this for us:
{-# LANGUAGE RankNTypes, GeneralizedNewtypeDeriving #-}
import Data.Word
import Control.Applicative
import Control.Monad.Reader
import Control.Monad.ST
import Data.Array.ST (STUArray, runSTUArray)
import qualified Data.Array.Base as A
import Data.Array.Unboxed (UArray)
newtype Hasher s a =
Hasher { getHasher :: ReaderT (STUArray s Int Word32) (ST s) a }
deriving (Functor, Applicative, Monad, MonadReader (A.STUArray s Int Word32))
hasherToST :: Hasher s () -> (Int,Int) -> ST s (STUArray s Int Word32)
hasherToST (Hasher r) bounds = do
block <- A.newArray bounds 0
runReaderT r block
return block
runHasher :: (forall s. Hasher s ()) -> (Int,Int) -> UArray Int Word32
runHasher h bounds = runSTUArray $ hasherToST h bounds
-- Perhaps private to this module, perhaps not
liftST :: ST s a -> Hasher s a
liftST = Hasher . lift
----- We can lift the functions which act on an STUArray -----
getBounds :: Hasher s (Int,Int)
getBounds = liftST . A.getBounds =<< ask
-- I'd recommend against removing the `unsafe` from the name; this function
-- could segfault, after all.
unsafeReadBlock :: Int -> Hasher s Word32
unsafeReadBlock i = do
block <- ask
liftST $ A.unsafeRead block i
unsafeWriteBlock :: Int -> Word32 -> Hasher s ()
unsafeWriteBlock i x = do
block <- ask
liftST $ A.unsafeWrite block i x
----- And then, perhaps in a separate module: -----
padFixed :: Hasher s ()
padFixed = do
unsafeWriteBlock 5 0x80000000
unsafeWriteBlock 15 160
(Note that I couldn't inline hasherToST inside of runHasher, probably because of the higher-rank types blocking inference.)
Basically, we wrap the ReaderT (STUArray s Int Word32) (ST s) into a newtype instead of a type synonym, and lift some basic array primitives up to work on the always-available block. You don't even need to derive MonadReader for the Hasher type if you don't want, as long as you lift all the necessary functions. But once you've done this, your hashing code can talk about the array implicitly.

No, you're confused; that's not possible. Think of STUArray s i e as being a pointer to the beginning of a block of memory. You have to pass that pointer around to anything that needs to modify that block of memory; you can't just conjure it up out of thin air.
But you don't need to return it. Presumably the caller already has the pointer.

You can use the freeze and thaw functions to convert to and from a UArray.
However either this will incur a performance penalty, or you need to use the "unsafe" variants. Since you're already doing unsafe writing, that's probably OK.

Related

Lifting a value in the State monad in Haskell

I am writing a Sudoku generator/solver in Haskell as a learning exercise.
My solve function takes in a UArray but returns a State Int (UArray ...) so that it can also return the maximum difficulty level that it found while solving.
This is my function so far (still in the very experimental early stage):
import Control.Monad.State (State, put)
import Control.Monad.Trans.Class (lift)
import Data.Array.MArray (thaw)
import Data.Array.ST (runSTUArray)
import Data.Array.Unboxed (UArray)
-- ...
type Cell = Word16
solve :: UArray (Int, Int) Cell -> State Int (UArray (Int, Int) Cell)
solve grid = do
return $ runSTUArray $ do
arr <- thaw grid
lift $ put 42
return arr
It does not really do anything with the mutable array yet. I am simply trying to get it to type check with the put 42, but currently get the following error:
• Couldn't match kind ‘*’ with ‘* -> *’
When matching the kind of ‘ST’
• In a stmt of a 'do' block: lift $ put 42
In the second argument of ‘($)’, namely
‘do arr <- thaw grid
lift $ put 42
return arr’
In the second argument of ‘($)’, namely
‘runSTUArray
$ do arr <- thaw grid
lift $ put 42
return arr’
|
128 | lift $ put 42
| ^^^^^^^^^^^^^

runSTUArray ... is a pure value, it does not know anything about "outer monad". And State cares about how you use it, you cannot pass it opaquely into ST.
What you could do:
Option1: change the whole program to move more logic to ST side. Instead of State you'd use STRef then:
solve :: ST s (STRef Int) -> ST s (UArray (Int, Int) Cell) -> ST s ()
...
Option2: manually extract it and pass it to ST, then get back and put explicitly. But there is complication. runSTUArray does not allow getting another value together with the array. I don't know how it can be done safely with current array functions. Unsafely you could re-implement better runSTUArray which can pass another value. You could also add fake cells and encode the new state there.
The way to export another value exists in the vector package, there is (in new versions) createT function which can take not bare vector but a structure containing it (or even several vectors). So, overall, your example would be like:
import Control.Monad.State (State, put, get)
import Data.Word (Word16)
import qualified Data.Vector.Unboxed as DVU
type Cell = Word16
solve :: DVU.Vector Cell -> State Int (DVU.Vector Cell)
solve grid = do
oldState <- get
let (newState, newGrid) = DVU.createT (do
arr <- DVU.thaw grid
pure (oldState + 42, arr))
put newState
pure newGrid
vectors are one-dimensional only, unfortunately

solve grid has form return $ .... This means that State Int (UArray (Int, Int) Cell) is just specialized Monad m => m (UArray (Int, Int) Cell) - the ... does not have access to the features of this specific monad, it's just a UArray (Int, Int) Cell value that you return.

I was able to get a slight variation to compile and run after changing the State monad to a tuple (Int, Grid):
import Control.Monad.ST (ST, runST)
import Data.Array.MArray (freeze, thaw, writeArray)
import Data.Array.ST (STUArray)
import Data.Array.Unboxed (UArray)
import Data.Word (Word16)
type Cell = Word16
type Grid = UArray (Int, Int) Cell
solve :: Grid -> (Int, Grid)
solve grid =
runST $ do
mut <- thaw grid :: ST s (STUArray s (Int, Int) Cell)
writeArray mut (0, 0) 0 -- test that I can actually write
frozen <- freeze mut
return (42, frozen)
This works fine for my application.

Difficulty with zoom and free monads

I am mucking around with free monads and lens, using the free monad to create my own version of the IO monad:
data MyIO next
= LogMsg String next
| GetInput (String -> next)
deriving (Functor)
I am stacking this on top of a state monad like so: FreeT MyIO (State GameState) a where GameState is:
data GameState = GameState { _players :: [PlayerState] }
Now, what I would like to have is a way to "zoom-into" a PlayerState from a GameState context. Something like this:
zoomPlayer :: Int -> FreeT MyIO (State PlayerState) a -> FreeT MyIO (State GameState) a
zoomPlayer i prog = hoistFreeT (zoom (players . element i)) prog
But I'm getting this error:
No instance for (Data.Monoid.Monoid a1)
arising from a use of ‘_head’
This error seems related to the fact that players . element i is a traversal; if I remove the list aspect from _players and use normal lens then the code works.
Any ideas on how to write this function?

If you are sure you'll never index into a non-existing player and don't mind a little unsafety, you can use the unsafeSingular combinator to turn a Traversal into a Lens, like this:
zoomPlayer :: Int -> FreeT MyIO (State PlayerState) a -> FreeT MyIO (State GameState) a
zoomPlayer i prog = hoistFreeT (zoom (players . unsafeSingular (element i))) prog
(Also, perhaps I would use ix instead of element, but that's unrelated to the problem.)
We can also construct safe indexing lenses for always-infinite sequences, like streams defined using Cofree from the free package:
import Control.Lens (Lens', _Wrapped')
import Control.Comonad.Cofree (Cofree, telescoped)
import Data.Functor.Identity
import Control
sureIx :: Int -> Lens' (Cofree Identity a) a
sureIx i = telescoped $ replicate i _Wrapped'
But a game is unlikely to have infinite players.

"Persistently" Impure (IO) Vectors in Haskell, with database-like persistent interface

I have a computation that is best described as iterative mutations on a vector; the final result is the final state of the vector.
The "idiomatic" approach to making this functional, I think, is to simply pass on a new vector object along whenever it is "modified". So your iterative method would be operate_on_vector :: Vector -> Vector, which takes in a vector and outputs the modified vector, which is then fed through the method again.
This method is pretty straightforward and I had no problems implementing it, even being new to Haskell.
Alternatively, one could encapsulate all of this in a State monad and pass along a constantly re-created and modified vector as the state value.
However, I suffer a huge, huge performance cost, as these calculations are pretty intensive, the iterations many (on the order of millions) and the data vectors can get pretty large (on the order of at least thousands of primitives). Re-creating a new vector in memory at every step of the iteration seems pretty costly, data collection or not.
Then I considered how IO works -- it can be seen as basically like State, except the state value is the "World", which is constantly changing.
Maybe I could use something that is like IO to "operate" on a "world"? And the "world" would be the vector in-memory? Sort of like a database query, but everything is in memory.
For example with io you could do
do
putStrLn "enter something"
something <- getLine
putStrLine $ "you entered " ++ something
which can be seen as "performing" putStrLn and "modifying" the World object, returning a new World object and feeding it into the next function, which queryies the world object for a string that is the result of the modification, and then returns another world object after another modification.
Is there anything like that that can do this for mutable vectors?
do
putInVec 0 9 -- index 0, value 9
val <- getFromVec 0
putInVec 0 (val + 1)
, with "impure" "mutable" vectors, instead of passing along a new modified vector at each step.

I believe you can do this using mutable vector and a thin wrapper over Reader + ST (or IO) monad.
It can look like this:
type MyVector = IOVector $x -- Use your own elements type here instead of $x
newtype VectorIO a = VectorIO (ReaderT MyVector IO a) deriving (Monad, MonadReader, MonadIO)
-- You will need GeneralizedNewtypeDeriving extension here
-- Run your computation over an existing vector
runComputation :: MyVector -> VectorIO a -> IO MyVector
runComputation vector (VectorIO action) = runReaderT action vector >> return vector
-- Run your computation over a new vector of the specified length
runNewComputation :: Int -> VectorIO a -> IO MyVector
runNewComputation n action = do
vector <- new n
runComputation vector action
putInVec :: Int -> $x -> VectorIO ()
putInVec idx val = do
v <- ask
liftIO $ write v idx val
getFromVec :: Int -> VectorIO $x
getFromVec idx = do
v <- ask
liftIO $ read v idx
That's really all. You can use VectorIO monad to perform your computations, just like you wanted in your example. If you do not want IO but want pure computations, you can use ST monad; modifications to the code above will be trivial.
Update
Here is an ST-based version:
{-# LANGUAGE GeneralizedNewtypeDeriving, FlexibleInstances, MultiParamTypeClasses, Rank2Types #-}
module Main where
import Control.Monad
import Control.Monad.Trans.Class
import Control.Monad.Reader
import Control.Monad.Reader.Class
import Control.Monad.ST
import Data.Vector as V
import Data.Vector.Mutable as MV
-- Your type of the elements
type E = Int
-- Mutable vector which will be used as a context
type MyVector s = MV.STVector s E
-- Immutable vector compatible with MyVector in its type
type MyPureVector = V.Vector E
-- Simple monad stack consisting of a reader with the mutable vector as a context
-- and of an ST action
newtype VectorST s a = VectorST (ReaderT (MyVector s) (ST s) a) deriving Monad
-- Make the VectorST a reader monad
instance MonadReader (MyVector s) (VectorST s) where
ask = VectorST $ ask
local f (VectorST a) = VectorST $ local f a
reader = VectorST . reader
-- Lift an ST action to a VectorST action
liftST :: ST s a -> VectorST s a
liftST = VectorST . lift
-- Run your computation over an existing vector
runComputation :: MyVector s -> VectorST s a -> ST s (MyVector s)
runComputation vector (VectorST action) = runReaderT action vector >> return vector
-- Run your computation over a new vector of the specified length
runNewComputation :: Int -> VectorST s a -> ST s (MyVector s)
runNewComputation n action = do
vector <- MV.new n
runComputation vector action
-- Run a computation on a new mutable vector and then freeze it to an immutable one
runComputationPure :: Int -> (forall s. VectorST s a) -> MyPureVector
runComputationPure n action = runST $ do
vector <- runNewComputation n action
V.unsafeFreeze vector
-- Put an element into the current vector
putInVec :: Int -> E -> VectorST s ()
putInVec idx val = do
v <- ask
liftST $ MV.write v idx val
-- Retrieve an element from the current vector
getFromVec :: Int -> VectorST s E
getFromVec idx = do
v <- ask
liftST $ MV.read v idx

How to correctly qualify types for working with the (transformed) ST and random monads

Here is my code:
...
import System.Random ( RandomGen, next, split )
import qualified Data.Array.MArray as MAI
import Data.Array.ST.Safe( STUArray )
import Control.Monad.ST.Safe(ST)
import qualified Control.Monad.Random as CMR
import Control.Monad.Trans.Class( lift )
data GraphEdgeYaml = GraphEdgeYaml {
specie1:: NodeName,
specie2 :: NodeName,
sign :: Int,
speed :: Int
}
type LinksSTA s = STUArray s Int GraphEdgeYaml
-- Change a simple link
swapLink :: RandomGen g =>
LinksSTA s
-> g
-> ST s g
swapLink graph generator =
let
swap_op :: CMR.RandT g (ST s) ()
swap_op = do
(low_limit, high_limit) <- lift $ MAI.getBounds graph
idx_value <- CMR.getRandomR (low_limit, high_limit)
return ()
in do
(_, new_generator) <- CMR.runRandT swap_op generator
return new_generator
and here is the error message that I get:
hs/SignMatrixBuild/Randomize.hs:43:26:
Could not deduce (RandomGen g1)
arising from a use of `CMR.getRandomR'
from the context (RandomGen g)
bound by the type signature for
swapLink :: RandomGen g => LinksSTA s -> g -> ST s g
at hs/SignMatrixBuild/Randomize.hs:(38,1)-(47,28)
Possible fix:
add (RandomGen g1) to the context of
the type signature for swap_op :: CMR.RandT g1 (ST s1) ()
or the type signature for
swapLink :: RandomGen g => LinksSTA s -> g -> ST s g
In a stmt of a 'do' block:
idx_value <- CMR.getRandomR (low_limit, high_limit)
In the expression:
do { (low_limit, high_limit) <- lift $ MAI.getBounds graph;
idx_value <- CMR.getRandomR (low_limit, high_limit);
return () }
In an equation for `swap_op':
swap_op
= do { (low_limit, high_limit) <- lift $ MAI.getBounds graph;
idx_value <- CMR.getRandomR (low_limit, high_limit);
return () }
How do I fix this?

One way to fix it is to bring the type variables s and g into scope using the ScopedTypeVariables extension, the other is to simply omit the local type signature on swap_op.
If the local signature is omitted, the type can be inferred -- that leaves, however, the problem of the constraint
MAI.MArray (STUArray s) GraphEdgeYaml (ST s)
that is needed. There are two options,
change the array type to STArray
use STUArrays indeed
If you change the array type to STArray, no constraint is needed (since there is an instance MArray (STArray s) e (ST s) that covers all element types), and without the local type signature it compiles without problems.
If you want to keep the STUArrays, the function can only be used where an instance is in scope. It is best, usually, to provide such instances either where the class is defined (not an option here) or where the type is defined (that would be this module).
So then you should write an
instance MAI.MArray (STUArray s) GraphEdgeYaml (ST s)
in that module, with that instance, the constraint would be fulfilled and need not be placed on the function. Note, however, that writing such an instance is not trivial.
Alternatively, you could add the constraint to the signature and load off the burden of defining an (orphan) instance to the user of swapLink.
I don't know what a NodeName is, but whether GraphEdgeYaml is an unboxable type seems doubtful. I would therefore recommend switching to STArrays.

Monadic creation of vectors (or: can someone type annotate this for me?)

I came across the following piece of code as part of this Redddit sub-thread discussing an implementation of the Fisher-Yates shuffle:
randomIs g n = fill g 0
where
v = enumFromN 0 n
fill g i = when (i < n) $ do
let (x,g') = randomR (i, n-1) g
G.swap v i x
fill g' (i+1)
(I guess G refers to Data.Vector.Generic.Mutable... right?). Having never created vectors monadically before, I'm struggling to grasp this, especially with no type annotations. Doesn't v have type Data.Vector Int? How come one can pass it to G.swap then? Won't it have to be thawed first?
I might have just misunderstood Data.Vector.Generic, but if someone could clarify the above (by adding type annotations, perhaps?), I'd appreciate it.
Addendum: Here's my own attempt at adding type annotations:
import qualified Data.Vector.Unboxed as UVect
import qualified Data.Vector.Unboxed.Mutable as UMVect
import qualified System.Random as R
import Control.Monad
import Control.Monad.ST
randomPermutation :: forall a. (R.RandomGen a) => a -> Int -> UVect.Vector Int
randomPermutation g n = runST newVect
where
newVect :: ST s (UVect.Vector Int)
newVect = UVect.unsafeThaw (UVect.enumFromN 0 n) >>= \v ->
fill v 0 g >>
UVect.unsafeFreeze v
fill x i gen = when (i < n) $
let (j, gen') = R.randomR (i, n-1) gen in
UMVect.unsafeSwap x i j >>
fill x (i+1) gen'
As you can see, I'm avoiding Data.Vector.Generic to rule out the error source caused by perhaps not understanding it right. I'm also doing things in the ST monad.
In my head, the type of fill should be
UMVect.MVector (ST s (UVect.Vector Int)) Int -> Int -> a -> ST s ()
but GHC objects. Any hints? Again: It typechecks if I don't annotate fill.
Sidenote: I'd also like randomPermutation to return the updated random number generator. Thus, I'd need fill to also handle the generator's state. With my current type confusion, I don't see how to do that neatly. Any hints?

The compile error is telling us:
Expected type: ST s (UMVect.MVector (ST s (UVect.Vector Int)) Int)
Actual type: ST s (UMVect.MVector (Control.Monad.Primitive.PrimState (ST s)) Int)
So, changing the type signature of fill to UMVect.MVector (PrimState (ST s)) Int -> Int -> a -> ST s () (adding import Control.Monad.Primitive too) solves the problem!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string