Generate stateful function pointer for FFI testing

Generate stateful function pointer for FFI testing - haskell

I want to generate stateful functions (C signature T f()) with QuickCheck as arguments for foreign functions. Preferably I also want to make them (or their inner s->(T,s)) showable.
I know that for stateless functions I can write something like
type Compare = CInt -> CInt -> CBool
foreign import ccall "wrapper"
mkCompare :: Compare -> IO (FunPtr Compare)
but I got stuck trying this approach for the stateful functions, as I don't see how I can translate the monad hiding the state in Haskell to the function hiding the state in C.
Example of a stateful function f
static int i = 0;
int f() {
return i++;
}
In Haskell I would represent this function as State (\s -> (s,s+1)).
What does QuickCheck have to do with it?
If I have a C function that takes a stateful function as argument, e.g.
int twice(int (*f)()) {
f();
return f();
}
then I can test the function with QuickCheck, where QuickCheck can generate different implementations for f, which would usually look similar to
prop_total (Fun f) xs = total $ g f xs
but these generated functions are stateless, not stateful like the example C function above.

Thanks to Daniel Wagner's suggestion in the comments, I could figure it out. The Show instance comes for free. Here's a minimal example that I ran with
gcc -fPIC -shared -o libstateful.dylib stateful.c && ghc -L. Stateful.hs -lstateful && ./Stateful
As expected it will output a distribution of about 50% of 1 (True) and 50% of 0 (False).
Stateful.hs
import Data.IORef
import Foreign.C
import Foreign.Ptr
import System.IO.Unsafe
import Test.QuickCheck
instance CoArbitrary CInt where
coarbitrary = coarbitraryIntegral
instance Arbitrary CBool where
arbitrary = chooseEnum (0,1)
shrink 1 = [0]
shrink 0 = []
instance Function CInt where
function = functionIntegral
type Generator = IO CBool
foreign import ccall "wrapper" mkGenerator :: Generator -> IO (FunPtr Generator)
foreign import ccall "changes" changes :: FunPtr Generator -> IO CBool
type StateFn = CInt -> (CBool,CInt)
stateFnToIORef :: StateFn -> IORef CInt -> IO CBool
stateFnToIORef f s_ref = do
s <- readIORef s_ref
let (a,s') = f s
writeIORef s_ref s'
pure a
prop_changes :: Fun CInt (CBool,CInt) -> Property
prop_changes (Fn f) = unsafePerformIO (do
x_ref <- newIORef 0
f' <- mkGenerator $ stateFnToIORef f x_ref
res <- changes f'
pure $ collect res (total res))
main :: IO ()
main = quickCheck prop_changes
stateful.c
_Bool changes(_Bool (*f)()) {
_Bool x = f();
_Bool y = f();
return x != y;
}

Related

How do I pass Haskell data through a C FFI as an opaque data type?

I'm trying to pass some data through a C library that doesn't read or modify that data.
foreign import ccall "lua.h lua_pushlightuserdata"
c_lua_pushlightuserdata :: LuaState -> Ptr a -> IO ()
foreign import ccall "lua.h lua_touserdata"
c_lua_touserdata :: LuaState -> CInt -> IO (Ptr a)
data MyData =
MyData
{ dataIds = TVar [Int]
, dataSomethingElse = [String]
}
calledFromRunLuaState :: LuaState -> IO ()
calledFromRunLuaState luaState = do
dataPtr <- c_lua_touserdata luaState (-1)
myData <- dataFromPtr dataPtr
doSomethingWith myData
main = do
luaState <- Lua.newstate
ids <- atomically $ newTVar []
c_lua_pushlightuserdata luaState (dataToPtr (MyData ids []))
runLuaState luaState
I'm trying to figure out how to define dataFromPtr and dataToPtr.

This is what StablePtr is for. So you have to use newStablePtr in place of your dataFromPtr and deRefStablePtr in place of your dataToPtr. Note that deRefStablePtr operates in the IO monad, so you would have to ajust your code accordingly. Also, you would have to adjust the foreign imports to use stable pointers, e.g.:
foreign import ccall "lua.h lua_pushlightuserdata"
c_lua_pushlightuserdata :: LuaState -> StablePtr MyData -> IO ()
and similarly for lua_touserdata.
Finally, when you create a stable pointer with newStablePtr, the garbage collector will not deallocate that value automatically. So it is your responsibility to deallocate it by calling freeStablePtr.

What are good Haskell conventions for managing deeply nested bracket patterns?

I am currently working with Haskell bindings to a HDF5 C library. Like many C libraries, this one uses many pointers in its functions calls.
The usual "best practice" Haskell functions for allocating and releasing C resources follow the bracket pattern, like alloca, withArray, etc. In using them, I often enter several nested brackets. For instance, here is a small excerpt for HDF5 bindings:
selectHyperslab rID dName = withDataset rID dName $ \dID -> do
v <- withDataspace 10 $ \dstDS -> do
srcDS <- c'H5Dget_space dID
dat <- alloca3 (0, 1, 10) $ \(start, stride, count) -> do
err <- c'H5Sselect_hyperslab srcDS c'H5S_SELECT_SET start stride count nullPtr
-- do some work ...
return value
alloca3 (a, b, c) action =
alloca $ \aP -> do
poke aP a
alloca $ \bP -> do
poke bP b
alloca $ \cP -> do
poke cP c
action (aP, bP, cP)
In the code above, the nested brackets are bracket functions I wrote withDataset, withDataspace, and alloca3, which I wrote to prevent the bracket nesting from going another 3 levels deep in the code. For C libraries with lots of resource acquisition calls and pointer arguments, coding with the standard bracket primitives can get unmanageable (which is why I wrote alloca3 to reduce the nesting.)
So generally, are there any best practices or coding techniques to help reduce the nesting of brackets when needing to allocate and deallocate many resources (such as with C calls)? The only alternative I have found is the ResourceT transformer, which from the tutorial looks like it is designed to make interleaving resource acquire/release possible, and not to simplify the bracket pattern.

Recently I was investigating this problem in Scala. The recurring pattern is (a -> IO r) -> IO r, where a given function is executed within some resource allocation context given a value of type a. And this is just ContT r IO a, which is readily available in Haskell. So we can write:
import Control.Monad
import Control.Monad.Cont
import Control.Monad.IO.Class
import Control.Exception (bracket)
import Foreign.Ptr (Ptr)
import Foreign.Storable (Storable)
import Foreign.Marshal.Alloc (alloca)
allocaC :: Storable a => ContT r IO (Ptr a)
allocaC = ContT alloca
bracketC :: IO a -> (a -> IO b) -> ContT r IO a
bracketC start end = ContT (bracket start end)
bracketC_ :: IO a -> IO b -> ContT r IO a
bracketC_ start end = ContT (bracket start (const end))
-- ...etc...
-- | Example:
main :: IO ()
main = flip runContT return $ do
bracketC_ (putStrLn "begin1") (putStrLn "end1")
bracketC_ (putStrLn "begin2") (putStrLn "end2")
liftIO $ putStrLn "..."
The standard monad/applicative functions allow you to simplify a lot of your code, for example:
allocAndPoke :: (Storable a) => a -> ContT r IO (Ptr a)
allocAndPoke x = allocaC >>= \ptr -> liftIO (poke ptr x) >> return ptr
-- With the monad alloca3 won't be probably needed, just as an example:
alloca3C (a, b, c) =
(,,) <$> allocAndPoke a <*> allocAndPoke b <*> allocAndPoke c
allocaManyC :: (Storable a) => [a] -> ContT r IO [Ptr a]
allocaManyC = mapM allocAndPoke

"Persistently" Impure (IO) Vectors in Haskell, with database-like persistent interface

I have a computation that is best described as iterative mutations on a vector; the final result is the final state of the vector.
The "idiomatic" approach to making this functional, I think, is to simply pass on a new vector object along whenever it is "modified". So your iterative method would be operate_on_vector :: Vector -> Vector, which takes in a vector and outputs the modified vector, which is then fed through the method again.
This method is pretty straightforward and I had no problems implementing it, even being new to Haskell.
Alternatively, one could encapsulate all of this in a State monad and pass along a constantly re-created and modified vector as the state value.
However, I suffer a huge, huge performance cost, as these calculations are pretty intensive, the iterations many (on the order of millions) and the data vectors can get pretty large (on the order of at least thousands of primitives). Re-creating a new vector in memory at every step of the iteration seems pretty costly, data collection or not.
Then I considered how IO works -- it can be seen as basically like State, except the state value is the "World", which is constantly changing.
Maybe I could use something that is like IO to "operate" on a "world"? And the "world" would be the vector in-memory? Sort of like a database query, but everything is in memory.
For example with io you could do
do
putStrLn "enter something"
something <- getLine
putStrLine $ "you entered " ++ something
which can be seen as "performing" putStrLn and "modifying" the World object, returning a new World object and feeding it into the next function, which queryies the world object for a string that is the result of the modification, and then returns another world object after another modification.
Is there anything like that that can do this for mutable vectors?
do
putInVec 0 9 -- index 0, value 9
val <- getFromVec 0
putInVec 0 (val + 1)
, with "impure" "mutable" vectors, instead of passing along a new modified vector at each step.

I believe you can do this using mutable vector and a thin wrapper over Reader + ST (or IO) monad.
It can look like this:
type MyVector = IOVector $x -- Use your own elements type here instead of $x
newtype VectorIO a = VectorIO (ReaderT MyVector IO a) deriving (Monad, MonadReader, MonadIO)
-- You will need GeneralizedNewtypeDeriving extension here
-- Run your computation over an existing vector
runComputation :: MyVector -> VectorIO a -> IO MyVector
runComputation vector (VectorIO action) = runReaderT action vector >> return vector
-- Run your computation over a new vector of the specified length
runNewComputation :: Int -> VectorIO a -> IO MyVector
runNewComputation n action = do
vector <- new n
runComputation vector action
putInVec :: Int -> $x -> VectorIO ()
putInVec idx val = do
v <- ask
liftIO $ write v idx val
getFromVec :: Int -> VectorIO $x
getFromVec idx = do
v <- ask
liftIO $ read v idx
That's really all. You can use VectorIO monad to perform your computations, just like you wanted in your example. If you do not want IO but want pure computations, you can use ST monad; modifications to the code above will be trivial.
Update
Here is an ST-based version:
{-# LANGUAGE GeneralizedNewtypeDeriving, FlexibleInstances, MultiParamTypeClasses, Rank2Types #-}
module Main where
import Control.Monad
import Control.Monad.Trans.Class
import Control.Monad.Reader
import Control.Monad.Reader.Class
import Control.Monad.ST
import Data.Vector as V
import Data.Vector.Mutable as MV
-- Your type of the elements
type E = Int
-- Mutable vector which will be used as a context
type MyVector s = MV.STVector s E
-- Immutable vector compatible with MyVector in its type
type MyPureVector = V.Vector E
-- Simple monad stack consisting of a reader with the mutable vector as a context
-- and of an ST action
newtype VectorST s a = VectorST (ReaderT (MyVector s) (ST s) a) deriving Monad
-- Make the VectorST a reader monad
instance MonadReader (MyVector s) (VectorST s) where
ask = VectorST $ ask
local f (VectorST a) = VectorST $ local f a
reader = VectorST . reader
-- Lift an ST action to a VectorST action
liftST :: ST s a -> VectorST s a
liftST = VectorST . lift
-- Run your computation over an existing vector
runComputation :: MyVector s -> VectorST s a -> ST s (MyVector s)
runComputation vector (VectorST action) = runReaderT action vector >> return vector
-- Run your computation over a new vector of the specified length
runNewComputation :: Int -> VectorST s a -> ST s (MyVector s)
runNewComputation n action = do
vector <- MV.new n
runComputation vector action
-- Run a computation on a new mutable vector and then freeze it to an immutable one
runComputationPure :: Int -> (forall s. VectorST s a) -> MyPureVector
runComputationPure n action = runST $ do
vector <- runNewComputation n action
V.unsafeFreeze vector
-- Put an element into the current vector
putInVec :: Int -> E -> VectorST s ()
putInVec idx val = do
v <- ask
liftST $ MV.write v idx val
-- Retrieve an element from the current vector
getFromVec :: Int -> VectorST s E
getFromVec idx = do
v <- ask
liftST $ MV.read v idx

FFI Haskell Callback with State

My question is about how to write friendly Haskell Interfaces that model callbacks which can be invoked from C code. Callbacks are addressed here (HaskellWiki), however, I believe this question is more complex than the example from that link.
Suppose we have C code, requiring callbacks and the header looks like the following:
typedef int CallbackType(char* input, char* output, int outputMaxSize, void* userData)
int execution(CallbackType* caller);
In this case the function execution takes a callback function and will use that to process new data, essentially a closure. The call back expects an input string, an output buffer which has been allocated with size outputMaxSize and the userData pointer, which can be casted however inside the callback.
We do similar things in haskell, when we pass around closures with MVars, so we can still communicate. Therefore when we write the Foreign interface, we'd like to keep this sort of type.
Specifically here is what the FFI Code might look like:
type Callback = CString -> CString -> CInt -> Ptr () -> IO CInt
foreign import ccall safe "wrapper"
wrap_callBack :: Callback -> IO (FunPtr Callback)
foreign import ccall safe "execution"
execute :: FunPtr Callback -> IO CInt
Users should be able to do this sort of thing, but it feels like a poor interface since
they need to write callbacks with type Ptr (). Rather we'd like to replace this with MVars
which feel more natural. So we'd like to write a function:
myCallback :: String -> Int -> MVar a -> (Int, String)
myCallback input maxOutLength data = ...
In order to convert to C, we'd like to have a function like:
castCallback :: ( String -> Int -> MVar a -> (Int, String) )
-> ( CString -> CString -> CInt -> Ptr () -> IO CInt )
main = wrap_callBack (castCallback myCallback) >>= execute
In this case castCallback is for the most part not hard to implement,
convert string -> cstring, Int -> CInt, and copy over the output string.
The hard part however is resolving the MVar to Ptr, which is not necessarily storable.
My Question is what is the best way to go about writing callback code in Haskell, which can still be communicated with.

If you want to access a Haskell structure like MVar which doesn't have a library function to convert it to a pointer representation (meaning it is not supposed to be passed to C), then you need to do partial function application.
In the partial function application, the trick is to build a partial function with MVar already applied, and pass the pointer to that function to C. C will then call it back with the object to put in MVar. An example code below (all the code below is derived from something I did before - I modified it for examples here but haven't tested the modifications):
-- this is the function that C will call back
syncWithC :: MVar CInt -> CInt -> IO ()
syncWithC m x = do
putMVar m x
return ()
foreign import ccall "wrapper"
syncWithCWrap :: (CInt -> IO ()) -> IO (FunPtr (CInt -> IO ()))
main = do
m <- newEmptyMVar
-- create a partial function with mvar m already applied. Pass to C. C will back with CInt
f <- syncWithCWrap $ syncWithC m
What if your MVar object is more complex? Then you need to build a Storable instance of the MVar object if it doesn't exist. For example, if I want to use an MVar with array of pair of Ints, then first define a Storable instance of Int pairs (SV is Storable Vector, MSV is Storable Mutable Vector):
data VCInt2 = IV2 {-# UNPACK #-} !CInt
{-# UNPACK #-} !CInt
instance SV.Storable VCInt2 where
sizeOf _ = sizeOf (undefined :: CInt) * 2
alignment _ = alignment (undefined :: CInt)
peek p = do
a <- peekElemOff q 0
b <- peekElemOff q 1
return (IV2 a b)
where q = castPtr p
{-# INLINE peek #-}
poke p (IV2 a b) = do
pokeElemOff q 0 a
pokeElemOff q 1 b
where q = castPtr p
{-# INLINE poke #-}
Now, you can just pass a pointer to the vector to C, have it update the vector, and call back the void function with no arguments (since C is already filling the vector). This also avoid expensive data marshalling by sharing memory between Haskell and C.
-- a "wrapper" import is a converter for converting a Haskell function to a foreign function pointer
foreign import ccall "wrapper"
syncWithCWrap :: IO () -> IO (FunPtr (IO ()))
-- call syncWithCWrap on syncWithC with both arguments applied
-- the result is a function with no arguments. Pass the function, and
-- pointer to x to C. Have C fill in x first, and then call back syncWithC
-- with no arguments
syncWithC :: MVar (SV.Vector VCInt2) -> MSV.IOVector VCInt2 -> IO ()
syncWithC m1 x = do
SV.unsafeFreeze x >>= putMVar m1
return ()
On C side, you will need a struct declaration for VCInt2 so that it knows how to parse it:
/** Haskell Storable Vector element with two int members **/
typedef struct vcint2{
int a;
int b;
} vcint2;
So, on C side, you are passing it vcint2 pointer for MVar object.

Converting a monadic function to an IO monadic function

parseSource :: String -> Either ParserError Mod.Module
parseSource src = do
(imports, rest) <- parseImports (Lex.lexSource src)
bindings <- mapM parseBinding rest
buildModule imports bindings
I need to make the above return an IO (Either ParserError Mod.Module) as the buildModule statement at the end will need to perform some IO functions (reading files). The problem i have is that when i make it an IO function, i can no longer do the bind(wrong term?) <- operations.
What is the simplest way to make this work?

Take a look at defining your problem in terms of ErrorT ParseError IO.

I couldn't find a combinator to lift a pure Either computation into the ErrorT monad, so I wrote one called liftError. I fleshed out your example with dummy types and implementations. The main runs the parser twice, once with input that throws a ParserError, and once which succeeds with an IO side-effect. In order for ErrorT ParserError IO to be a Monad, ParserError must be an instance of Error (so that it is possible to implement fail).
import Control.Monad.Error
type ParserMonad = ErrorT ParserError IO
data ParserError = ParserError1 | ParserError2 | ParserError3
deriving(Show)
data Module = Module
deriving(Show)
data Import = Import
deriving(Show)
data Binding = Binding
deriving(Show)
instance Error ParserError where
noMsg = undefined
-- lift a pure Either into the ErrorT monad
liftError :: Monad m => Either e a -> ErrorT e m a
liftError = ErrorT . return
parseSource :: String -> ParserMonad Module
parseSource src = do
(imports, rest) <- liftError $ parseImports (lexSource src)
bindings <- liftError $ mapM parseBinding rest
buildModule imports bindings
lexSource :: String -> [String]
lexSource = return
parseImports :: [String] -> Either ParserError ([Import], [String])
parseImports toks = do{ when (null toks) $ throwError ParserError1
; return ([Import], toks)
}
parseBinding :: String -> Either ParserError Binding
parseBinding b = do{ when (b == "hello") $ throwError ParserError2
; return Binding
}
buildModule :: [Import] -> [Binding] -> ParserMonad Module
buildModule i b = do{ liftIO $ print "hello"
; when (null b) $ throwError ParserError3
; return Module
}
main = mapM (runErrorT . parseSource) ["hello", "world"]

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string