Haskell FFI - return updated structure

Haskell FFI - return updated structure - haskell

I have the following C function that I want to call from Haskell:
void read_params_for (property_list_t *props);
The function is supposed to receive some property_list_t and populate some values within it, so the caller then has an updated structure.
I have all the necessary wrappers for property_list_t (like Storable, etc.), but I can't figure out how to wrap this function into something like
readParamsFor :: ForeignPtr PropertyListT -> IO (ForeignPtr PropertyListT)
I tried using C2HS, and I also tried writing FFI bindings manually like:
foreign import ccall "read_params_for"
readParamsFor' :: Ptr PropertyListT -> IO ()
readParamsFor :: ForeignPtr PropertyListT -> IO (ForeignPtr PropertyListT)
readParamsFor ps = do
withForeignPtr ps $ \ps' -> do
res <- readParamsFor' ps'
pl <- newForeignPtr propertyListDestroy ps'
return pl
But in both cases, I get back my original "underpopulated" list.
How do I get an updated structure back to Haskell?

I realised that there was a bug in a C library that I wanted to use and that indeed, a simple withForeignPtr is sufficient if the bug is not there.

Related

Garbage collector issues in Haskell runtime when (de)allocations are managed in C

I would like to share data (in the simplest case an array of integers) between C and Haskell using Haskell's FFI functionality. The C side creates the data (allocating memory accordingly), but never modifies it until it is freed, so I thought the following method would be "safe":
After the data is created, the C function passes the length of the array and a pointer to its start.
On the Haskell side, we create a ForeignPtr, setting up a finalizer which calls a C function that frees the pointer.
We build a Vector using that foreign pointer which can be (immutably) used in Haskell code.
However, using this approach causes rather non-deterministic crashes. Small examples tend to work, but "once the GC kicks in", I start to get various errors from segmentation faults to "barf"s at this or this line in the "evacuation" part of GHC's GC.
What am I doing wrong here? What would be the "right way" of doing something like this?
An Example
I have a C header with the following declarations:
typedef struct CVector {
const int32_t *pointer;
size_t length;
} Vector;
void create_c_vector(struct CVector *vector);
void free_buffer(void *buff);
The Haskell code is generated from the following .chs file using c2hs:
import Foreign.C.Types
import Foreign.Concurrent
import Foreign.Marshal.Alloc
import Foreign.Ptr
import Foreign.Storable
import qualified Data.Vector.Storable as V
#include <cvector.h>
data ForeignVector = ForeignVector
{ pointerFV :: Ptr CInt
, lengthFV :: CULong
}
instance Storable ForeignVector where
sizeOf _ = {#sizeof CVector #}
alignment _ = {#alignof CVector #}
peek p =
ForeignVector
<$> {#get CVector->pointer #} p
<*> {#get CVector->length #} p
poke p (ForeignVector vecP l) =
do {#set CVector.pointer #} p (castPtr vecP)
{#set CVector.length #} p l
peekUnit :: Storable a => Ptr () -> IO a
peekUnit = peek . castPtr
{#fun create_c_vector as ^ { alloca- `ForeignVector' peekUnit*} -> `()' #}
{#fun free_buffer as ^ { `Ptr ()' } -> `()' #}
fromForeign :: ForeignVector -> IO (V.Vector CInt)
fromForeign (ForeignVector p l) =
V.unsafeFromForeignPtr0
<$> newForeignPtr p (freeBuffer . castPtr $ p)
<*> pure (fromIntegral l)
createVector :: IO (V.Vector CInt)
createVector = fromForeign =<< createCVector
One particular test I did yielded internal error: evacuate: strange closure type 177 after a few thousand calls to createVector.
PS: Here is why I would like to use Foreign.Concurrent.newForeignPtr instead of the more "standard"
Foreign.ForeignPtr.newForeignPtr: In some more complicated cases I am anticipating, while freeing the pointer one should also clean up other things which can potentially depend on parameters that are passed from Haskell. Therefore I would like to have a "finalizer with multiple arguments" and pass a partial application as the actual finalizer. This means that I can't use a pointer to a C function as the finalizer. While I've read that one can cook up the FinalizerPtr required for the finalizer from Haskell functions using a "wrapping" mechanism, according to the documentation, function pointers obtained this way need to be explicitly deallocated with freeHaskellFunPtr and I don't want to do bookkeeping for that.
PPS: Here is a base64-encoded tarball with the complete source code of the example above (including code for an executable that reproduces the aforementioned error):
H4sIAAAAAAAAA+1Ze1PbOhbv3/oUZ0JnSQAb50VmeM1QKNvMwIUpLZ2dbjdRbDnx4liuZAO5vXz3
PUd+hAS4LC2XbmejYYitc3Se0u9Isu8HVqB1KqzxlbAcu27X13kcr796xuZg67Tb5hfb/K95rrec
dqPptFtO85VTb9abzVfQfk4jHmqpTrhCU35Uzrxzv0jz78m/GwaD55wAT8j/xkazTflvdFqL/L9E
uy//Wrk/Z/03Njr4Y9Z/e7H+X6Q9lP/jT2+fbQ781/lvNTbwj/K/4Szy/yLtz/J/KJUIhtG5cBOp
bHekv1MHxWOj1Xoo/41mcz7/HdwIvIIXCeL/ef7H0ktDAZhueybdcDUSSjAWjGOpEshp9r79YRIL
fadbRm6qlIgSZlkwR8x/TxM1P+yYKz3iob0XhtKdJ97Df4aG8UE4NetrysPAD4QHBzzhdj5TCzbg
Gs4ZWwoiN0w9AdvuZcYw2mWMeTgCZn3emX1nAN8glkGUCHV4DrC5CWgU7HfRTYA1CEU0TEZEIdL+
xyMZDZFwg+ZFOKsiV0Bpyn3BBdDB7+LEhx5q/rZEL9KH/Zxn6QYZ0L1hNMa45jzmfZ4pFuICYtjB
R7jjAbXt17s4diiSYpy1m7uFAiAuuFbucGUeFkyxvBCopzrrC8b0FMJart6T5MlUhn1bEVRdrhOK
IQ2q5XrnBtzSCSFj5NzHKEgoxNPEws6uyUW1BtYudE+ATxl3soDYkCtj7NuSn0bgKsET0XN72Syg
2fEvTDCnycct6M+4tQyFvJUbUtGv1pYp2pkoXwnRG6S+L0ox/cycZZhhZ34mtFgTrovqoKITD/fY
9gj+RpIqGAj3EB/Ix8M0MpLoH8+dq9ZqKEnJcW6i4ZtJQs53ni8BM0drM0PmshYXKTu300hzXxxO
eVG1w4p5E4mraTelkCx+k7lehhheQ4yZsECqzbkRmWPZHKMZFqdKkBA5RhPFUPEQLWEsS05uHLp3
jzczLDtw27md7e08vfksYj8bV3+Vdl/9p/P/MQ8i+7sr/mx7pP7X662N4vxXbzjIV281286i/r9E
K+o/pnuu5N/ZEpQUrPaJkqF9IER8Jr7Odx/LiHtFp6nLR4FOHtlJnE10IsZ29+RptZ3pIAwnZ+nY
YAaWyQwITYkuSZEBOl+GXrgM1X/CNUyIr3oNqzCpQR9j0Idmu1OvgUOYZ7BKiTgMXESUYxyPUFQM
X82Z4DYcIYCNKYI5cNWyN9KK9fAStq0Z7qzujU7T5CxRRxFgNRCKMLSyb7g8yCrUJlRgdRX0SF7B
5QODKr8h0QgPoiFg9ZOXVJMiL38Y0jq27Uo2XIuvqcB9Sa8ovZ/JQE0GTqNV0LIR0PcwzTiwj8oV
1vJCc8n3BwYIBXyuO84aNBwHbBvw2flSMHxZoPH/bLsP/2f6bJcPePhDOh7D/412vTz/tZuE/3j8
by7w/yWaya6FUKEDGW0Cpr/B6Az3YRRo8AME2hEi7UCICIYiEsrAE229IObuBR8Ke8LHIQwmMKIO
yCWBYzdbdstGUSRNC7EJoySJ9eb6+jBIRunAduV4Xctw3YxjLOJj5Jm2mUnISgPzZiYq05NIxjrQ
RfcejIMoGOOW8kqqCwJEcc3HMTrh034/gsPDLhihDLFdRHqq8TQdYNeBJOBmgzQIPSvB+pTRzwIS
wsR1orilZapcYVFs9KaBuPdv9w6O39pjz7yZ2/PypHm3y2WofKC4miBNXMdSC8/KynAuD+6pvQAy
wfI8z3jKk5HuYax6xq0exQqrhC6s9AJV8mrl4tNw5FoyTjCYGDbrpAHWJzqSWMOGMYc8zMwLplpy
0/Etj4yU4ZTYGOmS4olYRF5JG3AtzOMalCI84fM0TKyQR8MUJ9AmvOP6QoRhw6k7DIMs3DQxJX4W
mwYBVRzKj0UZv7VJfVpwHg4AWCrRSNJ34vL9bufFM3+bndRPCsxftP7vw/9yPj+Tjkf3/5329PtP
q2Puf7EkLPD/BdqSwcQuzQA4zsHzUw6ebzPwZMwUg1jJf+NUxWk6xqWTUCXQwHHvyfUIcLf793f7
yxqGXA1w7oIrwzA7qSMJuYRKaEuMyphG5EV4kTbrJtPqgtKotFxhcYCzJKstcPThDOodrEk2Y0tL
8IYWG1rG2EkkIBLC05DIbA0CIgDsQw6tKJNuXjIS+ULUfDlh5VLJGp52AncELke4F7gLjsSaqRJ5
xVijkqbSKCJ1/X6fDV0XLD3iCo20JOkpAF1LsPzT7j5MEZ4GoLektE/g3wcEYkOc2GS8K+bMxfji
Ht6brM0amoccjUSHjDMJpJr8qdAFo8eVxwwqVVDShcgCB3RjQmU9C9r73AnWjZCCW3cTMjxAEcTi
IzpplCK+kiX9O6jbXwN5O9xxmtAANu8ZFB6/PjroHXXfvN97/4/e6d6Hd30Q0WWg8HhobjIvMfmk
3F4cC35+e/D7f7mD+XEdj+C/06yX3/+x1NUR/9vIv8D/l2jTjyN4rMfTPX0bmekz99S7jNFuGHco
ePpXqVve1sO3bINLFQHXf9Js9BJYye/8twyNPmtgZ3atv8VuIBu5xdilDLz5W/nqnPyVrLu2lXHf
univmo4VekHqz47jr9oeXf/uj+t4bP1v4Jm/uP/tdOj83647ncX6f4lWrvVKCfiVp61MAwEzixx2
oNWgxV8CAi1S7K2WHTU8yNF3t2qrASuQffksqLWaGetD1aztHGJyDQGKcTJgod1adQt7tgtwgWB1
teCno5nvfw6+4IAgG3Bj/l/OfQHdMYxbM7TSjwK1cDCIUItc+F0Zv308OnpAhjH3ht2wP4UwI5lo
1RzRbhaYtmiLtmh/afsPAHfp2gAuAAA=

Copied and extended from my earlier comment.
You may have a faulty cast or poke. One thing I make a point of doing, both as a defensive guideline and when debugging, is this:
Explicitly annotate the type of everything that can undermine types. That way, you always know what you’re getting. Even if a poke, castPtr, or unsafeCoerce has my intended type now, that may not be stable under code motion. And even if this doesn’t identify the issue, it can at least help think through it.
For example, I was once writing a null terminator into a byte buffer…which corrupted adjacent memory by writing beyond the end, because I was using '\NUL', which is not a char, but a Char—32 bits! The reason was that pokeByteOff is polymorphic: it has type (Storable a) => Ptr b -> Int -> a -> IO (), not … => Ptr a -> ….
This turned out to be the case in your code! Quoth #aclow:
The createVector generated by c2hs was equivalent to something like alloca $ \ ptr -> createCVector'_ ptr >> peek ptr, where createCVector'_ :: Ptr () -> IO (), which meant that alloca allocated only enough space to hold a unit. Changing the in-marshaller to alloca' f = alloca $ f . (castPtr :: Ptr ForeignVector -> Ptr ()) seems to solve the issue.
Things that turned out not to be the case, but could’ve been:
I’ve encountered a similar crash when a closure was getting corrupted by somebody (read: me) writing beyond an array. If you’re doing any writes without bounds checking, it may be helpful to replace them with checked versions to see if you can get an exception rather than heap corruption. In a way this is what was happening here, except that the write was to the alloca-allocated region, not the array.
Alternatively, consider lifetime issues: whether the ForeignPtr could be getting dropped & freeing the buffer earlier than you expect, giving you a use-after-free. In a particularly frustrating case, I’ve had to use touchForeignPtr to keep a ForeignPtr alive for that reason.

Haskell - FFI and Pointers

I'm using the FFI in order to use a function in C that takes a struct and returns the same struct. The references I saw say I have to use pointers to these structures in order to be able to import it into Haskell. So, for example.
data Bar = Bar { a :: Int, b :: Int }
type BarPtr = Ptr (Bar)
foreign import ccall "static foo.h foo"
f_foo :: BarPtr -> BarPtr
Now I have the problem that I have to be able to use the function. The references I saw had functions of type BarPtr -> IO () and used with, which has signature Storable a => a -> (Ptr a -> IO b) -> IO b, which was ok, because they where calling the function inside main.
However, I would like to wrap this function in a library, getting a function of type Bar -> Bar without IO, is it possible to do without unsafePerformIO? What's the procedure?

It's not possible to remove IO from the type without using unsafePerformIO. However, it is possible to get a function with the type you want in this case, with some caveats. Specifically the C function "foo" cannot depend upon any global variables, thread-local state, or anything besides the single argument. Also, calling foo(bar) should always provide the same result when bar is unchanged.
I expect that trying to import the C function
bar foo(bar input);
with this call
f_foo :: BarPtr -> BarPtr
will result in a compiler error due to the result type. I think you may need to write a wrapper function (in C):
void wrap_foo(bar *barPtr) {
bar outp = foo(*barPtr);
*barPtr = outp;
}
and import it as
f_wrap_foo :: BarPtr -> IO ()
Finally, you would call this imported function with:
fooBar :: Bar -> Bar
fooBar bar = unsafePerformIO $ with bar $ \barPtr -> do
f_wrap_foo barPtr
peek barPtr

Segfault in Haskell LLVM-General code generation

I'm trying to follow along with the LLVM bindings tutorial here, and running into a segfault. The following code works in the sense that it prints a module header to output.ll, but it also segfaults somewhere.
module Main where
import Control.Monad.Error
import LLVM.General.Module
import LLVM.General.Context
import qualified LLVM.General.AST as AST
--Create and write out an empty LLVM module
main :: IO ()
main = writeModule (AST.defaultModule { AST.moduleName = "myModule" })
outputFile :: File
outputFile = File "output.ll"
writeModule :: AST.Module -> IO ()
writeModule mod = withContext $ (\context ->
liftError $ withModuleFromAST context mod (\m ->
liftError $ writeLLVMAssemblyToFile outputFile m))
--perform the action, or fail on an error
liftError :: ErrorT String IO a -> IO a
liftError = runErrorT >=> either fail return
I suspect this is related to the following hint from the linked tutorial:
It is very important to remember not to pass or attempt to use resources outside of the bracket as this will lead to undefined behavior and/or segfaults.
I think in this context the "bracket" is implemented by the withContext function, which makes it seem like everything should be handled.
If I change the definition of writeModule to
writeModule mod = do assembly <- (withContext $ (\context ->
liftError $ withModuleFromAST context mod moduleLLVMAssembly))
putStrLn assembly
that is, instead of writing to a file I just print out the string representation of the LLVM assembly, no segfault is thrown.
Does anyone have experience with these bindings? I'm also interested to know about the failure cases for the warning I quoted. That is, how would one "forget" not to use resources outside the bracket? All of the functions that seem to require a Context, well, require one. Isn't this kind of resource scoping issue exactly what Haskell is good at handling for you?
Version information:
llvm-general-3.4.3.0
LLVM version 3.4
Default target: x86_64-apple-darwin13.2.0

It would help if you shared your LLVM and cabal environment, LLVM is notorious for being backwards incompatible with itself so there might be an issue with using the latest versions of the bindings.
Behind the scenes writeLLVMAssemblyToFile is using a C++ call to do the file IO operation and I speculate that it's holding a reference to the LLVM module as a result of finalizing the file resource.
Try rendering the module to a String using moduleString and then only lifting into the IO monad to call writeFile from Haskell instead of going through C++ to the write.
import LLVM.General.Context
import LLVM.General.Module as Mod
import qualified LLVM.General.AST as AST
import Control.Monad.Error
main :: IO ()
main = do
writeModule (AST.defaultModule { AST.moduleName = "myModule" })
return ()
writeModule :: AST.Module -> IO (Either String ())
writeModule ast =
withContext $ \ctx ->
runErrorT $ withModuleFromAST ctx ast $ \m -> do
asm <- moduleString m
liftIO $ writeFile "output.ll" asm
The bindings can still rather brittle in my experience, you should ask on the issue tracker if the problem persists.
EDIT: This is a workaround for an old version that has been subsequently fixed. See: https://github.com/bscarlet/llvm-general/issues/109

Memoized IO function?

just curious how to rewrite the following function to be called only once during program's lifetime ?
getHeader :: FilePath -> IO String
getHeader fn = readFile fn >>= return . take 13
Above function is called several times from various functions.
How to prevent reopening of the file if function gets called with the same parameter, ie. file name ?

I would encourage you to seek a more functional solution, for example by loading the headers you need up front and passing them around in some data structure like for example a Map. If explicitly passing it around is inconvenient, you can use a Reader or State monad transformer to handle that for you.
That said, you can accomplish this the way you wanted using by using unsafePerformIO to create a global mutable reference to hold your data structure.
import Control.Concurrent.MVar
import qualified Data.Map as Map
import System.IO.Unsafe (unsafePerformIO)
memo :: MVar (Map.Map FilePath String)
memo = unsafePerformIO (newMVar Map.empty)
{-# NOINLINE memo #-}
getHeader :: FilePath -> IO String
getHeader fn = modifyMVar memo $ \m -> do
case Map.lookup fn m of
Just header -> return (m, header)
Nothing -> do header <- take 13 `fmap` readFile fn
return (Map.insert fn header m, header)
I used an MVar here for thread safety. If you don't need that, you might be able to get away with using an IORef instead.
Also, note the NOINLINE pragma on memo to ensure that the reference is only created once. Without this, the compiler might inline it into getHeader, giving you a new reference each time.

The simplest thing is to just call it once at the beginning of main and pass the resulting String around to all the other functions that need it:
main = do
header <- getHeader
bigOldThingOne header
bigOldThingTwo header

You can use monad-memo package to wrap any monad into MemoT transformer. The memo table will be passed implicitly thoughout your monadic functions. Then use startEvalMemoT to convert memoized monad into ordinary IO:
{-# LANGUAGE NoMonomorphismRestriction #-}
import Control.Monad.Memo
getHeader :: FilePath -> IO String
getHeader fn = readFile fn >>= return . take 13
-- | 'memoized' version of getHeader
getHeaderm :: FilePath -> MemoT String String IO String
getHeaderm fn = memo (lift . getHeader) fn
-- | 'memoized' version of Prelude.print
printm a = memo (lift . print) a
-- | This will not print the last "Hello"
test = do
printm "Hello"
printm "World"
printm "Hello"
main :: IO ()
main = startEvalMemoT test

You should not use unsafePerformIO to solve this. The correct way to do exactly what you describe is to create an IORef that holds a Maybe, initially containing Nothing. Then you create an IO function which checks the value, and performs the computation if it is Nothing and stores the result as a Just. If it finds a Just it reuses the value.
All of this requires passing around the IORef reference, which is just as cumbersome as passing around the string itself, which is why everybody directly recommends just passing around the string itself, either explicitly or implicitly using the Reader monad.
There are incredibly few legitimate uses for unsafePerformIO and this is not one of them. Don't go down that path, otherwise you'll find yourself fighting Haskell when it keeps doing unexpected things. Every solution that uses unsafePerformIO as a "clever trick" always ends catastrophically (and that includes readFile).
Side note - you can simplify your getHeader function:
getHeader path = fmap (take 13) (readFile path)
Or
getHeader path = take 13 <$> readFile path

Haskell FFI: Top-level FunPtr to a top-level function?

It seems desirable to create a FunPtr to a top-level function just once instead of creating a new one (to the same function) whenever it's needed and dealing with its deallocation.
Am I overlooking some way to obtain the FunPtr other than foreign import ccall "wrapper"? If not, my workaround would be as in the code below. Is that safe?
type SomeCallback = CInt -> IO ()
foreign import ccall "wrapper" mkSomeCallback :: SomeCallback -> IO (FunPtr SomeCallback)
f :: SomeCallback
f i = putStrLn ("It is: "++show i)
{-# NOINLINE f_FunPtr #-}
f_FunPtr :: FunPtr SomeCallback
f_FunPtr = unsafePerformIO (mkSomeCallback f)
Edit: Verified that the "creating a new one every time" variant (main = forever (mkSomeCallback f)) does in fact leak memory if one doesn't freeHaskellFunPtr it.

This should, in principle, be safe - GHC internal code uses a similar pattern to initialize singletons such as the IO watched-handles queues. Just keep in mind that you have no control over when mkSomeCallback runs, and don't forget the NOINLINE.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string