Using alloca- with c2hs

Using alloca- with c2hs - haskell

Consider from the c2hs documentation that this:
{#fun notebook_query_tab_label_packing as ^
`(NotebookClass nb, WidgetClass cld)' =>
{notebook `nb' ,
widget `cld' ,
alloca- `Bool' peekBool*,
alloca- `Bool' peekBool*,
alloca- `PackType' peekEnum*} -> `()'#}
generates in Haskell
notebookQueryTabLabelPacking :: (NotebookClass nb, WidgetClass cld)
=> nb -> cld -> IO (Bool, Bool, PackType)
which binds the following C function:
void gtk_notebook_query_tab_label_packing (GtkNotebook *notebook,
GtkWidget *child,
gboolean *expand,
gboolean *fill,
GtkPackType *pack_type);
Problem: Here I'm confused what the effect of alloca- has on the left side of, i.e., `Bool'.
Now I know that, in practice, what is happening is that alloca is somehow generating a Ptr Bool that peekBool can convert into an output argument. But what I'm terribly confused about is how alloca is doing this, given its type signature alloca :: Storable a => (Ptr a -> IO b) -> IO b. More specifically:
Question 1. When someone calls notebookQueryTabLabelPacking in Haskell, what does c2hs provide as argument for alloca's first parameter (Ptr a -> IO b)?
Question 2. In this case, what is the concrete type signature for alloca's first paramater (Ptr a -> IO b)? Is it (Ptr CBool -> IO CBool)?

c2hs will call alloca 3 times to allocate space for the 3 reference arguments. Each call will be provided a lambda that binds the pointer to a name, and returns an IO action that either allocates the space for the next reference argument, or, when all space has been allocated, calls the underlying function, reads the values of the reference arguments, and packages them into a tuple. Something like:
notebookQueryTabLabelPacking nb cld =
alloca $ \a3 -> do
alloca $ \a4 -> do
alloca $ \a5 -> do
gtk_notebookQueryTabLabelPacking nb cld a3 a4 a5
a3' <- peekRep a3
a4' <- peekRep a4
a5' <- peekRep a5
return (a3', a4', a5')
Each alloca is nested in the IO action of the previous one.
Note that the lambdas all take a pointer to their allocated storage, and return an IO (Bool, Bool, PackType).

Related

What is the IO type in Haskell

I am new to the Haskell programming language, I keep on stumbling on the IO type either as a function parameter or a return type.
playGame :: Screen -> IO ()
OR
gameRunner :: IO String -> (String -> IO ()) -> Screen -> IO ()
How does this work, I am a bit confused because I know a String expects words and an Int expects numbers. Whats does the IO used in functions expect or Return?

IO is the way how Haskell differentiates between code that is referentially transparent and code that is not. IO a is the type of an IO action that returns an a.
You can think of an IO action as a piece of code with some effect on the real world that waits to get executed. Because of this side effect, an IO action is not referentially transparent; therefore, execution order matters. It is the task of the main function of a Haskell program to properly sequence and execute all IO actions. Thus, when you write a function that returns IO a, what you are actually doing is writing a function that returns an action that eventually - when executed by main - performs the action and returns an a.
Some more explanation:
Referential transparency means that you can replace a function by its value. A referentially transparent function cannot have any side effects; in particular, a referentially transparent function cannot access any hardware resources like files, network, or keyboard, because the function value would depend on something else than its parameters.
Referentially transparent functions in a functional language like Haskell are like math functions (mappings between domain and codomain), much more than a sequence of imperative instructions on how to compute the function's value. Therefore, Haskell code says the compiler that a function is applied to its arguments, but it does not say that a function is called and thus actually computed.
Therefore, referentially transparent functions do not imply the order of execution. The Haskell compiler is free to evaluate functions in any way it sees fit - or not evaluate them at all if it is not necessary (called lazy evaluation). The only ordering arises from data dependencies, when one function requires the output of another function as input.
Real-world side effects are not referentially transparent. You can think of the real world as some sort of implicit global state that effectual functions mutate. Because of this state, the order of execution matters: It makes a difference if you first read from a database and then update it, or vice versa.
Haskell is a pure functional language, all its functions are referentially transparent and compilation rests on this guarantee. How, then, can we deal with effectful functions that manipulate some global real-world state and that need to be executed in a certain order? By introducing data dependency between those functions.
This is exactly what IO does: Under the hood, the IO type wraps an effectful function together with a dummy state paramter. Each IO action takes this dummy state as input and provides it as output. Passing this dummy state parameter from one IO action to the next creates a data dependency and thus tells the Haskell compiler how to properly sequence all the IO actions.
You don't see the dummy state parameter because it is hidden behind some syntactic sugar: the do notation in main and other IO actions, and inside the IO type.

Briefly put:
f1 :: A -> B -> C
is a function which takes two arguments of type A and B and returns a C. It does not perform any IO.
f2 :: A -> B -> IO C
is similar to f1, but can also perform IO.
f3 :: (A -> B) -> IO C
takes as an argument a function A -> B (which does not perform IO) and produces a C, possibly performing IO.
f4 :: (A -> IO B) -> IO C
takes as an argument a function A -> IO B (which can perform IO) and produces a C, possibly performing IO.
f5 :: A -> IO B -> IO C
takes as an argument a value of type A, an IO action of type IO B, and returns a value of type C, possibly performing IO (e.g. by running the IO action argument one or more times).
Example:
f6 :: IO Int -> IO Int
f6 action = do
x1 <- action
x2 <- action
putStrLn "hello!"
x3 <- action
return (x1+x2+x3)
When a function returns IO (), it returns no useful value, but can perform IO. Similar to, say, returning void in C or Java. Your
gameRunner :: IO String -> (String -> IO ()) -> Screen -> IO ()
function can be called with the following arguments:
arg1 :: IO String
arg1 = do
putStrLn "hello"
s <- readLine
return ("here: " ++ s)
arg2 :: String -> IO ()
arg2 str = do
putStrLn "hello"
putStrLn str
putStrLn "hello again"
arg3 :: Screen
arg3 = ... -- I don't know what's a Screen in your context

Let's try answering some simpler questions first:
What is the Maybe type in Haskell?
From chapter 21 (page 205) of the Haskell 2010 Report:
data Maybe a = Nothing | Just a
it's a simple partial type - you have a value (conveyed via Just) or you don't (Nothing).
How does this work?
Let's look at one possible Monad instance for Maybe:
instance Monad Maybe where
return = Just
Just x >>= k = k x
Nothing >>= _ = Nothing
This monadic interface simplifies the use of values based on Maybe constructors e.g.
instead of:
\f ox oy -> case ox of
Nothing -> Nothing
Just x -> case oy of
Nothing -> Nothing
Just y -> Just (f x y)
you can simply write this:
\f ox oy -> ox >>= \x -> oy >>= \y -> return (f x y)
The monadic interface is widely applicable: from parsing to encapsulated state, and so much more.
What does the Maybe type used in functions expect or return?
For a function expecting a Maybe-based value e.g:
maybe :: b -> (a -> b) -> Maybe a -> b
maybe _ f (Just x) = f x
maybe d _ Nothing = d
if its contents are being used in the function, then the function may have to deal with not receiving a value it can use i.e. Nothing.
For a function returning a Maybe-based value e.g:
invert :: Double -> Maybe Double
invert 0.0 = Nothing
invert d = Just (1/d)
it just needs to use the appropriate constructors.
One last point: observe how Maybe-based values are used - from starting simply (e.g. invert 0.5 or Just "here") to then define other, possibly more-elaborate Maybe-based values (with (>>=), (>>), etc) to ultimately be examined directly by pattern-matching, or abstractly by a suitable definition (maybe, fromJust et al).
Time for the original questions:
What is the IO type in Haskell?
From section 6.1.7 (page 75) of the Report:
The IO type serves as a tag for operations (actions) that interact with the outside world. The IO type is abstract: no constructors are visible to the user. IO is an instance of the Monad and Functor classes.
the crucial point being:
The IO type is abstract: no constructors are visible to the user.
No constructors? That begs the next question:
How does this work?
This is where the versatility of the monadic interface steps in: the flexibility of its two key operatives - return and (>>=) in Haskell - substantially make up for IO-based values being
abstract.
Remember that observation about how Maybe-based values are used? Well, IO-based values are used in similar fashion - starting simply (e.g. return 1, getChar or putStrLn "Hello, there!") to defining other IO-based values (with (>>=), (>>), catch, etc) to ultimately form Main.main.
But instead of pattern-matching or calling another function to extract its contents, Main.main is
processed directly by the Haskell implementation.
What does the IO used in functions expect or return?
For a function expecting a IO-based value e.g:
echo :: IO ()
echo :: getChar >>= \c -> if c == '\n'
then return ()
else putChar c >> echo
if its contents are being used in the function, then the function usually returns an IO-based value.
For a function returning a IO-based value e.g:
newLine :: IO ()
newLine = putChar '\n'
it just needs to use the appropriate definitions.

Copying GHC ByteArray# to Ptr

I am trying to write the following function:
memcpyByteArrayToPtr ::
ByteArray# -- ^ source
-> Int -- ^ start
-> Int -- ^ length
-> Ptr a -- ^ destination
-> IO ()
The behavior should be to internally use memcpy to copy the contents of a ByteArray# to the Ptr. There are two techniques I have seen for doing something like this, but it's difficult for me to reason about their safety.
The first is found in the memory package. There is an auxiliary function withPtr defined as:
data Bytes = Bytes (MutableByteArray# RealWorld)
withPtr :: Bytes -> (Ptr p -> IO a) -> IO a
withPtr b#(Bytes mba) f = do
a <- f (Ptr (byteArrayContents# (unsafeCoerce# mba)))
touchBytes b
return a
But, I'm pretty sure that this is only safe because the only way to construct Bytes is by using a smart constructor that calls newAlignedPinnedByteArray#. An answer given to a similar question and the docs for byteArrayContents# indicate that it is only safe when dealing with pinned ByteArray#s. In my situation, I'm dealing with the ByteArray#s that the text library uses internally, and they are not pinned, so I believe this would be unsafe.
The second possibility I've stumbled across is in text itself. At the bottom of the Data.Text.Array source code, there is an ffi function memcpyI:
foreign import ccall unsafe "_hs_text_memcpy" memcpyI
:: MutableByteArray# s -> CSize -> ByteArray# -> CSize -> CSize -> IO ()
This is backed by the following c code:
void _hs_text_memcpy(void *dest, size_t doff, const void *src, size_t soff, size_t n)
{
memcpy(dest + (doff<<1), src + (soff<<1), n<<1);
}
Because its a part of text, I trust that this is safe. It looks like it's dangerous because is that it's getting a memory location from an unpinned ByteArray#, the very thing that the byteArrayContents# documentation warns against. I suspect that it's ok because the ffi call is marked as unsafe, which I think prevents the GC from moving the ByteArray# during the ffi call.
That's the research I've done far. So far, my best guess is that I can just copy what's been done in text. The big difference would be that, instead of passing in MutableByteArray# and ByteArray# as the two pointers, I would be passing in ByteArray# and Ptr a (or maybe Addr#, I'm not sure which of those you typically use with the ffi).
Is what I have suggested safe? Is there a better way that would allow me to avoid using the ffi? Is there something in base that does this? Feel free to correct any incorrect assumptions I've made, and thanks for any suggestions or guidance.

copyByteArrayToAddr# :: ByteArray# -> Int# -> Addr# -> Int# -> State# s -> State# s
looks like the right primop. You just need to be sure not to try to copy it into memory it occupies. So you should probably be safe with
copyByteArrayToPtr :: ByteArray# -> Int -> Ptr a -> Int -> ST s ()
copyByteArrayToPtr ba (I# x) (Ptr p) (I# y) = ST $ \ s ->
(# copyByteArrayToAddr# ba x p y s, () #)
Unfortunately, the documentation gives me no clue what each Int# is supposed to mean, but I imagine you can figure that out through trial and segfault.

Passing list of different typed elements to a C function

I have a function written in C I’d like to call from a Haskell program. The function type is:
foo :: Int -> Ptr a -> IO ()
It takes a size and a pointer on whatever and puts the whole thing somewhere in memory. It’s intended to be used with mixed types. You can put n floats then m bools and so on (in C).
The most convenient way to represent such a situation in Haskell would be – in my opinion – something like ([a],[b]) for instance. But, I need the whole thing to fit in a Ptr a (it’s actually a void* in C). I can try to write a function like ([a],[b]) -> Ptr c, but I need some help around it. The desired final function would be:
withArrayLen magicArray foo

Things that can be stored in memory are instances of type class Storable (in Foreign.Storable). So, given the raw FFI prototype
foreign import "foo" c_foo :: CInt -> Ptr a -> IO ()
you could write something like this for homogenous lists:
homfoo :: Storable a => [a] -> IO ()
homfoo items = withArray items $ \ptr -> c_foo (fromIntegral len) ptr
where len = length items * sizeOf (head items)
But you've said the function is intended to work with mixed types, so we need some kind of type-constrained heterogeneous list for the nice Haskell wrapper. Here is one way to do this:
{-# LANGUAGE GADTs #-}
data DynStorable where
MkStorable :: Storable a => a -> DynStorable
foo :: [DynStorable] -> IO ()
foo items =
let (requiredSize, offsets) = mapAccumL sizeFold 0 items in
allocaBytes requiredSize $ \ptr -> do
zipWithM
(\offset (MkStorable x) -> pokeByteOff ptr offset x)
offsets items
c_foo (fromIntegral requiredSize) ptr
where
sizeFold offset (MkStorable x) =
let unalignment = offset `mod` alignment x
offset' = if unalignment /= 0
then offset + alignment x - unalignment
else offset
in (offset' + sizeOf x, offset')
main :: IO ()
main = do
foo [MkStorable (2 :: Int), MkStorable (3.0 :: Double), MkStorable True]
C function has no means to distinguish item boundaries in the received chunk of data, but it wouldn't be hard to include length prefixes or type codes if required.

Signature of IO in Haskell (is this class or data?)

The question is not what IO does, but how is it defined, its signature. Specifically, is this data or class, is "a" its type parameter then? I didn't find it anywhere. Also, I don't understand the syntactic meaning of this:
f :: IO a

You asked whether IO a is a data type: it is. And you asked whether the a is its type parameter: it is. You said you couldn't find its definition. Let me show you how to find it:
localhost:~ gareth.rowlands$ ghci
GHCi, version 7.6.3: http://www.haskell.org/ghc/ :? for help
Prelude> :i IO
newtype IO a
= GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld
-> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
-- Defined in `GHC.Types'
instance Monad IO -- Defined in `GHC.Base'
instance Functor IO -- Defined in `GHC.Base'
Prelude>
In ghci, :i or :info tells you about a type. It shows the type declaration and where it's defined. You can see that IO is a Monad and a Functor too.
This technique is more useful on normal Haskell types - as others have noted, IO is magic in Haskell. In a typical Haskell type, the type signature is very revealing but the important thing to know about IO is not its type declaration, rather that IO actions actually perform IO. They do this in a pretty conventional way, typically by calling the underlying C or OS routine. For example, Haskell's putChar action might call C's putchar function.

IO is a polymorphic type (which happens to be an instance of Monad, irrelevant here).
Consider the humble list. If we were to write our own list of Ints, we might do this:
data IntList = Nil | Cons { listHead :: Int, listRest :: IntList }
If you then abstract over what element type it is, you get this:
data List a = Nil | Cons { listHead :: a, listRest :: List a }
As you can see, the return value of listRest is List a. List is a polymorphic type of kind * -> *, which is to say that it takes one type argument to create a concrete type.
In a similar way, IO is a polymorphic type with kind * -> *, which again means it takes one type argument. If you were to define it yourself, it might look like this:
data IO a = IO (RealWorld -> (a, RealWorld))
(definition courtesy of this answer)

The amount of magic in IO is grossly overestimated: it has some support from compiler and runtime system, but much less than newbies usually expect.
Here is the source file where it is defined:
http://www.haskell.org/ghc/docs/latest/html/libraries/ghc-prim-0.3.0.0/src/GHC-Types.html
newtype IO a
= IO (State# RealWorld -> (# State# RealWorld, a #))
It is just an optimized version of state monad. If we remove optimization annotations we will see:
data IO a = IO (Realworld -> (Realworld, a))
So basically IO a is a data structure storing a function that takes old real world and returns new real world with io operation performed and a.
Some compiler tricks are necessary mostly to remove Realworld dummy value efficiently.
IO type is an abstract newtype - constructors are not exported, so you cannot bypass library functions, work with it directly and perform nasty things: duplicate RealWorld, create RealWorld out of nothing or escape the monad (write a function of IO a -> a type).

Since IO can be applied to objects of any type a, as it is a polymorphic monad, a is not specified.
If you have some object with type a, then it can be 'wrappered' as an object of type IO a, which you can think of as being an action that gives an object of type a. For example, getChar is of type IO Char, and so when it is called, it has the side effect of (From the program's perspective) generating a character, which comes from stdin.
As another example, putChar has type Char -> IO (), meaning that it takes a char, and then performs some action that gives no output (in the context of the program, though it will print the char given to stdout).
Edit: More explanation of monads:
A monad can be thought of as a 'wrapper type' M, and has two associated functions:
return and >>=.
Given a type a, it is possible to create objects of type M a (IO a in the case of the IO monad), using the return function.
return, therefore, has type a -> M a. Moreover, return attempts not to change the element that it is passed -- if you call return x, you will get a wrappered version of x that contains all of the information of x (Theoretically, at least. This doesn't happen with, for example, the empty monad.)
For example, return "x" will yield an M Char. This is how getChar works -- it yields an IO Char using a return statement, which is then pulled out of its wrapper with <-.
>>=, read as 'bind', is more complicated. It has type M a -> (a -> M b) -> M b, and its role is to take a 'wrappered' object, and a function from the underlying type of that object to another 'wrappered' object, and apply that function to the underlying variable in the first input.
For example, (return 5) >>= (return . (+ 3)) will yield an M Int, which will be the same M Int that would be given by return 8. In this way, any function that can be applied outside of a monad can also be applied inside of it.
To do this, one could take an arbitrary function f :: a -> b, and give the new function g :: M a -> M b as follows:
g x = x >>= (return . f)
Now, for something to be a monad, these operations must also have certain relations -- their definitions as above aren't quite enough.
First: (return x) >>= f must be equivalent to f x. That is, it must be equivalent to perform an operation on x whether it is 'wrapped' in the monad or not.
Second: x >>= return must be equivalent to m. That is, if an object is unwrapped by bind, and then rewrapped by return, it must return to its same state, unchanged.
Third, and finally (x >>= f) >>= g must be equivalent to x >>= (\y -> (f y >>= g) ). That is, function binding is associative (sort of). More accurately, if two functions are bound successively, this must be equivalent to binding the combination thereof.
Now, while this is how monads work, it's not how it's most commonly used, because of the syntactic sugar of do and <-.
Essentially, do begins a long chain of binds, and each <- sort of creates a lambda function that gets bound.
For example,
a = do x <- something
y <- function x
return y
is equivalent to
a = something >>= (\x -> (function x) >>= (\y -> return y))
In both cases, something is bound to x, function x is bound to y, and then y is returned to a in the wrapper of the relevant monad.
Sorry for the wall of text, and I hope it explains something. If there's more you need cleared up about this, or something in this explanation is confusing, just ask.

This is a very good question, if you ask me. I remember being very confused about this too, maybe this will help...
'IO' is a type constructor, 'IO a' is a type, the 'a' (in 'IO a') is an type variable. The letter 'a' carries no significance, the letter 'b' or 't1' could have been used just as well.
If you look at the definition of the IO type constructor you will see that it is a newtype defined as: GHC.Types.IO (GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #))
'f :: IO a' is the type of a function called 'f' of apparently no arguments that returns a result of some unconstrained type in the IO monad. 'in the IO monad' means that f can do some IO (i.e. change the 'RealWorld', where 'change' means replace the provided RealWorld with a new one) while computing its result. The result of f is polymorphic (that's a type variable 'a' not a type constant like 'Int'). A polymorphic result means that in your program it's the caller that determines the type of the result, so used in one place f could return an Int, used in another place it could return a String. 'Unconstrained' means that there's no type class restricting what type can be returned and so any type can be returned.
Why is 'f' a function and not a constant since there are no parameters and Haskell is pure? Because the definition of IO means that 'f :: IO a' could have been written 'f :: GHC.Prim.State# GHC.Prim.RealWorld -> (# GHC.Prim.State# GHC.Prim.RealWorld, a #)' and so in fact has a parameter -- the 'state of the real world'.

In the data IO a a have mainly the same meaning as in Maybe a.
But we can't rid of a constructor, like:
fromIO :: IO a -> a
fromIO (IO a) = a
Fortunately we could use this data in Monads, like:
{-# LANGUAGE ScopedTypeVariables #-}
foo = do
(fromIO :: a) <- (dataIO :: IO a)
...

Is there any hope to cast ForeignPtr to ByteArray# (for a function :: ByteString -> Vector)

For performance reasons I would like a zero-copy cast of ByteString (strict, for now) to a Vector. Since Vector is just a ByteArray# under the hood, and ByteString is a ForeignPtr this might look something like:
caseBStoVector :: ByteString -> Vector a
caseBStoVector (BS fptr off len) =
withForeignPtr fptr $ \ptr -> do
let ptr' = plusPtr ptr off
p = alignPtr ptr' (alignment (undefined :: a))
barr = ptrToByteArray# p len -- I want this function, or something similar
barr' = ByteArray barr
alignI = minusPtr p ptr
size = (len-alignI) `div` sizeOf (undefined :: a)
return (Vector 0 size barr')
That certainly isn't right. Even with the missing function ptrToByteArray# this seems to need to escape the ptr outside of the withForeignPtr scope. So my quesetions are:
This post probably advertises my primitive understanding of ByteArray#, if anyone can talk a bit about ByteArray#, it's representation, how it is managed (GCed), etc I'd be grateful.
The fact that ByteArray# lives on the GCed heap and ForeignPtr is external seems to be a fundamental issue - all the access operations are different. Perhaps I should look at redefining Vector from = ByteArray !Int !Int to something with another indirection? Someing like = Location !Int !Int where data Location = LocBA ByteArray | LocFPtr ForeignPtr and provide wrapping operations for both those types? This indirection might hurt performance too much though.
Failing to marry these two together, maybe I can just access arbitrary element types in a ForeignPtr in a more efficient manner. Does anyone know of a library that treats ForeignPtr (or ByteString) as an array of arbitrary Storable or Primitive types? This would still lose me the stream fusion and tuning from the Vector package.

Disclaimer: everything here is an implementation detail and specific to GHC and the internal representations of the libraries in question at the time of posting.
This response is a couple years after the fact, but it is indeed possible to get a pointer to bytearray contents. It's problematic as the GC likes to move data in the heap around, and things outside of the GC heap can leak, which isn't necessarily ideal. GHC solves this with:
newPinnedByteArray# :: Int# -> State# s -> (#State# s, MutableByteArray# s#)
Primitive bytearrays (internally typedef'd C char arrays) can be statically pinned to an address. The GC guarantees not to move them. You can convert a bytearray reference to a pointer with this function:
byteArrayContents# :: ByteArray# -> Addr#
The address type forms the basis of Ptr and ForeignPtr types. Ptrs are addresses marked with a phantom type and ForeignPtrs are that plus optional references to GHC memory and IORef finalizers.
Disclaimer: This will only work if your ByteString was built Haskell. Otherwise, you can't get a reference to the bytearray. You cannot dereference an arbitrary addr. Don't try to cast or coerce your way to a bytearray; that way lies segfaults. Example:
{-# LANGUAGE MagicHash, UnboxedTuples #-}
import GHC.IO
import GHC.Prim
import GHC.Types
main :: IO()
main = test
test :: IO () -- Create the test array.
test = IO $ \s0 -> case newPinnedByteArray# 8# s0 of {(# s1, mbarr# #) ->
-- Write something and read it back as baseline.
case writeInt64Array# mbarr# 0# 1# s1 of {s2 ->
case readInt64Array# mbarr# 0# s2 of {(# s3, x# #) ->
-- Print it. Should match what was written.
case unIO (print (I# x#)) s3 of {(# s4, _ #) ->
-- Convert bytearray to pointer.
case byteArrayContents# (unsafeCoerce# mbarr#) of {addr# ->
-- Dereference the pointer.
case readInt64OffAddr# addr# 0# s4 of {(# s5, x'# #) ->
-- Print what's read. Should match the above.
case unIO (print (I# x'#)) s5 of {(# s6, _ #) ->
-- Coerce the pointer into an array and try to read.
case readInt64Array# (unsafeCoerce# addr#) 0# s6 of {(# s7, y# #) ->
-- Haskell is not C. Arrays are not pointers.
-- This won't match. It might segfault. At best, it's garbage.
case unIO (print (I# y#)) s7 of (# s8, _ #) -> (# s8, () #)}}}}}}}}
Output:
1
1
(some garbage value)
To get the bytearray from a ByteString, you need to import the constructor from Data.ByteString.Internal and pattern match.
data ByteString = PS !(ForeignPtr Word8) !Int !Int
(\(PS foreignPointer offset length) -> foreignPointer)
Now we need to rip the goods out of the ForeignPtr. This part is entirely implementation-specific. For GHC, import from GHC.ForeignPtr.
data ForeignPtr a = ForeignPtr Addr# ForeignPtrContents
(\(ForeignPtr addr# foreignPointerContents) -> foreignPointerContents)
data ForeignPtrContents = PlainForeignPtr !(IORef (Finalizers, [IO ()]))
| MallocPtr (MutableByteArray# RealWorld) !(IORef (Finalizers, [IO ()]))
| PlainPtr (MutableByteArray# RealWorld)
In GHC, ByteString is built with PlainPtrs which are wrapped around pinned byte arrays. They carry no finalizers. They are GC'd like regular Haskell data when they fall out of scope. Addrs don't count, though. GHC assumes they point to things outside of the GC heap. If the bytearray itself falls out of the scope, you're left with a dangling pointer.
data PlainPtr = (MutableByteArray# RealWorld)
(\(PlainPtr mutableByteArray#) -> mutableByteArray#)
MutableByteArrays are identical to ByteArrays. If you want true zero-copy construction, make sure you either unsafeCoerce# or unsafeFreeze# to a bytearray. Otherwise, GHC creates a duplicate.
mbarrTobarr :: MutableByteArray# s -> ByteArray#
mbarrTobarr = unsafeCoerce#
And now you have the raw contents of the ByteString ready to be turned into a vector.
Best Wishes,

You might be able to hack together something :: ForeignPtr -> Maybe ByteArray#, but there is nothing you can do in general.
You should look at the Data.Vector.Storable module. It includes a function unsafeFromForeignPtr :: ForeignPtr a -> Int -> Int -> Vector a. It sounds like what you want.
There is also a Data.Vector.Storable.Mutable variant.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Using alloca- with c2hs - haskell

Related

What is the IO type in Haskell

Copying GHC ByteArray# to Ptr

Passing list of different typed elements to a C function

Signature of IO in Haskell (is this class or data?)

Is there any hope to cast ForeignPtr to ByteArray# (for a function :: ByteString -> Vector)

Categories

Resources