Haskell FFI - C struct array data fields - haskell

I'm in the process of working on haskell bindings for a native library with a pretty complex interface. It has a lot of structs as part of its interface, and I've been working on building interfaces to them with hsc2hs and the bindings-DSL package for helping automate struct bindings.
One problem I've run into, though, is with structs that contain multidimensional arrays. The bindings-DSL documentation describes macros for binding to a structure like
struct with_array {
char v[5];
struct test *array_pointer;
struct test proper_array[10];
};
with macros like
#starttype struct with_array
#array_field v , CChar
#field array_pointer , Ptr <test>
#array_field proper_array , <test>
#stoptype
But this library has many structs with multidimensional arrays as fields, more like
struct with_multidimensional_array {
int whatever;
struct something big_array[10][25][500];
};
The #array_field macro seems to only handle the first dimension of the array. Is it the case that bindings-DSL just doesn't have a macro for handling multidimensional arrays?
I'd really like a macro for binding a (possibly-multidimensional) array to a StorableArray of arbitrary indexes. Seems like the necessary information is possible in the macros bindings-DSL provides - there's just no macro for this.
Has anyone added macros to bindings-DSL? Has anyone added a macro for this to bindings-DSL? Am I way past what I should be doing with hsc2hs, and there's some other tool that would help me do what I want in a more succinct way?

Well, no one's come up with anything else, so I'll go with the idea in my comment. I'll use the #field macro instead of the #array_field macro, and specify a type that wraps StorableArray to work correctly.
Since I was thinking about this quite a bit, I realized that it was possible to abstract out the wrapper entirely, using the new type-level numbers that GHC 7.6+ support. I put together a package called storable-static-array that takes dimensions on the type level and provides a proper Storable instance for working with native arrays, even multidimensional ones.
One thing that's still missing, that I would like greatly, is to find a way to write a bindings-DSL compatible macro that automatically extracts dimensions and takes care of generating them properly. A short glance at the macros in bindings-DSL, though, convinced me that I don't know nearly enough to manage it myself.

The #array_field macro handles arrays with any dimension. Documentation has been updated to show that explicitly.
The Haskell equivalent record will be a list. When peeking and poking, the length and order of the elements of that list will correspond to the array as it were considered as a one-dimensional array in C. So, a field int example[2][3] would correspond to a list with 6 elements ordered as example[0][0], example[0][1], example[0][2], example[1][0], example[1][1], example[1][2]. When poking, if the list has more than 6 elements, only the first 6 would be used.
This design was choosen for consistency with peekArray and pokeArray from FFI standard library. Before version 1.0.17 of bindings-DSL there was a bug that caused the size of that list to be underestimated when array fields had dimension bigger than 1.

Related

Create a map using type as key

I need a HashMap<K,V> where V is a trait (it will likely be Box or an Rc or something, that's not important), and I need to ensure that the map stores at most one of a given struct, and more importantly, that I can query the presence of (and retrieve/insert) items by their type. K can be anything that is unique to each type (a uint would be nice, but a String or even some large struct holding type information would be sufficient as long as it can be Eq and Hashable)
This is occurring in a library, so I cannot use an enum or such since new types can be added by external code.
I looked into std::any::TypeId but besides not working for non-'static types, it seems they aren't even unique (and allegedly collisions were achieved accidentally with a rather small number of types) so I'd prefer to avoid them if feasible since the number of types I'll have may be very large. (hence this is not a duplicate of this IMO)
I'd like something along the lines of a macro to ensure uniqueness but I can't figure out how to have some kind of global compile time counter. I could use a proper UUID, but it'd be nice to have guaranteed uniqueness since this is, in theory at least, statically determinable.
It is safe to assume that all relevant types are defined either in this lib or in a singular crate that directly depends on it, if that allows for a solution that might be otherwise impossible.
e.g. my thoughts are to generate ids for types in the lib, and also export a constant of the counter, which can be used by the consumer of the lib in the same macro (or a very similar one) but I don't see a way to have such a const value modified by const code in multiple places.
Is this possible or do I need some kind of build script that provides values before compile time?

How do you get struct-copy to create a struct of the same type as the original?

To illustrate, here's a little immutable struct and a function to update it:
(struct timeseries (variable observations) #:transparent)
(define (add-observation ts t v)
(struct-copy timeseries ts
[observations (conj (timeseries-observations ts) `(,t ,v))]))
My question is: If I make a struct that inherits from timeseries, then add-observation will return a timeseries struct rather than a struct of the type that it was passed. How do you update a struct and retain its type?
By the way, if the above code is just not how things are done in Racket, please let me know the conventional way. The fact that I haven't found a function in the Racket libraries like struct-copy but that retains the type of the original struct makes me suspect that I'm going about this the wrong way. Is there some ordinary way to accomplish the same purpose without encountering the problem of returning a struct of a different type than you started with?
Unfortunately this is one of struct-copy's well-known limitations, most of which stem from it being implemented by what Sam Tobin-Hochstadt has aptly described as "unhygenically pasting bits of structs together" (rather than a low-level notion of copying structs), and is part of the reason that "struct-copy is hopeless and can't be fixed without major changes to how structs work." Matthias Felleisen described this as "an Achilles' heel in our world." There is definitely a desire in the Racket community to improve this situation, but for a number of reasons it seems daunting. I'm not aware of anyone actively working on it, and what a principled solution would look like seems to be an open question.
Structs are in many ways very fundamental to Racket. Conceptually, every value in Racket could be an instance of some struct type, though in reality the runtime system has specialized representations for certain built-ins. In fact, I think the ongoing work of replacing C with Chez Scheme in the Racket implementation may use structs for some things that are built-ins in the legacy Racket VM. This is possible because structs offer strong encapsulation capabilities, especially through inspectors. Improving the way structs work would touch essentially all of Racket and involve many disparate considerations, especially around backwards-compatibility.
Here are a few pointers for further reading about the issues:
This message from Sam sketches out how struct could start providing more static information while preserving backwards compatibility.
This GitHub issue, particularly the last comment from Alexis, points out (a) that it isn't obvious what the right static information to add would be and (b) that adding static information about the field names wouldn't be enough to solve the subtyping issues.
This thread points out a different limitation of struct-copy and includes a good summary from Alexis:
… struct-copy is irreparably broken and cannot be fixed without
fundamental changes to Racket’s struct system. Namely, it has the
“slicing” problem familiar to C++ programmers when using struct
inheritance, and the way it synthesizes field accessors from the
provided field names is unhygienic and can be easily thwarted.
The good news is that, while figuring out The Right Thing to do the general case is hard, Racket's languages-as-libraries approach makes it possible for all programmers and library-writers to try different approaches in their own code. There are various Racket packages to help with functional update and other features. Alexis' struct-update provides a macro to synthesize functions like timeseries-observations-update. Jay McCarthy has also experimented with enhancements to struct in library code. You can also implement a solution tailored to your specific use-case, ranging from implementing a consistent copy method (with racket/generic or racket/class) to creating a domain-specific language that can more naturally express your problem domain. This mailing-list thread, despite the subject line, covers a lot of approaches to functional update in Racket (including some thoughts from me about DSLs).
Answer from 2022: "unhygenically pasting bits of structs together" and "struct-copy is hopeless and can't be fixed without major changes to how structs work." which were referring to the hygiene problems, are no longer true. These problems have been mostly fixed.
For the original post's problem (which is not really related to the hygiene issues), it remains difficult to solve this problem with function abstraction. However, if we are allowed to use syntactic abstraction, we can do the following:
#lang racket
(struct timeseries (variable observations) #:transparent)
(define-syntax-rule (add-observation id ts t v)
(struct-copy id ts
[observations
#:parent timeseries
(cons (list t v) (timeseries-observations ts))]))
(struct good-timeseries timeseries (goodness) #:transparent)
(define val (good-timeseries 'id (list (list 1 2)) "goodness value"))
(add-observation good-timeseries
val
10 11)
;=> (good-timeseries 'id '((10 11) (1 2)) "goodness value")
Here, add-observation is changed to be a macro, and it accepts a struct transformation binding as the first argument, which allows struct-copy to produce a struct of the desired type.

Can associated constants be used to initialize the length of fixed size arrays?

In C++, you have the ability to pass integrals inside templates
std::array<int, 3> arr; //fixed size array of 3
I know that Rust has built in support for this, but what if I wanted to create something like linear algebra vector library?
struct Vec<T, size: usize> {
data: [T; size],
}
type Vec3f = Vec<f32, 3>;
type Vec4f = Vec<f32, 4>;
This is currently what I do in D. I have heard that Rust now has Associated Constants.
I haven't used Rust in a long time but this doesn't seem to address this problem at all or have I missed something?
As far as I can see, associated constants are only available in traits and that would mean I would still have to create N vector types by hand.
No, associated constants don't help and aren't intended to. Associated anything are outputs while use cases such as the one in the question want inputs. One could in principle construct something out of type parameters and a trait with associated constants (at least, as soon as you can use associated constants of type parameters — sadly that doesn't work yet). But that has terrible ergonomics, not much better than existing hacks like typenum.
Integer type parameters are highly desired since, as you noticed, they enable numerous things that aren't really feasible in current Rust. People talk about this and plan for it but it's not there yet.
Integer type parameters are not supported as of now, however there's an RFC for that IIRC, and a long-standing discussion.
You could use typenum crate in the meanwhile.

Is there a way to use the namelist I/O feature to read in a derived type with allocatable components?

Is there a way to use the namelist I/O feature to read in a derived type with allocatable components?
The only thing I've been able to find about it is https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/269585 which ended on an fairly unhelpful note.
Edit:
I have user-defined derived types that need to get filled with information from an input file. So, I'm trying to find a convenient way of doing that. Namelist seems like a good route because it is so succinct (basically two lines). One to create the namelist and then a namelist read. Namelist also seems like a good choice because in the text file it forces you to very clearly show where each value goes which I find highly preferable to just having a list of values that the compiler knows the exact order of. This makes it much more work if I or anyone else needs to know which value corresponds to which variable, and much more work to keep clean when inevitably a new value is needed.
I'm trying to do something of the basic form:
!where myType_T is a type that has at least one allocatable array in it
type(myType_T) :: thing
namelist /nmlThing/ thing
open(1, file"input.txt")
read(1, nml=nmlThing)
I may be misunderstanding user-defined I/O procedures, but they don't seem to be a very generic solution. It seems like I would need to write a new one any time I need to do this action, and they don't seem to natively support the
&nmlThing
thing%name = "thing1"
thing%siblings(1) = "thing2"
thing%siblings(2) = "thing3"
thing%siblings(3) = "thing4"
!siblings is an allocatable array
/
syntax that I find desirable.
There are a few solutions I've found to this problem, but none seem to be very succinct or elegant. Currently, I have a dummy user-defined type that has arrays that are way large instead of allocatable and then I write a function to copy the information from the dummy namelist friendly type to the allocatable field containing type. It works just fine, but it is ugly and I'm up to about 4 places were I need to do this same type of operation in the code.
Hence trying to find a good solution.
If you want to use allocatable components, then you need to have an accessible generic interface for a user defined derived type input/output procedure (typically by the type having a generic binding for such a procedure). You link to a thread with an example with such a procedure.
Once invoked, that user defined derived type input/output procedure is then responsible for reading and writing the data. That can include invoking namelist input/output on the components of the derived type.
Fortran 2003 also offers derived types with length parameters. These may offer a solution without the need for a user defined derived type input/output procedure. However, use of derived types with length parameters, in combination with namelist, will put you firmly in the "highly experimental" category with respect to the current compiler implementation.

Tuples in .net 4.0.When Should I use them

I have come across Tuples in net 4.0. I have seen few example on msdn,however it's still not clear to me about the purpose of it and when to use them.
Is it the idea that if i want to create a collections of mix types I should use a tuple?
Any clear examples out there I can relate to?
When did you last use them?
Thanks for any suggestions
Tuples are just used on the coding process by a developer. If you want to return two informations instead of one, then you can use a Tuple for fast coding, but I recoment you make yourself a type that will contain both properties, with appropriate naming, and documentation.
Tuples are not used to mix types as you imagine. Tuples are used to make compositions of other types. e.g. a type that holds both an int and a string can be represented by a tuple: Tuple<int,string>.
Tuples exists in a lot of sizes, not only two.
I don't recommend using tuples in your final code, since their meaning is not clear.
Tuples can be used as multipart keys for dictionaries or grouping statements because being value types they understand equality. Avoid using them to move data around because they have poor language support in C# and named values (classes, structs) are better (simpler) than ordered values (tuples).

Resources