Problem implementing reciprocally nested structures in Julia - struct

I am trying to define two structs, Node and Edge.
A node contains an array of edges, while an edge contains the destination node and the probability of reaching that node. Because of how the rest of the problem is structured, I cannot avoid having the Edge object.
struct Node
edges::Vector{Edge}
end
struct Edge
next::Node
probability::Float64
end
Whenever I try to run the whole script i get "UndefVarError:Edge is not defined".
If i try to run only the edge part i get "UndefVarError:Node is not defined".
Is there a way (like in C) to pre-declare the structures, or to tell julia to process the two structures together?

What about using abstract types:
abstract type AbstractEdge end;
struct Node{T <: AbstractEdge}
edges::Vector{T}
end
struct Edge <: AbstractEdge
next::Node{Edge}
probability::Float64
end
Node() = Node{Edge}(Edge[])
If you do any graph computations consider using LightGraphs.jl it has weighted graphs that might suit your needs.

As far as I know it is not possible yet, see https://github.com/JuliaLang/julia/issues/269.
You can define an in-between abstract type in such cases till the issue is resolved:
abstract type AbstractEdge end
struct Node{E<:AbstractEdge}
edges::Vector{E}
end
struct Edge <: AbstractEdge
next::Node{Edge}
probability::Float64
end

Related

Why do I need "use rand::Rng" to call gen() on rand::thread_rng()?

When I'm using Rust's rand crate, if I want to produce a rand number, I would write:
use rand::{self, Rng};
let rand = rand::thread_rng().gen::<usize>();
If I don't use rand::Rng, an error occurs:
no method named gen found for struct rand::prelude::ThreadRng in the current scope
That's quite different from what I'm used to. Usually I treat mods like:
import rand from "path";
rand.generate();
Once I import the mod I don't need to import something else, and I can use every method it exports.
Why must I use rand::Rng to enable the gen method on rand::thread_rng()?
That's quite different from what I used to know.
It feels different because it is indeed different. You are probably used to dynamic dispatch via some kind of virtual method table (as in e.g. C++), or, in case of JS, to dynamic dispatch by looking up either the own properties of the receiver object, or its ancestors via the __proto__-chain. In any case, the object on which you are invoking a method carries around some data that tells it how to get the method that you're invoking. Given the signature of the invoked method, the receiver object itself knows how to get the method with that signature.
That's not the only way, though. For example,
modules / functors in OCaml or SML
Typeclasses in Haskell
implicits / givens in Scala
traits in Rust
work on a rather different principle: the methods are not tied to the receiver, but to the module / typeclass / given / trait instances. In each case, those are entities that are separate from the receiver of the method call. It opens some new possibilities, e.g. it allows you to do some ad-hoc polymorphism (i.e. to define instances of traits after the fact, for types that are not necessarily under your control). At the same time, the compiler typically requires a bit more information from you in order to be able to select the correct instances: it behaves somewhat like a little type-directed search engine, or even a little "theorem prover", and for this to work, you have to tell the compiler where to look for the suitable building blocks for those synthetically generated instances.
If you've never worked before with any language that has a compiler with a subsystem that is "searching for instances" based on type information, this should indeed feel quite foreign. The error messages and the solution approaches do indeed feel rather different, because instead of comparing your implementation against an interface and looking for conflicts, you have to guide this instance-searching mechanism by providing more hints (e.g. by importing more traits etc.).
In your particular case, rand::thread_rng returns a struct ThreadRng. On its own, the struct knows nothing about the gen method, because this method is not tied directly to the struct. Instead, it's defined in the Rng trait. But at the same time, it could be defined in some entirely unrelated trait, and have some completely different meaning. In order to disambiguate the intended meaning, you therefore have to explicitly specify that you want to work with the Rng trait. This is why you have to mention it in the use-clause.
I don't know the specific library you're using, but I can guess at the problem. I would guess that Rng is a trait which defines gen. Traits can be thought of as somewhat like Java's interfaces: they enable ad-hoc polymorphism by allowing you to define different behaviors for the same function on different datatypes.
However, Rust's traits fix one major problem (well, they fix several major problems, but one that's relevant here) with Java's interfaces. In Java, if you define an interface, then anyone writing a class can implement the interface, but you can't implement it for other people. In particular, the built-in types String and int and the like can never implement any new interfaces downstream. In Rust, either the trait writer or the struct/enum writer can implement the trait.
But this poses another issue. Now, if I have a value foo of type Foo and I write foo.bar(), then bar might not be a method defined on Foo; it might be something some trait writer implemented in some other file. We can't go search every Rust file on your computer for possible matching traits, so Rust makes the logical decision to restrict this search to traits that are in scope. If you want to call foo.bar() and bar is a method on trait Bar, then trait Bar has to be in scope when you call it. Otherwise, Rust won't see it.
So, in your case, thread_rng() returns a rand::prelude::ThreadRng. The method gen is not defined on rand::prelude::ThreadRng. Instead, it's defined on a trait called rand::Rng which is *implemented by ThreadRng. That trait has to be in-scope to use the method.

Representing and building a cyclic abstract syntax tree

I'm a newbie Haskell programmer with imperative background.
I'm writing a program that parses an abstract syntax tree (or rather a graph) that has cycles. (This is actually GCC's Generic AST). I'm designing data types for representing this graph, but I'm facing difficulties:
Firstly, there are lots of nasty cycles in GCC AST. For example type declarations refer to the actual type descriptor which refers back to the type's declaration (and this declaration may sometimes be different from the original, sometimes it is just a reference to an identifier node). All tree nodes reference a so-called context node (a form of parent reference). However, this context node (for some node X) often is not the same as the node that originally referred to X. For example a declaration for some builtin function may be found from a C++ namespace node, but the function declaration's context points to a translation unit declaration. Perhpas this makes using zippers impossible? Also I can't just ignore these context nodes, because they are useful for finding out the scope of declarations.
So, when designing the data structure I should take into account the fact, that sometimes I don't know what kind of node I will be dealing with at run time. However, when I'm certain that a node will have a known type, I would like to be able to reflect this on type level and take advantage of static type checking (for example function's result type is always a type, not an integer constant, but it might be integer or pointer type, etc.).
Secondly, GCC tree is a hierarchical data type in object-oriented sense. All nodes have common information, such as what kind of node they are. All declarations have a name and many flags, and variable declarations have type information in addition. Many nodes have so much data that it would be incovenient to access this information through pattern matching only. So I most likely will be using accessor functions (and type classes to provide an uniform interface regardless of the node type).
Thirdly, I would like my graph to be purely functional, but I don't know how to build it. My input is text, which has a section for every node. Nodes are identified by unique ids, and there are lots of forward- and self-references.
So, with my background, I sense that I'm trying to force my Haskell interface to an imperative form. So I'm asking you for some concrete advice to guide me in designing my data types, their interface and how to build the graph.
So far I have decided that I will not be tying the knot. It would prevent me from doing transformations on the tree (which IMHO deserves some transformations). It also would make it hard to write this tree back to disc, if I want to do that some day.
EDIT:
Here is a sample of my input. It is in YAML, but the format is not yet set to stone (I'm generating this data my self from within GCC).
http://sange.fi/~aura/test.yaml The example contains the global namespace with one function declaration for (int main (int argc, char *argv[])).
Thanks in advance!

Haskell FFI - C struct array data fields

I'm in the process of working on haskell bindings for a native library with a pretty complex interface. It has a lot of structs as part of its interface, and I've been working on building interfaces to them with hsc2hs and the bindings-DSL package for helping automate struct bindings.
One problem I've run into, though, is with structs that contain multidimensional arrays. The bindings-DSL documentation describes macros for binding to a structure like
struct with_array {
char v[5];
struct test *array_pointer;
struct test proper_array[10];
};
with macros like
#starttype struct with_array
#array_field v , CChar
#field array_pointer , Ptr <test>
#array_field proper_array , <test>
#stoptype
But this library has many structs with multidimensional arrays as fields, more like
struct with_multidimensional_array {
int whatever;
struct something big_array[10][25][500];
};
The #array_field macro seems to only handle the first dimension of the array. Is it the case that bindings-DSL just doesn't have a macro for handling multidimensional arrays?
I'd really like a macro for binding a (possibly-multidimensional) array to a StorableArray of arbitrary indexes. Seems like the necessary information is possible in the macros bindings-DSL provides - there's just no macro for this.
Has anyone added macros to bindings-DSL? Has anyone added a macro for this to bindings-DSL? Am I way past what I should be doing with hsc2hs, and there's some other tool that would help me do what I want in a more succinct way?
Well, no one's come up with anything else, so I'll go with the idea in my comment. I'll use the #field macro instead of the #array_field macro, and specify a type that wraps StorableArray to work correctly.
Since I was thinking about this quite a bit, I realized that it was possible to abstract out the wrapper entirely, using the new type-level numbers that GHC 7.6+ support. I put together a package called storable-static-array that takes dimensions on the type level and provides a proper Storable instance for working with native arrays, even multidimensional ones.
One thing that's still missing, that I would like greatly, is to find a way to write a bindings-DSL compatible macro that automatically extracts dimensions and takes care of generating them properly. A short glance at the macros in bindings-DSL, though, convinced me that I don't know nearly enough to manage it myself.
The #array_field macro handles arrays with any dimension. Documentation has been updated to show that explicitly.
The Haskell equivalent record will be a list. When peeking and poking, the length and order of the elements of that list will correspond to the array as it were considered as a one-dimensional array in C. So, a field int example[2][3] would correspond to a list with 6 elements ordered as example[0][0], example[0][1], example[0][2], example[1][0], example[1][1], example[1][2]. When poking, if the list has more than 6 elements, only the first 6 would be used.
This design was choosen for consistency with peekArray and pokeArray from FFI standard library. Before version 1.0.17 of bindings-DSL there was a bug that caused the size of that list to be underestimated when array fields had dimension bigger than 1.

Atomic Compare And Swap with struct in Go

I am trying to create a non-blocking queue package for concurrent application using the algorithm by Maged M. Michael and Michael L. Scott as described here.
This requires the use of atomic CompareAndSwap which is offered by the "sync/atomic" package.
I am however not sure what the Go-equivalent to the following pseudocode would be:
E9: if CAS(&tail.ptr->next, next, <node, next.count+1>)
where tail and next is of type:
type pointer_t struct {
ptr *node_t
count uint
}
and node is of type:
type node_t struct {
value interface{}
next pointer_t
}
If I understood it correctly, it seems that I need to do a CAS with a struct (both a pointer and a uint). Is this even possible with the atomic-package?
Thanks for help!
If I understood it correctly, it seems that I need to do a CAS with a struct (both a > pointer and a uint). Is this even possible with the atomic-package?
No, that is not possible. Most architectures only support atomic operations on a single word. A lot of academic papers however use more powerful CAS statements (e.g. compare and swap double) that are not available today. Luckily there are a few tricks that are commonly used in such situations:
You could for example steal a couple of bits from the pointer (especially on 64bit systems) and use them, to encode your counter. Then you could simply use Go's CompareAndSwapPointer, but you need to mask the relevant bits of the pointer before you try to dereference it.
The other possibility is to work with pointers to your (immutable!) pointer_t struct. Whenever you want to modify an element from your pointer_t struct, you would have to create a copy, modify the copy and atomically replace the pointer to your struct. This idiom is called COW (copy on write) and works with arbitrary large structures. If you want to use this technique, you would have to change the next attribute to next *pointer_t.
I have recently written a lock-free list in Go for educational reasons. You can find the (imho well documented) source here: https://github.com/tux21b/goco/blob/master/list.go
This rather short example uses atomic.CompareAndSwapPointer excessively and also introduces an atomic type for marked pointers (the MarkAndRef struct). This type is very similar to your pointer_t struct (except that it stores a bool+pointer instead of an int+pointer). It's used to ensure that a node has not been marked as deleted while you are trying to insert an element directly afterwards. Feel free to use this source as starting point for your own projects.
You can do something like this:
if atomic.CompareAndSwapPointer(
(*unsafe.Pointer)(unsafe.Pointer(tail.ptr.next)),
unsafe.Pointer(&next),
unsafe.Pointer(&pointer_t{&node, next.count + 1})
)

What is typestate?

What does TypeState refer to in respect to language design? I saw it mentioned in some discussions regarding a new language by mozilla called Rust.
Note: Typestate was dropped from Rust, only a limited version (tracking uninitialized and moved from variables) is left. See my note at the end.
The motivation behind TypeState is that types are immutable, however some of their properties are dynamic, on a per variable basis.
The idea is therefore to create simple predicates about a type, and use the Control-Flow analysis that the compiler execute for many other reasons to statically decorate the type with those predicates.
Those predicates are not actually checked by the compiler itself, it could be too onerous, instead the compiler will simply reasons in terms of graph.
As a simple example, you create a predicate even, which returns true if a number is even.
Now, you create two functions:
halve, which only acts on even numbers
double, which take any number, and return an even number.
Note that the type number is not changed, you do not create a evennumber type and duplicate all those functions that previously acted on number. You just compose number with a predicate called even.
Now, let's build some graphs:
a: number -> halve(a) #! error: `a` is not `even`
a: number, even -> halve(a) # ok
a: number -> b = double(a) -> b: number, even
Simple, isn't it ?
Of course it gets a bit more complicated when you have several possible paths:
a: number -> a = double(a) -> a: number, even -> halve(a) #! error: `a` is not `even`
\___________________________________/
This shows that you reason in terms of sets of predicates:
when joining two paths, the new set of predicates is the intersection of the sets of predicates given by those two paths
This can be augmented by the generic rule of a function:
to call a function, the set of predicates it requires must be satisfied
after a function is called, only the set of predicates it established is satisfied (note: arguments taken by value are not affected)
And thus the building block of TypeState in Rust:
check: checks that the predicate holds, if it does not fail, otherwise adds the predicate to set of predicates
Note that since Rust requires that predicates are pure functions, it can eliminate redundant check calls if it can prove that the predicate already holds at this point.
What Typestate lack is simple: composability.
If you read the description carefully, you will note this:
after a function is called, only the set of predicates it established is satisfied (note: arguments taken by value are not affected)
This means that predicates for a types are useless in themselves, the utility comes from annotating functions. Therefore, introducing a new predicate in an existing codebase is a bore, as the existing functions need be reviewed and tweaked to cater to explain whether or not they need/preserve the invariant.
And this may lead to duplicating functions at an exponential rate when new predicates pop up: predicates are not, unfortunately, composable. The very design issue they were meant to address (proliferation of types, thus functions), does not seem to be addressed.
It's basically an extension of types, where you don't just check whether some operation is allowed in general, but in this specific context. All that at compile time.
The original paper is actually quite readable.
There's a typestate checker written for Java, and Adam Warski's explanatory page gives some useful information. I'm only just figuring this material out myself, but if you are familiar with QuickCheck for Haskell, the application of QuickCheck to monadic state seems similar: categorise the states and explain how they change when they are mutated through the interface.
Typestate is explained as:
leverage type system to encode state changes
Implemented by creating a type for each state
Use move semantics to invalidate a state
Return the next state from the previous state
Optionally drop the state(close file, connections,...)
Compile time enforcement of logic
struct Data;
struct Signed;
impl Data {
fn sign(self) -> Signed {
Signed
}
}
let data = Data;
let singed = data.sign();
data.sign() // Compile error

Resources