This exercise I made up is intended to help me understand signatures, structures, and functors in Standard ML. I can't seem to get it to work. Just for reference, I'm using
Standard ML of New Jersey v110.75 [built: Sun Jan 20 21:55:21 2013]
I have the following ML signature for an "object for which you can compute magnitude":
signature MAG_OBJ =
sig
type object
val mag : object -> int
end
If I want to give a structure of "int sets with magnitude," I might have a structure for an ordered int to use with the standard library's ORD_SET signature as follows:
structure OrderedInt : ORD_KEY =
struct
type ord_key = int
val compare = Int.compare
end
Then I can create a functor to give me a structure with the desired types and properties:
functor MakeMagSet(structure ELT : ORD_KEY) : MAG_OBJ =
struct
structure Set : ORD_SET = RedBlackSetFn(ELT)
type object = Set.set
val mag = Set.numItems
end
So far so good (everything compiles, at least). Now I create an instance of the structure for my OrderedInt structure I made above:
structure IntMagSet = MakeMagSet(structure ELT = OrderedInt)
But when I try to use it (create a set and compute its magnitude), I get an error:
val X = IntMagSet.Set.addList(IntMagSet.Set.empty, [0,1,2,3,4,5,6,7,8,9])
gives the error:
Error: unbound structure: Set in path IntMagSet.Set.empty.addList
From what I understand, ascribing a signature opaquely using :> makes it so one can't access any structure internals which are not defined explicitly in the signature, but I ascribed MAG_OBJ transparently, so I should be able to access the Set structure, right? What am I missing here?
[EDIT]
Even rewriting the functor to specifically bind the functions I want to the struct is no good:
functor MakeMagSet(structure ELT: ORD_KEY) : MAG_OBJ =
struct
structure Set : ORD_SET = RedBlackSetFn(ELT)
type object = Set.set
val mag = Set.numItems
val empty = Set.empty
val addList = Set.addList
end
Trying to access "empty" and "addList" give unbound variable errors.
On the other hand, trying to explicitly define the Set structure outside of the struct and use its functions gives a type error upon calling mag:
Error: operator and operand don't agree [tycon mismatch]
operator domain: IntMagSet.object
operand: Set.set
in expression:
IntMagSet.mag X
I think it's because you explicitly said that the type of MakeMagSet makes a MAG_OBJ, whose signature does not contain Set. If you had gotten rid of the : MAG_OBJ or made MAG_OBJ include ORD_SET, then it will work.
Related
I am trying to design an API for some database system in Haskell, and I would like to model the columns of this database in a way such that interactions between columns of different tables cannot get mixed up.
More precisely, imagine that you have a type to represent a table in a database, associated to some type:
type Table a = ...
and that you can extract the columns of the table, along with the type of the column:
type Column col = ...
Finally, there are various extractors. For example, if your table contains descriptions of frogs, a function would let you extract the column containing the weight of the frog:
extractCol :: Table Frog -> Column Weight
Here is the question: I would like to distinguish the origin of the columns so that users cannot do operations between tables. For example:
bullfrogTable = undefined :: Table Frog
toadTable = undefined :: Table Frog
bullfrogWeights = extractCol bullfrogTable
toadWeights = extractCol toadTable
-- Or some other columns from the toad table
toadWeights' = extractCol toadTable
-- This should compile
addWeights toadWeights' toadWeights
-- This should trigger a type error
addWeights bullfrogWeights toadWeights
I know how to achieve this in Scala (using path-dependent types, see [1]), and I have been thinking of 3 options in Haskell:
not using types, and just doing a check at runtime (the current solution)
the TypeInType extension to add a phantom type on the Table type itself, and pass this extra type to the columns. I am not keen on it, because the construction of such a type would be very complicated (tables are generated through complex DAG operations) and probably slow to compile in this context.
wrapping the operations using a forall construct similar to the ST monad, but in my case, I would like the extra tagging type to actually escape the construction.
I am happy to have a very limited valid scoping for the construction of the same columns (i.e. columns from table and (id table) not being mixable), and I mostly care about the DSL feel of the API rather than the safety.
[1] What is meant by Scala's path-dependent types?
My current solution
Here is what I ended up doing, using RankNTypes.
I still want to give users the ability to use columns how they see fit without having some strong type checks, and opt in if they want some stronger type guarantees: this is a DSL for data scientist who will not know the power of the Haskell side
Tables are still tagged by their content:
type Table a = ...
and columns are now tagged with some extra reference types, on top of the type of the data they contain:
type Column ref col = ...
Projections from tables to columns are either tagged or untagged. In practice, this is hidden behind a lens-like DSL.
extractCol :: Table Frog -> Column Frog Weight
data TaggedTable ref a = TaggedTable { _ttTable :: Table a }
extractColTagged :: Table ref Frog -> Column ref Weight
withTag :: Table a -> (forall ref. TaggedTable ref a -> b) -> b
withTag tb f = f (TaggedTable tb)
Now I can write some code as following:
let doubleToadWeights = withTag toadTable $ \ttoadTable ->
let toadWeights = extractColTagged ttoadTable in
addWeights toadWeights toadWeights
and this will not compile, as desired:
let doubleToadWeights =
toadTable `withTag` \ttoads ->
bullfrogTable `withTag` \tbullfrogs ->
let toadWeights = extractColTagged ttoads
bullfrogWeights = extractColTagged tbullfrogs
in addWeights toadWeights bullfrogWeights -- Type error
From a DSL perspective, I believe it is not as straightforward as what one could achieve with Scala, but the type error message is understandable, which is paramount for me.
Haskell does not (as far as I know) have path dependent types, but you can get some of the way by using rank 2 types. For instance the ST monad has a dummy type parameter s that is used to prevent leakage between invocations of runST:
runST :: (forall s . ST s a) -> a
Within an ST action you can have an STRef:
newSTRef :: a -> ST s (STRef s a)
But the STRef you get carries the s type parameter, so it isn't allowed to escape from the runST.
I have a strategies expressed as generics in nim:
proc fooStrategy[T](t: T, ...)
proc barStrategy[T](t: T, ...)
I would like to create a lookup table for the strategies by name... so I tried:
type
Strategy*[T] = proc[T](t: T, ...)
let strategies* = toTable[string, Strategy[T]]([
("foo", fooStrategy), ("bar", barStrategy)])
This doesn't work -- the type declaration fails. If I were to get by that I could guess that the table of strategies would also have problems. Is there another way to do this? "T" is supposed to be "some 1D collection type" -- could be sequence, array, vector from blas, etc. I could add concrete strategies to the table for common collections, but I still have the problem with the function pointer, as
type
Strategy* = proc(t: any, ...)
let strategies* = toTable[string, Strategy]([
("foo-seq[int]", fooStrategy[int]), ...])
still has problems. Any suggestions?
There are multiple issues with your code:
Firstly, initTable does not take a list of items for the table. It only takes an initial size. You want to use toTable instead.
Secondly, you must explicitly set a value for the generic parameter T when creating a table, because at runtime, all generic parameters must be bound to a type.
Thirdly, the proc types have to exactly match, including pragmas on the proc. This one's tricky.
Here is a working example:
import tables
type
Strategy*[T] = proc(t: T) {.gcsafe, locks: 0.}
proc fooStrategy[T](t: T) = echo "foo"
proc barStrategy[T](t: T) = echo "bar"
let strategies* = toTable[string, Strategy[int]]([
("foo", fooStrategy[int]), ("bar", barStrategy[int])
])
For this example, I create a table with Strategy[int] values (you cannot have a table with Strategy[T] values as this is not a concrete type). I instantiate both fooStrategy and barStrategy with [int] to match the table type. I added {.gcsafe, locks: 0.} to the type definition. If this is omitted, you will get a compiler error:
test.nim(9, 49) Error: type mismatch: got (Array constructor[0..1, (string, proc (t: int){.gcsafe, locks: 0.})])
but expected one of:
proc (pairs: openarray[(string, Strategy[system.int])]): Table[system.string, Strategy[system.int]]{.gcsafe, locks: 0.}
As you see, the compiler tells you in the first line what it sees and in the third line what it expects. it sees procs with {.gcsafe, locks: 0.} because those pragmas are implicitly assigned to the procs defined above. The pragmas change the type, so to be able to assign those procs to Strategy[T], you have to define the same pragmas to Strategy[T].
I'm a beginner in haskell and I wonder about the right way to define a new type. Suppose I want to define a Point type. In an imperative language, it's usually the equivalent of:
data Point = Int Int
However in haskell I usually see definitions such as:
data Point = Point Int Int
What are the differences and when should each approach be used?
In OO languages you can define a class with something like this
class Point {
int x,y;
Point(int x, int y) {...
}
it's similar
data Point = ...
is the type definition (similar to class Point above , and
... = Point Int Int
is the constructor, you can also define the constructor with a different name, but you need a name regardless.
data Point = P Int Int
The data definitions are, ultimately, tagged unions. For example:
data Maybe a = Nothing | Just a
Now how would you write this type using your syntax?
Moreover it remains the fact that in Haskell you can pattern match over this values and see which constructor was used to build a value. The name of the constructor is needed for pattern matching, and if the type has just one constructor it often re-uses the same name as the type.
For example:
let x = someOperationReturningMaybe
in case x of
Nothing -> 0
Just y -> y+5
This is different from plain union type, such as C's union where you can say "this thing is etiher an int or a float" but you have no way to know which one it actually is (except by keeping track of the state by hand).
Writing the code above using a C union you have no way to use a case to perform different actions depending on the constructor used, and you have to keep track explicitly what type is contained in that x and use an if.
What are the various use cases for union types and intersection types? There has been lately a lot of buzz about these type system features, yet somehow I have never felt need for either of these!
Union Types
To quote Robert Harper, "Practical Foundations for Programming
Languages", ch 15:
Most data structures involve
alternatives such as the distinction
between a leaf and an interior node in
a tree, or a choice in the outermost
form of a piece of abstract syntax.
Importantly, the choice determines the
structure of the value. For example,
nodes have children, but leaves do
not, and so forth. These concepts are
expressed by sum types, specifically
the binary sum, which offers a choice
of two things, and the nullary sum,
which offers a choice of no things.
Booleans
The simplest sum type is the Boolean,
data Bool = True
| False
Booleans have only two valid values, T or F. So instead of representing them as numbers, we can instead use a sum type to more accurately encode the fact there are only two possible values.
Enumerations
Enumerations are examples of more general sum types: ones with many, but finite, alternative values.
Sum types and null pointers
The best practically motivating example for sum types is discriminating between valid results and error values returned by functions, by distinguishing the failure case.
For example, null pointers and end-of-file characters are hackish encodings of the sum type:
data Maybe a = Nothing
| Just a
where we can distinguish between valid and invalid values by using the Nothing or Just tag to annotate each value with its status.
By using sum types in this way we can rule out null pointer errors entirely, which is a pretty decent motivating example. Null pointers are entirely due to the inability of older languages to express sum types easily.
Intersection Types
Intersection types are much newer, and their applications are not as widely understood. However, Benjamin Pierce's thesis ("Programming with Intersection Types
and Bounded Polymorphism") gives a good overview:
The most intriguing and potentially
useful property of intersection types
is their ability to express an
essentially unbounded (though of
course finite) amount of information
about the components of a program.
For
example, the addition function (+) can be
given the type Int -> Int -> Int ^ Real -> Real -> Real, capturing both the
general fact that the sum of two real
numbers is always a real and the more
specialized fact that the sum of two
integers is always an integer. A
compiler for a language with
intersection types might even provide
two different object-code sequences
for the two versions of (+), one using a
floating point addition instruction and
one using integer addition. For each
instance of+ in a program, the
compiler can decide whether both
arguments are integers and generate
the more efficient object code sequence
in this case.
This kind of finitary
polymorphism or coherent overloading
is so expressive, that ... the set of
all valid typings for a program
amounts to a complete characterization
of the program’s behavior
They let us encode a lot of information in the type, explaining via type theory what multiple inheritance means, giving types to type classes,
Union types are useful for typing dynamic languages or otherwise allowing more flexibility in the types passed around than most static languages allow. For example, consider this:
var a;
if (condition) {
a = "string";
} else {
a = 123;
}
If you have union types, it's easy to type a as int | string.
One use for intersection types is to describe an object that implements multiple interfaces. For example, C# allows multiple interface constraints on generics:
interface IFoo {
void Foo();
}
interface IBar {
void Bar();
}
void Method<T>(T arg) where T : IFoo, IBar {
arg.Foo();
arg.Bar();
}
Here, arg's type is the intersection of IFoo and IBar. Using that, the type-checker knows both Foo() and Bar() are valid methods on it.
If you want a more practice-oriented answer:
With union and recursive types you can encode regular tree types and therefore XML types.
With intersection types you can type BOTH overloaded functions and refinement types (what in a previous post is called coherent overloading)
So for instance you can write the function add (that overloads integer sum and string concatenation) as follows
let add ( (Int,Int)->Int ; (String,String)->String )
| (x & Int, y & Int) -> x+y
| (x & String, y & String) -> x#y ;;
Which has the intersection type
(Int,Int)->Int & (String,String)->String
But you can also refine the type above and type the function above as
(Pos,Pos) -> Pos &
(Neg,Neg) -> Neg &
(Int,Int)->Int &
(String,String)->String.
where Pos and Neg are positive and negative integer types.
The code above is executable in the language CDuce ( http://www.cduce.org ) whose type system includes union, intersections, and negation types (it is mainly targeted at XML transformations).
If you want to try it and you are on Linux, then it is probably included in your distribution (apt-get install cduce or yum install cduce should do the work) and you can use its toplevel (a la OCaml) to play with union and intersection types. On the CDuce site you will find a lot of practical examples of use of union and intersection types. And since there is a complete integration with OCaml libraries (you can import OCaml libraries in CDuce and export CDuce modules to OCaml) you can also check the correspondence with ML sum types (see here).
Here you are a complex example that mix union and intersection types (explained in the page "http://www.cduce.org/tutorial_overloading.html#val"), but to understand it you need to understand regular expression pattern matching, which requires some effort.
type Person = FPerson | MPerson
type FPerson = <person gender = "F">[ Name Children ]
type MPerson = <person gender = "M">[ Name Children ]
type Children = <children>[ Person* ]
type Name = <name>[ PCDATA ]
type Man = <man name=String>[ Sons Daughters ]
type Woman = <woman name=String>[ Sons Daughters ]
type Sons = <sons>[ Man* ]
type Daughters = <daughters>[ Woman* ]
let fun split (MPerson -> Man ; FPerson -> Woman)
<person gender=g>[ <name>n <children>[(mc::MPerson | fc::FPerson)*] ] ->
(* the above pattern collects all the MPerson in mc, and all the FPerson in fc *)
let tag = match g with "F" -> `woman | "M" -> `man in
let s = map mc with x -> split x in
let d = map fc with x -> split x in
<(tag) name=n>[ <sons>s <daughters>d ] ;;
In a nutshell it transforms values of type Person into values of type (Man | Women) (where the vertical bar denotes a union type) but keeping the correspondence between genres: split is a function with intersection type
MPerson -> Man & FPerson -> Woman
For instance with union types one could describe json domain model without introducing actual new classes but using only type aliases.
type JObject = Map[String, JValue]
type JArray = List[JValue]
type JValue = String | Number | Bool | Null | JObject | JArray
type Json = JObject | JArray
def stringify(json: JValue): String = json match {
case String | Number | Bool | Null => json.toString()
case JObject => "{" + json.map(x y => x + ": " + stringify(y)).mkStr(", ") + "}"
case JArray => "[" + json.map(stringify).mkStr(", ") + "]"
}
In my ml program I am using nested structures to structure my code. I'm defining the signatures for these structures - but I can't really get to have the signatures nested.
Example:
structure Example =
struct
structure Code =
struct
datatype mytype = Mycons of string
end
end
for this I'd like to do something like this:
signature EXAMPLE =
sig
signature CODE = (* or stucture Code - doesn't matter *)
sig
datatype mytype
end
end
Now this doesn't work; I get syntax errors. My questions:
Is this a bad idea? If so, why?
How do I do it? How do I apply the nested signature to the nested structure?
The syntax in signatures when having nested structures, requires some getting used to.
When trying to specify the signature if a structure within a signature you do it like this
signature JSON =
sig
type t
.. some signature stuff
structure Converter : sig
type json
type 'a t
... Converter specification stuff
... using type json as the parent signatures type t
end where type json = t
end
See these Hoffman[.sml][.sig] files for a simple examples of this and have a look at the Tree[.sig] file for a bit more complex example.
Remember that you need to mention your signature specification in your structure, else it will be pointless to make the signature in the first place.