Functional languages targeting the LLVM - programming-languages

Are there any languages that target the LLVM that:
Are statically typed
Use type inference
Are functional (i.e. lambda expressions, closures, list primitives, list comprehensions, etc.)
Have first class object-oriented features (inheritance, polymorphism, mixins, etc.)
Have a sophisticated type system (generics, covariance and contravariance, etc.)
Scala is all of these, but only targets the JVM. F# (and to some extent C#) is most if not all of these, but only targets .NET. What similar language targets the LLVM?

There's a Haskell (GHC) backend targeting the LLVM.
You could also try using F# through Mono-LLVM.
Also, the VMKit project is implementing both the JVM and the .NET CLI on top of LLVM; it's still in its early stages but once it matures you could use it with F#, or any JVM-targeting functional languages (Scala, Clojure, etc.)

I'm not sure how far these have progressed, but they may be worth adding to the list:
Scala for LLVM - https://github.com/greedy/scala/
Timber for LLVM - https://bitbucket.org/capitrane/timber-llvm
Mono for LLVM - http://www.mono-project.com/Mono_LLVM

Yes... clang. C++ has everything on your list except for list comprehensions. It is also the flagship LLVM language.
"Are statically typed"
Yup
"Use type inference"
// local type inference
auto var = 10;
// type inference on parameters to generic functions
template <typename T>
void my_function(T arg) {
...
}
my_function(1) // infers that T = int
// correctly handles more complicated cases where type is partially specified.
template <typename T>
void my_function(std::vector<T> arg) {
...
}
std::vector<int> my_vec = {1, 2, 3, 4};
my_function(my_vec) // infers that T = int
"Are functional (i.e. lambda expressions, closures, list primitives, list comprehensions, etc.)"
Lambdas in c++ look like [capture_spec](arglist...) { body }. You can either capture closed over variables by reference (similar to lisp) like so: [&]. Alternatively you can capture by value like so: [=].
int local = 10;
auto my_closure = [&]() { return local;};
my_closure(); // returns 10.
In C++ map, zip, and reduce are called std::transform and std::accumulate.
std::vector<int> vec = {1, 2, 3, 4};
int sum = std::accumulate(vec.begin(), vec.end(), [](int x, int y) { return x + y; });
You can also rig up list comprehensions using a macro and and a wrapper around std::transform if you really want...
"Have first class object-oriented features (inheritance, polymorphism, mixins, etc.)"
Of course. C++ allows virtual dispatch + multiple inheritance + implementation inheritance. Note: mixins are just implementation inheritance. You only need a special "mixin" mechanism if your language prohibits multiple inheritance.
"Have a sophisticated type system (generics, covariance and contravariance, etc.)"
C++ templates are the most powerful generics system in any language as far as I know.

Related

Understanding recurive let expression in lambda calculus with Haskell, OCaml and nix language

I'm trying to understand how recursive set operate internally by comparing similar feature in another functional programming languages and concepts.
I can find it in wiki. In that, I need to know Y combinator, fixed point. I can get it briefly in wiki.
Then, now I start to apply this in Haskell.
Haskell
It is easy. But I want to know behind the scenes.
*Main> let x = y; y = 10; in x
10
When you write a = f b in a lazy functional language like Haskell or Nix, the meaning is stronger than just assignment. a and f b will be the same thing. This is usually called a binding.
I'll focus on a Nix example, because you're asking about recursive sets specifically.
A simple attribute set
Let's look at the initialization of an attribute set first. When the Nix interpreter is asked to evaluate this file
{ a = 1 + 1; b = true; }
it parses it and returns a data structure like this
{ a = <thunk 1>; b = <thunk 2>; }
where a thunk is a reference to the relevant syntax tree node and a reference to the "environment", which behaves like a dictionary from identifiers to their values, although implemented more efficiently.
Perhaps the reason we're evaluating this file is because you requested nix-build, which will not just ask for the value of a file, but also traverse the attribute set when it sees that it is one. So nix-build will ask for the value of a, which will be computed from its thunk. When the computation is complete, the memory that held the thunk is assigned the actual value, type = tInt, value.integer = 2.
A recursive attribute set
Nix has a special syntax that combines the functionality of attribute set construction syntax ({ }) and let-binding syntax. This is avoids some repetition when you're constructing attribute sets with some shared values.
For example
let b = 1 + 1;
in { b = b; a = b + 5; }
can be expressed as
rec { b = 1 + 1; a = b + 5; }
Evaluation works in a similar manner.
At first the evaluator returns a representation of the attribute set with all thunks, but this time the thunks reference an new environment that includes all the attributes, on top of the existing lexical scope.
Note that all these representations can be constructed while performing a minimal amount of work.
nix-build traverses attrsets in alphabetic order, so it will evaluate a first. It's a thunk that references the a + syntax node and an environment with b in it. Evaluating this requires evaluating the b syntax node (an ExprVar), which references the environment, where we find the 1 + 1 thunk, which is changed to a tInt of 2 as before.
As you can see, this process of creating thunks but only evaluating them when needed is indeed lazy and allows us to have various language constructs with their own scoping rules.
Haskell implementations usually follow a similar pattern, but may compile the code rather than interpret a syntax tree, and resolve all variable references to constant memory offsets completely. Nix tries to do this to some degree, but it must be able to fall back on strings because of the inadvisable with keyword that makes the scope dynamic.
I guess several things by myself.
In eagar evaluation language, I must declare before use it. So the order of declaration is simple.
int x = 10;
int y = x;
Just for Nix language
In wiki, there isn't any concept comparision with Haskell though let ... in is compared with Haskell.
lexical scope
all variables are lexically scoped.
mutual recursion
https://en.wikipedia.org/wiki/Let_expression#Mutually_recursive_let_expression

How to undestand functors in the Nix expression language?

I'm having a bit of trouble parsing this. But as I write it out, I think I may have it.
let add = { __functor = self: x: x + self.x; };
inc = add // { x = 1; };
in inc 1
First, is self a keyword like in many OO languages or is this just a regular name?
Secondly, I'm trying to understand what the multiple : are doing in the definition of __functor, but this is probably a failing of my basic familiarity with Nix expressions, but I guess what is happening is that both self and x are arguments to __functor, i.e., it looks like it is probably a curried function.
So really, __functor here is what fmap would be in Haskell, I think, and self (add) is the functor itself, and x: x + self.x is what the function mapped by fmap would be in Haskell.
self is not a keyword, just an ordinary parameter name. You are correct that the right-hand side of __functor is a curried function of two arguments. The Nix interpreter ensures that __functor is passed the appropriate value for self, at the call site inc 1; __functor is handled specially, even though it's not a keyword per se.
Your example is nearly the same as:
let add = a: b: a + b
inc = add 1
in inc 1
In a larger program it might be useful to be able to override add.x elsewhere.
As noted in the comments, Nix uses "functor" in the sense of an object (here, set) that can be used syntactically like a function.
Passing self this way is Nix's version of "Objects are closures". The technique is used many places in Nixpkgs, with & without the __functor feature, to get the usual benefits of Objects, including extension (~ structural subtyping) & late binding.

Interning strings in declarative programming

The following scenario shows an abstraction that seems to me to be impossible to implement declaratively.
Suppose that I want to create a Symbol object which allows you to create objects with strings that can be compared, like Symbol.for() in JavaScript. A simple implementation in JS might look like this:
function MySymbol(text){//Comparable symbol object class
this.text = text;
this.equals = function(other){//Method to compare to other MySymbol
return this.text == other.text;
}
}
I could easily write this in a declarative language like Haskell:
data MySymbol = MySymbol String
makeSymbol :: String -> MySymbol
makeSymbol s = MySymbol s
compareSymbol :: MySymbol -> MySymbol -> Bool
compareSymbol (MySymbol s1) (MySymbol s2) = s1 == s2
However, maybe in the future I want to improve efficiency by using a global registry without changing the interface to the MySymbol objects. (The user of my class doesn't need to know that I've changed it to use a registry)
For example, this is easily done in Javascript:
function MySymbol(text){
if (MySymbol.registry.has(text)){//check if symbol already in registry
this.id = MySymbol.registry.get(text);//get id
} else {
this.id = MySymbol.nextId++;
MySymbol.registry.set(text, this.id);//Add new symbol with nextId
}
this.equals = function(other){//To compare, simply compare ids
return this.id == other.id;
}
}
//Setup initial empty registry
MySymbol.registry = new Map();//A map from strings to numbers
MySymbol.nextId = 0;
However, it is impossible to create a mutable global registry in Haskell. (I can create a registry, but not without changing the interface to my functions.)
Specifically, these three possible Haskell solutions all have problems:
Force the user to pass a registry argument or equivalent, making the interface implementation dependent
Use some fancy Monad stuff like Haskell's Control.Monad.Random, which would require either foreseeing the optimization from the start or changing the interface (and is basically just adding the concept of state into your program and therefore breaks referential transparency etc.)
Have a slow implementation which might not be practical in a given application
None of these solutions allow me to sufficiently abstract away implementation from my Haskell interface.
So, my question is: Is there a way to implement this optimization to a Symbol object in Haskell (or any declarative language) without causing one of the three problems listed above,
and are there any other situations where an imperative language can express an abstraction (for example an optimization like above) that a declarative language can't?
The intern package shows how. As discussed by #luqui, it uses unsafePerformIO at a few key moments, and is careful to hide the identifiers produced during interning.

Static languages that can manipulate object properties?

Something that comes up a fair amount when dealing with heterogeneous data is the need to partially change simple objects that hold data. For instance, you might want to add, drop, or rename a property, or concatenate two objects. This is easy enough in dynamic languages, but I'm wondering if there are any clever solutions proposed by static languages?
To fix ideas, are there any languages that enable, perhaps through some sort of static mixin syntax, something like this (C#):
var hello = new { Hello = "Hello" };
var world = new { World = "World" };
var helloWorld = hello + world;
Console.WriteLine(helloWorld.ToString());
//outputs {Hello = Hello, World = World}
This certainly seems possible, since no runtime information is used. Are there static languages that have this capability?
Added:
A limited version of what I'm considering is F#'s copy-and-update record expression:
let myRecord3 = { myRecord2 with Y = 100; Z = 2 }
What you're describing is known in programming language research as record concatenation. There has been some work on static type systems for record concatenation, mostly in the context of automatic type inference a la Haskell or ML. To the best of my knowledge it has not yet made an impact on any mainstream programming languages.

Is there any object-oriented static typed language with variables with few types?

I like reading about programming theories, so could you tell me if there is any object-oriented static typed language that allow variables to have a few types?
Example in pesudocode:
var value: BigInteger | Double | Nil
I think about way of calling methods on this object. If object value have type BigInteger | Double language could allow user to call only shared methods (lake plus, minus) but when the type is BigInteger | Double | Nil then object of Nil hasn't methods plus and minus, so we can't do anything usefull with this object, because it has only few shared methods (like toString).
So is there any idea how should work calling methods on variable with few types in static typed object-oriented language?
What you are describing is an intersection type. They do exist in Java, for example, but they only arise within the type-checker as the result of capture conversion and type-inference. You cannot write one yourself.
I don't know of any language which uses them directly, but they are often used to describe or analyze type systems of languages, espececially languages which don't actually have a type system. For example, Diamondback Ruby, which is a static type system and type-inferencer for the dynamically typed Ruby programming language, uses both union and intersection types.
Note that the syntax you are using is generally used to denote union types, which are the dual of intersection types. Intersection types are generally written A & B & C.
I am not aware of any language that does this... sadly, I'd love to play around with it (but first, they should adopt type inference and parametric polymorphism ;) ).
Although it is alreapossible: Relatively elegantly in a structural type system (type a is a subtype of type b if a has everything b has), simply by specifying a type for value that is a structural subtype of BigInteger and of Double and of Nil and slightly less elegantly in a nominative type system (type a is a subtype of type b if and only if it inherits from it, directly or indirectly) by specifying a common ancestor of all three (if all else fails, object). Of course we'd need to go recursive - what is the type of toString? And what's the typ of (Integer | Double | BigInteger).+?!? This is far from trivial (in fact, looking for a solution made my head hurt a bit). I can't say if it is impossible, but no mainly-OO-language's type system is anywhere sophisticated enough for a possible solution.
The bottom line is: It'd be really cool if some whizz came along and sorted out the issues it raises. Propably not worth the effort...
Edit: Do you know algebraic data types? They are similar to your idea (but much older ;) ) in that an algebraic data type is composed of several types and can therefore contain e.g. a BigInteger, a Double and Nil - the actual value is one of these and a tag (as in tagged union) says which. But to use the value stored in an algebraic data type, you have to use pattern matching to extract it safely. This concept is very powerful, and still "simple" enough to be understood tools - e.g. type inference and static typechecking work.
It has not much to do with OO but (as far as I understand it) what you describe looks much like polymorphism as implemented by C++.
Yes, OCaml has these in the form of polymorphic variants:
type my_var = Integer of int | Float of float;;
let x = Integer(10);;
let y = Float(3.14);;
Pike has them, as does Magpie, an optionally-typed language I'm working on. Google's Closure compiler for Javascript allows you to annotate types in Javascript using |.
They crop up frequently in languages that bridge static and dynamic typing because a lot of expressions in a dynamic language can yield one of a couple of types:
var a = 123;
if (foo) { a = "string"; }
bar(a);
The statically-determined type being passed to bar() is Number | String.
I'm not so sure if we really have a complete definition of what a static typed language is but I also hope that the language you describe wouldn't qualify as one.
One of my concerns is that if you add type T1 and T2 to be a part of your BigInteger | Double | Nil, how would they know about each other and how to handle the operations you defined? Now I realize you never said that the language would allow expanding the "implicit" conversion definition.
Come to think of it, C# does something that resembles this in its string handling
string s = -42 + '+' + "+" + -0.1 / -0.1 + "=" + (7 ^ 5) +
" is " + true + " and not " + AddressFamily.Unknown;
=> "1+1=2 is True and not Unknown"
string str = 1 + 2 + "!=" + 1 + 2;
=> "3!=12"
And I do not like it.

Resources