What is a type inference? - programming-languages

Does it only exist in statically typed languages? And is it only there when the language is not strongly typed (i.e., does Java have one)? Also, where does it belong - in the compilation phase assuming it's a compiled language?
In general, are the rules when the type is ambiguous dictated by the language specification or left up to the implementation?

Type inference is a feature of some statically-typed languages. It is done by the compiler to assign types to entities that otherwise lack any type annotations. The compiler effectively just 'fills in' the static type information on behalf of the programmer.
Type inference tends to work more poorly in languages with many implicit coercions and ambiguities, so most type inferenced languages are functional languages with little in the way of coercions, overloading, etc.
Type inference is part of the language specification, for the example the F# spec goes into great detail about the type inference algorithm and rules, as this effectively determines 'what is a legal program'.
Though some (most?) languages support some limited forms of type inference (e.g. 'var' in C#), for the most part people use 'type inference' to refer to languages where the vast majority of types are inferred rather than explicit (e.g. in F#, function and method signatures, in addition to local variables, are typically inferred; contrast to C# where 'var' allows inference of local variables but method declarations require full type information).

A type inferencer determines what type a variable is from the context. It relies on strong typing to do so. For example, functional languages are very strongly, statically typed but completely rely on type inference.
C# and VB.Net are other examples of statically typed languages with type inference (they provide it to make generics usable, and it is required for queries in LINQ, specifically to support projections).
Dynamic languages do not infer type, it is discovered at runtime.

Type inferencing is a bit of a compromise found in some static languages. You can declare variables without specifying the type, provided that the type can be inferred at compile time. It doesn't offer the flexibility of latent typing, but you do get type safety and you don't have to write as much.
See the Wikipedia article.

A type inferencer is anything which deduces types statically, using a type inference algorithm. As such, it is not just a feature of static languages.
You may build a static analysis tool for dynamic languages, or those with unsafe or implicit type conversions, and type inference will be a major part of its job. However, type inference for languages with unsafe or dynamic type systems, or which include implicit conversions, can not be used to prove the type safety of a program in the general case.
As such type inference is used:
to avoid type annotations in static languages,
in optimizing compilers for dynamic languages (ie for Scheme, Self and Python),
In bug checking tools, compilers and security analysis for dynamic languages.

Related

In what way is haskells type system more helpful than the type system of another statically typed language

I have been using haskell for a while now. I understand most/some of the concepts but I still do not understand, what exactly does haskells type system allow me to do that I cannot do in another statically typed language. I just intuitively know that haskells type system is better in every imaginable way compared to the type system in C,C++ or java, but I can't explain it logically, primarily because of a lack of in depth knowledge about the differences in type systems between haskell and other statically typed languages.
Could someone give me examples of how haskells type system is more helpful compared to a language with a static type system. Examples, that are terse and can be succinctly expressed would be nice.
The Haskell type system has a number of features which all exist in other languages, but are rarely combined within a single, consistent language:
it is a sound, static type system, meaning that a number of errors are guaranteed not to happen at runtime without needing runtime type checks (this is also the case in Caml, SML and almost the case in Java, but not in, say, Lisp, Python, C, or C++);
it performs static type reconstruction, meaning that the programmer doesn't need to write types unless he wants to, the compiler will reconstruct them on its own (this is also the case in Caml and SML, but not in Java or C);
it supports impredicative polymorphism (type variables), even at higher kinds (unlike Caml and SML, or any other production-ready language known to me);
it has good support for overloading (type classes) (unlike Caml and SML).
Whether any of those make Haskell a better language is open to discussion — for example, while I happen to like type classes a lot, I know quite a few Caml programmers who strongly dislike overloading and prefer to use the module system.
On the other hand, the Haskell type system lacks a few features that other languages support elegantly:
it has no support for runtime dispatch (unlike Java, Lisp, and Julia);
it has no support for existential types and GADTs (these are both GHC extensions);
it has no support for dependent types (unlike Coq, Agda and Idris).
Again, whether any of these are desirable features in a general-purpose programming language is open to discussion.
In addition to what others have answered, it is also Haskell's type system that makes the language pure, i.e. which distinguishes between values of a certain type and effectful computations that produce a result of that type.
One major difference between Haskell's type system and that of most OO languages is that the ability for a function to have side effects is represented by a data type (a monad such as IO). This allows you to write pure functions that the compiler can verify are side-effect-free and referentially transparent, which generally means that they're easier to understand and less prone to bugs. It's possible to write side-effect-free code in other languages, but you don't have the compiler's help in doing so. Haskell makes you think more carefully about which parts of your program need to have side effects (such as I/O or mutable variables) and which parts should be pure.
Also, although it's not quite part of the type system itself, the fact that function definitions in Haskell are expressions (rather than lists of statements) means that more of the code is subject to type-checking. In languages like C++ and Java, it's often possible to introduce logic errors by writing statements in the wrong order, since the compiler doesn't have a way to determine that one statement must precede another. For example, you might have one line that modifies an object's state, and another line that does something important based on that state, and it's up to you to ensure that these things happen in the correct order. In Haskell, this kind of ordering dependency tends to be expressed through function composition — e.g. f (g x) means that g must run first — and the compiler can check the return type of g against the argument type of f to make sure you haven't composed them the wrong way.

The nature of Haskell type system: static/dynamic, manual/inferred?

I'm learning Haskell and trying to grasp how exactly Haskell type system works re working out what is the type of the thing: dynamic, static, set manually, inferred?
Languages I know a bit:
C, Java: set manually by a programmer, verified at compile time, like int i;, strong typing (subtracting integer from a string is a compile error). Typical static type system.
Python: types inferred automatically by runtime (dynamic typing),
strong typing (subtracting int from a str raises exception).
Perl, PHP: types inferred automatically at runtime (dynamic typing), weak typing.
Haskell: types often inferred automatically at compile time (either this or type is set explicitly by a programmer before compile time), strong typing.
Does Haskell's type system really deserve description "static"? I mean automatic type inference is not (classic) static typing.
Does Haskell's type system really deserve description "static"? I mean automatic type inference is not (classic) static typing.
Type inference is done at compile time. All types are checked at compile time. Haskell implementations may erase types at runtime, as they have a compile-time proof of type safety.
So it is correct to say that Haskell has a "static" type system. "Static" refers to one side of the phase distinction between compile-time and runtime.
To quote Robert Harper:
Most programming languages exhibit a phase distinction between the
static and dynamic phases of processing. The static phase consists of
parsing and type checking to ensure that the program is well-formed;
the dynamic phase consists of execution of well-formed programs. A
language is said to be safe exactly when well-formed programs are well
behaved when executed.
From Practical Foundations for Programming Languages, 2014.
Under this description Haskell is a safe language with a static type system.
As a side note, I'd strongly recommend the above book for those interested in learning the essential skills for understanding about programming languages and their features.
The static-dynamic axis and the manual-inferred (or manifest-inferred) scales are not orthogonal. A static type system can be manifest or inferred, the distinction doesn't apply to dynamic typing. Python, Perl, PHP don't infer types because type inference is the deduction of static types via static analysis (i.e., at compile time).
Dynamic languages don't deduce types like that, they just compute the types of values alongside the actual computation. Haskell does deduce types statically, it is statically typed, you just don't have to write the static types manually. Its type system indeed differs from mainstream type systems, but it differs by being inferred rather than manifest (and many other features), not by being not static.
As for strong/weak typing: Stop using that term, it is overloaded to the point of being useless. If by "the type system is strong/weak" you mean "the type system allows/forbids X" by, then say that, because if you call it strong/weak typing a large portion of your audience will have a different definition and disagree with your use of the terms. Moreover, as you see, for most choices of X it's rather independent of the two distinctions you mention in the title (but then there's the sizable group that uses strong as synonym for static and weak as synonym for dynamic, oh my!).
Before going to SO it would be good to check e.g. Wikipedia, which says that "Haskell has a strong, static type system based on Hindley–Milner type inference." Static refers to when type checking is done, so whether types are inferred or not doesn't matter.

Does haskell erase types?

Does Haskell erase types, and if so, in what ways is this similar/dissimilar to the type erasure that occurs in Java?
Warning: experience+inference. Consult someone who works on both compilers for The Truth.
In the sense that type checking is done at compile time, and several complex features of the type system are reduced to much simpler language constructs, yes, but in a rather different way to Java.
A type signature creates no runtime overhead. The Haskell compiler is good at program transformation (it has more leeway, because the running order is in many cases not specified by the programmer), and automatically inlines appropriate definitions and specialises haskell-polymorhpic (=java-generic) functions to a particular type etc, as it sees fit, if it helps. That's a similar to Java type erasure, but more-so aspect.
There are in essence no type casts needed in Haskell to ensure type safety, because Haskell is designed to be type-safe from the ground up. We don't resort to turning everything into an Object, and we don't cast them back, because a polymorphic(generic) function genuinely does work on any data type, no matter what, pointer types or unboxed integers, it just works, without trickery. So unlike Java, casting is not a feature of compiling polymorphic(generic) code. Haskell folk tend to feel that if you're doing type casting, you said goodbye to type safety anyway.
For a lovely example of how ensuring the code's static type-correctness at compile time can avoid runtime overhead, there's a newtype construct in Haskell which is a type-safe wrapper for an existing type, and it's completely compiled away - all the construction and destruction simply doesn't happen at runtime. The type system ensures at compile time it's used correctly, it can't be got at at runtime except using (type-checked) accessor functions.
Polymorphic(generic) functions don't have polymorphic overheads. Haskell-overloaded functions (Java-interface-instance methods) have a data overhead in the sense that there's an implicit dictionary of functions used for what appears to be late binding to Java programmers, but is in fact, again, determined at compile time.
Summary: yes, even more so than in Java, and no, they were never there at runtime to erase anyway.
C and Pascal have type erasure. Java lets you inspect classes at run-time - even dynamically loaded ones!
What Haskell does is much closer to Pascal than to Java.

Is there any link between functional programming and strong typing?

All of the "pure" functional languages are strong typed. Is there any link between those?
Non-trivial functional programming techniques make heavy use of first-class and higher-order functions. First-class functions are implemented as closures. Non-trivial use of first-class functions and closures is only sane when you have garbage collection. Efficient and reliable garbage collection requires memory safety (which I assume you mean by "strongly typed"). So there you go.
Purity doesn't really matter for that.
"Pure" functional languages are those which enforce referential transparency. The enforcement could be static (via the type system), or it could be dynamic (e.g. a runtime failure). I'm guessing you mean "statically typed" when you say "strongly typed"...
Sincbe the community from which typed, pure functional programming emerged is separately issued in reducing runtime failures and making programming safer adding purity without type enforcement -- such that runtime failure is still an option -- is incongruous.
So its no suprise you see types and effect typing going together with purity-by-default: it is all about reducing runtime failures.
Mercury (in which you can do functional programming, but is more of a pure logic programming language) actually has an explicit static purity system. Every predicate or function is statically known to be pure or impure (or semipure, but I'm not going to go into that in detail). Putting a call to an impure function inside a pure function (pure is the default) will result in an error detected at compile time.
It also has a static type system, in which the type of every expression/variable is statically known by the compiler, and type errors are detected at compile time. But the type system is completely independent of the purity system (in that you can have pure, impure, and semipure functions of any given type).
So we can imagine a different language with the same static purity system but in which the types of expressions/variables are not statically known, and may vary dynamically at runtime. One could even imagine such a language having "weak types" in the sense of PHP (i.e. the language will try to convert values such that operations that don't make sense on the value's type can actually be performed), or in the sense of C (i.e. you can convince the language to store values of one type in a variable the language will treat as if it were a different type).
One could also imagine a language in which the purity was not statically known but still enforced at runtime. The language would have to do something such as keeping track of whether it was in an pure call, and if so rejecting calls to impure primitive operations.
So in that sense, no there's no link between strong typing and pure programming.
However, languages which actually enforce purity (rather than merely encouraging it, as in Scala) have traditionally achieved this by static analysis. Indeed, one of the motivations for pure code is that it is much more susceptible to static analysis than code which is impure in arbitrary ways. A contrived example is that a function which takes a boolean argument and returns something can be known to return one of at most two results if it is pure; if it is not known to be pure then the language has to assume it might return something different at every single invocation.. And if you're interested in doing static analysis of your code and you have this static analysis system for enforcing purity, you might as well make it enforce type safety as well. So there's just "not that much call" for languages which enforce purity but don't have strong static type systems. I'm not aware of any that actually exist (there's not all that many languages that enforce purity at all, as far as I know).

What makes Haskell's type system more "powerful" than other languages' type systems?

Reading Disadvantages of Scala type system versus Haskell?, I have to ask: what is it, specifically, that makes Haskell's type system more powerful than other languages' type systems (C, C++, Java). Apparently, even Scala can't perform some of the same powers as Haskell's type system. What is it, specifically, that makes Haskell's type system (Hindley–Milner type inference) so powerful? Can you give an example?
What is it, specifically, that makes Haskell's type system
It has been engineered for the past decade to be both flexible -- as a logic for property verification -- and powerful.
Haskell's type system has been developed over the years to encourage a relatively flexible, expressive static checking discipline, with several groups of researchers identifying type system techniques that enable powerful new classes of compile-time verification. Scala's is relatively undeveloped in that area.
That is, Haskell/GHC provides a logic that is both powerful and designed to encourage type level programming. Something fairly unique in the world of functional programming.
Some papers that give a flavor of the direction the engineering effort on Haskell's type system has taken:
Fun with type functions
Associated types with class
Fun with functional dependencies
Hindley-Milner is not a type system, but a type inference algorithm. Haskell's type system, back in the day, used to be able to be fully inferred using HM, but that ship has long sailed for modern Haskell with extensions. (ML remains capable of being fully inferred).
Arguably, the ability to mainly or entirely infer all types yields power in terms of expressiveness.
But that's largely not what I think the question is really about.
The papers that dons linked point to the other aspect -- that the extensions to Haskell's type system make it turing complete (and that modern type families make that turing complete language much more closely resemble value-level programming). Another nice paper on this topic is McBride's Faking It: Simulating Dependent Types in Haskell.
The paper in the other thread on Scala: "Type Classes as Objects and Implicits" goes into why you can in fact do most of this in Scala as well, although with a bit more explicitness. I tend to feel, but this is more a gut sense than from real Scala experience, that its more ad-hoc and explicit approach (what the C++ discussion called "nominal") is ultimately a bit messier.
Let's go with a very simple example: Haskell's Maybe.
data Maybe a = Nothing | Just a
In C++:
template <T>
struct Maybe {
bool isJust;
T value; // IMPORTANT: must ignore when !isJust
};
Let's consider these two function signatures, in Haskell:
sumJusts :: Num a => [Maybe a] -> a
and C++:
template <T> T sumJusts(vector<maybe<T> >);
Differences:
In C++ there are more possible mistakes to make. The compiler doesn't check the usage rule of Maybe.
The C++ type of sumJusts does not specify that it requires + and cast from 0. The error messages that show up when things do not work are cryptic and odd. In Haskell the compiler will just complain that the type is not an instance of Num, very straightforward..
In short, Haskell has:
ADTs
Type-classes
A very friendly syntax and good support for generics (which in C++ people try to avoid because of all their cryptickynessishisms)
Haskell language allows you to write safer code without giving up with functionalities. Most languages nowadays trade features for safety: the Haskell language is there to show that's possible to have both.
We can live without null pointers, explicit castings, loose typing and still have a perfectly expressive language, able to produce efficient final code.
More, the Haskell type system, along with its lazy-by-default and purity approach to coding gives you a boost in complicate but important matters like parallelism and concurrency.
Just my two cents.
One thing I really like and miss in other languages is the support of typclasses, which are an elegant solution for many problems (including for instance polyvariadic functions).
Using typeclasses, it's extremely easy to define very abstract functions, which are still completely type-safe - like for instance this Fibonacci-function:
fibs :: Num a => [a]
fibs#(_:xs) = 0:1:zipWith (+) fibs xs
For instance:
map (`div` 2) fibs -- integral context
(fibs !! 10) + 1.234 -- rational context
map (:+ 1.0) fibs -- Complex context
You may even define your own numeric type for this.
What is expressiveness? To my understanding it is what constraint the type system allow us to put on our code, or in other words what properties of code which we can prove. The more expressive a type system is, the more information we can embed at the type level (which can be used at compile time by the type-checker to check our code).
Here are some properties of Haskell's type system that other languages don't have.
Purity.
Purity allows Haskell to distinguish pure code and IO capable code
Paramtricity.
Haskell enforces parametricity for parametrically polymorphic functions so they must obey some laws. (Some languages does let you to express polymorphic function types but they don't enforce parametricity, for example Scala lets you to pattern match on a specific type even if the argument is polymorphic)
ADT
Extensions
Haskell's base type system is a weaker version of λ2 which itself isn't really impressive. But with these extensions it become really powerful (even able to express dependent types with singleton):
existential types
rank-n types (full λ2)
type families
data kinds (allows "typed" programming at type level)
GADT
...

Resources