Basic Concepts of Language Type Systems - programming-languages

Could someone please explain clearly and succinctly the concepts of language type systems?
I've read a post or two here on type systems, but have trouble finding one that answers all my questions below.
I've heard/read that there are 3 type categorizations: dynamic vs static, strong vs weak, safe vs unsafe.
Some questions:
Are there any others?
What do each of these mean?
If a language allows you to change the type of a variable in runtime (e.g. a variable that used to store an int is later used to store a string), what category does that fall in?
How does Python fit into each of these categories?
Is there anything else I should know about type systems?
Thanks very much!

1) Apparently, there are others: http://en.wikipedia.org/wiki/Type_system
2)
Dynamic => Type checking is done during runtime (program execution) e.g. Python.
Static (as opposed to Dynamic) => Type checking is done during compile time e.g. C++
Strong => Once the type system decides that a particular object is of a type, it doesn't allow it to be used as another type. e.g. Python
Weak (as opposed to Strong) => The type system allows objects types to change. e.g. perl lets you read a number as a string, then use it again as a number
Type safety => I can only best describe with a 'C' statement like:
x = (int *) malloc (...);
malloc returns a (void *) and we simply type-cast it to (int *). At compile time there is no check that the pointer returned by the function malloc will actually be the size of an integer => Some C operations aren't type safe.
I am told that some 'purely functional' languages are inherently type safe, but I do not know any of these languages. I think Standard ML or Haskell would be type safe.
3) "If a language allows you to change the type of a variable in runtime (e.g. a variable that used to store an int is later used to store a string), what category does that fall in?":
This may be dynamic - variables are untyped, values may carry implicit or explicit type information; alternatively, the type system may be able to cope with variables that change type, and be a static type system.
4) Python: It's dynamically and strongly typed. Type safety is something I don't know python (and type safety itself) enough to say anything about.
5) "Is there anything else I should know about type systems?": Maybe read the book #BasileStarynkevitch suggests?

You are asking a lot here :) Type system is a dedicated field of computer science!
Starting from the begining, "a type system is method for proving the absence of certain program behavior" (See B.Pierce's Types and Programming Languages, also referred in the other answer). Programs that pass the type checking is a subset of what would be valid programs. For instance, the method
int answer() {
if(true) { return 42; } else { return "wrong"; }
}
would actually behave well at run-time. The else branch is never executed, and the answer always return 42. The static type system is a conservative analysis that will reject this program, because it can not prove the absence of a type error, that is, that "wrong" is never returned.
Of course, you could improve the type system to actually detect that the else branch never happens. You want to improve the type system to reject as few program as possible. This is why type system have been enriched over the years to support more and more refinement (e.g. generic, etc.)
The point of a type system is to prove the absence of type errors. In practice, they support operations like downcasting that inherently imply run-time type checks, and might lead to type errors. Again, the goal is to make the type system as flexible as possible, so that we don't need to resort to these operations that weaken type safety (e.g. generic).
You can read chapter 1 of the aforementionned book for a really nice introduction. For the rest, I will refer you to What To Know Before Debating Type Systems, which is awesome blog post about the basic concepts.
Is there anything else I should know about type systems?
Oh, yes! :)
Happy immersion in the world of type systems!

I suggest reading B.Pierce's Types and Programming Languages book. And I also suggest learning a bit of a statically-typed, with type inference, language like Ocaml or Haskell.

A type system is a mechanism which controls the functions which access values. Compile time checking is one aspect of this, which rejects programs during compilation if an attempt is made to use a function on values it is not designed to handle. However another aspect is the converse, the selection of functions to handle some values, for example overloading. Another example is specialisation of polymorphic functions (e.g. templates in C++). Inference and deduction are other aspects where the type of functions is deduced by usage rather than specified by the programmer.
Parts of the checking and selection can be deferred until run time. Dispatch of methods based on variant tags or by indirection or specialised tables as for C++ virtual functions or Haskell typeclass dictionaries are two examples provided even in extremely strongly typed languages.
The key concept of type systems is called soundness. A type system is sound if it guarantees no value can be used by an inappropriate function. Roughly speaking an unsound type system has "holes" and is useless. The type system of ISO C89 is sound if you remove casts (and void* conversions), and unsound if you allow them. The type system of ISO C++ is unsound.
A second vital concept of types systems is called expressiveness. Sound type systems for polymorphic programming prevent programmers writing valid code: they're universally too restrictive (and I believe inescapably so). Making type systems more expressive so they allow a wider set of valid programs is the key academic challenge.
Another concept of typing is strength. A strong type system can find more errors earlier. For example many languages have type systems too weak to detect array bounds violations using the type system and have to resort to run time checks. Somehow strength is the opposite of expressiveness: we want to allow more valid programs (expressiveness) but also catch even more invalid ones (strength).
Here's a key question: explain why OO typing is too weak to permit OO to be used as a general development paradigm. [Hint: OO cannot handle relations]

Related

Is Haskell a strongly typed programming language?

Is Haskell strongly typed? I.e. is it possible to change the type of a variable after you assigned one? I can't seem to find the answer on the internet.
Static — types are known at compile time. Java and Haskell have static typing. Also C/C++, C#, Go, Scala, Rust, Kotlin, Pascal to list a few more.
A statically typed language might or might not have type inference. Java almost completely lacks type inference (but it's very slowly changing just a little bit); Haskell has full type inference (except with certain very advanced extensions).
(Type inference is when you only have to declare a minimal amount of types by hand, e.g. var isFoo = true and var person = new Person(), instead of bool isFoo = ... and Person person = ....)
Dynamic — Python, JavaScript, Ruby, PHP, Clojure (and Lisps in general), Prolog, Erlang, Groovy etc. Can also be called "unityped"; dynamic typing can be "emulated" in a static setting, but the reverse is not true except by using external static analysis tools. Some languages make it possible to mix dynamic and static (see gradual typing, e.g. https://typedclojure.org/).
Some languages enable static typing for one or more modules, applied at import time, for example: Python+Mypy, Typed Clojure, JavaScript+Flow, PHP+Hack to name a few.
Strong — values that are intended to be treated as Cat always are; trying to treat them like a Dog will cause a loud meeewww... I mean error.
Weak — this effectively boils down to 2 similar but distinct things: type coercion (e.g. "5"+3 equals 8 in PHP — or does it!) and memory reinterpretation (e.g. (int) someCharValue or (bool) somePtr in C, and C++ as well, but C++ wants you to explicitly say reinterpret_cast). So there's really coercion-weak and reinterpretation-weak, and different languages are weak in one or both of these ways.
Interestingly, note that coercion is implicit by nature and memory reinterpretation is explicit (except in Assembly) — so weak typing consists of an implicit and an explicit behavior. Maybe that's even more of a reason to refer to 2 distinct subcategories under weak typing.
There are languages with all 4 possible combinations, and variations/gradations thereof.
Haskell is static+strong; of course it has unsafeCoerce so it can be static+a bit reinterpret-weak at times, but unsafeCoerce is very much frowned upon except in extreme situations where you are sure about something being the case but just can't seem to persuade the compiler without going all the way back and retelling the entire story in a different way.
C is static+weak because all memory can freely be reinterpreted as something it originally was not meant to contain, hence weak. But all of those reinterpretations are kept track of by the type checker, so still fully static too. But C does not do implicit coercions, so it's only reinterpret-weak.
Python is dynamic+almost entirely strong — there are no types known on any given line of code prior to reaching that line during execution, however values that live at runtime do have types associated with them and it's impossible to reinterpret memory. Implicit coercions are also kept to a meaningful minimum, so one might say Python is 99.9% strong and 0.01% coercion-weak.
PHP and JavaScript are dynamic+mostly weak — dynamic, in that nothing has type until you execute and introspect its contents, and also weak in that coercions happen all the time and with things you'd never really expect to be coerced, unless you are only calling methods and functions and not using built-in operations. These coercions are a source of a lot of humor on the internet. There are no memory reinterpretations so PHP and JS are coercion-weak.
Furthermore, some people like to think that static typing is about variables having type, and strong typing is about values having type — this is a very useful way to go about understanding the full picture, but it's not quite true: some dynamically typed languages also allow variables/parameters to be annotated with types/constraints that are enforced at runtime.
In static typing, it's expressions that have a type; the fact of variables having type is only a consequence of variables being used as a means to glue bigger expressions together from smaller ones, so it's not variables per se that have types.
Similarly, in dynamic typing, it's not the variables that lack statically known type — it's all expressions! Variables lacking type is merely a consequence of the expressions they store lacking type.
One final illustration
In dynamic typing, all the cats, dogs and even elephants (in fact entire zoos!) are packaged up in identically sized boxes.
In static typing these boxes look different and have stickers on them saying what's inside.
Some people like it because they can just use a single box form factor and don't have to put any labels on the boxes — it's only the arrangement of boxes with regards to each other that implicitly (and hopefully) provides type sanity.
Some people also like it because it allows them to do all sorts of tricks with tigers temporarily being transported in boxes that smell like lions, and bears put in the same array of interconnected boxes as wolves or deer.
In such label-free setting of transport boxes, all the possible logicistics scenarios need to be played or simulated in order to detect misalignment in the implicit arrangement, like in a stage performance. No reliable guarantees can be given based on reasoning only, generally speaking. (ad-hoc test cases that need for the entire system to be started up for any partial conclusions to be obtained of its soundness)
With labels and explicit rules on how to deal with boxes of various labels, automated/mechanized logical reasoning can be used to draw up conclusions on what the logistics system won't do or will do for sure (static verification, formal proof, or at least pseudo-proof like QuickCheck), Some aspects of the logistics still need to be verified with trial runs, such as whether the logistics team even got the client right. (integration testing, acceptance testing, end user sanity checks).
Moreover, in weak typing dogs can be sliced up and reassembled as frankenstein cats. Whether they like it or not, and whether the result is ugly or not. (weak typing)
But if you add labels to the boxes, it still matters that Frankenstein cats be put in cat boxes. (static+weak typing)
In strong typing, while you can put a cat in the box of a dog, but you can only keep pretending it's a dog until you try to humiliate it by feeding it something only dogs would eat — if that happens, it will scream out loud, but until that time, if you're in dynamic typing, it will silently accept its place (in a static world it would refuse to be put in a dog's box before you can say "kitty").
You seem to mix up dynamic/static and weak/strong typing.
Dynamic or static typing is about whether the type of a variable can be changed during execution.
Weak or strong typing is about being able to predict type errors just from function signatures.
Haskell is both statically and strongly typed.
However, there is no such thing as variable in Haskell so talking about dynamic or static typing makes no sense since every identifier assigned with a value cannot be changed at execution.
EDIT: But like goldenbull said, those typing notions are not clearly defined.
It is strongly typed. See section 2.3 here: Why Haskell matters
I think you are talking about two different things.
First, haskell, and most functional programming (FP) languages, do NOT have the concept "variable". Instead, they use the concept "name" and "value", they just "bind" a value to a name. Once the value is bound, you can not bind another value to the same name, this is the key feature of FP.
Strong typing is another topic. Yes, haskell is strongly typed, and so are most FP languages. Strong typing gives FP the ability of "type inference" which is powerful to eliminate hidden bugs in compile time and help reduce the size of the source code.
Maybe you are comparing haskell with python? Python is also strongly typed. The difference between haskell and python is "static typed" and "dynamic typed". The actual meaning of term "Strong type" and "Weak Type" are ambiguous and fuzzy. That is another long story...

How does one avoid creating an ad-hoc type system in dynamically typed languages?

In every project I've started in languages without type systems, I eventually begin to invent a runtime type system. Maybe the term "type system" is too strong; at the very least, I create a set of type/value-range validators when I'm working with complex data types, and then I feel the need to be paranoid about where data types can be created and modified.
I hadn't thought twice about it until now. As an independent developer, my methods have been working in practice on a number of small projects, and there's no reason they'd stop working now.
Nonetheless, this must be wrong. I feel as if I'm not using dynamically-typed languages "correctly". If I must invent a type system and enforce it myself, I may as well use a language that has types to begin with.
So, my questions are:
Are there existing programming paradigms (for languages without types) that avoid the necessity of using or inventing type systems?
Are there otherwise common recommendations on how to solve the problems that static typing solves in dynamically-typed languages (without sheepishly reinventing types)?
Here is a concrete example for you to consider. I'm working with datetimes and timezones in erlang (a dynamic, strongly typed language). This is a common datatype I work with:
{{Y,M,D},{tztime, {time, HH,MM,SS}, Flag}}
... where {Y,M,D} is a tuple representing a valid date (all entries are integers), tztime and time are atoms, HH,MM,SS are integers representing a sane 24-hr time, and Flag is one of the atoms u,d,z,s,w.
This datatype is commonly parsed from input, so to ensure valid input and a correct parser, the values need to be checked for type correctness, and for valid ranges. Later on, instances of this datatype are compared to each other, making the type of their values all the more important, since all terms compare. From the erlang reference manual
number < atom < reference < fun < port < pid < tuple < list < bit string
Aside from the confsion of static vs. dynamic and strong vs. weak typing:
What you want to implement in your example isn't really solved by most existing static typing systems. Range checks and complications like February 31th and especially parsed input are usually checked during runtime no matter what type system you have.
Your example being in Erlang I have a few recommendations:
Use records. Besides being usefull and helpfull for a whole bunch of reasons, the give you easy runtime type checking without a lot of effort e.g.:
is_same_day(#datetime{year=Y1, month=M1, day=D1},
#datetime{year=Y2, month=M2, day=D2}) -> ...
Effortless only matches for two datetime records. You could even add guards to check for ranges if the source is untrusted. And it conforms to erlangs let it crash method of error handling: if no match is found you get a badmatch, and can handle this on the level where it is apropriate (usually the supervisor level).
Generally write your code that it crashes when the assumptions are not valid
If this doesn't feel static checked enough: use typer and dialyzer to find the kind of errors that can be found statically, whatever remains will be checkd at runtime.
Don't be too restrictive in your functions what "types" you accept, sometimes the added functionality of just doing someting useful even for different inputs is worth more than checking the types and ranges on every function. If you do it where it matters usually you will catch the error early enough for it to be easy fixable. This is especially true for a functionaly language where you allways know where every value comes from.
A lot of good answers, let me add:
Are there existing programming paradigms (for languages without types) that avoid the necessity of using or inventing type systems?
The most important paradigm, especially in Erlang, is this: Assume the type is right, otherwise let it crash. Don't write excessively checking paranoid code, but assume that the input you get is of the right type or the right pattern. Don't write (there are exceptions to this rule, but in general)
foo({tag, ...}) -> do_something(..);
foo({tag2, ...}) -> do_something_else(..);
foo(Otherwise) ->
report_error(Otherwise),
try to fix problem here...
Kill the last clause and have it crash right away. Let a supervisor and other processes do the cleanup (you can use monitors() for janitorial processes to know when a crash has occurred).
Do be precise however. Write
bar(N) when is_integer(N) -> ...
baz([]) -> ...
baz(L) when is_list(L) -> ...
if the function is known only to work with integers or lists respectively. Yes, it is a runtime check but the goal is to convey information to the programmer. Also, HiPE tend to utilize the hint for optimization and eliminate the type check if possible. Hence, the price may be less than what you think it is.
You choose an untyped/dynamically-typed language so the price you have to pay is that type checking and errors from clashes will happen at runtime. As other posts hint, a statically typed language is not exempt from doing some checks as well - the type system is (usually) an approximation of a proof of correctness. In most static languages you often get input which you can't trust. This input is transformed at the "border" of the application and then converted to an internal format. The conversion serves to mark trust: From now on, the thing has been validated and we can assume certain things about it. The power and correctness of this assumption is directly tied to its type signature and how good the programmer is with juggling the static types of the language.
Are there otherwise common recommendations on how to solve the problems that static typing solves in dynamically-typed languages (without sheepishly reinventing types)?
Erlang has the dialyzer which can be used to statically analyze and infer types of your programs. It will not come up with as many type errors as a type checker in e.g., Ocaml, but it won't "cry wolf" either: An error from the dialyzer is provably an error in the program. And it won't reject a program which may be working ok. A simple example is:
and(true, true) -> true;
and(true, _) -> false;
and(false, _) -> false.
The invocation and(true, greatmistake) will return false, yet a static type system will reject the program because it will infer from the first line that the type signature takes a boolean() value as the 2nd parameter. The dialyzer will accept this function in contrast and give it the signature (boolean(), term()) -> boolean(). It can do this, because there is no need to protect a priori for an error. If there is a mistake, the runtime system has a type check that will capture it.
In order for a statically-typed language to match the flexibility of a dynamically-typed one, I think it would need a lot, perhaps infinitely many, features.
In the Haskell world, one hears a lot of sophisticated, sometimes to the point of being scary, teminology. Type classes. Parametric polymorphism. Generalized algebraic data types. Type families. Functional dependencies. The Ωmega programming language takes it even further, with the website listing "type-level functions" and "level polymorphism", among others.
What are all these? Features added to static typing to make it more flexible. These features can be really cool, and tend to be elegant and mind-blowing, but are often difficult to understand. Learning curve aside, type systems often fail to model real-world problems elegantly. A particularly good example of this is interacting with other languages (a major motivation for C# 4's dynamic feature).
Dynamically-typed languages give you the flexibility to implement your own framework of rules and assumptions about data, rather than be constrained by the ever-limited static type system. However, "your own framework" won't be machine-checked, meaning the onus is on you to ensure your "type system" is safe and your code is well-"typed".
One thing I've found from learning Haskell is that I can carry lessons learned about strong typing and sound reasoning over to weaker-typed languages, such as C and even assembly, and do the "type checking" myself. Namely, I can prove that sections of code are correct in and of themselves, by bearing in mind the rules my functions and values are supposed to follow, and the assumptions I am allowed to make about other functions and values. When debugging, I go through and check things again, and think through whether or not my approach is sound.
The bottom line: dynamic typing puts more flexibility at your fingertips. On the other hand, statically-typed languages tend to be more efficient (by orders of magnitude), and good static type systems drastically cut down on debugging time by letting the computer do much of it for you. If you want the benefits of both, install a static type checker in your brain by learning decent, strongly-typed languages.
Sometimes data need validation. Validating any data received from the network is almost always a good idea — especially data from a public network. Being paranoid here is only good. If something resembling a static type system helps this in the least painful way, so be it. There's a reason why Erlang allows type annotations. Even pattern matching can be seen as just a kind of dynamic type checking; nevertheless, it's a central feature of the language. The very structure of data is its 'type' in Erlang.
The good thing is that you can custom-tailor your 'type system' to your needs, make it flexible and smart, while type systems of OO languages typically have fixed features. When data structures you use are immutable, once you've validated such a structure, you're safe to assume it conforms your restrictions, just like with static typing.
There's no point in being ready to process any kind of data at any point of a program, dynamically-typed or not. A 'dynamic type' is essentially a union of all possible types; limiting it to a useful subset is a valid way to program.
A statically typed language detects type errors at compile time. A dynamically typed language detects them at runtime. There are some modest restrictions on what one can write in a statically typed language such that all type errors can be caught at compile time.
But yes, you still have types even in a dynamically typed language, and that's a good thing. The problem is you wander into lots of runtime checks to ensure that you have the types you think you do, since the compiler hasn't taken care of that for you.
Erlang has a very nice tool for specifying and statically verifying lots of types -- dialyzer: Erlang type system, for references.
So don't reinvent types, use the typing tools that Erlang already provides, to handle the types that already exist in your program (but which you haven't yet specified).
And this on its own won't eliminate range checks, unfortunately. Without lots of special sauce you really have to enforce this on your own by convention (and smart constructors, etc. to help), or fall back to runtime checks, or both.

Does case sensitivity have anything to do with strongly typed languages (or loosely typed languages)?

(I admit this may be a n00b question - I know very little about CS theory, mostly a hands-on/hobby sort.)
I was googling up strongly-typed language for the official definition, and one of the top links I found was from Yahoo Answers, which suggested that case sensitive was a part of whether a language is loosely/strongly typed.
I had always thought the simple answer to the difference between a strongly typed/weakly typed language is that the first requires explicit type declarations, while the later is more open, even "dynamic".
The two S/O threads (here and here) I found so far seem to suggest that (more or less), but they don't mention anything about case sensitivity. Is there a relation at all between case sensitive and strong/weak?
A couple of clarifications:
Case sensitivity has nothing to do with strong vs. weak typing, static vs. dynamic typing or any other property of the type system. I don't know why the answer on yahoo answers has gotten its one upvote, but it's completely wrong. Just ignore it.
Strong typing isn't a well-defined term, but it is often used to refer to languages with few implicit type conversions, i.e. languages where it is an error to perform operations on types that do not support that operation.
As an example multiplying the strings "foo" and "bar" gives 0 as the result in perl, while it causes a type error in ruby, python, java, haskell, ml and many other languages. Thus those languages are more strongly typed than perl.
Strong typing is also sometimes used as a synonym for static typing.
A statically typed language is a language in which the types of variables, functions and expressions are known at compile time (or before runtime anyway - a statically typed language need not be compiled per se, though in practice it usually is). This means that if a statically typed program contains a type error, it will not run. If a dynamically typed program contains a type error it will run up to the point where the error happens and then crash.
Whether a language requires type annotations is (somewhat) independent of whether its type system is strong or weak or static or dynamic. In theory a dynamically typed language could require (or at least allow) type annotations and then throw runtime errors when those annotations are broken (though I don't know of any dynamically that actually does this).
More importantly there are many statically and strongly typed languages (e.g. Haskell, ML) that don't require type annotations, but instead use type inference algorithms to infer the types.
In theory, case sensitivity is completely unrelated to type strictness. Case sensitivity is about whether the identifiers foo, FOO, and fOo refer to the same variable, function, or what-have-you. Type strictness is about whether variables have types or just values do, how easy it is to convert among types, and so on.
In practice, there might be a correlation between case sensitivity and type strictness, but I can't think of enough case-insensitive languages right now to make an assessment. My impression is that most languages commonly used today are case sensitive — possibly because C was case sensitive and very influential, possibly because it was the only way to force people to stop PROGRAMMING IN ALL CAPS after a couple decades of FORTRAN, COBOL, and BASIC.
No - they're not connected. Strongly type languages force you to specify the type of data that a variable may hold - such as a real number, an integer, a textual string, or some programmer-defined object. You they can't accidentally assign another type of data into that variable unless it is implicitly convertible: examples of this are that you can generally put a integer into a real number (i.e. double x = 3.14; x = 3; is ok but int x = 3; x = 3.14; might not be, depending on how strongly typed the langauge is). Weakly typed languages just store whatever they're asked to without doing these sanity checks. In strongly typed languages like C++, you can still create type that can store data that can be any of a specific set of types (e.g. C++'s boost::variant), but sometimes they're a bit more limited in how much you can do and how convenient it is to use.
Case sensitivity is means that the uppercase and lowercase versions of the same letter are considered equivalent for some purposes... normally in a string comparison or regular expression match. It is unusual but not unheard of for modern computer languages to ignore the case of letters in variable names (identifiers).

What is the difference between statically typed and dynamically typed languages?

What does it mean when we say a language is dynamically typed versus statically typed?
Statically typed languages
A language is statically typed if the type of a variable is known at compile time. For some languages this means that you as the programmer must specify what type each variable is; other languages (e.g.: Java, C, C++) offer some form of type inference, the capability of the type system to deduce the type of a variable (e.g.: OCaml, Haskell, Scala, Kotlin).
The main advantage here is that all kinds of checking can be done by the compiler, and therefore a lot of trivial bugs are caught at a very early stage.
Examples: C, C++, Java, Rust, Go, Scala
Dynamically typed languages
A language is dynamically typed if the type is associated with run-time values, and not named variables/fields/etc. This means that you as a programmer can write a little quicker because you do not have to specify types every time (unless using a statically-typed language with type inference).
Examples: Perl, Ruby, Python, PHP, JavaScript, Erlang
Most scripting languages have this feature as there is no compiler to do static type-checking anyway, but you may find yourself searching for a bug that is due to the interpreter misinterpreting the type of a variable. Luckily, scripts tend to be small so bugs have not so many places to hide.
Most dynamically typed languages do allow you to provide type information, but do not require it. One language that is currently being developed, Rascal, takes a hybrid approach allowing dynamic typing within functions but enforcing static typing for the function signature.
Type checking is the process of verifying and enforcing the constraints of types.
Statically typed programming languages do type checking at compile-time.
Examples: Java, C, C++.
Dynamically typed programming languages do type checking at run-time.
Examples:
Perl, Ruby, Python, PHP, JavaScript.
Here is an example contrasting how Python (dynamically typed) and Go (statically typed) handle a type error:
def silly(a):
if a > 0:
print 'Hi'
else:
print 5 + '3'
Python does type checking at run time, and therefore:
silly(2)
Runs perfectly fine, and produces the expected output Hi. Error is only raised if the problematic line is hit:
silly(-1)
Produces
TypeError: unsupported operand type(s) for +: 'int' and 'str'
because the relevant line was actually executed.
Go on the other hand does type-checking at compile time:
package main
import ("fmt"
)
func silly(a int) {
if (a > 0) {
fmt.Println("Hi")
} else {
fmt.Println("3" + 5)
}
}
func main() {
silly(2)
}
The above will not compile, with the following error:
invalid operation: "3" + 5 (mismatched types string and int)
Simply put it this way: in a statically typed language variables' types are static, meaning once you set a variable to a type, you cannot change it. That is because typing is associated with the variable rather than the value it refers to.
For example in Java:
String str = "Hello"; // variable str statically typed as string
str = 5; // would throw an error since str is
// supposed to be a string only
Where on the other hand: in a dynamically typed language variables' types are dynamic, meaning after you set a variable to a type, you CAN change it. That is because typing is associated with the value it assumes rather than the variable itself.
For example in Python:
some_str = "Hello" # variable some_str is linked to a string value
some_str = 5 # now it is linked to an integer value; perfectly OK
So, it is best to think of variables in dynamically typed languages as just generic pointers to typed values.
To sum up, type describes (or should have described) the variables in the language rather than the language itself. It could have been better used as a language with statically typed variables versus a language with dynamically typed variables IMHO.
Statically typed languages are generally compiled languages, thus, the compilers check the types (make perfect sense right? as types are not allowed to be changed later on at run time).
Dynamically typed languages are generally interpreted, thus type checking (if any) happens at run time when they are used. This of course brings some performance cost, and is one of the reasons dynamic languages (e.g., python, ruby, php) do not scale as good as the typed ones (java, c#, etc.). From another perspective, statically typed languages have more of a start-up cost: makes you usually write more code, harder code. But that pays later off.
The good thing is both sides are borrowing features from the other side. Typed languages are incorporating more dynamic features, e.g., generics and dynamic libraries in c#, and dynamic languages are including more type checking, e.g., type annotations in python, or HACK variant of PHP, which are usually not core to the language and usable on demand.
When it comes to technology selection, neither side has an intrinsic superiority over the other. It is just a matter of preference whether you want more control to begin with or flexibility. just pick the right tool for the job, and make sure to check what is available in terms of the opposite before considering a switch.
http://en.wikipedia.org/wiki/Type_system
Static typing
A programming language is said to use
static typing when type checking is
performed during compile-time as
opposed to run-time. In static typing,
types are associated with variables
not values. Statically typed languages
include Ada, C, C++, C#, JADE, Java,
Fortran, Haskell, ML, Pascal, Perl
(with respect to distinguishing
scalars, arrays, hashes and
subroutines) and Scala. Static typing
is a limited form of program
verification (see type safety):
accordingly, it allows many type
errors to be caught early in the
development cycle. Static type
checkers evaluate only the type
information that can be determined at
compile time, but are able to verify
that the checked conditions hold for
all possible executions of the
program, which eliminates the need to
repeat type checks every time the
program is executed. Program execution
may also be made more efficient (i.e.
faster or taking reduced memory) by
omitting runtime type checks and
enabling other optimizations.
Because they evaluate type information
during compilation, and therefore lack
type information that is only
available at run-time, static type
checkers are conservative. They will
reject some programs that may be
well-behaved at run-time, but that
cannot be statically determined to be
well-typed. For example, even if an
expression always
evaluates to true at run-time, a
program containing the code
if <complex test> then 42 else <type error>
will be rejected as ill-typed, because
a static analysis cannot determine
that the else branch won't be
taken.[1] The conservative behaviour
of static type checkers is
advantageous when
evaluates to false infrequently: A
static type checker can detect type
errors in rarely used code paths.
Without static type checking, even
code coverage tests with 100% code
coverage may be unable to find such
type errors. Code coverage tests may
fail to detect such type errors
because the combination of all places
where values are created and all
places where a certain value is used
must be taken into account.
The most widely used statically typed
languages are not formally type safe.
They have "loopholes" in the
programming language specification
enabling programmers to write code
that circumvents the verification
performed by a static type checker and
so address a wider range of problems.
For example, Java and most C-style
languages have type punning, and
Haskell has such features as
unsafePerformIO: such operations may
be unsafe at runtime, in that they can
cause unwanted behaviour due to
incorrect typing of values when the
program runs.
Dynamic typing
A programming language is said to be
dynamically typed, or just 'dynamic',
when the majority of its type checking
is performed at run-time as opposed to
at compile-time. In dynamic typing,
types are associated with values not
variables. Dynamically typed languages
include Groovy, JavaScript, Lisp, Lua,
Objective-C, Perl (with respect to
user-defined types but not built-in
types), PHP, Prolog, Python, Ruby,
Smalltalk and Tcl. Compared to static
typing, dynamic typing can be more
flexible (e.g. by allowing programs to
generate types and functionality based
on run-time data), though at the
expense of fewer a priori guarantees.
This is because a dynamically typed
language accepts and attempts to
execute some programs which may be
ruled as invalid by a static type
checker.
Dynamic typing may result in runtime
type errors—that is, at runtime, a
value may have an unexpected type, and
an operation nonsensical for that type
is applied. This operation may occur
long after the place where the
programming mistake was made—that is,
the place where the wrong type of data
passed into a place it should not
have. This makes the bug difficult to
locate.
Dynamically typed language systems,
compared to their statically typed
cousins, make fewer "compile-time"
checks on the source code (but will
check, for example, that the program
is syntactically correct). Run-time
checks can potentially be more
sophisticated, since they can use
dynamic information as well as any
information that was present during
compilation. On the other hand,
runtime checks only assert that
conditions hold in a particular
execution of the program, and these
checks are repeated for every
execution of the program.
Development in dynamically typed
languages is often supported by
programming practices such as unit
testing. Testing is a key practice in
professional software development, and
is particularly important in
dynamically typed languages. In
practice, the testing done to ensure
correct program operation can detect a
much wider range of errors than static
type-checking, but conversely cannot
search as comprehensively for the
errors that both testing and static
type checking are able to detect.
Testing can be incorporated into the
software build cycle, in which case it
can be thought of as a "compile-time"
check, in that the program user will
not have to manually run such tests.
References
Pierce, Benjamin (2002). Types and Programming Languages. MIT Press.
ISBN 0-262-16209-1.
Compiled vs. Interpreted
"When source code is translated"
Source Code: Original code (usually typed by a human into a computer)
Translation: Converting source code into something a computer can read (i.e. machine code)
Run-Time: Period when program is executing commands (after compilation, if compiled)
Compiled Language: Code translated before run-time
Interpreted Language: Code translated on the fly, during execution
Typing
"When types are checked"
5 + '3' is an example of a type error in strongly typed languages such as Go and Python, because they don't allow for "type coercion" -> the ability for a value to change type in certain contexts such as merging two types. Weakly typed languages, such as JavaScript, won't throw a type error (results in '53').
Static: Types checked before run-time
Dynamic: Types checked on the fly, during execution
The definitions of "Static & Compiled" and "Dynamic & Interpreted" are quite similar...but remember it's "when types are checked" vs. "when source code is translated".
You'll get the same type errors irrespective of whether the language is compiled or interpreted! You need to separate these terms conceptually.
Python Example
Dynamic, Interpreted
def silly(a):
if a > 0:
print 'Hi'
else:
print 5 + '3'
silly(2)
Because Python is both interpreted and dynamically typed, it only translates and type-checks code it's executing on. The else block never executes, so 5 + '3' is never even looked at!
What if it was statically typed?
A type error would be thrown before the code is even run. It still performs type-checking before run-time even though it is interpreted.
What if it was compiled?
The else block would be translated/looked at before run-time, but because it's dynamically typed it wouldn't throw an error! Dynamically typed languages don't check types until execution, and that line never executes.
Go Example
Static, Compiled
package main
import ("fmt"
)
func silly(a int) {
if (a > 0) {
fmt.Println("Hi")
} else {
fmt.Println("3" + 5)
}
}
func main() {
silly(2)
}
The types are checked before running (static) and the type error is immediately caught! The types would still be checked before run-time if it was interpreted, having the same result. If it was dynamic, it wouldn't throw any errors even though the code would be looked at during compilation.
Performance
A compiled language will have better performance at run-time if it's statically typed (vs. dynamically); knowledge of types allows for machine code optimization.
Statically typed languages have better performance at run-time intrinsically due to not needing to check types dynamically while executing (it checks before running).
Similarly, compiled languages are faster at run time as the code has already been translated instead of needing to "interpret"/translate it on the fly.
Note that both compiled and statically typed languages will have a delay before running for translation and type-checking, respectively.
More Differences
Static typing catches errors early, instead of finding them during execution (especially useful for long programs). It's more "strict" in that it won't allow for type errors anywhere in your program and often prevents variables from changing types, which further defends against unintended errors.
num = 2
num = '3' // ERROR
Dynamic typing is more flexible, which some appreciate. It typically allows for variables to change types, which can result in unexpected errors.
The terminology "dynamically typed" is unfortunately misleading. All languages are statically typed, and types are properties of expressions (not of values as some think). However, some languages have only one type. These are called uni-typed languages. One example of such a language is the untyped lambda calculus.
In the untyped lambda calculus, all terms are lambda terms, and the only operation that can be performed on a term is applying it to another term. Hence all operations always result in either infinite recursion or a lambda term, but never signal an error.
However, were we to augment the untyped lambda calculus with primitive numbers and arithmetic operations, then we could perform nonsensical operations, such adding two lambda terms together: (λx.x) + (λy.y). One could argue that the only sane thing to do is to signal an error when this happens, but to be able to do this, each value has to be tagged with an indicator that indicates whether the term is a lambda term or a number. The addition operator will then check that indeed both arguments are tagged as numbers, and if they aren't, signal an error. Note that these tags are not types, because types are properties of programs, not of values produced by those programs.
A uni-typed language that does this is called dynamically typed.
Languages such as JavaScript, Python, and Ruby are all uni-typed. Again, the typeof operator in JavaScript and the type function in Python have misleading names; they return the tags associated with the operands, not their types. Similarly, dynamic_cast in C++ and instanceof in Java do not do type checks.
Statically typed languages: each variable and expression is already known at compile time.
(int a; a can take only integer type values at runtime)
Examples: C, C++, Java
Dynamically typed languages: variables can receive different values at runtime and their type is defined at run time.
(var a; a can take any kind of values at runtime)
Examples: Ruby, Python.
In Programming, Data Type is a Classification which tells what type of value a variable will hold and what are the mathematical, relational and logical operations can be done on those values without getting error.
In each programming language, to minimize the chance of getting error, type checking is done either before or during program execution. Depending on the Timing of Type Checking, programming languages are 2 types : Statically Typed and Dynamically Typed languages.
Also depending on whether Implicit Type Conversion happens or not, programming languages are 2 types : Strongly Typed and Weakly Typed languages.
Statically Typed :
Type checking is done at compile time
In source code, at the time of variable declaration, data type of that variable must be explicitly specified. Because if data type is specified in source code then at compile time that source code will be converted to machine code and type checking can happen
Here data type is associated with variable like, int count. And this association is static or fixed
If we try to change data type of an already declared variable (int count) by assigning a value of other data type (int count = "Hello") into it, then we will get error
If we try to change data type by redeclaring an already declared variable (int count) using other data type (boolean count) then also we will get error
int count; /* count is int type, association between data type
and variable is static or fixed */
count = 10; // no error
count = 'Hello'; // error
boolean count; // error
As type checking and type error detection is done at compile time that's why during runtime no further type checking is needed. Thus program becomes more optimized, results in faster execution
If we want more rigid code then choosing this type of language is better option
Example : Java, C, C++, Go, Swift etc.
Dynamically Typed :
Type checking is done at runtime
In source code, at the time of variable declaration, no need to explicitly specify data type of that variable. Because during type checking at runtime, the language system determines variable type from data type of the assigned value to that variable
Here data type is associated with the value assigned to the variable like, var foo = 10, 10 is a Number so now foo is of Number data type. But this association is dynamic or flexible
we can easily change data type of an already declared variable (var foo = 10), by assigning a value of other data type (foo = "Hi") into it, no error
we can easily change data type of an already declared variable (var foo = 10), by redeclaring it using value of other data type (var foo = true), no error
var foo; // without assigned value, variable holds undefined data type
var foo = 10; // foo is Number type now, association between data
// type and value is dynamic / flexible
foo = 'Hi'; // foo is String type now, no error
var foo = true; // foo is Boolean type now, no error
As type checking and type error detection is done at runtime, that's why program becomes less optimized, results in slower execution. Although execution of these type of languages can be faster if they implement JIT Compiler
If we want to write and execute code easily then this type of language is better option, but here we can get runtime error
Example : Python, JavaScript, PHP, Ruby etc.
Statically typed languages type-check at compile time and the type can NOT change. (Don't get cute with type-casting comments, a new variable/reference is created).
Dynamically typed languages type-check at run-time and the type of an variable CAN be changed at run-time.
Sweet and simple definitions, but fitting the need:
Statically typed languages binds the type to a variable for its entire scope (Seg: SCALA)
Dynamically typed languages bind the type to the actual value referenced by a variable.
In a statically typed language, a variable is associated with a type which is known at compile time, and that type remains unchanged throughout the execution of a program. Equivalently, the variable can only be assigned a value which is an instance of the known/specified type.
In a dynamically typed language, a variable has no type, and its value during execution can be anything of any shape and form.
Static typed languages (compiler resolves method calls and compile references):
usually better performance
faster compile error feedback
better IDE support
not suited for working with undefined data formats
harder to start a development when model is not defined when
longer compilation time
in many cases requires to write more code
Dynamic typed languages (decisions taken in running program):
lower performance
faster development
some bugs might be detected only later in run-time
good for undefined data formats (meta programming)
Static Type: Type checking performed at compile time.
What actually mean by static type language:
type of a variable must be specified
a variable can reference only a particular type of object*
type check for the value will be performed at the compile time and any type checking will be reported at that time
memory will be allocated at compile time to store the value of that particular type
Example of static type language are C, C++, Java.
Dynamic Type: Type checking performed at runtime.
What actually mean by dynamic type language:
no need to specify type of the variable
same variable can reference to different type of objects
Python, Ruby are examples of dynamic type language.
* Some objects can be assigned to different type of variables by typecasting it (a very common practice in languages like C and C++)
Statically typed languages like C++, Java and Dynamically typed languages like Python differ only in terms of the execution of the type of the variable.
Statically typed languages have static data type for the variable, here the data type is checked during compiling so debugging is much simpler...whereas Dynamically typed languages don't do the same, the data type is checked which executing the program and hence the debugging is bit difficult.
Moreover they have a very small difference and can be related with strongly typed and weakly typed languages. A strongly typed language doesn't allow you to use one type as another eg. C and C++ ...whereas weakly typed languages allow eg.python
Statically Typed
The types are checked before run-time so mistakes can be caught earlier.
Examples = c++
Dynamically Typed
The types are checked during execution.
Examples = Python
Dynamically typed programming that allows the program to change the type of the variable at runtime.
dynamic typing languages : Perl, Ruby, Python, PHP, JavaScript, Erlang
Statically typed, means if you try to store a string in an integer variable, it would not accept it.
Statically typed languages :C, C++, Java, Rust, Go, Scala, Dart
dynamically typed language helps to quickly prototype algorithm concepts without the overhead of about thinking what variable types need to be used (which is a necessity in statically typed language).
Static Typing:
The languages such as Java and Scala are static typed.
The variables have to be defined and initialized before they are used in a code.
for ex. int x; x = 10;
System.out.println(x);
Dynamic Typing:
Perl is an dynamic typed language.
Variables need not be initialized before they are used in code.
y=10; use this variable in the later part of code

What is a universal type?

I have heard the term "universal type" thrown around in the context of programming language type systems, does anybody know what this means? Is is something to do with objects like a String where two instances of "foo" are identical even though ("foo"=="foo") may be false?
A quick Wikipedia search turns up: Top Type: "The top type in type theory, commonly abbreviated as top or by the down tack symbol (⊤) is the universal type--that type which contains every possible object in the type system of interest." In other words, it's the "Object" class, which is (directly or indirectly) a superclass of every other class. As the page points out, C++ is unusual among OO languages since it doesn't have a universal type.
Russell's Paradox lurks in the wings. Just as you can break your mathematical system when you start getting into things like "the set of all sets", you can also break your type system if you are a little too blasé about a type of all types. Designing type systems requires a little bit of care.

Resources