Could anyone please tell me if there is a language which would forbids (won't compile) if you pass an argument to a fnc which is not an exact match (but has either trivial or user defined conversion to a needed type). For example if you have:
void f(int value);
//and in code you passing:
bool a = false;
f(a);
This is strictly theoretical Q.
This is a vague question, but all the same: Haskell, OCaml etc have this sort of behavior. If a function requires an Int - it has to be given an Int. You maybe able to write functions that coerce Ints to Bools but that doesn't change anything i.e. you still get a type error. Of course, there are languages with far more demanding type systems and complex proof obligations that Haskell and OCaml.
Scala is an interesting language where if there is a user defined coercion from one type to the other and it is non-ambiguous, the compiler will insert it for you. For example, sometimes people use it to coerce datatypes like (Int, (Int, Int)) to ((Int, Int) Int) which is handy.
Related
When you create a type synonym with type, ghc/ghci will use it instead of the original type whenever it is used explicitly, but will never attempt to work backwards from an inferred type to a matching synonym. Getting the most "abstract" synonym for a type would be pretty handy to learn complicated applications and libraries, which define synonyms for monad stacks and possibly synonyms of synonyms.
Has anybody ever written such a piece of code? I imagine it would be backtracking and it would also generate some spurious candidates (e.g. if two types are aliases of String, then they will both be candidates whenever a String must be resolved), but it could be useful in certain situations.
Not an answer, but a question. Type synonyms are often used to nicely name types in "high level code", but as soon as you pass those types into lower level/helper code (that are defined in terms of more concrete types), how should the system keep track of which synonym applies? consider the following:
type Title = String
type Name = String
capitalise :: String -> String
my_title = "Mayor" :: Title
shouted_title = capitalise my_title :: ???
How does the typechecker know that the String going into heleper function capitalise is conceptually the same type as the String coming out of capitalise? In the presence of multiple type aliases, how should the type checker choose which to use?
The Frege compiler, IDE and REPL try to do this for type applications (except applications of (->)) when they are asked to show "nice" types, and it does work in most cases. Here is an example online session snippet:
frege> type Flubber = (Int, Double)
frege> x = (42, 3.0)
frege> :t x
Flubber
frege> y = [x,x,x,x]
frege> :t y
[Flubber]
So, in principle, it should also work in Haskell (perhaps modulo certain extensions in the type system relative to Haskell 2010).
Keep in mind, though, that - as #Thomas pointed out in his answer - there may be multiple ways to unsubstitute type aliases and hence the output may actually be misleading.
What is the difference, at various stages of the read-compile-run pipeline, between a type declaration and a newtype declaration?
My assumption was that they compiled down to the same machine instructions, and that the only difference was when the program is typechecked, where for example
type Name = String
newtype Name_ = N String
You can use a Name anywhere a String is required, but the typechecker will call you out if you use a Name_ where a String is expected, even though they encode the same information.
I'm asking the question because, if this is the case, I don't see any reason why the following declarations shouldn't be valid:
type List a = Either () (a, List a)
newtype List_ a = L (Either () (a, List_ a))
However, the type checker accepts the second one but rejects the first. Why is that?
Luqui's comment should be an answer. Type synonym's in Haskell are to first approximation nothing more than macros. That is, they are expanded by the type checker into fully evaluated types. The type checker can not handle infinite types, so Haskell does not have equi-recursive types.
newtypes provide you iso-recursive types that, in GHC, essentially compile down to equi-recursive types in the core language. Haskell is not GHC core, so you don't have access to such types. Equi-recursive types are just a bit harder to work with both for type checkers and humans, while iso-recursive types have equivalent power.
When I first learned Haskell, I very quickly came to love parametric polymorphism. It's a delightfully simple idea that works astonishingly well. The whole "if it compiles it usually works right" thing is mostly due to parametric polymorphism, IMHO.
But the other day, something occurred to me. I can write foo as a polymorphic function. But when bar calls foo, it will do so with a specific set of argument types. Or, if bar itself is polymorphic, then its caller will assign definite types. By induction, it seems that if you were to take any valid Haskell program and analyse the entire codebase, you can statically determine the type of every single thing in the entire program.
This, in a sense, is a bit like C++ templates. There is no run-time polymorphism, only compile-time polymorphism. A Haskell compiler could choose to generate separate machine code for every type at which each polymorphic function is called. Most Haskell compilers don't, but you could implement one if you wanted to.
Only if you start adding Haskell extensions (ExistentialQuantification is the obvious one) do you start to get real run-time polymorphism, where you have values who's type cannot be statically computed.
Oh, yeah, my question?
Are the statements above actually correct?
Is there a widely-used name for this property?
Haskell (with no extensions) permits polymorphic recursion, and this feature alone makes it impossible to statically specialize a program to a completely monomorphic one. Here is a program that will print an N-deep nested list, where N is a command-line parameter:
import System
foo :: Show a => Int -> a -> IO ()
foo 0 x = print x
foo n x = foo (n-1) [x]
main = do [num_lists] <- getArgs
foo (read num_lists) 0
In the first call to foo, a has type Int. In the next recursive call, it has type [Int], then [[Int]], and so forth.
If polymorphic recursion is prohibited, then I believe it's possible to statically specialize a program.
Yep, I've thought about this too. Basically, the idea is that it seems like you could implement Haskell 98, but not some of the language extensions to it, using polymorphism-by-multiinstantiation instead of polymorphism-by-boxing.
You can get some insight into this by trying to implement some Haskell features as C++ libraries (as you note, C++ does polymorphism-by-multiinstatiation). What you find is that you can do everything that Haskell can do, except that it's impossible to have polymorphic values, which includes references to polymorphic functions.
What this looks like is that if you have
template<typename T>
void f(T); // f :: a -> IO ()
you can take the address of a particular instantiation to pass around as a function pointer at runtime:
&f<int>
but you cannot take the address of a template (&f). This makes sense: templates are a purely compile-time construct. It also makes sense that if you're doing polymorphism by multiinstantiation, you can have a pointer to any particular instantiation, but you cannot have a pointer to the polymorphic function itself, because at the machine code level, there isn't one.
So where does Haskell use polymorphic values? At first glance it seems like a good rule of thumb of is "anywhere you have to write an explicit forall". So PolymorphicComponents, Rank2Types, RankNTypes, and ImpredicativeTypes are obvious no-nos. You can't translate this to C++:
data MkList = MkList (forall a. a -> [a])
singleton = MkList (\x -> [x])
On the other hand, ExistentialQuantification is doable in at least some cases: it means having a non-template class with a template constructor (or more generally, a class whose constructor is templated on more things than the class itself).
If in Haskell you have:
data SomeShow = forall a. Show a => SomeShow a
instance Show SomeShow where show (SomeShow a) = show a
you can implement this in C++ as:
// a function which takes a void*, casts it to the given type, and
// calls the appropriate show() function (statically selected based
// on overload resolution rules)
template<typename T>
String showVoid(void *x)
{
show(*(T*)x);
}
class SomeShow
{
private:
void *m_data;
String (*m_show)(void*); // m_show :: Any -> String
public:
template<typename T>
SomeShow(T x)
: m_data(new T(x)) // memory management issues here, but that's orthogonal
, m_show(&showVoid<T>)
{
}
String show()
{
// alternately we could declare the top-level show() as a friend and
// put this there
return m_show(m_data);
}
};
// C++ doesn't have type classes per se, but it has overloading, which means
// that interfaces are implicit: where in Haskell you would write a class and
// instances, in C++ you just write a function with the same name for each type
String show(SomeShow x)
{
return x.show();
}
In both languages you have a non-polymorphic type with a polymorphic constructor.
So we have shown that there are some language extensions you can implement and some you can't, but what about the other side of the coin: is there anything in Haskell 98 that you can't implement? Judging by the fact that you need a language extension (ExplicitForAll) to even write a forall, you would think that the answer is no. And you would almost be right, but there's two wrinkles: type classes and polymorphic recursion. Type classes are typically implemented using dictionary passing: each instance declaration results in a record of functions, which are implicitly passed around wherever they're needed.
So for Monad for example you would have:
data MonadDict m = MonadDict {
return :: forall a. a -> m a,
(>>=) :: forall a b. m a -> (a -> m b) -> m b
}
Well would you look at those foralls! You can't write them explicitly, but in dictionary passing implementations, even in Haskell 98, classes with polymorphic methods result in records containing polymorphic functions. Which if you're trying to implement the whole thing using multiinstantion is obviously going to be a problem. You can almost get away without dictionary passing because, if you stick to Haskell 98, instances are almost always global and statically known. Each instance results in some polymorphic functions, but because which one to call is almost always known at compile time, you almost never need to pass references to them around at runtime (which is good, because you can't). The tradeoff is that you need to do whole-program compilation, because otherwise instances are no longer statically known: they might be in a different module. And the exception is polymorphic recursion, which practically requires you to build up a dictionary at runtime. See the other answer for more details on that. Polymorphic recursion kills the multiinstantiation approach even without type classes: see the comment about BTrees. (Also ExistentialQuantification *plus* classes with polymorphic methods is no longer doable, because you would have to again start storing pointers to polymorphic functions.)
Whole program compilers take advantage of global access to type information to make very aggressive optimizations, as you describe above. Examples include JHC and MLton. GHC with inlining is partially "whole program" as well, for similar reasons. Other techniques that take advantage of global information include super compilation.
Note that you can massively increase code size by specializing polymorphic functions at all the types they're used at -- this then needs heavy inlining to reduce code back to normal values. Managing this is a challenge.
I'm currently writing an expression parser. I've done the lexical and syntactic analysis and now I'm checking the types. I have the expression in a data structire like this (simplified version):
data Expr = EBinaryOp String Expr Expr
| EInt Int
| EFloat Float
And now I need a function which would convert this to a new type, say TypedExpr, which would also contain type information. And now my main problem is, how this type should look like. I have two ideas - with type parameter:
data TypedExpr t = TEBinaryOp (TBinaryOp a b t) (TExpr a) (TExpr b)
| TEConstant t
addTypes :: (ExprType t) => Expr -> TypedExpr t
or without:
data TypedExpr = TEBinaryOp Type BinaryOp TypedExpr TypedExpr
| TEConstant Type Dynamic
addTypes :: Expr -> TypedExpr
I started with the first option, but I ran into problems, because this approach assumes that you know type of the expression before parsing it (for me, it's true in most cases, but not always). However, I like it, because it lets me use Haskell's type system and check for most errors at compile time.
Is it possible to do it with the first option?
Which one would you choose? Why?
What problems should I expect with each option?
The type of your function
addTypes :: Expr -> TypedExpr t
is wrong, because it would mean that you get a TypedExpr t for any t you like. In contrast, what you actually want is one particular t that is determined by the argument of type Expr.
This reasoning already shows that you are going beyond the capabilities of the Hindley-Milner type system. After all, the return type of addTypes should depend on the value of the argument, but in plain Haskell 2010, types may not depend on values. Hence, you need an extension of the type system that brings you closer to dependent types. In Haskell, generalized algebraic data types (GADTs) can do that.
For a first introduction to GADTs, see also my video on GADTs.
However, after becoming familiar with GADTs, you still have the problem of parsing an untyped expression into a typed one, i.e. to write a function
addTypes :: Expr -> (exists t. TypedExpr t)
Of course, you have to perform some type checking yourself, but even then, it is not easy to convince the Haskell compiler that your type checks (which happen on the value level) can be lifted to the type level. Fortunately, other people have already thought about it, see for example the following message in the haskell-cafe mailing list:
Edward Kmett.
Re: Manual Type-Checking to provide Read instances for GADTs.
(was Re: [Haskell-cafe] Read instance for GATD)
http://article.gmane.org/gmane.comp.lang.haskell.cafe/76466
(Does anyone know of a formally published / nicely written up reference?)
I have recently started using tagless-final syntax for embedded DSL's, and I've found it to be much nicer than the standard GADT method (which you're heading towards, and Apfelmus describes).
The key to tagless-final syntax is that instead of using an expression data type, you represent operations with a type class. For functions like your eBinaryOp, I've found it best to use two classes:
class Repr repr where
eInt :: repr Int
eFloat :: repr Float
class Repr repr => BinaryOp repr a b c where
eBinaryOp :: String -> repr a -> repr b -> repr c
I would make separate BinaryOp functions rather than use a String though.
There's a lot more information on Oleg's web page, including a parser that uses Haskell's type system.
Since you're doing the parsing at runtime, not compile time, you can't piggy back off of Haskell's type system (unless you import the relevant modules and manually call it yourself.)
You may want to turn to TAPL’s ML examples of type checkers for a simple lambda calculus for inspiration. http://www.cis.upenn.edu/~bcpierce/tapl/ (under implementations). They do a bit more than your expression parser, since you don’t support lambdas.
I read William Cook's "On Data Abstraction, Revisited", and re-read Ralf Laemmel's "The expression lemma" to try to understand how to apply the former paper's ideas in Haskell. So, I'm trying to understand how could you implement, e.g., a set union function, in Haskell without specifying the types?
There's multiple ways, depending on which version of "abstract data types" you're after.
Concrete but opaque types: It's been a little while since I read Cook's lovely paper, but glancing back over it I think this is closest to what he's talking about as ADTs. The standard way to do this in Haskell is to export a type without its constructors; what this means in Haskell:
No pattern matching on values of the abstracted type
No constructing values of the type, except using functions exported from its module
How this relates to Cook's paper:
Representation independence: From the outside, the representation is inaccessible.
Inspection of multiple representations: Inside the ADT's module, representations may be inspected freely.
Unique implementations/modules: Different implementations can be provided by different modules, but the types cannot interoperate except by normal means. You can't use Data.IntMap.null to see whether a Data.Map.Map Int a is empty.
This technique is used extensively in the Haskell standard libraries, particularly for data types that need to maintain some sort of invariant or otherwise restrict the ability to construct values. So in this case, the best way to implement the set ADT from the paper is the following code:
import qualified Data.Set as S
Although this is perhaps not as powerful a means of abstraction as it could be in a language with a more expressive module system.
Existential quantification and interface: Haskell doesn't actually have an exists keyword as such, but the term "existential" is used in various circumstances to describe certain kinds of polymorphic types. The general idea in each case is to combine a value with a collection of functions operating on it, such that the result is polymorphic in the value's type. Consider this function signature:
foo :: (a, a -> Bool) -> Bool
Although it receives a value of type a, because a is fully polymorphic the only thing it can possibly do with that value is apply the function to it. So in a sense, within this function, the first half of the tuple is an "abstract data type", while the second half is an "interface" for working with that type. We can make this idea explicit, and apply it outside a single function, using an existential data type:
data FooADT = forall a. FooADT a (a -> Bool)
foo :: FooADT -> Bool
Now, any time we have a value of type FooADT, all we know is that there exists some type a such that we can apply FooADT's second argument to its first.
The same idea applies to polymorphic types with class constraints; the only difference is that the functions operating on the type are provided implicitly by the type class, rather than explicitly bundled with the value.
Now, what does this mean in terms of Cook's paper?
Representation independence still applies.
Total isolation: Unlike before, knowledge of the existentially quantified type is forever lost. Nothing can inspect the representation except the interface it itself provides.
Arbitrary implementations: Not only are implementations not necessarily unique, there's no way to limit them at all! Anything that can provide the same interface can be wrapped up inside an existential and be indistinguishable from other values.
In short, this is very similar to Cook's description of objects. For more on existential ADTs, the paper Unfolding Abstract Datatypes isn't a bad place to start; but keep in mind that what it discusses is fundamentally not what Cook is calling an ADT.
And a short addendum: Having gone to all the trouble above to describe existential type abstractions, I'd like to highlight something about the FooADT type: Because all you can do with it is apply the function to get a Bool result, there is fundamentally no difference between FooADT and Bool, except that the former obfuscates your code and requires GHC extensions. I strongly encourage reading this blog post before setting out to use existential types in Haskell code.
You can either require a comparison function to be provided or require the types to be instances of Eq. See nub and nubBy for examples of this technique:
nub :: (Eq a) => [a] -> [a]
nubBy :: (a -> a -> Bool) -> [a] -> [a]