When I try to compile this with ghc it complains the number of parameters in the left hand sides of the function definition are different.
module Example where
import Data.Maybe
from_maybe :: a -> Maybe a -> a
from_maybe a Nothing = a
from_maybe _ = Data.Maybe.fromJust
I'm wondering whether this is a ghc restriction. I tried to see if I
could find anything about the number of parameters in the Haskell 2010
Report but I wasn't successful.
Is this legal Haskell or isn't it? If not, where is this parameter count
restriction listed?
It's not legal. The restriction is described in the
Haskell 2010 Report:
4.4.3.1 Function bindings
[...]
Note that all clauses defining a function must be contiguous, and the number of patterns in each clause must be the same.
The answer from melpomene is the summation of the syntax rules. In this answer I want to try to describe why this has to be the case.
From the point of view of the programmer, the syntax in the question seems quite reasonable. There is a first specific case which is marked by a value in the second parameter. Then there is a general solution which reduces to an existing function. Either pattern on its own is valid so why can they not co-exist?
The syntax rule that all patterns must capture exactly the same parameters is a necessary result of haskell being a lazy language. Being lazy means that the values of parameters are not evaluated until they are needed.
Let's look at what happens when we curry a function, that is provide it with less then the number of parameters in the declaration. When that happens haskell produces an anonymous function* that will complete the function and stores the first parameters in the scope of that function without evaluating them.
That last part is important. For functions with different numbers of parameters, it would be necessary to evaluate some of them to choose which pattern to match, but then not evaluate them.
In other words the compiler needs to know exactly how many parameters it needs before it evaluates them to choose among the patterns. So while it seems like we should not need to put an extra _ in, the compiler needs it to be there.
As an aside, the order in which parameters occur can make a big difference in how easy a function is to implement and use. See the answers to my question Ordering of parameters to make use of currying for suggestions. If the order is reversed on the example function, it can be implemented with only the one named point.
from_maybe :: Maybe a -> a -> a
from_maybe Nothing = id
from_maybe (Just x) = const x
*: What an implementation like GHC actually does in this situation is subject to optimization and may not work exactly this way, but the end result must be the same as if it did.
Related
Would it be possible to have a function that takes some value and a pattern in order to check if both match?
Let's call this hypothetical function matches and with it we could rewrite the following function...
isSingleton :: [a] -> Bool
isSingleton [_] = True
isSingleton _ = False
...like so...
isSingleton xs = xs `matches` [_]
Would this be theoretically possible? If yes, how? And if not, why?
Well, you can't really use [_] as an expression – the compiler doesn't allow it. There are various bodges that could be applied to make it kind-of-possible:
-fdefer-typed-holes makes GHC ignore the _ during compilation. That doesn't really make it a legal expression, it just means the error will be raised at runtime instead. (Generally a very bad idea, should only be used for trying out something while another piece of your code isn't complete yet.)
You could define _' = undefined to have similar syntax that is accepted as an expression. Of course, that still means there will be a runtime error.
A runtime error could be caught, but only in the IO monad. That means basically don't do anything that requires it. But of course, technically speaking you could, and then wrap it in unsafePerformIO. Disgusting, but I won't say there aren't situations where this sort of thing is a necessary evil.
With that, you could already implement a function that would be able to determine that either it definitely doesn't match [_] (if == returns False without an error being raise), or possibly matches (in case the error is raised before a constructor discrepancy is found). That would be enough to determine that [] does not match [_], but not to determine that [1,_,3] does not match [1,_,0].
Again this could be botched around with a type class that first converts the values to a proper ⊥-less tree structure (using the unsafePerformIO catch to determine how deep to descent), but... please, let's stop the thought process here. We're clearly working against the language instead of with it with all of that.
What could of course be done is to do it in Template Haskell, but that would fix you to compile-time pattern expressions, I doubt that's in the spirit of the question. Or you could generate a type that explicitly expresses “original type with holes in it”, so the holes could properly be distinguished without any unsafe IO shenanigans. That would become quite verbose though.
In practice, I would just not try passing around first-class patterns, but instead pass around boolean-valued functions. Granted, functions are more general than patterns, but not in a way that would pose any practical problems.
Something that's more closely analogous to patterns are Prisms in the lens ecosystem. Indeed, there is the is combinator that does what you're asking for. However, lenses/prisms can be again fiddly to construct, I'm not even sure a prism corresponding to [_] can be built using the _Cons and _Empty primitives.
tl;dr it's not worth it.
Patterns aren't first-class in Haskell; you cannot write a function that receives a pattern as an argument.
However, anywhere you would call your hypothetical matches function, you can instead make a bespoke function that tests the particular pattern you're interested in, and use that function instead. So you don't really need first class patterns, they just might save a bit of boilerplate.
The extension LambdaCase more or less allows you to write a function that just does a pattern match with minimal syntactic overhead, although there is no special-case syntax available for the specific purpose of returning a Bool saying whether a pattern matches (such special-purpose syntax would allow you to to avoid explicitly writing that you want to map the pattern to True and having to add a catch-all alternative case to map to False).
For example:
isSingleton = \case [_] -> True
_ -> False
For something like isSingleton (which already is a single pattern match encapsulated into a function) there's not much benefit in doing this over just implementing it directly. But in a more complex function whether you want to call x `matches` <pattern> (or pass (`matches` <pattern>) to another function), it might be an alternative that keeps the pattern inline.
Honestly I'd probably just define functions like isSingleton for each pattern I wanted this for though (possibly just in a where clause).
My primary queston is: is there, within some Haskell AST, a way I can determine a list of the available declarations, and their types? I'm trying to build an editor that allows for the user to be shown all the appropriate edits available, such as inserting functions and/or other declared values that can be used or inserted at any point. It'll also disallows syntax errors as well as type-errors. (That is, it'll be a semantic structural editor, which I'll also use the typechecker to make sure the editing pieces make sense to in this case, Haskell).
The second part of my question is: once I have that list, given a particular expression or function or focussed-on piece of AST (using Lens), how could I filter the list based on what could possibly replace or fit that particular focussed-on AST piece (whether by providing arguments to a function, or if it's a value, just "as-is"). Perhaps I need to add some concrete example here... something like: "Haskell, which declarations could possibly be applied (for functions) and/or placed into the hole at yay x y z = (x + y - z) * _?" then if there was an expression number2 :: Num a => a ; number2 = 23 it would put this in the list, as well as the functions available in the context, as well as those from Num itself such as (+) :: Num a => a -> a -> a, (*) :: Num a => a -> a -> a, and any other declarations that resulted in a type that would match such as Num a => a etc. etc.
More details follow:
I’ve done a fair bit of research into this area over quite a long time: looked at and used hint, Language.Haskell.Exts and Control.Lens a fair bit. Also had a look into Dynamic. Control.Lens is relevant for the second half of my question. I've also looked at quite a few projects along the way including Conal Elliott's "Semantic Editing Combinators", Paul Chiusano's Unison system and quite a few things in Clojure and Lisp as well.
So, I know I can get a list of the exports of a module with hint as [String], and I could coerce that to [Dynamic], I think (possibly?), but I’m not sure how I’d get sub-function declarations and their types. (Maybe I could take the declarations within that scope with AST and put them in their own modules in a String and pull them in by getting the top level declarations with hint? that would work but feels hacky and cumbersome)
I can use (:~:) from Data.Typeable to do "propositional equality" (ie typechecking?) on two terms, but what I actually need to do is see if a term could be matched into a position in the source/AST (I'm using lenses and prisms to focus on those parts of the AST) given some number of arguments. Some kind of partial type-checking, or result type-checking? Because the thing I might be focussing on could very well be a function, and I might need to keep the same arity.
I feel like perhaps this is very similar to Idris' term-searching, though I haven't looked into the source for that and I'm not sure if that's something only possible in a dependently typed language.
Any help would be great.
Looks like I kind of answered my own questions, so I'm going to do so formally here.
The answer to the first part of my question can be found in the Reflection module of the hint library. I knew I could get a list a [String] of these modules, but there's a function in there that can be used which has type: getModuleExports :: MonadInterpreter m => ModuleName -> m [ModuleElem] and is most likely the sort of thing I'm after. This is because hint provides access to a large part of the GHC API. It also provides some lookup functions which I can then use to get the types of these top level terms.
https://github.com/mvdan/hint/blob/master/src/Hint/Reflection.hs#L30
Also, Template Haskell provides some of the functionality I'm interested in, and I'll probably end up using quite a bit of that to build my functions, or at least a set of lenses for whatever syntax is being used by the code (/text) under consideration.
In terms of the second part of the question, I still don't have a particularly good answer, so my first attempt will be to use some String munging on the output of the lookup functions and see what I can do.
I'm new to Haskell and understand that it is (basically) a pure functional language, which has the advantage that results to functions will not change across multiple evaluations. Given this, I'm puzzled by why I can't easily mark a function in such a way that its remembers the results of its first evaluation, and does not have to be evaluated again each time its value is required.
In Mathematica, for example, there is a simple idiom for accomplishing this:
f[x_]:=f[x]= ...
but in Haskell, the closest things I've found is something like
f' = (map f [0 ..] !!)
where f 0 = ...
f n = f' ...
which in addition to being far less clear (and apparently limited to Int arguments?) does not (seem to) preserve results within an interactive session.
Admittedly (and clearly), I don't understand exactly what's going on here; but naively, it seems like Haskel should have some way, at the function definition level, of
taking advantage of the fact that its functions are functions and skipping re-computation of their results once they have been computed, and
indicating a desire to do this at the function definition level with a simple and clean idiom.
Is there a way to accomplish this in Haskell that I'm missing? I understand (sort of) that Haskell can't store the evaluations as "state", but why can't it simply (in effect) redefine evaluated functions to be their computed value?
This grows out of this question, in which lack of this feature results in terrible performance.
Use a suitable library, such as MemoTrie.
import Data.MemoTrie
f' = memo f
where f 0 = ...
f n = f' ...
That's hardly less nice than the Mathematica version, is it?
Regarding
“why can't it simply (in effect) redefine evaluated functions to be their computed value?”
Well, it's not so easy in general. These values have to be stored somewhere. Even for an Int-valued function, you can't just allocate an array with all possible values – it wouldn't fit in memory. The list solution only works because Haskell is lazy and therefore allows infinite lists, but that's not particularly satisfying since lookup is O(n).
For other types it's simply hopeless – you'd need to somehow diagonalise an over-countably infinite domain.
You need some cleverer organisation. I don't know how Mathematica does this, but it probably uses a lot of “proprietary magic”. I wouldn't be so sure that it does really work the way you'd like, for any inputs.
Haskell fortunately has type classes, and these allow you to express exactly what a type needs in order to be quickly memoisable. HasTrie is such a class.
Suppose I was to define (+) on Strings but not by giving an instance of Num String.
Why does Haskell now hide Nums (+) function? After all, the function I have provided:
(+) :: String -> String -> String
can be distinguished by the compiler from Prelude's (+). Why can't both functions exist in the same namespace, but with different, non-overlapping type signatures?
As long as there is no call to the function in the code, Haskell to care that there's an ambiguitiy. Placing a call to the function with arguments will then determine the types, such that appropriate implementation can be chosen.
Of course, once there is an instance Num String, there would actually be a conflict, because at that point Haskell couldn't decide based upon the parameter type which implementation to choose, if the function were actually called.
In that case, an error should be raised.
Wouldn't this allow function overloading without pitfalls/ambiguities?
Note: I am not talking about dynamic binding.
Haskell simply does not support function overloading (except via typeclasses). One reason for that is that function overloading doesn't work well with type inference. If you had code like f x y = x + y, how would Haskell know whether x and y are Nums or Strings, i.e. whether the type of f should be f :: Num a => a -> a -> a or f :: String -> String -> String?
PS: This isn't really relevant to your question, but the types aren't strictly non-overlapping if you assume an open world, i.e. in some module somewhere there might be an instance for Num String, which, when imported, would break your code. So Haskell never makes any decisions based on the fact that a given type does not have an instance for a given typeclass. Of course, function definitions hide other function definitions with the same name even if there are no typeclasses involved, so as I said: not really relevant to your question.
Regarding why it's necessary for a function's type to be known at the definition site as opposed to being inferred at the call-site: First of all the call-site of a function may be in a different module than the function definition (or in multiple different modules), so if we had to look at the call site to infer a function's type, we'd have to perform type checking across module boundaries. That is when type checking a module, we'd also have to go all through the modules that import this module, so in the worst case we have to recompile all modules every time we change a single module. This would greatly complicate and slow down the compilation process. More importantly it would make it impossible to compile libraries because it's the nature of libraries that their functions will be used by other code bases that the compiler does not have access to when compiling the library.
As long as the function isn't called
At some point, when using the function
no no no. In Haskell you don't think of "before" or "the minute you do...", but define stuff once and for all time. That's most apparent in the runtime behaviour of variables, but also translates to function signatures and class instances. This way, you don't have to do all the tedious thinking about compilation order and are safe from the many ways e.g. C++ templates/overloads often break horribly because of one tiny change in the program.
Also, I don't think you quite understand how Hindley-Milner works.
Before you call the function, at which time you know the type of the argument, it doesn't need to know.
Well, you normally don't know the type of the argument! It may sometimes be explicitly given, but usually it's deduced from the other argument or the return type. For instance, in
map (+3) [5,6,7]
the compiler doesn't know what types the numeric literals have, it only knows that they are numbers. This way, you can evaluate the result as whatever you like, and that allows for things you could only dream of in other languages, for instance a symbolic type where
> map (+3) [5,6,7] :: SymbolicNum
[SymbolicPlus 5 3, SymbolicPlus 6 3, SymbolicPlus 7 3]
I'm curious as to how often experienced Haskell programmers really use type inference in practice. I often see it praised as an advantage over the always-explicit declarations needed in certain other languages, but for some reason (perhaps just because I'm new) it "feels" right to write a type signature just about all the time... and I'm sure in some cases it really is required.
Can some experienced Haskellers (Haskellites? Haskellizers?) provide some input?
It's still an advantage, even if you write type signatures, because the compiler will catch type errors in your functions. I usually write type signatures too, but omit them in places like where or let clauses where you actually define new symbols but don't feel the need to specify a type signature.
Stupid example with a strange way to calculate squares of numbers:
squares :: [Int]
squares = sums 0 odds
where
odds = filter odd [1..]
sums s (a:as) = s : sums (s+a) as
square :: Int -> Int
square n = squares !! n
odds and sums are functions that would need a type signature if the compiler wouldn't infer them automatically.
Also if you use generic functions, like you usually do, type inference is what ensures that you really combine all those generic functions together in a valid way. If you, in the above example, say
squares :: [a]
squares = ...
The compiler can deduce that this isn't valid this way, because one of the used functions (the odd function from the standard library), needs a to be in the type class Integral. In other languages you usually only recognize this at a later point.
If you write this as a template in C++, you get a compiler error when you use the function on a non-Integral type, but not when you define the template. This can be quite confusing, because it's not immediately clear where you've gone wrong and you might have to look through a long chain of error messages to find the real source of the problem. And in something like python you get the error at runtime at some unexpected point, because something didn't have the expected member functions. And in even more loosely typed languages you might not get any error, but just unexpected results.
In Haskell the compiler can ensure that the function can be called with all the types specified in it's signature, even if it's a generic function that is valid for all types that fulfill some constrains (aka type classes). This makes it easy to program in a generic way and use generic libraries, something much harder to get right in other languages. Even if you specify a generic type signature, there is still a lot of type inference going on in the compiler to find out what specific type is used in each call and if this type fulfills all the requirements of the function.
I always write the type signature for top-level functions and values, but not for stuff in "where", "let" or "do" clauses.
First, top level functions are generally exported, and Haddock needs a type declaration to generate the documentation.
Second, when you make a mistake the compiler errors are a lot easier to decode if the compiler has type information available. In fact sometimes in a complicated "where" clause I get an incomprehensible type error so I add temporary type declarations to find the problem, a bit like the type-level equivalent of printf debugging.
So to answer the original question, I use type inference a lot but not 100% of the time.
You have good instincts. Because they are checked by the compiler, type signatures for top-level values provide invaluable documentation.
Like others, I almost always put a type signature for a top-level function, and almost never for any other declaration.
The other place type inference is invaluable is at the interactive loop (e.g., with GHCi). This technique is most helpful when I'm designing and debugging some fancy new higher-order function or some such.
When you are faced with a type-check error, although the Haskell compiler does provide information on the error, this information can be hard to decode. To make it easier, you can comment out the function's type signature and then see what the compiler has inferred about the type and see how it differs from your intended type.
Another use is when you are constructing an 'inner function' inside a top level function but you are not sure how to build the inner function or even what its type should be. What you can do is to pass the inner-function in as an argument to the top level function and then ask ghci for the type of the type level function. This will include the type of the inner function. You can then use a tool like Hoogle to see if this function already exists in a library.