I'm still new to Haskell (learning it on and off). I'm wondering why Haskell doesn't have a literal Data.Map constructor syntax, like the Map/Hash constructor syntax in Clojure or Ruby. Is there a reason? I thought that since Haskell does have a literal constructor syntax for Data.List, there should be one for Data.Map.
This question is not meant to be critical at all. I would just like to learn more about Haskell through the answers.
Unlike Clojure and Ruby, Haskell's finite maps are provided as libraries. This has tradeoffs: for example, as you noticed, there's no built-in syntax for finite maps; however, because it's a library, we can (and do) have many alternative implementations, and you as a programmer can choose the one that's most appropriate for your uses.
In addition to the answers already given (the "historical accident" one notwithstanding), I think there's also something to be said for the use of Data.Map in Haskell compared to Hash in Ruby or similar things; map-like objects in other languages tend to see a lot more use for general ad-hoc storage.
Whereas in Haskell you'd whip up a data definition in no time at all, creating a class in other languages tends to be somewhat heavy-weight, and so we find that even for data with a well-known structure, we'll just use a Hash or dict or similar. The fact that we have a direct syntax for doing so makes it all the more attractive an option.
Contrast to Lisp: using MAKE-HASH-TABLE and then repeatedly SETFing it is relatively annoying (similar to using Data.Map), so everything gets thrown into nested lists, instead—because it's what's convenient.
Similarly, I'm happy that the most convenient choice for storing data is creating new types to suit, and then I leave Data.Map to when I'm actually constructing a map or hash-table as an intrinsic component. There are a few cases where I think the syntax would be nice (usually just for smaller throw-away programs), but in general I don't miss it.
Actually I'm not sure why nobody has pointed it out in an answer (there's only sam boosalis' comment) but with OverloadedLists you can pretty much get literal syntax for Map and Set:
{-# LANGUAGE OverloadedLists #-}
import Data.Map
import Data.Set
foo :: Map Int Int
foo = [(1,2)]
bar :: Set Int
bar = [1]
from there it's only one more step to get even nicer looking maps, for example:
a =: b = (a,b)
ages :: Map String Int
ages = [ "erik" =: 30
, "john" =: 45
, "peter" =: 21 ]
Although I personally prefer explicit over implicit, so unless I'm building a DSL, I'd still stick to fromList and (foo, bar) — Haskell is about the big wins not the small ones.
Haskell has special syntax for lists because in a lazy functional language they more or less take the place of loop control structures in imperative languages. So they're much more important than Map in the grand scheme.
Also, I know you were referring to [1,2,3] when you said "list syntax", but I wanted to add that list constructor syntax could almost be implemented in haskell-98, in that type constructors can be infix when they start with :, e.g.
data Pair = Int :-- Int
So the list constructor : is just a slight special case of this general syntax rule, which is pretty elegant. Some people miss that.
Haskell does have a Map constructor, but it is "hidden" (like a private method in an object oriented paradigm). You are encouraged to use "public" constructors, such as empty, singleton or fromList.
However, if you inspect the code, available at https://hackage.haskell.org/package/containers-0.4.0.0/docs/src/Data-Map.html , you get the following definition
data Map k a = Tip
| Bin {-# UNPACK #-} !Size !k a !(Map k a) !(Map k a)
You can use the Tip and Bin constructors, but that is not recommended.
Related
When learning about Control.Arrow and Haskell's built-in proc notation, I had the idea that this language might prove very useful as an eDSL for general monoidal categories (using *** for tensor and >>> for composition), if only the Arrow typeclass were generalized to allow a general tens :: * -> * -> * operation rather than Arrow's (,) : * -> * -> *.
After doing some research, I found GArrows, which seem to fit my needs. However, the linked Garrow typeclass comes bundled with the so-called "HetMet" GHC extensions, and support for other features that (for the time being, anyway), I don't have much use for, such as "modal types".
Given that I would like to be able to use such a GArrow typeclass without having to install non-standard GHC extensions:
Is there an actual (somewhat standardized) library on Hackage that meets my needs for such a generalized arrow typeclass?
Given such a library, is there any way to use such a GArrow type class with a "generalized proc" notation without having to cook up my own GHC extension? (With RebindableSyntax perhaps?)
Note: Also, I'm fine with using quasiquotation for a generalized proc notation. So perhaps it wouldn't be too difficult to modify something like this to suit my needs.
I've wondered about that before, too. But – proc notation is so widely considered a silly oddball that there's probably not much interest in generalisation either (though I daresay this is what would make it actually useful!)
However, it's actually not necessary to have special syntax. The primary reference that must be named here is Conal Elliott's work on compiling lambda notation to bicartesian closed categories. Which I thought would have caught on in the Haskell community some time by now, but somehow hasn't. It is available as a GHC plugin, at any rate.
Even that isn't always needed. For some category combinators, you can just wrap a value that's universally quantified in the argument, and treat that as a pseudo-return-value. I call those Agent in constrained-categories; not sure if that's usable for your application, at any rate several things you'd do with arrow-like categories can be done. (In constrained-categories, the tensor product is fixed to (,), however, so probably not what you want. Although, could you explain what tensor product you need?)
As Nikita Volkov mentioned in his question Data.Text vs String I also wondered why I have to deal with the different String implementations type String = [Char] and Data.Text in haskell. In my code I use the pack and unpack functions really often.
My question: Is there a way to have an automatic conversion between both string types so that I can avoid writing pack and unpack so often?
In other programming languages like Python or JavaScript there is for example an automatic conversion between integers and floats if it is needed. Can I reach something like this also in haskell? I know, that the mentioned languages are weakly typed, but I heard that C++ has a similar feature.
Note: I already know the language extension {-# LANGUAGE OverloadedStrings #-}. But as I understand this language extensions just applies to strings defined as "...". I want to have an automatic conversion for strings which I got from other functions or I have as arguments in function definitions.
Extended question: Haskell. Text or Bytestring covers also the difference between Data.Text and Data.ByteString. Is there a way to have an automatic conversion between the three strings String, Data.Text and Data.ByteString?
No.
Haskell doesn't have implicit coercions for technical, philosophical, and almost religious reasons.
As a comment, converting between these representations isn't free and most people don't like the idea that you have hidden and potentially expensive computations lurking around. Additionally, with strings as lazy lists, coercing them to a Text value might not terminate.
We can convert literals to Texts automatically with OverloadedStrings by desugaring a string literal "foo" to fromString "foo" and fromString for Text just calls pack.
The question might be to ask why you're coercing so much? Is there some why do you need to unpack Text values so often? If you constantly changing them to strings it defeats the purpose a bit.
Almost Yes: Data.String.Conversions
Haskell libraries make use of different types, so there are many situations in which there is no choice but to heavily use conversion, distasteful as it is - rewriting libraries doesn't count as a real choice.
I see two concrete problems, either of which being potentially a significant problem for Haskell adoption :
coding ends up requiring specific implementation knowledge of the libraries you want to use.This is a big issue for a high-level language
performance on simple tasks is bad - which is a big issue for a generalist language.
Abstracting from the specific types
In my experience, the first problem is the time spent guessing the package name holding the right function for plumbing between libraries that basically operate on the same data.
To that problem there is a really handy solution : the Data.String.Conversions package, provided you are comfortable with UTF-8 as your default encoding.
This package provides a single cs conversion function between a number of different types.
String
Data.ByteString.ByteString
Data.ByteString.Lazy.ByteString
Data.Text.Text
Data.Text.Lazy.Text
So you just import Data.String.Conversions, and use cs which will infer the right version of the conversion function according to input and output types.
Example:
import Data.Aeson (decode)
import Data.Text (Text)
import Data.ByteString.Lazy (ByteString)
import Data.String.Conversions (cs)
decodeTextStoredJson' :: T.Text -> MyStructure
decodeTextStoredJson' x = decode (cs x) :: Maybe MyStructure
NB : In GHCi you generally do not have a context that gives the target type so you direct the conversion by explicitly stating the type of the result, like for read
let z = cs x :: ByteString
Performance and the cry for a "true" solution
I am not aware of any true solution as of yet - but we can already guess the direction
it is legitimate to require conversion because the data does not change ;
best performance is achieved by not converting data from one type to another for administrative purposes ;
coercion is evil - coercitive, even.
So the direction must be to make these types not different, i.e. to reconcile them under (or over) an archtype from which they would all derive, allowing composition of functions using different derivations, without the need to convert.
Nota : I absolutely cannot evaluate the feasability / potential drawbacks of this idea. There may be some very sound stoppers.
I'd like to write a Haskell program that uses GADTs interactively on a platform not supported by GHCi (namely, GNU/Linux on mipsel). The problem is, the construct that can be used to define a GADT in GHC, for example:
data Term a where
Lit :: Int -> Term Int
Pair :: Term a -> Term b -> Term (a,b)
...
doesn't seem working on Hugs.
Can't GADTs really be defined in Hugs? My TA at a Haskell class said it was possible in Hugs, but he seemed unsure.
If not, can GADT be encoded by using other syntax or semantics supported by Hugs, just as GADTs can be encoded in ocaml?
GADTs are not implemented in Hugs.
Instead, you should use a port of GHC to mips if you are attempting to run code using GADTs. Note that you won't be able to use ghci on all platforms, due to lack of bytecode loading on more exotic architectures.
Regarding your question 2 (how to encode GADT use cases in Haskell 98), you may want to look at this 2006 paper by Sulzmann and Wang: GADTless programming in Haskell 98.
Like the OCaml work you're referring to, this works by factoring GADTs through an equality type. There are various ways to define equality type; they use a form of Leibniz equality like for OCaml, which allows to substitute through any application of a type operator at kind * -> *.
Depending on how a given type checker reason about GADT equalities, this may not be expressive enough to cover all examples of GADTs: the checker may implement equality reasoning rules that are not necessarily captured by this definition. For example, a*b = c*d implies a = c and b = d: this form of decomposition does not come if you only apply type constructors at kind * -> *. Later in 2010, Oleg discussed how you can use type families to apply "type deconstructors" through Leibniz equality, gaining decomposition properties for this definition -- but of course this is again outside Haskell 98.
That's something to keep in mind for type system designers: is your language complete for leibniz equality, in the sense that it can express what a specialized equality solver can do?
Even if you find an encoding of the equality type that is expressive enough, you will have very practical convenience issues: when you use GADTs, all uses of equality witness are inferred from type annotations. With this explicit encoding you'll have much more work to do.
Finally (no pun intended), a lot of use cases of GADTs can be equally expressed by tagless-final embeddings (again by Oleg), that IIRC can often be done in Haskell 98. The blog post by Martin Van Steenbergen that dons points to in its reply's comment is in this spirit, but Oleg has considerably improved this technique.
A classic programming exercise is to write a Lisp/Scheme interpreter in Lisp/Scheme. The power of the full language can be leveraged to produce an interpreter for a subset of the language.
Is there a similar exercise for Haskell? I'd like to implement a subset of Haskell using Haskell as the engine. Of course it can be done, but are there any online resources available to look at?
Here's the backstory.
I am exploring the idea of using Haskell as a language to explore some of the concepts in a Discrete Structures course I am teaching. For this semester I have settled on Miranda, a smaller language that inspired Haskell. Miranda does about 90% of what I'd like it to do, but Haskell does about 2000%. :)
So my idea is to create a language that has exactly the features of Haskell that I'd like and disallows everything else. As the students progress, I can selectively "turn on" various features once they've mastered the basics.
Pedagogical "language levels" have been used successfully to teach Java and Scheme. By limiting what they can do, you can prevent them from shooting themselves in the foot while they are still mastering the syntax and concepts you are trying to teach. And you can offer better error messages.
I love your goal, but it's a big job. A couple of hints:
I've worked on GHC, and you don't want any part of the sources. Hugs is a much simpler, cleaner implementation but unfortunately it's in C.
It's a small piece of the puzzle, but Mark Jones wrote a beautiful paper called Typing Haskell in Haskell which would be a great starting point for your front end.
Good luck! Identifying language levels for Haskell, with supporting evidence from the classroom, would be of great benefit to the community and definitely a publishable result!
There is a complete Haskell parser: http://hackage.haskell.org/package/haskell-src-exts
Once you've parsed it, stripping out or disallowing certain things is easy. I did this for tryhaskell.org to disallow import statements, to support top-level definitions, etc.
Just parse the module:
parseModule :: String -> ParseResult Module
Then you have an AST for a module:
Module SrcLoc ModuleName [ModulePragma] (Maybe WarningText) (Maybe [ExportSpec]) [ImportDecl] [Decl]
The Decl type is extensive: http://hackage.haskell.org/packages/archive/haskell-src-exts/1.9.0/doc/html/Language-Haskell-Exts-Syntax.html#t%3ADecl
All you need to do is define a white-list -- of what declarations, imports, symbols, syntax is available, then walk the AST and throw a "parse error" on anything you don't want them to be aware of yet. You can use the SrcLoc value attached to every node in the AST:
data SrcLoc = SrcLoc
{ srcFilename :: String
, srcLine :: Int
, srcColumn :: Int
}
There's no need to re-implement Haskell. If you want to provide more friendly compile errors, just parse the code, filter it, send it to the compiler, and parse the compiler output. If it's a "couldn't match expected type a against inferred a -> b" then you know it's probably too few arguments to a function.
Unless you really really want to spend time implementing Haskell from scratch or messing with the internals of Hugs, or some dumb implementation, I think you should just filter what gets passed to GHC. That way, if your students want to take their code-base and take it to the next step and write some real fully fledged Haskell code, the transition is transparent.
Do you want to build your interpreter from scratch? Begin with implementing an easier functional language like the lambda calculus or a lisp variant. For the latter there is a quite nice wikibook called Write yourself a Scheme in 48 hours giving a cool and pragmatic introduction into parsing and interpretation techniques.
Interpreting Haskell by hand will be much more complex since you'll have to deal with highly complex features like typeclasses, an extremely powerful type system (type-inference!) and lazy-evaluation (reduction techniques).
So you should define a quite little subset of Haskell to work with and then maybe start by extending the Scheme-example step by step.
Addition:
Note that in Haskell, you have full access to the interpreters API (at least under GHC) including parsers, compilers and of course interpreters.
The package to use is hint (Language.Haskell.*). I have unfortunately neither found online tutorials on this nor tried it out by myself but it looks quite promising.
create a language that has exactly the features of Haskell that I'd like and disallows everything else. As the students progress, I can selectively "turn on" various features once they've mastered the basics.
I suggest a simpler (as in less work involved) solution to this problem. Instead of creating a Haskell implementation where you can turn features off, wrap a Haskell compiler with a program that first checks that the code doesn't use any feature you disallow, and then uses the ready-made compiler to compile it.
That would be similar to HLint (and also kind of its opposite):
HLint (formerly Dr. Haskell) reads Haskell programs and suggests changes that hopefully make them easier to read. HLint also makes it easy to disable unwanted suggestions, and to add your own custom suggestions.
Implement your own HLint "suggestions" to not use the features you don't allow
Disable all the standard HLint suggestions.
Make your wrapper run your modified HLint as a first step
Treat HLint suggestions as errors. That is, if HLint "complained" then the program doesn't proceed to compilation stage
Baskell is a teaching implementation, http://hackage.haskell.org/package/baskell
You might start by picking just, say, the type system to implement. That's about as complicated as an interpreter for Scheme, http://hackage.haskell.org/package/thih
The EHC series of compilers is probably the best bet: it's actively developed and seems to be exactly what you want - a series of small lambda calculi compilers/interpreters culminating in Haskell '98.
But you could also look at the various languages developed in Pierce's Types and Programming Languages, or the Helium interpreter (a crippled Haskell intended for students http://en.wikipedia.org/wiki/Helium_(Haskell)).
If you're looking for a subset of Haskell that's easy to implement, you can do away with type classes and type checking. Without type classes, you don't need type inference to evaluate Haskell code.
I wrote a self-compiling Haskell subset compiler for a Code Golf challenge. It takes Haskell subset code on input and produces C code on output. I'm sorry there isn't a more readable version available; I lifted nested definitions by hand in the process of making it self-compiling.
For a student interested in implementing an interpreter for a subset of Haskell, I would recommend starting with the following features:
Lazy evaluation. If the interpreter is in Haskell, you might not have to do anything for this.
Function definitions with pattern-matched arguments and guards. Only worry about variable, cons, nil, and _ patterns.
Simple expression syntax:
Integer literals
Character literals
[] (nil)
Function application (left associative)
Infix : (cons, right associative)
Parenthesis
Variable names
Function names
More concretely, write an interpreter that can run this:
-- tail :: [a] -> [a]
tail (_:xs) = xs
-- append :: [a] -> [a] -> [a]
append [] ys = ys
append (x:xs) ys = x : append xs ys
-- zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith f (a:as) (b:bs) = f a b : zipWith f as bs
zipWith _ _ _ = []
-- showList :: (a -> String) -> [a] -> String
showList _ [] = '[' : ']' : []
showList show (x:xs) = '[' : append (show x) (showItems show xs)
-- showItems :: (a -> String) -> [a] -> String
showItems show [] = ']' : []
showItems show (x:xs) = ',' : append (show x) (showItems show xs)
-- fibs :: [Int]
fibs = 0 : 1 : zipWith add fibs (tail fibs)
-- main :: String
main = showList showInt (take 40 fibs)
Type checking is a crucial feature of Haskell. However, going from nothing to a type-checking Haskell compiler is very difficult. If you start by writing an interpreter for the above, adding type checking to it should be less daunting.
You might look at Happy (a yacc-like parser in Haskell) which has a Haskell parser.
This might be a good idea - make a tiny version of NetLogo in Haskell. Here is the tiny interpreter.
see if helium would make a better base to build upon than standard haskell.
Uhc/Ehc is a series of compilers enabling/disabling various Haskell features.
http://www.cs.uu.nl/wiki/Ehc/WebHome#What_is_UHC_And_EHC
I've been told that Idris has a fairly compact parser, not sure if it's really suitable for alteration, but it's written in Haskell.
Andrej Bauer's Programming Language Zoo has a small implementation of a purely functional programming language somewhat cheekily named "minihaskell". It is about 700 lines of OCaml, so very easy to digest.
The site also contains toy versions of ML-style, Prolog-style and OO programming languages.
Don't you think it would be easier to take the GHC sources and strip out what you don't want, than it would be to write your own Haskell interpreter from scratch? Generally speaking, there should be a lot less effort involved in removing features as opposed to creating/adding features.
GHC is written in Haskell anyway, so technically that stays with your question of a Haskell interpreter written in Haskell.
It probably wouldn't be too hard to make the whole thing statically linked and then only distribute your customized GHCi, so that the students can't load other Haskell source modules. As to how much work it would take to prevent them from loading other Haskell object files, I have no idea. You might want to disable FFI too, if you have a bunch of cheaters in your classes :)
The reason why there are so many LISP interpreters is that LISP is basically a predecessor of JSON: a simple format to encode data. This makes the frontend part quite easy to handle. Compared to that, Haskell, especially with Language Extensions, is not the easiest language to parse.
These are some syntactical constructs that sound tricky to get right:
operators with configurable precedence, associativity, and fixity,
nested comments
layout rule
pattern syntax
do- blocks and desugaring to monadic code
Each of these, except maybe the operators, could be tackled by students after their Compiler Construction Course, but it would take the focus away from how Haskell actually works. In addition to that, you might not want to implement all syntactical constructs of Haskell directly, but instead implement passes to get rid of them. Which brings us to the literal core of the issue, pun fully intended.
My suggestion is to implement typechecking and an interpreter for Core instead of full Haskell. Both of these tasks are quite intricate by themselves already.
This language, while still a strongly typed functional language, is way less complicated to deal with in terms of optimization and code generation.
However, it is still independent from the underlying machine.
Therefore, GHC uses it as an intermediary language and translates most syntaxical constructs of Haskell into it.
Additionally, you should not shy away from using GHC's (or another compiler's) frontend.
I'd not consider that as cheating since custom LISPs use the host LISP system's parser (at least during bootstrapping). Cleaning up Core snippets and presenting them to students, along with the original code, should allow you to give an overview of what the frontend does, and why it is preferable to not reimplement it.
Here are a few links to the documentation of Core as used in GHC:
System FC: equality constraints and coercions
GHC/As a library
The Core type
Coming from C++, I find generic programming indispensable. I wonder how people approach that in Haskell?
Say how do write generic swap function in Haskell?
Is there an equivalent concept of partial specialization in Haskell?
In C++, I can partially specialize the generic swap function with a special one for a generic map/hash_map container that has a special swap method for O(1) container swap. How do you do that in Haskell or what's the canonical example of generic programming in Haskell?
This is closely related to your other question about Haskell and quicksort. I think you probably need to read at least the introduction of a book about Haskell. It sounds as if you haven't yet grasped the key point about it which is that it bans you from modifying the values of existing variables.
Swap (as understood and used in C++) is, by its very nature, all about modifying existing values. It's so we can use a name to refer to a container, and replace that container with completely different contents, and specialize that operation to be fast (and exception-free) for specific containers, allowing us to implement a modify-and-publish approach (crucial for writing exception-safe code or attempting to write lock-free code).
You can write a generic swap in Haskell, but it would probably take a pair of values and return a new pair containing the same values with their positions reversed, or something like that. Not really the same thing, and not having the same uses. It wouldn't make any sense to try and specialise it for a map by digging inside that map and swapping its individual member variables, because you're just not allowed to do things like that in Haskell (you can do the specialization, but not the modifying of variables).
Suppose we wanted to "measure" a list in Haskell:
measure :: [a] -> Integer
That's a type declaration. It means that the function measure takes a list of anything (a is a generic type parameter because it starts with a lowercase letter) and returns an Integer. So this works for a list of any element type - it's what would be called a function template in C++, or a polymorphic function in Haskell (not the same as a polymorphic class in C++).
We can now define that by providing specializations for each interesting case:
measure [] = 0
i.e. measure the empty list and you get zero.
Here's a very general definition that covers all other cases:
measure (h:r) = 1 + measure r
The bit in parentheses on the LHS is a pattern. It means: take a list, break off the head and call it h, call the remaining part r. Those names are then parameters we can use. This will match any list with at least one item on it.
If you've tried template metaprogramming in C++ this will all be old hat to you, because it involves exactly the same style - recursion to do loops, specialization to make the recursion terminate. Except that in Haskell it works at runtime (specialization of the function for particular values or patterns of values).
As Earwicker sais, the example is not as meaningful in Haskell. If you absolutely want to have it anyway, here is something similar (swapping the two parts of a pair), c&p from an interactive session:
GHCi, version 6.8.2: http://www.haskell.org/ghc/ :? for help
Loading package base ... linking ... done.
Prelude> let swap (a,b) = (b,a)
Prelude> swap("hello", "world")
("world","hello")
Prelude> swap(1,2)
(2,1)
Prelude> swap("hello",2)
(2,"hello")
In Haskell, functions are as generic (polymorphic) as possible - the compiler will infer the "Most general type". For example, TheMarko's example swap is polymorphic by default in the absence of a type signature:
*Main> let swap (a,b) = (b,a)
*Main> :t swap
swap :: (t, t1) -> (t1, t)
As for partial specialization, ghc has a non-98 extension:
file:///C:/ghc/ghc-6.10.1/doc/users_guide/pragmas.html#specialize-pragma
Also, note that there's a mismatch in terminology. What's called generic in c++, Java, and C# is called polymorphic in Haskell. "Generic" in Haskell usually means polytypic:
http://haskell.readscheme.org/generic.html
But, aboe i use the c++ meaning of generic.
In Haskell you would create type classes. Type classes are not like classes in OO languages. Take the Numeric type class It says that anything that is an instance of the class can perform certain operations(+ - * /) so Integer is a member of Numeric and provides implementations of the functions necessary to be considered Numeric and can be used anywhere a Numeric is expected.
Say you want to be able to foo Ints and Strings. Then you would declare Int and String to be
instances of the type class Foo. Now anywhere you see the type (Foo a) you can now use Int or String.
The reason why you can't add ints and floats directly is because add has the type (Numeric a) a -> a -> a a is a type variable and just like regular variables it can only be bound once so as soon as you bind it to Int every a in the list must be Int.
After reading enough in a Haskell book to really understand Earwicker's answer I'd suggest you also read about type classes. I'm not sure what “partial specialization” means, but it sounds like they could come close.