Unsure of how to design a useful library using combinators

Unsure of how to design a useful library using combinators - haskell

I've been reading about combinators and seen how useful they are (for example, in Haskell's Parsec). My problem is that I'm not quite sure how to use them practically.
Here's an outline of the problem: distributions can be generated, filtered, and modified. Distributions may be combined to create new distributions.
The basic interfaces are (in pseudo-Haskell type terminology):
generator:: parameters -> distribution
selector:: parameters -> (distribution -> distribution)
modifier:: parameters -> (distribution -> distribution)
Now, I think that I see three combinators:
combine:: generator -> generator -> generator
filter:: generator -> selector -> generator
modify:: generator -> modifier -> generator
Are these actually combinators? Do the combinators make sense/are there any other obvious combinators that I'm missing?
Thanks for any advice.

The selector and modifier functions are already perfectly good combinators! Along with generator and combine you can do stuff like (I'm going to assume statistical distributions for concreteness and just make things up!):
modifier (Scale 3.0) $ generator StandardGaussian `combine` selector (LargerThan 10) . modifier (Shift 7) $ generator (Binomial 30 0.2)
You may have to mess around a bit with the priority of the combine operator for this to work smoothly :)
In general, when I'm trying to design a combinator library for values of type A, I like to keep my A's "at the end", so that the partially applied combinators (your selector and modifier) can be chained together with . instead of having to flip through hoops.
Here's a nice blog article which can help you design combinators, it influenced a lot of my thinking: Semantic Editor Combinators.
EDIT: I may have misread your question, given the type signature of combine. Maybe I'm missing something, but wouldn't the distributions be the more natural objects your combinator should work on?

Related

how to structure monads in haskell programs which compute on random values?

I have a program that's almost pure mathematical computation. The problem is that some of those computations operate on monte carlo generated values.
It seems like I have two design options:
Either all my computation functions take additional parameter which contains a pre-generated monte carlo chain. This lets me keep pure functions everywhere, but since there's functions that call other functions this adds a lot of line noise to the code base.
The other option is to make all the computation functions monadic. This seems unfortunate since some of the functions aren't even using those random values they're just calling a function which calls a function which needs the random values.
Is there any guidance regarding the preferred design here? Specifically, the separation of monadic / non-monadic functions in the code where monte carlo values are concerned?

The other option is to make all the computation functions monadic. This seems unfortunate since some of the functions aren't even using those random values they're just calling a function which calls a function which needs the random values.
I would suggest following this approach, and I disagree with your assessment that it's "unfortunate." What monads are good at precisely is separating your pure code from your side effecting code. Your pure functions can just have pure types, and the Functor/Applicative/Monad methods serve to "hook them up" with the random generation parts. Meditate on the signatures of the standard operations (here specialized to some idealized Random monad type):
-- Apply a pure function to a randomly selected value.
fmap :: (a -> b) -> Random a -> Random b
-- Apply a randomly selected function to a randomly selected argument.
-- The two random choices are independent.
(<*>) :: Random (a -> b) -> Random a -> Random b
-- Apply a two-argument function to a randomly selected arguments.
-- The two random choices are independent.
liftA2 :: (a -> b -> c) -> Random a -> Random b -> Random c
-- Make a `Random b` choice whose distribution depends on the value
-- sampled from the `Random a`.
(>>=) :: Random a -> (a -> Random b) -> Random b
So the reformulated version of your approach is:
Write pure functions wherever you can.
Adapt these pure functions to work on the random values by using the Functor/Applicative/Monad class operations.
Wherever you spot a function that's mentioning the Random type superfluously, figure out how to factor the Random part out using those classes' operations (or the copious utility functions that exist for them).
This is not specific to random number generation, by the way, but applies to any monad.
You might enjoy reading this article, and might want to check out the author's random generation monad library:
"Encoding Statistical Independence, Statically"
https://hackage.haskell.org/package/mwc-probability
I doubt you need to follow the article's approach of using free monads for modeling, but the conceptual bits about probability distribution monads will likely be of some help.

tl;dr:
Consider to abstract the random function generator and pass it as an argument. Haskells type classes should help you to hide that abstraction as much as possible.
Unfortunately, there is no silver bullet here. Since you are using side effects, your "functions" simply aren't functions in the proper sense. Haskell does not allow you to hide that fact (which makes up the largest part of its safety guarantees). So in some way you will need to express this fact. You also seem to confuse the difference between monadic operations and (plain) functions: A function that (indirectly) uses random values is implicitly monadic. A non-monadic function can always be used inside a monadic operation. So you should probably implement all truly non-monadic functions as such and see how far that carries.
As a completely unrelated side-note: If lazyness is not a requirement and Haskells strong safety is too much a burden for you, but you still want to write (mostly) functional code, you could give OCaml a try (or any other ML dialect for that matter).

Sequence of Haskell's Random Generators

I'd like to write the following Haskell function that will provide me with a list of unique random generators:
randomGenerators :: RandomGen g => g -> [g]
Is the following a reasonable solution that won't create a situation where the "same" sequences are repeated?
randomGenerators g = iterate (fst . split) g
I am obviously throwing away half of all the generators but will this be a problem?

This will work, providing that split is implemented correctly (that is, if it produces uncorrelated generators). The System.Random one is believed to be robust (although its implementation of split contains a comment -- no statistical foundation for this!, so use it at your own risk and test for correlations).
Alternatively, you can use a RNG specifically designed to be used in parallel batches. For example, I have a package Random123 which implements counter-based generators (not very well optimized for performance right now, but may suit your purposes). There may be also bindings for DCMT library out there, or you can write your own.

Haskell Random Generator ... how to make easier to use?

I am with trouble related to Haskell Random generator. At university, i have to deal with Java all my way around, so now I'm, corrupted with it.
I am developing a game in Haskell, and now I face something like 'chance to do something', and that chance needs to be like Int -> Bool. In Java, I would have done
new Random().nextInt(100)
and there, problem solved!
In Haskell I have to choose something in a monad IO or something with a seed. None of these does what I want. I don't really want to use IO monad in my pure model, and the seed is awkward to use because I need to remember my new seed every time...
Is there something simple like Java's Random?

Believe it or not, you'll have to use different approaches in Haskell than you did in Java. There are a couple packages that can help you, but you will have to get a different attitude in your head to use them successfully. Here are some pointers:
http://hackage.haskell.org/package/MonadRandom to painlessly track seeds
http://hackage.haskell.org/package/mwc-random for high-speed, high-quality randomness
http://hackage.haskell.org/package/comonad-random for another approach to wrapping up seeds, in case you would prefer comonads to monads
Searching for the word "random" on Hackage's package list will turn up many, many more specific packages for more specific needs.

Sorry, but you will have to live with that.
How can there be a function in a pure functional language that gives you different values on each call? Answer is: it cannot - only in the IO-Monad or something similiar like the state-monad where you can pass your seed around (and don't have the same input every time) can such a things exist.
You may alsow have a look as this question "How can a time function exist in functional programming?" as it's in the same direction as yours.

I think, "you will have to live with that", is neither useful nor correct. It really depends on the abstractions you are using. If your application is naturally bound to a monad, then it makes sense to use a monadic random number generator, which is just as convenient as Java's random number generator.
In the case of a game using modern abstractions your application is naturally bound to functional reactive programming (FRP), where generating random numbers is no problem at all and doesn't require you to pass around generators explicitly. Example using the netwire library:
movingPoint :: MonadIO m => (Double, Double) -> Wire m a (Double, Double)
movingPoint x0 =
proc _ -> do
-- Randomly fades in and out of existence.
visible <- wackelkontakt -< ()
require -< (visible, ())
-- 'rnd' is a random value between -1 and 1.
rnd <- noise1 -< ()
-- dx is the velocity.
let dx = (sin &&& cos) (rnd * pi)
-- Integration of dx over time gives us the point's position.
-- x0 is the starting point.
integral x0 -< dx
Is there any way to express this easier and more concisely? I guess not. FRP also proves Zhen's comment wrong. It can handle user input purely.

It is somewhat unintuitive that something that neither does input nor output needs to be handled as if it had. Let's say you defined it as follows:
random100 = unsafePerformIO $ randomRIO (1, 100) -- This will not work!
That would indeed give you a random number - in a way. What you truly need is a way to encode that you want a new pseudo-random number every time. This means information needs to go from one random number generation to the next. Most languages just ignore this "minor detail", but Haskell forces you to pay attention. You might thank Haskell when you find yourself in the spot to properly reproduce your pseudo-random result in a multi-threaded context.
There's a number of ways you can make these connections, most of which have been mentioned already. If you are reluctant to use a monad: Note that it might generally be a good thing to have your code in a monadic form (but not using IO!). Down the road, you might well come into situations where you want more monad features, such as a reader for configuration - then all ground work would be done already.

Write a Haskell interpreter in Haskell

A classic programming exercise is to write a Lisp/Scheme interpreter in Lisp/Scheme. The power of the full language can be leveraged to produce an interpreter for a subset of the language.
Is there a similar exercise for Haskell? I'd like to implement a subset of Haskell using Haskell as the engine. Of course it can be done, but are there any online resources available to look at?
Here's the backstory.
I am exploring the idea of using Haskell as a language to explore some of the concepts in a Discrete Structures course I am teaching. For this semester I have settled on Miranda, a smaller language that inspired Haskell. Miranda does about 90% of what I'd like it to do, but Haskell does about 2000%. :)
So my idea is to create a language that has exactly the features of Haskell that I'd like and disallows everything else. As the students progress, I can selectively "turn on" various features once they've mastered the basics.
Pedagogical "language levels" have been used successfully to teach Java and Scheme. By limiting what they can do, you can prevent them from shooting themselves in the foot while they are still mastering the syntax and concepts you are trying to teach. And you can offer better error messages.

I love your goal, but it's a big job. A couple of hints:
I've worked on GHC, and you don't want any part of the sources. Hugs is a much simpler, cleaner implementation but unfortunately it's in C.
It's a small piece of the puzzle, but Mark Jones wrote a beautiful paper called Typing Haskell in Haskell which would be a great starting point for your front end.
Good luck! Identifying language levels for Haskell, with supporting evidence from the classroom, would be of great benefit to the community and definitely a publishable result!

There is a complete Haskell parser: http://hackage.haskell.org/package/haskell-src-exts
Once you've parsed it, stripping out or disallowing certain things is easy. I did this for tryhaskell.org to disallow import statements, to support top-level definitions, etc.
Just parse the module:
parseModule :: String -> ParseResult Module
Then you have an AST for a module:
Module SrcLoc ModuleName [ModulePragma] (Maybe WarningText) (Maybe [ExportSpec]) [ImportDecl] [Decl]
The Decl type is extensive: http://hackage.haskell.org/packages/archive/haskell-src-exts/1.9.0/doc/html/Language-Haskell-Exts-Syntax.html#t%3ADecl
All you need to do is define a white-list -- of what declarations, imports, symbols, syntax is available, then walk the AST and throw a "parse error" on anything you don't want them to be aware of yet. You can use the SrcLoc value attached to every node in the AST:
data SrcLoc = SrcLoc
{ srcFilename :: String
, srcLine :: Int
, srcColumn :: Int
}
There's no need to re-implement Haskell. If you want to provide more friendly compile errors, just parse the code, filter it, send it to the compiler, and parse the compiler output. If it's a "couldn't match expected type a against inferred a -> b" then you know it's probably too few arguments to a function.
Unless you really really want to spend time implementing Haskell from scratch or messing with the internals of Hugs, or some dumb implementation, I think you should just filter what gets passed to GHC. That way, if your students want to take their code-base and take it to the next step and write some real fully fledged Haskell code, the transition is transparent.

Do you want to build your interpreter from scratch? Begin with implementing an easier functional language like the lambda calculus or a lisp variant. For the latter there is a quite nice wikibook called Write yourself a Scheme in 48 hours giving a cool and pragmatic introduction into parsing and interpretation techniques.
Interpreting Haskell by hand will be much more complex since you'll have to deal with highly complex features like typeclasses, an extremely powerful type system (type-inference!) and lazy-evaluation (reduction techniques).
So you should define a quite little subset of Haskell to work with and then maybe start by extending the Scheme-example step by step.
Addition:
Note that in Haskell, you have full access to the interpreters API (at least under GHC) including parsers, compilers and of course interpreters.
The package to use is hint (Language.Haskell.*). I have unfortunately neither found online tutorials on this nor tried it out by myself but it looks quite promising.

create a language that has exactly the features of Haskell that I'd like and disallows everything else. As the students progress, I can selectively "turn on" various features once they've mastered the basics.
I suggest a simpler (as in less work involved) solution to this problem. Instead of creating a Haskell implementation where you can turn features off, wrap a Haskell compiler with a program that first checks that the code doesn't use any feature you disallow, and then uses the ready-made compiler to compile it.
That would be similar to HLint (and also kind of its opposite):
HLint (formerly Dr. Haskell) reads Haskell programs and suggests changes that hopefully make them easier to read. HLint also makes it easy to disable unwanted suggestions, and to add your own custom suggestions.
Implement your own HLint "suggestions" to not use the features you don't allow
Disable all the standard HLint suggestions.
Make your wrapper run your modified HLint as a first step
Treat HLint suggestions as errors. That is, if HLint "complained" then the program doesn't proceed to compilation stage

Baskell is a teaching implementation, http://hackage.haskell.org/package/baskell
You might start by picking just, say, the type system to implement. That's about as complicated as an interpreter for Scheme, http://hackage.haskell.org/package/thih

The EHC series of compilers is probably the best bet: it's actively developed and seems to be exactly what you want - a series of small lambda calculi compilers/interpreters culminating in Haskell '98.
But you could also look at the various languages developed in Pierce's Types and Programming Languages, or the Helium interpreter (a crippled Haskell intended for students http://en.wikipedia.org/wiki/Helium_(Haskell)).

If you're looking for a subset of Haskell that's easy to implement, you can do away with type classes and type checking. Without type classes, you don't need type inference to evaluate Haskell code.
I wrote a self-compiling Haskell subset compiler for a Code Golf challenge. It takes Haskell subset code on input and produces C code on output. I'm sorry there isn't a more readable version available; I lifted nested definitions by hand in the process of making it self-compiling.
For a student interested in implementing an interpreter for a subset of Haskell, I would recommend starting with the following features:
Lazy evaluation. If the interpreter is in Haskell, you might not have to do anything for this.
Function definitions with pattern-matched arguments and guards. Only worry about variable, cons, nil, and _ patterns.
Simple expression syntax:
Integer literals
Character literals
[] (nil)
Function application (left associative)
Infix : (cons, right associative)
Parenthesis
Variable names
Function names
More concretely, write an interpreter that can run this:
-- tail :: [a] -> [a]
tail (_:xs) = xs
-- append :: [a] -> [a] -> [a]
append [] ys = ys
append (x:xs) ys = x : append xs ys
-- zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
zipWith f (a:as) (b:bs) = f a b : zipWith f as bs
zipWith _ _ _ = []
-- showList :: (a -> String) -> [a] -> String
showList _ [] = '[' : ']' : []
showList show (x:xs) = '[' : append (show x) (showItems show xs)
-- showItems :: (a -> String) -> [a] -> String
showItems show [] = ']' : []
showItems show (x:xs) = ',' : append (show x) (showItems show xs)
-- fibs :: [Int]
fibs = 0 : 1 : zipWith add fibs (tail fibs)
-- main :: String
main = showList showInt (take 40 fibs)
Type checking is a crucial feature of Haskell. However, going from nothing to a type-checking Haskell compiler is very difficult. If you start by writing an interpreter for the above, adding type checking to it should be less daunting.

You might look at Happy (a yacc-like parser in Haskell) which has a Haskell parser.

This might be a good idea - make a tiny version of NetLogo in Haskell. Here is the tiny interpreter.

see if helium would make a better base to build upon than standard haskell.

Uhc/Ehc is a series of compilers enabling/disabling various Haskell features.
http://www.cs.uu.nl/wiki/Ehc/WebHome#What_is_UHC_And_EHC

I've been told that Idris has a fairly compact parser, not sure if it's really suitable for alteration, but it's written in Haskell.

Andrej Bauer's Programming Language Zoo has a small implementation of a purely functional programming language somewhat cheekily named "minihaskell". It is about 700 lines of OCaml, so very easy to digest.
The site also contains toy versions of ML-style, Prolog-style and OO programming languages.

Don't you think it would be easier to take the GHC sources and strip out what you don't want, than it would be to write your own Haskell interpreter from scratch? Generally speaking, there should be a lot less effort involved in removing features as opposed to creating/adding features.
GHC is written in Haskell anyway, so technically that stays with your question of a Haskell interpreter written in Haskell.
It probably wouldn't be too hard to make the whole thing statically linked and then only distribute your customized GHCi, so that the students can't load other Haskell source modules. As to how much work it would take to prevent them from loading other Haskell object files, I have no idea. You might want to disable FFI too, if you have a bunch of cheaters in your classes :)

The reason why there are so many LISP interpreters is that LISP is basically a predecessor of JSON: a simple format to encode data. This makes the frontend part quite easy to handle. Compared to that, Haskell, especially with Language Extensions, is not the easiest language to parse.
These are some syntactical constructs that sound tricky to get right:
operators with configurable precedence, associativity, and fixity,
nested comments
layout rule
pattern syntax
do- blocks and desugaring to monadic code
Each of these, except maybe the operators, could be tackled by students after their Compiler Construction Course, but it would take the focus away from how Haskell actually works. In addition to that, you might not want to implement all syntactical constructs of Haskell directly, but instead implement passes to get rid of them. Which brings us to the literal core of the issue, pun fully intended.
My suggestion is to implement typechecking and an interpreter for Core instead of full Haskell. Both of these tasks are quite intricate by themselves already.
This language, while still a strongly typed functional language, is way less complicated to deal with in terms of optimization and code generation.
However, it is still independent from the underlying machine.
Therefore, GHC uses it as an intermediary language and translates most syntaxical constructs of Haskell into it.
Additionally, you should not shy away from using GHC's (or another compiler's) frontend.
I'd not consider that as cheating since custom LISPs use the host LISP system's parser (at least during bootstrapping). Cleaning up Core snippets and presenting them to students, along with the original code, should allow you to give an overview of what the frontend does, and why it is preferable to not reimplement it.
Here are a few links to the documentation of Core as used in GHC:
System FC: equality constraints and coercions
GHC/As a library
The Core type

Good Haskell coding standards

Could someone provide a link to a good coding standard for Haskell? I've found this and this, but they are far from comprehensive. Not to mention that the HaskellWiki one includes such "gems" as "use classes with care" and "defining symbolic infix identifiers should be left to library writers only."

Really hard question. I hope your answers turn up something good. Meanwhile, here is a catalog of mistakes or other annoying things that I have found in beginners' code. There is some overlap with the Cal Tech style page that Kornel Kisielewicz points to. Some of my advice is every bit as vague and useless as the HaskellWiki "gems", but I hope at least it is better advice :-)
Format your code so it fits in 80 columns. (Advanced users may prefer 87 or 88; beyond that is pushing it.)
Don't forget that let bindings and where clauses create a mutually recursive nest of definitions, not a sequence of definitions.
Take advantage of where clauses, especially their ability to see function parameters that are already in scope (nice vague advice). If you are really grokking Haskell, your code should have a lot more where-bindings than let-bindings. Too many let-bindings is a sign of an unreconstructed ML programmer or Lisp programmer.
Avoid redundant parentheses. Some places where redundant parentheses are particularly offensive are
Around the condition in an if expression (brands you as an unreconstructed C programmer)
Around a function application which is itself the argument of an infix operator (Function application binds tighter than any infix operator. This fact should be burned into every Haskeller's brain, in much the same way that us dinosaurs had APL's right-to-left scan rule burned in.)
Put spaces around infix operators. Put a space following each comma in a tuple literal.
Prefer a space between a function and its argument, even if the argument is parenthesized.
Use the $ operator judiciously to cut down on parentheses. Be aware of the close relationship between $ and infix .:
f $ g $ h x == (f . g . h) x == f . g . h $ x
Don't overlook the built-in Maybe and Either types.
Never write if <expression> then True else False; the correct phrase is simply <expression>.
Don't use head or tail when you could use pattern matching.
Don't overlook function composition with the infix dot operator.
Use line breaks carefully. Line breaks can increase readability, but there is a tradeoff: Your editor may display only 40–50 lines at once. If you need to read and understand a large function all at once, you mustn't overuse line breaks.
Almost always prefer the -- comments which run to end of line over the {- ... -} comments. The braced comments may be appropriate for large headers—that's it.
Give each top-level function an explicit type signature.
When possible, align -- lines, = signs, and even parentheses and commas that occur in adjacent lines.
Influenced as I am by GHC central, I have a very mild preference to use camelCase for exported identifiers and short_name with underscores for local where-bound or let-bound variables.

Some good rules of thumbs imho:
Consult with HLint to make sure you don't have redundant braces and that your code isn't pointlessly point-full.
Avoid recreating existing library functions. Hoogle can help you find them.
Often times existing library functions are more general than what one was going to make. For example if you want Maybe (Maybe a) -> Maybe a, then join does that among other things.
Argument naming and documentation is important sometimes.
For a function like replicate :: Int -> a -> [a], it's pretty obvious what each of the arguments does, from their types alone.
For a function that takes several arguments of the same type, like isPrefixOf :: (Eq a) => [a] -> [a] -> Bool, naming/documentation of arguments is more important.
If one function exists only to serve another function, and isn't otherwise useful, and/or it's hard to think of a good name for it, then it probably should exist in it's caller's where clause instead of in the module's scope.
DRY
Use Template-Haskell when appropriate.
Bundles of functions like zip3, zipWith3, zip4, zipWith4, etc are very meh. Use Applicative style with ZipLists instead. You probably never really need functions like those.
Derive instances automatically. The derive package can help you derive instances for type-classes such as Functor (there is only one correct way to make a type an instance of Functor).
Code that is more general has several benefits:
It's more useful and reusable.
It is less prone to bugs because there are more constraints.
For example if you want to program concat :: [[a]] -> [a], and notice how it can be more general as join :: Monad m => m (m a) -> m a. There is less room for error when programming join because when programming concat you can reverse the lists by mistake and in join there are very few things you can do.
When using the same stack of monad transformers in many places in your code, make a type synonym for it. This will make the types shorter, more concise, and easier to modify in bulk.
Beware of "lazy IO". For example readFile doesn't really read the file's contents at the moment the file is read.
Avoid indenting so much that I can't find the code.
If your type is logically an instance of a type-class, make it an instance.
The instance can replace other interface functions you may have considered with familiar ones.
Note: If there is more than one logical instance, create newtype-wrappers for the instances.
Make the different instances consistent. It would have been very confusing/bad if the list Applicative behaved like ZipList.

I like to try to organize functions
as point-free style compositions as
much as possible by doing things
like:
func = boo . boppity . bippity . snd
where boo = ...
boppity = ...
bippity = ...
I like using ($) only to avoid nested parens or long parenthesized expressions
... I thought I had a few more in me, oh well

I'd suggest taking a look at this style checker.

I found good markdown file covering almost every aspect of haskell code style. It can be used as cheat sheet. You can find it here: link

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string