haskellformac , why use 'runhaskell' command? - haskell

Reading http://blog.haskellformac.com/blog/running-command-line-programs :
This requires installing the Haskell for Mac command line tools as
outlined in a previous article. Those tools include a command named
runhaskell, it runs a Haskell program in "script mode" — i.e., it is
being interpreted, instead of compiled (much like, say, the Python
interpreter runs a Python script).
Why provide a tool to run haskell in script mode ?
As the code is being interpreted does this mean it will run slower in script mode ?

Yes, it will run slower, but depending on the application this may not matter at all. Many interesting tasks don't in fact require a lot of computations, so you wouldn't even notice the runtime difference between, say, Java and Ruby, though the latter is considered to have much worse performance.
For such quick-run applications, what's rather more important is the startup time. With interpreted languages, this is often pretty immediate, whereas recompiling a script can take considerable time. So, interpreting can indeed be faster that compiling, in practise!
Furthermore, just because the script is interpreted doesn't mean every single computation is. In fact, most of the critical stuff is often defined in libraries which are compiled and only called from the interpreted code – this is the single reason why languages like Python or Matlab can be competitive in scientifc computing: the computationally intensive routines are actually written in compiled C or Fortran, not the top-level language itself!
Haskell gives you the advantages of both worlds (fast raw performance of a compiled language; quick usage and conciseness of an interpreted one), but without the need to actually have two different languages – you can simply choose which parts to run compiled and which to merely interpret!
(This is not to say that this is a unique thing about Haskell – there exist in fact interpreters for pretty much all compiled languages. Only, it's normally not that common to run code interpreted except for debugging. But Haskell turns out to be well-suited even for scripting tasks that might normally be written in Python or Bash, but which nobody would bother to procure an entire Java or C++ project for.)

Related

Why is there such a clear cut between interpreted and compiled languages?

When learning a compiled language like C or C++, you get to know the compiler. In order to run your code, you have to compile it first. Compiling your code translates it from a textual representation into something that can be executed. The resulting code is very fast and can make use of preprocessors and the like.
When learning a dynamic language like Python, Matlab, or Ruby, you get to know the interpreter. In order to run your code, you just type it into the interpreter. Thus, you can play with your code at runtime and change the behavior of your program on the fly. The downside of this seem to be that interpreted languages are rather slow and the lack of a clear compilation time seems to makes preprocessors impossible.
Then there are just-in-time compilers which are used like interpreted languages but with less of a performance deficit compared to compiled languages. But they generally do not sport preprocessors and do not output ready-to-run binaries.
And then I learned Lisp, which can be compiled, interpreted and what have you, all the while being both fast and having a powerful preprocessing system (macros). This seems to be common sense in the Lisp world, but not anywhere else.
Why are there no popular interpreters for C or compilers for Python? Why the strong divide between interpreted and compiled languages? (I know some projects exist that can compile Python or interpret C, but in general they seem to not be very popular).
Most popular compiled languages were designed from the ground up to be compiled: they tend to avoid features that would make it difficult to produce efficient compiled code. These language features include the convenient "dynamic" ones such as dynamic typing, nonuniform containers, and ad hoc object namespaces.
So, an interpreter for a compiled language can't take advantage of the dynamic features that are available to a interpreted language, but lacks the performance advantages of a compiled implementation.
Conversely, a compiler must duplicate all features and behavior of an interpreted language, regardless of expense. In general, this means that the compiled program for an interpreted language will carry much of the overhead of the interpreter. As one example, any kind of eval() functionality effectively requires inclusion of the interpreter.
Finally, these effects are amplified by the mutually reinforcing advantages of large user base, good support, and robust implementation.

Will Haskell be a good choice for my task?

I'm starting a new project and don't know which language to use.
My 'must have' requirements are:
Be able to run on Windows/LinuxMacOs natively (native executable) – user should be able to just run the .exe (when on Windows, for example) and see the results.
No runtimes/interpreters (no JVM, CLR etc.) – one file download should be enough to run the application.
Full Unicode support.
Be able to manipulate OS threads (create them, run multiple tasks in parallel on multi-core CPUs, etc.)
Be reasonably fast (Python level performance and better).
To have some kind of standard library that does low-level, mundane tasks.
Not very niche and have some community behind it to be able to ask questions.
My 'nice to have' requirements are:
Language should be functional.
It should have good string manipulation capabilities (not necessarily regex).
Not extremely hard to learn.
I'm thinking about Haskell now, but keeping in mind OCaml as well.
Update:
This application is intended to be a simple language parsing and manipulation utility.
Please advice, if my choice is correct.
Haskell:
1: It runs on Linux, Windows and OS X, in many cases without changes to source code.
2: Native binaries generated. No VM.
3: Full Unicode support. All UTF variants supported.
4: Full threading support, plus if you only want parallelisation then you can use "par" with a 100% guarantee that it only affects the time taken rather than the semantics.
5: As fast as C, although some tweaking can be required, the skills required are currently rather obscure, and apparently minor tweaks can have multiple orders of magnitude impact.
6: Standard library included, and "Hackage" has lots more packages including a range of parser libraries.
7: Friendly community on IRC (#haskell) and here.
Edit: On the "nice to have" points:
1: Haskell is an uncompromisingly pure functional language.
2: It has generally good string manipulation, with regexes if you want them. As someone said in a later comment, beware the efficiency of the built-in "String" type (it represents a string as a linked list of characters), but the ByteString and Text libraries will solve that for you.
3: Is it hard to learn? Its nowhere near as complicated as C++, and probably a lot simpler than Java or even maybe Python. But its pure functional nature means that it is very different to imperative languages. The problem is not so much learning Haskell as unlearning imperative thought patterns.
Haskell sounds like it fits the bill perfectly. GHC produces native code on OS X, linux and windows just fine, and in general has performance that is much better than Python (for many things, not everything).
The only strange request is the need for OS threads. Programs produced by GHC use lightweight threads, which perform much better than OS threads, and much easier to work with than pthreads.
Haskell is also excellent for language parsing, using libraries like Parsec.
We're also quite well known for how string and helpful the community is around Haskell.
To your third nice to have: Have a look at Real World Haskell, it's free and a very good introduction, including an introduction to all the points you need. (Such as parallel computing, string parsing, etc).
Maybe 'nice to have':
yes pure functional and lazy evaluation.
yes (as said before).
depends on you, I think it's hard to learn,
but gives you some great benefits.

Which languages are dynamically typed and compiled (and which are statically typed and interpreted)?

In my reading on dynamic and static typing, I keep coming up against the assumption that statically typed languages are compiled, while dynamically typed languages are interpreted. I know that in general this is true, but I'm interested in the exceptions.
I'd really like someone to not only give some examples of these exceptions, but try to explain why it was decided that these languages should work in this way.
Here's a list of a few interesting systems. It is not exhaustive!
Dynamically typed and compiled
The Gambit Scheme compiler, Chez Scheme, Will Clinger's Larceny Scheme compiler, the Bigloo Scheme compiler, and probably many others.
Why?
Lots of people really like Scheme. Programs as data, good macro system, 35 years of development, big community. But they want performance. Hence, a number of good native-code compilers—Chez Scheme is even a successful commercial product (interpreted bytecodes are free; native codes you pay for).
The LuaJIT just-in-time compiler for Lua.
Why?
To show it could be done. And then, people started to like getting 3x speedup on their Lua programs. Lua is in a lot of games, where performance matters, plus it's creeping into other products too. 70% of the code in Adobe Lightroom is Lua.
The iconc Icon-to-C compiler.
Why?
The fifty people who used it loved Icon. Totally unusual evaluation model, the most innovative (and in my opinion, best) string-processing system ever designed. But that evaluation model was really expensive, especially on late-1980s computers. By compiling Icon to C, the Icon Project made it possible for big Icon programs to run in fewer hours.
Conclusion: people first develop an attachment to a dynamically typed language, and probably a significant code base. Eventually, the community spits out a native-code compiler so that you can get better performance and solve bigger problems.
Statically Typed and Interpreted
This category is less common, but...
Objective Caml. Dialect of ML, vehicle for lots of innovative experiments in language design.
Why?
Very portable system and very fast compilation times. People like both properties, so the new language-design ideas are desseminated widely.
Moscow ML. Standard ML with a few extra features of the modules system.
Why?
Portable, fast compilation times, easy to make an interactive read/eval/print loop. Became a popular teaching compiler.
C-Terp. An old product, I think maybe from Gimpel Software. Saber C—a product I don't think you can buy any more.
Why?
Debugging. Especially, debugging on 1980s hardware under MS-DOS. For very little resources, you could get really good help debugging C code on very limited hardware (think: 4.77MHz processor with an 8-bit bus, 640K of RAM fully loaded). Nearly impossible to get a good visual debugger for native-compiled code, but with the interpreter, fairly easy.
UCSD Pascal—the system that made "P-code" a household word.
Why?
Teachers liked Niklaus Wirth's language design, and the compiler could run on very small machines. Wirth's clean design and the UCSD P-system made an unbeatable combination, and Pascal was the standard teaching language of the 1970s. Younger people may find it hard to appreciate that in the 1970s there was no debate over what language to teach in the first course. Today I know of programs using C, C++, Haskell, Java, ML, and Scheme. In the 1970s it was always Pascal, and the UCSD P-system was a big reason way.
In case you are wondering, P stood for portable.
Summary: Interpreting a statically typed language is a great way to get an implementation into everybody's hands quickly. (It also had advantages for debugging on Bronze Age hardware.)
Objective-C is compiled and supports dynamic typing (certainly when calling methods via [target doSomething] syntax). That is, you can send any message to a target (using ordinary language syntax, without programming against a reflection API), receive only a warning at compile time that it might not be handled, and receive an exception only at runtime if the target doesn't respond to that selector (which is like a method signature); and you can ask any object (which can all be of static type id if your code doesn't know any better or doesn't care) whether it respondsToSelector: to probe its capabilities.
Java (a statically typed language) is compiled to JVM bytecode, which was interpreted on older versions of the JVM, whereas it now uses Just In Time (JIT) compilation, meaning machine code is generated at runtime. I also believe ML and its dialects can be interpreted, and ML is definitely statically typed.
Python is a dynamic language that has compilers.
See this SO question - Python - why compile?, for instance.
In general, compiling makes the program run much faster.
Actionscript has dynamic typing and compiles to bytecode.
And it even compiles right down to native machine code if you want to release a Flash app on the iPhone.

Interactive programming language?

Is there a programming language which can be programmed entirely in interactive mode, without needing to write files which are interpreted or compiled. Think maybe something like IRB for Ruby, but a system which is designed to let you write the whole program from the command line.
I assume you are looking for something similar to how BASIC used to work (boot up to a BASIC prompt and start coding).
IPython allows you to do this quite intuitively. Unix shells such as Bash use the same concept, but you cannot re-use and save your work nearly as intuitively as with IPython. Python is also a far better general-purpose language.
Edit: I was going to type up some examples and provide some links, but the IPython interactive tutorial seems to do this a lot better than I could. Good starting points for what you are looking for are the sections on source code handling tips and lightweight version control. Note this tutorial doesn't spell out how to do everything you are looking for precisely, but it does provide a jumping off point to understand the interactive features on the IPython shell.
Also take a look at the IPython "magic" reference, as it provides a lot of utilities that do things specific to what you want to do, and allows you to easily define your own. This is very "meta", but the example that shows how to create an IPython magic function is probably the most concise example of a "complete application" built in IPython.
Smalltalk can be programmed entirely interactively, but I wouldn't call the smalltalk prompt a "command line". Most lisp environments are like this as well. Also postscript (as in printers) if memory serves.
Are you saying that you want to write a program while never seeing more code than what fits in the scrollback buffer of your command window?
There's always lisp, the original alternative to Smalltalk with this characteristic.
The only way to avoid writing any files is to move completely to a running interactive environment. When you program this way (that is, interactively such as in IRB or F# interactive), how do you distribute your programs? When you exit IRB or F# interactive console, you lose all code you interactively wrote.
Smalltalk (see modern implementation such as Squeak) solves this and I'm not aware of any other environment where you could fully avoid files. The solution is that you distribute an image of running environment (which includes your interactively created program). In Smalltalk, these are called images.
Any unix shell conforms to your question. This goes from bash, sh, csh, ksh to tclsh for TCL or wish for TK GUI writing.
As already mentioned, Python has a few good interactive shells, I would recommend bpython for starters instead of ipython, the advantage of bpython here is the support for autocompletion and help dialogs to help you know what arguments the function accepts or what it does (if it has docstrings).
Screenshots: http://bpython-interpreter.org/screenshots/
This is really a question about implementations, not languages, but
Smalltalk (try out the Squeak version) keeps all your work in an "interactive workspace", but it is graphical and not oriented toward the command line.
APL, which was first deployed on IBM 360 and 370 systems, was entirely interactive, using a command line on a modified IBM Selectric typewriter! Your APL functions were kept in a "workspace" which did not at all resemble an ordinary file.
Many, many language implementations come with pure command-line interactive interpreters, like say Standard ML of New Jersey, but because they don't offer any sort of persistent namespace (i.e., when you exit the program, all your work is lost), I don't think they should really count.
Interestingly, the prime movers behind Smalltalk and APL (Kay and Iverson respectively) both won Turing Awards. (Iverson got his Turing award after being denied tenure at Harvard.)
TCL can be programmed entirely interactivly, and you can cetainly define new tcl procs (or redefine existing ones) without saving to a file.
Of course if you are developing and entire application at some point you do want to save to a file, else you lose everything. Using TCLs introspective abilities its relatively easy to dump some or all of the current interpreter state into a tcl file (I've written a proc to make this easier before, however mostly I would just develop in the file in the first place, and have a function in the application to resources itself if its source changes).
Not sure about that, but this system is impressively interactive: http://rigsomelight.com/2014/05/01/interactive-programming-flappy-bird-clojurescript.html
Most variations of Lisp make it easy to save your interactive work product as program files, since code is just data.
Charles Simonyi's Intentional Programming concept might be part way there, too, but it's not like you can go and buy that yet. The Intentional Workbench project may be worth exploring.
Many Forths can be used like this.
Someone already mentioned Forth but I would like to elaborate a bit on the history of Forth. Traditionally, Forth is a programming language which is it's own operating system. The traditional Forth saves the program directly onto disk sectors without using a "real" filesystem. It could afford to do that because it didn't ran directly on the CPU without an operating system so it didn't need to play nice.
Indeed, some implementations have Forth as not only the operating system but also the CPU (a lot of more modern stack based CPUs are in fact designed as Forth machines).
In the original implementation of Forth, code is always compiled each time a line is entered and saved on disk. This is feasible because Forth is very easy to compile. You just start the interpreter, play around with Forth defining functions as necessary then simply quit the interpreter. The next time you start the interpreter again all your previous functions are still there. Of course, not all modern implementations of Forth works this way.
Clojure
It's a functional Lisp on the JVM. You can connect to a REPL server called nREPL, and from there you can start writing code in a text file and loading it up interactively as you go.
Clojure gives you something akin to interactive unit testing.
I think Clojure is more interactive then other Lisps because of it's strong emphasis of the functional paradigm. It's easier to hot-swap functions when they are pure.
The best way to try it out is here: http://web.clojurerepl.com/
ELM
ELM is probably the most interactive you can get that I know of. It's a very pure functional language with syntax close to Haskell. What makes it special is that it's designed around a reactive model that allows hot-swapping(modifying running code(functions or values)) of code. The reactive bit makes it that whenever you change one thing, everything is re-evaluated.
Now ELM is compiled to HTML-CSS-JavaScript. So you won't be able to use it for everything.
ELM gives you something akin to interactive integration testing.
The best way to try it out is here: http://elm-lang.org/try

Bytecode Vs. Interpreted

I remember a professor once saying that interpreted code was about 10 times slower than compiled. What's the speed difference between interpreted and bytecode? (assuming that the bytecode isn't JIT compiled)
I ask because some folks have been kicking around the idea of compiling vim script into bytecode and I just wonder what kind of performance boost that will get.
When you compile things down to bytecode, you have the opportunity to first perform a bunch of expensive high-level optimizations. You design the byte-code to be very easily compiled to machine code and run all the optimizations and flow analysis ahead of time.
The speed-increase is thus potentially quite substantial - not only do you skip the whole lexing/parsing stages at runtime, but you also have more opportunity to apply optimizations and generate better machine code.
You could see a pretty good boost. However, there are a lot of factors. You can't just say that compiled code is always about 10 times faster than interpreted code, or that bytecode is n times faster than interpreted code.
Factors include the complexity and verbosity of the language for example. If a keyword in the language is several characters, and the bytecode is one, it should be quite a bit faster to load the bytecode, and jump to the routine that handles that bytecode, than it is to read the keyword string, then figure out where to go. But, if you're interpreting one of the exotic languages that has a one-byte keyword, the difference might be less noticeable.
I've seen this performance boost in practice, so it might worth it for you. Besides, it's fun to write such a thing, gives you a feel for how language interpreters and compilers work, and that will make you a better coder.
Are there actually any mainstream "interpreters" these days that don't actually compile their code? (Either to bytecode or something similar.)
For instance, when you use use a Perl program directly from its source code, the first thing it does is compile the source into a syntax tree, which it then optimizes and uses to execute the program. In normal situations the time spent compiling is tiny compared to the time actually running the program.
Sticking to this example, obviously Perl cannot be faster than well-optimized C code, as it is written in C. In practice, for most things you would normally do with Perl (like text processing), it will be as fast as you could reasonably code it in C, and orders of magnitude easier to write. On the other hand, I certainly wouldn't try to write a high performance math routine directly in Perl.
Also, a lot of "classic" interpreters also include the lex/parse phase along with execution.
For example, consider executing a Python script. When you do that, you have all the costs associated with converting the program text in to the internal interpreter data structures, which are then executed.
Now contrast that with executing a compiled Python script, a .pyc file. Here, the lex and parse phase is done, and you have just the runtime of the inner interpreter.
But if you consider, say, a classic BASIC interpreter, these typically never store the raw text, rather they store a tokenized form and recreate the program text when you do "LIST". Here the byte code is much cruder (you don't really have a virtual machine here), but your execution gets to skip some of the text processing. That's all done when you enter the line and hit ENTER.
It is according to your virtual machine. Some of your faster virtual machines(JVM) are approaching the speed of C code. So how fast is your interpreted code running compared to C?
Don't think that if you convert your interpreted code into ByteCode it will run as fast a Java(near C speeds), there has been years of performance boosting going on, but you should see significant speed boost.
Emacs has been ported into bytecode with increased performance. Might be worth a look to you.
I've never noticed a Vim script that was slow enough to notice. Assuming a script primarily calls built-in, native-code, operations (regexes, block operations, etc) that are implemented in the editor's core, even a 10x speed-up of the 'glue logic' in scripting would be insignificant.
Still, profiling is the only way to be really sure.

Resources