I remember a professor once saying that interpreted code was about 10 times slower than compiled. What's the speed difference between interpreted and bytecode? (assuming that the bytecode isn't JIT compiled)
I ask because some folks have been kicking around the idea of compiling vim script into bytecode and I just wonder what kind of performance boost that will get.

When you compile things down to bytecode, you have the opportunity to first perform a bunch of expensive high-level optimizations. You design the byte-code to be very easily compiled to machine code and run all the optimizations and flow analysis ahead of time.
The speed-increase is thus potentially quite substantial - not only do you skip the whole lexing/parsing stages at runtime, but you also have more opportunity to apply optimizations and generate better machine code.

You could see a pretty good boost. However, there are a lot of factors. You can't just say that compiled code is always about 10 times faster than interpreted code, or that bytecode is n times faster than interpreted code.
Factors include the complexity and verbosity of the language for example. If a keyword in the language is several characters, and the bytecode is one, it should be quite a bit faster to load the bytecode, and jump to the routine that handles that bytecode, than it is to read the keyword string, then figure out where to go. But, if you're interpreting one of the exotic languages that has a one-byte keyword, the difference might be less noticeable.
I've seen this performance boost in practice, so it might worth it for you. Besides, it's fun to write such a thing, gives you a feel for how language interpreters and compilers work, and that will make you a better coder.

Are there actually any mainstream "interpreters" these days that don't actually compile their code? (Either to bytecode or something similar.)
For instance, when you use use a Perl program directly from its source code, the first thing it does is compile the source into a syntax tree, which it then optimizes and uses to execute the program. In normal situations the time spent compiling is tiny compared to the time actually running the program.
Sticking to this example, obviously Perl cannot be faster than well-optimized C code, as it is written in C. In practice, for most things you would normally do with Perl (like text processing), it will be as fast as you could reasonably code it in C, and orders of magnitude easier to write. On the other hand, I certainly wouldn't try to write a high performance math routine directly in Perl.

Also, a lot of "classic" interpreters also include the lex/parse phase along with execution.
For example, consider executing a Python script. When you do that, you have all the costs associated with converting the program text in to the internal interpreter data structures, which are then executed.
Now contrast that with executing a compiled Python script, a .pyc file. Here, the lex and parse phase is done, and you have just the runtime of the inner interpreter.
But if you consider, say, a classic BASIC interpreter, these typically never store the raw text, rather they store a tokenized form and recreate the program text when you do "LIST". Here the byte code is much cruder (you don't really have a virtual machine here), but your execution gets to skip some of the text processing. That's all done when you enter the line and hit ENTER.

It is according to your virtual machine. Some of your faster virtual machines(JVM) are approaching the speed of C code. So how fast is your interpreted code running compared to C?
Don't think that if you convert your interpreted code into ByteCode it will run as fast a Java(near C speeds), there has been years of performance boosting going on, but you should see significant speed boost.
Emacs has been ported into bytecode with increased performance. Might be worth a look to you.

I've never noticed a Vim script that was slow enough to notice. Assuming a script primarily calls built-in, native-code, operations (regexes, block operations, etc) that are implemented in the editor's core, even a 10x speed-up of the 'glue logic' in scripting would be insignificant.
Still, profiling is the only way to be really sure.


haskellformac , why use 'runhaskell' command?

Reading :
This requires installing the Haskell for Mac command line tools as
outlined in a previous article. Those tools include a command named
runhaskell, it runs a Haskell program in "script mode" — i.e., it is
being interpreted, instead of compiled (much like, say, the Python
interpreter runs a Python script).
Why provide a tool to run haskell in script mode ?
As the code is being interpreted does this mean it will run slower in script mode ?
Yes, it will run slower, but depending on the application this may not matter at all. Many interesting tasks don't in fact require a lot of computations, so you wouldn't even notice the runtime difference between, say, Java and Ruby, though the latter is considered to have much worse performance.
For such quick-run applications, what's rather more important is the startup time. With interpreted languages, this is often pretty immediate, whereas recompiling a script can take considerable time. So, interpreting can indeed be faster that compiling, in practise!
Furthermore, just because the script is interpreted doesn't mean every single computation is. In fact, most of the critical stuff is often defined in libraries which are compiled and only called from the interpreted code – this is the single reason why languages like Python or Matlab can be competitive in scientifc computing: the computationally intensive routines are actually written in compiled C or Fortran, not the top-level language itself!
Haskell gives you the advantages of both worlds (fast raw performance of a compiled language; quick usage and conciseness of an interpreted one), but without the need to actually have two different languages – you can simply choose which parts to run compiled and which to merely interpret!
(This is not to say that this is a unique thing about Haskell – there exist in fact interpreters for pretty much all compiled languages. Only, it's normally not that common to run code interpreted except for debugging. But Haskell turns out to be well-suited even for scripting tasks that might normally be written in Python or Bash, but which nobody would bother to procure an entire Java or C++ project for.)

Why do the pre-defined functions always perform faster then user-defind function for the same functionality? Apologies if its too basic

In every programming languages, why the pre-defined
functions always performs faster than user-defined
function for the same functionality however we code with minimum Time Complexity?
Is it related to any resource utilization difference of these two types.?
This was my long term query.. I need atleast one better answer..
What you stated is simply not true. The claim that in any language user-defined functions perform worse than builtin is a very bold claim which is to be backed by hard data acquired using profiling.
Being more down-to-Earth though, we indeed might from time to time observe this behaviour with various platforms/lanugages/runtimes/libraries etc.
Mostly this is due to these two reasons:
Builtin functions are sometimes allowed to not be functions at all: the compiler is free to remove a function call completely (and replace it with something else like a bunch of machine instructions) if the behaviour of the code does not change.
In C, functions in the memcpy() family are a good example of this: in "release" builds, a good compiler for popular platforms usually replaces these cases with direct calls to CPU instructions which copy ranges of memory.
Builtin functions are often coded in platform-dependent assembly, so that when applicable, the function's body will be a finely tuned highly optimized low level code written for the target CPU's extended instruction set (such as SSE and its later iterations on Intel platforms).
If you meant not low-level languages like C but rather more high-level, targetting a virtual machine or interpreted then the answer should be rather more obvious.

Haskell for mission-critical systems [duplicate]

I've been curious to understand if it is possible to apply the power of Haskell to embedded realtime world, and in googling have found the Atom package. I'd assume that in the complex case the code might have all the classical C bugs - crashes, memory corruptions, etc, which would then need to be traced to the original Haskell code that
caused them. So, this is the first part of the question: "If you had the experience with Atom, how did you deal with the task of debugging the low-level bugs in compiled C code and fixing them in Haskell original code ?"
I searched for some more examples for Atom, this blog post mentions the resulting C code 22KLOC (and obviously no code:), the included example is a toy. This and this references have a bit more practical code, but this is where this ends. And the reason I put "sizable" in the subject is, I'm most interested if you might share your experiences of working with the generated C code in the range of 300KLOC+.
As I am a Haskell newbie, obviously there may be other ways that I did not find due to my unknown unknowns, so any other pointers for self-education in this area would be greatly appreciated - and this is the second part of the question - "what would be some other practical methods (if) of doing real-time development in Haskell?". If the multicore is also in the picture, that's an extra plus :-)
(About usage of Haskell itself for this purpose: from what I read in this blog post, the garbage collection and laziness in Haskell makes it rather nondeterministic scheduling-wise, but maybe in two years something has changed. Real world Haskell programming question on SO was the closest that I could find to this topic)
Note: "real-time" above is would be closer to "hard realtime" - I'm curious if it is possible to ensure that the pause time when the main task is not executing is under 0.5ms.
At Galois we use Haskell for two things:
Soft real time (OS device layers, networking), where 1-5 ms response times are plausible. GHC generates fast code, and has plenty of support for tuning the garbage collector and scheduler to get the right timings.
for true real time systems EDSLs are used to generate code for other languages that provide stronger timing guarantees. E.g. Cryptol, Atom and Copilot.
So be careful to distinguish the EDSL (Copilot or Atom) from the host language (Haskell).
Some examples of critical systems, and in some cases, real-time systems, either written or generated from Haskell, produced by Galois.
Copilot: A Hard Real-Time Runtime Monitor -- a DSL for real-time avionics monitoring
Equivalence and Safety Checking in Cryptol -- a DSL for cryptographic components of critical systems
HaLVM -- a lightweight microkernel for embedded and mobile applications
TSE -- a cross-domain (security level) network appliance
It will be a long time before there is a Haskell system that fits in small memory and can guarantee sub-millisecond pause times. The community of Haskell implementors just doesn't seem to be interested in this kind of target.
There is healthy interest in using Haskell or something Haskell-like to compile down to something very efficient; for example, Bluespec compiles to hardware.
I don't think it will meet your needs, but if you're interested in functional programming and embedded systems you should learn about Erlang.
Yes, it can be tricky to debug problems through the generated code back to the original source. One thing Atom provides is a means to probe internal expressions, then leaves if up to the user how to handle these probes. For vehicle testing, we build a transmitter (in Atom) and stream the probes out over a CAN bus. We can then capture this data, formated it, then view it with tools like GTKWave, either in post-processing or realtime. For software simulation, probes are handled differently. Instead of getting probe data from a CAN protocol, hooks are made to the C code to lift the probe values directly. The probe values are then used in the unit testing framework (distributed with Atom) to determine if a test passes or fails and to calculate simulation coverage.
I don't think Haskell, or other Garbage Collected languages are very well-suited to hard-realtime systems, as GC's tend to amortize their runtimes into short pauses.
Writing in Atom is not exactly programming in Haskell, as Haskell here can be seen as purely a preprocessor for the actual program you are writing.
I think Haskell is an awesome preprocessor, and using DSEL's like Atom is probably a great way to create sizable hard-realtime systems, but I don't know if Atom fits the bill or not. If it doesn't, I'm pretty sure it is possible (and I encourage anyone who does!) to implement a DSEL that does.
Having a very strong pre-processor like Haskell for a low-level language opens up a huge window of opportunity to implement abstractions through code-generation that are much more clumsy when implemented as C code text generators.
I've been fooling around with Atom. It is pretty cool, but I think it is best for small systems. Yes it runs in trucks and buses and implements real-world, critical applications, but that doesn't mean those applications are necessarily large or complex. It really is for hard-real-time apps and goes to great lengths to make every operation take the exact same amount of time. For example, instead of an if/else statement that conditionally executes one of two code branches that might differ in running time, it has a "mux" statement that always executes both branches before conditionally selecting one of the two computed values (so the total execution time is the same whichever value is selected). It doesn't have any significant type system other than built-in types (comparable to C's) that are enforced through GADT values passed through the Atom monad. The author is working on a static verification tool that analyzes the output C code, which is pretty cool (it uses an SMT solver), but I think Atom would benefit from more source-level features and checks. Even in my toy-sized app (LED flashlight controller), I've made a number of newbie errors that someone more experienced with the package might avoid, but that resulted in buggy output code that I'd rather have been caught by the compiler instead of through testing. On the other hand, it's still at version 0.1.something so improvements are undoubtedly coming.

Which languages are dynamically typed and compiled (and which are statically typed and interpreted)?

In my reading on dynamic and static typing, I keep coming up against the assumption that statically typed languages are compiled, while dynamically typed languages are interpreted. I know that in general this is true, but I'm interested in the exceptions.
I'd really like someone to not only give some examples of these exceptions, but try to explain why it was decided that these languages should work in this way.
Here's a list of a few interesting systems. It is not exhaustive!
Dynamically typed and compiled
The Gambit Scheme compiler, Chez Scheme, Will Clinger's Larceny Scheme compiler, the Bigloo Scheme compiler, and probably many others.
Lots of people really like Scheme. Programs as data, good macro system, 35 years of development, big community. But they want performance. Hence, a number of good native-code compilers—Chez Scheme is even a successful commercial product (interpreted bytecodes are free; native codes you pay for).
The LuaJIT just-in-time compiler for Lua.
To show it could be done. And then, people started to like getting 3x speedup on their Lua programs. Lua is in a lot of games, where performance matters, plus it's creeping into other products too. 70% of the code in Adobe Lightroom is Lua.
The iconc Icon-to-C compiler.
The fifty people who used it loved Icon. Totally unusual evaluation model, the most innovative (and in my opinion, best) string-processing system ever designed. But that evaluation model was really expensive, especially on late-1980s computers. By compiling Icon to C, the Icon Project made it possible for big Icon programs to run in fewer hours.
Conclusion: people first develop an attachment to a dynamically typed language, and probably a significant code base. Eventually, the community spits out a native-code compiler so that you can get better performance and solve bigger problems.
Statically Typed and Interpreted
This category is less common, but...
Objective Caml. Dialect of ML, vehicle for lots of innovative experiments in language design.
Very portable system and very fast compilation times. People like both properties, so the new language-design ideas are desseminated widely.
Moscow ML. Standard ML with a few extra features of the modules system.
Portable, fast compilation times, easy to make an interactive read/eval/print loop. Became a popular teaching compiler.
C-Terp. An old product, I think maybe from Gimpel Software. Saber C—a product I don't think you can buy any more.
Debugging. Especially, debugging on 1980s hardware under MS-DOS. For very little resources, you could get really good help debugging C code on very limited hardware (think: 4.77MHz processor with an 8-bit bus, 640K of RAM fully loaded). Nearly impossible to get a good visual debugger for native-compiled code, but with the interpreter, fairly easy.
UCSD Pascal—the system that made "P-code" a household word.
Teachers liked Niklaus Wirth's language design, and the compiler could run on very small machines. Wirth's clean design and the UCSD P-system made an unbeatable combination, and Pascal was the standard teaching language of the 1970s. Younger people may find it hard to appreciate that in the 1970s there was no debate over what language to teach in the first course. Today I know of programs using C, C++, Haskell, Java, ML, and Scheme. In the 1970s it was always Pascal, and the UCSD P-system was a big reason way.
In case you are wondering, P stood for portable.
Summary: Interpreting a statically typed language is a great way to get an implementation into everybody's hands quickly. (It also had advantages for debugging on Bronze Age hardware.)
Objective-C is compiled and supports dynamic typing (certainly when calling methods via [target doSomething] syntax). That is, you can send any message to a target (using ordinary language syntax, without programming against a reflection API), receive only a warning at compile time that it might not be handled, and receive an exception only at runtime if the target doesn't respond to that selector (which is like a method signature); and you can ask any object (which can all be of static type id if your code doesn't know any better or doesn't care) whether it respondsToSelector: to probe its capabilities.
Java (a statically typed language) is compiled to JVM bytecode, which was interpreted on older versions of the JVM, whereas it now uses Just In Time (JIT) compilation, meaning machine code is generated at runtime. I also believe ML and its dialects can be interpreted, and ML is definitely statically typed.
Python is a dynamic language that has compilers.
See this SO question - Python - why compile?, for instance.
In general, compiling makes the program run much faster.
Actionscript has dynamic typing and compiles to bytecode.
And it even compiles right down to native machine code if you want to release a Flash app on the iPhone.

Trivial mathematical problems as language benchmarks

Why do people insist on using trivial mathematical problems like finding numbers in the Fibonacci sequence for language benchmarks? Don't these usually get optimized to relativistic speeds? Isn't the brunt of the bottlenecks usually in I/O, system API calls, operations on strings and structures, processing large quantities of data, abstract object-oriented stuff, etc?
It is a throwback to the old days, when compiler technology for what we would now call basic math was still evolving rapidly.
Now, compiler evolution is more focused on exploiting new instructions for niche operations, 64-bit math, and so on.
Micro-benchmarks such as the ones you mention were useful, though, when evaluating the efficiency of the hotspot compiler when Java was first launched, and in evaluating the efficiency of .NET versus C/C++.
Your suggestion that I/O and system calls are the likely bottlenecks is correct, at least for some space of problems. But I notice you suggested string operations. One person's irrelevant micro-benchmark is another person's critical performance metric.
EDIT: ps, I also remember using linpack and other micro-benchmarks to compare versions of the JVM, and to compare vendors of the JVM. From v4 to v5 there was a big jump in perf, I guess the JIT compiler got more effective. Also, IBM's JVM was ahead of Sun's at that time, on Windows-x86.
Because if you want to benchmark the language/compiler, these "math problems" are good indicators of the "bare speed" of the generated code. Either they use the iterative solution, which is a tight loop and indicates how well can the compiler push the instructions to the processor, or they use the recursive solution, which indicates how does it handle recursive calls of short functions (inlining, tail-recursion etc.) (although the Ackermann function is usually used for that too).
Usually, the benchmark suite for the language contain tests benchmarking other parts as well - eg. gzip compression, text searching, object creation, virtual function call, exception throw/catch benchmarks.
The other things you've noticed, syscalls and IO are usually not included because
syscalls are in fact not that slow - applications don't spend significant porion of the time in the kernel, except for test specifically targeted at them or when something is seriously wrong with the program
syscall and IO performance does not depend on the language, but rather on the OS & hardware
I'd think a simple, well-established algorithm would remove the possibility that the benchmark is biased (whether through ignorance or malice) to favor one language. It is very difficult to write a complex program in two different languages exactly the same. Testing something like the efficiency of a multithreaded application in c# vs java, for example, would require developers skilled in multithreaded development both languages, and there would still be questions as to whether the benchmark app properly represents the general case, or if it is misrepresenting a special case that only one language handles well.
Back when the sieve of eratosthanes was a popular benchmark for C compilers, I thought it would be funny if one of the compiler authors would recognize the sieve code and replace it with a pre-computed lookup.
