yacc/lex or hand-coding? - programming-languages

I am working on new programming language, but I was always puzzled by the fact that everyone is using yaxx/lex to parse the code, but I am not.
My compiler (which is already working) is handcoded in C++/STL, and I cannot say it's complex or took too much time. It has both some kind of lexer and parser, but they are not autogenerated.
Earlier, I wrote a C compiler(not full spec) the same way - it was able to compile the program in 1 pass, with all these back references resolving & preprocessing - this is definitely impossible with yacc/lex.
I just cannot convince myself to scrap all this, and start diving into yaxx/lex - which might need quite an effort to implement and might possibly introduce some grammar limitations.
Is there something I miss when not using yacc/lex? Do I do an evil thing?

The main advantages of using any kind of lexer/parser generator is that it gives you a lot more flexibility if your language evolves. In a hand-coded lexer/parser (especially if you've mixed in a lot of functionality in a single pass!), changes to the language get nasty fairly quickly, whereas with a parser generator you make the change, re-run the generator, and move on with your life. There are certainly no inherent technical limitations to always just writing everything by hand, but I think the evolvability and maintainability of automating away the boring bits is worth it!

Yacc is inflexible in some ways:
good error handling is hard (basically, its algorithm is only defined to parse a correct string correctly, otherwise, all bets are off; this is one of the reasons that GCC moved to a hand-written parser)
context-dependency is hard to express, whereas with a hand-written recursive descent parser you can simply add a parameter to the functions
Furthermore, I have noticed that lex/yacc object code is often bigger than a hand-written recursive descent parser (source code tends to be the other way round).
I have not used ANTLR so I cannot say if that is better at these points.

The other huge advantage of using generators is that they are guaranteed to process exactly and only the language you specified in the grammar. You can't say that of any hand-written code. The LR/LALR variants are also guaranteed to be O(N), which again you can't assert about any hand coding, at least not without a lot of effort in constructing the proof.
I've written both and lived with both and I would never hand-code again. I only did that one because I didn't have yacc on the platform at the time.

Maybe you are missing out on ANTLR, which is good for languages that can be defined with a recursive-descent parsing strategy.
There are potentially some advantages to using Yacc/Lex, but it is not mandatory to use them. There are some downsides to using Yacc/Lex too, but the advantages usually outweigh the disadvantages. In particular, it is often easier to maintain a Yacc-driven grammar than a hand-coded one, and you benefit from the automation that Yacc provides.
However, writing your own parser from scratch is not an evil thing to do. It may make it harder to maintain in future, but it may make it easier, too.

It certainly depends on the complexity of your language grammar. An easy grammar means that there is an easy implementation and you can just do it yourself.
Take a look at maybe the worst possible example at all: C++ :) (Does anybody knows another language, besides natural languages, which are more difficult to parse correctly?) Even with tools like Antlr, is it quite difficult to get it right, though it is manageable. Whereby on the other side, even while being much harder, it seems that some of the best C++ parsers, e.g. GCC and LLVM, are also mostly handwritten.
If you don't need too much flexibility and your language is not too trivial, you will certainly safe some work/time by using Antlr.

Related

Expression trees vs IL.Emit for runtime code specialization

I recently learned that it is possible to generate C# code at runtime and I would like to put this feature to use. I have code that does some very basic geometric calculations like computing line-plane intersections and I think I could gain some performance benefits by generating specialized code for some of the methods because many of the calculations are performed for the same plane or the same line over and over again. By specializing the code that computes the intersections I think I should be able to gain some performance benefits.
The problem is that I'm not sure where to begin. From reading a few blog posts and browsing MSDN documentation I've come across two possible strategies for generating code at runtime: Expression trees and IL.Emit. Using expression trees seems much easier because there is no need to learn anything about OpCodes and various other MSIL related intricacies but I'm not sure if expression trees are as fast as manually generated MSIL. So are there any suggestions on which method I should go with?
The performance of both is generally same, as expression trees internally are traversed and emitted as IL using the same underlying system functions that you would be using yourself. It is theoretically possible to emit a more efficient IL using low-level functions, but I doubt that there would be any practically important performance gain. That would depend on the task, but I have not come of any practical optimisation of emitted IL, compared to one emitted by expression trees.
I highly suggest getting the tool called ILSpy that reverse-compiles CLR assemblies. With that you can look at the code actually traversing the expression trees and actually emitting IL.
Finally, a caveat. I have used expression trees in a language parser, where function calls are bound to grammar rules that are compiled from a file at runtime. Compiled is a key here. For many problems I came across, when what you want to achieve is known at compile time, then you would not gain much performance by runtime code generation. Some CLR JIT optimizations might be also unavailable to dynamic code. This is only an opinion from my practice, and your domain would be different, but if performance is critical, I would rather look at native code, highly optimized libraries. Some of the work I have done would be snail slow if not using LAPACK/MKL. But that is only a piece of the advice not asked for, so take it with a grain of salt.
If I were in your situation, I would try alternatives from high level to low level, in increasing "needed time & effort" and decreasing reusability order, and I would stop as soon as the performance is good enough for the time being, i.e.:
first, I'd check to see if Math.NET, LAPACK or some similar numeric library already has similar functionality, or I can adapt/extend the code to my needs;
second, I'd try Expression Trees;
third, I'd check Roslyn Project (even though it is in prerelease version);
fourth, I'd think about writing common routines with unsafe C code;
[fifth, I'd think about quitting and starting a new career in a different profession :) ],
and only if none of these work out, would I be so hopeless to try emitting IL at run time.
But perhaps I'm biased against low level approaches; your expertise, experience and point of view might be different.

Is there a suitable replacement for C++, when I would like to write video processing applications?

I want to write a video editing software, and the "logical" conclusion is that the language I must to use is C++... But I don't like it (sorry c++ fans)
I would like to write it with something cool, like Lisp or Haskell or Erlang... But I don't know if the open source implementation of those languages (I don't have money to buy licenses) let me made a competitive software (in the performance area)
What do you think? what do you recommend?
I can't speak to Lisp, but both Erlang and Haskell are capable of the performance necessary for video processing. Achieving that performance is likely to be more difficult than with C++ because there are fewer existing libraries in the domain, so you'll have to implement more yourself. Which means you'll have to be capable of writing high-performance code yourself. In Haskell I expect this would require a significant investment of time (6 months minimum) to become proficient.
Which language you choose should depend a great deal upon the goals of the project. If it's a hobby project, or you want to learn a lot about processing algorithms (and therefore don't mind having to do a lot of low-level coding yourself), there's nothing wrong with using an out-of-mainstream language. Haskell has bindings to a lot of things you would probably want to use eventually, such as a wrapper for GLSL.
As somebody working with audio processing (including real-time), I can say that Haskell's performance hasn't been a problem for me. For a recent project I did write some functions in C, but that was necessary to implement a custom vectorization scheme. Doing high-level work in Haskell and calling out to C when necessary is a perfectly valid approach, although thankfully it's less necessary now than in the past.
Of course, this presumes a few things about the nature of your project. If you want something you can use right away, Haskell, Lisp, and Erlang are probably not the languages for you because there are fewer resources. Have you considered Processing? It's Java, I don't know if you consider that better than C++ or worse.
I had motivations besides productivity for working in Haskell (and my productivity took a big hit for a while), without those other goals I wouldn't have persevered. If you want to write something to use it, stick with what's going to be most productive. If you have other motivations, tell us what they are and it's more likely people will make helpful suggestions.
For what it's worth, Wings3D is written in Erlang.
You could always try D, if you want something somewhat similar to C++ but not C++. Also, D could use some love.
For both Haskell and Erlang, the open source implementation is the standard, most efficient available implementation available. There's no reason that Haskell shouldn't be performant enough for your needs -- for video stuff I assume you'll be using matrices and such. There are quality bindings available for BLAS & co for Haskell. I don't know of a great deal of existing video editing work, but Alberto Ruiz (the author of HMatrix) has done work with Haskell and computer vision: http://dis.um.es/profesores/alberto/research.html
There's also a great deal of work on sound libraries and processing in Haskell.
I'd use the language that gives me the best coverage by third-party libraries for what I'm trying to do; for manipulating video data that's probably going to be a mainstream language like C++.
If this project is for fun/to learn a new language then by all means, take the road less traveled. But if this is something you need to ship in a reasonable amount of time, avoiding the best tools for the job because you don't like them is unsound strategy.
That depends at least on what's your goal with the project. If it's a hobby project and you want to learn a different language, then you should choose that language. In this case, however, I assume you're familiar with video processing. On the other hand, if you want to learn about video processing, I'd recommend using a language you are already comfortable with.
Now, if it's a professional project of a decent size (video processing software can be huge) you should probably consider using different languages for different things. The kind of systems I work with usually require writing some code in C (for efficiency reasons), but we always try to keep that to de indispensable minimum and use a higher level language for most of the system behaviour (we use erlang, but that applies to any other higher level language).
IMO, writing big systems in C or C++ is almost a suicide. There are projects that succeed, but I find that much harder than complementing the C part with higher level languages.
There is already some video streaming server written in Erlang http://erlyvideo.org/. You can look for some inspiration https://github.com/erlyvideo/erlyvideo.

Are Scala "continuations" just a funky syntax for defining and using Callback Functions?

And I mean that in the same sense that a C/Java for is just a funky syntax for a while loop.
I still remember when first learning about the for loop in C, the mental effort that had to go into understanding the execution sequence of the three control expressions relative to the loop statement. Seems to me the same sort of effort has to be applied to understand Continuations (in Scala and I guess probably other languages).
And then there's the obvious follow-up question... if so, then what's the point? It seems like a lot of pain (language complexity, programmer errors, unreadable programs, etc) for no gain.
In some sense, yes, continuations are funky syntax for using callbacks. You can manually perform a very complex global transformation on your code (the so called continuation-passing-style transformation), and you will get continuations on your hands without direct language support.
However, transforming your entire codebase is probably not very practical, and the resulting code is hard to read, so having the compiler do it for you behind the scenes is MUCH better.

How to create a language these days?

I need to get around to writing that programming language I've been meaning to write. How do you kids do it these days? I've been out of the loop for over a decade; are you doing it any differently now than we did back in the pre-internet, pre-windows days? You know, back when "real" coders coded in C, used the command line, and quibbled over which shell was superior?
Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily) but how do you build the compiler and standard libraries and so forth? What tools do you kids use these days?
One consideration that's new since the punched card era is the existence of virtual machines already bountifully provided with "standard libraries." Targeting the JVM or the .NET CLR instead of ye olde "language walled garden" saves you a lot of bootstrapping. If you're creating a compiled language, you may also find Java byte code or MSIL an easier compile target than machine code (of course, if you're in this for the fun of creating a tight optimising compiler then you'll see this as a bug rather than a feature).
On the negative side, the idioms of the JVM or CLR may not be what you want for your language. So you may still end up building "standard libraries" just to provide idiomatic interfaces over the platform facility. (An example is that every languages and its dog seems to provide its own method for writing to the console, rather than leaving users to manually call System.out.println or Console.WriteLine.) Nevertheless, it enables an incremental development of the idiomatic libraries, and means that the more obscure libraries for which you never get round to building idiomatic interfaces are still accessible even if in an ugly way.
If you're considering an interpreted language, .NET also has support for efficient interpretation via the Dynamic Language Runtime (DLR). (I don't know if there's an equivalent for the JVM.) This should help free you up to focus on the language design without having to worry so much about the optimisation of the interpreter.
I've written two compilers now in Haskell for small domain-specific languages, and have found it to be an incredibly productive experience. The parsec library makes playing with syntax easy, and interpreters are very simple to write over a Haskell data structure. There is a description of writing a Lisp interpreter in Haskell that I found helpful.
If you are interested in a high-performance backend, I recommend LLVM. It has a concise and elegant byte-code and the best x86/amd64 generating backend you can find. There is an optional garbage collector, and some experimental backends that target the JVM and CLR.
You can write a compiler in any language that produces LLVM bytecode. If you are adventurous enough to learn Haskell but want LLVM, there are a set of Haskell-LLVM bindings.
What has changed considerably but hasn't been mentioned yet is IDE support and interoperability:
Nowadays we pretty much expect Intellisense, step-by-step execution and state inspection "right in the editor window", new types that tell the debugger how to treat them and rather helpful diagnostic messages. The old "compile .x -> .y" executable is not enough to create a language anymore. The environment is nothing to focus on first, but affects willingness to adopt.
Also, libraries have become much more powerful, noone wants to implement all that in yet another language. Try to borrow, make it easy to call existing code, and make it easy to be called by other code.
Targeting a VM - as itowlson suggested - is probably a good way to get started. If that turns out a problem, it can still be replaced by native compilers.
I'm pretty sure you do what's always been done.
Write some code, and show your results to the world.
As compared to the olden times, there are some tools to make your job easier though. Might I suggest ANTLR for parsing your language grammar?
Speaking as someone who just built a very simple assembly like language and interpreter, I'd start out with the .NET framework or similar. Nothing can beat the powerful syntax of C# + the backing of the entire .NET community when attempting to write most things. From here i designed a simple bytecode format and assembly syntax and proceeeded to write my interpreter + assembler.
Like i said, it was a very simple language.
You should not accept wimpy solutions like using the latest tools. You should bootstrap the language by writing a minimal compiler in Visual Basic for Applications or a similar language, then write all the compilation tools in your new language and then self-compile it using only the language itself.
Also, what is the proposed name of the language?
I think recently there have not been languages with ALL CAPITAL LETTER names like COBOL and FORTRAN, so I hope you will call it something like MIKELANG with all capital letters.
Not so much an implementation but a design decision which effects implementation - if you make every statement of your language have a unique parse tree without context, you'll get something that it's easy to hand-code a parser, and that doesn't require large amounts of work to provide syntax highlighting for. Similarly simple things like using a different symbol for module namespaces and object namespaces ( unlike Java which uses . for both package and class namespaces ) means you can parse the code without loading every module that it refers to.
Standard libraries - include the equivalent of everything in C99 standard libraries other than setjmp. Add whatever else you need for your domain. Work out an easy way to do this, either something like SWIG or an in-line FFI such as Ruby's [can't remember module name] and Python's ctypes.
Building as much of the language in the language is an option, but projects which start out doing either give up (rubinius moved to using C++ for parts of its standard library), or is only for research purposes (Mozilla Narcissus)
I am actually a kid, haha. I've never written an actual compiler before or designed a language, but I have finished The Red Dragon Book, so I suppose I have somewhat of an idea (I hope).
It would depend firstly on the grammar. If it's LR or LALR I suppose tools like Bison/Flex would work well. If it's more LL, I'd use Spirit, which is a component of Boost. It allows you to write the language's grammar in C++ in an EBNF-like syntax, so no muddling around with code generators; the C++ compiler compiles the grammar for you. If any of these fail, I'd write an EBNF grammar on paper, and then proceed to do some heavy recursive descent parsing, which seems to work; if C++ can be parsed pretty well using RDP (as GCC does it), then I suppose with enough unit tests and patience you could write entire compilers using RDP.
Once I have a parser running and some sort of intermediate representation, it then depends on how it runs. If it's some bytecode or native code compiler, I'll use LLVM or libJIT to process it. LLVM is more suited for general compilation, but I like the libJIT API and documentation better. Alternatively, if I'm really lazy, I'll generate C code and let GCC do the actual compilation. Another alternative, is to target an existing VM, like Parrot or the JVM or the CLR. Parrot is the VM being designed for Perl. If it's just an interpreter, I'll walk the syntax tree.
A radical alternative is to use Prolog, which has syntax features which remarkably simulate EBNF. I have no experience with it though, and if I am not wrong (which I am almost certainly going to be), Prolog would be quite slow if used to parse heavy duty programming languages with a lot of syntactical constructs and quirks (read: C++ and Perl).
All this I'll do in C++, if only because I am more used to writing in it than C. I'd stay away from Java/Python or anything of that sort for the actual production code (writing compilers in C/C++ help to make it portable), but I could see myself using them as a prototyping language, especially Python, which I am partial towards. Of course, I've never actually done any of this before, so I'm not one to say.
On lambda-the-ultimate there's a link to Create Your Own Programming Language by Marc-André Cournoyer, which appears to describe how to leverage some modern tools for creating little languages.
Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily)
Just a hint: Look at some quite different languages first, before designing a new languge (i.e. languages with a very different evaluation strategy). Haskell and Oz come to mind. Though you should also know Prolog and Scheme. A year ago I also was like "hey, let's design a language that behaves exactly as I want", but fortunatly I looked at those other languages first (or you could also say unfortunatly, because now I don't know how I want a language to behave anymore...).
Before you start creating a language you should read this:
Hanspeter Moessenboeck, The Art of Niklaus Wirth
ftp://ftp.ssw.uni-linz.ac.at/pub/Papers/Moe00b.pdf
There's a big shortcut to implementing a language that I don't see in the other answers here. If you use one of Lukasiewicz's "unparenthesized" forms (ie. Forward Polish or Reverse Polish) you don't need a parser at all! With reverse polish, the dependencies go right-to-left so you simply execute each token as it's scanned. With forward polish, it's the reverse of that, so you actually execute the program "backwards", simplifying subexpressions until reaching the starting token.
To understand why this works, you should investigate the 3 primary tree-traversal algorithms: pre-order, in-order, post-order. These three traversals are the inverse of the parsing task that a language reader (i. parser) has to perform. Only the in-order notation "requires" a recursive decent to re-construct the expression tree. With the other two, you can get away with just a stack.
This may require more "thinking' and less "implementing".
BTW, if you've already found an answer (this question is a year old), you can post that and accept it.
Real coders still code in C. Just that it's a litte sharper.
Hmmm... language design? or writing a compiler?
If you want to write a compiler, you'd use Flex + Bison. (google)
Not an easy answer, but..
You essentially want to define a set of rules written in text (tokens) and then some parser that checks these rules and assembles them into fragments.
http://www.mactech.com/articles/mactech/Vol.16/16.07/UsingFlexandBison/
People can spend years on this, The above article talks about using two tools (Flex and Bison) That can be used to turn text into code you can feed to a compiler.
First I spent a year or so to actually think how the language should look like. At the same time I helped in developing Ioke (www.ioke.org) to learn language internals.
I have chosen Objective-C as implementation platform as it's fast (enough), simple and rich language. It also provides test framework so agile approach is a go. It also has a rich standard library I can build upon.
Since my language is simple on syntactic level (no keywords, only literals, operators and messages) I could go with Ragel (http://www.complang.org/ragel/) for building scanner. It's fast as hell and simple to use.
Now I have a working object model, scanner and simple operator shuffling plus standard library bootstrap code. I can even run a simple programs - as long as they fit in one file that is :)
Of course older techniques are still common (e.g. using Flex and Bison) many newer language implementations combine the lexing and parsing phase, by using a parser based on a parsing expression grammar (PEG). This works for recursive descent parsers created using combinators, or memoizing Packrat parsers. Many compilers are built using the Antlr framework also.
Use bison/flex which is the gnu version of yacc/lex. This book is extremely helpful.
The reason to use bison is it catches any conflicts in the language. I used it and it made my life many years easier (ok so i'm on my 2nd year but the first 6months was a few years ago writing it in C++ and the parsing/conflicts/results were terrible! :(.)
If you want to write a compiler obviously you need to read the Dragon Book ;)
Here is another good book that I have just read. It is practical and easier to understand than the Dragon Book:
http://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=language+implementation+patterns&x=0&y=0
Mike --
If you're interested in an efficient native-code-generating compiler for Windows so you can get your bearings -- without wading through all the unnecessary widgets, gadgets, and other nonsense that clutter today's machines -- I recommend the Osmosian Order's Plain English development system. It includes a unique interface, a simplified file manager, a friendly text editor, a handy hexadecimal dumper, the compiler/linker (of course), and a wysiwyg page-layout application for documentation. Written entirely in Plain English, it is a quick download (less than a megabyte), small enough to understand in short order (about 25,000 lines of Plain English code, with just 4,000 in the compiler/linker), yet powerful enough to reproduce itself on a bottom-of-the-line Dell in less than three seconds. Really: three seconds. And it's free to all who write and ask for a copy, including the source code and and a rather humorous tongue-in-cheek 100-page manual. See www.osmosian.com for details on how to get a copy, or write to me directly with questions or comments: Gerry.Rzeppa#pobox.com

What do people find so appealing about dynamic languages? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
It seems that everybody is jumping on the dynamic, non-compiled bandwagon lately. I've mostly only worked in compiled, static typed languages (C, Java, .Net). The experience I have with dynamic languages is stuff like ASP (Vb Script), JavaScript, and PHP. Using these technologies has left a bad taste in my mouth when thinking about dynamic languages. Things that usually would have been caught by the compiler such as misspelled variable names and assigning an value of the wrong type to a variable don't occur until runtime. And even then, you may not notice an error, as it just creates a new variable, and assigns some default value. I've also never seen intellisense work well in a dynamic language, since, well, variables don't have any explicit type.
What I want to know is, what people find so appealing about dynamic languages? What are the main advantages in terms of things that dynamic languages allow you to do that can't be done, or are difficult to do in compiled languages. It seems to me that we decided a long time ago, that things like uncompiled asp pages throwing runtime exceptions was a bad idea. Why is there is a resurgence of this type of code? And why does it seem to me at least, that Ruby on Rails doesn't really look like anything you couldn't have done with ASP 10 years ago?
I think the reason is that people are used to statically typed languages that have very limited and inexpressive type systems. These are languages like Java, C++, Pascal, etc. Instead of going in the direction of more expressive type systems and better type inference, (as in Haskell, for example, and even SQL to some extent), some people like to just keep all the "type" information in their head (and in their tests) and do away with static typechecking altogether.
What this buys you in the end is unclear. There are many misconceived notions about typechecking, the ones I most commonly come across are these two.
Fallacy: Dynamic languages are less verbose. The misconception is that type information equals type annotation. This is totally untrue. We all know that type annotation is annoying. The machine should be able to figure that stuff out. And in fact, it does in modern compilers. Here is a statically typed QuickSort in two lines of Haskell (from haskell.org):
qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)
And here is a dynamically typed QuickSort in LISP (from swisspig.net):
(defun quicksort (lis) (if (null lis) nil
(let* ((x (car lis)) (r (cdr lis)) (fn (lambda (a) (< a x))))
(append (quicksort (remove-if-not fn r)) (list x)
(quicksort (remove-if fn r))))))
The Haskell example falsifies the hypothesis statically typed, therefore verbose. The LISP example falsifies the hypothesis verbose, therefore statically typed. There is no implication in either direction between typing and verbosity. You can safely put that out of your mind.
Fallacy: Statically typed languages have to be compiled, not interpreted. Again, not true. Many statically typed languages have interpreters. There's the Scala interpreter, The GHCi and Hugs interpreters for Haskell, and of course SQL has been both statically typed and interpreted for longer than I've been alive.
You know, maybe the dynamic crowd just wants freedom to not have to think as carefully about what they're doing. The software might not be correct or robust, but maybe it doesn't have to be.
Personally, I think that those who would give up type safety to purchase a little temporary liberty, deserve neither liberty nor type safety.
Don't forget that you need to write 10x code coverage in unit tests to replace what your compiler does :D
I've been there, done that with dynamic languages, and I see absolutely no advantage.
When reading other people's responses, it seems that there are more or less three arguments for dynamic languages:
1) The code is less verbose.
I don't find this valid. Some dynamic languages are less verbose than some static ones. But F# is statically typed, but the static typing there does not add much, if any, code. It is implicitly typed, though, but that is a different thing.
2) "My favorite dynamic language X has my favorite functional feature Y, so therefore dynamic is better". Don't mix up functional and dynamic (I can't understand why this has to be said).
3) In dynamic languages you can see your results immediately. News: You can do that with C# in Visual Studio (since 2005) too. Just set a breakpoint, run the program in the debugger and modify the program while debbuging. I do this all the time and it works perfectly.
Myself, I'm a strong advocate for static typing, for one primary reason: maintainability. I have a system with a couple 10k lines of JavaScript in it, and any refactoring I want to do will take like half a day since the (non-existent) compiler will not tell me what that variable renaming messed up. And that's code I wrote myself, IMO well structured, too. I wouldn't want the task of being put in charge of an equivalent dynamic system that someone else wrote.
I guess I will be massively downvoted for this, but I'll take the chance.
VBScript sucks, unless you're comparing it to another flavor of VB.
PHP is ok, so long as you keep in mind that it's an overgrown templating language.
Modern Javascript is great. Really. Tons of fun. Just stay away from any scripts tagged "DHTML".
I've never used a language that didn't allow runtime errors. IMHO, that's largely a red-herring: compilers don't catch all typos, nor do they validate intent. Explicit typing is great when you need explicit types, but most of the time, you don't. Search for the questions here on generics or the one about whether or not using unsigned types was a good choice for index variables - much of the time, this stuff just gets in the way, and gives folks knobs to twiddle when they have time on their hands.
But, i haven't really answered your question. Why are dynamic languages appealing? Because after a while, writing code gets dull and you just want to implement the algorithm. You've already sat and worked it all out in pen, diagrammed potential problem scenarios and proved them solvable, and the only thing left to do is code up the twenty lines of implementation... and two hundred lines of boilerplate to make it compile. Then you realize that the type system you work with doesn't reflect what you're actually doing, but someone else's ultra-abstract idea of what you might be doing, and you've long ago abandoned programming for a life of knicknack tweaking so obsessive-compulsive that it would shame even fictional detective Adrian Monk.
That's when you go get plastered start looking seriously at dynamic languages.
I am a full-time .Net programmer fully entrenched in the throes of statically-typed C#. However, I love modern JavaScript.
Generally speaking, I think dynamic languages allow you to express your intent more succinctly than statically typed languages as you spend less time and space defining what the building blocks are of what you are trying to express when in many cases they are self evident.
I think there are multiple classes of dynamic languages, too. I have no desire to go back to writing classic ASP pages in VBScript. To be useful, I think a dynamic language needs to support some sort of collection, list or associative construct at its core so that objects (or what pass for objects) can be expressed and allow you to build more complex constructs. (Maybe we should all just code in LISP ... it's a joke ...)
I think in .Net circles, dynamic languages get a bad rap because they are associated with VBScript and/or JavaScript. VBScript is just a recalled as a nightmare for many of the reasons Kibbee stated -- anybody remember enforcing type in VBScript using CLng to make sure you got enough bits for a 32-bit integer. Also, I think JavaScript is still viewed as the browser language for drop-down menus that is written a different way for all browsers. In that case, the issue is not language, but the various browser object models. What's interesting is that the more C# matures, the more dynamic it starts to look. I love Lambda expressions, anonymous objects and type inference. It feels more like JavaScript everyday.
Here is a statically typed QuickSort in two lines of Haskell (from haskell.org):
qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)
And here is a dynamically typed QuickSort in LISP (from swisspig.net):
(defun quicksort (lis) (if (null lis) nil
(let* ((x (car lis)) (r (cdr lis)) (fn (lambda (a) (< a x))))
(append (quicksort (remove-if-not fn r)) (list x)
(quicksort (remove-if fn r))))))
I think you're biasing things with your choice of language here. Lisp is notoriously paren-heavy. A closer equivelent to Haskell would be Python.
if len(L) <= 1: return L
return qsort([lt for lt in L[1:] if lt < L[0]]) + [L[0]] + qsort([ge for ge in L[1:] if ge >= L[0]])
Python code from here
For me, the advantage of dynamic languages is how much more readable the code becomes due to less code and functional techniques like Ruby's block and Python's list comprehension.
But then I kind of miss the compile time checking (typo does happen) and IDE auto complete. Overall, the lesser amount of code and readability pays off for me.
Another advantage is the usually interpreted/non compiled nature of the language. Change some code and see the result immediately. It's really a time saver during development.
Last but not least, I like the fact that you can fire up a console and try out something you're not sure of, like a class or method that you've never used before and see how it behaves. There are many uses for the console and I'll just leave that for you to figure out.
Your arguments against dynamic languages are perfectly valid. However, consider the following:
Dynamic languages don't need to be compiled: just run them. You can even reload the files at run time without restarting the application in most cases.
Dynamic languages are generally less verbose and more readable: have you ever looked at a given algorithm or program implemented in a static language, then compared it to the Ruby or Python equivalent? In general, you're looking at a reduction in lines of code by a factor of 3. A lot of scaffolding code is unnecessary in dynamic languages, and that means the end result is more readable and more focused on the actual problem at hand.
Don't worry about typing issues: the general approach when programming in dynamic languages is not to worry about typing: most of the time, the right kind of argument will be passed to your methods. And once in a while, someone may use a different kind of argument that just happens to work as well. When things go wrong, your program may be stopped, but this rarely happens if you've done a few tests.
I too found it a bit scary to step away from the safe world of static typing at first, but for me the advantages by far outweigh the disadvantages, and I've never looked back.
I believe that the "new found love" for dynamically-typed languages have less to do with whether statically-typed languages are better or worst - in the absolute sense - than the rise in popularity of certain dynamic languages. Ruby on Rails was obviously a big phenomenon that cause the resurgence of dynamic languages. The thing that made rails so popular and created so many converts from the static camp was mainly: very terse and DRY code and configuration. This is especially true when compared to Java web frameworks which required mountains of XML configuration. Many Java programmers - smart ones too - converted over, and some even evangelized ruby and other dynamic languages. For me, three distinct features allow dynamic languages like Ruby or Python to be more terse:
Minimalist syntax - the big one is that type annotations are not required, but also the the language designer designed the language from the start to be terse
inline function syntax(or the lambda) - the ability to write inline functions and pass them around as variables makes many kinds of code more brief. In particular this is true for list/array operations. The roots of this ideas was obviously - LISP.
Metaprogramming - metaprogramming is a big part of what makes rails tick. It gave rise to a new way of refactoring code that allowed the client code of your library to be much more succinct. This also originate from LISP.
All three of these features are not exclusive to dynamic languages, but they certainly are not present in the popular static languages of today: Java and C#. You might argue C# has #2 in delegates, but I would argue that it's not widely used at all - such as with list operations.
As for more advanced static languages... Haskell is a wonderful language, it has #1 and #2, and although it doesn't have #3, it's type system is so flexible that you will probably not find the lack of meta to be limiting. I believe you can do metaprogramming in OCaml at compile time with a language extension. Scala is a very recent addition and is very promising. F# for the .NET camp. But, users of these languages are in the minority, and so they didn't really contribute to this change in the programming languages landscape. In fact, I very much believe the popularity of Ruby affected the popularity of languages like Haskell, OCaml, Scala, and F# in a positive way, in addition to the other dynamic languages.
Personally, I think it's just that most of the "dynamic" languages you have used just happen to be poor examples of languages in general.
I am way more productive in Python than in C or Java, and not just because you have to do the edit-compile-link-run dance. I'm getting more productive in Objective-C, but that's probably more due to the framework.
Needless to say, I am more productive in any of these languages than PHP. Hell, I'd rather code in Scheme or Prolog than PHP. (But lately I've actually been doing more Prolog than anything else, so take that with a grain of salt!)
My appreciation for dynamic languages is very much tied to how functional they are. Python's list comprehensions, Ruby's closures, and JavaScript's prototyped objects are all very appealing facets of those languages. All also feature first-class functions--something I can't see living without ever again.
I wouldn't categorize PHP and VB (script) in the same way. To me, those are mostly imperative languages with all of the dynamic-typing drawbacks that you suggest.
Sure, you don't get the same level of compile-time checks (since there ain't a compile time), but I would expect static syntax-checking tools to evolve over time to at least partially address that issue.
One of the advantages pointed out for dynamic languages is to just be able to change the code and continue running. No need to recompile. In VS.Net 2008, when debugging, you can actually change the code, and continue running, without a recompile. With advances in compilers and IDEs, is it possible that this and other advantages of using dynamic languages will go away.
Ah, I didn't see this topic when I posted similar question
Aside from good features the rest of the folks mentioned here about dynamic languages, I think everybody forget one, the most basic thing: metaprogramming.
Programming the program.
Its pretty hard to do in compiled languages, generally, take for example .Net. To make it work you have to make all kind of mambo jumbo and it usualy ends with code that runs around 100 times slower.
Most dynamic languages have a way to do metaprogramming and that is something that keeps me there - ability to create any kind of code in memory and perfectly integrate it into my applicaiton.
For instance to create calculator in Lua, all I have to do is:
print( loadstring( "return " .. io.read() )() )
Now, try to do that in .Net.
My main reason for liking dynamic (typed, since that seems to be the focus of the thread) languages is that the ones I've used (in a work environment) are far superior to the non-dynamic languages I've used. C, C++, Java, etc... they're all horrible languages for getting actual work done in. I'd love to see an implicitly typed language that's as natural to program in as many of the dynamically typed ones.
That being said, there's certain constructs that are just amazing in dynamically typed languages. For example, in Tcl
lindex $mylist end-2
The fact that you pass in "end-2" to indicate the index you want is incredibly concise and obvious to the reader. I have yet to see a statically typed language that accomplishes such.
I think this kind of argument is a bit stupid: "Things that usually would have been caught by the compiler such as misspelled variable names and assigning an value of the wrong type to a variable don't occur until runtime" yes thats right as a PHP developer I don't see things like mistyped variables until runtime, BUT runtime is step 2 for me, in C++ (Which is the only compiled language I have any experience) it is step 3, after linking, and compiling.
Not to mention that it takes all of a few seconds after I hit save to when my code is ready to run, unlike in compiled languages where it can take literally hours. I'm sorry if this sounds a bit angry, but I'm kind of tired of people treating me as a second rate programmer because I don't have to compile my code.
The argument is more complex than this (read Yegge's article "Is Weak Typing Strong Enough" for an interesting overview).
Dynamic languages don't necessarily lack error checking either - C#'s type inference is possibly one example. In the same way, C and C++ have terrible compile checks and they are statically typed.
The main advantages of dynamic languages are a) capability (which doesn't necessarily have to be used all the time) and b) Boyd's Law of Iteration.
The latter reason is massive.
Although I'm not a big fan of Ruby yet, I find dynamic languages to be really wonderful and powerful tools.
The idea that there is no type checking and variable declaration is not too big an issue really. Admittedly, you can't catch these errors until run time, but for experienced developers this is not really an issue, and when you do make mistakes, they're usually easily fixed.
It also forces novices to read what they're writing more carefully. I know learning PHP taught me to be more attentive to what I was actually typing, which has improved my programming even in compiled languages.
Good IDEs will give enough intellisense for you to know whether a variable has been "declared" and they also try to do some type inference for you so that you can tell what a variable is.
The power of what can be done with dynamic languages is really what makes them so much fun to work with in my opinion. Sure, you could do the same things in a compiled language, but it would take more code. Languages like Python and PHP let you develop in less time and get a functional codebase faster most of the time.
And for the record, I'm a full-time .NET developer, and I love compiled languages. I only use dynamic languages in my free time to learn more about them and better myself as a developer..
I think that we need the different types of languages depending on what we are trying to achieve, or solve with them. If we want an application that creates, retrieves, updates and deletes records from the database over the internet, we are better off doing it with one line of ROR code (using the scaffold) than writing it from scratch in a statically typed language. Using dynamic languages frees up the minds from wondering about
which variable has which type
how to grow a string dynamically as needs be
how to write code so that if i change type of one variable, i dont have to rewrite all the function that interact with it
to problems that are closer to business needs like
data is saving/updating etc in the database, how do i use it to drive traffic to my site
Anyway, one advantage of loosely typed languages is that we dont really care what type it is, if it behaves like what it is supposed to. That is the reason we have duck-typing in dynamically typed languages. it is a great feature and i can use the same variable names to store different types of data as the need arises. also, statically typed languages force you to think like a machine (how does the compiler interact with your code, etc etc) whereas dynamically typed languages, especially ruby/ror, force the machine to think like a human.
These are some of the arguments i use to justify my job and experience in dynamic languages!
I think both styles have their strengths. This either/or thinking is kind of crippling to our community in my opinion. I've worked in architectures that were statically-typed from top to bottom and it was fine. My favorite architecture is for dynamically-typed at the UI level and statically-typed at the functional level. This also encourages a language barrier that enforces the separation of UI and function.
To be a cynic, it may be simply that dynamic languages allow the developer to be lazier and to get things done knowing less about the fundamentals of computing. Whether this is a good or bad thing is up to the reader :)
FWIW, Compiling on most applications shouldn't take hours. I have worked with applications that are between 200-500k lines that take minutes to compile. Certainly not hours.
I prefer compiled languages myself. I feel as though the debugging tools (in my experience, which might not be true for everything) are better and the IDE tools are better.
I like being able to attach my Visual Studio to a running process. Can other IDEs do that? Maybe, but I don't know about them. I have been doing some PHP development work lately and to be honest it isn't all that bad. However, I much prefer C# and the VS IDE. I feel like I work faster and debug problems faster.
So maybe it is more a toolset thing for me than the dynamic/static language issue?
One last comment... if you are developing with a local server saving is faster than compiling, but often times I don't have access to everything on my local machine. Databases and fileshares live elsewhere. It is easier to FTP to the web server and then run my PHP code only to find the error and have to fix and re-ftp.
Productivity in a certain context. But that is just one environment I know, compared to some others I know or have seen used.
Smalltalk on Squeak/Pharo with Seaside is a much more effective and efficient web platform than ASP.Net(/MVC), RoR or Wicket, for complex applications. Until you need to interface with something that has libraries in one of those but not smalltalk.
Misspelled variable names are red in the IDE, IntelliSense works but is not as specific. Run-time errors on webpages are not an issue but a feature, one click to bring up the debugger, one click to my IDE, fix the bug in the debugger, save, continue. For simple bugs, the round-trip time for this cycle is less than 20 seconds.
Dynamic Languages Strike Back
http://www.youtube.com/watch?v=tz-Bb-D6teE
A talk discussing Dynamic Languages, what some of the positives are, and how many of the negatives aren't really true.
Because I consider stupid having to declare the type of the box.
The type stays with the entity, not with the container. Static typing had a sense when the type of the box had a direct consequence on how the bits in memory were interpreted.
If you take a look at the design patterns in the GoF, you will realize that a good part of them are there just to fight with the static nature of the language, and they have no reason whatsoever to exist in a dynamic language.
Also, I'm tired of having to write stuff like MyFancyObjectInterface f = new MyFancyObject(). DRY principle anyone ?
Put yourself in the place of a brand new programmer selecting a language to start out with, who doesn't care about dynamic versus staic versus lambdas versus this versus that etc.; which language would YOU choose?
C#
using System;
class MyProgram
{
public static void Main(string[] args)
{
foreach (string s in args)
{
Console.WriteLine(s);
}
}
}
Lua:
function printStuff(args)
for key,value in pairs(args) do
print value .. " "
end
end
strings = {
"hello",
"world",
"from lua"
}
printStuff(strings)
This all comes down to partially what's appropriate for the particular goals and what's a common personal preference. (E.G. Is this going to be a huge code base maintained by more people than can conduct a reasonable meeting together? You want type checking.)
The personal part is about trading off some checks and other steps for development and testing speed (while likely giving up some cpu performance). There's some people for which this is liberating and a performance boost, and there's some for which this is quite the opposite, and yes it does sort of depend on the particular flavor of your language too. I mean no one here is saying Java rocks for speedy, terse development, or that PHP is a solid language where you'll rarely make a hard to spot typo.
I have love for both static and dynamic languages. Every project that I've been involved in since about 2002 has been a C/C++ application with an embedded Python interpret. This gives me the best of both worlds:
The components and frameworks that make up the application are, for a given release of an application, immutable. They must also be very stable, and hence, well tested. A Statically typed language is the right choice for building these parts.
The wiring up of components, loading of component DLLs, artwork, most of the GUI, etc... can vary greatly (say, to customise the application for a client) with no need to change any framework or components code. A dynamic language is perfect for this.
I find that the mix of a statically typed language to build the system and a dynamically type language to configure it gives me flexibility, stability and productivity.
To answer the question of "What's with the love of dynamic languages?" For me it's the ability to completely re-wire a system at runtime in any way imaginable. I see the scripting language as "running the show", therefore the executing application may do anything you desire.
I don't have much experience with dynamic languages in general, but the one dynamic language I do know, JavaScript(aka ECMAScript), I absolutely love.
Well, wait, what's the discussion here? Dynamic compilation? Or dynamic typing? JavaScript covers both bases so I guess I'll talk about both:
Dynamic compilation:
To begin, dynamic languages are compiled, the compilation is simply put off until later. And Java and .NET really are compiled twice. Once to their respective intermediate languages, and again, dynamically, to machine code.
But when compilation is put off you can see results faster. That's one advantage. I do enjoy simply saving the file and seeing my program in action fairly quick.
Another advantage is that you can write and compile code at runtime. Whether this is possible in statically compiled code, I don't know. I imagine it must be, since whatever compiles JavaScript is ultimately machine code and statically compiled. But in a dynamic language this is a trivial thing to do. Code can write and run itself. (And I'm pretty sure .NET can do this, but the CIL that .NET compiles to is dynamically compiled on the fly anyways, and it's not so trivial in C#)
Dynamic typing:
I think dynamic typing is more expressive than static typing. Note that I'm using the term expressive informally to say that dynamic typing can say more with less. Here's some JavaScript code:
var Person = {};
Do you know what Person is now? It's a generic dictionary. I can do this:
Person["First_Name"] = "John";
Person["Last_Name"] = "Smith";
But it's also an object. I could refer to any of those "keys" like this:
Person.First_Name
And add any methods I deem necessary:
Person.changeFirstName = function(newName) {
this.First_Name = newName;
};
Sure, there might be problems if newName isn't a string. It won't be caught right away, if ever, but you can check yourself. It's a matter of trading expressive power and flexibility for safety. I don't mind adding code to check types, etc, myself, and I've yet to run into a type bug that gave me much grief (and I know that isn't saying much. It could be a matter of time :) ). I very much enjoy, however, that ability to adapt on the fly.
Nice blog post on the same topic: Python Makes Me Nervous
Method signatures are virtually
useless in Python. In Java, static
typing makes the method signature into
a recipe: it's all the shit you need
to make this method work. Not so in
Python. Here, a method signature will
only tell you one thing: how many
arguments you need to make it work.
Sometimes, it won't even do that, if
you start fucking around with
**kwargs.
Because it's fun fun fun. It's fun to not worry about memory allocation, for one. It's fun not waiting for compilation. etc etc etc
Weakly typed languages allow flexibility in how you manage your data.
I used VHDL last spring for several classes, and I like their method of representing bits/bytes, and how the compiler catches errors if you try to assign a 6-bit bus to a 9-bit bus. I tried to recreate it in C++, and I'm having a fair struggle to neatly get the typing to work smoothly with existing types. Steve Yegge does a very nice job of describing the issues involved with strong type systems, I think.
Regarding verbosity: I find Java and C# to be quite verbose in the large(let's not cherry-pick small algorithms to "prove" a point). And, yes, I've written in both. C++ struggles in the same area as well; VHDL succumbs here.
Parsimony appears to be a virtue of the dynamic languages in general(I present Perl and F# as examples).

Resources