How reasonable/possible/difficult is it to write a tsocks clone in Haskell - haskell

I'm a reasonably competent programmer who knows haskell, but who hasn't used it in any major projects. I know enough about c and systems and network programming that I believe I can pick apart tsocks from the source code.
I don't have any experience with the low-level systems interfaces haskell provides. I'm looking for any advice people can offer me on the topic, including, "Don't do it; you'll hate yourself for it," provided there is an explanation.

I really wouldn't do this, except as an experiment. I'm a Haskell guy, but not a deep systems guy, so there's a caveat there. But nonetheless, I see the following on the tsocks page:
tsocks is based on the 'shared library
interceptor' concept. Through use of
the LD_PRELOAD environment variable or
the /etc/ld.so.preload file tsocks is
automatically loaded into the process
space of every executed program. From
there it overrides the normal
connect() function by providing its
own. Thus when an application calls
connect() to establish a TCP
connection it instead passes control
to tsocks. tsocks determines if the
connection needs to be made via a
SOCKS server (by checking
/etc/tsocks.conf) and negotiates the
connection if so (through use of the
real connect() function )
It is possible to call Haskell from C, and vice-versa. And its relatively easy, in fact. For shared libraries, see this: http://www.haskell.org/ghc/docs/6.12.1/html/users_guide/using-shared-libs.html.
But when you invoke Haskell from C, you need to A) link in the runtime and B) invoke the runtime.
So that works when the C knows that its calling Haskell. But its relatively trickier when the C doesn't know that it's calling Haskell, and so you'd need to wrap the Haskell shared library with a C library that invoked and managed the runtime transparently to the program that is preloading the haskell-tsocks library to intercept its normal connect functions.
So I'm sure this can be done -- but it sounds rather painful and complicated, and somewhat expensive in terms of having to link the whole ghc runtime in for this one feature. And frankly, I imagine the code you'd be writing (I haven't inspected the tsocks code itself yet) would largely be FFI calls anyway.
So a Haskell implementation of some element of socks -- a proxy, a client, etc. sounds interesting and potentially useful. But the exact preload magic that tsocks does sounds like a perhaps poor fit.
Bear in mind that there are Haskell hackers that are much better at this stuff than me, more knowledgeable, and more experienced. So they might say otherwise.

(Posting as a separate answer, since this is advice unrelated to the FFI)
You probably know this stuff, but in case its useful for anyone...
Read up on the Network.Socket module
Search the Haskell Wiki for pages that might help you (like Applications and libraries/Network)
Check out System Programming in Haskell and other chapters from RWH
Ignore the people that say "Haskell is terrible for I/O" - protip: you can just scare them away by saying fancy words like "endofunctor"

This may not be exactly the answer you were looking for, but instead of re-writing it in Haskell, you could just use the Foreign Function Interface to wrap the already-existing C implementation in Haskell types.
Note, one of the few major changes in Haskell 2010 was to officially include the FFI as a language feature. Link: Haskell 2010 FFI

Related

Elixir and Haskell interoperability

As a platform for handling concurrent problems, Elixir/OTP seems to be the best suited solution.
When writing an application with a web interface, consider the case in which I want to reason about, and decouple, the application logic using another functional language - namely haskell (due to benefits like its advanced detection of errors at compile time, static typing, etc). I would then handle concurrency using GenServers, and attach a web interface using Phoenix.Channels.
Is this setup even possible using NIFs? Also, would true concurrency be maintained? I'm not sure that I'm following the correct line of reasoning here, but would a new haskell process be able to be spawned in line with GenServer demands, and would the two be able communicate efficiently?
This setup is certainly possible using NIFs and GHC's FFI with a small amount of boilerplate written in C. But NIFs are best used for short synchronous computations with no side effects and I get the feeling that that isn't what these operations are.
You'd probably be better off with C Nodes for the Haskell parts of the application. Most of the documentation you'll find for that will be for Erlang and not Elixir, but given Elixir's easy interop with Erlang, it should be pretty straight forward (someone's even written an example). Most of the hard work will be to do with writing a Haskell "C Node", for which a cursory glance at hackage and github turns up nothing.

What would be involved in calling ARPACK++ (a C++ library) from Haskell?

I've spent a couple of days developing a program in Haskell, while learning the language. Now I realize that I'll need to call Arpack (a Fortran library) or Arpack++ (a C++ wrapper to Arpack) -- I can't find a good implementation of Lanczos method with Haskell bindings. Do any more experienced Haskell programers have an opinion of how difficult this would be?
I've been able to get ".so" ("shared object") versions of libarpack and libarpack++ installed through Ubuntu's repository, but I'm not sure that will suffice. I suspect I'm going to ultimately need to build Arpack++ from source code, which is possible, but I'm getting a lot of build errors, so it will take time. Is there any way to use just the ".so" files, without knowing exactly which version of the header files were used to generate them?
I'm considering using GreenCard, because it looks like the most well maintained Haskell/C bridge. I can't find much documentation though, so I'm wondering whether it will support C++ too.
I'm also starting to wonder whether I should rewrite my program in Python, and use scipy to call Arpack, but I've already sunk a couple of days into writing Haskell. I really like Haskell too, so I'm hoping I can make this work. I guess my overall question is this: What would be involved in making this work with Haskell?
Thanks much.
ELF format is standard format of executables and shared libraries, so accessing the code in these compiled modules is only a matter of knowing function names. If I understand correctly, Fortran is interoperable with C. As a consequence, Fortran should be interoperable with any language which can use C bindings, including Haskell. FYI, you can find all names exported by a module (executable or shared object or simple object archive) using nm tool (it is usually available in all linux distros by default). This of course would work if the binary file was not "stripped", but AFAIK it is not common practice.
However, Haskell cannot use C++ bindings in sane way, since C++ polymorphic features require name mangling, and the method of this name transformation is highly compiler-dependent. It is well-known problem which is not specific to Haskell. Of course, you could try to get a list of exported symbols from C++ shared object and then bind them using FFI, but... It isn't worth it.
As dsign said, you can use Foreign Function Interface GHC feature to create bindings to foreign code. All you would require is library headers (and the library itself of course). In case of C language that would be header files (*.h), but since your library is written in Fortran, you have to find header files analogue in library sources, refere to this page to match Fortran and C types, and then use this information to write FFI bindings. It would be helpful first to write C bindings, i.e. write C header. Then you can even use automatic FFI binding programs like c2hs.
It maybe also helpful to look through C++ bindings. It is possible that it has the header file I've described above. If it has one, then writing FFI bindings will be no more difficult than writing them for any other library.
So, it is not entirely impossible, but it may require some thorough work. Writing bindings to scientific/pure computational libraries is way easier than writing them for some system library which does a lot of IO and keeps its own internal state, but since this library is written not in C... Well, it may be advisable to invest your time in easier alternatives. I cannot say anythin about scipy, I've never used it, but since Python as a language is much more simpler than Haskell, it may be good alternative.
I can tell you that using a C/Fortran library from Haskell, with the help of the Foreign Function Interface would be certainly possible and not terribly complicated. Here is an introduction. In my understanding, you should be able to call anything with a C calling convention, and perhaps even Fortran, without need of recompiling the code. The only exception is with things that look like function calls but are indeed macros, in which case you will have to figure out what the macros do and reproduce them in Haskell.
As of greencard, I have never used it, so I can not vouch for it.
Your second idea of using Python could potentially save you more than a couple of days. Sad as it is, I have never managed Haskell code to easily adapt to my changing requirements, while I find that trivial in Python. Of course, that could be a limitation on my skills with Haskell or my thinking process rather that something to blame to the language.

Safe execution of untrusted Haskell code

I'm looking for a way to run an arbitrary Haskell code safely (or refuse to run unsafe code).
Must have:
module/function whitelist
timeout on execution
memory usage restriction
Capabilities I would like to see:
ability to kill thread
compiling the modules to native code
caching of compiled code
running several interpreters concurrently
complex datatype for compiler errors (insted of simple message in String)
With that sort of functionality it would be possible to implement a browser plugin capable of running arbitrary Haskell code, which is the idea I have in mind.
EDIT: I've got two answers, both great. Thanks! The sad part is that there doesn't seem to be ready-to-go library, just a similar program. It's a useful resource though. Anyway I think I'll wait for 7.2.1 to be released and try to use SafeHaskell in my own program.
We've been doing this for about 8 years now in lambdabot, which supports:
a controlled namespace
OS-enforced timeouts
native code modules
caching
concurrent interactive top-levels
custom error message returns.
This series of rules is documented, see:
Safely running untrusted Haskell code
mueval, an alternative implementation based on ghc-api
The approach to safety taken in lambdabot inspired the Safe Haskell language extension work.
For approaches to dynamic extension of compiled Haskell applications, in Haskell, see the two papers:
Dynamic Extension of Typed Functional Languages, and
Dynamic applications from the ground up.
GHC 7.2.1 will likely have a new facility called SafeHaskell which covers some of what you want. SafeHaskell ensures type-safety (so things like unsafePerformIO are outlawed), and establishes a trust mechanism, so that a library with a safe API but implemented using unsafe features can be trusted. It is designed exactly for running untrusted code.
For the other practical aspects (timeouts and so on), lambdabot as Don says would be a great place to look.

How to create a language these days?

I need to get around to writing that programming language I've been meaning to write. How do you kids do it these days? I've been out of the loop for over a decade; are you doing it any differently now than we did back in the pre-internet, pre-windows days? You know, back when "real" coders coded in C, used the command line, and quibbled over which shell was superior?
Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily) but how do you build the compiler and standard libraries and so forth? What tools do you kids use these days?
One consideration that's new since the punched card era is the existence of virtual machines already bountifully provided with "standard libraries." Targeting the JVM or the .NET CLR instead of ye olde "language walled garden" saves you a lot of bootstrapping. If you're creating a compiled language, you may also find Java byte code or MSIL an easier compile target than machine code (of course, if you're in this for the fun of creating a tight optimising compiler then you'll see this as a bug rather than a feature).
On the negative side, the idioms of the JVM or CLR may not be what you want for your language. So you may still end up building "standard libraries" just to provide idiomatic interfaces over the platform facility. (An example is that every languages and its dog seems to provide its own method for writing to the console, rather than leaving users to manually call System.out.println or Console.WriteLine.) Nevertheless, it enables an incremental development of the idiomatic libraries, and means that the more obscure libraries for which you never get round to building idiomatic interfaces are still accessible even if in an ugly way.
If you're considering an interpreted language, .NET also has support for efficient interpretation via the Dynamic Language Runtime (DLR). (I don't know if there's an equivalent for the JVM.) This should help free you up to focus on the language design without having to worry so much about the optimisation of the interpreter.
I've written two compilers now in Haskell for small domain-specific languages, and have found it to be an incredibly productive experience. The parsec library makes playing with syntax easy, and interpreters are very simple to write over a Haskell data structure. There is a description of writing a Lisp interpreter in Haskell that I found helpful.
If you are interested in a high-performance backend, I recommend LLVM. It has a concise and elegant byte-code and the best x86/amd64 generating backend you can find. There is an optional garbage collector, and some experimental backends that target the JVM and CLR.
You can write a compiler in any language that produces LLVM bytecode. If you are adventurous enough to learn Haskell but want LLVM, there are a set of Haskell-LLVM bindings.
What has changed considerably but hasn't been mentioned yet is IDE support and interoperability:
Nowadays we pretty much expect Intellisense, step-by-step execution and state inspection "right in the editor window", new types that tell the debugger how to treat them and rather helpful diagnostic messages. The old "compile .x -> .y" executable is not enough to create a language anymore. The environment is nothing to focus on first, but affects willingness to adopt.
Also, libraries have become much more powerful, noone wants to implement all that in yet another language. Try to borrow, make it easy to call existing code, and make it easy to be called by other code.
Targeting a VM - as itowlson suggested - is probably a good way to get started. If that turns out a problem, it can still be replaced by native compilers.
I'm pretty sure you do what's always been done.
Write some code, and show your results to the world.
As compared to the olden times, there are some tools to make your job easier though. Might I suggest ANTLR for parsing your language grammar?
Speaking as someone who just built a very simple assembly like language and interpreter, I'd start out with the .NET framework or similar. Nothing can beat the powerful syntax of C# + the backing of the entire .NET community when attempting to write most things. From here i designed a simple bytecode format and assembly syntax and proceeeded to write my interpreter + assembler.
Like i said, it was a very simple language.
You should not accept wimpy solutions like using the latest tools. You should bootstrap the language by writing a minimal compiler in Visual Basic for Applications or a similar language, then write all the compilation tools in your new language and then self-compile it using only the language itself.
Also, what is the proposed name of the language?
I think recently there have not been languages with ALL CAPITAL LETTER names like COBOL and FORTRAN, so I hope you will call it something like MIKELANG with all capital letters.
Not so much an implementation but a design decision which effects implementation - if you make every statement of your language have a unique parse tree without context, you'll get something that it's easy to hand-code a parser, and that doesn't require large amounts of work to provide syntax highlighting for. Similarly simple things like using a different symbol for module namespaces and object namespaces ( unlike Java which uses . for both package and class namespaces ) means you can parse the code without loading every module that it refers to.
Standard libraries - include the equivalent of everything in C99 standard libraries other than setjmp. Add whatever else you need for your domain. Work out an easy way to do this, either something like SWIG or an in-line FFI such as Ruby's [can't remember module name] and Python's ctypes.
Building as much of the language in the language is an option, but projects which start out doing either give up (rubinius moved to using C++ for parts of its standard library), or is only for research purposes (Mozilla Narcissus)
I am actually a kid, haha. I've never written an actual compiler before or designed a language, but I have finished The Red Dragon Book, so I suppose I have somewhat of an idea (I hope).
It would depend firstly on the grammar. If it's LR or LALR I suppose tools like Bison/Flex would work well. If it's more LL, I'd use Spirit, which is a component of Boost. It allows you to write the language's grammar in C++ in an EBNF-like syntax, so no muddling around with code generators; the C++ compiler compiles the grammar for you. If any of these fail, I'd write an EBNF grammar on paper, and then proceed to do some heavy recursive descent parsing, which seems to work; if C++ can be parsed pretty well using RDP (as GCC does it), then I suppose with enough unit tests and patience you could write entire compilers using RDP.
Once I have a parser running and some sort of intermediate representation, it then depends on how it runs. If it's some bytecode or native code compiler, I'll use LLVM or libJIT to process it. LLVM is more suited for general compilation, but I like the libJIT API and documentation better. Alternatively, if I'm really lazy, I'll generate C code and let GCC do the actual compilation. Another alternative, is to target an existing VM, like Parrot or the JVM or the CLR. Parrot is the VM being designed for Perl. If it's just an interpreter, I'll walk the syntax tree.
A radical alternative is to use Prolog, which has syntax features which remarkably simulate EBNF. I have no experience with it though, and if I am not wrong (which I am almost certainly going to be), Prolog would be quite slow if used to parse heavy duty programming languages with a lot of syntactical constructs and quirks (read: C++ and Perl).
All this I'll do in C++, if only because I am more used to writing in it than C. I'd stay away from Java/Python or anything of that sort for the actual production code (writing compilers in C/C++ help to make it portable), but I could see myself using them as a prototyping language, especially Python, which I am partial towards. Of course, I've never actually done any of this before, so I'm not one to say.
On lambda-the-ultimate there's a link to Create Your Own Programming Language by Marc-André Cournoyer, which appears to describe how to leverage some modern tools for creating little languages.
Just to clarify, I mean, not how do you DESIGN a language (that I can figure out fairly easily)
Just a hint: Look at some quite different languages first, before designing a new languge (i.e. languages with a very different evaluation strategy). Haskell and Oz come to mind. Though you should also know Prolog and Scheme. A year ago I also was like "hey, let's design a language that behaves exactly as I want", but fortunatly I looked at those other languages first (or you could also say unfortunatly, because now I don't know how I want a language to behave anymore...).
Before you start creating a language you should read this:
Hanspeter Moessenboeck, The Art of Niklaus Wirth
ftp://ftp.ssw.uni-linz.ac.at/pub/Papers/Moe00b.pdf
There's a big shortcut to implementing a language that I don't see in the other answers here. If you use one of Lukasiewicz's "unparenthesized" forms (ie. Forward Polish or Reverse Polish) you don't need a parser at all! With reverse polish, the dependencies go right-to-left so you simply execute each token as it's scanned. With forward polish, it's the reverse of that, so you actually execute the program "backwards", simplifying subexpressions until reaching the starting token.
To understand why this works, you should investigate the 3 primary tree-traversal algorithms: pre-order, in-order, post-order. These three traversals are the inverse of the parsing task that a language reader (i. parser) has to perform. Only the in-order notation "requires" a recursive decent to re-construct the expression tree. With the other two, you can get away with just a stack.
This may require more "thinking' and less "implementing".
BTW, if you've already found an answer (this question is a year old), you can post that and accept it.
Real coders still code in C. Just that it's a litte sharper.
Hmmm... language design? or writing a compiler?
If you want to write a compiler, you'd use Flex + Bison. (google)
Not an easy answer, but..
You essentially want to define a set of rules written in text (tokens) and then some parser that checks these rules and assembles them into fragments.
http://www.mactech.com/articles/mactech/Vol.16/16.07/UsingFlexandBison/
People can spend years on this, The above article talks about using two tools (Flex and Bison) That can be used to turn text into code you can feed to a compiler.
First I spent a year or so to actually think how the language should look like. At the same time I helped in developing Ioke (www.ioke.org) to learn language internals.
I have chosen Objective-C as implementation platform as it's fast (enough), simple and rich language. It also provides test framework so agile approach is a go. It also has a rich standard library I can build upon.
Since my language is simple on syntactic level (no keywords, only literals, operators and messages) I could go with Ragel (http://www.complang.org/ragel/) for building scanner. It's fast as hell and simple to use.
Now I have a working object model, scanner and simple operator shuffling plus standard library bootstrap code. I can even run a simple programs - as long as they fit in one file that is :)
Of course older techniques are still common (e.g. using Flex and Bison) many newer language implementations combine the lexing and parsing phase, by using a parser based on a parsing expression grammar (PEG). This works for recursive descent parsers created using combinators, or memoizing Packrat parsers. Many compilers are built using the Antlr framework also.
Use bison/flex which is the gnu version of yacc/lex. This book is extremely helpful.
The reason to use bison is it catches any conflicts in the language. I used it and it made my life many years easier (ok so i'm on my 2nd year but the first 6months was a few years ago writing it in C++ and the parsing/conflicts/results were terrible! :(.)
If you want to write a compiler obviously you need to read the Dragon Book ;)
Here is another good book that I have just read. It is practical and easier to understand than the Dragon Book:
http://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-alias%3Daps&field-keywords=language+implementation+patterns&x=0&y=0
Mike --
If you're interested in an efficient native-code-generating compiler for Windows so you can get your bearings -- without wading through all the unnecessary widgets, gadgets, and other nonsense that clutter today's machines -- I recommend the Osmosian Order's Plain English development system. It includes a unique interface, a simplified file manager, a friendly text editor, a handy hexadecimal dumper, the compiler/linker (of course), and a wysiwyg page-layout application for documentation. Written entirely in Plain English, it is a quick download (less than a megabyte), small enough to understand in short order (about 25,000 lines of Plain English code, with just 4,000 in the compiler/linker), yet powerful enough to reproduce itself on a bottom-of-the-line Dell in less than three seconds. Really: three seconds. And it's free to all who write and ask for a copy, including the source code and and a rather humorous tongue-in-cheek 100-page manual. See www.osmosian.com for details on how to get a copy, or write to me directly with questions or comments: Gerry.Rzeppa#pobox.com

What do people find so appealing about dynamic languages? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
It seems that everybody is jumping on the dynamic, non-compiled bandwagon lately. I've mostly only worked in compiled, static typed languages (C, Java, .Net). The experience I have with dynamic languages is stuff like ASP (Vb Script), JavaScript, and PHP. Using these technologies has left a bad taste in my mouth when thinking about dynamic languages. Things that usually would have been caught by the compiler such as misspelled variable names and assigning an value of the wrong type to a variable don't occur until runtime. And even then, you may not notice an error, as it just creates a new variable, and assigns some default value. I've also never seen intellisense work well in a dynamic language, since, well, variables don't have any explicit type.
What I want to know is, what people find so appealing about dynamic languages? What are the main advantages in terms of things that dynamic languages allow you to do that can't be done, or are difficult to do in compiled languages. It seems to me that we decided a long time ago, that things like uncompiled asp pages throwing runtime exceptions was a bad idea. Why is there is a resurgence of this type of code? And why does it seem to me at least, that Ruby on Rails doesn't really look like anything you couldn't have done with ASP 10 years ago?
I think the reason is that people are used to statically typed languages that have very limited and inexpressive type systems. These are languages like Java, C++, Pascal, etc. Instead of going in the direction of more expressive type systems and better type inference, (as in Haskell, for example, and even SQL to some extent), some people like to just keep all the "type" information in their head (and in their tests) and do away with static typechecking altogether.
What this buys you in the end is unclear. There are many misconceived notions about typechecking, the ones I most commonly come across are these two.
Fallacy: Dynamic languages are less verbose. The misconception is that type information equals type annotation. This is totally untrue. We all know that type annotation is annoying. The machine should be able to figure that stuff out. And in fact, it does in modern compilers. Here is a statically typed QuickSort in two lines of Haskell (from haskell.org):
qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)
And here is a dynamically typed QuickSort in LISP (from swisspig.net):
(defun quicksort (lis) (if (null lis) nil
(let* ((x (car lis)) (r (cdr lis)) (fn (lambda (a) (< a x))))
(append (quicksort (remove-if-not fn r)) (list x)
(quicksort (remove-if fn r))))))
The Haskell example falsifies the hypothesis statically typed, therefore verbose. The LISP example falsifies the hypothesis verbose, therefore statically typed. There is no implication in either direction between typing and verbosity. You can safely put that out of your mind.
Fallacy: Statically typed languages have to be compiled, not interpreted. Again, not true. Many statically typed languages have interpreters. There's the Scala interpreter, The GHCi and Hugs interpreters for Haskell, and of course SQL has been both statically typed and interpreted for longer than I've been alive.
You know, maybe the dynamic crowd just wants freedom to not have to think as carefully about what they're doing. The software might not be correct or robust, but maybe it doesn't have to be.
Personally, I think that those who would give up type safety to purchase a little temporary liberty, deserve neither liberty nor type safety.
Don't forget that you need to write 10x code coverage in unit tests to replace what your compiler does :D
I've been there, done that with dynamic languages, and I see absolutely no advantage.
When reading other people's responses, it seems that there are more or less three arguments for dynamic languages:
1) The code is less verbose.
I don't find this valid. Some dynamic languages are less verbose than some static ones. But F# is statically typed, but the static typing there does not add much, if any, code. It is implicitly typed, though, but that is a different thing.
2) "My favorite dynamic language X has my favorite functional feature Y, so therefore dynamic is better". Don't mix up functional and dynamic (I can't understand why this has to be said).
3) In dynamic languages you can see your results immediately. News: You can do that with C# in Visual Studio (since 2005) too. Just set a breakpoint, run the program in the debugger and modify the program while debbuging. I do this all the time and it works perfectly.
Myself, I'm a strong advocate for static typing, for one primary reason: maintainability. I have a system with a couple 10k lines of JavaScript in it, and any refactoring I want to do will take like half a day since the (non-existent) compiler will not tell me what that variable renaming messed up. And that's code I wrote myself, IMO well structured, too. I wouldn't want the task of being put in charge of an equivalent dynamic system that someone else wrote.
I guess I will be massively downvoted for this, but I'll take the chance.
VBScript sucks, unless you're comparing it to another flavor of VB.
PHP is ok, so long as you keep in mind that it's an overgrown templating language.
Modern Javascript is great. Really. Tons of fun. Just stay away from any scripts tagged "DHTML".
I've never used a language that didn't allow runtime errors. IMHO, that's largely a red-herring: compilers don't catch all typos, nor do they validate intent. Explicit typing is great when you need explicit types, but most of the time, you don't. Search for the questions here on generics or the one about whether or not using unsigned types was a good choice for index variables - much of the time, this stuff just gets in the way, and gives folks knobs to twiddle when they have time on their hands.
But, i haven't really answered your question. Why are dynamic languages appealing? Because after a while, writing code gets dull and you just want to implement the algorithm. You've already sat and worked it all out in pen, diagrammed potential problem scenarios and proved them solvable, and the only thing left to do is code up the twenty lines of implementation... and two hundred lines of boilerplate to make it compile. Then you realize that the type system you work with doesn't reflect what you're actually doing, but someone else's ultra-abstract idea of what you might be doing, and you've long ago abandoned programming for a life of knicknack tweaking so obsessive-compulsive that it would shame even fictional detective Adrian Monk.
That's when you go get plastered start looking seriously at dynamic languages.
I am a full-time .Net programmer fully entrenched in the throes of statically-typed C#. However, I love modern JavaScript.
Generally speaking, I think dynamic languages allow you to express your intent more succinctly than statically typed languages as you spend less time and space defining what the building blocks are of what you are trying to express when in many cases they are self evident.
I think there are multiple classes of dynamic languages, too. I have no desire to go back to writing classic ASP pages in VBScript. To be useful, I think a dynamic language needs to support some sort of collection, list or associative construct at its core so that objects (or what pass for objects) can be expressed and allow you to build more complex constructs. (Maybe we should all just code in LISP ... it's a joke ...)
I think in .Net circles, dynamic languages get a bad rap because they are associated with VBScript and/or JavaScript. VBScript is just a recalled as a nightmare for many of the reasons Kibbee stated -- anybody remember enforcing type in VBScript using CLng to make sure you got enough bits for a 32-bit integer. Also, I think JavaScript is still viewed as the browser language for drop-down menus that is written a different way for all browsers. In that case, the issue is not language, but the various browser object models. What's interesting is that the more C# matures, the more dynamic it starts to look. I love Lambda expressions, anonymous objects and type inference. It feels more like JavaScript everyday.
Here is a statically typed QuickSort in two lines of Haskell (from haskell.org):
qsort [] = []
qsort (x:xs) = qsort (filter (< x) xs) ++ [x] ++ qsort (filter (>= x) xs)
And here is a dynamically typed QuickSort in LISP (from swisspig.net):
(defun quicksort (lis) (if (null lis) nil
(let* ((x (car lis)) (r (cdr lis)) (fn (lambda (a) (< a x))))
(append (quicksort (remove-if-not fn r)) (list x)
(quicksort (remove-if fn r))))))
I think you're biasing things with your choice of language here. Lisp is notoriously paren-heavy. A closer equivelent to Haskell would be Python.
if len(L) <= 1: return L
return qsort([lt for lt in L[1:] if lt < L[0]]) + [L[0]] + qsort([ge for ge in L[1:] if ge >= L[0]])
Python code from here
For me, the advantage of dynamic languages is how much more readable the code becomes due to less code and functional techniques like Ruby's block and Python's list comprehension.
But then I kind of miss the compile time checking (typo does happen) and IDE auto complete. Overall, the lesser amount of code and readability pays off for me.
Another advantage is the usually interpreted/non compiled nature of the language. Change some code and see the result immediately. It's really a time saver during development.
Last but not least, I like the fact that you can fire up a console and try out something you're not sure of, like a class or method that you've never used before and see how it behaves. There are many uses for the console and I'll just leave that for you to figure out.
Your arguments against dynamic languages are perfectly valid. However, consider the following:
Dynamic languages don't need to be compiled: just run them. You can even reload the files at run time without restarting the application in most cases.
Dynamic languages are generally less verbose and more readable: have you ever looked at a given algorithm or program implemented in a static language, then compared it to the Ruby or Python equivalent? In general, you're looking at a reduction in lines of code by a factor of 3. A lot of scaffolding code is unnecessary in dynamic languages, and that means the end result is more readable and more focused on the actual problem at hand.
Don't worry about typing issues: the general approach when programming in dynamic languages is not to worry about typing: most of the time, the right kind of argument will be passed to your methods. And once in a while, someone may use a different kind of argument that just happens to work as well. When things go wrong, your program may be stopped, but this rarely happens if you've done a few tests.
I too found it a bit scary to step away from the safe world of static typing at first, but for me the advantages by far outweigh the disadvantages, and I've never looked back.
I believe that the "new found love" for dynamically-typed languages have less to do with whether statically-typed languages are better or worst - in the absolute sense - than the rise in popularity of certain dynamic languages. Ruby on Rails was obviously a big phenomenon that cause the resurgence of dynamic languages. The thing that made rails so popular and created so many converts from the static camp was mainly: very terse and DRY code and configuration. This is especially true when compared to Java web frameworks which required mountains of XML configuration. Many Java programmers - smart ones too - converted over, and some even evangelized ruby and other dynamic languages. For me, three distinct features allow dynamic languages like Ruby or Python to be more terse:
Minimalist syntax - the big one is that type annotations are not required, but also the the language designer designed the language from the start to be terse
inline function syntax(or the lambda) - the ability to write inline functions and pass them around as variables makes many kinds of code more brief. In particular this is true for list/array operations. The roots of this ideas was obviously - LISP.
Metaprogramming - metaprogramming is a big part of what makes rails tick. It gave rise to a new way of refactoring code that allowed the client code of your library to be much more succinct. This also originate from LISP.
All three of these features are not exclusive to dynamic languages, but they certainly are not present in the popular static languages of today: Java and C#. You might argue C# has #2 in delegates, but I would argue that it's not widely used at all - such as with list operations.
As for more advanced static languages... Haskell is a wonderful language, it has #1 and #2, and although it doesn't have #3, it's type system is so flexible that you will probably not find the lack of meta to be limiting. I believe you can do metaprogramming in OCaml at compile time with a language extension. Scala is a very recent addition and is very promising. F# for the .NET camp. But, users of these languages are in the minority, and so they didn't really contribute to this change in the programming languages landscape. In fact, I very much believe the popularity of Ruby affected the popularity of languages like Haskell, OCaml, Scala, and F# in a positive way, in addition to the other dynamic languages.
Personally, I think it's just that most of the "dynamic" languages you have used just happen to be poor examples of languages in general.
I am way more productive in Python than in C or Java, and not just because you have to do the edit-compile-link-run dance. I'm getting more productive in Objective-C, but that's probably more due to the framework.
Needless to say, I am more productive in any of these languages than PHP. Hell, I'd rather code in Scheme or Prolog than PHP. (But lately I've actually been doing more Prolog than anything else, so take that with a grain of salt!)
My appreciation for dynamic languages is very much tied to how functional they are. Python's list comprehensions, Ruby's closures, and JavaScript's prototyped objects are all very appealing facets of those languages. All also feature first-class functions--something I can't see living without ever again.
I wouldn't categorize PHP and VB (script) in the same way. To me, those are mostly imperative languages with all of the dynamic-typing drawbacks that you suggest.
Sure, you don't get the same level of compile-time checks (since there ain't a compile time), but I would expect static syntax-checking tools to evolve over time to at least partially address that issue.
One of the advantages pointed out for dynamic languages is to just be able to change the code and continue running. No need to recompile. In VS.Net 2008, when debugging, you can actually change the code, and continue running, without a recompile. With advances in compilers and IDEs, is it possible that this and other advantages of using dynamic languages will go away.
Ah, I didn't see this topic when I posted similar question
Aside from good features the rest of the folks mentioned here about dynamic languages, I think everybody forget one, the most basic thing: metaprogramming.
Programming the program.
Its pretty hard to do in compiled languages, generally, take for example .Net. To make it work you have to make all kind of mambo jumbo and it usualy ends with code that runs around 100 times slower.
Most dynamic languages have a way to do metaprogramming and that is something that keeps me there - ability to create any kind of code in memory and perfectly integrate it into my applicaiton.
For instance to create calculator in Lua, all I have to do is:
print( loadstring( "return " .. io.read() )() )
Now, try to do that in .Net.
My main reason for liking dynamic (typed, since that seems to be the focus of the thread) languages is that the ones I've used (in a work environment) are far superior to the non-dynamic languages I've used. C, C++, Java, etc... they're all horrible languages for getting actual work done in. I'd love to see an implicitly typed language that's as natural to program in as many of the dynamically typed ones.
That being said, there's certain constructs that are just amazing in dynamically typed languages. For example, in Tcl
lindex $mylist end-2
The fact that you pass in "end-2" to indicate the index you want is incredibly concise and obvious to the reader. I have yet to see a statically typed language that accomplishes such.
I think this kind of argument is a bit stupid: "Things that usually would have been caught by the compiler such as misspelled variable names and assigning an value of the wrong type to a variable don't occur until runtime" yes thats right as a PHP developer I don't see things like mistyped variables until runtime, BUT runtime is step 2 for me, in C++ (Which is the only compiled language I have any experience) it is step 3, after linking, and compiling.
Not to mention that it takes all of a few seconds after I hit save to when my code is ready to run, unlike in compiled languages where it can take literally hours. I'm sorry if this sounds a bit angry, but I'm kind of tired of people treating me as a second rate programmer because I don't have to compile my code.
The argument is more complex than this (read Yegge's article "Is Weak Typing Strong Enough" for an interesting overview).
Dynamic languages don't necessarily lack error checking either - C#'s type inference is possibly one example. In the same way, C and C++ have terrible compile checks and they are statically typed.
The main advantages of dynamic languages are a) capability (which doesn't necessarily have to be used all the time) and b) Boyd's Law of Iteration.
The latter reason is massive.
Although I'm not a big fan of Ruby yet, I find dynamic languages to be really wonderful and powerful tools.
The idea that there is no type checking and variable declaration is not too big an issue really. Admittedly, you can't catch these errors until run time, but for experienced developers this is not really an issue, and when you do make mistakes, they're usually easily fixed.
It also forces novices to read what they're writing more carefully. I know learning PHP taught me to be more attentive to what I was actually typing, which has improved my programming even in compiled languages.
Good IDEs will give enough intellisense for you to know whether a variable has been "declared" and they also try to do some type inference for you so that you can tell what a variable is.
The power of what can be done with dynamic languages is really what makes them so much fun to work with in my opinion. Sure, you could do the same things in a compiled language, but it would take more code. Languages like Python and PHP let you develop in less time and get a functional codebase faster most of the time.
And for the record, I'm a full-time .NET developer, and I love compiled languages. I only use dynamic languages in my free time to learn more about them and better myself as a developer..
I think that we need the different types of languages depending on what we are trying to achieve, or solve with them. If we want an application that creates, retrieves, updates and deletes records from the database over the internet, we are better off doing it with one line of ROR code (using the scaffold) than writing it from scratch in a statically typed language. Using dynamic languages frees up the minds from wondering about
which variable has which type
how to grow a string dynamically as needs be
how to write code so that if i change type of one variable, i dont have to rewrite all the function that interact with it
to problems that are closer to business needs like
data is saving/updating etc in the database, how do i use it to drive traffic to my site
Anyway, one advantage of loosely typed languages is that we dont really care what type it is, if it behaves like what it is supposed to. That is the reason we have duck-typing in dynamically typed languages. it is a great feature and i can use the same variable names to store different types of data as the need arises. also, statically typed languages force you to think like a machine (how does the compiler interact with your code, etc etc) whereas dynamically typed languages, especially ruby/ror, force the machine to think like a human.
These are some of the arguments i use to justify my job and experience in dynamic languages!
I think both styles have their strengths. This either/or thinking is kind of crippling to our community in my opinion. I've worked in architectures that were statically-typed from top to bottom and it was fine. My favorite architecture is for dynamically-typed at the UI level and statically-typed at the functional level. This also encourages a language barrier that enforces the separation of UI and function.
To be a cynic, it may be simply that dynamic languages allow the developer to be lazier and to get things done knowing less about the fundamentals of computing. Whether this is a good or bad thing is up to the reader :)
FWIW, Compiling on most applications shouldn't take hours. I have worked with applications that are between 200-500k lines that take minutes to compile. Certainly not hours.
I prefer compiled languages myself. I feel as though the debugging tools (in my experience, which might not be true for everything) are better and the IDE tools are better.
I like being able to attach my Visual Studio to a running process. Can other IDEs do that? Maybe, but I don't know about them. I have been doing some PHP development work lately and to be honest it isn't all that bad. However, I much prefer C# and the VS IDE. I feel like I work faster and debug problems faster.
So maybe it is more a toolset thing for me than the dynamic/static language issue?
One last comment... if you are developing with a local server saving is faster than compiling, but often times I don't have access to everything on my local machine. Databases and fileshares live elsewhere. It is easier to FTP to the web server and then run my PHP code only to find the error and have to fix and re-ftp.
Productivity in a certain context. But that is just one environment I know, compared to some others I know or have seen used.
Smalltalk on Squeak/Pharo with Seaside is a much more effective and efficient web platform than ASP.Net(/MVC), RoR or Wicket, for complex applications. Until you need to interface with something that has libraries in one of those but not smalltalk.
Misspelled variable names are red in the IDE, IntelliSense works but is not as specific. Run-time errors on webpages are not an issue but a feature, one click to bring up the debugger, one click to my IDE, fix the bug in the debugger, save, continue. For simple bugs, the round-trip time for this cycle is less than 20 seconds.
Dynamic Languages Strike Back
http://www.youtube.com/watch?v=tz-Bb-D6teE
A talk discussing Dynamic Languages, what some of the positives are, and how many of the negatives aren't really true.
Because I consider stupid having to declare the type of the box.
The type stays with the entity, not with the container. Static typing had a sense when the type of the box had a direct consequence on how the bits in memory were interpreted.
If you take a look at the design patterns in the GoF, you will realize that a good part of them are there just to fight with the static nature of the language, and they have no reason whatsoever to exist in a dynamic language.
Also, I'm tired of having to write stuff like MyFancyObjectInterface f = new MyFancyObject(). DRY principle anyone ?
Put yourself in the place of a brand new programmer selecting a language to start out with, who doesn't care about dynamic versus staic versus lambdas versus this versus that etc.; which language would YOU choose?
C#
using System;
class MyProgram
{
public static void Main(string[] args)
{
foreach (string s in args)
{
Console.WriteLine(s);
}
}
}
Lua:
function printStuff(args)
for key,value in pairs(args) do
print value .. " "
end
end
strings = {
"hello",
"world",
"from lua"
}
printStuff(strings)
This all comes down to partially what's appropriate for the particular goals and what's a common personal preference. (E.G. Is this going to be a huge code base maintained by more people than can conduct a reasonable meeting together? You want type checking.)
The personal part is about trading off some checks and other steps for development and testing speed (while likely giving up some cpu performance). There's some people for which this is liberating and a performance boost, and there's some for which this is quite the opposite, and yes it does sort of depend on the particular flavor of your language too. I mean no one here is saying Java rocks for speedy, terse development, or that PHP is a solid language where you'll rarely make a hard to spot typo.
I have love for both static and dynamic languages. Every project that I've been involved in since about 2002 has been a C/C++ application with an embedded Python interpret. This gives me the best of both worlds:
The components and frameworks that make up the application are, for a given release of an application, immutable. They must also be very stable, and hence, well tested. A Statically typed language is the right choice for building these parts.
The wiring up of components, loading of component DLLs, artwork, most of the GUI, etc... can vary greatly (say, to customise the application for a client) with no need to change any framework or components code. A dynamic language is perfect for this.
I find that the mix of a statically typed language to build the system and a dynamically type language to configure it gives me flexibility, stability and productivity.
To answer the question of "What's with the love of dynamic languages?" For me it's the ability to completely re-wire a system at runtime in any way imaginable. I see the scripting language as "running the show", therefore the executing application may do anything you desire.
I don't have much experience with dynamic languages in general, but the one dynamic language I do know, JavaScript(aka ECMAScript), I absolutely love.
Well, wait, what's the discussion here? Dynamic compilation? Or dynamic typing? JavaScript covers both bases so I guess I'll talk about both:
Dynamic compilation:
To begin, dynamic languages are compiled, the compilation is simply put off until later. And Java and .NET really are compiled twice. Once to their respective intermediate languages, and again, dynamically, to machine code.
But when compilation is put off you can see results faster. That's one advantage. I do enjoy simply saving the file and seeing my program in action fairly quick.
Another advantage is that you can write and compile code at runtime. Whether this is possible in statically compiled code, I don't know. I imagine it must be, since whatever compiles JavaScript is ultimately machine code and statically compiled. But in a dynamic language this is a trivial thing to do. Code can write and run itself. (And I'm pretty sure .NET can do this, but the CIL that .NET compiles to is dynamically compiled on the fly anyways, and it's not so trivial in C#)
Dynamic typing:
I think dynamic typing is more expressive than static typing. Note that I'm using the term expressive informally to say that dynamic typing can say more with less. Here's some JavaScript code:
var Person = {};
Do you know what Person is now? It's a generic dictionary. I can do this:
Person["First_Name"] = "John";
Person["Last_Name"] = "Smith";
But it's also an object. I could refer to any of those "keys" like this:
Person.First_Name
And add any methods I deem necessary:
Person.changeFirstName = function(newName) {
this.First_Name = newName;
};
Sure, there might be problems if newName isn't a string. It won't be caught right away, if ever, but you can check yourself. It's a matter of trading expressive power and flexibility for safety. I don't mind adding code to check types, etc, myself, and I've yet to run into a type bug that gave me much grief (and I know that isn't saying much. It could be a matter of time :) ). I very much enjoy, however, that ability to adapt on the fly.
Nice blog post on the same topic: Python Makes Me Nervous
Method signatures are virtually
useless in Python. In Java, static
typing makes the method signature into
a recipe: it's all the shit you need
to make this method work. Not so in
Python. Here, a method signature will
only tell you one thing: how many
arguments you need to make it work.
Sometimes, it won't even do that, if
you start fucking around with
**kwargs.
Because it's fun fun fun. It's fun to not worry about memory allocation, for one. It's fun not waiting for compilation. etc etc etc
Weakly typed languages allow flexibility in how you manage your data.
I used VHDL last spring for several classes, and I like their method of representing bits/bytes, and how the compiler catches errors if you try to assign a 6-bit bus to a 9-bit bus. I tried to recreate it in C++, and I'm having a fair struggle to neatly get the typing to work smoothly with existing types. Steve Yegge does a very nice job of describing the issues involved with strong type systems, I think.
Regarding verbosity: I find Java and C# to be quite verbose in the large(let's not cherry-pick small algorithms to "prove" a point). And, yes, I've written in both. C++ struggles in the same area as well; VHDL succumbs here.
Parsimony appears to be a virtue of the dynamic languages in general(I present Perl and F# as examples).

Resources