Strings and Strands in MoarVM - string

When running Raku code on Rakudo with the MoarVM backend, is there any way to print information about how a given Str is stored in memory from inside the running program? In particular, I am curious whether there's a way to see how many Strands currently make up the Str (whether via Raku introspection, NQP, or something that accesses the MoarVM level (does such a thing even exist at runtime?).
If there isn't any way to access this info at runtime, is there a way to get at it through output from one of Rakudo's command-line flags, such as --target, or --tracing? Or through a debugger?
Finally, does MoarVM manage the number of Strands in a given Str? I often hear (or say) that one of Raku's super powers is that is can index into Unicode strings in O(1) time, but I've been thinking about the pathological case, and it feels like it would be O(n). For example,
(^$n).map({~rand}).join
seems like it would create a Str with a length proportional to $n that consists of $n Strands – and, if I'm understanding the datastructure correctly, that means that into this Str would require checking the length of each Strand, for a time complexity of O(n). But I know that it's possible to flatten a Strand-ed Str; would MoarVM do something like that in this case? Or have I misunderstood something more basic?

When running Raku code on Rakudo with the MoarVM backend, is there any way to print information about how a given Str is stored in memory from inside the running program?
My educated guess is yes, as described below for App::MoarVM modules. That said, my education came from a degree I started at the Unseen University, and a wizard had me expelled for guessing too much, so...
In particular, I am curious whether there's a way to see how many Strands currently make up the Str (whether via Raku introspection, NQP, or something that accesses the MoarVM level (does such a thing even exist at runtime?).
I'm 99.99% sure strands are purely an implementation detail of the backend, and there'll be no Raku or NQP access to that information without MoarVM specific tricks. That said, read on.
If there isn't any way to access this info at runtime
I can see there is access at runtime via MoarVM.
is there a way to get at it through output from one of Rakudo's command-line flags, such as --target, or --tracing? Or through a debugger?
I'm 99.99% sure there are multiple ways.
For example, there's a bunch of strand debugging code in MoarVM's ops.c file starting with #define MVM_DEBUG_STRANDS ....
Perhaps more interesting are what appears to be a veritable goldmine of sophisticated debugging and profiling features built into MoarVM. Plus what appear to be Rakudo specific modules that drive those features, presumably via Raku code. For a dozen or so articles discussing some aspects of those features, I suggest reading timotimo's blog. Browsing github I see ongoing commits related to MoarVM's debugging features for years and on into 2021.
Finally, does MoarVM manage the number of Strands in a given Str?
Yes. I can see that the string handling code (some links are below), which was written by samcv (extremely smart and careful) and, I believe, reviewed by jnthn, has logic limiting the number of strands.
I often hear (or say) that one of Raku's super powers is that is can index into Unicode strings in O(1) time, but I've been thinking about the pathological case, and it feels like it would be O(n).
Yes, if a backend that supported strands did not manage the number of strands.
But for MoarVM I think the intent is to set an absolute upper bound with #define MVM_STRING_MAX_STRANDS 64 in MoarVM's MVMString.h file, and logic that checks against that (and other characteristics of strings; see this else if statement as an exemplar). But the logic is sufficiently complex, and my C chops sufficiently meagre, that I am nowhere near being able to express confidence in that, even if I can say that that appears to be the intent.
For example, (^$n).map({~rand}).join seems like it would create a Str with a length proportional to $n that consists of $n Strands
I'm 95% confident that the strings constructed by simple joins like that will be O(1).
This is based on me thinking that a Raku/NQP level string join operation is handled by MVM_string_join, and my attempts to understand what that code does.
But I know that it's possible to flatten a Strand-ed Str; would MoarVM do something like that in this case?
If you read the code you will find it's doing very sophisticated handling.
Or have I misunderstood something more basic?
I'm pretty sure I will have misunderstood something basic so I sure ain't gonna comment on whether you have. :)

As far as I understand it, the fact that MoarVM implements strands (aka, a concatenating two strings will only result in creation of a strand that consists of "references" to the original strings), is really that: an implementation detail.
You can implement the Raku Programming Language without needing to implement strands. Therefore there is no way to introspect this, at least to my knowledge.
There has been a PR to expose the nqp:: op that would actually concatenate strands into a single string, but that has been refused / closed: https://github.com/rakudo/rakudo/pull/3975

Related

Tools for Domain Specific Language/Functions

Our users can enter questions that get answered by students. Our users need a extensible, flexible way to define the correct answers to these questions (which are stored as a simple string).
I would like to expose a library of domain specific functions that users can call on to describe the correct answer. Eg:
exact_match("puppy") // means the correct answer is the string 'puppy'
or
contains("yesterday") // means any answer with the word 'yesterday' is correct
The naive implementation would involve eval'ing user supplied strings in a sandboxed runtime (like a javascript vm or ruby vm). But I'd like to go further and only allow specific functions to be called. Any other scripting would be discarded. Such that:
puts("foo"); contains("yesterday")
would be illegal. Since we don't expose or allow puts().
How can I constrain the execution environment to only run a whitelist of functions? Or is there a different approach to build this kind of external-facing DSL instead of trying to constrain an existing language to a subset of functions?
I would check out MPS by JetBrains if I were you, its an open source DSL creation tool. I have never used it myself, but from everything I have seen on it, it's very intuitive; and all of their other products are incredibly powerful.
Just because you're creating a DSL, that doesn't necessarily mean that you have to give the user the ability to enter the code in text.
The key to this is providing a list of method names and your special keyword for them, the "FunCode" tag in the code example below:
Create a mapping from keyword to code, and letting them define everything they need, and then use it. And I would actually build my own XML parser so that it's not hackable, at least not on a list of zero-day-exploits hackable.
<strDefs>
<strDef><strNam>sickStr</strNam>
<strText>sick</strText><strNum>01</strNum><strDef>
<strDef><strNam>pupStr</strNam>
<strText>puppy</strText><strNum>02</strNum><strDef>
</strDefs>
<funDefs>
<funDef><funCode>pfContainsStr</funCode><funLabel>contains</funLabel>
<funNum>01</funNum></funDef>
<funDef><funCode>pfXact</funCode><funLabel>exact_match</funLabel>
<funNum>02</funNum></funDef>
</funDefs>
<queries>
<query><fun>01</fun><str>02</str>
</query>
</queries>
The above XML more represents the idea and the structure of what to do, but rather in a user interface, so the user is constrained. The user interface code that allows the data-entry of the above data should be running on your server, and they only interact with it. Any code that runs on their browser is hackable, because they can just save the page, edit the HTML (and/or JavaScript), and run that, which is their code now, not yours anymore.
You can't really open the door (pandora's box) and allow just anyone to write just any code and have it evaluated / interpreted by the language parser, because some hacker is going to exploit it. You must lock down the strings, probably by having them enter them into your database in an earlier step, and each string gets its own token that YOU generate (a SQL Server primary key is very simple, usable, and secure), but give them a display representation so it's readable to them.
Then give them a list of methods / functions they can use, along with a token (a primary key can also serve here, perhaps with a kind of table prefix) and also a display representation (label).
If you have them put all of their labels into yet another table, you can have SQL make sure that all of their labels are unique to each other in the whole "language", and then you can allow them to try to define their expressions in the language they want to use. This has the advantage that foreign languages can be used, but you don't have to do anything terribly special.
An important piece would be the verify button, that would translate their expression into unique tokens and back again, checking that the round-trip was successful. If it wasn't successful, there's some kind of ambiguity, and you might be able to allow them an option to use the list of tokens as the source in that case.
If you heavily rely on set-based logic for the underlying foundation of the language and your tables, you should be able to produce a coherent DSL that works. Many DSL creation problems are ones of integrity, where there are underlying assumptions that are contradictory, unintentionally mutually exclusive, or nonsensical. Truth is an unshakeable foundation. Anything else has a lie somewhere -- that you're trying to build on.
Sudoku is illustrative here. When you screw up a Sudoku, you often don't know that you have done so, and you keep building on that false foundation, until you get to the completion of the puzzle, and one whole string of assumptions disagrees with a different string of assumptions. They can't both be true. But you can't tell where you went wrong because you're too far away from the mistake and can not work backwards (easily). All steps taken look correct. A DSL, a database schema, and code, are all this way. Baby steps, that are double- and even triple-checked, and hopefully "correct by inspection", are the best way to "grow" a DSL, slowly, piece-by-piece. The best way to not have flaws is to not add them in the first place.
You don't want bugs in your DSL. Keep it spartan. KISS - Keep it simple, Sparticus! And I have personally found that keeping it set-based, if not overtly, under the covers, accomplishes this very well.
Finally, to be able to think this way, I've studied languages for a long time, and have cultivated a curiosity about how languages have come to be. Books are a good quality source of information, as they have a higher quality level than the internet, which is nevertheless also an indispensable source. Some of my favorite languages: Forth, Factor, SETL, F#, C#, Visual FoxPro (especially for its embedded SQL), T-SQL, Common LISP, Clojure, and probably my favorite, Dylan, an INFIX Lisp without parentheses that Apple experimented with and abandoned, with a syntax that seems to me reminiscent of Pascal, which I sort of liked. The language list is actually much longer than that (and I haven't written code for many of them -- just studied them or their genesis), but that's enough for now.
One of my favorite books, and immensely interesting for the "people" side of it, is "Masterminds of Programming: Conversations with the Creators of Major Programming Languages" (Theory in Practice (O'Reilly)) 1st Edition, Kindle Edition
by Federico Biancuzzi (Author), Chromatic (Author)
By the way, don't let them compromise the integrity of your DSL -- require that it is expressible set-based, and things should go well (IMHO). I hope it works out well for you. Add a comment to my answer telling me how it worked out, if you think of it. And don't forget to choose my answer if you think it's the best! We work hard for the money! ;-)

How should I make my parser concurrent?

I'm working on implementing a music programming language parser in Clojure. The idea is that you run the parser program with a text file as a command-line argument; the text file contains code in this music language I'm developing; the parser interprets the code and figures out what "instrument instances" have been declared, and for each instrument instance, it parses the code and returns a sequence of musical "events" (notes, chords, rests, etc.) that the instrument does. So before that last step, we have multiple strings of "music code," one string per instrument instance.
I'm somewhat new to Clojure and still learning the nuances of how to use reference types and threads/concurrency. My parser is going to be doing some complex parsing, so I figured it would benefit from using concurrency to boost performance. Here are my questions:
The simplest way to do this, it seems, would be to save the concurrency for after the instruments are "split up" by the initial parse (a single-thread operation), then parse each instrument's code on a different thread at the same time (rather than wait for each instrument to finish parsing before moving onto the next). Am I on the right track, or is there a more efficient and/or logical way to structure my "concurrency plan"?
What options do I have for how to implement this concurrent parsing, and which one might work the best, either from a performance or a code maintenance standpoint? It seems like it could be as simple as: (map #(future (process-music-code %)) instrument-instances), but I'm not sure if there is a better way to do it like with an agent, or manual threads via Java interop, or what. I'm new to concurrent programming, so any input on different ways to do this would be great.
From what I've read, it seems that Clojure's reference types play an important role in concurrent programming, and I can see why, but is it always necessary to use them when working with multiple threads? Should I worry about making some of my data mutable? If so, what in particular should be mutable in the code for the parser I'm writing? and what reference type(s) would be best suited for what I'm doing? The nature of the way my program will work (user runs the program with a text file as an argument -- program processes it and turns it into audio) makes it seem like I don't need anything to be mutable, since the input data never changes, so my gut tells me I won't need to use any reference types, but then again, I might not fully understand the relationship between reference types and concurrency in Clojure.
I would suggest that you might be distracting yourself from more important things (like working out the details of your music language) by premature optimization. It would be better to write the simplest, easiest-to-code parser which you can first, to get up and running. If you find it too slow, then you can look at how to optimize for better performance.
The parser should be fairly self-contained, and will probably not take a whole lot of code anyways, so even if you later throw it out and rewrite it, it will not be a big loss. And the experience of writing the first parser will help if and when you write the second one.
Other points:
You are absolutely right about reference types -- you probably won't need any. Your program is a compiler -- it takes input, transforms it, writes output, then exits. That is the ideal situation for pure functional programming, with nothing mutable and all flow of data going purely through function arguments and return values.
Using a parser generator is usually the quickest way to get a working parser, but I haven't found a really good parser generator for Clojure. Parsley has a really nice API, but it generates LR(0) parsers, which are almost useless for anything which does not have clear, unambiguous markers for the beginning/end of each "section". (Like the way S-expressions open and close with parens.) There are a couple parser combinator libraries out there, like squarepeg, but I don't like their APIs and prefer to write my own hand-coded, recursive-descent parsers using my own implementation of something like parser combinators. (They're not fast, but the code reads really well.)
I can only support Alex Ds point that writing parsers is an excellent exercise. You should definitely do it in C one time. From my own experience, it's a lot of debugging training at least.
Aside from that, given that you are in the beautiful world of Clojure notice the following:
Your parser will transform ordinary strings to data structures, like
{:command :declare,
:args {:name "bazooka-violin",
...},
...}
In Clojure you can read such data structures easily from EDN files. Possibly it would be a more valuable approach to play around with finding suitable structures directly before you constrain the syntax of your language too much for it to be flexible for later changes in the way your language works.
Don't ever think about writing for performance. Unless your user describes the collected works of Bach in a file, it's unlikely that it will take more than a second to parse.
If you write your interpreter in a functional, modular and concise way, it should be easy to decompose it into steps that can be parallelized using various techniques from pmap to core.reducers. The same of course goes for all other code and your parser as well (if multi-threading is a necessity there).
Even Clojure is not compiled in parallel. However it supports recompilation (on the JVM) which in contrast is a way more valuable feature to think about.
As an aside, I've been reading The Joy of Clojure, and I just learned that there is a nifty clojure.core function called pmap (parallel map) that provides a nice, easy way to perform an operation in parallel on a sequence of data. It's syntax is just like map, but the difference is that it performs the function on each item of the sequence in parallel and returns a lazy sequence of the results! This can generally give a performance boost, but it depends on the inherent performance cost of coordinating the sequence result, so whether or not pmap gives a performance boost will depend on the situation.
At this stage in my MPL parser, my plan is to map a function over a sequence of instruments/music data, transforming each instrument's music data from a parse tree into audio. I have no idea how costly this transformation will be, but if it turns out that it takes a while to generate the audio for each instrument individually, I suppose I could try changing my map to pmap and see if that improves performance.

Clearing memory in different languages for security

When studying Java I learned that Strings were not safe for storing passwords, since you can't manually clear the memory associated with them (you can't be sure they will eventually be gc'ed, interned strings may never be, and even after gc you can't be sure the physical memory contents were really wiped). Instead, I were to use char arrays, so I can zero-out them after use. I've tried to search for similar practices in other languages and platforms, but so far I couldn't find the relevant info (usually all I see are code examples of passwords stored in strings with no mention of any security issue).
I'm particularly interested in the situation with browsers. I use jQuery a lot, and my usual approach is just the set the value of a password field to an empty string and forget about it:
$(myPasswordField).val("");
But I'm not 100% convinced it is enough. I also have no idea whether or not the strings used for intermediate access are safe (for instance, when I use $.ajax to send the password to the server). As for other languages, usually I see no mention of this issue (another language I'm interested in particular is Python).
I know questions attempting to build lists are controversial, but since this deals with a common security issue that is largely overlooked, IMHO it's worth it. If I'm mistaken, I'd be happy to know just from JavaScript (in browsers) and Python then. I was also unsure whether to ask here, at security.SE or at programmers.SE, but since it involves the actual code to safely perform the task (not a conceptual question) I believe this site is the best option.
Note: in low-level languages, or languages that unambiguously support characters as primitive types, the answer should be obvious (Edit: not really obvious, as #Gabe showed in his answer below). I'm asking for those high level languages in which "everything is an object" or something like that, and also for those that perform automatic string interning behind the scenes (so you may create a security hole without realizing it, even if you're reasonably careful).
Update: according to an answer in a related question, even using char[] in Java is not guaranteed to be bulletproof (or .NET SecureString, for that matter), since the gc might move the array around so its contents might stick in the memory even after clearing (SecureString at least sticks in the same RAM address, guaranteeing clearing, but its consumers/producers might still leave traces).
I guess #NiklasB. is right, even though the vulnerability exists, the likelyhood of an exploit is low and the difficulty to prevent it is high, that might be the reason this issue is mostly ignored. I wish I could find at least some reference of this problem concerning browsers, but googling for it has been fruitless so far (does this scenario at least have a name?).
The .NET solution to this is SecureString.
A SecureString object is similar to a String object in that it has a text value. However, the value of a SecureString object is automatically encrypted, can be modified until your application marks it as read-only, and can be deleted from computer memory by either your application or the .NET Framework garbage collector.
Note that even for low-level languages like C, the answer isn't as obvious as it seems. Modern compilers can determine that you are writing to the string (zeroing it out) but never reading the values you read out, and just optimize away the zeroing. In order to prevent optimizing away the security, Windows provides SecureZeroMemory.
For Python, there's no way to do that that, according to this answer. A possibility would be using lists of characters (as length-1 strings or maybe code units as integers) instead of strings, so you can overwrite that list after use, but that would require every code that touches it to support this format (if even a single one of them creates a string with its contents, it's over).
There is also a mention to a method using ctypes, but the link is broken, so I'm unaware of its contents. This other answer also refers to it, but there's not a lot of detail.

Can anyone explain the design decisions behind Autolisp/visual lisp to me?

I wonder can anyone explain the design rationale behind the following features of autolisp / visual lisp? To me they seem to fly in the face of accepted software practice ... am I missing something?
All variables are global by default (ie unless placed after a / in the function arguments)
Reading/writing data from autocad requires putting stuff into an association list with lots of magic numbers. 10 means x/y coordinates, 90 means length of the coordinate list, 63 means colour, etc. Ok you could store these in some constants but that would mean yet more globals, and the documentation encourages you to use the magic numbers directly.
Lisp is a functional-style language, which encourages programming by recursion over iteration, but tail recursion is afaik not optimised in visual lisp leading to horrendous call stacks - unless, of course you iterate. But loop syntax is very restrictive; e.g. you can't break out of or return a value from a loop unless you put some kind of flag in the termination condition. Result, ugly code.
Generally you are forced to declare variables all over the place which flies in the face of functional programming - so why use a functional(-ish) language?
Lisp isn't a language, it's a group of sometimes surprisingly different languages. Scheme and Clojure are the functional members of the family. Common Lisp, and the more specialized breeds like Elisp aren't particularly functional and don't inherently encourage functional programming or recursion. CL in fact includes a very flexible object system, an extremely flexible iteration DSL, and doesn't guarantee optimized tail calls (Scheme dialects do, but not Lisps in general; that's the pitfall in thinking of "Lisp" as a single language).
Now that we have that cleared up, AutoLisp is an implementation from 1986 based on an early version of XLISP (the earliest of which was published in 1983).
The reason that it might fly in the face of currently accepted programming practice is that it predates currently accepted programming practice. Another thing to keep in mind is that the cheapest netbook available today is several hundred times more powerful than what a programmer could expect to have access to back in the mid 80s. Meaning that even if a given feature was accepted to be excellent, CPU or memory constraints may have prevented its implementation in a commercial language.
I've never programmed in Autolisp/Visual Lisp specifically, and the stuff you cite sounds bloody annoying, but it may have had some performance/memory advantage that justified it at the time.
If I remember correctly, AutoLisp is a fork from an early version of XLisp (some sources claim it was XLisp 1.0 (see this C2 article).
XLisp 1.0 is a 1-cell lisp (functions and variables share the same name-space) with some rather odd oddities to it.
You can add dynamic scoping into the mix btw, and if you don't know what it is consider yourself lucky. But actually not all your four points are that big of a deal IMO:
"Undeclared vars are created automatically as global." Same as in CL is it not (via setq)? The other option is to fail, and that's not a very attractive one for the language which is supposed to be used for quick-n-dirty scripting.
"magic numbers" are DXF-codes, which you're right are major inconvenience as they tend to change with the changing ACAD versions sometimes (thankfully, rarely). That's just how it is. Fixing it would require a major overhaul, introducing some "schemas" and what not, and why would "they" bother? AutoLISP was left in its state as of 1992 approximately, and never bothered with since. Visual LISP itself is entirely different and much more capable system, but it is all locked out for the regular user, and only made to serve one goal - to emulate the old AutoLISP as faithfully as possible (except where it added new VBA-related features in the later half of the 1990s, and was locked since then too).
(while (not done) ...) is not that ugly. There's no tail optimization guarantee, yes, just as there isn't one in CL and Haskell (that last one really stumbles me - there's no guaranteed way to encode a loop in Haskell in constant space without monads - how about that?).
"you're forced to declare vars all over the place" here I do not follow you. You declare them were you supposed to declare them - in the function's internal arguments list. What other places do you mean? I don't know of any.
In reality the biggest stumbling block of AutoLISP is its dynamic name resolution IMO, but that's how it was in Xlisp, only few years after Scheme first came out. Then also it's its immutable data store, but that was done mainly for simplicity of implementation, and to prevent too much confusion and hence questions, from the user base, I guess.

What is the worst programming language you ever worked with? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 13 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
If you have an interesting story to
share, please post an answer, but
do not abuse this question for bashing
a language.
We are programmers, and our primary tool is the programming language we use.
While there is a lot of discussion about the best one, I'd like to hear your stories about
the worst programming languages you ever worked with and I'd like to know exactly what annoyed you.
I'd like to collect this stories partly to avoid common pitfalls while designing a language (especially a DSL) and partly to avoid quirky languages in the future in general.
This question is not subjective. If a language supports only single character identifiers (see my own answer) this is bad in a non-debatable way.
EDIT
Some people have raised concerns that this question attracts trolls.
Wading through all your answers made one thing clear.
The large majority of answers is appropriate, useful and well written.
UPDATE 2009-07-01 19:15 GMT
The language overview is now complete, covering 103 different languages from 102 answers.
I decided to be lax about what counts as a programming language and included
anything reasonable. Thank you David for your comments on this.
Here are all programming languages covered so far
(alphabetical order, linked with answer, new entries in bold):
ABAP,
all 20th century languages,
all drag and drop languages,
all proprietary languages,
APF,
APL
(1),
AS400,
Authorware,
Autohotkey,
BancaStar,
BASIC,
Bourne Shell,
Brainfuck,
C++,
Centura Team Developer,
Cobol
(1),
Cold Fusion,
Coldfusion,
CRM114,
Crystal Syntax,
CSS,
Dataflex 2.3,
DB/c DX,
dbase II,
DCL,
Delphi IDE,
Doors DXL,
DOS batch
(1),
Excel Macro language,
FileMaker,
FOCUS,
Forth,
FORTRAN,
FORTRAN 77,
HTML,
Illustra web blade,
Informix 4th Generation Language,
Informix Universal Server web blade,
INTERCAL,
Java,
JavaScript
(1),
JCL
(1),
karol,
LabTalk,
Labview,
Lingo,
LISP,
Logo,
LOLCODE,
LotusScript,
m4,
Magic II,
Makefiles,
MapBasic,
MaxScript,
Meditech Magic,
MEL,
mIRC Script,
MS Access,
MUMPS,
Oberon,
object extensions to C,
Objective-C,
OPS5,
Oz,
Perl
(1),
PHP,
PL/SQL,
PowerDynamo,
PROGRESS 4GL,
prova,
PS-FOCUS,
Python,
Regular Expressions,
RPG,
RPG II,
Scheme,
ScriptMaker,
sendmail.conf,
Smalltalk,
Smalltalk ,
SNOBOL,
SpeedScript,
Sybase PowerBuilder,
Symbian C++,
System RPL,
TCL,
TECO,
The Visual Software Environment,
Tiny praat,
TransCAD,
troff,
uBasic,
VB6
(1),
VBScript
(1),
VDF4,
Vimscript,
Visual Basic
(1),
Visual C++,
Visual Foxpro,
VSE,
Webspeed,
XSLT
The answers covering 80386 assembler, VB6 and VBScript have been removed.
PHP (In no particular order)
Inconsistent function names and argument orders
Because there are a zillion functions, each one of which seems to use a different naming convention and argument order. "Lets see... is it foo_bar or foobar or fooBar... and is it needle, haystack or haystack, needle?" The PHP string functions are a perfect example of this. Half of them use str_foo and the other half use strfoo.
Non-standard date format characters
Take j for example
In UNIX (which, by the way, is what everyone else uses as a guide for date string formats) %j returns the day of the year with leading zeros.
In PHP's date function j returns the day of the month without leading zeros.
Still No Support for Apache 2.0 MPM
It's not recommended.
Why isn't this supported? "When you make the underlying framework more complex by not having completely separate execution threads, completely separate memory segments and a strong sandbox for each request to play in, feet of clay are introduced into PHP's system." Link So... it's not supported 'cause it makes things harder? 'Cause only the things that are easy are worth doing right? (To be fair, as Emil H pointed out, this is generally attributed to bad 3rd-party libs not being thread-safe, whereas the core of PHP is.)
No native Unicode support
Native Unicode support is slated for PHP6
I'm sure glad that we haven't lived in a global environment where we might have need to speak to people in other languages for the past, oh 18 years. Oh wait. (To be fair, the fact that everything doesn't use Unicode in this day and age really annoys me. My point is I shouldn't have to do any extra work to make Unicode happen. This isn't only a PHP problem.)
I have other beefs with the language. These are just some.
Jeff Atwood has an old post about why PHP sucks. He also says it doesn't matter. I don't agree but there we are.
XSLT.
XSLT is baffling, to begin with. The metaphor is completely different from anything else I know.
The thing was designed by a committee so deep in angle brackets that it comes off as a bizarre frankenstein.
The weird incantations required to specify the output format.
The built-in, invisible rules.
The odd bolt-on stuff, like scripts.
The dependency on XPath.
The tools support has been pretty slim, until lately. Debugging XSLT in the early days was an exercise in navigating in complete darkness. The tools change that but, still XSLT tops my list.
XSLT is weird enough that most people just ignore it. If you must use it, you need an XSLT Shaman to give you the magic incantations to make things go.
DOS Batch files. Not sure if this qualifies as programming language at all.
It's not that you can't solve your problems, but if you are used to bash...
Just my two cents.
Not sure if its a true language, but I hate Makefiles.
Makefiles have meaningful differences between space and TAB, so even if two lines appear identical, they do not run the same.
Make also relies on a complex set of implicit rules for many languages, which are difficult to learn, but then are frequently overridden by the make file.
A Makefile system is typically spread over many, many files, across many directories.
With virtually no scoping or abstraction, a change to a make file several directories away can prevent my source from building. Yet the error message is invariably a compliation error, not a meaningful error about make, or the makefiles.
Any environment I've worked in that uses makefiles successfully has a full-time Make expert. And all this to shave a few minutes off compilation??
The worse language I've ever seen come from the tool praat, which is a good audio analysis tool. It does a pretty good job until you use the script language. sigh bad memories.
Tiny praat script tutorial for beginners
Function call
We've listed at least 3 different function calling syntax :
The regular one
string = selected("Strings")
Nothing special here, you assign to the variable string the result of the selected function. Not really scary... yet.
The "I'm invoking some GUI command with parameters"
Create Strings as file list... liste 'path$'/'type$'
As you can see, the function name start at "Create" and finish with the "...". The command "Create Strings as file list" is the text displayed on a button or a menu (I'm to scared to check) on praat. This command take 2 parameters liste and an expression. I'm going to look deeper in the expression 'path$'/'type$'
Hmm. Yep. No spaces. If spaces were introduced, it would be separate arguments. As you can imagine, parenthesis don't work. At this point of the description I would like to point out the suffix of the variable names. I won't develop it in this paragraph, I'm just teasing.
The "Oh, but I want to get the result of the GUI command in my variable"
noliftt = Get number of strings
Yes we can see a pattern here, long and weird function name, it must be a GUI calling. But there's no '...' so no parameters. I don't want to see what the parser looks like.
The incredible type system (AKA Haskell and OCaml, praat is coming to you)
Simple natives types
windowname$ = left$(line$,length(line$)-4)
So, what's going on there?
It's now time to look at the convention and types of expression, so here we got :
left$ :: (String, Int) -> String
lenght :: (String) -> Int
windowname$ :: String
line$ :: String
As you can see, variable name and function names are suffixed with their type or return type. If their suffix is a '$', then it return a string or is a string. If there is nothing it's a number. I can see the point of prefixing the type to a variable to ease implementation, but to suffix, no sorry, I can't
Array type
To show the array type, let me introduce a 'tiny' loop :
for i from 1 to 4
Select... time time
bandwidth'i'$ = Get bandwidth... i
forhertz'i'$ = Get formant... i
endfor
We got i which is a number and... (no it's not a function)
bandwidth'i'$
What it does is create string variables : bandwidth1$, bandwidth2$, bandwidth3$, bandwidth4$ and give them values. As you can expect, you can't create two dimensional array this way, you must do something like that :
band2D__'i'__'j'$
The special string invocation
outline$ = "'time'#F'i':'forhertznum'Hz,'bandnum'Hz, 'spec''newline$'"
outline$ >> 'outfile$'
Strings are weirdly (at least) handled in the language. the '' is used to call the value of a variable inside the global "" string. This is _weird_. It goes against all the convention built into many languages from bash to PHP passing by the powershell. And look, it even got redirection. Don't be fooled, it doesn't work like in your beloved shell. No you have to get the variable value with the ''
Da Wonderderderfulful execution model
I'm going to put the final touch to this wonderderderfulful presentation by talking to you about the execution model. So as in every procedural languages you got instruction executed from top to bottom, there is the variables and the praat GUI. That is you code everything on the praat gui, you invoke commands written on menu/buttons.
The main window of praat contain a list of items which can be :
files
list of files (created by a function with a wonderderfulful long long name)
Spectrogramm
Strings (don't ask)
So if you want to perform operation on a given file, you must select the file in the list programmatically and then push the different buttons to take some actions. If you wanted to pass parameters to a GUI action, you have to follow the GUI layout of the form for your arguments, for example "To Spectrogram... 0.005 5000 0.002 20 Gaussian
" is like that because it follows this layout:
Needless to say, my nightmares are filled with praat scripts dancing around me and shouting "DEBUG MEEEE!!".
More information at the praat site, under the well-named section "easy programmable scripting language"
Well since this question refuses to die and since the OP did prod me into answering...
I humbly proffer for your consideration Authorware (AW) as the worst language it is possible to create. (n.b. I'm going off recollection here, it's been ~6 years since I used AW, which of course means there's a number of awful things I can't even remember)
the horror, the horror http://img.brothersoft.com/screenshots/softimage/a/adobe_authorware-67096-1.jpeg
Let's start with the fact that it's a Macromedia product (-10 points), a proprietary language (-50 more) primarily intended for creating e-learning software and moreover software that could be created by non-programmers and programmers alike implemented as an iconic language AND a text language (-100).
Now if that last statement didn't scare you then you haven't had to fix WYSIWYG generated code before (hello Dreamweaver and Frontpage devs!), but the salient point is that AW had a library of about 12 or so elements which could be dragged into a flow. Like "Page" elements, Animations, IFELSE, and GOTO (-100). Of course removing objects from the flow created any number of broken connections and artifacts which the IDE had variable levels of success coping with. Naturally the built in wizards (-10) were a major source of these.
Fortunately you could always step into a code view, and eventually you'd have to because with a limited set of iconic elements some things just weren't possible otherwise. The language itself was based on TUTOR (-50) - a candidate for worst language itself if only it had the ambition and scope to reach the depths AW would strive for - about which wikipedia says:
...the TUTOR language was not easy to
learn. In fact, it was even suggested
that several years of experience with
the language would be required before
programmers could build programs worth
keeping.
An excellent foundation then, which was built upon in the years before the rise of the internet with exactly nothing. Absolutely no form of data structure beyond an array (-100), certainly no sugar (real men don't use switch statements?) (-10), and a large splash of syntactic vinegar ("--" was the comment indicator so no decrement operator for you!) (-10). Language reference documentation was provided in paper or zip file formats (-100), but at least you had the support of the developer run usegroup and could quickly establish the solution to your problem was to use the DLL or SWF importing features of AW to enable you to do the actual coding in a real language.
AW was driven by a flow (with necessary PAUSE commands) and therefore has all the attendant problems of a linear rather than event based system (-50), and despite the outright marketing lies of the documentation it was not object oriented (-50) either. All code reuse was achieved through GOTO. No scope, lots of globals (-50).
It's not the language's fault directly, but obviously no source control integration was possible, and certainly no TDD, documentation generation or any other add-on tool you might like.
Of course Macromedia met the challenge of the internet head on with a stubborn refusal to engage for years, eventually producing the buggy, hard to use, security nightmare which is Shockwave (-100) to essentially serialise desktop versions of the software through a required plugin (-10). AS HTML rose so did AW stagnate, still persisting with it's shockwave delivery even in the face of IEEE SCORM javascript standards.
Ultimately after years of begging and promises Macromedia announced a radical new version of AW in development to address these issues, and a few years later offshored the development and then cancelled the project. Although of course Macromedia are still selling it (EVIL BONUS -500).
If anything else needs to be said, this is the language which allows spaces in variable names (-10000).
If you ever want to experience true pain, try reading somebody else's uncommented hungarian notation in a language which isn't case sensitive and allows variable name spaces.
Total Annakata Arbitrary Score (AAS): -11300
Adjusted for personal experience: OutOfRangeException
(apologies for length, but it was cathartic)
Seriously: Perl.
It's just a pain in the ass to code with for beginners and even for semi-professionals which work with perl on a daily basis. I can constantly see my colleagues struggle with the language, building the worst scripts, like 2000 lines with no regard of any well accepted coding standard. It's the worst mess i've ever seen in programming.
Now, you can always say, that those people are bad in coding (despite the fact that some of them have used perl for a lot of years, now), but the language just encourages all that freaking shit that makes me scream when i have to read a script by some other guy.
MS Access Visual Basic for Applications (VBA) was also pretty bad. Access was bad altogether in that it forced you down a weak paradigm and was deceptively simple to get started, but a nightmare to finish.
No answer about Cobol yet? :O
Old-skool BASICs with line numbers would be my choice. When you had no space between line numbers to add new lines, you had to run a renumber utility, which caused you to lose any mental anchors you had to what was where.
As a result, you ended up squeezing in too many statements on a single line (separated by colons), or you did a goto or gosub somewhere else to do the work you couldn't cram in.
MUMPS
I worked in it for a couple years, but have done a complete brain dump since then. All I can really remember was no documentation (at my location) and cryptic commands.
It was horrible. Horrible! HORRIBLE!!!
There are just two kinds of languages: the ones everybody complains about and the ones nobody uses.
Bjarne Stroustrup
I haven't yet worked with many languages and deal mostly with scripting languages; out of these VBScript is the one I like least. Although it has some handy features, some things really piss me off:
Object assignments are made using the Set keyword:
Set foo = Nothing
Omitting Set is one of the most common causes of run-time errors.
No such thing as structured exception handling. Error checking is like this:
On Error Resume Next
' Do something
If Err.Number <> 0
' Handle error
Err.Clear
End If
' And so on
Enclosing the procedure call parameters in parentheses requires using the Call keyword:
Call Foo (a, b)
Its English-like syntax is way too verbose. (I'm a fan of curly braces.)
Logical operators are long-circuit. If you need to test a compound condition where the subsequent condition relies on the success of the previous one, you need to put conditions into separate If statements.
Lack of parameterized class constructors.
To wrap a statement into several lines, you have to use an underscore:
str = "Hello, " & _
"world!"
Lack of multiline comments.
Edit: found this article: The Flangy Guide to Hating VBScript. The author sums up his complaints as "VBS isn't Python" :)
Objective-C.
The annotations are confusing, using brackets to call methods still does not compute in my brain, and what is worse is that all of the library functions from C are called using the standard operators in C, -> and ., and it seems like the only company that is driving this language is Apple.
I admit I have only used the language when programming for the iPhone (and looking into programming for OS X), but it feels as if C++ were merely forked, adding in annotations and forcing the implementation and the header files to be separate would make much more sense.
PROGRESS 4GL (apparently now known as "OpenEdge Advanced Business Language").
PROGRESS is both a language and a database system. The whole language is designed to make it easy to write crappy green-screen data-entry screens. (So start by imagining how well this translates to Windows.) Anything fancier than that, whether pretty screens, program logic, or batch processing... not so much.
I last used version 7, back in the late '90s, so it's vaguely possible that some of this is out-of-date, but I wouldn't bet on it.
It was originally designed for text-mode data-entry screens, so on Windows, all screen coordinates are in "character" units, which are some weird number of pixels wide and a different number of pixels high. But of course they default to a proportional font, so the number of "character units" doesn't correspond to the actual number of characters that will fit in a given space.
No classes or objects.
No language support for arrays or dynamic memory allocation. If you want something resembling an array, you create a temporary in-memory database table, define its schema, and then get a cursor on it. (I saw a bit of code from a later version, where they actually built and shipped a primitive object-oriented system on top of these in-memory tables. Scary.)
ISAM database access is built in. (But not SQL. Who needs it?) If you want to increment the Counter field in the current record in the State table, you just say State.Counter = State.Counter + 1. Which isn't so bad, except...
When you use a table directly in code, then behind the scenes, they create something resembling an invisible, magic local variable to hold the current cursor position in that table. They guess at which containing block this cursor will be scoped to. If you're not careful, your cursor will vanish when you exit a block, and reset itself later, with no warning. Or you'll start working with a table and find that you're not starting at the first record, because you're reusing the cursor from some other block (or even your own, because your scope was expanded when you didn't expect it).
Transactions operate on these wild-guess scopes. Are we having fun yet?
Everything can be abbreviated. For some of the offensively long keywords, this might not seem so bad at first. But if you have a variable named Index, you can refer to it as Index or as Ind or even as I. (Typos can have very interesting results.) And if you want to access a database field, not only can you abbreviate the field name, but you don't even have to qualify it with the table name; they'll guess the table too. For truly frightening results, combine this with:
Unless otherwise specified, they assume everything is a database access. If you access a variable you haven't declared yet (or, more likely, if you mistype the variable name), there's no compiler error: instead, it goes looking for a database field with that name... or a field that abbreviates to that name.
The guessing is the worst. Between the abbreviations and the field-by-default, you could get some nasty stuff if you weren't careful. (Forgot to declare I as a local variable before using it as a loop variable? No problem, we'll just randomly pick a table, grab its current record, and completely trash an arbitrarily-chosen field whose name starts with I!)
Then add in the fact that an accidental field-by-default access could change the scope it guessed for your tables, thus breaking some completely unrelated piece of code. Fun, yes?
They also have a reporting system built into the language, but I have apparently repressed all memories of it.
When I got another job working with Netscape LiveWire (an ill-fated attempt at server-side JavaScript) and classic ASP (VBScript), I was in heaven.
The worst language? BancStar, hands down.
3,000 predefined variables, all numbered, all global. No variable declaration, no initialization. Half of them, scattered over the range, reserved for system use, but you can use them at your peril. A hundred or so are automatically filled in as a result of various operations, and no list of which ones those are. They all fit in 38k bytes, and there is no protection whatsoever for buffer overflow. The system will cheerfully let users put 20 bytes in a ten byte field if you declared the length of an input field incorrectly. The effects are unpredictable, to say the least.
This is a language that will let you declare a calculated gosub or goto; due to its limitations, this is frequently necessary. Conditionals can be declared forward or reverse. Picture an "If" statement that terminates 20 lines before it begins.
The return stack is very shallow, (20 Gosubs or so) and since a user's press of any function key kicks off a different subroutine, you can overrun the stack easily. The designers thoughtfully included a "Clear Gosubs" command to nuke the stack completely in order to fix that problem and to make sure you would never know exactly what the program would do next.
There is much more. Tens of thousands of lines of this Lovecraftian horror.
The .bat files scripting language on DOS/Windows. God only knows how un-powerful is this one, specially if you compare it to the Unix shell languages (that aren't so powerful either, but way better nonetheless).
Just try to concatenate two strings or make a for loop. Nah.
VSE, The Visual Software Environment.
This is a language that a prof of mine (Dr. Henry Ledgard) tried to sell us on back in undergrad/grad school. (I don't feel bad about giving his name because, as far as I can tell, he's still a big proponent and would welcome the chance to convince some folks it's the best thing since sliced bread). When describing it to people, my best analogy is that it's sort of a bastard child of FORTRAN and COBOL, with some extra bad thrown in. From the only really accessible folder I've found with this material (there's lots more in there that I'm not going to link specifically here):
VSE Overview (pdf)
Chapter 3: The VSE Language (pdf) (Not really an overview of the language at all)
Appendix: On Strings and Characters (pdf)
The Software Survivors (pdf) (Fevered ramblings attempting to justify this turd)
VSE is built around what they call "The Separation Principle". The idea is that Data and Behavior must be completely segregated. Imagine C's requirement that all variables/data must be declared at the beginning of the function, except now move that declaration into a separate file that other functions can use as well. When other functions use it, they're using the same data, not a local copy of data with the same layout.
Why do things this way? We learn that from The Software Survivors that Variable Scope Rules Are Hard. I'd include a quote but, like most fools, it takes these guys forever to say anything. Search that PDF for "Quagmire Of Scope" and you'll discover some true enlightenment.
They go on to claim that this somehow makes it more suitable for multi-proc environments because it more closely models the underlying hardware implementation. Riiiight.
Another choice theme that comes up frequently:
INCREMENT DAY COUNT BY 7 (or DAY COUNT = DAY COUNT + 7)
DECREMENT TOTAL LOSS BY GROUND_LOSS
ADD 100.3 TO TOTAL LOSS(LINK_POINTER)
SET AIRCRAFT STATE TO ON_THE_GROUND
PERCENT BUSY = (TOTAL BUSY CALLS * 100)/TOTAL CALLS
Although not earthshaking, the style
of arithmetic reflects ordinary usage,
i.e., anyone can read and understand
it - without knowing a programming
language. In fact, VisiSoft arithmetic
is virtually identical to FORTRAN,
including embedded complex arithmetic.
This puts programmers concerned with
their professional status and
corresponding job security ill at
ease.
Ummm, not that concerned at all, really. One of the key selling points that Bill Cave uses to try to sell VSE is the democratization of programming so that business people don't need to indenture themselves to programmers who use crazy, arcane tools for the sole purpose of job security. He leverages this irrational fear to sell his tool. (And it works-- the federal gov't is his biggest customer). I counted 17 uses of the phrase "job security" in the document. Examples:
... and fit only for those desiring artificial job security.
More false job security?
Is job security dependent upon ensuring the other guy can't figure out what was done?
Is job security dependent upon complex code...?
One of the strongest forces affecting the acceptance of new technology is the perception of one's job security.
He uses this paranoia to drive wedge between the managers holding the purse strings and the technical people who have the knowledge to recognize VSE for the turd that it is. This is how he squeezes it into companies-- "Your technical people are only saying it sucks because they're afraid it will make them obsolete!"
A few additional choice quotes from the overview documentation:
Another consequence of this approach
is that data is mapped into memory
on a "What You See Is What You Get"
basis, and maintained throughout.
This allows users to move a complete
structure as a string of characters
into a template that descrives each
individual field. Multiple templates
can be redefined for a given storage
area. Unlike C and other languages,
substructures can be moved without the problems of misalignment due to
word boundary alignment standards.
Now, I don't know about you, but I know that a WYSIWYG approach to memory layout is at the top of my priority list when it comes to language choice! Basically, they ignore alignment issues because only old languages that were designed in the '60's and '70's care about word alignment. Or something like that. The reasoning is bogus. It made so little sense to me that I proceeded to forget it almost immediately.
There are no user-defined types in VSE. This is a far-reaching
decision that greatly simplifies the
language. The gain from a practical
point of view is also great. VSE
allows the designer and programmer to
organize a program along the same
lines as a physical system being
modeled. VSE allows structures to be
built in an easy-to-read, logical
attribute hierarchy.
Awesome! User-defined types are lame. Why would I want something like an InputMessage object when I can have:
LINKS_IN_USE INTEGER
INPUT_MESSAGE
1 ORIGIN INTEGER
1 DESTINATION INTEGER
1 MESSAGE
2 MESSAGE_HEADER CHAR 10
2 MESSAGE_BODY CHAR 24
2 MESSAGE_TRAILER CHAR 10
1 ARRIVAL_TIME INTEGER
1 DURATION INTEGER
1 TYPE CHAR 5
OUTPUT_MESSAGE CHARACTER 50
You might look at that and think, "Oh, that's pretty nicely formatted, if a bit old-school." Old-school is right. Whitespace is significant-- very significant. And redundant! The 1's must be in column 3. The 1 indicates that it's at the first level of the hierarchy. The Symbol name must be in column 5. You hierarchies are limited to a depth of 9.
Well, ok, but is that so awful? Just wait:
It is well known that for reading
text, use of conventional upper/lower
case is more readable. VSE uses all
upper case (except for comments). Why?
The literature in psychology is based
on prose. Programs, simply, are not
prose. Programs are more like math,
accounting, tables. Program fonts
(usually Courier) are almost
universally fixed-pitch, and for good
reason – vertical alignment among
related lines of code. Programs in
upper case are nicely readable, and,
after a time, much better in our
opinion
Nothing like enforcing your opinion at the language level! That's right, you cannot use any lower case in VSE unless it's in a comment. Just keep your CAPSLOCK on, it's gonna be stuck there for a while.
VSE subprocedures are called processes. This code sample contains three processes:
PROCESS_MUSIC
EXECUTE INITIALIZE_THE_SCENE
EXECUTE PROCESS_PANEL_WIDGET
INITIALIZE_THE_SCENE
SET TEST_BUTTON PANEL_BUTTON_STATUS TO ON
MOVE ' ' TO TEST_INPUT PANEL_INPUT_TEXT
DISPLAY PANEL PANEL_MUSIC
PROCESS_PANEL_WIDGET
ACCEPT PANEL PANEL_MUSIC
*** CHECK FOR BUTTON CLICK
IF RTG_PANEL_WIDGET_NAME IS EQUAL TO 'TEST_BUTTON'
MOVE 'I LIKE THE BEATLES!' TO TEST_INPUT PANEL_INPUT_TEXT.
DISPLAY PANEL PANEL_MUSIC
All caps as expected. After all, that's easier to read. Note the whitespace. It's significant again. All process names must start in column 0. The initial level of instructions must start on column 4. Deeper levels must be indented exactly 3 spaces. This isn't a big deal, though, because you aren't allowed to do things like nest conditionals. You want a nested conditional? Well just make another process and call it. And note the delicious COBOL-esque syntax!
You want loops? Easy:
EXECUTE NEXT_CALL
EXECUTE NEXT_CALL 5 TIMES
EXECUTE NEXT_CALL TOTAL CALL TIMES
EXECUTE NEXT_CALL UNTIL NO LINES ARE AVAILABLE
EXECUTE NEXT_CALL UNTIL CALLS_ANSWERED ARE EQUAL TO CALLS_WAITING
EXECUTE READ_MESSAGE UNTIL LEAD_CHARACTER IS A DELIMITER
Ugh.
Here is the contribution to my own question:
Origin LabTalk
My all-time favourite in this regard is Origin LabTalk.
In LabTalk the maximum length of a string variable identifier is one character.
That is, there are only 26 string variables at all. Even worse, some of them are used by Origin itself, and it is not clear which ones.
From the manual:
LabTalk uses the % notation to define
a string variable. A legal string
variable name must be a % character
followed by a single alphabetic
character (a letter from A to Z).
String variable names are
caseinsensitive. Of all the 26 string
variables that exist, Origin itself
uses 14.
Doors DXL
For me the second worst in my opinion is Doors DXL.
Programming languages can be divided into two groups:
Those with manual memory management (e.g. delete, free) and those with a garbage collector.
Some languages offer both, but DXL is probably the only language in the world that
supports neither. OK, to be honest this is only true for strings, but hey, strings aren't exactly
the most rarely used data type in requirements engineering software.
The consequence is that memory used by a string can never be reclaimed and
DOORS DXL leaks like sieve.
There are countless other quirks in DXL, just to name a few:
DXL function syntax
DXL arrays
Cold Fusion
I guess it's good for designers but as a programmer I always felt like one hand was tied behind my back.
The worst two languages I've worked with were APL, which is relatively well known for languages of its age, and TECO, the language in which the original Emacs was written. Both are notable for their terse, inscrutable syntax.
APL is an array processing language; it's extremely powerful, but nearly impossible to read, since every character is an operator, and many don't appear on standard keyboards.
TECO had a similar look, and for a similar reason. Most characters are operators, and this special purpose language was devoted to editing text files. It was a little better, since it used the standard character set. And it did have the ability to define functions, which was what gave life to emacs--people wrote macros, and only invoked those after a while. But figuring out what a program did or writing a new one was quite a challenge.
LOLCODE:
HAI
CAN HAS STDIO?
VISIBLE "HAI WORLD!"
KTHXBYE
Seriously, the worst programming language ever is that of Makefiles. Totally incomprehensible, tabs have a syntactic meaning and not even a debugger to find out what's going on.
I'm not sure if you meant to include scripting languages, but I've seen TCL (which is also annoying), but... the mIRC scripting language annoys me to no end.
Because of some oversight in the parsing, it's whitespace significant when it's not supposed to be. Conditional statements will sometimes be executed when they're supposed to be skipped because of this. Opening a block statement cannot be done on a separate line, etc.
Other than that it's just full of messy, inconsistent syntax that was probably designed that way to make very basic stuff easy, but at the same time makes anything a little more complex barely readable.
I lost most of my mIRC scripts, or I could have probably found some good examples of what a horrible mess it forces you to create :(
Regular expressions
It's a write only language, and it's hard to verify if it works correctly for the right inputs.
Visual Foxpro
I can't belive nobody has said this one:
LotusScript
I thinks is far worst than php at least.
Is not about the language itself which follows a syntax similar to Visual Basic, is the fact that it seem to have a lot of functions for extremely unuseful things that you will never (or one in a million times) use, but lack support for things you will use everyday.
I don't remember any concrete example but they were like:
"Ok, I have an event to check whether the mouse pointer is in the upper corner of the form and I don't have an double click event for the Form!!?? WTF??"
Twice I've had to work in 'languages' where you drag-n-dropped modules onto the page and linked them together with lines to show data flow. (One claimed to be a RDBMs, and the other a general purpose data acquisition and number crunching language.)
Just thinking of it makes me what to throttle someone. Or puke. Or both.
Worse, neither exposed a text language that you could hack directly.
I find myself avoid having to use VBScript/Visual Basic 6 the most.
I use primarily C++, javascript, Java for most tasks and dabble in ruby, scala, erlang, python, assembler, perl when the need arises.
I, like most other reasonably minded polyglots/programmers, strongly feel that you have to use the right tool for the job - this requires you to understand your domain and to understand your tools.
My issue with VBscript and VB6 is when I use them to script windows or office applications (the right domain for them) - i find myself struggling with the language (they fall short of being the right tool).
VBScript's lack of easy to use native data structures (such as associative containers/maps) and other quirks (such as set for assignment to objects) is a needless and frustrating annoyance, especially for a scripting language. Contrast it with Javascript (which i now use to program wscript/cscript windows and do activex automation scripts) which is far more expressive. While there are certain things that work better with vbscript (such as passing arrays back and forth from COM objects is slightly easier, although it is easier to pass event handlers into COM components with jscript), I am still surprised by the amount of coders that still use vbscript to script windows - I bet if they wrote the same program in both languages they would find that jscript works with you much more than vbscript, because of jscript's native hash data types and closures.
Vb6/VBA, though a little better than vbscript in general, still has many similar issues where (for their domain) they require much more boiler plate to do simple tasks than what I would like and have seen in other scripting languages.
In 25+ years of computer programming, by far the worst thing I've ever experienced was a derivative of MUMPS called Meditech Magic. It's much more evil than PHP could ever hope to be.
It doesn't even use '=' for assignment! 100^b assigns a value of 100 to b and is read as "100 goes to b". Basically, this language invented its own syntax from top to bottom. So no matter how many programming languages you know, Magic will be a complete mystery to you.
Here is 100 bottles of beer on the wall written in this abomination of a language:
BEERv1.1,
100^b,T("")^#,DO{b'<1 NN(b,"bottle"_IF{b=1 " ";"s "}_"of beer on the wall")^#,
N(b,"bottle"_IF{b=1 " ";"s "}_"of beer!")^#,
N("You take one down, pass it around,")^#,b-1^b,
N(b,"bottle"_IF{b=1 " ";"s "}_"of beer on the wall!")^#},
END;
TCL. It only compiles code right before it executes, so it's possible that if your code never went down branch A while testing, and one day, in the field it goes down branch A, it could have a SYNTAX ERROR!

Resources