How should I manage side effects in a new language design?

How should I manage side effects in a new language design? - programming-languages

So I'm currently working on a new programming language. Inspired by ideas from concurrent programming and Haskell, one of the primary goals of the language is management of side effects. More or less, each module will be required to specify which side effects it allows. So, if I were making a game, the graphics module would have no ability to do IO. The input module would have no ability to draw to the screen. The AI module would be required to be totally pure. Scripts and plugins for the game would have access to a very restricted subset of IO for reading configuration files. Et cetera.
However, what constitutes a side effect isn't clear cut. I'm looking for any thoughts or suggestions on the subject that I might want to consider in my language. Here are my current thoughts.
Some side effects are blatant. Whether its printing to the user's console or launching your missiles, anything action that reads or write to a user-owned file or interacts with external hardware is a side effect.
Others are more subtle and these are the ones I'm really interested in. These would be things like getting a random number, getting the system time, sleeping a thread, implementing software transactional memory, or even something very fundamental such as allocating memory.
Unlike other languages built to control side effects (looking at you Haskell), I want to design my language to be pragmatic and practical. The restrictions on side effects should serve two purposes:
To aid in the separations of concerns. (No one module can do everything).
To sandbox each module in the application. (Any module could be used as a plugin)
With that in mind, how should I handle "pseudo"-side effects, like random numbers and sleeping, as I mention above? What else might I have missed? In what ways might I manage memory usage and time as resources?

The problem of how to describe and control effects is currently occupying some of the best scientific minds in programming languages, including people like Greg Morrisett of Harvard University. To my knowledge, the most ambitious pioneering work in this area was done by David Gifford and Pierre Jouvelot in the FX programming language started in 1987. The language definition is online, but you may get more insight into the ideas by reading their 1991 POPL paper.

This is a really interesting question, and it represents one of the stages I've gone through and, frankly, moved beyond.
I remember seminars in which Carl Hewitt, in talking about his Actors formalism, discussed this. He defined it in terms of a method giving a response that was solely a function of its arguments, or that could give different answers at different times.
I say I moved beyond this because it makes the language itself (or the computational model) the main subject, as opposed to the problem(s) it is supposed to solve. It is based on the idea that the language should have a formal underlying model so that its properties are easy to verify. That is fine, but still remains a distant goal, because there is still no language (to my knowledge) in which the correctness of something as simple as bubble sort is easy to prove, let alone more complex systems.
The above is a fine goal, but the direction I went was to look at information systems in terms of information theory. Specifically, assuming a system starts with a corpus of requirements (on paper or in somebody's head), those requirements can be transmitted to a program-writing machine (whether automatic or human) to generate source code for a working implementation. THEN, as changes occur to the requirements, the changes are processed through as delta changes to the implementation source code.
Then the question is: What properties of the source code (and the language it is encoded in) facilitate this process? Clearly it depends on the type of problem being solved, what kinds of information go in and out (and when), how long the information has to be retained, and what kind of processing needs to be done on it. From this one can determine the formal level of the language needed for that problem.
I realized the process of cranking through delta changes of requirements to source code is made easier as the format of the code comes more to resemble the requirements, and there is a nice quantitative way to measure this resemblence, not in terms of superficial resemblence, but in terms of editing actions. The well-known technology that best expresses this is domain specific languages (DSL). So I came to realize that what I look for most in a general-purpose language is the ability to create special-purpose languages.
Depending on the application, such special-purpose languages may or may not need specific formal features like functional notation, side-effect control, paralellism, etc. In fact, there are many ways to make a special-purpose language, from parsing, interpreting, compiling, down to just macros in an existing language, down to simply defining classes, variables, and methods in an existing language. As soon as you declare a variable or subroutine you're created new vocabulary and thus, a new language in which to solve your problem. In fact, in this broad sense, I don't think you can solve any programming problem without being, at some level, a language designer.
So best of luck, and I hope it opens up new vistas for you.

A side effect is having any effect on anything in the world other than returning a value, i.e. mutating something that could be visible in some way outside the function.
A pure function neither depends on or affects any mutable state outside the scope of that invocation of the function, which means that the function's output depends only on constants and its inputs. This implies that if you call a function twice with the same arguments, you are guaranteed to get the same result both times, regardless of how the function is written.
If you have a function that modifies a variable that it has been passed, that modification is a side effect because it's visible output from the function other than the return value. A void function that is not a no-op must have side effects, because it has no other way of affecting the world.
The function could have a private variable only visible to that function that it reads and modifies, and calling it would still have the side effect of changing the way the function behaves in the future. Being pure means having exactly one channel for output of any kind: the return value.
It is possible to generate random numbers purely, but you have to pass around the random seed manually. Most random functions keep a private seed value that is updated each time its called so that you get a different random each time. Here's a Haskell snippet using System.Random:
randomColor :: StdGen -> (Color, Int, StdGen)
randomColor gen1 = (color, intensity, gen2)
where (color, gen2) = random gen1
(intensity, gen3) = randomR (1, 100) gen2
The random functions each return the randomized value and a new generator with a new seed (based on the previous one). To get a new value each time, the chain of new generators (gen1,gen2,gen3) have to be passed along. Implicit generators just use an internal variable to store the gen1.. values in the background.
Doing this manually is a pain, and in Haskell you can use a state monad to make it a lot easier. You'll want to implement something less pure or use a facility like monads, arrows or uniqueness values to abstract it away.
Getting the system time is impure because the time could be different each time you ask.
Sleeping is fuzzier because sleep doesn't affect the result of the function, and you could always delay execution with a busy loop, and that wouldn't affect purity. The thing is that sleeping is done for the sake of something else, which IS a side effect.
Memory allocation in pure languages has to happen implicitly, because explicitly allocating and freeing memory are side effects if you can do any kind of pointer comparisons. Otherwise, creating two new objects with the same parameters would still produce different values because they would have different identities (e.g. not be equal by Java's == operator).
I know I've rambled on a bit, but hopefully that explains what side effects are.

Give a serious look to Clojure, and their use of software transactional memory, agents, and atoms to keep side effects under control.

Related

Can anyone explain the design decisions behind Autolisp/visual lisp to me?

I wonder can anyone explain the design rationale behind the following features of autolisp / visual lisp? To me they seem to fly in the face of accepted software practice ... am I missing something?
All variables are global by default (ie unless placed after a / in the function arguments)
Reading/writing data from autocad requires putting stuff into an association list with lots of magic numbers. 10 means x/y coordinates, 90 means length of the coordinate list, 63 means colour, etc. Ok you could store these in some constants but that would mean yet more globals, and the documentation encourages you to use the magic numbers directly.
Lisp is a functional-style language, which encourages programming by recursion over iteration, but tail recursion is afaik not optimised in visual lisp leading to horrendous call stacks - unless, of course you iterate. But loop syntax is very restrictive; e.g. you can't break out of or return a value from a loop unless you put some kind of flag in the termination condition. Result, ugly code.
Generally you are forced to declare variables all over the place which flies in the face of functional programming - so why use a functional(-ish) language?

Lisp isn't a language, it's a group of sometimes surprisingly different languages. Scheme and Clojure are the functional members of the family. Common Lisp, and the more specialized breeds like Elisp aren't particularly functional and don't inherently encourage functional programming or recursion. CL in fact includes a very flexible object system, an extremely flexible iteration DSL, and doesn't guarantee optimized tail calls (Scheme dialects do, but not Lisps in general; that's the pitfall in thinking of "Lisp" as a single language).
Now that we have that cleared up, AutoLisp is an implementation from 1986 based on an early version of XLISP (the earliest of which was published in 1983).
The reason that it might fly in the face of currently accepted programming practice is that it predates currently accepted programming practice. Another thing to keep in mind is that the cheapest netbook available today is several hundred times more powerful than what a programmer could expect to have access to back in the mid 80s. Meaning that even if a given feature was accepted to be excellent, CPU or memory constraints may have prevented its implementation in a commercial language.
I've never programmed in Autolisp/Visual Lisp specifically, and the stuff you cite sounds bloody annoying, but it may have had some performance/memory advantage that justified it at the time.

If I remember correctly, AutoLisp is a fork from an early version of XLisp (some sources claim it was XLisp 1.0 (see this C2 article).
XLisp 1.0 is a 1-cell lisp (functions and variables share the same name-space) with some rather odd oddities to it.

You can add dynamic scoping into the mix btw, and if you don't know what it is consider yourself lucky. But actually not all your four points are that big of a deal IMO:
"Undeclared vars are created automatically as global." Same as in CL is it not (via setq)? The other option is to fail, and that's not a very attractive one for the language which is supposed to be used for quick-n-dirty scripting.
"magic numbers" are DXF-codes, which you're right are major inconvenience as they tend to change with the changing ACAD versions sometimes (thankfully, rarely). That's just how it is. Fixing it would require a major overhaul, introducing some "schemas" and what not, and why would "they" bother? AutoLISP was left in its state as of 1992 approximately, and never bothered with since. Visual LISP itself is entirely different and much more capable system, but it is all locked out for the regular user, and only made to serve one goal - to emulate the old AutoLISP as faithfully as possible (except where it added new VBA-related features in the later half of the 1990s, and was locked since then too).
(while (not done) ...) is not that ugly. There's no tail optimization guarantee, yes, just as there isn't one in CL and Haskell (that last one really stumbles me - there's no guaranteed way to encode a loop in Haskell in constant space without monads - how about that?).
"you're forced to declare vars all over the place" here I do not follow you. You declare them were you supposed to declare them - in the function's internal arguments list. What other places do you mean? I don't know of any.
In reality the biggest stumbling block of AutoLISP is its dynamic name resolution IMO, but that's how it was in Xlisp, only few years after Scheme first came out. Then also it's its immutable data store, but that was done mainly for simplicity of implementation, and to prevent too much confusion and hence questions, from the user base, I guess.

Haskell for mission-critical systems [duplicate]

I've been curious to understand if it is possible to apply the power of Haskell to embedded realtime world, and in googling have found the Atom package. I'd assume that in the complex case the code might have all the classical C bugs - crashes, memory corruptions, etc, which would then need to be traced to the original Haskell code that
caused them. So, this is the first part of the question: "If you had the experience with Atom, how did you deal with the task of debugging the low-level bugs in compiled C code and fixing them in Haskell original code ?"
I searched for some more examples for Atom, this blog post mentions the resulting C code 22KLOC (and obviously no code:), the included example is a toy. This and this references have a bit more practical code, but this is where this ends. And the reason I put "sizable" in the subject is, I'm most interested if you might share your experiences of working with the generated C code in the range of 300KLOC+.
As I am a Haskell newbie, obviously there may be other ways that I did not find due to my unknown unknowns, so any other pointers for self-education in this area would be greatly appreciated - and this is the second part of the question - "what would be some other practical methods (if) of doing real-time development in Haskell?". If the multicore is also in the picture, that's an extra plus :-)
(About usage of Haskell itself for this purpose: from what I read in this blog post, the garbage collection and laziness in Haskell makes it rather nondeterministic scheduling-wise, but maybe in two years something has changed. Real world Haskell programming question on SO was the closest that I could find to this topic)
Note: "real-time" above is would be closer to "hard realtime" - I'm curious if it is possible to ensure that the pause time when the main task is not executing is under 0.5ms.

At Galois we use Haskell for two things:
Soft real time (OS device layers, networking), where 1-5 ms response times are plausible. GHC generates fast code, and has plenty of support for tuning the garbage collector and scheduler to get the right timings.
for true real time systems EDSLs are used to generate code for other languages that provide stronger timing guarantees. E.g. Cryptol, Atom and Copilot.
So be careful to distinguish the EDSL (Copilot or Atom) from the host language (Haskell).
Some examples of critical systems, and in some cases, real-time systems, either written or generated from Haskell, produced by Galois.
EDSLs
Copilot: A Hard Real-Time Runtime Monitor -- a DSL for real-time avionics monitoring
Equivalence and Safety Checking in Cryptol -- a DSL for cryptographic components of critical systems
Systems
HaLVM -- a lightweight microkernel for embedded and mobile applications
TSE -- a cross-domain (security level) network appliance

It will be a long time before there is a Haskell system that fits in small memory and can guarantee sub-millisecond pause times. The community of Haskell implementors just doesn't seem to be interested in this kind of target.
There is healthy interest in using Haskell or something Haskell-like to compile down to something very efficient; for example, Bluespec compiles to hardware.
I don't think it will meet your needs, but if you're interested in functional programming and embedded systems you should learn about Erlang.

Andrew,
Yes, it can be tricky to debug problems through the generated code back to the original source. One thing Atom provides is a means to probe internal expressions, then leaves if up to the user how to handle these probes. For vehicle testing, we build a transmitter (in Atom) and stream the probes out over a CAN bus. We can then capture this data, formated it, then view it with tools like GTKWave, either in post-processing or realtime. For software simulation, probes are handled differently. Instead of getting probe data from a CAN protocol, hooks are made to the C code to lift the probe values directly. The probe values are then used in the unit testing framework (distributed with Atom) to determine if a test passes or fails and to calculate simulation coverage.

I don't think Haskell, or other Garbage Collected languages are very well-suited to hard-realtime systems, as GC's tend to amortize their runtimes into short pauses.
Writing in Atom is not exactly programming in Haskell, as Haskell here can be seen as purely a preprocessor for the actual program you are writing.
I think Haskell is an awesome preprocessor, and using DSEL's like Atom is probably a great way to create sizable hard-realtime systems, but I don't know if Atom fits the bill or not. If it doesn't, I'm pretty sure it is possible (and I encourage anyone who does!) to implement a DSEL that does.
Having a very strong pre-processor like Haskell for a low-level language opens up a huge window of opportunity to implement abstractions through code-generation that are much more clumsy when implemented as C code text generators.

I've been fooling around with Atom. It is pretty cool, but I think it is best for small systems. Yes it runs in trucks and buses and implements real-world, critical applications, but that doesn't mean those applications are necessarily large or complex. It really is for hard-real-time apps and goes to great lengths to make every operation take the exact same amount of time. For example, instead of an if/else statement that conditionally executes one of two code branches that might differ in running time, it has a "mux" statement that always executes both branches before conditionally selecting one of the two computed values (so the total execution time is the same whichever value is selected). It doesn't have any significant type system other than built-in types (comparable to C's) that are enforced through GADT values passed through the Atom monad. The author is working on a static verification tool that analyzes the output C code, which is pretty cool (it uses an SMT solver), but I think Atom would benefit from more source-level features and checks. Even in my toy-sized app (LED flashlight controller), I've made a number of newbie errors that someone more experienced with the package might avoid, but that resulted in buggy output code that I'd rather have been caught by the compiler instead of through testing. On the other hand, it's still at version 0.1.something so improvements are undoubtedly coming.

Why are functional languages considered a boon for multi threaded environments?

I hear a lot about functional languages, and how they scale well because there is no state around a function; and therefore that function can be massively parallelized.
However, this makes little sense to me because almost all real-world practical programs need/have state to take care of. I also find it interesting that most major scaling libraries, i.e. MapReduce, are typically written in imperative languages like C or C++.
I'd like to hear from the functional camp where this hype I'm hearing is coming from..

It's important to add one word: "there's no shared state".
Any meaningful program (in any language) changes the state of the world. But (some) functional languages make it impossible to access the same resource from multiple threads simultaneously. The absence of shared state makes multithreading safe.

Functional languages such as Haskell, Scheme and others have what are called "pure functions". A pure function is a function with no side effects. It doesn't modify any other state in the program. This is by definition threadsafe.
Of course you can write pure functions in imperative languages. You also find multi-paradigm languages like Python, Ruby and even C# where you can do imperative programming, functional programming or both.
But the point of Haskell (etc) is that you can't write a non-pure function. Well that's not strictly true but it's mostly true.
Similarly, many imperative languages have immutable objects for much the same reason. An immutable object is one whose state doesn't change once created. Again by definition an immutable object is threadsafe.

You're talking about two different things and don't realize it.
Yes, most real-world programs have state somewhere, but if you want to do multithreading, that state should not be everywhere, and in fact, the fewer places it's in, the better. In functional programs, the default is not to have state, and you can introduce state exactly where you need it and nowhere else. Those parts that are dealing with state will not be as easily multithreaded, but since all the rest of your program is free of side-effects and thus it doesn't matter what order those parts are executed in, it removes a huge barrier to parallelization.

However, this makes little sense to me because almost all real-world
practical programs need/have state to take care of.
You'd be surprised! Yes, all programs need some state (I/O in particular) but often you don't need much more. Just because most programs have heaps of state doesn't mean they need it.
Programming in a functional language encourages you to use less state, and thus your programs become easier to parallelise.
Many functional languages are "impure" which means they allow some state. Haskell doesn't, but Haskell has monads which basically let you get something from nothing: you get state using stateless constructs. Monads are a bit fiddly to work with which is why Haskell gives you a strong incentive to restrict state to as small a part of your program as possible.
I also find it interesting that most major scaling libraries, i.e.
MapReduce, are typically written in imperative languages like C or C++.
Programming concurrent applications is "hard" in C/C++. That's why it's best to do all the dangerous stuff in a library which is heavily tested and inspected. But you still get the flexibility and performance of C/C++.

Higher order functions. Consider a simple reduction operation, summing the elements of an array. In an imperative language, programmers typically write themselves a loop and perform reductions one element at a time.
But that code isn't easy to make multi-threaded. When you write a loop you're assuming an order of operations and you have to spell out how to get from one element to the next. You'd really like to just say "sum the array" and have the compiler, or runtime, or whatever, make the decision about how to work through the array, dividing up the task as necessary between multiple cores, and combining those results together. So instead of writing a loop, with some addition code embedded inside it, an alternative is to pass something representing "addition" into a function that can do the divvying. As soon as you do that, you're writing functionally. You're passing a function (addition) into another function (the reducer). If you write this way then it not only makes more readable code, but when you change architecture, or want to write for heterogeneous architecture, you don't have to change the summer, just the reducer. In practice you might have many different algorithms that all share one reducer so this is a big payoff.
This is just a simple example. You may want to build on this. Functions to apply other functions on 2D arrays, functions to apply functions to tree structures, functions to combine functions to apply functions (eg. if you have a hierarchical structure with trees above and arrays below) and so on.

For reliable code, NModel, Spec Explorer, F# or other?

I've got a business app in C#, with unit tests. Can I increase the reliability and cut down on my testing time and expense by using NModel or Spec Explorer? Alternately, if I were to rewrite it in F# (or even Haskell), what kinds (if any) of reliability increase might I see?
Code Contracts? ASML?
I realize this is subjective, and possibly argumentative, so please back up your answers with data, if possible. :) Or maybe an worked example, such as Eric Evans Cargo Shipping System?
If we consider
Unit tests to be specific and strong theorems, checked
quasi-statically on particular “interesting instances” and Types to be general but weak theorems (usually checked statically), and contracts to be general and strong theorems, checked dynamically for particular instances that occur during regular program operation.
(from B. Pierce's Types Considered Harmful),
where do these other tools fit?
We could pose the analogous question for Java, using Java PathFinder, Scala, etc.

Reliability is a function of several variables, including the general architecture of the software, the capability of the programmers, the quality of the requirements and the maturity of your configuration management and general QA processes. All these will affect the reliability of a rewrite.
Having said that, language certainly has a significant impact. All other things being equal:
Defects are roughly proportional to SLOC count. Languages that are terser see fewer coding errors. Haskell seems to require about 10% of the SLOC required by C++, Erlang about 14%, Java around 50%. I guess C# probably fits alongside Java on this scale.
Type systems are not borne equal. Languages with type inference (e.g. Haskell and to a lesser extent O'Caml) will have fewer defects. Haskell in particular will allow you to encode invariants in the type system so that a program will only compile if they can be proven true. Doing so requires extra work, so consider the trade-off on a case-by-case basis.
Managing state is a source of many defects. Functional languages, and especially pure functional languages, avoid this problem.
QuickCheck and its relatives allow you to write unit and system tests that verify general properties rather than individual test cases. This can greatly reduce the work required to test the code, especially if you are aiming for high test coverage metrics. A set of QuickCheck properties resembles a formal specification, and this concept fits nicely with Test Driven Development (write your tests first, and when the code passes them you are done).
Put all of these things together and you should have a powerful toolkit for driving quality through the development lifecycle. Unfortunately I'm not aware of any robust studies that actually prove this. All the factors I listed at the start would confound any real study, and you would need a lot of data before an unambiguous pattern showed itself.

Some comments on the quote, in the context of C# which is my "first" language:
Unit tests to be specific and strong
theorems,
Yes, but they might not give you first order logic checks, like "for all x there exists a y where f(y)", more like "there exists a y, here it is (!), f(y)", aka setup, act, assert. ;)*
checked quasi-statically on
particular “interesting instances” and
Types to be general but weak theorems
(usually checked statically),
Types are not necessarily that weak**.
and
contracts to be general and strong
theorems, checked dynamically for
particular instances that occur during
regular program operation. (from B.
Pierce's Types Considered Harmful),
Unit Testing
Pex + Moles I think is getting closer to the first-order logic type of checking, as it generates the edge-cases and uses the C9 solver to work with integer constraint solving. I would really like to see more Moles tutorials (moles is for replacing implementations), specifically together with some sort of inversion of control container that can leverage what stub- and real- implementations of abstract classes and interfaces already exist.
Weak Types
In C# they are fairly weak, sure: generic typing/types allows you to add protocol semantics for one operation -- i.e. constraining types to be on interfaces, which are in some sense protocols which implementing classes agree to. However, the static typing of the protocol is just for one operation.
Example: Reactive Extensions API
Let's take Reactive Extensions as a discussion topic.
The contract required by the consumer, implemented by the observable.
interface IObserver<in T> : IDisposable {
void OnNext(T);
void OnCompleted();
void OnError(System.Exception);
}
There are more to the protocol than this interface shows: methods called on an IObserver< in T > instance must follow this protocol:
Ordering:
OnNext{0,n} (OnCompleted | OnError){0, 1}
Furthermore, on another axis; time-dimension:
Time:
for all t|-> t:(method -> time). t(OnNext) < t(OnCompleted)
for all t|-> t:(method -> time). t(OnNext) < t(OnError)
i.e. no invocation to OnNext may be done after one to OnCompleted xor OnError.
Furthermore, the axis of parallelism:
Parallelism:
no invocation to OnNext may be done in parallel
i.e. there's a scheduling constraint that needs to be followed from implementers of IObservable. No IObservable may push from multiple threads at the same time, without first synchronizing the invocation around a context.
How do you test this contract holds in an easy way? With c#, I don't know.
Consumer of API
From the consuming side of the application, there might be interactions between different contexts, such as Dispatcher, Background/other threads, and preferably we'd like to give guarantees that we don't end up in a deadlock.
Further, there is the requirement to handle deterministic disposing of the observables. It might not be clear all the time when an extension method's returned IObservable instance takes care of the method's arguments' IObservable instances and dispose those, so there's a requirement to know about the inner workings of the black box (alternatively you can let the references go in a "reasonable way" and the GC will take them at some point)
<<< Without Reactive Extensions, it's not necessarily easier:
There is the task pool on top of TPL is implemented. In the task pool we have a work-stealing queue of delegates to invoke on the worker threads.
Using the APM/begin/end or the async pattern (which queues to the task pool) could leave us open to callback-ordering bugs if we mutating state. Also, the protocol of begin-invocations and their callbacks might be too convoluted and hence impossible to follow. I read a post-mortem the other day about a silverlight project having problems seeing the business logic-forest for all the callback-trees. Then there's the possibility of implementing the poor-man's async monad, the IEnumerable with an async 'manager' iterating through it and calling MoveNext() every time a yielded IAsyncResult completes.
...and don't get me started on the nuuuumerous hidden protocols in IAsyncResult.
Another problem, without using Reactive extensions is the turtles problem - once you decide that you want an IO-blocking operation to be async, there need to be turtles all the way down to the p/invoke call that places the associated Win32-thread on an IO-completion port! If you have three layers and then some logic as well inside of your topmost layer, you need to make all three layers implement the APM pattern; and fulfil the numerous contract obligations of IAsyncResult (or leave it partially broken) -- and there's no default public AsyncResult implementation in the base class library.
>>>
Working with exceptions from the interface
Even with the above memory-management + parallelism + contract + protocol items covered, there are still exceptions to be handled (not just received and forgotten about), in a good, reliable application. I want to make an example;
Context
Let's say that we find ourselves catching an exception from the contract/interface (not necessarily from reactive extensions' IObservable implementations here which have monadic exception handling rather than stack-frame based).
Hopefully the programmer was diligent and documented the possible exceptions, but there might be exception possibilities all the way down. If everything is correctly defined with code contracts at least we can be sure we are capable of catching a few of the exceptions, but many different causes may be lumped together inside of one exception type, and once an exception is thrown, how do we ensure that the work of the least possible size is rectified?
Aim
Say that we are pushing some data-record from a message-bus-consumer in our application, and receiving them on the background thread which decides what to do with them.
Example
A real-life example here could be Spotify, which I'm using every day.
My $100 router/access point throws in the towel at random times. I guess it has a cache-bug or some sort of stack overflow bug, as it happens every time I push more than 2 MB/s LAN/WAN data through it.
I have to NICs up; the wifi and the ethernet card. Ethernet's connection goes down. The sockets of Spotify's event-handler loop return an invalid code (I think it's C or C++) or throw exceptions. Spotify has to handle it, but it doesn't know what my network topology looks like (and there is no code to try all routes/update the routing table and hence the interface to be used); I still have a route to the internet, but just not on the same interface. Spotify crashes.
A thesis
Exceptions are simply not semantic enough. I believe one can look at exceptions from the perspective of the Error monad in Haskell. We either continue or break: unwinding the stack, executing the catches, executing the finally's an praying we don't end up with race conditions on either other exception handlers or the GC, or async exceptions for outstanding IO-completion ports.
But when one of my interfaces' connection/route goes down, Spotify crashes freezes.
Now we have SEH/Structured Exception Handling, but I think we will have SEH2 in the future, where each source of exception gives, with the actual exception, a discriminated union (i.e. it should be statically typed to the linked library/assembly), of possible compensating actions -- in this example, I could imagine Windows' network API telling the application to execute a compensating action to open the same socket on another interface, or to handle it on its own (like now), or to retry the socket, with some kernel-managed retry policy. Each of these options are parts of a discriminated union type, so the implementer must use one of them.
I think that, when we have SEH2, it won't be called exceptions anymore.
^^
Anyway, I have digressed too much already.
Instead of reading my thoughts, listen to some of Erik Meijer's -- this is a very good round-table discussion between him and Joe Duffy. They discuss handling side-effects of calls. Or have a look at this search listing.
I'm finding myself in a position, today, as a consultant, of maintaining a system where stronger static semantics could be good, and I'm looking at tools which can give me the speed of programming + the correctness verification on a level which is accurate and precise. I haven't found it yet.
I simply think we are another 20 years if not more away from developer oriented reliable computing. There are just too many languages, frameworks, marketing BS and concepts in the air right now, for the ordinary develop to stay on top of things.
Why is this under the heading of "weak types"?
Because I find that the type system will be part of the solution; types need not be weak! Terse code and strong type systems (think Haskell) help programmers build reliable software.

Using Polymorphic Code for Legitimate Purposes?

I recently came across the term Polymorphic Code, and was wondering if anyone could suggest a legitimate (i.e. in legal and business appropriate software) reason to use it in a computer program? Links to real world examples would be appreciated!
Before someone answers, telling us all about the benefits of polymorphism in object oriented programming, please read the following definition for polymorphic code (taken from Wikipedia):
"Polymorphic code is code that uses a polymorphic engine to mutate while keeping the original algorithm intact. That is, the code changes itself each time it runs, but the function of the code in whole will not change at all."
Thanks, MagicAndi.
Update
Summary of answers so far:
Runtime optimization of the original code
Assigning a "DNA fingerprint" to each individual copy of an application
Obfuscate a program to prevent reverse-engineering
I was also introduced to the term 'metamorphic code'.

Runtime optimization of the original code, based on actual performance statistics gathered when running the application in its real environment and real inputs.

Digitally watermarking music is something often done to determine who was responsible for leaking a track, for example. It makes each copy of the music unique so that copies can be traced back to the original owner, but doesn't affect the audible qualities of the track.
Something similar could be done for compiled software by running each individual copy through a polymorphic engine before distributing it. Then if a cracked version of this software is released onto the Internet, the developer might be able to tell who cracked it by looking for specific variations produced the polymorphic engine (a sort of DNA test). As far as I know, this technique has never been used in practice.
It's not exactly what you were looking for I guess, since the polymorphic engine is not distributed with the code, but I think it's the closest to a legitimate business use you will find for this kind of technique.

Polymorphic code is a nice thing, but metamorphic is even nicer. To the legitimate uses: well, I can't think of anything other than anti-cracking and copy protection. Look at vx.org.ua if you wan't real world uses (not that legitimate though)

As Sami notes, on-the-fly optimisation is an excellent application of polymorphic code. A great example of this is the Fastest Fourier Transform in the West. It has a number of solvers at its disposal, which it combines with self-profiling to adjust the code path and solver parameters on subsequent executions. The result is the program optimises itself for your computing environment, getting faster with subsequent runs!
A related idea that may possibly be of interest is computational steering. This is the practice of altering the execution path of large simulations as the run proceeds, to focus on areas of interest to the researcher. The overall purpose of the simulation is not changed, but the feedback cycle acts to optimise the calculation. In this case the executable code is not being explicitly rewritten, but the effect from a user perspective is similar.

Polymorph code can be used to obfuscate weak or proprietary algorithms - that may use encryption e. g.. There're many "legitimate" uses for that. The term legitimate these days is kind of narrow-minded when it comes to IT. The core-paradigms of IT contain security. Whether you use polymorph shellcode in exploits or detect such code with an AV scanner. You have to know about it.

Obfuscate a program i.e. prevent reverse-engineering: goal being to protect IP (Intellectual Property).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string