Control structures beyond standard conditionals and loops? - programming-languages

Structured programming languages typically have a few control structures, like while, if, for, do, switch, break, and continue that are used to express high-level structures in source code.
However, there are many other control structures that have been proposed over the years that haven't made their way into modern programming languages. For example, in Knuth's paper "Structured Programming with Go To Statements," page 275, he references a control structure that looks like a stripped-down version of exception handling:
loop until event1 or event2 or ... eventN
/* ... */
leave with event1;
/* ... */
repeat;
then event1 -> /* ... code if event1 occurred ... */
event2 -> /* ... code if event2 occurred ... */
/* ... */
eventN -> /* ... code if eventN occurred ... */
fi;
This seems like a useful structure, but I haven't seen any languages that actually implement it beyond as a special case of standard exception handling.
Similarly, Edsger Dijkstra often used a control structure in which one of many pieces of code is executed nondeterministically based on a set of conditions that may be true. You can see this on page 10 of his paper on smoothsort, among other places. Sample code might look like this:
do
/* Either of these may be chosen if x == 5 */
if x <= 5 then y = 5;
if x >= 5 then y = 137;
od;
I understand that historically C influenced many modern languages like C++, C#, and Java, and so many control structures we use today are based on the small set offered by C. However, as evidenced by this other SO question, we programmers like to think about alternative control structures that we'd love to have but aren't supported by many programming languages.
My question is this - are there common languages in use today that support control structures radically different from the C-style control structures I mentioned above? Such a control structure doesn't have to be something that can't be represented using the standard C structures - pretty much anything can be encoded that way - but ideally I'd like an example of something that lets you approach certain programming tasks in a fundamentally different way than the C model allows.
And no, "functional programming" isn't really a control structure.

Since Haskell is lazy, every function call is essentially a control structure.
Pattern-matching in ML-derived languages merges branching, variable binding, and destructuring objects into a single control structure.
Common Lisp's conditions are like exceptions that can be restarted.
Scheme and other languages support continuations which let you pause and resume or restart a program at any point.

Perhaps not "radically different" but "asynchronous" control structures are fairly new.
Async allows non-blocking code to be executed in parallel, with control returning to the main program flow once completed. Although the same could be achieved with nested callbacks, doing anything non-trivial in this way leads to fugly code very quickly.
For example in the upcoming versions of C#/VB, Async allows calling into asynchronous APIs without having to split your code across multiple methods or lambda expressions. I.e. no more callbacks. "await" and "async" keywords enable you to write asynchronous methods that can pause execution without consuming a thread, and then resume later where it left off.
// C#
async Task<int> SumPageSizesAsync(IList<Uri> uris)
{
int total = 0;
var statusText = new TextBox();
foreach (var uri in uris)
{
statusText.Text = string.Format("Found {0} bytes ...", total);
var data = await new WebClient().DownloadDataTaskAsync(uri);
total += data.Length;
}
statusText.Text = string.Format("Found {0} bytes total", total);
return total;
}
(pinched from http://blogs.msdn.com/b/visualstudio/archive/2011/04/13/async-ctp-refresh.aspx)
For Javascript, there's http://tamejs.org/ that allows you to write code like this:
var res1, res2;
await {
doOneThing(defer(res1));
andAnother(defer(res2));
}
thenDoSomethingWith(res1, res2);

C#/Python iterators/generators
def integers():
i = 0
while True:
yield i
i += 1

(I don't know a lot about the subject so I marked this a wiki)
Haskell's Pattern Matching.
Plain example:
sign x | x > 0 = 1
| x == 0 = 0
| x < 0 = -1
or, say, Fibonacci, which looks almost identical to the math equation:
fib x | x < 2 = 1
| x >= 2 = fib (x - 1) + fib (x - 2)

Related

Understanding recurive let expression in lambda calculus with Haskell, OCaml and nix language

I'm trying to understand how recursive set operate internally by comparing similar feature in another functional programming languages and concepts.
I can find it in wiki. In that, I need to know Y combinator, fixed point. I can get it briefly in wiki.
Then, now I start to apply this in Haskell.
Haskell
It is easy. But I want to know behind the scenes.
*Main> let x = y; y = 10; in x
10
When you write a = f b in a lazy functional language like Haskell or Nix, the meaning is stronger than just assignment. a and f b will be the same thing. This is usually called a binding.
I'll focus on a Nix example, because you're asking about recursive sets specifically.
A simple attribute set
Let's look at the initialization of an attribute set first. When the Nix interpreter is asked to evaluate this file
{ a = 1 + 1; b = true; }
it parses it and returns a data structure like this
{ a = <thunk 1>; b = <thunk 2>; }
where a thunk is a reference to the relevant syntax tree node and a reference to the "environment", which behaves like a dictionary from identifiers to their values, although implemented more efficiently.
Perhaps the reason we're evaluating this file is because you requested nix-build, which will not just ask for the value of a file, but also traverse the attribute set when it sees that it is one. So nix-build will ask for the value of a, which will be computed from its thunk. When the computation is complete, the memory that held the thunk is assigned the actual value, type = tInt, value.integer = 2.
A recursive attribute set
Nix has a special syntax that combines the functionality of attribute set construction syntax ({ }) and let-binding syntax. This is avoids some repetition when you're constructing attribute sets with some shared values.
For example
let b = 1 + 1;
in { b = b; a = b + 5; }
can be expressed as
rec { b = 1 + 1; a = b + 5; }
Evaluation works in a similar manner.
At first the evaluator returns a representation of the attribute set with all thunks, but this time the thunks reference an new environment that includes all the attributes, on top of the existing lexical scope.
Note that all these representations can be constructed while performing a minimal amount of work.
nix-build traverses attrsets in alphabetic order, so it will evaluate a first. It's a thunk that references the a + syntax node and an environment with b in it. Evaluating this requires evaluating the b syntax node (an ExprVar), which references the environment, where we find the 1 + 1 thunk, which is changed to a tInt of 2 as before.
As you can see, this process of creating thunks but only evaluating them when needed is indeed lazy and allows us to have various language constructs with their own scoping rules.
Haskell implementations usually follow a similar pattern, but may compile the code rather than interpret a syntax tree, and resolve all variable references to constant memory offsets completely. Nix tries to do this to some degree, but it must be able to fall back on strings because of the inadvisable with keyword that makes the scope dynamic.
I guess several things by myself.
In eagar evaluation language, I must declare before use it. So the order of declaration is simple.
int x = 10;
int y = x;
Just for Nix language
In wiki, there isn't any concept comparision with Haskell though let ... in is compared with Haskell.
lexical scope
all variables are lexically scoped.
mutual recursion
https://en.wikipedia.org/wiki/Let_expression#Mutually_recursive_let_expression

What is process interleaving? (in the realm of Concurrency)

I'm not quite sure as to what this term means. I saw it during a course where we are learning about concurrency. I've seen a lot of definitions for data interleaving, but I could find anything about process interleaving.
When looking at the term my instincts tell me it is the use of threads to run more than one process simultaneously, is that correct?
If you imagine a process as a (possibly infinite) sequence/trace of statements (e.g. obtained by loop unfolding), then the set of possible interleavings of several processes consists of all possible sequences of statements of any of those process.
Consider for example the processes
int i;
proctype A() {
i = 1;
}
proctype B() {
i = 2;
}
Then the possible interleavings are i = 1; i = 2 and i = 2; i = 1, i.e. the possible final values for i are 1 and 2. This can be of course more complex, for instance in the presence of guarded statements: Then the next possible statements in an interleaving sequence are not necessarily those at the position of the next program counter, but only those that are allowed by the guard; consider for example the proctype
proctype B() {
if
:: i == 0 -> i = 2
:: else -> skip
fi
}
Then the possible interleavings (given A() as before) are i = 1; skip and i = 2; i = 1, so there is only one possible final value for i.
Indeed the notion of interleavings is crucial for Spin's view of concurrency. In a trace semantics, the set of possible traces of concurrent processes is the set of possible interleavings of the traces of the individual processes.
It simply means performing (data access or execution or ... ) in an arbitrary order**(see the note). In the case of concurrency, it usually refers to action interleaving.
If the process P and Q are in parallel composition (P||Q) then the actions of these will be interleaved. Consider following processes:
PLAYING = (play_music -> stop_music -> STOP).
PERFORMING = (dance -> STOP).
||PLAY_PERFORM = (PLAYING || PERFORMING).
So each primitive process can be shown as: (generated by LTSA model-cheking tool)
Then the possible traces as the result of action interleaving will be:
dance -> play_music -> stop_music
play_music -> dance -> stop_music
play_music -> stop_music -> dance
Here is the LTSA tool generated output of this example.
**note: "arbitrary" here means arbitrary choice of process execution not their inner sequence of codes. The code execution in each process will be always followed sequentially.
If it is still something that you're not comfortable with you can take a look at: https://www.doc.ic.ac.uk/~jnm/book/firstbook/pdf/ch3.pdf
Hope it helps! :)
Operating Systems support Tasks (or Processes). But for now let's think of "Actitivities".
Activities can be executed in parallel. Here are two activities, P and Q:
P: abc
Q: def
a, b, c, d, e, f, are operations. *
Each operation has always the same effect independent of what other
operations may be executing at the same time (atomicity).
What is the effect of executing the two activities concurrently? We
do not know for sure, but we know that it will be the same as obtained
by executing sequentially an INTERLEAVING of the two activities
[interleavings are also called SCHEDULES]. Here are the possible
interleavings of these two activities:
abcdef
abdcef
abdecf
abdefc
adbcef
......
defabc
That is, the operations of the two activities are sequenced in all possible ways that preserve the order in which the operations appeared in the two activities. A serial interleaving [serial schedule] of two activities is one where all the operations of one activity precede all the operations of the other activity.
The importance of the concept of interleaving is that it allows us to express the meaning of concurrent programs: The parallel execution of activities is equivalent to the sequential execution of one of the interleavings of these activities.
For detailed information: https://cis.temple.edu/~ingargio/cis307/readings/interleave.html

Multithread+Recursion strategies

I am just starting to learn the ins-and-outs of multithread programming and have a few basic questions that, once answered, should keep me occupied for quite sometime. I understand that multithreading loses its effectiveness once you have created more threads than there are cores (due to context switching and cache flushing). With that understood, I can think of two ways to employ multithreading of a recursive function...but am not quite sure what is the common way to approach the problem. One seems much more complicated, perhaps with a higher payoff...but thats what I hope you will be able to tell me.
Below is pseudo-code for two different methods of multithreading a recursive function. I have used the terminology of merge sort for simplicity, but it's not that important. It is easy to see how to generalize the methods to other problems. Also, I will personally be employing these methods using the pthreads library in C, so the thread syntax mildly reflects this.
Method 1:
main ()
{
A = array of length N
NUM_CORES = get number of functional cores
chunk[NUM_CORES] = array of indices partitioning A into (N / NUM_CORES) sized chunks
thread_id[NUM_CORES] = array of thread id’s
thread[NUM_CORES] = array of thread type
//start NUM_CORES threads on working on each chunk of A
for i = 0 to (NUM_CORES - 1) {
thread_id[i] = thread_start(thread[i], MergeSort, chunk[i])
}
//wait for all threads to finish
//Merge chunks appropriately
exit
}
MergeSort ( chunk )
{
MergeSort ( lowerSubChunk )
MergeSort ( higherSubChunk )
Merge(lowerSubChunk, higherSubChunk)
}
//Merge(,) not shown
Method 2:
main ()
{
A = array of length N
NUM_CORES = get number of functional cores
chunk = indices 0 and N
thread_id[NUM_CORES] = array of thread id’s
thread[NUM_CORES] = array of thread type
//lock variable aka mutex
THREADS_IN_USE = 1
MergeSort( chunk )
exit
}
MergeSort ( chunk )
{
lock THREADS_IN_USE
if ( THREADS_IN_USE < NUM_CORES ) {
FREE_CORE = find index of unused core
thread_id[FREE_CORE] = thread_start(thread[FREE_CORE], MergeSort, lowerSubChunk)
THREADS_IN_USE++
unlock THREADS_IN_USE
MergeSort( higherSubChunk )
//wait for thread_id[FREE_CORE] and current thread to finish
lock THREADS_IN_USE
THREADS_IN_USE--
unlock THREADS_IN_USE
Merge(lowerSubChunk, higherSubChunk)
}
else {
unlock THREADS_IN_USE
MergeSort( lowerSubChunk )
MergeSort( higherSubChunk )
Merge(lowerSubChunk, higherSubChunk)
}
}
//Merge(,) not shown
Visually, one can think of the differences between these two methods as follows:
Method 1: creates NUM_CORES separate recursion trees, each one having a single core traversing it.
Method 2: creates a single recursion tree but has all cores traversing it. In particular, whenever there is a free core, it is set to work on the "left child subtree" of the first node where MergeSort is called after the core is freed.
The problem with Method 1 is that if it is the case that the running time of the recursive function varies with the distribution of values within each initial subchunk (i.e. the chunk[i]), one thread could finish much faster leaving a core sitting idle while the others finish. With Merge Sort this is not likely to be the case since the work of MergeSort happens in Merge whose runtime isn't affected much by the distribution of values in the (sorted) subchunks. However, with a more involved recursive function, the running time on one subchunk could be much longer!
With Method 2 it is possible to have the same problem. Again, with merge sort its not clear since the running time for each subchunk is likely to be similar, but the line //wait for thread_id[FREE_CORE] and current thread to finish would also require one core to wait for the other. However, with Method 2, all calls to Merge run ASAP as opposed to Method 1 where one must wait for NUM_CORES calls to MergeSort to finish and then do NUM_CORES - 1 merges afterward (although you can multithread this as well...to an extent)
(though the syntax might not be completely correct)
Are both of these methods used in practice? Are there situations where one is more beneficial over the other? Is this the correct way to implement Method 2? (in this case, THREADS_IN_USE is a semaphore?)
Thanks so much for your help!

How GOTO statement in Groovy?

I saw this nice blog post about a Scala continuations that 'emulates' a GOTO statement in the Scala language. (read more about Continuations here)
I would like to have the same in the programming language Groovy. I think it's possible within a Groovy compiler phase transformation.
I'm working on an Domain-Specific Language (DSL), and preferred embedded in Groovy. I would like to have the GOTO statement, because the DSL is an unstructured language (and is generated from workflow diagrams). I need a 'labeled' goto statement, not to line numbers.
The DSL is a language for workflow definitions, and because there are no restrictions for the arrows between nodes, a goto is needed. (or unreadable code with while etc)
As a beginner of Groovy and Scala I don't know If I can translate the Scala solution to Groovy, but I don think there are continuations in Groovy.
I'm looking for an algorithm/code for emulating labeled goto's in Groovy. One algorithm I had in mind is using eval repeatedly; doing the eval when your are at a goto.
The DSL is evaluated with an eval already.
I'm not looking for a 'while' loop or something, but rather translating this code so that it works (some other syntax is no problem)
label1:
a();
b();
goto label1;
PS:
I don't prefer the discussion if I should really use/want the GOTO statement. The DSL is a specification-language and is probably not coping with variables, efficiency etc.
PS2: Some other keyword then GOTO can be used.
You might want to tell a little bit more about the language you are trying to build, perhaps it's simple enough that dealing with transformations would be overengineering.
Playing with the AST is something groovy people have been doing for years and it's really powerful.
The spock framework guys rewrite the tests you create annotating the code with labels. http://code.google.com/p/spock/
Hamlet D'Arcy has given several presentations on the matter. Several posts can also be found on his blog. http://hamletdarcy.blogspot.com/
Cedric Champeau describes an interesting transformation he built and its evolution http://www.jroller.com/melix/
Probably missing lots of other guys but those I remember.
A possible starting points that you probably already know but are really useful. http://groovy.codehaus.org/Compile-time+Metaprogramming+-+AST+Transformations
http://groovy.codehaus.org/Building+AST+Guide
Long story short, I'd say its quite possible
You won't get anywhere trying this, as goto is a reserved word in Groovy (as it is in Java), so using it in your DSL will be problematic.
It's not a reserved word in Scala, so this isn't an issue
You can emulate if and goto with while loops. It won't be pretty, it will introduce lots of unnecessary code-blocks, but it should work for any function. There is some proof that this is always possible to rewrite code like that, but of course being possible does not mean it's nice or easy.
Basically you move all local variables to the beginning of the function and add a bool takeJump local variable. Then add a while(takeJump){+} pair for any goto+label pair and set the flag before the while and before the end of the while to the value you want.
But to be honest I don't recommend that approach. I'd rather use a library that allows me to build an AST with labels and gotos and then translates that directly to byte-code.
Or use some other language built on the java vm that does support goto. I'm sure there is such a language.
Just throwing this out there, perhaps you could have a scoped switch case
So if your DSL says this:
def foo() {
def x = x()
def y
def z
label a:
y = y(x)
if(y < someConst) goto a
label b:
z = y(z)
if(z > someConst) goto c
x = y(y(z+x))
z = y(x)
label c:
return z;
}
Your "compiler" can turn it into this:
def foo() {
String currentLABEL = "NO_LABEL"
while(SCOPED_INTO_BLOCK_0143) {
def x
def y
def z
def retval
switch(currentLABEL) {
case "NO_LABEL":
x = x()
case "LABEL_A"
y = y(x)
if(y < someConst) {
currentLABEL = "LABEL_A"
break
}
case "LABEL_B"
z = y(z)
if(z > someConst) {
currentLabel = "LABEL_C"
break
}
x = y(y(z+x))
z = y(x)
case "LABEL_C"
SCOPED_INTO_BLOCK_0143 = false
retval = z
}
}
return retval
}

Are there any programming languages where variables are really functions?

For example, I would write:
x = 2
y = x + 4
print(y)
x = 5
print(y)
And it would output:
6 (=2+4)
9 (=5+4)
Also, are there any cases where this could actually be useful?
Clarification: Yes, lambdas etc. solve this problem (they were how I arrived at this idea); I was wondering if there were specific languages where this was the default: no function or lambda keywords required or needed.
Haskell will meet you halfway, because essentially everything is a function, but variables are only bound once (meaning you cannot reassign x in the same scope).
It's easy to consider y = x + 4 a variable assignment, but when you look at y = map (+4) [1..] (which means add 4 to every number in the infinite list from 1 upwards), what is y now? Is it an infinite list, or is it a function that returns an infinite list? (Hint: it's the second one.) In this case, treating variables as functions can be extremely beneficial, if not an absolute necessity, when taking advantage of laziness.
Really, in Haskell, your definition of y is a function accepting no arguments and returning x+4, where x is also a function that takes no arguments, but returns the value 2.
In any language with first order functions, it's trivial to assign anonymous functions to variables, but for most languages you'll have to add the parentheses to indicate a function call.
Example Lua code:
x = function() return 2 end
y = function() return x() + 4 end
print(y())
x = function() return 5 end
print(y())
$ lua x.lua
6
9
Or the same thing in Python (sticking with first-order functions, but we could have just used plain integers for x):
x = lambda: 2
y = lambda: x() + 4
print(y())
x = lambda: 5
print(y())
$ python x.py
6
9
you can use func expressions in C#
Func<int, int> y = (x) => x + 5;
Console.WriteLine(y(5)); // 10
Console.WriteLine(y(3)); // 8
... or ...
int x = 0;
Func<int> y = () => x + 5;
x = 5;
Console.WriteLine(y()); // 10
x = 3;
Console.WriteLine(y()); // 8
... if you are really wanting to program in a functional style the first option would probably be best.
it looks more like the stuff you saw in math class.
you don't have to worry about external state.
Check out various functional languages like F#, Haskell, and Scala. Scala treats functions as objects that have an apply() method, and you can store them in variables and pass them around like you can any other kind of object. I don't know that you can print out the definition of a Scala function as code though.
Update: I seem to recall that at least some Lisps allow you to pretty-print a function as code (eg, Scheme's pretty-print function).
This is the way spreadsheets work.
It is also related to call by name semantics for evaluating function arguments. Algol 60 had that, but it didn't catch on, too complicated to implement.
The programming language Lucid does this, although it calls x and y "streams" rather than functions.
The program would be written:
y
where
y = x + 4
end
And then you'd input:
x(0): 2
y = 6
x(1): 5
y = 7
Of course, Lucid (like most interesting programming languages) is fairly obscure, so I'm not surprised that nobody else found it. (or looked for it)
Try checking out F# here and on Wikipedia about Functional programming languages.
I myself have not yet worked on these types of languages since I've been concentrated on OOP, but will be delving soon once F# is out.
Hope this helps!
The closest I've seen of these have been part of Technical Analysis systems in charting components. (Tradestation, metastock, etc), but mainly they focus on returning multiple sets of metadata (eg buy/sell signals) which can be then fed into other functions that accept either meta data, or financial data, or plotted directly.
My 2c:
I'd say a language as you suggest would be highly confusing to say the least. Functions are generally r-values for good reason. This code (javascript) shows how enforcing functions as r-values increases readability (and therefore maintenance) n-fold:
var x = 2;
var y = function() { return x+2; }
alert(y());
x= 5;
alert(y());
Self makes no distinction between fields and methods, both are slots and can be accessed in exactly the same way. A slot can contain a value or a function (so those two are still separate entities), but the distinction doesn't matter to the user of the slot.
In Scala, you have lazy values and call-by-name arguments in functions.
def foo(x : => Int) {
println(x)
println(x) // x is evaluated again!
}
In some way, this can have the effect you looked for.
I believe the mathematically oriented languages like Octave, R and Maxima do that. I could be wrong, but no one else has mentioned them, so I thought I would.

Resources