Related
How can I know if I actually need to return an l-value when using FALLBACK?
I'm using return-rw but I'd like to only use return where possible. I want to track if I've actually modified %!attrs or have only just read the value when FALLBACK was called.
Or (alternate plan B) can I attach a callback or something similar to my %!attrs to monitor for changes?
class Foo {
has %.attrs;
submethod BUILD { %!attrs{'bar'} = 'bar' }
# multi method FALLBACK(Str:D $name, *#rest) {
# say 'read-only';
# return %!attrs{$name} if %!attrs«$name»:exists;
# }
multi method FALLBACK(Str:D $name, *#rest) {
say 'read-write';
return-rw %!attrs{$name} if %!attrs«$name»:exists;
}
}
my $foo = Foo.new;
say $foo.bar;
$foo.bar = 'baz';
say $foo.bar;
This feels a bit like a X-Y question, so let's simplify the example, and see if that answers helps in your decisions.
First of all: if you return the "value" of a non-existing key in a hash, you are in fact returning a container that will auto-vivify the key in the hash when assigned to:
my %hash;
sub get($key) { return-rw %hash{$key} }
get("foo") = 42;
dd %hash; # Hash %hash = {:foo(42)}
Please note that you need to use return-rw here to ensure the actual container is returned, rather than just the value in the container. Alternately, you can use the is raw trait, which allows you to just set the last value:
my %hash;
sub get($key) is raw { %hash{$key} }
get("foo") = 42;
dd %hash; # Hash %hash = {:foo(42)}
Note that you should not use return in that case, as that will still de-containerize again.
To get back to your question:
I want to track if I've actually modified %!attrs or have only just read the value when FALLBACK was called.
class Foo {
has %!attrs;
has %!unexpected;
method TWEAK() { %!attrs<bar> = 'bar' }
method FALLBACK(Str:D $name, *#rest) is raw {
if %!attrs{$name}:exists {
%!attrs{$name}
}
else {
%!unexpected{$name}++;
Any
}
}
}
This would either return the container found in the hash, or record the access to the unknown key and return an immutable Any.
Regarding plan B, recording changes: for that you could use a Proxy object for that.
Hope this helps in your quest.
Liz's answer is full of useful info and you've accepted it but I thought the following might still be of interest.
How to know if returning an l-value ... ?
Let's start by ignoring the FALLBACK clause.
You would have to test the value. To deal with Scalars, you must test the .VAR of the value. (For non-Scalar values the .VAR acts like a "no op".) I think (but don't quote me) that Scalar|Array|Hash covers all the l-value super-types:
my \value = 42; # Int is an l-value is False
my \l-value-one = $; # Scalar is an l-value is True
my \l-value-too = #; # Array is an l-value is True
say "{.VAR.^name} is an l-value is {.VAR ~~ Scalar|Array|Hash}"
for value, l-value-one, l-value-too
How to know if returning an l-value when using FALLBACK?
Adding "when using FALLBACK" makes no difference to the answer.
How can I know if I actually need to return an l-value ... ?
Again, let's start by ignoring the FALLBACK clause.
This is a completely different question than "How to know if returning an l-value ... ?". I think it's the core of your question.
Afaik, the answer is, you need to anticipate how the returned value will be used. If there's any chance it'll be used as an l-value, and you want that usage to work, then you need to return an l-value. The language/compiler can't (or at least doesn't) help you make that decision.
Consider some related scenarios:
my $baz := foo.bar;
... (100s of lines of code) ...
$baz = 42;
Unless the first line returns an l-value, the second line will fail.
But the situation is actually much more immediate than that:
routine-foo = 42;
routine-foo is evaluated first, in its entirety, before the lhs = rhs expression is evaluated.
Unless the compiler's resolution of the routine-foo call somehow incorporated the fact that the very next thing to happen would be that the lhs will be assigned to, then there would be no way for a singly or multiply dispatched routine-foo to know whether it can safely return an r-value or must return an l-value.
And the compiler's resolution does not incorporate that. Thus, for example:
multi term:<bar> is rw { ... }
multi term:<bar> { ... }
bar = 99; # Ambiguous call to 'term:<bar>(...)'
I can imagine this one day (N years from now) being solved by a combination of allowing = to be an overloadable operator, robust macros that allow overloading of = being available, and routine resolution being modified so the above ambiguous call could do something equivalent to resolving to the is rw multi. But I doubt it will actually come to pass even with N=10. Perhaps there is another way but I can't think of one at the moment.
How can I know if I actually need to return an l-value when using FALLBACK?
Again, adding "when using FALLBACK" makes no difference to the answer.
I want to track if I've actually modified %!attrs or have only just read the value when FALLBACK was called.
When FALLBACK is called it doesn't know what context it's being called in -- r-value or l-value. Any modification comes after it has already returned.
In other words, whatever solution you come up with will being nothing to do per se with FALLBACK (even if you have to use it to implement some other aspect of whatever it is you're trying to do).
(Even if it were, I suspect trying to solve it via FALLBACK itself would just make matters worse. One can imagine writing two FALLBACK multis, one with an is rw trait, but, as explained above, my imagination doesn't stretch to that making any difference any time soon, if ever, and could only happen if the above imaginary things happened (the macros etc.) and the compiler was also modified to pay attention to the two FALLBACK multi variants, and I'm not at all meaning to suggest that that even makes sense.)
Plan B
Or (alternate plan B) can I attach a callback or something similar to my %!attrs to monitor for changes?
As Lizmat notes, that's the realm of Proxys. And thus your next SO question... :)
im am writing a simple emulator in go (should i? or should i go back to c?).
anyway, i am fetching the instruction and decoding it. at this point i have a byte like 0x81, and i have to execute the right function.
should i have something like this
func (sys *cpu) eval() {
switch opcode {
case 0x80:
sys.add(sys.b)
case 0x81:
sys.add(sys.c)
etc
}
}
or something like this
var fnTable = []func(*cpu) {
0x80: func(sys *cpu) {
sys.add(sys.b)
},
0x81: func(sys *cpu) {
sys.add(sys.c)
}
}
func (sys *cpu) eval() {
return fnTable[opcode](sys)
}
1.which one is better?
2.which one is faster?
also
3.can i declare a function inline?
4.i have a cpu struct in which i have the registers etc. would it be faster if i have the registers and all as globals? (without the struct)
thank you very much.
I did some benchmarks and the table version is faster than the switch version once you have more than about 4 cases.
I was surprised to discover that the Go compiler (gc, anyway; not sure about gccgo) doesn't seem to be smart enough to turn a dense switch into a jump table.
Update:
Ken Thompson posted on the Go mailing list describing the difficulties of optimizing switch.
The first version looks better to me, YMMV.
Benchmark it. Depends how good is the compiler at optimizing. The "jump table" version might be faster if the compiler doesn't try hard enough to optimize.
Depends on your definition of what is "to declare function inline". Go can declare and define functions/methods at the top level only. But functions are first class citizens in Go, so one can have variables/parameters/return values and structured types of function type. In all this places a function literal can [also] be assigned to the variable/field/element...
Possibly. Still I would suggest to not keep the cpu state in a global variable. Once you possibly decide to go emulating multicore, it will be welcome ;-)
If you have the ast of some expression, and you want to eval it for a big amount of data rows, then you may only once compile it into the tree of lambdas, and do not calculate any switches on each iteration at all;
For example, given such ast: {* (a, {+ (b, c)})}
Compile function (in very rough pseudo language) will be something like this:
func (e *evaluator) compile(brunch ast) {
switch brunch.type {
case binaryOperator:
switch brunch.op {
case *: return func() {compile(brunch.arg0) * compile(brunch.arg1)}
case +: return func() {compile(brunch.arg0) + compile(brunch.arg1)}
}
case BasicLit: return func() {return brunch.arg0}
case Ident: return func(){return e.GetIdent(brunch.arg0)}
}
}
So eventually compile returns the func, that must be called on each row of your data and there will be no switches or other calculation stuff at all.
There remains the question about operations with data of different types, that is for your own research ;)
This is an interesting approach, in situations, when there is no jump-table mechanism available :) but sure, func call is more complex operation then jump.
In some dynamic languages I have seen this kind of syntax:
myValue = if (this.IsValidObject)
{
UpdateGraph();
UpdateCount();
this.Name;
}
else
{
Debug.Log (Exceptions.UninitializedObject);
3;
}
Basically being able to return the last statement in a branch as the return value for a variable, not necessarily only for method returns, but they could be achieved as well.
What's the name of this feature?
Can this also be achieved in staticly typed languages such as C#? I know C# has ternary operator, but I mean using if statements, switch statements as shown above.
It is called "conditional-branches-are-expressions" or "death to the statement/expression divide".
See Conditional If Expressions:
Many languages support if expressions, which are similar to if statements, but return a value as a result. Thus, they are true expressions (which evaluate to a value), not statements (which just perform an action).
That is, if (expr) { ... } is an expression (could possible be an expression or a statement depending upon context) in the language grammar just as ?: is an expression in languages like C, C# or Java.
This form is common in functional programming languages (which eschew side-effects) -- however, it is not "functional programming" per se and exists in other language that accept/allow a "functional like syntax" while still utilizing heavy side-effects and other paradigms (e.g. Ruby).
Some languages like Perl allow this behavior to be simulated. That is, $x = eval { if (true) { "hello world!" } else { "goodbye" } }; print $x will display "hello world!" because the eval expression evaluates to the last value evaluated inside even though the if grammar production itself is not an expression. ($x = if ... is a syntax error in Perl).
Happy coding.
To answer your other question:
Can this also be achieved in staticly typed languages such as C#?
Is it a thing the language supports? No. Can it be achieved? Kind of.
C# --like C++, Java, and all that ilk-- has expressions and statements. Statements, like if-then and switch-case, don't return values and there fore can't be used as expressions. Also, as a slight aside, your example assigns myValue to either a string or an integer, which C# can't do because it is strongly typed. You'd either have to use object myValue and then accept the casting and boxing costs, use var myValue (which is still static typed, just inferred), or some other bizarre cleverness.
Anyway, so if if-then is a statement, how do you do that in C#? You'd have to build a method to accomplish the goal of if-then-else. You could use a static method as an extension to bools, to model the Smalltalk way of doing it:
public static T IfTrue(this bool value, Action doThen, Action doElse )
{
if(value)
return doThen();
else
return doElse();
}
To use this, you'd do something like
var myVal = (6 < 7).IfTrue(() => return "Less than", () => return "Greater than");
Disclaimer: I tested none of that, so it may not quite work due to typos, but I think the principle is correct.
The new IfTrue() function checks the boolean it is attached to and executes one of two delegates passed into it. They must have the same return type, and neither accepts arguments (use closures, so it won't matter).
Now, should you do that? No, almost certainly not. Its not the proper C# way of doing things so it's confusing, and its much less efficient than using an if-then. You're trading off something like 1 IL instruction for a complex mess of classes and method calls that .NET will build behind the scenes to support that.
It is a ternary conditional.
In C you can use, for example:
printf("Debug? %s\n", debug?"yes":"no");
Edited:
A compound statement list can be evaluated as a expression in C. The last statement should be a expression and the whole compound statement surrounded by braces.
For example:
#include <stdio.h>
int main(void)
{
int a=0, b=1;
a=({
printf("testing compound statement\n");
if(b==a)
printf("equals\n");
b+1;
});
printf("a=%d\n", a);
return 0;
}
So the name of the characteristic you are doing is assigning to a (local) variable a compound statement. Now I think this helps you a little bit more. For more, please visit this source:
http://www.chemie.fu-berlin.de/chemnet/use/info/gcc/gcc_8.html
Take care,
Beco.
PS. This example makes more sense in the context of your question:
a=({
int c;
if(b==a)
c=b+1;
else
c=a-1;
c;
});
In addition to returning the value of the last expression in a branch, it's likely (depending on the language) that myValue is being assigned to an anonymous function -- or in Smalltalk / Ruby, code blocks:
A block of code (an anonymous function) can be expressed as a literal value (which is an object, since all values are objects.)
In this case, since myValue is actually pointing to a function that gets invoked only when myValue is used, the language probably implements them as closures, which are originally a feature of functional languages.
Because closures are first-class functions with free variables, closures exist in C#. However, the implicit return does not occur; in C# they're simply anonymous delegates! Consider:
Func<Object> myValue = delegate()
{
if (this.IsValidObject)
{
UpdateGraph();
UpdateCount();
return this.Name;
}
else
{
Debug.Log (Exceptions.UninitializedObject);
return 3;
}
};
This can also be done in C# using lambda expressions:
Func<Object> myValue = () =>
{
if (this.IsValidObject) { ... }
else { ... }
};
I realize your question is asking about the implicit return value, but I am trying to illustrate that there is more than just "conditional branches are expressions" going on here.
Can this also be achieved in staticly
typed languages?
Sure, the types of the involved expressions can be statically and strictly checked. There seems to be nothing dependent on dynamic typing in the "if-as-expression" approach.
For example, Haskell--a strict statically typed language with a rich system of types:
$ ghci
Prelude> let x = if True then "a" else "b" in x
"a"
(the example expression could be simpler, I just wanted to reflect the assignment from your question, but the expression to demonstrate the feature could be simlpler:
Prelude> if True then "a" else "b"
"a"
.)
Every so often when programmers are complaining about null errors/exceptions someone asks what we do without null.
I have some basic idea of the coolness of option types, but I don't have the knowledge or languages skill to best express it. What is a great explanation of the following written in a way approachable to the average programmer that we could point that person towards?
The undesirability of having references/pointers be nullable by default
How option types work including strategies to ease checking null cases such as
pattern matching and
monadic comprehensions
Alternative solution such as message eating nil
(other aspects I missed)
I think the succinct summary of why null is undesirable is that meaningless states should not be representable.
Suppose I'm modeling a door. It can be in one of three states: open, shut but unlocked, and shut and locked. Now I could model it along the lines of
class Door
private bool isShut
private bool isLocked
and it is clear how to map my three states into these two boolean variables. But this leaves a fourth, undesired state available: isShut==false && isLocked==true. Because the types I have selected as my representation admit this state, I must expend mental effort to ensure that the class never gets into this state (perhaps by explicitly coding an invariant). In contrast, if I were using a language with algebraic data types or checked enumerations that lets me define
type DoorState =
| Open | ShutAndUnlocked | ShutAndLocked
then I could define
class Door
private DoorState state
and there are no more worries. The type system will ensure that there are only three possible states for an instance of class Door to be in. This is what type systems are good at - explicitly ruling out a whole class of errors at compile-time.
The problem with null is that every reference type gets this extra state in its space that is typically undesired. A string variable could be any sequence of characters, or it could be this crazy extra null value that doesn't map into my problem domain. A Triangle object has three Points, which themselves have X and Y values, but unfortunately the Points or the Triangle itself might be this crazy null value that is meaningless to the graphing domain I'm working in. Etc.
When you do intend to model a possibly-non-existent value, then you should opt into it explicitly. If the way I intend to model people is that every Person has a FirstName and a LastName, but only some people have MiddleNames, then I would like to say something like
class Person
private string FirstName
private Option<string> MiddleName
private string LastName
where string here is assumed to be a non-nullable type. Then there are no tricky invariants to establish and no unexpected NullReferenceExceptions when trying to compute the length of someone's name. The type system ensures that any code dealing with the MiddleName accounts for the possibility of it being None, whereas any code dealing with the FirstName can safely assume there is a value there.
So for example, using the type above, we could author this silly function:
let TotalNumCharsInPersonsName(p:Person) =
let middleLen = match p.MiddleName with
| None -> 0
| Some(s) -> s.Length
p.FirstName.Length + middleLen + p.LastName.Length
with no worries. In contrast, in a language with nullable references for types like string, then assuming
class Person
private string FirstName
private string MiddleName
private string LastName
you end up authoring stuff like
let TotalNumCharsInPersonsName(p:Person) =
p.FirstName.Length + p.MiddleName.Length + p.LastName.Length
which blows up if the incoming Person object does not have the invariant of everything being non-null, or
let TotalNumCharsInPersonsName(p:Person) =
(if p.FirstName=null then 0 else p.FirstName.Length)
+ (if p.MiddleName=null then 0 else p.MiddleName.Length)
+ (if p.LastName=null then 0 else p.LastName.Length)
or maybe
let TotalNumCharsInPersonsName(p:Person) =
p.FirstName.Length
+ (if p.MiddleName=null then 0 else p.MiddleName.Length)
+ p.LastName.Length
assuming that p ensures first/last are there but middle can be null, or maybe you do checks that throw different types of exceptions, or who knows what. All these crazy implementation choices and things to think about crop up because there's this stupid representable-value that you don't want or need.
Null typically adds needless complexity. Complexity is the enemy of all software, and you should strive to reduce complexity whenever reasonable.
(Note well that there is more complexity to even these simple examples. Even if a FirstName cannot be null, a string can represent "" (the empty string), which is probably also not a person name that we intend to model. As such, even with non-nullable strings, it still might be the case that we are "representing meaningless values". Again, you could choose to battle this either via invariants and conditional code at runtime, or by using the type system (e.g. to have a NonEmptyString type). The latter is perhaps ill-advised ("good" types are often "closed" over a set of common operations, and e.g. NonEmptyString is not closed over .SubString(0,0)), but it demonstrates more points in the design space. At the end of the day, in any given type system, there is some complexity it will be very good at getting rid of, and other complexity that is just intrinsically harder to get rid of. The key for this topic is that in nearly every type system, the change from "nullable references by default" to "non-nullable references by default" is nearly always a simple change that makes the type system a great deal better at battling complexity and ruling out certain types of errors and meaningless states. So it is pretty crazy that so many languages keep repeating this error again and again.)
The nice thing about option types isn't that they're optional. It is that all other types aren't.
Sometimes, we need to be able to represent a kind of "null" state. Sometimes we have to represent a "no value" option as well as the other possible values a variable may take. So a language that flat out disallows this is going to be a bit crippled.
But often, we don't need it, and allowing such a "null" state only leads to ambiguity and confusion: every time I access a reference type variable in .NET, I have to consider that it might be null.
Often, it will never actually be null, because the programmer structures the code so that it can never happen. But the compiler can't verify that, and every single time you see it, you have to ask yourself "can this be null? Do I need to check for null here?"
Ideally, in the many cases where null doesn't make sense, it shouldn't be allowed.
That's tricky to achieve in .NET, where nearly everything can be null. You have to rely on the author of the code you're calling to be 100% disciplined and consistent and have clearly documented what can and cannot be null, or you have to be paranoid and check everything.
However, if types aren't nullable by default, then you don't need to check whether or not they're null. You know they can never be null, because the compiler/type checker enforces that for you.
And then we just need a back door for the rare cases where we do need to handle a null state. Then an "option" type can be used. Then we allow null in the cases where we've made a conscious decision that we need to be able to represent the "no value" case, and in every other case, we know that the value will never be null.
As others have mentioned, in C# or Java for example, null can mean one of two things:
the variable is uninitialized. This should, ideally, never happen. A variable shouldn't exist unless it is initialized.
the variable contains some "optional" data: it needs to be able to represent the case where there is no data. This is sometimes necessary. Perhaps you're trying to find an object in a list, and you don't know in advance whether or not it's there. Then we need to be able to represent that "no object was found".
The second meaning has to be preserved, but the first one should be eliminated entirely. And even the second meaning should not be the default. It's something we can opt in to if and when we need it. But when we don't need something to be optional, we want the type checker to guarantee that it will never be null.
All of the answers so far focus on why null is a bad thing, and how it's kinda handy if a language can guarantee that certain values will never be null.
They then go on to suggest that it would be a pretty neat idea if you enforce non-nullability for all values, which can be done if you add a concept like Option or Maybe to represent types that may not always have a defined value. This is the approach taken by Haskell.
It's all good stuff! But it doesn't preclude the use of explicitly nullable / non-null types to achieve the same effect. Why, then, is Option still a good thing? After all, Scala supports nullable values (is has to, so it can work with Java libraries) but supports Options as well.
Q. So what are the benefits beyond being able to remove nulls from a language entirely?
A. Composition
If you make a naive translation from null-aware code
def fullNameLength(p:Person) = {
val middleLen =
if (null == p.middleName)
p.middleName.length
else
0
p.firstName.length + middleLen + p.lastName.length
}
to option-aware code
def fullNameLength(p:Person) = {
val middleLen = p.middleName match {
case Some(x) => x.length
case _ => 0
}
p.firstName.length + middleLen + p.lastName.length
}
there's not much difference! But it's also a terrible way to use Options... This approach is much cleaner:
def fullNameLength(p:Person) = {
val middleLen = p.middleName map {_.length} getOrElse 0
p.firstName.length + middleLen + p.lastName.length
}
Or even:
def fullNameLength(p:Person) =
p.firstName.length +
p.middleName.map{length}.getOrElse(0) +
p.lastName.length
When you start dealing with List of Options, it gets even better. Imagine that the List people is itself optional:
people flatMap(_ find (_.firstName == "joe")) map (fullNameLength)
How does this work?
//convert an Option[List[Person]] to an Option[S]
//where the function f takes a List[Person] and returns an S
people map f
//find a person named "Joe" in a List[Person].
//returns Some[Person], or None if "Joe" isn't in the list
validPeopleList find (_.firstName == "joe")
//returns None if people is None
//Some(None) if people is valid but doesn't contain Joe
//Some[Some[Person]] if Joe is found
people map (_ find (_.firstName == "joe"))
//flatten it to return None if people is None or Joe isn't found
//Some[Person] if Joe is found
people flatMap (_ find (_.firstName == "joe"))
//return Some(length) if the list isn't None and Joe is found
//otherwise return None
people flatMap (_ find (_.firstName == "joe")) map (fullNameLength)
The corresponding code with null checks (or even elvis ?: operators) would be painfully long. The real trick here is the flatMap operation, which allows for the nested comprehension of Options and collections in a way that nullable values can never achieve.
Since people seem to be missing it: null is ambiguous.
Alice's date-of-birth is null. What does it mean?
Bob's date-of-death is null. What does that mean?
A "reasonable" interpretation might be that Alice's date-of-birth exists but is unknown, whereas Bob's date-of-death does not exist (Bob is still alive). But why did we get to different answers?
Another problem: null is an edge case.
Is null = null?
Is nan = nan?
Is inf = inf?
Is +0 = -0?
Is +0/0 = -0/0?
The answers are usually "yes", "no", "yes", "yes", "no", "yes" respectively. Crazy "mathematicians" call NaN "nullity" and say it compares equal to itself. SQL treats nulls as not equal to anything (so they behave like NaNs). One wonders what happens when you try to store ±∞, ±0, and NaNs into the same database column (there are 253 NaNs, half of which are "negative").
To make matters worse, databases differ in how they treat NULL, and most of them aren't consistent (see NULL Handling in SQLite for an overview). It's pretty horrible.
And now for the obligatory story:
I recently designed a (sqlite3) database table with five columns a NOT NULL, b, id_a, id_b NOT NULL, timestamp. Because it's a generic schema designed to solve a generic problem for fairly arbitrary apps, there are two uniqueness constraints:
UNIQUE(a, b, id_a)
UNIQUE(a, b, id_b)
id_a only exists for compatibility with an existing app design (partly because I haven't come up with a better solution), and is not used in the new app. Because of the way NULL works in SQL, I can insert (1, 2, NULL, 3, t) and (1, 2, NULL, 4, t) and not violate the first uniqueness constraint (because (1, 2, NULL) != (1, 2, NULL)).
This works specifically because of how NULL works in a uniqueness constraint on most databases (presumably so it's easier to model "real-world" situations, e.g. no two people can have the same Social Security Number, but not all people have one).
FWIW, without first invoking undefined behaviour, C++ references cannot "point to" null, and it's not possible to construct a class with uninitialized reference member variables (if an exception is thrown, construction fails).
Sidenote: Occasionally you might want mutually-exclusive pointers (i.e. only one of them can be non-NULL), e.g. in a hypothetical iOS type DialogState = NotShown | ShowingActionSheet UIActionSheet | ShowingAlertView UIAlertView | Dismissed. Instead, I'm forced to do stuff like assert((bool)actionSheet + (bool)alertView == 1).
The undesirability of having having references/pointers be nullable by default.
I don't think this is the main issue with nulls, the main issue with nulls is that they can mean two things:
The reference/pointer is uninitialized: the problem here is the same as mutability in general. For one, it makes it more difficult to analyze your code.
The variable being null actually means something: this is the case which Option types actually formalize.
Languages which support Option types typically also forbid or discourage the use of uninitialized variables as well.
How option types work including strategies to ease checking null cases such as pattern matching.
In order to be effective, Option types need to be supported directly in the language. Otherwise it takes a lot of boiler-plate code to simulate them. Pattern-matching and type-inference are two keys language features making Option types easy to work with. For example:
In F#:
//first we create the option list, and then filter out all None Option types and
//map all Some Option types to their values. See how type-inference shines.
let optionList = [Some(1); Some(2); None; Some(3); None]
optionList |> List.choose id //evaluates to [1;2;3]
//here is a simple pattern-matching example
//which prints "1;2;None;3;None;".
//notice how value is extracted from op during the match
optionList
|> List.iter (function Some(value) -> printf "%i;" value | None -> printf "None;")
However, in a language like Java without direct support for Option types, we'd have something like:
//here we perform the same filter/map operation as in the F# example.
List<Option<Integer>> optionList = Arrays.asList(new Some<Integer>(1),new Some<Integer>(2),new None<Integer>(),new Some<Integer>(3),new None<Integer>());
List<Integer> filteredList = new ArrayList<Integer>();
for(Option<Integer> op : list)
if(op instanceof Some)
filteredList.add(((Some<Integer>)op).getValue());
Alternative solution such as message eating nil
Objective-C's "message eating nil" is not so much a solution as an attempt to lighten the head-ache of null checking. Basically, instead of throwing a runtime exception when trying to invoke a method on a null object, the expression instead evaluates to null itself. Suspending disbelief, it's as if each instance method begins with if (this == null) return null;. But then there is information loss: you don't know whether the method returned null because it is valid return value, or because the object is actually null. It's a lot like exception swallowing, and doesn't make any progress addressing the issues with null outlined before.
Assembly brought us addresses also known as untyped pointers. C mapped them directly as typed pointers but introduced Algol's null as a unique pointer value, compatible with all typed pointers. The big issue with null in C is that since every pointer can be null, one never can use a pointer safely without a manual check.
In higher-level languages, having null is awkward since it really conveys two distinct notions:
Telling that something is undefined.
Telling that something is optional.
Having undefined variables is pretty much useless, and yields to undefined behavior whenever they occur. I suppose everybody will agree that having things undefined should be avoided at all costs.
The second case is optionality and is best provided explicitly, for instance with an option type.
Let's say we're in a transport company and we need to create an application to help create a schedule for our drivers. For each driver, we store a few informations such as: the driving licences they have and the phone number to call in case of emergency.
In C we could have:
struct PhoneNumber { ... };
struct MotorbikeLicence { ... };
struct CarLicence { ... };
struct TruckLicence { ... };
struct Driver {
char name[32]; /* Null terminated */
struct PhoneNumber * emergency_phone_number;
struct MotorbikeLicence * motorbike_licence;
struct CarLicence * car_licence;
struct TruckLicence * truck_licence;
};
As you observe, in any processing over our list of drivers we'll have to check for null pointers. The compiler won't help you, the safety of the program relies on your shoulders.
In OCaml, the same code would look like this:
type phone_number = { ... }
type motorbike_licence = { ... }
type car_licence = { ... }
type truck_licence = { ... }
type driver = {
name: string;
emergency_phone_number: phone_number option;
motorbike_licence: motorbike_licence option;
car_licence: car_licence option;
truck_licence: truck_licence option;
}
Let's now say that we want to print the names of all the drivers along with their truck licence numbers.
In C:
#include <stdio.h>
void print_driver_with_truck_licence_number(struct Driver * driver) {
/* Check may be redundant but better be safe than sorry */
if (driver != NULL) {
printf("driver %s has ", driver->name);
if (driver->truck_licence != NULL) {
printf("truck licence %04d-%04d-%08d\n",
driver->truck_licence->area_code
driver->truck_licence->year
driver->truck_licence->num_in_year);
} else {
printf("no truck licence\n");
}
}
}
void print_drivers_with_truck_licence_numbers(struct Driver ** drivers, int nb) {
if (drivers != NULL && nb >= 0) {
int i;
for (i = 0; i < nb; ++i) {
struct Driver * driver = drivers[i];
if (driver) {
print_driver_with_truck_licence_number(driver);
} else {
/* Huh ? We got a null inside the array, meaning it probably got
corrupt somehow, what do we do ? Ignore ? Assert ? */
}
}
} else {
/* Caller provided us with erroneous input, what do we do ?
Ignore ? Assert ? */
}
}
In OCaml that would be:
open Printf
(* Here we are guaranteed to have a driver instance *)
let print_driver_with_truck_licence_number driver =
printf "driver %s has " driver.name;
match driver.truck_licence with
| None ->
printf "no truck licence\n"
| Some licence ->
(* Here we are guaranteed to have a licence *)
printf "truck licence %04d-%04d-%08d\n"
licence.area_code
licence.year
licence.num_in_year
(* Here we are guaranteed to have a valid list of drivers *)
let print_drivers_with_truck_licence_numbers drivers =
List.iter print_driver_with_truck_licence_number drivers
As you can see in this trivial example, there is nothing complicated in the safe version:
It's terser.
You get much better guarantees and no null check is required at all.
The compiler ensured that you correctly dealt with the option
Whereas in C, you could just have forgotten a null check and boom...
Note : these code samples where not compiled, but I hope you got the ideas.
Microsoft Research has a intersting project called
Spec#
It is a C# extension with not-null type and some mechanism to check your objects against not being null, although, IMHO, applying the design by contract principle may be more appropriate and more helpful for many troublesome situations caused by null references.
Robert Nystrom offers a nice article here:
http://journal.stuffwithstuff.com/2010/08/23/void-null-maybe-and-nothing/
describing his thought process when adding support for absence and failure to his Magpie programming language.
Coming from .NET background, I always thought null had a point, its useful. Until I came to know of structs and how easy it was working with them avoiding a lot of boilerplate code. Tony Hoare speaking at QCon London in 2009, apologized for inventing the null reference. To quote him:
I call it my billion-dollar mistake. It was the invention of the null
reference in 1965. At that time, I was designing the first
comprehensive type system for references in an object oriented
language (ALGOL W). My goal was to ensure that all use of references
should be absolutely safe, with checking performed automatically by
the compiler. But I couldn't resist the temptation to put in a null
reference, simply because it was so easy to implement. This has led to
innumerable errors, vulnerabilities, and system crashes, which have
probably caused a billion dollars of pain and damage in the last forty
years. In recent years, a number of program analysers like PREfix and
PREfast in Microsoft have been used to check references, and give
warnings if there is a risk they may be non-null. More recent
programming languages like Spec# have introduced declarations for
non-null references. This is the solution, which I rejected in 1965.
See this question too at programmers
I've always looked at Null (or nil) as being the absence of a value.
Sometimes you want this, sometimes you don't. It depends on the domain you are working with. If the absence is meaningful: no middle name, then your application can act accordingly. On the other hand if the null value should not be there: The first name is null, then the developer gets the proverbial 2 a.m. phone call.
I've also seen code overloaded and over-complicated with checks for null. To me this means one of two things:
a) a bug higher up in the application tree
b) bad/incomplete design
On the positive side - Null is probably one of the more useful notions for checking if something is absent, and languages without the concept of null will endup over-complicating things when it's time to do data validation. In this case, if a new variable is not initialized, said languagues will usually set variables to an empty string, 0, or an empty collection. However, if an empty string or 0 or empty collection are valid values for your application -- then you have a problem.
Sometimes this circumvented by inventing special/weird values for fields to represent an uninitialized state. But then what happens when the special value is entered by a well-intentioned user? And let's not get into the mess this will make of data validation routines.
If the language supported the null concept all the concerns would vanish.
Vector languages can sometimes get away with not having a null.
The empty vector serves as a typed null in this case.
I have read quite a few articles on closures, and, embarassingly enough, I still don't understand this concept! Articles explain how to create a closure with a few examples, but I don't see any point in paying much attention to them, as they largely look contrived examples. I am not saying all of them are contrived, just that the ones I found looked contrived, and I dint see how even after understanding them, I will be able to use them. So in order to understand closures, I am looking at a few real problems, that can be solved very naturally using closures.
For instance, a natural way to explain recursion to a person could be to explain the computation of n!. It is very natural to understand a problem like computing the factorial of a number using recursion. Similarly, it is almost a no-brainer to find an element in an unsorted array by reading each element, and comparing with the number in question. Also, at a different level, doing Object-oriented programming also makes sense.
So I am trying to find a number of problems that could be solved with or without closures, but using closures makes thinking about them and solving them easier. Also, there are two types to closures, where each call to a closure can create a copy of the environment variables, or reference the same variables. So what sort of problems can be solved more naturally in which of the closure implementations?
Callbacks are a great example. Let's take JavaScript.
Imagine you have a news site, with headlines and short blurbs and "Read More..." buttons next to each one. When the user clicks the button, you want to asynchronously load the content corresponding to the clicked button, and present the user with an indicator next to the requested headline so that the user can "see the page working at it".
function handleClickOnReadMore(element) {
// the user clicked on one of our 17 "request more info" buttons.
// we'll put a whirly gif next to the clicked one so the user knows
// what he's waiting for...
spinner = makeSpinnerNextTo(element);
// now get the data from the server
performAJAXRequest('http://example.com/',
function(response) {
handleResponse(response);
// this is where the closure comes in
// the identity of the spinner did not
// go through the response, but the handler
// still knows it, even seconds later,
// because the closure remembers it.
stopSpinner(spinner);
}
);
}
Alright, say you're generating a menu in, say, javascript.
var menuItems = [
{ id: 1, text: 'Home' },
{ id: 2, text: 'About' },
{ id: 3, text: 'Contact' }
];
And your creating them in a loop, like this:
for(var i = 0; i < menuItems.length; i++) {
var menuItem = menuItems[i];
var a = document.createElement('a');
a.href = '#';
a.innerHTML = menuItem.text;
// assign onclick listener
myMenu.appendChild(a);
}
Now, for the onclick listener, you might attempt something like this:
a.addEventListener('click', function() {
alert( menuItem.id );
}, false);
But you'll find that this will have every link alert '3'. Because at the time of the click, your onclick code is executed, and the value of menuItem is evaluated to the last item, since that was the value it was last assigned, upon the last iteration of the for loop.
Instead, what you can do is to create a new closure for each link, using the value of menuItem as it is at that point in execution
a.addEventListener('click', (function(item) {
return function() {
alert( item.id );
}
})( menuItem ), false);
So what's going on here? We're actually creating a function that accepts item and then immediately call that function, passing menuItem. So that 'wrapper' function is not what will be executed when you click the link. We're executing it immediately, and assigning the returned function as the click handler.
What happens here is that, for each iteration, the function is called with a new value of menuItem, and a function is returned which has access to that value, which is effectively creating a new closure.
Hope that cleared things up =)
Personally I hated to write sort routines when I had a list of objects.
Usually you had the sort function and a separate function to compare the two objects.
Now you can do it in one statement
List<Person> fList = new List<Person>();
fList.Sort((a, b) => a.Age.CompareTo(b.Age));
Well, after some time, spent with Scala, I can not imagine a code, operating on some list without closures. Example:
val multiplier = (i:Int) => i * 2
val list1 = List(1, 2, 3, 4, 5) map multiplier
val divider = (i:Int) => i / 2
val list2 = List(1, 2, 3, 4, 5) map divider
val map = list1 zip list2
println(map)
The output would be
List((2,0), (4,1), (6,1), (8,2), (10,2))
I'm not sure, if it is the example, you are looking for, but I personally think, that best examples of closures's real power can be seen on lists-examples: all kinds of sortings, searchings, iterating, etc.
I personally find massively helpful presentation by Stuart Langridge on Closures in JavaScript (slides in pdf). It's full of good examples and a bit of sense of humour as well.
In C#, a function can implement the concept of filtering by proximity to a point:
IEnumerable<Point> WithinRadius(this IEnumerable<Point> points, Point c, double r)
{
return points.Where(p => (p - c).Length <= r);
}
The lambda (closure) captures the supplied parameters and binds them up into a computation that will performed at some later time.
Well, I use Closures on a daily basis in the form of a function composition operator and a curry operator, both are implemented in Scheme by closures:
Quicksort in Scheme for instance:
(define (qsort lst cmp)
(if (null? lst) lst
(let* ((pivot (car lst))
(stacks (split-by (cdr lst) (curry cmp pivot))))
(append (qsort (cadr stacks) cmp)
(cons pivot
(qsort (car stacks) cmp))))))
Where I cmp is a binary function usually applied as (cmp one two), I curried it in this case to split my stack in two by creating a unitary predicate if you like, my curry operators:
(define (curry f . args)
(lambda lst (apply f (append args lst))))
(define (curryl f . args)
(lambda lst (apply f (append lst args))))
Which respectively curry righthanded and lefthanded.
So currying is a good example of closures, another example is functional composition. Or for instance a function which takes list and makes from that a predicate that tests if its argument is a member of that list or not, that's a closure too.
Closure is one of the power strengths of JavaScript, because JavaScript is a lambda language. For more on this subject go here:
Trying to simplify some Javascript with closures