Best way to build dynamic SQL involving an optional limit? - jooq

What is the best way to optionally apply a LIMIT to a query in JOOQ? I want to run:
SelectSeekStepN<Record> readyToFetch = dslContext.select(selectFields).
from(derivedTable).
where(conditions).
orderBy(orderForward);
if (length != Integer.MAX_VALUE)
readyToFetch = readyToFetch.limit(length);
limit() returns SelectLimitPercentStep<Record> which is not a sub-class of SelectSeekStepN<Record> so I get a compiler error.
If, on the other hand, I change the return type of readyToFetch from SelectSeekStepN<Record> to Select<Record> which is compatible with the return type of limit() then I cannot invoke limit() on Select<Record>. I would need to explicitly cast it to SelectSeekStepN<Record>.
Is there a better way to do this?
Maybe JOOQ should treat Integer.MAX_VALUE as a special value (no limit) to make this kind of code easier to write...

Offering a no-op to pass to clauses like LIMIT
There is an occasional feature request asking for such no-op clauses in the DSL API, which would obviously be very helpful, especially in the case of LIMIT, where there is currently no non-hacky workaround. Unfortunately, there's no good solution yet, other than the one you've already mentioned in your question, to dynamically construct your SQL query.
For most clauses where optionality is required, something like a DSL.noCondition() exists. A DSL.noTable() has been requested, but not yet implemented (as of jOOQ 3.14). Same with a "no-op" for LIMIT: https://github.com/jOOQ/jOOQ/issues/11551
Getting the types right with dynamic SQL
Your own question already contains the solution. It's just a minor typing problem. You probably chose to assign your intermediary step to SelectSeekStepN because your IDE suggested this type. But you can use any super type, instead.
Select<Record> readyToFetch;
SelectLimitStep<Record> readyToLimit;
readyToFetch = readyToLimit = dslContext.select(selectFields).
from(derivedTable).
where(conditions).
orderBy(orderForward);
if (length != Integer.MAX_VALUE)
readyToFetch = readyToLimit.limit(length);
readyToFetch.fetch();
You can take some inspiration by the ParserImpl logic. It does this all over the place. Assignment expressions are a blessing!
Alternative using type inference on conditional expressions:
SelectLimitStep<Record> limit = dslContext.select(selectFields).
from(derivedTable).
where(conditions).
orderBy(orderForward);
Result<?> result = (length != Integer.MAX_VALUE ? limit.limit(length) : limit).fetch();
Using null as a way to explicitly indicate the absence of LIMIT
Using null as a way to indicate the absence of a LIMIT is a very bad idea for at least 3 reasons:
Most jOOQ API interprets (Field<?>) null as a NULL bind value or NULL literal, never as an absent value. It would be very surprising if suddenly, we used null for that purpose only in LIMIT
Even if we did, we'd have to start distinguishing between null (the internal interpretation of an absent value) and null (the value you as a user provide jOOQ with explicitly). So, we'd need some noLimit() object anyway internally, to make the distinction, in case of which, why not just expose that as API instead of letting you hack around?
Some dialects support NULL limits. PostgreSQL interprets it an absent LIMIT (which I find very confusing, the LIMIT being "unknown"). Oracle interprets it as a LIMIT 0, which is much more reasonable. Other dialects (e.g. MySQL) reject LIMIT NULL as bad syntax, which is also reasonable. You're suggesting jOOQ overrides this behaviour and cleverly re-interprets it. I'd rather not!

I dug into the implementation and discovered that there is another method limit(Number) that treats null values as no limit. Consequently, the code could be written as:
Select<Record> readyToFetch = dslContext.select(selectFields).
from(derivedTable).
where(conditions).
orderBy(orderForward).
limit(length == Integer.MAX_VALUE ? null : length);

Related

What is the difference between Option::None in Rust and null in other languages?

I want to know what are the main differences between Rust's Option::None and "null" in other programming languages. And why is None considered to be better?
A brief history lesson in how to not have a value!
In the beginning, there was C. And in C, we had these things called pointers. Oftentimes, it was useful for a pointer to be initialized later, or to be optional, so the C community decided that there would be a special place in memory, usually memory address zero, which we would all just agree meant "there's no value here". Functions which returned T* could return NULL (written in all-caps in C, since it's a macro) to indicate failure, or lack of value. In a low-level language like C, in that day and age, this worked fairly well. We were only just rising out of the primordial ooze that is assembly language and into the realm of typed (and comparatively safe) languages.
C++ and later Java basically aped this approach. In C++, every pointer could be NULL (later, nullptr was added, which is not a macro and is a slightly more type-safe NULL). In Java, the problem becomes more clear: Every non-primitive value (which means, basically, every value that isn't a number or character) could be null. If my function takes a String as argument, then I always have to be ready to handle the case where some naive young programmer passes me a null. If I get a String as a result from a function, then I have to check the docs to see if it might not actually exist.
That gave us NullPointerException, which even today crowds this very website with tens of questions every day by young programmers falling into traps.
It is clear, in the modern day, that "every value might be null" is not a sustainable approach in a statically typed language. (Dynamically typed languages tend to be more prepared to deal with failure at every point anyway, by their very nature, so the existence of, say, nil in Ruby or None in Python is somewhat less egregious.)
Kotlin, which is often lauded as a "better Java", approaches this problem by introducing nullable types. Now, not every type is nullable. If I have a String, then I actually have a String. If I intend that my String be nullable, then I can opt-in to nullability with String?, which is a type that's either a string or null. Crucially, this is type-safe. If I have a value of type String?, then I can't call String methods on it until I do a null check or make an assertion. So if x has type String?, then I can't do x.toLowerCase() unless I first do one of the following.
Put it inside a if (x != null), to make sure that x is not null (or some other form of control flow that proves my case).
Use the ? null-safe call operator to do x?.toLowerCase(). This will compile to an if (x != null) check and will return a String? which is null if the original string was null.
Use the !! to assert that the value is not null. The assertion is checked and will throw an exception if I turn out to be wrong.
Note that (3) is what Java does by default at every turn. The difference is that, in Kotlin, the "I'm asserting that I know better than the type checker" case is opt-in, and you have to go out of your way to get into the situation where you can get a null pointer exception. (I'm glossing over platform types, which are a convenient hack in the type system to interoperate with Java. It's not really germane here)
Nullable types is how Kotlin solves the problem, and it's how Typescript (with --strict mode) and Scala 3 (with null checks turned on) handle the problem as well. However, it's not really viable in Rust without significant changes to the compiler. That's because nullable types require the language to support subtyping. In a language like Java or Kotlin, which is built using object-oriented principles first, it's easy to introduce new subtypes. String is a subtype of String?, and also of Any (essentially java.lang.Object), and also of Any? (any value and also null). So a string like "A" has a principal type of String but it also has all of those other types, by subtyping.
Rust doesn't really have this concept. Rust has a bit of type coercion available with trait objects, but that's not really subtyping. It's not correct to say that String is a subtype of dyn Display, only that a String (in an unsized context) can be coerced automatically into a dyn Display.
So we need to go back to the drawing board. Nullable types don't really work here. However, fortunately, there's another way to handle the problem of "this value may or may not exist", and it's called optional types. I wouldn't hazard a guess as to what language first tried this idea, but it was definitely popularized by Haskell and is common in more functional languages.
In functional languages, we often have a notion of principal types, similar to in OOP languages. That is, a value x has a "best" type T. In OOP languages with subtyping, x might have other types which are supertypes of T. However, in functional languages without subtyping, x truly only has one type. There are other types that can unify with T, such as (written in Haskell's notation) forall a. a. But it's not really correct to say that the type of x is forall a. a.
The whole nullable type trick in Kotlin relied on the fact that "abc" was both a String and a String?, while null was only a String?. Since we don't have subtyping, we'll need two separate values for the String and String? case.
If we have a structure like this in Rust
struct Foo(i32);
Then Foo(0) is a value of type Foo. Period. End of story. It's not a Foo?, or an optional Foo, or anything like that. It has one type.
However, there's this related value called Some(Foo(0)) which is an Option<Foo>. Note that Foo(0) and Some(Foo(0)) are not the same value, they just happen to be related in a fairly natural way. The difference is that while a Foo must exist, an Option<Foo> could be Some(Foo(0)) or it could be a None, which is kind of like our null in Kotlin. We still have to check whether or not the value is None before we can do anything. In Rust, we generally do that with pattern matching, or by using one of the several built-in functions that do the pattern matching for us. It's the same idea, just implementing using techniques from functional programming.
So if we want to get the value out of an Option, or a default if it doesn't exist, we can write
my_option.unwrap_or(0)
If we want to do something to the inside if it exists, or null out if it doesn't, then we write
my_option.and_then(|inner_value| ...)
This is basically what ?. does in Kotlin. If we want to assert that a value is present and panic otherwise, we can write
my_option.unwrap()
Finally, our general purpose Swiss army knife for dealing with this is pattern matching.
match my_option {
None => {
...
}
Some(value) => {
...
}
}
So we have two different approaches to dealing with this problem: nullable types based on subtyping, and optional types based on composition. "Which is better" is getting into opinion a bit, but I'll try to summarize the arguments I see on both sides.
Advocates of nullable types tend to focus on ergonomics. It's very convenient to be able to do a quick "null check" and then use the same value, rather than having to jump through hoops of unwrapping and wrapping values constantly. It's also nice being able to pass literal values to functions expecting String? or Int? without worrying about bundling them or constantly checking whether they're in a Some or not.
On the other hand, optional types have the benefit of being less "magic". If nullable types didn't exist in Kotlin, we'd be somewhat out of luck. But if Option didn't exist in Rust, someone could write it themselves in a few lines of code. It's not special syntax, it's just a type that exists and has some (ordinary) functions defined on it. It's also built up of composition, which means (naturally) that it composes better. That is, (T?)? is equivalent to T? (the former still only has one null value), while Option<Option<T>> is distinct from Option<T>. I don't recommend writing functions that return Option<Option<T>> directly, but you can end up getting bitten by this when writing generic functions (i.e. your function returns S? and the caller happens to instantiate S with Int?).
I go a bit more into the differences in this post, where someone asked basically the same question but in Elm rather than Rust.
There's two implicit assumptions in the question: first, that other language's "nulls" (and nils and undefs and nones) are all the same. They aren't.
The second assumption is that "null" and "none" provide similar functionality. There's many different uses of null: the value is unknown (SQL trinary logic), the value is a sentinel (C's null pointer and null byte), there was an error (Perl's undef), and to indicate no value (Rust's Option::none).
Null itself comes in at least three different flavors: an invalid value, a keyword, and special objects or types.
Keyword as null
Many languages opt for a specific keyword to indicate null. Perl has undef, Go and Ruby have nil, Python has None. These are useful in that they are distinct from false or 0.
Invalid value as null
Unlike having a keyword, these are things within the language which mean a specific value, but are used as null anyway. The best examples are C's null pointer and null bytes.
Special objects and types as null
Increasingly, languages will use special objects and types for null. These have all the advantages of a keyword, but rather than a generic "something went wrong" they can have very specific meaning per domain. They can also be user-defined offering flexibility beyond what the language designer intended. And, if you use objects, they can have additional information attached.
For example, std::ptr::null in Rust indicates a null raw pointer. Go has the error interface. Rust has Result::Err and Option::None.
Null as unknown
Most programming languages use two-value or binary logic: true or false. But some, particularly SQL, use three-value or trinary logic: true, false, and unknown. Here null means "we do not know what the value is". It's not true, it's not false, it's not an error, it's not empty, it's not zero: the value is unknown. This changes how the logic works.
If you compare unknown to anything, even itself, the result is unknown. This allows you to draw logical conclusions based on incomplete data. For example, Alice is 7'4" tall, we don't know how tall Bob is. Is Alice taller than Bob? Unknown.
Null as uninitialized
When you declare a variable, it has to contain some value. Even in C, which famously does not initialize variables for you, it contains whatever value happened to be sitting in memory. Modern languages will initialize values for you, but they have to initialize it to something. Often that something is null.
Null as sentinel
When you have a list of things and you need to know when the list is done, you can use a sentinel value. This is anything which is not a valid value for the list. For example, if you have a list of positive numbers you can use -1.
The classic example is C. In C, you don't know how long an array is. You need something to tell you to stop. You can pass around the size of the array, or you can use a sentinel value. You read the array until you see the sentinel. A string in C is just an array of characters which ends with a "null byte", that's just a 0 which is an invalid character. If you have an array of pointers, you can end it with a null pointer. The disadvantages are 1) there's not always a truly invalid value and 2) if you forget the sentinel you walk off the end of the array and bad stuff happens.
A more modern example is how to know to stop iterating. For example in Go, for thing := range things { ... }, range will return nil when there are no more items in the range causing the for loop to exit. While this is more flexible, it has the same problem as the classic sentinel: you need a value which will never appear in the list. What if null is a valid value?
Languages such as Python and Ruby choose to solve this problem by raising a special exception when the iteration is done. Both will raise StopIteration which their loops will catch and then exit the loop. This avoids the problem of choosing a sentinel value, Python and Ruby iterators can return anything.
Null as error
While many languages use exceptions for error handling, some languages do not. Instead, they return a special value to indicate an error, often null. C, Go, Perl, and Rust are good examples.
This has the same problem as a sentinel, you need to use a value which is not a valid return value. Sometimes this is not possible. For example, functions in C can only return a single value of a given type. If it returns a pointer, it can return a null pointer to indicate error. But if it returns a number, you have to pick an otherwise valid number as the error value. This conflating the error and return values is a problem.
Go works around this by allowing functions to return multiple values, typically the return value and an error. Then the two values can be checked independently. Rust can only return a single value, so it works around this by returning the special type Result. This contains either an Ok with the return value or an Err with an error code.
In both Rust and Go, they are not just values, but they can have methods called on them expanding their functionality. Rust's Result::Err has the additional advantage of being a special type, you can't accidentally use a Result::Err as anything else.
Null as no value
Finally, we have "none of the given options". Quite simply, there's a set of valid options and the result is none of them. This is distinct from "unknown" in that we know the value is not in the valid set of values. For example, if I asked "which fruit is this car" the result would be null meaning "the car is not any fruit".
When asking for the value of a key in a collection, and that key does not exist, you will get "no value". For example, Rust's HashMap get will return None if the key does not exist.
This is not an error, though they often get confused. For example, Ruby will raise ArgumentError if you pass nonsense into a function. For example, array.first(-2) asks for the first -2 values which is nonsense and will raise an ArgumentError.
Option vs Result
Which finally brings us back to Option::None. It is the "special type" version of null which has many advantages.
Rust uses Option for many things: to indicate an uninitialized value, to indicate simple errors, as no value, and for Rust specific things. The documentation gives numerous examples.
Initial values
Return values for functions that are not defined over their entire input range (partial functions)
Return value for otherwise reporting simple errors, where None is returned on error
Optional struct fields
Struct fields that can be loaned or “taken”
Optional function arguments
Nullable pointers
Swapping things out of difficult situations
To use it in so many places dilutes its value as a special type. It also overlaps with Result which is what you're supposed to use to return results from functions.
The best advice I've seen is to use Result<Option<T>, E> to separate the three possibilities: a valid result, no result, and an error. Jeremy Fitzhardinge gives a good example of what to return from querying a key/value store.
...I'm a big advocate of returning Result<Option<T>, E> where Ok(Some(value)) means "here's the thing you asked for", Ok(None) means "the thing you asked for definitely doesn't exist", and Err(...) means "I can't say whether the thing you asked for exists or not because something went wrong".

Advantage to a certain string comparison order

Looking at some pieces of code around the internet, I've noticed some authors tend to write string comparisons like
if("String"==$variable)
in PHP, or
if("String".equals(variable))
Whereas my preference is:
if(variable.equals("String"))
I realize these are effectively equal: they compare two strings for equality. But I was curious if there was an advantage to one over the other in terms of performance or something else.
Thank you for the help!
One example to the approach of using an equality function or using if( constant == variable ) rather than if( variable == constant ) is that it prevents you from accidentally typoing and writing an assignment instead of a comparison, for instance:
if( s = "test" )
Will assign "test" to s, which will result in undesired behaviour which may potentially cause a hard-to-find bug. However:
if( "test" = s )
Will in most languages (that I'm aware of) result in some form of warning or compiler error, helping to avoid a bug later on.
With a simple int example, this prevents accidental writes of
if (a=5)
which would be a compile error if written as
if (5=a)
I sure don't know about all languages, but decent C compilers warn you about if (a=b). Perhaps whatever language your question is written in doesn't have such a feature, so to be able to generate an error in such a case, they have reverted the order of the comparison arguments.
Yoda conditions call these some.
The kind of syntaxis a language uses has nothing to do with efficiency. It is all about how the comparison algorithm works.
In the examples you mentioned, this:
if("String".equals(variable))
and this:
if(variable.equals("String"))
would be exactly the same, because the expression "String" will be treated as a String variable.
Languages that provide a comparison method for Strings, will use the fastest method so you shouldn't care about it, unless you want to implement the method yourself ;)

Misuse of a variables value?

I came across an instance where a solution to a particular problem was to use a variable whose value when zero or above meant the system would use that value in a calculation but when less than zero would indicate that the value should not be used at all.
My initial thought was that I didn't like the multipurpose use of the value of the variable: a.) as a range to be using in a formula; b.) as a form of control logic.
What is this kind of misuse of a variable called? Meta-'something' or is there a classic antipattern that this fits?
Sort of feels like when a database field is set to null to represent not using a value and if it's not null then use the value in that field.
Update:
An example would be that if a variable's value is > 0 I would use the value if it's <= 0 then I would not use the value and decided to perform some other logic.
Values such as these are often called "distinguished values". By far the most common distinguished value is null for reference types. A close second is the use of distinguished values to indicate unusual conditions (e.g. error return codes or search failures).
The problem with distinguished values is that all client code must be aware of the existence of such values and their associated semantics. In practical terms, this usually means that some kind of conditional logic must be wrapped around each call site that obtains such a value. It is far too easy to forget to add that logic, obtaining incorrect results. It also promotes copy-and-paste code as the boilerplate code required to deal with the distinguished values is often very similar throughout the application but difficult to encapsulate.
Common alternatives to the use of distinguished values are exceptions, or distinctly typed values that cannot be accidentally confused with one another (e.g. Maybe or Option types).
Having said all that, distinguished values may still play a valuable role in environments with extremely tight memory availability or other stringent performance constraints.
I don't think what your describing is a pure magic number, but it's kind of close. It's similar to the situation in pre-.NET 2.0 where you'd use Int32.MinValue to indicate a null value. .NET 2.0 introduced Nullable and kind of alleviated this issue.
So you're describing the use of a variable who's value really means something other than it's value -- -1 means essentially the same as the use of Int32.MinValue as I described above.
I'd call it a magic number.
Hope this helps.
Using different ranges of the possible values of a variable to invoke different functionality was very common when RAM and disk space for data and program code were scarce. Nowadays, you would use a function or an additional, accompanying value (boolean, or enumeration) to determine the action to take.
Current OS's suggest 1GiB of RAM to operate correctly, when 256KiB was high very few years ago. Cheap disk space has gone from hundreds of MiB to multiples of TiB in a matter of months. Not too long ago I wrote programs for 640KiB of RAM and 10MiB of disk, and you would probably hate them.
I think it would be good to cope with code like that if it's just a few years old (refactor it!), and denounce it as bad practice if it's recent.

Core Data - Optional attributes and performance

Per the Core Data Programming Guide:
You can specify that an attribute is
optional—that is, it is not required
to have a value. In general, however,
you are discouraged from doing
so—especially for numeric values
(typically you can get better results
using a mandatory attribute with a
default value—in the model—of 0). The
reason for this is that SQL has
special comparison behavior for NULL
that is unlike Objective-C's nil. NULL
in a database is not the same as 0,
and searches for 0 will not match
columns with NULL.
I have always made numeric values non-optional, but have not for dates and strings. It is convenient in my code to base logic on dates and/or strings being nil.
Based on the above recommendations, I am considering making everything in my database non-optional. For dates I could set the model default to a value of 0 and for strings a model default of nothing (""). Then, in my code I could test dates for [date timeIntervalSince1970] != 0 and strings for string.length != 0.
The question is, for a relatively small database, does this really matter from a Core Data performance standpoint? And what is the tradeoff if the attribute in question will never be directly queried via a predicate?
I have not seen any performance problems on small to medium sized data sets. I suspect that this is something you would deal with in the performance stage of your application.
Personally I use the same logic of non numerics being optional if it makes sense as it does indeed make the code easier which in turn gives me more time to optimize later.

Implications of not including NULL in a language?

I know that NULL isn't necessary in a programming language, and I recently made the decision not to include NULL in my programming language. Declaration is done by initialization, so it is impossible to have an uninitialized variable. My hope is that this will eliminate the NullPointerException in favor of more meaningful exceptions or simply not having certain kinds of bugs.
Of course, since the language is implemented in C, there will be NULLs used under the covers.
My question is, besides using NULL as an error flag (this is handled with exceptions) or as an endpoint for data structures such as linked lists and binary trees (this is handled with discriminated unions) are there any other use-cases for NULL for which I should have a solution? Are there any really important implications of not having NULL which could cause me problems?
There's a recent article referenced on LtU by Tony Hoare titled Null References: The Billion Dollar Mistake which describes a method to allow the presence of NULLs in a programming language, but also eliminates the risk of referencing such a NULL reference. It seems so simple yet it's such a powerful idea.
Update: here's a link to the actual paper that I read, which talks about the implementation in Eiffel: http://docs.eiffel.com/book/papers/void-safety-how-eiffel-removes-null-pointer-dereferencing
Borrowing a page from Haskell's Maybe monad, how will you handle the case of a return value that may or may not exist? For instance, if you tried to allocate memory but none was available. Or maybe you've created an array to hold 50 foos, but none of the foos have been instantiated yet -- you need some way to be able to check for these kinds of things.
I guess you can use exceptions to cover all these cases, but does that mean that a programmer will have to wrap all of those in a try-catch block? That would be annoying at best. Or everything would have to return its own value plus a boolean indicating whether the value was valid, which is certainly not better.
FWIW, I'm not aware of any program that doesn't have some sort of notion of NULL -- you've got null in all the C-style languages and Java; Python has None, Scheme, Lisp, Smalltalk, Lua, Ruby all have nil; VB uses Nothing; and Haskell has a different kind of nothing.
That doesn't mean a language absolutely has to have some kind of null, but if all of the other big languages out there use it, surely there was some sound reasoning behind it.
On the other hand, if you're only making a lightweight DSL or some other non-general language, you could probably get by without null if none of your native data types require it.
The one that immediately comes to mind is pass-by-reference parameters. I'm primarily an Objective-C coder, so I'm used to seeing things kind of like this:
NSError *error;
[anObject doSomething:anArgumentObject error:&error];
// Error-handling code follows...
After this code executes, the error object has details about the error that was encountered, if any. But say I don't care if an error happens:
[anObject doSomething:anArgumentObject error:nil];
Since I don't pass in any actual value for the error object, I get no results back, and I don't really worry about parsing an error (since I don't care in the first place if it occurs).
You've already mentioned you're handling errors a different way, so this specific example doesn't really apply, but the point stands: what do you do when you pass something back by reference? Or does your language just not do that?
I think it's usefull for a method to return NULL - for example for a search method supposed to return some object, it can return the found object, or NULL if it wasn't found.
I'm starting to learn Ruby and Ruby has a very interesting concept for NULL, maybe you could consider implementing something silimar. In Ruby, NULL is called Nil, and it's an actual object just like any other object. It happens to be implemented as a global Singleton object. Also in Ruby, there is an object False, and both Nil and False evaluate to false in boolean expressions, while everything else evaluates to true (even 0, for example, evaluates to true).
In my mind there are two uses cases for which NULL is generally used:
The variable in question doesn't have a value (Nothing)
We don't know the value of the variable in question (Unknown)
Both of common occurrences and, honestly, using NULL for both can cause confusion.
Worth noting is that some languages that don't support NULL do support the nothing of Nothing/Unknown. Haskell, for instance, supports "Maybe ", which can contain either a value of or Nothing. Thus, commands can return (and accept) a type that they know will always have a value, or they can return/accept "Maybe " to indicate that there may not be a value.
I prefer the concept of having non-nullable pointers be the default, with nullable pointers a possibility. You can almost do this with c++ through references (&) rather than pointers, but it can get quite gnarly and irksome in some cases.
A language can do without null in the Java/C sense, for instance Haskell (and most other functional languages) have a "Maybe" type which is effectively a construct that just provides the concept of an optional null pointer.
It's not clear to me why you would want to eliminate the concept of 'null' from a language. What would you do if your app requires you to do some initialization 'lazily' - that is, you don't perform the operation until the data is needed? Ex:
public class ImLazy {
public ImLazy() {
//I can't initialize resources in my constructor, because I'm lazy.
//Maybe I don't have a network connection available yet, or maybe I'm
//just not motivated enough.
}
private ResourceObject lazyObject;
public ResourceObject getLazyObject() { //initialize then return
if (lazyObject == null) {
lazyObject = new DatabaseNetworkResourceThatTakesForeverToLoad();
}
}
public ResourceObject isObjectLoaded() { //just return the object
return (lazyObject != null);
}
}
In a case like this, how could we return a value for getObject()? We could come up with one of two things:
-require the user to initialize LazyObject in the declaration. The user would then have to fill in some dummy object (UselessResourceObject), which requires them to write all of the same error-checking code (if (lazyObject.equals(UselessResourceObject)...) or:
-come up with some other value, which works the same as null, but has a different name
For any complex/OO language you need this functionality, or something like it, as far as I can see. It may be valuable to have a non-null reference type (for example, in a method signature, so that you don't have to do a null check in the method code), but the null functionality should be available for cases where you do use it.
Interesting discussion happening here.
If I was building a language, I really don't know if I would have the concept of null. I guess it depends on how I want the language to look. Case in point: I wrote a simple templating language whose main strength is nested tokens and ease of making a token a list of values. It doesn't have the concept of null, but then it doesn't really have the concept of any types other than string.
By comparison, the langauge it is built-in, Icon, uses null extensively. Probably the best thing the language designers for Icon did with null is make it synonymous with an uninitialized variable (i.e. you can't tell the difference between a variable that doesn't exist and one that currently holds the value null). And then created two prefix operators to check null and not-null.
In PHP, I sometimes use null as a 'third' boolean value. This is good in "black-box" type classes (e.g. ORM core) where a state can be True, False or I Don't Know. Null is used for the third value.
Of course, both of these languages do not have pointers in the same way C does, so null pointers do not exist.
We use nulls all the time in our application to represent the "nothing" case. For example, if you are asked to look up some data in the database given an id, and no record matches that id: return null. This is very handy because we can store nulls in our cache, which means we don't have to go back to the database if someone asks for that id again in a few seconds.
The cache itself has two different kinds of responses: null, meaning there was no such entry in the cache, or an entry object. The entry object might have a null value, which is the case when we cached a null db lookup.
Our app is written in Java, but even with unchecked exceptions doing this with exceptions would be incredibly annoying.
If one accepts the propositions that powerful languages should have some sort of pointer or reference type (i.e. something which can hold a reference to data which does not exist at compile time), and some form of array type (or other means of having a collection of storage slots which are addressable sequentially via integer index), and that slots of the latter should be able to hold the former, and one accepts the possibility that one may have to read some slots of an array of pointers/references before sensible values exist for all of them, then there will be programs which, from a compiler's perspective, will read an array slot before a sensible value has been written to it (trying to ascertain in the general case whether an array slot could be read before it is written would be equivalent to the Halting Problem).
While it would be possible for a language to require that all array slots be initialized with some non-null reference before any of them could be read, in many situations there isn't really anything that could be stored which would be better than null: if an attempt is made to read an as-yet-unwritten array slot and dereference the (non)item contained there, that represents an error, and it would be better to have the system trap that condition than to access some arbitrary object whose sole purpose for existence is to give the array slots some non-null thing they can reference.

Resources