What do creating an object "explicitly" and "implicitly as a result of elaboration" mean? - programming-languages

From p498 of Programming Language Pragmatics, by Scott
With a reference model for variables, every object is created
explicitly, and it is easy to ensure that an appropriate constructor
is called.
With a value model for variables, object creation can happen
implicitly as a result of elaboration. In Ada, which doesn’t provide
automatic calls to constructors by default, elaborated objects begin
life uninitialized, and it is possible to accidentally attempt to use
a variable before it has a value. In C++, the compiler ensures that an
appropriate constructor is called for every elaborated object, but the
rules it uses to identify constructors and their arguments can
sometimes be confusing.
What does it mean by creating an object "explicitly" and "implicitly as a result of elaboration"?
What does "elaboration" mean?
Thanks.

Related

Is it allowed to modify value of the Value Object on construction

Assuming that I want that following Value Object contains always capitalized String value. Is it eligible to do it like this with toUpperCase() in constructor?
class CapitalizedId(value: String) {
val value: String = value.toUpperCase()
// getters
// equals and hashCode
}
In general, I do not see a problem of performing such a simple transformation in a value object's constructor. There should of course be no surprises for the user of a constructor but as the name CapitalizedId already tells you that whatever will be created will be capitalized there is no surprise, from my point of view. I also perform validity checks in constructors to ensure business invariants are adhered.
If you are worried to not perform operations in a constructor or if the operations and validations become too complex you can always provide factory methods instead (or in Kotlin using companion, I guess, not a Kotlin expert) containing all the heavy lifting (think of LocalDateTime.of()) and validation logic and use it somehow like this:
CapitalizedId.of("abc5464g");
Note: when implementing a factory method the constructor should be made private in such cases
Is it eligible to do it like this with toUpperCase() in constructor?
Yes, in the sense that what you end up with is still an expression of the ValueObject pattern.
It's not consistent with the idea that initializers should initialize, and not also include other responsibilities. See Misko Hevery 2008.
Will this specific implementation be an expensive mistake? Probably not

How does VBA determine the lifetime of non-IUnknown reference types?

I'm trying to understand what dictates the lifetime of different bits of data; how the VBA interpreter knows it is safe to release the associated memory of a given variable. Here's what I've found so far:
Value types
Simple value types use scope to determine lifetime; e.g. a Long that is Dimmed inside a Function will hang around until an Exit/End * statement is hit. At this point the exact mechanism is unclear to me, but I imagine the VBA interpreter keeps a list of all variables in a given scope, and uses VarPtr and LenB to release all memory associated with them.
Object types
Objects meanwhile all derive from the IDispatch interface which is based on the COM IUnknown interface. As a result, objects use reference counting to determine lifetime. So a variable which holds a reference to an object (basically a LongPtr) will be overwritten when it falls out of scope (similar to value types), but just before that happens, the VBA interpreter calls IUnknown_Release on the IUnknown interface of the object (or more precisely, whatever interface is being held in the variable/With block).
As such, it is the responsibility of the COM object to clean up and release its own instance memory whenever the reference count drops to zero
Other Reference types
There are other reference types though (that is, types which are not stored as raw data like Doubles and Longs, but instead as pointers to the full data elsewhere). e.g:
Strings
Arrays
UDTs
Anything passed ByRef
Now since these are not strictly value types, VBA can't just use scope to determine lifetime; let's suppose some function Dims an array, which falls out of scope when the function Ends. Well if it is VBA's default behaviour to call SafeArrayDestroy on the array pointer then the memory would be freed as soon as the variable is out of scope. However what happens if the array is the return value of the function - now VBA can't free the underlying data when the variable is out of scope or things will break. If the array doesn't use reference counting, then how does VBA know to release it or not?
Similarly for the other types - any clue what exactly determines lifetime of these half value half reference types (they all are assigned without the Set operator so I guess they are not strictly reference types)

MSVC How does ios_base::Init work?

The documentation states that:
The nested class describes an object whose construction ensures that
the standard iostreams objects are properly constructed, even before
the execution of a constructor for an arbitrary static object.
As seen at:
https://msdn.microsoft.com/en-gb/library/fbyc90zw.aspx
But since static objects have an undefined init ordering how does ios_base::Init ensure that it runs before them?
I would hazard that the order of initialization of static variables is undefined, but the order of constructor calls for creation of an object/variable (static or not) is well defined.
The ios_base::Init class is nested in the ios_base class. So when any ios_base instance is constructed, the ios_base::Init constructor is run as well.
It doesn't matter which of the possible static instances of an object using ios_base exist or what order they run in. All that matters is that the ios_base::Init gets to run and initialize the standard streams first (all other constructor calls to ios_base::Init likely do nothing since the work is already done by the first constructor call).

How should I use storage class specifiers like ref, in, out, etc. in function arguments in D?

There are comparatively many storage class specifiers for functions arguments in D, which are:
none
in (which is equivalent to const scope)
out
ref
scope
lazy
const
immutable
shared
inout
What's the rational behind them? Their names already put forth the obvious use. However, there are some open questions:
Should I use ref combined with in for struct type function arguments by default?
Does out imply ref implicitely?
When should I use none?
Does ref on classes and/or interfaces make sense? (Class types are references by default.)
How about ref on array slices?
Should I use const for built-in arithmetic types, whenever possible?
More generally put: When and why should I use which storage class specifier for function argument types in case of built-in types, arrays, structs, classes and interfaces?
(In order to isolate the scope of the question a little bit, please don't discuss shared, since it has its own isolated meaning.)
I wouldn't use either by default. ref parameters only take lvalues, and it implies that you're going to be altering the argument that's being passed in. If you want to avoid copying, then use const ref or auto ref. But const ref still requires an lvalue, so unless you want to duplicate your functions, it's frequently more annoying than it's worth. And while auto ref will avoid copying lvalues (it basically makes it so that there's a version of the function which takes an lvalues by ref and one which takes rvalues without ref), it only works with templates, limiting its usefulness. And using const can have far-reaching consequences due to the fact that D's const is transitive and the fact that it's undefined behavior to cast away const from a variable and modify it. So, while it's often useful, using it by default is likely to get you into trouble.
Using in gives you scope in addition to const, which I'd generally advise against. scope on function parameters is supposed to make it so that no reference to that data can escape the function, but the checks for it aren't properly implemented yet, so you can actually use it in a lot more situations than are supposed to be legal. There are some cases where scope is invaluable (e.g. with delegates, since it makes it so that the compiler doesn't have to allocate a closure for it), but for other types, it can be annoying (e.g. if you pass an array be scope, then you couldn't return a slice to that array from the function). And any structs with any arrays or reference types would be affected. And while you won't get many complaints about incorrectly using scope right now, if you've been using it all over the place, you're bound to get a lot of errors once it's fixed. Also, its utterly pointless for value types, since they have no references to escape. So, using const and in on a value type (including structs which are value types) are effectively identical.
out is the same as ref except that it resets the parameter to its init value so that you always get the same value passed in regardless of what the previous state of the variable being passed in was.
Almost always as far as function arguments go. You use const or scope or whatnot when you have a specific need it, but I wouldn't advise using any of them by default.
Of course it does. ref is separate from the concept of class references. It's a reference to the variable being passed in. If I do
void func(ref MyClass obj)
{
obj = new MyClass(7);
}
auto var = new MyClass(5);
func(var);
then var will refer the newly constructed new MyClass(7) after the call to func rather than the new MyClass(5). You're passing the reference by ref. It's just like how taking the address of a reference (like var) gives you a pointer to a reference and not a pointer to a class object.
MyClass* p = &var; //points to var, _not_ to the object that var refers to.
Same deal as with classes. ref makes the parameter refer to the variable passed in. e.g.
void func(ref int[] arr)
{
arr ~= 5;
}
auto var = [1, 2, 3];
func(var);
assert(var == [1, 2, 3, 5]);
If func didn't take its argument by ref, then var would have been sliced, and appending to arr would not have affected var. But since the parameter was ref, anything done to arr is done to var.
That's totally up to you. Making it const makes it so that you can't mutate it, which means that you're protected from accidentally mutating it if you don't intend to ever mutate it. It might also enable some optimizations, but if you never write to the variable, and it's a built-in arithmetic type, then the compiler knows that it's never altered and the optimizer should be able to do those optimizations anyway (though whether it does or not depends on the compiler's implementation).
immutable and const are effectively identical for the built-in arithmetic types in almost all cases, so personally, I'd just use immutable if I want to guarantee that such a variable doesn't change. In general, using immutable instead of const if you can gives you better optimizations and better guarantees, since it allows the variable to be implicitly shared across threads (if applicable) and it always guarantees that the variable can't be mutated (whereas for reference types, const just means only that that reference can't mutate the object, not that it can't be mutated).
Certainly, if you mark your variables const and immutable as much as possible, then it does help the compiler with optimizations at least some of the time, and it makes it easier to catch bugs where you mutated something when you didn't mean to. It also can make your code easier to understand, since you know that the variable is not going to be mutated. So, using them liberally can be valuable. But again, using const or immutable can be overly restrictive depending on the type (though that isn't a problem with the built-in integral types), so just automatically marking everything as const or immutable can cause problems.

About first-,second- and third-class value

First-class value can be
passed as an argument
returned from a subroutine
assigned into a variable.
Second-class value just can be passed as an argument.
Third-class value even can't be passed as an argument.
Why should these things defined like that? As I understand, "can be passed as an argument" means it can be pushed into the runtime stack;"can be assigned into a variable" means it can be moved into a different location of the memory; "can be returned from a subroutine" almost has the same meaning of "can be assigned into a variable" since the returned value always be put into a known address, so first class value is totally "movable" or "dynamic",second class value is half "movable" , and third class value is just "static", such as labels in C/C++ which just can be addressed by goto statement, and you can't do nothing with that address except "goto" .Does My understanding make any sense? or what do these three kinds of values mean exactly?
Oh no, I may have to go edit Wikipedia again.
There are really only two distinctions worth making: first-class and not first-class. If Michael Scott talks about a third-class anything, I'll be very depressed.
Ok, so what is "first-class," anyway? Well, it is a term that barely has a technical meaning. The meaning, when present, is usually comparative, and it applies to a thing in a language (I'm being deliberately vague here) that has more privileges than a comparable thing. That's all people mean by it.
Let's look at some examples:
Function pointers in C are first-class values because they can be passed to functions, returned from functions, and stored in heap-allocated data structures just like any other value. Functions in Pascal and Ada are not first-class values because although they can be passed as arguments, they cannot be returned as results or stored in heap-allocated data structures.
Struct types are second-class types in C, because there are no literal expressions of struct type. (Since C99 there are literal initializers with named fields, but this is still not as general as having a literal anywhere you can use an expression.)
Polymorphic values are second-class values in ML because although they can be let-bound to names, they cannot be lambda-bound. Therefore they cannot be passed as arguments. But in Haskell, because Haskell supports higher-rank polymorphism, polymorphic values are first-class. (They can even be stored in data structures!)
In Java, the type int is second class because you can't inherit from it. Type Integer is first class.
In C, labels are second class, because they don't have values and you can't compute with them. In FORTRAN, line numbers have values and so are first class. There is a GNU extension to C that allows you to define first-class labels, and it is jolly useful. What does first-class mean in this case? It means the labels have values, can be stored in data structures, and can be used in goto. But those values are second class in another sense, because a label from one procedure can't meaningfully be used in a goto that belongs to another procedure.
Are we getting an idea how useless this terminology is?
I hope these examples convince you that the idea of "first-class" is not a very useful idea in thinking about programming languages overall. When you're talking about a particular feature of a particular language or language family, it can be a useful shorthand ("a language isn't functional unless it has first-class, nested functions") but by and large you're better off saying just what you mean instead of talking about "first-class" or "not first-class" things.
As for "third class", just say no.
Something is first-class if it is explicitly manipulable in the code. In other words, something is first-class if it can be programmatically manipulated at run-time.
This closely relates to meta-programming in the sense that what you describe in the code (at development time) is one meta-level, and what exists at run-time is another meta-level. But the barrier between these two meta-levels can be blurred, for instance with reflection. When something is reified at run-time, it becomes explicitly manipulable.
We speak of first-class object, because objects can be manipulated programmatically at run-time (that's the very purpose).
In java, you have classes, but they are not first-class, because the code can normally not manipulate a class unless you use reflection. But in Smalltalk, classes are first-class: the code can manipulate a class like an regular object.
In java, you have packages (modules), but they are not first-class, because the code does not manipulate package at run-time. But in NewSpeak, packages (modules) are first-class, you can instantiate a module and pass it to another module to specify the modularity at run-time.
In C#, you have closures which are first-class functions. They exist and can be manipulated at run-time programmatically. Such things does not exists (yet) in java.
To me, the boundary first-class/not first-class is not exactly strict. It is sometimes hard to pronounce for some language constructs, e.g. java primitive types. We could say it's not first-class because it's not an object and is not manipulable through a reference that can be passed along, but the primitive value does still exists and can be manipulated at run-time.
PS: I agree with Norman Ramsey and 2nd-class and 3rd-class value make no sense to me.
First-class: A first-class construct is one which is an intrinsic element of a language. The following properties must hold.
It must form part of the lexical syntax of the language
It may have operators applied to it
It must be referenceable (for example stored in a variable)
Second-class: A second-class construct is one which is an intrinsic element of the language with the following properties.
It must form part of the lexical syntax of the language
It may have operators applied to it
Third-class: A third-class construct is one which forms part of the syntax of a language.
in
Roger Keays and Andry Rakotonirainy. Context-oriented programming. In Pro- ceedings of the 3rd ACM International Workshop on Data Engineering for Wire- less and Mobile Access, MobiDe ’03, pages 9–16, New York, NY, USA, 2003. ACM.
Those terms are very broad and not really globally well defined, but here are the most logical definitions for them:
First-class values are the ones that have actual, tangible values, and so can be operated on and go around, as variables, arguments, return values or whatever.
This doesn't really need a thorough example, does it? In C, an int is first-class.
Second-class values are more limited. They have values, but they can't be used directly, so the compiler deliberately limits what you can do with it. You can reference them, so you can still have a first-class value representing them.
For example, in C, a function is a second-class value. It can't be altered, but it can be called and referenced.
Third-class values are even more limited. They not only don't have values, but interaction is completely absent, and often it only exists to be used as compile-time attributes.
For example, in Rust, a lifetime is a third-class value. You can't use the lifetime at all. You can only receive it as a template parameter, you can only use it as a template parameter (only when creating a new variable), and that's all you can do with it.
Another example, in C++, a struct or a class is a third-class value. This doesn't need much explanation.

Resources