How is assignment implemented in Pharo? - pharo

Poking around the image and searching for ":=" produces no relevant results that quickly pop out.
Where exactly in the Pharo image is assignment implemented and how?

Assignment is translated to byte code instructions. Assignment to temporary variables translates to popIntoTemp:, which pops the current top off the stack and stores it in the slot assigned to the temporary variable described by the index (the argument to the instruction).
Other assignments work in a similar manner, hence assignment to an instance variable translates to popIntoRcvr: ("pop into receiver"), where the index (the argument to the instruction) designates the index of the instance variable.
Class variable assignment translates to popIntoLit: (here the argument to the instruction is the literal itself, i.e. the class variable in this case), class instance variable assignment to popIntoRcvr: and global assignment into popIntoLit: (the argument is the literal itself, i.e. the global).
The names used for the instructions are taken from the byte code view in Pharo. The Blue Book defines these instructions in terms of bytes (which is what the virtual machine uses of course) and descriptive names. Here are the bytes associated with the instructions mentioned above:
popIntoTemp:: <68>
popIntoRcvr:: <60>
popIntoLit:: <82 C0>
Also note, that there may be additional instructions for special cases, e.g. storing into a temporary variable at some large offset.

Related

Temporary materialization conversion - Confusion about terminology and concepts

Hi stackoverflow community,
I'm a few months into C++ and recently I've been trying to grasp the concepts revolving around the "new" value categories, move semantics, and especially temporary materialization.
First of all, it's not straightforward to me how to interpret the term "temporary materialization conversion". The conversion part is clear to me (prvalue -> xvalue). But how exactly is a "temporary" defined in this context? I used to think that temporaries were unnamed objects that only exist - from a language point of view - until the last step in the evaluation of the expression they were created in. But this conception doesn't seem to match what temporaries actually seem to be in the broader context of temporary materialization, the new value categories, etc.
The lack of clarity about the term "temporary" results in me not being able to tell if a "temporary materialization" is a temporary that gets materialized or a materialization that is temporary. I assume it's the former, but I'm not sure. Also: Is the term temporary only used for class types?
This directly brings me to the next point of confusion: What roles do prvalues and xvalues play regarding temporaries? Suppose I have an rvalue expression that needs to be evaluated in such a way that it has to be converted into an xvalue, e.g. by performing member access.
What will exactly happen? Is the the prvalue something that is actually existent (in memory or elsewhere) and is the prvalue already the temporary? Now, the "temporary materialization conversion" described as "A prvalue of any complete type T can be converted to an xvalue of the same type T. This conversion initializes a temporary object of type T from the prvalue by evaluating the prvalue with the temporary object as its result object, and produces an xvalue denoting the temporary object" at cppreference.com (https://en.cppreference.com/w/cpp/language/implicit_conversion) converts the prvalue into an xvalue. This extract makes me think that a prvalue is something that is not existent anywhere in memory or a register up until it gets "materialized" by such conversion. (Also, I'm not sure if a temporary object is the same as a temporary.) So, as far as I understand, this conversion is done by the evaluation of the prvalue expression which has "real" object as a result. This object is then REPRESENTED (= denoted?) by the xvalue expression. What happens in memory? Where has the rvalue been, where is the xvalue now?
My next question is more specific question about a certain part of temporary materialization. In the talk "Understanding value categories in C++" by Kris van Rens on YouTube (https://www.youtube.com/watch?v=liAnuOfc66o&t=3576s) at ~56:30 he shows this slide:
Based on what cppreference.com says about temporary materialization numbers 1 and 2 are clear cases (1: member access on a class pravlue, 2: binding a reference to a prvalue (as in the std::string +operator).
I'm not too sure about number 3, though. Cppreference says: "Note that temporary materialization does not occur when initializing an object from a prvalue of the same type (by direct-initialization or copy-initialization): such object is initialized directly from the initializer. This ensures "guaranteed copy elision"."
The +operator returns a prvalue. Now, this prvalue of type std::string is used to initialize an auto (which should resolve to std::string as well) variable. This sounds like the case that is discussed in the prior cppreference excerpt. So does temporary materialization really occur here? And what happens to the objects (1 and 2) that were "denoted" by the xvalue expressions in between? When do they get destroyed? And if the +operator is returning an prvalue, does it even "exist" somewhere? And how is the object auto x " initialized directly from the initializer" (the prvalue) if the prvalue is not even a real (materialized?) object?
In the talk "Nothing is better than copy or move - Roger Orr [ACCU 2018]" on YouTube (https://www.youtube.com/watch?v=-dc5vqt2tgA&t=2557s) at ~ 40:00 there is nearly the same example:
This slide even says that temporary materialization occurs when initializing a variable which clearly contradicts the exception from cppference from above. So what's true?
As you can see, I'm pretty confused about this whole topic. For me, it's especially hard to grasp these concepts as I cannot find any clear definitions of various term that are used in a uniform way online. I'd appreciate any help a lot!
Best regards,
Ruperrrt
TL;DR: What is a temporary in the temporary materialization conversion context? Does that mean that a temporary gets materialized or that it is materialization that is temporary? Also temporary = temporary object?
In the slides, is 3 (first slide) respectively 1 (second slide) really a point where temporary materialization occurs (conflicts with what cppreference says about initialization from pravlues of the same type)?
107 views, 6 months and no answer nor comments. Interesting. Here's my take on your question.
Temporary materialization should mean "temporary that gets materialized". I don't even know what "materialization that is temporary" would even mean to be honest.
The term temporary is not only used for class types.
Prvalues, loosely speaking, don't exist in memory unlike xvalues. The thing you should care about is the context. Let's say you have defined a structure
struct S { int m; };.
In the expression S x = S();, subexpression S() denotes a prvalue. The compiler with treat it just as if you have written S x{}; (note that I've put curly brackets on purpose because S x(); is actually a declaration of a function). On the other hand in expression like int i = S().m;, subexpression S() is a prvalue that will be converted to xvalue that is, S() will denote something that will exist in the memory.
Regarding your second question, the thing you need to know about is that with the C++-17, the circumstances in which temporaries are going to be created were brought down to minimum (cppreference describes it very well). However, the expression
auto x = std::string("Guaca") + std::string("mole").c_str();
will require two temporary objects to be created before assignment. Firstly, you are doing member access with c_str() method so a temporary std::string will be created. Secondly, the operator + will bind one a reference to the std::string("Guaca") (new temporary). and one to the result object of c_str(), but without creating additional temporary because of:. That's pretty much it. It's worth to note that the order of creation temporary objects isn't known - it's totally implementation-defined.
After that, we're calling the operator + which probably constructs another std::string which technically isn't a temporary object because that's a part of the implementation. That object might or might not be constructed into the memory location of x depending on the NRVO. In any case, whatever value does the prvalue expression std::string("Guaca") + std::string("mole").c_str() denote will be the same value (of the same object) denoted by the expression x because of cpp.ref:
Note that temporary materialization does not occur when initializing an object from a prvalue of the same type (by direct-initialization or copy-initialization): such object is initialized directly from the initializer. This ensures "guaranteed copy elision".
This quote isn't really precise and might confuse you so I also suggest reading copy elision (the first part about mandatory elision).
I'm not a C++ expert so take all of this with a grain of salt.

How to deal with CHARACTER variables, longer than the allowed maximum of 2000?

I'm working with Progress-4GL, appBuilder and procedure editor, release 11.6.
I've just found a CHARACTER type global variable (DEF VAR global_variable AS CHAR NO-UNDO.), containing up to 12901 characters. The variable is only used for passing information within the application, the information will never be stored as one tuple within a table.
The information in that variable seems to be handled well: the content is correct.
Yet, as this URL mentions, the maximum length of a character variable in Progress being 2000 characters, and this makes me worry: I'm afraid that one day, another limit may be crossed and from that moment on, we'll need to rethink the whole idea, and I'd like to be prepared for that day.
Therefore, does anybody know the "next" length limit of a character variable in Progress?
That reference you mention points to SQL limitations.
In the ABL, a CHARACTER variable can hold ~ 32 k
DEFINE VARIABLE c AS CHARACTER NO-UNDO.
ASSIGN c = FILL ("*", 31000) .
MESSAGE LENGTH (c)
VIEW-AS ALERT-BOX INFORMATION BUTTONS OK.
Beyond that you have to use LONGCHAR with it's limitations:
slightly slower
cannot be indexed in temp-tables or database tables.
CHARACTER variables are always stored in the CPINTERNAL codepage. LONGCHAR's can use a different codepage through the FIX-CODEPAGE statement.

Why a function can be a literal and a expression can't?

I understand that the concept literal is applied to whenever you represent a fixed value in source code, exactly as it is meant to be interpreted, vs. a variable or a constant, which are names for several of a class or one of them respectively.
But they are also opposed to expressions. I thought it was because they could incorporate variables. But even expressions like 1+2 are not (see first answer in What does the word "literal" mean?).
So, when I define a variable this way:
var=1+2
1+2 is not a literal even though it is not a name and evaluates to a single value. I could then guess that it is because it doesn't represent the target value directly; in other words, a literal represents a value "exactly as it is".
But then how is it possible that a function like this one is a literal (as pointed it the same linked answer)?
(x) => x*x
Only anonymous functions can be literal because they are not bound to an identifier
so (x)=>x*x is a literal because it is a anonymous function,or function literal
but a
void my_func()
{#something}
is not a literal cause it is bound to an identifier;
read these,
https://en.wikipedia.org/wiki/Literal_(computer_programming)
https://en.wikipedia.org/wiki/Anonymous_function
Expressions can be divided into two general types: atomic expressions and composite expressions.
Composite expressions can be divided by operator, and so on; atomic expressions can be divided into variables, constants, and literals. I guess different authors might use other categories or boundaries here, so it might not be universal. But I will argue why this categorization might make sense.
It's fairly obvious why strings or numbers are literals, or why a sum isn't. A function call can be considered composite, as it operates on subexpressions - its parameters. A function definition does not operate on subexpressions. Only when the so defined function is called, that call passes parameters into the function. In a compiled language, the anonymous function will likely be replaced by a target address where the corresponding code is located - that memory location is obviously not dependent on any subexpression.
#rdRahul's answer references this Wikipedia article, which says that object literals, such as {"cat", "dog"} can be considered literals. This can be easily argued by pointing out that the object which is the value of the expression is just one opaque thing, such as a pointer to the memory location of the object.

How is a syscall is defined in linux kernel? What's the relation between compat_sys_xxx and sys_xxx?

In /include/linux/compat.h, I see a lot of compat_sys_xxx. Also, there is sys_xxx defined somewhere else. What's the relation between compat_sys_xxx and sys_xxx?
If there's a compat entry, it almost certainly means that the system call prototype was changed and a version of the previous prototype was maintained for compatibility. Often you'll see that compat_sys_xxx just calls sys_xxx with the arguments converted appropriately (or both call a common function with slightly different conversions).
As a more or less random example, compat_sys_msgsnd takes three "int" arguments followed by a pointer to a compat_msgbuf structure (wherein the first - ostensibly "long" - field is forced to a 32-bit size). OTOH, sys_msgsnd lists the arguments in a different order and with argument types selected to morph appropriately for the architecture (i.e. long floats according to the natural long integer size, size_t replaces int in one place, etc).
No doubt the syscall interface was changed because the original interface was ambiguous in some way, when moved to a different (non-i386) architecture. The compat_ version allows existing binaries to continue working without modification.

Implementing pass-by-reference argument semantics in an interpreter

Pass-by-value semantics are easy to implement in an interpreter (for, say, your run-of-the-mill imperative language). For each scope, we maintain an environment that maps identifiers to their values. Processing a function call involves creating a new environment and populating it with copies of the arguments.
This won't work if we allow arguments that are passed by reference. How is this case typically handled?
First, your interpreter must check that the argument is something that can be passed by reference – that the argument is something that is legal in the left-hand side of an assignment statement. For example, if f has a single pass-by-reference parameter, f(x) is okay (since x := y makes sense) but f(1+1) is not (1+1 := y makes no sense). Typical qualifying arguments are variables and variable-like constructs like array indexing (if a is an array for which 5 is a legal index, f(a[5]) is okay, since a[5] = y makes sense).
If the argument passes that check, it will be possible for your interpreter to determine while processing the function call which precise memory location it refers to. When you construct the new environment, you put a reference to that memory location as the value of the pass-by-reference parameter. What that reference concretely looks like depends on the design of your interpreter, particularly on how you represent variables: you could simply use a pointer if your implementation language supports it, but it can be more complex if your design calls for it (the important thing is that the reference must make it possible for you to retrieve and modify the value contained in the memory location being referred to).
while your interpreter is interpreting the body of a function, it may have to treat pass-by-referece parameters specially, since the enviroment does not contain a proper value for it, just a reference. Your interpreter must recognize this and go look what the reference points to. For example, if x is a local variable and y is a pass-by-reference parameter, computing x+1 and y+1 may (depending on the details of your interpreter) work differently: in the former, you just look up the value of x, and then add one to it; in the latter, you must look up the reference that y happens to be bound to in the environment and go look what value is stored in the variable on the far side of the reference, and then you add one to it. Similarly, x = 1 and y = 1 are likely to work differently: the former just goes to modify the value of x, while the latter must first see where the reference points to and modify whatever variable or variable-like thing (such as an array element) it finds there.
You could simplify this by having all variables in the environment be bound to references instead of values; then looking up the value of a variable is the same process as looking up the value of a pass-by-reference parameter. However, this creates other issues, and it depends on your interpreter design and on the details of the language whether that's worth the hassle.

Resources