Suppose I have an algol-like language, with static types and the following piece of code:
a := b + c * d;
where a is a float, b an integer, c a double and d a long. Then, the language will convert d to long to operate with c, and b to double to operate with c*d result. So, after that, the double result of b+c*d will be converted to float to assign the result to a. But, when it happens?, I mean, do all the conversions happens in runtime or compile time?
And if I have:
int x; //READ FROM USER KEYBOARD.
if (x > 5) {
a:= b + c * d;
}
else {
a := b + c;
}
The above code has conditionals. If the compiler converts this at compile time, some portion may never run. Is this correct?
You cannot do a conversion at compile-time any more than you can do an addition at compile time (unless the compiler can determine the value of the variable, perhaps because it is actually constant).
The compiler can (and does) emit a program with instructions which add and multiply the value of variables. It also emits instructions which convert the type of a stored value into a different type prior to computation, if that is necessary.
Languages which do not have variable types fixed at compile-time do have to perform checks at runtime and conditionally convert values to different types. But I don't believe that is the case with any of the languages included in the general category of "Algol-like".
I'm puzzled as to how arguments are passed into a cppFunction when we use Rcpp. In particular, I wonder if someone can explain the result of the following code.
library(Rcpp)
cppFunction("void test(double &x, NumericVector y) {
x = 2016;
y[0] = 2016;
}")
a = 1L
b = 1L
c = 1
d = 1
test(a,b)
test(c,d)
cat(a,b,c,d) #this prints "1 1 1 2016"
As stated before in other areas, Rcpp establishes convenient classes around R's SEXP objects.
For the first parameter, the double type does not have a default SEXP object. This is because within R, there is no such thing as a scalar. Thus, new memory is allocate making the & reference incompatible. Hence, the variable scope for the modification is limited to the function and there is never an update to the result. As a result, for both test cases, you will see 1.
For the second case, there is a mismatch between object classes. Within the first call object supplied is of type integer due to the L appended on the end, which conflicts with the C++ function expected type of numeric. The issue is resolved once the L is dropped as the object is instantiated as a numeric. Therefore, an intermediary memory location does not need to be created that is of the correct type to receive the value. Hence, the modification in the second case is able to propagate back to R.
e.g.
a = 1L
class(a)
# "integer"
a = 1
class(a)
# "numeric"
Could you please explain differences between and definition of call by value, call by reference, call by name and call by need?
Call by value
Call-by-value evaluation is the most common evaluation strategy, used in languages as different as C and Scheme. In call-by-value, the argument expression is evaluated, and the resulting value is bound to the corresponding variable in the function (frequently by copying the value into a new memory region). If the function or procedure is able to assign values to its parameters, only its local copy is assigned — that is, anything passed into a function call is unchanged in the caller's scope when the function returns.
Call by reference
In call-by-reference evaluation (also referred to as pass-by-reference), a function receives an implicit reference to a variable used as argument, rather than a copy of its value. This typically means that the function can modify (i.e. assign to) the variable used as argument—something that will be seen by its caller. Call-by-reference can therefore be used to provide an additional channel of communication between the called function and the calling function. A call-by-reference language makes it more difficult for a programmer to track the effects of a function call, and may introduce subtle bugs.
differences
call by value example
If data is passed by value, the data is copied from the variable used in for example main() to a variable used by the function. So if the data passed (that is stored in the function variable) is modified inside the function, the value is only changed in the variable used inside the function. Let’s take a look at a call by value example:
#include <stdio.h>
void call_by_value(int x) {
printf("Inside call_by_value x = %d before adding 10.\n", x);
x += 10;
printf("Inside call_by_value x = %d after adding 10.\n", x);
}
int main() {
int a=10;
printf("a = %d before function call_by_value.\n", a);
call_by_value(a);
printf("a = %d after function call_by_value.\n", a);
return 0;
}
The output of this call by value code example will look like this:
a = 10 before function call_by_value.
Inside call_by_value x = 10 before adding 10.
Inside call_by_value x = 20 after adding 10.
a = 10 after function call_by_value.
call by reference example
If data is passed by reference, a pointer to the data is copied instead of the actual variable as is done in a call by value. Because a pointer is copied, if the value at that pointers address is changed in the function, the value is also changed in main(). Let’s take a look at a code example:
#include <stdio.h>
void call_by_reference(int *y) {
printf("Inside call_by_reference y = %d before adding 10.\n", *y);
(*y) += 10;
printf("Inside call_by_reference y = %d after adding 10.\n", *y);
}
int main() {
int b=10;
printf("b = %d before function call_by_reference.\n", b);
call_by_reference(&b);
printf("b = %d after function call_by_reference.\n", b);
return 0;
}
The output of this call by reference source code example will look like this:
b = 10 before function call_by_reference.
Inside call_by_reference y = 10 before adding 10.
Inside call_by_reference y = 20 after adding 10.
b = 20 after function call_by_reference.
when to use which
One advantage of the call by reference method is that it is using pointers, so there is no doubling of the memory used by the variables (as with the copy of the call by value method). This is of course great, lowering the memory footprint is always a good thing. So why don’t we just make all the parameters call by reference?
There are two reasons why this is not a good idea and that you (the programmer) need to choose between call by value and call by reference. The reason are: side effects and privacy. Unwanted side effects are usually caused by inadvertently changes that are made to a call by reference parameter. Also in most cases you want the data to be private and that someone calling a function only be able to change if you want it. So it is better to use a call by value by default and only use call by reference if data changes are expected.
call by name
In call-by-name evaluation, the arguments to a function are not evaluated before the function is called — rather, they are substituted directly into the function body (using capture-avoiding substitution) and then left to be evaluated whenever they appear in the function.
call by need
Lazy evaluation, or call-by-need is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations
So I have this problem where I have to figure out the output using two different scoping rules. I know the output using lexical scoping is a=3 and b=1, but I am having hard time figure out the output using dynamic scoping.
Note:the code example that follows uses C syntax, but let's just treat it as pseudo-code.
int a,b;
int p() {
int a, p;
a = 0; b = 1; p = 2;
return p;
}
void print() {
printf("%d\n%d\n",a,b);
}
void q () {
int b;
a = 3; b = 4;
print();
}
main() {
a = p();
q();
}
Here is what I come up with.
Using Dynamic scoping, the nonlocal references to a and b can change. So I have a=2 ( return from p() ), then b=4 ( inside q() ).
So the output is 2 4?
As we know, C doesn't have dynamic scoping, but assuming it did, the program would print 3 4.
In main, a and b are the global ones. a will be set to 2, as we will see that this is what p will return.
In p, called from main, b is still the global one, but a is the one local in p. The local a is set to 0, but will soon disappear. The global b is set to 1. The local p is set to 2, and 2 will be returned. Now the global b is 1.
In q, called from main, a is the global one, but b is the one local in q. Here the global a is set to 3, and the local b is set to 4.
In print, called from q, a is the global one (which has the value 3), and b is the one local in q (which has the value 4).
It is in this last step, inside the function print, that we see a difference from static scoping. With static scoping a and b would be the global ones. With dynamic scoping, we have to look at the chain of calling functions, and in q we find a variable b, which will be the b used inside print.
C is not a dynamically scoped language. If you want to experiment in order to understand the difference, you're better off with a language like Perl which lets you chose between both.
I want to know what is call-by-need.
Though I searched in wikipedia and found it here: http://en.wikipedia.org/wiki/Evaluation_strategy,
but could not understand properly.
If anyone can explain with an example and point out the difference with call-by-value, it would be a great help.
Suppose we have the function
square(x) = x * x
and we want to evaluate square(1+2).
In call-by-value, we do
square(1+2)
square(3)
3*3
9
In call-by-name, we do
square(1+2)
(1+2)*(1+2)
3*(1+2)
3*3
9
Notice that since we use the argument twice, we evaluate it twice. That would be wasteful if the argument evaluation took a long time. That's the issue that call-by-need fixes.
In call-by-need, we do something like the following:
square(1+2)
let x = 1+2 in x*x
let x = 3 in x*x
3*3
9
In step 2, instead of copying the argument (like in call-by-name), we give it a name. Then in step 3, when we notice that we need the value of x, we evaluate the expression for x. Only then do we substitute.
BTW, if the argument expression produced something more complicated, like a closure, there might be more shuffling of lets around to eliminate the possibility of copying. The formal rules are somewhat complicated to write down.
Notice that we "need" values for the arguments to primitive operations like + and *, but for other functions we take the "name, wait, and see" approach. We would say that the primitive arithmetic operations are "strict". It depends on the language, but usually most primitive operations are strict.
Notice also that "evaluation" still means to reduce to a value. A function call always returns a value, not an expression. (One of the other answers got this wrong.) OTOH, lazy languages usually have lazy data constructors, which can have components that are evaluated on-need, ie, when extracted. That's how you can have an "infinite" list---the value you return is a lazy data structure. But call-by-need vs call-by-value is a separate issue from lazy vs strict data structures. Scheme has lazy data constructors (streams), although since Scheme is call-by-value, the constructors are syntactic forms, not ordinary functions. And Haskell is call-by-name, but it has ways of defining strict data types.
If it helps to think about implementations, then one implementation of call-by-name is to wrap every argument in a thunk; when the argument is needed, you call the thunk and use the value. One implementation of call-by-need is similar, but the thunk is memoizing; it only runs the computation once, then it saves it and just returns the saved answer after that.
Imagine a function:
fun add(a, b) {
return a + b
}
And then we call it:
add(3 * 2, 4 / 2)
In a call-by-name language this will be evaluated so:
a = 3 * 2 = 6
b = 4 / 2 = 2
return a + b = 6 + 2 = 8
The function will return the value 8.
In a call-by-need (also called a lazy language) this is evaluated like so:
a = 3 * 2
b = 4 / 2
return a + b = 3 * 2 + 4 / 2
The function will return the expression 3 * 2 + 4 / 2. So far almost no computational resources have been spent. The whole expression will be computed only if its value is needed - say we wanted to print the result.
Why is this useful? Two reasons. First if you accidentally include dead code it doesn't weigh your program down and thus can be a lot more efficient. Second it allows to do very cool things like efficiently calculating with infinite lists:
fun takeFirstThree(list) {
return [list[0], list[1], list[2]]
}
takeFirstThree([0 ... infinity])
A call-by-name language would hang there trying to create a list from 0 to infinity. A lazy language will simply return [0,1,2].
A simple, yet illustrative example:
function choose(cond, arg1, arg2) {
if (cond)
do_something(arg1);
else
do_something(arg2);
}
choose(true, 7*0, 7/0);
Now lets say we're using the eager evaluation strategy, then it would calculate both 7*0 and 7/0 eagerly. If it is a lazy evaluated strategy (call-by-need), then it would just send the expressions 7*0 and 7/0 through to the function without evaluating them.
The difference? you would expect to execute do_something(0) because the first argument gets used, although it actually depends on the evaluation strategy:
If the language evaluates eagerly, then it will, as stated, evaluate 7*0 and 7/0 first, and what's 7/0? Divide-by-zero error.
But if the evaluation strategy is lazy, it will see that it doesn't need to calculate the division, it will call do_something(0) as we were expecting, with no errors.
In this example, the lazy evaluation strategy can save the execution from producing errors. In a similar manner, it can save the execution from performing unnecessary evaluation that it won't use (the same way it didn't use 7/0 here).
Here's a concrete example for a bunch of different evaluation strategies written in C. I'll specifically go over the difference between call-by-name, call-by-value, and call-by-need, which is kind of a combination of the previous two, as suggested by Ryan's answer.
#include<stdio.h>
int x = 1;
int y[3]= {1, 2, 3};
int i = 0;
int k = 0;
int j = 0;
int foo(int a, int b, int c) {
i = i + 1;
// 2 for call-by-name
// 1 for call-by-value, call-by-value-result, and call-by-reference
// unsure what call-by-need will do here; will likely be 2, but could have evaluated earlier than needed
printf("a is %i\n", a);
b = 2;
// 1 for call-by-value and call-by-value-result
// 2 for call-by-reference, call-by-need, and call-by-name
printf("x is %i\n", x);
// this triggers multiple increments of k for call-by-name
j = c + c;
// we don't actually care what j is, we just don't want it to be optimized out by the compiler
printf("j is %i\n", j);
// 2 for call-by-name
// 1 for call-by-need, call-by-value, call-by-value-result, and call-by-reference
printf("k is %i\n", k);
}
int main() {
int ans = foo(y[i], x, k++);
// 2 for call-by-value-result, call-by-name, call-by-reference, and call-by-need
// 1 for call-by-value
printf("x is %i\n", x);
return 0;
}
The part we're most interested in is the fact that foo is called with k++ as the actual parameter for the formal parameter c.
Note that how the ++ postfix operator works is that k++ returns k at first, and then increments k by 1. That is, the result of k++ is just k. (But, then after that result is returned, k will be incremented by 1.)
We can ignore all of the code inside foo up until the line j = c + c (the second section).
Here's what happens for this line under call-by-value:
When the function is first called, before it encounters the line j = c + c, because we're doing call-by-value, c will have the value of evaluating k++. Since evaluating k++ returns k, and k is 0 (from the top of the program), c will be 0. However, we did evaluate k++ once, which will set k to 1.
The line becomes j = 0 + 0, which behaves exactly like how you'd expect, by setting j to 0 and leaving c at 0.
Then, when we run printf("k is %i\n", k); we get that k is 1, because we evaluated k++ once.
Here's what happens for the line under call-by-name:
Since the line contains c and we're using call-by-name, we replace the text c with the text of the actual argument, k++. Thus, the line becomes j = (k++) + (k++).
We then run j = (k++) + (k++). One of the (k++)s will be evaluated first, returning 0 and setting k to 1. Then, the second (k++) will be evaluated, returning 1 (because k was set to 1 by the first evaluation of k++), and setting k to 2. Thus, we end up with j = 0 + 1 and k set to 2.
Then, when we run printf("k is %i\n", k);, we get that k is 2 because we evaluated k++ twice.
Finally, here's what happens for the line under call-by-need:
When we encounter j = c + c; we recognize that this is the first time the parameter c is evaluated. Thus we need to evaluate its actual argument (once) and store that value to be the evaluation of c. Thus, we evaluate the actual argument k++, which will return k, which is 0, and therefore the evaluation of c will be 0. Then, since we evaluated k++, k will be set to 1. We then use this stored evaluation as the evaluation for the second c. That is, unlike call-by-name, we do not re-evaluate k++. Instead, we reuse the previously evaluated initial value for c, which is 0. Thus, we get j = 0 + 0; just as if c was pass-by-value. And, since we only evaluated k++ once, k is 1.
As explained in the previous step, j = c + c is j = 0 + 0 under call-by-need, and it runs exactly as you'd expect.
When we run printf("k is %i\n", k);, we get that k is 1 because we only evaluated k++ once.
Hopefully this helps to differentiate how call-by-value, call-by-name, and call-by-need work. If it would be helpful to differentiate call-by-value and call-by-need more clearly, let me know in a comment and I'll explain the code earlier on in foo and why it works the way it does.
I think this line from Wikipedia sums things up nicely:
Call by need is a memoized variant of call by name, where, if the function argument is evaluated, that value is stored for subsequent use. If the argument is pure (i.e., free of side effects), this produces the same results as call by name, saving the cost of recomputing the argument.