Variable inside for loop not re-initialized on each iteration? - string

I have a C++/CLI project that declares a String^ variable inside a for loop but does not initialize it. On the first iteration, the variable is set to some value. On each subsequent
iteration, it appears to be retaining the previous value. Shouldn't a variable in local scope be initialized to null (or equivalent) each time thru the loop? This happens with an int as well. Also, the compiler
does not warn of a potentially uninitialized value unless I set the warning level to W4, and even then it only warns for the int and not the String^.
This is sample code that shows the behavior.
#include "stdafx.h"
using namespace System;
int main(array<System::String ^> ^args)
{
for(int n = 0; n < 10; n++)
{
String^ variable;
int x;
switch(n)
{
case 1:
variable = "One";
x = 1;
break;
case 5:
variable = "Five";
x = 5;
break;
}
Console::WriteLine("{0}{1}", variable, x);
}
}
The output of this will be
One, 1
One, 1
One, 1
One, 1
Five, 5
Five, 5
Five, 5
Five, 5
Five, 5
Am I completely misunderstanding how locally scoped variables are supposed to be initialized? Is this a "feature" unique to managed C++? If I convert
this to C# the compiler will warn about both variables, even at the base warning level.

Disclaimer: I know C and C++ pretty well; C++/CLI, not so much. But the behavior you're seeing is essentially the same that I'd expect for a similar program in C or C++.
String^ is a handle to a String, similar to a pointer in C or C++.
Unless C++/CLI adds new rules for initialization of handles, a block-scope variable of type String^ with no explicit initialization will initially have a garbage value, consisting of whatever happened to be in that chunk of memory.
Each iteration of the loop conceptually creates and destroys any variables defined between the { and }. And each iteration probably allocates its local variables in the same memory location (this isn't required, but there's no real reason for it not to do so). The compiler could even generate code that allocates the memory on entry to the function.
So on the first iteration of your loop, variable is set to "One" (or rather, to a handle that refers to "One"), that's the value printed by Console::WriteLine. No problem there.
On the second iteration, variable is allocated in the same memory location that was used for it on the first iteration. No new value is assigned to it, so it retains the value that was stored in that memory location on the first iteration. The same thing happens with x.
You cannot count on the previous value being retained, and your program's behavior is undefined. If your goal were to write a correctly working program, rather than to understand how this incorrect program behaves, the solution would be to ensure that all your variables are properly initialized before they're used.
If you did the initial assignment on the second iteration rather than the first, the program would likely crash on the first iteration -- though even that's not guaranteed.
As for why the compiler doesn't warn about this, I don't know. I hesitate to suggest a compiler bug, but this could be one.
Also, even with high warning levels enabled, warning about uninitialized variables requires control flow analysis that may not be done by default. Enabling both warnings and a high level of optimization might give the compiler enough information to warn about both variable and x.
It still seems odd that it warns about x and not about variable with W4.

C++/CLI is only an extension/superset of standard C++ so it complies with most of its specifications, extending it only to fit with the CLI (~.Net) requirements.
Shouldn't a variable in local scope be initialized to null (or
equivalent) each time thru the loop?
AFAIK the C++ standard does not define the way local loop variables should be initialized.
So, to avoid any overhead, compilers usually don't use specific local memory management for loops : see this SO question :
Is there any overhead to declaring a variable within a loop? (C++)
Am I completely misunderstanding how locally scoped variables are supposed to be initialized?
Is this a "feature" unique to managed C++
So no, this is not a feature or special behavior : your C++/CLI compiler is only using standard C++ practices.
If I convert this to C# the compiler will warn about both variables,
even at the base warning level.
C# and AFAIK Java try hard to avoid any undefined behaviors, so they force you to initialize local variables before they are used.
Here is the CIL resulting from the compilation (I've done some formatting and commenting to make this bunch of text understandable :)) :
.locals init (int32 V_0, int32 V_1, string V_2, int32 V_3)
// ^ ^ ^ ^
// n x variable tmp
// initialization of "n"
IL_0000: ldc.i4.0
IL_0001: stloc.0
IL_0002: br.s IL_0008
// loop starts here
// post iteration processing
IL_0004: ldloc.0
IL_0005: ldc.i4.1
IL_0006: add
IL_0007: stloc.0
// stop condition check
IL_0008: ldloc.0
IL_0009: ldc.i4.s 10
IL_000b: bge.s IL_003e
// initialization of temporary "tmp" variable for switch
IL_000d: ldloc.0
IL_000e: stloc.3
// check if "tmp" is 3
IL_000f: ldloc.3
IL_0010: ldc.i4.1
// if so go to "variable" intialization
IL_0011: beq.s IL_0019
// check if "tmp" is 5
IL_0013: ldloc.3
IL_0014: ldc.i4.5
IL_0015: beq.s IL_0023
// go to display
IL_0017: br.s IL_002b
// initialization of "variable"
IL_0019: ldstr "One"
IL_001e: stloc.2
...
So the variable is indeed never implicitly manipulated by the code generated by the compiler.

Related

Point a pointer to local variable created within a function

Here is the code:
var timePointer *[]time.Time
func UpdateHolidayList() error {
//updating logic: pulling list of holidays from API
holidaySlice := make([]time.Time, 0)
//appending some holidays of type time.Time to holidaySlice
//now holidaySlice contains a few time.Time values
timePointer = &holidaySlice
return nil
}
func main() {
//run UpdateHoliday every 7 days
go func() {
for {
UpdateHoliday()
time.Sleep(7 * 3600 * time.Hour)
}
}()
}
I have 4 questions:
holidaySlice is a local variable, is it safe to point a (global) pointer to it?
Is this whole code multi-thread safe?
After pointing timePointer to holidaySlice, can I access the values via timePointer
(If answer to 3. is "yes") The holiday list is constantly changing, so holidaySlice will be different each update. Will the values accessed via timePointer change accordingly then?
holidaySlice is a local variable allocated on the heap. Any variable pointing to the same heap location can access the data structure at that location. Whether or not it is safe depends on how you access it. Even if holidaySlice was not explicitly allocated on the heap, once you make a global variable point to it, Go compiler would detect that it "escapes", so it would allocate that on the heap.
The code is not thread-safe. You are modifying a shared variable (the global variable) without any explicit synchronization, so there is no guarantee on when or if other goroutines will see the updates to that variable.
If you update the contents of the timePointer array without explicit synchronization, there is no guarantee on when or if other goroutines will see those updates. You have to use synchronization primitives like sync/Mutex to delimit read/write access to data structures that can be updated/read by multiple goroutines.
As long as your main does not end timePointer will have access to the pointer that holidaySlice creates - since its heap allocated, compiler will detect its escape and not free the memory.
Nope, absolutely not. Look at the sync package
Yes you can. Just remember to iterate using *timePointer instead of timePointer
It will change - but not accordingly. Since you have not done any synchronization - you have no defined way of knowing what data is stored at slice pointed to by timePointer when it is read.

Safely zeroing buffers after working with crypto/*

Is there a way to zero buffers containing e. g. private keys after
using them and make sure that compilers don't delete the zeroing code as
unused? Something tells me that a simple:
copy(privateKey, make([]byte, keySize))
Is not guaranteed to stay there.
Sounds like you want to prevent sensitive data remaining in memory. But have you considered the data might have been replicated, or swapped to disk?
For these reasons I use the https://github.com/awnumar/memguard package.
It provides features to destroy the data when no longer required, while keeping it safe in the mean time.
You can read about its background here; https://spacetime.dev/memory-security-go
How about checking (some of) the content of the buffer after zeroing it and passing it to another function? For example:
copy(privateKey, make([]byte, keySize))
if privateKey[0] != 0 {
// If you pass the buffer to another function,
// this check and above copy() can't be optimized away:
fmt.Println("Zeroing failed", privateKey[0])
}
To be absolutely safe, you could XOR the passed buffer content with random bytes, but if / since the zeroing is not optimized away, the if body is never reached.
You might think a very intelligent compiler might deduce the above copy() zeros privateKey[0] and thus determine the condition is always false and still optimize it away (although this is very unlikely). The solution to this is not to use make([]byte, keySize) but e.g. a slice coming from a global variable or a function argument (whose value can only be determined at runtime) so the compiler can't be smart enough to deduce the condition is going to be always false at compile time.

Incorrect synchronization in go lang

While I was taking a look on the golang memory model document(link), I found a weird behavior on go lang. This document says that below code can happen that g prints 2 and then 0.
var a, b int
func f() {
a = 1
b = 2
}
func g() {
print(b)
print(a)
}
func main() {
go f()
g()
}
Is this only go routine issue? Because I am curious that why value assignment of variable 'b' can happen before that of 'a'? Even if value assignment of 'a' and 'b would happen in different thread(not in main thread), does it have to be ensured that 'a' should be assigned before 'b' in it's own thread?(because assignment of 'a' comes first and that of 'b' comes later) Can anyone please tell me about this issue clearly?
Variables a and b are allocated and initialized with the zero values of their respective type (which is 0 in case of int) before any of the functions start to execute, at this line:
var a, b int
What may change is the order new values are assigned to them in the f() function.
Quoting from that page: Happens Before:
Within a single goroutine, reads and writes must behave as if they executed in the order specified by the program. That is, compilers and processors may reorder the reads and writes executed within a single goroutine only when the reordering does not change the behavior within that goroutine as defined by the language specification. Because of this reordering, the execution order observed by one goroutine may differ from the order perceived by another. For example, if one goroutine executes a = 1; b = 2;, another might observe the updated value of b before the updated value of a.
Assignment to a and b may not happen in the order you write them if reordering them does not make a difference in the same goroutine. The compiler may reorder them for example if first changing the value of b is more efficient (e.g. because its address is already loaded in a register). If changing the assignment order would (or may) cause issue in the same goroutine, then obviously the compiler is not allowed to change the order. Since the goroutine of the f() function does nothing with the variables a and b after the assignment, the compiler is free to carry out the assignments in whatever order.
Since there is no synchronization between the 2 goroutines in the above example, the compiler makes no effort to check whether reordering would cause any issues in the other goroutine. It doesn't have to.
Buf if you synchronize your goroutines, the compiler will make sure that at the "synchronization point" there will be no inconsistencies: you will have guarantee that at that point both the assignments will be "completed"; so if the "synchronization point" is before the print() calls, then you will see the assigned new values printed: 2 and 1.

c++ multi threading - lock one pointer assignment?

I have a method as below
SomeStruct* abc;
void NullABC()
{
abc = NULL;
}
This is just example and not very interesting.
Many thread could call this method at the same time.
Do I need to lock "abc = NULL" line?
I think it is just pointer so it could be done in one shot and there isn't really need for it but just wanted to make sure.
Thanks
It depends on the platform on which you are running. On many platforms, as long as abc is correctly aligned, the write will be atomic.
However, if your platform does not have such a guarantee, you need to synchronize access to the variable, using a lock, an atomic variable, or an interlocked operation.
No you do not need a lock, at least not on x86. A memory barrier is required in may real world situations though, and locking is one way to get this (the other would be an explicit barrier). You may also consider using an interlocked operation, like VisualC's InterlockedExchangePointer if you need access to the original pointer. There are equivalent intrinsics supported by most compilers.
If no other threads are ever using abc for any other purpose, then the code as shown is fine... but of course it's a bit silly to have a pointer that never gets used except to set it to NULL.
If there is some other code somewhere that does something like this, OTOH:
if (abc != NULL)
{
abc->DoSomething();
}
Then in this case both the code that uses the abc pointer (above) and the code that changes it (that you posted) needed to lock a mutex before accessing (abc). Otherwise the code above risks crashing if the value of abc gets set to NULL after the if statement but before the DoSomething() call.
A borderline case would be if the other code does this:
SomeStruct * my_abc = abc;
if (my_abc != NULL)
{
my_abc->DoSomething();
}
That will probably work, because at the time the abc pointer's value is copied over to my_abc, the value of abc is either NULL or it isn't... and my_abc is a local variable, so other thread's won't be able to change it before DoSomething() is called. The above could theoretically break on some platforms where copying of pointers isn't atomic though (in which case my_abc might end up being an invalid pointer, with have of abc's bits and half NULL bits)... but common/PC hardware will copy pointers atomically, so it shouldn't be an issue there. It might be worthwhile to use a Mutex anyway just to for paranoia's sake though.

Verilog automatic task

What does it mean if a task is declared with the automatic keyword in Verilog?
task automatic do_things;
input [31:0] number_of_things;
reg [31:0] tmp_thing;
begin
// ...
end
endtask;
Note: This question is mostly because I'm curious if there are any hardware programmers on the site. :)
"automatic" does in fact mean "re-entrant". The term itself is stolen from software languages -- for example, C has the "auto" keyword for declaring variables as being allocated on the stack when the scope it's in is executed, and deallocated afterwards, so that multiple invocations of the same scope do not see persistent values of that variable. The reason you may not have heard of this keyword in C is that it is the default storage class for all types :-) The alternatives are "static", which means "allocate this variable statically (to a single global location in memory), and refer to this same memory location throughout the execution of the program, regardless of how many times the function is invoked", and "volatile", which means "this is a register elsewhere on my SoC or something on another device which I have no control over; compiler, please don't optimize reads to me away, even when you think you know my value from previous reads with no intermediate writes in the code".
"automatic" is intended for recursive functions, but also for running the same function in different threads of execution concurrently. For instance, if you "fork" off N different blocks (using Verilog's fork->join statement), and have them all call the same function at the same time, the same problems arise as a function calling itself recursively.
In many cases, your code will be just fine without declaring the task or function as "automatic", but it's good practice to put it in there unless you specifically need it to be otherwise.
It means that the task is re-entrant - items declared within the task are dynamically allocated rather than shared between different invocations of the task.
You see - some of us do Verilog... (ugh)
The "automatic" keyword also allows you to write recursive functions (since verilog 2001). I believe they should be synthesisable if they bottom out, but I'm not sure if they have tool support.
I too, do verilog!
As Will and Marty say, the automatic was intended for recursive functions.
If a normal (i.e. not automatic) function is called with different values and processed by the simulator in the same time slice, the returned value is indeterminate. That can be quite a tricky bug to spot! This is only a simulation issue, when synthesised the logic will be correct.
Making the function automatic fixes this.
In computing, a computer program or subroutine is called re-entrant if multiple invocations can safely run concurrently (Wikipedia).
In simple words, the keyword automatic makes it safe, when multiple instances of a task run at a same time.
:D
Automatic is just opposite to static in usual programming. So is the case with Verilog. Think of static variables, they cannot be re-initialized. See the Verilog description below:
for (int i = 0; i < 3; i++) begin
static int f = 0;
f = f + 1;
end
Result of the above program will be f = 3. Also, see the program below:
for (int i = 0; i < 3; i++) begin
int f = 0;
f = f + 1;
end
The result of above program is f = 1. What makes a difference is static keyword.
Conclusion is tasks in Verilog should be automatic because they are invoked (called) so many times. If they were static (if not declared explicitly, they are static), they could have used the result from the previous call which often we do not want.

Resources