Avoiding race conditions without using program blocks in systemverilog

Avoiding race conditions without using program blocks in systemverilog - verilog

We know that program blocks are used in SystemVerilog to avoid race conditions between DUT and testbench. What did the verification engineers do before SystemVerilog came into picture? I can only think of using hand shake signals.

You use the same semantics that designers use to prevent race conditions in RTL: Non-blocking assignments, or alternative clock edges.
Program blocks are an unnecessary construct in SystemVerilog. See http://go.mentor.com/programblocks

You can avoid race condition without using program block.
Race condition is created just because of expression or assignments are try to access same signal at a same time.
If two signals try to access same signal at different time stamp then user can remove race condition.
Actually code is written in verilog or system verilog is execute in different time region like active region, reactive region.
Race condition can be removed using following things.
(1) Program block
(2) Clocking block
(3) Non blocking assigment
Before program block and clocking block race condition is removed using non blocking assignment.
As I explained above statement written in verilog code or system verilog code is not execute code in single time same. There are different region in which specific syntax is executed by tool.
Here I mainly talked about Active and Reactive region.
Active region consider continuous assignments, blocking assignments.
Reactive region consider LHS of non blocking assignments are evaluated in this region.
First active region is evaluated then reactive region is evaluated.
So before program block to remove race condition verification engineers take care of this things(regions of execution).
Now in system verilog there are many other regions are added like prepone region, observed region, postpone region.

Related

Grayzone between blocking and non-blocking I/O?

I am familar with programming according to the two paradigms, blocking and non-blocking, on the JVM (Java/nio, Scala/Akka).
However, I see a kind of grayzone in between that confuses me.
Look at any non-blocking program of your choice: it is full of blocking statements!
For example, each assignment of a variable is a blocking operation that waits for CPU-registers and memory-reads to succeed.
Moreover, non-blocking programs even contain blocking statements that carry out computations on complex in-memory-collections, without violating the non-blocking paradigm.
In contrast to that, the non-blocking paradigm would clearly be violated if we would call some external web-service in a blocking way to receive its result.
But what is in between these extremes? What about reading/writing a tiny file, a local socket, or making an API-call to an embedded data storage engine (such as SQLite, RocksDb, etc.). Is it ok to do blocking reads/writes to these APIs? They usually give strong timing guarantees in practice (say << 1ms as long as the OS is not stalled), so there is almost no practical difference to pure in-memory-access. As a precise example: is calling RocksDBs get/put within an Akka Actor considered to be an inadvisable blocking I/O?
So, my question is whether there are rules of thumb or precise criteria that help me in deciding whether I may stick to a simple blocking statement in my non-blocking program, or whether I shall wrap such a statement into non-blocking boilerplate (framework-depending, e.g., outsourcing such calls to a separate thread-pool, nesting one step deeper in a Future or Monad, etc.).

for example, each assignment of a variable is a blocking operation that waits for CPU-registers and memory-reads to succeed
That's not really what is considered "blocking". Those operations are constant time, and that constant is very low (a few cycles in general) compared to the latency of any IO operations (anywhere between thousands and billions of cycles) - except for page faults due to swapped memory, but if those happen regularly you have a problem anyway.
And if we want to get all nitpicky, individual instructions do not fully block a CPU thread as modern CPUs can reorder instructions and execute ones that have no data dependencies out of order while waiting for memory/caches or other more expensive instructions to finish.
Moreover, non-blocking programs even contain blocking statements that carry out computations on complex in-memory-collections, without violating the non-blocking paradigm.
Those are not considered as blocking the CPU from doing work. They should not even block user interactivity if they are correctly designed to present the results to the user when they are done without blocking the UI.
Is it ok to do blocking reads/writes to these APIs?
That always depends on why you are using non-blocking approaches in the first place. What problem are you trying to solve? Maybe one API warrants a non-blocking approach while the other does not.
For example most file IO methods are nominally blocking, but writes without fsync can be very cheap, especially if you're not writing to spinning rust so it can be overkill to avoid those methods on your compute threadpool. On the other hand one usually does not want to block a thread in a fixed threadpool while waiting for a multi-second database query

What word refers to a code segment/function that can be/is executed concurrently/in parallel by two different threads?

I came across this term while studying threads, synchronization, and writing multi-threaded programs. If I remember correctly, it refers to a section of code that two threads execute in parallel.
If I remember incorrectly, it might actually refer to a section of code that can run simultaneously. Then again, I might be off entirely (sorry).
The term is on the tip of my tongue and I (desperately) want to google it.

RENTRANT and THREAD-SAFE. Both are necessary.
See this Wiki entry on "reentrant":
In computing, a computer program or subroutine is called reentrant if
it can be interrupted in the middle of its execution and then safely
called again ("re-entered") before its previous invocations complete
execution. The interruption could be caused by an internal action such
as a jump or call, or by an external action such as a hardware
interrupt or signal. Once the reentered invocation completes, the
previous invocations will resume correct execution.
This definition originates from single-threaded programming
environments where the flow of control could be interrupted by a
hardware interrupt and transferred to an interrupt service routine
(ISR). Any subroutine used by the ISR that could potentially have been
executing when the interrupt was triggered should be reentrant. Often,
subroutines accessible via the operating system kernel are not
reentrant. Hence, interrupt service routines are limited in the
actions they can perform; for instance, they are usually restricted
from accessing the file system and sometimes even from allocating
memory.
A subroutine that is directly or indirectly recursive should be
reentrant. This policy is partially enforced by structured programming
languages.[citation needed] However a subroutine can fail to be
reentrant if it relies on a global variable to remain unchanged but
that variable is modified when the subroutine is recursively invoked.
This definition of reentrancy differs from that of thread-safety in
multi-threaded environments. A reentrant subroutine can achieve
thread-safety,1 but being reentrant alone might not be sufficient to
be thread-safe in all situations. Conversely, thread-safe code does
not necessarily have to be reentrant (see below for examples).
...

I think the term you're looking for is a Critical Section - a piece of code whose function is critically important when dealing with multiple threads.
However, your question posits a block of code that can run simultaneously on multiple threads, which is different than a critical section - a critical section is specifically a block of code that must run on only one thread at a time, for instance, incrementing a bank balance. It's the type of code where one would expect that multiple threads could try to run it, but specifically requires that only one thread actually be allowed to run it at one time.
There is no name, to the best of my knowledge, for a block of code that could be executed simultaneously on multiple threads, because lots of code does that innocuously.

Why using zero timing (#0)in verilog is not good practice?

Why using zero timing is not good.
I cant find details about that any what problem will arise if w use that method.

Designers are often tempted to use #0 to avoid race conditions between two procedural blocks. A #0 in a procedural block forces that block to stop and be rescheduled after all other blocks. The problem happens when you have a multiple blocks that all want to execute last. Who should win?
This itself can become a new race condition and its resolution could vary from run to run and from simulator to simulator. In short, multiple threads using #0 delays can cause non-deterministic execution behavior.
Besides, it makes your code hard to read and also non-synthesizable. SystemVerilog has provided new constructs for avoiding #0 in a more predictable and readable way. Here is one example (See 7.2 Event trigger race conditions).
Note that there are cases other than the classics usage of #0 in SystemVerilog that you may actually need to use #0. For example, differed assertions.

Are "data races" and "race condition" actually the same thing in context of concurrent programming

I often find these terms being used in context of concurrent programming . Are they the same thing or different ?

No, they are not the same thing. They are not a subset of one another. They are also neither the necessary, nor the sufficient condition for one another.
The definition of a data race is pretty clear, and therefore, its discovery can be automated. A data race occurs when 2 instructions from different threads access the same memory location, at least one of these accesses is a write and there is no synchronization that is mandating any particular order among these accesses.
A race condition is a semantic error. It is a flaw that occurs in the timing or the ordering of events that leads to erroneous program behavior. Many race conditions can be caused by data races, but this is not necessary.
Consider the following simple example where x is a shared variable:
Thread 1 Thread 2
lock(l) lock(l)
x=1 x=2
unlock(l) unlock(l)
In this example, the writes to x from thread 1 and 2 are protected by locks, therefore they are always happening in some order enforced by the order with which the locks are acquired at runtime. That is, the writes' atomicity cannot be broken; there is always a happens before relationship between the two writes in any execution. We just cannot know which write happens before the other a priori.
There is no fixed ordering between the writes, because locks cannot provide this. If the programs' correctness is compromised, say when the write to x by thread 2 is followed by the write to x in thread 1, we say there is a race condition, although technically there is no data race.
It is far more useful to detect race conditions than data races; however this is also very difficult to achieve.
Constructing the reverse example is also trivial. This blog post also explains the difference very well, with a simple bank transaction example.

According to Wikipedia, the term "race condition" has been in use since the days of the first electronic logic gates. In the context of Java, a race condition can pertain to any resource, such as a file, network connection, a thread from a thread pool, etc.
The term "data race" is best reserved for its specific meaning defined by the JLS.
The most interesting case is a race condition that is very similar to a data race, but still isn't one, like in this simple example:
class Race {
static volatile int i;
static int uniqueInt() { return i++; }
}
Since i is volatile, there is no data race; however, from the program correctness standpoint there is a race condition due to the non-atomicity of the two operations: read i, write i+1. Multiple threads may receive the same value from uniqueInt.

TL;DR: The distinction between data race and race condition depends on the nature of problem formulation, and where to draw the boundary between undefined behavior and well-defined but indeterminate behavior. The current distinction is conventional and best reflects the interface between processor architect and programming language.
1. Semantics
Data race specifically refers to the non-synchronized conflicting "memory accesses" (or actions, or operations) to the same memory location. If there is no conflict in the memory accesses, while there is still indeterminate behavior caused by operation ordering, that is a race condition.
Note "memory accesses" here have specific meaning. They refer to the "pure" memory load or store actions, without any additional semantics applied. For example, a memory store from one thread does not (necessarily) know how long it takes for the data to be written into the memory, and finally propagates to another thread. For another example, a memory store to one location before another store to another location by the same thread does not (necessarily) guarantee the first data written in the memory be ahead of the second. As a result, the order of those pure memory accesses are not (necessarily) able to be "reasoned" , and anything could happen, unless otherwise well defined.
When the "memory accesses" are well defined in terms of ordering through synchronization, additional semantics can ensure that, even if the timing of the memory accesses are indeterminate, their order can be "reasoned" through the synchronizations. Note, although the ordering between the memory accesses can be reasoned, they are not necessarily determinate, hence the race condition.
2. Why the difference?
But if the order is still indeterminate in race condition, why bother to distinguish it from data race? The reason is in practical rather than theoretical. It is because the distinction does exist in the interface between the programming language and processor architecture.
A memory load/store instruction in modern architecture is usually implemented as "pure" memory access, due to the nature of out-of-order pipeline, speculation, multi-level of cache, cpu-ram interconnection, especially multi-core, etc. There are lots of factors leading to indeterminate timing and ordering. To enforce ordering for every memory instruction incurs huge penalty, especially in a processor design that supports multi-core. So the ordering semantics are provided with additional instructions like various barriers (or fences).
Data race is the situation of processor instruction execution without additional fences to help reasoning the ordering of conflicting memory accesses. The result is not only indeterminate, but also possibly very weird, e.g., two writes to the same word location by different threads may result with each writing half of the word, or may only operate upon their locally cached values. -- These are undefined behavior, from the programmer's point of view. But they are (usually) well defined from the processor architect's point of view.
Programmers have to have a way to reason their code execution. Data race is something they cannot make sense, therefore should always avoid (normally). That is why the language specifications that are low level enough usually define data race as undefined behavior, different from the well-defined memory behavior of race condition.
3. Language memory models
Different processors may have different memory access behavior, i.e., processor memory model. It is awkward for programmers to study the memory model of every modern processor and then develop programs that can benefit from them. It is desirable if the language can define a memory model so that the programs of that language always behave as expected as the memory model defines. That is why Java and C++ have their memory models defined. It is the burden of the compiler/runtime developers to ensure the language memory models are enforced across different processor architectures.
That said, if a language does not want to expose the low level behavior of the processor (and is willing to sacrifice certain performance benefits of the modern architectures), they can choose to define a memory model that completely hide the details of "pure" memory accesses, but apply ordering semantics for all their memory operations. Then the compiler/runtime developers may choose to treat every memory variable as volatile in all processor architectures. For these languages (that support shared memory across threads), there are no data races, but may still be race conditions, even with a language of complete sequential consistence.
On the other hand, the processor memory model can be stricter (or less relaxed, or at higher level), e.g., implementing sequential consistency as early-days processor did. Then all memory operations are ordered, and no data race exists for any languages running in the processor.
4. Conclusion
Back to the original question, IMHO it is fine to define data race as a special case of race condition, and race condition at one level may become data race at a higher level. It depends on the nature of problem formulation, and where to draw the boundary between undefined behavior and well-defined but indeterminate behavior. Just the current convention defines the boundary at language-processor interface, does not necessarily mean that is always and must be the case; but the current convention probably best reflects the state-of-the-art interface (and wisdom) between processor architect and programming language.

No, they are different & neither of them is a subset of one or vice-versa.
The term race condition is often confused with the related term data
race, which arises when synchronization is not used to coordinate all
access to a shared nonfinal field. You risk a data race whenever a
thread writes a variable that might next be read by another thread or
reads a variable that might have last been written by another thread
if both threads do not use synchronization; code with data races has
no useful defined semantics under the Java Memory Model. Not all race
conditions are data races, and not all data races are race conditions,
but they both can cause concurrent programs to fail in unpredictable
ways.
Taken from the excellent book - Java Concurrency in Practice by Brian Goetz & Co.

Data races and Race condition
[Atomicity, Visibility, Ordering]
In my opinion definitely it is two different things.
Data races is a situation when same memory is shared between several threads(at least one of them change it (write access)) without synchronoization
Race condition is a situation when not synchronized blocks of code(may be the same) which use same shared resource are run simultaneously on different threads and result of which is unpredictable.
Race condition examples:
//increment variable
1. read variable
2. change variable
3. write variable
//cache mechanism
1. check if exists in cache and if not
2. load
3. cache
Solution:
Data races and Race condition are problem with atomicity and they can be solved by synchronization mechanism.
Data races - When write access to shared variable will be synchronized
Race condition - When block of code is run as an atomic operation

what is the difference between semaphore and critical region?

the only thing i understood is
semaphore is a primitive way
critical region has a GUARD variable (semaphore also does but the name is not GUARD!)
??
so whats the difference?

Generally, a critical region is a place where, if two separate threads of execution were to be present, a race condition or some other undesirable effect would occur. Semaphores are one way of preventing two threads from being in the critical region at the same point in time.

The GUARD would only allow 1 thread to enter the critical region at a time, whereas the semaphore can allow n threads (you specify n) to concurrently enter the critical region.

When a process executes code that manipulates shared data (or resource), we say that the process is in it’s critical section (CS) (for that shared data)
and semaphore is Nonnegative integer variable used as a flag and
Signals if and when resource is free

There are two interpretations of "critical region":
A region of code that will produce undefined results if executed simultaneously by two threads.
A region of code that is isolated from all executors except for the current thread. An example of this would be an interrupt handler. These regions are more commonly called "critical sections". On Intel CPUs you can begin/end a critical section with the CLI/STI instructions.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string