Verilog doesn't have something like main()? - verilog

I understand that modules are essentially like c++ functions. However, I didn't find something like a main() section that calls those functions. How does it work without a main() section?

Trying to find (or conceptually force) a main() equivalent in HDL is the wrong way to go about learning HDL -- it will prevent you from making progress. For synthesisable descriptions you need to make the leap from sequential thinking (one instruction running after another) to "parallel" thinking (everything is running all the time). Mentally, look at your code from left to right instead of top to bottom, and you may realize that the concept of main() isn't all that meaningful.
In HDL, we don't "call" functions, we instantiate modules and connect their ports to nets; again, you'll need to change your mental view of the process.
Once you get it, it all becomes much smoother...

Keep in mind that the normal use of Verilog is modeling/describing circuits. When you apply power, all the circuits start to run, so you need to write your reset logic to get each piece into a stable, usable operating state. Typically you'll include a reset line and do your initialization in response to that.

Verilog has initial blocks are kinda like main() in C. These are lists of statements that are scheduled to run from time 0. Verilog can have multiple initial blocks though, that are executed concurrently.
always blocks will also work as main() if they've an empty sensitivity list:
always begin // no sensitivity list
s = 4;
#10; // delay statements, or sim will infinite loop
s = 8;
#10;
end

Related

Execution order of initial and always blocks in Verilog

I'm new to Verilog programming and would like to know how the Verilog program is executed. Does all initial and always block execution begin at time t = 0, or does initial block execution begin at time t = 0 and all always blocks begin after initial block execution? I examined the Verilog program's abstract syntax tree, and all initial and always blocks begin at the same hierarchical level. Thank you very much.
All initial and all always blocks throughout your design create concurrent processes that start at time 0. The ordering is indeterminate as far as the LRM is concerned. But may be repeatable for debug purposes when executing the same version of the same simulation tool. In other words, never rely on the simulation ordering to make you code execute properly.
Verilog requires event-driven simulation. As such, order of execution of all 'always' blocks and 'assign' statements depends on the flow of those events. Signal updated one block will cause execution of all other blocks which depend on those signals.
The difference between always blocks and initial blocks is that the latter is executed unconditionally at time 0 and usually produces some initial events, like generation of clocks and/or schedule reset signals. So, in a sense, initial blocks are executed first, before other blocks react to the events which are produced by them.
But, there is no execution order across multiple initial blocks or across initial blocks and always blocks which were forced into execution by other initial blocks.
In addition, there are other ways to generate events besides initial blocks.
In practice, nobody cares, and you shouldn't either.
On actual hardware, the chip immediately after powering-up is very unstable because of the transient states of the power supply circuit, hence its initial states untrustworthy.
The method to ensure initial state in practice is to set them in the initial block as
always # (event) {
if(~n_reset) {
initial_state=0
} else {
do_something();
}
}

Behaviour of Blocking Assignments inside Tasks called from within always_ff blocks

Have looked for an answer to this question online everywhere but I haven't managed to find an answer yet.
I've got a SystemVerilog project at the moment where I've implemented a circular buffer in a separate module to the main module. The queue module itself has a synchronous portion that acquires data from a set of signals but it also has a combinatorial section that responds to an input. Now when I want to query the state of this queue in my main module a task, inside an always_ff block sets the input using a blocking assignment, then the next statement reads the output and acts on that.
An example would look something like this in almost SystemVerilog:
module foo(clk, ...)
queue = queue(clk, ...)
always_ff#(posedge clk)
begin
check_queue(...)
end
task check_queue();
begin
query_in = 3;
if (query_out == 5)
begin
<<THINGS HAPPEN>>
end
end
endtask
endmodule
module queue(clk, query_in, query_out)
always_comb
begin
query_out = query_in + 2;
end
endmodule
My question essentially comes down to, does this idea work? In my head because the queue is combinatorial it should respond as soon as the input stimulus is applied it should be fine but because it's within a task within an always_ff block I'm a bit concerned about the use of blocking assignments.
Can someone help? If you need more information then let me know and I can give some clarifications.
This creates a race condition and most likely will not work. It has nothing to do with your use of a task. You are trying to read the value of a signal (queue_out) that is being assigned in another concurrent process. Whether it gets updated or not by the time you get to the If statement is a race. U?se a non-blocking assignment to all variable that go outside of the always_ff block and it guarantees you get the previous value.
in order to figure out the stuff, you can just mentally inline the task inside the always_ff. BTW, it really looks like a function in your case. Now, remember that execution of any always block must finish before any other is executed. So, the following will never evaluate to '5' at the same clock edge:
query_in = 3;
if (query_out == 5)
query_out will become 5 after this block (your task) is evaluated and will be ready at the next clock edge only. So, you are supposed to get a one cycle delay.
You need to split it into several always blocks.

How does SystemVerilog `force` work?

I have a hierarchy of modules where I am trying to do a force to get different value at different module interface. I am working on a component whose task is to inject transaction to a module down the hierarchy, bypassing the drives from the modules higher up in the hierarchy. I thought I could use force on the control signals in order to disengage drives from higher up modules and start driving into the module of interest. So I have been trying to see how force will work. The full code is at http://www.edaplayground.com/x/69PB.
In particular, I am trying to understand effect of these two statements within initial block:
force u_DataReceiveTop.u_DataReceiveWrap.DataReceiveIfWrp_inst.valid = 1'b0;
force u_DataReceiveTop.valid = 1'b1;
what I expected the values to be is:
u_DataReceiveTop.u_DataReceiveWrap.DataReceiveIfWrp_inst.valid == 0
u_DataReceiveTop.valid == 1
but I see from waves:
u_DataReceiveTop.u_DataReceiveWrap.DataReceiveIfWrp_inst.valid == 1
u_DataReceiveTop.valid == 1
It is as if the second force statement force u_DataReceiveTop.valid = 1'b1; has propagated down the hierarchy even though there is another force. What is happening here?
A wire in Verilog is a network of drivers and receivers all connected to the same signal. The value of that signal is some resolution function of all the drivers and the type of the wire. When you connect two wires through a port, the two wires get collapsed into a single signal, but you still have two different names for the same signal.
When you use the force statement on a wire, that overrides all the drivers on the network until encountering another force or release statement. In your example, the second force statement replaces the first force. I doesn't matter which hierarchical reference you use in the force because they all refer to the same signal.
If you want the behavior you are expecting, you need to use variables instead of wires. When you connect a variable to a port, SystemVerilog creates an implicit continuous assignment, depending on the direction of the port. SystemVerilog does not allow more than one continuous assignment to a variable, which is why you can't use variables with an inout port. So you will need to be more careful about the port directions then.

Is it ok to have multiple threads writing the same values to the same variables?

I understand about race conditions and how with multiple threads accessing the same variable, updates made by one can be ignored and overwritten by others, but what if each thread is writing the same value (not different values) to the same variable; can even this cause problems? Could this code:
GlobalVar.property = 11;
(assuming that property will never be assigned anything other than 11), cause problems if multiple threads execute it at the same time?
The problem comes when you read that state back, and do something about it. Writing is a red herring - it is true that as long as this is a single word most environments guarantee the write will be atomic, but that doesn't mean that a larger piece of code that includes this fragment is thread-safe. Firstly, presumably your global variable contained a different value to begin with - otherwise if you know it's always the same, why is it a variable? Second, presumably you eventually read this value back again?
The issue is that presumably, you are writing to this bit of shared state for a reason - to signal that something has occurred? This is where it falls down: when you have no locking constructs, there is no implied order of memory accesses at all. It's hard to point to what's wrong here because your example doesn't actually contain the use of the variable, so here's a trivialish example in neutral C-like syntax:
int x = 0, y = 0;
//thread A does:
x = 1;
y = 2;
if (y == 2)
print(x);
//thread B does, at the same time:
if (y == 2)
print(x);
Thread A will always print 1, but it's completely valid for thread B to print 0. The order of operations in thread A is only required to be observable from code executing in thread A - thread B is allowed to see any combination of the state. The writes to x and y may not actually happen in order.
This can happen even on single-processor systems, where most people do not expect this kind of reordering - your compiler may reorder it for you. On SMP even if the compiler doesn't reorder things, the memory writes may be reordered between the caches of the separate processors.
If that doesn't seem to answer it for you, include more detail of your example in the question. Without the use of the variable it's impossible to definitively say whether such a usage is safe or not.
It depends on the work actually done by that statement. There can still be some cases where Something Bad happens - for example, if a C++ class has overloaded the = operator, and does anything nontrivial within that statement.
I have accidentally written code that did something like this with POD types (builtin primitive types), and it worked fine -- however, it's definitely not good practice, and I'm not confident that it's dependable.
Why not just lock the memory around this variable when you use it? In fact, if you somehow "know" this is the only write statement that can occur at some point in your code, why not just use the value 11 directly, instead of writing it to a shared variable?
(edit: I guess it's better to use a constant name instead of the magic number 11 directly in the code, btw.)
If you're using this to figure out when at least one thread has reached this statement, you could use a semaphore that starts at 1, and is decremented by the first thread that hits it.
I would expect the result to be undetermined. As in it would vary from compiler to complier, langauge to language and OS to OS etc. So no, it is not safe
WHy would you want to do this though - adding in a line to obtain a mutex lock is only one or two lines of code (in most languages), and would remove any possibility of problem. If this is going to be two expensive then you need to find an alternate way of solving the problem
In General, this is not considered a safe thing to do unless your system provides for atomic operation (operations that are guaranteed to be executed in a single cycle).
The reason is that while the "C" statement looks simple, often there are a number of underlying assembly operations taking place.
Depending on your OS, there are a few things you could do:
Take a mutual exclusion semaphore (mutex) to protect access
in some OS, you can temporarily disable preemption, which guarantees your thread will not swap out.
Some OS provide a writer or reader semaphore which is more performant than a plain old mutex.
Here's my take on the question.
You have two or more threads running that write to a variable...like a status flag or something, where you only want to know if one or more of them was true. Then in another part of the code (after the threads complete) you want to check and see if at least on thread set that status... for example
bool flag = false
threadContainer tc
threadInputs inputs
check(input)
{
...do stuff to input
if(success)
flag = true
}
start multiple threads
foreach(i in inputs)
t = startthread(check, i)
tc.add(t) // Keep track of all the threads started
foreach(t in tc)
t.join( ) // Wait until each thread is done
if(flag)
print "One of the threads were successful"
else
print "None of the threads were successful"
I believe the above code would be OK, assuming you're fine with not knowing which thread set the status to true, and you can wait for all the multi-threaded stuff to finish before reading that flag. I could be wrong though.
If the operation is atomic, you should be able to get by just fine. But I wouldn't do that in practice. It is better just to acquire a lock on the object and write the value.
Assuming that property will never be assigned anything other than 11, then I don't see a reason for assigment in the first place. Just make it a constant then.
Assigment only makes sense when you intend to change the value unless the act of assigment itself has other side effects - like volatile writes have memory visibility side-effects in Java. And if you change state shared between multiple threads, then you need to synchronize or otherwise "handle" the problem of concurrency.
When you assign a value, without proper synchronization, to some state shared between multiple threads, then there's no guarantees for when the other threads will see that change. And no visibility guarantees means that it it possible that the other threads will never see the assignt.
Compilers, JITs, CPU caches. They're all trying to make your code run as fast as possible, and if you don't make any explicit requirements for memory visibility, then they will take advantage of that. If not on your machine, then somebody elses.

Verilog automatic task

What does it mean if a task is declared with the automatic keyword in Verilog?
task automatic do_things;
input [31:0] number_of_things;
reg [31:0] tmp_thing;
begin
// ...
end
endtask;
Note: This question is mostly because I'm curious if there are any hardware programmers on the site. :)
"automatic" does in fact mean "re-entrant". The term itself is stolen from software languages -- for example, C has the "auto" keyword for declaring variables as being allocated on the stack when the scope it's in is executed, and deallocated afterwards, so that multiple invocations of the same scope do not see persistent values of that variable. The reason you may not have heard of this keyword in C is that it is the default storage class for all types :-) The alternatives are "static", which means "allocate this variable statically (to a single global location in memory), and refer to this same memory location throughout the execution of the program, regardless of how many times the function is invoked", and "volatile", which means "this is a register elsewhere on my SoC or something on another device which I have no control over; compiler, please don't optimize reads to me away, even when you think you know my value from previous reads with no intermediate writes in the code".
"automatic" is intended for recursive functions, but also for running the same function in different threads of execution concurrently. For instance, if you "fork" off N different blocks (using Verilog's fork->join statement), and have them all call the same function at the same time, the same problems arise as a function calling itself recursively.
In many cases, your code will be just fine without declaring the task or function as "automatic", but it's good practice to put it in there unless you specifically need it to be otherwise.
It means that the task is re-entrant - items declared within the task are dynamically allocated rather than shared between different invocations of the task.
You see - some of us do Verilog... (ugh)
The "automatic" keyword also allows you to write recursive functions (since verilog 2001). I believe they should be synthesisable if they bottom out, but I'm not sure if they have tool support.
I too, do verilog!
As Will and Marty say, the automatic was intended for recursive functions.
If a normal (i.e. not automatic) function is called with different values and processed by the simulator in the same time slice, the returned value is indeterminate. That can be quite a tricky bug to spot! This is only a simulation issue, when synthesised the logic will be correct.
Making the function automatic fixes this.
In computing, a computer program or subroutine is called re-entrant if multiple invocations can safely run concurrently (Wikipedia).
In simple words, the keyword automatic makes it safe, when multiple instances of a task run at a same time.
:D
Automatic is just opposite to static in usual programming. So is the case with Verilog. Think of static variables, they cannot be re-initialized. See the Verilog description below:
for (int i = 0; i < 3; i++) begin
static int f = 0;
f = f + 1;
end
Result of the above program will be f = 3. Also, see the program below:
for (int i = 0; i < 3; i++) begin
int f = 0;
f = f + 1;
end
The result of above program is f = 1. What makes a difference is static keyword.
Conclusion is tasks in Verilog should be automatic because they are invoked (called) so many times. If they were static (if not declared explicitly, they are static), they could have used the result from the previous call which often we do not want.

Resources