Execution in verilog sequentially or concurrently - verilog

I am new to verilog and am finding the execution of verilog tricky. How does the execution occurs in a verilog program. Say I have 2 modules and a testbench-
module module1(clock,A,B,C);
input A,B,clock;
output C;
assign c=A+B;
endmodule
module module2(clock,A,B,C);
input A,B,clock;
output C;
assign C=A-B;
endmodule
module testbench;
reg A,B,clock;
wire C;
module1 m1(clock,A,B,C);
module2 m2(clock,A,B,C);
initial
clock=1'b0;
always
#4 clock=~clock;
endmodule
I understand all initial blocks start at time 0.But are these initial blocks then executed sequentially i.e. if a initial block has more than one lines, will all of them executed sequentially or concurrently. Also, how does module execution take place?Will module1 start first as it appears before module2 in testbench and finish completely and then module2 start or both run concurrently. What happens when clock changes after 4sec, will the module running stop in between if clock changes or will it complete its previous execution and then start again with new clock?

In verilog, instantiation of a module means adding physical hardware to your board.
Modules are nothing but small hardware blocks that work concurrently. Every module can have some procedural blocks, continuous assignment statements or both.
Every procedural block executes concurrently, similar applies to continuous assignment statements.
I refer as this:
Procedural blocks: initial, always etc. blocks.
Continuous assignment: assign, force etc.
So, no matter in what sequence you instantiate modules, all are going to work in parallel.
Here comes the concept of timestamp. Each timestamp contains active, inactive and NBA regions. Refer to figure here:
For each timestamp, all the instances are checked in every region. If any execution is to be done in let's say module1 then it is done, in parallel, other module let's say module2 is also checked. If there is some dependency between the modules, then they are executed again.
Here, in your example, c is a single wire, and output of both modules, this generates a race around condition between modules, which is of course not good.
Think from hardware perspective. Two or more different hardware blocks can have same inputs but can not have same outputs. So, the output wires must be different.
module testbench;
reg A,B,clock;
wire C1,C2; // different wires
module1 m1(clock,A,B,C1);
module2 m2(clock,A,B,C2);
initial clock=1'b0;
always #4 clock=~clock;
endmodule
Also, here the modules have continuous assignment, so there is no effect of clock. No the modules are running in between the clocks also. It's just that there are no events scheduled in those timestamps.
As we know now, all procedural blocks are executed in parallel. But the contents inside procedural block is executed sequentially. To make the contents in concurrent, fork..join construct is used. For example:
initial
begin
a<=0;
#5;
b<=1; // b is assigned at 5ns
end
initial
fork
a<=0;
#5;
b<=1; // b is assigned at 0ns
join
Refer to Verilog Procedural Blocks, Concurrent and Sequential Statements sites for further information.

Another way to think about this from a simulation point of view
All of the initial, always, and continuous assign statements in your design execute concurrently starting at time 0. It doesn't matter whether they are in different modules or not - they are all equally concurrent. The elaboration step flattens out all of your module instances. All that is left are hierarchical names for things that were inside those modules.
Now, unless you are running the simulation on massively parallel CPUs (essentially running on the real synthesized hardware does), there is no way to actually run all of these processes concurrently, A software simulator has to choose one process to go first. You just can't rely on which one it chooses.
That is what the Verilog algorithm does. It puts everything scheduled to run at time 0 into an event queue (active queue), and starts executing each process one at time. It executes each process until it finishes, or it has to block waiting for some delay or a signal to change. It the process has to block, it gets suspended and put onto another queue. Then the next process in the current queue starts executing, and these steps keep repeating until the current queue is empty.
Then the scheduling algorithm picks another queue to become the active queue, and advances time if that queue is scheduled with some delay.

Related

Execution order of initial and always blocks in Verilog

I'm new to Verilog programming and would like to know how the Verilog program is executed. Does all initial and always block execution begin at time t = 0, or does initial block execution begin at time t = 0 and all always blocks begin after initial block execution? I examined the Verilog program's abstract syntax tree, and all initial and always blocks begin at the same hierarchical level. Thank you very much.
All initial and all always blocks throughout your design create concurrent processes that start at time 0. The ordering is indeterminate as far as the LRM is concerned. But may be repeatable for debug purposes when executing the same version of the same simulation tool. In other words, never rely on the simulation ordering to make you code execute properly.
Verilog requires event-driven simulation. As such, order of execution of all 'always' blocks and 'assign' statements depends on the flow of those events. Signal updated one block will cause execution of all other blocks which depend on those signals.
The difference between always blocks and initial blocks is that the latter is executed unconditionally at time 0 and usually produces some initial events, like generation of clocks and/or schedule reset signals. So, in a sense, initial blocks are executed first, before other blocks react to the events which are produced by them.
But, there is no execution order across multiple initial blocks or across initial blocks and always blocks which were forced into execution by other initial blocks.
In addition, there are other ways to generate events besides initial blocks.
In practice, nobody cares, and you shouldn't either.
On actual hardware, the chip immediately after powering-up is very unstable because of the transient states of the power supply circuit, hence its initial states untrustworthy.
The method to ensure initial state in practice is to set them in the initial block as
always # (event) {
if(~n_reset) {
initial_state=0
} else {
do_something();
}
}

Understanding Verilog Code with two Clocks

I am pretty new to Verilog
and I use it to verify some code from a simulation program.
Right now I am struggeling if a verilog code snippet because the simulation programm uses 2 clocks ( one system clock and a pll of this ) where two hardware componentes work together, thus synchronize each other:
module something (input data)
reg vid;
always #(posegde sys_clk)
vid <= data;
always #(posegde pll_clk)
if (vid)
// do something
When reading about non blocking assignments it says the evaluation of the left-hand side is postponed until other evaluations in the current time step are completed.
Intuitive I thought this means they are evaluated at the end of the time step, thus if data changes from 0 to 1 in sys_clk tick "A", this means at the end of "A" and the beginning of next sys_clk tick this value is in vid and so only after "A" the second always block ( of pll_clk) can read vid = 1
Is this how it works or did i miss something ?
Thank you :)
In this particular case it means that
if posedge sys_clk and pll_clk happen simultaneously then vid will not have a chance to update before it gets used in the pll_clk block. So, if vid was '0' before the clock edges (and is updated to '1' in the first block), it will still be '0' in the if statement of the second block. This sequence is guaranteed by use of the non-blocking assignment in the first block
if the posedges are not happening at the same time, then the value of vid will be updated at posedge sys_clk and picked up later at the following posedge of pll_clk.
In simulation non-blocking assignment guarantees that the assignment itself happens after all the blocks are evaluated in the current clock tick. It has nothing to do with the next clock cycle. However, the latter is often used in tutorials to illustrate a particular single-clock situation, creating confusion.
Also being simultaneous is a simulation abstraction, meaning that both edges happen in the same clock tick (or within a certain small time interval in hardware).

Determinism in Verilog: event controls

Consider the below example:
module test;
reg a;
initial begin
a = 1'b1;
end
initial begin
wait(a) $display("wait(a): %b", a);
$display("wait(a)");
end
initial begin
#a $display("#a: %b", a);
$display("#a");
end
endmodule
When I run it, I always get this output:
wait(a): 1
wait(a)
Which confuses me. My understanding of the code is like this:
the default value for a is x.
all initial blocks start in parallel at time 0.
the first initial block does a block assignment while at the same time the wait and #a event controls read a.
since there is no determinism in the sequencing among the three, the blocking assignment a=1'b1, may be executed before or after the wait(a) or #a.
if the blocking assignment is executed before wait(a) and #a, no output should be displayed since wait(a) and #a will not detect any change in a.
if, however, the blocking assignment a=1'b1 is executed after the read of a in wait(a) and #a, which both will read as x, then the output from both should be displayed after the blocking assignment completes.
But, as I pointed out above, the output I see is always the output from wait(a). Can someone please explain to me:
What is going on and the defect in my understanding?
And more generally, and outside of the example above:
What exactly happens when the simulator encounters a wait(a) and #a?
wait(a) and #a detect level and edge changes (in the example, level and edge change are identical). When we say "change" in this case, does it mean a change after the last read of the variables involved in the event controls (in this example a)?
You are correct up until point 5.
If a=1'b1 gets executed before the wait(a), then it has no effect—it does not suspend the process. If the reverse order, the wait(a) suspends the process and resumes after the assignment a=1'b1. In your example, you always see the output from the second initial block regardless of the order.
But the ordering is very important for the third initial block. The #a must execute before any change to a, otherwise it suspends until it sees another change. Although you should not rely on it, most tools executed initial blocks in source code order. But optimizations and ordering between initial blocks in different modules can never guarantee ordering.

Behaviour of Blocking Assignments inside Tasks called from within always_ff blocks

Have looked for an answer to this question online everywhere but I haven't managed to find an answer yet.
I've got a SystemVerilog project at the moment where I've implemented a circular buffer in a separate module to the main module. The queue module itself has a synchronous portion that acquires data from a set of signals but it also has a combinatorial section that responds to an input. Now when I want to query the state of this queue in my main module a task, inside an always_ff block sets the input using a blocking assignment, then the next statement reads the output and acts on that.
An example would look something like this in almost SystemVerilog:
module foo(clk, ...)
queue = queue(clk, ...)
always_ff#(posedge clk)
begin
check_queue(...)
end
task check_queue();
begin
query_in = 3;
if (query_out == 5)
begin
<<THINGS HAPPEN>>
end
end
endtask
endmodule
module queue(clk, query_in, query_out)
always_comb
begin
query_out = query_in + 2;
end
endmodule
My question essentially comes down to, does this idea work? In my head because the queue is combinatorial it should respond as soon as the input stimulus is applied it should be fine but because it's within a task within an always_ff block I'm a bit concerned about the use of blocking assignments.
Can someone help? If you need more information then let me know and I can give some clarifications.
This creates a race condition and most likely will not work. It has nothing to do with your use of a task. You are trying to read the value of a signal (queue_out) that is being assigned in another concurrent process. Whether it gets updated or not by the time you get to the If statement is a race. U?se a non-blocking assignment to all variable that go outside of the always_ff block and it guarantees you get the previous value.
in order to figure out the stuff, you can just mentally inline the task inside the always_ff. BTW, it really looks like a function in your case. Now, remember that execution of any always block must finish before any other is executed. So, the following will never evaluate to '5' at the same clock edge:
query_in = 3;
if (query_out == 5)
query_out will become 5 after this block (your task) is evaluated and will be ready at the next clock edge only. So, you are supposed to get a one cycle delay.
You need to split it into several always blocks.

SystemVerilog : fork - join and writing parallel testbenches

I am following the testbench example at this link:
http://www.verificationguide.com/p/systemverilog-testbench-example-00.html
I have two questions regarding fork-join statements. The test environment has the following tasks for initiating the test:
task test();
fork
gen.main();
driv.main();
join_any
endtask
task post_test();
wait(gen.ended.triggered);
wait(gen.repeat_count == driv.no_transactions);
endtask
task run;
pre_test();
test();
post_test();
$finish;
endtask
My first question is why do we wait for the generator event to be triggered in the post_test() task? why not instead do a regular fork-join which, as far as I understand, will wait for both threads to finish before continuing.
I read another Stack Overflow question (System Verilog fork join - Not actually parallel?) that said these threads are not actually executed in parallel in the CPU sense, but only in the simulation sense.
My second question is what are the point of fork-joins if they are not actually executed in parallel. There would be no performance benefit, so why not follow a sequential algorithm like:
while true:
Create new input
Feed input to module
Check output
To me this seems much simpler than the testbench example.
Thanks for your help!
without having the code for gen and driv, it is difficult to say. However, most likely both driv and gen are communicating with each other in some manner, i.e. gen produces data which driv consumes and drive something else.
If gen and driv are written in as gen input/cousume input fashion, than your loop would make sense, however, most likely they generate and consume data based on some events and cannot be easily split in such functions easily. Something like the following is usually much cleaner.
gen:
while() begin
wait(some event);
generateData;
prepareForTheNextEvent;
end
driv:
while() begin
wait(gen ready);
driveData;
end
so, for the above reason you cannot run them sequentially. They must run in parallel. For all programming purposes they are running in parallel. In more details they run in the same single thread, but verilog schedules their execution based on events generated in simulation. So, you need fork.
As for the join_any, I think, that the test in your case is supposed to finish when either of the threads is done. However the driver has also to finish all outstanding jobs before it can exit. Therefore there are those wait statements in the posttest task.

Resources