Trouble understanding simulation/module behavior - verilog

I implemented a very simple counter with preset functionality (code reproduced below).
module counter
#(
parameter mod = 4
) (
input wire clk,
input wire rst,
input wire pst,
input wire en,
input wire [mod - 1:0] data,
output reg [mod - 1:0] out,
output reg rco
);
parameter max = (2 ** mod) - 1;
always #* begin
if(out == max) begin
rco = 1;
end else begin
rco = 0;
end
end
always #(posedge clk) begin
if(rst) begin
out <= 0;
end else if(pst) begin
out <= data;
end else if(en) begin
out <= out + 1;
end else begin
out <= out;
end
end
endmodule
I am having trouble understanding the following simulation result. With pst asserted and data set to 7 on a rising clock edge, the counter's out is set to data, as expected (first image below. out is the last signal, data is the signal just above, and above that is pst.). On the next rising edge, I kept preset asserted and set data to 0. However, out does not follow data this time. What is the cause of this behavior?
My thoughts
On the rising clock edge where I set data to 0, I notice that out stays at 7, and doesn't increment to 8. So I believe that the counter is presetting, but with the value 7, not 0. If I move the data transition from 7 to 0 up in time, out gets set to 0 as expected (image below). Am I encountering a race condition?
Testbenches
My initial testbench code that produced the first image is reproduced below. I show the changes I made to get coherent results as comments.
parameter mod = 4;
// ...
reg pst;
reg [mod - 1:0] data;
// ...
#(posedge clk); // ==> #(negedge clk)
data = 7;
pst = 1;
#(posedge clk); // ==> #(negedge clk)
data 0;
pst = 1;
#(posedge clk); // ==> #(negedge clk)
pst = 0;
#(posedge clk);
// ...

You have a race condition test bench. The Verilog scheduler is allowed to evaluate any # triggered in the time step in any order it chooses. All code after the granted # will execute until it hits another time blocking statement. In your waveform it looks like data and pst from the from the test bench are sometimes being assigned before the design samples them and sometimes after.
The solution is simple, use non-blocking assignments (<=). Refer to What is the difference between = and <= in Verilog?
#(posedge clk);
data <= 7;
pst <= 1;
#(posedge clk);
data <= 0;
pst <= 1;
#(posedge clk);
pst <= 0;
#(posedge clk);

I am able to obtain correct, predictable behavior if I modify my testbench to only modify input signals to my counter on falling clock edges rather than on rising clock edges (as it should be anyways). My best guess as to why the above behavior was occurring is that changing input signals at the same time the counter module is programmed to sample its inputs leads to undefined simulator behavior.

Related

How does verilog treat input values to if statements in always_ff blocks

I'm currently working on a pipelined MIPS cpu using Icarus Verilog and have come across some very strange behaviour when using an if statement within an always_ff loop. I'm currently testing this implementation of a PC block:
module PC (
input logic clk,
input logic rst,
input logic[31:0] PC_JVal,
input logic jump_en,
input logic branch_en,
input logic PC_Stall,
output logic [31:0] PC_Out,
output logic fetch_stall,
output logic active,
output logic [2:0] check
);
// Active is completely dependent on the value of the PC.
// JUMP_EN --> PC = JVAL
// BRANCH_EN --> PC = PC + JVAL
// PC_Stall --> PC = PC
reg [31:0] PC;
logic [31:0] branchSignExt = (PC_JVal[15] == 1) ? {16'hFFFF, PC_JVal[15:0]} : {16'h0000, PC_JVal[15:0]};
logic start;
assign fetch_stall = PC_Stall;
assign active = (PC != 0) ? 1 : 0;
assign PC_Out = (active == 0) ? 0 : ( (PC_Stall == 1) ? PC + 4 : ( (jump_en == 1) ? PC_JVal : ( (branch_en == 1) ? PC + branchSignExt : PC + 4 ) ) );
initial begin
PC = 0;
start = 0;
check = 0;
end
always_ff # (posedge clk) begin
check[1] <= ~check[1];
if (rst) begin
start <= 1;
end
else if (active) begin
if (PC_Stall) begin
PC <= PC;
check[0] <= ~check[0];
end
else if (jump_en) begin
PC <= PC_JVal;
end
else if (branch_en) begin
PC <= PC + branchSignExt;
end
else begin
PC <= PC + 4;
end
end
end
always_ff # (negedge rst) begin
if (start) begin
PC <= 32'hBFBFFFFC;
start <= 0;
end
end
endmodule
And am running the following testbench:
module PC_TB ();
logic clk;
logic rst;
logic[31:0] PC_JVal;
logic jump_en;
logic branch_en;
logic PC_Stall;
logic [31:0] PC_Out;
logic fetch_stall;
logic active;
logic [2:0] check;
initial begin
$dumpfile("PC_TB.vcd");
$dumpvars(0, PC_TB);
clk = 0;
jump_en = 0;
PC_Stall = 0;
branch_en = 0;
rst = 0;
repeat(100) begin
#50; clk = ~clk;
end
$fatal(1, "Timeout");
end
initial begin
# (posedge clk);
# (posedge clk);
# (posedge clk);
rst = 1;
# (posedge clk);
# (posedge clk);
# (posedge clk);
rst = 0;
# (posedge clk);
# (posedge clk);
# (posedge clk);
PC_Stall = 1;
# (posedge clk);
PC_Stall = 0;
# (posedge clk);
# (posedge clk);
end
PC PC(.clk(clk), .rst(rst), .PC_JVal(PC_JVal), .jump_en(jump_en), .branch_en(branch_en), .PC_Stall(PC_Stall), .PC_Out(PC_Out), .fetch_stall(fetch_stall), .active(active), .check(check));
endmodule
The issue I'm having is that how the if statement checking for PC_Stall is evaluated seems to alternate between clock cycles and I have no clue why.
I get the following VCD output when running it with the test bench as is (not the desired output), with the PC Stall not really happening (the PC value should remain for 2 cycles, but here it is only for one.)
Stall lasts 1 Cycle
Then by just shifting the point at which the PC_Stall is asserted forward by one cycle, results in Stall lasting 3 cycles, even though its only asserted for 1.
Stall lasts 3 cycles
I've been really stuck on this and genuinely have no idea what is wrong, and I would appreciate the help.
iverilog does not have very good support for SystemVerilog features yet. If you compile your code on other simulators, such as VCS on edaplayground, you will get compile errors. For example:
Error-[ICPD] Illegal combination of drivers
Illegal combination of procedural drivers
Variable "check" is driven by an invalid combination of procedural drivers.
Variables written on left-hand of "always_ff" cannot be written to by any
other processes, including other "always_ff" processes.
This variable is declared at : logic [2:0] check;
The first driver is at : always_ff #(posedge clk) begin
check[1] <= (~check[1]);
...
The second driver is at : check = 0;
You must fix all such errors.
Note, several simulators are available on edaplayground if you sign up for a free account.
So it appears to be a compiler issue regarding how conditionals are treated when both inputs to said conditionals change and the conditionals themselves are executed on a positive clock edge.
The issue was fixed by adding a small delay just before said conditional, to give the values time to update or something, not sure and this seems like quite a botched solution, it works though.

Why I can not copy a content of register to another one in "always" block in Verilog?

well, I have this code, that is working perfectly:
module syncRX(clk, signal, detect);
input clk, signal;
output reg [7:0] detect = 0;
reg [7:0] delay = 0;
//wire clk_1khz;
freq_div div(.clk(clk), .clk_1khz(clk_1khz));
always #(posedge signal)
begin
detect <= detect + 1;
delay <= 0;
end
always #(posedge clk_1khz)
begin
delay <= delay + 1;
end
endmodule // top
module freq_div(input clk, output reg clk_1khz);
reg [12:0] count = 0;
always #(posedge clk)
begin
if(count == 6000)
begin
clk_1khz <= ~clk_1khz;
count <= 0;
end
else
count <= count + 1;
end
endmodule
The problem appears when I change the line "detect <= detect + 1;" to "detect <= delay;".
The intention is calculate the period of the signal, but I get this warning message of Icestorm:
Warning: No clocks found in design
And the FPGA stop working...
Please, anyone have an idea what is going bad?
Thanks to all!
By the votes of the question I could see that is not good one, maybe because community consider it that there is already documented, but I still can not find solution to the problem, I did some improvements and I will try again to find help here, I have this code now, that syntethize perfectly:
module syncRX(clk, signal, detect);
input clk, signal;
output [7:0] detect;
reg [7:0] detect_aux = 8'b0;
reg rst;
assign detect = detect_aux & ~rst;
freq_div div(.clk(clk), .clk_1khz(clk_1khz));
always #(posedge signal)
rst <= 1;
always #(posedge clk_1khz)
detect_aux <= detect_aux + 1;
endmodule // top
module freq_div(input clk, output reg clk_1khz);
reg [12:0] count = 0;
always #(posedge clk)
begin
if(count == 6000)
begin
clk_1khz <= ~clk_1khz;
count <= 0;
end
else
count <= count + 1;
end
endmodule
The problem is that
reg rst;
assign detect = detect_aux & ~rst;
Seams do nothingh. Any suggestion?
Thanks
The problem is that delay is multiply driven (driving from multiple always blocks is not allowed in synthesis) which is undefined behaviour (in this case I believe the constant '0' will be used). It should also be at least a warning.

why does my output signal have 2 clock cycles delay?

For each bit in a 32-bit vector, capture when the input signal changes from 1 in one clock cycle to 0 the next. "Capture" means that the output will remain 1 until the register is reset (synchronous reset).
Each output bit behaves like a SR flip-flop: The output bit should be set (to 1) the cycle after a 1 to 0 transition occurs. The output bit should be reset (to 0) at the positive clock edge when reset is high. If both of the above events occur at the same time, reset has precedence. In the last 4 cycles of the example waveform below, the 'reset' event occurs one cycle earlier than the 'set' event, so there is no conflict here.
In the example waveform below, reset, in1 and out1 are shown again separately for clarity.
my code:
module top_module (
input clk,
input reset,
input [31:0] in,
output [31:0] out );
integer i;
reg [31:0] in_del;
reg [31:0] out_del;
always # (posedge clk)
begin
in_del<=in;
out_del<=~in & in_del;
if (reset)
out=0;
else
begin
for (i=0; i<32;i=i+1) begin
if (out_del[i])
out[i]=out_del[i];
end
end
end
endmodule
my output
First about your code.
it cannot be compiled. The out must be a reg in order to be assignable within the always block.
using non-blocking assignment in out_del <= in & in_del will cause a one-cycle delay for the if (out_del) comparison. Non-blocking assignments schedule lhs assignment after the block gets evaluated. The rule of thumb is to always use blocking assignments for intermediate signals in the sequential block.
because of the above and because of the in & in_del, this cannot be synthesized, or at least it cannot be synthesized correctly.
you violate industry practices by using the blocking assignment on the out signal. The rule of thumb is to always use non-blocking assignments for the outputs of the sequential blocks.
the code just does not work :-(
If I understood your requirement correctly the following code does it:
module top_module (
input clk,
input reset,
input [31:0] in,
output reg [31:0] out );
reg lock;
always # (posedge clk)
begin
if (reset) begin
lock <= 0;
out <= 0;
end
else if (lock == 0)
begin
out <= in;
lock <= 1;
end
end
endmodule
Just use the lock signal to allow updated. And yes, here is a simple test bench to check it:
module real_top();
reg clk, reset;
reg [31:0] in;
reg [31:0] out;
top_module tm(clk, reset, in, out);
initial begin
clk = 0;
forever #5 clk = ~clk;
end
integer i;
initial begin
in = 0;
reset = 1;
#7 reset = 0;
for (i = 1; i < 5; i++) begin
#10 in = i;
#10 reset = 1;
#10 reset = 0;
end
$finish;
end
initial
$monitor(clk, reset, in, out);
endmodule

Verilog: wait for module logic evaluation in an always block

I want to use the output of another module inside an always block.
Currently the only way to make this code work is by adding #1 after the pi_in assignment so that enough time has passed to allow Pi to finish.
Relevant part from module pLayer.v:
Pi pi(pi_in,pi_out);
always #(*)
begin
for(i=0; i<constants.nSBox; i++) begin
for(j=0; j<8; j++) begin
x = (state_value[(constants.nSBox-1)-i]>>j) & 1'b1;
pi_in = 8*i+j;#1; /* wait for pi to finish */
PermutedBitNo = pi_out;
y = PermutedBitNo>>3;
tmp[(constants.nSBox-1)-y] ^= x<<(PermutedBitNo-8*y);
end
end
state_out = tmp;
end
Modllue Pi.v
`include "constants.v"
module Pi(in, out);
input [31:0] in;
output [31:0] out;
reg [31:0] out;
always #* begin
if (in != constants.nBits-1) begin
out = (in*constants.nBits/4)%(constants.nBits-1);
end else begin
out = constants.nBits-1;
end
end
endmodule
Delays should not be used in the final implementation, so is there another way without using #1?
In essence i want PermutedBitNo = pi_out to be evaluated only after the Pi module has finished its job with pi_in (=8*i+j) as input.
How can i block this line until Pi has finished?
Do i have to use a clock? If that's the case, please give me a hint.
update:
Based on Krouitch suggestions i modified my modules. Here is the updated version:
From pLayer.v:
Pi pi(.clk (clk),
.rst (rst),
.in (pi_in),
.out (pi_out));
counter c_i (clk, rst, stp_i, lmt_i, i);
counter c_j (clk, rst, stp_j, lmt_j, j);
always #(posedge clk)
begin
if (rst) begin
state_out = 0;
end else begin
if (c_j.count == lmt_j) begin
stp_i = 1;
end else begin
stp_i = 0;
end
// here, the logic starts
x = (state_value[(constants.nSBox-1)-i]>>j) & 1'b1;
pi_in = 8*i+j;
PermutedBitNo = pi_out;
y = PermutedBitNo>>3;
tmp[(constants.nSBox-1)-y] ^= x<<(PermutedBitNo-8*y);
// at end
if (i == lmt_i-1)
if (j == lmt_j) begin
state_out = tmp;
end
end
end
endmodule
module counter(
input wire clk,
input wire rst,
input wire stp,
input wire [32:0] lmt,
output reg [32:0] count
);
always#(posedge clk or posedge rst)
if(rst)
count <= 0;
else if (count >= lmt)
count <= 0;
else if (stp)
count <= count + 1;
endmodule
From Pi.v:
always #* begin
if (rst == 1'b1) begin
out_comb = 0;
end
if (in != constants.nBits-1) begin
out_comb = (in*constants.nBits/4)%(constants.nBits-1);
end else begin
out_comb = constants.nBits-1;
end
end
always#(posedge clk) begin
if (rst)
out <= 0;
else
out <= out_comb;
end
That's a nice piece of software you have here...
The fact that this language describes hardware is not helping then.
In verilog, what you write will simulate in zero time. it means that your loop on i and j will be completely done in zero time too. That is why you see something when you force the loop to wait for 1 time unit with #1.
So yes, you have to use a clock.
For your system to work you will have to implement counters for i and j as I see things.
A counter synchronous counter with reset can be written like this:
`define SIZE 10
module counter(
input wire clk,
input wire rst_n,
output reg [`SIZE-1:0] count
);
always#(posedge clk or negedge rst_n)
if(~rst_n)
count <= `SIZE'd0;
else
count <= count + `SIZE'd1;
endmodule
You specify that you want to sample pi_out only when pi_in is processed.
In a digital design it means that you want to wait one clock cycle between the moment when you are sending pi_in and the moment when you are reading pi_out.
The best solution, in my opinion, is to make your pi module sequential and then consider pi_out as a register.
To do that I would do the following:
module Pi(in, out);
input clk;
input [31:0] in;
output [31:0] out;
reg [31:0] out;
wire clk;
wire [31:0] out_comb;
always #* begin
if (in != constants.nBits-1) begin
out_comb = (in*constants.nBits/4)%(constants.nBits-1);
end else begin
out_comb = constants.nBits-1;
end
end
always#(posedge clk)
out <= out_comb;
endmodule
Quickly if you use counters for i and j and this last pi module this is what will happen:
at a new clock cycle, i and j will change --> pi_in will change accordingly at the same time(in simulation)
at the next clock cycle out_comb will be stored in out and then you will have the new value of pi_out one clock cycle later than pi_in
EDIT
First of all, when writing (synchronous) processes, I would advise you to deal only with 1 register by process. It will make your code clearer and easier to understand/debug.
Another tip would be to separate combinatorial circuitry from sequential. It will also make you code clearer and understandable.
If I take the example of the counter I wrote previously it would look like :
`define SIZE 10
module counter(
input wire clk,
input wire rst_n,
output reg [`SIZE-1:0] count
);
//Two way to do the combinatorial function
//First one
wire [`SIZE-1:0] count_next;
assign count_next = count + `SIZE'd1;
//Second one
reg [`SIZE-1:0] count_next;
always#*
count_next = count + `SIZE'1d1;
always#(posedge clk or negedge rst_n)
if(~rst_n)
count <= `SIZE'd0;
else
count <= count_next;
endmodule
Here I see why you have one more cycle than expected, it is because you put the combinatorial circuitry that controls your pi module in you synchronous process. It means that the following will happen :
first clk positive edge i and j will be evaluated
next cycle, the pi_in is evaluated
next cycle, pi_out is captured
So it makes sense that it takes 2 cycles.
To correct that you should take out of the synchronous process the 'logic' part. As you stated in your commentaries it is logic, so it should not be in the synchronous process.
Hope it helps

4 bit countetr using verilog not incrementing

Sir,
I have done a 4 bit up counter using verilog. but it was not incrementing during simulation. A frequency divider circuit is used to provide necessory clock to the counter.please help me to solve this. The code is given below
module my_upcount(
input clk,
input clr,
output [3:0] y
);
reg [26:0] temp1;
wire clk_1;
always #(posedge clk or posedge clr)
begin
temp1 <= ( (clr) ? 4'b0 : temp1 + 1'b1 );
end
assign clk_1 = temp1[26];
reg [3:0] temp;
always #(posedge clk_1 or posedge clr)
begin
temp <= ( (clr) ? 4'b0 : temp + 1'b1 );
end
assign y = temp;
endmodule
Did you run your simulation for at least (2^27) / 2 + 1 iterations? If not then your clk_1 signal will never rise to 1, and your counter will never increment. Try using 4 bits for the divisor counter so you won't have to run the simulation for so long. Also, the clk_1 signal should activate when divisor counter reaches its max value, not when the MSB bit is one.
Apart from that, there are couple of other issues with your code:
Drive all registers with a single clock - Using different clocks within a single hardware module is a very bad idea as it violates the principles of synchronous design. All registers should be driven by the same clock signal otherwise you're looking for trouble.
Separate current and next register value - It is a good practice to separate current register value from the next register value. The next register value will then be assigned in a combinational portion of the circuit (not driven by the clock) and stored in the register on the beginning of the next clock cycle (check code below for example). This makes the code much more clear and understandable and minimises the probability of race conditions and unwanted inferred memory.
Define all signals at the beginning of the module - All signals should be defined at the beginning of the module. This helps to keep the module logic as clean as possible.
Here's you example rewritten according to my suggestions:
module my_counter
(
input wire clk, clr,
output [3:0] y
);
reg [3:0] dvsr_reg, counter_reg;
wire [3:0] dvsr_next, counter_next;
wire dvsr_tick;
always #(posedge clk, posedge clr)
if (clr)
begin
counter_reg <= 4'b0000;
dvsr_reg <= 4'b0000;
end
else
begin
counter_reg <= counter_next;
dvsr_reg <= dvsr_next;
end
/// Combinational next-state logic
assign dvsr_next = dvsr_reg + 4'b0001;
assign counter_next = (dvsr_reg == 4'b1111) ? counter_reg + 4'b0001 : counter_reg;
/// Set the output signals
assign y = counter_reg;
endmodule
And here's the simple testbench to verify its operation:
module my_counter_tb;
localparam
T = 20;
reg clk, clr;
wire [3:0] y;
my_counter uut(.clk(clk), .clr(clr), .y(y));
always
begin
clk = 1'b1;
#(T/2);
clk = 1'b0;
#(T/2);
end
initial
begin
clr = 1'b1;
#(negedge clk);
clr = 1'b0;
repeat(50) #(negedge clk);
$stop;
end
endmodule

Resources