I am currently developing a Verilog based Testbench model for a DUT,
I have experience with System Verilog TB and Verification IPs and this is my first time developing a pure verilog TB.
I have completed the basic blocks for running the simulation and its working as expected.
But I am stuck at implementing the Functional Coverage(which I want to do in Sample Monitor block).I have extracted the Functional Coverage from the specifications but how do I implement it in Verilog code ?
We have below(example code to show the syntax) support in System verilog for functional coverage,
covergroup example_group # (posedge en);
parity : coverpoint par {
bins even = {0};
bins odd = {1};
}
endgroup
Is there a way to implement functional coverage as bins,points and groups(in System verilog) to track overall functional coverage in verilog based code?
Edit :
I understood that there is no alternative syntax for coverage in verilog and I don't want to complicate and spend more time by implementing coverage counters. Also I can't convert my verilog TB to System Verilog due to some internal agreement issues.
Yes, the covergroup is equivalent to this Verilog code
always #(posedge en) begin : example_group
integer even=0;
integer odd=0;
if (par == 0) even = even + 1;
if (par == 1) odd = odd + 1;
end
But the real time consuming part is writing the code that collects all these counters, merges the data from different tests, and generates the reports. Seems silly to re-invent this. Most tools give you this capability in SystemVerilog.
Related
I'm creating a sudoku game in verilog(2001) to eventually be put onto an FPGA, I found code for it in java and have been trying to convert it but have run into some errors.
Here's the link for the java code
www.geeksforgeeks.org/program-sudoku-generator
I have very little verilog experience and am learning as I go.
task automatic removeKDigits()
reg count = K;
while (count != 0)
begin
integer cellId = randomGenerator(N*N-1);
// System.out.println(cellId);
// extract coordinates i and j
i = (cellId/N);
j = cellId%9;
// System.out.println(i+" "+j);
if (mat[i][j] != 0)
begin
count = count-1;
mat[i][j] = 0;
end
else
count=count;
end
endtask
K is the amount of digits to be removed from the mat[i][j] board, N=9 since its a 9x9 sudoku board. For the lines containing "count=count-1" and "count=count" I'm getting the error
syntax error, unexpected '=', expecting IDENTIFIER
what does it mean? how do I fix it?
Unfortunately, it's unlikely you'll be able to port java code to synthesizable Verilog code, without at least a decent knowledge of the principles behind RTLs (Register transfer languages).
Programming languages like Java are a high level descriptions of some logic, that will get converted into machine instructions, and run on a processor. They operate sequentially, one line at a time, in a particular order.
RTLs on the other hand, describe actual hardware. They tend to operate in parallel, on a trigger, typically a clock. Instead of 'variables', you tend to work with 'registers' representing actual flip flops, and the Verilog programme will describe the transfer of data between these registers.
As for the actual issues with your code, it's impossible to point out the errors, because it simply isn't Verilog.
I recommend this answer: https://stackoverflow.com/a/5121853/10719567, for a more eloquent description of the differences between programming languages and RTLs, and why it's not that easy to port between the two.
I have written the below code for a simple multiplication of 2 n-bit numbers(here n=16). It is getting simulated with desired output waveform but the problem is, it is not getting synthesized in vivado 17.2 even though I have written a static 'for loop'(ie., loop iteration is constant). I am getting below mentioned error.
[Synth 8-3380] loop condition does not converge after 2000 iterations
Note: I have written
for(i=i;i<n;i=i+1)
instead of
for(i=0;i<n;i=i+1)
because the latter one was executing once again after i reached n. So that is not a mistake. Kindly someone help. Thank you for your time
//unsigned integer multiplier
module multiplication(product,multiplier,multiplicand,clk,rset);
parameter n = 16;
output reg [(n<<1)-1:0]product;
input [n-1:0]multiplier, multiplicand;
input clk,rset;
reg [n:0]i='d0;
always #( posedge clk or posedge rset)
begin
if (rset) product <= 'd0;
else
begin
for(i=i;i<n;i=i+1)
begin
product =(multiplier[i] == 1'b1)? product + ( multiplicand << i ): product;
$display("product =%d,i=%d",product,i);
end
end
end
endmodule
First of all, it is not a good practice to use for, while kind of loops if you really want to implement your design on FPGA (Vivado is optimized to be used to implement your design on FPGA). Even if you can successfully synthesize your design, you might face with timing problems or unexpected bugs.
I think you can find your answer here.
Edit: I just wanted to inform you that generally controlling timing is very important in HW design, especially when you want to integrate your design with other system, loops can be nightmare for that.
Go back to using your original for (i=0 loop.
Your error is that you assume i=0 because of reg [n:0]i='d0; That is only true the very first time. Thus only once, at the start of the simulation.
because the latter one was executing once again after i reached n.
Yes, the loop will repeat again and again for every clock cycle. That is what #( posedge clk ...) does.
More errors:
You are using blocking assignment in the clock section, use non-blocking:
product <=(multiplier[i] == 1'b1)? product + ( multiplicand << i ): product;
Your product is only correct the first time after a reset (when it starts at zero). The second clock cycle after a reset you do the multiplication again but start with the previous value of product.
Your i is a bit big you use 17 bits to count to 16. Also global loop variables have pitfalls. I suggest you use the system Verilog syntax of: `for (int i=0; ....)
The error message is not about the iterations of your loop, but about the iterations of the internal algorithm of your synthesis program. It tells you that Vivado failed to create a circuit that could actually be implemented on your FPGA of choice at your clock speed of choice and which actually does what you're asking for.
Let me elaborate, after I mention two items of general import: don't use blocking assignments (=) inside always #(posedge clk) blocks. They almost never do what you want. Only non-blocking assignments (<=) should be clocked. Secondly, the correct way to synthesize a for loop is with generate, even though Vivado seems to accept the simple for.
You are building a fairly large block of combinatorial logic here. Remember that when you synthesize combinatorial logic you are asking for a circuit that can evaluate the expression that you've written within a single clock cycle. The actual expression taken into consideration is the one after the for loop has been unrolled. I.e., you are (conditionally) adding a 16bit number to a 32bit number 16 times, each times shifted one bit further to the left. Each of these additions has a carry bit. So each of these additions will actually have to look at all of the upper 16bits of the result of the previous addition, and each will depend on all but the lowest bit of the previous addition. An adder for one bit + carry needs O(5) gates. Each addition is conditional, which adds at least one more gate for each bit. So you are asking for at minimum of 16*16*6 = 1300 interdependent gates that all have to stabilize within a single clock cycle. Vivado is telling you that it cannot fulfill these requirements.
One way of mitigating this problem will be to synthesize for a lower clock frequency, where the gates have more time to stabilize, and you can thus build longer logic chains. Another option would be to pipeline the operation, say by only evaluating what corresponds to four iterations of your loop within a single clock cycle and then building the result over several clock cycles. This will add some bookkeeping logic to your code, but it is inevitable if one wants to evaluate complex expressions with finite resources at high clock frequencies. It would also introduce you to synchronous logic, which you will have to learn anyway if you want to do anything non-trivial with an FPGA. Note that this kind of pipelining wouldn't affect throughput significantly, because your FPGA would then be doing several multiplications in parallel.
You may also be able to rewrite the expression to handle the carry bits and interdependencies in a smarter way that allows Vivado to find its way through the expression (there probably is such a way, I assume it synthesizes if you simply write the multiplication operator?).
Finally, many FPGAs come with dedicated multiplier units because multiplication is a common operation, but implementing it in logic gates wastes a lot of resources. As you found out.
Here is an example that use nested conditional operator to map register address to it's value.
reg [4:0] mux;
reg [1:0] addr;
mux = (addr == 2'b00) ? i0 :
((addr == 2'b01) ? i1 :
((addr == 2'b10) ? i2 :
((addr == 2'b11) ? i3 :
4'bz)));
In my application, there are about one hundred registers, so The nested level is very deep. If the expression is C language executed by CPU, it will be very slow.
How about FPGA?
From my experience, it depends on the synthesizer as well as the options used. It can use your code as a functional guideline to generate equivalent logic. Or it can use it as a structural guideline in which case each ?: conditional operator is mapped to a 2:1 mux. You can do experiments with your synthesizer to figure how how the generate the gate equivalent, and read the synthesis options in the manual.
Generally I use the ?: conditional operator when I intentionally want a 2:1 mux (or tri-state driver). For more complex conditional multiplexing, I prefer using case statements or if-else statements. This strategy usually fulfills timing and area requirements.
With large multiplexing (you mentioned "about one hundred registers"), meeting timing and area requirements can be difficult. Sometimes the synthesizer can handle this, other times it needs more guidance. Synthesizer directives (refer to manual) and splitting the multiplexer into chunks is one way to deal with it. Your FPGA may have a macro module of dedicated logic (RAMs, complex arithmetic logic, etc) you could instantiate to substitute for portions of your code.
In this specific case of your example most synthesizers will interpret this as a series of 2-to-1 multiplexers.
In a more general case such as
output = (one_condition)? a : (another_condition)? b : (other_condition)? c : ...
It will use a multiplexer for each condition extending the asynchronous path. A longer asynchronous path means a longer settling time and slower maximum clock frequency.
I am trying to debug my code shown below. I am fairly new to SystemVerilog and hopefully I can learn from this. Let me know of any suggestions.
The errors I am receiving are:
Error-[ICPD] Invalid procedural driver combination
"divide.v", 2
Variable "Q" is driven by an invalid combination of procedural drivers.
Variables written on left-hand of "always_comb" cannot be written to by any
other processes, including other "always_comb" processes.
"divide.v", 2: logic [7:0] Q;
"divide.v", 8: always_comb begin
if (x <= R) begin
...
"divide.v", 5: Q = 8'b0;
Error-[ICPD] Invalid procedural driver combination
"divide.v", 2
Variable "R" is driven by an invalid combination of procedural drivers.
Variables written on left-hand of "always_comb" cannot be written to by any
other processes, including other "always_comb" processes.
"divide.v", 2: logic [7:0] R;
"divide.v", 8: always_comb begin
if (x <= R) begin
...
"divide.v",6: R = y;
My SystemVerilog Code is:
module divider(input logic [7:0] x,y,
output logic [7:0] Q,R);
initial
begin
Q = 8'd0;
R = y;
end
always_comb
begin
if (x<=R)
begin R <= R - x; Q <= Q + 8'd1; end
end
endmodule
module test1;
logic [7:0] x,y,Q,R;
divider Divider1 (x,y,Q,R);
initial
begin
x = 8'd2;
y = 8'd8;
end
endmodule
Generally, in Verilog/SystemVerilog you cannot assign to a variable from two parallel blocks (with some exceptions). You are assigning to R and Q from two places: the initial block and the always_comb block.
Although the initial block only runs once, it runs in parallel with the always_comb block at the beginning of the simulation, which is a violation of this rule.
Why don't you get rid of the initial block and do everything in always_comb?
always_comb
begin
Q = 8'd0; // set initial value of Q
R = y; // set initial value of R
.... //THE REST OF THE ALGORITHM
end
Also, you are missing using a loop!
An important distinction between writing System Verilog (or any HDL) and writing in any software language (C/C++, Java, etc) is that System Verilog is designed to facilitate describing hardware structures while allowing for software-like testbenches, while software languages are designed to allow users to give instructions to an interpreter, VM or actual hardware. That being said, you need to think first about the hardware you are trying to describe and then write the associated HDL code. There are numerous posts describing the differences between HDLs and software languages (ex: Can Verilog/Systemverilog/VHDL be considered actor oriented programming languages?).
Looking at your code and flow chart you were given, it appears you are trying to use System Verilog as a programming language rather than a HDL. For example, initial blocks are generally only used in test benches and not in modules themselves. Also, as your algorithm is sequential, it is likely you will need a clock signal and registers, but the way your code lacks both.
Ideally, you should have a good idea of how to design digital hardware systems before trying to code any HDL, as this is the mentality you should have using HDLs. The goal is generally to translate a hardware design into HDL, so understanding digital design will help clarify alot.
I'm by no means a Verilog expert, and I was wondering if someone knew which of these ways to increment a value was better. Sorry if this is too simple a question.
Way A:
In a combinational logic block, probably in a state machine:
//some condition
count_next = count + 1;
And then somewhere in a sequential block:
count <= count_next;
Or Way B:
Combinational block:
//some condition
count_en = 1;
Sequential block:
if (count_en == 1)
count <= count + 1;
I have seen Way A more often. One potential benefit of Way B is that if you are incrementing the same variable in many places in your state machine, perhaps it would use only one adder instead of many; or is that false?
Which method is preferred and why? Do either have a significant drawback?
Thank you.
One potential benefit of Way B is that if you are incrementing the same variable in many places in your state machine, perhaps it would use only one adder instead of many; or is that false?
Any synthesis tool will attempt automatic resource sharing. How well they do so depends on the tool and code written. Here is a document that describes some features of Design Compiler. Notice that in some cases, less area means worse timing.
Which method is preferred and why? Do either have a significant drawback?
It depends. Verilog(for synthesis) is a means to implement some logic circuit but the spec does not specify exactly how this is done. Way A may be the same as Way B on an FPGA but Way A is not consistent with low power design on an ASIC due to the unconditional sequential assignment. Using reset nets is almost a requirement on an ASIC but since many FPGAs start in a known state, you can save quite a bit of resources by not having them.
I use Way A in my Verilog code. My sequential blocks have almost no logic in them; they just assign registers based on the values of the "wire regs" computed in the combinational always blocks. There is just less to go wrong this way. And with Verilog we need all the help we can get.
What is your definition of "better" ?
It can be better performance (faster maximum frequency of the synthesized circuit), smaller area (less logic gates), or faster simulation execution.
Let's consider smaller area case for Xilinx and Altera FPGAs. Registers in those FPGA families have enable input. In your "Way B", *count_en* will be directly mapped into that enable register input, which will result in less logic gates. Essentially, "Way B" provides more "hints" to a synthesis tool how to better synthesize that circuit. Also it's possible that most FPGA synthesis tools (I'm talking about Xilinx XST, Altera MAP, Mentor Precision, and Synopsys Synplify) will correctly infer register enable input from the "Way A".
If *count_en* is synthesized as enable register input, that will result in better performance of the circuit, because your counter increment logic will have less logic levels.
Thanks