verilog synthesis not converging after 2000 iterations

verilog synthesis not converging after 2000 iterations - verilog

I have written the below code for a simple multiplication of 2 n-bit numbers(here n=16). It is getting simulated with desired output waveform but the problem is, it is not getting synthesized in vivado 17.2 even though I have written a static 'for loop'(ie., loop iteration is constant). I am getting below mentioned error.
[Synth 8-3380] loop condition does not converge after 2000 iterations
Note: I have written
for(i=i;i<n;i=i+1)
instead of
for(i=0;i<n;i=i+1)
because the latter one was executing once again after i reached n. So that is not a mistake. Kindly someone help. Thank you for your time
//unsigned integer multiplier
module multiplication(product,multiplier,multiplicand,clk,rset);
parameter n = 16;
output reg [(n<<1)-1:0]product;
input [n-1:0]multiplier, multiplicand;
input clk,rset;
reg [n:0]i='d0;
always #( posedge clk or posedge rset)
begin
if (rset) product <= 'd0;
else
begin
for(i=i;i<n;i=i+1)
begin
product =(multiplier[i] == 1'b1)? product + ( multiplicand << i ): product;
$display("product =%d,i=%d",product,i);
end
end
end
endmodule

First of all, it is not a good practice to use for, while kind of loops if you really want to implement your design on FPGA (Vivado is optimized to be used to implement your design on FPGA). Even if you can successfully synthesize your design, you might face with timing problems or unexpected bugs.
I think you can find your answer here.
Edit: I just wanted to inform you that generally controlling timing is very important in HW design, especially when you want to integrate your design with other system, loops can be nightmare for that.

Go back to using your original for (i=0 loop.
Your error is that you assume i=0 because of reg [n:0]i='d0; That is only true the very first time. Thus only once, at the start of the simulation.
because the latter one was executing once again after i reached n.
Yes, the loop will repeat again and again for every clock cycle. That is what #( posedge clk ...) does.
More errors:
You are using blocking assignment in the clock section, use non-blocking:
product <=(multiplier[i] == 1'b1)? product + ( multiplicand << i ): product;
Your product is only correct the first time after a reset (when it starts at zero). The second clock cycle after a reset you do the multiplication again but start with the previous value of product.
Your i is a bit big you use 17 bits to count to 16. Also global loop variables have pitfalls. I suggest you use the system Verilog syntax of: `for (int i=0; ....)

The error message is not about the iterations of your loop, but about the iterations of the internal algorithm of your synthesis program. It tells you that Vivado failed to create a circuit that could actually be implemented on your FPGA of choice at your clock speed of choice and which actually does what you're asking for.
Let me elaborate, after I mention two items of general import: don't use blocking assignments (=) inside always #(posedge clk) blocks. They almost never do what you want. Only non-blocking assignments (<=) should be clocked. Secondly, the correct way to synthesize a for loop is with generate, even though Vivado seems to accept the simple for.
You are building a fairly large block of combinatorial logic here. Remember that when you synthesize combinatorial logic you are asking for a circuit that can evaluate the expression that you've written within a single clock cycle. The actual expression taken into consideration is the one after the for loop has been unrolled. I.e., you are (conditionally) adding a 16bit number to a 32bit number 16 times, each times shifted one bit further to the left. Each of these additions has a carry bit. So each of these additions will actually have to look at all of the upper 16bits of the result of the previous addition, and each will depend on all but the lowest bit of the previous addition. An adder for one bit + carry needs O(5) gates. Each addition is conditional, which adds at least one more gate for each bit. So you are asking for at minimum of 16*16*6 = 1300 interdependent gates that all have to stabilize within a single clock cycle. Vivado is telling you that it cannot fulfill these requirements.
One way of mitigating this problem will be to synthesize for a lower clock frequency, where the gates have more time to stabilize, and you can thus build longer logic chains. Another option would be to pipeline the operation, say by only evaluating what corresponds to four iterations of your loop within a single clock cycle and then building the result over several clock cycles. This will add some bookkeeping logic to your code, but it is inevitable if one wants to evaluate complex expressions with finite resources at high clock frequencies. It would also introduce you to synchronous logic, which you will have to learn anyway if you want to do anything non-trivial with an FPGA. Note that this kind of pipelining wouldn't affect throughput significantly, because your FPGA would then be doing several multiplications in parallel.
You may also be able to rewrite the expression to handle the carry bits and interdependencies in a smarter way that allows Vivado to find its way through the expression (there probably is such a way, I assume it synthesizes if you simply write the multiplication operator?).
Finally, many FPGAs come with dedicated multiplier units because multiplication is a common operation, but implementing it in logic gates wastes a lot of resources. As you found out.

Related

Cannot match operand(s) in the condition to the corresponding edges in the enclosing event control of the always construct

This code fails to compile and gives the error as in the title at the "if(overflow)" line.
always #(posedge clk or negedge overflow) begin
if(overflow)
count_posedge = count_posedge + 1;
else
count_posedge = 0;
end
I've somewhere on the internet that I must change it like this:
always #(posedge clk or negedge overflow) begin
if(~overflow)
count_posedge = 0;
else
count_posedge = count_posedge + 1;
end
...and it works perfectly.
From my understanding, the 2 code should behave the same. What's the problem with the first one?

This is more likely an issue with your synthesizer then your simulator. The likely problem is that the first code does not match any of it's templates for a synchronous flip-flop with asynchronous reset.
The common coding practice is to assign your reset logic before any other logic. This coding practice has been around for decades. I assume the rational behind this particular coding practice stems from:
Reset logic is critical from many designs; especially as designs get larger and more complex. It is put at the top for it importance and the fact that it usually fewer lines of code then the synchronous logic.
Early synthesizers were very limited and could only synthesize specific code structures.
The coding style has become a precedence. No one is going to change it unless you can convince and prove something else is superior or has specific advantages.
In your case, your synthesizer is doing a lint check and has determined your code is not following the conventional coding practices. The creator of your synthesizer has decided to only support the common coding structure and there is little incentive to change.
FYI: you should be assigning synchronous logic with non-blocking assignments (<=). Blocking assignments (=) should be used for combinatonal logic. If you do not follow proper coding practices you increase the risk of introducing race-conditions, RTL vs gate mismatch, and other bugs.

Will latches occur in sequential logic in verilog

This is a basic question but it seems to lack of clear explanation to me.
In many of code examples,one style to write FSM output is
assign a = (current_state==DONE)?1:0;
I confuse this with definition of latches. Will this combinational logic infer latches as "a" holds its previous value if current_state != DONE? It seems no warnings from my compiler.
sometimes, i would have
always#(posedge clk)
begin
if(reset)
a<= 1'b0;
else
if(current_state == DONE)
a <=1'b1;
end
This is certainly a sequential logic(tho my output does not depend on my chains of past input) and a would keep its previous value until my control signal current_state == DONE.I would guess this logic will synthesize to a mux to the input of a flipflop.
so if in the 2nd case that I actually have a clocked FSM, i would have my mux with FSM states output as the select signal input.
Until now, can i say anything that is not combinational logic will not generate any latches?
However,when i have a structure like the following,
always#(posedge DCO or posedge reset or posedge enable)
begin
if(reset)
begin
end
else if(enable)
begin
end
else
begin
end
end
I get a warning in my FPGA that i have inferred a latch with control signal enable.
Why?
My enable changes based on another state machine for example,
assign enable = (pcurrent_state == START)?1:0;
Moreover, we have unintentional latches and intentional latches. but design rule basically says to avoid latches to avoid timing arches. Can someone give some examples of where intentional latches should be used in design rather than clock gating example?
Plus,
The output of all the storage elements (flip-flops) in the circuit at any given time, the binary data they contain, is called the state of the circuit. The state of a synchronous circuit only changes on clock pulses. At each cycle, the next state is determined by the current state and the value of the input signals when the clock pulse occurs.(from https://en.wikipedia.org/wiki/Sequential_logic)
This sounds like describing a mealy machine to me rather than a typical sequential logic. My simplest sequential logic does not need my output change determined by my current_state
Thank you for any help. I am doing this coding everyday and reading its definition but it seems that i am confusing myself without discuss with others.

To answer your question in parts:
The given assign statement will not infer a latch as a does not retain it's value. It will be 1 if current_state == DONE and otherwise be 0. So it's pure combinational logic.
The second block of code implements a flip-flop with synchronous reset and only loads itself with 1'b1 if current_state == DONE so there is retention in that code. This code shouldn't generate a latch due to the edge sensitivity on a single clock.
The last block would be difficult for any synthesis tool to handle due to the sensitivity on serveral signals, which is not common in hardware. Moreover, if say enable is asserted but not an edge when a positive edge of DC0 comes along, the code would have the body of the else if (enable) run, thus simulating some sort of latching behavior. Synthesis tools generally allow for a single clock and single reset to be specified in the sensitivity list of an always block to indicate a flip-flop with asynchronous reset. While Verilog certainly allows for more complicated sensitivity lists, their physical meaning gets complicated quickly, thus inferring latches. In most designs, you shouldnt ever need these complex sensitivity lists are you are then getting into asynchronous design for which most synthesis tools are not well suited on a behavioral level. FPGA tools especially are poor at asynchronous elements and even latches as the logic cells in the fpga to which the design must be mapped are designed specifically for synchronous designs using flip-flops; that's how fpgas are implemented.
Finally, in non-fpga designs, it is sometimes desirable to use latches if edge sensitivity isn't required, as latches are physically smaller and can allow a design to be smaller, faster and more power efficient in some cases. However, you need to have a firm grasp on what you are designing and understand potential trade offs and timing requirements when doing so. Here's an example of when a latch is a useful element: https://electronics.stackexchange.com/questions/255009/what-is-application-of-latch-in-vlsi-design

Accessing register values in verilog

I initialize a register
reg[1:0] yreg;
and manipulate it a bit, i.e., value from prev. iteration of program is shifted to 1 spot when I add in the new value in the 0 spot
yreg = SIGNAL; //here SIGNAL is an input to the program
And then I want to access the values at the 0 and 1 spots in the register later for a calculation. How can I do this? My initial reaction was yreg[0] and yreg[1] (I normally program in python =) but this is producing an error (line 35 is the line of code that has yreg[0] and yreg[1] in it):
ERROR:HDLCompiler:806 - "/home/ise/FPGA/trapezoid/trapverilog.v" Line 35: Syntax error near "[".
My assumption when I saw this was that it's not the right syntax to use the brackets to access a certain index of the register. How do you properly access the information in an index of a register? I'm having trouble finding information on this.
Sorry for the probably ridiculous question, this is my first time ever using verilog or FPGAs in general.
Full Code
module trapverilog(
input CLK,
input SIGNAL,
input x,
output OUT
);
reg[1:0] yreg;
float sum = 0;
always #(posedge CLK)
begin
yreg = SIGNAL; //should shift automatically...?
sum = ((reg[0] + reg[1])*x/2) + sum; //this is effectively trapezoidal integration
OUT = sum;
end
endmodule

You have a fundamental misunderstanding of how Verilog signals work.
By default, all Verilog signals are single bits. For example, in your code, SIGNAL, x, and out are all one bit wide. They cannot store a number (other than 0 or 1).
Specifying a width when you define a signal (like reg [1:0] yreg) controls how many bits are used to represent that signal. For instance, yreg is a two-bit signal. This does not mean that it "shifts automatically"; it just means that the signal is two bits wide, allowing it to be used to represent numbers from 0 to 3 (among other things).
I would strongly advise that you work through a course in digital electronics design. Verilog programming is very different from procedural programming languages (such as Python), and you will have a hard time making sense of it without a solid understanding of what you are actually building here.

Apparently as per this answer using brackets to get a certain index of a register is correct. I just forgot to call the variable properly - I was calling reg[0] instead of yreg[0]. Changing this fixed the error.

optimization choices with slice LUT and slice registers in Xilinx FPGA

Has anybody has any idea, that in Xilinx FPGAs when Slice LUTs are used and Slice Registers are used? What are the various design choices that one can have to explicitly target one of these particular resources.

LUT's have no state and are used to implement combinatorial logic assign x = a + b;
Registers are just elements that hold state, and do not implement any logic
always #(posedge clk) state_f <= state_nxt;
If you want to target one or the other, then you have to choose your algorithms to either minimize combinatorial logic or minimize state.
I think this is the question you are asking, my apologies if it is too simple of an answer.

Verilog Best Practice - Incrementing a variable

I'm by no means a Verilog expert, and I was wondering if someone knew which of these ways to increment a value was better. Sorry if this is too simple a question.
Way A:
In a combinational logic block, probably in a state machine:
//some condition
count_next = count + 1;
And then somewhere in a sequential block:
count <= count_next;
Or Way B:
Combinational block:
//some condition
count_en = 1;
Sequential block:
if (count_en == 1)
count <= count + 1;
I have seen Way A more often. One potential benefit of Way B is that if you are incrementing the same variable in many places in your state machine, perhaps it would use only one adder instead of many; or is that false?
Which method is preferred and why? Do either have a significant drawback?
Thank you.

One potential benefit of Way B is that if you are incrementing the same variable in many places in your state machine, perhaps it would use only one adder instead of many; or is that false?
Any synthesis tool will attempt automatic resource sharing. How well they do so depends on the tool and code written. Here is a document that describes some features of Design Compiler. Notice that in some cases, less area means worse timing.
Which method is preferred and why? Do either have a significant drawback?
It depends. Verilog(for synthesis) is a means to implement some logic circuit but the spec does not specify exactly how this is done. Way A may be the same as Way B on an FPGA but Way A is not consistent with low power design on an ASIC due to the unconditional sequential assignment. Using reset nets is almost a requirement on an ASIC but since many FPGAs start in a known state, you can save quite a bit of resources by not having them.

I use Way A in my Verilog code. My sequential blocks have almost no logic in them; they just assign registers based on the values of the "wire regs" computed in the combinational always blocks. There is just less to go wrong this way. And with Verilog we need all the help we can get.

What is your definition of "better" ?
It can be better performance (faster maximum frequency of the synthesized circuit), smaller area (less logic gates), or faster simulation execution.
Let's consider smaller area case for Xilinx and Altera FPGAs. Registers in those FPGA families have enable input. In your "Way B", *count_en* will be directly mapped into that enable register input, which will result in less logic gates. Essentially, "Way B" provides more "hints" to a synthesis tool how to better synthesize that circuit. Also it's possible that most FPGA synthesis tools (I'm talking about Xilinx XST, Altera MAP, Mentor Precision, and Synopsys Synplify) will correctly infer register enable input from the "Way A".
If *count_en* is synthesized as enable register input, that will result in better performance of the circuit, because your counter increment logic will have less logic levels.
Thanks

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string