Using blocking assignments to infer flip-flops in Verilog

Using blocking assignments to infer flip-flops in Verilog - verilog

I have read "Nonblocking Assignments in Verilog Synthesis, Coding Styles that Kill!" by Clifford Cummings. He says that the following code (page 12, simplified) is a correct implementation of a flip-flop often used in textbooks, even if not exactly the kind that anyone should use. The document won a best paper award, so I assume the claim is true.
module ff (q, d, clk)
output q;
input d, clk;
reg q;
always #(posedge clk)
q = d;
endmodule
I would like to know why this would continue to work correctly if two or more of these flip-flops were connected in series. Say
module two_ffs (q, d, clk)
input d, clk;
output q;
wire tmp;
ff firstff (tmp, d, clk);
ff secondff (q, tmp, clk);
endmodule
The way I see it, it's possible that the value of tmp is updated before it is used by secondff, thus resulting in one flip-flop rather than two. Can someone please tell me what part of the standard says that cannot happen? Many thanks.
[not that I would ever contemplate writing code like that, I just want to understand the blocking/nonblocking behavior even in cases when poor coding style makes the meaning non-obvious]
Added later:
I now think the paper is unlikely to be correct. Section 5 "Scheduling Semantics" of the 1364-2201 Verilog standard explains what happens. In particular, section 5.6.6 "Port connections" on page 68 says that unidirectional ports are just like continuous assignments. In turn, a continuous assignment is just an always block sensitive to everything. So the bottom line is that that the two instantiations of an ff in my example below are equivalent to a module with multiple always clauses, which everyone would agree is broken.
Added after Clive Cummings answered the question:
I am grateful to CC for pointing out that that the statements above taken out of section 5 of the standard only refer to the timing of update events, and do not imply literal equivalence between e.g. some continuous assignments and always blocks. Nevertheless, I think they explain why some simulators (e.g. Icarus Verilog) will produce different simulation results with a blocking and a non-blocking assignment in the "flip-flop". [On a larger example, I got 2 apparent ffs with a blocking assignment, and the correct five with a non-blocking one.] Other simulators (e.g. Modelsim with default options or Cver) seem to produce the same result no matter which form of assignment is used.

All -
A few corrections and updates. Section 5.6.6 of the 2001 Verilog Standard does not say that "unidirectional ports are just like continuous assignments," it says "Ports connect processes through implicit continuous assignment statements." There is a difference that I will note below.
Second, "a continuous assignment is just an always block sensitive to everything," is not true. Continuous assignments Drive values onto nets that can be driven by other sources with pre-defined resolution functions as described in the Verilog Standard. Always blocks Change values of variables and last procedural change wins (no resolution).
Regarding my description of the 1-always block flip-flop, my description in the paper is not 100% accurate (but is usually accurate). The 2-instantiated flip-flop model in theory does have a race condition, though it is rarely seen. The race is rarely seen because when you make an always block assignment to a variable that is declared as an output, Verilog compilers automatically throw in an "implicit continuous assignment statement" (IEEE-1364-2001, Section 5.6.6, 1st paragraph) to convert the procedural variable into a net-Driving assignment (you never see this happen!) This conversion is typically sufficient to introduce the equivalent of a nonblocking assignment delay on the port, so the simulation works. I have experimented in the past with compiler optimization switches that effectively remove the module ports between the flip-flops and have observed the unwanted race conditions, so technically, my description of an okay 1-always, blocking-assignment flip-flop is not 100% correct; hence, you should still use the nonblocking assignments described in the paper.
The 2-always blocking-assignment example in the same module has a definite race condition. As written, it will probably work because most compilers execute the code top-down, but if you reverse the order of the always blocks, you will probably see a race.
Regards - Cliff Cummings -
Verilog & SystemVerilog Guru

Reading Version 1.3 of the paper, Section 9 Example 13. The text under it explains that it is OK if the module only contains a single always block. My current understanding is that it is not an issue between separate modules. Allowing your example to work. However if a module contained multiple always blocks then the order of execution is undefined and will lead to the race conditions talked about in section 2 of the paper.
The example below is almost the same as the 2 flop example in the question, except it is in 1 module and so has an undefined order of execution, this will likely not work.
module ff (q, d, clk)
output reg q;
input d, clk;
reg d_delay ;
always #(posedge clk)
d_delay = d;
always #(posedge clk)
q = d_delay;
endmodule

Related

Incomplete assignment and latches

When incompletely assigning a value I get a latch. But why did I get a latch in the example below? I think there is no need for the latch of F output because it is defined at all values of SEL.
Verilog code:
always # (ENB or D or A or B pr SEL)
if (ENB)
begin
Q=D;
if (SEL)
F=A;
else
F=B;
end
Inferred logic:

Although it is defined at all values of SEL, it is not defined for all values of ENB. If ENB = 0, your code says that both Q and F should hold the value from the previous cycle. This is also what is inferred in the image you are linking: only update Q and F if ENB = 1.
If you want Q to be a latch and F not, you can do this:
always # (ENB or D or A or B or SEL)
begin
if (ENB)
Q=D;
if (SEL)
F=A;
else
F=B;
end
Edit: additional information
As pointed out in the comments, I only showed how you could realize combinational logic and a latch, without modifying your code too much. There are, however, some things which could be done better. So, a non-TL;DR version:
Although it is possible to put combinational logic and latches in one procedural block, it is better to split them into two blocks. You are designing two different kinds of hardware, so it is also better to seperate them in Verilog.
Use nonblocking assignments instead of blocking assignments when modeling latches. Clifford E. Cummings wrote an excellent paper on the difference between blocking and nonblocking assignments and why it is important to know the difference. I am also going to use this paper as source here: Nonblocking Assignments in Verilog Synthesis, Coding Styles That Kill!
First, it is important to understand what a race condition in Verilog is (Cummings):
A Verilog race condition occurs when two or more statements that are scheduled to execute in the same simulation time-step, would give different results when the order of statement execution is changed, as permitted by the IEEE Verilog Standard.
Simply put: always blocks may be executed in an arbitrary order, which could cause race conditions and thus unexpected behaviour.
To understand how to prevent this, it is important to understand the difference between blocking and nonblocking assignments. When you use a blocking assignment (=), the evaluation of the right-hand side (in your code A, B, and D) and assignment of the left-hand side (in your code Q and F) is done without interruption from any other Verilog statement (i.e., "it happens immediately"). When using a nonblocking assignment (<=), however, the left-hand side is only updated at the end of a timestep.
As you can imagine, the latter assignment type helps to prevent race conditions, because you know for sure at what moment the left-hand side of your assignment will be updated.
After an analysis of the matter, Cummings concludes, i.a., the following:
Guideline #1: When modeling sequential logic, use nonblocking assignments.
Guideline #2: When modeling latches, use nonblocking assignments.
Guideline #3: When modeling combinational logic with an always block, use blocking assignments.
A last point which I want to highlight from the aforementioned paper is the "why". Except from the fact that you are sure the right hardware is infered, it also helps when correlating pre-synthesis simulations with the the behaviour of your actual hardware:
But why? In general, the answer is simulation related. Ignoring the above guidelines [about using blocking or nonblocking assignments on page 2 of the paper] can still infer the correct synthesized logic, but the pre-synthesis simulation might not match the behavior of the synthesized circuit.
This last point is not possible if you want to strictly adhere to Verilog2001, but if you are free to choose your Verilog version, try to use always_combfor combinational logic and always_latch for latches. Both keywords automatically infer the sensitivity list, and it is easier for tools to find out if you actually coded up the logic you intended to design.
Quoting from the SystemVerilog LRM:
The always_latch construct is identical to the always_comb construct except that software tools should perform additional checks and warn if the behavior in an always_latch construct does not represent latched logic, whereas in an always_comb construct, tools should check and warn if the behavior does not represent combinational logic.
With these tips, your logic would look like this:
always_latch
begin
if (ENB)
Q <= D;
end
always_comb
begin
if (SEL)
F = A;
else
F = B;
end

How does a sensitivity list work in circuit level?

Let's say there's a code that runs like this
reg [4:0] data;
always # (posedge clk, posedge clr)
begin
if(clr)
data <= 0;
else
data <= data +1;
end
How would this look like in circuit level? My guess is roughly
but then that wouldn't help if Clk goes from 0 to 1 while Clr is 1......
Also, is it good practice to have multiple elements in the sensitivity list? From what I see, there's som overhead going on here..

Verilog excerpt will infer DFF (D Flip-Flop) with async reset. This happens due to the fact that reset signal is a part of sensitivity list.
NOTE1: as per LRM for Verilog, adding the reset to the sensitivity list is what makes the reset asynchronous.
NOTE2: each Verilog procedural block should model only one type of flip-flop. In other words, a designer should not mix resetable (sync or async) flip-flops with follower flip-flops (flops with no resets) in the same procedural block.
Your diagram is incorrect, 'clr' signal will be connected to extra input of the DFF called as CLEAR (it is basically an async reset). I suggest to start with some sort of Verilog tutorial, this is very basic thing and it is well explained in materials that are generally available. To grasp on concept of reset in HDL code I recommend the following material:
http://www.sunburst-design.com/papers/CummingsSNUG2003Boston_Resets.pdf

The schematic is not accurate. 4 D-FF will be implemented for each bit of data declared.
Adding reset (i.e. clr) to sensitivity list will make the ckt async (Verilog LRM).
The D-FF will have an additional clear pin, there will be NO bubble to this pin as your reset (i.e. clr) is active high.

Verilog design - input is "unused" warning

When attempting to synthesize a Verilog design (I want to generate a schematic), I get the following warning:
Synthesizing Unit <rising>.
Related source file is "C:\PPM\PPM_encoder\detectors.v".
WARNING:Xst:647 - Input <in> is never used. This port will be preserved and left unconnected if it belongs to a top-level block or it belongs to a sub-block and the hierarchy of this sub-block is preserved.
Summary:
no macro.
Unit <rising> synthesized.
The relevant module is simply:
module rising (in, out);
output out;
input in;
not #(2,3) (ininv, in);
and #(2,3) (out, in, ininv);
endmodule
And I call it in several different locations, including:
rising startdetect(
.in(start),
.out(or01a));
When I complete the synthesis and then choose to "View schematic", only one component is actually present. Expanding that component, I see only the output being connected to ground, which is the initial condition. Nothing else is present. This is with my testbench as my "top module".
When I select my actual main project (below the testbench, it's called ppmencode) as the top module, I get those same warnings, plus additional warnings for every single module instance:
WARNING:Xst:1290 - Hierarchical block <startdetect> is unconnected in block <ppmencode>.
It will be removed from the design.
What is the cause of these two warnings, and how can I fix them and be able to generate a correct schematic?
Edited to add: The whole thing simulates perfectly, it's just when trying to make a schematic (to try to explain this thing that I just made to my team) that I run into problems. This image shows the schematic that I get.

It's not enough to have a signal named as an input to a module...it needs to actually be connected to a pin on the FPGA. On the other hand, your rising module is taking the AND of the input and its complement...the synthesizer might have figured out a way to simplify that logic that is contrary to your wishes.

Synthesis is optimizing all the logic out because it ignores the delays. Functionally you have in & ~in which is always 0. What you intend is a pulse generator. One way to achieve this is to use the dont_touch attribute, which tell the synthesizer that it must keep a particular module instantiation in the design. See this Properties Reference Guide for more.
module rising (in, out);
output out;
input in;
(* DONT_TOUCH = "TRUE" *)
not #(2,3) (ininv, in);
and #(2,3) (out, in, ininv);
endmodule
Be warned that even with the dont_touch your synthesize result may not match simulation. Synthesis ignores the artificial timing in your netlist. The actual pulse width could be longer, more likely shorter or to small to be registered. Check your standard cell library and look for a delay cell to apply to ininv, this will increase the pulse width. The library may already have a pulse generator cell already defined.

Precise rules to determine what Verilog tool is synthesisable

I am reading the IEEE Standard Verilog Hardware Description Language (specifically IEEE Std 1364-2001) which unambiguously defines and discusses simulatable Verilog. Unfortunately, the document does not touch upon the notion of synthesis.
I haven't been able to find a similar reference for synthesisable Verilog. All I find is vague rules, or unnecessarily restrictive ones.
Where can I learn the formal language of synthesisable Verilog?

IEEE 1364.1 is an adjunct to the 1364 Verilog standard titled Verilog Register Transfer Level Synthesis, which attempts to define a common synthesizable subset. However, as Jerry points out, different tools support different constructs, and to determine tool-specific behavior, you need to consult the tool documentation.
There isn't a formal (BNF-style) syntax definition for synthesizable Verilog. Whether code is synthesizable depends on usage as well as syntax. For example, the behavior described by an always construct with incomplete sensitivity, like always #(a) o = a || b, isn't synthesizable. (Most tools will synthesize that code as if the sensitivity list were complete, resulting in a possible simulation/synthesis mismatch.)
Circuit constructs like latches and multiply-driven nets can be synthesized from a Verilog description, but are disallowed or discouraged under most design rules. There are also synthesizable constructs that are unsupported or inadvisable given the choice of target library. For example, describing a RAM that's larger than the maximum supported by a chosen FPGA technology, or describing tri-state drivers when they aren't present in the target library.
The general constructs to stick to for synthesizable Verilog are:
Combinational logic modeled with continuous assignments (assign statements)
Combinational logic modeled with always blocks, which should use blocking assignments, and either have a complete sensitivity list or use always #*
Sequential logic (flip-flops) modeled with always blocks, which should use non-blocking assignments, and have either posedge clock alone (for sync reset or non-reset flops) or posedge clock and posedge or negedge reset (for async reset flops) in the sensitivity list.
The safest coding style for sequential logic is to code only reset logic in the sequential always block:
always #(posedge clk or posedge reset)
if (reset)
q <= reset_value;
else
q <= next_value;
However, if you're careful, you can code additional combinational logic in the sequential block. A common case where it may make sense to do this is a mux in front of a flop:
always #(posedge clk)
if (!sel)
q <= sel0_value;
else if (sel)
q <= sel1_value;
else
q <= 'bx;

What's included in a verilog always #* sensitivity list?

I'm a bit confused about what is considered an input when you use the wildcard #* in an always block sensitivity list. For instance, in the following example which signals are interpreted as inputs that cause the always block to be reevaluated? From what I understand clk and reset aren't included because they dont appear on the right hand side of any procedural statement in the always block. a and b are included because they both appear on the right hand side of procedural statements in the always block. But where I'm really confused about is en and mux. Because they are used as test conditions in the if and case statements are they considered inputs? Is the always block reevaluated each time en and mux change value? I'm pretty much a noob, and in the 3 Verilog books I have I haven't found a satisfactory explanation. I've always found the explanations here to be really helpful. Thanks
module example
(
input wire clk, reset, en, a, b,
input wire [1:0] mux,
output reg x,y, z
);
always #*
begin
x = a & b;
if (en)
y= a | b;
case(mux)
2'b00: z = 0;
2'b01: z = 1;
2'b10: z = 1;
2'b11: z = 0;
endcase
end
endmodule

Any signal that is read inside a block, and so may cause the result of a block to change if it's value changes, will be included by #*. Any change on a read signal used must cause the block to be re-evaluated, as it could cause the outputs of the block to change. As I'm sure you know, if you hadn't used #* you'd be listing those signals out by hand.
In the case of the code you've provided it's any signal that is:
Evaluated on the right hand side of an assignment (a and b)
Evaluated as part of a conditional (en and mux)
...but it's any signal that would be evaluated for any reason. (I can't think of any other reasons right now, but maybe someone else can)
clk and reset aren't on the sensitivity list because they aren't used. Simple as that. There's nothing special about them; they're signals like any other.

In your example, the following signals are included in the implicit sensitivity list:
a
b
en
mux
clk and reset are not part of the sensitivity list.
This is described completely in the IEEE Std for Verilog (1800-2009, for example). The IEEE spec is the best source of detailed information on Verilog. The documentation for your simulator may also describe how #* works.

The simplest answer depends on if you are writing RTL, or a testbench. If you are writing RTL then you should try to forget about the concept of Sensitivity lists, as they don't really exist. There is no logic that only updates when an item on the list is triggered. All sensitivity lists can do in RTL is cause your simulation and actual circuit to differ, they don't do anything good.
So, always use "always #*" or better yet "always_comb" and forget about the concept of sensitivity lists. If the item in the code is evaluated it will trigger the process. Simple as that. It an item is in an if/else, a case, assigned to a variable, or anything else, it will be "evaluated" and thus cause the process to be triggered.
But, just remember, in digital circuits, there is no sensitivity list.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string