Let's say I have a code:
wire clk1;
wire clk2;
assign clk1 = Clk;
assign Clk2 = Clk;
Now clk1 and clk2 are used to clock various modules and traverse through the hierarchy of the design. Somewhere deep in the hierarchy, if a module is clocked by clk1, does it's output remain synchronous with another from module2?
e.g.
reg r1;
always # (posedge clk1)
r1 <= rSomething;
reg r2;
always # (posedge clk2)
r2 <= r1;
Is this code valid? Will the synthesis tools (Altera tool chain) maintain the skew across these two clocks? Or will it maintain the skew only on clocks that are named the same and clk1 and clk2 will cease to be synchronous despite their common source?
Thanks
EDIT1 : This for synthesis, not simulation.
EDIT2: Changed the second code example. I was trying to assign r2 <= r1, not the other way round as we the case earlier.
A synthesizer will transform your design input into an internal netlist that represents the logic structure. This is typically done in two stages. First to a high level behavioral form that represents abstract operations and then to a technology mapped form that directly implements logic primitives of the target architecture. In this transformation process clk1 and clkl2 will be seen as topologically equivalent to clk and they will be treated as one combined net.
The normal clock buffer insertion process will account for the skew across all leaf nodes in the unified net. Any timing constraints would need to be put on clk. An attempt to constrain clk1 and clk2 independently could have unpredictable results.
Renamed clocks remain synchronous. An explicit continuous assignment similar to passing a signal through a port where the connecting signal name and the port name are different.
However, no synthesis tool that I am aware of will let you make assignments to the same variable from multiple processes.
Related
Can a posedge be detected on variables that aren't a clock?
For example I have a reset button R, which should reset the machine to the starting state whenever it is pressed.
always # (posedge clk, posedge R)
if(R)
reset_the_machine();
else
use_the_next_state();
You have two questions in your question:
can posegde in verilog be used only on clock?
The answer is no.
Can a posedge be detected on variables that aren't a clock?
The anser is yes.
There are no clocks in verilog language. Every signal is equal. Edges could be detected in simulation on any variable. Detection of edges itself is a simulation artifact.
Clock is only a modern hardware artifact. A verilog program reflects hardware behavior and therefore it needs to program clocks in a specific hardware-related way. But this is just a programming trick.
As for your example, see dave-59's answer.
The use of an edge on a signal like a set or reset is an artifact of the template pattern synthesis tools choose to represent asynchronous logic iin a single always block. If you didn't qualify R with posedge then the negative edge (negedge) of R would execute the use_next_state() branch. Thus simulation would be interpreting the release of reset as a posedge of clock.
When Verilog was first developed, the intent was to use separate always blocks for the synchronous and asynchronous behaviors using a procedural continuous assignment
always #(R)
if (R)
assign state = initial_state;
else
deassign state;
always #(posedge clk)
state <= next_state;
But today's synthesis tools no longer support this template probably because of confusion with the fully continuous assign statement.
I'm connecting a DDR model in the test bench using wires with net delay to model the trace delay on the board. The trace can hold 1 or 2 bits in its transmission line, but because the simulators models the net delay as inertial delay, all the bits get filtered out as glitches. I don't even get the clock this way. The SystemVerilog spec is not explicit on this subject. So, I'm guessing that the simulators don't want to incur the cost on performance and storage to model it as transport delay. However, I strongly believe transport delay is the right way to use for net delay, because otherwise the hassle of modeling each bidir signal with its own delay is huge. What do you think?
Here's my test case.
`timescale 1ps/1ps
module sim_top ();
parameter realtime NET_DELAY = 80ps;
parameter realtime PULSE_DELAY = 50ps;
logic driver;
initial begin
driver = 0;
forever #PULSE_DELAY driver = !driver;
end
wire #NET_DELAY a;
assign a = driver;
always #(posedge a or negedge a) begin
$display("a=%b # %t", a, $time);
end
logic b;
always #(*) begin
b <= #NET_DELAY driver;
end
always #(posedge b or negedge b) begin
$display("b=%b # %t", b, $time);
end
initial begin
#1ns;
$finish;
end
endmodule
Transport delay could be simulated by using #delays with non-blocking assighments:
always #*
out <= #delay input;
this is close to an inertial delay:
assign
#delay out = input;
The rest are mostly behavioral simulation artifacts which should be used in testbench only.
I had found this paper very informative for this issue:
http://www-inst.eecs.berkeley.edu/~cs152/fa06/handouts/CummingsHDLCON1999_BehavioralDelays_Rev1_1.pdf
So, I'm guessing that the simulators don't want to incur the cost on performance and storage to model it as transport delay.
I can imagine how you got that reasoning. From what I have learned, your assumptions is incorrect.
Real logic gates have inertial delay. This delay is caused by RC (resistance & capacitance) that is inherit to the device, from routing, or intentionally added. Verilog uses inertial delay to emulate delay behavior of real logic.
I cannot find a citation, but from discussions with engineers with pre-Verilog experience, early Verilog only modeled gates (think Verilog primitives and,or,nand,bufif0, etc.). Assign statements and always blocks were added some time later along with the possibility of synthesis. Non-blocking (<=) assignments were added much later (and copied from VHDL from what I've heard), but sill before Verilog by IEEE.
Assign statement handle delay the same way as gates. See IEEE1800-2012 § 10.3.3 Continuous assignment delays
A delay given to a continuous assignment shall specify the time duration between a right-hand operand value change and the assignment made to the left-hand side. If the left-hand references a scalar net, then the delay shall be treated in the same way as for gate delays; that is, different delays can be given for the output rising, falling, and changing to high impedance (see 28.16).
I'm guessing assign statements use the same type of delay as gates to reuse existing code (inside the simulator, not Verilog code) for managing delay and maintains continuity for the definition of delay. It also mapping easier, ex: assign #1 out = in; is the same as buf #1 (out,in);.
With real circuits, when you want a delay of 10 with a filter of 0.5, you need chain of 20 delay cells with small RC. If you want a delay and filter of 10 you use one delay cell with large RC. In theory synthesizers could use this delay information for target delay and filtration control (I cannot think of any synthesizer that actually does this; all synthesizers I know of ignore RTL delays).
Generally non-blocking (<=) assignments are used for synchronous logic (edge-sensitive flops and level-sensitive latches). Transport delay is the main exception but should only be used for behavioral modeling that will not be synthesized. If you know the amount of filtering your real delay path will have, I suggest coding it as follows:
reg out;
wire #(FILTER) in_filtered = in;
always #* out <= #(TOTAL_DELAY-FILTER) in_filter; // must be >=0.0
Which is cleaner and less CPU intensive then the pure inertial delay approach:
localparam LENGTH = TOTAL_DELAY/FILTER; // must be an integer >=2
wire out;
wire [LENGTH-2 : 0] delay_chain;
assign #(FILTER) {out,delay_chain} = {delay_chain,in};
I'm trying to design this state machine in verilog:
I was have:
`timescale 1ns/1ns
module labEightMachine(y, x,clk,clr)
output y;
input [1:2] x;
input clk, clr;
reg [1:2] q;
reg y;
reg nX;
always # (posedge clk)
begin
if(clr)
{q}<=2'b00;
else
q<= {nX}
end
always #(y,x)
begin
if(q==2'b00)
if(x==2'b00)
q<=2'b00;
else
q<=2'b01;
y<=1;
if(q==2'b01)
if((x==2'b00)||(x==2'b01))
q<=2'b00;
y<=0;
else
q<=2'b11;
y<=0;
if(q==2'b11)
if(x==2'b10)
q<=2'b10;
y<=1;
else
q<=2'b00;
y<=0;
if(q==2'b10)
q<=2'b00;
y<=0;
end
endmodule
If any one could help by telling me where it is incorrect, that would be greatly appreciated. The state machines confuse me and I'm not sure that I am reassigning everything correctly.
Applying stimulus is always a better way to check your code. Leaving the syntax errors of semi-colons/begin-end and all that, a couple of immediate logical errors I can see are as below.
The declaration reg nX declares a variable of single bit width. On the contrary, q is declared as reg [1:2] q, of two bits width.
Secondly,q is driven from two always blocks. If clr is LOW, then q is driven by nX. While, nX is never driven by any signal. So, the output will be x majority of times (leaving those race-conditions). Multiple driver issues.
Thirdly, it would be better to use if--else if--else ladder instead of multiple ifs. This will make the next_state logic clear.
A better FSM, might have two always blocks and one output logic block. One for sequential logic and other for combinational logic. Sequential block is used to update the current_state value by the next_state value. While, a combinational block is used to update the next state value according to inputs. The output logic must either have a separate block of continuous assignments or a procedural block.
Also, it might be convenient to use a case statement for next_state logic. This will be useful when too many states are interacting with each other in a single FSM. Using default in case statement is inevitable.
For further information on efficient FSM coding styles, refer to CummingsSNUG1998SJ_FSM paper and CummingsSNUG2000Boston_FSM paper.
I've got an fpga design that utilizes synchronous resets (I prefer synchronous resets to asynchronous for reasons discussed elsewhere). I have four different clock domains in the design and I utilize a single button to generate my reset signal, which is of course totally asynchronous to everything (save my finger). I debounce the button signal in each of the four clock domains to generate synchronous resets for the four domains from a single source. My debounce module basically counts N clock cycles of the reset button being asserted. If more than N cycles have passed with reset asserted then I generate my reset signal (code for this module pasted below).
First question -- are there better ways of generating the reset(s) than this method?
Second (more interesting question): when I look at the timing reports (using xilinx tools) I see that consistently the limiting signals are all reset related. For example the limiting path is from the reset generator (debouncer) to some state machine's state register. The reset signals are very high fan out (they touch everything in their respective clock domains). I'm a little surprised though that my speed is limited by the reset. I'm finding that I'm limited to something like 8.5 nS where ~50% is routing and ~50% of that is logic. Any suggestions on how to do this a little better? How do you go about dealing with synchronous reset generation in fpga designs?
Here's the code for reset generation. Note that the signal reset signal is akin to the debounced output (e.g. when I instantiate the module the debounced output is the reset for that particular clock domain).
module button_debouncer(/*AUTOARG*/
// Outputs
debounced,
// Inputs
clk, button
);
/* Parameters */
parameter WIDTH = 1;
parameter NUM_CLKS_HIGH = 12000000;
parameter log2_NUM_CLKS = 24;
/* Inputs */
input clk;
input [WIDTH-1:0] button;
/* Outputs */
output [WIDTH-1:0] debounced;
/* Regs and Wires */
reg [WIDTH-1:0] b1, b2;
reg [log2_NUM_CLKS-1:0] counter;
/* Synched to clock domain */
always #(posedge clk) begin
b1 <= button;
b2 <= b1;
end
/* Debounce the button */
always #(posedge clk) begin
if(~b2)
counter <= 0;
else if(counter < {log2_NUM_CLKS{1'b1}})
counter <= counter + 1;
end
/* Assign the output */
//wire [WIDTH-1:0] debounced = counter > NUM_CLKS_HIGH;
reg [WIDTH-1:0] debounced;
always #(posedge clk) begin
debounced <= counter > NUM_CLKS_HIGH;
end
endmodule //button_debouncer
A very good way to improve timing scores while working with resets is to cap the max fanout. the tools will then buffer the signal so that there is not one lut trying to be routed and used to drive every register. This can be accomplished in this way:
(* max_fanout = <arbitrary_value> *)
wire reset;
so what we have here is a constraint used by the vivado synth tool (or if you are still using ISE, then that tool). Also, if should be noted that this only affects the next declaration of a net, so other nets (wires, regs, ext) declared before or after this are unaffected.
There is a good constraint user guide on xilinx's website. There are a few other ones that you may want to look into as well and they are: IBUF or BUFG.
You don't need four instances of the debouncer. Put in one debouncer off your main clock and then use three metastable filters to sync its output to the other three domains.
Also when you distribute your reset you should use what Cliff Cummings calls a "synchronous reset distribution tree". Check his website for some papers on that.
i just want to know the difference between this two statement
always #(posedge CLK)
begin
state <= next_state;
end
AND:
always #(CLK)
begin
case(CLK)
1'b1:
state <= next_state;
1'b0:
state <= state;
end
Is there a difference between both ?
Thanks
Not quite. posedge detects these transitions (from the LRM):
Table 43—Detecting posedge and negedge
To 0 1 x z
From
0 No edge posedge posedge posedge
1 negedge No edge negedge negedge
x negedge posedge No edge No edge
z negedge posedge No edge No edge
So, 0->x is a posedge, for example. Your second example only detects cases where CLK ends up as 1, so misses 0->x and 0->z.
The IEEE Std. 1364.1(E):2002 (IEC 624142(E):2005), the Verilog register transfer level synthesis standard, states in Sec. 5.1 that an always block without any posedge/negedge events in the sensitivity list is combinational logic. I.e. the signals in the event list are ignored and the block is synthesized as if an implicit expression list (#(*), #*) was used. The following example is given in the standard ("Example 4" on page 14):
always # (in)
if (ena)
out = in;
else
out = 1’b1;
// Supported, but simulation mismatch might occur.
// To assure the simulation will match the synthesized logic, add ena
// to the event list so the event list reads: always # (in or ena)
(the comment is also copied from the standard document)
I.e. for a synthesis tool your second block is effectively:
always #*
begin
case(CLK)
1'b1:
state <= next_state;
1'b0:
state <= state;
end
which is just a multiplexer with CLK as select input, next_state as active-1 input and the output (state) fed back as active-0 input. A smart synthesis tool might detect that this is identical to a d-type latch with CLK as enable-input and create a d-type latch instead of a combinational loop. Note that the synthesis tool is not required to detect this latch because the code explicitly assigns state in all branches (compare Sec. 5.3. of the standard).
Either way this is different from the d-type flip-flop your first code example would synthesize to. This is one of many cases where Verilog-code has different meaning in simulation and synthesis. Therefore it is important to (1) write synthesizeable Verilog code in a way that avoids this cases and (2) always run post-synthesis simulations of your design (even if you are also using formal verification!) to make sure you have successfully avoided this pitfalls.
Functionally, those two circuits describe the same behavior in verilog, so I think there should be no difference.
However you should generally use the first style, as that is the one that is standard for writing synthesizable code, and most understandable by anyone else reading your code. The latter style, while describing the correct behavior, may confuse some synthesizers that don't expect to see clocks that are both sensitive to positive and negative edge.
The two blocks are VERY different.
The top one gives you a flip-flop while the bottom one gives you a latch with a multiplexer with the CLK as the select signal.
The critical difference between the two blocks is that the top one is a synchronous block i.e. the posedge clk part while the bottom one is asynchronous with the CLK level, not edge.
A verilog simulator could do left-hand sampling of CLK, effectively making the the case(CLK) version the same as a negedge CLK flop. Otherwise the simulator will treat it like a posedge CLK flop. It really depends how it is handled in the scheduler of specific simulator (or how a particular synthesizer will process it).
The most common codding styles all use the first condition. It is explicitly clear to the synthesizer and anyone reading the code that state is intended to be a flip-flop with a positive edge clocking trigger.
There is also a simulation performance differences. The posedge CLK performances 2 CPU operations every clock period, while the case(CLK) will perform 6 CPU operations every clock period. Granted in this example the differences is insignificance, but in large designs the poor coding used everywhere will add up to hours of simulation time wasted.