vivado simulation error: Iteration limit 10000 is reached - verilog

While I was trying to run the simulation in vivado, I got:
ERROR: Iteration limit 10000 is reached. Possible zero delay
oscillation detected where simulation time can not advance. Please
check your source code. Note that the iteration limit can be changed
using switch -maxdeltaid. Time: 10 ns Iteration: 10000
I don't have any initial statement in my module being tested.
Could anybody point out where the problem could be?
`timescale 1ns / 1ps
module mulp(
input clk,
input rst,
input start,
input [4:0] mplier, // -13
input [4:0] mplcant, // -9
output reg done,
output [9:0] product
);
parameter N = 6;
parameter Idle = 2'b00;
parameter Load = 2'b01;
parameter Oper = 2'b10;
parameter Finish = 2'b11;
reg done_r;
reg [N-1:0] A, A_r, B, B_r;
reg [1:0] state, state_r;
reg [2:0] count, count_r;
wire [N-2:0] C, C_comp;
reg [N-2:0] C_r;
assign C = mplcant; assign C_comp = {~C + 1};
assign product = {A_r[N-2:0], B_r[N-2:0]};
always #(posedge clk) begin
if (rst) begin
state_r <= Idle;
count_r <= 0;
done_r <= 0;
A_r <= 0;
B_r <= 0;
end else begin
state_r <= state;
count_r <= count;
done_r <= done;
A_r <= A;
B_r <= B;
end // if
end // always
always #(*) begin
state = state_r;
count = count_r - 1; // count: 6
done = done_r;
A = A_r;
B = B_r;
case (state)
Idle: begin
if (start) begin
state <= Load;
end // if
end
Load: begin
A = 0; B = {mplier, 1'b0}; count = N; // start at 6
state = Oper;
end
Oper: begin
if (count == 0)
state = Finish;
else begin
case (B[1:0])
2'b01: begin
// add C to A
A = A_r + {C[N-2], C[N-2:0]};
// shift A and B
A = {A_r[N-1], A_r[N-1:1]};
B = {A_r[0], B_r[N-1:1]};
end
2'b10: begin
A = A_r + {C_comp[N-2], C_comp[N-2:0]};
A = {A_r[N-1], A[N-1:1]};
B = {A_r[0], B_r[N-1:1]};
end
(2'b00 | 2'b11): begin
A = {A_r[N-1], A[N-1:1]};
B = {A_r[0], B_r[N-1:1]};
end
default: begin
state = Idle; done = 1'bx; // error
end
endcase
end // else
end // Oper
Finish: begin
done = 1;
state = Idle;
end // Finish
default: begin
done = 1'bx;
state = Idle;
end
endcase
end // always
endmodule

You have a combinational loop. You are sampling and driving the state signal in the combinational always block. Typically, you sample the registered state variable (state_r in your code) in an FSM. Change:
case (state)
to:
case (state_r)
Unrelated, but you should use all blocking assignments in the combo block (not a mixture). Change:
state <= Load;
to:
state = Load;

Related

FSM is double incrementing

I have a newbie FSM Verilog question. In the below code, around line 96, I am incrementing led_num with the non blocking led_num <= led_num+1; However, because of the way my code is setup I have a non blocking curr_state <= next_state at line 82, and then I am doing a next_state <= "NEXT STATE" in every one of my case statements. This is causing me to be in each state 2 clocks and is causing my led_num <= led_num+1 to execute twice, and I want just a single time.
How do I go about fixing this so I am in each state a single clock or how do I prevent the led_num being incremented by two each time? I had one person tell me to non blocking all of the time, but some examples mix them and I am pretty sure I need to move my nextState to use a blocking assignment but I have tried several things so far and I keep messing it up.
`timescale 1ns / 1ps
module ws2812_b(clk, rst, send_the_data, led_data_out);
input clk;
input rst;
input send_the_data;
output reg led_data_out;
parameter NUMBER_LEDS = 5;
parameter ADDRESS_BITS =8;
parameter NUM_BITS = 24;
parameter STATE_BITS = 4;
parameter T0_HIGH_0_COUNT = 6; // .35us (System clock = 12Mhz, 83.3 ns/cycle.
parameter T0_HIGH_1_COUNT = 16; // .9us
parameter CYCLE_COUNT = (T0_HIGH_0_COUNT+T0_HIGH_1_COUNT); //.9+.35
parameter RESET_COUNT = 1500; // 50us
localparam ST_IDLE = 4'd0,
ST_GET_WORD = 4'd1,
ST_SEND_START = 4'd2,
ST_UPDATE_WORD = 4'd3,
ST_DELAY_0 = 4'd4,
ST_DRIVE_0 = 4'd5,
ST_DELAY_1 = 4'd6,
ST_CHECK_WORD_DONE = 4'd7,
ST_SEND_RESET = 4'd8,
ST_NDEF_9 = 4'd9,
ST_NDEF_10 = 4'd10,
ST_NDEF_11 = 4'd11,
ST_DELAY_1S = 4'd12,
ST_NDEF_13 = 4'd13,
ST_NDEF_14 = 4'd14,
ST_NDEF_15 = 4'd15;
reg [STATE_BITS-1:0]currState;
reg [STATE_BITS-1:0]nextState;
reg [14:0] delay_value;
reg [NUM_BITS-1:0] led_word;
reg [5:0] bit_num;
reg [13:0] delay0, delay1;
reg [8:0] led_num;
`define rundelay currState <= ST_DELAY_1S
`define returndelay currState <= nextState
always #(posedge clk, posedge rst)
begin
if (rst) begin
currState <= ST_IDLE;
nextState <= ST_IDLE;
led_data_out <=0;
led_word <= 0;
bit_num <= 0;
delay0 <= 0;
delay1 <=0;
delay_value <=0;
led_num <= 0;
end
else
begin
currState <= nextState;
case (currState)
ST_IDLE: //0
begin
led_data_out <= 0;
nextState <= ST_GET_WORD;
led_num <= 0;
end
ST_GET_WORD: //1
begin
led_word <= 24'h335599;
bit_num <= 0;
nextState <= ST_SEND_START;
led_num <= led_num + 1 ; // ***** DOUBLE INCREMENTS ****
end
ST_SEND_START: //2
begin
led_data_out <= 1;
if (led_word[0]==1)
begin
delay0 <= T0_HIGH_1_COUNT;
delay1 <= (CYCLE_COUNT - T0_HIGH_1_COUNT + 1);
end else
begin
delay0 <= T0_HIGH_0_COUNT;
delay1 <= (CYCLE_COUNT - T0_HIGH_0_COUNT + 1);
end
nextState <= ST_UPDATE_WORD;
end
ST_UPDATE_WORD: //3
begin
led_word[23:0] <= {1'b0,led_word[23:1]};
bit_num <= bit_num+1;
nextState <= ST_DRIVE_0;
delay_value <= delay0;
`rundelay;
end
ST_DRIVE_0: //5
begin
led_data_out <= 0;
nextState <= ST_CHECK_WORD_DONE;
delay_value <= delay1;
`rundelay;
end
ST_CHECK_WORD_DONE: //4
begin
if (bit_num<24)
begin
nextState <= ST_SEND_START;
end else
begin
if (led_num<3)
begin
nextState <= ST_GET_WORD;
end else
begin
nextState <= ST_SEND_RESET;
end
end
end
ST_SEND_RESET:
begin
led_data_out <= 0;
nextState <= ST_IDLE;
delay_value <= 2000;
`rundelay;
end
ST_DELAY_1S: // Delay of ~1sec. Note that the total delay is one clock cycle more because of the state transition.
begin
if (delay_value)
begin
delay_value <= delay_value -1; // as long as delay_value is larger than zero ( bitwise or)
// decrement (decrementing counters synthesize smaller than
// incrementing counters as no comparator is needed, only a
// zero check which is a bitwise or.
currState <= currState;
end
else
begin
//currState <= nextState; // if it hits zero ; throw the target state into currentstate.
`returndelay;
end
end
default:
begin
currState <= ST_IDLE;
end
endcase
end
end
endmodule
As Greg pointed out, FSMs are implemented one of two ways:
Single always block (like you've implemented) In this case, every clock runs through the "always block" once and you only need one "State" variable. All variables are "non-blocking" ("<=" symbol) which implies that all statements are executed in parallel.
Double always blocks. In this case, the first always block is purely combinational, which is fully "blocking" assigns so that you can have sequential arithmetic to calcuate your final value. The second always block behaves like a set of registers to let the whole state-machine "step" to the next state on the clock edge. This setup requires "currState" and "nextState". The first combinational always block calculates "nextState" from "currState". The second (sequential) always block simply transfers nextState to currState on a clock edge.
In your case, it looks like you're trying to jump to a "delay" state for a certain number of cycles and then jump back to the next state in the sequence like a subroutine in C code that delays.
While I haven't run it, I think the following is closer to what you're trying to do. When you jump to ST_DELAY_1S, you need to save the state you want to return to in a separate variable (I used afterDelayState) so that you can spin in the delay state and then return (much like the instruction pointer on the stack) to the proper place.
`timescale 1ns / 1ps
module ws2812_b(clk, rst, send_the_data, led_data_out);
input clk;
input rst;
input send_the_data;
output reg led_data_out;
parameter NUMBER_LEDS = 5;
parameter ADDRESS_BITS =8;
parameter NUM_BITS = 24;
parameter STATE_BITS = 4;
parameter T0_HIGH_0_COUNT = 6; // .35us (System clock = 12Mhz, 83.3 ns/cycle.
parameter T0_HIGH_1_COUNT = 16; // .9us
parameter CYCLE_COUNT = (T0_HIGH_0_COUNT+T0_HIGH_1_COUNT); //.9+.35
parameter RESET_COUNT = 1500; // 50us
localparam ST_IDLE = 4'd0,
ST_GET_WORD = 4'd1,
ST_SEND_START = 4'd2,
ST_UPDATE_WORD = 4'd3,
ST_DELAY_0 = 4'd4,
ST_DRIVE_0 = 4'd5,
ST_DELAY_1 = 4'd6,
ST_CHECK_WORD_DONE = 4'd7,
ST_SEND_RESET = 4'd8,
ST_NDEF_9 = 4'd9,
ST_NDEF_10 = 4'd10,
ST_NDEF_11 = 4'd11,
ST_DELAY_1S = 4'd12,
ST_NDEF_13 = 4'd13,
ST_NDEF_14 = 4'd14,
ST_NDEF_15 = 4'd15;
reg [STATE_BITS-1:0] State;
reg [STATE_BITS-1:0] afterDelayState;
reg [14:0] delay_value;
reg [NUM_BITS-1:0] led_word;
reg [5:0] bit_num;
reg [13:0] delay0, delay1;
reg [8:0] led_num;
//`define rundelay currState <= ST_DELAY_1S
//`define returndelay currState <= nextState
always #(posedge clk, posedge rst)
begin
if (rst) begin
State <= ST_IDLE;
afterDelayState <= ST_IDLE;
led_data_out <=0;
led_word <= 0;
bit_num <= 0;
delay0 <= 0;
delay1 <=0;
delay_value <=0;
led_num <= 0;
end
else
begin
case (State)
ST_IDLE: //0
begin
led_data_out <= 0;
State <= ST_GET_WORD;
led_num <= 0;
end
ST_GET_WORD: //1
begin
led_word <= 24'h335599;
bit_num <= 0;
State <= ST_SEND_START;
led_num <= led_num + 1 ; // ***** DOUBLE INCREMENTS ****
end
ST_SEND_START: //2
begin
led_data_out <= 1;
if (led_word[0]==1)
begin
delay0 <= T0_HIGH_1_COUNT;
delay1 <= (CYCLE_COUNT - T0_HIGH_1_COUNT + 1);
end else
begin
delay0 <= T0_HIGH_0_COUNT;
delay1 <= (CYCLE_COUNT - T0_HIGH_0_COUNT + 1);
end
State <= ST_UPDATE_WORD;
end
ST_UPDATE_WORD: //3
begin
led_word[23:0] <= {1'b0,led_word[23:1]};
bit_num <= bit_num+1;
State <= ST_DRIVE_0;
delay_value <= delay0;
// delay "delay0" cycles and then jump to ST_DRIVE_0
delay_value <= delay0;
State <= ST_DELAY_1S;
afterDelayState <= ST_DRIVE_0;
end
ST_DRIVE_0: //5
begin
led_data_out <= 0;
// delay "delay1" cycles and then jump to ST_CHECK_WORD_DONE
delay_value <= delay1;
State <= ST_DELAY_1S;
afterDelayState <= ST_CHECK_WORD_DONE;
end
ST_CHECK_WORD_DONE: //4
begin
if (bit_num<24)
begin
State <= ST_SEND_START;
end else
begin
if (led_num<3)
begin
State <= ST_GET_WORD;
end else
begin
State <= ST_SEND_RESET;
end
end
end
ST_SEND_RESET:
begin
led_data_out <= 0;
// delay 2000 cycles and then jump to ST_IDLE
delay_value <= 2000;
State <= ST_DELAY_1S;
afterDelayState <= ST_IDLE;
end
// This is a special "delay" loop that jumps to a specified "afterDelayState" when the delay_value runs to zero
ST_DELAY_1S: // Delay of ~1sec. Note that the total delay is one clock cycle more because of the state transition.
begin
if (delay_value)
begin
// every clock cycle stay in the ST_DELAY_1S state until the "delay_value" runs to zero
delay_value <= delay_value -1; // as long as delay_value is larger than zero ( bitwise or)
// decrement (decrementing counters synthesize smaller than
// incrementing counters as no comparator is needed, only a
// zero check which is a bitwise or.
//
// no change in State, so we stay here
// State <= State
end
else
begin
//after it hits zero, jump to the state we stored in afterDelayState
State <= afterDelayState;
end
end
default:
begin
State <= ST_IDLE;
end
endcase
end
end
endmodule
In the code above, each state has only one State <= (new state) command.
We only assign the afterDelayState variable when jumping to the ST_DELAY_1S "subroutine" so we know where to go back to.
I took out the `defines to make all assignments more explicit.
One side recommendation is to install VSCode and iverilog so that you can do syntax checking, auto-formatting and compile checking inside the editor. This quickly helped me clean things up.

Cause of inferred latches (not else or default statement) in Verilog

Quartus is telling me that I have inferred latches for input_val, ii, output_val, delayed, and addr_to_rom. I have looked at previous posts and made changes so that all my if statements have an else component and that my case statement has a default option.
What else could be causing all of these inferred latches?
(updated code below)
module simple_fir(clk, reset_n, output_val);
parameter data_width = 8;
parameter size = 1024;
input wire clk;
input wire reset_n;
//input reg [(data_width):0] input_val;
reg [(data_width):0] input_val;
output reg [data_width:0] output_val;
reg [(data_width):0] delayed;
reg [data_width:0] to_avg;
reg [9:0] ii ;
reg [9:0] i ;
reg [data_width:0] val;
reg [data_width:0] output_arr [(size-1):0];
logic [(data_width):0] data_from_rom; //precalculated 1 Hz sine wave
logic [9:0] addr_to_rom;
initial delayed = 0;
initial i = 0;
initial ii = 0;
//port map to ROM
rom_data input_data(
.clk(clk),
.addr(addr_to_rom), //text file?
.data(data_from_rom)
);
//Moore FSM
localparam [3:0]
s0=0, s1 = 1, s2 = 2, s3 = 3, s4 = 4, s5 = 5, s6 = 6;
reg [3:0] state_reg, state_next;
initial state_next = 0;
always #(posedge clk, negedge reset_n) begin //posedge clk, reset_n
if (reset_n == 'b0) begin //reset is active low
state_reg <= s0;
end else begin
state_reg <= state_next;
end
end
always #* begin
state_next = state_reg; // default state_next
case (state_reg)
s0 : begin //initial state, reset state
if (!reset_n) begin
output_val = 0;
delayed = 0;
to_avg= 0;
i = 0;
ii = 0;
end else begin
state_next = s1;
end
end
s1 : begin
if (ii>(size-2)) begin
ii = 0;
end else begin
addr_to_rom = ii;
end
input_val = input_val;
delayed = delayed;
state_next = s2;
end
s2 : begin
input_val = data_from_rom;
ii = ii+1;
delayed = delayed;
state_next = s3;
end
s3 : begin
delayed = input_val;
state_next = s4;
end
s4 : begin
addr_to_rom = ii;
input_val = input_val;
delayed = delayed;
state_next = s5;
end
s5 : begin
input_val = data_from_rom;
delayed = delayed;
//i=i+1;
//ii <= ii+1;
state_next = s6;
end
s6 : begin
to_avg = input_val + delayed; //summing two values
val = (to_avg >> 1); //taking the average
output_arr[ii-1] = val; //indexing starts on [2]
output_val = val;
state_next = s0;
input_val = input_val;
end
default: begin
addr_to_rom = addr_to_rom;
data_from_rom = data_from_rom;
delayed = delayed;
input_val = input_val;
to_avg = to_avg;
val = val;
output_val = output_val;
ii = ii;
end
endcase
end
endmodule
You have 6 states in your combo always block, but you are assigning to delayed in only 3 of them. This implies that you want to retain state, which infers latches. You should make sure delayed is assigned some value in all states (and in all branches of if/else, if applicable.
Also, you have an incomplete sensitivity list. You likely want to change:
always #(state_reg) begin
to:
always #* begin
#* is the implicit sensitivity list, which triggers the always block whenever a change occurs to any signal on the RHS of an assignment.
you have to look at the ways state machines are programmed. Usually states are calculated using flop and the final assignment is a combinatorial. You have it vice versa. Therefore you have issues. Something like the following should work for you. Sorry, i did not test it. The code needs to use non-blocking assignments except in the place where you caculate an intermediate value.
always #*
state_reg = state_next;
always #(posedge clk, negedge reset_n) begin //posedge clk, reset_n
if (reset_n == 'b0) begin //reset is active low
state_next <= s0;
end
else begin
case (state_reg)
S0: begin
output_val <= 0;
delayed <= 0;
i <= 0;
ii <= 0;
end
s1 : begin
if (ii>(size-2)) begin
ii <= 0;
end else begin
addr_to_rom <= ii;
end
input_val <= input_val;
delayed <= delayed;
state_next <= s2;
end
s2 : begin
input_val <= data_from_rom;
ii <= ii+1;
state_next <= s3;
end
s3 : begin
delayed <= input_val;
state_next <= s4;
end
s4 : begin
addr_to_rom <= ii;
state_next <= s5;
end
s5 : begin
input_val <= data_from_rom;
state_next <= s6;
end
s6 : begin
// calculating intermediate value, need blocking assignments here
to_avg = input_val + delayed; //need blocking here
val = (to_avg >> 1); //need blocking here
output_arr[ii-1] <= val; //indexing starts on [2]
output_val <= val;
state_next <= s0;
end
endcase
end // else
end // always
I noticed you used a coding style that a default assignment state_next = state_reg is placed at the very top of the always block. That makes state_next always have a legal driver if it is not explicitly assigned. This is also required for other combinational signals. Otherwise, latch inferred. Besides, the signal data_from_rom, which I assume is an output from module rom_data, has mutiple drivers.

how to properly connect receiver, sender and top and make them dependent from each other - RS232

I've got a simple project which requires me to write a code for RS232 receiver and sender, then put them together and, finally, test if it works properly. I've prepared code for both sender and receiver (and also connecting block - top). My problem is that I don't know how to connect them, so they could work with each other properly.
The main issue is that I can't "transfer" data from data_o to data_i because of the fact that one is reg and second - wire. I wouldn't like to use inout for these purposes. I can't figure out any possible modifications to make it work.
Another issue is putting some flags that could kind of follow idea like this: if receiving -> not sending, if sending -> not receiving.
Here's my code:
top.v
`timescale 1ns / 1ps
module top (
clk_i,
rst_i,
RXD_i,
data_i,
TXD_o,
data_o
);
input clk_i;
input rst_i;
input RXD_i;
output TXD_o;
//the problem is here, can't data_i <= data_o because output is reg
input [7:0] data_i;
output [7:0] data_o;
receiver r1(clk_i, RXD_i, data_o);
sender s1(clk_i, data_i, TXD_o);
endmodule
receiver.v
`timescale 1ns / 1ps
module receiver (
clk_i,
RXD_i,
data_o
);
//inputs and outputs
input clk_i;
input RXD_i;
output reg [7:0] data_o;
//counter values
parameter received_bit_period = 5208;
parameter half_received_bit_period = 2604;
//state definitions
parameter ready = 2'b00;
parameter start_bit = 2'b01;
parameter data_bits = 2'b10;
parameter stop_bit = 2'b11;
//operational regs
reg [12:0] counter = 0; //9765.625Hz
reg [7:0] received_data = 8'b00000000;
reg [3:0] data_bit_count = 0;
reg [1:0] state = ready;
//latching part
reg internal_RXD;
always #(posedge clk_i) //latch RXD_i value to internal_RXD
begin
internal_RXD = RXD_i;
end
always #(clk_i) //receiving process
begin
case (state)
ready :
begin
if (internal_RXD == 0)
begin
state <= start_bit;
counter <= counter + 1;
end
else
begin
state <= ready;
counter <= 0;
data_bit_count <= 0;
end
end
start_bit :
begin
if (counter == half_received_bit_period)
begin
if (internal_RXD == 0)
begin
state <= data_bits;
counter <= 0;
end
end
else
begin
state <= start_bit;
counter <= counter + 1;
end
end
data_bits :
begin
if (counter == received_bit_period)
begin
received_data[data_bit_count] <= internal_RXD;
data_bit_count <= data_bit_count + 1;
counter <= 0;
if (data_bit_count == 8)
state <= stop_bit;
end
else
counter <= counter + 1;
end
stop_bit:
begin
counter <= counter + 1;
if (counter == received_bit_period)
begin
state <= ready;
data_o <= received_data;
end
end
endcase
end
endmodule
sender.v
`timescale 1ns / 1ps
module sender (
clk_i,
data_i,
TXD_o
);
//inputs and outputs
input clk_i;
input [7:0] data_i;
output reg TXD_o;
//counter values
parameter received_bit_period = 5208;
parameter half_received_bit_period = 2604;
//state definitions
parameter ready = 1'b0;
parameter data_bits = 1'b1;
//operational regs
reg [12:0] counter = 0; //9765.625Hz
reg [9:0] framed_data = 0;
reg [3:0] data_bit_count = 0;
reg state = ready;
always #(posedge clk_i) //sending process
begin
case (state)
ready :
begin // flag needed?
state <= data_bits;
TXD_o <= 1;
framed_data[0] <= 1'b0;
framed_data[8:1] <= data_i;
framed_data[9] <= 1'b1;
counter <= 0;
end
data_bits :
begin
counter <= counter + 1;
if (data_bit_count == 10)
begin // flag needed?
state <= ready;
data_bit_count <= 0;
TXD_o <= 1;
end
else
begin
if (counter == received_bit_period)
begin
data_bit_count <= data_bit_count + 1;
end
TXD_o <= framed_data[data_bit_count];
end
end
endcase
end
endmodule
You don't!
In all CPU's and FPGA nowadays the read and write data path are separate buses. You will find that also with all CPU cores. Have a look at AXI or AHB bus protocols from ARM.
What is more worrying is the way you have implemented your functions. You would at least need some 'data valid' signal for the transmitter to know when there is valid data to transmit and for the receive when valid data has arrived.
Even that is not enough because for the TX the connecting logic would need to know when the data has been send and the next byte can go out.
You need to make a (preferably standard) CPU interface which talks to your UART. (For beginner I would not use AXI.)
As to your flags: they would come from within the CPU interface.
Last: a UART should be capable of transmitting and receiving at the same time.

Verilog : uart on FPGA and simulation behavioural differences

EDIT: removed some redundancies, moved all assignments to non-blocking, inserted a reset mapped as one of the input buttons of my FPGA... but when I implement the code, it starts transmitting the same wrong character and gets stuck in a single state of my machine.
Post Synthesis and Post-Implementation simulations are identical,$time-wise
module UART (reset_button, sysclk_p, sysclk_n,TxD, Tx_busy, Tx_state_scope_external);
input reset_button, sysclk_p, sysclk_n;
output wire TxD, Tx_busy;
output wire [1:0]Tx_state_scope_external;
//internal communications signals
wire clk_internal;
//buffer unit control signals
wire [7:0]TxD_data_internal;
wire Tx_start_internal;
wire Tx_busy_internal;
wire reset_flag;
reset_buf RESET_BUFF (.reset_internal (reset_flag), .reset (reset_button));
differential_CK CK_GENERATION (.sysclk_p (sysclk_p), .sysclk_n(sysclk_n), .clk(clk_internal));
output_Dbuffer OB1 (.reset (reset_flag), .RTS_n (Tx_busy_internal), .clk(clk_internal), .TX_trigger (Tx_start_internal), .TX_data(TxD_data_internal));
async_transmitter TX1 (.reset (reset_flag), .clk (clk_internal), .TxD_data(TxD_data_internal), .Tx_start (Tx_start_internal), .TxD(TxD), .Tx_busy_flag(Tx_busy_internal), .Tx_state_scope(Tx_state_scope_external));
obuf_TX O_TX1( .Tx_busy(Tx_busy), .Tx_busy_flag(Tx_busy_internal));
endmodule
module reset_buf (
output reset_internal,
input reset
);
// IBUF: Single-ended Input Buffer
// 7 Series
// Xilinx HDL Libraries Guide, version 14.7
IBUF #(
.IBUF_LOW_PWR("TRUE"), // Low power (TRUE) vs. performance (FALSE) setting for referenced I/O standards
.IOSTANDARD("DEFAULT") // Specify the input I/O standard
) IBUF_inst (
.O(reset_internal), // Buffer output
.I(reset) // Buffer input (connect directly to top-level port)
);
// End of IBUF_inst instantiation
endmodule
module differential_CK(
input sysclk_p,
input sysclk_n,
output clk
);
// IBUFGDS: Differential Global Clock Input Buffer
// 7 Series
// Xilinx HDL Libraries Guide, version 14.7
IBUFGDS #(
.DIFF_TERM("FALSE"), // Differential Termination
.IBUF_LOW_PWR("TRUE"), // Low power="TRUE", Highest performance="FALSE"
.IOSTANDARD("DEFAULT") // Specify the input I/O standard
) IBUFGDS_inst (
.O(clk), // Clock buffer output
.I(sysclk_p), // Diff_p clock buffer input (connect directly to top-level port)
.IB(sysclk_n) // Diff_n clock buffer input (connect directly to top-level port)
);
// End of IBUFGDS_inst instantiation
endmodule
module output_Dbuffer (
input reset,
input RTS_n, //TX_BUSY flag of the transmitter is my ready to send flag
input clk, //ck needed for the FSM
output wire TX_trigger, //TX_START flag of the transmitter now comes from THIS unit instead of Receiver
output wire [7:0]TX_data //byte for transmission
);
//internal variables
reg [7:0] mem [0:9]; //memory init, 10 * 8 bit locations
integer m, n, i, j, k ; //M = row [a.k.a. bytes], N = column [a.k.a. single bits]
reg TX_trigger_int;
reg [7:0] TX_data_int, TX_complete;
//reg sum256_ok;
reg [7:0]checksum_buff ;
//buffer FSM required variables
localparam //state enumeration declaration
BUF_IDLE = 3'b000,
BUF_START = 3'b001,
BUF_BYTES = 3'b010,
BUF_BUSY = 3'b011,
BUF_TX_CHECKSUM = 3'b100;
reg [2:0] buf_state; //2 bits for 4 states
//static assignments of OUTPUTS : Transmission Flag and Transmission Data (content)
assign TX_trigger = TX_trigger_int;
assign TX_data = TX_data_int;
//Block for transmitting [here I manage the TX_Data and TX_Trigger functionality]
always #(posedge clk)
begin
if (reset)
begin
buf_state <= BUF_IDLE;
TX_trigger_int <= 0;
TX_data_int <= 8'b00000000;
end
else case (buf_state)
BUF_IDLE:
begin
TX_trigger_int <= 0;
TX_data_int <= 8'b00000000;
m <=0;
n <=0;
i <=0;
j <=0;
mem[9] <= 8'b01010001; //81
mem[8] <= 8'b01000000; //64
mem[7] <= 8'b00110001; //49
mem[6] <= 8'b00100100; //36
mem[5] <= 8'b00011001; //25
mem[4] <= 8'b00010000; //16
mem[3] <= 8'b00001001; //9
mem[2] <= 8'b00000100; //4
mem[1] <= 8'b00000001; //1
mem[0] <= 8'b00000010;//2
checksum_buff <= 8'd31;
//check if the TX is not busy
if (RTS_n == 0) buf_state <= BUF_START;
end
BUF_START:
begin
TX_trigger_int <= 0;
if ((i == 0) || ( (j - i) > 1 )) buf_state <= BUF_BYTES;
else begin
$display ("BUFFER BUSY #time:", $time);
buf_state <= BUF_BUSY;
end
end
BUF_BYTES:
begin
//check if the TX is busy
if (RTS_n==0)
begin
// TX_trigger_int = 1; 21.09 MOVED THE TRIGGER INSIDE THE ELSE N LINE 498
if (j > 9)
begin
TX_trigger_int <= 0;
buf_state <= BUF_TX_CHECKSUM;
end
else begin
TX_data_int <= mem[j];
TX_trigger_int <= 1;
j <= j+1;
//TX_trigger_int =0;
buf_state <= BUF_START;
end
end
else buf_state <= BUF_BYTES;
end
BUF_BUSY:
begin
if (RTS_n == 0)
begin
$display ("BUFFER AVAILABLE AGAIN #time:", $time);
buf_state <= BUF_START;
end
end
BUF_TX_CHECKSUM:
begin
if (RTS_n==0) begin
TX_data_int <= checksum_buff;
// sum256_ok = 0;
TX_trigger_int <= 1;
buf_state <= BUF_IDLE;
end
end
//default: buf_state <= BUF_IDLE;
endcase
end
endmodule
module async_transmitter(
input clk,
input reset,
//differential clock pair
input [7:0] TxD_data,
input Tx_start, // it is ==TX_TRIGGER
output wire TxD, //bit being sent to the USB
output reg Tx_busy_flag,
output wire [1:0]Tx_state_scope
);
localparam //state enumeration declaration
TX_IDLE = 2'b00,
TX_START_BIT = 2'b01,
TX_BITS = 2'b10,
TX_STOP_BIT = 2'b11;
parameter ClkFrequencyTx = 200000000; // 200MHz
parameter BaudTx = 9600;
reg [1:0] Tx_state; //2 bits for 4 states
integer bit_counter; //bit counter variable
reg [7:0]TxD_data_int, TxD_int;
integer i; //vector index for output data
wire TXSTART_Trigger;
StartDetectionUnitTX SDU_TX (.clk(clk), .state (Tx_state), .signal_in (Tx_start), . trigger (TXSTART_Trigger));
wire BitTick;
BaudTickGen #(ClkFrequencyTx, BaudTx) as (.clk(clk), .trigger (TXSTART_Trigger), .tick(BitTick));
//BitTick is 16times the frequency generated during the RX portion
assign TxD = TxD_int;
always #(posedge clk) begin
if (reset)
begin
Tx_state <= TX_IDLE;
TxD_int <= 1;
Tx_busy_flag <=0;
end
else case (Tx_state)
TX_IDLE:
begin //reinitialization and check on the trigger condition
bit_counter <= 0;
TxD_data_int <= 8'b00000000;
i <= 0;
TxD_int <= 1; //idle state
Tx_busy_flag <= 0;
if (TXSTART_Trigger) begin
Tx_state <= TX_START_BIT;
TxD_data_int <= TxD_data;
Tx_busy_flag <= 1;
bit_counter <= 8;
end
end
TX_START_BIT:
begin
if (BitTick)
begin
TxD_int <= 0 ; //start bit is a ZERO logical value
Tx_state <= TX_BITS;
end
end
TX_BITS:
begin
if (BitTick)
begin
bit_counter <= bit_counter -1;
TxD_int <= TxD_data_int[i];
// $display ("ho trasmesso dalla UART un bit di valore %b al tempo: ", TxD, $time);
i <= i+1;
if (bit_counter < 1) Tx_state <= TX_STOP_BIT;
end
end
TX_STOP_BIT:
begin
if (BitTick) begin
TxD_int <= 1; //STOP BIT is a logical '1'
Tx_busy_flag <= 0;
Tx_state <= TX_IDLE;
end
end
// default: Tx_state <= TX_IDLE;
endcase
end
assign Tx_state_scope = Tx_state;
endmodule
module obuf_TX (
output Tx_busy,
input Tx_busy_flag
);
// OBUF: Single-ended Output Buffer
// 7 Series
// Xilinx HDL Libraries Guide, version 14.7
OBUF #(
.DRIVE(12), // Specify the output drive strength
.IOSTANDARD("DEFAULT"), // Specify the output I/O standard
.SLEW("SLOW") // Specify the output slew rate
) OBUF_inst (
.O(Tx_busy), // Buffer output (connect directly to top-level port)
.I(Tx_busy_flag) // Buffer input
);
// End of OBUF_inst instantiation
endmodule
module StartDetectionUnitTX ( //detects a rising edge of the start bit == TRANSMISSION START, during the IDLE state = 0000
input clk, [1:0]state,
input signal_in,
output trigger
);
reg signal_d;
always #(posedge clk)
begin
signal_d <= signal_in;
end
assign trigger = signal_in & (!signal_d) & (!state);
endmodule
module BaudTickGen (
input clk, trigger,
output tick //generates a tick at a specified baud rate *oversampling
);
parameter ClkFrequency = 200000000; //sysclk at 200Mhz
parameter Baud = 9600;
parameter Oversampling = 1;
//20832 almost= ClkFrequency / Baud, to make it an integer number
integer counter = (20833/Oversampling)-1; //-1 so counter can get to 0
reg out;
always #(posedge clk)
begin
if (trigger)
begin
counter <= (20833/Oversampling)-1; //-1 so counter can get to 0
out <= 1;
end
if (counter == 0)
begin
counter <= (20833/Oversampling)-1; //-1 so counter can get to 0
out <= 1;
end
else begin
counter <= counter-1;
out <= 0;
end
end
assign tick = out;
endmodule
My FPGA is a Virtex-7 VC707 and I'm using Vivado for my design flow.
Here I am attaching an image of my looping error.
error image
What have you done? Have you just simulated the code? Are you saying that it fails on the board, but the post-implementation sim is Ok?
A difference between pre- and post-implementation sim could point to a race condition. Get rid of all your blocking assignments, replace with NBAs (why did you use blocking assignments?)
Don't go to Chipscope - it's just a red flag that you don't know what you're doing
The code is a mess - simplify it. The Xilinx-specific stuff is irrelevant - get rid of it if you want anyone to look at it, fix comments (2-bit state?!), fix your statement about getting stuck in '10', etc
Have you run this through Vivado? Seriously? You have multiple drivers on various signals. Get rid of the initial block, use a reset. Initialise the RAM in a way which is understood by the tools. Even if Vivado is capable of initialising stuff using a separate initial block, don't do it
Get rid of statements like 'else Tx_state = TX_IDLE' in the TX_IDLE branch - they're redundant, and just add verbosity
Write something which fails stand-alone, and post it again.

Behavioral algorithms (GCD) in Verilog - possible?

I want to write a module for GCD computing, using extended Euclidean algorithm. But the main problem is that I completely don't know how to do that without getting to the lowest (RTL) level. What I mean is to have FSM with three states:
IDLE (waiting for input)
COMPUTING (as many clock cycles as needed)
FINISHED (ready to read output).
However, when I try to separate FSM & computations into separate processes, like this:
module modinv(clk, reset, number, prime, finished, gcd, inverse_fail, inverse);
input [31:0] number, prime;
input wire clk, reset;
output integer gcd, inverse;
output reg finished, inverse_fail;
parameter [2:0] IDLE = 3'b001, COMPUTING = 3'b010, END = 3'b100;
reg [2:0] state, state_next;
integer a, b, c, q, p, r;
always # (posedge clk, posedge reset)
begin
if (reset == 1)
begin
state <= IDLE;
end
else
begin
state <= state_next;
end
end
always #(state or b)
begin
finished <= 0;
inverse_fail <= 0;
case (state)
IDLE:
begin
a <= number;
b <= prime;
p <= 1;
r <= 0;
state_next <= COMPUTING;
end
COMPUTING:
begin
c = a % b;
q = a / b;
a = b;
b = c;
r = p - q * r;
p = r;
if (b == 0)
begin
state_next <= END;
end
else
begin
state_next <= COMPUTING;
end
end
END:
begin
gcd <= a;
inverse <= p;
finished <= 1;
if (gcd != 1)
begin
inverse_fail <= 1;
end
end
endcase
end
endmodule
And when I try to put computation in the second process, in COMPUTING state case, it only works once - what is correct in means of verilog, because until computing is done, state doesn't change, so the process isn't called again.
However, when I put computation in the first process, there isn't any non-ugly way to limit the computations only to correct STATE, which results in wrong output (as soon as FSM is in FINISHED state, the output is already incorrect - one step further).
So, my question is - how to do it correctly without using FSM + datapath (low-level RTL) solution?
My current waveform:
You seem to be missing some clocked elements in your design.
From what I understand of your design, you seem to expect once the state goes to COMPUTING, it should keep iterating the values of a and b until b reaches 0. But the only thing you're actually clocking on a clock edge is the state variable, so there's no memory of a and b from one state to the next. If you want variables like a and b to have memory from one clock cycle to the next, then you need to latch these variables as well:
I made some modifications to your program, it might not be 100% correct, but you should see what I'm getting at. See if this makes sense how you do the combinational logic in the second block, but you register the values on the posedge so that you can use them at the start of the next clock cycle.
module modinv(clk, reset, number, prime, finished, gcd, inverse_fail, inverse);
input [31:0] number, prime;
input wire clk, reset;
output integer gcd, inverse;
output reg finished, inverse_fail;
parameter [2:0] IDLE = 3'b001, COMPUTING = 3'b010, END = 3'b100;
reg [2:0] state, state_next;
integer a, b, c, q, p, r;
integer a_next, b_next, p_next, r_next;
always # (posedge clk, posedge reset)
begin
if (reset == 1)
begin
state <= IDLE;
a <= 0;
b <= 0;
p <= 0;
r <= 0;
end
else
begin
state <= state_next;
a <= a_next;
b <= b_next;
p <= p_next;
r <= r_next;
end
end
always #* //just use the auto-triggered '#*' operator
begin
finished <= 0;
inverse_fail <= 0;
case (state)
IDLE:
begin
a_next <= number;
b_next <= prime;
p_next <= 1;
r_next <= 0;
state_next <= COMPUTING;
end
COMPUTING:
begin
c = a % b;
q = a / b;
a_next = b;
b_next = c;
r_next = p - q * r;
p_next = r;
if (b == 0)
begin
state_next <= END;
end
else
begin
state_next <= COMPUTING;
end
end
END:
begin
gcd <= a;
inverse <= p;
finished <= 1;
if (gcd != 1)
begin
inverse_fail <= 1;
end
end
endcase
end
endmodule

Resources