Fifo buffer in Verilog. generate always - verilog

I'm tring to write universal fifo buffer.
To make it universal i used code like this.
genvar i;
generate
for(i=0;i<BusWidthIn;i=i+1) begin: i_buffin
always # (negedge clkin) begin
if (!full)
Buffer[wr_ptr+i] <= datain[i*BitPerWord+BitPerWord-1:i*BitPerWord];
end
end
endgenerate
In simulation it works properly, but in Quartus it gives
Error (10028): Can't resolve multiple constant drivers for net "Buffer[30][6]" at fifo.v(33) and so on.
All Code
module fifo_m(clkin,datain,clkout,dataout,full,empty);
parameter BusWidthIn = 3, //in 10*bits
BusWidthOut = 1, //in 10*bits
BufferLen = 4, // in power of 2 , e.g. 4 will be 2^4=16 bytes
BitPerWord = 10;
input clkin;
input [BusWidthIn*BitPerWord-1:0] datain;
input clkout;
output [BusWidthOut*BitPerWord-1:0] dataout;
output full;
output empty;
reg [BusWidthOut*BitPerWord-1:0] dataout;
reg [BitPerWord-1:0] Buffer [(1 << BufferLen)-1 : 0];
wire [BusWidthIn*BitPerWord-1:0] tbuff;
reg [BufferLen - 1 : 0] rd_ptr, wr_ptr;
wire [BufferLen - 1 : 0] cnt_buff;
wire full;
wire empty;
assign cnt_buff = wr_ptr > rd_ptr ? wr_ptr - rd_ptr : (1 << BufferLen) - rd_ptr + wr_ptr;
assign full = cnt_buff > (1 << BufferLen) - BusWidthIn;
assign empty = cnt_buff < BusWidthOut;
initial begin
rd_ptr = 0;
wr_ptr = 0;
end
genvar i;
generate
for(i=0;i<BusWidthIn;i=i+1) begin: i_buffin
always # (negedge clkin) begin
if (!full)
Buffer[wr_ptr+i] <= datain[i*BitPerWord+BitPerWord-1:i*BitPerWord];
end
end
endgenerate
always # (negedge clkin)
begin
if (!full)
wr_ptr = wr_ptr + BusWidthIn;
end
genvar j;
generate
for(j=0;j<BusWidthOut;j=j+1) begin : i_buffout
always # (posedge clkout) begin
dataout[j*BitPerWord+BitPerWord-1:j*BitPerWord] <= Buffer[rd_ptr+j];
end
end
endgenerate
always # (posedge clkout)
begin
if (!empty)
rd_ptr = rd_ptr + BusWidthOut;
end
endmodule
To solve this problem I must put for inside always, but how I can do it?

I think the issue is that synthesis doesn't know that wr_ptr is always a multiple of 3, hence from the synthesis point of view 3 different always blocks can assign to each Buffer entry. I think you can recode your logic to assign just a single Buffer entry per always block.
genvar i, j;
generate
for(i=0;i < (1<<(BufferLen)); i=i+1) begin: i_buffin
for(j = (i%BusWidthIn);j == (i%BusWidthIn); j++) begin // a long way to write 'j = (i%BusWidthIn);'
always # (negedge clkin) begin
if (!full) begin
if (wr_ptr*BusWidthIn + j == i) begin
Buffer[i] <= datain[j*BitPerWord+BitPerWord-1:j*BitPerWord];
end
end
end
end
end
endgenerate
Also at http://www.edaplayground.com/x/23L (based off of Morgan's copy).
And, don't you need to add a valid signal into the fifo, or is the data actually always available to be pushed in ?

Other then the *_ptr in your code should be assigned with non-blocking assignment (<=), there really isn't anything wrong with your code.
If you want to assign Buffer with a for-loop inside of an always block, you can use the following:
integer i;
always #(negedge clkin) begin
if (!full) begin
for (i=0;i<BusWidthIn;i=i+1) begin: i_buffin
Buffer[wr_ptr+i] <= datain[i*BitPerWord +: BitPerWord];
end
wr_ptr <= wr_ptr + BusWidthIn;
end
end
datain[i*BitPerWord+BitPerWord-1:i*BitPerWord] will not compile in Verilog because the MSB and LSB select bits are variables. Verilog requires a known range. +: is for part-select (also known as a slice) allows a variable select index and a constant range value. It was introduced in IEEE Std 1364-2001 § 4.2.1. You can also read more about it in IEEE Std 1800-2012 § 11.5.1, or refer to previously asked questions: What is `+:` and `-:`? and Indexing vectors and arrays with +:.

Related

Verilog cannot synthesize when using external counter inside generate block

I cannot synthesize the Verilog code in Vivado, simulation runs correctly. I declare an array localparam, use an external counting variable cnt1 inside a generate block to get the desired address for parameter. When I remove the cnt1 variable inside module1, it could be synthesized. Please guys give me some suggestion to solve this problem. I really appreciate that.
module multiply_s1(
input clk,
input rst,
input [9:0]in,
input ena,
output [9:0]out);
localparam [0:24] pi_values = {5'h4, 5'h5, 5'h6, 5'h7, 5'h8};
reg [1:0] cnt1;//count CC times of GG coeffcient
always#(posedge clk or negedge rst) begin
if(rst == 0) begin
cnt1 <= 0;
end
else if(ena == 0) begin
cnt1 <= 0;
end
else begin
if (cnt1 == 3)
cnt1 <= 0;
else
cnt1 <= cnt1 + 1;
end
end
genvar i;
generate
for (i=0; i<2; i=i+1)
begin: mod1
module1 mod1(.clk(clk),
.rst(rst),
.multiplier(in[i*5 +: 5]),
.multiplicand(pi_values[(i + cnt1)*5 +: 5]),
.result(out[i*5 +: 5]));
end
endgenerate
endmodule
Without knowing what Vivado told you, I guess the error may be here:
[(i + cnt1)*5 +: 5]
cnt1 is a register whose value is only known at "runtime", therefore, Vivado cannot know which value to use to bitslicing the pi_values vector.
You would need something like this:
localparam [0:24] pi_values = {5'h4, 5'h5, 5'h6, 5'h7, 5'h8};
reg [1:0] cnt1;//count CC times of GG coeffcient
always#(posedge clk or negedge rst) begin
if(rst == 0)
cnt1 <= 0;
else if(ena == 0)
cnt1 <= 0;
else
cnt1 <= cnt1 + 1;
end
reg [0:24] pi_values_rotated;
always #* begin
case (cnt1)
0: pi_values_rotated = pi_values;
1: pi_values_rotated = {pi_values[5:24], pi_values[0:4]};
2: pi_values_rotated = {pi_values[10:24], pi_values[0:9]};
3: pi_values_rotated = {pi_values[15:24], pi_values[0:14]};
default: pi_values_rotated = pi_values;
endcase
end
genvar i;
generate
for (i=0; i<2; i=i+1)
begin: mod1
module1 mod1(.clk(clk),
.rst(rst),
.multiplier(in[i]),
.multiplicand(pi_values_rotated[i*5 +: 5]),
.result(out[i]));
end
endgenerate
pi_values_rotated would be the pi_values vector, as seen after the current value of cnt1 is applied. Then, you can use i as the sole value to generate your instances, which should be accepted now.
Notice here:
else begin
if (cnt1 == 3)
cnt1 <= 0;
else
cnt1 <= cnt1 + 1;
end
It can be either 0, 1, 2 or 3. This works fine in simulation. But in synthesis, you are constantly changing the value of cnt1 while trying to build logic gates for mod1 where mod1 uses that changing variable cnt1. This is a conflict for synthesizing logic gates.
Synthesis can't build gates for your generate block as it is building actual hardware and wants to know a deterministic value of cnt1 in order to construct the gates accordingly.
I believe you need to develop an architecture that can handle the largest value of cnt1

Verilog : uart on FPGA and simulation behavioural differences

EDIT: removed some redundancies, moved all assignments to non-blocking, inserted a reset mapped as one of the input buttons of my FPGA... but when I implement the code, it starts transmitting the same wrong character and gets stuck in a single state of my machine.
Post Synthesis and Post-Implementation simulations are identical,$time-wise
module UART (reset_button, sysclk_p, sysclk_n,TxD, Tx_busy, Tx_state_scope_external);
input reset_button, sysclk_p, sysclk_n;
output wire TxD, Tx_busy;
output wire [1:0]Tx_state_scope_external;
//internal communications signals
wire clk_internal;
//buffer unit control signals
wire [7:0]TxD_data_internal;
wire Tx_start_internal;
wire Tx_busy_internal;
wire reset_flag;
reset_buf RESET_BUFF (.reset_internal (reset_flag), .reset (reset_button));
differential_CK CK_GENERATION (.sysclk_p (sysclk_p), .sysclk_n(sysclk_n), .clk(clk_internal));
output_Dbuffer OB1 (.reset (reset_flag), .RTS_n (Tx_busy_internal), .clk(clk_internal), .TX_trigger (Tx_start_internal), .TX_data(TxD_data_internal));
async_transmitter TX1 (.reset (reset_flag), .clk (clk_internal), .TxD_data(TxD_data_internal), .Tx_start (Tx_start_internal), .TxD(TxD), .Tx_busy_flag(Tx_busy_internal), .Tx_state_scope(Tx_state_scope_external));
obuf_TX O_TX1( .Tx_busy(Tx_busy), .Tx_busy_flag(Tx_busy_internal));
endmodule
module reset_buf (
output reset_internal,
input reset
);
// IBUF: Single-ended Input Buffer
// 7 Series
// Xilinx HDL Libraries Guide, version 14.7
IBUF #(
.IBUF_LOW_PWR("TRUE"), // Low power (TRUE) vs. performance (FALSE) setting for referenced I/O standards
.IOSTANDARD("DEFAULT") // Specify the input I/O standard
) IBUF_inst (
.O(reset_internal), // Buffer output
.I(reset) // Buffer input (connect directly to top-level port)
);
// End of IBUF_inst instantiation
endmodule
module differential_CK(
input sysclk_p,
input sysclk_n,
output clk
);
// IBUFGDS: Differential Global Clock Input Buffer
// 7 Series
// Xilinx HDL Libraries Guide, version 14.7
IBUFGDS #(
.DIFF_TERM("FALSE"), // Differential Termination
.IBUF_LOW_PWR("TRUE"), // Low power="TRUE", Highest performance="FALSE"
.IOSTANDARD("DEFAULT") // Specify the input I/O standard
) IBUFGDS_inst (
.O(clk), // Clock buffer output
.I(sysclk_p), // Diff_p clock buffer input (connect directly to top-level port)
.IB(sysclk_n) // Diff_n clock buffer input (connect directly to top-level port)
);
// End of IBUFGDS_inst instantiation
endmodule
module output_Dbuffer (
input reset,
input RTS_n, //TX_BUSY flag of the transmitter is my ready to send flag
input clk, //ck needed for the FSM
output wire TX_trigger, //TX_START flag of the transmitter now comes from THIS unit instead of Receiver
output wire [7:0]TX_data //byte for transmission
);
//internal variables
reg [7:0] mem [0:9]; //memory init, 10 * 8 bit locations
integer m, n, i, j, k ; //M = row [a.k.a. bytes], N = column [a.k.a. single bits]
reg TX_trigger_int;
reg [7:0] TX_data_int, TX_complete;
//reg sum256_ok;
reg [7:0]checksum_buff ;
//buffer FSM required variables
localparam //state enumeration declaration
BUF_IDLE = 3'b000,
BUF_START = 3'b001,
BUF_BYTES = 3'b010,
BUF_BUSY = 3'b011,
BUF_TX_CHECKSUM = 3'b100;
reg [2:0] buf_state; //2 bits for 4 states
//static assignments of OUTPUTS : Transmission Flag and Transmission Data (content)
assign TX_trigger = TX_trigger_int;
assign TX_data = TX_data_int;
//Block for transmitting [here I manage the TX_Data and TX_Trigger functionality]
always #(posedge clk)
begin
if (reset)
begin
buf_state <= BUF_IDLE;
TX_trigger_int <= 0;
TX_data_int <= 8'b00000000;
end
else case (buf_state)
BUF_IDLE:
begin
TX_trigger_int <= 0;
TX_data_int <= 8'b00000000;
m <=0;
n <=0;
i <=0;
j <=0;
mem[9] <= 8'b01010001; //81
mem[8] <= 8'b01000000; //64
mem[7] <= 8'b00110001; //49
mem[6] <= 8'b00100100; //36
mem[5] <= 8'b00011001; //25
mem[4] <= 8'b00010000; //16
mem[3] <= 8'b00001001; //9
mem[2] <= 8'b00000100; //4
mem[1] <= 8'b00000001; //1
mem[0] <= 8'b00000010;//2
checksum_buff <= 8'd31;
//check if the TX is not busy
if (RTS_n == 0) buf_state <= BUF_START;
end
BUF_START:
begin
TX_trigger_int <= 0;
if ((i == 0) || ( (j - i) > 1 )) buf_state <= BUF_BYTES;
else begin
$display ("BUFFER BUSY #time:", $time);
buf_state <= BUF_BUSY;
end
end
BUF_BYTES:
begin
//check if the TX is busy
if (RTS_n==0)
begin
// TX_trigger_int = 1; 21.09 MOVED THE TRIGGER INSIDE THE ELSE N LINE 498
if (j > 9)
begin
TX_trigger_int <= 0;
buf_state <= BUF_TX_CHECKSUM;
end
else begin
TX_data_int <= mem[j];
TX_trigger_int <= 1;
j <= j+1;
//TX_trigger_int =0;
buf_state <= BUF_START;
end
end
else buf_state <= BUF_BYTES;
end
BUF_BUSY:
begin
if (RTS_n == 0)
begin
$display ("BUFFER AVAILABLE AGAIN #time:", $time);
buf_state <= BUF_START;
end
end
BUF_TX_CHECKSUM:
begin
if (RTS_n==0) begin
TX_data_int <= checksum_buff;
// sum256_ok = 0;
TX_trigger_int <= 1;
buf_state <= BUF_IDLE;
end
end
//default: buf_state <= BUF_IDLE;
endcase
end
endmodule
module async_transmitter(
input clk,
input reset,
//differential clock pair
input [7:0] TxD_data,
input Tx_start, // it is ==TX_TRIGGER
output wire TxD, //bit being sent to the USB
output reg Tx_busy_flag,
output wire [1:0]Tx_state_scope
);
localparam //state enumeration declaration
TX_IDLE = 2'b00,
TX_START_BIT = 2'b01,
TX_BITS = 2'b10,
TX_STOP_BIT = 2'b11;
parameter ClkFrequencyTx = 200000000; // 200MHz
parameter BaudTx = 9600;
reg [1:0] Tx_state; //2 bits for 4 states
integer bit_counter; //bit counter variable
reg [7:0]TxD_data_int, TxD_int;
integer i; //vector index for output data
wire TXSTART_Trigger;
StartDetectionUnitTX SDU_TX (.clk(clk), .state (Tx_state), .signal_in (Tx_start), . trigger (TXSTART_Trigger));
wire BitTick;
BaudTickGen #(ClkFrequencyTx, BaudTx) as (.clk(clk), .trigger (TXSTART_Trigger), .tick(BitTick));
//BitTick is 16times the frequency generated during the RX portion
assign TxD = TxD_int;
always #(posedge clk) begin
if (reset)
begin
Tx_state <= TX_IDLE;
TxD_int <= 1;
Tx_busy_flag <=0;
end
else case (Tx_state)
TX_IDLE:
begin //reinitialization and check on the trigger condition
bit_counter <= 0;
TxD_data_int <= 8'b00000000;
i <= 0;
TxD_int <= 1; //idle state
Tx_busy_flag <= 0;
if (TXSTART_Trigger) begin
Tx_state <= TX_START_BIT;
TxD_data_int <= TxD_data;
Tx_busy_flag <= 1;
bit_counter <= 8;
end
end
TX_START_BIT:
begin
if (BitTick)
begin
TxD_int <= 0 ; //start bit is a ZERO logical value
Tx_state <= TX_BITS;
end
end
TX_BITS:
begin
if (BitTick)
begin
bit_counter <= bit_counter -1;
TxD_int <= TxD_data_int[i];
// $display ("ho trasmesso dalla UART un bit di valore %b al tempo: ", TxD, $time);
i <= i+1;
if (bit_counter < 1) Tx_state <= TX_STOP_BIT;
end
end
TX_STOP_BIT:
begin
if (BitTick) begin
TxD_int <= 1; //STOP BIT is a logical '1'
Tx_busy_flag <= 0;
Tx_state <= TX_IDLE;
end
end
// default: Tx_state <= TX_IDLE;
endcase
end
assign Tx_state_scope = Tx_state;
endmodule
module obuf_TX (
output Tx_busy,
input Tx_busy_flag
);
// OBUF: Single-ended Output Buffer
// 7 Series
// Xilinx HDL Libraries Guide, version 14.7
OBUF #(
.DRIVE(12), // Specify the output drive strength
.IOSTANDARD("DEFAULT"), // Specify the output I/O standard
.SLEW("SLOW") // Specify the output slew rate
) OBUF_inst (
.O(Tx_busy), // Buffer output (connect directly to top-level port)
.I(Tx_busy_flag) // Buffer input
);
// End of OBUF_inst instantiation
endmodule
module StartDetectionUnitTX ( //detects a rising edge of the start bit == TRANSMISSION START, during the IDLE state = 0000
input clk, [1:0]state,
input signal_in,
output trigger
);
reg signal_d;
always #(posedge clk)
begin
signal_d <= signal_in;
end
assign trigger = signal_in & (!signal_d) & (!state);
endmodule
module BaudTickGen (
input clk, trigger,
output tick //generates a tick at a specified baud rate *oversampling
);
parameter ClkFrequency = 200000000; //sysclk at 200Mhz
parameter Baud = 9600;
parameter Oversampling = 1;
//20832 almost= ClkFrequency / Baud, to make it an integer number
integer counter = (20833/Oversampling)-1; //-1 so counter can get to 0
reg out;
always #(posedge clk)
begin
if (trigger)
begin
counter <= (20833/Oversampling)-1; //-1 so counter can get to 0
out <= 1;
end
if (counter == 0)
begin
counter <= (20833/Oversampling)-1; //-1 so counter can get to 0
out <= 1;
end
else begin
counter <= counter-1;
out <= 0;
end
end
assign tick = out;
endmodule
My FPGA is a Virtex-7 VC707 and I'm using Vivado for my design flow.
Here I am attaching an image of my looping error.
error image
What have you done? Have you just simulated the code? Are you saying that it fails on the board, but the post-implementation sim is Ok?
A difference between pre- and post-implementation sim could point to a race condition. Get rid of all your blocking assignments, replace with NBAs (why did you use blocking assignments?)
Don't go to Chipscope - it's just a red flag that you don't know what you're doing
The code is a mess - simplify it. The Xilinx-specific stuff is irrelevant - get rid of it if you want anyone to look at it, fix comments (2-bit state?!), fix your statement about getting stuck in '10', etc
Have you run this through Vivado? Seriously? You have multiple drivers on various signals. Get rid of the initial block, use a reset. Initialise the RAM in a way which is understood by the tools. Even if Vivado is capable of initialising stuff using a separate initial block, don't do it
Get rid of statements like 'else Tx_state = TX_IDLE' in the TX_IDLE branch - they're redundant, and just add verbosity
Write something which fails stand-alone, and post it again.

Synthesizable Verilog modular shift register

I'm doing a LOTTT of pipelining with varying width signals and wanted a SYNTHESIZEABLE module wherein i could pass 2 parameters : 1) number of pipes (L) and 2) width of signal (W).
That way i just have to instantiate the module and pass 2 values which is so much simple and robust than typing loads and loads of signal propagation via dummy registers...prone to errors and et all.
I have HALF written the verilog code , kindly request you to correct me if i am wrong.
I AM FACING COMPILE ERROR ... SEE COMMENTS
*****************************************************************
PARTIAL VERILOG CODE FOR SERIAL IN SERIAL OUT SHIFT REGISTER WITH
1) Varying number of shifts / stages : L
2) Varying number of signal / register width : W
*****************************************************************
module SISO (clk, rst, Serial_in, Serial_out); // sIn -> [0|1|2|3|...|L-1] -> sOut
parameter L = 60; // Number of stages
parameter W = 60; // Width of Serial_in / Serial_out
input clk,rst;
input reg Serial_in;
output reg Serial_out;
// reg [L-1:0][W-1:0] R;
reg [L-1:0] R; // Declare a register which is L bit long
always #(posedge clk or posedge rst)
begin
if (rst) // Reset = active high
//**********************
begin
R[0] <= 'b0; // Exceptional case : feeding input to pipe
Serial_out <= 'b0; // Exceptional case : vomiting output from pipe
genvar j;
for(j = 1; j<= L; j=j+1) // Ensuring ALL registers are reset when rst = 1
begin : rst_regs // Block name = reset_the_registers
R[L] <= 'b0; // Verilog automatically assumes destination width # just using 'b0
end
end
else
//**********************
begin
generate
genvar i;
for(i = 1; i< L; i=i+1)
begin : declare_reg
R[0] <= Serial_in; // <---- COMPILE ERROR POINTED HERE
R[L] <= R[L-1];
Serial_out <= R[L-1];
end
endgenerate;
end
//**********************
endmodule
//**********************
Why so complicated? The following code would be much simpler and easier to understand:
module SISO #(
parameter L = 60, // Number of stages (1 = this is a simple FF)
parameter W = 60 // Width of Serial_in / Serial_out
) (
input clk, rst,
input [W-1:0] Serial_in,
output [W-1:0] Serial_out
);
reg [L*W-1:0] shreg;
always #(posedge clk) begin
if (rst)
shreg <= 0;
else
shreg <= {shreg, Serial_in};
end
assign Serial_out = shreg[L*W-1:(L-1)*W];
endmodule
However, looking at your code there are the following problems:
You declare Serial_in as input reg. This is not possible, an input cannot be a reg.
You are using generate..endgenerate within an always block. A generate block is a module item and cannot be used in an always block. Simply remove the generate and endgenerate statements and declare i as integer.
Obviously Serial_in and Serial_out must be declared as vectors of size [W-1:0].
You are using R as a memory. Declare it as such: reg [W-1:0] R [0:L-1].
You are not using i in you for loop. Obviously you meant to chain all the elements of R together, but you are just accessing the 0th, (L-1)th and Lth element. (Obviously the Lth element is nonexisting, this array would be going from 0 to L-1.
I'm now stopping writing this list because, I'm sorry, I think there really is not much to gain by improving the code you have posted..

Serializer 32 to 8 - Verilog HDL

I am trying to make serializer from 32bits to 8 bits. Because I am just starting verilog I am facing problem. I would like to get 32 bits (on every 4th clock cycles) and then to send 8 bits on every clock cycle. How can I take just part of my dataIn, I wrote code below but assignment expression is not working. Sorry if question is basic. Thank you in advance on answer.
module ser32to8(clk, dataIn, dataOut);
input clk;
input [32:0] dataIn;
output [7:0] dataOut;
always #(posedge clk)
begin
dataOut <= dataIn << 8;
end
endmodule
The reason why the assignment failed (besides your code not doing any serialization) is because you didn't declare dataOut as a reg, and so you cannot assign to it inside an always block.
Here's how you do it correctly. (Since you didn't say in which order you wanted to serialize, I chose to go for lowest byte first, highest byte last. To reverse the order, exchange >> by << and tmp[7:0] by tmp[31:24].)
module ser32to8(
input clk,
input [31:0] dataIn,
output [7:0] dataOut
);
// count: 0, 1, 2, 3, 0, ... (wraps automatically)
reg [1:0] count;
always #(posedge clk) begin
count <= count + 2'd1;
end
reg [31:0] tmp;
always #(posedge clk) begin
if (count == 2'd0)
tmp <= dataIn;
else
tmp <= (tmp >> 8);
end
assign dataOut = tmp[7:0];
endmodule
How can you just take part of your dataIn data? By using the [] notation. dataIn[7:0] takes the 8 least significant bits, dataIn[15:8] takes the next 8 bits, and so on up to dataIn[31:24] which would take the 8 most significant bits.
To apply this to your problem, you can do like this (take into account that this is not an optimal solution, as outputs are not registered and hence, glitches may occur)
module ser32to8(
input wire clk,
input wire [31:0] dataIn,
output reg [7:0] dataOut
);
reg [1:0] cnt = 2'b00;
always #(posedge clk)
cnt <= cnt + 1;
always #* begin
case (cnt)
2'd0: dataOut = dataIn[7:0];
2'd1: dataOut = dataIn[15:8];
2'd2: dataOut = dataIn[23:16];
2'd3: dataOut = dataIn[31:24];
default: dataOut = 8'h00;
endcase
end
endmodule
You must declare dataOut as a reg, since you are using it in always block.Also, you are trying to assign 32 bit datain to 8 bit dataout , it is not logically correct.
The idea behind the question is not so clear but my guess would be that you want to wait for 4 clock cycles before you send the data, if that is the case below snippet could help, A counter to wait before 4 clock cycles will do the trick
module top (input clk,rst,
input [31:0] dataIn,
output [7:0] dataOut
);
reg [31:0] tmp;
reg [31:0] inter;
integer count;
always #(posedge clk)
begin
if (rst) begin
count <= 0;
tmp <= '0;
end
else
begin
if (count < 3) begin
tmp <= dataIn << 4;
count <= count +1; end
else if (count == 3)
begin
inter <= tmp;
count <= 0;
end
else
begin
tmp <= dataIn;
end
end
end
assign dataOut = inter[7:0];
endmodule
But there are some limitations tested with tb http://www.edaplayground.com/x/4Cg
Note: Please ignore the previous code it won't work(I was not clear so
tried it differently)
EDIT:
If I understand your question correctly a simple way to do it is
a)
module top ( input rst,clk,
input [31:0] dataIn,
output [7:0] dataOut);
reg [1:0] cnt;
always #(posedge clk) begin
if (rst) cnt <= 'b0;
else cnt <= cnt + 1;
end
assign dataOut = (cnt == 0) ? dataIn [7:0] :
(cnt == 1) ? dataIn [15:8] :
(cnt == 2) ? dataIn [23:16] :
(cnt == 3) ? dataIn [31:24] :
'0;
endmodule
Incase if you don't want to write it seperately for loop will come in handy to make it more simple
b)
module top ( input rst,clk,
input [31:0] dataIn,
output reg [7:0] dataOut);
reg [1:0] cnt;
integer i;
always #(posedge clk) begin
if (rst) cnt <= 'b0;
else cnt <= cnt + 1;
end
always # * begin
for ( i =0;i < cnt ; i=i+1) begin
dataOut <= dataIn[(i*8)+:8]; end
end
endmodule
I have tried both with test cases and found to be working, tc's present #
a) http://www.edaplayground.com/x/VCF
b) http://www.edaplayground.com/x/4Cg
You may want to give it a try
You can follow the figure below to design your circuit. Hope it can be useful with you. If you need the code, feel free to contact me.
SER 112 bits with 8 outputs in parallel

Verilog generate/genvar in an always block

I'm trying to get a module to pass the syntax check in ISE 12.4, and it gives me an error I don't understand. First a code snippet:
parameter ROWBITS = 4;
reg [ROWBITS-1:0] temp;
genvar c;
generate
always #(posedge sysclk) begin
for (c = 0; c < ROWBITS; c = c + 1) begin: test
temp[c] <= 1'b0;
end
end
endgenerate
When I try a syntax check, I get the following error message:
ERROR:HDLCompiler:731 - "test.v" Line 46: Procedural assignment to a
non-register <c> is not permitted.
I really don't understand why it's complaining. "c" isn't a wire, it's a genvar. This should be the equivalent of the completely legal syntax:
reg [3:0] temp;
always #(posedge sysclk) begin
temp[0] <= 1'b0;
temp[1] <= 1'b0;
temp[2] <= 1'b0;
temp[3] <= 1'b0;
end
Please, no comments about how it'd be easier to write this without the generate. This is a reduced example of a much more complex piece of code involving multiple ifs and non-blocking assignments to "temp". Also, don't just tell me there are newer versions of ISE, I already know that. OTOH, if you know it's fixed in a later version of ISE, please let me know which version you know works.
You need to reverse the nesting inside the generate block:
genvar c;
generate
for (c = 0; c < ROWBITS; c = c + 1) begin: test
always #(posedge sysclk) begin
temp[c] <= 1'b0;
end
end
endgenerate
Technically, this generates four always blocks:
always #(posedge sysclk) temp[0] <= 1'b0;
always #(posedge sysclk) temp[1] <= 1'b0;
always #(posedge sysclk) temp[2] <= 1'b0;
always #(posedge sysclk) temp[3] <= 1'b0;
In this simple example, there's no difference in behavior between the four always blocks and a single always block containing four assignments, but in other cases there could be.
The genvar-dependent operation needs to be resolved when constructing the in-memory representation of the design (in the case of a simulator) or when mapping to logic gates (in the case of a synthesis tool). The always #posedge doesn't have meaning until the design is operating.
Subject to certain restrictions, you can put a for loop inside the always block, even for synthesizable code. For synthesis, the loop will be unrolled. However, in that case, the for loop needs to work with a reg, integer, or similar. It can't use a genvar, because having the for loop inside the always block describes an operation that occurs at each edge of the clock, not an operation that can be expanded statically during elaboration of the design.
You don't need a generate bock if you want all the bits of temp assigned in the same always block.
parameter ROWBITS = 4;
reg [ROWBITS-1:0] temp;
always #(posedge sysclk) begin
for (integer c=0; c<ROWBITS; c=c+1) begin: test
temp[c] <= 1'b0;
end
end
Alternatively, if your simulator supports IEEE 1800 (SytemVerilog), then
parameter ROWBITS = 4;
reg [ROWBITS-1:0] temp;
always #(posedge sysclk) begin
temp <= '0; // fill with 0
end
end
If you do not mind having to compile/generate the file then you could use a pre processing technique. This gives you the power of the generate but results in a clean Verilog file which is often easier to debug and leads to less simulator issues.
I use RubyIt to generate verilog files from templates using ERB (Embedded Ruby).
parameter ROWBITS = <%= ROWBITS %> ;
always #(posedge sysclk) begin
<% (0...ROWBITS).each do |addr| -%>
temp[<%= addr %>] <= 1'b0;
<% end -%>
end
Generating the module_name.v file with :
$ ruby_it --parameter ROWBITS=4 --outpath ./ --file ./module_name.rv
The generated module_name.v
parameter ROWBITS = 4 ;
always #(posedge sysclk) begin
temp[0] <= 1'b0;
temp[1] <= 1'b0;
temp[2] <= 1'b0;
temp[3] <= 1'b0;
end
Within a module, Verilog contains essentially two constructs: items and statements. Statements are always found in procedural contexts, which include anything in between begin..end, functions, tasks, always blocks and initial blocks. Items, such as generate constructs, are listed directly in the module. For loops and most variable/constant declarations can exist in both contexts.
In your code, it appears that you want the for loop to be evaluated as a generate item but the loop is actually part of the procedural context of the always block. For a for loop to be treated as a generate loop it must be in the module context. The generate..endgenerate keywords are entirely optional(some tools require them) and have no effect. See this answer for an example of how generate loops are evaluated.
//Compiler sees this
parameter ROWBITS = 4;
reg [ROWBITS-1:0] temp;
genvar c;
always #(posedge sysclk) //Procedural context starts here
begin
for (c = 0; c < ROWBITS; c = c + 1) begin: test
temp[c] <= 1'b0; //Still a genvar
end
end
for verilog just do
parameter ROWBITS = 4;
reg [ROWBITS-1:0] temp;
always #(posedge sysclk) begin
temp <= {ROWBITS{1'b0}}; // fill with 0
end
To put it simply, you don't use generate inside an always process, you use generate to create a parametrized process or instantiate particular modules, where you can combine if-else or case. So you can move this generate and crea a particular process or an instantiation e.g.,
module #(
parameter XLEN = 64,
parameter USEIP = 0
)
(
input clk,
input rstn,
input [XLEN-1:0] opA,
input [XLEN-1:0] opB,
input [XLEN-1:0] opR,
input en
);
generate
case(USEIP)
0:begin
always #(posedge clk or negedge rstn)
begin
if(!rstn)
begin
opR <= '{default:0};
end
else
begin
if(en)
opR <= opA+opB;
else
opR <= '{default:0};
end
end
end
1:begin
superAdder #(.XLEN(XLEN)) _adder(.clk(clk),.rstm(rstn), .opA(opA), .opB(opB), .opR(opR), .en(en));
end
endcase
endmodule

Resources