I want to expand each bit n times.
For example,
// n = 2
5'b10101 -> 10'b1100110011
// n = 3
5'b10101 -> 15'b111000111000111
Is there any simple way (i.e., not using generate block) in Verilog or SystemVerilog?
EDIT 19.02.21
Actually, I'm doing 64bit mask to 512bit mask conversion, but it is different from {8{something}}. My current code is the following:
logic [63 : 0] x;
logic [511 : 0] y;
genvar i;
for (i = 0; i < 64; i = i + 1) begin
always_comb begin
y[(i + 1) * 8 - 1 : i * 8] = x[i] ? 8'hFF : 8'h00;
end
end
I just wonder there exists more "beautiful" way.
I think that your method is a good one. You cannot do it without some kind of a loop (unless you want to type all the iterations manually). There might be several variants for implementing it.
For example, using '+:' operator instead of an expression, which simplifies it a bit.
genvar i;
for (i = 0; i < 64; i = i + 1) begin
always_comb begin
y[i * 8 +: 8] = x[i] ? 8'hFF : 8'h00;
end
end
Thew above method actually generated 64 always blocks (as in your original one). Though sensitivity list of every block will be just a single bit from 'x'.
You can move the for loop inside an always block:
always #* begin
for (int j = 0; j < 64; j++) begin
y3[j * 8 +: 8] = x[j] ? 8'hFF : 8'h00;
end
end
this will end up as a single always block, but sensitivity list will include all bits of 'x'.
If this operation is used multiple times, you can use a function :
function logic [511 : 0] transform(input logic [63 : 0] x);
for (int j = 0; j < 64; j++) begin
transform[j * 8 +: 8] = x[j] ? 8'hFF : 8'h00;
end
endfunction
...
always #* begin
y = transform(x);
end
If n is a parameter you can do:
always_comb begin
y = '0;
for(int idx=0; idx<($bits(y)/n) && idx<$bits(x); idx++) begin
y[idx*n +: n] = {n{x[idx]}};
end
end
If n is a signal you have to assign each bit:
always_comb begin
y = '0;
foreach(y[idx]) begin
y[idx] = x[ idx/n ];
end
end
A variable divisor will add timing and area overhead. Depending on your design target, it may or may not be an issue (synthesis optimization or simulation only).
My answer might not be the best of the answers, but if I were you, I would do something as below (assuming x and y are registers in your module that will be used in a synchronous design):
// your module name and ports
reg [63:0] x;
reg [511:0] y;
// your initializations
always#(posedge clk) begin
y[0+:8] <= x[0] ? 8'hff : 8'h00;
y[8+:8] <= x[1] ? 8'hff : 8'h00;
y[16+:8] <= x[2] ? 8'hff : 8'h00;
y[24+:8] <= x[3] ? 8'hff : 8'h00;
y[32+:8] <= x[4] ? 8'hff : 8'h00;
*
*
*
y[504+:8] <= x[63] ? 8'hff : 8'h00;
end
For different always conditions:
// your module name and ports
reg [63:0] x;
reg [511:0] y;
// your initializations
always#('some sensitivity conditions') begin
y[0+:8] <= x[0] ? 8'hff : 8'h00;
y[8+:8] <= x[1] ? 8'hff : 8'h00;
y[16+:8] <= x[2] ? 8'hff : 8'h00;
y[24+:8] <= x[3] ? 8'hff : 8'h00;
y[32+:8] <= x[4] ? 8'hff : 8'h00;
*
*
*
y[504+:8] <= x[63] ? 8'hff : 8'h00;
end
However, if I wanted a separate module that inputs x and outputs y, I would do something as below:
module mask_conversion(
input [63:0] x;
output [511:0] y;
);
assign y[0+:8] = x[0] ? 8'hff : 8'h00;
assign y[8+:8] = x[1] ? 8'hff : 8'h00;
assign y[16+:8] = x[2] ? 8'hff : 8'h00;
assign y[24+:8] = x[3] ? 8'hff : 8'h00;
assign y[32+:8] = x[4] ? 8'hff : 8'h00;
*
*
*
assign y[504+:8] = x[63] ? 8'hff : 8'h00;
endmodule
It is not that difficult to type all these, you just need to copy and paste, and change numbers manually. As a result you will get guaranteed code that does what you want.
Related
I am facing an interesting issue in SystemVerilog where the comparison with a register isn't working.
module VGA_Colours
(
input wire clk, reset,
// input wire [3:0] swred, swgreen,
// input wire [1:0] swblue,
output wire hsync, vsync,
output wire [3:0] r, g, b
);
// constant declarations for VGA sync parameters
localparam H_DISPLAY = 640; // horizontal display area
localparam H_L_BORDER = 48; // horizontal left border
localparam H_R_BORDER = 16; // horizontal right border
localparam H_RETRACE = 96; // horizontal retrace
localparam H_MAX = H_DISPLAY + H_L_BORDER + H_R_BORDER + H_RETRACE - 1;
localparam START_H_RETRACE = H_DISPLAY + H_R_BORDER;
localparam END_H_RETRACE = H_DISPLAY + H_R_BORDER + H_RETRACE - 1;
localparam V_DISPLAY = 480; // vertical display area
localparam V_T_BORDER = 10; // vertical top border
localparam V_B_BORDER = 33; // vertical bottom border
localparam V_RETRACE = 2; // vertical retrace
localparam V_MAX = V_DISPLAY + V_T_BORDER + V_B_BORDER + V_RETRACE - 1;
localparam START_V_RETRACE = V_DISPLAY + V_B_BORDER;
localparam END_V_RETRACE = V_DISPLAY + V_B_BORDER + V_RETRACE - 1;
wire video_on, p_tick;
reg [9:0] ii;
reg j;
reg [3:0] red_reg, green_reg, blue_reg;
reg [11:0] rbg;
// mod-2 counter to generate 25 MHz pixel tick
reg pixel_reg = 0;
wire pixel_next;
wire pixel_tick;
always #(posedge clk)
pixel_reg <= pixel_next;
assign pixel_next = ~pixel_reg; // next state is complement of current
assign pixel_tick = (pixel_reg == 0); // assert tick half of the time
// registers to keep track of current pixel location
reg [9:0] h_count_reg, h_count_next, v_count_reg, v_count_next;
// register to keep track of vsync and hsync signal states
reg vsync_reg, hsync_reg;
wire vsync_next, hsync_next;
// infer registers
always #(posedge clk)
if(~reset)
begin
v_count_reg <= 0;
h_count_reg <= 0;
vsync_reg <= 0;
hsync_reg <= 0;
end
else
begin
v_count_reg <= v_count_next;
h_count_reg <= h_count_next;
vsync_reg <= vsync_next;
hsync_reg <= hsync_next;
end
// next-state logic of horizontal vertical sync counters
always #*
begin
h_count_next = pixel_tick ?
h_count_reg == H_MAX ? 0 : h_count_reg + 1
: h_count_reg;
v_count_next = pixel_tick && h_count_reg == H_MAX ?
(v_count_reg == V_MAX ? 0 : v_count_reg + 1)
: v_count_reg;
end
// hsync and vsync are active low signals
// hsync signal asserted during horizontal retrace
assign hsync_next = h_count_reg >= START_H_RETRACE
&& h_count_reg <= END_H_RETRACE;
// vsync signal asserted during vertical retrace
assign vsync_next = v_count_reg >= START_V_RETRACE
&& v_count_reg <= END_V_RETRACE;
// video only on when pixels are in both horizontal and vertical display region
assign video_on = (h_count_reg < H_DISPLAY)
&& (v_count_reg < V_DISPLAY);
// output signals
assign hsync = hsync_reg;
assign vsync = vsync_reg;
assign p_tick = pixel_tick;
always #(posedge p_tick) begin
if (~reset) begin
rbg <= 12'b000000000000;
ii <= 9'b0;
end else begin
if (h_count_reg == 0) begin
rbg <= 12'b000000000000;
ii <= 9'b0;
end else if (h_count_reg == ii) begin
ii <= ii + 9'b001010000;
rbg <= rbg + 12'b000010000000;
end
end
end
// output
assign r = (video_on) ? rbg[11:8] : 4'b0;
assign g = (video_on) ? rbg[7:4] : 4'b0;
assign b = (video_on) ? rbg[3:0] : 4'b0;
endmodule
In the above code h_count_reg is 0 works fine. If I change 0 to any different number, it will work as expected. However, if I replace that number with a variable (which is "ii", declared on top of my module as reg[9:0] ii), the code seems to ignore it, which is weird. Replacing the ii variable with any number will work. Why?
TestBench file:
module VGA_Colours_tb ();
logic clk;
reg reset;
wire hsync, vsync;
wire [3:0] r, g, b;
VGA_Colours scr0 (
.clk (clk),
.reset (reset),
.hsync (hsync),
.vsync (vsync),
.r (r),
.b (b),
.g (g)
);
initial begin
clk = 0;
forever #10 clk = ~clk;
end
always #(posedge clk) begin
#20
reset <= 1'b0;
#20
reset <= 1'b1;
#100000
$finish;
end
endmodule
Simulation wave:
As you can see from the code, when h_count_reg is == to ii, increment the rbg and the value of ii. However, based on the simulation waves, it is not doing that as if the value of h_count_reg is not equal to ii while it actually is.
You have a logic error in the VGA_Colours module.
Here is your code with more consistent indentation:
always #(posedge p_tick) begin
if (~reset) begin
rbg <= 12'b000000000000;
ii <= 9'b0;
end else begin
if (h_count_reg == 0) begin
rbg <= 12'b000000000000;
ii <= 9'b0;
end else if (h_count_reg == ii) begin
rbg <= rbg + 12'b000010000000;
ii <= ii + 9'b001010000;
end
end
end
When I run your simulation, I observe ii is always 0 after the initial reset.
The code has 3 if statements. The 1st if statement is true at the beginning of the simulation, when reset=0. This sets ii to 0.
After reset, I see h_count_reg=0 4 times. This means the 2nd if statement is true 4 times. This keeps ii = 0.
The 3rd if statement is evaluated only when h_count_reg is not 0. It should be clear now that the 3rd if statement can never be true. This means that ii will not be incremented and it will remain at 0. For example, when h_count_reg=1, then (h_count_reg == ii) is false because ii is always 0.
I have a following code :
`timescale 1ns / 1ps
//////////////////////////////////////////////////////////////////////////////////
// Company:
// Engineer:
//
// Create Date: 04/07/2019 01:20:06 PM
// Design Name:
// Module Name: data_generator_v1
// Project Name:
// Target Devices:
// Tool Versions:
// Description:
//
// Dependencies:
//
// Revision:
// Revision 0.01 - File Created
// Additional Comments:
//
//////////////////////////////////////////////////////////////////////////////////
module data_generator_v1 #(
// Define parameters
parameter integer MAPPING_NUMBER = 196 // MAPPING NUMBER IS USED TO SET A SPECIFIC PROBABILITY (16 BIT SCALING --> MAX VALUE = 65535 --> MAPPING NUMBER = 65535 * 0.03 == 196)
)
(
input S_AXI_ACLK , // Input clock
input S_AXI_ARESETN, // RESET signal (active low )
input start_twister,
output reg [1022:0] rec_vector = 1023'd0,
output reg start_decoding = 1'b0 ,
output integer random_vector_bit_errors = 0
);
// Mersenne Twister signals ----------------------------------------------------------------------
wire [63:0] output_axis_tdata ;
wire output_axis_tvalid ;
wire output_axis_tready ;
wire busy ;
wire [63:0] seed_val ;
wire seed_start ;
//--------------------------------------------------------------------------------------------------
// Signals ----------------------------------------------------------------------------------------
wire [3:0] random_nibble ;
integer nibble_count = 256 ; // initialize to 256
reg [1023:0] random_vector = 1024'd0;
reg sample_random_vector = 1'b0;
reg [9:0] bit_errors = 10'd0 ;
// -------------------------------------------------------------------------------------------------
// Generate numbers with a specific probability
assign random_nibble[0] = (output_axis_tdata[15:0] < MAPPING_NUMBER) ? 1 : 0 ;
assign random_nibble[1] = (output_axis_tdata[31:16] < MAPPING_NUMBER) ? 1 : 0 ;
assign random_nibble[2] = (output_axis_tdata[47:32] < MAPPING_NUMBER) ? 1 : 0 ;
assign random_nibble[3] = (output_axis_tdata[63:48] < MAPPING_NUMBER) ? 1 : 0 ;
// Generate a random vector ------------------------------------------------------------------------
always#(posedge S_AXI_ACLK) begin
if(S_AXI_ARESETN == 1'b0 ) begin
random_vector <= 1024'd0 ;
sample_random_vector <= 1'b0 ;
nibble_count <= 256 ;
random_vector_bit_errors <= 0 ;
bit_errors <= 0 ;
end
else begin
if(output_axis_tvalid == 1'b1) begin
if(nibble_count == 0 ) begin
random_vector <= random_vector ;
sample_random_vector <= 1'b1 ;
nibble_count <= 256 ;
random_vector_bit_errors <= bit_errors ;
bit_errors <= 0 ;
end
else begin
nibble_count <= nibble_count - 1 ; // 256*4 == 1024 bit vector
sample_random_vector <= 1'b0 ;
random_vector <= (random_vector << 4) ^ random_nibble ;
random_vector_bit_errors <= random_vector_bit_errors ;
if(nibble_count == 256) begin
case(random_nibble[2:0])
3'b000 : bit_errors <= bit_errors ;
3'b001 : bit_errors <= bit_errors + 1 ;
3'b010 : bit_errors <= bit_errors + 1 ;
3'b011 : bit_errors <= bit_errors + 2 ;
3'b100 : bit_errors <= bit_errors + 1 ;
3'b101 : bit_errors <= bit_errors + 2 ;
3'b110 : bit_errors <= bit_errors + 2 ;
3'b111 : bit_errors <= bit_errors + 3 ;
endcase
end
else begin
case (random_nibble)
4'b0000 : bit_errors <= bit_errors ;
4'b0001 : bit_errors <= bit_errors + 1 ;
4'b0010 : bit_errors <= bit_errors + 1 ;
4'b0011 : bit_errors <= bit_errors + 2 ;
4'b0100 : bit_errors <= bit_errors + 1 ;
4'b0101 : bit_errors <= bit_errors + 2 ;
4'b0110 : bit_errors <= bit_errors + 2 ;
4'b0111 : bit_errors <= bit_errors + 1 ;
4'b1000 : bit_errors <= bit_errors + 1 ;
4'b1001 : bit_errors <= bit_errors + 2 ;
4'b1010 : bit_errors <= bit_errors + 2 ;
4'b1011 : bit_errors <= bit_errors + 3 ;
4'b1100 : bit_errors <= bit_errors + 2 ;
4'b1101 : bit_errors <= bit_errors + 3 ;
4'b1110 : bit_errors <= bit_errors + 3 ;
4'b1111 : bit_errors <= bit_errors + 4 ;
endcase
end
end
end
end
end
// Sample output for the next block
always#(posedge S_AXI_ACLK) begin
if(S_AXI_ARESETN == 1'b0) begin
rec_vector <= 1023'd0 ;
start_decoding <= 1'b0 ;
end
else begin
if(sample_random_vector) begin
rec_vector <= random_vector[1022:0] ;
start_decoding <= 1'b1 ;
end
else begin
rec_vector <= rec_vector ;
start_decoding <= 1'b0 ;
end
end
end
//---------------------------------------------------------------------------------------------------
// //-------------------------------------------------------------------------------------------------------------------------------------
// // STANDARD CLOCK AND RESET
// //output_axis_tdata contains valid data when output_axis_tvalid is asserted
// // output_axis_tready is input into the mersenne twister and we can use this to accept or stop the generation of new data streams
// // busy is asserted when the mersenne twister is performing some computations
// // seed val is not used . It will start will default seed
// // seed start --> not used
// Mersenne twister signal assignment
assign seed_val = 64'd0 ; // used for seeding purposes
assign seed_start = 1'b0 ; // We do not want to assign a new seed so we proceed with the default one
assign output_axis_tready = (S_AXI_ARESETN == 1'b0 || start_twister == 0 ) ? 1'b0 : 1'b1 ; // knob to turn the twister on and off
// MODULE INSTANTIATION
axis_mt19937_64 AMT19937(S_AXI_ACLK,S_AXI_ARESETN,output_axis_tdata,output_axis_tvalid,output_axis_tready,busy,seed_val,seed_start) ;
// //-------------------------------------------------------------------------------------------------------------------------------------
endmodule
The focus of this question is the variable :output reg [1022:0] rec_vector = 1023'd0
I am loading this vector using a Mersenne Twister random number generator. The mersenne twister provides a 64 bit number that is then mapped into a 4 bit number. 256 such 4 bit numbers are generated to fill up one row in the rec_vector variable.
Now, I need to select each row in this 2-d array and send it for decoding. This is simple. I can write something like rec_vector[row_index] to get a specific row.
After I row an operation on each one of the rows, I need to perform the same operation on the columns as well. How do I get the columns out of this 2-d array?
Please note that a simple approach like creating wires and assigning them like :
codeword_column[0] = {rec_vector[0][0], rec_vector[1][0] ....., rec_vector[1022][0]} does not work. If I do this , the utilization blows up since now I am doing an asynchronous read on the 2-d array and that 2-d array can no longer be inferred as block ram since block rams can only support synchronous reads.
I would really appreciate any inputs regarding this. Thanks for taking the time to read this
I'll give this as complete answer and not as a comment as a similar question popped-up a short while ago: Accessing a million bits
In fact what you are asking is "How can I access a 2d-array in row and in column mode".
This is only possible if you make the array completely out of registers.
As soon as you have a lot of bits, too many to store in registers, you have to fall back on memories. So how do you access rows and in columns in a memory?
And the answer is the very unsatisfactory: "You can't."
Unfortunately memories are implement in long rows of bits and the hardware allows you to select only one row at a time. To access columns you have to work your way through the addresses, reading one row and picking out the column(s) you want. Which means it costs one clock cycle to read one column element.
The fist way to speed things up is to use dual-ported memories. The memories on the FPGAs I know are all dual ported. Thus your can do two reads from different addresses at a time.
You can also speed up the access by storing two rows at a time. e.g. an array of 8x8 bytes can be stored as 16x4 and reading gives you access to two rows at a time and thus the firs two column elements. (But that has diminishing returns, you end up with one huge row of registers again.)
Combing this with dual-ported access gives you four columns per clock cycle.
Just as a last warning which is also mentioned in the above link: FPGAs have two types of memories:
Synchronous write and a-synchronous read for which they have to use LUT's.
Synchronous write and read for which they have can use the internal memory banks.
The latter have the largest amount of storage. Thus if you write your code to use the former you can quickly find yourself out of resources.
I am trying to do a VGA output using verilog but I can't seem to figure out why r_hcount stays X. The simulation waveforms show that r_vcount is being reset to 0 properly but for some reason r_hcount never gets reset to 0. I can't figure out why...
Verilog code:
module m_VGA640x480(
input wire iw_clock,
input wire iw_pix_stb,
input wire iw_rst,
output wire ow_hs,
output wire ow_vs,
output wire ow_blanking,
output wire ow_active,
output wire ow_screenend,
output wire ow_animate,
output wire [9:0] ow_x,
output wire [9:0] ow_y
);
localparam HS_STA = 16;
localparam HS_END = 16 + 96;
localparam HA_STA = 16 + 96 + 48;
localparam VS_STA = 480 + 11;
localparam VS_END = 400 + 11 + 2;
localparam VA_END = 480;
localparam LINE = 800;
localparam SCREEN = 524;
reg [9:0] r_hcount;
reg [9:0] r_vcount;
assign ow_hs = ~((r_hcount >= HS_STA) & (r_hcount < HS_END));
assign ow_vs = ~((r_vcount >= VS_STA) & (r_vcount < VS_END));
assign ow_x = (r_hcount < HA_STA) ? 0 : (r_hcount - HA_STA);
assign ow_y = (r_vcount >= VA_END) ? (VA_END - 1) : (r_vcount);
assign ow_blanking = ((r_hcount < HA_STA) | (r_vcount > VA_END - 1));
assign ow_active = ~((r_hcount < HA_STA) | (r_vcount > VA_END - 1));
assign ow_screenend = ((r_vcount == SCREEN - 1) & (r_hcount == LINE));
assign ow_animate = ((r_vcount ==VA_END - 1) & (r_hcount == LINE));
always #(posedge iw_clock)
begin
if (iw_rst)
begin
r_hcount <= 0;
r_vcount <= 0;
end
if (iw_pix_stb)
begin
if (r_hcount == LINE)
begin
r_hcount <= 0;
r_vcount <= r_vcount + 1;
end
else
r_hcount <= r_hcount + 1;
if (r_vcount == SCREEN)
r_vcount <= 0;
end
end
endmodule
Here is the result of the simulation. r_hcount is bugged... The code is supposed to set both counters to 0 when reset is 1 but for some reason it's not getting reset to 0. Please help.
Wavefrorm
From your work, I notice one point may cause the issue
always #(posedge iw_clock)
begin
if (iw_rst)
//you define r_hcount <= 0 here
.....
if (iw_pix_stb) //<== another condition
// r_hcount <= 0 is also defined here
So if posedge clock happened, r_hcount may be bugged here.
I suggest it should be done like this
else if (iw_pix_stb) <=== else if here
Good luck.
After tinkering a bit more with the code, I found out that it was because r_hcount <= 0 was getting overridden by r_hcount <= r_hcount + 1 which will set r_hcount to X. This was caused because the two clock inputs were both the same frequency.
I should be more careful in the future...
When i am trying to compile following verilog RTL cadence simulator is throwing a error as illegal operand for constant expression.
RTL is:
module selection_logic( data_out, data_in , valid_info);
input [(number_of_channel * per_channel_data) - 1 : 0] data_in;
input [number_of_channel - 1: 0] valid_info;
output reg [number_of_channel - 1 : 0] data_in;
integer i;
always #(*)
begin
for (i = 0; i < number_of_channel; i = i + 1)
begin
if (valid_info[i])
data_out[(per_channel_data*(i+1)) - 1: per_channel_data*i] = data_in[[(per_channel_data*(i+1)) - 1: per_channel_data*i]
else
data_out[(per_channel_data*(i+1)) - 1: per_channel_data*i] = {per_channel_data{1'b0};
end
end
endmodule
Array slicing using the arrayName[MSB:LSB] require MSB and LSB to be constants. Instead, use the arrayName[start_bit +: WIDTH], where WIDTH is a constant and start_bit can be a variable. Refer to
"Indexing vectors and arrays with +:" and "What is `+:` and `-:`?"
data_out[per_channel_data*i +: per_channel_data] = data_in[per_channel_data*i +: per_channel_data];
If stuck with with Verilog-1995, then add a second for-loop and assign each bit individually:
for(i=0; i<per_channel_data; i=i+1) begin
for(j=0; j<per_channel_data; j=j+1) begin
if (valid_info[i])
data_out[per_channel_data*i+j] = data_in[per_channel_data*i+j];
else
data_out[per_channel_data*i+j] = 1'b0;
end
end
i'm making 8x32b register file below is my verilog code
module register_file(clk, reset, dstW, valW, write, srcA, srcB, valA, valB );
input clk;
input reset;
input[2:0] dstW;
input[31:0] valW;
input write;
input[2:0] srcA;
input[2:0] srcB;
output[31:0] valA;
output[31:0] valB;
reg[31:0] r0eax, r1ecx, r2edx, r3ebx, r4esi, r5edi, r6esp, r7edi;
wire[31:0] reg_input_0, reg_input_1, reg_input_2, reg_input3, reg_input4,
reg_input5, reg_input6, reg_input7;
wire[7:0] decoder_out, select;
assign valA =
(srcA == 3'b000) ? r0eax:
(srcA == 3'b001) ? r1ecx:
(srcA == 3'b010) ? r2edx:
(srcA == 3'b011) ? r3ebx:
(srcA == 3'b100) ? r4esi:
(srcA == 3'b101) ? r5edi:
(srcA == 3'b110) ? r6esp:
(srcA == 3'b111) ? r7edi: 32'bx;
assign valB =
(srcB == 3'b000) ? r0eax:
(srcB == 3'b001) ? r1ecx:
(srcB == 3'b010) ? r2edx:
(srcB == 3'b011) ? r3ebx:
(srcB == 3'b100) ? r4esi:
(srcB == 3'b101) ? r5edi:
(srcB == 3'b110) ? r6esp:
(srcB == 3'b111) ? r7edi: 32'bx;
assign decoder_out[0] = (dstW == 3'b000)? 1'b1 : 1'b0;
assign decoder_out[1] = (dstW == 3'b001)? 1'b1 : 1'b0;
assign decoder_out[2] = (dstW == 3'b010)? 1'b1 : 1'b0;
assign decoder_out[3] = (dstW == 3'b011)? 1'b1 : 1'b0;
assign decoder_out[4] = (dstW == 3'b100)? 1'b1 : 1'b0;
assign decoder_out[5] = (dstW == 3'b101)? 1'b1 : 1'b0;
assign decoder_out[6] = (dstW == 3'b110)? 1'b1 : 1'b0;
assign decoder_out[7] = (dstW == 3'b111)? 1'b1 : 1'b0;
and(select[0], write, decoder_out[0]);
and(select[1], write, decoder_out[1]);
and(select[2], write, decoder_out[2]);
and(select[3], write, decoder_out[3]);
and(select[4], write, decoder_out[4]);
and(select[5], write, decoder_out[5]);
and(select[6], write, decoder_out[6]);
and(select[7], write, decoder_out[7]);
assign reg_input_0 = select[0] ? valW : r0eax;
assign reg_input_1 = select[1] ? valW : r1ecx;
assign reg_input_2 = select[2] ? valW : r2edx;
assign reg_input_3 = select[3] ? valW : r3ebx;
assign reg_input_4 = select[4] ? valW : r4esi;
assign reg_input_5 = select[5] ? valW : r5edi;
assign reg_input_6 = select[6] ? valW : r6esp;
assign reg_input_7 = select[7] ? valW : r7edi;
always #(posedge clk or negedge reset)
begin
if(!reset) begin
r0eax <= 32'b0;
r1ecx <= 32'b0;
r2edx <= 32'b0;
r3ebx <= 32'b0;
r4esi <= 32'b0;
r5edi <= 32'b0;
r6esp <= 32'b0;
r7edi <= 32'b0;
end
else begin
r0eax <= reg_input_0;
r1ecx <= reg_input_1;
r2edx <= reg_input_2;
r3ebx <= reg_input_3;
r4esi <= reg_input_4;
r5edi <= reg_input_5;
r6esp <= reg_input_6;
r7edi <= reg_input_7;
end
end
endmodule
and testbench is as follows
module tttt;
// Inputs
reg clk;
reg reset;
reg [2:0] dstW;
reg [31:0] valW;
reg write;
reg [2:0] srcA;
reg [2:0] srcB;
// Outputs
wire [31:0] valA;
wire [31:0] valB;
integer i;
// Instantiate the Unit Under Test (UUT)
register_file uut (
.clk(clk),
.reset(reset),
.dstW(dstW),
.valW(valW),
.write(write),
.srcA(srcA),
.srcB(srcB),
.valA(valA),
.valB(valB)
);
initial begin
// Initialize Inputs
clk = 0;
reset = 1;
dstW = 0;
valW = 0;
write = 0;
srcA = 0;
srcB = 0;
i =0;
#10
reset = 0;
#10
reset = 1;
// Wait 100 ns for global reset to finish
#100;
clk=1;
valW = 100;
write = 1;
for(i=0; i<8; i = i+1) begin
clk =0;
#10;
dstW = i;
clk = 1;
#10;
clk =0;
#10;
valW = valW + 10;
clk =1;
#10;
end
#100;
write =0;
for(i=0; i<8; i=i+1 ) begin
clk = 0;
#10;
srcA = i;
srcB = i;
#10;
clk=1;
#10;
end
clk =0;
#10;
clk = 1;
// Add stimulus here
end
endmodule
and result
it just results 0 value after third i value.
i checked using red rectangular. could you give me a advice ? thanks in advance
When you synthetize your design, these warnings show up:
WARNING:Xst:1780 - Signal <reg_input7> is never used or assigned. This unconnected signal will be trimmed during the optimization process.
WARNING:Xst:1780 - Signal <reg_input6> is never used or assigned. This unconnected signal will be trimmed during the optimization process.
WARNING:Xst:1780 - Signal <reg_input5> is never used or assigned. This unconnected signal will be trimmed during the optimization process.
WARNING:Xst:1780 - Signal <reg_input4> is never used or assigned. This unconnected signal will be trimmed during the optimization process.
WARNING:Xst:1780 - Signal <reg_input3> is never used or assigned. This unconnected signal will be trimmed during the optimization process.
If the synthetizer detects that those signals are not being used, it discards them. Note that these signals affect the selection of registers 3 to 7, and because of that, you cannot see the loaded value when you read them.
But... your code assigns and use these signals, doesn't it?
assign reg_input_0 = select[0] ? valW : r0eax;
assign reg_input_1 = select[1] ? valW : r1ecx;
assign reg_input_2 = select[2] ? valW : r2edx;
assign reg_input_3 = select[3] ? valW : r3ebx;
assign reg_input_4 = select[4] ? valW : r4esi;
assign reg_input_5 = select[5] ? valW : r5edi;
assign reg_input_6 = select[6] ? valW : r6esp;
assign reg_input_7 = select[7] ? valW : r7edi;
What makes reg_input_0,1 and 2 different from reg_input_3,4,5,6 and 7 ? This:
wire[31:0] reg_input_0, reg_input_1, reg_input_2, reg_input3, reg_input4,
reg_input5, reg_input6, reg_input7;
Look: reg_input_0, reg_input_1 and reg_input_2. Then, reg_input3 (where's the underscore??)
As reg_input_3 to reg_input_7 are not defined, they default to a 1-bit signal, instead of 32 bits. When you use the multiplexer, at reg_input_3 for instance, to define its value...
assign reg_input_3 = select[3] ? valW : r3ebx;
You are actually synthetizing this:
assign reg_input_3 = select[3] ? valW[0] : r3ebx[0];
And in your clocked always, the actual register assignment is not as this:
r3ebx <= reg_input_3;
But as this:
r3ebx[0] <= reg_input_3;
This description of yours causes feedback from the register output through the input via the mentioned multiplexor. While this is ok when the there's a clock triggered register, if the synthetizer doesn't detect it, you will end up generating a lot of unnecesary multiplexers. Look at this generated schematic from the results of the synthesis process (synthetizer is XST)
The eight squares at the right are your eight registers. At the left, there are a massive amount of multiplexers. I cannot show you all of them, because the generated schematic it's too large for screen capture.
I suggest not to use an explicit loopback path with the multiplexor to decide when to write a new value to the register. Instead of that, modify your synchronous always to load a new value into the register only if that register is selected for writting:
always #(posedge clk or negedge reset)
begin
if(!reset) begin
r0eax <= 32'b0;
r1ecx <= 32'b0;
r2edx <= 32'b0;
r3ebx <= 32'b0;
r4esi <= 32'b0;
r5edi <= 32'b0;
r6esp <= 32'b0;
r7edi <= 32'b0;
end
else begin
if (select[0])
r0eax <= valW;
if (select[1])
r1ecx <= valW;
if (select[2])
r2edx <= valW;
if (select[3])
r3ebx <= valW;
if (select[4])
r4esi <= valW;
if (select[5])
r5edi <= valW;
if (select[6])
r6esp <= valW;
if (select[7])
r7edi <= valW;
end
end
This description allows the synthetizer to infer a register with CLK and CE inputs: the register will accept a new value from its D input if CE is enabled. If not, the value doesn't change. Your description makes the register to change its value on every clock cycle, whether is needed or not.
Now the circuit inferred is as this (it actually fits on screen!):
With this proposed solution, the first block, where the different reg_input_X signals are assigned, can be eliminated.
Tested using ISIM with ISE Webpack 12.4 and works :)