< bit half-byte byte ... > memory access in 32-bit memory using verilog - verilog

This is my synthesizable memory model in Verilog.
module memory(
output reg [31:0] data_out,
input [31:0] address,
input [31:0] data_in,
input write_enable,
input clk
);
reg [31:0] memory [0:255];
always #(posedge clk) begin
if (write_enable) begin
memory[address] <= data_in;
end
data_out <= memory[address];
end
endmodule
For example:
memory[32'h10] contains 0xAAAAAAAA
I just want to write one byte of data 0xFF in memory address 0x10 so that
memory[32'h10] contains 0xFFAAAAAA
Can you recommend a good way to change my code so that I can access only one bit, half-byte, byte, halfword, or word in my memory module?

You only declared 256 words of 32-bits, but your address bus is 32-bits wide, allowing up to 2^32 words of 32-bits. You might want to reduce your address bus width to 8-bits to match the number of words you declared.
For Xilinx FPGAs I use the CORE Generator tool to instantiate one or more BlockRAMs of the right width and depth. BlockRAMs have an option to support individual byte enables.
This code might work, but I haven't tried it
module memory (
output reg [31:0] data_out,
input [7:0] address,
input [31:0] data_in,
input [3:0] write_enable,
input clk
);
reg [31:0] memory [0:255];
reg [31:0] memory_in = 0; // wire reg
always #* begin : combinational_logic
memory_in = memory[address];
if (write_enable[3])
memory_in[31:24] = data_in[31:24];
if (write_enable[2])
memory_in[23:16] = data_in[23:16];
if (write_enable[1])
memory_in[15:8] = data_in[15:8];
if (write_enable[0])
memory_in[7:0] = data_in[7:0];
end
always #(posedge clk) begin : sequential_logic
if (|write_enable) begin
memory[address] <= memory_in;
end
data_out <= memory[address];
end
endmodule

What 'a good way' is depends on your synthesis target. If it's an FPGA you should consider that bit-wise write access for large memories is generally not a good idea. This will possibly prevent the memory from mapping to RAM resources, dramatically increasing routing costs.
Byte enables are generally directly supported. You can view the Xilinx coding guidelines here where it describes byte enables on page 159.

Related

Writing random data to a RAM in a testbench

I am working with RAM in Verilog, and I need to implement a test bench where I will confirm the correct operation of the three memory processes (write data, read data and read commands). I have written a testbench where it seems to be writing and reading some integer numbers, but is there any way to fill the memory with words or strings to be more clear randomly?
This is my testbench:
module ramtest();
parameter WORD_SIZE=8;
parameter ADDR_WIDTH=8;
parameter RAM_SIZE=1<<ADDR_WIDTH;
reg we;
reg re;
reg [ADDR_WIDTH-1:0] addr;
reg [ADDR_WIDTH-1:0] instraddr;
reg [WORD_SIZE-1:0] datawr;
reg Clk;
reg [WORD_SIZE-1:0] mem[RAM_SIZE-1:0];
wire [WORD_SIZE-1:0] datard;
wire [WORD_SIZE-1:0] instrrd;
MCPU_RAMController raminst (.we(we),.datawr(datawr),.re(re),.addr(addr),.datard(datard),.instraddr(instraddr),.instrrd(instrrd));
integer i;
initial begin
we=0;
datawr=0;
instraddr=0;
addr=1;
#20;
for(i=0;i<RAM_SIZE;i=i+1) begin
datawr=i;
addr=i-1;
#10;
end
we=0;
addr=1;
instraddr=0;
for(i=0;i<RAM_SIZE;i=i+1) begin
addr=i-1;
#10;
end
end
endmodule
And here is the RAM controller code where I need to test:
module MCPU_RAMController(we, datawr, re, addr, datard, instraddr, instrrd);
parameter WORD_SIZE=8;
parameter ADDR_WIDTH=8;
parameter RAM_SIZE=1<<ADDR_WIDTH;
input we, re;
input [WORD_SIZE-1:0] datawr;
input [ADDR_WIDTH-1:0] addr;
input [ADDR_WIDTH-1:0] instraddr;
output [WORD_SIZE-1:0] datard;
output [WORD_SIZE-1:0] instrrd;
reg [WORD_SIZE-1:0] mem[RAM_SIZE-1:0];
reg [WORD_SIZE-1:0] datard;
reg [WORD_SIZE-1:0] instrrd;
always # (addr or we or re or datawr)
begin
if(we)begin
mem[addr]=datawr;
end
if(re) begin
datard=mem[addr];
end
end
always # (instraddr)
begin
instrrd=mem[instraddr];
end
endmodule
Currently, you are filling the memory with incrementing values (0, 1, 2, etc.) at incrementing addresses. One way to fill the memory with random data values is to use the $random system function.
In the testbench, change:
datawr=i;
to:
datawr=$random;
See also IEEE Std 1800-2017, section 18.13 Random number system functions and methods for more modern random functions ($urandom, etc.).

Quartus does not allow using a Generate block in Verilog

Pretty simple problem. Given the following code:
module main(
output reg [1:0][DATA_WIDTH-1:0] dOut,
input wire [1:0][DATA_WIDTH-1:0] dIn,
input wire [1:0][ADDR_WIDTH-1:0] addr,
input wire [1:0] wren,
input wire clk
);
parameter DATA_WIDTH = 16;
parameter ADDR_WIDTH = 6;
reg [DATA_WIDTH-1:0] ram [2**ADDR_WIDTH-1:0];
generate
genvar k;
for(k=0; k<2; k=k+1) begin: m
always #(posedge clk) begin
if(wren[k])
ram[addr[k]] <= dIn[k];
dOut[k] <= ram[addr[k]];
end
end
endgenerate
endmodule
quarus 13.0sp1 gives this error (and its 20 other ill-begotten fraternally equivalent siblings):
Error (10028): Can't resolve multiple constant drivers for net "ram[63][14]" at main.v(42)
But if I manually un-roll the generate loop:
module main(
output reg [1:0][DATA_WIDTH-1:0] dOut,
input wire [1:0][DATA_WIDTH-1:0] dIn,
input wire [1:0][ADDR_WIDTH-1:0] addr,
input wire [1:0] wren,
input wire clk
);
parameter DATA_WIDTH = 16;
parameter ADDR_WIDTH = 6;
reg [DATA_WIDTH-1:0] ram [2**ADDR_WIDTH-1:0];
always #(posedge clk) begin
if(wren[0])
ram[addr[0]] <= dIn[0];
dOut[0] <= ram[addr[0]];
end
always #(posedge clk) begin
if(wren[1])
ram[addr[1]] <= dIn[1];
dOut[1] <= ram[addr[1]];
end
endmodule
It all becomes okay with the analysis & synthesis step.
What's the cure to get the generate loop running?
I think the correct way is in the lines of what it's explained in this question: Using a generate with for loop in verilog
Which would be transferred to your code as this:
module main(
output reg [1:0][DATA_WIDTH-1:0] dOut,
input wire [1:0][DATA_WIDTH-1:0] dIn,
input wire [1:0][ADDR_WIDTH-1:0] addr,
input wire [1:0] wren,
input wire clk
);
parameter DATA_WIDTH = 16;
parameter ADDR_WIDTH = 6;
reg [DATA_WIDTH-1:0] ram [2**ADDR_WIDTH-1:0];
integer k;
always #(posedge clk) begin
for(k=0; k<2; k=k+1) begin:
if(wren[k])
ram[addr[k]] <= dIn[k];
dOut[k] <= ram[addr[k]];
end
end
endmodule
Keeping all accesses to your dual port RAM in one always block is convenient so the synthesizer can safely detect that you are efefctively using a dual port RAM at register ram.
Both the generate loop and unrolled versions should not have passed synthesis. In both cases the same address in ram can be assigned by both always blocks. Worse, if both bits of wren are high with both addresses being the same and data being different, then the result is indeterminable. The Verilog LRM states last assignment on a register wins and always blocks with the same trigger could be evaluated in any order.
Synthesis requires assignments to registers to be deterministic. Two (or more) always blocks having write access to the same bit is illegal because nondeterministic. If the unrolled is synthesizing correctly, then that means there are constants on wren and addr outside of the shown module that make it logically impossible for write conflict; for some reason the generate loop version is not getting the same optimization. Example of constraints that would allow optimization to prevent multi-always block write access:
One wren is hard coded to 0. Therefore only one block has exclusive access
Address have non overlapping sets of possible values. Ex addr[0] can only be even while addr[1] can only be odd, or addr[0] < 2**(ADDR_WIDTH/2) and addr[1] >= 2**(ADDR_WIDTH/2).
Synthesis is okay with dOut being assigned by two always blocks because each block has exclusive write access to its target bits (non overlapping sets of possible address values).
The single always block in mcleod_ideafix answer is the preferred solution. If both bits of wren are high with both addresses being the same, then wren[1] will always win. If wren[0] should have priority, then make the for-loop a count down.

Data memory unit

I started Verilog a few weeks ago and now I'm implementing MIPS pipelining on an FPGA board and I'm on the MEM part of the pipelining stage. I'm trying to code the Data memory unit (in picture -> Data memory Unit).
I don't understand the use of memread. I understand that if memwrite is 1, the contents of the current address is passed to read data.
So far, this is my code:
module data_memory (
input wire [31:0] addr, // Memory Address
input wire [31:0] write_data, // Memory Address Contents
input wire memwrite, memread,
output reg [31:0] read_data // Output of Memory Address Contents
);
reg [31:0] MEMO[0:255]; // 256 words of 32-bit memory
integer i;
initial begin
read_data <= 0;
for (i = 0; i < 256; i = i + 1)
MEMO[i] = i;
end
always # (addr) begin
//**I don't understand the use of memread**//
if (memwrite == 1'b1)
MEMO[addr] <= write_data;
end
end
assign read_data = MEMO[addr];
endmodule
Do I need another if statement for the memread? Any help is greatly appreciated. Thanks
In the design you have coded above, you dont use memread, instead choosing to combinationally read from the memory via the last line of your module. And without more details on how exactly the memory in your diagram is suppose to function, its difficult to say the exact usage of memread. Typical memories only have a memwrite and assume that if an address is supplied and memwrite is deasserted, the access is a read. In this case, I can only assuming memread should be asserted to read from the memory. Also, I would suggest a few edits to your code to make it work better and follow a better synchronous design style (this will incorporate memread so you can see how it can be used):
module data_memory (
input wire [31:0] addr, // Memory Address
input wire [31:0] write_data, // Memory Address Contents
input wire memwrite, memread,
input wire clk, // All synchronous elements, including memories, should have a clock signal
output reg [31:0] read_data // Output of Memory Address Contents
);
reg [31:0] MEMO[0:255]; // 256 words of 32-bit memory
integer i;
initial begin
read_data <= 0;
for (i = 0; i < 256; i = i + 1) begin
MEMO[i] = i;
end
end
// Using #(addr) will lead to unexpected behavior as memories are synchronous elements like registers
always #(posedge clk) begin
if (memwrite == 1'b1) begin
MEMO[addr] <= write_data;
end
// Use memread to indicate a valid address is on the line and read the memory into a register at that address when memread is asserted
if (memread == 1'b1) begin
read_data <= MEMO[addr];
end
end
endmodule
Important to note also the need for a clock in your design. Most block diagrams at that level will omit the clock as it is assumed but all synchronous elements (memories and registers) will be synchronized to a common clock (or multiple clocks in some cases).
#Unn gives excellent answer, moreover I just want add that, if you not use read_enable, Then it may unsynchronised data read operation, It is also preferred to flop the output read_data on read_clk.
Here with see below templent for reference.
parameter RAM_WIDTH = <ram_width>;
parameter RAM_ADDR_BITS = <ram_addr_bits>;
(* RAM_STYLE="{AUTO | BLOCK | BLOCK_POWER1 | BLOCK_POWER2}" *)
reg [RAM_WIDTH-1:0] <ram_name> [(2**RAM_ADDR_BITS)-1:0];
reg [RAM_WIDTH-1:0] <output_dataB>;
<reg_or_wire> [RAM_ADDR_BITS-1:0] <addressA>, <addressB>;
<reg_or_wire> [RAM_WIDTH-1:0] <input_dataA>;
// The forllowing code is only necessary if you wish to initialize the RAM
// contents via an external file (use $readmemb for binary data)
initial
$readmemh("<data_file_name>", <ram_name>, <begin_address>, <end_address>);
always #(posedge <clockA>)
if (<enableA>)
if (<write_enableA>)
<ram_name>[<addressA>] <= <input_dataA>;
always #(posedge <clockB>)
if (<enableB>)
<output_dataB> <= <ram_name>[<addressB>];

Default values of RAM

I writing in Verilog HDL for synthesis and I want to instantiate a DUAL PORT RAM with default values (zeros), how can do it?
Thanks, Netanel
As you mentioned Virtex-7 - look in the Xilinx Synthesis manual for examples of how to write Verilog that infers a memory block.
In Appendix C you can find this code:
// Dual-Port Block RAM with Two Write Ports
// File: HDL_Coding_Techniques/rams/rams_16.v
module v_rams_16 (clka,clkb,ena,enb,wea,web,addra,addrb,dia,dib,doa,dob);
input clka,clkb,ena,enb,wea,web;
input [9:0] addra,addrb;
input [15:0] dia,dib;
output [15:0] doa,dob;
reg [15:0] ram [1023:0];
reg [15:0] doa,dob;
always #(posedge clka) begin if (ena)
begin
if (wea)
ram[addra] <= dia;
doa <= ram[addra];
end
end
always #(posedge clkb) begin if (enb)
begin
if (web)
ram[addrb] <= dib;
dob <= ram[addrb];
end
end
endmodule
I'm not a Verilogger, but I'm sure you can tweak the ram declaration to make it initialise with all zeros.

Accessing instance of one module from inside of several other modules? (verilog)

I am writing a simple system in which there is a memory module (simple reg with a read and write signals). Now this memory has to be accessed by several other modules (not simultaneously). So I create an instance of this memory and feed data to it. But I can't figure out how will my other modules access the same instance of the memory module. Any help?
Edit
Let me clarify a bit by some code. This is my memory module, simple signals.
module rom(
input [15:0] addr,
input [15:0] data_in,
input rd,
input wr,
input cs,
output reg [15:0] data_out
);
reg [15:0] mem[255:0];
integer k;
initial begin
for(k = 0;k<256;k=k+2)
mem[k] = 16'h0011;
for(k = 1;k<256;k=k+2)
mem[k] = 16'h0101;
end
always #(cs)begin
if(wr)
mem[addr] <= data_in;
if(rd)
data_out <= mem[addr];
end
endmodule
This will be instantiated in my top module, something like this
module Top;
// Inputs
reg [15:0] addr;
reg [15:0] data_in;
reg rd;
reg wr;
reg cs;
// Outputs
wire [15:0] data_out;
// Instantiate the Unit Under Test (UUT)
rom uut (
.addr(addr),
.data_in(data_in),
.rd(rd),
.wr(wr),
.cs(cs),
.data_out(data_out)
);
....
....
....
endmodule
Now this top module will also contain some other modules which would want to connect to memory. I don't really understand how I would connect them. Suppose there is one module like this
module IF_stage(
input clk,
input rst,
output reg [15:0] pc,
output [15:0] instruction
);
//pc control
always#(posedge clk or posedge rst)
begin
if(rst)
pc <= 16'hFFFF;
else
pc <= pc+1;
end
....
How would I access the memory module from here?
Answering your comment, you don't instance the memory more than once. You create one instance of the memory at some level of the hierarchy, and then connect all the consumers to it via ports/wires. So in the top level you might have 3 modules that want to access memory, and 1 memory module. The three accessors each connect to the single instance, they don't instance their own memory.
The memory should be parallel to the other modules, not inside of them, if that makes sense.
You need to modify IF_stage to add an interface that can communicate with the memory, something like this:
module IF_stage(
input clk,
input rst,
input [15:0] read_data_from_memory, //new
input read_data_from_memory_valid, //new
output reg [15:0] pc,
output [15:0] instruction
output do_memory_write //new
output do_memory_read //new
output [15:0] memory_write_data //new
output [15:0] addr //new
);
Then when an IF_stage wants to read or write memory, it puts addr/data on it's output port to issue command to the memory module, and then waits for read_data_from_memory(_valid) to be asserted on it's input port. These outputs and inputs get connected to the memory module at the top level.
You'll also have to deal with bus contention here, for instance if two instances of IF_stage try to read/write at the same time, you'll need some kind of arbiter module to acknowledge both requests, and then forward them one at a time to the memory, and return the valid data to the proper module.
First of all, the name of your memory is "rom", which is read only. I assume it is a typo, otherwise there is no need to have wr port and you can simply implement separate roms in clients and let synthesizer to optimize the design.
For your question, basically you need an arbiter to handle the contention between multiple clients. All clients can assume they occupy the memory exclusively but the memory is shared by all clients and cannot be accessed simultaneously.
Tim is right about the IF_stage. Every client must have a separate memory interface
output [15:0] addr;
output [15:0] data_out;
input [15:0] data_in;
output wr, rd, cs;
input rdy; // only when rdy == 1, the memory operation is finished
You will need a memory controller/arbiter, which behaves as a memory to all clients but actually handle the contention among clients. Assuming there are three clients and all clients access the memory less than once per three cycles, you can simply have something as follow:
module mem_ctl(
addr_c1, dw_c1, dr_c1, wr_c1, rd_c1, cs_c1,
addr_c2, dw_c2, dr_c2, wr_c2, rd_c2, cs_c2,
addr_c3, dw_c3, dr_c3, wr_c3, rd_c3, cs_c3,
addr_m, dw_m, dr_m, wr_m, rd_m, cs_m,
rdy_c1, rdy_c2, rdy_c3,
rst_n, clk
);
input clk, rst_n;
input [15:0] addr_c1, addr_c2, addr_c3, dw_c1, dw_c2, dw_c3; // addr and data_write from clients
output [15:0] dr_c1, dr_c2, dr_c3; // data read from clients
input wr_c1, wr_c2, wr_c3, rd_c1, rd_c2, rd_c3, cs_c1, cs_c2, cs_c3; // control from clients
output [15:0] addr_m, dw_m; // addr and data write to memory
input [15:0] dr_m;
output wr_m, rd_m, cs_m; // control the memory
output rdy_c1, rdy_c2, rdy_c3;
reg [15:0] dr_c1, dr_c2, dr_c3, dw_m, addr_m;
reg wr_m, rd_m, cs_m;
reg [1:0] cnt;
always #(posedge clk or negedge rst_n)
if (~rst_n)
cnt <= 0;
else if(cnt == 2'd2)
cnt <= 0;
else
cnt <= cnt + 1;
always #(*) // Verilog 2001, if not recognizable, fill in yourself
begin
case(cnt)
0: begin
dw_m = dw_c1;
wr_m = wr_c1;
cs_m = cs_c1;
rd_m = rd_c1;
dr_c1 = dr_m;
end
1: begin
dw_m = dw_c2;
wr_m = wr_c2;
cs_m = cs_c2;
rd_m = rd_c2;
dr_c2 = dr_m;
end
default: begin
dw_m = dw_c3;
wr_m = wr_c3;
cs_m = cs_c3;
rd_m = rd_c3;
dr_c3 = dr_m;
end
endcase
end
assign rdy_c1 = (cnt == 0) & cs_c1;
assign rdy_c2 = (cnt == 1) & cs_c2;
assign rdy_c3 = (cnt == 2) & cs_c3;
endmodule
However, this is only OK when the access rates of all clients are lower than once per three cycles. If the access rate is variant and higher than that, you will need a real arbiter in the mem_ctl module. I think a round-robin arbiter will be fine.
The final comment, if the accumulative access rates of all clients is larger than once per cycle, it is impossible to handle in hardware. In that case, you will need to do it in other ways.

Resources