The verilog below code as you see uses a multi-dimensional register array for storing the data.
parameter DSIZE = 8;
parameter ASIZE = 4;
input [DSIZE-1:0] wdata;
input wclk,wen;
reg [ASIZE:0] wptr;
parameter MEMDEPTH = 1<<ASIZE;
reg [DSIZE-1:0] ex_mem [0:MEMDEPTH-1];
always #(posedge wclk)
if (wen)
ex_mem[wptr[ASIZE-1:0]] <= wdata;
I do not properly understand what happens in the last assignment statement in which ex_mem is assigned the value in wdata. What does the part in the brackets (wptr[ASIZE-1:0]) associated with ex_mem return and to what location of ex_mem does wdata get stored into?
In the code, ex_mem is a memory that has 16 (MEMDEPTH) slots. Each slot has 8 (DSIZE) bits. 16 slots can be addressed by 4 (ASIZE) bits, but wptr is a 5-bit signal for some reason, so its most significant bit (MSB) is not used for addressing the memory.
ex_mem[wptr[ASIZE-1:0]] <= wdata;
Since wptr[ASIZE-1:0] is a 4-bit signal (for ASIZE=4), the assignment above may write to a slot between ex_mem[0] and ex_mem[15].
'wptr' is just a one-dimensional register.
So, first of all verilog extracts an index to ex_mem from the 'wptr' thing. It uses this range to do so: ASIZE-1:0.
If ASIZE is 4, as in your example, it can sample values from 0 to 15 from there. For example,
reg [4:0] wptr = 0x1B;
wptr[3:0] will give you 'B' (11).
Now this index value will be applied to the ex_mem array to write your data.
Related
This is the simple code I wrote.From 'outt' I get 122116. But if I change 'outt' width to be 33 bits ([32:0]) then code seems to work and give the correct answer -140028. What is the reason for this behaviour??
`timescale 1ns / 1ps
module valu_parser(clk,outt);
input clk;
reg signed [31:0] r_1;
reg signed [31:0] r_2;
output reg signed [31:0] outt;
initial begin
r_1 = -47938;
r_2 = -150096;
end
always # (posedge clk) begin
outt <= ((r_1 + r_2)* 11585 + 8192)>>>14;
end
endmodule
You are performing an operation that needs at least 33 bits (the temporary result before the right shift uses 33 bits) and theoretically it could need 32+"the size of the multiplicand constant" assuming that r_1 and r_2 are not constants.
If you think of the hardware your code will generate, these bits needs to be stored somewhere temporarily to allow the hardware to first perform multiplication, then addition followed by the right shift.
This will do the trick, but will also generate more registers than you wanted initially. If you are using this module to generate a constant, I would recommend hard-coding the constant.
module valu_parser(clk,outt);
input clk;
reg signed [31:0] r_1;
reg signed [31:0] r_2;
reg signed [32:0] temp;
output reg signed [31:0] outt;
initial begin
r_1 = -47938;
r_2 = -150096;
end
always # (posedge clk) begin
temp <= ((r_1 + r_2)* 11585 + 8192);
end
assign outt = temp>>>14;
endmodule
The concept can be seen here: https://www.edaplayground.com/x/3BXy.
In an expression Verilog needs to decide how many bits to use in the calculation.
The + and * operators result in what are called context-determined expressions. With the expression F = A + B; the number of bits used is the maximum of F, A, and B. This usually works fine, because normally you would ensure that F was wide enough to store the result of adding A and B. Likewise with the expression F = A * B; would usually work fine, because normally you would ensure that F was wide enough to store the result of multiplying A and B.
However, by adding the shift right operator you have been able to make the variable being assigned narrower than the number of bits actually needed to calculate the expression on the left of the shift operator. The number of bits Verilog uses in the calculation is the maximum of the width of outt, r_1, r_2, 11585 and 8192. All of these are 32 bits wide (including 11585 and 8192), so 32 bits are used in the calculation. As you have discovered, 32 bits is not enough, but, with the values you have chosen, 33 bits is. With other values, 33 bits wouldn't be enough either. For a completely flexible solution, you should be using 66 bits (32 + 32 + 1 + 1) - 32 bits + 32 bits for the multiplication plus 1 more bit for each addition.
The solution to your problem is to make r_1 and/or r_2 wider or to use an intermediate value (as suggested by Hida's answer here).
When you multiply -198034 (r1+r1) on 11585 you have in result highest bit is 0 (outt[31]), then if you have 32bit signed value its start be positive and in answer you have positive result. And when you change it for 33bit your highest bit is 1 (outt[32]) and result is negative value and you have correct answer.
In the following Verilog code snippet for implementing an input buffer for a router, in second line, what is the role of 1<<`BUF_WIDTH? I understand that << is the left shift operator, but what happens by left shifting 1 by `BUF_WIDTH? Or is there some other function of << operator?
`define BUF_WIDTH 3 // BUF_SIZE = 16 -> BUF_WIDTH = 4, no. of bits to be used in pointer
`define BUF_SIZE ( 1<<`BUF_WIDTH )
module fifo13( clk, rst, buf_in, buf_out, wr_en, rd_en, buf_empty, buf_full, fifo_counter );
input rst, clk, wr_en, rd_en;
input [7:0] buf_in; // data input to be pushed to buffer
output[7:0] buf_out;// port to output the data using pop.
output buf_empty, buf_full; // buffer empty and full indication
output[`BUF_WIDTH :0] fifo_counter; // number of data pushed in to buffer
reg[7:0] buf_out;
reg buf_empty, buf_full;
reg[`BUF_WIDTH :0] fifo_counter;
reg[`BUF_WIDTH -1:0] rd_ptr, wr_ptr; // pointer to read and write addresses
reg[7:0] buf_mem[`BUF_SIZE -1 : 0];
.
.
.
The entire code is available on http://electrosofts.com/verilog/fifo.html
You assume correctly that << is the left-shift operator, it has no other special meaning.
Shifting the binary representation of a number to the left is equivalent to multiplying the number by 2. So, by shifting 1 to the left N times, you get 2 to the power of N as a result.
The way this is used in the code sample ensures that the buffer has exactly as many entries (BUF_SIZE) as can be uniquely addressed by a pointer of size BUF_WIDTH.
It is the bit shift operator. Think what it does: it shifts bits left. You have a definition of BUF_WIDTH being 3. Then you take 1, shift it by that many places and you get 8 for BUF_SIZE. With three bits you can have 8 different values.
So this is a way to define these two constants so that you only have to change one value. If they would be two constants, someone might accidentally only change one and not the other and this would cause problems.
I'm new to Verilog, ISE, FPGAs. I'm trying to implement a simple design into an FPGA, but the entire design is being optimized away. It is basically an 2D array with some arbitrary values. Here is the code:
module top(
output reg out
);
integer i;
integer j;
reg [5:0] array [0:99][0:31];
initial begin
for(i=0;i<100;i=i+1) begin
for(j=0;j<32;j=j+1) begin
array[i][j] = j;
out = array[i][j];
end
end
end
endmodule
It passes XST Synthesis fine, but it fails MAP in the Implementation process. Two Errors are given:
ERROR:Map:116 - The design is empty. No processing will be done.
ERROR:Map:52 - Problem encountered processing RPMs.
The entire code is being optimized away in XST. Why? What am I doing wrong?
The reason your design is being synthesized away is because you have not described any logic in your module.
The only block in your design is an initial block which is typically not used in synthesis except in limited cases; the construct mainly used for testbenches in simulation (running the Verilog through ModelSim or another simluator).
What you want is to use always blocks or assign statements to describe logic for XST to synthesize into a netlist for the FPGA to emulate. As the module you provided has neither of these constructs, no netlist can be generated, thus nothing synthesized!
In your case, it is not entirely clear what logic you want to describe as the result of your module will always have out equal to 31. If you want out to cycle through the values 0 to 31, you'll need to add some sequential logic to implement that. Search around the net for some tutorials on digital design so you have the fundamentals down (combinational logic, gates, registers, etc). Then, think about what you want the design to do and map it to those components. Then, write the Verilog that describes that design.
EDIT IN LIGHT OF COMMENTS:
The reason you are get no LUT/FF usage on the report is because the FPGA doesn't need to use any resources (or none of those resources) to implement your module. As out is tied to constant 31, it will always have the value of 1, so the FPGA only needs to tie out to Vdd (NOTE that out is not 31 because it is only a 1-bit reg). The other array values are never used nor accesses, so the FPGA synthesized them away (ie, not output needs to know the value of array[0][1] as out is a constant and no other ports exist in the design). In order to preserve the array, you need only use it to drive some output somehow. Heres a basic example to show you:
module top( input [6:0] i_in, // Used to index the array like i
input [4:0] j_in, // Used to index the array like j
output reg [5:0] out // Note, out is now big enough to store all the bits in array
);
integer i;
integer j;
reg [5:0] array[0:99][0:31];
always #(*) begin
// Set up the array, not necessarily optimal, but it works
for (i = 0; i < 100; i = i + 1) begin
for (j = 0; j < 32; j = j + 1) begin
array[i][j] = j;
end
end
// Assign the output to value in the array at position i_in, j_in
out = array[i_in][j_in];
end
endmodule
If you connect the inputs i_in and j_in to switches or something and out to 6 LEDs, you should be able to index the array with the switches and get the output on the LEDs to confirm your design.
module dut ( a,b_out,array,c);
input [2:0] a;
input [3:0] array;
input c;
output reg b_out;
always#( a or c or array) begin
if(c)
b_out = 1'b0;
else
b_out = array[a];
end
endmodule
There is a possible range overflow in the above RTL, how it exactly affects the simulation and synthesis?
When a > 3 and !c then b_out will be undef in simulation because an out-of bounds access to a vector returns undef (i.e. 1'bx). See 5.2.1 in IEEE Std 1364-2005:
A part-select of any type that addresses a range of bits that are
completely out of the address bounds of the net, reg, integer, time
variable, or parameter or a part-select that is x or z shall yield the
value x when read and shall have no effect on the data stored when
written. Part-selects that are partially out of range shall, when
read, return x for the bits that are out of range and shall, when
written, only affect the bits that are in range.
In synthesis this don't care will be transformed into whatever the synthesis tool deems the most efficient. Very likely that means that only the bottom two bits of a are used in array[a], i.e. it is identical to array[a[1:0]]. But there is no guarantee for that whatsoever and it would be equally correct to create a circuit that for example always returns 1 or 0 when a[2] is high.
i can't understand the two lines at the end of this code
input [15:0] offset ;
output [31:0] pc;
output [31:0] pc_plus_4;
reg [31:0] pc;
wire [31:0] pcinc ;
assign pcinc = pc +4 ;
assign pc_plus_4 = {pc[31],pcinc};
assign branch_aadr = {0,pcinc + {{13{offset[15]}},offset[15:0],2'b00}};
If you are unfamiliar with curly braces {}, they are concatenation operators. You can read about them in the IEEE Std for Verilog (for example, 1800-2009, Section 11.4.12).
assign pc_plus_4 = {pc[31],pcinc};
This concatenates the MSB of pc with all bits of pcinc to assemble the pc_plus_4 signal. However, in this case, since pcinc and pc_plus_4 are both 32 bits wide, pc[31] is ignored. A good linting tool will notify you that the RHS is 33 bits and the LHS is 32 bits, and that the most significant bit will be lost. The line can be more simply coded as:
assign pc_plus_4 = pcinc;
The last line is a compile error for one simulator I'm using. You did not explicitly declare the width of the branch_aadr signal, and the width of the 0 constant is unspecified.
The last line also contains a replication operator, which uses two sets of curly braces.
{13{offset[15]}}
This replicates the bit offset[15] thirteen times. It looks like the author is doing a sign extension on offset before adding it to pcinc. A better way might be to declare offset as signed.
//Three ways to replicate bits
wire [3:0] repeated;
wire value;
//These two assignments have the same effect
assign repeated = {4{value}}; //Replication operator
assign repeated = {value,value,value,value}; //Concatenation operator
//These four taken together have the same effect as the above two
assign repeated[3] = value; //Bit selects
assign repeated[2] = value;
assign repeated[1] = value;
assign repeated[0] = value;