How to prevent ISE compiler from optmizing away my array? - verilog

I'm new to Verilog, ISE, FPGAs. I'm trying to implement a simple design into an FPGA, but the entire design is being optimized away. It is basically an 2D array with some arbitrary values. Here is the code:
module top(
output reg out
);
integer i;
integer j;
reg [5:0] array [0:99][0:31];
initial begin
for(i=0;i<100;i=i+1) begin
for(j=0;j<32;j=j+1) begin
array[i][j] = j;
out = array[i][j];
end
end
end
endmodule
It passes XST Synthesis fine, but it fails MAP in the Implementation process. Two Errors are given:
ERROR:Map:116 - The design is empty. No processing will be done.
ERROR:Map:52 - Problem encountered processing RPMs.
The entire code is being optimized away in XST. Why? What am I doing wrong?

The reason your design is being synthesized away is because you have not described any logic in your module.
The only block in your design is an initial block which is typically not used in synthesis except in limited cases; the construct mainly used for testbenches in simulation (running the Verilog through ModelSim or another simluator).
What you want is to use always blocks or assign statements to describe logic for XST to synthesize into a netlist for the FPGA to emulate. As the module you provided has neither of these constructs, no netlist can be generated, thus nothing synthesized!
In your case, it is not entirely clear what logic you want to describe as the result of your module will always have out equal to 31. If you want out to cycle through the values 0 to 31, you'll need to add some sequential logic to implement that. Search around the net for some tutorials on digital design so you have the fundamentals down (combinational logic, gates, registers, etc). Then, think about what you want the design to do and map it to those components. Then, write the Verilog that describes that design.
EDIT IN LIGHT OF COMMENTS:
The reason you are get no LUT/FF usage on the report is because the FPGA doesn't need to use any resources (or none of those resources) to implement your module. As out is tied to constant 31, it will always have the value of 1, so the FPGA only needs to tie out to Vdd (NOTE that out is not 31 because it is only a 1-bit reg). The other array values are never used nor accesses, so the FPGA synthesized them away (ie, not output needs to know the value of array[0][1] as out is a constant and no other ports exist in the design). In order to preserve the array, you need only use it to drive some output somehow. Heres a basic example to show you:
module top( input [6:0] i_in, // Used to index the array like i
input [4:0] j_in, // Used to index the array like j
output reg [5:0] out // Note, out is now big enough to store all the bits in array
);
integer i;
integer j;
reg [5:0] array[0:99][0:31];
always #(*) begin
// Set up the array, not necessarily optimal, but it works
for (i = 0; i < 100; i = i + 1) begin
for (j = 0; j < 32; j = j + 1) begin
array[i][j] = j;
end
end
// Assign the output to value in the array at position i_in, j_in
out = array[i_in][j_in];
end
endmodule
If you connect the inputs i_in and j_in to switches or something and out to 6 LEDs, you should be able to index the array with the switches and get the output on the LEDs to confirm your design.

Related

Combinational way of implementing a CAM in verilog

I'm trying to implement a cache and index lookup memory in SystemVerilog. It's a simple CAM + circular buffer. The interface is:
input rst_n;
input clk;
input [WORD_BITS-1:0] inp;
input rd_en;
input wr_en;
output logic [DEPTH_BITS-1:0] index;
output logic index_valid;
reg [WORD_BITS-1:0] buffer[$pow(2, DEPTH_BITS)];
reg [DEPTH_BITS-1:0] next;
There's basic async reset code. There's a synchronous block that stores inp in buffer and advances next whenever wr_en is high.
Now I'm trying to come up with an efficient and readable way of finding the index of inp when rd_en is high. It seems this could be completely combinational except when clocking the result into the index output. The way I'm visualizing it in my head is to xor inp with all of the buffer locations (it will be fairly small, perhaps 64 entries) then if that is equal to 0 the entry was found. Then a block to arbitrarily choose one of the indices with a 0 value. This is where is differs from a traditional CAM, there could be multiple entries for the same value but I really only need the index of one of those and it doesn't matter which one.
Any thoughts on how to do this in System Verilog (2012)? I know I can loop through all the memory locations synchronously and save a bunch of area but I'd rather it be fast than small. I'm targeting FPGAs. (initially inexpensive Lattice and maybe Xilinx parts) I know a few of the Lattice parts actually have CAM blocks but this is for cases where that isn't available.
Following up on a suggestion from another forum, the following seems to work well.
always_comb begin
index_valid = 0;
for (int i=0; i < 64; i=i+1) begin
if (rd_en) begin
if (inp == buffer[i]) begin
index = i;
index_valid = 1'b1;
end
end
end
end

call by reference in verilog code

I am trying to change a c++ code into verilog HDL.
I want to write a module that changes one of its inputs. (some how like call by reference in c++)
as I know there is no way to write a call by reference module in verilog (I can't use systemverilog)
Here is a code that I wrote and it works. are there any better ways to do this?!
my problme is that the register I want to be call by reference is a big array. this way duplicates the registers and has a lot of cost.
module testbench();
reg a;
wire b;
reg clk;
initial begin
a = 0;
clk = 0;
#10
clk = 1;
end
test test_instance(
.a(a),
.clk(clk),
.aOut(b)
);
always#(*)begin
a = b;
end
endmodule
module test(
input a,
input clk,
output reg aOut
);
always #(posedge clk) begin
if (a == 0)begin
a = 1;
aOut = a;
end
end
endmodule
Verilog is not a software programming language; it is a hardware description language. The inputs to a module are pieces of metal (wires, tracks, pins); the outputs from a module are pieces of metal. If you want a port that is both an input and an output you can use an inout. However, inout ports are best avoided; it is usually much better to use separate inputs and outputs.
A Verilog module is not a software function. Nothing is copied to the inputs; nothing is copied from the outputs. A Verilog module is a lump of hardware: it has inputs (pieces of metal carrying information in) and outputs (pieces of metal carrying information out).
Your are right to say that you can use either pass-by-copy or pass-by-reference in SystemVerilog. If you wish to pass a large data structure into a function or into/out of a task, then passing by reference may save simulation time.
By reference means by address, so to translate this to hdl directly you would either need to provide a way for the module to get on that bus and perform transactions based on that address.
Or better, if you need this as an input take each of the items in the struct and make individual inputs from them. If it is pass by reference because it is an output or is also an output, then you create individual outputs for each of the items in the struct. The module then distinguishes between the input version of that sub item and output version of that sub item.
my.thing.x = my.thing.x + 1;
becomes something like
my_thing_x_output = my_thing_x_input + 1;

Getting strange error in verilog (vcs) when trying to use if/else blocks

I am trying to write an "inverter" function for a 2's compliment adder. My instructor wants me to use if/else statements in order to implement it. The module is supposed to take an 8 bit number and flip the bits (so zero to ones/ones to zeros). I wrote this module:
module inverter(b, bnot);
input [7:0] b;
output [7:0]bnot;
if (b[0] == 0) begin
assign bnot[0] = 1;
end else begin
assign bnot[0] = 0;
end
//repeat for bits 1-7
When I try and compile and compile it using this command I got the following errors:
vcs +v2k inverter.v
Error-[V2005S] Verilog 2005 IEEE 1364-2005 syntax used.
inverter.v, 16
Please compile with -sverilog or -v2005 to support this construct: generate
blocks without generate/endgenerate keywords.
So I added the -v2005 argument and then I get this error:
vcs +v2k -v2005 inverter.v
Elaboration time unknown or bad value encountered for generate if-statement
condition expression.
Please make sure it is elaboration time constant.
Someone mind explaining to me what I am doing wrong? Very new to all of this, and very confused :). Thanks!
assign statements like this declare combinatorial hardware which drive the assigned wire. Since you have put if/else around it it looks like you are generating hardware on the fly as required, which you can not do. Generate statements are away of paramertising code with variable instance based on constant parameters which is why in this situation you get that quite confusing error.
Two solutions:
Use a ternary operator to select the value.
assign bnot[0] = b[0] ? 1'b0 : 1'b1;
Which is the same as assign bnot[0] = ~b[0].
Or use a combinatorial always block, output must be declared as reg.
module inverter(
input [7:0] b,
output reg [7:0] bnot
);
always #* begin
if (b[0] == 0) begin
bnot[0] = 1;
end else begin
bnot[0] = 0;
end
end
Note in the above example the output is declared as reg not wire, we wrap code with an always #* and we do not use assign keyword.
Verliog reg vs wire is a simulator optimisation and you just need to use the correct one, further answers which elaborate on this are Verilog Input Output types, SystemVerilog datatypes.

Why is adding one operation causing my number of logic elements to skyrocket?

I'm designing a 464 order FIR filter in Verilog for use on the Altera DE0 FPGA. I've got (what I believe to be) a working implementation; however, there's one small issue that's really actually given me quite a headache. The basic operation works like this: A 10 bit number is sent from a micro controller and stored in datastore. The FPGA then filters the data, and lights LED1 if the data is near 100, and off if it's near 50. LED2 is on when the data is neither 100 nor 50, or the filter hasn't filled the buffer yet.
In the specification, the coefficients (which have been pre provided), have been multiplied by 2^15 in order to represent them as integers. Therefore, I need to divide my final output Y by 2^15. I have implemented this using a shift, since it should be (?) the most efficient way. However, this single line causes my number of logic elements to jump from ~11,000 without it, to over 35,000. The Altera DE0 uses a Cyclone III FPGA which only has room for about 15k logic elements. I've tried doing it inside both combinational and sequential logic blocks, both of which have the same exact issue.
Why is this single, seemingly simple operation causing such an inflation elements? I'll include my code, which I'm sure isn't the most efficient, nor the cleanest. I don't care about optimizing this design for performance or area/density at all. I just want to be able to fit it onto the FPGA so it'll run. I'm not very experienced in HDL design, and this is by far the most complex project I've needed to tackle. It's worth noting that I do not remove y completely, I replace the "bad" line with assign YY = y;.
Just as a note: I haven't included all of the coefficients, for sanity's sake. I know there might be a better way to do it than using case statements, but it's the way that it came and I don't really want to relocate 464 elements to a parameter declaration, etc.
module lab5 (LED1, LED2, handshake, reset, data_clock, datastore, bit_out, clk);
// NUMBER OF COEFFICIENTS (465)
// (Change this to a small value for initial testing and debugging,
// otherwise it will take ~4 minutes to load your program on the FPGA.)
parameter NUMCOEFFICIENTS = 465;
// DEFINE ALL REGISTERS AND WIRES HERE
reg [11:0] coeffIndex; // Coefficient index of FIR filter
reg signed [16:0] coefficient; // Coefficient of FIR filter for index coeffIndex
reg signed [16:0] out; // Register used for coefficient calculation
reg signed [31:0] y;
wire signed [7:0] YY;
reg [9:0] xn [0:464]; // Integer array for holding x
integer i;
output reg LED1, LED2;
// Added values from part 1
input reset, handshake, clk, data_clock, bit_out;
output reg [9:0] datastore;
integer k;
reg sent;
initial
begin
sent = 0;
i=0;
datastore = 10'b0000000000;
y=0;
LED1 = 0;
LED2 = 0;
for (i=0; i<NUMCOEFFICIENTS; i=i+1)
begin
xn[i] = 0;
end
end
always#(posedge data_clock)
begin
if(handshake)
begin
if(bit_out)
begin
datastore = datastore >> 1;
datastore [9] = 1;
end
else
begin
datastore = datastore >> 1;
datastore [9] = 0;
end
end
end
always#(negedge clk)
begin
if (!handshake )
begin
if(!sent)
begin
y=0;
for (i=NUMCOEFFICIENTS-1; i > 0; i=i-1) //shifts coeffecients
begin
xn[i] = xn[i-1];
end
xn[0] = datastore;
for (i=0; i<NUMCOEFFICIENTS; i=i+1)
begin
// Calculate coefficient based on the coeffIndex value. Note that coeffIndex is a signed value!
// (Note: These don't necessarily have to be blocking statements.)
case ( 464-i )
12'd0: out = 17'd442; // This coefficient should be multiplied with the oldest input value
12'd1: out = -17'd373;
12'd2: out = -17'd169;
...
12'd463: out = -17'd373; //-17'd373
12'd464: out = 17'd442; //17'd442
// This coefficient should be multiplied with the most recent data input
// This should never occur.
default: out = 17'h0000;
endcase
y = y + (out * xn[i]);
end
sent = 1;
end
end
else if (handshake)
begin
sent = 0;
end
end
assign YY = (y>>>15); //THIS IS THE LINE THAT IS CAUSING THE ISSUE!
always #(YY)
begin
LED1 = 0;
LED2 = 1;
if ((YY >= 40) && (YY <= 60))
begin
LED1 <= 0;
LED2 <= 0;
end
if ((YY >= 90) && (YY <= 110))
begin
LED1 <= 1;
LED2 <= 0;
end
end
endmodule
You're almost certainly seeing the effects of synthesis optimisation.
The following line is the only place that uses y:
assign YY = (y>>>15); //THIS IS THE LINE THAT IS CAUSING THE ISSUE!
If you remove this line, all the logic that feeds into y (including out and xn) will be removed. On Altera you want to look carefully through your map report which will contain (buried amongst a million other things) information about all the logic that Quartus has removed and the reason behind it.
Good places to start are the Port Connectivity Checks which will tell you if any inputs or outputs are stuck high or low or are dangling. The look through the Registers Removed During Synthesis section and Removed Registers Triggering Further Register Optimizations.
You can try to force Quartus not to remove redundant logic by using the following in your QSF:
set_instance_assignment -name preserve_fanout_free_node on -to reg
set_instance_assignment -name preserve_register on -to foo
In your case however it sounds like the correct solution is to re-factor the code rather than try to preserve redundant logic. I suspect you want to investigate using an embedded RAM to store the coefficients.
(In addition to Chiggs' answer, assuming that you are hooking up YY correctly ....)
I would add that, you don't need >>>. It would be simpler to write :
assign YY = y[22:15];
And BTW, initial blocks are ignored for synthesis. So, you want to move that initialization to the respective always blocks in a if (reset) or if (handshake) section.

Generate statement inside verilog task

I want to use generate statement inside a task. The following code is giving compile errors (iverilog).
task write_mem; //for generic use with 8, 16 and 32 bit mem writes
input [WIDTH-1:0] data;
input [WIDTH-1:0] addr;
output [WIDTH-1:0] MEM;
integer i;
begin
generate
genvar j;
for(j=0; j<i;j++)
MEM[addr+(i-j-1)] = data[(j*8):((j*8) + 8)-1];
endgenerate
end
endtask // write_mem
I also tried putting generate just after the line integer i, but still its producing errors. Any thoughts?
EDIT: I also tried putting genvar declaration between begin and generate statement in the above code. Its still producing compiler errors
Thanks in advance,
Jay Aurabind
What you are trying is not possible - a generate region (generate..endgenerate block) is only allowed in the module description (aka "top level"), i.e. the same level where you have parameters, wires, always- and inital-regions, etc. (see Syntax 12-5 in the IEEE Std. 1364-2005). Within a task a generate region is e.g. as invalid as an assign statement.
However, you can use a non-generate for-loop in a task (this is also synthesizeable).
Either way, you can not count from 0 to i-1 in synthesizeable code as 'i' is not constant. Also note that j++ is not valid verilog, you must write j=j+1 instead. Finally you probably want to use a nonblocking assignment (<=) instead of a blocking assignment (=), but this depends on how you intent to use this task.
genvars should be defined before the generate statement:
genvar j;
generate
for(j=0; j<i;j++)
MEM[addr+(i-j-1)] = data[(j*8):((j*8) + 8)-1];
endgenerate
Although your usage here does not look like it needs a generate statement a for loop would have done.
As pointed out by #CliffordVienna generate statements are for building hierarchy and wiring based on compile time constants. ie parameters can be changed for reusable code but are constant in a given simulation. Tasks do not contain hierarchy and therefore the use of a generate is invalid.
Any for loop that can be unrolled is synthesizable, some thing like:
task write_mem; //for generic use with 8, 16 and 32 bit mem writes
input [WIDTH-1:0] data;
input [WIDTH-1:0] addr;
output [WIDTH-1:0] mem;
integer i = WIDTH / 8; // CONSTANT
begin
for(j=0; j<i;j++) begin
mem[addr+(i-j-1)] = data[(j*8):((j*8) + 8)-1];
end
end
endtask // write_mem
Tasks are synthesizable as long as they do not contain any timing control, which yours does not. From the information given this should be synthesizable.
NB: I would separate data width and addr width, they might be the same now but you might want to move to 8 bit addressing and 16 bit data.

Resources