Find Maximum Number present in Verilog array - verilog

I have tried writing a small verilog module that will find the maximum of 10 numbers in an array. At the moment I am just trying to verify the correctness of the module without going into specific RTL methods that will to do such a task.
I am just seeing a a couple of registers when I am synthesizing this module. Nothing more that that. Ideally the output should be 7 which is at index 4 but I am seeing nothing neither on FPGA board or in the test bench. What I am doing wrong with this ?
module findmaximum(input clk,rst,output reg[3:0]max, output reg[3:0]index);
reg [3:0]corr_Output[0:9];
always#(posedge clk or posedge rst)
if(rst)
begin
corr_Output[0]=0;
corr_Output[1]=0;
corr_Output[2]=0;
corr_Output[3]=0;
corr_Output[4]=0;
corr_Output[5]=0;
corr_Output[6]=0;
corr_Output[7]=0;
corr_Output[8]=0;
corr_Output[9]=0;
end
else
begin
corr_Output[0]=0;
corr_Output[1]=0;
corr_Output[2]=0;
corr_Output[3]=0;
corr_Output[4]=7;
corr_Output[5]=0;
corr_Output[6]=0;
corr_Output[7]=0;
corr_Output[8]=0;
corr_Output[9]=0;
end
integer i;
always#(posedge clk or posedge rst)
if(rst)
begin
max=0;
index=0;
end
else
begin
max = corr_Output[0];
for (i = 0; i <= 9; i=i+1)
begin
if (corr_Output[i] > max)
begin
max = corr_Output[i];
index = i;
end
end
end
endmodule

Looking are your code, the only possible outputs are max=0,index=0 and a clock or two after reset max=7,index=4. Therefore, your synthesizer is likely optimizing the code with equivalent behavior with simpler logic.
For your find max logic to be meaningful, you need to change the values of corr_Output periodically. This can be done via input writes, LFSR (aka pseudo random number generator), and or other logic.
Other issues:
Synchronous logic (updated on a clock edge) should be assigned by with non-blocking (<=). Combinational logic should be assigned with blocking (=). When this guideline is not followed there is a risk of behavior differences between simulation and synthesis. In the event you need to compare with intermediate values (like your original max and index), then you need to separate the logic into two always blocks like bellow. See code bellow.
Also, FPGAs tend to have limited asynchronous reset support. Use synchronous reset instead by removing the reset from the sensitivity list.
always#(posedge clk) begin
if (rst) begin
max <= 4'h0;
index <= 4'h0;
end
else begin
max <= next_max;
index <= next_index;
end
always #* begin
next_max = corr_Output[0];
next_index = 4'h0;
for (i = 1; i <= 9; i=i+1) begin // <-- start at 1, not 0 (0 is same a default)
if (corr_Output[i] > next_max) begin
next_max = corr_Output[i];
next_index = i;
end
end
end

Related

Blocking assignments in always block verilog?

now I know in Verilog, to make a sequential logic you would almost always have use the non-blocking assignment (<=) in an always block. But does this rule also apply to internal variables? If blocking assignments were to be used for internal variables in an always block would it make it comb or seq logic?
So, for example, I'm trying to code a sequential prescaler module. It's output will only be a positive pulse of one clk period duration. It'll have a parameter value that will be the prescaler (how many clock cycles to divide the clk) and a counter variable to keep track of it.
I have count's assignments to be blocking assignments but the output, q to be non-blocking. For simulation purposes, the code works; the output of q is just the way I want it to be. If I change the assignments to be non-blocking, the output of q only works correctly for the 1st cycle of the parameter length, and then stays 0 forever for some reason (this might be because of the way its coded but, I can't seem to think of another way to code it). So is the way the code is right now behaving as a combinational or sequential logic? And, is this an acceptable thing to do in the industry? And is this synthesizable?
```
module scan_rate2(q, clk, reset_bar);
//I/O's
input clk;
input reset_bar;
output reg q;
//internal constants/variables
parameter prescaler = 8;
integer count = prescaler;
always #(posedge clk) begin
if(reset_bar == 0)
q <= 1'b0;
else begin
if (count == 0) begin
q <= 1'b1;
count = prescaler;
end
else
q <= 1'b0;
end
count = count - 1;
end
endmodule
```
You should follow the industry practice which tells you to use non-blocking assignments for all outputs of the sequential logic. The only exclusion are temporary vars which are used to help in evaluation of complex expressions in sequential logic, provided that they are used only in a single block.
In you case using 'blocking' for the 'counter' will cause mismatch in synthesis behavior. Synthesis will create flops for both q and count. However, in your case with blocking assignment the count will be decremented immediately after it is being assigned the prescaled value, whether after synthesis, it will happen next cycle only.
So, you need a non-blocking. BTW initializing 'count' within declaration might work in fpga synthesis, but does not work in schematic synthesis, so it is better to initialize it differently. Unless I misinterpreted your intent, it should look like the following.
integer count;
always #(posedge clk) begin
if(reset_bar == 0) begin
q <= 1'b0;
counter <= prescaler - 1;
end
else begin
if (count == 0) begin
q <= 1'b1;
count <= prescaler -1;
end
else begin
q <= 1'b0;
count <= count - 1;
end
end
end
You do not need temp vars there, but you for the illustration it can be done as the following:
...
integer tmp;
always ...
else begin
q <= 1'b0;
tmp = count - 1; // you should use blocking here
count <= tmp; // but here you should still use NBA
end

Can you use/manipulate same output/reg variable in multiple always blocks?

always # (posedge clk) begin
if (x) begin
count <= count + 1'b1;
end
end
always # (posedge clk) begin
if (y) begin
count <= count - 2'b10;
end
end
always # (negedge clk) begin
if (x) begin
count <= count - 1'b1;
end
end
always # ( count ) begin
...do something... ;
end
Can I us the variable count inside multiple always block?
Is this a good design practice?
Why/Where should/should not use this method?
How does the simulator/synthesizer do the calculations for that variable 'count'?
Does the compiler throw error if I do this?
Can I us the variable count inside multiple always block?
Not in RTL code NO.
Is this a good design practice?
"good design practice" is not a well defined term. You might use it in a test-bench but not in the format you use. In that case you must make sure that all always conditions are mutual exclusive.
Why/Where should/should not use this method?
You could use it if you have about 10 years experience in writing code. Otherwise don't. As to "should" never!
How does the simulator/synthesizer do the calculations for that variable 'count'?
The synthesizer will refuse your code. The simulator will assign a value just as you described. Which in your code means: you have no idea which assignment is executed last so the result is unpredictable.
Does the compiler throw error if I do this?
Why ask if you can try?
I'm not a hardware designer, but this is not good. Your 3 always blocks will all infer a register and they will all drive the count signals.
You can read signals in multiple blocks, but you should only write to them in a single block.
In most cases you don't want to have multi-drivers. If you have something like a bus with multiple possible masters then you will want multi-drivers, but they need to drive the bus through tri-states and you need to ensure that the master has exclusive access.
Mixing posedge and negedge is not a good idea.
With a single block you might write something like this (which appropriate macros or parameter for UP1, DOWN1 and DOWN2).
always #(posedge clk or negedge reset_n)
begin
if (reset_n == 1'b0)
begin
count <= 32'b0;
end
else
begin
case (count_control)
UP1: count <= count + 1'b1;
DOWN2: count <= count - 2'b10;
DOWN1: count <= count - 1'b1;
endcase
end
end
No. You can't have assignments to a net from multiple always block.
Here is the synthesis result of 2 implementation in Synopsys Design Compiler
ASSIGNMENTS FROM MULTIPLE ALWAYS BLOCK.
module temp(clk, rst, x, y, op);
input logic clk, rst;
logic [1:0] count;
input logic x, y;
output logic [1:0] op;
assign op = count;
always # (posedge clk) begin
if (x) begin
count <= count + 2'd1;
end
end
always # (posedge clk) begin
if (y) begin
count <= count - 2'd2;
end
end
always # (negedge clk) begin
if (x) begin
count <= count - 2'd1;
end
end
endmodule
// Synthesis Result of elaborate command -
Error: /afs/asu.edu/users/k/m/s/kmshah4/temp/a.sv:16: Net 'count[1]' or a directly connected net is driven by more than one source, and not all drivers are three-state. (ELAB-366)
Error: /afs/asu.edu/users/k/m/s/kmshah4/temp/a.sv:16: Net 'count[0]' or a directly connected net is driven by more than one source, and not all drivers are three-state. (ELAB-366)
ASSIGNMENTS WITH SINGLE ALWAYS BLOCK.
module temp(clk, rst, x, y, op);
input logic clk, rst;
logic [1:0] count;
input logic x, y;
output logic [1:0] op;
assign op = count;
always # (clk)
begin
if (clk)
begin
case ({x, y})
2'b01 : count <= count - 2'd2;
2'b10 : count <= count + 2'd1;
default : count <= count;
endcase
end
else
begin
count <= (x) ? (count - 2'd1) : count;
end
end
endmodule
// Synthesis Result of elaborate command -
Elaborated 1 design.
Current design is now 'temp'.
1

how to average the mem values in verilog

I am trying to read the values of memory after 5 cycles into an output register in verilog. How do I do that?
For example if i have a code which looks like this,
reg[31:0] mem[0:5];
if(high==1)
begin
newcount1<=count2;
mem[i]<=newcount1;
i<=i+1;
count2=0;
end
After the 5 cycles of operation whatever mem values i get, how do i read them in another output register? and can i perform averaging operation on those 5 cycles? and get a nominal value?
Say Your Memory Data Out is mem_data and you want it Read out in mem_data_out with Latency of 5 Cycles.
parameter MDP_Latency = 4;
reg [31:0] mem_data_out;
reg [31:0] [MDP_Latency - 1 : 0] mem_data_out_temp;
always#(posedge clk) begin
if(!reset) begin
for(int i = 0; i < MDP_Latency; i ++)
begin
mem_data_out <= 'd0;
mem_data_out_temp[i] <= 'd0;
end
end
else
begin
for(int i = 0; i < MDP_Latency; i ++)
begin
if(i == 0)
begin
mem_data_out_temp[i] <= mem_data;
end
else
begin
mem_data_out_temp[i] <= mem_data_out_temp[i - 1];
end
end
mem_data_out <= mem_data_out_temp[MDP_Latency];
end
end
`
Lets have a look at the posted code:
reg[31:0] mem[0:5];
if(high==1)
begin
newcount1<=count2;
mem[i]<=newcount1;
i<=i+1;
count2=0;
end
The lack of indentation makes it hard to read, i is not declared. The memory is actually 6 locations, 0 to 5.
You have conditionals and assignment not insides and initial or always block.
I am not sure what you are doing with count2 but mixing blocking and non-blocking is considered bad practise, it can be done but you must be really careful to not cause RTL to gates mismatch.
User1932872 Has posted an answer using for loops it looks like a valid answer but I think loops at this stage over complicate learning and understanding what you are creating in HDLs. While learning I would avoid such features and only uses them once comfortable with the whole flow.
reg[31:0] mem[0:4]; //5 Locations
always #(posedge clk) begin //Clked process assignmnets use non-blocking(<=)
mem[0]<=newcount1;
mem[1]<=mem[0];
mem[2]<=mem[1];
mem[3]<=mem[2];
mem[4]<=mem[3];
end
With this structure we can see the pipeline of data though mem[0] to mem[4]. We are implying 5 flip-flops where the output of the first drives data into the next. A combinatorial sum of them all could be:
reg [31+4:0] sum; //The +4 allows for bitgrowth you may need to truncate or limit.
always #* begin // Combinatorial use blocking (=)
sum = mem[0] + mem[1] + mem[2] + mem[3] + mem[4];
end

Is there a way to sum multi-dimensional arrays in verilog?

This is something that I think should be doable, but I am failing at how to do it in the HDL world. Currently I have a design I inherited that is summing a multidimensional array, but we have to pre-write the addition block because one of the dimensions is a synthesize-time option, and we cater the addition to that.
If I have something like reg tap_out[src][dst][tap], where src and dst is set to 4 and tap can be between 0 and 15 (16 possibilities), I want to be able to assign output[dst] be the sum of all the tap_out for that particular dst.
Right now our summation block takes all the combinations of tap_out for each src and tap and sums them in pairs for each dst:
tap_out[0][dst][0]
tap_out[1][dst][0]
tap_out[2][dst][0]
tap_out[3][dst][0]
tap_out[0][dst][1]
....
tap_out[3][dst][15]
Is there a way to do this better in Verilog? In C I would use some for-loops, but that doesn't seem possible here.
for-loops work perfectly fine in this situation
integer src_idx, tap_idx;
always #* begin
sum = 0;
for (scr_idx=0; src_idx<4; src_idx=scr_idx+1) begin
for (tap_idx=0; tap_idx<16; tap_idx=tap_idx+1) begin
sum = sum + tap_out[src_idx][dst][tap_idx];
end
end
end
It does unroll into a large combinational logic during synthesis and the results should be the same adding up the bits line by line.
Propagation delay from a large summing logic could have a timing issue. A good synthesizer should find the optimum timing/area when told the clocking constraint. If logic is too complex for the synthesizer, then add your own partial sum logic that can run in parallel
reg [`WIDHT-1:0] /*keep*/ partial_sum [3:0]; // tell synthesis to preserve these nets
integer src_idx, tap_idx;
always #* begin
sum = 0;
for (scr_idx=0; src_idx<4; src_idx=scr_idx+1) begin
partial_sum[scr_idx] = 0;
// partial sums are independent of each other so the can run in parallel
for (tap_idx=0; tap_idx<16; tap_idx=tap_idx+1) begin
partial_sum[scr_idx] = partial_sum[scr_idx] + tap_out[src_idx][dst][tap_idx];
end
sum = sum + partial_sum[scr_idx]; // sum the partial sums
end
end
If timing is still an issue, then you have must treat the logic as multi-cycle and sample the value some clock cycles after the input changed.
In RTL (the level of abstraction you are likely modelling with your HDL), you have to balance parallelism with time. By doing things in parallel, you save time (typically) but the logic takes up a lot of space. Conversely, you can make the adds completely serial (do one add at one time) and store the results in a register (it sounds like you want to accumulate the total sum, so I will explain that).
It sounds like the fully parallel is not practical for your uses (if it is and you want to rewrite it, look up generate statements). So, you'll need to create a small FSM and accumulate the sums into a register. Here's a basic example, which sums an array of 16-bit numbers (assume they are set somewhere else):
reg [15:0] arr[0:9]; // numbers
reg [31:0] result; // accumulated sum
reg load_result; // load signal for register containing result
reg clk, rst_L; // These are the clock and reset signals (reset asserted low)
/* This is a register for storing the result */
always #(posedge clk, negedge rst_L) begin
if (~rst_L) begin
result <= 32'd0;
end
else begin
if (load_result) begin
result <= next_result;
end
end
end
/* A counter for knowing which element of the array we are adding
reg [3:0] counter, next_counter;
reg load_counter;
always #(posedge clk, negedge rst_L) begin
if (~rst_L) begin
counter <= 4'd0;
end
else begin
if (load_counter) begin
counter <= counter + 4'd1;
end
end
end
/* Perform the addition */
assign next_result = result + arr[counter];
/* Define the state machine states and state variable */
localparam IDLE = 2'd0;
localparam ADDING = 2'd1;
localparam DONE = 2'd2;
reg [1:0] state, next_state;
/* A register for holding the current state */
always #(posedge clk, negedge rst_L) begin
if (~rst_L) begin
state <= IDLE;
end
else begin
state <= next_state;
end
end
/* The next state and output logic, this will control the addition */
always #(*) begin
/* Defaults */
next_state = IDLE;
load_result = 1'b0;
load_counter = 1'b0;
case (state)
IDLE: begin
next_state = ADDING; // Start adding now (right away)
end
ADDING: begin
load_result = 1'b1; // Load in the result
if (counter == 3'd9) begin // If we're on the last element, stop incrementing counter, we are done
load_counter = 1'b0;
next_state = DONE;
end
else begin // Otherwise, keep adding
load_counter = 1'b1;
next_state = ADDING;
end
end
DONE: begin // finished adding, result is in result!
next_state = DONE;
end
endcase
end
There are lots of resources on the web explaining FSMs if you are having trouble with the concept, but they can be used to implement your basic C-style for loop.

Verilog Loop Condition

I am completely new to verilog and I have to know quite a bit of it fairly soon for a course I am taking in university. So I am play around with my altera DE2 board and quartis2 and learning the ins and outs.
I am trying to make a counter which is turned on and off by a switch.
So far the counter counts and resets based on a key press.
This is my error:
Error (10119): Verilog HDL Loop Statement error at my_first_counter_enable.v(19): loop with non-constant loop condition must terminate within 250 iterations
I understand I am being asked to provide a loop variable, but even doing so I get an error.
This is my code:
module my_first_counter_enable(SW,CLOCK_50,LEDR,KEY);
input CLOCK_50;
input [17:0] SW;
input KEY;
output [17:0] LEDR;
reg [32:0] count;
wire reset_n;
wire enable;
assign reset_n = KEY;
assign enable = SW[0];
assign LEDR = count[27:24];
always# (posedge CLOCK_50 or negedge reset_n) begin
while(enable) begin
if(!reset_n)
count = 0;
else
count = count + 1;
end
end
endmodule
I hope someone can point out my error in my loop and allow me to continue.
Thank you!
I don't think you want to use a while loop there. How about:
always# (posedge CLOCK_50 or negedge reset_n) begin
if(!reset_n)
count <= 0;
else if (enable)
count <= count + 1;
end
I also added non-blocking assignments <=, which are more appropriate for synchronous logic.
The block will trigger every time there is a positive edge of the clock. Where you had a while loop does not mean anything in hardware, it would still need a clock to drive the flip flops.
While loops can be used in testbeches to drive stimulus
integer x;
initial begin
x = 0;
while (x<1000) begin
data_in = 2**x ; //or stimulus read from file etc ...
x=x+1;
end
end
I find for loops or repeat to be of more use though:
integer x;
initial begin
for (x=0; x<1000; x=x+1) begin
data_in = 2**x ; //or stimulus read from file etc ...
end
end
initial begin
repeat(1000) begin
data_in = 'z; //stimulus read from file etc (no loop variable)...
end
end
NB: personally I would also add begin end to every thing to avoid adding extra lines later and wondering why they always or never get executed, especially while new to the language. It also has the added benefit of making the indenting look a little nicer.
always# (posedge CLOCK_50 or negedge reset_n) begin
if(!reset_n) begin
count <= 'b0;
end
else if (enable) begin
count <= count + 1;
end
end
Title
Error (10119): Verilog HDL Loop Statement error at : loop with non-constant loop condition must terminate within iterations
Description
This error may appear in the Quartus® II software when synthesis iterates through a loop in Verilog HDL for more than the synthesis loop limit. This limit prevents synthesis from potentially running into an infinite loop. By default, this loop limit is set to 250 iterations.
Workaround / Fix
To work around this error, the loop limit can be set using the VERILOG_NON_CONSTANT_LOOP_LIMIT option in the Quartus II Settings File (.qsf). For example:
set_global_assignment -name VERILOG_NON_CONSTANT_LOOP_LIMIT 300

Resources