32bits OR Operation Verilog Unexpected Result at most significant bit - verilog

TestModule Of Or32
GTKWave result
My question is why the most significant bit get an unexpected x?
module Or32(//32-Bit-Or
input [31:0] a,
input [31:0] b,
output [31:0] o
);
integer i;
reg [31:0]mid;
always#* begin
for (i = 0;i < 31; i++) begin
mid[i] = a[i] || b[i];
end
end
assign o = mid;
endmodule

Most significant bit get x because of your loop condition.
for (i = 0;i < 31; i++)
Last value which meet i < 31 is 30, but your mid's most significant bit is 31. Change your condition to i < 32.

Related

Verilog expression evaluates to 'x'

I am writing matrix multiplication module in Verilog and I encountered an issue where expression evaluates to bunch of 'xxxx':
// multiplies 5x32 matrix by 32x5 matrix
module matmul(input [4959:0] A, input [4959:0] B, output reg [799:0] out);
integer i,j,k;
integer start = 0;
reg [31:0] placeholder_A [4:0][31:0];
reg [31:0] placeholder_B [31:0][4:0];
reg [31:0] placeholder_out [4:0][4:0];
always #(A or B) begin
// initialize output to zeros
for (i=0; i<800; i=i+1)
out[i] = 0;
// initialize placeholder output to zeros
for (i=0; i<5; i=i+1)
for(j=0; j<5; j=j+1)
placeholder_out[i][j] = 32'd0;
// turn flat vector A array into matrix
for (i=0; i<5; i=i+1)
for(j=0; j<32; j=j+1) begin
placeholder_A[i][j] = A[start +: 31];
start = start + 32;
end
start = 0;
// turn flat vector B array into matrix
for (i=0; i<32; i=i+1)
for(j=0; j<5; j=j+1) begin
placeholder_B[i][j] = B[start +: 31];
start = start + 32;
end
start = 0;
// do the matrix multiplication
for (i=0; i<5; i=i+1) // A.shape[0]
for(j=0; j<5; j=j+1) // B.shape[1]
for(k=0; k<32; k=k+1) // B.shape[0] or A.shape[1]
placeholder_out[i][j] = placeholder_out[i][j] + (placeholder_A[i][k]*placeholder_B[k][j]); // this is where I am having problems
start = 0;
// flatten the output
for (i=0; i<5; i=i+1)
for(j=0; j<5; j=j+1) begin
out[start] = placeholder_out[i][j];
start = start + 1;
end
end
endmodule
placeholder_out variable (and therefore out output) are evaluated as 'xx...xxx' and I cannot understand why. When checking the signals through testbench both placeholder_A and placeholder_B contain valid values. Any help would be appreciated.
You can run the testbench here: https://www.edaplayground.com/x/2P7m
A couple of things that I observed from the code snippet. First of all the input is not having sufficient width. The required width is 32*5*5=5120. So we need input vectors of 5120 bits ( input [5119:0] A, input [5119:0] B). A linting tool might have caught this issue.
Secondly, the start needs to be initialized to zero at the start of computation. This will avoid latches on start and will compute from zeroth index of A and avoid X's to propagate further.
always #(A or B) begin
//...
start=0;
I'd advise to use always_comb instead of manual sensitivity but that is an entirely different topic.
As a side note, the given code snippet will create large combinational hardware as per my understanding. You may want to check synthesis result for timing violations on different nets and apply some alternate logic.

Syntax errors in the verilog code

I want to convert this c code to verilog module but I am having some difficulty
void window_averaging(void) {
register unsigned int i, k;
for (i = 0; i < 128; i++) {
// Copying first 128 output samples to the Window 0 and so on till Window 7.
W[count][i] = O[i];
}
for (i = 0; i < 128; i++) {
for (k = 0; k< 8; k++) {
O[i] += W[k][i];
}
O[i] /= 8; // Averaging over 8 window
}
count = (count++)%8; // Count = 0 after all the window elements are filled.
}
Verilog:
module window_averaging(
input [16:0]in_noise, //input from noise cancellation
input clk,
output reg [16:0]window_average // output after window averaging
);
integer i;
integer k;
integer count = 0;
reg [16:0] store_elements[0:7][0:128]; // 2-D array for window averaging
reg [16:0] temp;
always #(posedge clk)
begin
// Copying first 128 output samples to the Window 0 and so on till Window 7
for(i=0 ; i < 128 ; i = 1+1)
begin
store_elements[count][i] = in_noise;
end
for(i=0; i<128 ; i=i+1)
begin
for(k=0;k<8;k = k+1)
begin
temp = temp + store_elements[i][k];
end
window_average = temp/8;
count = (count+1)%8;
end
end
endmodule
The errors I am getting are syntax error near "(" and "=". I am little new to verilog can anyone help me how to proceed.
First you are trying to drive a wire from inside an #always block which is not allowed. If you convert the wires to regs then it will work:
module window_averaging(
input [16:0]in_noise, //input from noise cancellation
input clk,
output reg [16:0]window_average // output after window averaging
);
integer i;
integer k;
integer count = 0;
reg [16:0] store_elements[0:7][0:128]; // 2-D array for window averaging
reg [16:0] temp;
...
Also I believe to be consistent with your C code the line count = (count+1)%8; should be outside the for loop like so:
window_average = temp/8;
end
count = (count+1)%8;
end
endmodule
I don't know what you are using to compile, but I think the following stuff should give you errors:
For the first loop:
for(i=0 ; i < 128 ; i = 1+1)
change to i= i+1
Also, in line:
temp = temp + store_elements[i][k];
remember the declaration store_elements[0:7][0:128] , so may be switch i and k ?
This isn't an answer really. Sorry, I don't have comment privilege yet.

Reduce array to sum of elements

I am trying to reduce a vector to a sum of all it elements. Is there an easy way to do this in verilog?
Similar to the systemverilog .sum method.
Thanks
My combinational solution for this problem:
//example array
parameter cells = 8;
reg [7:0]array[cells-1:0] = {1,2,3,4,5,1,1,1};
//###############################################
genvar i;
wire [7:0] summation_steps [cells-2 : 0];//container for all sumation steps
generate
assign summation_steps[0] = array[0] + array[1];//for less cost starts witch first sum (not array[0])
for(i=0; i<cells-2; i=i+1) begin
assign summation_steps[i+1] = summation_steps[i] + array[i+2];
end
endgenerate
wire [7:0] result;
assign result = summation_steps[cells-2];
Verilog doesn't have any built-in array methods like SV. Therefore, a for-loop can be used to perform the desired functionality. Example:
parameter N = 64;
integer i;
reg [7:0] array [0:N-1]
reg [N+6:0] sum; // enough bits to handle overflow
always #*
begin
sum = {(N+7){1'b0}}; // all zero
for(i = 0; i < N; i=i+1)
sum = sum + array[i];
end
In critiquing the other answers delivered here, there are some comments to make.
The first important thing is to provide space for the sum to be accumulated. statements such as the following, in RTL, won't do that:
sum = sum + array[i]
because each of the unique nets created on the Right Hand Side (RHS) of the expression are all being assigned back to the same signal called "sum", leading to ambiguity in which of the unique nets is actually the driver (called a multiple driver hazard). To compound the problem, this statement also creates a combinational loop issue because sum is used combinationally to drive itself - not good. What would be good would be if something different could be used as the load and as the driver on each successive iteration of the loop....
Back to the argument though, in the above situation, the signal will be driven to an unknown value by most simulator tools (because: which driver should it pick? so assume none of them are right, or all of them are right - unknown!!). That is if it manages to get through the compiler at all (which is unlikely, and it doesn't at least in Cadence IEV).
The right way to do it would be to set up the following. Say you were summing bytes:
parameter NUM_BYTES = 4;
reg [7:0] array_of_bytes [NUM_BYTES-1:0];
reg [8+$clog2(NUM_BYTES):0] sum [NUM_BYTES-1:1];
always #* begin
for (int i=1; i<NUM_BYTES; i+=1) begin
if (i == 1) begin
sum[i] = array_of_bytes[i] + array_of_bytes[i-1];
end
else begin
sum[i] = sum[i-1] + array_of_bytes[i];
end
end
end
// The accumulated value is indexed at sum[NUM_BYTES-1]
Here is a module that works for arbitrarily sized arrays and does not require extra storage:
module arrsum(input clk,
input rst,
input go,
output reg [7:0] cnt,
input wire [7:0] buf_,
input wire [7:0] n,
output reg [7:0] sum);
always #(posedge clk, posedge rst) begin
if (rst) begin
cnt <= 0;
sum <= 0;
end else begin
if (cnt == 0) begin
if (go == 1) begin
cnt <= n;
sum <= 0;
end
end else begin
cnt <= cnt - 1;
sum <= sum + buf_;
end
end
end
endmodule
module arrsum_tb();
localparam N = 6;
reg clk = 0, rst = 0, go = 0;
wire [7:0] cnt;
reg [7:0] buf_, n;
wire [7:0] sum;
reg [7:0] arr[9:0];
integer i;
arrsum dut(clk, rst, go, cnt, buf_, n, sum);
initial begin
$display("time clk rst sum cnt");
$monitor("%4g %b %b %d %d",
$time, clk, rst, sum, cnt);
arr[0] = 5;
arr[1] = 6;
arr[2] = 7;
arr[3] = 10;
arr[4] = 2;
arr[5] = 2;
#5 clk = !clk;
#5 rst = 1;
#5 rst = 0;
#5 clk = !clk;
go = 1;
n = N;
#5 clk = !clk;
#5 clk = !clk;
for (i = 0; i < N; i++) begin
buf_ = arr[i];
#5 clk = !clk;
#5 clk = !clk;
go = 0;
end
#5 clk = !clk;
$finish;
end
endmodule
I designed it for 8-bit numbers but it can easily be adapted for other kinds of numbers too.

How to design a 64 x 64 bit array multiplier in Verilog?

I know how to design a 4x4 array multiplier , but if I follow the same logic , the coding becomes tedious.
4 x 4 - 16 partial products
64 x 64 - 4096 partial products.
Along with 8 full adders and 4 half adders, How many full adders and half adders do I need for 64 x 64 bit. How do I reduce the number of Partial products? Is there any simple way to solve this ?
Whenever tediously coding a repetitive pattern you should use a generate statement instead:
module array_multiplier(a, b, y);
parameter width = 8;
input [width-1:0] a, b;
output [width-1:0] y;
wire [width*width-1:0] partials;
genvar i;
assign partials[width-1 : 0] = a[0] ? b : 0;
generate for (i = 1; i < width; i = i+1) begin:gen
assign partials[width*(i+1)-1 : width*i] = (a[i] ? b << i : 0) +
partials[width*i-1 : width*(i-1)];
end endgenerate
assign y = partials[width*width-1 : width*(width-1)];
endmodule
I've verified this module using the following test-bench:
http://svn.clifford.at/handicraft/2013/array_multiplier/array_multiplier_tb.v
EDIT:
As #Debian has asked for a pipelined version - here it is. This time using a for loop in an always-region for the array part.
module array_multiplier_pipeline(clk, a, b, y);
parameter width = 8;
input clk;
input [width-1:0] a, b;
output [width-1:0] y;
reg [width-1:0] a_pipeline [0:width-2];
reg [width-1:0] b_pipeline [0:width-2];
reg [width-1:0] partials [0:width-1];
integer i;
always #(posedge clk) begin
a_pipeline[0] <= a;
b_pipeline[0] <= b;
for (i = 1; i < width-1; i = i+1) begin
a_pipeline[i] <= a_pipeline[i-1];
b_pipeline[i] <= b_pipeline[i-1];
end
partials[0] <= a[0] ? b : 0;
for (i = 1; i < width; i = i+1)
partials[i] <= (a_pipeline[i-1][i] ? b_pipeline[i-1] << i : 0) +
partials[i-1];
end
assign y = partials[width-1];
endmodule
Note that with many synthesis tools it's also possible to just add (width) register stages after the non-pipelined adder and let the tools register balancing pass do the pipelining.
[how to] reduce the number of partial products?
A method somewhat common used to be modified Booth encoding:
At the cost of more complicated addend selection, it at least almost halves their number.
In its simplest form, considering groups of three adjacent bits (overlapping by one) from one of the operands, say, b, and selecting 0, a, 2a, -2a or -a as an addend.
The code below generates only half of expected the output.
module arr_multi(a, b, y);
parameter w = 8;
input [w-1:0] a, b; // w-width
output [(2*w)-1:0] y; // p-partials
wire [(2*w*w)-1:0] p; //assign width as input bits multiplied by
output bits
genvar i;
assign p[(2*w)-1 : 0] = a[0] ? b : 0; //first output size bits
generate
for (i = 1; i < w; i = i+1)
begin
assign p[(w*(4+(2*(i-1))))-1 : (w*2)*i] = (a[i]?b<<i :0) + p[(w*(4+(2*
(i-2))))-1 :(w*2)*(i-1)];
end
endgenerate
assign y=p[(2*w*w)-1:(2*w)*(w-1)]; //taking last output size bits
endmodule

verilog debugging

I don't know what is wrong with the code below. Can someone help me debug?
module iloop(z,a);
input [31:0] a;
output z;
reg [4:0] i;
reg s, z;
initial begin
s = 0;
for(i=0; i<32; i=i+1) s = s | a[i];
z = !s;
end
endmodule
Your code has an infinite loop. You have declared i as a 5-bit reg, which means its range of values is (decimal) 0 to 31. But, your for loop checks if i < 32, which is always true.
Once i=31, i is incremented and rolls over to 0.
$display is your friend. If you add it to your for loop, you will see the problem:
for(i=0; i<32; i=i+1) begin $display(i); s = s | a[i]; end
I think you want i<31.
Or, maybe you want to OR all the bits of a together, using the bit-wise OR operator:
s = |a;
You should explain in words what you are trying to achieve.

Resources