Substraction in Verilog - verilog

I am working on a Verilog fixed point adder, using which I will also do the subtraction. When I do the subtraction not always I get the correct result.
For example, 1-1=0, but I get -0.
Kindly have a look on the below mentioned code:
`timescale 1ns/1ps
module adder #(
//Parameterized values
parameter Q = 27,
parameter N = 32
)
(
input [N-1:0] a,
input [N-1:0] b,
output [N-1:0] c
);
reg [N-1:0] res;
assign c = res;
always #(a,b) begin
// both negative or both positive
if(a[N-1] == b[N-1]) begin //Since they have the same sign, absolute magnitude increases
res[N-2:0] = a[N-2:0] + b[N-2:0]; //So just the two numbers are added
res[N-1] = a[N-1]; //and the sign is set appropriately...
end
// one of them is negative...
else if(a[N-1] == 0 && b[N-1] == 1) begin // subtracts a-b
if( a[N-2:0] > b[N-2:0] ) begin // if a is greater than b,
res[N-2:0] = a[N-2:0] - b[N-2:0];
res[N-1] = 0; // manually the sign is set to positive
end
else begin // if a is less than b,
res[N-2:0] = b[N-2:0] - a[N-2:0]; // subtracting a from b to avoid a 2's complement answer
if (res[N-2:0] == 0)
res[N-1] = 0; // To remove negative zero....
else
res[N-1] = 1; // and manually the sign is set to negative
end
end
else begin // subtract b-a (a negative, b positive)
if( a[N-2:0] > b[N-2:0] ) begin // if a is greater than b,
res[N-2:0] = a[N-2:0] - b[N-2:0]; // subtracting b from a to avoid a 2's complement answer
if (res[N-2:0] == 0)
res[N-1] = 0;
else
res[N-1] = 1; // and manually the sign is set to negative
end
else begin // if a is less than b,
res[N-2:0] = b[N-2:0] - a[N-2:0];
res[N-1] = 0;
end
end
end
endmodule
Testbench for the adder is below:
`timescale 1ns/1ps
module tb_adder (
);
reg clk;
reg [ 31 : 0 ] a;
reg [ 31 : 0 ] b;
wire [ 31: 0 ] c;
adder adder_i (
.a(a),
.b(b),
.c(c)
);
parameter CLKPERIODE = 100;
initial clk = 1'b1;
always #(CLKPERIODE/2) clk = !clk;
initial begin
$monitor ("adder=%h", c);
#1
a = 32'h08000000;
b = 32'hF8000000;
#(CLKPERIODE)
$finish();
end
endmodule
I am having a hard time to find where did I go wrong as I am a newbie in Verilog. I am using this module to calculate Taylor Series in Fixed Point arithmetic. Any suggestions?

The only case I could find where your code produces the dirty zero is when both inputs are the dirty zero themselves. i.e.
a = b = 32'h80000000 = "-0"
It looks like this happens because in this case your code takes the branch at
if(a[N-1] == b[N-1]) begin //Since they have the same sign, absolute magnitude increases
and this branch doesn't have the same check as the others that specifically avoids it. You could fix this by moving that code to the end of the always block so it runs no matter what branch is taken earlier.

Related

8 bit sequential multiplier using add and shift

I'm designing an 8-bit signed sequential multiplier using Verilog. The inputs are clk (clock), rst (reset), a (8 bit multiplier), b (8 bit multiplicand), and the outputs are p (product) and rdy (ready signal, indicating multiplication is over). For negative inputs, I do a sign extension and save it in the 15 bit register variables multiplier and multiplicand. Here's my code:
module seq_mult (p, rdy, clk, reset, a, b);
input clk, reset;
input [7:0] a, b;
output [15:0] p;
output rdy;
reg [15:0] p;
reg [15:0] multiplier;
reg [15:0] multiplicand;
reg rdy;
reg [4:0] ctr;
always #(posedge clk or posedge reset) begin
if (reset)
begin
rdy <= 0;
p <= 0;
ctr <= 0;
multiplier <= {{8{a[7]}}, a};
multiplicand <= {{8{b[7]}}, b};
end
else
begin
if(ctr < 16)
begin
if(multiplier[ctr]==1)
begin
multiplicand = multiplicand<<ctr;
p <= p + multiplicand;
end
ctr <= ctr+1;
end
else
begin
rdy <= 1;
end
end
end //End of always block
endmodule
And here's my testbench:
`timescale 1ns/1ns
`define width 8
`define TESTFILE "test_in.dat"
module seq_mult_tb () ;
reg signed [`width-1:0] a, b;
reg clk, reset;
wire signed [2*`width-1:0] p;
wire rdy;
integer total, err;
integer i, s, fp, numtests;
// Golden reference - can be automatically generated in this case
// otherwise store and read from a file
wire signed [2*`width-1:0] ans = a*b;
// Device under test - always use named mapping of signals to ports
seq_mult dut( .clk(clk),
.reset(reset),
.a(a),
.b(b),
.p(p),
.rdy(rdy));
// Set up 10ns clock
always #5 clk = !clk;
// A task to automatically run till the rdy signal comes back from DUT
task apply_and_check;
input [`width-1:0] ain;
input [`width-1:0] bin;
begin
// Set the inputs
a = ain;
b = bin;
// Reset the DUT for one clock cycle
reset = 1;
#(posedge clk);
// Remove reset
#1 reset = 0;
// Loop until the DUT indicates 'rdy'
while (rdy == 0) begin
#(posedge clk); // Wait for one clock cycle
end
if (p == ans) begin
$display($time, " Passed %d * %d = %d", a, b, p);
end else begin
$display($time, " Fail %d * %d: %d instead of %d", a, b, p, ans);
err = err + 1;
end
total = total + 1;
end
endtask // apply_and_check
initial begin
// Initialize the clock
clk = 1;
// Counters to track progress
total = 0;
err = 0;
// Get all inputs from file: 1st line has number of inputs
fp = $fopen(`TESTFILE, "r");
s = $fscanf(fp, "%d\n", numtests);
// Sequences of values pumped through DUT
for (i=0; i<numtests; i=i+1) begin
s = $fscanf(fp, "%d %d\n", a, b);
apply_and_check(a, b);
end
if (err > 0) begin
$display("FAIL %d out of %d", err, total);
end else begin
$display("PASS %d tests", total);
end
$finish;
end
endmodule // seq_mult_tb
I also created a file called test_in.dat in which the test cases are stored (first line indicates number of test cases):
10
5 5
2 3
10 1
10 2
20 20
-128 2
10 -128
-1 -1
10 0
0 2
Now the problem is: the code works for only the first two inputs and for the last two inputs. For the remaining inputs, I get a different number than is expected. Can someone point out any logical error in my code that is causing this? Or if there's a much simpler strategy for doing the same, please let me know of that as well.
multiplicand is shifted to the left by ctr in each iteration if multiplier[ctr] is 1.
But ctr already includes the previous shift amounts, so you are shifting too far.
You should just shift multiplicand by 1 in every iteration unconditionally:
multiplicand <= multiplicand << 1;
if (multiplier[ctr] == 1)
begin
p <= p + multiplicand;
end
ctr <= ctr + 1;
You should also use nonblocking assignment for multiplicand. You might need to move the shifting to after adding it to p.

Implementing one-bit flags in a 32Bit ALU using Verilog

I am working on an assignment and am a little lost and don't really know how to get started. I need to implement the following flags in a 32Bit ALU:
• Z ("Zero"): Set to 1 ("True") if the result of the operation is zero
• N ("Negative"): Set to 1 ("True") if the first bit of the result is 1, which indicates a negative number
• O ("Overflow"): Set to 1 ("True") to indicate that the operation overflowed the bus width.
Additionally, a comparison function that compares input a to input b and then set one of three flags:
• LT if input a is less than input b
• GT if input a is greater than input b
• EQ if input a is equal to input b
I need to modify this ALU to include the three flags and comparison outputs then change the test bench to test for all of these modifications.
This was all the information I received for this assignment and there is no textbook or any other resources really. It's an online class, and I cannot get a response from my instructor. So I am a little confused as to how to get started. I am still a total newbie when it comes to digital logic so please bear with me. I just need some help understanding how these flags and comparison works. If any one can explain this a little better to me as far as how they work and what they do, and possibly how I would implement them into the ALU and testbench, I would really appreciate it.
I don't expect anyone to do my assignment, I really just need help understanding it.
ALU
module alu32 (a, b, out, sel);
input [31:0] a, b;
input [3:0] sel;
output [31:0] out,
reg [31:0] out;
//Code starts here
always #(a, b, sel)
begin
case (sel)
//Arithmetic Functions
0 : out <= a + b;
1 : out <= a - b;
2 : out <= b - a;
3 : out <= a * b;
4 : out <= a / b;
5 : out <= b % a;
//Bit-wise Logic Functions
6 : out <= ~a; //Not
7 : out <= a & b; //And
8 : out <= a | b; //Or
9 : out <= a ^ b; //XOR
10 : out <= a ^~ b; //XNOR
//Logic Functions
11 : out <= !a;
12 : out <= a && b;
13 : out <= a || b;
default: out <= a + b;
endcase
end
endmodule
ALU Testbench
module alu32_tb();
reg [31:0] a, b;
reg [3:0] sel;
wire [31:0] out;
initial begin
$monitor("sel=%d a=%d b=%d out=%d", sel,a,b,out);
//Fundamental tests - all a+b
#0 sel=4'd0; a = 8'd0; b = 8'd0;
#1 sel=4'd0; a = 8'd0; b = 8'd25;
#1 sel=4'd0; a = 8'd37; b = 8'd0;
#1 sel=4'd0; a = 8'd45; b = 8'd75;
//Arithmetic
#1 sel=4'd1; a = 8'd120; b = 8'd25; //a-b
#1 sel=4'd2; a = 8'd30; b = 8'd120; //b-a
#1 sel=4'd3; a = 8'd75; b = 8'd3; //a*b
#1 sel=4'd4; a = 8'd75; b = 8'd3; //a/b
#1 sel=4'd5; a = 8'd74; b = 8'd3; //a%b
//Bit-wise Logic Functions
#1 sel=4'd6; a = 8'd31; //Not
#1 sel=4'd7; a = 8'd31; b = 8'd31; //And
#1 sel=4'd8; a = 8'd30; b = 8'd1; //Or
#1 sel=4'd9; a = 8'd30; b = 8'd1; //XOR
#1 sel=4'd10; a = 8'd30; b = 8'd1; //XNOR
//Logic Functions
#1 sel=4'd11; a = 8'd25; //Not
#1 sel=4'd12; a = 8'd30; b = 8'd0; //And
#1 sel=4'd13; a = 8'd0; b = 8'd30; //Or
#1 $finish;
end
alu32 myalu (.a(a), .b(b), .out(out), .sel(sel));
endmodule
You can add these flag outputs to the design. Like the following. Simply connect them in testbench.
// In design:
output zero;
output overflow;
output negative;
// In testbench:
wire zero,overflow,negative;
alu32 myalu (.a(a), .b(b), .out(out), .sel(sel), .zero(zero), .overflow(overflow),.negative(negative));
For logic part, you can do it with continuous assignments. You may need to add some logic for using these flags only during certain values of sel.
Z ("Zero"): Set to 1 ("True") if the result of the operation is zero
So, we can have condition like all the bits of out must be zero. This can be done in many other ways.
// Bit wise OR-ing on out
assign zero = ~(|out);
O ("Overflow"): Set to 1 ("True") to indicate that the operation overflowed the bus width.
According to this description and the code shown, you simply want carry flag here.That is, a signed extension of addition operation. Refer to this page on WikiPedia for overflow condition.
But, Overflow condition is not the same as the carry bit. Overflow represents data loss while carry represents a bit used for calculation in next stage.
So, doing something like following may be useful:
// Extend the result for capturing carry bit
// Simply use this bit if you want result > bus width
{carry,out} <= a+b;
// overflow in signed arithmetic:
assign overflow = ({carry,out[31]} == 2'b01);
N ("Negative"): Set to 1 ("True") if the first bit of the result is 1, which indicates a negative number
Again this is simply the MSB of the out register. But, the underflow condition is entirely a different thing.
// Depending on sel, subtraction must be performed here
assign negative = (out[31] == 1 && (sel == 1 || sel == 2));
Also, simple condition like assign lt = (a<b) ? 1 : 0; and others can detect the input LT, GT and EQ conditions.
Refer the answer here for the overflow/underflow flag understanding. Overflow-Carry link may also be useful.
Refer Carryout-Overflow, ALU in Verilog and ALU PDF for further information about ALU implementation.

Verilog comparator

I'm newbie to a verilog.
I did a lot of research, and finally wrote this code, but it seems to not work.
Can anyone fix it for me?
module comparator();
reg[3:0] a, b;
wire[1:0] equal, lower, greater;
if (a<b) begin
equal = 0;
lower = 1;
greater = 0;
end
else if (a==b) begin
equal = 1;
lower = 0;
greater = 0;
end
else begin
equal = 0;
lower = 0;
greater = 1;
end;
initial begin
$monitor($time,
"a=%b, b=%b, greater=%b, equals=%b, lower=%b",
a, b, greater, equal, lower);
a=9; b=10;
#100 $display ("\n", $time, "\n");
end
endmodule
Behavioural procedures must be enclosed within an always block, like this:
Also, your module needs inputs and outputs. A more correct version would be like this:
module comparator (
input wire [3:0] a,
input wire [3:0] b,
output reg equal,
output reg lower,
output reg greater
);
always #* begin
if (a<b) begin
equal = 0;
lower = 1;
greater = 0;
end
else if (a==b) begin
equal = 1;
lower = 0;
greater = 0;
end
else begin
equal = 0;
lower = 0;
greater = 1;
end
end
endmodule
I suggest reading some tutorial about behavioral modelling with Verilog, because you missed a lot of points:
How to correctly define inputs and outputs in a module
What things can be wires and what things should be regs
The use of always #* to model combinational logic
And most important: how to write a test bench. Test benches are written as module with no inputs and outputs) that instantiates your UUT (unit under test), provides inputs, read outputs and check whether they are valid.
module testcomp;
reg [3:0] a, b;
wire eq, lw, gr;
comparator uut (
.a(a),
.b(b),
.equal(eq),
.lower(lw),
.greater(gr)
);
initial begin
a = 0;
repeat (16) begin
b = 0;
repeat (16) begin
#10;
$display ("TESTING %d and %d yields eq=%d lw=%d gr=%d", a, b, eq, lw, gr);
if (a==b && eq!=1'b1 && gr!=1'b0 && lw!=1'b0) begin
$display ("ERROR!");
$finish;
end
if (a>b && eq!=1'b0 && gr!=1'b1 && lw!=1'b0) begin
$display ("ERROR!");
$finish;
end
if (a<b && eq!=1'b1 && gr!=1'b0 && lw!=1'b1) begin
$display ("ERROR!");
$finish;
end
b = b + 1;
end
a = a + 1;
end
$display ("PASSED!");
$finish;
end
endmodule
You can play with this example at EDAPlayGround using this link:
http://www.edaplayground.com/x/CPq
Without always block:
module comparator (
input wire [3:0] a,
input wire [3:0] b,
output reg equal,
output reg lower,
output reg greater
);
assign equal = (a===b);
assign lower = (a<b)?1'b1:1'b0;
assign greater = (a>b)1'b1:1'b0;
end
Be careful, you need to consider 'X' and 'Z', use "===" instead of "=="

Priority encoder in verilog

I am somewhat new to verilog, I tried running this code but it gives me an error:
module enc(in,out);
input [7:0] in;
output [3:0] out;
reg i;
reg [3:0] out;
always #*
begin
for (i=0;i<7;i=i+1)
begin
if ((in[i]==1) && (in[7:i+1]==0))
out = i;
else
out = 0;
end
end
endmodule
I think it complains about in[7:i+1] but i don't understand why ?
Can someone please advise..
EDIT
ok so I am reluctant to using the X due to their numerous problems.. I was thinking of modifying the code to something like this :
module enc(in,out);
input [7:0] in;
output [2:0] out;
reg i;
reg [2:0] out,temp;
always #*
begin
temp = 0;
for (i=0;i<8;i=i+1)
begin
if (in[i]==1)
temp = i;
end
out = temp;
end
endmodule
Do you think that will do the trick ? I currently don't have access to a simulator..
A priority encoder mean giving priority to a one bit if two or more bits meet the criteria. Looking at your code, it appears you wanted to give priority to a LSB while using a up counter. out is assigned in every look, so even if your could compile, the final result would be 6 or 0.
For an LSB priority encoder, first start with a default value for out and use a down counter:
module enc (
input wire [7:0] in,
output reg [2:0] out
);
integer i;
always #* begin
out = 0; // default value if 'in' is all 0's
for (i=7; i>=0; i=i-1)
if (in[i]) out = i;
end
endmodule
If you are only interested in simulation than your linear loop approach should be fine, something like
out = 0;
for (i = W - 1; i > 0; i = i - 1) begin
if (in[i] && !out)
out = i;
end
If you also care about performance, the question becomes more interesting. I once experimented with different approaches to writing parameterized priority encoders here. It turned out that Synopsys can generate efficient implementation even from the brain-dead loop above but other toolchains needed explicit generate magic. Here is an excerpt from the link:
output [WIDTH_LOG - 1:0] msb;
wire [WIDTH_LOG*WIDTH - 1:0] ors;
assign ors[WIDTH_LOG*WIDTH - 1:(WIDTH_LOG - 1)*WIDTH] = x;
genvar w, i;
integer j;
generate
for (w = WIDTH_LOG - 1; w >= 0; w = w - 1) begin
assign msb[w] = |ors[w*WIDTH + 2*(1 << w) - 1:w*WIDTH + (1 << w)];
if (w > 0) begin
assign ors[(w - 1)*WIDTH + (1 << w) - 1:(w - 1)*WIDTH] = msb[w] ? ors[w*WIDTH + 2*(1 << w) - 1:w*WIDTH + (1 << w)] : ors[w*WIDTH + (1 << w) - 1:w*WIDTH];
end
end
endgenerate
So my Edited solution worked... how silly !! I forgot to declare reg [2:0] i; and instead wrote reg i;
Thanks everybody
Hunks, I have to tell you, all your solutions are either too complex or non-synthesizable, or implement into slow multiplexors. Alexej Bolshakov at OpenCores uploaded an outstandin' parametrizable encoder on Aug 23, 2015, based on OR elements. No muxes, 100% synthesizable. His code (with my tiny formatting):
module encoder #(
parameter LINES = 16,
parameter WIDTH = $clog2(LINES)
)(
input [LINES-1:0] unitary_in,
output wor [WIDTH-1:0] binary_out
);
genvar i, j;
generate
for (i = 0; i < LINES; i = i + 1)
begin: loop_i
for (j = 0; j < WIDTH; j = j + 1)
begin: loop_j
if (i[j])
assign binary_out[j] = unitary_in[i];
end
end
endgenerate
endmodule
RTL viewer screenshot, Model-Sim screenshot
This solution divides the input into four blocks and checks for the first nonzero block. This block is further subdivided in the same way. It is reasonably efficient.
// find position of most significant 1 bit in 64 bits input
// (system verilog)
module bitscan(
input logic [63:0] in, // number input
output logic [5:0] out, // bit position output
output logic zeroout // indicates if input is zero
);
logic [63:0] m0; // intermediates
logic [15:0] m1;
logic [3:0] m2;
logic [5:0] r;
always_comb begin
m0 = in;
// choose between four 16-bit blocks
if (|m0[63:48]) begin
m1 = m0[63:48];
r[5:4] = 3;
end else if (|m0[47:32]) begin
m1 = m0[47:32];
r[5:4] = 2;
end else if (|m0[31:16]) begin
m1 = m0[31:16];
r[5:4] = 1;
end else begin
m1 = m0[15:0];
r[5:4] = 0;
end
// choose between four 4-bit blocks
if (|m1[15:12]) begin
m2 = m1[15:12];
r[3:2] = 3;
end else if (|m0[11:8]) begin
m2 = m1[11:8];
r[3:2] = 2;
end else if (|m0[7:4]) begin
m2 = m1[7:4];
r[3:2] = 1;
end else begin
m2 = m1[3:0];
r[3:2] = 0;
end
// choose between four remaining bits
if (m2[3]) r[1:0] = 3;
else if (m2[2]) r[1:0] = 2;
else if (m2[1]) r[1:0] = 1;
else r[1:0] = 0;
out = r;
zeroout = ~|m2;
end
endmodule
Here is another solution that uses slightly less resourcess:
module bitscan4 (
input logic [63:0] in,
output logic [5:0] out,
output logic zout
);
logic [63:0] m0;
logic [3:0] m1;
logic [3:0] m2;
logic [5:0] r;
always_comb begin
r = 0;
m0 = in;
if (|m0[63:48]) begin
r[5:4] = 3;
m1[3] = |m0[63:60];
m1[2] = |m0[59:56];
m1[1] = |m0[55:53];
m1[0] = |m0[51:48];
end else if (|m0[47:32]) begin
r[5:4] = 2;
m1[3] = |m0[47:44];
m1[2] = |m0[43:40];
m1[1] = |m0[39:36];
m1[0] = |m0[35:32];
end else if (|m0[31:16]) begin
r[5:4] = 1;
m1[3] = |m0[31:28];
m1[2] = |m0[27:24];
m1[1] = |m0[23:20];
m1[0] = |m0[19:16];
end else begin
r[5:4] = 0;
m1[3] = |m0[15:12];
m1[2] = |m0[11:8];
m1[1] = |m0[7:4];
m1[0] = |m0[3:0];
end
if (m1[3]) begin
r[3:2] = 3;
end else if (m1[2]) begin
r[3:2] = 2;
end else if (m1[1]) begin
r[3:2] = 1;
end else begin
r[3:2] = 0;
end
m2 = m0[{r[5:2],2'b0}+: 4];
if (m2[3]) r[1:0] = 3;
else if (m2[2]) r[1:0] = 2;
else if (m2[1]) r[1:0] = 1;
else r[1:0] = 0;
zout = ~|m2;
out = r;
end
endmodule
To be able to use variable indexes in part-slice suffixes, you must enclose the for block into a generate block, like this:
gen var i;
generate
for (i=0;i<7;i=i+1) begin :gen_slices
always #* begin
... do whatever with in[7:i+1]
end
end
The problem is that apllying this to your module, the way it's written, leads to other errors. Your rewritten module would look like this (be warned: this won't work either)
module enc (
input wire [7:0] in,
output reg [2:0] out // I believe you wanted this to be 3 bits width, not 4.
);
genvar i; //a generate block needs a genvar
generate
for (i=0;i<7;i=i+1) begin :gen_block
always #* begin
if (in[i]==1'b1 && in[7:i+1]=='b0) // now this IS allowed :)
out = i;
else
out = 3'b0;
end
end
endgenerate
endmodule
This will throw a synthesis error about out being driven from more than one source. This means that the value assigned to out comes from several sources at the same time, and that is not allowed.
This is because the for block unrolls to something like this:
always #* begin
if (in[0]==1'b1 && in[7:1]=='b0)
out = 0;
else
out = 3'b0;
end
always #* begin
if (in[1]==1'b1 && in[7:2]=='b0)
out = 1;
else
out = 3'b0;
end
always #* begin
if (in[2]==1'b1 && in[7:3]=='b0)
out = 2;
else
out = 3'b0;
end
.... and so on...
So now you have multiple combinational block (always #*) trying to set a value to out. All of them will work at the same time, and all of them will try to put a specific value to out whether the if block evaluates as true or false. Recall that the condition of each if statement is mutually exclusive with respect of the other if conditions (i.e. only one if must evaluate to true).
So a quick and dirty way to avoid this multisource situation (I'm sure there are more elegant ways to solve this) is to let out to be high impedance if the if block is not going to assign it a value. Something like this:
module enc (
input wire [7:0] in,
output reg [2:0] out // I believe you wanted this to be 3 bits width, not 4.
);
genvar i; //a generate block needs a genvar
generate
for (i=0;i<7;i=i+1) begin :gen_block
always #* begin
if (in[i]==1'b1 && in[7:i+1]=='b0) // now this IS allowed :)
out = i;
else
out = 3'bZZZ;
end
end
endgenerate
always #* begin
if (in[7]) // you missed the case in which in[7] is high
out = 3'd7;
else
out = 3'bZZZ;
end
endmodule
On the other way, if you just need a priority encoder and your design uses fixed and small widths for inputs and outputs, you may write your encoder as this:
module enc (
input wire [7:0] in,
output reg [2:0] out
);
always #* begin
casex (in)
8'b1xxxxxxx : out = 3'd7;
8'b01xxxxxx : out = 3'd6;
8'b001xxxxx : out = 3'd5;
8'b0001xxxx : out = 3'd4;
8'b00001xxx : out = 3'd3;
8'b000001xx : out = 3'd2;
8'b0000001x : out = 3'd1;
8'b00000001 : out = 3'd0;
default : out = 3'd0;
endcase
end
endmodule
(although there seems to be reasons to not to use casex in a design. Read the comment #Tim posted about it in this other question: How can I assign a "don't care" value to an output in a combinational module in Verilog )
In conclusion: I'm afraid that I have not a bullet-proof design for your requirements (if we take into account the contents of the paper Tim linked in his comment), but at least, you know now why i was unallowed inside a part-slice suffix.
On the other way, you can have half of the work done by studying this code I gave as an answer to another SO question. In this case, the module works like a priority encoder, parametrized and without casex statements, only the output is not binary, but one-hot encoded.
How to parameterize a case statement with don't cares?
out = in&(~(in-1))
gives you the one-hot results(FROM LSB->MSB where the first 1 at)

How to design a 64 x 64 bit array multiplier in Verilog?

I know how to design a 4x4 array multiplier , but if I follow the same logic , the coding becomes tedious.
4 x 4 - 16 partial products
64 x 64 - 4096 partial products.
Along with 8 full adders and 4 half adders, How many full adders and half adders do I need for 64 x 64 bit. How do I reduce the number of Partial products? Is there any simple way to solve this ?
Whenever tediously coding a repetitive pattern you should use a generate statement instead:
module array_multiplier(a, b, y);
parameter width = 8;
input [width-1:0] a, b;
output [width-1:0] y;
wire [width*width-1:0] partials;
genvar i;
assign partials[width-1 : 0] = a[0] ? b : 0;
generate for (i = 1; i < width; i = i+1) begin:gen
assign partials[width*(i+1)-1 : width*i] = (a[i] ? b << i : 0) +
partials[width*i-1 : width*(i-1)];
end endgenerate
assign y = partials[width*width-1 : width*(width-1)];
endmodule
I've verified this module using the following test-bench:
http://svn.clifford.at/handicraft/2013/array_multiplier/array_multiplier_tb.v
EDIT:
As #Debian has asked for a pipelined version - here it is. This time using a for loop in an always-region for the array part.
module array_multiplier_pipeline(clk, a, b, y);
parameter width = 8;
input clk;
input [width-1:0] a, b;
output [width-1:0] y;
reg [width-1:0] a_pipeline [0:width-2];
reg [width-1:0] b_pipeline [0:width-2];
reg [width-1:0] partials [0:width-1];
integer i;
always #(posedge clk) begin
a_pipeline[0] <= a;
b_pipeline[0] <= b;
for (i = 1; i < width-1; i = i+1) begin
a_pipeline[i] <= a_pipeline[i-1];
b_pipeline[i] <= b_pipeline[i-1];
end
partials[0] <= a[0] ? b : 0;
for (i = 1; i < width; i = i+1)
partials[i] <= (a_pipeline[i-1][i] ? b_pipeline[i-1] << i : 0) +
partials[i-1];
end
assign y = partials[width-1];
endmodule
Note that with many synthesis tools it's also possible to just add (width) register stages after the non-pipelined adder and let the tools register balancing pass do the pipelining.
[how to] reduce the number of partial products?
A method somewhat common used to be modified Booth encoding:
At the cost of more complicated addend selection, it at least almost halves their number.
In its simplest form, considering groups of three adjacent bits (overlapping by one) from one of the operands, say, b, and selecting 0, a, 2a, -2a or -a as an addend.
The code below generates only half of expected the output.
module arr_multi(a, b, y);
parameter w = 8;
input [w-1:0] a, b; // w-width
output [(2*w)-1:0] y; // p-partials
wire [(2*w*w)-1:0] p; //assign width as input bits multiplied by
output bits
genvar i;
assign p[(2*w)-1 : 0] = a[0] ? b : 0; //first output size bits
generate
for (i = 1; i < w; i = i+1)
begin
assign p[(w*(4+(2*(i-1))))-1 : (w*2)*i] = (a[i]?b<<i :0) + p[(w*(4+(2*
(i-2))))-1 :(w*2)*(i-1)];
end
endgenerate
assign y=p[(2*w*w)-1:(2*w)*(w-1)]; //taking last output size bits
endmodule

Resources