I am designing a processor in Verilog. I'm working on the ALU, specifically the multiplier for the ALU. I can get the correct results when performing multiplication with small, positive numbers, but if I try to multiply signed numbers I get issues. When a positive number multiplies a negative the result will not sign extend all the way to 64 bits, and if two negative numbers are multiplied the number is incorrect altogether (sign and value). Can anyone see where the issue lies? I assumed I was not performing an arithmetic shift but I adjusted that and am still getting the wrong results.
module multiplier(
input[31:0] operand1,
input[31:0] operand2,
output reg [63:0] product
);
reg [64:0] prod;
reg [31:0] mcand;
reg [31:0] sum;
integer i = 0;
always #* begin
prod = {32'b0,operand1};
mcand = operand2;
for(i=0;i<32;i=i+1) begin
//test 0 bit of product
case(prod[0])
1'b0:begin //if prod[0] == 0, arithmetic shift right
prod = prod>>>1;
end
1'b1:begin //if prod[0] == 1, add multiplicand to upper 32
//bits and arithmetic shift right
prod = {(prod[63:32]+mcand[31:0]),prod[31:0]};
prod = prod>>>1;
end
endcase
end
product = prod[63:0];
end
endmodule
For >>> to perform signed shift the variable must be declared as signed.
reg signed [64:0] prod;
Short example on eda playground.
Also note that prod = {32'b0,operand1}; is not a sign extension. You should probably be using:
prod = { {32{operand[31]}}, operand1 };
Related
Tried implementing Karatsuba multiplier for multiplying two binary numbers, the logic below works well for unsigned numbers, but getting incorrect answer when I change one of the inputs to a negative. In the example below a=1010011111000000(equals -88.25) and b= 0001010001000000(equals 20.25). The answer should be 11111001000001001111000000000000(equals:-1787.0625)but I end up getting the incorect answer. Have used fixed point logic, with inputs 16 bits and fraction point 8 bits, output being 32 bits with fraction bits 16.
module karatsuba( input signed [15:0] a,
input signed [15:0] b,
output signed [31:0] out
);
reg [15:0] ac,bd;
reg [31:0] t1;
reg [31:0]t2;
reg [24:0] psum;
initial begin
assign ac = a[15:8]*b[15:8];
assign bd = a[7:0]*b[7:0];
assign t2= bd;
assign t1={ac,16'b0000000000000000};
assign psum = {(a[15:8]+a[7:0])*(b[15:8]+b[7:0])-ac-bd,8'b00000000};
end
assign out= t1+psum+t2;
endmodule
module karatsuba_tb();
reg signed [15:0]a,b;
wire signed [31:0]out;
karatsuba uut(.a(a),.b(b),.out(out));
initial begin
a=16'b0101100001000000;
b=16'b0001010001000000;
end
endmodule
enter image description here
enter image description here
There are two issues pertaining to the signed multiply in the post:
Slices of vector variables (even slices of signed vectors) are treated as unsigned in Verilog. This is because when a slice is taken there is no way to identify the original sign bit, therefore it must be treated as unsigned
The solution is to cast the slices to signed so that Verilog treats them as signed. Like this:
assign ac = signed'(a[3:2]) * signed'(b[3:2]);
Make the line that defines the variable ac,bd to be treated as signed using the signed keyword (default is unsigned).
You will need to propagate these changes to other places in the posted code which have the same issues.
Here is a simplified version of the post using small numbers to illustrate the cast and keyword use:
module karatsuba(
input signed [3:0] a,
input signed [3:0] b
);
reg signed [3:0] ac;
assign ac = signed'(a[3:2]) * signed'(b[3:2]);
endmodule
module karatsuba_tb();
reg signed [3:0]a,b;
karatsuba uut(.a(a),.b(b));
initial begin
a = 4'b1110;
b = 4'b1111;
#1;
//
$display("---------------");
$display("uut.a[3:2] = %b",uut.a[3:2]);
$display("uut.b[3:2] = %b",uut.b[3:2]);
$display("uut.ac = %b",uut.ac);
$display("---------------");
//
$display("uut.a[3:2] = %d",signed'(uut.a[3:2]));
$display("uut.b[3:2] = %d",signed'(uut.b[3:2]));
$display("uut.ac = %d",uut.ac);
$display("---------------");
end
endmodule
The example displays this message at runtime:
---------------
uut.a[3:2] = 11
uut.b[3:2] = 11
uut.ac = 0001
---------------
uut.a[3:2] = -1
uut.b[3:2] = -1
uut.ac = 1
---------------
If I reduce the number of bits after arithmetic right shift in verilog, do I still get the correct signed number? Is this valid?
number of bits reduced = shift value
A = 1110_1110
A>>>1
new A = 111_0111
Yes, but you should use three '>' not four and of course the new variable should be big enough:
wire signed [7:0] A,B;
wire signed [6:0] just_fits;
wire signed [5:0] oops;
assign B = A >>> 1; // Signed divide by two
assign just_fits = A >>> 1; // Signed divide by two
assign oops = A >>> 1; // Goes wrong
I am designing a signed verilog multiplier which I intend to use multiple times in another module.
My two inputs will be always s4.27 format. 1 bit signed, 4 bits of integer and 27 bits of fraction. My output has to be also in s4.27 format and I have to get the most accurate result out of it.
In C, the following not so perfect code snippet did the job.
int32_t mul(int32_t x, int32_t y)
{
int64_t mul = x;
mul *= y;
mul >>= 27;
return (int32_t) mul;
}
In verilog my simple version of code is given below,
`timescale 1ns/1ps
module fixed_multiplier (
i_a,
i_b,
o_p,
clk
);
input clk;
input signed [31:0] i_a;
input signed [31:0] i_b;
output signed [31:0] o_p;
wire signed [63:0] out;
assign out = i_a*i_b;
assign o_p = out;
endmodule
The above mentioned code has bugs that I know because I am not getting the desired results at all.
So my questions are,
(1) As this line "assign o_result = out;" seems crucial to me, how shall I do my assignments to my final output so that I get the correct and most accurate s4.27 format output? Please note, this output will be fed to an adder and the adder output will be again an input for the multiplier.
Above question being asked, I also tried with xor-ing of sign bits of both inputs and assigning [57:27] bits to final output. Did not suit me and resulted in overflow, while in C same inputs did not give any overflow error.
(2) With C I did not have any problem with fixed-point multiplication while in verilog I guess I am struggling as I am quite a newbie. Any suggestions what things to keep in mind while dealing with signed multiplication/addition?
Below is the testbench code,
`timescale 1ns / 1ps
module tb_mul;
// Inputs
reg clk;
reg [31:0] a;
reg [31:0] b;
// Outputs
wire [31:0] c;
fixed_multiplier mul_i (
.clk(clk),
.i_a(a),
.i_b(b),
.o_p(c)
);
initial begin
$dumpfile("test_mul.vcd");
$dumpvars(1);
$monitor ("a=%h,\tb=%h,\tc=%h",a,b,c);
a = 32'h10000000;
b = 32'h10000000;
$finish();
end
endmodule
Thank you in advance.
Your multiplier does not work like you think it does. Verilog will assume your multiplication will be unsigned, and will compute it as such. You might want to do something like the following:
wire [61:0] temp_out;
assign temp_out = i_multiplicand[30:0] * i_multiplier[30:0];
assign sign = i_multiplicand[31] ^ i_multiplier[31];
assign out = {sign, temp_out[57:37]};
I have tried to write the code in Verilog to multiply two 32 bit binary numbers using a 32 bit carry look ahead adder but my program fails to compile. the generate if condition must be a constant expression error keeps on coming in Modelsim for the part 'if(store[0]==1)' and 'if(C[32]==1)'
This is the algorithm that I followed:
Begin Program
Multiplier = 32 bits
Multiplicand = 32 bits
Register = 64 bits
Put the multiplier in the least significant half and clear
the most significant half
For i = 1 to 32
Begin Loop
If the least significant bit of the 64-bit register
contains binary ‘1’
Begin If
Add the Multiplicand to the Most Significant
Half using the CLAA
Begin Adder
C[0 ] = ’0’
For j = 0 to 31
Begin Loop
Calculate Propagate P[j] = Multiplicand[j]^ Most Significant Half[j]
Calculate Generate G[j] =
Multiplicand[j]·Most Significant Half[j]
Calculate Carries C[i + 1] = G[i] + P[i] ·
C[i]
Calculate Sum S[i] = P[i] Å C[i]
End Loop
End Adder
Shift the 64-bit Register one bit to the right
throwing away the least significant bit
Else
Only Shift the 64-bit Register one bit to the
right throwing away the least significant bit
End If
End Loop
Register = Sum of Partial Products
End Program
Code:
module Multiplier_32(multiplier,multiplicand,store);
output store;
input [31:0]multiplier,multiplicand;
wire [63:0]store;
genvar i,j;
wire g=32;
wire [31:0]P,G,sum;
wire [32:0]C;
assign store[31:0]=multiplier;
generate for(i=0;i<32;i=i+1)
begin
if(store[0]==1)
begin
assign C[0]=0;
for(j=0;j<32;j=j+1)
begin
assign P[j]= multiplicand[j]^store[g];
assign G[j]=multiplicand[j]&store[g];
assign C[j+1]=G[i]|(P[i]&C[j]);
assign sum[j]=P[i]^C[j];
assign g=g-1;
end
assign store[63:32]=sum[31:0];
if(C[32]==1)
begin
assign store[62:0]=store[63:1];
assign store[63]=1;
end
else
begin
assign store[62:0]=store[63:1];
assign store[63]=0;
end
end
else
begin
assign store[62:0]=store[63:1];
assign store[63]=0;
end
end
endgenerate
endmodule
A generate block is evaluated at compile/elaboration time. They are used to construct hardware from patterns and not to evaluate logic. The value of store[0], C[32], and all other signals are unknown at this time. The only know values are parameters and genvars.
In this case, a combinational block (always #*) will fulfill your functionality requirements. Replace all your wire with reg, but all your assignments inside a always #*, and remove all the assign keywords (assign should not be used inside an always block).
module Multiplier_32(
input [31:0] multiplier, multiplicand,
output reg [63:0] store
);
integer i,j;
integer g;
reg [31:0] P,G,sum;
reg [32:0] C;
always #* begin
g = 32;
store[31:0]=multiplier;
for(i=0;i<32;i=i+1) begin
// your code here, do not use 'assign'
end
end
endmodule
I'm trying to implement as follows to multiplying by 15.
module mul15(
output [10:0] result,
input [3:0] a
);
assign result = a*15;
endmodule
But is there any improve way to multiplying to a by 15?
I think there are 2 ways like this
1.result = a<<4 -1;
2.result = {a,3'b1111_1111};
Ans I think the best way is 2.
but I'm not sure also with aspect to synthesis.
update:
What if I am multiplying 0 at {a,3'b1111_1111}? This is 255 not 0.
Does anyone know the best way?
Update
How about this way?
Case1
result = {a,8'b0}+ {a,7'b0}+ {a,6'b0}+ {a,5'b0}+ {a,4'b0}+ {a,7'b0}+ {a,3'b0}+ {a,2'b0}+ {a,1'b0}+ a;
But it looks 8 adder used.
Case2
result = a<<8 -1
I'm not sure what is the best way else.
There is always a*16 - a. Static multiplications of power of 2 are basically free in hardware; it is just hard-coded 0s to the LSB. So you just need one 11-bit full-subtracter, which is a full adder and some inverters.
other forms:
result = a<<4 - a;
result = {a,4'b0} - a; // unsigned full-subtractor
result = {a,4'b0} + ~a + 1'b1; // unsigned full-adder w/ carry in, 2's complement
result = {{3{a[3]}},a,4'b0} + ~{ {7{a[3]}}, a} + 1'b1; // signed full-adder w/ carry in, 2's complement
The cleanest RTL version is as you have stated in the question:
module mul15(
input [3:0] a
output reg [7:0] result,
);
always #* begin
result = a * 4'd15;
end
endmodule
The Multiplicand 15 in binary is 4'b1111; That is 8 + 4 + 2 + 1.
Instead of a multiplier it could be broken down into the sum of these powers of 2. Powers of 2 are just barrel shifts. This is how a shift and add multiplier would work.
module mul15(
input [3:0] a
output reg [7:0] result,
);
always #* begin
// 8 4 2 1 =>15
result = (a<<3) + (a<<2) + (a<<1) + a;
end
endmodule
To minimise the number of adders required a CSD could be used. making 15 out of 16-1:
module mul15(
input [3:0] a
output reg [7:0] result,
);
always #* begin
// 16 - 1 =>15
result = (a<<4) - a;
end
endmodule
With a modern synthesis tool these should all result in same the thing. Therefore having more readable code which gives a clear instruction to the tool as to what you intended gives it the free rein to optimise as required.