Permuation in SystemVerilog using genvar - verilog

I am working on a piece of code for permutation. I could not find a nice of assigning signals indexed with genvar and an input.
I have input i with array of 4, and output o again with same size. And I have a permutation index. Depending on the permutation index, I want to assign outputs to inputs such as:
perm_index 0 | o[0] <- i[0] | o[1] <- i[1] | o[2] <- i[2] | o[3] <- i[3]
perm_index 1 | o[0] <- i[1] | o[1] <- i[0] | o[2] <- i[3] | o[3] <- i[2]
perm_index 2 | o[0] <- i[2] | o[1] <- i[3] | o[2] <- i[0] | o[3] <- i[1]
perm_index 3 | o[0] <- i[3] | o[1] <- i[2] | o[2] <- i[1] | o[3] <- i[0]
This is the permutation that I want to code, and I want to use genvar for this. In my coding with genvar g and input [1:0] perm_index, this line o[g].addr <= i[g^perm_index].addr; causes an error "perm_index is not a constant".
Does anyone knows a better way to write this code?
First I coded my permutation using a generate block, such as:
genvar g;
generate
for (g=0; g<4; g++)
always_comb
o[g].addr <= i[g^perm_index].addr;
endgenerate
However, Vivado does not accept this code, throwing out an error "perm_index is not a constant".
I could solve it by writing the code as shown below.
genvar g;
generate
for (g=0; g<4; g++)
always_comb
case (g[1:0])
(2'd0^perm_index) : o[g].addr <= i[0].addr;
(2'd1^perm_index) : o[g].addr <= i[1].addr;
(2'd2^perm_index) : o[g].addr <= i[2].addr;
(2'd3^perm_index) : o[g].addr <= i[3].addr;
endcase
endgenerate
Although this is a solution, I am not happy with this coding style. Does anyone knows a better way to write this code?

It looks lie you are trying to implement some weird bit swapping. I kind of guess what it is but straight forward implementation for this example is much easier than an attempt to create a clever algorithm for it.
In any case, you do not need any generate statement at all. It is a special construct and it is supposed to be used for configuration purposes or algorithmic instantiation of multiple design elements. You do not have either.
So, I tried to follow your picture and created a possible implementation with a 'case' statement as the following:
module perm(input logic[1:0] perm_index, input logic[3:0]i, output logic [3:0] o);
always_comb begin
case(perm_index)
0: o = i;
1: o = {i[1], i[0], i[3], i[2]};
2: o = {i[2], i[3], i[0], i[1]};
3: o = {i[3], i[2], i[1], i[0]};
endcase
end
endmodule
I do not think that you can get it easier than that, unless you have much bigger vectors.
And here is a simple test bench
module top;
logic [3:0] perm_index;
logic [3:0] i, o;
perm perm(perm_index, i, o);
initial begin
$monitor("%0t> %0d: %b --> %b", $time, perm_index, i, o);
i = 4'b1100;
for (int pi = 0; pi < 7; pi++) begin
perm_index = pi;
#2;
end
#2 $finish;
end
endmodule
As it turned out, OP used System Verilog interfaces. Below is an example with interfaces.
First is a straight forward one with concats:
module perm(input logic[1:0] perm_index, I i[4], I o[4]);
always_comb begin
case(perm_index)
0: {o[0].addr, o[1].addr, o[2].addr, o[3].addr} =
{i[0].addr, i[1].addr, i[2].addr, i[3].addr};
1: {o[0].addr, o[1].addr, o[2].addr, o[3].addr} =
{i[1].addr, i[0].addr, i[3].addr, i[2].addr};
2: {o[0].addr, o[1].addr, o[2].addr, o[3].addr} =
{i[2].addr, i[3].addr, i[0].addr, i[1].addr};
3: {o[0].addr, o[1].addr, o[2].addr, o[3].addr} =
{i[3].addr, i[2].addr, i[1].addr, i[0].addr};
endcase
end
endmodule
Here is one with genblocks:
module perm(input logic[1:0] perm_index, I i[4], I o[4]);
parameter int perm_i[4][4] = '{
'{0, 1, 2, 3},
'{1, 0, 3, 2},
'{2, 3, 0, 1},
'{3, 2, 1, 0}};
for (genvar v = 0; v < 4; v++) begin
always_comb
case (perm_index)
0: o[v].addr = i[perm_i[0][v]].addr;
1: o[v].addr = i[perm_i[1][v]].addr;
2: o[v].addr = i[perm_i[2][v]].addr;
3: o[v].addr = i[perm_i[3][v]].addr;
endcase
end
endmodule
And an updated testbench
interface I;
logic[3:0] addr;
endinterface
module top;
I i[4]();
I o[4]();
logic [3:0] perm_index;
perm perm(perm_index, i, o);
initial begin
$monitor("%2t> [%0d], i(%0d,%0d,%0d,%0d) --> o(%0d,%0d,%0d,%0d)",
$time, perm_index,
i[0].addr, i[1].addr, i[2].addr, i[3].addr,
o[0].addr, o[1].addr, o[2].addr, o[3].addr);
{i[0].addr, i[1].addr, i[2].addr, i[3].addr} = {4'd1, 4'd2, 4'd3, 4'd4};
for (int pi = 0; pi < 7; pi++) begin
perm_index = pi;
#2;
end
#2 $finish;
end
endmodule

Permutation is possible with gen var. What you aree trying to achieve is actually a rotation. You can obtain it with modulo operation.
It is possibile provided that perm_index is a constant value, eg: parameter.
It is good practice to have a constant width value for array size. I show a full example:
module top;
parameter logic[1:0] PERM_INDEX=0;
parameter ARRAY_W= 4;
typedef struct{
int addr;
} my_t;
my_t i[ARRAY_W];
my_t o[ARRAY_W];
initial begin;
for(int j=0; j<ARRAY_W; j++) begin
i[j].addr = j+ 10;
$display("i[%d].addr = %d", j, i[j].addr);
end
#10;
for(int j=0; j<ARRAY_W; j++) begin
$display("o[%d].addr = %d", j, o[j].addr);
end
end
genvar g;
generate
for (g=0; g<ARRAY_W; g++)
assign o[g].addr = i[(g+PERM_INDEX)%ARRAY_W].addr;
endgenerate
endmodule
------------------ EDIT ------------
To have perm_index changing over the time, the best approach is to use a barrel shifter.
https://en.wikipedia.org/wiki/Barrel_shifter
Already answered here:
How can i make my verilog shifter more general?

Related

Systemverilog recursion update value for next stage

I am trying to create a recursive logic in Systemverilog but I seem to be missing the right logic to carry the output of one iteration to the next.
Here is an example of the problem:
parameter WIDTH=4;
module test_ckt #(parameter WIDTH = 4)(CK, K, Z);
input CK;
input [WIDTH-1:0] K;
output reg Z;
wire [WIDTH/2-1:0] tt;
wire [WIDTH-1:0] tempin;
assign tempin = K;
genvar i,j;
generate
for (j=$clog2(WIDTH); j>0; j=j-1)
begin: outer
wire [(2**(j-1))-1:0] tt;
for (i=(2**j)-1; i>0; i=i-2)
begin
glitchy_ckt #(.WIDTH(1)) gckt (tempin[i:i], tempin[(i-1):i-1], tt[((i+1)/2)-1]);
end
// How do I save the value for the next iteration?
wire [(2**(j-1))-1:0] tempin;
assign outer[j].tempin = outer[j].tt;
end
endgenerate
always #(posedge CK)
begin
// How do I use the final output here?
Z <= tt[0];
end
endmodule
module glitchy_ckt #(parameter WIDTH = 1)(A1, B1, Z1);
input [WIDTH-1:0] A1,B1;
output Z1;
assign Z1 = ~A1[0] ^ B1[0];
endmodule
Expected topology:
S1 S2
K3--<inv>--|==
|XOR]---<inv>----|
K2---------|== |
|==
<--gckt---> |XOR]
|==
K1--<inv>--|== |
|XOR]------------|
K0---------|== <-----gckt---->
Example input and expected outputs:
Expected output:
A - 1010
----
S1 0 0 <- j=2 and i=3,1.
S2 1 <- j=1 and i=1.
Actual output:
A - 1010
----
S1 0 0 <- j=2 and i=3,1.
S2 0 <- j=1 and i=1. Here, because tempin is not updated, inputs are same as (j=2 & i=1).
Test-bench:
`timescale 1 ps / 1 ps
`include "test_ckt.v"
module mytb;
reg CK;
reg [WIDTH-1:0] A;
wire Z;
test_ckt #(.WIDTH(WIDTH)) dut(.CK(CK), .K(A), .Z(Z));
always #200 CK = ~CK;
integer i;
initial begin
$display($time, "Starting simulation");
#0 CK = 0;
A = 4'b1010;
#500 $finish;
end
initial begin
//dump waveform
$dumpfile("test_ckt.vcd");
$dumpvars(0,dut);
end
endmodule
How do I make sure that tempin and tt get updated as I go from one stage to the next.
Your code does not have any recursion in it. You were trying to solve it using loops, but generate blocks are very limited constructs and, for example, you cannot access parameters defined in other generate iterations (but you can access variables or module instances).
So, the idea is to use a real recursive instantiation of the module. In the following implementation the module rec is the one which is instantiated recursively. It actually builds the hierarchy from your example (I hope correctly).
Since you tagged it as system verilog, I used the system verilog syntax.
module rec#(WIDTH=1) (input logic [WIDTH-1:0]source, output logic result);
if (WIDTH <= 2) begin
always_comb
result = source; // << generating the result and exiting recursion.
end
else begin:blk
localparam REC_WDT = WIDTH / 2;
logic [REC_WDT-1:0] newSource;
always_comb // << calculation of your expression
for (int i = 0; i < REC_WDT; i++)
newSource[i] = source[i*2] ^ ~source[(i*2)+1];
rec #(REC_WDT) rec(newSource, result); // << recursive instantiation with WIDTH/2
end // else: !if(WIDTH <= 2)
initial $display("%m: W=%0d", WIDTH); // just my testing leftover
endmodule
The module is instantiated first time from the test_ckt:
module test_ckt #(parameter WIDTH = 4)(input logic CK, input logic [WIDTH-1:0] K, output logic Z);
logic result;
rec#(WIDTH) rec(K, result); // instantiate first time )(top)
always_ff #(posedge CK)
Z <= result; // assign the results
endmodule // test_ckt
And your testbench, a bit changed:
module mytb;
reg CK;
reg [WIDTH-1:0] A;
wire Z;
test_ckt #(.WIDTH(WIDTH)) dut(.CK(CK), .K(A), .Z(Z));
always #200 CK = ~CK;
integer i;
initial begin
$display($time, "Starting simulation");
CK = 0;
A = 4'b1010;
#500
A = 4'b1000;
#500 $finish;
end
initial begin
$monitor("Z=%b", Z);
end
endmodule // mytb
Use of $display/$monitor is more convenient than dumping traces for such small examples.
I did not do much testing of what I created, so there could be issues, but you can get basic ideas from it in any case. I assume it should work with any WIDTH which is power of 2.

Very new to verilog. Can't assign values in a loop

I'm very new to verilog and not sure what data types to use. I'm trying to iterate over a binary number and xor each bit. I can do this manually, but I can't store it into a different reg.
module iterator(a, b);
input [3:0] a;
input [3:0] b;
integer i = 0;
reg [3:0] c = 4'b0000;
always # (a or b) begin
$display("p = %b", p);
for(i = 0; i < 4; i=i+1)
c[i] = a[i] ^ b[i];
$display("i = %d, a[i] = %b, b[i] = %b, a[i] ^ b[i] = %b", i, a[i], b[i], c[i]);
end
endmodule
module Testbench;
reg [3:0] a = 4'b1001;
reg [3:0] b = 4'b0110;
iterator it(a, b);
endmodule
Just to point out:
1: your display statement is misaligned. It is not inside the for loop. (This is not python thank god!) Use:
for(i = 0; i < 4; i=i+1)
begin
c[i] = a[i] ^ b[i];
$display("i = %d, a[i] = %b, b[i] = %b, a[i] ^ b[i] = %b", i, a[i], b[i], c[i]);
end
2: there is no variable 'p' which suggest this is not the code you used (syntax error).
3: Your code has no output. You EXOR bits but return no result as you have no output port.
4: initialising c to 4'b0000 is not always synthesize-sable. It is a bad habit and should be avoided unless it is strictly necessary. (Which is not here).
But all that goes away if you use toolic's code.
In order for the code inside an always block to be executed, you need to make sure that the values of the variables in sensitivity list are changed. In your example you need a test bench to do so.
In your always block a and b are the signals which trigger execution of the code inside the block, if any of them changes.
always # (a or b) begin
$display("p = %b", p);
To do provide such a change you can do something like the following:
module Testbench;
reg [3:0] a;
reg [3:0] b;
iterator it(a, b);
initial begin
a = 4'b1001;
b = 4'b0110;
#2
a = 4'b0110;
b = 4'b1111;
#2
...
end
endmodule

Bit slicing in verilog

How can I write wdata[((8*j)+7) : (8*i)] = $random; in verilog programming language? , where i and j are reg type variable. Modelsim gives error for constant range variable. How could I write it in proper manner.
You should think from Hardware prospective for the solution.
Here is one solution. Hope that it will help you.
module temp(clk);
input clk;
reg i, j;
reg [23:0] register, select;
wire [23:0] temp;
initial
begin
i = 'd1;
j = 'd1;
end
generate
for(genvar i = 0; i<24; i++)
begin
assign temp[i] = select[i] ? $random : register[i];
end
endgenerate
always # (posedge clk)
begin
register <= temp;
end
always # *
begin
select = (32'hffff_ffff << ((j<<3)+8)) ^ (32'hffff_ffff << (i<<3));
end
endmodule
Use the array slicing construction. You can find more detailed explanation at Array slicing Q&A
bit [7:0] PA, PB;
int loc;
initial begin
loc = 3;
PA = PB; // Read/Write
PA[7:4] = 'hA; // Read/Write of a slice
PA[loc -:4] = PA[loc+1 +:4]; // Read/Write of a variable slice equivalent to PA[3:0] = PA[7:4];
end
Verilog 2001 Syntax
[M -: N] // negative offset from bit index M, N bit result
[M +: N] // positive offset from bit index M, N bit result

How to design a 64 x 64 bit array multiplier in Verilog?

I know how to design a 4x4 array multiplier , but if I follow the same logic , the coding becomes tedious.
4 x 4 - 16 partial products
64 x 64 - 4096 partial products.
Along with 8 full adders and 4 half adders, How many full adders and half adders do I need for 64 x 64 bit. How do I reduce the number of Partial products? Is there any simple way to solve this ?
Whenever tediously coding a repetitive pattern you should use a generate statement instead:
module array_multiplier(a, b, y);
parameter width = 8;
input [width-1:0] a, b;
output [width-1:0] y;
wire [width*width-1:0] partials;
genvar i;
assign partials[width-1 : 0] = a[0] ? b : 0;
generate for (i = 1; i < width; i = i+1) begin:gen
assign partials[width*(i+1)-1 : width*i] = (a[i] ? b << i : 0) +
partials[width*i-1 : width*(i-1)];
end endgenerate
assign y = partials[width*width-1 : width*(width-1)];
endmodule
I've verified this module using the following test-bench:
http://svn.clifford.at/handicraft/2013/array_multiplier/array_multiplier_tb.v
EDIT:
As #Debian has asked for a pipelined version - here it is. This time using a for loop in an always-region for the array part.
module array_multiplier_pipeline(clk, a, b, y);
parameter width = 8;
input clk;
input [width-1:0] a, b;
output [width-1:0] y;
reg [width-1:0] a_pipeline [0:width-2];
reg [width-1:0] b_pipeline [0:width-2];
reg [width-1:0] partials [0:width-1];
integer i;
always #(posedge clk) begin
a_pipeline[0] <= a;
b_pipeline[0] <= b;
for (i = 1; i < width-1; i = i+1) begin
a_pipeline[i] <= a_pipeline[i-1];
b_pipeline[i] <= b_pipeline[i-1];
end
partials[0] <= a[0] ? b : 0;
for (i = 1; i < width; i = i+1)
partials[i] <= (a_pipeline[i-1][i] ? b_pipeline[i-1] << i : 0) +
partials[i-1];
end
assign y = partials[width-1];
endmodule
Note that with many synthesis tools it's also possible to just add (width) register stages after the non-pipelined adder and let the tools register balancing pass do the pipelining.
[how to] reduce the number of partial products?
A method somewhat common used to be modified Booth encoding:
At the cost of more complicated addend selection, it at least almost halves their number.
In its simplest form, considering groups of three adjacent bits (overlapping by one) from one of the operands, say, b, and selecting 0, a, 2a, -2a or -a as an addend.
The code below generates only half of expected the output.
module arr_multi(a, b, y);
parameter w = 8;
input [w-1:0] a, b; // w-width
output [(2*w)-1:0] y; // p-partials
wire [(2*w*w)-1:0] p; //assign width as input bits multiplied by
output bits
genvar i;
assign p[(2*w)-1 : 0] = a[0] ? b : 0; //first output size bits
generate
for (i = 1; i < w; i = i+1)
begin
assign p[(w*(4+(2*(i-1))))-1 : (w*2)*i] = (a[i]?b<<i :0) + p[(w*(4+(2*
(i-2))))-1 :(w*2)*(i-1)];
end
endgenerate
assign y=p[(2*w*w)-1:(2*w)*(w-1)]; //taking last output size bits
endmodule

Generate If Statements in Verilog

I'm trying to create a synthesizable, parametrized priority encoder in Verilog. Specifically, I want to find the least significant 1 in a vector and return a vector containing just that 1. For example:
IN[3:0] | OUT[4:0]
--------+---------
1010 | 00010
1111 | 00001
0100 | 00100
0000 | 10000 (special case)
So if the vectors are four bits wide, the code is:
if (in[0]==1'b1) least_one = 1;
else if (in[1]==1'b1) least_one = 2;
else if (in[2]==1'b1) least_one = 4;
else if (in[3]==1'b1) least_one = 8;
else out = 16; // special case in==0, set carry bit
I need a general, scalable way to do this because the input/output vector length is parametrized. My current code is:
module least_one_onehot
#(parameter ADDR_WIDTH=4)
(output reg [ADDR_WIDTH:0] least_one,
input [ADDR_WIDTH-1:0] in);
genvar i;
always #(in) begin
if (in[0]==1'b1) least_one = 1;
generate for (i=1; i<ADDR_WIDTH; i=i+1) begin : U
else if (in[i]==1'b1) least_one = 2**i;
end
endgenerate
else least_one = 2**ADDR_WIDTH;
end
endmodule
When I try to compile this, I receive the following errors:
file: least_one_onehot.v
generate for (i=1; i<ADDR_WIDTH; i=i+1) begin : U
|
ncvlog: *E,GIWSCP (least_one_onehot.v,10|8): Generated instantiation can only be valid within a module scope [12.1.3(IEEE 2001)].
else if (in[i]==1'b1) least_one = 2**i;
|
ncvlog: *E,NOTSTT (least_one_onehot.v,11|6): expecting a statement [9(IEEE)].
endgenerate
|
ncvlog: *E,GIWSCP (least_one_onehot.v,13|12): Generated instantiation can only be valid within a module scope [12.1.3(IEEE 2001)].
else least_one = 2**ADDR_WIDTH;
|
ncvlog: *E,NOTSTT (least_one_onehot.v,14|5): expecting a statement [9(IEEE)]
I've tried various arrangements of the generate, if, and always statements, all without success. Anyone know the proper syntax for this? Case-statement implementation or other alternatives would also be fine. Thanks.
I think you misunderstand how generate works. It isn't a text pre-processor that emits the code in between the generate/endgenerate pair with appropriate substitutions. You have to have complete syntactic entities withing the pair. I don't have access to a simulator right this minute but this might do the trick for you (totally untested)
genvar i;
generate
for (i = 1; i < ADDR_WIDTH; i = i + 1) begin : U
least_one[i] = in[i] & ~|in[i - 1:0];
end
endgenerate
least_one[0] = in[0];
least_one[ADDR_WIDTH] = ~|in;
Ordinarily Verilog would complain about the non-constant bit slice width but since it's within a generate loop it might work.
Failing something like the above you just test for the first set bit in a for-loop and then decode that result.
You do not need a generate block. You could use:
integer i;
reg found;
always #(in) begin
least_one = {(ADDR_WIDTH+1){1'b0}};
found = 1'b0;
for (i=0; i<ADDR_WIDTH; i=i+1) begin
if (in[i]==1'b1 && found==1'b0) begin
least_one[i] = 1'b1;
found = 1'b1;
end
end
least_one[ADDR_WIDTH] = (found==1'b0);
end
If you really want to use a generate block, then you need to assign each bit.
assign least_one[0] = in[0];
assign least_one[ADDR_WIDTH] = (in == {ADDR_WIDTH{1'b0}});
genvar i;
generate
for (i=1; i<ADDR_WIDTH; i=i+1) begin : U
assign least_one[i] = in[i] && (in[i - 1:0] == {i{1'b0}});
end
endgenerate
This simulates the way you want it to, but it is not synthesizable (you didn't specify if that was a requirement):
module least_one_onehot #(parameter ADDR_WIDTH=4) (
output reg [ADDR_WIDTH-1:0] least_one,
input [ADDR_WIDTH-1:0] in
);
always #* begin
least_one = '0;
for (int i=ADDR_WIDTH-1; i>=0; i--) begin
if (in[i]) least_one = 2**i;
end
end
endmodule
Note that it uses SystemVerilog constructs.
Personally, I like the following block of code for what you need:
assign out = {1'b1,in} & ((~{1'b1,in})+1);
You could try this (dropping the extra high bit for legibility), but I like to explicitly do the twos compliment to avoid any potential compatibility problems.
assign out = in & (-1*in);

Resources