Double square braces - verilog

I'm trying to decipher the following line of code in Verilog:
assign ASIC_error_flag = (StartTransfer & ~Bank_Slct[IO_Config_P2[13:12]]);
I suspect it might be a compare between the negated bus "Bank_Slct" and bits 13 through 12 of the bus IO_Config_P2, but I've never seen a bus inside of a bus like that before. What is this supposed to equate to?

The inner square brackets are used to select a portion of the IO_Config_P2 signal, and the outer brackets are in turn used to select a portion of the Bank_Slct signal.
Let's assume you declared Bank_Slct like a memory of 4 bytes:
reg [7:0] Bank_Slct [0:3];
In this case, you need a 2-bit signal to select one of the 4 bytes (like a memory address). The expression, IO_Config_P2[13:12], is the 2-bit select signal.
When IO_Config_P2[13:12] is equal to 2'b00, you are selecting byte Bank_Slct[0].
When IO_Config_P2[13:12] is equal to 2'b01, you are selecting byte Bank_Slct[1], etc.
An alternate approach would have been to create a separate signal (sel), then use that:
wire [1:0] sel = IO_Config_P2[13:12];
assign ASIC_error_flag = (StartTransfer & ~Bank_Slct[sel]);
Refer to IEEE Std 1800-2017, section 7.4.4 Memories.

Related

Delay associated with xor of 1023 10 bit vectors in Verilog

I am somewhat new to verilog and I have a question that is confusing me .
I have a number of constant parameters , specifically nearly 1023 of them c0 , c1,c2 ..... c1022, each one being 10 bit in length . I also have a vector r[1022:0] , which is 1023 bits in length . My task is to compute ci*r[i] where i varies from 0 to 1022 and finally take the xor of the 1023 10 bit vectors that i get.When I do this in simulation , verilog generates the output at time 0 for the assign statement . How can verilog generate the output at time 0 ? Will there be no delay associated with these 1023 xors?
Also, if I need to do this succinctly , is there a short form that I can use or do I need to manually write c0 *r[0] ^ c1 *r[1] ......^ c[1022]*r[1022] which is synthesizable ?
A Verilog simulator will execute whatever legal syntax you give it—the tool knows nothing about what the implementation eventually looks like. It's up to you to feed timing constraints to the synthesis tool and it tells you if it can fit the logic to meet the constraints (or you might have to run another tool to see if it meets timing constraints).
Since you named your parameters c0, c1, c2, ..., you might as well named them czero, cone, ctwo, ... which gives you no options for shortcuts.
If you tool supports SystemVerilog, you can write your parameter as an array and then use the array xor reduction operator
parameter [9:0] C[1023] = {10'h123, 10'h234, ...};
assign out = C.xor() with (item*r[item.index]);
If you synthesis tool does not support this SystemVerilog syntax you, you can pack the parameter values into a single vector and use an indexed part select in Verilog.
parameter [10220-1:0] C = {10'h123, 10'h234, ...};
function [9:0] xor_reduction (input [1022:0] r);
integer I;
begin
xor_reduction = 0;
for(I=0;I<1023;I=I+1)
xor_reduction = xor_refuction ^ (r[1022-I]*C[I-:10]);
end
endfunction
assign out = xor_reduction(r);

unpacked union in systemverilog

typedef union {
logic [1:0] c3;
bit [3:0] a3;
byte b3;
} pack3;
pack3 p3;
According to LRM, the default initialization is according to the first member of union i.e logic in above example, therefore, c3 assign to X and rest to assign to 0 but when I compile in ModelSim and check in object window then there is a different result for a3 and b3.Also when I assign p3.a3 = 4'b0010; the value of a3 and b3 changes but not c3.Please Explain? I know there is only memory available for each variable so update in any value reflects all.
There are no guarantees if you write to one member of an unpacked union and try to read another member (except for one special provision mentioned at the end of section 7.3 Unions in the 1800-2012 LRM). You need to use a packed union if you want a guarantee in the layout of overlapping members.

Eliminating unused bits: creating synthesisable multidimensional arrays of with different dimensions

This is a follow-on question from How can I iteratively create buses of parameterized size to connect modules also iteratively created?. The answer is too complex to answer in a comment and the solution may be helpful for other SOs. This question is following the self-answer format. Addition answer are encouraged.
The following code works and uses a bi-directional array.
module Multiplier #(parameter M = 4, parameter N = 4)(
input [M-1:0] A, //Input A, size M
input [N-1:0] B, //Input B, size N
output [M+N-1:0] P ); //Output P (product), size M+N
wire [M+N-1:0] PP [N-1:0]; // Partial Product array
assign PP[0] = { {N{1'b0}} , { A & {M{B[0]}} } }; // Pad upper bits with 0s
assign P = PP[N-1]; // Product
genvar i;
generate
for (i=1; i < N; i=i+1)
begin: addPartialProduct
wire [M+i-1:0] gA,gB,gS; wire Cout;
assign gA = { A & {M{B[i]}} , {i{1'b0}} };
assign gB = PP[i-1][M+i-1:0];
assign PP[i] = { {(N-i){1'b0}}, Cout, gS}; // Pad upper bits with 0s
RippleCarryAdder#(M+i) adder( .A(gA), .B(gB), .S(gS), .Cin(1'b0), .* );
end
endgenerate
endmodule
Some of the bits are never used, such as PP[0][M+N-1:M+1]. A synthesizer will usually remove these bits during optimization and possibly give a warning. Some synthesizers are not advance enough to do this correctly. To resolve this, the designer must implement extra logic. In this example the parameter for all the RippleCarryAdder's would be set to M+N. The extra logic wastes area and potently degrades performance.
How can the unused bits be safely eliminated? Can multidimensional arrays with different dimensions be used? Will the end code be readable and debug-able?
Can multidimensional arrays with different dimensions be used?
Short answer, NO.
Verilog does not support unique sized multidimensional arrays. SystemVerilog does support dynamic arrays however these cannot be connected to module ports and cannot be synthesized.
Embedded code (such as Perl's EP3, Ruby's eRuby/ruby_it, Python's prepro, etc.) can generate custom denominational arrays and code iterations, but the parameters must be hard coded before compile. The final value of any parameter of a given instance is discoverer during compile time, well after the embedded script is ran. The parameter must be treated as a global constant, therefore Multiplier#(4,4) and Multiplier#(8,8) cannot exist in the same project unless to teach the script how to extract the full hierarchy and parameters of the project. (Good luck coding and maintaining that).
How can the unused bits be safely eliminated?
If the synthesizer is not advance enough to exclude unused bits on its own, then the bits can be optimized by flattening the multidimensional array into a one-dimensional array with intelligent part-select. The trick is finding the equation which can be achieved by following these steps:
Find the pattern of the lsb index for each part part select:
Assume M is 4, the lsb for each part-select are 0, 5, 11, 18, 26, 35, .... Plug this pattern into WolframAlpha to find the equation a(n) = (n-1)*(n+8)/2.
Repeat with M equal to 3 for the pattern 0, 4, 9, 15, ... to get equation a(n)=(n-1)*(n+6)/2
Repeat again with M equal to 5 for the pattern 0, 6, 13, 21, 30, ... to get equation a(n)=(n-1)*(n+10)/2.
Since the relation of M and N is linear (i.e. multiple; no exponential, logarithmic, etc.), only two equations are needed to create a variable parameter M equation. For non-linear equations more data-point equations are recommended. In this case note that for M=3,4,5 the pattern (n+6),(n+8),(n+10), therefore the generic equation can be derived to: lsb(n)=(n-1)*(n+2*M)/2
Fine the pattern of the msb index for each part select:
Use the same process of as finding the lsb (ends up being msb(n)=(n**2+(M*2+1)*n-2)/2). Or define the msb in terms of lsb: msb(n)=lsb(n+1)-1
IEEE std 1364-2001 (Verilog 2001) introduced macros with arguments and indexed part-select; see § 19.3.1 '`define' and § 4.2.1 'Vector bit-select and part-select addressing' respectively. Or see IEEE std 1800-2012 § 22.5.1 '`define' and § 11.5.1 'Vector bit-select and part-select addressing' respectively. This answer assumes that these features are supported by the SO's simulator and synthesizer since the generate keyword was also introduced in IEEE std 1364-2001, see § 12.1.3 'Generated instantiation' (and IEEE std 1800-2012 § 27. 'Generate constructs'). For tools that are not fully support IEEE std 1364-2001, see `ifdef examples provided here.
Since the functions to calculate the part-select ranges are frequently used, use `define macros with arguments. This will help prevent copy/paste bugs. The extra sets of () in the macro definitions are to insure proper order of operations. It is also a good idea to `undef the macros at the end of the module definition, preventing the global space from getting polluted. With the flattened array it may become challenging to debug. By defining pass-through connections within the generate block's for-loop the signal can become readable and can be probed in waveform.
module Multiplier #(parameter M = 4, parameter N = 4)(
input [M-1:0] A, //Input A, size M
input [N-1:0] B, //Input B, size N
output [M+N-1:0] P ); //Output P (product), size M+N
// global space macros
`define calc_pp_lsb(n) (((n)-1)*((n)+2*M)/2)
`define calc_pp_msb(n) (`calc_pp_lsb(n+1)-1)
`define calc_pp_range(n) `calc_pp_lsb(n) +: (M+n)
wire [`calc_pp_msb(N):0] PP; // Partial Product
assign PP[`calc_pp_range(1)] = { 1'b0 , { A & {M{B[0]}} } };
assign P = PP[`calc_pp_range(N)]; // Product
genvar i;
generate
for (i=1; i < N; i=i+1)
begin: addPartialProduct
wire [M+i-1:0] gA,gB,gS; wire Cout;
assign gA = PP[`calc_pp_range(i)];
assign gB = { A & {M{B[i]}} , {i{1'b0}} };
assign PP[`calc_pp_range(i+1)] = {Cout,gS};
RippleCarryAdder#(M+i) adder( .A(gA), .B(gB), .S(gS), .Cin (1'b0), .* );
end
endgenerate
// Cleanup global space
`undef calc_pp_range
`undef calc_pp_msb
`undef calc_pp_lsb
endmodule
Working example with side-by-side and test bench: http://www.edaplayground.com/s/6/591
Will the end code be readable and debug-able?
Yes, for anyone who has already learned how to properly use the generate construct. The generate block's for-loop defines local wires which are confined to scope of the loop index. gA form loop-0 and gA from loop-1 are unique signals and cannot interact with each other. The local signals can be probed in waveform which is great for debugging.

How to define and initialize a vector containing only ones in Verilog?

If I want to declare a 128 bit vector of all ones, which one of these methods is always correct?
wire [127:0] mywire;
assign mywire = 128'b1;
assign mywire = {128{1'b1}};
assign mywire = 128'hFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF;
As a quick simulation would prove, assign mywire = 128'b1; does not assign all bits of mywire to 1. Only bit 0 is assigned 1.
Both of the following always assign all 128 bits to 1:
assign mywire = {128{1'b1}};
assign mywire = 128'hFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF;
One advantage of the 1st line is that it is more easily scalable to widths greater than and less than 128.
With SystemVerilog, the following syntax also always assigns all 128 bits to 1:
assign mywire = '1;
I would use the following statement instead:
assign mywire = ~0;
in a simple expression like this, the width on the left-hand side of the assignment sets the width for the expression on the right hand side. So 0, which is a 32 bit constant, is first extended to the full 128 bit of mywire, then all the bits are flipped and the resulting all-ones vector is assigned.
I'd prefer this version because it does not require you to specify the width of mywire anywhere in the assignment.

Using parameter for continuous assignment in verilog?

Can you use a parameter value for assignment in verilog? Can I somehow define the width of a parameter variable?
Ex:
module mymodule #(parameter type =2)
(...
output [(3+type)-1:0] out);
wire [2:0] rate;
...
assign out = {rate, {1'b0{type}} };
endmodule
Lets just say type=2. Then I would want out to be of bit-length 5. rate is still of bit-length 3 (lets just say it is 3'b100), when I assign out I want it to be 100 000.
Similarly if type=6. Then I would want out to be of bit-length 9. rate is still of bit-length 3 (again lets say its 3'b100), when I assign out I want it to be 100 000000.
I don't get any syntax errors but when I try to simulate it I get:
"error: Concatenation operand "type" has indefinite width"
How would you guys approach a design problem like this one?
You have the repetition operator backward. Should be
{type{1'b0}}, not {1'b0{type}}
I'm surprised you don't see any syntax error from that.

Resources