Same random number sequence generated by Verilog-A code when running the code consecutively - verilog

Recently I have to use Verilog-A to generate a set of random numbers (sigmaX, sigmaY, sigmaZ). Statistically, each of them has mean=0 and std=1, and sigmaX^2+sigmaY^2+sigmaZ^2=1. The following code in test_solver.va file is writen in Verilog-A to realize such random numebr set at each time step:
`include "disciplines.h"
`include "constants.h"
module test_va(p,n,mb,mc,md,me,mf,mg);
inout p,n;
output mb,mc,md,me,mf,mg;
electrical p,n,mb,mc,md,me,mf,mg;
real randomX,randomY,randomZ; // Gaussian random variables with mean = 0, stdev = 1
real sigmaX,sigmaY,sigmaZ; // Normalized thermal noise vector components
integer seedX,seedY,seedZ; // Seed variables for RNG
integer random_seed;
//------------------------------------------------------------------//
// Define mag(x, y, z)
//------------------------------------------------------------------//
analog function real mag;
input x, y, z;
real x, y, z;
begin
mag = sqrt(pow(x,2)+pow(y,2)+pow(z,2));
end
endfunction
analog begin
random_seed = 1;
seedX = $random+random_seed;
seedY = $random+random_seed;
seedZ = $random+random_seed;
randomX = $rdist_normal(seedX, 0.0, 1.0);
randomY = $rdist_normal(seedY, 0.0, 1.0);
randomZ = $rdist_normal(seedZ, 0.0, 1.0);
sigmaX = randomX/mag(randomX, randomY, randomZ);
sigmaY = randomY/mag(randomX, randomY, randomZ);
sigmaZ = randomZ/mag(randomX, randomY, randomZ);
V(mb) <+ randomX;
V(mc) <+ randomY;
V(md) <+ randomZ;
V(me) <+ sigmaX;
V(mf) <+ sigmaY;
V(mg) <+ sigmaZ;
end
endmodule
I used HSPICE 2019 to test the random number output at each simulation step, by running the folloing test_solver.sp file:
Title Simple
.option post=1
.option probe=0
*.option runlvl=4
.option ingold=2
*.option accurate=1
*.option method=bdf
*.option bdfrtol=1e-5
*.option bdfatol=1e-5
.option numdgt=4
.option brief
.option measfile=1
.option lis_new=1
.option vaopts=str('-G')
.save
.hdl ./test_solver.va
vin 1 0 PULSE(0 0.5 2NS 1NS 1NS 10NS 20NS)
X 1 0 2 3 4 5 6 7 test_va
.tran 0.01n 20.0n 1E-10 uic
.print tran V(1) V(2) V(3) V(4) V(5) V(6) V(7)
.end
However, I noticed that it always generates an identical random number set (sigmaX, sigmaY, sigmaZ) if I run in HSPICE consecutively. But my requirement is to have different random number sets when running the same code consecutively.
I also noticed that if I change random_seed=1 in the test_solver.va file, for example, to random_seed=2 (or 3 or 4 ...) and run in HSPICE, it will generate a different random number set than before. But it still generates the same set when running the same code consecutively.
So I wonder if there is anything wrong with my test_solver.va code, or we have to change "random_seed=1" every time. Then it might not be easy to realize if I integrate this code into others and run many times.

First of all, pseudo-random number generators are deterministic. That means if you start with the same seed you will always get the same result.
I'm not aware of any way to do what you want directly in Verilog-A. I think that you will need to write your own function in 'C'. One technique that is often used is to call a high resolution timer and assume that the time in micro- or nanoseconds is essentially random. Alternatively you can call a function like getrandom().
The next problem is getting the 'C' random value back to your Verilog-A. I'm not familiar with HSPICE, but this can be done with Verilog PLI on some other simulators.
Alternatively you could wrap your simulation in a shell script and do something like this
script: read /dev/urandom and write a random number to a file
run hspice
in your Verilog-A use a system task like $fread to read the file that the script produced

Related

Error due to delay operator in algorithm section

I have implemented a delay operator in algorithm section of a class as shown in the test case below, but during the execution of the codes in Open modelica, I face the below error. how can I fix the problem?
model test3
Real x=sin(377*time);
Real z;
parameter Real tau[:]={0.01,0.02};
equation
algorithm
for k in 1: 2 loop
z:=delay(x,tau[k]);
end for;
end test3;
Looks like a tool issue. On the other hand, why would you calculate z for k=1 and never use it?
Like tbeu said in his answer, its an issue in OpenModelica. In Dymola your example simulates as expected. So please report the issue here.
Investigating a bit I realized that the following combination prevents your model from translating:
usage of delay
in an algorithm section
inside a for loop
Hence, you have to get rid of either of those.
Workarounds
If you know the size of tau[:] in advance, use separate lines for the computation of z instead of the for loop:
model sep_lines
Real x = sin(8 * time);
Real z[2];
parameter Real tau[2] = {0.02, 0.01};
equation
algorithm
z[1] := delay(x, tau[1]);
z[2] := delay(x, tau[2]);
end sep_lines;
Maybe you can use an equation section instead of the algoirthm section. Then you don't need the for loop at all, since function calls (in this case the delay) are vectorized automatically if needed. Your code will reduce to:
model eqs
Real x = sin(8 * time);
Real z[3];
parameter Real tau[3] = {0.03, 0.02, 0.01};
equation
z = delay(x, tau);
end eqs;
Extra issue
If the for loop is replaced with a while loop the model translates and simulates. But the delayed signal z[1] is not correct and the error increases with the frequency of x. At low frequencies like 1 Hz its barely visible, but with a frequency of e.g. 20 Hz the amplitude is considerably wrong. So don't get fooled by this solution.
model while_ "Compiles, but the amplitude of z[1] is wrong"
Real x = sin(2*Modelica.Constants.pi * 20 * time);
Real z[size(tau, 1)];
parameter Real tau[:] = {0.02, 0.01};
protected
Integer i;
algorithm
i := 0;
while i < size(tau, 1) loop
i := i + 1;
z[i] := delay(x, tau[i]);
end while;
end while_;
This is a bug in OpenModelica. A ticket for it was created: https://trac.openmodelica.org/OpenModelica/ticket/5572

Delay associated with xor of 1023 10 bit vectors in Verilog

I am somewhat new to verilog and I have a question that is confusing me .
I have a number of constant parameters , specifically nearly 1023 of them c0 , c1,c2 ..... c1022, each one being 10 bit in length . I also have a vector r[1022:0] , which is 1023 bits in length . My task is to compute ci*r[i] where i varies from 0 to 1022 and finally take the xor of the 1023 10 bit vectors that i get.When I do this in simulation , verilog generates the output at time 0 for the assign statement . How can verilog generate the output at time 0 ? Will there be no delay associated with these 1023 xors?
Also, if I need to do this succinctly , is there a short form that I can use or do I need to manually write c0 *r[0] ^ c1 *r[1] ......^ c[1022]*r[1022] which is synthesizable ?
A Verilog simulator will execute whatever legal syntax you give it—the tool knows nothing about what the implementation eventually looks like. It's up to you to feed timing constraints to the synthesis tool and it tells you if it can fit the logic to meet the constraints (or you might have to run another tool to see if it meets timing constraints).
Since you named your parameters c0, c1, c2, ..., you might as well named them czero, cone, ctwo, ... which gives you no options for shortcuts.
If you tool supports SystemVerilog, you can write your parameter as an array and then use the array xor reduction operator
parameter [9:0] C[1023] = {10'h123, 10'h234, ...};
assign out = C.xor() with (item*r[item.index]);
If you synthesis tool does not support this SystemVerilog syntax you, you can pack the parameter values into a single vector and use an indexed part select in Verilog.
parameter [10220-1:0] C = {10'h123, 10'h234, ...};
function [9:0] xor_reduction (input [1022:0] r);
integer I;
begin
xor_reduction = 0;
for(I=0;I<1023;I=I+1)
xor_reduction = xor_refuction ^ (r[1022-I]*C[I-:10]);
end
endfunction
assign out = xor_reduction(r);

Using real parameter to determine counter sizes

I am trying to make my debounce code more modular by passing in parameters that are the frequency and the desired bounce time to eliminate button/switch bounce. This is how I approached it:
module debounceCounter
#(
parameter CLOCK_FREQUENCY_Hz = 50_000_000,
parameter BOUNCE_TIME_s = 0.003
)
(
input wire sysClk, reset,
input wire i_async,
output reg o_sync
);
/* include tasks/functions */
`include "clog2.v"
/* constants */
parameter [(clog2(BOUNCE_TIME_s * CLOCK_FREQUENCY_Hz + 0.5) - 1) : 0]
MAX_COUNT = BOUNCE_TIME_s * CLOCK_FREQUENCY_Hz;
Synthesis using Xilinx ISE 14.7 Throws this error:
Xst:850 - "../../rtl/verilog/debounceCounter.v" line0: Unsupported real
constant
How can I get around this issue so that I can determine the counter size and max count value based on parameters being passed in from code above this module in the heirarchy? A majority of my code has sizes of variables and such determined by frequency generics, so not being able to use methods like VHDL has proven to create problems in my designs.
Seems to work fine on Vivado 2016.3 (the oldest I have available). I think the problem is that 2014.7 is too old to support this. You didn't show the contents of the `include, but I'm assuming its the one from AR# 44586. If so, it should take and return integers and it will truncate the real floating point values for you. Floating point arithmetic is fine to use in Verilog/SystemVerilog testbenches and parameters.
How can I get around this issue so that I can determine the counter
size and max count value based on parameters being passed in from code
above this module in the heirarchy?
Update to a recent version. 2017.1 or 2017.3 are working good for me. I tested the following on 2016.3 and it also worked fine.
Try using SystemVerilog (.sv) which supports the $clog2() system function natively without the `include. Not sure when .sv started working, but probably needs 2015+.
Verify that your version of clog2 in the clog2.v header matches the following
NOTE: There is another pretty serious bug in the code you posted.
When you want to get the MSB required to hold a constant expression "x" the pattern should be $clog2((x)+1)-1. You have only added 0.5 instead of 1. This causes there to not be enough bits whenever the result of the floating point expression "x" falls between 2^n and (2^n + 0.5). For example, what you have erronously computes the constant as 17'h0 instead of 18'h4_0000 for the the frequency 87381333 but it still appears to work for your example at 50Mhz. Murphy's law says you will accidentally fall into this narrow bad range at the worst possible time, but never during testing :).
For reference, this is what I tested, with the `include expanded inline:
`timescale 1ns / 1ps
module debounceCounter
#(
//parameter CLOCK_FREQUENCY_Hz = 50_000_000,
parameter CLOCK_FREQUENCY_Hz = 87381333, // whoops
parameter BOUNCE_TIME_s = 0.003
)
(
input wire sysClk, reset,
input wire i_async,
output reg o_sync
);
/* include tasks/functions */
//`include "clog2.v"
function integer clog2;
input integer value;
begin
value = value-1;
for (clog2=0; value>0; clog2=clog2+1)
value = value>>1;
end
endfunction
/* constants */
//parameter [(clog2(BOUNCE_TIME_s * CLOCK_FREQUENCY_Hz + 0.5) - 1) : 0] // <- BUG!!! 0.5 should 1
parameter [(clog2(BOUNCE_TIME_s * CLOCK_FREQUENCY_Hz + 1) - 1) : 0]
MAX_COUNT = BOUNCE_TIME_s * CLOCK_FREQUENCY_Hz;
initial
$display("MAX_COUNT %d", MAX_COUNT);
endmodule
Type Real is not synthesizable. Draw/Create your design before you translate into/write HDL and you will realize this. Ask yourself, "What does a real synthesize to in gates?"
For those tools (e.g. Synplify) that do "support" Type Real, it is just a vendor interpretation, and as such is impossible to "support" since it is not defined as part of any HDL standard. The implication: If you had a simulator that interprets Type Real one way, and your synthesizer (likely) interprets it another way, you will get sim/syn mismatches. You may get away with them, depending on what you are trying to accomplish, but, it would still be considered poor design practice.
Behavioral code, for modeling and use in testbenches, as stated above, a different story as it is not synthesized.

verilog changing random seed

How do I change the seed for $urandom_range every time I am starting a new simulation. I tried so many things non worked.
always#(posedge tb_rd_clkh)
begin
$random(9);
tbo9_ready_toggle_q <= $urandom_range(0, 1);
end
You can change the seed using a flag like this:
irun -seed seed_number
Or you can use a random seed:
irun -seed random
I'm pretty sure every tool (Questa and VCS) has an option to do this. If you don't set a seed, it will default to 1.
Set the seed value by using the conventional way before accessing the range of random numbers using urandom_range
seed = 2;
void'($urandom(seed));
Here the above code snippet will set the seed value to 2 for uramdom_range too and every time you run, random number generator creates the same sequence as long as the seed is same, you can find a working example at the EDA-Playground
UPDATE:
For your question how to set seed for urandom_range insde always block? A more generalized way as per SV LRM IEEE 1800 - 2012 Section 18.13.3 srandom()
The srandom() method allows manually seeding the RNG of objects or
threads.
Thus by making use of it created a simplified self contained example to show how to set seed inside always block
module dut(input clk,output reg [31:0] out);
integer seed;
assign seed = 10;
always # (posedge clk)
begin
$srandom(seed);
out <= $urandom_range (10,1);
$display ("out = %d",out);
end
endmodule
You may want to try this out, the above example with tb can be found in the link.
Solution to your question
In your code snippet you have to change $random(9) to $srandom(9) were 9 is the seed value

How do you move non-zero elements in an array to the top in a single cycle?

I have the following 8-bit array:
0
4
0
0
5
0
2
0
How do I make it to the following in a single cycle (without iterating the element one by one)?
4
5
2
0
0
0
0
0
I know how to do it in software (MATLAB), but I'm not sure how to do it with combinational logic.
% initialise temporary vectors
TempType = zeros(maxType,1);
TempStart = zeros(maxType,1);
TempStop = zeros(maxType,1);
index = 1;
% remove zero elements from the middle
for j = 1:maxType
if (PreType(j) > 0 && PreStart(j) > 0 && PreStop(j) > 0)
TempType(index) = PreType(j);
TempStart(index) = PreStart(j);
TempStop(index) = PreStop(j);
index = index + 1;
end
end
I think any simplified sorting algorithm can do the job. For example, here is a modified bubble sort solution implemented in a single cycle:
module MoveZeros;
parameter W1 = 8;
parameter W2 = 10;
integer i, j;
logic [W1-1:0] array[W2-1:0] = {0,4,0,0,5,0,2,0,0,1};
logic [W1-1:0] temp;
always_comb begin
for (i=W2-1 ; i >=0 ; i=i-1)
for (j=W2-1 ; j >= 0 ; j=j-1)
begin
if (array[j]==0 && array[j-1] != 0) begin
temp = array[j];
array[j] = array [j-1];
array[j-1] = temp;
end
end
end
endmodule
output:
# array = '{4, 5, 2, 1, 0, 0, 0, 0, 0, 0}
Working example on edaplayground. Depending on your cycle time and the width of your input array (W2), you may want to break this algorithm into multiple cycles.
Synthesis tools unroll loops, therefore, the synthesized circuit will have O(W2^2) comparators and multiplexers, which can explode. Hence for bigger arrays, a multi-cycle solution is the way to go.
This is not an answer, which would take several hours of work, but SO's comments are not up to this sort of question. You should ask on comp.arch.fpga, if it's still alive.
Start by finding a datasheet for one of the old asynchronous fall-through FIFOs; these will include a circuit diagram. You don't really want to do anything like this, because the stage-to-stage handshaking is hairy, and you can't apply all 8 values simultaneously, but it'll give you ideas for a more synchronous implementation. Adapting a fall-through FIFO to do what you want is trivial - just ignore zero inputs.
If you can go up to 8 clock cycles, a more synchronous implementation is easy, with relatively limited hardware.
One cycle doesn't look too difficult, but will use more hardware. How sure are you that you must do it in one cycle? How much hardware can you use? If you've got a free PLL/DLL I'd be inclined to use that to get an 8x clock.
EDIT
Actually, with the benefit of more than 2 minutes thought, this seems pretty easy, even in one cycle.
Say you've got 8 registers with your 8 inputs (I0-I7), and 8 output registers (Q0-Q7). Each output register has associated logic which selects an input register for source data. The Q0 selector finds the lowest-numbered I register which contains non-zero data. The Q1 selector finds the next highest I register which contains non-zero data, and so on. Each selector drives a mux which loads the corresponding output register. Q0 requires an 8-1 mux (eight 8-bit inputs from I0-I7, one 8-bit output which goes to the input of Q0). Q1 requires a 7-1 mux (the inputs can only be I1-I7), and so on, until Q7, which doesn't require a mux at all (it can only be driven by I7).
The only smarts are in the selectors which find the source data for each output register. The Q7 selector is trivial; Q7 can only select I7, and only if all of I0-I7 contain non-zero data. Q6 is a bit more complicated, and so on.
If you can't see how to code a selector, ask specifically about that one in a new question, to avoid all the comments.

Resources