Is there a method in verilog to start reading ROM data from a specific address? - verilog

I've designed a ROM for coefficients and an up-down counter to read these coefficients one by one but there are two cases for the starting point where a specific number of coefficients for type1 and another set of coefficients for type 2 ... so for example for type 1 I want to start from address zero and for type 2 start from address 30 ... I remember that someone told me it is possible using some # or something but I don't remember what is the actual way to do this
this for my counter code
module UDcounter(input clk,rst,up,GItype,
output reg [5:0]addr);
always #(posedge clk,posedge rst)
if (rst)
addr<=6'b0;
else
begin
if (GItype) //assume 1 is a long GI type
begin
// addr=6'b000000;
if (up)
addr=addr+1;
else addr=addr-1;
end
else //for short GI
begin
//addr=6'b100000;
if (up)
addr=addr+1;
else addr=addr-1;
end
end
endmodule
the error here is that every clock cycle it start addressing from addr=0 for example and the output address is always 1 (for the +1) line

So what I understood from your question is that you want to design a ROM which will store coefficients.
Going by your question I assume that you have two types of coefficient viz type a & type b stored in the ROM, say the starting address for type a is 0 and for type b is 30. To go about accessing the ROM you would want two counters viz addr_ptr_a and addr_ptr_b which will act as address pointers, lets assume that the ROM has about 60 address locations then addr_ptr_a will count from 0 to 29 and addr_ptr_b will count from 30 to 60.
The GItype signal can be used to determine which counter to enable.
I am assuming a sequential read operation, for a random read operation you would need a separate logic to generate the read address.

Related

How to print system verilog coverage bin value at end of simulation?

I want to find out how many times the state machine went through the following sequence of states by displaying the count at the end of the simulation.
I could not find a way to dump the value of the bin "b" in the code below.
interface i;
typedef enum { S0, S1, S2, S3} state_e;
state_e state;
assign state = dut1.sm_state;
covergroup my_cg #(state);
coverpoint state {
bins b = (S0 => S1 => S2 => S3);
}
endgroup
my_cg cg1 = new();
final begin
$display("COVERAGE:CG1.state:%0d", cg1.state.get_coverage());
end
endinterface
Currently the output gives 100 if the sm went through the arc even once. I would instead like the count how many times it went through the arc.
get_coverage does not give you a count of bin hits—it only gives you the percent of bins hit, or the ratio of bins hit to the total number of bins. For performance, most tools stop counting after the bin has met its required minimum hits, just 1 hit by default. This saves not only the counting, but also evaluating the set of selection expressions that determine which bin to hit.
For debug, most tools give you a way of reporting the actual bin hit counts for the entire simulation.

The difference between x and z

While reading the syntax of Verilog, I came across the four logic values: 0 1 x z.
After searching the web, seeking to find the difference between x and z, I found only that x is unknown value and z is high impedance (tristate). I think that I understand the definition of x but didn't quite understood the one of z - what does it mean "high impedance (tristate)"?
I would like to see an example for each logic value out of the two: x z
Z means the signal is in a high-impedance state also called tri-state. Another signal connected to it can change the value: a 0 will pull it low, a 1 will pull it high.
To understand impedance (and thus high impedance) you should have some understanding of resistance, voltage and current and their relations as defined by Ohms law.
I can't give you an example of 'X' or 'Z', just as I can't give you an example of '1' or '0'. These are just definitions of signal states. In fact in Verilog there are more then four states. There are seven strengths.
(See this webpage).
Here is a principle diagram of how a chip output port makes a zero, one or Z. In reality the switches are MOSFETs.
Tri-state signals are no longer used inside chips or inside FPGA's. They are only used outside for connecting signals together.
x, as you had already found describes an unknown state. By default verilog simulation starts with all variables initialized to this value. One of the task of the designer is to provide correct reset sequences to bring the model into a known state, without 'x', i.e.
always #(posedge clk)
if (rst)
q <= 0;
In the above example initial value of q which was x is replaced by a known value of 0.
The difference between 'x' and 'z' is that 'z' is a known state of high impedance, meaning actually disconnected. As such, it could be driven to any other value with some other driver. It is used to express tri-state buses or some other logic.
wire bus;
assign bus = en1 ? value1 : 1'bz;
...
assign bus = en2 ? value2 : 1'bz;
In the above example the bus is driven by 2 different drivers. If 'en1' or 'en2' is high, the bus is driven with a real 'value1' or 'value2'. Otherwise its state is 'z'.
verilog has truth tables for every operator for all the values. You can check how they are used. i.e. for '&'
& 0 1 x z
0 0 0 0 0
1 0 1 x x
x 0 x x x
z 0 x x x
you can find for every other gate as well. Note that there are no 'z' in the result, just 'x's.
In system verilog X is treated like unconnected wire and Z is Weak HIGH.
Suppose a situation where you have wire connecting 2 modules m1 and m2.
If you are driving Z onto that wire from m1 then you can pull down this wire by assigning it to zero by m2.
As I figured out :
"tristate" or "high impedance" In transistors occures when you have "nothing" in the output.
that may occur, for example :
In a situation that you have an nMOS transistor let's call that T1:
the gate value of T1 is for example 0
so T1 would not conduct and there is no conduction path between your supply (probably 0 ) and the drain(output)
-that may occur a "Z" or tristate
--
It may occur for PMOS transistors with value -> 1 too.

Counting of different channels diverge and jumps

I'm trying to realize a counting module. My basic setup:
FPGA (Digilent's Arty with Xilinx Artix-35T) with two BNC cables attached to IO ports connected to a signal generator and via USB/UART to the PC for reading out. My signal generator produces with, say, 1 Hz some TTL signal.
I now want to count the amount of events in channel 1, in channel 2 and the coincidences of channels 1 and 2. While the basic principle works, I see channels 1 and 2 separate, even though they have the same input (via BNC-T connector). Also, sometimes one of the output channels jumps - in either direction, see figure.
The violet channel ("Channel 1") has a different slope than green ("Channel 2"). Also the coincidences make here two little lossy jumps.
My sequential counting code looks like
reg [15:0] coinciInt [(numCoincidences -1):0]; // internally store events
always #(posedge clk or posedge reset) // every time the clock rises...
begin
signalDelay <= signal; // delayed signal for not counting the same event twice
if(reset) // reset
begin
for(i=0;i<numCoincidences;i=i+1)
coinciInt[i] <= 16'b0;
end
else // No reset
begin
for(i=1;i<numCoincidences;i=i+1) // loop through all coincidence possibilities:
begin
if( ((signal & i) == i) && ((signalDelay & i) != i) ) // only if signal give coincidence, but did not give before, it's a coincidence
begin // "(signal & i) == i" means that "signal" is checked if bitmask of "i" is contained:
// ((0011 & 0010) == 0010) is true, since 0011 & 0010 = 0010 == 0010
coinciInt[i] <= coinciInt[i] + 1'b1; // the i-th coincidence triggered, store it
end
end
end
end // end of always
assign coinci = coinciInt; // the output variable is called coinci, so assign to this one
Please note that all events are in the register coinci - coincidences as well as 'single events'. Ideally, coinci[1] should store events of channel 1, coinci[2] these of channel 2 and coinci[3] coincidences between 1 and 2, since channels are labelled by 1,2,4,8,...,2^n and coincidences by the respective sum. coinci[0] is used for some kind of checksum, but that's off-topic now.
Are there any ideas for the missing counts? For the different slopes?
Thank you very much
Edit 1
#Brian Magnuson pointed to the meta stability issue. Using multi-buffered inputs solved the issue of diverging channels. That works nicely. Although I don't fully understand the reason for this, I also did not see any jumps in the coincidence channel so far. You probably save me a lot of time, thanks!
I would suspect a meta-stability problem. Your incoming pulses on ch1/ch2 are probably not synchronized with the system clock you are using. See here.
Because of this you are probably sometimes catching the counter updates 'mid-stride' so to speak which will cause unexpected behavior.
To fix this you can flop the inputs twice (called a dual-rank synchronizer) before feeding them into the rest of your logic. Usually multi-bit synchronization requires a bit more careful handling but in your case each bit can be treated independently.

How do you move non-zero elements in an array to the top in a single cycle?

I have the following 8-bit array:
0
4
0
0
5
0
2
0
How do I make it to the following in a single cycle (without iterating the element one by one)?
4
5
2
0
0
0
0
0
I know how to do it in software (MATLAB), but I'm not sure how to do it with combinational logic.
% initialise temporary vectors
TempType = zeros(maxType,1);
TempStart = zeros(maxType,1);
TempStop = zeros(maxType,1);
index = 1;
% remove zero elements from the middle
for j = 1:maxType
if (PreType(j) > 0 && PreStart(j) > 0 && PreStop(j) > 0)
TempType(index) = PreType(j);
TempStart(index) = PreStart(j);
TempStop(index) = PreStop(j);
index = index + 1;
end
end
I think any simplified sorting algorithm can do the job. For example, here is a modified bubble sort solution implemented in a single cycle:
module MoveZeros;
parameter W1 = 8;
parameter W2 = 10;
integer i, j;
logic [W1-1:0] array[W2-1:0] = {0,4,0,0,5,0,2,0,0,1};
logic [W1-1:0] temp;
always_comb begin
for (i=W2-1 ; i >=0 ; i=i-1)
for (j=W2-1 ; j >= 0 ; j=j-1)
begin
if (array[j]==0 && array[j-1] != 0) begin
temp = array[j];
array[j] = array [j-1];
array[j-1] = temp;
end
end
end
endmodule
output:
# array = '{4, 5, 2, 1, 0, 0, 0, 0, 0, 0}
Working example on edaplayground. Depending on your cycle time and the width of your input array (W2), you may want to break this algorithm into multiple cycles.
Synthesis tools unroll loops, therefore, the synthesized circuit will have O(W2^2) comparators and multiplexers, which can explode. Hence for bigger arrays, a multi-cycle solution is the way to go.
This is not an answer, which would take several hours of work, but SO's comments are not up to this sort of question. You should ask on comp.arch.fpga, if it's still alive.
Start by finding a datasheet for one of the old asynchronous fall-through FIFOs; these will include a circuit diagram. You don't really want to do anything like this, because the stage-to-stage handshaking is hairy, and you can't apply all 8 values simultaneously, but it'll give you ideas for a more synchronous implementation. Adapting a fall-through FIFO to do what you want is trivial - just ignore zero inputs.
If you can go up to 8 clock cycles, a more synchronous implementation is easy, with relatively limited hardware.
One cycle doesn't look too difficult, but will use more hardware. How sure are you that you must do it in one cycle? How much hardware can you use? If you've got a free PLL/DLL I'd be inclined to use that to get an 8x clock.
EDIT
Actually, with the benefit of more than 2 minutes thought, this seems pretty easy, even in one cycle.
Say you've got 8 registers with your 8 inputs (I0-I7), and 8 output registers (Q0-Q7). Each output register has associated logic which selects an input register for source data. The Q0 selector finds the lowest-numbered I register which contains non-zero data. The Q1 selector finds the next highest I register which contains non-zero data, and so on. Each selector drives a mux which loads the corresponding output register. Q0 requires an 8-1 mux (eight 8-bit inputs from I0-I7, one 8-bit output which goes to the input of Q0). Q1 requires a 7-1 mux (the inputs can only be I1-I7), and so on, until Q7, which doesn't require a mux at all (it can only be driven by I7).
The only smarts are in the selectors which find the source data for each output register. The Q7 selector is trivial; Q7 can only select I7, and only if all of I0-I7 contain non-zero data. Q6 is a bit more complicated, and so on.
If you can't see how to code a selector, ask specifically about that one in a new question, to avoid all the comments.

Optimize this comparator for better synthesis

I have a module which is basically a LUT whose input is 64 bits. The LUT always block consists of a case statement which compares the input to over 200 different integers. The default case in the case statement checks if the input is > 100 or not before assigning the output a default value.
My problem is that when I synthesize, it leads to a 65 bit comparator, and I was wondering if there are better ways of doing it so that a large comparator isn't synthesized.
Here's my code snippet:
always #(in)
begin
case (in)
-100: out <= 495050;
-99: out <= 500000;
...
99: out <= 99500000;
100: out <= 99504950;
default:
begin
if (in > 100)
out <= 99504950;
else
out <= 495050;
end
endcase
end
Thanks,
Faisal
Assuming that in is a 64 bit number, what you can do is to chop it off such that you only have to 'compare' the lowest few bits, and then you can do quick checks to see if the number is outside of the range needed.
For example, let's just chop off in at 8 bits, and assign it to an 8 bit signed register. This should allow you to represent between -128 and 127.
You can test if the full number is larger than 127 by: !in[63] && (|in[62:8]) (check if any upper bit is 1, and the MSB is not set).
You can test if the full number is less than -128 by: in[63] && !(&in[62:8]) (check if any upper bit is 0, and the MSB is set).
Now you know three things:
if the number is larger than 127
if the number is between 127 and -128
and if the number is less than -128.
You should be able to use a small 8-bit LUT for the inbetween case, or use your default values if it's in either of the upper ranges.
Note I might expect a good synthesizer to do this automatically for you, but if you look at the generated netlist and it's too large you can try this to see if it gives you a better result.
It seems like You have calculated table with some function values of input x = [-100;100]. If so, it would be better to store them in memory one after another starting from some base address. So to read them, You can write base + X + 100 value on the address bus, and obtain value you need.
In case you need a gigantic multiplexer, you may want to try using a "parallel" case directive.
As for comparator in "default" - I have the same problem, so I am waiting for an answer.
I wanted to write this as a comment but I have no such privilege

Resources