Counting of different channels diverge and jumps - verilog

I'm trying to realize a counting module. My basic setup:
FPGA (Digilent's Arty with Xilinx Artix-35T) with two BNC cables attached to IO ports connected to a signal generator and via USB/UART to the PC for reading out. My signal generator produces with, say, 1 Hz some TTL signal.
I now want to count the amount of events in channel 1, in channel 2 and the coincidences of channels 1 and 2. While the basic principle works, I see channels 1 and 2 separate, even though they have the same input (via BNC-T connector). Also, sometimes one of the output channels jumps - in either direction, see figure.
The violet channel ("Channel 1") has a different slope than green ("Channel 2"). Also the coincidences make here two little lossy jumps.
My sequential counting code looks like
reg [15:0] coinciInt [(numCoincidences -1):0]; // internally store events
always #(posedge clk or posedge reset) // every time the clock rises...
begin
signalDelay <= signal; // delayed signal for not counting the same event twice
if(reset) // reset
begin
for(i=0;i<numCoincidences;i=i+1)
coinciInt[i] <= 16'b0;
end
else // No reset
begin
for(i=1;i<numCoincidences;i=i+1) // loop through all coincidence possibilities:
begin
if( ((signal & i) == i) && ((signalDelay & i) != i) ) // only if signal give coincidence, but did not give before, it's a coincidence
begin // "(signal & i) == i" means that "signal" is checked if bitmask of "i" is contained:
// ((0011 & 0010) == 0010) is true, since 0011 & 0010 = 0010 == 0010
coinciInt[i] <= coinciInt[i] + 1'b1; // the i-th coincidence triggered, store it
end
end
end
end // end of always
assign coinci = coinciInt; // the output variable is called coinci, so assign to this one
Please note that all events are in the register coinci - coincidences as well as 'single events'. Ideally, coinci[1] should store events of channel 1, coinci[2] these of channel 2 and coinci[3] coincidences between 1 and 2, since channels are labelled by 1,2,4,8,...,2^n and coincidences by the respective sum. coinci[0] is used for some kind of checksum, but that's off-topic now.
Are there any ideas for the missing counts? For the different slopes?
Thank you very much
Edit 1
#Brian Magnuson pointed to the meta stability issue. Using multi-buffered inputs solved the issue of diverging channels. That works nicely. Although I don't fully understand the reason for this, I also did not see any jumps in the coincidence channel so far. You probably save me a lot of time, thanks!

I would suspect a meta-stability problem. Your incoming pulses on ch1/ch2 are probably not synchronized with the system clock you are using. See here.
Because of this you are probably sometimes catching the counter updates 'mid-stride' so to speak which will cause unexpected behavior.
To fix this you can flop the inputs twice (called a dual-rank synchronizer) before feeding them into the rest of your logic. Usually multi-bit synchronization requires a bit more careful handling but in your case each bit can be treated independently.

Related

Is there a method in verilog to start reading ROM data from a specific address?

I've designed a ROM for coefficients and an up-down counter to read these coefficients one by one but there are two cases for the starting point where a specific number of coefficients for type1 and another set of coefficients for type 2 ... so for example for type 1 I want to start from address zero and for type 2 start from address 30 ... I remember that someone told me it is possible using some # or something but I don't remember what is the actual way to do this
this for my counter code
module UDcounter(input clk,rst,up,GItype,
output reg [5:0]addr);
always #(posedge clk,posedge rst)
if (rst)
addr<=6'b0;
else
begin
if (GItype) //assume 1 is a long GI type
begin
// addr=6'b000000;
if (up)
addr=addr+1;
else addr=addr-1;
end
else //for short GI
begin
//addr=6'b100000;
if (up)
addr=addr+1;
else addr=addr-1;
end
end
endmodule
the error here is that every clock cycle it start addressing from addr=0 for example and the output address is always 1 (for the +1) line
So what I understood from your question is that you want to design a ROM which will store coefficients.
Going by your question I assume that you have two types of coefficient viz type a & type b stored in the ROM, say the starting address for type a is 0 and for type b is 30. To go about accessing the ROM you would want two counters viz addr_ptr_a and addr_ptr_b which will act as address pointers, lets assume that the ROM has about 60 address locations then addr_ptr_a will count from 0 to 29 and addr_ptr_b will count from 30 to 60.
The GItype signal can be used to determine which counter to enable.
I am assuming a sequential read operation, for a random read operation you would need a separate logic to generate the read address.

Loop Convergence - Verilog Synthesis

I am trying to successively subtract a particular number to get the last digit of the number (without division). For example when q=54, we get q=4 after the loop. Same goes for q=205, output is q=5.
if(q>10)
while(q>10)
begin
q=q-10;
end
The iteration should converge logically. However, I am getting an error:
"[Synth 8-3380] loop condition does not converge after 2000 iterations"
I checked the post - Use of For loop in always block. It says that the number of iterations in a loop must be fixed.
Then I tried to implement this loop with fixed iterations as well like below (just for checking if this atleast synthesizes):
if(q>10)
while(loopco<9)
begin
q=q-10;
loopco=loopco-1;
end
But the above does not work too. Getting the same error "[Synth 8-3380] loop condition does not converge after 2000 iterations". Logically, it should be 10 iterations as I had declared the value of loopco=8.
Any suggestions on how to implement the above functionality in verilog will be helpful.
That code can not be synthesized. For synthesis the loop has to have a compile time known number of iterations. Thus it has to know how many subtractions to make. In this case it can't.
Never forget that for synthesis you are converting a language to hardware. In this case the tool needs to generate the code for N subtractions but the value of N is not known.
You are already stating that you are trying to avoid division. That suggest to me you know the generic division operator can not be synthesized. Trying to work around that using repeated subtract will not work. You should have been suspicious: If it was the easy it would have been done by now.
You could build it yourself if you know the upper limit of q (which you do from the number of bits):
wire [5:0] q;
reg [3:0] rem;
always #( * )
if (q<6'd10)
rem = q;
else if (q<6'd20)
rem = q - 6'd10;
else if (q<6'd30)
rem = q - 6'd20;
etc.
else
rem = q - 6'd60;
Just noticed this link which pops up next to your question which shows it has been asked in the past:
How to NOT use while() loops in verilog (for synthesis)?

Output port missing in generated Verilog code from MyHDL

I am trying to generate a verilog module from the following MyHDL module:
top.py:
from myhdl import *
from counter import Counter
def Top(clkIn, leds):
counter = Counter(clkIn, leds)
return counter
clkIn = Signal(bool(0))
leds = intbv(0)[8:0]
toVerilog(Top, clkIn, leds)
and,
counter.py:
from myhdl import *
def Counter(clk, count):
c = Signal(modbv(0)[8:0])
#always(clk.posedge)
def logic():
c.next = c + 1
#always_comb
def outputs():
count.next = c
return logic, outputs
However, in the generated file's module definition, (lines 1-3)
top.v:
module top (
clkIn
);
input clkIn;
reg [7:0] counter_c;
always #(posedge clkIn) begin: TOP_COUNTER_LOGIC
counter_c <= (counter_c + 1);
end
assign count = counter_c;
endmodule
leds[7:0] are missing. Even though these LEDs are unused I need them for my synthesizer to assign them to the proper pins on the development board. Why is MyHDL omitting them? and how can I make it include them?
Change leds = intbv(0)[8:0] into leds = Signal(intbv(0)[8:0]).
Module (output) ports need to be declared as Signal.
In your module top design, you didn't declare leds as an output. On clkIn is defined and it is an input. Most synthesizers will check that logic is driving outputs or some other visible, or kept logic. If the synthesizer determines that there is no possible way for you tell that leds is present in the design externally, it may just optimize it away as well as any dedicated logic driving it, away.
If this is Altera, there is a qsf assignment called virtual pins which could be assigned to leds, to keep it. But the easy thing to do is add leds to the module top pin definition and assign it as an out.
Per the comment from Alper, you don't assign Count to anything. That needs to be fixed.
Also, you don't initialize counter in the Counter definition. This might work in synthesis because the logic will either init to zeros, or some other definitive value, but in simulation, the value may (probably/will) remain unknown. Get a reset signal if you can.

Verilog: detect pulses larger than tmax

I have a question and I hope somebody can give me a hint to solve the problem.
I need a verilog code to make a signal "reset" goes high immediately if the period of input signal "in" is larger than tmax.
Signal "reset" should go low again at the next positive edge of "in" (if there is a next positive edge)
If the period of input signal "in" is smaller than tmax then signal "reset" should remain low.
Example 1.
tmax=100ns
period(in) = 80ns
reset remains low all the time
Example 2.
tmax=100ns
period(in) = 130ns
reset goes high 100ns after the first positive edge of "in"
reset goes low at the next positive edge of "in", if there is a second pulse
Where should I start?
How about this. You will still have to do your check to see if Tmax is exceed, but you get a 1 timescale resolution between results.
always # (*)
begin
while(1)
begin
current_time = $realtime;
if (last_time > 0.0) freq = 1.0e9 / (current_time - last_time);
last_time = current_time;
# 1; // This allows for 1 timescale between loops. If you leave
// this out, most simulators become non-responsive in the loop.
end
end

How do you move non-zero elements in an array to the top in a single cycle?

I have the following 8-bit array:
0
4
0
0
5
0
2
0
How do I make it to the following in a single cycle (without iterating the element one by one)?
4
5
2
0
0
0
0
0
I know how to do it in software (MATLAB), but I'm not sure how to do it with combinational logic.
% initialise temporary vectors
TempType = zeros(maxType,1);
TempStart = zeros(maxType,1);
TempStop = zeros(maxType,1);
index = 1;
% remove zero elements from the middle
for j = 1:maxType
if (PreType(j) > 0 && PreStart(j) > 0 && PreStop(j) > 0)
TempType(index) = PreType(j);
TempStart(index) = PreStart(j);
TempStop(index) = PreStop(j);
index = index + 1;
end
end
I think any simplified sorting algorithm can do the job. For example, here is a modified bubble sort solution implemented in a single cycle:
module MoveZeros;
parameter W1 = 8;
parameter W2 = 10;
integer i, j;
logic [W1-1:0] array[W2-1:0] = {0,4,0,0,5,0,2,0,0,1};
logic [W1-1:0] temp;
always_comb begin
for (i=W2-1 ; i >=0 ; i=i-1)
for (j=W2-1 ; j >= 0 ; j=j-1)
begin
if (array[j]==0 && array[j-1] != 0) begin
temp = array[j];
array[j] = array [j-1];
array[j-1] = temp;
end
end
end
endmodule
output:
# array = '{4, 5, 2, 1, 0, 0, 0, 0, 0, 0}
Working example on edaplayground. Depending on your cycle time and the width of your input array (W2), you may want to break this algorithm into multiple cycles.
Synthesis tools unroll loops, therefore, the synthesized circuit will have O(W2^2) comparators and multiplexers, which can explode. Hence for bigger arrays, a multi-cycle solution is the way to go.
This is not an answer, which would take several hours of work, but SO's comments are not up to this sort of question. You should ask on comp.arch.fpga, if it's still alive.
Start by finding a datasheet for one of the old asynchronous fall-through FIFOs; these will include a circuit diagram. You don't really want to do anything like this, because the stage-to-stage handshaking is hairy, and you can't apply all 8 values simultaneously, but it'll give you ideas for a more synchronous implementation. Adapting a fall-through FIFO to do what you want is trivial - just ignore zero inputs.
If you can go up to 8 clock cycles, a more synchronous implementation is easy, with relatively limited hardware.
One cycle doesn't look too difficult, but will use more hardware. How sure are you that you must do it in one cycle? How much hardware can you use? If you've got a free PLL/DLL I'd be inclined to use that to get an 8x clock.
EDIT
Actually, with the benefit of more than 2 minutes thought, this seems pretty easy, even in one cycle.
Say you've got 8 registers with your 8 inputs (I0-I7), and 8 output registers (Q0-Q7). Each output register has associated logic which selects an input register for source data. The Q0 selector finds the lowest-numbered I register which contains non-zero data. The Q1 selector finds the next highest I register which contains non-zero data, and so on. Each selector drives a mux which loads the corresponding output register. Q0 requires an 8-1 mux (eight 8-bit inputs from I0-I7, one 8-bit output which goes to the input of Q0). Q1 requires a 7-1 mux (the inputs can only be I1-I7), and so on, until Q7, which doesn't require a mux at all (it can only be driven by I7).
The only smarts are in the selectors which find the source data for each output register. The Q7 selector is trivial; Q7 can only select I7, and only if all of I0-I7 contain non-zero data. Q6 is a bit more complicated, and so on.
If you can't see how to code a selector, ask specifically about that one in a new question, to avoid all the comments.

Resources