How to delay signal for a given number of cycles in VHDL?
Number of cycles is given as a generic.
Any other options instead of
process(CLK) is
begin
if rising_edge(CLK) then
a_q <= a;
a_q_q <= a_q;
a_q_q_q <= a_q_q;
-- etc
end if;
end process;
?
Create an 1-d array (let's call it a_store) of the appropriate type of signal with the length of the array related to the number of cycles. This may mean you have to create a new type for the array unless there's already a vector type you can use: eg. std_logic_vector or integer_vector (the latter is standard only in VHDL-2008).
Then shuffle the array along each tick:
if rising_edge(clk) then
a_store <= a_store(store'high-1 downto 0) & a;
a_out <= a_store(a_store'high);
end if;
Related
I am working on a Verilog design where I am using SRAM inside a FSM. I need to synthesize it later on since I want to fabricate the IC. My question is that I have a fully working code using reg registers where I use blocking assignment for concurrent operation. Since there is no clock in this system, it works fine. Now, I want to replace these registers with SRAM based memory, which brings in clock into the system. My first thought is to use non-blocking assignment and changing the dependency list from always #(*) to always # (negedge clk).
In the code snippet below, I want to read 5 sets of data from the SRAM (SR4). So what I do is I place a counter that counts till 5 (wait_var) for this to happen. By introducing additional counter, this code ensures that at 1st clock edge it enters the counter and at subsequent clock edges, the five sets of data is read from SRAM. This technique works for simple logic such as this.
S_INIT_MEM: begin
// ******Off-Chip (External) Controller will write the data into SR4. Once data is written, init_data will be raised to 1.******
if (init_data == 1'b0) begin
CEN4 <= CEN;
WEN4 <= WEN;
RETN4 <= RETN;
EMA4 <= EMA;
A4 <= A_in;
D4 <= D_in;
end
else begin
CEN4 <= 1'b0; //SR4 is enabled
EMA4 <= 3'b0; //EMA set to 0
WEN4 <= 1'b1; //SR4 set to read mode
RETN4 <= 1'b1; //SR4 RETN is turned ON
A4 <= 8'b0000_0000;
if (wait_var < 6) begin
if (A4 == 8'b0000_0000 ) begin
NUM_DIMENSIONS <= Q4;
A4 <= 8'b0000_0001;
end
if (A4 == 8'b0000_0001 ) begin
NUM_PARTICLES <= Q4;
A4 <= 8'b0000_0010;
end
if (A4 == 8'b0000_0010 ) begin
n_gd_iterations <= Q4;
A4 <= 8'b0000_0011;
end
if (A4 == 8'b0000_0011 ) begin
iterations <= Q4;
A4 <= 8'b0000_0100;
end
if (A4 == 8'b0000_0100 ) begin
threshold_val <= Q4;
A4 <= 8'b0000_0101;
end
wait_var <= wait_var + 1;
end
//Variables have been read from SR4
if(wait_var == 6) begin
CEN4 <= 1'b1;
next_state <= S_INIT_PRNG;
wait_var <= 0;
end
else begin
next_state <= S_INIT_MEM;
end
end
end
However, when I need to write a complex logic in the similar fashion, the counter based delay method gets too complex. Eg. say I want to read data from one SRAM (SR1) and want to write it to another SRAM (SR3).
CEN1 = 1'b0;
A1 = ((particle_count-1)*NUM_DIMENSIONS) + (dimension_count-1);
if (CEN1 == 1'b0) begin
CEN3 = 1'b0;
WEN3 = 1'b0;
A3 = ((particle_count-1)*NUM_DIMENSIONS) + (dimension_count-1);
if(WEN3 == 1'b0) begin
D3 = Q1;
WEN3 = 1'b1;
CEN3 = 1'b1;
end
CEN1 = 1'b1;
end
I know this still uses blocking assignments and I need to convert them to non-blocking assignments, but if I do and I do not introduce 1 clock cycle delay manually using counter, it will not work as desired. Is there a way to get around this in a simpler manner?
Any help would be highly appreciated.
The main part is, that non-blocking assignments are a simulation only artifact and provides a way for simulation to match hardware behavior. If you use them incorrectly, you might end up with simulation time races and mismatch with hardware. In this case your verification effort goes to null.
There is a set of common practices used in the industry to handle this situation. One is to use non-blocking assignments for outputs of all sequential devices. This avoids races and makes sure that the behavior of sequential flops and latches pipes data the same way as in real hardware.
Hence, one cycle delay caused by the non-blocking assignments is a myth. If you design sequential flops when the second one latches the data from the first, then the data will be moved across flops sequentially every cycle:
clk ------v----------------v
in1 -> [flop1] -> out1 -> [flop2] -> out2
clk 1 1 1 0
clk 3 1 1 1
clk 4 0 0 1
clk 5 0 0 0
In the above example data is propagated from out1 to out2 in the every next clock cycle which can be expressed in verilog as
always #(posedge clk)
out1 <= in1;
always #(posedge clk)
out2 <= out1;
Or you can combine those
always #(posedge clk) begin
out1 <= in1;
out2 <= out1;
end
So, the task of your design is to cleanly separate sequential logic from combinatorial logic and therefore separate blocks with blocking and non-blocking assignments.
There are cases which can and must be used with blocking assignments inside sequential blocks, as mentioned in comments: if you use temporary vars to simplify your expressions inside sequential blocks assuming that those vars are never used anywhere else.
Other than above never mix blocking and non-blocking assignments in a single always block.
Also, usually due to synthesis methodologies, use if 'negedge' is discouraged. Avoid it unless your synthesis methodology does not care.
You should browse around to get more information and example of blocking/non-blocking assignments and their use.
so I have a module that does convolution, it takes a data input and the filter input , where input is array of 9 numbers , every posedge of the clk these two inputs are being multiplied and then added accumulatively, i.e I save every new multiplication product into a register. after each 9 iterations I have to save the result and reset it , but I have to do it in one clock cycle, since my new data is coming on the next posedge. So the issue that I am facing is how to not save data and reset the out without losing data? Please help if you have any suggestions. It also need to be mentioned that my conv_module is a sub-module and I will be instantiating it in a top module , so I have to access all the inputs and outputs from uptop.
This is the code that I've written so far, but it does not work the way I want it, cause I cannot tap the array of output numbers from the top module.
module mult_conv( input clk,
input rst,
input signed [4:0] a,
input signed[2:0] b,
output reg signed[7:0] out
);
wire signed [7:0] mult;
reg signed [7:0] sum;
reg [3:0] counter;
reg do_write;
reg [7:0] out_top;
assign mult = {{3{a[4]}},a} * {{5{b[2]}},b};
always #(posedge clk or posedge rst)
begin
if (rst)
begin
counter <= 4'h0;
addr <= 'h0;
sum <= 0;
do_write <= 1'b0;
end // rst
else
begin
if (counter == 4'h8)
begin // we have gathered 9 samples
counter <= 4'h0;
// start again so ignore old sum
sum <= mult;
out <= sum;
out_top <= out;
end
else
begin
counter <= counter + 4'h1;
// Add results
sum <= sum + mult;
out <= 0;
out_top <= out_top;
end
// Write signal has to be set one cycle early
do_write = (counter==4'h7);
end // clocked
end // always
endmodule
You have a plethora of errors in that code.
Apart from that you have a 3Mega bit memory from which you use only 1 in 9 locations.
You write out in two places. That does not work.
You use a %9. That can not be mapped onto hardware.
You have a sel signal which somehow controls your sum.
On top of that I understand you want to bring the whole memory out.
Your code because it needs to be drastically re-written.
But your biggest problem is that you definitely can't make the memory come out. What ever post-processing you want to do you have two choices:
Process the output data as it appears.
Store the data outside the module in a memory and have another process read that memory.
I think only (1) is the correct way because your signal can have infinite length.
As to fixing this code a bit:
Replace the %9 with a counter to count from 0 to 8.
Process out in in clocked section. See below
Move the addr and sel generating logic in here. Keep it all together.
Below is the basic code of how to do a 9-sequence convolution. I have to ignore 'sel' as I have no idea of the timing. I have also added address generation and a write signal so the result can be store in an external memory. But I still think you should process the result on the fly.
always #(posedge clk or posedge rts)
begin
if (rst)
begin
counter <= 4'h0;
addr <= 'h0;
sum <= 0;
do_write <= 1'b0;
end // rst
else
begin
if (counter == 4'h8)
begin // we have gathered 9 samples
counter <= 4'h0;
addr <= addr + 1;
// start again so ignore old sum
sum <= mult;
end
else
begin
counter <= counter + 4'h1;
// Add results
sum <= sum + mult;
end
// Write signal has to be set one cycle early
do_write = (counter==4'h7);
end // clocked
end // always
(Code above was entered on-the fly, may contain syntax, typing or other errors!!)
As you can see the trick is to know when to add the old result or when to ignore the old sum and start again.
(I spend about 3/4 of an hour on that so on my normal tariff you would have to pay me $93.75 :-)
I provided the basic code to let you work out the specifics. I did nothing with out but left that to you.
do_write and addr where for a possible memory to pick up the result. Without memory you can drop addr but do_write should tell you when a new convolution result is available, in which case you might want to give a it a different name. e.g. 'sum_valid'.
I want to control the value of a variable using two switches. One for incrementing the value, whereas the other one for decrementing the value. How should i shange this code.
error says that the variable counting is unsynthesisable.
I have tried a lot but could not figure out what exactly the problem is.
ERROR:Xst:827 - line 34: Signal counting0 cannot be synthesized, bad synchronous description. The description style you are using to describe a synchronous element (register, memory, etc.) is not supported in the current software release.
library IEEE;
use IEEE.std_logic_1164.ALL;
use IEEE.numeric_std.ALL;
entity counts is
port(
btn_up : in std_logic;
reset : IN STD_LOGIC;
btn_dn : in std_logic;
counted : out std_logic_vector(8 downto 0)
);
end entity counts;
architecture behaviour of counts is
signal counter : std_logic_vector(8 downto 0);
begin
btupprocess : process(btn_up,reset,counter)
variable counting : unsigned(8 downto 0);
begin
counting := unsigned(counter);
if(reset = '1') then
counting := (others => '0');
elsif (rising_edge(btn_up)) then
if(counting > 399) then
counting := counting - 1;
else
counting := counting + 1;
end if;
end if;
counter <= std_logic_vector(counting);
end process;
btndnprocess : process(btn_dn,counter)
variable counting : unsigned(8 downto 0);
begin
counting := unsigned(counter);
if (falling_edge(btn_dn)) then
if(counting < 200) then
counting := counting + 1;
else
counting := counting - 1;
end if;
end if;
counter <= std_logic_vector(counting);
end process;
counted <= counter;
end behaviour;
Although in some cases it is possible to drive a signal from two different processes, there are better approaches in this case.
A possible solution to your problem is:
add a clock input to your entity; you should probably use a synchronous design
rewrite your architecture to use three processes, with each process driving a single signal:
one process will debounce and detect a rising edge on btn_up; this process will generate the signal btn_up_rising_edge
one process will debounce and detect a rising edge on btn_dn; this process will generate the signal btn_dn_rising_edge
a third process will read btn_up_rising_edge and btn_dn_rising_edge, and increment or decrement the count as appropriate
in all three processes, your sensitiviy list should contain clock and reset only
You can find an example of an edge detector with a debouncer here: https://electronics.stackexchange.com/questions/32260/vhdl-debouncer-circuit
I have the following button press logic in my code. I have tried debouncing it using a wait delay, but the compiler will not allow this. I have four push buttons on my FPGA, which the "key" array below reflects:
process(clock)
begin
if rising_edge(clock) then
if(key(3)/='1' or key(2)/='1' or key(1)/='1' or key(0)/='1') then --MY ATTEMPT AT DEBOUNCING
wait for 200 ns; ----MY ATTEMPT AT DEBOUNCING
if (key(3)='1' and key(2)='1' and key(1)='0' and last_key_state="1111" and key(0)='1') then
...
elsif (key(3)='1' and key(2)='1' and key(1)='1' and key(0)='0' and last_key_state="1111") then
...
elsif (key(3)='0' and key(2)='1' and key(1)='1' and key(0)='1' and last_key_state="1111") then
...
elsif (key(3)='1' and key(2)='0' and key(1)='1' and key(0)='1' and last_key_state="1111") then
...
end if;
last_key_state<=key;
end if;
end if;
end process;
Can anyone give some really simple example code showing how I could debounce a setup like the one I have above?
Well if you think about how you would do this with real electronics you would probably use a capacitor.. which has a charging time. Same idea applies here, just figure out the time your switch is bouncing (usually a function of clock speed) and then actually set the register.
Simple Example With a 4-Bit Shift Register
So you'd put this between your switch and any other logic blocks
process
begin
if rising_edge(clock) then --You're clock
SHIFT_PB(2 Downto 0) <= SHIFT_PB(3 Downto 1); --Shifting each cycle
SHIFT_PB(3) <= NOT PB; --PB is the pre-bounced signal
If SHIFT_PB(3 Downto 0)="0000" THEN --once the bounce has settled set the debounced value
PB_DEBOUNCED <= '1';
ELSE
PB_DEBOUNCED <= '0';
End if;
end process;
Its basically delaying your signal 4 clock cycles (what you were trying to do with the wait).
Others have shown the way with counters... you also need to synchronise the signal to the clock before feeding it to the counter, otherwise occasionally, the signal will get to different parts of the counter at different times, and the counter will count incorrectly.
Whether this matters depends on the application - if correct operation is important, it is important to synchronise correctly!
You get the error because of wait ... wait is not synthesizeable.
I would do it with a simple counter. So you can use the same code for different clock speeds by adjusting the counter.
-- adjust the counter to you special needs
-- depending on how good your buttons are hardware debounced
-- you can easily think in ms
signal counter : std_logic_vector(7 DOWNTO 0) := "10000000";
process
begin
if rising_edge(clock) then --You're clock
if(key(3) = '0') or (key(2) = '0') or (key(1) = '0') or (key(0) = '0') then
start_debouncing <= '1';
key_vector_out <= key(3) & key(2) & key(1) & key(0);
end if;
if(start_debouncing = '1') then
key_vector_out <= "0000";
counter <= std_logic_vector(unsigned(counter) - 1);
end if;
if(counter = "00000000") then
counter <= "10000000";
start_debouncing <= '0';
end if;
end process;
Your code can produce another problem.
What will happen if you button is released so your input is .. key = "0000" .. right you never get you output. Perhaps it will work 99 out of 100 times but you can get an really hard to find error.
I would like to delay a signal several cycles in vhdl, but I have problems using how to delay a signal for several cycles in vhdl
Wouldn't I need a registered signal? I mean, something like:
a_store and a_store_registered would be std_logic_vector(cycles_delayed-1 downto 0)
process(clk)
begin
if rising_edge(clk) then
a_store_registered <= a_store;
end if;
end process;
a_out <= a_store_registered(cycles_delayed-1);
process(a_store_registered, a)
begin
a_store <= a_store_registered(size-2 downto 0) & a;
end process;
The solution you link to is a registered signal - the very act of writing to a signal inside a process with a rising_edge(clk) qualifier creates registers.
An even simpler implementation of a delay-line can be had in one line of code (+ another one if you want to copy the high bit to an output)
a_store <= (a_store(a_store'high-1 downto 0) & a) when rising_edge(clk);
a_out <= a_store(a_store'high);
Not sure why I didn't mention this in my answer to the linked question!
I am not sure why you are approaching the problem as you are; there is no need for a second process here. What is wrong with the method suggested in the linked question?
if rising_edge(clk) then
a_store <= a_store(store'high-1 downto 0) & a;
a_out <= a_store(a_store'high);
end if;
In this case your input is a and your output is a_out. If you want to make the delay longer, increase the size of a_store by resizing the signal declaration.
If you want to access the intermediate signal for other reasons, you could do this:
a_store <= a_store_registered(cycles_delayed-2 downto 0) & a;
process(clk)
begin
if rising_edge(clk) then
a_store_registered <= a_store;
end if;
end process;
a_out <= a_store_registered(cycles_delayed-1);
Remember that you can use the foo'delayed(N ns) attribute or foo <= sig after N ns in simulations.