Lattice iCE40-HX8K Board - UART - verilog

I have the following verilog code for my Lattice iCE40-HX8K Board:
uart.v:
module uart(input clk, output TXD);
reg [3:0] count;
reg [9:0] data;
reg z;
initial begin
data[9:0] = 10'b1000000000; // Startbit = 1, Stopbit = 0
z = 0;
end
always#(posedge clk)
begin
if(count == 1250) //9600x per Second (1250) = Baudrate
begin
count <= 0;
TXD = data[z];
z = z + 1;
if(z == 10)
begin
z = 0;
end
else
begin
end
end
else
begin
count <= count + 1;
end
end
endmodule
For receiving the UART-Data I use gtkterm under Ubuntu 14.04.
I have set the baudrate in gtkterm to 9600.
If I now program my FPGA with the code I receive once per programming a
hex "00" (irrespective of the 8 usage-bits).
Can anybody give me a hint what is wrong?
Thank you for your support.

There are at least two obvious problems with your design:
Your count is only 4 bits wide, thus it cannot count to 1250. It must be at least 11 bits wide to be able to count to 1250.
Also your z is only 1 bit wide, so it can only hold the values 0 and 1. It must be at least 4 bits wide to be able to count to 10.
You should also read up on blocking vs. non-blocking assignments. The way you use blocking assignments in sequential logic can lead to race conditions in the verilog simulation model.
You should always write testbenches for your HDL code and simulate it before trying to run it in hardware. A testbench for your design would be as easy as:
module uart_testbench;
reg clk = 1;
always #5 clk = ~clk;
uart uut (
.clk(clk)
);
initial begin
$dumpfile("uart.vcd");
$dumpvars(0, uut);
repeat (1000000) #(posedge clk);
$finish;
end
endmodule

Related

Verilog code that copies an input square wave signal to an output signal?

I was wondering if someone may be able to help me? I was not sure how to word the question, but I am basically trying to write a program that generates a square wave output signal from a square wave input signal, matching the duty cycle and frequency of the input signal. Basically, the output just copies the input. To summarize what I am saying graphically, here is a picture I made:
Link to diagram
It is not my final goal, but it would be enough to get me going. I am having a very hard time figuring out how to work with inputs. I have a signal generator making the input square wave signal, and am sending it into an input pin. I've tried calculating the duty cycle mathematically, and then just trying to assign the output to a reg that is set equal to the input on every rising edge of the clock signal but it didn't work.
Here's my code. It has extra functionality of generating a 1 Hz signal, but that is only from learning earlier how to create the pwm. You can ignore "pwm_reg" and the "pwm" output. The "pwm2" output is intended to copy "apwm" input:
`timescale 1ns / 1ps
module duty_cycle_gen(
input clk,
input rst_n,
input apwm,
output pwm,
output pwm2
);
// Input clock is 250MHz
localparam CLOCK_FREQUENCY = 250000000;
// Counter for toggling of clock
integer counter = 0;
reg pwm_reg = 0;
assign pwm = pwm_reg;
reg apwm_val;
always #(posedge clk) begin
if (!rst_n) begin
counter <= 8'h00;
pwm_reg <= 1'b0;
end
else begin
apwm_val <= apwm;
// If counter is zero, toggle pwm_reg
if (counter == 8'h00) begin
pwm_reg <= ~pwm_reg;
// Generate 1Hz Frequency
counter <= CLOCK_FREQUENCY/2 - 1;
end
// Else count down
else
counter <= counter - 1;
end
$display("counter : %d", counter);
end
assign pwm2 = apwm_val;
endmodule
Here is a simple example with a test bench. (I like to use a small delay when assigning, #1, to help capture causality for debugging purposes):
module example(
input wire clk,
input wire in,
output reg out);
always #(posedge clk)
begin
out <= #1 in;
end
endmodule // example
module example_test();
reg chip__clk;
reg chip__in;
wire chip__out;
reg [10:0] count;
example ex(
chip__clk,
chip__in,
chip__out);
initial
begin
$dumpvars();
count <= #1 0;
end
always #(count)
begin
count <= #1 (count + 1);
if (count == 1000)
begin
$display("RESULT=PASS:0 # done");
$finish_and_return(0);
end
if ((count == 60) & (chip__out != 1))
begin
$display("RESULT=FAIL:1 # chip.out not raised");
$finish_and_return(1);
end
if ((count == 30) & (chip__out != 0))
begin
$display("RESULT=FAIL:1 # chip.out not lowered");
$finish_and_return(1);
end
chip__in <= #1 count[5];
chip__clk <= #1 count[1];
end
endmodule // example_test
It works by treating the in signal as something that can can be thought of as constant over the timescale of the higher frequency clk.
If your in clock is an external signal which might be noisy, with the small latency delay, you can attempt to stabilize it by using a small fifo running with the high frequency clk:
module example(
input wire clk,
input wire in,
output reg out);
reg [1:0] buffer;
always #(posedge clk)
begin
out <= #1 buffer[1];
buffer[1] <= #1 buffer[0];
buffer[0] <= #1 in;
end
endmodule // example

Segmentation fault: 11 while compiling with yosys

I'm trying to implement a Verilog module that writes in a Lattice UP5K SPRAM hardware core using the Yosys SB_SPRAM256KA block. Note that there are little or no documentation/examples about usage of this black box block. The main purpose is implementing an echo or delay in an audio digital system.
I have two clocks the frame clock lrclk and the bit clock bclk, note that each period of frame clock has 64 bit clocks periods.
I tried to, with a sensitivity list in the blck, cycle a read/write process in the SPRAM. I implement a state machina that:
S1: Put the input data in the input port of the RAM, enable the write_enable signal and set the writing pointer to RAM address.
S2: (Data supposed to be written) Disables write_enable signal and set the reading pointer to RAM address.
S3: (Data supposed to be loaded on output buffer of the RAM). Set the module output from the RAM output buffer and resets the state machine.
This is the module code:
module echo(
input wire bclk,
input wire lrclk,
input wire [DATALEN-1:0] right_in,
output reg [DATALEN-1:0] right_out,
);
localparam ADDRLEN = 14;
localparam DATALEN = 16;
reg [ADDRLEN-1:0] rd_ptr = 0;
reg [ADDRLEN-1:0] wr_ptr = (2**ADDRLEN)/2;
reg [2:0] sm = 0;
reg wren;
reg [ADDRLEN-1:0] memaddr;
reg [DATALEN-1:0] datain;
reg [DATALEN-1:0] dataout;
SB_SPRAM256KA M1 (
.ADDRESS(memaddr),
.DATAIN(datain),
.MASKWREN(4'b1111),
.WREN(wren),
.CHIPSELECT(1'b1),
.CLOCK(bclk),
.STANDBY(1'b0),
.SLEEP(1'b0),
.POWEROFF(1'b0),
.DATAOUT(dataout)
);
always #(posedge lrclk) begin
sm <= 1;
end
always #(posedge bclk) begin
if (sm === 1) begin
datain <= right_in;
wren <= 1;
memaddr <= wr_ptr;
sm <= 2;
end else if (sm === 2) begin
wren <= 0;
memaddr <= rd_ptr;
sm <= 3;
end else if (sm === 3) begin
right_out <= dataout;
wr_ptr <= (wr_ptr + 1);
rd_ptr <= (rd_ptr + 1);
sm <= 0;
end
end
endmodule
I expect to have errors on systhesis time or misfunctional behaving of the implementation, but I obtain this Yosis error:
5.11. Executing WREDUCE pass (reducing word size of cells).
Removed top 31 bits (of 32) from port B of cell main.$add$main.v:70$2 ($add).
Removed top 1 bits (of 32) from port Y of cell main.$add$main.v:70$2 ($add).
Removed top 2 bits (of 3) from FF cell main.$techmap\E1.$procdff$117 ($dff).
make: *** [main.bin] Segmentation fault: 11
.POWEROFF(1'b0) should be 1'b1 right?
See the "iCE40 SPRAM Usage Guide" for more information.

Verilog: wait for module logic evaluation in an always block

I want to use the output of another module inside an always block.
Currently the only way to make this code work is by adding #1 after the pi_in assignment so that enough time has passed to allow Pi to finish.
Relevant part from module pLayer.v:
Pi pi(pi_in,pi_out);
always #(*)
begin
for(i=0; i<constants.nSBox; i++) begin
for(j=0; j<8; j++) begin
x = (state_value[(constants.nSBox-1)-i]>>j) & 1'b1;
pi_in = 8*i+j;#1; /* wait for pi to finish */
PermutedBitNo = pi_out;
y = PermutedBitNo>>3;
tmp[(constants.nSBox-1)-y] ^= x<<(PermutedBitNo-8*y);
end
end
state_out = tmp;
end
Modllue Pi.v
`include "constants.v"
module Pi(in, out);
input [31:0] in;
output [31:0] out;
reg [31:0] out;
always #* begin
if (in != constants.nBits-1) begin
out = (in*constants.nBits/4)%(constants.nBits-1);
end else begin
out = constants.nBits-1;
end
end
endmodule
Delays should not be used in the final implementation, so is there another way without using #1?
In essence i want PermutedBitNo = pi_out to be evaluated only after the Pi module has finished its job with pi_in (=8*i+j) as input.
How can i block this line until Pi has finished?
Do i have to use a clock? If that's the case, please give me a hint.
update:
Based on Krouitch suggestions i modified my modules. Here is the updated version:
From pLayer.v:
Pi pi(.clk (clk),
.rst (rst),
.in (pi_in),
.out (pi_out));
counter c_i (clk, rst, stp_i, lmt_i, i);
counter c_j (clk, rst, stp_j, lmt_j, j);
always #(posedge clk)
begin
if (rst) begin
state_out = 0;
end else begin
if (c_j.count == lmt_j) begin
stp_i = 1;
end else begin
stp_i = 0;
end
// here, the logic starts
x = (state_value[(constants.nSBox-1)-i]>>j) & 1'b1;
pi_in = 8*i+j;
PermutedBitNo = pi_out;
y = PermutedBitNo>>3;
tmp[(constants.nSBox-1)-y] ^= x<<(PermutedBitNo-8*y);
// at end
if (i == lmt_i-1)
if (j == lmt_j) begin
state_out = tmp;
end
end
end
endmodule
module counter(
input wire clk,
input wire rst,
input wire stp,
input wire [32:0] lmt,
output reg [32:0] count
);
always#(posedge clk or posedge rst)
if(rst)
count <= 0;
else if (count >= lmt)
count <= 0;
else if (stp)
count <= count + 1;
endmodule
From Pi.v:
always #* begin
if (rst == 1'b1) begin
out_comb = 0;
end
if (in != constants.nBits-1) begin
out_comb = (in*constants.nBits/4)%(constants.nBits-1);
end else begin
out_comb = constants.nBits-1;
end
end
always#(posedge clk) begin
if (rst)
out <= 0;
else
out <= out_comb;
end
That's a nice piece of software you have here...
The fact that this language describes hardware is not helping then.
In verilog, what you write will simulate in zero time. it means that your loop on i and j will be completely done in zero time too. That is why you see something when you force the loop to wait for 1 time unit with #1.
So yes, you have to use a clock.
For your system to work you will have to implement counters for i and j as I see things.
A counter synchronous counter with reset can be written like this:
`define SIZE 10
module counter(
input wire clk,
input wire rst_n,
output reg [`SIZE-1:0] count
);
always#(posedge clk or negedge rst_n)
if(~rst_n)
count <= `SIZE'd0;
else
count <= count + `SIZE'd1;
endmodule
You specify that you want to sample pi_out only when pi_in is processed.
In a digital design it means that you want to wait one clock cycle between the moment when you are sending pi_in and the moment when you are reading pi_out.
The best solution, in my opinion, is to make your pi module sequential and then consider pi_out as a register.
To do that I would do the following:
module Pi(in, out);
input clk;
input [31:0] in;
output [31:0] out;
reg [31:0] out;
wire clk;
wire [31:0] out_comb;
always #* begin
if (in != constants.nBits-1) begin
out_comb = (in*constants.nBits/4)%(constants.nBits-1);
end else begin
out_comb = constants.nBits-1;
end
end
always#(posedge clk)
out <= out_comb;
endmodule
Quickly if you use counters for i and j and this last pi module this is what will happen:
at a new clock cycle, i and j will change --> pi_in will change accordingly at the same time(in simulation)
at the next clock cycle out_comb will be stored in out and then you will have the new value of pi_out one clock cycle later than pi_in
EDIT
First of all, when writing (synchronous) processes, I would advise you to deal only with 1 register by process. It will make your code clearer and easier to understand/debug.
Another tip would be to separate combinatorial circuitry from sequential. It will also make you code clearer and understandable.
If I take the example of the counter I wrote previously it would look like :
`define SIZE 10
module counter(
input wire clk,
input wire rst_n,
output reg [`SIZE-1:0] count
);
//Two way to do the combinatorial function
//First one
wire [`SIZE-1:0] count_next;
assign count_next = count + `SIZE'd1;
//Second one
reg [`SIZE-1:0] count_next;
always#*
count_next = count + `SIZE'1d1;
always#(posedge clk or negedge rst_n)
if(~rst_n)
count <= `SIZE'd0;
else
count <= count_next;
endmodule
Here I see why you have one more cycle than expected, it is because you put the combinatorial circuitry that controls your pi module in you synchronous process. It means that the following will happen :
first clk positive edge i and j will be evaluated
next cycle, the pi_in is evaluated
next cycle, pi_out is captured
So it makes sense that it takes 2 cycles.
To correct that you should take out of the synchronous process the 'logic' part. As you stated in your commentaries it is logic, so it should not be in the synchronous process.
Hope it helps

CPLD Breathing LED with flexible set points

This is my first post and my first attempt at using a PLD.
I have written some code to make a breathing LED with 7 set points. The code produces a pwm output according to the first set point. It then slowly increases/decreases the pwm towards the next set point (7 in total).
The code works but I think it can be done better as I need to put 16 instantiations of this into a Lattice 4256 CPLD (not possible with my code).
I am keen to learn how a professional Verilog programmer would tackle this.
Many thanks in advance for your support.
PWM Generation
module LED_breath (led, tmr_clk);
output reg led;
input tmr_clk;
reg [7:0] cnt;
reg [6:0] pwm_cnt;
reg [6:0] pwm_val;
reg [2:0] pat_cnt;
reg [9:0] delay_cnt;
reg [6:0] cur_pat;
reg [6:0] nxt_pat;
parameter pattern = {7'h00, 7'h00, 7'h00, 7'h00, 7'h00, 7'h00, 7'h00, 7'h00};
always #(posedge tmr_clk)
begin
pwm_cnt = cnt[7] ? ~cnt[6:0] : cnt[6:0]; //Generate triangle wave
if(pwm_cnt > pwm_val) //Generate pwm
led <= 1'b0;
if(pwm_cnt < pwm_val)
led <= 1'b1;
cnt = cnt + 1;
end
always #(posedge tmr_clk) //breathing pattern
begin
if(!delay_cnt) //Add delay
begin
cur_pat <= ((pattern >> (7*pat_cnt)) & 7'b1111111); //Find correct pattern No. from parameter list
if((pat_cnt+1) == 8) //Check for last pattern - overflow, set to 0
nxt_pat <= (pattern & 7'b1111111);
else
nxt_pat <= ((pattern >> (7*(pat_cnt+1))) & 7'b1111111);
if(pwm_val == nxt_pat) //If pwm is at max or min increment count to get next pattern
pat_cnt <= pat_cnt + 1;
if(cur_pat <= nxt_pat) //Current pattern < next pattern, count up
pwm_val <= pwm_val + 1;
if(cur_pat >= nxt_pat) //Current pattern < next pattern, count down
pwm_val <= pwm_val - 1;
end
delay_cnt <= delay_cnt + 1;
end
endmodule
module top (led_0, led_1, led_2, led_3);
output led_0;
output led_1;
output led_2;
output led_3;
defparam I1.TIMER_DIV = "128";
OSCTIMER I1 (.DYNOSCDIS(1'b0), .TIMERRES(1'b0), .OSCOUT(osc_clk), .TIMEROUT(tmr_clk));
LED_breath #(.pattern({7'h20, 7'h70, 7'h50, 7'h70, 7'h40, 7'h10, 7'h60, 7'h10})) led_A(
.led (led_0),
.tmr_clk (tmr_clk)
);
LED_breath #(.pattern({7'h70, 7'h10, 7'h30, 7'h20, 7'h60, 7'h40, 7'h70, 7'h10})) led_B(
.led (led_1),
.tmr_clk (tmr_clk)
);
LED_breath #(.pattern({7'h10, 7'h30, 7'h10, 7'h18, 7'h40, 7'h50, 7'h30, 7'h60})) led_C(
.led (led_2),
.tmr_clk (tmr_clk)
);
LED_breath #(.pattern({7'h50, 7'h70, 7'h40, 7'h50, 7'h40, 7'h70, 7'h60, 7'h70})) led_D(
.led (led_3),
.tmr_clk (tmr_clk)
);
endmodule
Can you explain a bit what you are trying to achieve in this always block ?
always #(posedge tmr_clk)
I think you're using a fixed frequency and changing duty cycle to get desired breathing effect.
1) Is my thinking correct ?
2) If yes, how do you decide when to change the pattern ?

Verilog Inter-FPGA SPI Communication

I am trying to communicate between two Xilinx Spartan 3e FPGAs using SPI communication and GPIO pins. The goal is to have a master-slave communication working but for now I am just sending data from Master to Slave and trying to see if the data received is correct.
This is the Master code that sends 16 bits of data to Slave in serial format. After checking on the scope numerous times it seems correct.
module SPI_MASTER_SEND(
input CLK_50MHZ,
input [1:0] ID_user,
input [15:0] DATA_TO_SEND,
output reg SData,
output SCLK,
output notCS
);
parameter max = 20; // max-counter size
reg [6:0]div_counter;
wire [6:0] data_count;
assign data_count[6:0] = div_counter[6:0];
reg CLOCK;
reg Clk_out;
reg CompleteB;
//have the notCS be low for 20 pulses, and hi for 20 pulses.
//sends 16 bits of data during low pulse
always#(posedge CLOCK) begin
if (div_counter == max-1)
begin
div_counter <= 0;
Clk_out <= ~Clk_out;
end
else
begin
div_counter <= div_counter + 1;
end
end
assign notCS = Clk_out;
reg flag;
assign SCLK = flag&&CLOCK; //Clock when notCS is down for 16 pulses
always #(posedge CLOCK) // Parallel to Serial
begin
if (data_count >= 7'd3 && data_count < 7'd18 && notCS ==0)
begin
SData <= DATA_TO_SEND[18-data_count];
flag <=1;
CompleteB<=0;
end
else if (data_count == 7'd18 && notCS ==0)
begin
flag <=1;
SData<=DATA_TO_SEND[0];
CompleteB<=1;
end
else
begin
CompleteB<=0;
flag<=0;
SData <= 0;
end
end
endmodule
This is the code on the Slave receiving end, I check the data on the falling edge of the clock (have tried posedge too) to avoid any timing conflicts.
The Clock,notCS, and SI (serial in) are all coming from the master via gpio pins
module SPI_COMM_SLAVE(CLK,SI,notCS,outputPO,ID_user);
input CLK,SI,notCS;
input [1:0] ID_user;
reg [15:0] PO;
output reg [15:0] outputPO;
reg CompleteB;
reg C;
reg [5:0] cnt;
initial cnt[5:0] = 6'b000000;
always#(negedge CLK)
begin
if (cnt < 6'd15)
begin
PO[15-cnt] <= SI;
cnt <= cnt + 1'b1;
CompleteB<=0;
end
else if (cnt == 6'd15)
begin
PO[0] <= SI;
cnt<=6'b000000;
CompleteB <=1;
end
else
begin
cnt <= cnt;
CompleteB<=0;
end
end
always#(*)begin
if(CompleteB == 1)
outputPO[15:0] <= PO[15:0];
else
outputPO[15:0]<=outputPO[15:0];
end
endmodule
After outputting the "outputPO" to the DAC it gives a bunch of garbage and is clearly not a single value.
Thank you
To debug an FPGA problem like this you should absolutely simulate the design. If you have not already, create a testbench to initiate a write in the master module and connect the slave module as it would be in the system. Check the wave forms match the behavior you expect. It is not effective debugging in hardware until this simulation is working. If you do not have access to a paid simulator there are free verilog simulators available. One suggestion is to build this simulation environment in EDA Playground and then you can share it here as part of the problem description.
Secondly I noticed a number of things that could be improved the quality and readability of your code which does make debugging easier:
Indent code inside blocks (begin/end pairs, etc).
Always use non-blocking assignments inside clocked processes and blocking assignments in combinatorial blocks. For example the non-blocking statements in your combinatorial process assigning outputPO in SPI_COMM_SLAVE are wrong. This can lead to simulation not matching synthesized results.
Latches are not recommended for fpga designs. SPI_COMM_SLAVE will synthesize a 16bit latch for outputPO. Consider making this signal a register.
Your master architecture looks more complex than it needs to be. Consider separating the functionality that initiates the spi transactions (div_counter) from the logic that does the actual spi transaction.

Resources