I have a code similar to the following
module testModule(
input Clk,
input [2:0] Counter,
output [1:0] OutVar1,
output [1:0] OutVar2
);
localparam logic [7:0] mask = 8'h50;
// CODE 1
always_ff #(posedge Clk) begin
case (mask[{Counter[1:0], 1'b0} +: 2])
2'h0 : OutVar1 <= 2'h0;
2'h1 : OutVar1 <= 2'h1;
2'h2 : OutVar1 <= 2'h2;
2'h3 : OutVar1 <= 2'h3;
default: OutVar1 <= 2'hX;
endcase
end
// CODE 2
always_ff #(posedge Clk) begin
case (mask[(Counter[1:0]<<1) +: 2])
2'h0 : OutVar2 <= 2'h0;
2'h1 : OutVar2 <= 2'h1;
2'h2 : OutVar2 <= 2'h2;
2'h3 : OutVar2 <= 2'h3;
default: OutVar2 <= 2'hX;
endcase
end
endmodule
Counter is a input that goes 0, 2, 4, 6, 0, 2, 4, etc.
And I expected CODE 1 and CODE 2 to behave the same but when counter is 2 and 6 (counter[1:0] is 2) I hit the case 2'h1 in CODE 1 (correct) and 2'h0 in CODE 2 (wrong).
I have not checked yet what is the behaviour if counter goes 0, 1, ..., 7, 0, 1, etc.
I do not have a testbed because this code is part of a large project. I saw the problem after simulation and seeing the waves.
What am I missing?
I suspect you're missing that only 2 bits are used to calculate the answer in "CODE 2", because it is a so-called self-determined expression. So, Verilog takes the expression:
counter[1:0]<<1
and needs to decide how many bits to use for the answer. This is what it does: it looks at how many bits there are on the left hand side of the shift operator (2) and uses that to put the result in. How could it do anything else? The number of bits on the right hand side (32) is basically irrelevant (unless you think Verilog should use 2^31-1 bits for the result!). So, you get an overflow - the left hand side of the result of the shift is truncated.
See this answer here.
Related
I learned two ways of writing pipeline(unblocking and blocking), I wonder which is better?
My personal opinion is that the second one is tedious and I don't understand why so many wire are needed.
Also, is there any standard style(template) of writing pipeline like FSM in verilog?
Thanks in advance.
module simplePipeline
#(
parameter WIDTH = 100
)
(
input clk,
input [WIDTH - 1 : 0] datain,
output [WIDTH - 1: 0] dataout
);
reg [WIDTH - 1 : 0] piprData1;
reg [WIDTH - 1 : 0] piprData2;
reg [WIDTH - 1 : 0] piprData3;
always #(posedge clk) begin
piprData1 <= datain;
end
always #(posedge clk) begin
piprData2 <= piprData1;
end
always #(posedge clk) begin
piprData3 <= piprData2;
end
assign dataout = piprData3;
endmodule
module blockingPipeline#2
(
parameter WIDTH = 100
)
(
input clk,
input rst,
input validIn,
input [WIDTH - 1 : 0] dataIn,
input outAllow,
output wire validOut,
output wire [WIDTH - 1 : 0] dataOut
);
reg pipe1Valid;
reg [WIDTH - 1 : 0] pipe1Data;
reg pipe2Valid;
reg [WIDTH - 1 : 0] pipe2Data;
reg pipe3Valid;
reg [WIDTH - 1 : 0] pipe3Data;
/*------------------------PIPE1 LOGIC------------------------*/
wire pipe1AllowIn;
wire pipe1ReadyGo;
wire pipe1ToPipe2Valid;
assign pipe1ReadyGo = 1'b1
assign pipe1AllowIn = ! pipe1Valid || pipe1ReadyGo && pipe2AllowIn;
assign pipe1ToPipe2Valid = pipe1Valid && pipe1ReadyGo
always #(posedge clk)begin
if( rst ) begin
pipe1Vali <= 1'b0;
end
else if(pipe1AllowIn)begin
pipe1Valid <= validIn;
end
ifvalidIn && pipe1AllowIn)begin
pipe1Data <= dataIn;
end
end
/*------------------------PIPE2 LOGIC------------------------*/
wire pipe2AllowIn;
wire pipe2ReadyGo;
wire pipe2ToPipe3Valid;
assign pipe2ReadyGo = 1'b1;
assign pipe2AllowIn = ! pipe2Valid || pipe2ReadyGo && pipe3AllowIn;
assign pipe2ToPipe3Valid = pipe2Valid && pipe3ReadyGo;
always #(posedge clk)begin
if( rst ) begin
pipe2Valid <= 1'b0;
end
else if(pipe2AllowIn)begin
pipe2Valid <= pipe1ToPipe2Valid;
end
if(pipe1ToPipe2Valid && pipe2AllowIn)begin
pipe2Data <= pipe1Data;
end
end
/*------------------------PIPE3 LOGIC------------------------*/
wire pipe3AllowIn;
wire pipe3ReadyGo;
assign pipe3ReadyGo = 1'b1;
assign pipe3AllowIn = ! pipe3Valid || pipe3ReadyGo && outAllow;
always #(posedge clk)begin
if( rst ) begin
pipe3Valid <= 1'b0;
end
else if(pipe3AllowIn)begin
pipe3Valid <= pipe2ToPipe3Valid;
end
if(pipe2ToPipe3Valid && pipe3AllowIn)begin
pipe3Data <= pipe2Data;
end
end
assign validOut = pipe3Valid && pipe3ReadyGo;
assign dataOut = pipe3Data;
endmodule
The problem with the first version is that there seems to be no clock gate at all. Unless your clock is well gated on a higher level or the pipeline is used every cycle you will waste a lot of power by (unnecessarily) toggling each stage of the pipeline every cycle.
As good practice, the second one seems "better" in the sense that when you design a hardware circuit part you might want offer great control possibilities. Generally speaking, since your code will be implemented in the silicon forever (or in your FPGA with painful reconfiguration) that would be a real problem to not have enough controls capacities, because you can't really add some afterward.
As an example you already mentioned in the comments that you'll have to stall the pipeline, so of course you need more wires to do it. You will also need to reset the pipeline sometimes, this is the purpose of the rst signal.
However, more signals means more silicon surface (or more FPGA resources using) which generally come with a greater price. One can argue that only one or two wires will not makes such great difference for a chip, which is true, but if you re-use your pipeline thousand times on the circuit it will be much more significant.
For me the first implementation can be relevant only with really strong optimization requirements at circuit level like for an ASIC, where you exactly know the overall wanted behavior.
I wrote the program for blink 8 raw LEDs and code has not any errors and it is properly loaded. But blinking LEDs is not happen properly.
I checked pin planner and it was correct and clock I used is 50MHz. I am using the DE10 lite board.
module LED_blink(clk,led);
input clk;
output reg[7:0] led;
reg[31:0] count = 0;
always #(posedge clk)
begin
count <= count + 1;
led <= (count<50000000) ? 8'b11111111 : 0;
count <= (count<50000000) ? count : 0;
end
endmodule
There are no error messages but it is not working the way I assumed.
You should get the habit of simulating your code with a testbench before running it on an FPGA. But in this case even a simple Rubber Duck Debugging shows the errors.
When you execute count<=(count<50000000)?count:0; you override the previous count<=count+1;, thus the count will never increment.
This can be easily fixed by changing the 2 lines to:
count<=(count<50000000)?count+1:0;
The second problem is that led will be 0 only for one clock cycle (when count is equal to 50000000), practically you won't see any blinking.
This can be easily fixed by changing the line to:
led<=(count<25000000)?8'b11111111:0;
The problem is not syntactical, but there are some other problems in your code.
The first is the way you use nonblocking assignments (<=). In your code, all assignments will only be evaluated at the end of the clock cycle. Thus, the first line of your always block (count<=count+1) does not affect the last line of your always block (count<=(count<50000000)?count:0). Also, only the last line will be evaluated.
The second problem is that you don't have a reset state for count. The value is never explicitly set to 0, and you can therefore not know it's state.
If you fix these problems, there will be a third problem in your logic. You describe that your variable led should be 8'b11111111 if count is smaller than 50000000. However, as soon as the counter exceeds this value, you reset it to 0. In other words, your counter would always be smaller than 50000000, and led would never be set to 0.
I wrote down a small fix:
module LED_blink(clk,led);
input clk;
output reg[7:0] led;
reg[31:0]count=0;
always #(posedge clk or negedge rst_n)
begin
if (~rst_n) begin
count <= 32'b0;
led <= 8'b0;
end
else begin
count <= (count < 100000000) ? count + 1 : 0;
led <= (count < 50000000) ? 8'b11111111 : 0;
end
end
endmodule
Alternatively, you could write in your always block:
count <= (count < 50000000) ? count + 1 : 0;
led <= (count == 50000000) ? ~led : led;
In the following Verilog module, I'd like to understand why the blocking assignment using concatenation doesn't give the same result as the 2 commented out blocking assignments.
When I run the program on the FPGA, it gives the expected result with the 2 blocking assignments (the leds blink), but not with the blocking assignment using concatenation (the leds stay off).
Bonus points for answers pointing to the Verilog specification explaining what is at play here!
/* Every second, the set of leds that are lit will change */
module blinky(
input clk,
output [3:0] led
);
reg [3:0] count = 0;
reg [27:0] i = 0;
localparam [27:0] nTicksPerSecond = 100000000;
assign led = {count[3],count[2],count[1],count[0]};
always # (posedge(clk)) begin
// This works:
//count = i==nTicksPerSecond ? (count + 1) : count;
//i = i==nTicksPerSecond ? 0 : i+1;
// But this doesn't:
{count,i} = i==nTicksPerSecond ?
{count+1, 28'b0 } :
{count , i+1};
end
endmodule
PS: I use Vivado 2018.2
The reason is because the widths of count+1 and i+1 are both 32 bits. An unsized number is 32 bits wide (1800-2017 LRM section 5.7.1) and the width of the addition operator is the size of the largest operand (LRM section 11.6.1). To make your code work, add a proper size to your numeric literals
{count,i} = i==nTicksPerSecond ?
{count+4'd1, 28'b0 } :
{count , i+28'd1};
A simpler way to write this code is
always # (posedge clk)
if (i== nTicksPerSecond)
begin
count <= count + 1;
i <= 0;
end
else
begin
i <= i + 1;
end
I have a 1023 bit vector in Verilog. All I want to do is check if the ith bit is 1 and if it is 1 , I have to add 'i' to another variable .
In C , it would be something like :
int sum=0;
int i=0;
for(i=0;i<1023;i++) {
if(a[i]==1) {
sum=sum+i;
}
Of course , the addition that I am doing is over a Galois Field . So, I have a module called Galois_Field_Adder to do the computation .
So, my question now is how do I conditionally check if a specific bit is 1 and if so call my module to do that specific addition .
NOTE: The 1023 bit vector is declared as an input .
It's hard to answer your question without seeing your module, as we can't gage where you are in your Verilog. You always have to think of how your code translates in gates. If we want to translate your C code into synthesizable logic, we can take the same algorithm, go through each bit one after the other, and add to the sum depending on each bit. You would use something like this:
module gallois (
input wire clk,
input wire rst,
input wire [1022:0] a,
input wire a_valid,
output reg [18:0] sum,
output reg sum_valid
);
reg [9:0] cnt;
reg [1021:0] shift_a;
always #(posedge clk)
if (rst)
begin
sum[18:0] <= {19{1'bx}};
sum_valid <= 1'b0;
cnt[9:0] <= 10'd0;
shift_a[1021:0] <= {1022{1'bx}};
end
else
if (a_valid)
begin
sum[18:0] <= 19'd0;
sum_valid <= 1'b0;
cnt[9:0] <= 10'd1;
shift_a[1021:0] <= a[1022:1];
end
else if (cnt[9:0])
begin
if (cnt[9:0] == 10'd1022)
begin
sum_valid <= 1'b1;
cnt[9:0] <= 10'd0;
end
else
cnt[9:0] <= cnt[9:0] + 10'd1;
if (shift_a[0])
sum[18:0] <= sum[18:0] + cnt[9:0];
shift_a[1021:0] <= {1'bx, shift_a[1021:1]};
end
endmodule
You will get your result after 1023 clock cycles. This code needs to be modified depending on what goes around it, what interface you want etc...
Of importance here is that we use a shift register to test each bit, so that the logic adding your sum only takes shift_a[0], sum and cnt as an input.
Code based on the following would also work in simulation:
if (a[cnt[9:0])
sum[18:0] <= sum[18:0] + cnt[9:0];
but the logic adding to sum would in effect take all 1023 bits of a[] as an input. This would be quite hard to turn into actual lookup tables.
In simulation, you can also implement something very crude such as this:
reg [1022:0]a;
reg [9:0] sum;
integer i;
always #(a)
begin
sum[9:0] = 10'd0;
for (i=0; i < 1023; i=i+1)
if (a[i])
sum[9:0] = sum[9:0] + i;
end
If you were to try to synthesize this, sum would actually turn into a chunk of combinatorial logic, as the 'always' block doesn't rely on a clock. This code is in fact equivalent to this:
always #(a)
case(a):
1023'd0: sum[18:0] = 19'd0;
1023'd1: sum[18:0] = 19'd1;
1023'd2: sum[18:0] = 19'd3;
etc...
Needless to say that a lookup table with 1023 input bits is a VERY big memory...
Then if you want to improve your code, and use your FPGA as an FPGA and not like a CPU, you need to start thinking about parallelism, for instance working in parallel on different ranges of your input a. But this is another thread...
Hi everyone,
I am a newbie in programming FPGA by verilog language. At the present, I am trying to design the firmware to calculate the sum of adc data at 3 sampling. Firstly, I will explain about one adc at one sampling in my code. When you look at the code, you can see that with rising-edge of clkr clock and adcIfEnb == 1, the adc_data will get the value from adcIfData and this is the data for one sampling. In the next rising-edge of clkr clock and adcIfEnb == 1, this data is stored in iradcTrg. Finally, I will have the 3 data of adc_data for 3 sampling which are stored in iradcTrg and then I summarize 3 these data.
wire adcIfData[79:0];
reg
always #(posedge clkr) begin
if(adcIfEnb) begin
adc_data[9:0] <= adcIfData[9:0];
end
end
reg [29:0] iradcTrg;
reg [9:0] adcTrg;
always #(posedge clkr) begin
if (adcIfEnb) begin
iradcTrg[29:0] <= {adc_trg[19:10],adc_trg[9:0],adc_data[9:0]};
adcTrg[9:0] <= adc_trg[29:20] + adc_trg[19:10] + adc_trg[9:0];
end
end
However, there are 2 problems which I do not know how to solve.
Firstly, at the beginning time, when the first data of adc_data is stored at iradcTrg and adcTrg also take the sum. It means that adcTrg = 0 + 0 + first_adc_data but this sum need to be avoided.
Secondly, according to my design, I see that adc_data is serialized into iradcTrg. It means that the adc_data will be stored like this:
[1 2 3] 4 5 6 => 1 [2 3 4] 5 6=> 1 2 [3 4 5] 6
But in my case, I would like that the adc_data will be stored like this to get the sum
[1 2 3] 4 5 6 => 1 2 3 [4 5 6]
Therefore, how should I repair my code to get the result that I expected or are there any documents can help me in this case ?
To start: make sure your code is correctly indented when you put it on stackexchange. Secondly: I assume you have edited the code before posting it here because that code will not compile e.g. there is a floating 'reg' at the top and no module declaration.Thirdly: you have defined a wire adcIfData[79:0] I am going to assume you meant that to be [9:0].Forthly: You use variables which are not defined: adc_data, adc_trg.
Fifthly: I suggest you give your variables more meaningfull names like: gater_samples, sum_off_samples.
Now lets look at the core of the code. You want to take samples and shift them into a 30 bit register. There is no need to write "adc_trg[19:10],adc_trg[9:0]" adc_trg[19:0] will suffice. Also there is no need to put it in a different register beforehand. I would just use:
always #(posedge clkr)
if (adcIfEnb)
iradcTrg[29:0] <= {iradcTrg[19:0],adcIfData[9:0]};
As to your basic problem of gathering samples and not using the first two: all you have to do is add a counter which counts to three. Then you add the result on the third count. You will need a reset to give the counter a known value at startup but I don't see a reset signal. I always try to use minimal logic so I would make iradcTrg 20 bits wide to only store the intermediate result and at the count of three add it up with the latest sample. Saves another 10 registers. Here is some code. I wrote this without simulating or compiling. It is just a guide of how it all should look like.
reg [ 1:0] count;
reg [19:0] gather_samples;
reg [ 9:0] sum_of_samples;
reg sum_valid;
always #(posedge clkr)
begin
if (some_reset)
count <= 2'd0;
else
if (adcIfEnb)
begin
if (count==2'd2)
begin // third sample arriving, add it to the previous 2
sum_of_samples <= gather_samples[19:10] + gather_samples[9:0] + adcIfData;
count <= 2'd0;
else
begin // intermediate: gather samples
gather_samples <= {gather_samples[9:0],adcIfData};
count <= count + 2'd1;
end
sum_valid <= (count==2'd2);
end // if (adcIfEnb)
end // clocked
Your job will be much easier if you use a state machine. Here's a small (and incomplete) example of a state machine.
parameter FIRST_DATA=0, SECOND_DATA=1, THIRD_DATA=2, OUTPUT=3;
reg [2:0] current_state = FIRST_DATA;
reg [9:0] adc_data1;
reg [9:0] adc_data2;
reg [9:0] adc_data3;
reg [11:0] adc_data_sum;
always # (posedge clk)
begin
// TODO: use proper reset
case (current_state):
FIRST_DATA:
if(adcIfEnb):
current_state <= SECOND_DATA;
SECOND_DATA:
if(adcIfEnb):
current_state <= THIRD_DATA;
THIRD_DATA:
if(adcIfEnb):
current_state <= OUTPUT;
OUTPUT:
if(adcIfEnb):
current_state <= FIRST_DATA;
endcase
end
always # (negedge clk)
begin
if (current_state == FIRST_DATA && adcIfEnb)
adc_data1 <= adcIfData;
end
always # (negedge clk)
begin
if (current_state == SECOND_DATA && adcIfEnb)
adc_data2 <= adcIfData;
end
always # (negedge clk)
begin
if (current_state == THIRD_DATA && adcIfEnb)
adc_data3 <= adcIfData;
end
always # (negedge clk)
begin
if (current_state == OUTPUT)
adc_data_sum <= adc_data1 + adc_data2 + adc_data3;
end
Few comments first:
1) I don't know why do you introduce so many variables with strange names. You only need a adc_buffer and adc_sum. Is iradcTrg the same as adc_trg? Why is there an empty reg statement? Why adcIfData has 80 bits and you only use 8 LSB bits? I'm confused.
2) Since adc_sum will be a sum of 3 (adcTrg in your case), think about possible overflow. What should be the width of adc_sum if you want to add 3 10-bit numbers?
3) Shouldn't you reset your design to a known state using asynchronous or synchronous reset first?
You can use a 2 bit counter with async reset and a logic for wrapping back to 0:
reg [1:0] adc_buffer_counter_reg;
always #(posedge clkr or negedge rst_n) begin
if (!rst_n)
adc_buffer_counter_reg <= 2'd0;
else if (adcIfEnb) begin
if (adc_buffer_counter_reg == 2'd2) //trigger calc of the sum here
adc_buffer_counter_reg <= 2'd0;
else
adc_buffer_counter_reg <= adc_buffer_counter_reg + 2'd1;
end
You can use this counter to trigger a calculation of the sum every 3rd valid data.