Want help in understanding verilog constructs - verilog

I am new to understanding Verilog as this language requires thinking in terms of synthesis.
While doing some program I found that:
begin
buf_inm[row][col] =temp_data;
#1 mux_data=buf_inm[row][col];
end
gave correct results than
begin
buf_inm[row][col] =temp_data;
mux_data=buf_inm[row][col];
end
in terms of assignments of variables.
Can anybody explain what is the difference between these two?
In any other higher level languages construct 2 (without delay) would have given correct assignments.
Thanking you,
Yours sincerely,
R. Ganesan.

First of all #1 is a one time scale delay that you can use in simulation, but it is not synthesizeable. When you say "gave correct results", it might help if you explained what results you are seeing, but my guess is that you assigned something to the 2 dimensional vector, and then you assign that to the max_data and are expecting mux_data to be what temp data was. Is it retaining old data? Is it undefined?
That said, I would say the difference is a synthesizeable assignment versus a non-sythesizeble one. In the first case you should see two state changes separated by a timescale unit. The second case, because it is blocking, should see both values update in zero-time, and in the order as written top to bottom. If you aren't seeing that, something else is potentially at foot here. What tool are you using? Tool version? Maybe you can share your full code and test bench so we can see why your results are unexpected.

Related

Why would more array accesses perform better?

I'm taking a course on coursera that uses minizinc. In one of the assignments, I was spinning my wheels forever because my model was not performing well enough on a hidden test case. I finally solved it by changing the following types of accesses in my model
from
constraint sum(neg1,neg2 in party where neg1 < neg2)(joint[neg1,neg2]) >= m;
to
constraint sum(i,j in 1..u where i < j)(joint[party[i],party[j]]) >= m;
I dont know what I'm missing, but why would these two perform any differently from eachother? It seems like they should perform similarly with the former being maybe slightly faster, but the performance difference was dramatic. I'm guessing there is some sort of optimization that the former misses out on? Or, am I really missing something and do those lines actually result in different behavior? My intention is to sum the strength of every element in raid.
Misc. Details:
party is an array of enum vars
party's index set is 1..real_u
every element in party should be unique except for a dummy variable.
solver was Gecode
verification of my model was done on a coursera server so I don't know what optimization level their compiler used.
edit: Since minizinc(mz) is a declarative language, I'm realizing that "array accesses" in mz don't necessarily have a direct corollary in an imperative language. However, to me, these two lines mean the same thing semantically. So I guess my question is more "Why are the above lines different semantically in mz?"
edit2: I had to change the example in question, I was toting the line of violating coursera's honor code.
The difference stems from the way in which the where-clause "a < b" is evaluated. When "a" and "b" are parameters, then the compiler can already exclude the irrelevant parts of the sum during compilation. If "a" or "b" is a variable, then this can usually not be decided during compile time and the solver will receive a more complex constraint.
In this case the solver would have gotten a sum over "array[int] of var opt int", meaning that some variables in an array might not actually be present. For most solvers this is rewritten to a sum where every variable is multiplied by a boolean variable, which is true iff the variable is present. You can understand how this is less efficient than an normal sum without multiplications.

assume() does not work for initial statement

For https://i.imgur.com/NCUjYmr.png , why doesn't the signal "reset" assumed to be '1' initially ? Anyone have any idea why the assume does not work ?
I found the solution. I am in temporal induction where it starts at a state which is not the initial state of the system. Therefore, the signal "reset" is not assumed to be '1' initially
Assumes will work as assumptions only in formal verification environment. But in simulation based verification, they work as an assert statement only.
As per the LRM :
The immediate assume statement specifies that its expression is assumed to hold. For example, immediate assume statements can be used with formal verification tools to specify assumptions on design inputs that constrain the verification computation. When used in this way, they specify the expected behavior of the environment of the design as opposed to that of the design itself. In simulation, an immediate assume may behave as an immediate assert to verify that the environment behaves as assumed. A simulation tool shall
provide the capability to check the immediate assume statement in this way.
Due to this, in your design, it won't actually assume the value, but it will check whether the proper value is given or not.

Carry/auxiliary flag functions in x86 ALU

I'm trying to create a 8086 processor in Verilog, and I have a better-than-average fundamental understanding of most of the architecture (and can get along happily once I get past this point), but I can't seem to wrap my head around how the Carry and Auxiliary flags function in the ALU.
I understand that CF is triggered upon an addition or subtraction (in which case it's called borrow) that would cause the result to be larger than the bit width of the ALU.
But, how would I write Verilog code for addition and subtraction that would allow me to write to the FLAGS[0] (CF) bit and then re-access it to continue the operation? Can anyone give me examples that I can deconstruct?
Also, this is even more of a n00b question, but how is it that an ALU with carry operation can support creation of a 17-bit number if the SI and DI registers are only 16 bits in width? Where does that extra bit go or what gets done with it? What happens if multiplication creates that same bit overflow?
Many apologies for the novice-level questions. I almost feel like I'm set to get yelled at for some obvious ignorance or lack of understanding with regards to this. Many thanks to anyone who can help and give code lines for understanding to elucidate this for me.
I don't quite know what you mean and then re-access it to continue the operation, but if you're just asking how you can generate a carry bit from a 16 bit addition/subtraction, this is one way to do it (use concatenation to write result to two different registers):
always # posedge clk begin
if(add_with_carry)
{CF[0], result[15:0]} <= a[15:0] + b[15:0];
else if(sub_with_carry)
{CF[0], result[15:0]} <= a[15:0] - b[15:0];
else if(add_without_carry)
result[15:0] <= a[15:0] + b[15:0];
else if(sub_without_carry)
result[15:0] <= a[15:0] - b[15:0];
end
This is also basically the same thing as writing the result to a 17 bit register, and then just designating result[16] as the carry flag.

Verilog Best Practice - Incrementing a variable

I'm by no means a Verilog expert, and I was wondering if someone knew which of these ways to increment a value was better. Sorry if this is too simple a question.
Way A:
In a combinational logic block, probably in a state machine:
//some condition
count_next = count + 1;
And then somewhere in a sequential block:
count <= count_next;
Or Way B:
Combinational block:
//some condition
count_en = 1;
Sequential block:
if (count_en == 1)
count <= count + 1;
I have seen Way A more often. One potential benefit of Way B is that if you are incrementing the same variable in many places in your state machine, perhaps it would use only one adder instead of many; or is that false?
Which method is preferred and why? Do either have a significant drawback?
Thank you.
One potential benefit of Way B is that if you are incrementing the same variable in many places in your state machine, perhaps it would use only one adder instead of many; or is that false?
Any synthesis tool will attempt automatic resource sharing. How well they do so depends on the tool and code written. Here is a document that describes some features of Design Compiler. Notice that in some cases, less area means worse timing.
Which method is preferred and why? Do either have a significant drawback?
It depends. Verilog(for synthesis) is a means to implement some logic circuit but the spec does not specify exactly how this is done. Way A may be the same as Way B on an FPGA but Way A is not consistent with low power design on an ASIC due to the unconditional sequential assignment. Using reset nets is almost a requirement on an ASIC but since many FPGAs start in a known state, you can save quite a bit of resources by not having them.
I use Way A in my Verilog code. My sequential blocks have almost no logic in them; they just assign registers based on the values of the "wire regs" computed in the combinational always blocks. There is just less to go wrong this way. And with Verilog we need all the help we can get.
What is your definition of "better" ?
It can be better performance (faster maximum frequency of the synthesized circuit), smaller area (less logic gates), or faster simulation execution.
Let's consider smaller area case for Xilinx and Altera FPGAs. Registers in those FPGA families have enable input. In your "Way B", *count_en* will be directly mapped into that enable register input, which will result in less logic gates. Essentially, "Way B" provides more "hints" to a synthesis tool how to better synthesize that circuit. Also it's possible that most FPGA synthesis tools (I'm talking about Xilinx XST, Altera MAP, Mentor Precision, and Synopsys Synplify) will correctly infer register enable input from the "Way A".
If *count_en* is synthesized as enable register input, that will result in better performance of the circuit, because your counter increment logic will have less logic levels.
Thanks

Does Verilog support short circuit evaluation?

If I have an if statement like:
if(risingEdge && cnt == 3'b111)
begin
...
end
Will it check on cnt if risingEdge is not true?
Does this even matter inside of an HDL?
For simulation it is undefined as to whether short-circuited expressions are evaluated or not. In the above example it makes no difference, but if you have a function call on the right hand side then you may run into problems with undefined side effects.
See Gotcha #52 in "Verilog and SystemVerilog Gotchas: 101 Common Coding Errors and How to Avoid Them" by Stuart Sutherland and Don Mills.
In terms of whether it matters in an HDL, I presume you are asking whether it would matter when synthesized. Short answer is that it would. For example, the following code is synthesizable SystemVerilog :
if(risingEdge && cnt++ == 3'b111)
begin
...
end
In Verilog (not SV) the post-increment could be replaced with a verilog function that has other assignments to module variables to show the same thing. So yes it is a relevant question.
Paul R is generally correct, and the Sutherland reference is a great one (lots of good things like that described in those Gotchas). For reference though, this has changed with SystemVerilog, at least as far as the specification is concerned. While Verilog specifies that short-circuited operations may or may not be executed, SV disambiguates this by indicating that implementations shall not evaluate the short circuited operands (similar to C++, Java, etc). See IEEE-1800-2009 section 11.3.5 if you are interested. While this is great, the track record for adhering to the SV spec is not stellar across all tool providers, so use care when relying on it in SV.

Resources