Verilog Multiplier/Divider Propagation Delay

I'm about to start coding a basic shift multiplier and a shift divider structurally in Verilog, but I wanted to first figure out what the expected propagation delays should be. Does anyone know the propagation delay equations for basic shift multipliers and dividers?

May be being more specific can help us answer your question more accurately. The expected delays and actual hardware depends on the method you use to implement your circuit.
May be this PDF can provide some help regarding simulation and timing.

Not Only does this depend on the multiplier and divider architecture used but the process and voltage you run the circuit at.
For example at 350nm, 1.3v you will struggle to meet timing at 100MHz. While at 14nm, 1.0v # 1GHz you will not have a problem.
If you have the manual for your standard cell library it should list propagation delays for a given voltage for each cell.


What are flip-flops&latches and transmission&switch gates in Verilog?

I've looked everywhere to figure out what flip-flops and latches are. Could you give me a brief description of them in the simplest possible way (as if to a child)?
Also, could you tell me the functionalities, input and output of transmission primitives (buf, bufif0, bufif1, notif0, notif1) and switch primitives (pmos, rpmos, nmos, rnmos, cmos, rcmos, tranif1, tranif0, rtranif1, rtranif0, tran, rtran, pullup, pulldown)?
It is a lot of stuff I am requesting, so if you have a URL with comprehensive description of the given primitives, and any other ones with other introductory level information on Verilog, I would be very grateful. (NOT because that is precisely where I'm learning this from and I don't understand them)
PS: I'd like to be familiar with more primitives to prepare for my exam.
Could you give me a brief description of them in the simplest possible way (as if to a child)?
Once upon a time a wise genie named Lindley invented a magical box called a Flip-Flop. Whenever the enchanted Clock strikes a chime, if a peasant farmer Mr. D. is feeling high, his neighbor Mrs. Q will see this and suddenly do a "flip", landing upon the rail. And there she stays until the next time the clock chimes.
But instead, if Mr. D. is feeling low when the enchanted clock chimes, Mrs. Q will "flop", quite dejectedly, ending up flat on the ground.
And so this continues on forever as long as the enchanted clock continues to chime, with Q flipping and flopping as to however D feels. But only each time the clock chimes, much to the bemusement of the genie. If the clock were to stop, Q will freeze right where she is, and no matter what is happening to D, there she will stay frozen.
This may seem an singularly odd amusement, but it happens that with enough enchanted boxes and flipping peasants, a wise fellow named Turing is said to be able to compute most nearly anything.
Now a latch is something, not quite different, but also very much not quite the same. It involves some different peasants and in lieu of the magical clock, another peasant simply tells the others when to freeze. For this reason it is not as interesting as the famous flip-flop of Antioch.
if you have a URL with comprehensive description of the given primitives
Please see the following freely download-able reference. This contains comprehensive and authoritative descriptions of all of the primitives you request.
IEEE 1800 System Verilog-2012

how to get power estimation using xpower

I have been working on a class project using Verilog. I had to create a circuit and then calculate the power that the circuit uses. I have been trying to do it using Xpower Analyzer I follow the instruction to create the vcd file, compile and synthesize the code using Xilinx ISE 14.7 . Everything goes well until the result shows up. I received 0 power consumption from the clock. I try to constrains the clock and it only give me a increment in dynamic power from 0 to 0.009, but not luck in the clock. Also, I try Xpower in my personal computer and at my university computer lab, so I don't think that it is a software bug.
Moreover, I have try different design such as a simple alu, register etc. Nonetheless, I still getting the same power result.
More information:
Testbench runs well and does what I want
I declare clock like: module toptrafficlight(
clock,rst,output );
List item: I have constrained the clock to 20ns
Timing phase = 0. After synthesis (not sure what this means)
Warrnings from:
HDLCompiler:413 - Line 86: Result of 5-bit expression is truncated to fit in 4-bit target.
PhysDesignRules:372 - Gated clock. Clock net main_gated_clk is sourced by a combinatorial pin. This is not good design practice. Use the CE pin to control the loading of data into the flip-flop.
Power result from Xpower Analyzer
My questions are?
is it a way to setup the clock? which I think might be the cause of the problem
is there anything else needed to be done beside getting the VCD file and synthesize the code?
any other ideas, examples or tutorial?
The screenshot shows that the design is very small, so it's not a big surprise for clock power to be smaller than 1mW. Xilinx also provides an Excel sheet for power estimation. It can be used for a quick tryout to see what circumstances make the clock power significant.
Xilinx Power Estimator (XPE)

How to calculate propagation delay of a combinational circuit?

I have googled this question, but i can't seem to find a proper answer. Is there a certain equation to it? Or do i calculate the delay by looking at the longest path in the circuit?
You calculate path delays using static timing analysis tools that are usually provided by the target technology vendor (i.e. FPGA)
Quick and dirty method - Open the synthesis report, and look for maximum combinational delay. This should give you an approximate value.
The right way - If the block is purely combinational, register the inputs and outputs using a clock of ANY frequency. Generate timing from input register to output register post implementation (place and route) using the tcl command report_timing -from {Flop1} -to {Flop2}. This will give you an elaborate description of cell and wire delays. This analysis would be accurate.

Adjusting the operating frequency of a module in Verilog

I am creating a fairly complicated module which involves timing analysis of 2 Modules each having their own algorithm, but take in 2 signed numbers as inputs and output a signed number.
I am designing this module for an FPGA in Verilog using Xilinx as my synthesis tool. Now I understand that Xilinx usually gives the worst case timing analysis for any module. This means that if I have a range of numbers which take 250 picoseconds, from input to output including the routing time, if there is even a single set of inputs that takes 400 picoseconds, the timing analysis shown by Xilinx would be 400 picoseconds.
My goal is to find:
1) If Module 1 is faster than Module 2 for any set of numbers.
1) Range of numbers for which Module 1 is faster than Module 2.
The only logical approach I can think of is, by increasing the operating frequency of the module. That is to force both the Modules to give their outputs after say 300 picoseconds rather than 400 picoseconds.
Obviously if I increase the operating frequency, some of the inputs in the testbench will give out erroneous outputs. My hypothesis is that the module that starts giving out erroneous answers first, has the algorithm.
So my doubts are:
1) Is it possible to increase the operating frequency of a Module in Verilog using Xilinx (some settings that I must enforce upon during synthesis or analysis). If not, is there a better tool that will be do my timing analysis?
2) Is this approach viable? Short of doing a gate level synthesis using Cadence, is there anyway, I can find out the actual time delay analysis for each set of signed numbers for each gate using Verilog?
You are right in assuming that Xilinx always reports worst-case timing for the whole design, where clock-rates are concerned, Don't take the synthesis results as being very accurate - they can vary by quite a lot once you've placed and routed the design.
I guess you could take the post-PAR Verilog netlist and simulate that with a variety of inputs using different simulated clock speeds - if there are slow paths which are not used for certain inputs, you should be able to run the simulated clock faster for those inputs.
Sounds like a very time consuming task, and I'm not sure what the point is. Where I come from (automotive) "worst-case" is the only number we can look at with any level of confidence!

What is the point of "create_clock" command in FPGA design?

In FPGA programming, what is the point of using the create_clock command in the XDC (or UCF) file? Let's say I have a clock port CLK that is assigned to a physical pin (which is my clock), in the XDC (or UCF) file. Why can't I just go ahead and use this CLK pin in my top level HDL? Why do I need to add something like this:
create_clock -name sys_clk_pin -period "XXX" [get_ports "CLK"]
Also, let's say I have a main clock "CLK" and some other clocks which I generate in HDL. Do I have to use "create_clock" for all the minor clock in XDC too?
I don't get this whole "create_clock" thing. Any help or direction is much appreciated.
Design constraints, as the name suggests, are used in order to define additional constraints of your design, which can't be captured from HDL description.
Lets take create_clock command as an example. You specified the clock pin in your HDL description, why isn't this enough? The reason is that clock signal is not a usual signal - it is used as a reference signal by a synchronous logic (flip-flops).
I suppose you're familiar with "propagation delay" (through logic gates) concept. You want to make sure that all signals originating at one flop and sampled at the other will be able to propagate during a single clock cycle. Now, the total propagation delay you can know right after synthesis because each logic gate in FPGA has associated propagation delay (just sum these up). But how your analysis tools know what is the maximal allowed propagation delay? You do not specify these constraints in HDL, right? This is one of the cases where the frequency you specified with create_clock command will be used - it will be converted to period, and an analysis tool will warn you if any of the combinatorial paths in your design takes longer to propagate than clock's period.
The above example describes one of the actions performed by Static Timing Analysis (STA) tools in which "design constraints" are employed.
Another kind of tools which make extensive use of design constraints is Clock Domain Crossing (CDC) tools. These tools employed in designs containing more than one clock. The CDC concepts are described brilliantly here
In case you take one clock and generate another one from it (clock divider for example) you want to make CDC tool aware of this, because the fact that these clocks are related is important. Your way to inform CDC tool that the clocks are related is to use create_generated_clock constraint.
NOTE: the above examples are basic and by no means comprehensive.
