My class is using Nasm assembly and I was trying to figure out different ways to shift, we know the instructions shr/sar, shl/sal, ror, rcr, rol, rcl. But would I shift right and set the leftmost bit to whatever I want.
For example:
I have 11010011, and shifting right would produce _1101001 cf=1,
is there a shift in which I can carry in a number to the leftmost bit?
Thanks!
edit:
My only thoughts are using bit-wise operations and if the leftmost bit isn't what I want I can flip it using the not operator.
For example the number ends up as 1 1101001 and I wanted 0 1101001,
1 1101001 & 01101001 = 01101001
or,
0 1101001 | 11101001 = 11101001
The easiest way would be to simply set the bit to what you want using AND or OR operations.
If you want the high bit set to 1, use input OR 1000000.
If you want it set to 0, use input AND 01111111.
The remaining bits will be unchanged.
So I am working with 64 bit floating point numbers in Verilog for synthesis, ideally I would like to do -A*B, where A and B are the two numbers. I have got past doing A*B, so is it okay now if I just change the value of the first bit 0 to 1 or 1 to 0 to make it represent -A*B.
kinda like,
A[0]=~A[0];
Thanks in advance for any suggestion.
Yes! That's all there is to it.
Keep in mind that negating 0 will give you -0. (They're different floating-point bit patterns.) Whether this matters to you will depend on your application.
I'm trying to create a 8086 processor in Verilog, and I have a better-than-average fundamental understanding of most of the architecture (and can get along happily once I get past this point), but I can't seem to wrap my head around how the Carry and Auxiliary flags function in the ALU.
I understand that CF is triggered upon an addition or subtraction (in which case it's called borrow) that would cause the result to be larger than the bit width of the ALU.
But, how would I write Verilog code for addition and subtraction that would allow me to write to the FLAGS[0] (CF) bit and then re-access it to continue the operation? Can anyone give me examples that I can deconstruct?
Also, this is even more of a n00b question, but how is it that an ALU with carry operation can support creation of a 17-bit number if the SI and DI registers are only 16 bits in width? Where does that extra bit go or what gets done with it? What happens if multiplication creates that same bit overflow?
Many apologies for the novice-level questions. I almost feel like I'm set to get yelled at for some obvious ignorance or lack of understanding with regards to this. Many thanks to anyone who can help and give code lines for understanding to elucidate this for me.
I don't quite know what you mean and then re-access it to continue the operation, but if you're just asking how you can generate a carry bit from a 16 bit addition/subtraction, this is one way to do it (use concatenation to write result to two different registers):
always # posedge clk begin
if(add_with_carry)
{CF[0], result[15:0]} <= a[15:0] + b[15:0];
else if(sub_with_carry)
{CF[0], result[15:0]} <= a[15:0] - b[15:0];
else if(add_without_carry)
result[15:0] <= a[15:0] + b[15:0];
else if(sub_without_carry)
result[15:0] <= a[15:0] - b[15:0];
end
This is also basically the same thing as writing the result to a 17 bit register, and then just designating result[16] as the carry flag.
I am teaching myself verilog. The book I am following stated in the introduction chapters that to perform division we use the '/' operator or '%' operator. In later chapters it's saying that division is too complex for verilog and cannot be synthesized, so to perform division it introduces a long algorithm.
So I am confused, can't verilog handle simple division? is the / operator useless?
It all depends what type of code you're writing.
If you're writing code that you intend to be synthesised, that you intend to go into an FPGA or ASIC, then you probably don't want to use the division or modulo operators. When you put any arithmetic operator in RTL the synthesiser instances a circuit to do the job; An adder for + & -; A multiplier for *. When you write / you're asking for a divider circuit, but a divider circuit is a very complex thing. It often takes multiple clock cycles, and may use look up tables. It's asking a lot of a synthesis tool to infer what you want when you write a / b.
(Obviously dividing by powers of 2 is simple, but normally you'd use the shift operators)
If you're writing code that you don't want to be synthesised, that is part of a test bench for example, then you can use division all you want.
So to answer your question, the / operator isn't useless, but you have be concious of where and why you're using it. The same is true of *, but to a lesser degree. Multipliers are quite expensive, but most synthesisers are able to infer them.
You have to think in hardware.
When you write a <= b/c you are saying to the synthesis tool "I want a divider that can provide a result every clock cycle and has no intermediate pipline registers".
If you work out the logic circuit required to create that it's very complex, especially for higher bit counts. Generally FPGAs won't have specialist hardware blocks for division so it would have to be implemented out of generic logic resources. It's likely to be both big (lots of luts) and slow (low fmax).
Some synthesisers may implement it anyway (from a quick search it seems quartus will), others won't bother because they don't think it's very useful in practice.
If you are dividing by a constant and can live with an approximate result then you can do tricks with multipliers. Take the reciprocal of what you wanted to divide by, multiply it by a power of two and round to the nearest integer.
Then in your verilog you can implement your approximate divide by multiply (which is not too expensive on modern FPGAS) followed by shift (shifting by a fixed number of bits is essentially free in hardware). Make sure you allow enough bits for the intermediate result.
If you need an exact answer or if you need to divide by something that is not a pre-defined constant you will have to decide what kind of divider you want. IF your throughput is low then you can use a state machine based approach which does one division every n clock cycles. If your throughput is high and you can afford the device area then a pipelined approach which does a division per clock cycle (but requires multiple cycles for the result to flow through) may be more appropriate.
Often tool vendors will provide pre-made blocks (altera calls them megafunctions) for this kind of stuff. The advantage of these is that the tool vendor will likely have carefully optimised them for the device. The downside is they can bring vendor lockin, if you want to move to a different device vendor you will most likely have to swap out the block and the block you swap it with may have different characteristics.
So im confused. cant verilog handle simple division? is the / operator
useless?
The verilog synthesis spec (IEEE 1364.1) actually indicates all arithmetic operators with integer operands should be supported but nobody follows this spec. Some synthesis tools can do integer division but others will reject it(I think XST still does) because combinational division is typically very area inefficient. Multicycle implementations are the norm but these cannot be synthesized from '/'.
Division and modulo are never "simple". Avoid them if you can do so, e.g. through bit masks or shift operations. Especially a variable divisor is really complicated to implement in hardware.
"Verilog the language" handles division and modulo just fine - when you are using a computer to simulate your code you have full access to all it's abilities.
When you are synthesising your code to a particular chip, there are limitations. The limitations tend to be based on what the tool-vendor thinks is "sensible" rather than what is feasible.
In the old days, division by anything other than a power-of-two was deemed to be non-sensible for silicon as it took up a lot of space and ran very slowly. At the moment, some synthesisers with create "divide by a constant" circuits for you.
In future, I see no reason why the synthesiser shouldn't create you a divider (or make use of one that is in the DSP blocks of a potential future architecture). Whether it will or not remains to be seen, but witness the progression of multipliers (from "only powers of two" to "one input constant" to "full implementation" in just a few years)
circuits including only division by 2 : just shift the bit :)
other than 2 .... see you should always think at circuit level verilog is NOT C or C++
/ and % is not synthesizable or if it becomes( in new versions) i believe you should keep your own division circuit this is because the ip they provide will be general ( most probably they will make for floating not fixed)
i bet you had gone through morris mano computer architechure book , there in some last chapters the whole flow is given along with algo , go through it follow it and make your own
see now if your works go for only logic verification and no real circuit is needed , sure go for / and % . no problem it will work for simulation
Division using '/' is possible in verilog. But it is not a synthesizable operator. Same is the case for multiplication using '*'. There are certain algorithms to perform these operations in verliog, and they are used if the code needs to be synthesizable. ie. if you require an equivalent hardware for it.
I am not aware of any algorithms for division, but for multiplication, i have used Booth's algorithm.
if you want the synthesizable code you can use the Divison_IP or you can use the right shifting operator for some divisions like 64/8=8 same 64>>3 = 8.
Division isn't simple in hardware as people spent a lot of time in an efficient
and fast multiplier as an example. However, you can do divid by 2 easily by right shifting one bit in hardware.
Actually your point is very valid and I was also confused in my initial days of learning HDLs.
When you synthesise a division operator, it consumes a lot of resources on FPGA or during logic synthesis for ASIC. Try following instead.
You can also perform division(and multiplication) by shifting some vector(right = division, left = multiplication). But that will be multiplication(and divion) by 2.
Example 0100 = 4
Shift right 0010 = 2(which is 4/2)
Shift left 1000 = 8(which is 4*2).
We use >> operator for shift right, and << for shift left.
But we can also produce variations out of it.
For example multiplication by 3.
So if we have 0100 (4 dec) then also will be
shift left and add one at each step. ((0100 << 1)+1)
Similarly division by 3
shift right and subtract one at each step. ((0100 >> 1) - 1)
These methods were made because to be honest, resources in FPGA are limited, and when it comes to ASICs, your manager tries to kill you for any additional logic. :)
The division operator / is not useless in Verilog/System Verilog. It works in case of simulations as usual mathematical operator.
Some synthesis tools like Xilinx Vivado synthesize the division operator also because it is having a pre-built algorithm in it (though takes more hardware gates).
In simple words, you can do division in Verilog but have to take care of tools and simulators.
Using result <= a/b and works perfectly.
Remember when using the <= operator, the answer is calculated immediately but the answer is entered inside the "result" register at next clock positive edge.
If you don't want to wait till next clock positive edge use result = a/b.
Remember, any arithmetic operation circuit needs some time to finish the operation, and during this time the circuit generates random numbers (bits).
Its like when A-10 warthog attack airplane attacks a tank it shoots lots of bullets. That's how the divider circuit acts while dividing,it spits random bits. After couple of nanoseconds it will finish dividing and return a stable good result.
This is why we wait until next clock cycle for the "result" register. We try to protect it from random garbage numbers.
Division is the most complex operation, so it will have a delay in calculation. For 16bit division the result will be calculated in approximately 6 nanoseconds.
I am trying to implement floating point operations in a microcontroller and so far I have had ample success.
The problem lies in the way I do multiplication in my computer and it works fine:
unsigned long long gig,mm1,mm2;
unsigned long m,m1,m2;
mm1 = f1.float_parts.mantissa;
mm2 = f2.float_parts.mantissa;
m1 = f1.float_parts.mantissa;
m2 = f2.float_parts.mantissa;
gig = mm1*mm2; //this works fine I get all the bits I need since they are all long long, but won't work in the mcu
gig = m1*m2//this does not work, to be precise it gives only the 32 least significant bits , but works on the mcu
So you can see that my problem is that the microcontroller will throw an undefined refence to __muldi3 if I try the gig = mm1*mm2 there.
And If I try with the smaller data types, it only keeps the least significant bits, which I don't want it to. I need the 23 msb bits of the product.
Does anyone have any ideas as to how I can do this?
Apologizes for the short answer, I hope that someone else will take the time to write a fuller explanation, but basically you do exactly as when you multiply two big numbers by hand on a paper! It's just that instead of working with base 10, you work in base 256. That is, treat your numbers as a byte vectors, and do with each byte what you do to a digit when you "hand multiply".
The comments in the FreeBSD implementation of __muldi3() have a good explanation of the required procedure, see muldi3.c. If you want to go straight to the source (always a good idea!), according to the comments this code was based on an algorithm described in Knuth's The Art of Computer Programming vol. 2 (2nd ed), section 4.3.3, p. 278. (N.B. the link is for the 3rd edition.)
Back on the Intel 8088 (the original PC CPU and the last CPU I wrote assembly code for) when you multiplied two 16 bit numbers (32 bits? whoow) the CPU would return 2 16 bit numbers in two different registers - one with the 16 msb and one with the lsb.
You should check the hardware capabilities of your micro-controller, maybe it has a similar setup (obviously you'll need the code this in assembly if it does).
Otherwise you'll have to implement multiplication on your own.