I am just learning NASM and I am kind of struggling to figure this out. How do you declare variables in NASM? For example, how would you declare unsigned int i in NASM? Thanks
there is no such thing as unsigned int in assembly language (as far as I know).
In NASM you can only declare memory locations and put contents in it.
example:
section .data
abyte: db 15
aword: dw 452
adword: dd 478569
; etc etc see Nasm manual for more 'types'
The way you treat the variables will make you to use signed or unsigned values. When you need signed values the keep in mind that div and mul only works for unsigned values. (The MSB is not the sign bit). In that case you should use idiv and imul (integer division or signed division).
Also keep in mind that the negative of a value will be shown as two's complement. You will see for 5 (in AX as example) : 0000000000000101 binary but for -5 you will see 1111111111111011 which is the two's complement of 5.
both added gives 5 + (-5) or 0000000000000101 + 1111111111111011 = 0000000000000000. The overflow flag will be set appropriatly to indicate that there is an overflow when both numbers are treated as unsigned, so sometimes you can ignore this. A good practice is to debug and check often the flag status.
To check if AX is negative or not you can and ax, ax and the sign flag will be 1 if the MSB is 1 otherwise 0. (js and jns instructions)
The answer is a bit late but for those who have the same question.....
Related
On 64-bit RISC-V, when a 32-bit operand is loaded into a register, it is necessary to decide whether to sign-extend or zero-extend to 64 bits, and the architectural decision was made to prefer the former, presumably on the grounds that the most common int type in C family languages is a signed 32-bit integer. So sign extension is slightly faster than zero extension.
Is the same true of 8-bit operands? In other words, is signed char more efficient than unsigned char?
If you’re going to be widening a lot of 8-bit values to wchar_t, unsigned char is what you want, because that’s a no-op rather than a bitmask. If your char format is UTF-8, you also want to be able to use unsigned math for your shifts. If you’re using library functions, it’s most convenient to use the types your library expects.
The RISC-V architecture has both a LB instruction that loads a sign-extended 8-bit value into a register, and a LBU instruction that zero-extends. Both are equally efficient. In C, any signed char used in an arithmetic operation is widened to int, and the C standard library functions specify widening char to int, so this puts the variable in the correct format to use.
Storing is a matter of truncation, and converting from any integral type to unsigned char is trivial (bitmask by 0xff). Converting from an unsigned char to a signed value can be done in no more than two instructions, without conditionals or register pressure (SLLI to put the sign bit of the char into the sign bit of the machine register, followed by SRLI to sign-extend the upper bits).
There is therefore no additional overhead in this architecture to working with either. The API specifies sign-extension rather than zero-extension of signed quantities.
Incidentally, RV64I does not architecturally prefer sign-extension. That is the ABI convention, but the instruction set adds a LWU instruction to load a 32-bit value from memory with zero-extension and an ADDIW that can sign-extend a zero-extended 32-bit result. (There is no corresponding ADDIB for 8-bit or ADDIH for 16-bit quantities.)
I have this table of S-format instructions. Can you explain to me what imm[11:5] and funct3 are? I know funct indicates its size in bits and sometimes it is 000 or 010. I don't know exactly why it's there. Also, imm[11:5] is also 7-bits of all 0s.
Please help!
imm[4:0] and imm[11:5] denote closed-intervals into the bit-representation of the immediate operand.
The S-format is used to encode store instructions, i.e.:
sX rs2, offset(r1)
There are different types of store instructions, e.g. store-byte (sb), store-half-word (sh), store-word (sw), etc. The funct3 part is used to encode the type (i.e. 0b00 -> sb, 0b010 -> sw, 0b011 -> sd, etc.). This allows to just use one (major) opcode while still having multiple types of store instructions - instead of having to waste several (major) opcodes. IOW, funct3 encodes the minor opcode of the instruction.
The immediate operand encodes the offset. If you ask yourself why it's split like this - it allows to increase similarity of the remaining parts in the encoding with other instruction formats. For example, the opcode, rs1, and funct3 parts are located at the exact same place in the R-type, I-type and B-type instruction formats. The rs2 part placement is shared with the R-type and B-type instruction formats. Those similarities help to simplify the instruction decoder.
That means the offset is 12 bit wide and in pseudo-code:
offset = sign_ext(imm[11:5] << 5 | imm[4:0])
See also the first figure in Section 2.6 (Load and Store Instruction) of the RISC-V Base specification (2019-06-08 ratified):
I got a small problem: that I want to split a float variable into parts and then compute these parts (add / subtract etc.). My main problem is that I don't know how to get that splitted parts/variables from the float type variable. I want to operate on those parts using rax / eax registers and b,c,d etc.
Is there somebody who can help me to acquire some knowledge about this and eventually lead me to some code that can do the trick? One restriction of mine is: I can't operate on FPU commands.
Please take a look at the following image…
There are two symbols in this image.
I learned from Wikipedia's “List of logic symbols” the symbol “⊕” stands for “XOR”, but what does that cross in square symbol mean? Does that mean “XOR” too?
XOR
Means: combine the two inputs using XOR. So, this symbol indeed can be read as “⊕”.
Addition
Means: combine the two inputs using addition. This symbol indeed can be read as “+”.
Nota Bene
In the image you're asking about, it is noted that the S-boxes take 8 bit (= unsigned char) input and return 32 bits (= unsigned int)… which means the cipher expects you to do the addition and XOR on unsigned integers.
The plus in a box is addition mod 232 (actually, I don't remember for sure -- it could be mod 232-1, but it's addition in any case).
I am trying to learn how to convert a string to an integer. I think I am pretty close. My code works for numbers under 260. Once the numbers entered are greater than or equal to 260, then it just converts them to 0. I think it might have something to do with the size of a BYTE, but I'm not sure how to fix it. Any suggestions?
Some Irvine functions are included, but I'm trying to write my own ReadInt function.
I can see the problem. Rather than giving away the answer completely, here's a hint:
The lodsb instruction loads one byte into al (which is the low 8 bits of eax). The rest of eax is unchanged. What might cause eax to contain extra bits that aren't changed by lodsb?