What does mov cx, 02001Q mean in assembly - nasm

I'm doing an analysis of the shellcode found at http://www.shell-storm.org/shellcode/files/shellcode-211.php
I was wondering what this particular instruction does:
mov cx, 02001Q
I know it moves a value into cx, but I'm not sure what the Q stands for.

From the NASM docs;
NASM allows you to specify numbers in a variety of number bases, in a variety of ways: you can suffix H or X, D or T, Q or O, and B or Y for hexadecimal, decimal, octal and binary respectively
In other words, 02001Q means 2001 octal.

Related

RISC V Assembly Program: Unable to load the floating point register f0,f1 with the floating point instructions .Why the value is not getting loaded?

Code to convert from Fahrenheit to Celsius.
Input 100.0f Stored in the x8 integer register.
Output in the f0 register.
What is the error in the code?
Trying to implement the formula: Output = (100.0-32.0)*(5.0/9.0)
_start:
li x5,0x40a00000 # floating point representation of 5 in hex as per IEEE 754 notation. Storing in x5.
li x6, 0x41100000 # floating point representation of 9 in hex as per IEEE 754 notation. Storing in x6.
li x7, 0x42000000 # floating point representation of 32 in hex as per IEEE 754 notation. Storing in x7.
li x8, 0x42c80000 # floating point representation of 100 in hex as per IEEE 754 notation. Storing in x8. This is the input.
flw f0,0(x5) # Storing the value 5.0 in the floating point register f0
flw f1,0(x6) # Storing the value 9.0 in the floating point register f1
flw f2,0(x8) # Storing the value 100.0 in the floating point register f2
fdiv.s f0, f0, f1 # storing the value 5.0/9.0 in the register f0
flw f1, 0(x7) # storing the value 32.0 in the register f1.
fsub.s f2, f2, f1 # f2 = 100.0f – 32.0f = 68.0f
fmul.s f0, f0, f2 # f0 = 68.0f*(5.0f/9.0f) This is the output.
Compile Command:
~/spiking$ riscv64-unknown-elf-gcc -nostdlib -nostartfiles -T spike.lds faren.S -o faren.elf**
The flw instruction is used to load a floating-point value from memory. In your case, you use x5 as a pointer (which isn't what you intended).
To move the bits untouched from an X register to a floating-point register you can use fmv.w.x.

FPU Anamoly In NASM

I'm performing a program which seeks sum of square of floating point numbers provided in array
I've initialized ST1 and ST to +0.0 by using FLDZ twice and loaded the first Number Pointed by RSI to ST and multiplied by itself and added the result to ST1 and ran the above procedure through a loop
The array is array dd 15.0,7.0,9.0
The block which performs given operation is
mov rsi,array
fldz
fldz
xor rcx,rcx
mov cl,3
variance:
fld dword[rsi]
fmul dword[rsi]
fadd st1
add rsi,4
loop variance
fld st1
call display
mov rax,60
syscall
Consider display as a procedure which prints Floating Point Numbers stored in ST
Expected Output : 355.0000
Actual Output : 274.0000

Find point along line where normal extends through another point

Given the line segment AB, how do I find the point Pn where the normal to AB passes through the point P? I also need to know if there is no normal that passes through the point (e.g. the point Q).
If R is any point on the normal line passing through P (different from P), then Pn is the point where AB and PR intersect.
One way to generate point R is to rotate segment AB by 90 degrees and then translate it so that A coincides with P. The the translated B is your R:
Rx = Px + (By - Ay)
Ry = Py - (Bx - Ax)
Once you have your point R, it becomes a simple line-line intersection problem, which will give you your Pn (the formulas can be simplified for a specific case of perpendicular lines).
Then you can easily check whether Pn lies between A and B or not.
P.S. Note that solution provided in #MBo's answer is a more direct and efficient one (and if you combine and simplify/optimize the formulas required for my answer you will eventually arrive at the same thing). What I describe above might make good sense if you already have a primitive function that calculates the intersection of two lines, say,
find_intersection(Ax, Ay, Bx, By, Cx, Cy, Dx, Dy)
// Intersect AB and CD
In that case finding Pn becomes a simple one-liner
Pn = find_intersection(Ax, Ay, Bx, By, Px, Py, Px + (By - Ay), Py - (Bx - Ax))
But if you don't have such primitive function at your disposal and/or care for making your code more efficient, then you might want to opt for a more direct dedicated sequence of calculations like the one in #MBo's answer.
Find vectors
AB = (B.X-A.X, B.Y-A.Y)
AP = (P.X-A.X, P.Y-A.Y)
Projection of P to AB is:
APn = AB * (AB.dot.AP) / (AB.dot.AB);
where .dot. is scalar product
In coordinates:
cf = ((B.X-A.X)*(P.X-A.X)+(B.Y-A.Y)*(P.Y-A.Y))/((B.X-A.X)^2+(B.Y-A.Y)^2)
if cf < 0 or cf > 1 then projection lies outside AB segment
Pn.X = A.X + (B.X-A.X) * cf
Pn.Y = A.Y + (B.Y-A.Y) * cf

ASM question, two's complement

so this book "assembly language step by step" is really awesome, but it was sort of cryptic about how two's complement works when working on actual memory and register data. along with that, i'm not sure how signed values are represented in memory either, which i feel might be what's keeping me confused. anywho...
it says: "-1 = $FF, -2 = $FE and so on". now i understand that the two's complement of a number is itself multiplied by -1 and when added to the original will give you 0. so, FF is the hex equivalent of 11111111 in binary, and 255 in decimal. so my question is: what's the book saying when it says "-1 = $FF"? does it mean that -255 + -1 will give you 0 but also, which it didn't explicitly, set the OF flag?
so in practice... let's say we have 11h, which is 17 in decimal, and 00100001 in binary. and this value is in AL.
so then we NEG AL, and this will set the CF and SF, and change the value in AL to... 239 in decimal, 11101111 in binary, or EFh? i just don't see how that would be 17 * -1? or is that just a poorly worded explanation by the book, where it really means that it gives you the value you would need to cause an overflow?
thanks!
In two's complement, for bytes, (-x) == (256 - x) == (~x + 1). (~ is C'ish for the NOT operator, which flips all the bits in its operand.)
Let's say we have 11h.
100h - 11h == EFh
(256 - 17 == 239)
Note, the 256 works with bytes, cause they're 8 bits in size. For 16-bit words you'd use 2^16 (65536), for dwords 2^32. Also note that all math is mod 256 for bytes, 65536 for shorts, etc.
Or, using not/+1,
~11h = EEh
+1... EFh
This method works for words of all sizes.
what's the book saying when it says "-1 = $FF"?
If considering a byte only, the two's complement of 1 is 0xff (or $FF if using that format for hex numbers).
To break it down, the complement (or one's complement) of 1 is 0xfe, then you add 1 to get the two's complement: 0xff
Similarly for 2: the complement is 0xfd, add 1 to get the two's complement: 0xfe
Now let's look at 17 decimal. As you say, that's 0x11. The complement is 0xee, and the two's complement is 0xef - all that agrees with what you stated in your question.
Now, experiment with what happens when you add the numbers together. First in decimal:
17 + (-17) == 0
Now in hex:
0x11 + 0xef == 0x100
Since we're dealing with numeric objects that are only a byte in size, the 1 in 0x100 is discarded (some hand waving here...), and we result in:
0x11 + 0xef == 0x00
To deal with the 'hand waving' (I probably won't do this in an understandable manner, unfortunately): since the overflow flag (OF or sometimes called V for reasons that I don't know) is the same as the carry flag (C) the carry can be ignored (it's an indication that signed arithmetic occurred correctly). One way to think of it that's probably not very precise, but I find useful, is that leading ones in a negative two's complement number are 'the same as' leading zeros in a non-negative two's complement number.

Unsigned 128-bit division on 64-bit machine

I have a 128-bit number stored as 2 64-bit numbers ("Hi" and "Lo"). I need only to divide it by a 32-bit number. How could I do it, using the native 64-bit operations from CPU?
(Please, note that I DO NOT need an arbitrary precision library. Just need to know how to make this simple division using native operations. Thank you).
If you are storing the value (128-bits) using the largest possible native representation your architecture can handle (64-bits) you will have problems handling the intermediate results of the division (as you already found :) ).
But you always can use a SMALLER representation. What about FOUR numbers of 32-bits? This way you could use the native 64-bits operations without overflow problems.
A simple implementation (in Delphi) can be found here.
I have a DECIMAL structure which consists of three 32-bit values: Lo32, Mid32 and Hi32 = 96 bit totally.
You can easily expand my C code for 128-bit, 256-bit, 512-bit or even 1024-bit division.
// in-place divide Dividend / Divisor including previous rest and returning new rest
static void Divide32(DWORD* pu32_Dividend, DWORD u32_Divisor, DWORD* pu32_Rest)
{
ULONGLONG u64_Dividend = *pu32_Rest;
u64_Dividend <<= 32;
u64_Dividend |= *pu32_Dividend;
*pu32_Dividend = (DWORD)(u64_Dividend / u32_Divisor);
*pu32_Rest = (DWORD)(u64_Dividend % u32_Divisor);
}
// in-place divide 96 bit DECIMAL structure
static bool DivideByDword(DECIMAL* pk_Decimal, DWORD u32_Divisor)
{
if (u32_Divisor == 0)
return false;
if (u32_Divisor > 1)
{
DWORD u32_Rest = 0;
Divide32(&pk_Decimal->Hi32, u32_Divisor, &u32_Rest); // Hi FIRST!
Divide32(&pk_Decimal->Mid32, u32_Divisor, &u32_Rest);
Divide32(&pk_Decimal->Lo32, u32_Divisor, &u32_Rest);
}
return true;
}
The subtitle to volume two of The Art of Computer Programming is Seminumerical Algorithms. It's appropriate, as the solution is fairly straight-forward when you think of the number as an equation instead of as a number.
Think of the number as Hx + L, where x is 264. If we are dividing by, call it Y, it is then trivially true that Hx = (N + M)x where N is divisible by Y and M is less than Y. Why would I do this? (Hx + L) / Y can now be expressed as (N / Y)x + (Mx + L) / Y. The values N, N / Y, and M are integers: N is just H / Y and M is H % Y However, as x is 264, this still brings us to a 128 by something divide, which will raise a hardware fault (as people have noted) should Y be 1.
So, what you can do is reformulate the problem as (Ax3 + Bx2 + Cx + D) / Y, with x being 232. You can now go down: (A / Y)x3 + (((A % Y)x + B) / Y)x2 + (((((A % Y)x + B) % Y)x + C) / Y)x + ((((((A % Y)x + B) % Y)x + C) / Y)x + D) / Y. If you only have 64 bit divides: you do four divides, and in the first three, you take the remainder and shift it up 32 bits and or in the next coefficient for the next division.
This is the math behind the solution that has already been given twice.
How could I do it, using the native 64-bit operations from CPU?
Since you want native operations, you'll have to use some built-in types or intrinsic functions. All the above answers will only give you general C solutions which won't be compiled to the division instruction
Most modern 64-bit compilers have some ways to do a 128-by-64 division. In MSVC use _div128() and _udiv128() so you'll just need to call _udiv128(hi, lo, divisor, &remainder)
The _div128 intrinsic divides a 128-bit integer by a 64-bit integer. The return value holds the quotient, and the intrinsic returns the remainder through a pointer parameter. _div128 is Microsoft specific.
In Clang, GCC and ICC there's an __int128 type and you can use that directly
unsigned __int128 div128by32(unsigned __int128 x, uint64_t y)
{
return x/y;
}

Resources