What will be the physical address of 000H and 29H ? - memory-address

What will be the physical address of 0000H assume this segment address and offset address is 29H ?
I am reading the Assembly Language Step by Step 3rd Edition, the answer of this question inside the book which i mentioned is 2DH Bytes But How ?
I took basic steps to to solve my self and here it is my result:
0000 0000 0000 0010 1001 // segment address + offset addr (In Binary)
After calculation the result will be 41 in decimal and if we convert this to Hex it will be again 29H. Basically, the answer inside the book is 2DH, How ?
and there is another problem with 0001H which is segment address and the offset address is 19H. The answer inside the book is 1DH, How ?
Thanks :)

Related

What does addr means in the opcode 0x3a LDA addr intel 8080

LDA is a simple opcode that loads to accumulator (register a) the pointed data in intel 8080 processor. In this condition (0x3a LDA addr) it says that op loads the addr to accumulator. But i couldn't recognize what it specifyies as addr.
A <- (adr) is the operation which 0x3a does and it uses 3 bytes of memory. I could store the data in the last 2 bytes of op as hi add and low add in a stack but accumulator is only 1 byte so i can't. Thanks.
LDA a16 instruction reads a byte from address a16 (the 8080 has a 16-bit bus) and stores that value into the A register.
This instruction is encoded as three : 0x3a lo hi, being lo and hi the two bytes that compose the address.
If you want to store an immediate (constant) value into A you should use instead instruction MVI A, x, being x the constant value. This instruction is encoded as: 0x3e x, only two bytes, as you seem to expect.
It looks like you are confusing memory address and memory content. The 8080 has an address bus of 16 bits and a data bus of 8 bits. That means that it can access memory from address 0x0000 up to 0xffff (16 full bits), or 65536 different addresses, but each of these address can store a single byte, with a value from 0x00 to 0xff (8 bits). That adds up to 64 kilobytes of memory.
Now, when you want to read a value from memory you need to specify the address of the value you are reading (remember, the address is 16 bits, the value is 8 bits). So you have to encode somehow the address into the instruction using 2 bytes. Intel CPU use the little-endian scheme, so to encode an address the lower 8 bits are stored in the first byte and the higher 8 bits in the second one. And that is what the LDA opcode does, and that is why it is 3 bytes long.

RISCV resolving opcode

I need help with understanding how to solve this problem in RISCV.
Provide the assembly language instruction for the following hex values:
Address 1000: b3
Address 1001: 0b
Address 1002: 9c
Address 1003: 41
I know I have to change to binary and that RISCV is little Endian, but beyond that I dont know how to proceed. I have several problems like this but I want to do the rest myself.
As you said, RISC-V is little-endian, so the word at address 1000 through 1003 is
0x419c0bb3, in binary:
01000001100111000000101110110011
First thing to notice, the instruction ends in 0110011. This matches several instructions, see pages 104 and 105 in riscv-spec-v2.2.pdf. To further decode the instruction I examine the FUNC3 field in bits 14-12, these are 000. I am down to a few possible instructions, ADD, SUB or MUL. I now examine the most significant 7 bits of the instruction, 0100000. The instruction is SUB. The full decoding of the instruction is:
FUNC7 rs2 rs1 FUNC3 rd OPCODE
0100000 11001 11000 000 10111 0110011
In assembler this should be sub x23,x24,x25.
To check the answer it is best to use an assembler/emulator.

Why Linux/gnu linker chose address 0x400000?

I'm experimenting with ELF executables and the gnu toolchain on Linux x86_64:
I've linked and stripped (by hand) a "Hello World" test.s:
.global _start
.text
_start:
mov $1, %rax
...
into a 267 byte ELF64 executable...
0000000: 7f45 4c46 0201 0100 0000 0000 0000 0000 .ELF............
0000010: 0200 3e00 0100 0000 d400 4000 0000 0000 ..>.......#.....
0000020: 4000 0000 0000 0000 0000 0000 0000 0000 #...............
0000030: 0000 0000 4000 3800 0100 4000 0000 0000 ....#.8...#.....
0000040: 0100 0000 0500 0000 0000 0000 0000 0000 ................
0000050: 0000 4000 0000 0000 0000 4000 0000 0000 ..#.......#.....
0000060: 0b01 0000 0000 0000 0b01 0000 0000 0000 ................
0000070: 0000 2000 0000 0000 0000 0000 0000 0000 .. .............
0000080: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0000090: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000b0: 0400 0000 1400 0000 0300 0000 474e 5500 ............GNU.
00000c0: c3b0 cbbd 0abf a73c 26ef e960 fc64 4026 .......<&..`.d#&
00000d0: e242 8bc7 48c7 c001 0000 0048 c7c7 0100 .B..H......H....
00000e0: 0000 48c7 c6fe 0040 0048 c7c2 0d00 0000 ..H....#.H......
00000f0: 0f05 48c7 c03c 0000 0048 31ff 0f05 4865 ..H..<...H1...He
0000100: 6c6c 6f2c 2057 6f72 6c64 0a llo, World.
It has one program header (LOAD) and no sections:
There are 1 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x000000000000010b 0x000000000000010b R E 200000
This seems to load the entire file (file offset 0 thru 0x10b - elf header and all) at address 0x400000.
The entry point is:
Entry point address: 0x4000d4
Which corresponds to 0xd4 offset in the file, and as we can see that address is the start of the machine code (mov $1, %rax1)
My question is why (how) did the gnu linker choose address 0x400000 to map the file to?
The start address is usually set by a linker script.
For example, on GNU/Linux, looking at /usr/lib/ldscripts/elf_x86_64.x we see:
...
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); \
. = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
The value 0x400000 is the default value for the SEGMENT_START() function on this platform.
You can find out more about linker scripts by browsing the linker manual:
% info ld Scripts
ld's default linker script has that 0x400000 value baked in for non-PIE executables.
PIEs (Position Independent Executables) don't have default base address; they're always relocated by the kernel, with the kernel's default being 0x0000555... plus some ASLR offset unless ASLR is disabled for this process or system-wide. ld has no control over this. Note that most modern systems configure GCC to use -fPIE -pie by default, so it passes -pie to ld, and turns C into asm that's position-independent. Hand-written asm has to follow the same rules if you link it that way.
But what makes 0x400000 (4 MiB) a good default?
It has to be above mmap_min_addr = 65536 = 64K by default.
And being plenty far away from 0 gives plenty more room to guard against NULL deref with an offset reading .text or .data/.bss memory (array[i] where array is NULL). Even without increasing mmap_min_addr (which this leave room for without breaking executables), usually mmap randomly picks high addresses so in practice we have at least 4MiB of guard against NULL deref.
2M-aligned is good
This puts it at the start of a page-directory in the next level up of the page tables means the same number of 4K page-table-entries will be split across fewer 2M page directory entries, saving kernel page-table memory and helping page-walk hardware cache better. For big static arrays, close to the start of a 1G subtree of the next level up is also good.
IDK why 4MiB instead of 2MiB, or what the developers' reasoning was. 4MiB is the 32-bit largepage size without PAE (4-byte PTEs so 10 bits per level instead of 9), but a CPU has to be using x86-64 page tables to be in 64-bit mode.
A low start address allows nearly 2 GiB of static arrays
(Without using a larger code model, where at least large arrays have to be addressed in ways that are sometimes less efficient. See section 3.5.1 Architectural Constraints in the x86-64 System V ABI document for details on code models.)
The default code model for non-PIE executables ("small") lets programs assume that any static address is in the low 2GiB of virtual address space. So any absolute address in .text/.rodata, .data, .bss can be used as a 32-bit sign-extended immediate in the machine code where that's more efficient.
(This is not the case in a PIE or shared library: see 32-bit absolute addresses no longer allowed in x86-64 Linux? for the things you / the compiler can't do in x86-64 asm as a result, notably addss xmm0, [foo + rdi*4] instead requires a RIP-relative LEA to get the array start address into a register. x86-64's only RIP-relative addressing mode is [RIP+rel32], without any general-purpose registers.)
Starting the executable's sections/segments near the bottom of virtual address space leaves almost the whole 2GiB available for text+data+bss to be that big. (It might have been possible to have a higher default, and have large executables make ld choose a lower address to make them fit, but that would be a more complicated linker script.)
This includes zero-initialized arrays in the .bss which don't make the executable file huge, just the process image in memory. In practice, Fortran programmers tend to run into this more than C and C++, since static arrays are popular there. For example gfortran for dummies: What does mcmodel=medium do exactly? has a good explanation of a build error with the default small model, and the resulting x86-64 asm difference for medium (where objects above a certain size threshold are not assumed to be in the low 2G or within +-2G of the code. But code and smaller static data still is so the speed penalty is minor.)
For example static float arr[1UL<<28]; is a 1 GiB array. If you had 3 of them, they couldn't all start inside the low 2 GiB (which may be all you need for hand-written asm), let alone have each element accessible.
gcc -fno-pie expects to be able to compile float *p = &arr[size-1]; to mov $arr+1073741820, %edi, a 5-byte mov $imm32. RIP-relative won't work either if the target address is more than 2GiB away from the code generating the address (or loading from it with movss arr+1073741820(%rip), %xmm0; RIP-relative is the normal way to load/store static data even in a non-PIE, when there's no runtime-variable index.) That's why the small-PIC model also has a 2GiB size limit on text+data+bss (plus gaps between segments): all static data and code needs to be within 2GiB of any other that might want to reach it.
If your code only ever accesses high elements or their addresses via runtime-variable indices, you only need the start of each array, the symbol itself, to be in the low 2 GiB. I forget if the linker enforces having the end-of-bss within the low 2GiB; it might since the linker script puts a symbol there that some CRT startup code might reference.
Footnote 1: There aren't any useful smaller sizes for a code model smaller than 2GiB. x86-64 machine code uses either 8 or 32-bit for immediates and addressing mode. 8-bit (256 bytes) is too small to be usable, and many important instructions like call rel32, mov r32, imm32, and [rip+rel32] addressing, are only available with 4-byte not 1-byte constants anyway.
Limiting to the low 2 GiB (instead of 4) means that addresses can safely be zero-extended as with mov edi, OFFSET arr, or sign-extended, as with mov eax, [arr + rdi*4]. Remember that addresses aren't the only use-case for [reg + disp32] addressing modes; [rbp - 256] can often make sense, so it's good that x86-64 machine code sign-extends disp8 and disp32 to 64-bit, not zero-extends.
Implicit zero-extension to 64-bit happens when writing a 32-bit register, as with mov-immediate to put an address in a register, where 32-bit operand-size is a smaller machine-code instruction than 64-bit operand-size. See How to load address of function or label into register (which also covers RIP-relative LEA).
Related for 32-bit Windows
Raymond Chen wrote an article about why the same 0x400000 base address is the default for 32-bit Windows.
He mentions that DLLs get loaded at high addresses by default, and a low address is far from that. x86-64 SysV shared objects can get loaded anywhere there's a large enough gap of address space, with the kernel defaulting to near the top of user-space virtual address-space, i.e. the top of the canonical range. But ELF shared objects are required to be fully relocatable so would work fine anywhere.
The 4MiB choice for 32-bit Windows was also motivated by avoiding the low 64K (NULL deref), and by picking the start of a page-directory for legacy 32-bit page tables. (Where the "largepage" size is 4M, not 2M for x86-64 or PAE.) With a bunch of Win95 and Win3.1 legacy memory-map reasons why at least 1MiB or 4MiB was partially necessary, and stuff like working around CPU bugs.
Page zero of task's virtual address space is kept unmapped so that null-pointer references could be catched through page-fault exception leading to SIGSEGV. 4 MB fit with "big page" granularity (as opposed to "normal page" granularity 4 KB) - so on settings with 4 MB page granularity, 0x000000 to 0x3FFFFF address range is unmapped, making 0x400000 the first valid address in task's virtual address space.

rfid tags: same tag, different codes on different readers

Is anyone familiar with RFID codes here?
I have a EM4102 type tag here. My handheld reader says on it's display:
EM4102 tag, ID 04178649C1
The same tag, when read on a Gigatek/Promag PCR125 CF-card reader gives me the exact same code:
04178649C1
However, an ACG RF PC CF-card reader gives me the code
20E8619283
This reader is capable of reading different types of tags and also reports the correct type (EM4x02, length 5 bytes).
I have tried a few readers of the same model and they all give me the same code.
I guess that reader just reports the code in a different way. Perhaps I have to shift some bits around (wouldn't be the first time) or there are error correction bits still included in the code?
FYI, the reader is documented here. The section regarding this type of tags just states:
The EM4x02 label only provides a 5 bytes serial number. The label
starts to send its response immediately after entering an energizing
field. Each transponder has its own unique serial number, which cannot
be changed.
Any clue what the reader is doing?
I figured it out myself.
20E8619283 in binary is:
0010 0000 1110 1000 0110 0001 1001 0010 1000 0011
These are five bytes, two nibbles each. Mirroring the bit order of each byte (bit 0 becomes bit 7, bit 1 becomes 2 etc.) I get:
0000 0100 0001 0111 1000 0110 0100 1001 1100 0001
which in hexadecimal notation is 04178649C1, the correct code.
So apparently the reader is not interpreting/reporting the bits in the right order...

How to calculate my GPIO port address

I have a Jetway NF81-T56 motherboard which has a header providing 8 I/O lines labeled GPIO30-GPIO37. There is no GPIO driver in my CentOS6 install, and I am attemptiing to write a driver. A Fintek F71869 Super IO chip provides the GPIO and other I/O functions. I can access and modify the GPIO3 registers through the 0x2e/0x2f ports, but haven't been able to access the data port using the GPIO BASE_ADDR set in the F71869 GPIO registers. I have read those registers, and the GPIO BASE_ADDR is set to 0x0a00. The manual page for the chip states:
The index port is BASE_ADDR[15:2] + 5 and the data port is BASE_ADDR[15:2] + 6
I have set the data port to 0x0f (as displayed by connected LEDs) and tried reading ports 0x0a00-0x0a7f. All returned 0xff, and not 0x0f. Does anyone know how to interpret the "BASE_ADDR[15:2]" notation syntax? I have tried searching the Internet and tried contacting the manufacturer, to no avail.
This is hardware-style vector (bitfield) notation, basically you right shift two places.
0x0a00 becomes 0x0280 if I'm doing it right in my head.
To draw it out here are the bits [15:0]
0x0a00 = 0000 1010 0000 0000
now select bits [15:2]
0000 1010 0000 00
which we re-format as [13:0] of a new vector
00 0010 1000 0000 = 0x0280
So it looks like you should be accessing at 0x0285 and 0x0286. However, you need to be sure you've got the board configured correctly, and more important you need to be sure that nothing else is also located at those addresses.

Resources