Include binary file in as86/bin86 - linux

I have written a bit of code in i8086 assembler that is supposed to put a 80x25 image into the VRAM and show it on screen.
entry start
start:
mov di,#0xb800 ; Point ES:DI at VRAM
mov es,di
mov di,#0x0000
mov si,#image ; And DS:SI at Image
mov cx,#0x03e8 ; Image is 1000 bytes
mov bl,#0x20 ; Print spaces
; How BX is used:
; |XXXX XXXX XXXXXXXX|
; ^^^^^^^^^ BL contains ascii whitespace
; ^^^^ BH higher 4 bits contain background color
; ^^^^ BH lower 4 bits contain unused foreground color
img_loop:
seg ds ; Load color
mov bh,[si]
seg es ; Write a whitespace and color to VRAM
mov [di],bx
add di,#2 ; Advance one 'pixel'
sal bh,#4 ; Shift the unused lower 4-bits so that they become background color for the 2nd pixel
seg es
mov [di],bx
add di,#2
add si,#1
sub cx,#1 ; Repeat until 1 KiB is read
jnz img_loop
endless:
jmp endless
image:
GET splash.bin
The problem is that I cannot get the as86 assembler to include the binary data from the image file. I have looked at the the man page but I could not find anything that works.
If I try to build above code it gives me no error, but the output file produced by the linker is only 44 bytes in size, so clearly it did not bother to put in the 1000 byte image.
Can anybody help me with that? What am I doing wrong?

I am not certain that this will help you, as I have never tried it for 8086 code. But you might be able to make it work.
The objcopy program can convert binary objects to various different formats. Like this example from the man objcopy page:
objcopy -I binary -O <output_format> -B <architecture> \
--rename-section .data=.rodata,alloc,load,readonly,data,contents \
<input_binary_file> <output_object_file>
So from that you'd have an object file with your <input_binary_file> in a section named .rodata. But you could name it whatever you wanted. Then use a linker to link your machine code to the image data.
The symbol names are created for you too. Also from the man page:
-B
--binary-architecture=bfdarch
Useful when transforming a architecture-less input file into an object file. In this case the output architecture can be set to
bfdarch. This option will be ignored if the input file has a known
bfdarch. You can access this binary data inside a program by
referencing the special symbols that are created by the conversion
process. These symbols are called _binary_objfile_start,
_binary_objfile_end and _binary_objfile_size. e.g. you can transform a picture file into an object file and then access it in your code using
these symbols.

If your whole code is pure code (no executable headers, no relocation...) you can just manually concatenate the image at the end of the code (and of course remove GET splash.bin). In Linux for example you can do cat code-binary image-binary > final-binary.

Thank you everybody else trying to help me. Unfortunately I did not get the objcopy to work (maybe I am just too stupid, who knows) and while I actually used cat at first, I had to include multiple binary files soon, which should still be accessible via labels in my assembler code, so that was not a solution either.
What I ended up doing was the following: You reserve the exact amount of bytes in your assembler source code directly after the label you wanna put in your binary file, i.e.:
splash_img:
.SPACE 1000
snake_pit:
.SPACE 2000
Then you assemble your source code creating a symbol table by adding the -s option, i.e. -s snake.symbol to your call to as86. The linker call does not change. Now you have a binary file that has a bunch of zeroes at the position you wanna have your binary data, and you have a symbol table that should look similar to this:
0 00000762 ---R- snake_pit
0 0000037A ---R- splash_img
All you gotta do now is get a program to override the binary file created by the linker with your binary include file starting at the addresses found in the symbol table. It is up to you how you wanna do it, there are a lot of ways, I ended up writing a small C program that does this.
Then I just call ./as86_binin snake snake.symbols splash_img splash.bin and it copies the binary include into my linked assembler program.
I am sorry for answering my own question now, but I felt like this is the best way to do it. It is quite unfortunate bin86 doesn't have a simple binary include macro on its own. If anybody else runs into this problem in the future, I hope this will help you.

Related

Double up bit gb-emu

Apologies in advance as this is an old topic. I was reading the following post on how the Nintendo logo data is decompressed and scaled before being copied into the v-ram during bootstrap and interestingly enough the data written does indeed look gibberish (as pointed out by the questioner) and I have tried my best (with a gb emulator I wrote) to produce that same output...but without success.
Link to post
The assembly code in question is this part of the boot rom:
LD C,A ; $0095 "Double up" all the bits of the graphics data
LD B,$04 ; $0096 and store in Video RAM
Addr_0098:
PUSH BC ; $0098
RL C ; $0099
RLA ; $009b
POP BC ; $009c
RL C ; $009d
RLA ; $009f
DEC B ; $00a0
JR NZ, Addr_0098 ; $00a1
LD (HL+),A ; $00a3
INC HL ; $00a4
LD (HL+),A ; $00a5
INC HL ; $00a6
RET
In reply to above post the output to v-ram is show to be:
8000: 00000000000000000000000000000000
8010: F000F000FC00FC00FC00FC00F300F300
8020: 3C003C003C003C003C003C003C003C00
8030: F000F000F000F00000000000F300F300
8040: 000000000000000000000000CF00CF00
... and so on
Can anyone explain how this output is generated and if it is indeed correct?
Many thanks in advance.
P.S. Assumption is made that the Nintendo logo is explicitly (inside some C/Java code) copied over to v-ram starting at address 0104h during boot process to test the bootstrap.
.DB $CE,$ED,$66,$66,$CC,$0D,$00,$0B,$03,$73,$00,$83,$00,$0C,$00,$0D
.DB $00,$08,$11,$1F,$88,$89,$00,$0E,$DC,$CC,$6E,$E6,$DD,$DD,$D9,$99
.DB $BB,$BB,$67,$63,$6E,$0E,$EC,$CC,$DD,$DC,$99,$9F,$BB,$B9,$33,$3E
After going through my code and seeing a potential silly bug (maybe a couple or two) I was able to finally get the same result as above. Please consider this resolved.
Basically I was forgetting to update the F register after the settings were changed for INC n, Add n and Sub n.
So technically, the above output seems to be correct.

Extend section in Mach-O file

I am trying to extract libraries from the Dyld_shared_cache, and need to fix in external references.
For example, the pointers in the __DATA.__objc_selrefs section usually point to data outside the mach-o file, to fix that I would have to copy the corresponding c-string from the dyld and append it to the __TEXT.__objc_methname section.
Though from my understanding of the Mach-O file format, this extension of the __TEXT.__objc_methname would shift all the sections after it and would force me to fix all the offsets and pointers that reference them. Is there a way to add data to a section without breaking a lot of things?
Thanks!
Thanks to #Kamil.S for the idea about adding a new load command and section.
One way to achieve adding more data to a section is to create a duplicate segment and section and insert it before the __LINKEDIT segment.
Slide the __LINKEDIT segment so we have space to add the new section.
define the slide amount, this must be page-aligned, so I choose 0x4000.
add the slide amount to the relevant load commands, this includes but is not limited to:
__LINKEDIT segment (duh)
dyld_info_command
symtab_command
dysymtab_command
linkedit_data_commands
physically move the __LINKEDIT in the file.
duplicate the section and change the following1
size, should be the length of your new data.
addr, should be in the free space.
offset, should be in the free space.
duplicate the segment and change the following1
fileoff, should be the start of the free space.
vmaddr, should be the start of the free space.
filesize, anything as long as it is bigger than your data.
vmsize, must be identical to filesize.
nsects, change to reflect how many sections your adding.
cmdsize, change to reflect the size of the segment command and its section commands.
insert the duplicated segment and sections before the __LINKEDIT segment
update the mach_header
ncmds
sizeofcmds
physically write the extra data in the file.
you can optionally change the segname and sectname fields, though it isn't necessary. thanks Kamil.S!
UPDATE
After clarifing with OP that extension of __TEXT.__objc_methname would happen during Mach-O post processing of an existing executable I had a fresh look on the problem.
Another take would be to create a new load command LC_SEGMENT_64 with a new __TEXT_EXEC.__objc_methname segment / section entry (normally __TEXT_EXEC is used for some kernel stuff but essentially it's the same thing as __TEXT). Here's a quick POC to ilustrate the concept:
#import <Foundation/Foundation.h>
int main(int argc, const char * argv[]) {
#autoreleasepool {
printf("%lx",[NSObject new]);
}
return 0;
}
Compile like this:
gcc main.m -c -o main.o
ld main.o -rename_section __TEXT __objc_methname __TEXT_EXEC __objc_methname -lobjc -lc
Interestingly only ld up to High Sierra 10.14.6 generates __TEXT.__objc_methname, no trace of it on Catalina, it's done differently.
UPDATE2.
Playing around with it, I noticed execution rights for __TEXT segment (and __TEXT_EXEC for that matter) are not required for __objc_methname to work.
Even better specific segment & section names are not required:
I could pull off:
__DATA.__objc_methname
__DATA_CONST.__objc_methname
__ARBITRARY.__arbitrary
or in my case last __DATA section
__DATA.__objc_classrefs where the original the data got concatenated by the selector name.
It's all fine as long as a proper null terminated C-string with the selector name is there. If I intentionally break the "new\0" in hex editor or MachOView I'll get
"+[NSObject ne]: unrecognized selector sent to instance ..."
upon launching my POC executable so the value is used for sure.
So to sum __TEXT.__objc_methname section itself is likely some debugger hint made by the linker. The app runtime seems to only need selector names as char* anywhere in memory.

How to MODIFY an ELF file with Python

I am trying to modify an ELF file's .text segment using python.
I successfully acquired the .text field so then I can simply change the bit that I want. The thing is that pyelftools does not provide any way to generate an ELF file from the ELF object.
So what I tried is the following:
I've created a simple helloworld program in c, compiled it and got the a.out file. Then I used the pyelftools to disassemble it.
To change/edit any section of the ELF file I simply used pyelftools's ELFFile class methods to acquire the field's (i) offset and (ii) size. So then I know exactly where to look inside the binary file.
So after getting the values-margins of the field (A,B) I simply treated the file like a normal binary. The only thing I did is to do a file.seek(A) to move the file pointer to the specific section that I wish to modify.
def edit_elf_section(elf_object,original_file,section):
elf_section = elf_object.get_section_by_name(section)
# !! IMPORTANT !!
section_start = elf_section['sh_offset'] # NOT sh_addr. sh_addr is the logical address of the section
section_end = section_start + elf_section['sh_size']
original_file.seek(section_start)
# Write whatever you want to the file #
assert(original_file.tell() <= section_end) # You've written outside the section
To validate the results you can use the diff binary to see that the files are/aren't identical

How to find pointer w/ offset [ecx+eax*4] (address offset?)

I've seen this topic: How to find a point with offset eax+ebx*4
eax will be the pointer value to look for
ebx*4 will be the offset (ebx is the offset in an array with elements of 4 bytes long)
so:
ebx=0 : offset=0
ebx=1 : offset=4
ebx=2 : offset=8
ebx=3 : offset=c
ebx=4 : offset=10
But I'm still don't understand how can I determine ebx?
Here is my situation: I'm trying to get current ammo pointer for Red Faction: Guerrilla (gfwl version)
I see that the address of this ammo is changed when I load another save file. So I use "Find out what writes to this address" for the ammo pointer (which no longer working after load another save file)
Then I load another save file to see what it writes to the pointer:
The result is the pointer with offset [ecx+eax*4]
So I make a pointer like this
ecx=00C1B988 (address 00C1B988 holds the value: ECX=00C1B994)
EAX*4= I don't know how to work with this, so I just put: E71*4
But it still doesn't work when I load another save file. I stuck at E71*4, what should I replace for E71? I even tried to search the value E71 (or 3697), but it seems like I'm going nowhere.
Usually when you see ecx+eax*4 it's indexing into an array. ECX points to the array, EAX is the element # and 4 is size of the element. Often times when you see 4 or 8 it's because it's an array of pointers and that's the size of the pointer on x86.
What you're seeing is not some encryption/obfuscation/anticheat. It is just how object oriented programming/C++ gets compiled into assembly.
That pointer chain you're creating isn't going to work for you, the solution will be to get the address of the weapon/player object so you can offset into it to get address of the ammo. To do this you need to:
find another pointer manually
find another pointer using pointer scanner
pattern scanning + hooking and pulling the address out of the register
If perhaps this is some obfuscation, you can easily get the value of EAX by hooking the instruction and grabbing it's value.

80x86 Assembly - Very basic I/O program conversion to Linux from Windows

So my first day of Assembly class, and what do you know? My professor teaches everything on her Windows box, using Windows API calls, etc. which is fine except that I'm running Ubuntu on my box..
Basically, I'm hoping I can find either a workaround or some form of common-grounds in order for me to get my assignments done.
Today, our first programming assignment was to input two integers and output the sum. I followed my professor's code as follows:
.386
.model flat
ExitProcess PROTO NEAR32 stdcall, dwExiteCode:DWORD
include io.h
cr EQU 0dh
lf EQU 0ah
.stack 4096
.data
szPrompt1 BYTE "Enter first number: ", 0
szPrompt2 BYTE "Enter second number: ", 0
zLabel1 BYTE cr, lf, "The sum is "
dwNumber1 DWORD ? ; numbers to be added
dwNumber2 DWORD ?
szString BYTE 40 DUP (?) ; input string for numbers
szSum BYTE 12 DUP (0) ; sum in string form
szNewline BYTE cr,lf,0
.code ; start of main program code
_start:
output szPrompt1 ; prompt for ?rst number
input szString,40 ; read ASCII characters
atod szString ; convert to integer
mov dwNumber1,eax ; store in memory
output szPrompt2 ; repeat for second number
input szString,40
atod szString
mov dwNumber2,eax
mov eax,dwNumber1 ; first number to EAX
add eax,dwNumber2 ; add second number
dtoa szSum,eax ; convert to ASCII characters
output szLabel1 ; output label and results
output szSum
output szNewline
INVOKE ExitProcess,0 ; exit with return code 0
PUBLIC _start ; make entry point public
END ; end of source code
Simple and straightforward enough, yeah? So I turned it in today all linked up from the crappy school computers. And I completely understand all the concepts involved, however, I see 2 main issues here for if I actually want to assemble it on my box:
1) .model flat
2) ExitProcess PROTO NEAR32 stdcall, dwExiteCode:DWORD
And
Both of which I've heard are very Windows-specific. So my question is how can I mutate this code to be able to assemble on Linux?
Sorry If I'm missing any details, but I'll let you know if you need.
Thanks!
Assembly code is, generally speaking, almost always platform specific. Indeed, the very syntax varies between assemblers, even within the same hardware and OS platform!
You'll also probably have problems with that io.h there - I would bet it's making a lot of calls into win32 APIs.
I would recommend simply using wine, along with a copy of whatever assembler your professor is using, to run your professor's examples. If it can run things like Microsoft Office and Steam, it can certainly run some trivial example code :)

Resources