a way to run assembler code in protected environment (x86-64) - linux

I'm trying to create a Linux software in C++ which need to run code in a protected environment on x86 and x86-64 processor.
My problem is to find a way to run code in protected environment, first, only on x86-64 (it's a technical part of processors way of working), I have see Local Descriptors Table, but I found it no more works on x86-64. I also heard about the Intel VT technology, but documents seems very complicated.
Have you any idea of ways to run code in a protected environment on linux and x86-64 inside a process?
My goal is to create something like an OS inside a linux process.
Like Windows or Linux does, I want the program runned inside my protected environment no to access part of my software, and make systemcall if needed. I believe I have found a way to do so, I esxplain it below.

I have found a way to do what I want:
Each time my program will switch from main part to the program inside, it will use mprotect (a function of Glibc on Linux) to change the right to access to lot of part of the memory of the process.
Each time the program inside will make a systemcall to my program, it will change back the right to access to the memory.
You may thinks it stays security issues, because the program inside can run any kind of code and access to system call to linux and so can access to not allowed things. But I believe I can use a tricky which would prohibit the code inside to start any kind of opcodes.

Related

Program that runs on windows and linux

Is it possible to write a program (make executable) that runs on windows and linux without any interpreters?
Will it be able to take input and print output to console?
A program that runs directly on hardware, pure machine code as this should be possible in theory
edit:
Ok, file formats are different, system calls are different
But how hard or is it possible for kernel developers to introduce another executable format called raw for fun and science? Maybe raw program wont be able to report back but it should be able to inflict heavy load on cpu and raise its temperature as evidence of running for example
Is it possible to write a program (make executable) that runs on windows and linux without any interpreters?
in practice, no !
Levine's book Linkers and loaders explain why it is not possible in practice.
On recent Linux, an executable has the elf(5) format.
On Windows, it has some PE format.
The very first bytes of executables are different. And these two OSes have different system calls. The Linux ones are listed in syscalls(2).
And even on Linux, in practice, an executable is usually dynamically linked and depends on shared objects (and they are different from one distribution to the next one, so it is likely that an executable built for Debian/Testing won't run on Redhat). You could use the objdump(1), readelf(1), ldd(1) commands to inspect it, and strace(1) with gdb(1) to observe its runtime behavior.
Portability of software is often achieved by publishing it (in source form) with some open source license. The burden of recompilation is then on the shoulders of users.
In practice, real software (in particular those with a graphical user interface) depends on lots of OS specific and computer specific resources (e.g. fonts, screen size, colors) and user preferences.
A possible approach could be to have a small OS specific software base which generate machine code at runtime, like e.g. SBCL or LuaJit does. You could also consider using asmjit. Another approach is to generate opaque or obfuscated C or C++ code at runtime, compile it (with the system compiler), and load it -at runtime- as a plugin. On Linux, use dlopen(3) with dlsym(3).
Pitrat's book: Artificial Beings, the conscience of a conscious machine describes a software system (some artificial mathematician) which generates all of its C source code (half a million lines). Contact me by email to basile#starynkevitch.net for more.
The Wine emulator allows you to run some (but not all) simple Windows executables on Linux. The WSL layer is rumored to enable you to run some Linux executable on Windows.
PS. Even open source projects like RefPerSys or GCC or Qt may be (and often are) difficult to build.
No, mainly because executable formats are different, but...
With some care, you can use mostly the same code to create different executables, one for Linux and another one for windows. Depending on what you consider an interpreter Java also runs on both Windows and Linux (in a Java Virtual Machine though).
Also, it is possible to create scripts that can be interpreted both by PowerShell and by the Bash shell, such that running one of these scripts could launch a proper application compiled for the OS of the user.
You might require the windows user to run on WSL, which is maybe an ugly workaround but allows you to have the same executable for both Windows and Linux users.

Record dynamic instruction trace or histogram in QEMU?

I've written and compiled a RISC-V Linux application.
I want to dump all the instructions that get executed at run-time (which cannot be achieved by static analysis).
Is it possible to get a dynamic assembly instruction execution historgram from QEMU (or other tools)?
For instruction tracing, I go with -singlestep -d nochain,cpu, combined with some awk. This can become painfully slow and large depending on the code you run.
Regarding the statistics you'd like to obtain, delegate it to R/numpy/pandas/whatever after extracting the program counter.
The presentation or video of user "yvr18" on that topic, might cover some aspects of QEMU tracing at various levels (as well as some interesting heatmap visualization).
QEMU doesn't currently support that sort of trace of all instructions executed.
The closest we have today is that there are various bits of debug logging under the -d switch, and you can combine the tracing of "instructions translated from guest to native" with the "blocks of translated code executed" translation to work out what was executed, but this is pretty awkward.
Alternatively you could try scripting the gdbstub interface to do something like "disassemble instruction at PC; singlestep" which will (slowly!) give you all the instructions executed.
Note: There ongoing work to improve QEMU's ability to introspect guest execution so that you can write a simple 'plugin' with functions that are called back on events like guest instruction execution; with that it would be fairly easy to write a dump of guest instructions executed (or do more interesting processing), but this is still work-in-progress, so not available yet.
It seems you can do something similar with rv8 (https://github.com/rv8-io/rv8), using the command:
rv-jit -l
The "spike" RISC-V emulator allows tracing instructions executed, new values stored into registers, or just simply a histogram of PC values (from which you can extract what instruction was at each PC location).
It's not as fast as qemu, but runs at 100 to 200 MIPS on current x86 hardware (at least without tracing enabled)

How does linux protect memory?

I'm interested in how linux runs in protected mode from an assembly point of view. Which registers and interrupts are used when it comes to putting the cpu in protected mode for an i386:0x86_64 machine? I understand how memory managment works when I look at the c source of functions like mmap and mprotect, however whats keeping me from taking over with assembly? Where can I get more info on this?
I believe you're looking for arch/x86/mm/ -- arch/x86/mm/init.c sets up the page tables for the correct architecture (ia32 or AMD64) and takes into account the processor features available (PSE, PGE, etc.).
It bears stressing: This is a function of the processor. Linux tells the processor what to protect, and the processor does it.
AFA the system call interface, have a glance at http://stromberg.dnsalias.org/~strombrg/pbmonherc.html from back before the C library had mmap, but after the Linux kernel did. See file mmap.c.

Address space identifiers using qemu for i386 linux kernel

Friends, I am working on an in-house architectural simulator which is used to simulate the timing-effect of a code running on different architectural parameters like core, memory hierarchy and interconnects.
I am working on a module takes the actual trace of a running program from an emulator like "PinTool" and "qemu-linux-user" and feed this trace to the simulator.
Till now my approach was like this :
1) take objdump of a binary executable and parse this information.
2) Now the emulator has to just feed me an instruction-pointer and other info like load-address/store-address.
Such approaches work only if the program content is known.
But now I have been trying to take traces of an executable running on top of a standard linux-kernel. The problem now is that the base kernel image does not contain the code for LKM(Loadable Kernel Modules). Also the daemons are not known when starting a kernel.
So, my approach to this solution is :
1) use qemu to emulate a machine.
2) When an instruction is encountered for the first time, I will parse it and save this info. for later.
3) create a helper function which sends the ip, load/store address when an instruction is executed.
i am stuck in step2. how do i differentiate between different processes from qemu which is just an emulator and does not know anything about the guest OS ??
I can modify the scheduler of the guest OS but I am really not able to figure out the way forward.
Sorry if the question is very lengthy. I know I could have abstracted some part but felt that some part of it gives an explanation of the context of the problem.
In the first case, using qemu-linux-user to perform user mode emulation of a single program, the task is quite easy because the memory is linear and there is no virtual memory involved in the emulator. The second case of whole system emulation is a lot more complex, because you basically have to parse the addresses out of the kernel structures.
If you can get the virtual addresses directly out of QEmu, your job is a bit easier; then you just need to identify the process and everything else functions just like in the single-process case. You might be able to get the PID by faking a system call to get_pid().
Otherwise, this all seems quite a bit similar to debugging a system from a physical memory dump. There are some tools for this task. They are probably too slow to run for every instruction, though, but you can look for hints there.

Execute code in process's stack, on recent Linux

I want to use ptrace to write a piece of binary code in a running process's stack.
However, this causes segmentation fault (signal 11).
I can make sure the %eip register stores the pointer to the first instruction that I want to execute in the stack. I guess there is some mechanism that linux protects the stack data to be executable.
So, does anyone know how to disable such protection for stack. Specifically, I'm trying Fedora 15.
Thanks a lot!
After reading all replies, I tried execstack, which really makes code in stack executable. Thank you all!
This is probably due to the NX bit on modern processors. You may be able to disable this for your program using execstack.
http://advosys.ca/viewpoints/2009/07/disabling-the-nx-bit-for-specific-apps/
http://linux.die.net/man/8/execstack
As already mentioned it is due to the NX bit. But it is possible. I know for sure that gcc uses it itself for trampolines (which are a workaround to make e.g. function pointers of nested functions). I dont looked at the detailes, but I would recommend a look at the gcc code. Search in the sources for the architecture specific macro TARGET_ASM_TRAMPOLINE_TEMPLATE, there you should see how they do it.
EDIT: A quick google for that macro, gave me the hint: mprotect is used to change the permissions of the memory page. Also be carefull when you generate date and execute it - you maybe have in addition to flush the instruction cache.

Resources