Why is x86 to x86 emulation so slow?

Why is x86 to x86 emulation so slow? - emulation

I was running Debian in a QEMU instance, and everything was running extremely slowly. I understand why emulating different instruction sets is slow (you have to translate each instruction to a form executable on the host architecture), but why are emulators so slow when both the host and guest are running on the same architecture (x86, in this case)? The instruction sets are exactly the same, so surely there isn't anything that needs to be translated, right? So what specific operation causes the guest OS to run so slowly? Am I missing something or not understanding something here?

Related

Best way to simulate old, slow processor on modern hardware?

I really like the idea of running, optimizing my software on old hardware, because you can viscerally feel when things are slower (or faster!). The most obvious way to do this is to buy an old system and literally use it for development, but that would allow down my IDE, and compiler and all other development tasks, which is less helpful, and (possibly) unnecessary.
I want to be able to:
Run my application at various levels of performance, on demand
At the same time, run my IDE, debugger, compiler at full speed
On a single system
Nice to have:
Simulate real, specific old systems, with some accuracy
Similarly throttle memory speed, and size
Optionally run my build system slowly

Try use QEMU in full emulation mode, but keep in mind it's use more cpu resources.
https://stuff.mit.edu/afs/sipb/project/phone-project/OldFiles/share/doc/qemu/qemu-doc.html
QEMU has two operating modes:
Full system emulation. In this mode, QEMU emulates a full system (for example a PC), including one or several processors and various peripherals. It can be used to launch different Operating Systems without rebooting the PC or to debug system code.
User mode emulation (Linux host only). In this mode, QEMU can launch Linux processes compiled for one CPU on another CPU.
Possible architectures can see there:
https://wiki.qemu.org/Documentation/Platforms

Is an operating system kernel an interpeter for all other programs?

So, from my understanding, there are two types of programs, those that are interpreted and those that are compiled. Interpreted programs are executed by an interpreter that is a native application for the platform its on, and compiled programs are themselves native applications (or system software) for the platform they are on.
But my question is this: is anything besides the kernel actually being directly run by the CPU? A Windows Executable is a "Windows Executable", not an x86 or amd64 executable. Does that mean every other process that's not the kernel is literally being interpreted by the kernel in the same way that a browser interprets Javascript? Or is the kernel placing these processes on the "bare metal" that the kernel sits on top of?
IF they're on the "bare metal", how, say does Windows know that a program is a windows program and not a Linux program, since they're both compiled for amd64 processors? If it's because of the "format" of the executable, how is that executable able to run on the "bare metal", since, to me, the fact that it's formatted to run on a particular OS would mean that some interpretation would be required for it to run.
Is this question too complicated for Stack Overflow?

They run on the "bare metal", but they do contain operating system-specific things. An executable file will typically provide some instructions to the kernel (which are, arguably, "interpreted") as to how the program should be loaded into memory, and the file's code will provide ways for it to "hook" in to the running operating system, such as by an operating system's API or via device drivers. Once such a non-interpreted program is loaded into memory, it runs on the bare metal but continues to communicate with the operating system, which is also running on the bare metal.
In the days of single-process operating systems, it was common for executables to essentially "seize" control of the entire computer and communicate with hardware directly. Computers like the Apple ][ and the Commodore 64 work like that. In a modern multitasking operating system like Windows or Linux, applications and the operating system share use of the CPU via a complex multitasking arrangement, and applications access the hardware via a set of abstractions built in to the operating system's API and its device drivers. Take a course in Operating System design if you are interested in learning lots of details.
Bouncing off Junaid's answer, the way that the kernel blocks a program from doing something "funny" is by controlling the allocation and usage of memory. The kernel requires that memory be requested and accessed through it via its API, and thus protects the computer from "unauthorized" access. In the days of single-process operating systems, applications had much more freedom to access memory and other things directly, without involving the operating system. An application running on an old Apple ][ can read to or write to any address in RAM that it wants to on the entire computer.
One of the reasons why a compiled application won't just "run" on another operating system is that these "hooks" are different for different operating systems. For example, an application that knows how to request the allocation of RAM from Windows might not have any idea how to request it from Linux or the Mac OS. As Disk Crasher mentioned, these low level access instructions are inserted by the compiler.

I think you are confusing things. A compiled program is in machine readable format. When you run the program, kernel will allocate memory, cpu etc and ensure that the program does not interfere with other programs. If the program requires access to HW resources or disk etc, the kernel will handle it so kernel will always be between hardware and any software you run in user space.
If the program is interpreted, then a relevant interpreter for that language will convert the code to machine readable on the fly and kernel will still provide the same functionality like access to hardware and making sure programs aren't doing anything funny like trying to access other program memory etc.

The only thing that runs on "bare metal" is assembly language code, which is abstracted from the programmer by many layers in the OS and compiler. Generally speaking, applications are compiled to an OS and CPU architecture. They will not run on other OS's, at least not without a compatible framework in place (e.g. Mono on Linux).
Back in the day a lot of code used to be written on bare metal using macro assemblers, but that's pretty much unheard of on PCs today. (And there was even a time before macro assemblers.)

Comparative analysis between libkvm on linux and NetBSD

I want to build a sample program and as an initial step to learn KVM I started it from the link below.
http://www.linuxjournal.com/magazine/linux-kvm-learning-tool?page=0,1
I see that this is quite an old post for KVM, but I realize that the very first program does not compiles as it asks to include libkvm.h, which is not in my Ubuntu 13.04 installation.
To prepare for this program I installed qemu-kvm,dkms and libvirt stuff.
I also verified that the user has kvm and libvirtd in the group.
I am running Ubuntu on virtual-box on a modern i7 processor windows host.
So I have two different questions here -
1) Since I dont find libkvm.h in my box, what is the way compile my program and learn this kind of programming. If you have any tutorials please forward.
2) I got know that there is another libkvm that is used in BSD style Unix (e.g NetBSD/FreeBSD) that is used to access kernel data-strucrtures. From internet I see that GDB uses that library to fetch info from kernel memory. KVM in linux is a tool to create virtual machines on a Linux box. Is my understanding correct or is there anything more to it? Please provide a comparative analysis between these two libraries, namely libkvm on linux and libkvm on BSD?

As you already said, Linux KVM is a virtualisation technique whereas BSD kvm is much older, the acronym even expands to something different, and is a library to access (not only) kernel data structures in a defined manner.
They are totally separate and different things that have absolutely nothing to do with each other except for sharing the same acronym.
As do, for example, Keyboard-Video-Mouse switches. I was confused by all those Linux people talking about a “KVM” thing suddenly, back when Linux-KVM first came out, and not meaning those.

Can I use JTAG to debug my program on top of embedded Linux?

I am using an at91sam9260 for my developments. There is a Linux kernel running in it and I start my own software on top of it.
I was wondering if I could use a JTAG debugger to debug the software I am working on without seeing to much of what is going on the Linux kernel ?
I am asking that because I think that I might become very complex to debug my software while seeing the full Linux execution.
In other words I would like to know if there could be some abstraction layer when debugging with JTAG probe?

Probably not -- as far as I know, most JTAG debuggers assume the ability of setting breakpoints in the processor. Under a multitasking OS, that stops the OS kernel too.
Embedded OS's like QNX have debuggers that operate on top of the OS kernel and which communicate over Ethernet.

Generally yes you can jtag as a debugger has absolutely nothing to do with what software you happen to be running on that processor. Where you can get into trouble is the cache, for example if you stop the processor want to change some instructions in ram, and restart, the changing of instructions in ram is a data access, which does not go through the instruction cache but the data cache, if you have a separate instruction and data cache, they are enabled and some of the instructions you have modified are at address that are in the instruction cache, you can get messed up pretty fast with new and stale instructions being fed to the processor. Linux likes to use the caches if there.
Second is the mmu, the processor/jtag is likely operating on the virtual addresses on the processor side of the mmu not the physical addresses, so depending on how the hardware works, if for example you set a breakpoint by address in a debug unit in the processor and the operating system task switches to another program/thread at that same address space, you will breakpoint on the wrong program at the right address. If the debugger/processor sets breakpoints by modifying an instruction in ram then you run into the cache problem above, IF not cached then you will break on the right instruction in the right thread, but then you have that cache problem.
Bottom line, absolutely, if the processor supports jtag based debugging that doesnt change based on whatever software you choose to run on that processor.

It depends on JTAG device and it's driver. Personally, I know only one device that capable of doing that: XDS560 + Code composer studio (CCS). But, there can be others.
I suggest to consult with manufacturer of your device.

For ARM, the Asset Arium family is claimed to be able to debug application code. I haven't tried it, though.

How do emulator/virtual computer work?

How does an emulator work like the android one. And a virtual pc live VirtualBox?

In one form the software reads the binary in the same way that hardware on the real system would, it fetches instructions decodes them and executes them using variables in a program instead of registers in hardware. Memory and other I/O is similarly emulated/simulated. To be interesting beyond just an instruction set simulator it needs to also simulate hardware, so it may have software that pretends to behave for example like a VGA video card the software run on the emulator ideally cannot tell the Memory/I/O is from simulated hardware, ideally you do enough to fool the software being simulated. Also though you try to honor what those register writes and reads mean by making calls to the operating system you are running on and/or hardware directly (assuming your program of course thinks it is talking to hardware and not an emulator).
The next level up would be a virtual machine. For the case I am describing it is a matching instruction set, so you are say wanting to virtualize an x86 program on an x86 host machine. The long and short of it is the host processor/machine has hardware features that allow you to run the actual instructions of the program being virtualized. so long as the instructions are simple register based or stack or other local memory, once the program ventures out of its memory space the virtualization hardware will interrupt the operating system, the virtual machine software like vmware or virtualbox then examines the Memory or I/O request from the software being virtualized and then determines if that was a video card request or usb device or nic or whatever, and then it emulates the device in question much in the same way that a pure non-virtualized setup would. A virtual machine can often outrun a purely emulated machine because it allows a percentage of the software to run at the full speed of the processor. the downside is you have to have a virtual machine that matches the software being run. An emulator can be far more accurate and portable than a virtual machine at the cost of performance.
The next level up would be something like wine or cygwin where not only are you trying to do something like a virtual machine and run native instructions and trap memory requests but you are going beyond that and trying to trap operating system calls so that you can run a program compiled for one operating system on another operating system, but much faster than a virtual machine. Instead of traping the hardware level register or memory access to a video card, you trap the operating system call for a bitblt or fill or line draw or string draw with a specific font, etc. Then you can translate that operating system request with calls to the native operating system.

At their simplest emulators or virtual computers provide an abstraction layer built on top of the host system (the actual physical system running the emulator) that implements the emulated system's functionality to the code that is to be run.

Emulators and virtual machines simulate hardware like a PC or an android phone. A virtual machine (or virtual pc) looks at an operating system's machine code instructions and runs them on top of your current (host) operating system in a virtual computer.

http://en.wikipedia.org/wiki/Virtual_machine
and
http://en.wikipedia.org/wiki/Emulator
Depending on the type or virtualization, a virtual machine is not always an emulator.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string