What is the difference between x64 and IA-64? - 64-bit

I was on Microsoft's website and noticed two different installers, one for x64 and one for IA-64. Reference:Installing the .NET Framework 4.5, 4.5.1
My understanding is that IA-64 is a subclass of x64, so I'm curious why it would have a separate installer.

x64 is used as a short term for the 64 bit extensions of the "classical" x86 architecture; almost any "normal" PC produced in the last years have a processor based on such architecture.
AMD invented the AMD64 extensions; Intel was more or less forced to implement them, and called them first IA-32e, then EM64T and finally Intel 64 (actually, the AMD and Intel extensions aren't exactly the same, but they are almost identical).
Many people also call this stuff x86-64, to have a vendor-independent name and to stress the fact that it's the 64 bit evolution of the x86 architecture. All the "regular" PCs that are sold with "64 bit processors" run on x86-64 architecture.
IA-64 (Intel Architecture 64) is an almost completely unrelated 64 bit architecture (also known as Itanium), developed by Intel initially for high-end servers. It was said that Itanium could have been a replacement for the x86 architecture, but this architecture didn't have much success (for various reasons), so it's unlikely that you'll ever need the IA-64 installers.
For more information, you may have a look at the wikipedia articles on x86-64 and Itanium.

IA-64 is the Intel Itanium Architecture. This is a Very Long Instruction Word (VLIW) processor instruction set.
x86_64 is the normal 64-bit architecture that is used by processors inside every laptop / desktop in today's computers. This processor is a dynamic processor.
The main difference between these two is that
In VLIW, the compiler resolves the dependencies between instructions and schedules them appropriately. The processor merely executes them.
With a dynamic processor, the compiler just schedules the instructions without worrying about dependencies. The processor takes care of dependencies, reorders them and executes them appropriately.
VLIW code is dependent on each chip's internal architecture. The compiler needs to know that information. The advantage of them is that it can extract much more parallelism than dynamic processors can give.
The code is independent on each chip's internal architecture for dynamic processors. It just needs to follow the instruction set. So code compiled on one machine can run on other machines very easily. The disadvantage though is that limited parallelism can be exploited from dynamic processors. And the internal logic and design is very complex and intricate than VLIW.
Nevertheless, dynamic processors are used today mostly by consumers (individuals), so they can run code compiled / generated on any machine. VLIW processors are used by servers and enterprises because of the parallelism they can produce.

they are different
IA-64 is itanium - an architecture for servers
x64 is what 64bit intel core and amd cpus implement

x64 is short for x86-64 which is an extension of the x86 instruction set.
IA-64 is for the Itanium 64 bit Architecture (by Intel)

IA-64 is for computers running Intel Itanium 64 bit processors. They do not support running 32 bit applications like x64 processors do. A special version of Windows is needed to run on these processors, thus the two different installers.

They have different instruction set, this is the key point.

Related

why does the runtime version of my android studio say AMD64 on my Windows system? [duplicate]

This question is about terminology for 32-bit vs. 64-bit x86.
If I have 2 directories with source code of the same program - one for 32-bit Windows and another for 64-bit Windows, what will be the more correct names for these folders:
x86-64 and x64?
or IA-32 and x64?
I already have read some web resources, but can't understand. Just for the record:
https://superuser.com/questions/179919/x86-vs-x64-why-is-32-bit-called-x86
Difference between x86, x32, and x64 architectures?
https://en.wikipedia.org/wiki/X86
https://en.wikipedia.org/wiki/IA-32
https://en.wikipedia.org/wiki/X86-64
x86 can be a broad term that covers all CPUs that are backwards-compatible with 8086, and all extensions to the architecture including x86-64.
Note that IA-64 is not x86 at all, it's Itanium (a 64-bit VLIW architecture with explicit speculation / parallelism). It was also designed by Intel, but is totally unrelated to x86 in terms of compatibility or design. (Early IA-64 CPUs also had an x86 core integrated, for compatibility. Intel was pushing IA-64 while AMD was pushing AMD64/x86-64)
Intel sometimes talks about their CPUs as having IA cores + the integrated GPU and the other logic outside of each IA core. (IA = Intel Architecture = x86).
32-bit x86 specifically (excluding 16-bit or 64-bit) can be called
IA-32 (used sometimes by Intel)
i386 or i686 (common on Linux)
(Windows only): x86. Yes really: in the Windows world, "x86" specifically means 32-bit. That's why you have a Program Files (x86) directory with that name. This choice causes potential terminology confusion for everyone, because "x86" is still by far the best way to refer to the architecture in general, as opposed to ARM or MIPS.
rarely: x86-32. This is not used officially by any hardware or software vendors I'm aware of, but it is a useful term that's unambiguous.
Never call it x32. x32 is an ILP32 variant of the x86-64 System V ABI: 32-bit pointers in 64-bit mode. https://en.wikipedia.org/wiki/X32_ABI
64-bit x86 is easier to refer to specifically (excluding 32 and 16-bit):
x86-64 or x86_64 (the dash vs. underscore is not at all significant. In text most people use a dash, but only _ can be part of function/variable names in most languages.)
AMD64 or amd64
(Windows only) x64
IA-32e or Intel64 (mostly only in Intel CPU-architecture documentation documenting Intel's implementation of x86-64, these aren't popular and I haven't seen them in software directory names or config options). The "e" stands for "extensions", apparently. https://en.wikipedia.org/wiki/X86-64#Intel_64 has a History section that mentions naming. There are very minor differences between Intel and AMD implementations of x86-64, mostly only affecting kernels, not user-space.
Not IA-64, that's a separate architecture.
Of course if you want to be pedantic, x86-64 CPUs are required to support legacy mode, so you can run a pure 32-bit OS on an x86-64 CPU, and it's still a 64-bit CPU.
With a 64-bit kernel running 32-bit user-space, the CPU is in "compatibility" mode, which is a lot like 32-bit protected mode except the page-table format has 52-bit physical addresses. (More than the 36-bit physical address width from PAE page tables, which the x86-64 page-table format is based on.) User-space would be hard pressed to tell the difference between running under a 32-bit kernel vs. a 64-bit kernel, except for OS-specific stuff like asking the kernel with a system call.
Software directory names
Many projects go with i386 vs. x86-64 or amd64. That would be my recommendation as the least ambiguous. (Or maybe i686 if you don't really care about compat with CPUs older than PPro.)
Some, like GMP (the GNU MultiPrecision library) which has hand-written asm for many architectures, uses "x86" and "x86_64". https://gmplib.org/repo/gmp/file/tip/mpn.
(GMP has multiple hand-tuned versions of the same function for different x86 CPUs. Within "x86", there are subdirectories with different versions of functions tuned for Pentum, Core 2, Haswell, and/or taking advantage of instruction-set extensions like BMI2. This is unusual; most projects don't get that specific. Some will maybe have some stuff to take advantage of AVX or AVX512, or SSE4.1 for example, but that's often just within source files.)
x86-32 versions are for Intel-compatible 32-bit processors.
x86-64 versions are for Intel-compatible 64-bit processors.
IA-64 versions are for specifically 64-bit Intel Itanium microprocessors.
Sometimes referred as:
x86-32 sometimes referred as IA-32, i386 or i686, x86 (see note 1), x32 (see note 2).
x86-64 sometimes referred as AMD64, Intel 64, x64.
Note 1: Strictly speaking, it's not correct, because
x86 covers all CPUs that are backwards-compatible with 8086 and all extensions to the architecture including x86-32 and x86-64.
Though, this shortening is often used in Windows world.
Note 2: Should not be used to avoid confusion with x32 ABI.
Sources:
https://social.msdn.microsoft.com/Forums/vstudio/en-US/3da8f2fd-1089-4000-8c3f-f9225f0635e3
https://stackoverflow.com/questions/53364320#53364541

Environmental performance parameters for applications on Linux

I have two physical, "identical" Linux RedHat servers. I ran a small program on both of them. My problem: the CPU usage of my program varies between both servers. I am not a Linux expert. I am wondering what could lead to that performance difference?
I wrote the program in C++ and in java to see if the inconsistency comes from the programming language chosen. The program itself does a little bit of integer calculation over time to consume a constant amount of CPU time. Both program versions have the same percentual CPU usage difference.
The environmental variables I have already thought of and could be excluded:
identical server type
identical processor (both have two sockets, single core)
both Intel Hyper-Threading-Technology enabled
identical clock speed
identical OS version (Red Hat Enterprise Linux Server release 5.9)
identical Java version, Java RE, JVM
Intel Demand based Switching can be ignored since the measurement tool uses the default value of clock speed for CPU capacity
processor affinity can be excluded as well I think. I ran multiple measurement series and I always retrieve exactly the same CPU usage values.
Is there maybe a C library or something like that, that has an impact on the CPU usage of C++ and Java programs which needs to be updated separately from the actual OS version? Or could there be a different thread scheduler?
There are a variety of things that can differ even for "identical" systems. Different compilers being used to build various libraries, as well as different versions of compilers. For example, there are continuous improvements from generation to generation of the ability of Intel compilers to optimize. Other differences can occur due to airflow differences causing one machine to run hotter than the other resulting in a drop in frequency occasionally. There are a whole host of other issues that can cause identical systems to run differently.
Here's my recommendation: Create an OS image and use that same image for both systems. Disconnect both from any network. Run compute bound (which you are). Bind your app to a certain core. Verify the exit air temperatures are well within specification. Disable any turbo capability. If there are still differences, do a memory speed check.
Also, use a more sophisticated profiling and analysis tool such as Intel Vtune. You can dig into actual cycles, measure cache misses, branch mispredicts, etc. They should also be identical. If they aren't, the analysis should give you an idea of where the problem lies.

Control over memory virtualization on Linux and Windows

This is semi-theoretic question.
Can I specify the virtualization mode for memory (pure segmentation/segmentation+paging/just paging) while compiling for Windows (e.g., MSVS12 C++) and for Linux (e.g. g++)?
I have read all MSVS linker+compiler options, and found no point of control in there.
For g++ the manual is quite too complex for such question.
The source of this question is this - link
I know from theory and practice that these should either be possible or restricted by OS policy at some level cause core i7 supports all three modes I mentioned above.
Practical background:
The piece of code that created lots of data is here, function Init - and it exhausted my memory if I wanted to have over 2-3G primes on heap.
Intel x86 CPUs always use some form segmentation that can't be turned off. In 64-bit mode code segmentation is limited, but it's still there. Paging is required for both Windows and Linux to work on Intel CPUs (though Linux doesn't use paging on certain other CPU architectures). Paging is also required to enable 64-bit mode on Intel CPUs.
So in other words on Windows and Linux the OS always uses segmentation and paging, and so do any applications run on them, though this is largely transparent. It's not possible to "compiled+linked for 'segmentation without paging'" as you said in the answer you linked. Maybe the book you referenced is referring to ancient 16-bit versions of Windows (3.1 or earlier) which could be run in a mode that supported 80286 CPUs which didn't have paging. Though even then that normally didn't make any difference in how you compiled and linked your applications.
What you are describing is not a function of a compiler, or even a linker.
When you run your program, you get the memory model that is already running on the system. Your compiled code does not care abut the underlying memory mode.
However, your program itself can change the memory model IF it starts running in an unprotected processor mode.

How to programatically detect a 64 bit or 32 bit machine?

I don't understand what 32 bit and 64 bit means. It seems that people say 64 bit computers run faster - but why? Does it mean that there are 64 bit integers instead of 32? If it's something like that, is there a way to write a program to determine if we're on a 32 bit or 64 bit machine?
On 64-bit machines pointers are 8 bytes (64 bits). On 32-bit machines they are 4 bytes (32 bits). Thus we can determine by the size of a pointer what we are dealing with, in it's simplest form:
#define IS_64BIT (sizeof(void *) == 8)
The only drawback is that a 64 bit computer running in 32 bit mode will register as 32 bit. Of course, this isn't actually important as for all intents and purposes a 32 bit OS on a 64 bit computer will be a 32 bit computer.
There's actually several different things your asking here.
First of all there's the CPU. Most modern day CPUs (within the past 5-years approx) will support 64-bit.
Now just because the CPU supports it doesn't mean the OS supports it, that's where you have either 64-bit OS or 32-bit OS (32-bit is also known as x86, there's small technical differences in the x86 refers to the CPU instruction set, but for most common usage x86 and 32-bit are interchangeable)
Even if the OS supports it, it doesn't mean the specific program you're running supports 64-bit. What most (if not all?) 64-bit OS's do is they have a 32-bit emulation mode so you can still run 32-bit programs.
Now for your question of how to determine which architecture you're running on, the most reliable way is to ask the OS through some API call.
As for why 64-bit is sometimes considered faster, it because with 32-bits it is only possible to address 4GB of memory, whereas with 64-bit the limit imposed by address space is much higher (as in about 4 billion times higher) and the limiting factor is hardware not address space. As to when and why more memory is faster, that's a separate topic altogether.
64-bit machines do not run faster than 32-bit machines except in cases where 64-bit math is being done or in cases where more than 4 GB of RAM is needed.
64-bit AMD (and later Intel) machines run faster than 32-bit x86 machines because when AMD designed the new instruction set they added more CPU registers and made SSE math the default.
32-bit x86 systems can waste a lot of CPU time pushing data around in RAM, while a x86_64 system can store that data in CPU registers instead. Registers are much faster than level-1 CPU cache. Having more registers also saves CPU instructions that otherwise need to store the old value of a register in RAM, load in a different value from RAM, then load the original value back from RAM.
In some especially register-starved cases the extra registers can gain 30% speed for a program. The benefit is usually much less than that.
The speed benefits from assuming SSE2 are many. In 32-bit CPUs SSE instructions may or may not exist, so to use them the software needs to have clumsy test code and two (or more!) implementation of the math functions. Most software just doesn't care enough and so it never bothers, always falling back on x87 FPU math from the 486 days. The 64-bit CPUs made SSE2 a required part of the instruction set, so all x86_64 programs are free to assume it exists and use it in all cases.
64bit computers do not run faster, per se. It just can support higher precision (larger integers, more precise floats).
In some rare cases, libraries might jam two 32bit numbers into 64bits to perform a large number of parallel operations, possibly resulting in potentially up to 2x speedup. This might occur for some highly optimized scientific/numeric libraries, or in special applications that (for some reason or another) have been highly optimized at a very low level. For example, some multimedia software. It should be noted that such applications could always have made this tradeoff even in 32bit mode, but chose not to; they are merely trading away precision (which they may not need) for parallelism.
Operating system benchmarks which reveal faster performance (maybe <10% improvement) are not necessarily related to 64bit-related optimizations. 64bit architectures may be correlated with having for example more registers or advanced features that programs can take aware of [citation: http://www.tuxradar.com/content/ubuntu-904-32-bit-vs-64-bit-benchmarks ], which may be the cause of a performance difference (as well as other variables).
How to determine whether a CPU is 32bit or 64bit depends on what OS you are using. For example on Linux, you can call uname -a, though there's probably a better way to do so. If you're using C/C++, see the other answer for a way to determine it in a program.

What is Intel microcode?

From what I've read it's used to fix bugs in the CPU without modifying the BIOS.
From my basic knowledge of Assembly I know that assembly instructions are split into microcodes internally by the CPU and executed accordingly. But intel somehow gives access to make some updates while the system is up and running.
Anyone has more info on them? Is there any documentation regarding what can it be done with microcodes and how can they be used?
EDIT:
I've read the wikipedia article: didn't figure out how can I write some on my own, and what uses it would have.
In older times, microcode was heavily used in CPU: every single instruction was split into microcode. This enabled relatively complex instruction sets in modest CPU (consider that a Motorola 68000, with its many operand modes and eight 32-bit registers, fits in 40000 transistors, whereas a single-core modern x86 will have more than a hundred millions). This is not true anymore. For performance reasons, most instructions are now "hardwired": their interpretation is performed by inflexible circuitry, outside of any microcode.
In a recent x86, it is plausible that some complex instructions such as fsin (which computes the sine function on a floating point value) are implemented with microcode, but simple instructions (including integer multiplication with imul) are not. This limits what can be achieved with custom microcode.
That being said, microcode format is not only very specific to the specific processor model (e.g. microcode for a Pentium III and a Pentium IV cannot be freely exchanged with eachother -- and, of course, using Intel microcode for an AMD processor is out of the question), but it is also a severely protected secret. Intel has published the method by which an operating system or a motherboard BIOS may update the microcode (it must be done after each hard reset; the update is kept in volatile RAM) but the microcode contents are undocumented. The Intel® 64 and IA-32 Architectures Software Developer’s Manual (volume 3a) describes the update procedure (section 9.11 "microcode update facilities") but states that the actual microcode is "encrypted" and clock-full of checksums. The wording is vague enough that just about any kind of cryptographic protection may be hidden, but the bottom-line is that it is not currently possible, for people other than Intel, to write and try some custom microcode.
If the "encryption" does not include a digital (asymmetric) signature and/or if the people at Intel botched the protection system somehow, then it may be conceivable that some remarkable reverse-engineering effort could potentially enable one to produce such microcode, but, given the probably limited applicability (since most instructions are hardwired), chances are that this would not buy much, as far as programming power is concerned.
Think loosely about a virtual machine or simulator where say for example qemu-arm can simulate an arm processor on an x86 host, ideally the software running on the simulated arm has no idea that it isnt a real arm. Take this idea to the level where the whole chip is designed such that it always looks like you are an x86, the software never knows there is some programmable items inside the chip. And that some other processor inside is somewhat designed for the purpose of implementing/simulating an x86. Supposedly the popular AMD 29000 product line just went away because the hardware team and perhaps processor/core became the guts of an early x86 clone. Transmeta, where Linus worked, had a vliw processor that was made to be a low power x86. In that case the translation layer was not (as much of) a secret. Vliw, very long instruction word, RISC taken to the extreme, is the kind of thing you build for this kind of task.
No it is not as much of an emulation layer as I am implying, there isnt some linux running there with a qemu program inside each chip. It is somewhere between hardwired where there is no software/microcode in the middle and a full blow emulation. The programmable bits may be like an fpga, programmable gates, or it may be software or programmable state machines, meaning not-programmable gates, just what runs on the gates is programmable.
Your non-x86, non-big iron type processors. Take ARM for example, are hardwired, no microcode. Microcontrollers, PIC, MSP430, AVR, assume these are not microcoded. Basically do not assume all processors are microcoded, few if any processor families are. It is just that the ones we deal with in PCs have been and may still be, so it may feel like they all are.
As fun as it may sound to play with this microcode, it is likely very specific to the processor family, and you likely will never gain access to how it works unless you work for Intel or AMD, each of which likely have their own internals. So you would need to get a job at one of the two, then work your way through the trenches to become one of what is likely an elite team that does this work. And once you get that far your career is trapped, your skills may be limited to one job at one company. You might have more fun programming individual gpus on a video card, something that is documented or at least has tools, something you can do today without spending 10 years at AMD or Intel to possibly get nowhere.

Resources