What "64 bit" means? - 64-bit

I know this question seems obvious, but I don't manage to find a precise answer.
If on my laptop it is written "Windows 8 64 bit", what "64 bit" exactly refers to? (I know that "Windows 8" is just the name and version of the OS).
I have a few interpretations, but none of them make me entirely happy:
The virtual address space of a process is of size 2^64 units (with unit being some small size). This definition does not make me happy, because even with disc storage, the total storage of my computer is far less than that. So I would never be able in a program to initialize an array of size 2^64.
The registers in memory have a capacity of 64 bits. This also does not make me entirely happy, because my machine could have both 64 bit and 32 bit registers, and perhaps registers of smaller size.
The maximum capacity of registers is 64 bits. This definition could be sensible, but looks "iffy".
So could anyone give me a clear definition, or at least tell that one of the above is correct?

"Windows 64 bit" means that the operating system supports 64-bit addressing.
This, in turn, implies that the CPU also supports 64-bit addressing.
The OS and the CPU are two entirely different things.
Runtime binaries (.exes and .dlls for Windows) are yet another "different thing". 32-bit and 64-bit .exe's have different binary formats, are loaded differently by the OS, and use different runtime resources.
You can't run a 64-bit OS on a 32-bit CPU. But you can run a 32-bit OS on a 64-bit CPU. Similarly, you can't use a 64-bit shared library or executable program on a 32-bit OS.
The key aspect of "64-bit" is 64-bit addressing: that both the CPU and the running program can address up to 2^64 bytes of virtual memory:
In practice, a running program will likely be able to address only a portion of that address space.
You can read more here:
https://en.wikipedia.org/wiki/64-bit_computing
PS:
Yes: CPU registers come in all different sizes. For example, ah is 8-bits, ax 16 bits, eax 32 bits and rax is 64 bits. Furthermore, different registers do "different things". For "64-bit computing", we're primarily interested in those registers that load from and store to virtual memory.

Related

great size pointer in gcc

I want to define a great size pointer(64 bit or 128 bit) in gcc which is not depend on platform.
I think there is something like __ptr128 or __ptr64 in MSDN.
sizeof(__ptr128) is 16 bytes.
sizeof(__ptr64 ) is 8 bytes.
is it possible?
it can be useful when you use kernel functions in 64 bit OS which requires 8 bytes pointer argument and you have a 32 bit application which uses 32 bits address and you want to use this kernel function.
Your question makes no sense. Pointers, by definition, are a memory address to something - the size must depend upon the platform. How would you dereference a 128-bit pointer on a hardware platform supporting 64-bit addressing?!
You can create 64 or 128-bit values, but a pointer is directly related to the memory addressing scheme of the underlying hardware.
EDIT
With your additional statement, I think I see what you're trying to do. Unfortunately, I doubt it's possible. If the kernel function you want to use takes a 64-bit pointer argument, it's highly likely to be a 64-bit function (unless you're developing for some unusual hardware).
Even though it's technically possible to mix 64-bit instructions into a 32-bit executable, no compiler will actually let you do this. A 64-bit API call will use 64-bit code, 64-bit registers and a 64-bit stack - it would be extremely awkward for the compiler and operating system to manage arbitrary switching from a 32-bit environment to a 64-bit environment.
You should look at finding the equivalent API for a 32-bit environment. Perhaps you could post the kernel function prototype (name+parameters) you want to use and someone could help you find a better solution.
Just so there's no confusion, __ptr64 in MSDN is not platform independent:
On a 32-bit system, a pointer declared with __ptr64 is truncated to a
32-bit pointer.
Can't comment, but the statement that you can't use 64 bit instructions in a "32 bit executable" is misleading since the definition of "32 bit executable" is subject to interpretation. If you mean an executable that uses 32 bit pointers, then there is nothing at all that says you can't use instructions that manipulate 64 bit values while using 32 bit pointers. The processor doesn't know the difference.
Linux even supports a mode where you can have a 32 bit userspace and a 64 bit kernel space. Thus, each app has access to 4GB of RAM, but the system can access much more. This keeps the size of your pointers down to 4 bytes but does not restrict the use of 64 bit data manipulations.
I'm late to the party but the question makes quite a lot of sense in embedded platforms.
If you combine a CPU with some additional accelerators in the same SOC, they don't necessarily need to have the same address space or address space size.
For the firmware in the accelerator you would want pointers that cover its address space from the CPU and the accelerator's perspective. They are not necessarily the same size.
For example, with a 64 bit CPU and a 32 bit accelerator, the pointer for the firmware can cover 32 bit long address space and the pointer for CPU covers 64 bit address space. C does not have two or more void * types depending on the address spaces you want to talk to.
People generally solve this by casting void * to uintN_t with N as large as needed and passing this around between different parts of the system.
There is none, because gcc was not designed for embedded architectures. There are architectures where multiple sized pointers exist like for example m16c: ram has 16 bit addresses and rom(flash) has 20 bit addresses in the same address space. The performance and size usage is better for smaller pointers.

How to programatically detect a 64 bit or 32 bit machine?

I don't understand what 32 bit and 64 bit means. It seems that people say 64 bit computers run faster - but why? Does it mean that there are 64 bit integers instead of 32? If it's something like that, is there a way to write a program to determine if we're on a 32 bit or 64 bit machine?
On 64-bit machines pointers are 8 bytes (64 bits). On 32-bit machines they are 4 bytes (32 bits). Thus we can determine by the size of a pointer what we are dealing with, in it's simplest form:
#define IS_64BIT (sizeof(void *) == 8)
The only drawback is that a 64 bit computer running in 32 bit mode will register as 32 bit. Of course, this isn't actually important as for all intents and purposes a 32 bit OS on a 64 bit computer will be a 32 bit computer.
There's actually several different things your asking here.
First of all there's the CPU. Most modern day CPUs (within the past 5-years approx) will support 64-bit.
Now just because the CPU supports it doesn't mean the OS supports it, that's where you have either 64-bit OS or 32-bit OS (32-bit is also known as x86, there's small technical differences in the x86 refers to the CPU instruction set, but for most common usage x86 and 32-bit are interchangeable)
Even if the OS supports it, it doesn't mean the specific program you're running supports 64-bit. What most (if not all?) 64-bit OS's do is they have a 32-bit emulation mode so you can still run 32-bit programs.
Now for your question of how to determine which architecture you're running on, the most reliable way is to ask the OS through some API call.
As for why 64-bit is sometimes considered faster, it because with 32-bits it is only possible to address 4GB of memory, whereas with 64-bit the limit imposed by address space is much higher (as in about 4 billion times higher) and the limiting factor is hardware not address space. As to when and why more memory is faster, that's a separate topic altogether.
64-bit machines do not run faster than 32-bit machines except in cases where 64-bit math is being done or in cases where more than 4 GB of RAM is needed.
64-bit AMD (and later Intel) machines run faster than 32-bit x86 machines because when AMD designed the new instruction set they added more CPU registers and made SSE math the default.
32-bit x86 systems can waste a lot of CPU time pushing data around in RAM, while a x86_64 system can store that data in CPU registers instead. Registers are much faster than level-1 CPU cache. Having more registers also saves CPU instructions that otherwise need to store the old value of a register in RAM, load in a different value from RAM, then load the original value back from RAM.
In some especially register-starved cases the extra registers can gain 30% speed for a program. The benefit is usually much less than that.
The speed benefits from assuming SSE2 are many. In 32-bit CPUs SSE instructions may or may not exist, so to use them the software needs to have clumsy test code and two (or more!) implementation of the math functions. Most software just doesn't care enough and so it never bothers, always falling back on x87 FPU math from the 486 days. The 64-bit CPUs made SSE2 a required part of the instruction set, so all x86_64 programs are free to assume it exists and use it in all cases.
64bit computers do not run faster, per se. It just can support higher precision (larger integers, more precise floats).
In some rare cases, libraries might jam two 32bit numbers into 64bits to perform a large number of parallel operations, possibly resulting in potentially up to 2x speedup. This might occur for some highly optimized scientific/numeric libraries, or in special applications that (for some reason or another) have been highly optimized at a very low level. For example, some multimedia software. It should be noted that such applications could always have made this tradeoff even in 32bit mode, but chose not to; they are merely trading away precision (which they may not need) for parallelism.
Operating system benchmarks which reveal faster performance (maybe <10% improvement) are not necessarily related to 64bit-related optimizations. 64bit architectures may be correlated with having for example more registers or advanced features that programs can take aware of [citation: http://www.tuxradar.com/content/ubuntu-904-32-bit-vs-64-bit-benchmarks ], which may be the cause of a performance difference (as well as other variables).
How to determine whether a CPU is 32bit or 64bit depends on what OS you are using. For example on Linux, you can call uname -a, though there's probably a better way to do so. If you're using C/C++, see the other answer for a way to determine it in a program.

What is the difference between a 32-bit and 64-bit processor?

I have been trying to read up on 32-bit and 64-bit processors (http://en.wikipedia.org/wiki/32-bit_processing). My understanding is that a 32-bit processor (like x86) has registers 32-bits wide. I'm not sure what that means. So it has special "memory spaces" that can store integer values up to 2^32?
I don't want to sound stupid, but I have no idea about processors. I'm assuming 64-bits is, in general, better than 32-bits. Although my computer now (one year old, Win 7, Intel Atom) has a 32-bit processor.
All calculations take place in the registers. When you're adding (or subtracting, or whatever) variables together in your code, they get loaded from memory into the registers (if they're not already there, but while you can declare an infinite number of variables, the number of registers is limited). So, having larger registers allows you to perform "larger" calculations in the same time. Not that this size-difference matters so much in practice when it comes to regular programs (since at least I rarely manipulate values larger than 2^32), but that is how it works.
Also, certain registers are used as pointers into your memory space and hence limits the maximum amount of memory that you can reference. A 32-bit processor can only reference 2^32 bytes (which is about 4 GB of data). A 64-bit processor can manage a whole lot more obviously.
There are other consequences as well, but these are the two that comes to mind.
First 32-bit and 64-bit are called architectures.
These architectures means that how much data a microprocessor will process within one instruction cycle i.e. fetch-decode-execute
In one second there might be thousands to billions of instruction cycles depending upon a processor design.
32-bit means that a microprocessor can execute 4 bytes of data in one instruction cycle while 64-bit means that a microprocessor can execute 8 bytes of data in one instruction cycle.
Since microprocessor needs to talk to other parts of computer to get and send data i.e. memory, data bus and video controller etc. so they must also support 64-bit data transfer theoretically. However, for practical reasons such as compatibility and cost, the other parts might still talk to microprocessor in 32 bits. This happened in original IBM PC where its microprocessor 8088 was capable of 16-bit execution while it talked to other parts of computer in 8 bits for the reason of cost and compatibility with existing parts.
Imagine that on a 32 bit computer you need to write 'a' as 'A' i.e. in CAPSLOCK, so the operation only requires 2 bytes while computer will read 4 bytes of data resulting in overhead. This overhead increases in 64 bit computer to 6 bytes. So, 64 bit computers not necessarily be fast all the times.
Remember 64 bit windows could be run on a microprocessor only if it supports 64-bit execution.
Processor calls data from Memory i.e. RAM by giving its address to MAR (Memory Address Register). Selector electronics then finds that address in the memory bank and retrieves the data and puts it in MDR (Memory Data Register) This data is recorded in one of the Registers in the Processor for further processing. Thats why size of Data Bus determines the size of Registers in Processor. Now, if my processor has 32 bit register, it can call data of 4 bytes size only, at a time. And if the data size exceeds 32 bits, then it would required two cycles of fetching to have the data in it. This slows down the speed of 32 bit Machine compared to 64 bit, which would complete the operation in ONE fetch cycle only. So, obviosly for the smaller data, it makes no difference if my processors are clocked at the same speed.
Again, with 64 bit processor and 64 bit OS, my instructions will be of 64 bit size always... which unnecessarily uses up more memory space.
32bit processors can address a memory bank with 32 bit address with. So you can have 2^32 memory cells and therefore a limited amount of addressable memory (~ 4GB). Even when you add another memory bank to your machine it can not be addressed. 64bit machines therefore can address up to 2^64 memory cells.
This answer is probably 9 years too late, but I feel that the above answers don't adequately answer the question.
The definition of 32-bit and 64-bit are not well defined or regulated by any standards body. They are merely intuitive concepts. The 32-bit or 64-bit CPU generally refers to the native word size of the CPU's instruction set architecture (ISA). So what is an ISA and what is a word size?
ISA and word size
ISA is the machine instructions / assembly mnemonics used by the CPU. They are the lowest level of a software which directly tell what the hardware to do. Example:
ADD r2,r1,r3 # add instruction in ARM architecture to do r2 = r1 + r3
# r1, r2, r3 refer to values stored in register r1, r2, r3
# using ARM since Intel isn't the best when learning about ISA
The old definition of word size would be the number of bits the CPU can compute in one instruction cycle. In modern context the word size is the default size of the registers or size of the registers the basic instruction acts upon (I know I kept a lot of ambiguity in this definition, but it's an intuitive concept across multiple architectures which don't completely match with each other). Example:
ADD16 r2,r1,r3 # perform addition half-word wise (assuming 32 bit word size)
ADD r2,r1,r3 # default add instruction works in terms of the word size
Example bit-ness of a Pentium Pro CPU with PAE
First, various word sizes in general purpose instrucion:
Arithmetic, logical instructions: 32 bit (Note that this violates old concept of word size since multiply and divide takes more than one cycle)
Branch, jump instructions: 32 bit for indirect addressing, 16-bit for immediate (Again Intel isn't a great example because of CISC ISA and there is enough complexity here)
Move, load, store: 32 bit for indirect, 16 bit for immediate (These instructions may take several cycles, so old definition of word size does not hold)
Second, bus and memory access sizes in hardware architecture:
Logical address size before virtual address translation: 32 bit
Virtual address size: 64-bit
Physical address size post translation: 36 bit (system bus address bus)
System bus data bus size: 256 bit
So from all the above sizes, most people intuitively called this a 32-bit CPU (despite no clear consensus on ALU word size and address bit size).
Interesting point to note here is that in olden days (70s and 80s) there were CPU architectures whose ALU word size was very different from it's memory access size. Also note that we haven't even dealt with the quirks in non-general purpose instructions.
Note on Intel x86_64
Contrary to popular belief, x86_64 is not a 64-bit architecture in the truest sense of the word. It is a 32 bit architecture which supports extension instructions which can do 64 bit operations. It also supports a 64-bit logical address size. Intel themselves call this ISA IA32e (IA32 extended, with IA32 being their 32-bit ISA).
References
ARM instruction examples
Intel addressing modes
From here:
The main difference between 32-bit processors and 64-bit processors is
the speed they operate. 64-bit processors can come in dual core, quad
core, and six core versions for home computing (with eight core
versions coming soon). Multiple cores allow for increase processing
power and faster computer operation. Software programs that require
many calculations to function operate faster on the multi-core 64-bit
processors, for the most part. It is important to note that 64-bit
computers can still use 32-bit based software programs, even when the
Windows operating system is a 64-bit version.
Another big difference between 32-bit processors and 64-bit processors
is the maximum amount of memory (RAM) that is supported. 32-bit
computers support a maximum of 3-4GB of memory, whereas a 64-bit
computer can support memory amounts over 4 GB. This is important for
software programs that are used for graphical design, engineering
design or video editing, where many calculations are performed to
render images, drawings, and video footage.
One thing to note is that 3D graphic programs and games do not benefit
much, if at all, from switching to a 64-bit computer, unless the
program is a 64-bit program. A 32-bit processor is adequate for any
program written for a 32-bit processor. In the case of computer games,
you'll get a lot more performance by upgrading the video card instead
of getting a 64-bit processor.
In the end, 64-bit processors are becoming more and more commonplace
in home computers. Most manufacturers build computers with 64-bit
processors due to cheaper prices and because more users are now using
64-bit operating systems and programs. Computer parts retailers are
offering fewer and fewer 32-bit processors and soon may not offer any
at all.
32-bit and 64-bit are basically the registers size, register the fastest type of memory and is closest to the CPU. A 64-bit processor can store more data for addressing and transmission than a 32-bit register but there are other factors also on the basis of the speed of the processor is measured such as the number of cores, cache memory, architecture etc.
Reference: Difference between 32-bit processor and 64-bit processor
From what is the meaning of 32 bit or 64 bit
process??
by kenshin123 :
The virtual addresses of a process are the mappings of an address
table that correspond to real physical memory on the system. For
reasons of efficiency and security, the kernel creates an abstraction
for a process that gives it the illusion of having its own address
space. This abstraction is called a virtual address space. It's just a
table of pointers to physical memory.
So a 32-bit process is given about 2^32 or 4GB of address space. What
this means under the hood is that the process is given a 32-bit page
table. In addition, this page table has a 32-bit VAS that maps to 4GB
of memory on the system.
So yes, a 64-bit process has a 64-bit VAS. Does that make sense?
there are 8 bits in a byte so if its 32 bit you are processing 4 bytes of data at whatever ghz or mhz your cpu is clocked at per second. so if there is a 64 bit cpu and 32 bit cpu clocked at the same speed the 64 bit cpu would be faster
32 bit processors are processing 32 bits of data based on Ghz of Processor in per second and 64 bit processors are processing 64bit of data per second on what speed your PC has. as well the 34 bit processors works with 4GB of RAM .

What are 16, 32 and 64-bit architectures?

What do 16-bit, 32-bit and 64-bit architectures mean in case of Microprocessors and/or Operating Systems?
In case of Microprocessors, does it mean maximum size of General Purpose Registers or size of Integer or number of Address-lines or number of Data Bus lines or what?
What do we mean by saying "DOS is a 16-bit OS", "Windows in a 32-bit OS", etc...?
My original answer is below, if you want to understand the comments.
New Answer
As you say, there are a variety of measures. Luckily for many CPUs a lot of the measures are the same, so there is no confusion. Let's look at some data (Sorry for image upload, I couldn't see a good way to do a table in markdown).
As you can see, many columns are good candidates. However, I would argue that the size of the general purpose registers (green) is the most commonly understood answer.
When a processor is very varied in size for different registers, it will often be described in more detail, eg the Motorola 68k being described as a 16/32bit chip.
Others have argued it is the instruction bus width (yellow) which also matches in the table. However, in today's world of pipelining I would argue this is a much less relevant measure for most applications than the size of the general purpose registers.
Original answer
Different people can mean different things, because as you say there are several measures. So for example someone talking about memory addressing might mean something different to someone talking about integer arithmetic. However, I'll try and define what i think is the common understanding.
My take is that for a CPU it means "The size of the typical register used for standard operations" or "the size of the data bus" (the two are normally equivalent).
I justify this with the following logic. The Z80 has an 8bit accumulator and an 8 bit databus, while having 16bit memory addressing registers (IX, IY, SP, PC), and a 16bit memory address bus. And the Z80 is called an 8bit microprocessor. This means people must normally mean the main integer arithmetic size, or databus size, not the memory addressing size.
It is not the size of instructions, as the Z80 (again) had 1,2 and 3 byte instructions, though of course the multi-byte were read in multiple reads. In the other direction, the 8086 is a 16bit microprocessor and can read 8 or 16bit instructions. So I would have to disagree with the answers that say it is instruction size.
For Operating systems, I would define it as "the code is compiled to run on a CPU of that size", so a 32bit OS has code compiled to run on a 32 bit CPU (as per the definition above).
How many bits a CPU "is", means what it's instruction word length is.
On a 32 bit CPU, the word length of such instruction is 32 bit, meaning that this is the width what a CPU can handle as instructions or data, often resulting in a bus line with that width.
For a similar reason, registers have the size of the CPU's word length, but you often have larger registers for different purposes.
Take the PDP-8 computer as an example. This was a 12 bit computer. Each instruction was 12 bit long. To handle data of the same width, the accumulator was also 12 bit.
But what makes the 12-bit computer a 12 bit machine, was its instruction word length. It had twelve switches on the front panel with which it could be programmed, instruction by instruction.
This is a good example to break out of the 8/16/32 bit focus.
The bit count is also typically the size of the address bus. It therefore usually tells the maximum addressable memory.
There's a good explanation of this at Wikipedia:
In computer architecture, 32-bit integers, memory addresses, or other data units are those that are at most 32 bits (4 octets) wide. Also, 32-bit CPU and ALU architectures are those that are based on registers, address buses, or data buses of that size. 32-bit is also a term given to a generation of computers in which 32-bit processors were the norm.
Now let's talk about OS.
With OS-es, this is way less bound to the actual "bitty-ness" of the CPU, it usually reflects how opcodes are assembled (for which word length of the CPU) and how registers are adressed (you can't load a 32 bit value in a 16 bit register) and how memory is adressed. Think of it as the completed, compiled program. It is stored as binary instructions and has therefore to fit into the CPUs word length. Task-wise, it has to be able to address the whole memory, otherwise it couldn't do proper memory management.
But what come's down to it, is whether a program is 32 or 64 bit (an OS is essentially a program here) it how its binary instructions are stored and how registers and memory are addressed. All in all, this applies to all kinds of programs, not just OS-es. That's why you have programs compiled for 32 bit or 64 bit.
The difference comes down to the bit width of an instruction set passed to a general purpose register for operating on. 16 bits can operate on 2 bytes, 64 on 8 bytes of instruction at a time. You can often increase throughput of a processor by executing more dense instructions per clock cycle.
The definitions are marketing terms more than precise technical terms.
In fuzzy technical term they are more related to architecturally visible widths than any real implementation register or bus width. For instance the 68008 was classed as a 32-bit CPU, but had 16-bit registers in the silicon and only an 8-bit data bus and 20-odd address bits.
http://en.wikipedia.org/wiki/64-bit#64-bit_data_models the data models mean bitness for the language.
The "OS is x-bit" phrase usually means that the OS was written for x-bit cpu mode, that is, 64-bit Windows uses long mode on x86-64, where registers are 64 bits and address space is 64-bits large and there are other distinct differences from 32-bits mode, where typically registers are 32-bits wide and address space is 32-bits large. On x86 a major difference between 32 and 64 bits modes is presence of segmentation in 32-bits for historical compatibility.
Usually the OS is written with CPU bitness in mind, x86-64 being a notable example of decades of backwards compatibility - you can have everything from 16-bit real-mode programs through 32-bits protected-mode programs to 64-bits long-mode programs.
Plus there are different ways to virtualise, so your program may run as if in 32-bits mode, but in reality it is executed by a non-x86 core at all.
When we talk about 2^n bit architectures in computer science then we are basically talking about memory registers, address buses size or data buses size. The basic concept behind term of 2^n bit architecture is to signify that this here 2^n bit of data can be made use to address/transport the data of size 2^n by processes.
As far as I know, technically, it's the width of the integer pathways. I've heard of 16bit chips that have 32bit addressing. However, in reality, it is the address width. sizeof(void*) is 16bit on a 16bit chip, 32bit on a 32bit, and 64bit on a 64bit.
This leads to problems because C and C++ allow conversions between void* and integral types, and it's safe if the integral type is large enough (the same size as the pointer). This lead to all sorts of unsafe stuff in terms of
void* p = something;
int i = (int)p;
Which will horrifically crash and burn on 64bit code (works on 32bit) because void* is now twice as big as int.
In most languages, you have to work hard to care about the width of the system you're working on.

What are the advantages of a 64-bit processor?

Obviously, a 64-bit processor has a 64-bit address space, so you have more than 4 GB of RAM at your disposal. Does compiling the same program as 64-bit and running on a 64-bit CPU have any other advantages that might actually benefit programs that aren't enormous memory hogs?
I'm asking about CPUs in general, and Intel-compatible CPUs in particular.
There's a great article on Wikipedia about the differences and benefits of 64bit Intel/AMD cpus over their 32 bit versions. It should have all the information you need.
Some on the key differences are:
16 general purpose registers instead of 8
Additional SSE registers
A no execute (NX) bit to prevent buffer overrun attacks
The main advantage of a 64-bit CPU is the ability to have 64-bit pointer types that allow virtual address ranges greater than 4GB in size. On a 32-bit CPU, the pointer size is (typically) 32 bits wide, allowing a pointer to refer to one of 2^32 (4,294,967,296) discrete addresses. This allows a program to make a data structure in memory up to 4GB in size and resolve any data item in it by simply de-referencing a pointer. Reality is slightly more complex than this, but for the purposes of this discussion it's a good enough view.
A 64-bit CPU has 64-bit pointer types that can refer to any address with a space with 2^64 (18,446,744,073,709,551,616) discrete addresses, or 16 Exabytes. A process on a CPU like this can (theoretically) construct and logically address any part of a data structure up to 16 Exabytes in size by simply de-referencing a pointer (looking up data at an address held in the pointer).
This allows a process on a 64-bit CPU to work with a larger set of data (constrained by physical memory) than a process on a 32 bit CPU could. From the point of view of most users of 64-bit systems, the principal advantage is the ability for applications to work with larger data sets in memory.
Aside from that, you may get a native 64-bit integer type. A 64 bit integer makes arithmetic or logical operations using 64 bit types such as C's long long faster than one implemented as two 32-bit operations. Floating point arithmetic is unlikely to be significantly affected, as FPU's on most modern 32-bit CPU's natively support 64-bit double floating point types.
Any other performance advantages or enhanced feature sets are a function of specific chip implementations, rather than something inherent to a system having a 64 bit ALU.
This article may be helpful:
http://www.softwaretipsandtricks.com/windowsxp/articles/581/1/The-difference-between-64-and-32-bit-processors
This one is a bit off-topic, but might help if you plan to use Ubuntu:
http://ubuntuforums.org/showthread.php?t=368607
And this pdf below contains a detailed technical specification:
http://www.plmworld.org/access/tech_showcase/pdf/Advantage%20of%2064bit%20WS%20for%20NX.pdf
Slight correction. On 32-bit Windows, the limit is about 3GB of RAM. I believe the remaining 1GB of address space is reserved for hardware. You can still install 4GB, but only 3 will be accessable.
Personally I think anyone who hasn't happily lived with 16K on an 8-bit OS in a former life should be careful about casting aspersions against some of today's software starting to become porcine. The truth is that as our resources become more plentiful, so do our expectations. The day is not long off when 3GB will start to seem ridiculously small. Until that day, stick with your 32-bit OS and be happy.
About 1-3% of speed increase due to instruction level parallelism for 32-bit calculations.
Just wanted to add a little bit of information on the pros and cons of 64-bit CPUs. https://blogs.msdn.microsoft.com/joshwil/2006/07/18/should-i-choose-to-take-advantage-of-64-bit/
The main difference between 32-bit processors and 64-bit processors is the speed they operate. 64-bit processors can come in dual core, quad core, and six core versions for home computing (with eight core versions coming soon). Multiple cores allow for increase processing power and faster computer operation. Software programs that require many calculations to function operate faster on the multi-core 64-bit processors, for the most part. It is important to note that 64-bit computers can still use 32-bit based software programs, even when the Windows operating system is a 64-bit version.
Another big difference between 32-bit processors and 64-bit processors is the maximum amount of memory (RAM) that is supported. 32-bit computers support a maximum of 3-4GB of memory, whereas a 64-bit computer can support memory amounts over 4 GB. This is important for software programs that are used for graphical design, engineering design or video editing, where many calculations are performed to render images, drawings, and video footage.
One thing to note is that 3D graphic programs and games do not benefit much, if at all, from switching to a 64-bit computer, unless the program is a 64-bit program. A 32-bit processor is adequate for any program written for a 32-bit processor. In the case of computer games, you'll get a lot more performance by upgrading the video card instead of getting a 64-bit processor.
In the end, 64-bit processors are becoming more and more commonplace in home computers. Most manufacturers build computers with 64-bit processors due to cheaper prices and because more users are now using 64-bit operating systems and programs. Computer parts retailers are offering fewer and fewer 32-bit processors and soon may not offer any at all.

Resources