Difference between NASM, TASM, & MASM - linux

Can somebody explain the differences between: masm, tasm, & nasm?
Why can't I run tasm code on linux? Are they different languages?
I thought that assembly language was unique for all systems.

TASM, MASM, and NASM are x86 assemblers.
Borland Turbo Assembler (TASM) and Microsoft Macro Assembler (MASM) are DOS/Windows-based, Netwide Assembler (NASM) is available for other platforms as well. TASM produces 16-bit/32-bit output, MASM and NASM also produce 64-bit output.
All of those assemblers take the x86 instruction set as input. That does not mean that assembler source files are identical and compatible, however.
Instruction syntax
Assemblers expect either the original syntax used in the Intel instruction set documentation - the Intel syntax - or the so-called AT&T syntax developed at the AT&T Bell Labs. AT&T uses mov src, dest, Intel uses mov dest, src, amongst other differences.
Windows assemblers prefer Intel syntax (TASM, MASM), most Linux/UNIX assemblers use AT&T. NASM uses a variant of the Intel syntax.
Assembler-specific syntax
Assemblers have their own syntax for directives affecting the assembly process, macros and comments. These usually differ from assembler to assembler.
Compatibility
TASM can assemble MASM sources in "MASM mode".
NASM can assemble TASM code in "TASM mode". So, in theory, you can take TASM code and assemble them using NASM on Linux using that mode. Of course, the code might still need adjustments. If the code have OS dependencies, these will require your attention as well as you move from Windows to Linux.

Related

Is there a way to use NASM syntax for inline assembly?

I really dislike the GNU Assembler syntax and I some existing code written with NASM syntax that would be quite painful and time consuming to port.
Is it possible to make the global_asm!() macro use NASM as the assembler or possibly make GAS use NASM syntax?
You might be able to change it but it seems as if GAS is the only viable option. In Directives Support:
'Inline assembly supports a subset of the directives supported by both GNU AS and LLVM's internal assembler, given as follows. The result of using other directives is assembler-specific (and may cause an error, or may be accepted as-is).'
Additionally,the documentation states "Currently, all supported targets follow the assembly code syntax used by LLVM's internal assembler which usually corresponds to that of the GNU assembler (GAS). On x86, the .intel_syntax noprefix mode of GAS is used by default.'
This might be helpful as well https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md

Is coding MASM for linux illegal?

Almost 30 years after MASM 6.0, It is still being used for educational purposes worldwide.
The 16 bit MASM is what is being taught.
I use linux and am quite pained that not a single MASM assembler is available.
1.Wine does not work
2.DOSBOX does not run the linker
3.I don't have an windows XP cd to run in a VM
4.My only windows computer is windows 8.1, which does not support MASM
People suggest NASM but the syntax is not same as a 16 bit MASM assembler.
Is it illegal to make a MASM for linux, or any windows 7 and above??
I was thinking about coding a simple one for educational purposes myself.

Is all MIPS code on Linux supposed to be PIC?

In Linux on MIPS CPUs (MIPSEL32 to be precise), is it true that all userland SO's are supposed be position independent (PIC)? A cite from an authoritative source would be the the best.
How about Android?
My interest stems from this.
The situation with PIC code on Linux appears to be somewhat interesting. In the past (pre EGLIBC-2.9) all binaries on MIPS where supposed to be PIC (both applications and shared libraries). However, to reduce the size of applications, the ABI extension was developed to allow for non-PIC executables (but shared objects stay PIC, as before):
At this time we do not propose any change to the position-independent
addressing conventions used by shared objects. Similarly,
position-independent executables compiled with '-fpie' -- as required
for address space randomisation in "hardened" Linux distributions --
shall continue to use the existing psABI addressing and calling
mechanisms.
http://gcc.gnu.org/ml/gcc/2008-07/txt00000.txt
The wiki page on linux-mips.org stating that all binaries on MIPS must be PIC appears to be somewhat out of date, as both recent GCC and EGLIBC on Linux support non-PIC executables: http://www.linux-mips.org/wiki/PIC_code

Using x86 materials to learn assembly on a 64 bit OS?

I am teaching myself/reading up about assembly. Most of the books on assembly refer to x86- all the register names in the code begin with "e" and not "r" (as they would in x86-64). However, I use 64-bit Linux and I was wondering if these books have any value because they are not referring to x86-64.
So in short- is it really worth me using these resources to learn x86-64. Or reworded differently, besides the difference in register naming convention- are there any other differences between the two which could make learning from x86 materials difficult?
64 bit Linux allows running 32bit applications, so you still can create 32 bit applications on your computer. This way, the books and example 32 bit code are fully useful.
The only single problem you might have is if the assembly application dynamically link to some 32 bit shared library. In order to fix this you should install 32 bit compatibility layer.
The assembly programs that use only Linux system calls works fine without this layer, which is actually set of shared libraries compiled for 32 bit.
BTW, in my opinion, writing 32 bit code is still better if you want your programs to be useful for more people. There are still many 32 bit computers around and they will not disappear soon.
It's indeed a bit easier to learn assembly on 32bit since the calling conventions and stack management are simpler.
On 64bit you need to worry about ABI. Not only that but the conventions are not the same for every OSes. For instance, the ABI rules on Mac OS X are different than those on Windows (the registers are not the same and on Windows it only uses 4 registers).
You can compile your assembly code using -arch i386 with the assembler (as). With clang or gcc you can use -m32 (at least on Mac OS X, since I haven't used it on Linux proper). You won't be able to link modules that have different bitness (32bit vs 64bit).
Once you're ready to switch or compile your program for 64bit you will have to make sure that when you handle the stack you need to push 64bit words instead of 32bit ones but that kinda goes with saying.

Is assembly language `assembler` specific too? Which assembler is best?

I'm learning assembly language. I started with Paul A. Carter's PC Assembly Language which uses NASM (The Netwide Assembler). Then in the middle I switched and started reading Introduction to 80×86 Assembly Language and Computer Architecture which uses MASM.
In NASM I used to write, for initializing a byte
db 110101b
In MASM I'm using
BYTE 110101b
I'm in the middle of reading. Since these are Assembler directives they will be different for each assembler. right?
Doesn't these assembler developers follow a standard for these directives? Because, They know that mnemonics are CPU specific. So, its pain in the ass to learn and code in assembly language.
Now if they follow different directives, its more pain if you change assembler or if you switch the operating system (MASM developer is in deep trouble if he goes to linux).
My confusion is should I acquaint myself with NASM or MASM? I'm fan of windows but I may have to work (in future) on Linux also.
Every book should be titled "_________ Assembly Language using __________ Assembler"
Unfortunately there has never been a standard for assembly language. You'll just have to learn the directives that your assembler supports. Fortunately most of the directives, while having different names, are semantically similar like db and BYTE.
But wait! It gets worse, especially for the x86. You have (at least) two forms of code that assemblers can accept: Intel and AT&T format. AT&T format reverses the order of most operands to instructions (or is it visa versa ;-).
NASM is probably a better choice for portability, but you could also look at the GNU
assembler..
Intel Syntax / AT&T Syntax
With x86 in particular, the first assemblers were from Intel and then largely-compatible assemblers from Microsoft formed one branch.
These assemblers organize source and destination operands right to left and have an unusual (and to my eyes, kind of wacky) abstraction layer that uses a single mnemonic for 8, 16, and 32-bit ops and then derives the actual machine opcode to use based on properties of the operand. Modifiers exist (on operands) to force a particular size.
But Unix was also important and it had a completely different assembler line with different traditions and conventions.
The original Unix vendor was AT&T, which owned the intellectual property developed at Bell Labs. A series of BSD projects and then Linux continued with this tradition. These assemblers historically process operands left to right, have a spare design optimized for speed, and when used by humans they generally use cpp for macros and conditionals, even if the assembler also has parallel features.
These days you are probably using VS on MS or Gnu on Linux or Mac, but this is why we still say AT&T vs Intel. The GNU assembler has an option to assemble both ways, although it's still really in the AT&T camp.
Generally yes. They are mostly feature-compatible though, so converting from one assembler syntax to another is usually not terribly difficult if you know both.
Processors are all documented in a manufacturer supplied Reference manual. This usually developed into the normative syntax (along with the assembler provided by the vendor) for assembly programs on a particular platform. Consequently, many processors from a single vendor have similar syntax.
The situation became more complex with second sourcing of processors and the eventual development of multi-targeting assemblers that, for historical reasons, use mostly consistent syntax across all platforms. This also provides some arguable advantages when porting code across platforms.
Your best choices are to: pick a notation you are comfortable with and accept books with different syntax, see if you can locate cross-system macro libraries or translation tools or bite the bullet and learn multiple dialects. The third is usually tolerable although it makes building private libraries labour intensive.

Resources