Porting JVM to MINIX - linux

As you may see from the title, for some reason I need to make running .class files on Minix possible (a compiler is not necessary). So could somebody point me in any direction, suggest some literature or give some advice? Generally, how would you do it?
Until now I found OpenJDK (but it's not exactly what I am looking for). I have also read Tanenbaum's "operating systems design and implementation". It gave me a lot of insight of minix internals.

If you just want to run .class files without much concern for performance, you could create a bytecode interpreter, which might be simpler than porting / creating a full compiler. You can find the format of these class files detailed here, and the behavior of the VM specified here.
You'll also need to pick a runtime -- OpenJDK and GNU Classpath are probably the best bets -- and port it to MINIX by implementing its native methods in C. native methods are usually concerned with platform-specific stuff, like calls to file I/O, and therefore cannot be implemented in the platform-independent Java language.
There are a number of other links and resources that you might find useful on this wiki page.

The Jainja JVM (I'm the author) can work on Minix 3.2 (not tested with 3.3). It's an interpreter (i.e. no JIT) with Java 5 standard library. There is limited support of AWT/Swing using a X11 backend.

Related

Is there a good "OCaml Browser" tool for Linux?

I'm using Emacs + Typerex for OCaml programming. I have tried OcaIDE before in Windows. It's not as nice as Typerex, but it does have a good feature: Ocaml Browser
Is there anyway to have such a browser in Typerex?
(eclipse + OcaIDE in linux might work, but I do not like it as much as typerex)
Thanks
ocamlbrowser is actually the name of a program that has been distributed with the OCaml compiler for a very long time. It is written with LablTk, maintained by Jacques Garrigue, and was inside the "ocaml distribution" (instead of an external tool) because it accessed .cmi files in ways that rely on internal details of the compiler.
So the short answer is: "yes, just call ocamlbrowser in your terminal" (assuming your distribution packaged ocamlbrowser with the compiler, which may or may not be the case; there may be a separate ocamlbrowser package instead). The look&feel of the tool may be a bit dated compared to a shiny Eclipse version, but it indeed exists and works fine.

Linux equivalent header for synch.h

I have a C code which contains #include<synch.h> The code compiles successfully in solaris but in Linux I find that the header is missing.
As suggested in a few links, can "sync.h" be used instead?
Or is there any other equivalent header for synch.h in Linux?
The synch.h header in Solaris is for Solaris threads. Among other things, it provides declarations for semaphores and mutexes. You can either this library (http://sctl.sourceforge.net/sctl_v1.1_rn.html) to give you Solaris compatible threading on Linux or, a much better idea, rework your code to use POSIX threading.
We can't be sure, but one likely possibility is that this is the synch.h that's part of NACHOS, which is often used in educational environments. Head over to the NACHOS project page and read up on it, and decide whether you think that's probably the right thing; if so, you can download and install it for free.

Compiled interpreted language

Is there a programming language, having usable interactive interpreter, even as it can be compiled to machine code?
Compilation vs. "interpretation" is essentially a matter of implementation, not the language itself. For example, MRI Ruby 1.8 is interpreted, while MacRuby is compiled to native machine code. Both include an interactive REPL. All the languages I know that have at least one machine-code compiler and at least one REPL:
Ruby
Python
Almost all Lisps (Lisp was the language that pioneered this technique, AFAIK)
OCaml
Haskell
Forth
If we're counting compilation to bytecode as well as machine code, it's true of the vast majority of popular bytecode-compiled languages:
Java
Scala
Groovy
Erlang
C#
F#
Smalltalk
Haskell, using the Glasgow Haskell Compiler which has an interactive "shell" called GHCi.
Many flavors of Lisp offer both options, including Clojure.
Two come to my mind : ocaml and scala (~= java), but I'm sure there must be a lot more out there.
And here's another one to burn your house down:
x86 Assembly
Yup, there are interpreters for this as well.
Javascript x86 Assembly Interpreter
Jasmin
At this point you're really in emulator land, but it does meet the requirements you state.
I'm wondering if it's easier to name compiled languages that someone hasn't cobbled up a working interpreter for. :-)
Lua has an interactive mode for one-liners and experimentation. It normally compiles to bytecode for its VM for execution. LuaJIT is an independent implementation of a Lua VM that also does just-in-time compilation to 32-bit x86. Support for 64-bit is underway, and support for ARM is frequently requested.
Compilation to a bytecode is often a reasonable compromise between a pure interpreter and a pure compiler. The VM can be tuned to the needs of the language, and JIT techniques can analyze the VM code as it executes and concentrate on frequently executed code paths and inner loops.
As others have mentioned, OCaml.
If managed code (.NET CLI) is close enough to machine code, F# would be a candidate as well. There are probably other .NET/Mono languages which meet the requirement as well.
You may regret you asked:
C and C++.
Why?
Ch
CINT
EIC
picocc
and there are probably others out there as well.
Plenty of languages offer an implementation that both interacts and compiles to machine code, but it's rare to do both at once. Standard ML of New Jersey is one that has an interactive loop but no bytecode: it simply compiles to machine code in memory and then branches to it.
Not exactly machine code, but Java can be compiled and also used via BeanShell.
I've used Ruby with an interpreter, and there seems to be a compiler here.
Icon used to have a compiler, but it falls in and out of maintenence. It may still work.
Python can be compiled to windows executables.
C# can be compiled by using SnippetCompiler, maybe this would act as an interactive interpreter for you?
Your question is a bit vague. Even Java would fit it:
by interactive interpreter, i mean
shell-like environment, where you can
work in the runtime interactively.
Java has this, e.g. in the Eclipse "scrapbook pages", where you can enter Java expressions and have them evaluated right away. Java is of course also a compiled language (and while it's usually compiled to bytecode, there are various compilers that output machine code).
So what are you looking for? Maybe you could explain your problem or interest.
I tried using mono/.net for a bit and found random GC pauses to be disagreeable (at least on my crusty old laptop). I looked at using gambit-c an implementation of scheme that can compile to C but it seemed difficult to work with because the docs were somewhat limited and the packages where not very easy to install and use.
I usually just stick to having an interpreted language such as python bound to C/C++ which is more painful but at least I know what I am in for.

Bare metal cross compilers input

What are the input limitations of a bare metal cross compiler...as in does it not compile programs with pointers or mallocs......or anything that would require more than the underlying hardware....also how can 1 find these limitations..
I also wanted to ask...I built a cross compiler for target mips..i need to create a mips executable using this cross compiler...but i am not able to find where the executable is...as in there is 1 executable which i found mipsel-linux-cpp which is supposed to compile,assemble and link and then produce a.out but it is not doing so...
However the ./cc1 gives a mips assembly.......
There is an install folder which has a gcc executable which uses i386 assembly and then gives an exe...i dont understand how can the gcc exe give i386 and not mips assembly when i have specified target as mips....
please help im really not able to understand what is happ...
I followed the foll steps..
1. Installed binutils 2.19
2. configured gcc for mips..(g++,core)
I would suggest that you should have started two separate questions.
The GNU toolchain does not have any OS dependencies, but the GNU library does. Most bare-metal cross builds of GCC use the Newlib C library which provides a set of syscall stubs that you must map to your target yourself. These stubs include low-level calls necessary to implement stream I/O and heap management. They can be very simple or very complex depending on your needs. If the only I/O support is to a UART to stdin/stdout/stderr, then it is simple. You don't have to implement everything, but if you do not implement teh I/O stubs, you won't be able to use printf() for example. You must implement the sbrk()/sbrk_r() syscall is you want malloc() to work.
The GNU C++ library will work correctly with Newlib as its underlying library. If you use C++, the C runtime start-up (usually crt0.s) must include the static initialiser loop to invoke the constructors of any static objects that your code may include. The run-time start-up must also of course initialise the processor, clocks, SDRAM controller, timers, MMU etc; that is your responsibility, not the compiler's.
I have no experience of MIPS targets, but the principles are the same for all processors, there is a very useful article called "Building Bare Metal ARM with GNU" which you may find helpful, much of it will be relevant - especially porting the parts regarding implementing Newlib stubs.
Regarding your other question, if your compiler is called mipsel-linux-cpp, then it is not a 'bare-metal' build but rather a Linux build. Also this executable does not really "compile, assemble and link", it is rather a driver that separately calls the pre-processor, compiler, assembler and linker. It has to be configured correctly to invoke the cross-tools rather than the host tools. I generally invoke the linker separately in order to enforce decisions about which standard library to link (-nostdlib), and also because it makes more sense when a application is comprised of multiple execution units. I cannot offer much help other than that here since I have always used GNU-ARM tools built by people with obviously more patience than me, and moreover hosted on Windows, where there is less possibility of the host tool-chain being invoked instead (one reason why I have also avoided those tool-chains that rely on Cygwin)
EDIT
With more time available, I have rewritten my original answer in an attempt to provide something more useful.
I cannot provide a specific answer for your question. I have never tried to get code running on a MIPS machine. What I do have is plenty of experience getting a variety of "bare metal" boards up and running. All kinds of CPUs and all kinds of compilers and cross compilers. So I have an understanding of the principles that apply in all such situations. I will point out the kind of knowledge you will need to absorb before you can hope to succeed with a job like this, and hopefully I can list some links to resources to get you started on learning that knowledge.
I am worried you don't know that pointers are exactly the kind of thing a bare metal compiler can handle, they are a basic machine primitive. This tells me you are probably not an expert embedded developer who is just stuck in this particular scenario. Never mind. There isn't anything magic about programming an embedded system, and you can learn what you need to know.
The first step is getting to understand the relationship between C and the machine you wish to run code on. Basically C is a portable assembly language. This means that C is good for manipulating the basic operations of the machine. In this sense the basic operations of the machine are reading and writing memory locations, performing arithmetic and boolean operations on the data read from memory, and making branching and looping decisions based on that data. In particular the C concept of pointers allows you to manipulate data at locations in memory that you specify.
So far so good, but just doing raw computations in memory is not usually enough - you need a way to input and output data from memory. To do that you need to manipulate the hardware peripherals on your board. If the hardware peripherals are memory mapped then the machine registers used to control the peripherals look exactly like memory locations and C can manipulate them directly. Even in that case though, it is much more likely that doing useful I/O is best handled by extending the C core language with a library of routines provided just for that purpose. These library routines handle all the nasty details (timers, interrupts, non-memory mapped I/O) involved in manipulating the peripheral hardware on the board, and wrap them up with a convenient C function call interface. The idea is that you can go simply printf("hello world"); and the library call take care of the details of displaying the string.
An appropriately skilled developer knows how to adapt an existing I/O library to a new board, or how to develop new library routines to provide access to non-standard custom hardware. The classic way to develop these skills is to start with something simple, usually a LED for an output device, and a switch for an input device. Write a program that pulses a LED in a predictable way, or reads a switch and reflects in on a LED. The first time you get this working will be hugely satisfying.
Okay I have rambled enough. It is time to provide some more resources for you to study. The good news is that there's never been a better time to learn how things work at the interface between hardware and software. There is a wealth of freely available code and docs. Stackoverflow is a great resource as you know. Good luck! Links follow;
Embedded systems overview
Knowing the C language well is fundamental
Why not get your code working on a simulator before you try real hardware
Another emulated environment
Linux device drivers - an overlapping subject
Another book about bare metal programming

Is assembler portable between Linux distros?

Is a program shipped in assembler format portable between Linux distributions (modulo CPU architecture differences)?
Here's the background to my question: I'm working on a new programming language (named Aklo), whose modus operandi will be the classic compiling to .s and feeding the result to the GNU assembler.
Obviously it would be nice ultimately to have the implementation written in itself, but I had resigned myself to maintaining it in C++ to solve the chicken and egg problem: suppose you download the compiler for the first time and it is itself written in Aklo, how do you compile it? As I understand it, different Linux distributions and other UNIX like systems have different conventions for binary formats.
But it's just occurred to me, a solution might be to ship the .s file (well, one per CPU architecture): it's fair to assume you have or can install the GNU assembler. Of course I'd still need a bootstrap compiler, but that doesn't need to be fast; I can write it in Python.
Is assembler portable in the way that binaries are not? Are there any other stumbling blocks I haven't thought of?
Added in response to one answer:
I had looked wistfully at LLVM, there is certainly a lot of good stuff there and it would make my life easier -- except that it would incur a dependency on the correct version of LLVM being installed. It wouldn't be so bad having that dependency on development machines, but in a world where it's common to ship programs as source, the same dependency would be incurred for every user of every program ever written in Aklo, and I decided that was too high a price to pay.
But if the solution of shipping compiled programs as assembler works... then that solves that problem, and I can use LLVM after all, which would be a big win.
So the question about portability of assembler is even considerably more important than I had first realized.
Conclusion: from answers here and on the LLVM mailing list http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028991.html it seems the bad news is the problem is unsolvable, but the good news is that means using LLVM makes it no worse, so I'm free to do so and obtain all the advantages thereof.
You might want to check out LLVM before going down this particular path. It might make your life a lot easier, as it provides a low level virtual machine that makes a lot of hard stuff just work and has been very popular.
At a very high level, the ABI consists of { instruction set, system calls, binary format, libraries }.
Distribution as .s may free you from the binary format. This is still rather pointless, because you are fixed to a particular ISA and still need to use libraries and/or make system calls. Libraries vary from distribution to distribution (although this isn't really that bad, especially if you just use libc) and syscalls vary from OS to OS.
It's basically 20 years since I last bootstrapped a C compiler. At the level of compilers, the differences between Linux distributions are minimal.
The much more important reason for going LLVM is cross-platform; if you're not writing some intermediate language, your compiler will be extremely difficult to retarget for different processors. And seeing as, on my laptop, I have compilers for x86, x86_64, two kinds of MIPS, PowerPC, ARM and AVR... you see where I'm going? I can compile multiple languages for most of those targets too (only C for AVR).

Resources