When distributing source code is not an option the Linux Standard Base allows a mechanism for achieving binary compatibility with many Linux distros.
http://www.linuxfoundation.org/collaborate/workgroups/lsb
However, it appears that it has been somewhat abandoned as an approach:
https://github.com/LinuxStandardBase/lsb/blob/master/README.md
I'd be interested in thoughts on the best path forward for maintaining binary compatibility. The document referenced above basically says "Linux is evolving so rapidly that it's not practical to have a standard base". I can appreciate that, but what are the other options? It's difficult to google this topic because you mostly get references about the LSB.
Related
I am a non CS/IT student, but having knowledge of C, Java, DS and Algorithms. Now-a-days I am focusing on operating system and had gained some of its concepts. But I want some practical knowledge of it. Merely writing algo code in java/c has no fun in doing. I have gone through many articles where they mentioned we can customize source code of Linux-kernel.
I want to start customizing the kernel as I move ahead in the learning of OS concepts and apply the same. It will make two goals achievable 1. I will gain practical idea of the operating system 2. I will have a project.
Problem which I face-
1. From where to get the source code? Which source code should I download? Also the documentation if possible.
https://www.kernel.org/
I went in there but there are so many of them which one will be better?
2. How will I customize the code once I have it?
Please give me suggestions with detail about how I should start this journey (of changing source code to customize Linux).
Moreover I am using Windows 8.
I recommend first reading several books on OSes and on programming. You need a broad CS culture (if possible get a CS degree)
I am a non CS/IT student,
You'll better become one, or else spend years of work to learn all the stuff a CS graduate student has learnt.
First, you need to be very familiar with Linux programming on user side (application programs). So read at least Advanced Linux Programming and study the source code of several programs, including shells (and some kind of servers). Read also carefully syscalls(2). Explore the state of your kernel (e.g. thru proc(5)...). Look into https://kernelnewbies.org/
I also recommend learning several programming languages. You should in particular read SICP, an excellent introduction to programming. Read also some book like programming language pragmatics. Read something about continuation and continuation passing style. Read the Dragon book. Read some Introduction to Algorithms. Read something about computer architecture and instruction set architecture
Merely writing algo code in java/c has no fun in doing.
But the kernel is also written in C (mostly) and full of algorithmic code. What makes you think you'll get more fun in it?
I want to start customizing the kernel as I move ahead in the learning of OS concepts and apply the same.
But why? Why don't you also consider studying and contributing to some user-level code
I would recommend first reading a good book on OSes in general, notably Operating Systems: Three Easy Pieces. Look also on OSdev.
At last, the general advice about kernel programming is don't. A common mistake is to try adding code inside the kernel to solve some issue that can and should be solved in user-land.
How will I customize the code once I have it?
You probably should not customize the kernel, but if you did you'll use familiar tools (a good source code editor like emacs or vim, a compiler and linker on the command line, a build automation tool like make). Patching the kernel is similar to patching some other free software. But testing your kernel is harder (because you'll often reboot).
You'll also find several books explaining the Linux kernel.
If you still want to customize the kernel you should first try to code some kernel module.
Moreover I am using Windows 8.
This is a huge mistake. You first need to be an advanced Linux user. So wipe out Windows from your computer, and install some Linux distribution -I recommend Debian- (and use only Linux, no more Windows). Become familiar with command line.
I seriously recommend to avoid working on the kernel as your first project.
I strongly recommend looking at some existing user-land free software project first (there are thousands of them, notably on github, e.g. choose some package in your distribution, study its source code, work on it, propose the patch to the community). Be able to build from source code a lot of things.
A wise man once said you "must act your way into right thinking, as you cannot think your way into right acting". In your case, you'll need to act as an experienced programmer would act, which means before we write any code, we need to answer some questions.
What do we want to change?
Why do we want to change it?
What are the repercussions of this change (ie what other functions - out of all the 10's of millions of lines of source code - call this function)?
After we've made the change, how are we going to compile it? In other words, there is a defined process for this. What is it?
After we compile our new kernel/module, how are we going to test it?
A good start, in addition to the answer that was just posted, would be to run LFS (Linux from Scratch). Get a successful install of that and use it as a starting point.
Now, since we're experienced programmers, we know that tinkering with a 10M+ line codebase is a recipe for trouble; we need a bit more direction than that. Here's a list of bugs that need to be fixed: https://bugzilla.kernel.org/buglist.cgi?chfield=%5BBug%20creation%5D&chfieldfrom=7d
I, for one, would be glad to see the one called "AUFS hangs on fanotify" go away, as I use AUFS with Docker on a daily basis.
If, down the line, you decide you'd rather hack on something besides the kernel, there are plenty of other options.
From your question it follows that you've already gained some concepts of an operating system. However, if you feel that it's still insufficient, it is OK to spend more time on learning. An operating system (mainly, a kernel) has certain tasks to perform like memory management (or memory protection), multiprogramming, hardware abstraction and so on. Neither of the topics may be neglected - they are all as important. So, if you have some time, you may refer to such useful books as "Modern Operating Systems" by Andrew Tanenbaum. Special books like that will shed much light on all important aspects of a modern OS. Suffice it to say, Linux kernel itself was started by Linus Torvalds because of a strong inspiration by MINIX - an educational project by A. Tanenbaum.
Such a cumbersome project like an OS kernel (BSD, Linux, etc.) contains lots of code. Many people are collaborating to write or enhance whatever parts of the kernel. So, there is a common and inevitable need to use a version control system. So, if you have an intention to submit your code to the kernel in future, you also have to have hands on with version control. Particularly, Linux relies on Git SCM (software configuration management - a synonym for version control).
So, once you have some knowledge of Git, you can install it on your computer and download Linux source code: git clone https://github.com/torvalds/linux.git
Determine your goals at Linux kernel modification. What do you want to achieve? Perhaps, you have a network card which you suspect to miss some features in Linux? Take a look at the other vendors' drivers and make an attempt to fix the driver of interest to include the features. Of course, this will require some knowledge of the HW, and, if the features are HW dependent, you will unlikely succeed to elaborate your code without special knowledge. But, in general, - if you are trying to make an enhancement, it assumes that you are an experienced Linux user yourself. Otherwise, how will you understand that some fixes/enhancements/etc. are required? So, I can't help but agree with the proposal to postpone Windows 8 for a while and start using some Linux distribution (eg. Debian).
If you succeed to determine your goals (eg. if you find a paper describing some desired changes in Linux kernel or if you decide to enhance some device drivers / write your own), you will be able to try it hands on. However, you still might need some helpful books, but, in this case, some Linux-specific ones. Also, writing C code for the kernel itself will require one important detail - you will need to comply with a so called coding standard, otherwise Linux kernel maintainers will not be able to accept your patches.
So, I made an attempt to outline some tips based on your current question. Of course, the job of kernel development has far more broad prerequisites, but these are which are just obvious.
The motivation for this question is a far-fetched dream I have where a lot of the excellent software available on *nix platforms could be trivially ported to Windows. Microsoft has taken a different approach to open source and openness in general recently, so I'd really like to know how viable such a thing would be if Microsoft were so inclined. Some of the more specific things I'm curious about is if it could be done without breaking backwards compatibility, and perhaps some sort of gauge on the amount of effort that would be involved. If there were any specific technical examples that would highlight particular difficulties in doing such a thing, that would also be greatly appreciated.
Windows has already been such. The NT kernel itself has supported the concept of "personalities" (API layers over the NT layer) since the beginning, to support by design at the very least the Win32 API, the POSIX API and the OS/2 API.
The POSIX layer circulated for a long time in higher end SKUs (typically server-related) with different names (Microsoft POSIX subsystem/SFU/SUA), but it never really caught on for non-specialized use, both because it was not universally available (Microsoft never really pushed it, probably for commercial reasons) and because other solutions became widespread (think Cygwin/MSYS/MinGW).
Notice that, although the "personality API" is an interesting concept (and probably one of the cleanest way to implement a multi-API OS) it suffers a bit from a "deep segregation" problem - i.e., it's true that you can access kernel objects through a POSIX interface, but all services that have been built over the Win32 interface (like Windows, GDI & co.) aren't easily available; besides, however good the interface may be, there are some details (like the format of paths) that cannot be ironed over, so a POSIX application will always look a bit out of place.
Is there a Linux distro (other than Minix) with good documentation for the source code? Or, is there some good documentation to describe the general Linux source code?
I have downloaded the Kernel source code, but, it is (unsurprisingly) a little overwhelming to find my way around and I wondered if there were some higher-level documentation to go with how the Linux kernel works?
Have you tried having a look on The linux documentation project I've find it quietly exhaustive regarding linux
They have a section The Linux Kernel wich is an online book that explains
how the linux kernel works and why it does behaves in certain ways, you should deffinitely
look into it because it's very well made.
Some of the Linux kernel code has decent commenting as documentation, but if you're going to be getting into kernel development, I'd recommend picking up a good book. A good, relatively easy-to-read one is Linux Kernel Development, by Robert Love. I got started on the Second Edition when I was in college, and keep a copy of the third on my bookshelf now.
I also find the Linux Cross Reference site helpful in jumping around the kernel source code. It's nice for tracking down functions that are in different files, and getting at what you need.
If you want to learn about operating systems and their basics, I strongly suggest you to start with a small kernel and then ramp up to learn about Linux. Starting with an operating system like Linux would be overwhelming in terms of code and documentation.
There is XV6 operating system which follows the basic Unix notion of files and processes. You can get the code listing and the documentation explaining the code properly. Here is a link to it. link.
Since academia is using this course as a baseline, I think you should get good support for understanding the same.
Linux Core Kernel Commentary is a little dated, but is still an excellent source of info.
For something which is not obsolete (like kernel.org/doc is), you may see:
Free Electrons Linux/Documentation/ (3.8)
Linux Cross Reference kernel/Documentation/
kernel-doc (3.6.10)
The first is the one I prefer personally (clean, readable, pleasant, up‑to‑date).
The second is the most well known.
The third, is for download, if you wish to browse and search it off‑line (may be handy in some case).
My two cents as a side note before I leave: I feel it's weird how for such a famous stuff as the Linux kernel is, when you search the web for documentation, you get masses of obsolete documentations, and how the rather up‑to‑date ones seems to be rather hidden and far from the top position of search engines.
Is there any way to protect a shared library (.so file) against reverse engineering ?
Is there any free tool to do that ?
The obvious first step is to strip the library of all symbols, except the ones required for the published API you provide.
The "standard" disassembly-prevention techniques (such as jumping into the middle of instruction, or encrypting some parts of code and decrypting them on-demand) apply to shared libraries.
Other techniques, e.g. detecting that you are running under debugger and failing, do not really apply (unless you want to drive your end-users completely insane).
Assuming you want your end-users to be able to debug the applications they are developing using your library, obfuscation is a mostly lost cause. Your efforts are really much better spent providing features and support.
Reverse engineering protection comes in many forms, here are just a few:
Detecting reversing environments, such as being run in a debugger or a virtual machine, and aborting. This prevents an analyst from figuring out what is going on. Usually used by malware. A common trick is to run undocumented instructions that behave differently in VMWare than on a real CPU.
Formatting the binary so that it is malformed, e.g. missing ELF sections. You're trying to prevent normal analysis tools from being able to open the file. On Linux, this means doing something that libbfd doesn't understand (but other libraries like capstone may still work).
Randomizing the binary's symbols and code blocks so that they don't look like what a compiler would produce. This makes decompiling (guessing at the original source code) more difficult. Some commercial programs (like games) are deployed with this kind of protection.
Polymorphic code that changes itself on the fly (e.g. decompresses into memory when loaded). The most sophisticated ones are designed for use by malware and are called packers (avoid these unless you want to get flagged by anti-malware tools). There are also academic ones like UPX http://upx.sourceforge.net/ (which provides a tool to undo the UPX'ing).
I would like to create a set of domain objects in multiple languages, so that I can target different platforms. I have been looking at external DSLs as a way to define a language for my domain, and then potentially writing adapters that generate code for the languages I'm interested in targeting. Is this the best way to solve this problem? Or is it just simpler to maintain multiple versions of the project?
I think that Apache Thrift delivers what you are asking for.
Sorry for late answer, but as you mention C# being your main language, this practically fully supported Visual Studio based technology is exactly what you are looking for.
You have to understand what you want to abstract with your DSLs, but the multiple-platform support is trivial on top of that.
Disclaimer: This is our technology, but it's publicly open and it solves exactly the problem presented in the question.
http://abstraction.codeplex.com/
Note! Mind the very "alpha" stage of the current download, I suggest you skip the zipped download and grab the latest source. I am updating better construct in relatively near future. Check out the "Context" implementation in "Production/Dev/AbstractionTemplate" solution.
It is difficult to be helpful without understanding what you are planning to use your DSL for.
Is portability your main problem here?
To succesfully target these different platforms, you will probably have to maintain plaftorm-specific layers anyway (generated or not).
If you plan to write your whole application in your DSL, then use your own compiler to transform it into runnable code for each platform, well it is most probably a bad idea, too complex and overengineered.
However, if you have a well-defined chunk of platform-independent logic, then a DSL is a good choice. Just write an interpreter for it on each target platform (provided that performance is not critical, this is also simpler and easier than generating code).
What is the best way to create multiple language versions of a domain?
This is (was?) somehow the idea of Model Driven Architecture (MDA). Quoting Model-driven architecture from Wikipedia:
The Model-Driven Architecture approach
defines system functionality using a
platform-independent model (PIM) using
an appropriate domain-specific
language (DSL).
Then, given a platform definition
model (PDM) corresponding to CORBA,
.NET, the Web, etc., the PIM is
translated to one or more
platform-specific models (PSMs) that
computers can run. This requires
mappings and transformations and
should be modeled too.
The PSM may use different Domain
Specific Languages (DSLs), or a
General Purpose Language (GPL) like
Java, C#, PHP, Python, etc. Automated tools generally
perform this translation.
Depending on the complexity of your domain and the availability of a MDA Tool, this might be an option (with a lower implementation cost).
See also
MDA: Nice idea, shame about the ...
Language Workbenches and Model Driven Architecture
UML vs. Domain-Specific Languages
DSL in the context of UML and GPL
UML or DSL: Which Bear Is Best? (be sure to read this one)