How to proceed with Linux source code customization?

How to proceed with Linux source code customization? - linux

I am a non CS/IT student, but having knowledge of C, Java, DS and Algorithms. Now-a-days I am focusing on operating system and had gained some of its concepts. But I want some practical knowledge of it. Merely writing algo code in java/c has no fun in doing. I have gone through many articles where they mentioned we can customize source code of Linux-kernel.
I want to start customizing the kernel as I move ahead in the learning of OS concepts and apply the same. It will make two goals achievable 1. I will gain practical idea of the operating system 2. I will have a project.
Problem which I face-
1. From where to get the source code? Which source code should I download? Also the documentation if possible.
https://www.kernel.org/
I went in there but there are so many of them which one will be better?
2. How will I customize the code once I have it?
Please give me suggestions with detail about how I should start this journey (of changing source code to customize Linux).
Moreover I am using Windows 8.

I recommend first reading several books on OSes and on programming. You need a broad CS culture (if possible get a CS degree)
I am a non CS/IT student,
You'll better become one, or else spend years of work to learn all the stuff a CS graduate student has learnt.
First, you need to be very familiar with Linux programming on user side (application programs). So read at least Advanced Linux Programming and study the source code of several programs, including shells (and some kind of servers). Read also carefully syscalls(2). Explore the state of your kernel (e.g. thru proc(5)...). Look into https://kernelnewbies.org/
I also recommend learning several programming languages. You should in particular read SICP, an excellent introduction to programming. Read also some book like programming language pragmatics. Read something about continuation and continuation passing style. Read the Dragon book. Read some Introduction to Algorithms. Read something about computer architecture and instruction set architecture
Merely writing algo code in java/c has no fun in doing.
But the kernel is also written in C (mostly) and full of algorithmic code. What makes you think you'll get more fun in it?
I want to start customizing the kernel as I move ahead in the learning of OS concepts and apply the same.
But why? Why don't you also consider studying and contributing to some user-level code
I would recommend first reading a good book on OSes in general, notably Operating Systems: Three Easy Pieces. Look also on OSdev.
At last, the general advice about kernel programming is don't. A common mistake is to try adding code inside the kernel to solve some issue that can and should be solved in user-land.
How will I customize the code once I have it?
You probably should not customize the kernel, but if you did you'll use familiar tools (a good source code editor like emacs or vim, a compiler and linker on the command line, a build automation tool like make). Patching the kernel is similar to patching some other free software. But testing your kernel is harder (because you'll often reboot).
You'll also find several books explaining the Linux kernel.
If you still want to customize the kernel you should first try to code some kernel module.
Moreover I am using Windows 8.
This is a huge mistake. You first need to be an advanced Linux user. So wipe out Windows from your computer, and install some Linux distribution -I recommend Debian- (and use only Linux, no more Windows). Become familiar with command line.
I seriously recommend to avoid working on the kernel as your first project.
I strongly recommend looking at some existing user-land free software project first (there are thousands of them, notably on github, e.g. choose some package in your distribution, study its source code, work on it, propose the patch to the community). Be able to build from source code a lot of things.

A wise man once said you "must act your way into right thinking, as you cannot think your way into right acting". In your case, you'll need to act as an experienced programmer would act, which means before we write any code, we need to answer some questions.
What do we want to change?
Why do we want to change it?
What are the repercussions of this change (ie what other functions - out of all the 10's of millions of lines of source code - call this function)?
After we've made the change, how are we going to compile it? In other words, there is a defined process for this. What is it?
After we compile our new kernel/module, how are we going to test it?
A good start, in addition to the answer that was just posted, would be to run LFS (Linux from Scratch). Get a successful install of that and use it as a starting point.
Now, since we're experienced programmers, we know that tinkering with a 10M+ line codebase is a recipe for trouble; we need a bit more direction than that. Here's a list of bugs that need to be fixed: https://bugzilla.kernel.org/buglist.cgi?chfield=%5BBug%20creation%5D&chfieldfrom=7d
I, for one, would be glad to see the one called "AUFS hangs on fanotify" go away, as I use AUFS with Docker on a daily basis.
If, down the line, you decide you'd rather hack on something besides the kernel, there are plenty of other options.

From your question it follows that you've already gained some concepts of an operating system. However, if you feel that it's still insufficient, it is OK to spend more time on learning. An operating system (mainly, a kernel) has certain tasks to perform like memory management (or memory protection), multiprogramming, hardware abstraction and so on. Neither of the topics may be neglected - they are all as important. So, if you have some time, you may refer to such useful books as "Modern Operating Systems" by Andrew Tanenbaum. Special books like that will shed much light on all important aspects of a modern OS. Suffice it to say, Linux kernel itself was started by Linus Torvalds because of a strong inspiration by MINIX - an educational project by A. Tanenbaum.
Such a cumbersome project like an OS kernel (BSD, Linux, etc.) contains lots of code. Many people are collaborating to write or enhance whatever parts of the kernel. So, there is a common and inevitable need to use a version control system. So, if you have an intention to submit your code to the kernel in future, you also have to have hands on with version control. Particularly, Linux relies on Git SCM (software configuration management - a synonym for version control).
So, once you have some knowledge of Git, you can install it on your computer and download Linux source code: git clone https://github.com/torvalds/linux.git
Determine your goals at Linux kernel modification. What do you want to achieve? Perhaps, you have a network card which you suspect to miss some features in Linux? Take a look at the other vendors' drivers and make an attempt to fix the driver of interest to include the features. Of course, this will require some knowledge of the HW, and, if the features are HW dependent, you will unlikely succeed to elaborate your code without special knowledge. But, in general, - if you are trying to make an enhancement, it assumes that you are an experienced Linux user yourself. Otherwise, how will you understand that some fixes/enhancements/etc. are required? So, I can't help but agree with the proposal to postpone Windows 8 for a while and start using some Linux distribution (eg. Debian).
If you succeed to determine your goals (eg. if you find a paper describing some desired changes in Linux kernel or if you decide to enhance some device drivers / write your own), you will be able to try it hands on. However, you still might need some helpful books, but, in this case, some Linux-specific ones. Also, writing C code for the kernel itself will require one important detail - you will need to comply with a so called coding standard, otherwise Linux kernel maintainers will not be able to accept your patches.
So, I made an attempt to outline some tips based on your current question. Of course, the job of kernel development has far more broad prerequisites, but these are which are just obvious.

Related

How Could I Create This Type of Machine?

So I have a computer. It has programs on it already. If I delete those programs, I would be left with an operating system that is able to run commands. I could create my own programs from that point, but I would be limited to the constraints of the operating system already loaded onto the machine. What I would like to do is remove the operating system from the computer entirely and be left only with a blank screen and a cursor where I could type whatever I want. I want to be able to create my own program without having to run an operating system program behind it. I do not understand how the physical machine would be able to process the strings of characters that I type into it and produce its own response, which would then be displayed on the screen, but obviously someone has done it before, otherwise I would not have the machine that I am typing on right now.
(I apologize for the run on sentences but I do not know how to say what I want to say right now.)
My goal here is to have a computer, kind of like the Apple 2, where the only thing that I could do with it is type into a text line and see characters pop up on the screen. My goal on top of that goal is to develop a program that would hide in the background of the machine, so that there would still only be a cursor on screen, but the program would make it so that when I type any simple question into the screen, such as, "How are you feeling today?", I would receive a response like, "I am doing quite well, thank you. How are you?".
Does anybody have any idea how I would be able to start this project properly?

If you need to ask this question, you need to learn more than one answer on SO can provide.
Operating system is needed even to get the cursor thingy on screen.
If you are serious about the idea - you might want to start with a microcontroller, such as Arduino. They are more powerful than Apple 2 and they will allow you to write programs and boot straight into them. Adding some kind of terminal IO will not be hard - at least comparing to bootstrapping a program on an actual PC.

A good starting point for a project like this is to learn about operating systems in general. It is a vast topic but you don't have to know everything.
When we speak of an operating system we have in mind a large system that provides capabilities like managing memory, reading and writing files to permanent storage and interacting with input and output such as keyboards and displays. We are also usually thinking of a large number of higher level software applications. Think about commands like dir or ls as programs that come along with the operating system. Of course with a GUI based OS we also have windows and buttons and a wide variety of controls to consider.
The good news is that in order to get started you don't need to be an expert on everything and you certainly don't have to start with a full-blown OS.
The other good news is that the topic can be broken down into byte-sized pieces. A great introduction to the fundamentals you will need is Charles Petzold's Code The Hidden Language of Hardware and Software
Petzold begins with discussions of the inventions of Morse code and Braille, adds electricity, number systems, Boolean logic, and the resulting epiphanies required to put them all together economically. With these building blocks he builds circuits, relays, gates, switches, discusses the inventions of the vacuum tube, transistors, and finally the integrated circuit.
The last portion of the book contains a grab bag of subjects such as implementation of floating point math, operating systems, and the various refinements that have occurred in the latter half of the twentieth century.
Once you have a feel for the fundamentals a good next step in learning about operating systems is to study one that provides as few capabilities as possible. Take a look at MINIX
MINIX originally was developed in 1987 by Andrew S. Tanenbaum as a teaching tool for his textbook Operating Systems Design and Implementation. Today, it is a text-oriented operating system with a kernel of less than 6,000 lines of code. MINIX's largest claim to fame is as an example of a microkernel, in which each device driver runs as an isolated user-mode process—a structure that not only increases security but also reliability, because it means a bug in a driver cannot bring down the entire system.
Have fun.

Linux kernel/os source code documentation?

Is there a Linux distro (other than Minix) with good documentation for the source code? Or, is there some good documentation to describe the general Linux source code?
I have downloaded the Kernel source code, but, it is (unsurprisingly) a little overwhelming to find my way around and I wondered if there were some higher-level documentation to go with how the Linux kernel works?

Have you tried having a look on The linux documentation project I've find it quietly exhaustive regarding linux
They have a section The Linux Kernel wich is an online book that explains
how the linux kernel works and why it does behaves in certain ways, you should deffinitely
look into it because it's very well made.

Some of the Linux kernel code has decent commenting as documentation, but if you're going to be getting into kernel development, I'd recommend picking up a good book. A good, relatively easy-to-read one is Linux Kernel Development, by Robert Love. I got started on the Second Edition when I was in college, and keep a copy of the third on my bookshelf now.
I also find the Linux Cross Reference site helpful in jumping around the kernel source code. It's nice for tracking down functions that are in different files, and getting at what you need.

If you want to learn about operating systems and their basics, I strongly suggest you to start with a small kernel and then ramp up to learn about Linux. Starting with an operating system like Linux would be overwhelming in terms of code and documentation.
There is XV6 operating system which follows the basic Unix notion of files and processes. You can get the code listing and the documentation explaining the code properly. Here is a link to it. link.
Since academia is using this course as a baseline, I think you should get good support for understanding the same.

Linux Core Kernel Commentary is a little dated, but is still an excellent source of info.

For something which is not obsolete (like kernel.org/doc is), you may see:
Free Electrons Linux/Documentation/ (3.8)
Linux Cross Reference kernel/Documentation/
kernel-doc (3.6.10)
The first is the one I prefer personally (clean, readable, pleasant, up‑to‑date).
The second is the most well known.
The third, is for download, if you wish to browse and search it off‑line (may be handy in some case).
My two cents as a side note before I leave: I feel it's weird how for such a famous stuff as the Linux kernel is, when you search the web for documentation, you get masses of obsolete documentations, and how the rather up‑to‑date ones seems to be rather hidden and far from the top position of search engines.

Fuzzing the Linux Kernel: A student in peril.

I am currently a student at a university studying a computing related degree and my current project is focusing on finding vulnerabilities in the Linux kernel. My aim is to both statically audit as well as 'fuzz' the kernel (targeting version 3.0) in an attempt to find a vulnerability.
My first question is 'simple' is fuzzing the Linux kernel possible? I have heard of people fuzzing plenty of protocols etc. but never much about kernel modules. I also understand that on a Linux system everything can be seen as a file and as such surely input to the kernel modules should be possible via that interface shouldn't it?
My second question is: which fuzzer would you suggest? As previously stated lots of fuzzers exist that fuzz protocols however I don't see many of these being useful when attacking a kernel module. Obviously there are frameworks such as the Peach fuzzer which allows you to 'create' your own fuzzer from the ground up and are supposedly excellent however I have tried repeatedly to install Peach to no avail and I'm finding it difficult to believe it is suitable given the difficulty I've already experienced just installing it (if anyone knows of any decent installation tutorials please let me know :P).
I would appreciate any information you are able to provide me with this problem. Given the breadth of the topic I have chosen, any idea of a direction is always greatly appreciated. Equally, I would like to ask people to refrain from telling me to start elsewhere. I do understand the size of the task at hand however I will still attempt it regardless (I'm a blue-sky thinker :P A.K.A stubborn as an Ox)
Cheers
A.Smith

I think a good starting point would be to extend Dave Jones's Linux kernel fuzzer, Trinity: http://codemonkey.org.uk/2010/12/15/system-call-fuzzing-continued/ and http://codemonkey.org.uk/2010/11/09/system-call-abuse/
Dave seems to find more bugs whenever he extends that a bit more. The basic idea is to look at the system calls you are fuzzing, and rather than passing in totally random junk, make your fuzzer choose random junk that will at least pass the basic sanity checks in the actual system call code. In other words, you use the kernel source to let your fuzzer get further into the system calls than totally random input would usually go.

"Fuzzing" the kernel is quite a broad way to describe your goals.
From a kernel point of view you can
try to fuzz the system calls
the character- and block-devices in /dev
Not sure what you want to achieve.
Fuzzing the system calls would mean checking out every Linux system call (http://linux.die.net/man/2/syscalls) and try if you can disturb regular work by odd parameter values.
Fuzzing character- or block-drivers would mean trying to send data via the /dev-interfaces in a way which would end up in odd result.
Also you have to differentiate between attempts by an unprivileged user and by root.
My suggestion is narrowing down your attempts to a subset of your proposition. It's just too damn broad.
Good luck -
Alex.

One way to fuzzing is via system call fuzzing.
Essentially the idea is to take the system call, fuzz the input over the entire range of possible values - whether it remain within the specification defined for the system call does not matter.

Contributing to a Linux distribution

I'm interested in contributing to a Linux distro, but regarding the various distro's developer communities, I'm having a bit of trouble figuring out which one I'd most like to join.
What languages I know: C, C++, Lua, Python, and fairly familiar with Perl (though I wouldn't say I "know" it). In particular, I have very little experience with x86 assembly besides hacking stuff together for performance tweaks, though that will be partially rectified soon.
What I'm looking for: A community that provides plenty of opportunities for developers to work on various aspects of the distribution. To be honest I'm most interested in reading and working on the kernel source (in which case the distro doesn't matter), but it's pretty daunting and I figure getting into the Linux community and working with experienced Linux developers might give me a better idea of how to jump into the guts(let me know if this is bogus, or if you have any advice regarding that).
So...
Which distro has the "best" developer community in terms of organization, people who are fun to work with, and opportunities to contribute?
I've read various "Contributing to XXX" pages and mailing lists for distros like Ubuntu, OpenSuse, Fedora, etc. but I'd rather get a more personal testament from an actual developer.

Unless you have a specific desire to learn the ins and outs of various packaging formats you would probably be better off contributing directly upstream to applications/libraries that you find interesting. While individual distributions often have a few management applications that are unique(ish) to them most core applications and libraries are shared between them.
As you have expressed an interest in guts it would make sense to stick to one of the main community distros (Fedora and Ubuntu/Debian) as the rest tend to be variations on a base distro. The other option is to choose a source based distribution which have a number of advantages to developers although you may find yourself spending a bit of time keeping your machine trim.
As I'm a developer I personally use Gentoo which gives me a number of things:
Rolling release: New versions of applications are generally available soon after release
Stable/Unstable mix: I can run stable core with bleeding edge on upstream packages I care about
Development ready: Any installed package is by default a "dev" package, the distinction between buildtime/runtime dependencies is blurred
Packaging is easy: If it's a simple as "configure/make/make install" writing and ebuild is very easy.
Contribution is easy: Contributing new ebuilds is fairly painless, from there you can get as involved as you like
Of course there are downsides, not least of all your machine spends a considerable amount of time building things and if your run a large selection of "unstable" packages you may find you occasionally need to fix-up your machine. However I find these disadvantages minor compared to giving me an up to date platform with which to contribute to upstream from.

If you want to work with the kernel then you shouldn't be picking a distribution, but rather working upstream.

Somebody correct me if I'm wrong, but I think that contributing to Ubuntu can be very easy and fun if you use Launchpad. I haven't tried contributing code, but I contribute translations and file bugs on some projects.

How to build Linux system from kernel to UI layer

I have been looking into MeeGo, maemo, Android architecture.
They all have Linux Kernel, build some libraries on it, then build middle layer libraries [e.g telephony, media etc...].
Suppose i wana build my own system, say Linux Kernel, with some binariers like glibc, Dbus,.... UI toolkit like GTK+ and its binaries.
I want to compile every project from source to customize my own linux system for desktop, netbook and handheld devices. [starting from netbook first :)]
How can i build my own customize system from kernel to UI.

I apologize in advance for a very long winded answer to what you thought would be a very simple question. Unfortunately, piecing together an entire operating system from many different bits in a coherent and unified manner is not exactly a trivial task. I'm currently working on my own Xen based distribution, I'll share my experience thus far (beyond Linux From Scratch):
1 - Decide on a scope and stick to it
If you have any hope of actually completing this project, you need write an explanation of what your new OS will be and do once its completed in a single paragraph. Print that out and tape it to your wall, directly in front of you. Read it, chant it, practice saying it backwards and whatever else may help you to keep it directly in front of any urge to succumb to feature creep.
2 - Decide on a package manager
This may be the single most important decision that you will make. You need to decide how you will maintain your operating system in regards to updates and new releases, even if you are the only subscriber. Anyone, including you who uses the new OS will surely find a need to install something that was not included in the base distribution. Even if you are pushing out an OS to power a kiosk, its critical for all deployments to keep themselves up to date in a sane and consistent manner.
I ended up going with apt-rpm because it offered the flexibility of the popular .rpm package format while leveraging apt's known sanity when it comes to dependencies. You may prefer using yum, apt with .deb packages, slackware style .tgz packages or your own format.
Decide on this quickly, because its going to dictate how you structure your build. Keep track of dependencies in each component so that its easy to roll packages later.
3 - Re-read your scope then configure your kernel
Avoid the kitchen sink syndrome when making a kernel. Look at what you want to accomplish and then decide what the kernel has to support. You will probably want full gadget support, compatibility with file systems from other popular operating systems, security hooks appropriate for people who do a lot of browsing, etc. You don't need to support crazy RAID configurations, advanced netfilter targets and minixfs, but wifi better work. You don't need 10GBE or infiniband support. Go through the kernel configuration carefully. If you can't justify including a module by its potential use, don't check it.
Avoid pulling in out of tree patches unless you absolutely need them. From time to time, people come up with new scheduling algorithms, experimental file systems, etc. It is very, very difficult to maintain a kernel that consumes from anything else but mainline.
There are exceptions, of course. If going out of tree is the only way to meet one of your goals stated in your scope. Just remain conscious of how much additional work you'll be making for yourself in the future.
4 - Re-read your scope then select your base userland
At the very minimum, you'll need a shell, the core utilities and an editor that works without an window manager. Paying attention to dependencies will tell you that you also need a C library and whatever else is needed to make the base commands work. As Eli answered, Linux From Scratch is a good resource to check. I also strongly suggest looking at the LSB (Linux standard base), this is a specification that lists common packages and components that are 'expected' to be included with any distribution. Don't follow the LSB as a standard, compare its suggestions against your scope. If the purpose of your OS does not necessitate inclusion of something and nothing you install will depend on it, don't include it.
5 - Re-read your scope and decide on a window system
Again, referring to the everything including the kitchen sink syndrome, try and resist the urge to just slap a stock install of KDE or GNOME on top of your base OS and call it done. Another common pitfall is to install a full blown version of either and work backwards by removing things that aren't needed. For the sake of sane dependencies, its really better to work on this from bottom up rather than top down.
Decide quickly on the UI toolkit that your distribution is going to favor and get it (with supporting libraries) in place. Define consistency in UIs quickly and stick to it. Nothing is more annoying than having 10 windows open that behave completely differently as far as controls go. When I see this, I diagnose the OS with multiple personality disorder and want to medicate its developer. There was just an uproar regarding Ubuntu moving window controls around, and they were doing it consistently .. the inconsistency was the behavior changing between versions. People get very upset if they can't immediately find a button or have to increase their mouse mileage.
6 - Re-read your scope and pick your applications
Avoid kitchen sink syndrome here as well. Choose your applications not only based on your scope and their popularity, but how easy they will be for you to maintain. Its very likely that you will be applying your own patches to them (even simple ones like messengers updating a blinking light on the toolbar).
Its important to keep every architecture that you want to support in mind as you select what you want to include. For instance, if Valgrind is your best friend, be aware that you won't be able to use it to debug issues on certain ARM platforms.
Pretend you are a company and will be an employee there. Does your company pass the Joel test? Consider a continuous integration system like Hudson, as well. It will save you lots of hair pulling as you progress.
As you begin unifying all of these components, you'll naturally be establishing your own SDK. Document it as you go, avoid breaking it on a whim (refer to your scope, always). Its perfectly acceptable to just let linux be linux, which turns your SDK more into formal guidelines than anything else.
In my case, I'm rather fortunate to be working on something that is designed strictly as a server OS. I don't have to deal with desktop caveats and I don't envy anyone who does.
7 - Additional suggestions
These are in random order, but noting them might save you some time:
Maintain patch sets to every line of upstream code that you modify, in numbered sequence. An example might be 00-make-bash-clairvoyant.patch, this allows you to maintain patches instead of entire forked repositories of upstream code. You'll thank yourself for this later.
If a component has a testing suite, make sure you add tests for anything that you introduce. Its easy to just say "great, it works!" and leave it at that, keep in mind that you'll likely be adding even more later, which may break what you added previously.
Use whatever version control system is in use by the authors when pulling in upstream code. This makes merging of new code much, much simpler and shaves hours off of re-basing your patches.
Even if you think upstream authors won't be interested in your changes, at least alert them to the fact that they exist. Coordination is essential, even if you simply learn that a feature you just put in is already in planning and will be implemented differently in the future.
You may be convinced that you will be the only person to ever use your OS. Design it as though millions will use it, you never know. This kind of thinking helps avoid kludges.
Don't pull upstream alpha code, no matter what the temptation may be. Red Hat tried that, it did not work out well. Stick to stable releases unless you are pulling in bug fixes. Major bug fixes usually result in upstream releases, so make sure you watch and coordinate.
Remember that it's supposed to be fun.
Finally, realize that rolling an entire from-scratch distribution is exponentially more complex than forking an existing distribution and simply adding whatever you feel that it lacks. You need to reward yourself often by booting your OS and actually using it productively. If you get too frustrated, consistently confused or find yourself putting off work on it, consider making a lightweight fork of Debian or Ubuntu. You can then go back and duplicate it entirely from scratch. Its no different than prototyping an application in a simpler / rapid language first before writing it for real in something more difficult. If you want to go this route (first), gNewSense offers utilities to fork your own OS directly from Ubuntu. Note, by default, their utilities will strip any non free bits (including binary kernel blobs) from the resulting distro.
I strongly suggest going the completely from scratch route (first) because the experience that you will gain is far greater than making yet another fork. However, its also important that you actually complete your project. Best is subjective, do what works for you.
Good luck on your project, see you on distrowatch.

Check out Linux From Scratch:
Linux From Scratch (LFS) is a project
that provides you with step-by-step
instructions for building your own
customized Linux system entirely from
source.

Use Gentoo Linux. It is a compile from source distribution, very customizable. I like it a lot.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string