What parts of Linux kernel can I read for fun? [closed] - linux

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Programming isn't my main job, though I enjoy it and sometimes get paid for it. For many years now I've been hearing about Linux and my friends have shown to me many *nixes (or *nici?), though I stick with Mac OS.
Do you think there are any parts of the Linux kernel that I could enjoy looking at, that would help me understand what's the whole stuff about? For example, how Linux is different from Darwin?
I've grown up with assembler and DOS, so things like interrupts or low-level C shouldn't be barriers to understanding. But in the end I'm more interested in high-level concepts, like threading or networking stack - I know different operating systems do them differently. And I'm looking for something fun, easy and enjoyable, like late-night reading.
(Note: made a CW, just in case)
Update: I looked for some docs and started reading:
Unreliable Guide To Locking

I would recommend looking at LXR. It makes it easier to follow the flow of the code (you do not have to search for each function that is called — well, you have, but the site does it for you).
Some starting points, for the current version (2.6.30):
start_kernel() — think of it as the kernel equivalent of main(). This function initializes almost all the kernel subsystems; follow it to see in code what you see scrolling on the screen during the boot.
entry_32.S — system calls and interrupts (x86-32 version, which should be nearer what you know; note the use of the AT&T assembly dialect instead of the Intel dialect you might be more used to).
head_32.S — the kernel entry point. This is where the kernel starts after switching to protected mode; in the end, it will call start_kernel().
arch/x86/boot — the real-mode bootstrap code. It starts in assembly (boot/header.S), but quickly jumps into C code (starting at boot/main.c). Does the real-mode initialization (mostly BIOS calls which have to be done before switching to protected mode); it is compiled using a weird GCC trick (.code16gcc), which allows the generation of 32-bit real-mode code.
arch/x86/boot/compressed — if you ever wondered where does the "Decompressing Linux..." message comes from, it is from here.

Myself, I've always found the task scheduling code a bit of a hoot :-/
Methinks you need to get yourself a hobby outside the industry. Or a life :-)

The comments in the kernel can be pretty funny. There's some tips on where to find the best ones on kerneltrap.
arch/sparc/lib/checksum.S- /* Sun, you just can't beat me, you just can't. Stop trying,
arch/sparc/lib/checksum.S: * give up. I'm serious, I am going to kick the living shit
arch/sparc/lib/checksum.S- * out of you, game over, lights out.*/

linux-0.01.tar.gz is Historic Kernel and good for start
it is simple and tiny and better for start reading
(also it have void main(void) Instead of start_kernel() lol :D )

You might want to read or skim a book that describes the Linux Kernel before looking deep into the Linux kernel.
The books that come to mind are:
Understanding the Linux Kernel, Second Edition
Design of the UNIX Operating System

You need to re-define the word 'fun' in your context. :)
That said, the Linux kernel may be too much of a monster to take on. You may want to start with some academic or more primitive kernels to get the hang of what's going on, first. You may also want to consider the Jolix book.

You'd probably get more out of reading a book on OS theory. As far as source code goes: I've got no idea, but you could easily download the Linux kernel source and see if you can find anything that appeals.

This should turn up some interesting code when run in the src directory:
grep -ir "fixme" *
also try with other comical terms, crap, shit, f***, penguin, etc.

It's been recommended by quite a few people that v0.0.1 of linux is the easiest to understand.
Though, if your looking for good kernel source to read, I wouldn't go with linux, it's a beast of a hack(about like saying the GCC sources are "fun") Instead, you may wish to try Minix or one of the BSDs(Darwin is basically a branch of NetBSD iirc) or even one of the many free DOS clones if everything else is a little too scary..

Try reading the code that implements these character devices:
/dev/zero
/dev/null
/dev/full
And maybe the random number generators if you are inclined. The code is straightforward and simpler than all other device drivers since it does not touch any hardware.
Start at drivers/char/mem.*

kernel.h
Some simple tricks we can learn, such as
#define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0]))
...
#define min(x, y) ...
...
#define container_of

For fun I guess you could also see Minix, it isn't exactly linux but Modern Operating systems by tenenbaum is a good read.

Related

Using the Linux kernel in my operating system [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
PREFACE
I tried to put as much effort and work into this question as I reasonably could, so if you could at least read it through, I would highly appreciate it; I, also, have tried researching this question, but I never seemed to find anything useful, in terms of anything that directly answered my question; I do not know if this is right place for this question, even though it is related to programming, it is more related to operating system development and the Linux kernel, and if there is a better place for this question that I am unaware of, please move it there; feel free to do whatever, edit the question if need be, I do not care, I just need an answer to this question, because this is stressing me out.
The following is some background on why I am asking this question; if you are uninterested, and if you just want to see what I am asking, then skip to the 'MY QUESTION' label; I thought that I would put this is here, so that anyone who is reading this question would know why I am asking this question.
BACKGROUND
I have recently begun setting up an operating system development project; and after I get some things ready, it will be only me working on it, as of right now, and I plan to write the whole thing (yes, I know it will take a whole lot of work, but I can try, right? :p), including the bootstrapping, the CLI, and most of what is necessary to have to either my own kernel or Linux kernel function; GUI and much more; granted, eventually I may end up having a team, but that is for the future.
MY QUESTION
My question, which is actually consists of three parts, and I narrowed them down to specifically those thee things, which are the following:
(1) If I were to build everything else, and use the Linux kernel as-is, and if I were to not tie the other parts of the system into the kernel, but use the kernel for I/O and system calls, would I be violating the GPL in any way, and would I think need to open source the rest of my code?
(2) If I were to only use the kernel for I/O and for system calls, but not have the code that I wrote actually interface with any kernel functions, would that still be considered linking?
(3) If I were to do the above, would that be considered a derived work, when I wrote everything else, but used Linux as the system's kernel?
All these legal issues are making my head spin and extremely confusing to me.
No
No
No
The linux kernel considers the system calls a boundary, and code that communicates with the kernel via system calls is not covered by the licensing of the kernel. So, the user space code you write is not a derivative work of the kernel.
There's also a set of header files provided by the kernel, collectively named the UAPI headers which you can use without having your code become a derivative work
This is covered at https://www.kernel.org/doc/html/v4.17/process/license-rules.html and https://github.com/torvalds/linux/blob/master/LICENSES/exceptions/Linux-syscall-note
If you need legal advice though, contact a lawyer.

How to proceed with Linux source code customization?

I am a non CS/IT student, but having knowledge of C, Java, DS and Algorithms. Now-a-days I am focusing on operating system and had gained some of its concepts. But I want some practical knowledge of it. Merely writing algo code in java/c has no fun in doing. I have gone through many articles where they mentioned we can customize source code of Linux-kernel.
I want to start customizing the kernel as I move ahead in the learning of OS concepts and apply the same. It will make two goals achievable 1. I will gain practical idea of the operating system 2. I will have a project.
Problem which I face-
1. From where to get the source code? Which source code should I download? Also the documentation if possible.
https://www.kernel.org/
I went in there but there are so many of them which one will be better?
2. How will I customize the code once I have it?
Please give me suggestions with detail about how I should start this journey (of changing source code to customize Linux).
Moreover I am using Windows 8.
I recommend first reading several books on OSes and on programming. You need a broad CS culture (if possible get a CS degree)
I am a non CS/IT student,
You'll better become one, or else spend years of work to learn all the stuff a CS graduate student has learnt.
First, you need to be very familiar with Linux programming on user side (application programs). So read at least Advanced Linux Programming and study the source code of several programs, including shells (and some kind of servers). Read also carefully syscalls(2). Explore the state of your kernel (e.g. thru proc(5)...). Look into https://kernelnewbies.org/
I also recommend learning several programming languages. You should in particular read SICP, an excellent introduction to programming. Read also some book like programming language pragmatics. Read something about continuation and continuation passing style. Read the Dragon book. Read some Introduction to Algorithms. Read something about computer architecture and instruction set architecture
Merely writing algo code in java/c has no fun in doing.
But the kernel is also written in C (mostly) and full of algorithmic code. What makes you think you'll get more fun in it?
I want to start customizing the kernel as I move ahead in the learning of OS concepts and apply the same.
But why? Why don't you also consider studying and contributing to some user-level code
I would recommend first reading a good book on OSes in general, notably Operating Systems: Three Easy Pieces. Look also on OSdev.
At last, the general advice about kernel programming is don't. A common mistake is to try adding code inside the kernel to solve some issue that can and should be solved in user-land.
How will I customize the code once I have it?
You probably should not customize the kernel, but if you did you'll use familiar tools (a good source code editor like emacs or vim, a compiler and linker on the command line, a build automation tool like make). Patching the kernel is similar to patching some other free software. But testing your kernel is harder (because you'll often reboot).
You'll also find several books explaining the Linux kernel.
If you still want to customize the kernel you should first try to code some kernel module.
Moreover I am using Windows 8.
This is a huge mistake. You first need to be an advanced Linux user. So wipe out Windows from your computer, and install some Linux distribution -I recommend Debian- (and use only Linux, no more Windows). Become familiar with command line.
I seriously recommend to avoid working on the kernel as your first project.
I strongly recommend looking at some existing user-land free software project first (there are thousands of them, notably on github, e.g. choose some package in your distribution, study its source code, work on it, propose the patch to the community). Be able to build from source code a lot of things.
A wise man once said you "must act your way into right thinking, as you cannot think your way into right acting". In your case, you'll need to act as an experienced programmer would act, which means before we write any code, we need to answer some questions.
What do we want to change?
Why do we want to change it?
What are the repercussions of this change (ie what other functions - out of all the 10's of millions of lines of source code - call this function)?
After we've made the change, how are we going to compile it? In other words, there is a defined process for this. What is it?
After we compile our new kernel/module, how are we going to test it?
A good start, in addition to the answer that was just posted, would be to run LFS (Linux from Scratch). Get a successful install of that and use it as a starting point.
Now, since we're experienced programmers, we know that tinkering with a 10M+ line codebase is a recipe for trouble; we need a bit more direction than that. Here's a list of bugs that need to be fixed: https://bugzilla.kernel.org/buglist.cgi?chfield=%5BBug%20creation%5D&chfieldfrom=7d
I, for one, would be glad to see the one called "AUFS hangs on fanotify" go away, as I use AUFS with Docker on a daily basis.
If, down the line, you decide you'd rather hack on something besides the kernel, there are plenty of other options.
From your question it follows that you've already gained some concepts of an operating system. However, if you feel that it's still insufficient, it is OK to spend more time on learning. An operating system (mainly, a kernel) has certain tasks to perform like memory management (or memory protection), multiprogramming, hardware abstraction and so on. Neither of the topics may be neglected - they are all as important. So, if you have some time, you may refer to such useful books as "Modern Operating Systems" by Andrew Tanenbaum. Special books like that will shed much light on all important aspects of a modern OS. Suffice it to say, Linux kernel itself was started by Linus Torvalds because of a strong inspiration by MINIX - an educational project by A. Tanenbaum.
Such a cumbersome project like an OS kernel (BSD, Linux, etc.) contains lots of code. Many people are collaborating to write or enhance whatever parts of the kernel. So, there is a common and inevitable need to use a version control system. So, if you have an intention to submit your code to the kernel in future, you also have to have hands on with version control. Particularly, Linux relies on Git SCM (software configuration management - a synonym for version control).
So, once you have some knowledge of Git, you can install it on your computer and download Linux source code: git clone https://github.com/torvalds/linux.git
Determine your goals at Linux kernel modification. What do you want to achieve? Perhaps, you have a network card which you suspect to miss some features in Linux? Take a look at the other vendors' drivers and make an attempt to fix the driver of interest to include the features. Of course, this will require some knowledge of the HW, and, if the features are HW dependent, you will unlikely succeed to elaborate your code without special knowledge. But, in general, - if you are trying to make an enhancement, it assumes that you are an experienced Linux user yourself. Otherwise, how will you understand that some fixes/enhancements/etc. are required? So, I can't help but agree with the proposal to postpone Windows 8 for a while and start using some Linux distribution (eg. Debian).
If you succeed to determine your goals (eg. if you find a paper describing some desired changes in Linux kernel or if you decide to enhance some device drivers / write your own), you will be able to try it hands on. However, you still might need some helpful books, but, in this case, some Linux-specific ones. Also, writing C code for the kernel itself will require one important detail - you will need to comply with a so called coding standard, otherwise Linux kernel maintainers will not be able to accept your patches.
So, I made an attempt to outline some tips based on your current question. Of course, the job of kernel development has far more broad prerequisites, but these are which are just obvious.

What is the best way to learn x86 assembly on a Linux platform? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have no prior knowledge of assembly programming, and would like to learn how to code x86 assembly on a Linux platform. However, I'm having a hard time finding a good resource to teach myself with.
The Art of Assembly book looks good, but it teaches HLA. I'm not interested in having to learn one way, then relearning it all over again. It also seems like RISC architectures have better resources for assembly, but unfortunately I do not have a RISC processor to learn with. Does anyone have any suggestions?
http://asm.sf.net has some material on architectures besides x86.
If you are interested in RISC architectures, you could run Linux on Qemu. Qemu emulates several RISC architectures like PowerPC, ARM and MIPS. You might be able to find a ready to use Qemu hard disk image here.
Another way to experiment with RISC architectures would be to use gdb's built-in simulator.
I found Assembly language step-by-step to be a very good resource. It has a section in the back thats aimed at Linux assembly too.
Probably nothing much better than The Art of Assembly Language Programming and the other resources at that web site.
There are really two parts to learning assembly-level programming: the basic concepts, and then specific architectures. If you haven't had any exposure to asm programming, I strongly suggest you get the basics down first with a simple, small architecture, even tho' it likely is not directly applicable to any real hardware. If many folks are pointing to a particular resource like "The Art of...", take another look at it, use it to learn what an architecture is, how to use the basic tools (asm, debugger, disasm, etc).
Once those are out of the way, then you can start looking into more advanced instruction sets. The x86 architecture and instruction set are pretty convoluted and there are many obscure ways to twist your brain - learn something simple before you tackle that.
Even though many people I know at school hated this book, I will link it anyway:
http://www.amazon.com/Professional-Assembly-Language-Programmer/dp/0764579010
The main reason I used this book is because it uses x86 on Linux with the GNU assembler. That last point helped since I had to use that assembler in our school's lab, and if you aren't aware - the syntax is different from Intel syntax.
Also, I would just add that learning how high level languages are compiled into assembly language really helped me move along.
x86 assembly is really an intel language, best learnt with an intel chip and a windows platform which does DOS
If you have something like WinXP there used to be a DOS interpreter which showed a user the basics of asm and allowed a user to reverse a command and tweak the code in real time, then assemble the code into a block which could be run on the interpreter
It was called the "Ketman Interpreter"
It was for DOS asm only but it was pretty unique because it let you see what happens with all the registers and flags and allows a totally clueless individual to get a handle on the logic
Try http://www.emu8086.com which is a windows-hosted 8086 emulator with an assembler and debugger. It comes with a tutorial.
I learned x86 assembler from a book about the 8086 (which I can't remember the name of at present... it was obviously quite old, and purple. if you're really interested I can dig it up when I get home). That will only teach you 16 bit stuff, for the more advanced 32 bit stuff I read some tutorials online. I've never done 64 bit. At least at first, the OS you're targeting probably won't matter, as you're too low-level... the BIOS is all you really care about. If you don't have access to a test system, an emulator is probably a good choice, as others have mentioned, but you can also build yourself an 8088 or 8086 without too much trouble from discrete parts. You can find tutorials and circuit diagrams online easily. It should cost less than $50 and it's a great learning experience -- you're essentially building a motherboard from scratch.
If you're not too attached to x86 assembly and want to learn RISC, I recommend the Microchip PIC microcontrollers. You can pick up a starter kit for less than $50 (the PICKit 1, which I have, even works under Linux). They have extensive documentation and plenty of third-party tutorials aimed at hobbyists.
don't forget to grab a copy of Guide-Assembly-Language-Programming-in-Linux book.
The Art of Assembly Language Programming

How to convince my co-worker the linux kernel code is re-entrant?

Yeah I know ... Some people are sometimes hard to convince of what sounds natural to the rest of us, an I need your help right now SO community (or I'll go postal soon ..)
One of my co-worker is convinced the linux kernel code is not re-entrant as he reads it somewhere last time he get insterested in it, likely 7 years ago. Probably its reading was right at that time, remember that multi core architecture was not much widespread some time ago and linux project at its begining or so was not totally well writen and fully fledged with all fancy features.
Today is different. It's obvious that calling the same system call from different processes running in parallel on the same architecture won't lead to undefined behavior. Linux kernel is widespread now, and known for its reability even though running on multicore architectures.
That is my argument for now. But what would be yours to prove that objectively ?
I was thinking to show him off some function in the linux kernel (on lxr website ) as the mutex_lock() system call. Eveything is tuned to get it work in concurrent environnement. But the code could be not that obvious for newbie (as I am).
Please help me.. ;-)
Search the kernel mailing list archive for "BKL". That stands for "Big Kernel Lock", which is what used to be used to prevent problems. A lot of work has been put into breaking it up into pieces, to allow reentry as long different parts of the kernel are used by different processes. Most recent mentions of "BKL" (at least that I've noticed) have basically referred to somebody trying to make his own life easy by locking more than somebody else approved of, at which point they frequently say something about "returning to the days of the BKL", or something on that order.
The easiest way to prove that multiple CPUs can execute in the kernel simultaneously would be to write a program that does a lot of work in-kernel (for example, looks up long pathnames in a tight loop), then run two copies of it at the same time on a dual-core machine and show that the "system" percentage in top goes above 50%.
At the risk of being snarky: why not just read the code? If neither of you are expert enough to follow the code through an interrupt handler and into some subsystem or another where you can read out the synchronization code, then ... why bother? Isn't this just a dancing on the head of a pin argument? It's like a creationist demanding "proof" of evolution when they aren't interested in learning any biology.
Maybe you should have your friend prove Linux is not reentrant. Burden should not be on you to prove this.

What's the best way to get to know linux or BSD kernel internals? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I'd like to gain better knowledge of operating system internals. Process management, memory management, and stuff like that.
I was thinking of learning by getting to know either linux or BSD kernel.
Which one kernel is better for learning purposes?
What's the best place to start?
Can you recommend any good books?
In college, I had an operating systems class where we used a book by Tanenbaum. In the class, we implemented a device driver in the Minix operating system. It was a lot of fun, and we learned a lot.
One thing to note though, if you pick Minix, it is designed for learning. It is a microkernel, while Linux and BSD are a monolithic kernel, so what you learn may not be 100% translatable to be able to work with Linux or BSD, but you can still gain a lot out of it, without having to process quite as much information.
As a side note, if you've read Just for Fun, Linus actually was playing with Minix before he wrote Linux, but it just wasn't enough for his purposes.
As a Linux user I'd say Linux has a great community for people to learn about the kernel. http://kernelnewbies.org is a great place to start asking questions and learning about how the kernel works. I can't make a book reccomendation, but once you've read the starting material on kernelnewbies the source is very well documented.
Aside from the good books already mentioned (Opeating System Design & Implementation is particularly good), get a hold of a 1.x release Linux Kernel, load it into VMWare or VirtualBox and start playing around from there.
You will need to spend a lot of time browsing source code. For this, check out http://lxr.linux.no/ which is a browsable linked version of the source and makes life a lot easier. For the very first version of Linux (0.01) check out http://lxr.linux.no/linux-old+v0.01/. The fun begins at http://lxr.linux.no/linux-old+v0.01/boot/boot.s. As you progress from version to version, check out the ChangeLog and dig into those parts that have changed to save you re-reading the whole thing again.
Once you've gotten a hold of the concepts, look at 2.0, then 2.2, etc. Be prepared to sink A LOT of time into the process.
Linux
Device Drivers
Linux Core Kernel Commentary
Operating Systems Design and Implementation
I had previously bought these books on recommendation for the same purpose but I never got to studying them myself so only take them as second-hand advice.
I recommend you the BSD kernels! BSD kernels have far fewer hackers so following their evolution is easier. Either BSD and Linux kernels have great hackers, but some people argue that BSD lower fame filters out novice ones. Also taking design decisions is easier when the sources are not being updated 100 times a day.
Among the BSD choices, my favorite one is NetBSD. It might not be the pain-free choice you want for your desktop, but because it has a strong focus on portability, the quality is quite good. I think this part say it all:
Some systems seem to have the philosophy of “If it works, it's right”. In that light NetBSD's philosophy could be described as “It doesn't work unless it's right”
If you have been working long enough, you will know that NetBSD is a quite joy for learning good coding. Although professionally you will find more chances with Linux
Whichever choice you take, start joining their mail lists, follow the discussions. Study some patches and finally try to do your own bug-fixing. Regarding books, search for Diomidis Spinellis articles and his book. It is not exactly a kernel book, but has NetBSD examples and helps a lot to tackle large software.
Noting the lack of BSDs here, I figured I'd chip in:
The Design and Implementation of the FreeBSD Operating System (dead-tree book)
Unix and BSD Courses (courses and videos)
FreeBSD Architecture Handbook (online book)
I haven't taken any of the courses myself, but I've heard Marshall Kirk McKusick speak on other occasions, and he is really good at what he does.
And of course the BSD man pages, which are an excellent resource as they are maintained to a far greater extent than your average Linux man-page. Take for instance the uvm(9) man-page, describing the virtual memory interface in OpenBSD.
Not quite related, but I'll also recommend the video History of the Berkeley Software Distributions as it gives a nice introduction to the BSD parts of the UNIX history and culture as well as plenty of hilarious anectodes from back when.
There's no substitute for diving into the code. Try to find a driver or subsystem that you're interested in and poke around with it. With tools like VMware Workstation it's super easy to make whatever changes you want, snapshot the VM, and run your modified kernel. If the kernel panics on boot, who cares? Just jump back to the snapshot and fix the problem.
For books, I strongly recommend Linux Kernel Development by Robert Love. It's a wonderfully written book -- lots of information, organized sanely, and humorous... not dry reading at all.
Take Mike Stone's advice and start with Minix. That's what Linus did! The textbook is really well written, and Tannenbaum does a great job of showing how the various features are implemented in a real system.
Nobody seems to have mentioned that code-wise BSD is much cleaner and more consistent. The documentation's way better too (as already mentioned). But since there's a whole lot of fiddling with whatever system you choose - I'd pick the one you use more often.
Linux and Minix are fun to learn. If you also want to learn how a modern micro-kernel operating system looks like, you can look at QNX. The complete documentation is available online and it is very accessible. For example, this online book.
When I was at uni I spent a semester studying operating systems, and as part of this had an assignment where we had to implement a RAM-based filesystem in Linux.
It was a fantastic way to get to understand the internals of the Linux keurnel and to get a grasp on how everything fits together - And a heck of a lot of fun playing around with how it interacts with standard tools too.
I haven't tried it myself, but you can go to Linux From Scratch and start building your own Linux distribution. Sounds like something that'll take a junkload of time, but will result in an intimate knowledge of the guts of the Linux kernel and how each part works. Of course, you can supplement this learning by following any of the other tips here.

Resources