Read a book to learn the Linux kernel [closed] - linux

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have read the general answers to these related questions,
understanding-the-linux-kernel-source
How is the system call in Linux implemented?
How does a syscall actually happen on linux?
but still left with questions of my own. For example, on int 0x80 the kernel services the system call, but what does it mean to "service" a call? e.g. if a service call is made for getuid
#define __NR_getuid (__NR_SYSCALL_BASE+ 24)
then once int 0x80 occurs, the kernel services the call. So what exactly must the kernel do to implement getuid? Somewhere there must be some code which runs after int 0x80. Assuming having downloaded the Linux kernel source, where (e.g. what path) might you find the source code implementation for __NR_getuid?

The handler for getuid(2) is in kernel/timer.c, but you're going to find a single-line function there which won't enlighten you at all.
I found that file by saying make tags at the top level of the kernel source directory, then saying vi -t sys_getuid. sys_*() is how the syscall entry points are named in the kernel. Having done that, you can see why the find command given by 0xDen should work, and in fact does here on my system. But, using tags is faster and easier.
This is why I keep recommending that those who want to understand how the kernel works read a book on it. These books don't literally tell you how each and every syscall is implemented, since that would require that it basically walk you through all the code, line-by-line. There are millions of SLOC in the kernel. If you printed it out, it would literally require a bookcase to hold it all, even if you printed it in small text, double-sided.
Even if you ignore all the non-core parts, such as drivers, weird filesystems, less popular CPU types, etc., so that you were able to cut this down to 1% its total size, you'd still be left with a hundred thousand SLOC to plow through. That fills a big book all by itself, without leaving room for much commentary.
So, what a book on the kernel is going to do is give you the overall understanding of what goes on in there so that you will be able to find out where things live and how they are activated yourself. You will learn about the data structures that tie it all together, so you can follow the call chains, etc.

find -name "*.c"| xargs grep -n -E "SYSCALL_DEFINE" | grep getuid

Related

Using the Linux kernel in my operating system [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
PREFACE
I tried to put as much effort and work into this question as I reasonably could, so if you could at least read it through, I would highly appreciate it; I, also, have tried researching this question, but I never seemed to find anything useful, in terms of anything that directly answered my question; I do not know if this is right place for this question, even though it is related to programming, it is more related to operating system development and the Linux kernel, and if there is a better place for this question that I am unaware of, please move it there; feel free to do whatever, edit the question if need be, I do not care, I just need an answer to this question, because this is stressing me out.
The following is some background on why I am asking this question; if you are uninterested, and if you just want to see what I am asking, then skip to the 'MY QUESTION' label; I thought that I would put this is here, so that anyone who is reading this question would know why I am asking this question.
BACKGROUND
I have recently begun setting up an operating system development project; and after I get some things ready, it will be only me working on it, as of right now, and I plan to write the whole thing (yes, I know it will take a whole lot of work, but I can try, right? :p), including the bootstrapping, the CLI, and most of what is necessary to have to either my own kernel or Linux kernel function; GUI and much more; granted, eventually I may end up having a team, but that is for the future.
MY QUESTION
My question, which is actually consists of three parts, and I narrowed them down to specifically those thee things, which are the following:
(1) If I were to build everything else, and use the Linux kernel as-is, and if I were to not tie the other parts of the system into the kernel, but use the kernel for I/O and system calls, would I be violating the GPL in any way, and would I think need to open source the rest of my code?
(2) If I were to only use the kernel for I/O and for system calls, but not have the code that I wrote actually interface with any kernel functions, would that still be considered linking?
(3) If I were to do the above, would that be considered a derived work, when I wrote everything else, but used Linux as the system's kernel?
All these legal issues are making my head spin and extremely confusing to me.
No
No
No
The linux kernel considers the system calls a boundary, and code that communicates with the kernel via system calls is not covered by the licensing of the kernel. So, the user space code you write is not a derivative work of the kernel.
There's also a set of header files provided by the kernel, collectively named the UAPI headers which you can use without having your code become a derivative work
This is covered at https://www.kernel.org/doc/html/v4.17/process/license-rules.html and https://github.com/torvalds/linux/blob/master/LICENSES/exceptions/Linux-syscall-note
If you need legal advice though, contact a lawyer.

How I could make a Linux OS for a CPU that I designed? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Background information on the architecture: I just designed a set of instructions for a CPU that I'm going to put in an FPGA, but I want to be able to do it compatible with Linux (a simple Linux system) but I just started with this and I do not know much about this, I'm sure Linux can serve in my CPU. I think AVR supports Linux too, but I do not know if this is true but if this is true I think my CPU can also.
My CPU is 16 bits, and it has the following registers:
AX
BX
CX
DX
EX
FX
This can support up to 256 (16-bits registers), I also only put a few registers because I do not know if it will give me space for the VGA driver in my FPGA, think that 8 registers more could fit in the register file. My FPGA board has a Cyclone IV.
The program counter
or (PC) of my CPU is 16 bits.
My CPU handles data with Pointers (ARP, BRP) that point to two registers and take the value of the registers to the two outputs that can be used to put the values (A, B) of the ALU. To save data in the registers I use two pointers as well (CRP, DRP) with these I point to the registers where the values will be stored, the instructions say if the pointers are going to be used to save a value because otherwise, the value would be saved in two registers by mistake.
I do not know if this information is useful to give you an idea if I'm going to be able to use Linux in my design.
Thank you so much! ☺
Question: Is it possible to port Linux to a 16bit architecture?
Edit: After almost 3 years of gained experience with embedded systems, I see how ignorant this question is. I cannot provide an answer to this question because this question is flagged to not accept answers. But I will try to explain why porting Linux natively to a 16bit CPU is almost impossible.
Real Linux requires MMU to work, although there is uClinux which requires no MMU. MMU is required to provide userspace programs with their own memory address space without other programs interfering.
16bit address space is too limited to even run what is required. As the smaller Linux installations that I've seen need 8MB which is way off the 16bit address space (64 kilobytes).
Linux kernel needs Binutils and GCC to compile!
It will be very hacky and tricky to port GCC because GCC was designed to target 32bit architectures.
I mentioned earlier that it's almost impossible but, you can do emulation and with help of external hardware, you can emulate another architecture. But that's cheating, isn't it?
http://dmitry.gr/index.php?r=05.Projects&proj=07.%20Linux%20on%208bit
Finally, if you really want to run Linux on your custom CPU, start with RISCV. It supports GCC and all the required tools, plus RISCV is the future!
If you really want to run an OS in your custom CPU, you can port it to the LCC compiler. Run RTOS. This is a more realistic approach. But still, it is a challenging one.
You are out of luck. Linux requires a 32 bit system to run.

How are linux filesystems created? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
i've been studying the linux operating system for a while now, i understand what file systems are but i'm curious as to how they're made. Is it possible for a programmer to create their own custom made file system in linux, is it possible to combine multiple file systems together and how much control do we have over a file system? Thanks.
Also does anyone know any online sources or books that talk about linux file systems
Certainly a programmer can create a file system, everyone can, you just have to use the command to do that. In addition to that a programmer theoretically can implement logic that creates what you probably refer to as "custom made filesystem", just as a programmer can change, remove or add anything he wants from any part of the system he uses. It is questionable though if many programmers actually are able to create a working and usable file system from scratch, since that is quite a complex thing to do.
Combining multiple filesystems is certainly possible, but maybe you should define in more detail what you actually ask by that. You certainly can use multiple filesystems inside a single system by simply mounting them. You can mount one filesystem inside another. You can even use a loopback device to store a whole filesystem inside a file contained in another filesystem. What you can not do is somehow take two separate file systems, hold a magic wand over them and declare them as one from now on. Well, actually you can do that but it won't work as expected ;-)
About the last question, how much control we have... well, difficult to answer without you giving any metric to do so... We certainly can configure a filesystem, we can use it and its content. We can even destroy or damage it, mount it, check it, examine it, monitor it, repair it, create it, ... I personally would indeed call that some amount of "control" we have over filesystems.

How does rm work? What does rm do? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
My understanding is that 'files' are effectively just pointers to the memory location corresponding to the files content. If you 'rm' a file, you certainly must be deleting that pointer. If rm actually 'erases' the data, I would guess each bit is written over (set to 0 or something). But I know that there are special procedures/programs (i.e. srm) to make sure that data isn't 'recoverable' --- which suggests that nothing is actually being overwritten...
So, is deleting the pointer to a memory address the only thing rm does? Is the data still sitting there in a contiguous block like it was before?
My understanding is that 'files' are effectively just pointers to the memory location corresponding to the files content.
Be careful with your terminology. The files (and pointers) are on disk, not in memory (RAM).
If you 'rm' a file, you certainly must be deleting that pointer.
Yes. What happens is heavily file-system dependent. Some have a bitmap of which block are free/busy. So it would have to flip the bit for each block freed. Other filesystems use more sophisticated methods of tracking free space.
which suggests that nothing is actually being overwritten...
Correct. You can find various "undelete" utilities. But depending on the filesystem, it can get rather complex. But stuff you saved years ago could still be sitting around -- or it could be overwritten. It all depends on minute details. For example, see e2fsprogs.
So, is deleting the pointer to a memory address the only thing rm does?
Well, it also has to remove the "directory entry" that gives metadata about the file. (Sometimes it just wipes out the first byte of the filename).
Is the data still sitting there in a contiguous block like it was before?
Yes, the data is still there. But don't assume it is a contiguous block. Files can be freagmented all over the disk, with lots of pointers that tell it how to re-assemble. And if you are using RAID, things get real complex.
Yes. rm simply deletes the pointer. If you have multiple pointers to the file (hard links), then deleting one of those pointers with rm leaves the others completely untouched and the data still available.
Deleting all of those links still does not touch the data, however the OS is now free to reuse the blocks which previously were reserved for storing that data.
It's worth noting that any process which opens a file creates a file handle for it. This adds to the overall count of references to the file. If you delete all of the pointers from your filesystem, but the operating system still has a process running with an open file handle for your file, then the count of pointers will not be zero and the file will not really be deleted. Only when that final pointer is closed will the filesystem register the disk space as having been released, and only at that point will the OS be free to overwrite the blocks previously reserved for storing that data.
You may or may not be able to recover that data at any point in the future depending on whether any reuse of the blocks in question has occurred.
Incidentally, you have no guarantee that your data is sitting there in a contiguous block in the first place.

What parts of Linux kernel can I read for fun? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Programming isn't my main job, though I enjoy it and sometimes get paid for it. For many years now I've been hearing about Linux and my friends have shown to me many *nixes (or *nici?), though I stick with Mac OS.
Do you think there are any parts of the Linux kernel that I could enjoy looking at, that would help me understand what's the whole stuff about? For example, how Linux is different from Darwin?
I've grown up with assembler and DOS, so things like interrupts or low-level C shouldn't be barriers to understanding. But in the end I'm more interested in high-level concepts, like threading or networking stack - I know different operating systems do them differently. And I'm looking for something fun, easy and enjoyable, like late-night reading.
(Note: made a CW, just in case)
Update: I looked for some docs and started reading:
Unreliable Guide To Locking
I would recommend looking at LXR. It makes it easier to follow the flow of the code (you do not have to search for each function that is called — well, you have, but the site does it for you).
Some starting points, for the current version (2.6.30):
start_kernel() — think of it as the kernel equivalent of main(). This function initializes almost all the kernel subsystems; follow it to see in code what you see scrolling on the screen during the boot.
entry_32.S — system calls and interrupts (x86-32 version, which should be nearer what you know; note the use of the AT&T assembly dialect instead of the Intel dialect you might be more used to).
head_32.S — the kernel entry point. This is where the kernel starts after switching to protected mode; in the end, it will call start_kernel().
arch/x86/boot — the real-mode bootstrap code. It starts in assembly (boot/header.S), but quickly jumps into C code (starting at boot/main.c). Does the real-mode initialization (mostly BIOS calls which have to be done before switching to protected mode); it is compiled using a weird GCC trick (.code16gcc), which allows the generation of 32-bit real-mode code.
arch/x86/boot/compressed — if you ever wondered where does the "Decompressing Linux..." message comes from, it is from here.
Myself, I've always found the task scheduling code a bit of a hoot :-/
Methinks you need to get yourself a hobby outside the industry. Or a life :-)
The comments in the kernel can be pretty funny. There's some tips on where to find the best ones on kerneltrap.
arch/sparc/lib/checksum.S- /* Sun, you just can't beat me, you just can't. Stop trying,
arch/sparc/lib/checksum.S: * give up. I'm serious, I am going to kick the living shit
arch/sparc/lib/checksum.S- * out of you, game over, lights out.*/
linux-0.01.tar.gz is Historic Kernel and good for start
it is simple and tiny and better for start reading
(also it have void main(void) Instead of start_kernel() lol :D )
You might want to read or skim a book that describes the Linux Kernel before looking deep into the Linux kernel.
The books that come to mind are:
Understanding the Linux Kernel, Second Edition
Design of the UNIX Operating System
You need to re-define the word 'fun' in your context. :)
That said, the Linux kernel may be too much of a monster to take on. You may want to start with some academic or more primitive kernels to get the hang of what's going on, first. You may also want to consider the Jolix book.
You'd probably get more out of reading a book on OS theory. As far as source code goes: I've got no idea, but you could easily download the Linux kernel source and see if you can find anything that appeals.
This should turn up some interesting code when run in the src directory:
grep -ir "fixme" *
also try with other comical terms, crap, shit, f***, penguin, etc.
It's been recommended by quite a few people that v0.0.1 of linux is the easiest to understand.
Though, if your looking for good kernel source to read, I wouldn't go with linux, it's a beast of a hack(about like saying the GCC sources are "fun") Instead, you may wish to try Minix or one of the BSDs(Darwin is basically a branch of NetBSD iirc) or even one of the many free DOS clones if everything else is a little too scary..
Try reading the code that implements these character devices:
/dev/zero
/dev/null
/dev/full
And maybe the random number generators if you are inclined. The code is straightforward and simpler than all other device drivers since it does not touch any hardware.
Start at drivers/char/mem.*
kernel.h
Some simple tricks we can learn, such as
#define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0]))
...
#define min(x, y) ...
...
#define container_of
For fun I guess you could also see Minix, it isn't exactly linux but Modern Operating systems by tenenbaum is a good read.

Resources