Lets say i want the text "Hello World" on my monitor, how does the computer represent, graphically, the text on the binary level?
That's a subjective question. It differs based on the hardware and potentially in application or in the OS.
In general, the hardware system you are using will have a defined text encoding that maps character images (or something similar, pixel patterns/screen colors) to given binary value(s). These images are loaded into the screen's memory buffer, which upon the next refresh is displayed on the screen.
So, in a very basic sense, let's say you have an embedded system with an LCD board. In this case it would not be images, but pixel patterns being mapped. You would likely have an 8-bit encoding that supports ASCII. You would load your binary values (that represent the text you want to display) into the LCD's memory/memory buffer. After the memory/buffer is loaded a command would need to be issued to the board to refresh. The display would change based on what you loaded into the memory.
If you are working very low level, then you would have to define that relationship at a driver level. Likely having to work with how to manipulate pixels via memory buffers based on binary values.
It gets more complex with say the computer you used to ask this question.
When you type something in your screen this is what basically happens:
1: The keyboard sends an electrical interrupt to the processor with the binary representation of the key you pressed (see ASCII)
2: The processor looks for the memory location (which was setup by the operating system) that has the instructions to handle the interrupt
3: The interrupt is then interpreted by the operating system (let's say, Linux)
4: If there's a process waiting for input, the operating system delivers the key code to that process (let's say, Bash)
5: Bash receives the code, and sends an instruction to the operating system to display certain characters in the screen device
6: The operating system receives the instruction from Bash, and sends it to the screen device
7: The screen device receives the instruction, translates the bits into pixels and shows them in your screen
All this is abstraction. In the end, everything is binary, and if you want to get there you first should understand the abstractions (Assembly, C, Operating Systems, devices, memory, processor, etc)
Related
I was searching for a syscall that would draw a pixel on a given coordinate on the screen on something similar. But I couldn't find any such syscalls in this site.
I came to know that OS interacts with monitors using graphic drivers. But these drivers may be different on different machines. So is there a common native API provided by linux for handling these?
Much like how there are syscalls for opening, closing, reading, writing to files. Even though underlying file systems maybe different, these syscalls provide an abstract API for user programs to simplify things. I was searching something similar for drawing onto the screen.
Typically a user is running a display server and window system which organizes the screen into windows which applications draw to individually using the API provided by that system. The details will depend on the architecture of this system.
The traditional window system on Linux is the X window system and the more modern Wayland display server/protocol is also in common use. For example X has commands to instruct the X server to draw primitives to the screen.
If no such system is in use, you can directly draw to a display either via a framebuffer device or using the DRM API. Both are not accessed by special syscalls, but instead by using normal file syscalls like open, read, write, etc., but also ioctl, on special device files in /dev, e.g. /dev/dri/card0 for DRM to the first graphics card or /dev/fb0 for the first framebuffer device. DRM is also used for applications to render directly to the screen or a buffer when under a display server or window system as above.
In any case DRM is usually not used directly to draw e.g. pixels to the screen. It still is specific to the graphics card. Typically a library like Mesa3D is used to translate the specific details into a common API like OpenGL or Vulkan for applications to use.
I found this awesome text explaining a lot about TTY devices. It focuses in the relation between a TTY device and a shell (and its spawned jobs). But it says little about the relation between the terminal emulator and the TTY device; and now I'm wondering about that. I googled, but I could not find the answers...
1) What kind of input logic is the terminal emulator responsible for? It just sends each character code (received by window event) to the TTY device, or it does a more complicated processing before/during the transmission to the TTY? And how these character codes are sent to the TTY device? Via file?
2) After a foreground process calling write() to the TTY device file, a.k.a. stdout/stderr, what happens? How this data reaches the terminal emulator process, so it can be rendered? Again, via file?
3) Is the terminal emulator responsible for "allocating" a TTY device? TTY devices can be created "on the fly" by the kernel, or is there a limited number of available TTY devices the kernel can manage?
First of all, answer yourself what a terminal is.
Historically, terminal devices where some dumb devices that transformed output characters from programs to visible drawings in some output device (a printer or a cathode ray tube) and send input characters to programs (produced at a keyboard locally) through a serial line.
From that perspective, a terminal emulator is some software application, normally running at a computer that has not been designed to act as a terminal device to make it behave as such. Normally, this means it will be receiving input from a serial line to output to the user (for example in a specific window on the screen) and will process user input on that window and send it to a remote computer for processing in the program running there.
By contrast, tty lines were serial lines used to send and receive characters. In UNIX, they used to have a common driver, that did some processing to the received characters from the actual terminal. For example, the unix driver collects all characters, allowing some editing via the use of the backspace key, and only make this data available to the program running on the computer after the user (the terminal) has sent the RETURN key.
Some time ago, the need to have virtual terminal devices (devices that don't have an actual terminal behind, but another program instead) where needed to run several programs that used to program the connecting device (for example, to not echo password characters back to the terminal, or to do character by character input, instead of line by line) and to allow the driving programs in the virtual TTY program to act upon these programmings.
Virtual terminal devices come in pairs and terminal emulating programs get the master side of the virtual terminal, running the actual program in the slave part (for example a login shell to allow a windows based pseudoterminal)
Now the answers to your questions:
1) Terminal input logic is managed at the virtual terminal driver of the slave device, as if the program running on it has complete control of character mapping or line/raw input. By the way, the program attached to the master side, only gets the raw characters, without any interpretation, so it can, for example, send a Control-C character to interrupt the program running on the slave side.
2) when the program running at the slave side does a write, this write goes through the tty driver, that makes all the asumptions of line-discipline (for example to add a CR char before any LF character to make a CRLF sequence, in case of terminal cooked mode of operation) the program running on the master side will receive the raw characters (even, for example a Ctrl-C written by the program) On input, tty device converts input character (case of a Ctrl-C) and sends the proper signal to the group of processes attached to that pseudo terminal.
3) Historically, the terminals appeared as device pairs (as one specific kind of terminal character device driver), and as such, they had inodes with major/minor number pairs. This limited their number to a properly configured administration value. Nowadays, linux, for example, allows dynamic allocation of devices, making it possible to allocate device pairs dynamically. But the maximum number continues to be bounded (for eficiency and implementation reasons)
I have recently started reading Linux Kernel Development By Robert Love and I am Love -ing it!
Please read the below excerpt from the book to better understand my questions:
A number identifies interrupts and the kernel uses
this number to execute a specific interrupt handler to process and respond to the interrupt.
For example, as you type, the keyboard controller issues an interrupt to let the system
know that there is new data in the keyboard buffer. The kernel notes the interrupt number of the incoming interrupt and executes the correct interrupt handler.The interrupt
handler processes the keyboard data and lets the keyboard controller know it is ready for
more data...
Now I have dual boot on my machine and sometimes (in fact,many) when I type something on windows, I find myself doing it in, what I call Night crawler mode. This is when I am typing and I don't see anything on the screen and later after a while the entire text comes in one flash, probably the buffer just spits everything out.
Now I don't see this happening on Linux. Is it because of the interrupt-context present in Linux and the absence of it in windows?
BTW, I am still not sure if there is an interrupt-context in windows, google didn't give me any relevant results for that.
All OSes have an interrupt context, it's a feature/constraint of the CPU architecture -- basically, this is "just the way things work" with computer hardware. Different OSes (and drivers within that OS) make different choices about what work and how much work to do in the interrupt before returning, though. That may be related to your windows experience, or it may not. There is a lot of code involved in getting a key press translated into screen output, and interrupt handling is only a tiny part.
A number identifies interrupts and the kernel uses this number to execute a specific interrupt handler to process and respond to the interrupt. For example, as you type, the keyboard controller issues an interrupt to let the system know that there is new data in the keyboard buffer.The kernel notes the interrupt num- ber of the incoming interrupt and executes the correct interrupt handler.The interrupt handler processes the keyboard data and lets the keyboard controller know it is ready for more data
This is a pretty poor description. Things might be different now with USB keyboards, but this seems to discuss what would happen with an old PS/2 connection, where an "8042"-compatible chipset on your motherboard signals on an IRQ line to the CPU, which then executes whatever code is at the address stored in location 9 in the interrupt table (traditionally an array of pointers starting at address 0 in physical memory, though from memory you could change the address, and last time I played with this stuff PCs still had <1MB RAM and used different memory layout modes).
That dispatch process has nothing to do with the kernel... it's the way the hardware works. (The keyboard controller could be asked not to generate interrupts, allowing OS/driver software to "poll" it regularly to see if there happened to be new event data available, but it'd be pretty crazy to use that really).
Still, the code address from the interrupt table will point into the kernel or keyboard driver, and the kernel/driver code will read the keyboard event data from the keyboad controller's I/O port. For these hardware interrupt handlers, a primary goal is to get the data from the device and store it into a buffer as quickly as possible - both to ensure a return from the interrupt to whatever processing was happening, and because the keyboard controller can only handle one event at a time - it needs to be read off into the buffer before the next event.
It's then up to the OS/driver to either provide some kind of input availability signal to application software, or wait for the application software to attempt to read more keyboard input, but it can do it a "whenever you're ready" fashion. Whichever way, once an application has time to read and start responding to the input, things can happen that mean it takes an unexpectedly long amount of time: it could be that the extra keystroke triggers some complex repagination algorithm that takes a long time to run, or that the keystroke results in the program executing code that has been swapped out to disk (check wikipedia for "virtual memory"), in which case it could be only after the hard disk has read part of the program into memory that the program can continue to run. There are thousands of such edge cases involving window movement, graphics clipping algorithms, etc. that could account for the keyboard-handling code taking a long time to complete, and if other keystrokes have happened meanwhile they'll be read by the keyboard driver into that buffer, then only "perceived" by the application after the slow/blocking processing completes. It may well be that the processing consequent to all the keystrokes then in the buffer completes much more quickly: for example, if part of the program was swapped in from disk, that part may be ready to process the remaining keystrokes.
Why would Linux do better at this than Windows? Mainly because the Operating System, drivers and applications tend to be "leaner and meaner"... less bloated software (like C++ vs C# .NET), less wasted memory, so less swapping and delays.
Here is from Wiki .
"In computing, an executable file causes a computer "to perform indicated tasks according to encoded instructions," ( Machine Code ?? )
"Modern operating systems retain control over the computer's resources, requiring that individual programs make system calls to access privileged resources. Since each operating system family features its own system call architecture, executable files are generally tied to specific operating systems."
Well this is my perspective .
Executables cannot be Machine Code as they need to tal to the OS for hardware services ( system calls) Hence executable is just not yet "Machine Code" ... Perhaps it is like some part of the code is actual Machine Code and some parts are just meant to call the Machine code embedded in the Operating system ? Overall it contains some junks of Machine Code - and some junks of codes to call the operating system .
Edited after Damon's Answer :
In the end OS is a set of machine codes . Basically OS would be doing the job of copy pasting user's Machine Code ( created by C Compiler ) and then if the instruction is a system call , the transfer goes to OS memory region for handling it . Now the question is what Machine Code generated in C can do this part ? Like asking to transfer control to OS etc - I suppose its system calls at higher abstraction but under the hood - how does it work .
I get a feeling its similar to chicken egg problem , C creates OS and C uses OS Cant find the exactly how the process goes .
Can anyone break the puzzle for me ?
One thing does not exclude the other. Executables are (unless they are some form of bytecode running in a virtual machine) machine code. However, there are different kinds of instructions, some of which are not usable at certain privilegue levels.
That is where the operating system comes in, it is "machine code" that runs at the highest privilegue level, working as arbiter for the "important" parts and tasks, such as deciding who gets CPU time and what value goes into some hardware register.
(originally comment, made an answer by request)
EDIT: About your extended question, this works approximately as follows. When the computer is turned on, the processor runs at its highest privilegue level. In this "mode", the BIOS, the boot loader, and the operating system can do just what they want. This sounds great, but you don't want any kind of code being able to do just whatever it wants.
For example, the code can tell the MMU which memory pages are allowed to be read or written to, and which ones are not. Or, it can define what address is called if "something special" such as a trap or interrupt happens. Or, it can directly write to some special memory addresses that map ports of some devices (disk, network, whatever).
Eventually, the OS switches to "unprivileged" mode and calls some non-OS code. When a trap or interrupt happens, execution is interrupted and continues elsewhere (as specified by the OS previously), and the privilege level is upped again. Once the interrupt has been dealt with, privilege is taken away, and user code is called again.
If a user program needs the OS to do something "OS like", it sets up parameters according to an agreed scheme (for example in some particular registers) and executes a trap instruction.
This is for example how things like multithreading or virtual memory are implemented. In regular intervals, a timer fires off an interrupt, which stops execution of "normal" code, and calls some code in the kernel (in privileged mode). That code then decides what user process control should returned to, after some kind of priority scheme. Those are the "CPU time slices" that are handed out.
If some process reads from or writes to a page that it isn't allowed, a trap is generated by the MMU. The OS then looks at what happened and where, and decides whether to load some data from disk into some memory region (and possibly purge something else) and change the process' mappings, or whether to kill the process with a "segmentation fault" error.
Of course in reality, it is a million times more complicated, but in principle that's about as it works.
It does not really matter whether the OS or the programs were originally written in C or with an assembler. To the processor, it's just a sequence of machine instructions. Even a python or perl script is "just machine instructions" in the end, only with a detour via the interpreter.
I am using a generic usb keyboard, Linux 2.6.27 with gnome desktop, gnome-terminal and bash shell. I am interested to know what happens in the software. How are special characters from my keyboard interpreted with some encoding to characters and where do the character pictures come from?
The Linux input layer with the USB drivers gets scancodes (basically "KEY 1 DOWN" "KEY 1 UP") from the keyboard.
X uses its keymap to convert scancodes into keycodes and X input events.
The GTK input method converts the sequence of input events into composed unicode characters.
Gnome-terminal encodes these in UTF-8 for the shell.
The Shell doesn't care. it just knows that it's dealing with a multibyte encoding.
The shell echoes multibyte-encoded text back through the TTY.
Gnome-terminal decodes the incoming text and determines unicode code points.
Gnome-terminal draws characters using GTK+ facilities.
GTK+ uses Pango to render the text, and calls the X library to draw the pixels to the screen.
The X server draws characters into the screen buffer and the video card displays them.
Here is my attempt at a diagram:
alt text http://osoft.us/system_layers.png
Look at it in layers. First is the hardware, and a device driver in the Linux kernel will have specific methods for controlling and responding to the keyboard via status registers in the device and interrupt handlers, for example.
Next is the Linux kernel, which will have some method of loading the appropriate driver for each piece of hardware detected at boot time. Once loaded, the device driver conforms to some kernel-driver interface, providing data from the device to the kernel and vice versa.
Outside the kernel, at some level, the device driver and hardware are visible, usually as a listing in the /dev directory. Software, like a terminal emulator, that needs to use a device will gain access to the device through an entry in /dev.
Communication between a user-level application and the device now happens via a series of read/write and ioctl operations. These trap into the kernel (see the manual pages for these for some detail), at which point the kernel communicates with the device driver loaded above.
The terminal emulator will display characters as you type them (in most cases) and as they are received from the device (in most cases) by using fonts that it can access, located in various places depending on the application. (I'm speaking in generalities here because I don't know Gnome specifically).
Does this help?