How does an operating system draw windows on the screen?

How does an operating system draw windows on the screen? - graphics

I realized after many years of using and programming computers that the stack of software that actually draws on the screen is mostly a mystery to me.
I have worked on some embedded LCD GUI applications and I think that provides some clues as to a simplified stack but the whole picture for something like the Windows operating system is still murky.
From what I know:
Lowest level 0 is electronic hardware (integrated circuits) that provide a digital interface to turn a pixel on the screen a certain color or grey scale shade. The interface is documented in data sheets so you know how to toggle the digital lines to turn any pixel the way you want it.
Next level 1 is a hardware driver. This usually abstracts the hardware into a common interface. Something like SetPixel() etc.
Next level 2 is 2D/3D graphics library (of which I have limited widget/single screen experience). The lower levels seem to provide a buffer or range of memory that represents the pixels on the screen. The graphics library abstracts this so you can call functions like DrawText("text", 10, 10, "font") and it will set the pixels for you in the right way.
Next level would be the magic of the OS. The windows/buttons/forms/WPF/etc is created in memory and then routed to the appropriate driver while also being directed to a certain part of the screen?
But how does something like Windows really work?
I would assume that the GPU fits between level 0 and level 1. The GPU drives the pixels on the display directly and now the level 1 drivers are a GPU driver. There are more functions available to enable the added functionality a GPU provides. (what would this be though? Does the OS pass on an array of triangles in 3D space and the GPU processes this into a 3D perspective view and then chuck it on the screen?)
The biggest mystery to me though is when you get into the windows part of things. You can have sketch up, visual studio and a FPS game all running at the same time and be able to switch between them, or in some cases tile them on the screen or have then spread across multiple screens. How is this tracked and rendered? Each of these would have to be running in the background and the OS would have to say which graphics pipe should be connected to which part of the screen. How would Windows say this part of the screen is a 3D game and this part is a 2D WPF app etc?
On top of that all you have DirectX used in one application and Qt in another. I remember having multiple games or apps running that use the same technology so how would that work? From what I can see you would have Application->Graphics library (DirectX, WPF etc)->Frame Buffer->Windows director (where and what part of the screen should this frame buffer be scaled to)->Driver?
In the end it is just bits toggling to indicate which pixel should be what color but it is one hell of a lot of toggling bits along the way to get there.
If I fire up Visual Studio and create a basic WPF app what is all going on in the background when I drop a button on the screen and hit start? I have seen the VS designer to drop it on, created it in XAML and I have even manually drawn things pixel by pixel in an embedded system but what happens in between, the so-called meat of this sandwich?
I have used Android, iOS, Windows and Linux and it seem to be a common functionality but I have never seen or heard an explanation of the how behind what I outline above, I only have a slightly educated guess.
Is anyone able to shed some light on how this works?

VGA
Assuming x86, VGA memory is mapped at a standard video buffer address in the lowest 1 MiB (0x000B8000 for text mode and 0x000A0000 for graphics mode). There are also many VGA registers that control the behaviour of the card. There were two widely used video modes, mode 0x12 (16-color 640x480) and mode 0x13 (256-color 320x200). Mode 0x12 involved switching planes (blue, green, red, white) with VGA registers, while mode 0x13 involved having a 256-color palette which can be modified using VGA registers.
Normally, an OS relying on VGA would set the mode using BIOS while booting, or write to the appropriate VGA registers at runtime (if it knows what it is doing). To draw to the screen, the video driver would either simply write to the video memory (mode 0x13) or combine that with writing to VGA registers too (mode 0x12).
Most cards in use today are still (partly) VGA compatible.
VBE
Some years later, VESA invented "VESA BIOS Extensions", which was a standard interface for video cards and allowed higher resolutions and greater color depths. The video memory was exposed through two different ways: banked mode and linear framebuffer. The banked mode would expose some small portion of the video memory to a low address (0x000A0000) and the video driver would need to switch banks almost each time the screen is to be updated. The linear framebuffer is a much more convenient solution, which would map the entire video memory to a non-standard high address.
During boot, an OS would call the VBE interface to query for supported modes and to set the most convenient one, or it would bypass the VBE interface and write directly to the needed video hardware registers (if it knows what it is doing). In either between the banked mode and the linear framebuffer, the video driver would write to the specified memory address to which the video memory is mapped.
Most cards in use today are still (partly) VBE compatible.
Modern video interfaces
The most modern video interfaces usually aren't documented as widely as VGA and/or VBE. However, the video memory is still mapped at an address, while hardware registers and/or a buffer contain modifiable information about the behaviour of the graphics card. The difference is that the interfaces aren't standardised anymore and nowadays an advanced OS requires different drivers for each graphics card.

Related

Establishing a virtually bigger display buffer through code

I don't know if this is a programming question but it is a issue that may be possible to solve through computer programming.
Based on my limited knowledge about how the display processing pipeline in computers works, I theorised that pixels on the monitor are allocated space in a memory buffer somewhere and this buffer size depends on the size of our screen. So, can we fake the computer into thinking that we have a bigger monitor than we actually have and take the advantage for instance screencasting at a larger resolution than we already have?

To answer your question more information about your hardware and OS (and drivers) is required, as it is all very depending on this.
Nvidia for example has Dynamic Super Resolution, while AMD has Virtual Super Resolution.
Both will provide a bigger available resolution to the applications (games) than is actually available on your monitor. How to enable/configure this is depending on your hardware, so you should Google a bit for your specific setup.
Your OS then is (should be) able to scale it down to properly show on your monitor, so you can view the screen properly.
If your your screen capture software is able to directly capture your video memory (and not what is output to your monitor), it will capture the higher resolution. (I have no experience with screen capturing software, so I won't be much of help with this.)

Yes, there's a large chunk of memory (probably in your video card) that contains the actual displayed pixels, and there's a completely separate memory area maintained by the desktop software. It is possible (and in fact common) for the latter to maintain a "virtual" desktop that is larger than your monitor, extending the desktop into a second monitor, or perhaps scrolling or page flipping to access the extended areas.
All of this is very OS-specific.

Hardware acceleration without X

I was wondering if it would be possible to get graphical hardware acceleration without Xorg and its DDX driver, only with kernel module and the rest of userspace driver. I'm asking this because I'm starting to develop on an embedded platform (something like beagleboard or more roughly a Texas instruments ARM chip with integrated GPU), and I would get hardware acceleration without the overhead of a graphical server (that is not needed).
If yes, how? I was thinking about OpenGL or OpengGLES implementations, or Qt embedded http://harmattan-dev.nokia.com/docs/library/html/qt4/qt-embeddedlinux-accel.html
And TI provides a large documentation, but still is not clear to me
http://processors.wiki.ti.com/index.php/Sitara_Linux_Software_Developer%E2%80%99s_Guide
Thank you.

The answer will depend on your user application. If everything is bare metal and your application team is writing everything, the DirectFB API can be used as Fredrik suggest. This might be especially interesting if you use the framebuffer version of GTK.
However, if you are using Qt, then this is not the best way forward. Qt5.0 does away with QWS (Qt embedded acceleration). Qt is migrating to LightHouse, now known as QPA. If you write a QPA plug-in that uses your graphics acceleration by whatever kernel mechanism you expose, then you have accelerated Qt graphics. Also of interest might be the Wayland architecture; there are QPA plug-ins for Wayland. Support exists for QPA in Qt4.8+ and Qt5.0+. Skia is also an interesting graphics API with support for an OpenGL backend; Skia is used by Android devices.
Getting graphics acceleration is easy. Do you want compositing? What is your memory foot print? Who is your developer audience that will program to the API? Do you need object functionality or just drawing primitives? There is a big difference between SKIA, PegUI, WindML and full blown graphics frameworks (Gtk, Qt) with all the widget and dynamics effects that people expect today. Programming to the OpenGL ES API might seem fine at first glance, but if your application has any complexity you will need a richer graphics framework; Mostly re-iterating Mats Petersson's comment.
Edit: From the Qt embedded acceleration link,
CPU blitter - slowest
Hardware blitter - Eg, directFB. Fast memory movement usually with bit ops as opposed to machine words, like DMA.
2D vector - OpenVG, Stick figure drawing, with bit manipulation.
3D drawing - OpenGL(ES) has polygon fills, etc.
This is the type of drawing you wish to perform. A framework like Qt and Gtk, give an API to put a radio button, checkbox, editbox, etc on the screen. It also has styling of the text and interaction with a keyboard, mouse and/or touch screen and other elements. A framework uses the drawing engine to put the objects on the screen.
Graphics acceleration is just putting algorithms like a Bresenham algorithm in a separate CPU or dedicated hardware. If the framework you chose doesn't support 3D objects, the frameworks is unlikely to need OpenGL support and may not perform any better.
The final piece of the puzzle is a window manager. Many embedded devices do not need this. However, many handset are using compositing and alpha values to create transparent windows and allow multiple apps to be seen at the same time. This may also influence your graphics API.
Additionally: DRI without X gives some compelling reasons why this might not be a good thing to do; for the case of a single user task, the DRI is not even needed.
The following is a diagram of a Wayland graphics stack a blog on Wayland.

This is depend on soc gpu driver implement ,
On iMX6 ,you can use wayland composite on framebuffer
I build a sample project as a reference
Qt with wayland on imx6D/Q
On omap3 there is a project
omap3 sgx wayland

Programming graphics and sound on PC - Total newbie questions, and lots of them!

This isn't exactly specifically a programming question (or is it?) but I was wondering:
How are graphics and sound processed from code and output by the PC?
My guess for graphics:
There is some reserved memory space somewhere that holds exactly enough room for a frame of graphics output for your monitor.
IE: 800 x 600, 24 bit color mode == 800x600x3 = ~1.4MB memory space
Between each refresh, the program writes video data to this space. This action is completed before the monitor refresh.
Assume a simple 2D game: the graphics data is stored in machine code as many bytes representing color values. Depending on what the program(s) being run instruct the PC, the processor reads the appropriate data and writes it to the memory space.
When it is time for the monitor to refresh, it reads from each memory space byte-for-byte and activates hardware depending on those values for each color element of each pixel.
All of this of course happens crazy-fast, and repeats x times a second, x being the monitor's refresh rate. I've simplified my own likely-incorrect explanation by avoiding talk of double buffering, etc
Here are my questions:
a) How close is the above guess (the three steps)?
b) How could one incorporate graphics in pure C++ code? I assume the practical thing that everyone does is use a graphics library (SDL, OpenGL, etc), but, for example, how do these libraries accomplish what they do? Would manual inclusion of graphics in pure C++ code (say, a 2D spite) involve creating a two-dimensional array of bit values (or three dimensional to include multiple RGB values per pixel)? Is this how it would be done waaay back in the day?
c) Also, continuing from above, do libraries such as SDL etc that use bitmaps actual just build the bitmap/etc files into machine code of the executable and use them as though they were build in the same matter mentioned in question b above?
d) In my hypothetical step 3 above, is there any registers involved? Like, could you write some byte value to some register to output a single color of one byte on the screen? Or is it purely dedicated memory space (=RAM) + hardware interaction?
e) Finally, how is all of this done for sound? (I have no idea :) )

a.
A long time ago, that was the case, but it hasn't been for quite a while. Most hardware will still support that type of configuration, but mostly as a fall-back -- it's not how they're really designed to work. Now most have a block of memory on the graphics card that's also mapped to be addressable by the CPU over the PCI/AGP/PCI-E bus. The size of that block is more or less independent of what's displayed on the screen though.
Again, at one time that's how it mostly worked, but it's mostly not the case anymore.
Mostly right.
b. OpenGL normally comes in a few parts -- a core library that's part of the OS, and a driver that's supplied by the graphics chipset (or possibly card) vendor. The exact distribution of labor between the CPU and GPU varies somewhat though (between vendors, over time within products from a single vendor, etc.) SDL is built around the general idea of a simple frame-buffer like you've described.
c. You usually build bitmaps, textures, etc., into separate files in formats specifically for the purpose.
d. There are quite a few registers involved, though the main graphics chipset vendors (ATI/AMD and nVidia) tend to keep their register-level documentation more or less secret (though this could have changed -- there's constant pressure from open source developers for documentation, not just closed-source drivers). Most hardware has capabilities like dedicated line drawing, where you can put (for example) line parameters into specified registers, and it'll draw the line you've specified. Exact details vary widely though...
e. Sorry, but this is getting long already, and sound covers a pretty large area...

For graphics, Jerry Coffin's got a pretty good answer.
Sound is actually handled similarly to your (the OP's) description of how graphics is handled. At a very basic level, you have a "buffer" (some memory, somewhere).
Your software writes the sound you want to play into that buffer. It is basically an encoding of the position of the speaker cone at a given instant in time.
For "CD quality" audio, you have 44100 values per second (a "sample rate" of 44.1 kHz).
A little bit behind the write position, you have the audio subsystem reading from a read position in the buffer.
This read position will be a little bit behind the write position. The distance behind is known as the latency. A larger distance gives more of a delay, but also helps to avoid the case where the read position catches up to the write position, leaving the sound device with nothing to actually play!

Learning about low-level graphics programming

I'm interesting in learning about the different layers of abstraction available for making graphical applications.
I see a lot of terms thrown around: At the highest level of abstraction, I hear about things like C#, .NET, pyglet and pygame. Further down, I hear about DirectX and OpenGL. Then there's DirectDraw, SDL, the Win32 API, and still other multi-platform libraries like WxWidgets.
How can I get a good sense of where one of these layers ends and where the next one begins? What is the "lowest possible level" way of creating a window in Windows, in C? What about C++? (A code sample would be divine.) What about in X11? Are the Windows implementations of OpenGL and DirectX built on top of the Win32 API? Where can I begin to learn about these things?
There's another question on SO where Programming Windows is suggested. What about for Linux? Is there an equivalent such book?
I'm aware that this is very low-level, and that there are many friendlier tools available, but I would like to at least learn the basics of what's going on beneath the surface. As much as I'd like to begin slinging windows and vectors right off the bat, starting with something like pygame is too high-level for me; I really need to make the full conceptual circuit of how you draw stuff on a computer.
I will certainly appreciate suggestions for books and resources, but I think it would be stupendously cool if the answers to this question filled up with lots of different ways to get to "Hello world" with different approaches to graphics programming. C? C++? Using OpenGL? Using DirectX? On Windows XP? On Ubuntu? Maybe I ask for too much.

The lowest level would be the graphics card's video RAM. When the computer first starts, the graphics card is typically set to the 80x25 character legacy mode.
You can write text with a BIOS provided interrupt at this point. You can also change the foreground and background color from a palette of 16 distinctive colors. You can use access ports/registers to change the display mode. At this point you could say, load a different font into the display memory and still use the 80x25 mode (OS installations usually do this) or you can go ahead and enable VGA/SVGA. It's quite complicated, that's what drivers are for.
Once the card's in the 'higher' mode you'd change what's on screen by accessing the memory mapped to the video card. It's stored horizontally pixel by pixel with some 'dirty regions' of pixels that aren't mapped to screen at the end of each line which you have to compensate for. But yeah, you could copy the pixels of an image in memory directly to the screen.
For things like DirectX, OpenGL. rather than write directly to the screen, commands are sent to the graphics card and it updates its screen automatically. Commands like "Hey you, draw this image I've loaded into the VRAM here, here and here" or "Draw these triangles with this transformation matrix..." take a fraction of the time compared to pixel by pixel . The CPU will thank you.
DirectX/OpenGL is a programmer friendly library for sending those commands to the card with all the supporting functions to help you get it done smoothly. A more direct approach would only be unproductive.
SDL is an abstraction layer so without bothering to read up on it I'd guess it would have different ways of working on each system. On one it might use semi-direct screen writing, another Direct3D, etc. Whatever's fastest as long as the code stays cross-platform..able.
The GDI/GDI+ and XWindow system. They're designed specifically to draw windows. Originally they drew using the pixel-by-pixel method (which was good enough because they'd only have to redraw when a button was pressed or a window moved, etc.) but now they use Direct3D/OpenGL for accelerated drawing (and special effects). Optimizations depend on the versions and implementations of these libraries.
So if you want the most power and speed, DirectX/openGL is the way to go. SDL is certainly useful for getting the most from a cross-platform environment and integrates with OpenGL anyway. The windowing system comes last but don't underestimate it. Especially with the stuff Microsoft's coming up with lately.

Michael Abrash's Graphics Programming 'Black Book' is a great place to start. Plus you can download it for free!

If you really want to start at the bottom then drawing a line is the most basic operation. Computer graphics is simply about filling in pixels on a grid (screen), so you need to work out which pixels to fill in to get a line that goes from (x0,y0) to (x1,y1).
Check out Bresenham's algorithm to get a feel for what is involved.

To be a good graphics and image processing programmer doesn't require this low level knowledge, but i do hate to be clueless about the insides of what i'm using. I see two ways to chase this - high-level down, or bottom-level up.
Top-down is a matter of following how the action traces from a high-level graphics operation such as to draw a circle, to the hardware. Get to know OpenGL well. Then the source to Mesa (free!) provides a peek at how OpenGL can be implemented in software. The source to Xorg would be next, first to see how the action goes from API calls through the client side to the X server. Finally you dive into a device driver that interfaces with hardware.
Bottom up: build your own graphics hardware. Think of ways it could connect to a computer - how to handle massive numbers of pixels through a few byte-size registers, how DMA would work. Write a device driver, and try designing a graphics library that might be useful for app programmers.
The bottom-up way is how i learned, years ago when it was a possibility with the slow 8-bit microprocessors. The direct experience with circuitry and hardware-software interfacing gave me a good appreciation of the difficult design decisions - e.g. to paint rectangles using clever hardware, in the device driver, or higher level. None of this is of practical everyday value, but provided a foundation of knowledge to understand newer technology.

see Open GPU Documentation section:
http://developer.amd.com/documentation/guides/Pages/default.aspx
HTH

On MSWindows it is easy: you use what the API provides, whether it is the standard windows programming API or the DirectX-family API's: that's what you use, and they are well documented.
In an X windows environment you use whatever X11-libraries that are provided. If you want to understand the principles behind windowing on X, I suggest that you do this, nevermind that many others tell you not to, it will really help you to understand graphics and windowing under X. You can read the documentation on X-programming (google for it). (After this exercise you would appreciate the higher level libraries!)
Apart from the above, at the absolutely lowest level (excluding chip-level) that you can go is to call the interrupts that switch to the various graphics modes available - there are several - and then write to the screen buffers, but for this you would have to use assembler, anything else would be too slow. Going this way will not be portable at all.
Another post mentions Abrash's Black Book - an excellent resource.
Edit: As for books on programming Linux: it is a community thing, there are many howto's around; also find a forum, join it, and as long as you act civilized you will get all the help you can ever need.

Right off the bat, I'd say "you're asking too much." From what little experience I've had, I would recommend reading some tutorials or getting a book on either directX or OpenGL to start out. To go any lower than that would be pretty complex. Most of the books I've seen in OGL or DX have pretty good introductions that explain what the functions/classes do.
Once you get the hang of one of these, you could always dig in to the libraries to see what exactly they're doing to go lower.
Or, if you really, absolutely MUST learn the LOWEST level... read the book in the above post.

libX11 is the lowest level library for X11. I believe the opengl/directx talk to the driver/hardware directly (or emulate unsupported ops), so they would be the lowest level library.
If you want to start with very low level programming, look for x86 assembly code for VGA and fire up a copy of dosbox or similar.

Vulkan api is an api which gives you very low level access to most if not all features of the gpu, computational and graphical, it works on amd and Nvidia gpus (not all)
you can also use CUDA, but it only works on Nvidia gpus and has access to computational features only, no video output.

VGA standard for Graphics Controller

I'm attempting to create a generic graphics controller for VGA monitors with an Altera FPGA via a VGA connector, but I cannot find any good online resources explaining the standard specification which monitors use. I've found all the pin descriptions and some resources which describe how to create a specific graphics controller, such as this 8 colour 480x640 controller, but no resources I've found describe the actual expected 'protocol' which monitors.
For example, nowhere have I found what the exact timings are supposed to be between different parts of the signal -- in the above, specific timings in µs are given but not why. Are all the sections supposed to be in these set proportions or is there some arbitrariness with regards to pause timings between rows, etc.... What would the pseudo-code look like if you were thinking of implementing it in code (and be able to change resolution / colour depth)?
Again, I'm looking for the expected 'protocol' for a generic controller -- similar to what an OS would use when no monitor type is specified. An pointers to the right direction would be appreciated.

I haven't done any lower level VGA stuff for years, but a book I used that that may be of some help is: Programmer's Guide to the EGA, VGA, and Super VGA Cards
The table of contents for the book is
as follows:
Introduction to the Programmer's Guide
The EGA, VGA, and Super VGA Features
Graphics Hardware and Software
Types of Graphics Systems
Principles of Computer Graphics
Alphanumeric Processing
Graphics Processing
Color Palette and Color Registers
Reading the State of the EGA and VGA
The EGA/VGA Registers
The EGA/VGA BIOS
Programming Examples
The Super VGA
Graphics Coprocessors
Super VGA Code Basics
The Adapter Interface
The 8514/A
The XGA
ATI Technologies
Chips and Technologies
Cirrus Logic
The Video7 Super VGA Chip Set
IIT
NCR
Oak
S3 Incorporated
The Trident Super VGA Chip Sets
The Tseng Labs Super VGA Chips
The Paradise Super VGA Chips
Weitek

This site:
http://server.oersted.dtu.dk/www/sn/31002/?Materials/vga/main.html
Has a pretty good discussion on VGA.
The key to what you're asking is answered with this clip from the site: http://web.mit.edu/6.111/www/s2004/NEWKIT/vga.shtml
"As with RS-232, the standard for VGA video is that there are lots of standards. Every manufacturer seems to list different timings in the manuals for their monitors. The values given in the table above are not particularly critical. On a CRT monitor, the lengths of the front and back porches control the position of the image on the display. If the image appears offset to the right or left, or up or down, try adjusting the front and back porch values for the corresponding direction (or use the image position adjustments on the monitor, which accomplish the same thing)."
The problem is backwards compatability doesn lend itself well to a simple equation to determine these values. There is a modern spreadsheet that will calculate values for monitors that use the most recent standards, but if your playing around with VGA the old analog monitors will let you do tricks that you can't do on an led type display.
Your resolution is limited to how fast the electronics can turn on and off the electron beam, but the horizonal placement is only limited by your clock and what ever phase adjustments are possible on your FPGA.
For instance you can setup 640x480 timing on your sync pulses and instead of clocking data at 25MHz you can use 100 or 200 MHz and simply require a min on time for each pixel. Effectively allowing you to smooth scroll 1/8th of the width of a pixel. You may be able to do simalar tweeking to the distance between scan line although I've never tried it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string