Why emulate for certain number of cycles? - emulation

I have seen in more than one places - the following way of emulating
i.e cycles is passed into emulate function
int CPU_execute(int cycles) {
int cycle_count;
cycle_count = cycles;
do {
/* OPCODE execution here */
} while(cycle_count > 0);
return cycles - cycle_count;
}
I am having hard time understand why would you do this approach for emulating i.e why would you emulate for certain number of cycles? Can you give some scenarios where this approach is useful.
Any help is heartily appreciated !!!

Emulators tend to be interested in fooling the software written for multiple chip devices — in terms of the Z80 and the best selling devices you're probably talking about at least a graphics chip and a sound chip in addition to the CPU.
In the real world those chips all act concurrently. There'll be some bus logic to allow them all to communicate but they're otherwise in worlds of their own.
You don't normally run emulation of the different chips as concurrent processes because the cost of enforcing synchronisation events is too great, especially in the common arrangement where something like the same block of RAM is shared between several of the chips.
So instead the most basic approach is to cooperatively multitask the different chips — run the Z80 for a few cycles, then run the graphics chip for the same amount of time, etc, ad infinitum. That's where the approach of running for n cycles and returning comes from.
It's usually not an accurate way of reproducing the behaviour of a real computer bus but it's easy to implement and often you can fool most software.
In the specific code you've posted the author has further decided that the emulation will round the number of cycles up to the end of the next whole instruction. Again that's about simplicity of implementation rather than being anything to do with the actual internals of a real machine. The number of cycles actually run for is returned so that other subsystems can attempt to adapt.

Since you mentioned z80, I happen to know just the perfect example of the platform where this kind of precise emulation is sometimes necessary: ZX Spectrum. The standard graphics output area on ZX Spectrum was a box of 256 x 192 pixels situated in the centre of the screen, surrounded by a fairly wide "border" area filled by a solid color. The color of the border was controlled by outputing a value into a special output port. The computer designer's idea was that one would simply choose the border color that is the most appropriate to what is happening on the main screen.
ZX Spectrum did not have a precision timer. But programmers quickly realised that the "rigid" (by modern standards) timings of z80 allowed one to do drawing that was synchronised with the movement of the monitor's beam. On ZX Spectrum one could wait for the interrupt produced at the beginning of each frame and then literally count the precise number of cycles necessary to achieve various effects. For example, a single full scanline on ZX Spectrum was scanned in 224 cycles. Thus, one could change the border color every 224 cycles and generate pixel-thick lines on the border.
Graphics capacity of the ZX Spectrum was limited in a sense that the screen was divided into 8x8 blocks of pixels, which could only use two colors at any given time. Programmers overcame this limitation by changing these two colors every 224 cycles, hence, effectively, increasing the color resolution 8-fold.
I can see that the discussion under another answer focuses on whether one scanline may be a sufficiently accurate resolution for an emulator. Well, some of the border scroller effects I've seen on ZX Spectrum are, basically, timed to a single z80-cycle. Emulator that wants to reproduce the correct output of such codes would also have to be precise to a single machine cycle.

If you want to sync your processor with other hardware it could be useful to do it like that. For instance, if you want to sync it with a timer you would like to control how many cycles can pass before the timer interrupts the CPU.

Related

What is the point of having metric mapping modes like MM_LOMETRIC, and MM_LOENGLISH?

Page number 47 of book Programming with MFC (second edition) by Jeff Prosise (chapter 2: Drawing in a window), has the following statement.
One thing to keep in mind when you use the metric mapping modes is that on display screens, 1 logical inch usually doesn't equal 1 physical inch. In other words, if you draw a line that's 100 units long in the MM_LOENGLISH mapping mode, the line probably won't be exactly 1 inch long.
My question is, if windows cannot give any guarantee on the physical dimensions of things we draw using metric mapping modes, then what is the point of having such a mapping mode? Are metric mapping modes relevant only for printers, and completely irrelevant for monitors?
In modern monitors, with digital ports like HDMI/Display port, can't windows OS get physical dimensions of the screen, thus making it possible to draw things using metric dimensions (inches, rather than pixels, note that the current resolution of the monitor will already be known to the OS)?
One of the ideas behind the logical inch is that viewing distance to a monitor was typically larger than the distance to a printed page, so it made sense to have the default of a logical inch on a typical monitor be a bit larger than a physical inch, especially in an era where WYSIWYG was taking off. Rather than put all of the burden to adjust for device resolution on the application, the logical inch lets WYSIWYG application developer think in terms of distances and sizes on the printed page and not have to work in pixels or dots which varied widely from device to device (and especially from monitor to printer).
Another issue was that, with the relatively limited resolutions of early monitors, it just wasn't practical to show legible text as small as typically printed text. For example, text was commonly printed at 6 lines per inch. At typical monitor resolutions, this might mean 12 pixels per line, which really limits font design and legibility (especially before anti-aliased and sub-pixel rendered text was practical). Making the logical inch default to 120-130% of an actual inch (on a typical monitor of the era) means lines of text would be 16 pixels high, making typographic niceties like serifs and italic more tenable (though still not pretty).
Also keep in mind that the user controls the logical inch and could very well set the logical inch so that it matches the physical inch if that suited their needs.
The logical units are still useful today, even as monitors have resolutions approaching those of older laser printers. Consider designing slides for a presentation that will be projected and also printed as handouts. The projection size is a function of the projector's optics and its distance from the screen. There's no way, even with two-way communication between the OS and the display device for the OS to determine the actual physical size (nor would it be useful for most applications).
I'm not a CSS expert, but it's my understanding that even when working in CSS's px units, you're working in a logical unit that may not be exactly the size of a physical pixel. It's supposed to take into account the actual resolution of the device and the typical viewing distance, allowing web designers to make the same 96-per-inch assumption that native application developers had long been using.

Looking for cool LED graphics routines that don't require arrays

I have made a 24 x 15 LED matrix using an Arduino, shift registers and TLC5940s.
The Arduino Uno has a measly 32 KB of memory, so the graphics are not stored to arrays beforehand. Rather, I write algorithms to generate artistic animations using math equations.
Example code for a rainbow sine wave is:
for (int iterations = 0; iterations < times; iterations++)
{
val += PI/500;
for (int col = 0; col < NUM_COLS; col++)
{
digitalWrite(layerLatchPin, LOW);
shiftOut(layerDataPin, layerClockPin, MSBFIRST, colMasks[col] >> 16 );
shiftOut(layerDataPin, layerClockPin, MSBFIRST, colMasks[col] >> 8 );
shiftOut(layerDataPin, layerClockPin, MSBFIRST, colMasks[col] );
digitalWrite(layerLatchPin, HIGH);
Tlc.clear();
int rainbow1 = 7 + 7*sin(2*PI*col/NUM_COLS_TOTAL + val);
setRainbowSinkValue(rainbow1, k);
Tlc.update();
}
}
Where the setRainbowSinkValue sets one of the LEDS from 1 to 15 to a certain colour, and val shifts the wave to the right every iteration.
So I'm looking for simple graphics routines like this, in order to get cool animations, without having to store everything in arrays, as 15 x 24 x RGB quickly uses up all 32 KB of RAM.
I will try get an Arduino Mega, but let's assume this isn't an option for now.
How can I do it?
There are many effects you can get if you start to overlay simple functions like sin or cos. This guy creates the "plasma" effect which I think is always a cool thing to watch :)
Another way is to use noise functions to calculate the color of your pixels. You get a lot of examples if you google for "Arduino Perlin noise" (depending on your Arduino model you might not be able to get high framerates because Perlin noise requires some CPU power).
I've been working on similar graphics style projects with the Arduino and have considered a variety of strategies to deal with the limited. Personally I find algorithmic animations rather banal and generic unless they are combined with other things or directed in some way.
At any rate, the two approaches I have been working on:
defining a custom format to pack the data as bits and then using bitshifting to unpack it
storing simple SVG graphics in PROGMEM and then using sprite techniques to move them around the screen (with screen wrap around etc.). By using Boolean operations to merge multiple graphics together it's possible to get animated layer effects and build up complexity/variety.
I only use single color LEDs so things are simpler conceptually and datawise.
A good question but you're probably not going to find anything due to the nature of the platform.
You have the general idea to use algorithms to generate effects, so you should go ahead and write more crazy functions.
You could package your functions and make them available to everyone.
Also, if you allow it, use the serial port to communicate with a host that has more resources and can supply endless streams of patterns.
Using a transmitter and receiver will also work for connecting to another computer.
I will answer related questions but not exactly the question you asked because I am not a graphics expert....
First of all, don't forget PROGMEM, which allows you to store data in flash memory. There is a lot more flash than SRAM, and the usual thing to do, in fact, is to store extra data in flash.
Secondly, there are compression techniques that are available that will reduce your memory consumption. And these "compression" techniques are natural to the kinds of tasks you are doing anyway, so the word "compression" is a bit misleading.
First of all, we observe that because human perception of light intensity is exponential (shameless link to my own answer on this topic), depending on how exactly you use the LED drivers, you need not store the exact intensity. It looks like you are using only 8 bits of intensity on the TLC5940, not full 12. For 8 bits of LED driver intensity, you only have 8 or 9 different intensity values (because the intensity you tell the LED driver to use is 2^perceptible_intensity). 8 different values can be stored in only three bits. Storing three bit chunks in bytes can be a bit of a pain, but you can still treat each "pixel" in your array as a uint16_t, but store the entire color information. So you reduce your memory consumption to 2/3 of what it was. Furthermore, you can just palettize your image: each pixel is a byte (uint8_t), and indexes a place in a palette, which could be three bytes if you'd like. The palette need not be large, and, in fact, you don't have to have a palette at all, which just means having a palette in code: your code knows how to transform a byte into a set of intensities. Then you generate the actual intensity values that the TLC5940 needs right before you shift them out.

How to make colours on one screen look the same as another

Given two seperate computers, how could one ensure that colours are being projected roughly the same on each screen?
IE, one screen might have 50% brightness more than another, so colours appear duller on one screen. One artist on one computer might be seeing the pictures differently to another, it's important they are seeing the same levels.
Is there some sort of callibration technique via software you can do? Any techniques? Or is a hardware solution the only way?
If you are talking about lab-critical calibration (that is, the colours on one monitor need to exactly match the colours on another, and both need to match an external reference as closely as possible) then a hardware colorimeter (with its own appropriate software and test targets) is the only solution. Software solutions can only get you so far.
The technique you described is a common software-only solution, but it's only for setting the gamma curves on a single device. There is no control over the absolute brightness and contrast; you are merely ensuring that solid colours match their dithered equivalents. That's usually done after setting the brightness and contrast so that black is as black as it can be and white is as white as it can be, but you can still distinguish not-quite-black from black and not-quite-white from white. Each monitor, then, will be optimized for its own maximum colour gamut, but it will not necessarily match any other monitor in the shop (even monitors that are the same make and model will show some variation due to manufacturing tolerances and age/use). A hardware colorimeter will (usually) generate a custom colour profile for the device under test as it is at the time of testing, and there is generally and end-to-end solution built into the product (so your scanner, printer, and monitor are all as closely matched as they can be).
You will never get to an absolute end-to-end match in a complete system, but hardware will get you as close as you can get. Software alone can only get you to a local maximum for the device it's calibrating, independent of any other device.
What you need to investigate are color profiles.
Wikipedia has some good articles on this:
https://en.wikipedia.org/wiki/Color_management
https://en.wikipedia.org/wiki/ICC_profile
The basic thing you need is the color profile of the display on which the color was seen. Then, with the color profile of display #2, you can take the original color and convert it into a color that will look as close as possible (depends on what colors the display device can actually represent).
Color profiles are platform independent and many modern frameworks support them directly.
You may be interested in reading about how Apple has dealt with this issue:
Color Programming Topics
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/DrawColor/DrawColor.html
You'd have to allow or ask the individual users to calibrate their monitors. But there's enough variation across monitors - particularly between models and brands - that trying to implement a "silver bullet" solution is basically impossible.
As #Matt Ball observes calibrating your monitors is what you are trying to do. Here's one way to do it without specialised hardware or software. For 'roughly the same' visual calibration against a reference image is likely to be adequate.
Getting multiple monitors of varying quality/brand/capabilities to render a given image the same way is simply not possible.
IF you have complete control over the monitor, video card, calibration hardware/software, and lighting used then you have a shot. But that's only if you are in complete control of the desktop and the environment.
Assuming you are just accounting for LCDs, they are built different types of panels with a host of different capabilities. Brightness is just one factor (albeit a big one). Another is simply the number of colors they are capable of rendering.
Beyond that, there is the environment that the monitor is in. Even assuming the same brand monitor and calibration points, a person will perceive a different color if an overhead fluorescent is used versus an incandescent placed next to the monitor itself. At one place I was at we had to shut off all the overheads and provide exact lamp placement for the graphic artists. Picky picky. ;)
I assume that you have no control over the hardware used, each user has a different brand and model monitor.
You have also no control over operating system color profiles.
An extravagant solution would be to display a test picture or pattern, and ask your users to take a picture of it using their mobile or webcam.
Download the picture to the computer, and check whether its levels are valid or too out of range.
This will also ensure ambient light at the office is appropiate.

Programming graphics and sound on PC - Total newbie questions, and lots of them!

This isn't exactly specifically a programming question (or is it?) but I was wondering:
How are graphics and sound processed from code and output by the PC?
My guess for graphics:
There is some reserved memory space somewhere that holds exactly enough room for a frame of graphics output for your monitor.
IE: 800 x 600, 24 bit color mode == 800x600x3 = ~1.4MB memory space
Between each refresh, the program writes video data to this space. This action is completed before the monitor refresh.
Assume a simple 2D game: the graphics data is stored in machine code as many bytes representing color values. Depending on what the program(s) being run instruct the PC, the processor reads the appropriate data and writes it to the memory space.
When it is time for the monitor to refresh, it reads from each memory space byte-for-byte and activates hardware depending on those values for each color element of each pixel.
All of this of course happens crazy-fast, and repeats x times a second, x being the monitor's refresh rate. I've simplified my own likely-incorrect explanation by avoiding talk of double buffering, etc
Here are my questions:
a) How close is the above guess (the three steps)?
b) How could one incorporate graphics in pure C++ code? I assume the practical thing that everyone does is use a graphics library (SDL, OpenGL, etc), but, for example, how do these libraries accomplish what they do? Would manual inclusion of graphics in pure C++ code (say, a 2D spite) involve creating a two-dimensional array of bit values (or three dimensional to include multiple RGB values per pixel)? Is this how it would be done waaay back in the day?
c) Also, continuing from above, do libraries such as SDL etc that use bitmaps actual just build the bitmap/etc files into machine code of the executable and use them as though they were build in the same matter mentioned in question b above?
d) In my hypothetical step 3 above, is there any registers involved? Like, could you write some byte value to some register to output a single color of one byte on the screen? Or is it purely dedicated memory space (=RAM) + hardware interaction?
e) Finally, how is all of this done for sound? (I have no idea :) )
a.
A long time ago, that was the case, but it hasn't been for quite a while. Most hardware will still support that type of configuration, but mostly as a fall-back -- it's not how they're really designed to work. Now most have a block of memory on the graphics card that's also mapped to be addressable by the CPU over the PCI/AGP/PCI-E bus. The size of that block is more or less independent of what's displayed on the screen though.
Again, at one time that's how it mostly worked, but it's mostly not the case anymore.
Mostly right.
b. OpenGL normally comes in a few parts -- a core library that's part of the OS, and a driver that's supplied by the graphics chipset (or possibly card) vendor. The exact distribution of labor between the CPU and GPU varies somewhat though (between vendors, over time within products from a single vendor, etc.) SDL is built around the general idea of a simple frame-buffer like you've described.
c. You usually build bitmaps, textures, etc., into separate files in formats specifically for the purpose.
d. There are quite a few registers involved, though the main graphics chipset vendors (ATI/AMD and nVidia) tend to keep their register-level documentation more or less secret (though this could have changed -- there's constant pressure from open source developers for documentation, not just closed-source drivers). Most hardware has capabilities like dedicated line drawing, where you can put (for example) line parameters into specified registers, and it'll draw the line you've specified. Exact details vary widely though...
e. Sorry, but this is getting long already, and sound covers a pretty large area...
For graphics, Jerry Coffin's got a pretty good answer.
Sound is actually handled similarly to your (the OP's) description of how graphics is handled. At a very basic level, you have a "buffer" (some memory, somewhere).
Your software writes the sound you want to play into that buffer. It is basically an encoding of the position of the speaker cone at a given instant in time.
For "CD quality" audio, you have 44100 values per second (a "sample rate" of 44.1 kHz).
A little bit behind the write position, you have the audio subsystem reading from a read position in the buffer.
This read position will be a little bit behind the write position. The distance behind is known as the latency. A larger distance gives more of a delay, but also helps to avoid the case where the read position catches up to the write position, leaving the sound device with nothing to actually play!

When using Direct3D, how much math is being done on the CPU?

Context: I'm just starting out. I'm not even touching the Direct3D 11 API, and instead looking at understanding the pipeline, etc.
From looking at documentation and information floating around the web, it seems like some calculations are being handled by the application. That, is, instead of simply presenting matrices to multiply to the GPU, the calculations are being done by a math library that operates on the CPU. I don't have any particular resources to point to, although I guess I can point to the XNA Math Library or the samples shipped in the February DX SDK. When you see code like mViewProj = mView * mProj;, that projection is being calculated on the CPU. Or am I wrong?
If you were writing a program, where you can have 10 cubes on the screen, where you can move or rotate cubes, as well as viewpoint, what calculations would you do on the CPU? I think I would store the geometry for the a single cube, and then transform matrices representing the actual instances. And then it seems I would use the XNA math library, or another of my choosing, to transform each cube in model space. Then get the coordinates in world space. Then push the information to the GPU.
That's quite a bit of calculation on the CPU. Am I wrong?
Am I reaching conclusions based on too little information and understanding?
What terms should I Google for, if the answer is STFW?
Or if I am right, why aren't these calculations being pushed to the GPU as well?
EDIT: By the way, I am not using XNA, but documentation notes the XNA Math Library replaces the previous DX Math library. (i see the XNA Library in the SDK as a sheer template library).
"Am I reaching conclusions based on too little information and understanding?"
Not as a bad thing, as we all do it, but in a word: Yes.
What is being done by the GPU is, generally, dependent on the GPU driver and your method of access. Most of the time you really don't care or need to know (other than curiosity and general understanding).
For mViewProj = mView * mProj; this is most likely happening on the CPU. But it is not much burden (counted in 100's of cycles at the most). The real trick is the application of the new view matrix on the "world". Every vertex needs to be transformed, more or less, along with shading, textures, lighting, etc. All if this work will be done in the GPU (if done on the CPU things will slow down really fast).
Generally you make high level changes to the world, maybe 20 CPU bound calculations, and the GPU takes care of the millions or billions of calculations needed to render the world based on the changes.
In your 10 cube example: You supply a transform for each cube, any math needed for you to create the transform is CPU bound (with exceptions). You also supply a transform for the view, again creating the transform matrix might be CPU bound. Once you have your 11 new matrices you apply the to the world. From a hardware point of view the 11 matrices need to be copied to the GPU...that will happen very, very fast...once copied the CPU is done and the GPU recalculates the world based on the new data, renders it to a buffer and poops it on the screen. So for your 10 cubes the CPU bound calculations are trivial.
Look at some reflected code for an XNA project and you will see where your calculations end and XNA begins (XNA will do everything is possibly can in the GPU).

Resources