DirectWrite'ing glyphs such that the em square has a specific size - dpi

I'm working on an application that renders music notation. The musical symbol are specified in regular font files, which use the convention that the height of the em square corresponds to the height of a regular five-line staff of music. For example, the glyph for a note head is approximately 0.25 em high, the distance between two lines of the staff.
When it comes to rendering, I use a coordinate system in which 4 units corresponds to the height of a five-line staff of music. Therefore, I need to render glyphs such that the em square ends up rendered 4 units high. However DirectWrite only allows specifying text size in device independent pixels (DIPs) and I'm confused about how to juggle between the coordinate systems. There are two parts to this:
From a given font size in DIPs I can compute a height in physical pixels, but what is mapped to that height? The em square or some other design-space metric?
What if I'm using some arbitrary transformation matrix? How do I specify DIPs in order to get meaningful values in the coordinate system I am using?
And for good measure:
If get this to work, is this going to mess up font hinting because my DIP values don't have a clear relationship to physical pixels?

After some more experimentation and research, I have come to the following conclusions.
The font size specifies the size of the EM square as drawn. Drawing at 12 DIPs means that the EM square is scaled to use 12 DIPs of vertical space.
The top Y coordinate of the layoutRect parameter of the ID2D1RenderTarget::DrawText function is mapped to the top of the font's ascent (for the first line of text).
The identity matrix gives a coordinate system in which (0, 0) is the top-left and (width, height), as retrieved from ID2D1RenderTarget::GetSize, is the bottom-right, in DIPs. Which means for any transformation matrix, the font size unit should match the unit in the render target's coordinate system and a vertical line of 42 units will be as high as the EM square with a font size of 42 units.
I was unable to find information about the effect of arbitrary transformations on font hinting, however.

Related

Text Display Implementation Across Multiple Platforms

I have been scouring the internet for weeks trying to figure out exactly how text (such as what you are reading right now) is displayed to the screen.
Results have been shockingly sparse.
I've come across the concepts of rasterization, bitmaps, vector graphics, etc. What I don't understand, is how the underlying implementation works so uniformly across all systems (windows, linux, etc.) in way we as humans understand. Is there a specification defined somewhere? Is implementation code open source and viewable by the general public?
My understanding as of right now, is this:
Create a font with an external drawing program, one character at a time
Add these characters into a font file that is understood by language-specific libraries
These characters are then read from the file as needed by the GPU and displayed to the screen in a linear fashion as defined by the parenting code.
Additionally, if characters are defined in a font file such as 'F, C, ..., Z', how are vector graphics (which rely on a set of coordinate points) supported? Without coordinate points, rasterization would seem the only option for size changes.
This is about as far as my assumptions/research goes.
If you are familiar with this topic and can provide a detailed answer that may be useful to myself and other readers, please answer at your discretion. I find it fascinating just how much code we take for granted that is remarkably complicated under the hood.
The following provides an overview (leaving out lots of gory details):
Two key components for display of text on the Internet are (i) character encoding and (ii) fonts.
Character encoding is a scheme by which characters, such as the Latin capital letter "A", are assigned a representation as a byte sequence. Many different character encoding schemes have been devised in the past. Today, what is almost ubiquitously used on the Internet is Unicode. Unicode assigns each character to a code point, which is an integer value; e.g., Unicode assigns LATIN CAPITAL LETTER A to the code point 65, or 41 in hexadecimal. By convention, Unicode code points are referred to using four to six hex digits with "U+" as a prefix. So, LATIN CAPITAL LETTER A is assigned to U+0041.
Fonts provide the graphical data used to display text. There have been various font formats created over the years. Today, what is ubiquitously used on the Internet are fonts that follow the OpenType spec (which is an extension of the TrueType font format created back around 1991).
What you see presented on the screen are glyphs. An OpenType font contains data for the glyphs, and also a table that maps Unicode code points to corresponding glyphs. More precisely, the character-to-glyph mapping (or 'cmap') table maps Unicode code points to glyph IDs. The code points are defined by Unicode; the glyph IDs are a font-internal implementation detail, and are used to look up the glyph descriptions and related data in other tables.
Glyphs in an OpenType font can be defined as bitmaps, or (far more common) as vector outlines (Bezier curves). There is an assumed coordinate grid for the glyph descriptions. The vector outlines, then, are defined as an ordered list of coordinate pairs for Bezier curve control points. When text is displayed, the vector outline is scaled onto a display grid, based on the requested text size (e.g., 10 point) and pixel sizing on the display. A rasterizer reads the control point data in the font, scales as required for the display grid, and generates a bitmap that is put onto the screen at an appropriate position.
One extra detail about displaying the rasterized bitmap: most operating systems or apps will use some kind of filtering to give glyphs a smoother and more legible appearance. For example, a grayscale anti-alias filter will set display pixels at the edges of glyphs to a gray level, rather than pure black or pure white, to make edges appear smoother when the scaled outline doesn't align exactly to the physical pixel boundaries—which is most of the time.
I mentioned "at an appropriate position". The font has metric (positioning) information for the font as a whole and for each glyph.
The font-wide metrics will include a recommended line-to-line distance for lines of text, and the placement of the baseline within each line. These metrics are expressed in the units of the font's glyph design grid; the baseline corresponds to y=0 within the grid. To start a line, the (0,0) design grid position is aligned to where the baseline meets the edge of a text container within the page layout, and the first glyph is positioned.
The font also has glyph metrics. One of the glyph metrics is an advance width for each given glyph. So, when the app is drawing a line of text, it has a starting "pen position" at the start of the line, as described above. It then places the first glyph on the line accordingly, and advances the pen position by the amount of that first glyph's advance width. It then places the second glyph using the new pen position, and advances again. And so on as glyphs are placed along the line.
There are (naturally) more complexities in laying out lines of text. What I described above is sufficient for English text displayed in a basic text editor. More generally, display of a line of text can involve substitution of the default glyphs with certain alternate glyphs; this is needed, for example, when displaying Arabic text so that characters appear cursively connected. OpenType fonts contain a "glyph substitution" (or 'GSUB') table that provides details for glyph substitution actions. In addition, the positioning of glyphs can be adjusted for various reasons; for example, to position a diacritic glyph correctly over a letter. OpenType fonts contain a "glyph positioning" ('GPOS') table that provides the position adjustment data. Operating system platforms and browsers today support all of this functionality so that Unicode-encoded text for many different languages can be displayed using OpenType fonts.
Addendum on glyph scaling:
Within the font, a grid is set up with a certain number of units per em. This is set by the font designer. For example, the designer might specify 1000 units per em, or 2048 units per em. The glyphs in the font and all the metric values—glyph advance width, default line-to-line distinance, etc.—are all set in font design grid units.
How does the em relate to what content authors set? In a word processing app, you typically set text size in points. In the printing world, a point is a well defined unit for length, approximately but not quite 1/72 of an inch. In digital typography, points are defined as exactly 1/72 of an inch. Now, in a word processor, when you set text size to, say, 12 points, that really means 12 points per em.
So, for example, suppose a font is designed using 1000 design units per em. And suppose a particular glyph is exactly 1 em wide (e.g., an em dash); in terms of the design grid units, it will be exactly 1000 units wide. Now, suppose the text size is set to 36 points. That means 36 points per em, and 36 points = 1/2", so the glyph will print exactly 1/2" wide.
When the text is rasterized, it's done for a specific target device, that has a certain pixel density. A desktop display might have a pixel (or dot) density of 96 dpi; a printer might have a pixel density of 1200 dpi. Those are relative to inches, but from inches you can get to points, and for a given text size, you can get to ems. You end up with a certain number of pixels per em based on the device and the text size. So, the rasterizer takes the glyph outline defined in font design units per em, and scales it up or down for the given number of pixels per em.
For example, suppose a font is designed using 1000 units per em, and a printer is 1000 dpi. If text is set to 72 points, that's 1" per em, and the font design units will exactly match the printer dots. If the text is set to 12 points, then the rasterizer will scale down so that there are 6 font design units per printer dot.
At that point, the details in the glyph outline might not align to whole units in the device grid. The rasterizer needs to decide which pixels/dots get ink and which do not. The font can include "hints" that affect the rasterizer behaviour. The hints might ensure that certain font details stay aligned, or the hints might be instructions to move a Bezier control point by a certain amount based on the current pixels-per-em.
For more details, see Digitizing Letterform Designs and Font Engine from Apple's TrueType Reference Manual, which goes into lots of detail.

How to get the MFC view in millimeters (mm)?

I have to draw a line so that — for example — if the input width value of line is 20 mm then the width of the line drawn should be in 20 mm. I read in MFC documentation that the input width value that we provide is considered as units by the MFC and the drawn object values are in pixels. Can anyone tell me how to set and get the width scaling in mm .
You need to use SetMapMode.
The SetMapMode function sets the mapping mode of the specified device context. The mapping mode defines the unit of measure used to transform page-space units into device-space units, and also defines the orientation of the device's x and y axes.
Look at MM_LOMETRIC or MM_HIMETRIC:
Each logical unit is mapped to 0.01 millimeter. Positive x is to the right; positive y is up.
Towards the bottom of the article it states:
The MM_HIENGLISH, MM_HIMETRIC, MM_LOENGLISH, MM_LOMETRIC, and MM_TWIPS modes are useful for applications drawing in physically meaningful units (such as inches or millimeters).

What is the real definition of resolution?

I read everywhere that resolution is defined by the number of pixels on a screen.
But if you imagine 1000 x 1000 pixels on a screen the size of 20 skyscrapers and compare it to 999 x 999 pixels on a box of matches, the resolution would make the skyscrapers screen look 'low-res' and the box of matches screen look 'high-res'. Instinctively, I would say that the box of matches screen is higher resolution than the skyscrapers screen.
Am I wrong to say this? Is resolution definitely defined by the total number of pixels instead of the dots per inch?
Indeed, in the context of displays, the term resolution says nothing about pixel density. As stated in Wikipedia's article on Display Resolution:
The term "display resolution" is usually used to mean pixel dimensions, the number of pixels in each dimension (e.g. 1920 × 1080), which does not tell anything about the pixel density of the display on which the image is actually formed: broadcast television resolution properly refers to the pixel density, the number of pixels per unit distance or area, not total number of pixels. In digital measurement, the display resolution would be given in pixels per inch (PPI)
Definition of resolution varies according with the context. Every thing has a measurement unit.
When you talk about screens(Monitors), screen has pixels not dots thats why its resolution is measured in Pixels.
And When you talk about printing or video it is all about dots per inch. In your case, Box of Match is not a screen, its on printed paper.
For eg: you might have heard people saying DPI's(not resolution) while scanning documents.
So don't get yourself confused with the definition of resolution that is meant for different context.

texture mapping (u,v) values

Here is a excerpt from Peter Shirley's Fundamentals of computer graphics:
11.1.2 Texture Arrays
We will assume the two dimensions to be mapped are called u and v.
We also assume we have an nx and ny image that we use as the texture.
Somehow we need every (u,v) to have an associated color found from the
image. A fairly standard way to make texturing work for (u,v) is to
first remove the integer portion of (u,v) so that it lies in the unit
square. This has the effect of "tiling" the entire uv plane with
copies of the now-square texture. We then use one of the three
interpolation strategies to compute the image color for the
coordinates.
My question is: What are the integer portion of (u,v)? I thought u,v are 0 <= u,v <= 1.0. If there is an integer portion, shouldn't we be dividing u,v by the texture image width and height to get the normalized u,v values?
UV values can be less than 0 or greater than 1. The reason for dropping the integer portion is that UV values use the fractional part when indexing textures, where (0,0), (0,1), (1,0) and (1,1) correspond to the texture's corners. Allowing UV values to go beyond 0 and 1 is what enables the "tiling" effect to work.
For example, if you have a rectangle whose corners are indexed with the UV points (0,0), (0,2), (2,0), (2,2), and assuming the texture is set to tile the rectangle, then four copies of the texture will be drawn on that rectangle.
The meaning of a UV value's integer part depends on the wrapping mode. In OpenGL, for example, there are at least three wrapping modes:
GL_REPEAT - The integer part is ignored and has no meaning. This is what allows textures to tile when UV values go beyond 0 and 1.
GL_MIRRORED_REPEAT - The fractional part is mirrored if the integer part is odd.
GL_CLAMP_TO_EDGE - Values greater than 1 are clamped to 1, and values less than 0 are clamped to 0.
Peter O's answer is excellent. I want to add a high level point that the coordinate systems used in graphics are a convention that people just stick to as a defacto standard-- there's no law of nature here and it is arbitrary (but a decent standard thank goodness). I think one reason texture mapping is often confusing is that the arbitrariness of this stardard isn't obvious. This is that the image has a de facto coordinate system on the unit square [0,1]^2. Give me a (u,v) on the unit square and I will tell you a point in the image (for example, (0.2,0.3) is 20% to the right and 30% up from the bottom-left corner of the image). But what if you give me a (u,v) that is outside [0,1]^2 like (22.7, -13.4)? Some rule is used to make that on [0.1]^2, and the GL modes described are just various useful hacks to deal with that case.

Generating density map for tree growth rings

I was just wondering if someone know of any papers or resources on generating synthetic images of growth rings in trees. Im thinking 2d scalar-fields or some other data representation which can then be used to render growth rings like images :)
Thanks!
never done or heard about this ...
If you need simulation then search for biology/botanist sites instead.
If you need just visually close results then I would:
make a polygon covering the cut (circle/oval like shape)
start with circle and when all working try to add some random distortion or use ellipse
create 1D texture with the density
it will be used to fill the polygon via triangle fan. So first find an image of the tree type you want to generate for example this:
Analyze the color and intensity as a function of diameter so extract a pie like piece (or a thin rectangle)
and plot a graph of R,G,B values to see how the rings are shaped
then create function that approximate that (or use piecewise interpolation) and create your own texture as function of tree age. You can interpolate in this way booth the color and density of rings.
My example shows that for this tree the color is the same so only its intensity changes. In this case you do not need to approximate all 3 functions. The bumps are a bit noisy due to another texture layer (ignore this at start). You can use:
intensity=A*|cos(pi*t)| as a start
A is brightness
t is age in years/cycles (and also the x coordinate (scaled) in your 1D texture)
so take base color R,G,B multiply it by A for each t and fill the texture pixel with this color. You can add some randomness to ring period (pi*t) and also the scale can be matched more closely. This is linear growth ,... so you can use exponential instead or interpolate to match bumps per length affected by age (distance form t=0)...
now just render the polygon
mid point is the t=0 coordinate in texture each vertex of polygon is t=full_age coordinate in texture. So render the triangle fan with these texture coordinates. If you need more close match (rings are not the same thickness along the perimeter) then you can convert this to 2D texture
[Notes]
You can also do this incrementally so do just one ring per iteration. Next ring polygon is last one enlarged or scaled by scale>1 and add some randomness, but this needs to be rendered by QUAD STRIP. You can have static texture for single ring so interpolate just the density and overall brightness:
radius(i)=radius(i-1)+ring_width=radius(i-1)*scale
so:
scale=(radius(i-1)+ring_width)/radius(i-1)

Resources