Identifying teeth area within a mouth region in an image - colors

I am trying to do an image manipulation wherein the user would be prompted to enclose the mouth portion within an image. Once the user does that my application should identify the pixels that would identify the teeth (the color varying from white to yellow) and then I would like to brighten only those pixel. Could anyone give me a guidance on how to proceed?

Your question is quite honestly, very broad as an adequate answer will touch on a large number of areas.
Nevertheless, what you are trying to attempt is called Pattern Recognition. More specifically, your problem is geared towards image-analysis, dealing mainly in Template Matching:
Template matching is a technique in digital image processing for
finding small parts of an image which match a template image. It can
be used in manufacturing as a part of quality control, a way to
navigate a mobile robot, or as a way to detect edges in images.
The Template Matching page has a C-like language sample algorithm which demonstrates what you are attempting to do (identify a specific color within an image).
As for how to go about this, generally speaking you will have to load an image, store it into an array then try to manipulate it as the algorithm suggests:
One way to perform template matching on color images is to decompose
the pixels into their color components and measure the quality of
match between the color template and search image using the sum of the
absolute differences (SAD) computed for each color separately.
Of course, there are numerous projects in various languages that do that for you. My suggestion is to read up a bit more on the topic, pick a language, and attempt a solution using libraries as necessary.
One book that you might find to be very helpful is the classic Phillips: Image Processing in C even if you don't want to use C. Why? Because it pores over a lot of the algorithmic details in how they work, and how to implement them. And, its free too.

Related

Handle different layout of document using kofax

I am new to KofaxTotalAgility solution, but i am well aware of OCR, OMR and recognition mechanism.
I have two forms in one folder, A and B.
both of them are identical, but due to manual scan there are slight axes change, say 20 pixel right shift, so Layout is slightly differ.
Layout of Image A and Image B are different, position of a form in a page are not fix.
I know, other solution like "abbyy fine reader", provide flexilayout where we can handle this by finding the text and setting up right left top down to automatically identify zones.
As i have started learning KofaxTotalAgility, i am unaware of all option provided by "kofax Transformation Designer".
My question is which Locator should i use, i am currently using/working-on advance zone locator and for one document(Image A) which i set as a reference, extraction is proper. But for other,(Image B) due to layout mismatch text/box field are not getting extracted.
Can anyone point out the right direction from where i can get this case handled properly.
I know, i am asking direct option/solution, any help is highly appreciable.
In general, Kofax Transformations has two groups of locators:
Deterministic. You tell the locator precisely what to do, and how to do it (similar to an imperative approach when programming)
Probabilistic. You just tell your locator what to extract, and it works out the rest (based on AI).
Here's a (non-exhaustive) diagram I created the other day:
When working with forms, you might be tempted to rely on forms-specific locators such as the Advanced Zone Locator. While this locator can account for fields "moving around", for example due to images being jolted, zoomed, or distorted, there are certain limitations. Other locators don't have these limitations - the format locator for example allows you to define a certain pattern (a Regular Expression) that should be matched along with a keyword that has to be found somewhere around that pattern.
For your example, you could create a regex like M|F|X, and then define "Gender" as the keyword that needs to be present on the left.
However, any locator that's ruled by determinism follows Murphy's law - at some point that keyword might change. There could be different languages. And maybe additional letters for certain genders might be added; ultimately breaking your extraction logic.
Enter AI - while Murphy's law still applies when using Group Locators, the difference here is that users can train the system to pick up the new data. Said locator will automatically work out the best way to extract that piece of data. If you used a format locator, the customer would need to get back to you to add additional expressions, or have the keywords changed.
In your particular case, I'd try to use a Trainable Group Locator first. If you already know what you're looking for - for example SSNs that you have somewhere in a database, go for the Database Locator. Use Format Locators as a last resort, as tempting as they may be. Advanced Zone Locators are useful when you deal with forms, but I find myself using them almost exclusively for handprint or checkbox recognition.

Modifying a model and texture mid-game code

Just have a question for anyone out there who knows some sort of game engine pretty well. What I am trying to implement is some sort of script or code that will allow me to make a custom game character and textures mid-game. A few examples would be along the lines of changing facial expressions and body part positions in the game SecondLife. I don't really need a particular language, feel free to use your favorite, I'm just really looking for an example on how to go about this.
Also I was wondering if there is anyway to combine textures for optimization; for example if i wanted to add a tattoo to a character midgame, is there any code that could combine his body texture and the tattoo texture into one texture to use (this way I can simply just render one texture per body.)
Any tips would be appreciated, sorry if the question is a wee bit to vauge.
I think that "swappable tattoos" are typically done as a second render pass of the polygons. You could do some research into "detail maps" and see if they provide what you're looking for.
As for actually modifying the texture data at runtime, all you need to do is composite the textures into a new one. You could even use the rendering API to do it for you, more than likely; render the textures you want to combine in the order you want to combine them into a new texture. Mind, doing this every frame would be a disoptimization since it'll be slower to render two textures into one and then draw the new one than it would be just to draw the two sources one after the other.

How does Nike's website do this Flash effect when the user selects a choice

I was wondering how does Nike website make the change you can see when selecting a color or a sole. At first I thought they were only using images and when the user picked a color you just replaced that part, but when I selected a different sole I noticed it didn't changed like an image it looked a bit more as if it was being rendered. Does anybody happens to know how this is made? Or where can I get further info about making this effect :)?
It's hard to know for sure, but my guess would be that they're using a rendering service similar to that provided by Adobe's Scene7.
It's a product that is used to colorize/customize a base product image based on user choices.
If you're interested in using the service, I'd suggest signing up for their weekly webinar. I attended one a while back and was very impressed with their offering. They showed the Converse site (which had functionality almost identical functionality to the Nike site) as a demo.
A lot of these tools are built out in Flash using a variety of techniques:
1) You can use Flash's BitmapData object to directly shift the hues of the pixels in your item. This is probably the simplest technique but often limits you to simple color transformations.
2) You can pre-render transparent PNG's (or photos, I guess) containing the various textures you would want to show on your object (for instance patterns or textures) and have them dynamically added to your stage at runtime. This, I think, offers the highest fidelity but means you need all of your items rendered upfront.
3) You can create 3D collada files and load them via a library like Papervision3D. Then dynamically change the texture at runtime. This is the most memory intensive technique and tends to result in far worse fidelity, but for that you get a full 3D object that you can view in space.
I'm sure there are other techniques but those are the top 3 I can think of. I hope that helps!

Is there an automated tool for making photo-mosaics (using images as pixels)?

I occasionally see portraits and other images which have been redrawn into an abstract form, where each pixel in the redrawn image is actually another, much smaller picture.
I am looking for a tool (or library) which can perform this type of transformation automatically. Does something like that exist?
You are thinking of a Photomosaic. You can use AndreaMosaic. There is a HowTo here.
I don't know about any existing tool; but the general algorithm would be to look at each pixel in the source image, and from a pool of pictures select one to represent the picture. You would probably want select the image that has the most content of the pixel color. (So when you encounter a blue pixel, you select an image to represent it, that is mostly blue, and so on).

Preferable Tag Cloud Visualization Formats

Out of curiosity, I would love to know what tag clouds formats best serve the purpose of discovery of more and more (relevant)content?
I am aware of 3 formats, but don't know which one is the best.
1) delicious one - color shading
2) The standard one with font size variations -
3) The one on this site - numbers showing importance/usage.
So which ones do you prefer? and why?
Edit:
Thanks to the answers below, I now have much more understanding of tag cloud visualization techniques.
4) Parallel Tag Clouds - a simple use of parallel coordinates technique. I find it more organized and readable.
5) voroni diagram - more useful for identifying tag relationships and making decisions based on them. Doesn't serves our purpose of discovery of relevant content.
6) Mind maps - They are good and can be employed to step by step filter content.
I found some more interesting techniques here - http://www.cs.toronto.edu/~ccollins/research/index.html
I really do think that depends on the content of the information and the audience. What's relevant to one is not relevant to another. If an audience is more specialized, then they will be more likely to think along the same lines, but it would still need to be analyzed and catered to by the content provider.
There are also multiple paths that a person can take to "discover more". Take the tag "DNS" for example. You could drill down to more specific details like "UDP Port 53" and "MX Record", or you could go sideways with terms like "IP address" "Hostname" and "URL". A Voronoi diagram shows clusters, but wouldn't handle the case where general terms could be related to many concepts. Hostname mapping to "DNS", "HTTP", "SSH" etc.
I've noticed that in certain tag clouds there's usually one or two items that are vastly larger than the others. Those sorts of things could be served by a mind map, where one central concept has others radiating out from it.
For the cases of lots of "main topics" where a mind map is inappropriate, there are parallel coordinates but that would be baffling to many net users.
I think that if we found an extremely well organized way of sorting clusters of tags while preserving links between generalities and specificities, that would be somewhat helpful to AI research.
In terms of which I personally prefer, I think the numeric approach is nice because infrequently referenced tags are still presented at a readable font size. I also think SO does it this way because they have vastly more tags to cover than the average size based cloud a la the standard.
I would go with #2 out of the options you listed above.
1 - The human eye recognizes and comprehends size differences much more effectively than color, when the color scale is along the same spectrum (ie, various blues as opposed to discrete individual colors).
3 - Requires the user to scan the full list and mathematically compare each individual number while scanning. No real meaningful relationship between tags without a lot of work on the users part.
So, going with #2, there are several considerations to take into account:
Keep the tags alphabetical. This affords the user another method of searching and establishes a known relationship between each (assuming they know the alphabet!). If they're unordered, it's just a crapshoot to find a single one.
If size comparison is absolutely critical (this usually isn't the case, as you can scale up each level by a certain percentage or pixel amount), use a monospaced font. Otherwise, certain letter combinations may end up looking larger than they actually are.
Don't include any commas, pipes, or other dividers. You're already going to have a lot of data in a small area - no need to clutter it up with debris. Space the tags out with a decent amount of padding, of course. Just don't double the number of visual elements by adding more than just the data.
Set a min/max font size and scale between those. There are situations where one tag may be so popular that visually it may appear exponentially larger than the others. Likewise, you don't want a tag to end up rendering at 1px! Set the min/max and adjust between as necessary.
size adjusted voroni diagram
- it shows which tags are inter-related
My favorite tag cloud format is the Wordle format. It looks great and it also does a pretty good job of fitting a lot of tags in a small space.

Resources