Segmenting License Plate Characters #ImageProcessing - python-3.x

For a university project I have to segment characters from a license plate using Python. This sounds reasonably simple. However, the thing is that we are not allowed to use any sophisticated library functions such as cv2.findContours(). The basics such as cv2.imread() cv2.resize() cv2.rectangle() are allowed.
I have written a function that localizes a license plate in an image and outputs a result as can be seen in the images Output 1 and Output 2 . These are binary images.
As one can see. Sometimes, the output of this function is relatively clean (Output 2). However, often it is also noisy (Output 1)
For a clean image (Output 2) I have tried finding the columns that contain less than x black pixels in order to segment the characters. However, this only works when the image is clean. This is often not the case. Changing the x parameter here does not make significant improvement.
Does anybody have suggestions on how I can approach this problem?

For an elementary solution, you can form a profile by counting the black pixels on all vertical lines. Then look for maximas and minimas of the average count in a sliding interval on this profile. The interval length should be a fraction of the expected width of a character. Only the extrema with sufficient contrast should be considered.
To avoid the effect of surrounding features in rotated plates, you can restrict the counting to just a slice of the image.
Once you have approximate vertical limits between the characters, you can repeat a similar processing to get the bottom and top limits of the characters (the sliding interval is no more necessary).
Finally, you can refine the boxing by finding the horizontal limits in the rectangles so formed.

Related

How to identify joints in the profile of a shape?

I'm working on a system to automatically take 2D profiles of components and assemble them into 3D shapes.
Imagine given these pieces:
You want to make this shape:
I'm highlighting one of the components to show how they fit together.
I'm open to any suggestions on how to go about doing this but the current approach I'm attempting first finds joints that may fit together just by looking at the 2D profile.
How could I go about identifying the "tabs" from the polyline profile?
The same technique should also work on assemblies like such:
see How to compare two shapes?
so you basically trying to find the "same" sequences in polylines encoded in the polar increment format (turn angle, line length) and then just check if relative position of matched sequences are the same in both shapes ...
Beware that the locks might have some gap between the joined shapes to ensure assembly is possible... in same case the gap might be even negative (overlap) depends on material and function so You need to compare the sequences with some margin ...
Also I would divide each shape into its sides to speed up the process as the lock is most likely not crossing sides ...
You may define the "code" for a tab. For example:
3,C,5,C,3 would mean: Three units length, then turn 90º counter-clockwise, then 5 units length, then turn 90º counter-clockwise, then 3 units length.
Of course more identifiers than C can be used, for different angles and so.
A tab in another piece that fits in the tab of the first piece has the same (or very similar) 3,C,5,C,3 code
So, finding same code in both pieces may be a fit. Check if adjacents codes in both pieces also fit, and you're done.
Notice that pieces can be rotated. This case doesn't change the code, but may change the order of adjacents codes.

Recognizing license plate characters using template characters in Python

For a university project I have to recognize characters from a license plate. I have to do this using python 3. I am not allowed to use OCR functions or use functions that use deep learning or neural networks. I have reached the point where I am able to segment the characters from a license plate and transform them to a uniform format. A few examples of segmented characters are here.
The format of the segmented characters is very dependent on the input. However, I can easily convert this to uniform dimensions using opencv. Additionally, I have a set of template characters and numbers that I can use to predict what character / number it is.
I therefore need a metric to express the similarity between the segmented character and the reference image. In this way, I can say that the reference image with the highest similarity score matches the segmented character. I have tried the following ways to compute the similarity.
For these operations I have made sure that the reference characters and the segmented characters have the same dimensions.
A bitwise XOR-operator
Inverting the reference characters and comparing them pixel by pixel. If a pixel matches increment the similarity score, if a pixel does not match decrement the similarity score.
hash both the segmented character and the reference character using 'imagehash'. Consequently comparing the hashes and see which ones are most similar.
None of these methods succeed to give me an accurate prediction for all characters. Most characters are usually correctly predicted. However, the program confuses characters like 8-B, D-0, 7-Z, P-R consistently.
Does anybody have an idea how to predict the segmented characters? I.e. defining a better similarity score.
Edit: Unfortunately, cv2.matchTemplate and cv2.matchShapes are not allowed for this assignment...
The general procedure for comparing two images consists in the extraction of features from the two images and their subsequent comparison. What you are actually doing in the first two methods is considering the value of every pixel as a feature. The similarity measure is therefore a distance-computation on a space of very high dimension. This methods are, however, subject to noise and this requires very big datasets in order not to obtain acceptable results.
For this reason, usually one attempts to reduce the space dimensionality. I'm not familiar with the third method, but it seems to go in this direction.
A way to reduce the space dimensionality consists in defining some custom features meaningful for the problem you are facing.
A possibility for the character classification problem could be to define features that measure the response of the input image on strategic subshapes of the characters (an upper horizontal line, a lower one, a circle in the upper part of the image, a diagonal line, etc.).
You could define a minimal set of shapes that, combined together, can generate every character. Then you should retrieve one feature for each shape, by measuring the response (i.e., integrating the signal of the input image inside the shape) of the original image on that particular shape. Finally, you should determine the class which the image belongs to by taking the nearest reference point in this, smaller, space of the features.

How to clean border noise from a license plate using filters

I am filtering an image so that it removes noise from them.
This image corresponds to a patent plate, and to detect the letters I need them to be without noise.
Original image:
Output:
Any way to make that 5 able to remove white part from above? or decrease it
I have a couple of images like this with that problem, which occurs when I transform the image horizontally. Any help is welcome.
With just low-level operations (filters), you can't reduce the black area on the top because is it of the same nature as the characters themselves. Any action you take against this zone will also damage the characters. No filter will work satisfactorily.
Hence you must use some extra contextual information such as "against the top edge", and possibly "forming a straight edge". Even so, finding the exact border with the 5 is challenging.

A better Greyscale algorithm

I'm trying to create a spectral image with a constant grey-scale value for every row. I've written some fantastically slow code that basically tries 1000 different variation between black and white for a given hue and it finds the one whose grey-scale value most closely approximates the target value, resulting in the following image:
On my laptop screen (HP) there is a very noticeable 'dip' near the blue peak, where blue pixels near the bottom of the image appear much brighter than the neighbouring purple and cyan pixels. On my second screen (Acer, which has far superior colour display) the dip is smaller, but still there.
I use the following function to compute the grey-scale approximation of a colour:
Math.Abs(targetGrey - (0.2989 * R + 0.5870 * G + 0.1140 * B))
when I convert the image to grey-scale using Paint.NET, I get a perfect black to white gradient, so that part of the code at least works.
So, question: Is this purely an artefact of the display qualities of my screens? Or can the above mentioned grey-scale algorithm be improved upon to give a visually more consistent result?
EDIT: The problem seems to be mostly monitor calibration. Not, I repeat not, a problem with the code.
I'm wondering if its more to do with the way our eyes interpret the colors, rather than screen artifacts.
That said... I am using a very-high quality screen (Dell Ultrasharp, IPS) that has incredible color reproduction and I'm not sure what you mean by "dip" in the blue peak. So either I'm just not noticing it, or my screen doesn't show the same picture and it more color-accurate.
The output looks correct given the greyscale conversion you have used (which I believe is the standard one for sRGB colour spaces).
However - there are lots of tradeoffs in colour models and one of these is that you can get results which aren't visually quite what you want. In your case, the fact that there is a very low blue weight means that a greater amount of blue is needed to get any given greyscale value, hence the blue seems to start lower, at least in terms of how the human eye perceives it.
If your objective is to get a visually appealing spectral image, then I'd suggest altering your function to make the R,G,B weights more equal, and see if you like what you get.

Help with the theory behind a pixelate algorithm?

So say I have an image that I want to "pixelate". I want this sharp image represented by a grid of, say, 100 x 100 squares. So if the original photo is 500 px X 500 px, each square is 5 px X 5 px. So each square would have a color corresponding to the 5 px X 5 px group of pixels it swaps in for...
How do I figure out what this one color, which is best representative of the stuff it covers, is? Do I just take the R G and B numbers for each of the 25 pixels and average them? Or is there some obscure other way I should know about? What is conventionally used in "pixelation" functions, say like in photoshop?
If you want to know about the 'theory' of pixelation, read up on resampling (and downsampling in particular). Pixelation algorithms are simply downsampling an image (using some downsampling method) and then upsampling it using nearest-neighbour interpolation. Note that in code these two steps may be fused into one.
For downsampling in general, to downsample by a factor of n the image is first filtered by an appropriate low-pass filter, and then one sample out of every n is taken. An "ideal" filter to use is the sinc filter, but because of issues with implementing it, the Lanczos filter is often used as a close alternative.
However, for almost all purposes when doing pixelization, using a simple box blur should work fine, and is very simple to implement. This is just an average of nearby pixels.
If you don't need to change the output size of the image, then this means you divide the image into blocks (the big resulting pixels) which are k×k pixels, and then replace all the pixels in each block with the average value of the pixels in that block.
when the source and target grids are so evenly divisible and aligned, most algorigthms give similar results. if the grids are fixed, go for simple averages.
in other cases, especially when resizing by a small percentage, the quality difference is quite evident. the simplest enhancement over simple average is weighting each pixel value considering how much of it's contained in the target pixel's area.
for more algorithms, check multivariate interpolation

Resources