How to count small seeds of two different kinds in an image? - python-3.x

I have an image in which there are about 503 chia seeds and 4 black pepper seeds below:
I am trying to count both of them. I a familiar with counting one type of object using the size of the seed and divide it by the total area covered by them. This is a simple approach but it works just fine.
For the image above, I found the grain size using the thresholding approach and it turns out to be 197 units and by using the below code I can find the total number of seeds:
image = Image.open('./shapes/S__14155926.jpg')
arr = np.array(image)
nseeds = np.sum(arr[...,2] < 100) / 197
print(nseeds)
504.58468309859154
The number is in ballpark and a couple of wrong counts here in there is no issue. However, how do I find the output classified as 500 small seeds and 4 large seeds without having to train a CNN model. I don't wish to recognize the seeds, just detect and count.

Both seeds are very similar in colour, which makes thresholding a bit challenging. Perhaps, passing the seeds through a strainer would be easier, so you can use your technique in two different images. If the seeds were nicely separated from each other, you could give it a try on blob detection.

First of all I find the area of all seeds
img=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_bin = 255-cv2.threshold(copy, 127, 255, cv2.THRESH_BINARY)[1]
S=cv2.moments(img_bin)
S['m00']
and get that the area is equal 26607210.0 pixels
Then I cut of 5 seeds
and find their area 175185.0. Thus one small seed has 35037.0.
Since a big seed contains about 8 small seeds ( in the sane way as above) I get that
there are about 728 small seeds and 4 big seeds approximatelly

Related

Compute maximum edge length for subdividing mesh

I have two triangulated meshes m and m1 with the following properties:
m_faces: 16640
m_points: 49920
m_surface_area: 178.82696989147524
m1_faces: 8
m1_points: 24
m1_surface_area: 1.440205667851934
Now I would like to subdivide m1 so that it has approx. the same faces number as m, i.e. 16640. I am using the vtk library and more specifically the vtkAdaptiveSubdivisionFilter() function which according to the description:
…is a filter that subdivides triangles based on maximum edge length and/or triangle area.
My question is how to compute the maximum edge length, according to some trial and error I found that this needs to be a value between [0.0188-0.0265] which gives me 16384 faces. However, I couldn’t find any formulation that gives a number in this ratio range and is consistent on different cases. Any idea how to calculate this maximum edge length each time?
On another example I have the following two meshes:
Sph1_faces: 390
Sph1_points: 1170
Sph1_surface_area: 1.9251713393584104
Sph2_faces: 1722
Sph2_points: 5166
Sph2_suface_area: 10.59400389764954
And for getting Sph1 number of faces close to Sph2 number of faces the maximum edge length should be between [0.089-0.09] which gives me 1730 faces for Sph1.
I've tried to use the equilateral triangle area formulation making the corresponding assumption and then solving for side and dividing by number of faces or points but it didn't seem to work. Thus, any other idea would be appreciated.
Thanks.

Using CNN with Dataset that has different depths between volumes

I am working with Medical Images, where I have 130 Patient Volumes, each volume consists of N number of DICOM Images/slices.
The problem is that between the volumes the the number of slices N, varies.
Majority, 50% of volumes have 20 Slices, rest varies by 3 or 4 slices, some even more than 10 slices (so much so that interpolation to make number of slices equal between volumes is not possible)
I am able to use Conv3d for volumes where the depth N (number of slices) is same between volumes, but I have to make use of entire data set for the classification task. So how do I incorporate entire dataset and feed it to my network model ?
If I understand your question, you have 130 3-dimensional images, which you need to feed into a 3D ConvNet. I'll assume your batches, if N was the same for all of your data, would be tensors of shape (batch_size, channels, N, H, W), and your problem is that your N varies between different data samples.
So there's two problems. First, there's the problem of your model needing to handle data with different values of N. Second, there's the more implementation-related problem of batching data of different lengths.
Both problems come up in video classification models. For the first, I don't think there's a way of getting around having to interpolate SOMEWHERE in your model (unless you're willing to pad/cut/sample) -- if you're doing any kind of classification task, you pretty much need a constant-sized layer at your classification head. However, the interpolation doesn't have happen right at the beginning. For example, if for an input tensor of size (batch, 3, 20, 256, 256), your network conv-pools down to (batch, 1024, 4, 1, 1), then you can perform an adaptive pool (e.g. https://pytorch.org/docs/stable/nn.html#torch.nn.AdaptiveAvgPool3d) right before the output to downsample everything larger to that size before prediction.
The other option is padding and/or truncating and/or resampling the images so that all of your data is the same length. For videos, sometimes people pad by looping the frames, or you could pad with zeros. What's valid depends on whether your length axis represents time, or something else.
For the second problem, batching: If you're familiar with pytorch's dataloader/dataset pipeline, you'll need to write a custom collate_fn which takes a list of outputs of your dataset object and stacks them together into a batch tensor. In this function, you can decide whether to pad or truncate or whatever, so that you end up with a tensor of the correct shape. Different batches can then have different values of N. A simple example of implementing this pipeline is here: https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/03-advanced/image_captioning/data_loader.py
Something else that might help with batching is putting your data into buckets depending on their N dimension. That way, you might be able to avoid lots of unnecessary padding.
You'll need to flatten the dataset. You can treat every individual slice as an input in the CNN. You can set each variable as a boolean flag Yes / No if categorical or if it is numerical you can set the input as the equivalent of none (Usually 0).

How to reduce an unknown size data into a fixed size data? Please read details

Example:
Given n number of images marked 1 to n where n is unknown, I can calculate a property of every image which is a scalar quantity. Now I have to represent this property of all images in a fixed size vector (say 5 or 10).
One naive approach can be this vector- [avg max min std_deviation]
And I also want to include the effect of relative positions of those images.
What your are looking for is called feature extraction.
There are many techniques for the same, for images:
For your purpose try:
PCA
Auto-encoders
Convolutional Auto-encoders, 1 & 2
You could also look into conventional (old) methods like SIFT, HOG, Edge Detection, but they all will need an extra step for making them to a smaller-fixed size.

Reducing / Enhancing known features in an image

I am microbiology student new to computer vision, so any help will be extremely appreciated.
This question involves microscope images that I am trying to analyze. The goal I am trying to accomplish is to count bacteria in an image but I need to pre-process the image first to enhance any bacteria that are not fluorescing very brightly. I have thought about using several different techniques like enhancing the contrast or sharpening the image but it isn't exactly what I need.
I want to reduce the noise(black spaces) to 0's on the RBG scale and enhance the green spaces. I originally was writing a for loop in OpenCV with threshold limits to change each pixel but I know that there is a better way.
Here is an example that I did in photo shop of the original image vs what I want.
Original Image and enhanced Image.
I need to learn to do this in a python environment so that I can automate this process. As I said I am new but I am familiar with python's OpenCV, mahotas, numpy etc. so I am not exactly attached to a particular package. I am also very new to these techniques so I am open to even if you just point me in the right direction.
Thanks!
You can have a look at histogram equalization. This would emphasize the green and reduce the black range. There is an OpenCV tutorial here. Afterwards you can experiment with different thresholding mechanisms that best yields the bacteria.
Use TensorFlow:
create your own dataset with images of bacteria and their positions stored in accompanying text files (the bigger the dataset the better).
Create a positive and negative set of images
update default TensorFlow example with your images
make sure you have a bunch of convolution layers.
train and test.
TensorFlow is perfect for such tasks and you don't need to worry about different intensity levels.
I initially tried histogram equalization but did not get the desired results. So I used adaptive threshold using the mean filter:
th = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 3, 2)
Then I applied the median filter:
median = cv2.medianBlur(th, 5)
Finally I applied morphological closing with the ellipse kernel:
k1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
dilate = cv2.morphologyEx(median, cv2.MORPH_CLOSE, k1, 3)
THIS PAGE will help you modify this result however you want.

Help with the theory behind a pixelate algorithm?

So say I have an image that I want to "pixelate". I want this sharp image represented by a grid of, say, 100 x 100 squares. So if the original photo is 500 px X 500 px, each square is 5 px X 5 px. So each square would have a color corresponding to the 5 px X 5 px group of pixels it swaps in for...
How do I figure out what this one color, which is best representative of the stuff it covers, is? Do I just take the R G and B numbers for each of the 25 pixels and average them? Or is there some obscure other way I should know about? What is conventionally used in "pixelation" functions, say like in photoshop?
If you want to know about the 'theory' of pixelation, read up on resampling (and downsampling in particular). Pixelation algorithms are simply downsampling an image (using some downsampling method) and then upsampling it using nearest-neighbour interpolation. Note that in code these two steps may be fused into one.
For downsampling in general, to downsample by a factor of n the image is first filtered by an appropriate low-pass filter, and then one sample out of every n is taken. An "ideal" filter to use is the sinc filter, but because of issues with implementing it, the Lanczos filter is often used as a close alternative.
However, for almost all purposes when doing pixelization, using a simple box blur should work fine, and is very simple to implement. This is just an average of nearby pixels.
If you don't need to change the output size of the image, then this means you divide the image into blocks (the big resulting pixels) which are k×k pixels, and then replace all the pixels in each block with the average value of the pixels in that block.
when the source and target grids are so evenly divisible and aligned, most algorigthms give similar results. if the grids are fixed, go for simple averages.
in other cases, especially when resizing by a small percentage, the quality difference is quite evident. the simplest enhancement over simple average is weighting each pixel value considering how much of it's contained in the target pixel's area.
for more algorithms, check multivariate interpolation

Resources