Apply function to masked region - python-3.x

I have an image like that:
I have both the mask and the original image. I would like to calculate the colour temperature of ONLY the ducks region.
Right now, I'm iterating through each row and column of the image below and getting pixels where their values are not zero. But I think this isn't the right way to do this. Any suggestions?
What I did was:
xyzImg = cv2.cvtColor(resImage, cv2.COLOR_BGR2XYZ)
x,y,z = cv2.split(xyzImg)
xList=[]
yList=[]
zList=[]
rows=x.shape[0]
cols=x.shape[1]
for i in range(rows):
for j in range(cols):
if (x[i][j]!=0) and (y[i][j]!=0) and (z[i][j]!=0):
xList.append(x[i][j])
yList.append(y[i][j])
zList.append(z[i][j])
xAvg = np.mean(xList)
yAvg = np.mean(yList)
zAvg = np.mean(zList)
xs = xAvg / (xAvg + yAvg + zAvg)
ys = yAvg / (xAvg + yAvg + zAvg)
xyChrome = np.array([xs,ys])
But this is very slow and I don't think its right...

The simplest way would be to use cv2.mean() function.
It takes two arguments src (having 1 to 4 channels) and mask and returns a vector with mean values for individual channels.
Refer to cv2::mask

Related

subtracting every element of a list from a previous element in a dictionary

I have a dictionary which has a 2D list (list of a list). This 2D list contains x and y coordinates [x,y] of a particle. Whenever the particle moves, its new coordinates are appended to this 2D list in a dictionary. I want to calculate the distance between every location and append the result to another list (can just be a normal list without dictionary). What I want is something like the following:
dist1 = sqrt((x1-x0)^2 + (y1-y0)^2)
dist2 = sqrt((x2-x1)^2 + (y2-y1)^2)
.....
distN = sqrt((xN-xN-1)^2 + (yN-yN-1)^2)
but I am having issues in accessing elements of a list in a dictionary. I have a very long 2D list but you can use the below example to give me some suggestions.
c = {"coordinates":[[1,2],[3,4],[5,6],[7,8]]}
for k, dk in c.items():
for x in dk:
print(x[0], x[1])
I can access one element in the dk at a time in a loop but how to get the previous one? There should be a nice way of doing it but I just don't know.
Any help will be appreciated.
Using a for loop (probably not the most efficient solution):
import numpy as np
c = {"coordinates":[[1,2],[3,4],[5,6],[7,8]]}
coordinates = np.array(c['coordinates'])
distances = []
for i in range(1, len(coordinates)):
distances.append(np.linalg.norm(coordinates[i-1] - coordinates[i]))
print(distances)
# [2.8284271247461903, 2.8284271247461903, 2.8284271247461903]
I also used numpy and its linalg.norm function to calculate the distance (How can the Euclidean distance be calculated with NumPy?), but you could ofcourse use your own function or calculation in case you'd want that.
I tried this and it also works:
c = {"coordinates":[[1,2],[3,4],[15,6],[7,8]]}
l1 = []
for k, dk in c.items():
for x in dk:
l1.append(x)
print(l1)
dist = [math.sqrt((p1[0]-p0[0])**2 + (p1[1]-p0[1])**2) for p1,p0 in zip(l1,l1[1:]
as others suggested in this question, better way to get l1 is to use the following command:
l1 = c["coordinates"]
dist = [math.sqrt((p1[0]-p0[0])**2 + (p1[1]-p0[1])**2) for p1,p0 in zip(l1,l1[1:]

What's a potentially better algorithm to solve this python nested for loop than the one I'm using?

I have a nested loop that has to loop through a huge amount of data.
Assuming a data frame with random values with a size of 1000,000 rows each has an X,Y location in 2D space. There is a window of 10 length that go through all the 1M data rows one by one till all the calculations are done.
Explaining what the code is supposed to do:
Each row represents a coordinates in X-Y plane.
r_test is containing the diameters of different circles of investigations in our 2D plane (X-Y plane).
For each 10 points/rows, for every single diameter in r_test, we compare the distance between every point with the remaining 9 points and if the value is less than R we add 2 to H. Then we calculate H/(N**5) and store it in c_10 with the index corresponding to that of the diameter of investigation.
For this first 10 points finally when the loop went through all those diameters in r_test, we read the slope of the fitted line and save it to S_wind[ii]. So the first 9 data points will have no value calculated for them thus giving them np.inf to be distinguished later.
Then the window moves one point down the rows and repeat this process till S_wind is completed.
What's a potentially better algorithm to solve this than the one I'm using? in python 3.x?
Many thanks in advance!
import numpy as np
import pandas as pd
####generating input data frame
df = pd.DataFrame(data = np.random.randint(2000, 6000, (1000000, 2)))
df.columns= ['X','Y']
####====creating upper and lower bound for the diameter of the investigation circles
x_range =max(df['X']) - min(df['X'])
y_range = max(df['Y']) - min(df['Y'])
R = max(x_range,y_range)/20
d = 2
N = 10 #### Number of points in each window
#r1 = 2*R*(1/N)**(1/d)
#r2 = (R)/(1+d)
#r_test = np.arange(r1, r2, 0.05)
##===avoiding generation of empty r_test
r1 = 80
r2= 800
r_test = np.arange(r1, r2, 5)
S_wind = np.zeros(len(df['X'])) + np.inf
for ii in range (10,len(df['X'])): #### maybe the code run slower because of using len() function instead of a number
c_10 = np.zeros(len(r_test)) +np.inf
H = 0
C = 0
N = 10 ##### maybe I should also remove this
for ind in range(len(r_test)):
for i in range (ii-10,ii):
for j in range(ii-10,ii):
dd = r_test[ind] - np.sqrt((df['X'][i] - df['X'][j])**2+ (df['Y'][i] - df['Y'][j])**2)
if dd > 0:
H += 1
c_10[ind] = (H/(N**2))
S_wind[ii] = np.polyfit(np.log10(r_test), np.log10(c_10), 1)[0]
You can use numpy broadcasting to eliminate all of the inner loops. I'm not sure if there's an easy way to get rid of the outermost loop, but the others are not too hard to avoid.
The inner loops are comparing ten 2D points against each other in pairs. That's just dying for using a 10x10x2 numpy array:
# replacing the `for ind` loop and its contents:
points = np.hstack((np.asarray(df['X'])[ii-10:ii, None], np.asarray(df['Y'])[ii-10:ii, None]))
differences = np.subtract(points[None, :, :], points[:, None, :]) # broadcast to 10x10x2
squared_distances = (differences * differences).sum(axis=2)
within_range = squared_distances[None,:,:] < (r_test*r_test)[:, None, None] # compare squares
c_10 = within_range.sum(axis=(1,2)).cumsum() * 2 / (N**2)
S_wind[ii] = np.polyfit(np.log10(r_test), np.log10(c_10), 1)[0] # this is unchanged...
I'm not very pandas savvy, so there's probably a better way to get the X and Y values into a single 2-dimensional numpy array. You generated the random data in the format that I'd find most useful, then converted into something less immediately useful for numeric operations!
Note that this code matches the output of your loop code. I'm not sure that's actually doing what you want it to do, as there are several slightly strange things in your current code. For example, you may not want the cumsum in my code, which corresponds to only re-initializing H to zero in the outermost loop. If you don't want the matches for smaller values of r_test to be counted again for the larger values, you can skip that sum (or equivalently, move the H = 0 line to in between the for ind and the for i loops in your original code).

How to make a double loop inside a dictionary in Python?

I have a data cube with 2 dimensions of coordinates and a third dimension for wavelength. My goal is to write a mask for coordinates outside a circle of given radius to the central coordinates (x0 and y0 in my code). For this, I'm trying to use a dictionary, but I'm having throuble because it seems that I'll have to make a double loop inside the dictionary to iterate over the two dimensions, and as a beginner with dictionaries, I don't know yet how to do that.
I wrote the following code
x0 = 38
y0 = 45
radius = 9
xcoords = np.arange(1,flux.shape[1]+1,1)
ycoords = np.arange(1,flux.shape[2]+1,1)
mask = {'xmask': [xcoords[np.sqrt((xcoords[:]-x0)**2 + (y-y0)**2) < radius] for y in ycoords], 'ymask': [ycoords[np.sqrt((x-x0)**2 + (ycoords[:]-y0)**2) < radius] for x in xcoords]}
And it returned several arrays, one for each value of y (for xmasks), and one for each value of x (for ymasks), although I want just one array for each one. Could anyone say what I made wrong and how to achieve my goal?
Note: I also made it without using a dictionary, as
xmask = []
for x in xcoords:
for y in ycoords:
if np.sqrt((x-x0)**2 + (y-y0)**2) < radius:
xmask.append(x)
break
ymask = []
for y in xcoords:
for x in ycoords:
if np.sqrt((x-x0)**2 + (y-y0)**2) < radius:
ymask.append(y)
break
but I hope it's possible to make it more efficiently.
Thanks for any help!
Edit: I realized that no loop was needed. If I select y = y0 and x = x0, I get the values of x and y that are inside the circle, respectively. So I stayed with
mask = {'xmask': [xcoords[abs(xcoords[:]-x0) < radius]], 'ymask': [ycoords[abs(ycoords[:]-y0) < radius]]}
The OP explains that assigning
mask = {'xmask': [xcoords[abs(xcoords[:] - x0) < radius]],
'ymask': [ycoords[abs(ycoords[:] - y0) < radius]]}
solves the problem.

Data Storage and Export From Matlab

Here is background information to the problem I am encountering:
1) output is a cell array, each cell contains a matrix of size = 1024 x 1024, type = double
2) labelbout is a cell array which is the identical to output, except that each matrix has been binarized.
3) I am using the function regionprops to extract the mean intensity and centroid values for ROIs (there are multiple ROIs in each image) for each cell of output
4) props is a 5 x 1 struct with 2 fields (centroid and mean intensity)
The problem: I would like to take the mean intensity values for each ROI in every matrix and export to excel. Here is what I have so far:
for i = 1:size(output,2)
props = regionprops(labelboutput{1,i},output{1,i},'MeanIntensity','Centroid');
end
for i = 1:size(output,2)
meanValues = getfield(props(1:length(props),'MeanIntensity'));
end
writetable(struct2table(props), 'advanced_test.xlsx');
There seem to be a few issues:
1) my getfield command is not working and gets the error: "Index exceeds matrix dimensions"
2) when the information is being stored into props, it overwrites the values for each matrix. How do I make props a 5 x n (where n = number of cells in output)?
Please help!!
1) my getfield command is not working and gets the error: "Index exceeds matrix dimensions"
An easier way to get numeric values out of the same field in an array of structs, as an array is: [structArray.fieldName]. In your case this will be:
meanValues = [props.MeanIntensity];
2) when the information is being stored into props, it overwrites the values for each matrix. How do I make props a 5 x n (where n = number of cells in output)?
One option would be to preallocate an empty cell of the necessary dimensions and then fill it in with your regionprops output. Like this:
props = cell(size(output,1),1);
for k = 1:size(output,2)
props{k} = regionprops(labelboutput{1,k},output{1,k},'MeanIntensity','Centroid');
end
for k = 1:size(output,2)
meanValues = [props{k}.MeanIntensity];
end
...
Another option would be to combine your loops so that you can use your matrix data before it is overwritten. Like this:
for i = 1:size(output,2)
props = regionprops(labelboutput{1,i},output{1,i},'MeanIntensity','Centroid');
meanValues = [props.MeanIntensity];
% update this call to place props in non-overlapping parts of your file (e.g. append)
% writetable(struct2table(props), 'advanced_test.xlsx');
end
The bad thing about this second one is it has a file I/O step right inside your loop which can really slow things down; not to mention you will need to curtail your writetable call so it places the resulting table in non-overlapping regions of 'advanced_test.xlsx'.

Color Histogram

I'm trying to calculate histogram for an image. I'm using the following formula to calculate the bin
%bin = red*(N^2) + green*(N^1) + blue;
I have to implement the following Matlab functions.
[row, col, noChannels] = size(rgbImage);
hsvImage = rgb2hsv(rgbImage); % Ranges from 0 to 1.
H = zeros(4,4,4);
for col = 1 : columns
for row = 1 : rows
hBin = floor(hsvImage(row, column, 1) * 15);
sBin = floor(hsvImage(row, column, 2) * 4);
vBin = floor(hsvImage(row, column, 3) * 4);
F(hBin, sBin, vBin) = hBin, sBin, vBin + 1;
end
end
When I run the code I get the following error message "Subscript indices must either be real positive integers or logical."
As I am new to Matlab and Image processing, I'm not sure if the problem is with implementing the algorithm or a syntax error.
There are 3 problems with your code. (Four if you count that you changed from H to F your accumulator vector, but I'll assume that's a typo.)
First one, your variable bin can be zero at any moment if the values of a giving pixel are low. And F(0) is not a valid index for a vector or matrix. This is why you are getting that error.
You can solve easily by doing F(bin+1) and keep in mind that your F vector will have your values shifted one position over.
Second error, you are assigning the value bin + 1 to your accumulator vector F, which is not what you want, you want to add 1 every time a pixel in that range is found, what you should do is F(bin+1) = F(bin+1) + 1;. This way the values of F will be increasing all the time.
Third error is simpler, you forgot to implement your bin = red*(N^2) + green*(N^1) + blue; equation

Resources