I constructed a large uniform circle (dots = 8000000) in python 3. In the next step, I would like to add additional dots (in myList) outside the circle but at the corresponding position.
import matplotlib.pyplot as plt
import numpy as np
circleSize = 8000000
myList = [155744, 213230, 215537, 262274, 262613, 6143898, 244883, 509516, 1997259, 2336382]
fig = plt.figure(figsize=(4, 4))
n_dots = circleSize # set number of points in circle
uniformSpacing = np.linspace(0, 2*np.pi, n_dots) # create uniform spacing between points
center_x, center_y = (50, 20) # set the center of the circle
x_coord, y_coord = [], [] # for coordinates of points to plot
radius = 10.0 # set the radius of circle
for items in uniformSpacing :
x = center_x + radius*np.cos(items)
y = center_y + radius*np.sin(items)
x_coord.append(x)
y_coord.append(y)
plt.scatter(x_coord, y_coord, c = 'black', s=1) # plot points
plt.show()
How can I add the points to my plot?
Thank you!
If you're coming from a MATLAB background, pyplot has hold on by default, so you can do multiple plot() or scatter() calls without it erasing what was on the plot before.
Also, since you're already using numpy, you should utilize its vectorization capabilities and calculate x_coord and y_coord using SIMD instructions rather than looping and appending to a Python list (which is painfully slow).
fig = plt.figure(figsize=(4, 4))
n_dots = circleSize # set number of points in circle
uniformSpacing = np.linspace(0, 2*np.pi, n_dots) # create uniform spacing between points
center_x, center_y = (50, 20) # set the center of the circle
radius = 10.0 # set the radius of circle
x_coord = center_x + radius * np.cos(uniformSpacing)
y_coord = center_y + radius * np.sin(uniformSpacing)
plt.scatter(x_coord, y_coord, marker='o', color='k');
new_dots_angles = np.linspace(0, 2 * np.pi, 5)
new_radius = 15.0
new_xcoord = center_x + new_radius * np.cos(new_dots_angles)
new_ycoord = center_y + new_radius * np.sin(new_dots_angles)
plt.scatter(new_xcoord, new_ycoord, marker='*', color='r')
Related
I used below code to generate the colorbar plot of an image:
plt.imshow(distance)
cb = plt.colorbar()
plt.savefig(generate_filename("test_images.png"))
cb.remove()
The image looks likes this:
I want to draw a single contour line on this image where the signed distance value is equal to 0. I checked the doc of pyplot.contour but it needs a X and Y vector that represents the coordinates and a Z that represents heights. Is there a method to generate X, Y, and Z? Or is there a better function to achieve this? Thanks!
If you leave out X and Y, by default, plt.contour uses the array indices (in this case the range 0-1023 in both x and y).
To only draw a contour line at a given level, you can use levels=[0]. The colors= parameter can fix one or more colors. Optionally, you can draw a line on the colorbar to indicate the value of the level.
import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage # to smooth a test image
# create a test image with similar properties as the given one
np.random.seed(20221230)
distance = np.pad(np.random.randn(1001, 1001), (11, 11), constant_values=-0.02)
distance = ndimage.filters.gaussian_filter(distance, 100)
distance -= distance.min()
distance = distance / distance.max() * 0.78 - 0.73
plt.imshow(distance)
cbar = plt.colorbar()
level = 0
color = 'red'
plt.contour(distance, levels=[level], colors=color)
cbar.ax.axhline(level, color=color) # show the level on the colorbar
plt.show()
Reference: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.contour.html
You can accomplish this by setting the [levels] parameter in contour([X, Y,] Z, [levels], **kwargs).
You can draw contour lines at the specified levels by giving an array that is in increasing order.
import matplotlib.pyplot as plt
import numpy as np
x = y = np.arange(-3.0, 3.0, 0.02)
X, Y = np.meshgrid(x, y)
Z1 = np.exp(-X ** 2 - Y ** 2)
Z2 = np.exp(-(X - 1) ** 2 - (Y - 1) ** 2)
Z3 = np.exp(-(X + 1) ** 2 - (Y + 1) ** 2)
Z = (Z1 - Z2 - Z3) * 2
fig, ax = plt.subplots()
im = ax.imshow(Z, interpolation='gaussian',
origin='lower', extent=[-4, 4, -4, 4],
vmax=abs(Z).max(), vmin=-abs(Z).max())
plt.colorbar(im)
CS = ax.contour(X, Y, Z, levels=[0.9], colors='black')
ax.clabel(CS, fmt='%1.1f', fontsize=12)
plt.show()
Result (levels=[0.9]):
I would like to plot a heatmap where the input data is not in the typical rectangularly spaced grid. Here is some sample data:
import numpy as np
xmin = 6
xmax= 12
ymin = 0
x = np.linspace(xmin, xmax, 100)
ymax = x**2
final = []
for i in range(len(ymax)):
yrange = np.linspace(0, ymax[i], 100)
for j in range(len(yrange)):
intensity = np.random.rand()
final.append([x[i], yrange[j], intensity])
data_for_plotting = np.asarray(final) # (10000, 3) shaped array
I would like to plot intensity (in the colorbar) as a function of (x,y) which represents the position and I would like to do this without interpolation.
Here is my solution which uses matplotlib's griddata and linear interpolation.
import matplotlib.pyplot as plt
from matplotlib.mlab import griddata
total_length = 100
x1 = np.linspace(min(data_for_plotting[:,0]), max(data_for_plotting[:,0]), total_length)
y1 = np.linspace(min(data_for_plotting[:,1]), max(data_for_plotting[:,1]), total_length)
z1 = griddata(data_for_plotting[:,0], data_for_plotting[:,1], data_for_plotting[:,2], x1, y1, interp='linear')
p=plt.pcolormesh(x1, y1, z1, vmin = 0. , vmax=1.0, cmap='viridis')
clb = plt.colorbar(p)
plt.show()
I am looking for an alternate solution without interpolation as I would like to see the smallest unit of measurement in my x and y position (pixel size/rectangle). Based on the sample data given above I expect the height of the pixel to increase for large values of x.
I'm unsure what matplotlib.mlab.griddata is about. Maybe some very old version?
You could use scipy.interpolate.griddata which needs its parameters in a slightly different format. method='nearest' switches off the interpolation (default method='linear').
Here is how it could look with your test data (see griddata's documentation for more explanation and examples):
import matplotlib.pyplot as plt
from scipy.interpolate import griddata
import numpy as np
xmin = 6
xmax = 12
ymin = 0
x = np.linspace(xmin, xmax, 100)
ymax = x ** 2
final = []
for i in range(len(ymax)):
yrange = np.linspace(0, ymax[i], 100)
for j in range(len(yrange)):
intensity = np.random.rand()
final.append([x[i], yrange[j], intensity])
data_for_plotting = np.asarray(final) # (10000, 3) shaped array
total_length = 100
x1 = np.linspace(min(data_for_plotting[:, 0]), max(data_for_plotting[:, 0]), total_length)
y1 = np.linspace(min(data_for_plotting[:, 1]), max(data_for_plotting[:, 1]), total_length)
grid_x, grid_y = np.meshgrid(x1, y1)
z1 = griddata(data_for_plotting[:, :2], data_for_plotting[:, 2], (grid_x, grid_y), method='nearest')
img = plt.imshow(z1, extent=[x1[0], x1[-1], y1[0], y1[-1]], origin='lower',
vmin=0, vmax=1, cmap='inferno', aspect='auto')
cbar = plt.colorbar(img)
plt.show()
An alernative, is to create one rectangle for each of the prolonged pixels. Beware that this can be a rather slow operation. If really needed, one could create a pcolormesh for each column.
import matplotlib.pyplot as plt
from matplotlib.cm import ScalarMappable
import numpy as np
# ... create x and data_for_plotting as before
fig, ax = plt.subplots()
cmap = plt.get_cmap('inferno')
norm = plt.Normalize(0, 1)
x_step = x[1] - x[0]
y_step = 0
for i, (xi, yi, intensity_i) in enumerate(data_for_plotting):
if i + 1 < len(data_for_plotting) and data_for_plotting[i + 1, 0] == xi: # when False, the last y_step is reused
y_step = data_for_plotting[i + 1, 1] - yi
ax.add_artist(plt.Rectangle((xi, yi), x_step, y_step, color=cmap(norm(intensity_i))))
cbar = plt.colorbar(ScalarMappable(cmap=cmap, norm=norm))
ax.set_xlim(x[0], x[-1])
ax.set_ylim(0, data_for_plotting[:, 1].max())
plt.tight_layout()
plt.show()
I have 2 sets of datapoints:
import random
import pandas as pd
A = pd.DataFrame({'x':[random.uniform(0, 1) for i in range(0,100)], 'y':[random.uniform(0, 1) for i in range(0,100)]})
B = pd.DataFrame({'x':[random.uniform(0, 1) for i in range(0,100)], 'y':[random.uniform(0, 1) for i in range(0,100)]})
For each one of these dataset I can produce the jointplot like this:
import seaborn as sns
sns.jointplot(x=A["x"], y=A["y"], kind='kde')
sns.jointplot(x=B["x"], y=B["y"], kind='kde')
Is there a way to calculate the "common area" between these 2 joint plots ?
By common area, I mean, if you put one joint plot "inside" the other, what is the total area of intersection. So if you imagine these 2 joint plots as mountains, and you put one mountain inside the other, how much does one fall inside the other ?
EDIT
To make my question more clear:
import matplotlib.pyplot as plt
import scipy.stats as st
def plot_2d_kde(df):
# Extract x and y
x = df['x']
y = df['y']
# Define the borders
deltaX = (max(x) - min(x))/10
deltaY = (max(y) - min(y))/10
xmin = min(x) - deltaX
xmax = max(x) + deltaX
ymin = min(y) - deltaY
ymax = max(y) + deltaY
# Create meshgrid
xx, yy = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
# We will fit a gaussian kernel using the scipy’s gaussian_kde method
positions = np.vstack([xx.ravel(), yy.ravel()])
values = np.vstack([x, y])
kernel = st.gaussian_kde(values)
f = np.reshape(kernel(positions).T, xx.shape)
fig = plt.figure(figsize=(13, 7))
ax = plt.axes(projection='3d')
surf = ax.plot_surface(xx, yy, f, rstride=1, cstride=1, cmap='coolwarm', edgecolor='none')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('PDF')
ax.set_title('Surface plot of Gaussian 2D KDE')
fig.colorbar(surf, shrink=0.5, aspect=5) # add color bar indicating the PDF
ax.view_init(60, 35)
I am interested in finding the interection/common volume (just the number) of these 2 kde plots:
plot_2d_kde(A)
plot_2d_kde(B)
Credits: The code for the kde plots is from here
I believe this is what you're looking for. I'm basically calculating the space (integration) of the intersection (overlay) of the two KDE distributions.
A = pd.DataFrame({'x':[random.uniform(0, 1) for i in range(0,100)], 'y':[random.uniform(0, 1) for i in range(0,100)]})
B = pd.DataFrame({'x':[random.uniform(0, 1) for i in range(0,100)], 'y':[random.uniform(0, 1) for i in range(0,100)]})
# KDE fro both A and B
kde_a = scipy.stats.gaussian_kde([A.x, A.y])
kde_b = scipy.stats.gaussian_kde([B.x, B.y])
min_x = min(A.x.min(), B.x.min())
min_y = min(A.y.min(), B.y.min())
max_x = max(A.x.max(), B.x.max())
max_y = max(A.y.max(), B.y.max())
print(f"x is from {min_x} to {max_x}")
print(f"y is from {min_y} to {max_y}")
x = [a[0] for a in itertools.product(np.arange(min_x, max_x, 0.01), np.arange(min_y, max_y, 0.01))]
y = [a[1] for a in itertools.product(np.arange(min_x, max_x, 0.01), np.arange(min_y, max_y, 0.01))]
# sample across 100x100 points.
a_dist = kde_a([x, y])
b_dist = kde_b([x, y])
print(a_dist.sum() / len(x)) # intergral of A
print(b_dist.sum() / len(x)) # intergral of B
print(np.minimum(a_dist, b_dist).sum() / len(x)) # intergral of the intersection between A and B
The following code compares calculating the volume of the intersection either via scipy's dblquad or via taking the average value over a grid.
Remarks:
For the 2D case (and with only 100 sample points), it seems the delta's need to be quite larger than 10%. The code below uses 25%. With a delta of 10%, the calculated values for f1 and f2 are about 0.90, while in theory they should be 1.0. With a delta of 25%, these values are around 0.994.
To approximate the volume the simple way, the average needs to be multiplied by the area (here (xmax - xmin)*(ymax - ymin)). Also, the more grid points are considered, the better the approximation. The code below uses 1000x1000 grid points.
Scipy has some special functions to calculate the integral, such as scipy.integrate.dblquad. This is much slower than the 'simple' method, but a bit more precise. The default precision didn't work, so the code below reduces that precision considerably. (dblquad outputs two numbers: the approximate integral and an indication of the error. To only get the integral, dblquad()[0] is used in the code.)
The same approach can be used for more dimensions. For the 'simple' method, create a more dimensional grid (xx, yy, zz = np.mgrid[xmin:xmax:100j, ymin:ymax:100j, zmin:zmax:100j]). Note that a subdivision by 1000 in each dimension would create a grid that's too large to work with.
When using scipy.integrate, dblquad needs to be replaced by tplquad for 3 dimensions or nquad for N dimensions. This probably will also be rather slow, so the accuracy needs to be reduced further.
import numpy as np
import pandas as pd
import scipy.stats as st
from scipy.integrate import dblquad
df1 = pd.DataFrame({'x':np.random.uniform(0, 1, 100), 'y':np.random.uniform(0, 1, 100)})
df2 = pd.DataFrame({'x':np.random.uniform(0, 1, 100), 'y':np.random.uniform(0, 1, 100)})
# Extract x and y
x1 = df1['x']
y1 = df1['y']
x2 = df2['x']
y2 = df2['y']
# Define the borders
deltaX = (np.max([x1, x2]) - np.min([x1, x2])) / 4
deltaY = (np.max([y1, y2]) - np.min([y1, y2])) / 4
xmin = np.min([x1, x2]) - deltaX
xmax = np.max([x1, x2]) + deltaX
ymin = np.min([y1, y2]) - deltaY
ymax = np.max([y1, y2]) + deltaY
# fit a gaussian kernel using scipy’s gaussian_kde method
kernel1 = st.gaussian_kde(np.vstack([x1, y1]))
kernel2 = st.gaussian_kde(np.vstack([x2, y2]))
print('volumes via scipy`s dblquad (volume):')
print(' volume_f1 =', dblquad(lambda y, x: kernel1((x, y)), xmin, xmax, ymin, ymax, epsabs=1e-4, epsrel=1e-4)[0])
print(' volume_f2 =', dblquad(lambda y, x: kernel2((x, y)), xmin, xmax, ymin, ymax, epsabs=1e-4, epsrel=1e-4)[0])
print(' volume_intersection =',
dblquad(lambda y, x: np.minimum(kernel1((x, y)), kernel2((x, y))), xmin, xmax, ymin, ymax, epsabs=1e-4, epsrel=1e-4)[0])
Alternatively, one can calculate the mean value over a grid of points, and multiply the result by the area of the grid. Note that np.mgrid is much faster than creating a list via itertools.
# Create meshgrid
xx, yy = np.mgrid[xmin:xmax:1000j, ymin:ymax:1000j]
positions = np.vstack([xx.ravel(), yy.ravel()])
f1 = np.reshape(kernel1(positions).T, xx.shape)
f2 = np.reshape(kernel2(positions).T, xx.shape)
intersection = np.minimum(f1, f2)
print('volumes via the mean value multiplied by the area:')
print(' volume_f1 =', np.sum(f1) / f1.size * ((xmax - xmin)*(ymax - ymin)))
print(' volume_f2 =', np.sum(f2) / f2.size * ((xmax - xmin)*(ymax - ymin)))
print(' volume_intersection =', np.sum(intersection) / intersection.size * ((xmax - xmin)*(ymax - ymin)))
Example output:
volumes via scipy`s dblquad (volume):
volume_f1 = 0.9946974276169385
volume_f2 = 0.9928998852123891
volume_intersection = 0.9046421634401607
volumes via the mean value multiplied by the area:
volume_f1 = 0.9927873844924111
volume_f2 = 0.9910132867915901
volume_intersection = 0.9028999384136771
I would like to get some tips on how to properly visualize/plot two 2-dimensional arrays of the same shape,
say ground_arr and water_arr. ground_arr represents the elevation of some surface, and water_arr represents the height/depth of water on top of that surface. The total elevation is then ofc ground_arr + water_arr.
For now im using plt.imshow(water_arr, cmap=...) to only see the water and plt.imshow(water_arr+ ground_arr) to see the total elevation but i would like to merge both of them in the same plot, to get some map alike plot.
Any tips?
Supposing you have 2D arrays of height values for the terrain and for the water level. And that the water level is set to zero at the places without water.
Just set the water level to Nan where you want the water image to be transparent.
import numpy as np
import matplotlib.pyplot as plt
# Generate test data, terrain is some sine on the distance to the center
terrain_x, terrain_y = np.meshgrid(np.linspace(-15, 15, 1000), np.linspace(-15, 15, 1000))
r = np.sqrt(terrain_x * terrain_x + terrain_y * terrain_y)
terrain_z = 5 + 5 * np.sin(r)
# test data for water has some height where r is between 3 and 4 pi, zero everywhere else
water_z = np.where(3 * np.pi < r, 3 - terrain_z, 0)
water_z = np.where(4 * np.pi > r, water_z, 0)
extent = [-15, 15, -15, 15]
fig, (ax1, ax2, ax3) = plt.subplots(ncols=3)
ax1.imshow(terrain_z, cmap="YlOrBr", extent=extent)
ax1.set_title('Terrain')
ax2.imshow(water_z, cmap="Blues", extent=extent)
ax2.set_title('Water')
ax3.imshow(terrain_z, cmap="YlOrBr", extent=extent)
water_z = np.where(water_z > 0, water_z, np.nan)
ax3.imshow(water_z, cmap="Blues", extent=extent)
ax3.set_title('Combined')
plt.show()
I want to create "L" shapes black and white structure on a 20x20 mm figure. Each L shape width and length are defined as uw, ul, lw and ll (see code). A sper my understanding matplotlib works with points per inch (PPI) of 72 and with linewidth of 1, the shape will be 1/72 inch wide. I cannot understand how I can make these figures big enough to be visible when I use plt.show() and save them in the size I want (i.e. 20x20 mm page and each L with their exact shape size with high DPI so that I can view it when I open the saved figure). My code is:
import matplotlib.pyplot as plt
import numpy as np
uw = 20e-6 #upper width in meters
ul = 100e-6 #upper length in meters
lw = 20e-6 #lower width in meters
ll = 100e-6 #lower length in meters
w_space = 50e-6 #width spacing for subplots
h_space = 50e-6 #height spacing for subplots
N = 40
coord = [[0,0], [ll,0], [ll,lw], [uw,lw], [uw,ul], [0,ul]]
coord.append(coord[0]) #repeat the first point to create a 'closed loop'
xs, ys = zip(*coord) #create lists of x and y values
fig = plt.figure(num=None, figsize=(0.1, 0.1), dpi=100, facecolor='w', edgecolor='k') #figsize cannot be chosen below 0.1
for i in range(N):
ax = fig.add_subplot(5,10,i+1)
ax.fill(xs,ys,'k',linewidth=1)
plt.axis('off')
plt.subplots_adjust(wspace = w_space, hspace = h_space)
plt.savefig('screenshots/L_shape.png' ,bbox_inches = 'tight', pad_inches = 0, dpi=10000)
plt.show()