Fit a GMM to a 3D histogram in scikit-learn - scikit-learn

The mixture model code in scikit-learn works for a list individual data points, but what if you have a histogram? That is, I have a density value for every voxel, and I want the mixture model to approximate it. Is this possible? I suppose one solution would be to sample values from this histogram, but that shouldn't be necessary.

Scikit-learn has extensive utilities and algorithms for kernel density estimation, which is specifically centered around inferring distributions from things like histograms. See the documentation here for some examples. If you have no expectations for the distribution of your data, KDE might be a more general approach.

For 2D histogram Z (your 2D array of voxels)
import numpy as np
# create the co-ordinate values
X, Y = np.mgrid[0:Z.shape[0], 0:Z.shape[1]]
# artificially create a list of points from your histogram
data_points = []
for x, y, z in zip(X.ravel(), Y.ravel(), Z.ravel()):
# add the data point / voxel (x, y) as many times as it occurs
# in the histogram
for iz in z:
data_points.append((x, y))
# now fit your GMM
from sklearn.mixture import GMM
gmm = GMM()
gmm.fit(data_points)
Though, as #Kyle Kastner points out, there are better methods for achieving this. For a start, your histogram will be 'binned' which will already loose you some resolution. Can you get hold of the raw data before it was binned?

Related

networkx: node spacing when plotting multipartite graph

I want to plot a multiparite graph using networkx. However, when adding more nodes, the plot becomes very crowdy. Is there a way to have more space between nodes and partitions?
Looking at the documentation of multipartite_layout, I couldn't find parameters for this.
Of course, one could create complicated formulas for the positions, but since the spacing of multipartite_layout already looks so good for small graphs, I was how to scale this to bigger graphs.
Has anyone an idea how to do this (efficiently)?
Sample code, generating a graph with three partitions:
import matplotlib.pyplot as plt
import networkx as nx
# build graph:
G = nx.Graph()
for i in range (0,30):
G.add_node(i,layer=0)
for i in range (30,50):
G.add_node(i,layer=1)
for j in range(0,30):
G.add_edge(i,j)
G.add_node(100,layer=2)
G.add_edge(40,100)
# plot graph
pos = nx.multipartite_layout(G, subset_key="layer",)
plt.figure(figsize=(20, 8))
nx.draw(G, pos,with_labels=False)
plt.axis("equal")
plt.show()
The current, crowdy plot:
nx.multipartite_layout returns a dictionary with the following format: {node: array([x, y])}
I suggest you try pos = {p:array_op(pos[p]) for p in pos} where array_op is a function acting on the position of each node, array([x, y]).
In your case, I think a simple scaling along the x-axis suffice, i.e.
array_op = lambda x, sx: np.array(x[0]*sx, x[1]).
For visualization purpose I guess this should be equivalent with #JPM 's comment. However, this approach gives you the advantage of having the actual transformed position data.
In the end, if such uniform transformation does not satisfy your need, you can always manipulate the position manually with the knowledge of the format of the dict (although it might be less efficient).

How do I visualise orthogonal parameter steps in gradient descent, using Matplotlib?

I have implemented multivariate linear regression, where parameters theta0 (intersect), theta1, theta2 are optimized by minimizing MSE loss, chosen with line search in gradient descent. How do I visually illustrate the mathematical property that the direction of steepest descent (negative gradient) of successive steps are orthogonal? I'm trying to generate a contour map similar to this image: Plot, but with respect to 2 parameters instead of 1 (if it's not possible, 2 separate plots would also be great).
Also, I originally wanted to perform multivariate linear regression with 4 features, but ultimately decided to use only the 2 most strongly correlated ones (after comparing their PCC) in order to be able to plot a graph. Although I'm not aware of any way to plot 4-dimensional data, does anyone know if this is possible and how?

How to decompose affine matrix?

I have a series of points in two 3D systems. With them, I use np.linalg.lstsq to calculate the affine transformation matrix (4x4) between both. However, due to my project, I have to "disable" the shear in the transform. Is there a way to decompose the matrix into the base transformations? I have found out how to do so for Translation and Scaling but I don't know how to separate Rotation and Shear.
If not, is there a way to calculate a transformation matrix from the points that doesn't include shear?
I can only use numpy or tensorflow to solve this problem btw.
I'm not sure I understand what you're asking.
Anyway If you have two sets of 3D points P and Q, you can use Kabsch algorithm to find out a rotation matrix R and a translation vector T such that the sum of square distances between (RP+T) and Q is minimized.
You can of course combine R and T into a 4x4 matrix (of rotation and translation only. without shear or scale).

fitting for offset in a patsy model

Using patsy, I understand how to turn intercepts on or off. But I haven't managed to get horizontal offsets. For instance, I would like to be able to fit, in essence
y = alpha + beta * abs(x_opt - x_obs)
with x_opt free in the fit. I tried write this like so:
y ~ 1 + np.abs(y - x)
using a constant column for y. But within the np.abs() parentheses, patsy "turns off," and y - x is just interpreted as a number. If I shift y to 1 or 20, I get different answers.
A similar question applies for e.g., np.pow(1-x, 2) or a sine wave. Being able to fit for the x offset would be extremely helpful. Is this possible? Or is this precisely what is meant that patsy doesn't do non-linear?
patsy and most of statsmodels only handle models that are linear in parameters. Or more precisely, models where the design matrix and estimated parameters are combined in a linear way, x * beta.
Polynomials and splines are nonlinear in the underlying variables but have a linear representation in terms of basis function and are therefore linear in parameters.
The only non-linearities in the models that are currently implemented in statsmodels are predefined nonlinearities like link functions in GLM or discrete models, shape parameters in models like NegativeBinomial, or covariances in mixed models and GEE.
The best Python package for nonlinear least squares is currently lmfit https://pypi.python.org/pypi/lmfit/

scikit-learn Standard Scaler - get the standard deviation in the original unscaled space for GMM

Before running a GMM clustering model, I use a standard Scaler to transform my data into a 0 mean, 1 std dataset
Having then performed clustering, I am interested in representing the learned cluster back in the original space rather than the 0-mean, 1 standard deviation, where the feature values make more sense.
Is it then correct to do the following:
Get the mean by multiplying the mean of each GMM cluster by the
scaler.mean_ parameters.
Get the standard deviation by multiplying the square of the
diagonal covariance matrix by the scaler.std_ parameters.
I'd appreciate any feedback,
Thank you!
For the cluster centers you can use scaler.inverse_transform() directly (because they live in the same space as your data). It adds the column means back and scales each column back up by its standard deviation.
import numpy as np
from sklearn.preprocessing import StandardScaler
X = np.random.randn(10, 3)
scaler = StandardScaler()
scaler.fit(X)
You will then see that
scaler.inverse_transform(scaler.transform(X)) - X
is equal or extremely close to 0, making the two essentially equal. In order to automate you r pipeline, you should also take a look at sklearn.pipeline.Pipeline, with which you can concatenate your processes and invoke transform and inverse_transform methods.
As for the rescaling of the covariance, you should multiply np.diag(scaler.std_) to the right and to the left of your cluster covariance matrices.
To answer your question:
1) You obtain the mean by multiplying the cluster means by scaler.std_ and adding scaler.mean_ back.
2) You rescale the cluster covariances by multiplying left and right by, np.diag(scaler.std_), viz rescaled_cov = np.diag(scaler.std_).dot(cov).dot(np.diag(scaler.std_))
Note: If your covariance matrices are rather large, you may not want to create another (diagonal, but dense) matrix of the same size. The operation scaler.std_[:, np.newaxis] * cov * scaler.std_ is equivalent mathematically to 2) but does not require creating the diagonal matrix.

Resources