What do r_orig and r_emb parameters represent in UMAP algorithm? - dimensionality-reduction

What do exactly rad_orig and r_emb represent in umap.UMAP? (docs).
These parameters are available when output_dens flag is set.
Reading the docs:
r_orig: array, shape (n_samples)
Local radii of data points in the original data space
(log-transformed)
and
r_emb: array, shape (n_samples)
Local radii of data points in the embedding (log-transformed).
I tried to figure that out with an example, by having a toy array of integers with shape (100,50):
array([[15, 8, 16, ..., 12, 9, 14],
[ 4, 4, 5, ..., 4, 19, 15],
[ 2, 4, 16, ..., 4, 7, 8],
...,
[11, 17, 14, ..., 7, 18, 6],
[ 2, 16, 12, ..., 18, 17, 15],
[ 3, 11, 9, ..., 11, 14, 8]])
and executing
umap_trans = umap.UMAP(densmap=True,output_dens=True).fit(arr)
then r_orig looks something like
array([7.557382 , 7.6884522, 7.5413175, 7.5586753, 7.526751 , 7.633186 ,
7.579795 , 7.6138983, 7.4713755, 7.5365367, 7.63102 , 7.627236 ,
7.5586395, 7.5616612, 7.5946164, 7.626307 , 7.6850867, 7.5265946,
7.5604353, 7.5958605, 7.5464926, 7.5515323, 7.6224527, 7.5082755,
7.6015797, 7.5680337, 7.6188903, 7.5625224, 7.6245193, 7.5826597,
7.6149483, 7.5915165, 7.558839 , 7.613548 , 7.578578 , 7.613815 ,
7.684106 , 7.5169396, 7.5644665, 7.6615157, 7.6193194, 7.626235 ,
7.656492 , 7.58103 , 7.5389533, 7.641165 , 7.588751 , 7.554403 ,
7.647078 , 7.6455092, 7.561126 , 7.5732226, 7.6015496, 7.6265235,
7.564877 , 7.5956354, 7.6075587, 7.5987916, 7.626135 , 7.539194 ,
7.5905514, 7.6090746, 7.6593614, 7.6186256, 7.66446 , 7.5629582,
7.6118226, 7.54342 , 7.5881543, 7.563827 , 7.60424 , 7.6116834,
7.5791817, 7.5829387, 7.6135163, 7.562068 , 7.7188945, 7.5859914,
7.6612687, 7.5608892, 7.5465975, 7.5277977, 7.6697884, 7.5451746,
7.5410295, 7.5975976, 7.588921 , 7.6266494, 7.630443 , 7.621092 ,
7.5729136, 7.559135 , 7.665758 , 7.585926 , 7.7076025, 7.4915547,
7.6049953, 7.5991044, 7.637067 , 7.5531616], dtype=float32)
how are these numbers related to the original array? I couldn't find any further mathematical explanation on this.

Related

Flopy: error about flopy3_modpath7_unstructured_example

When I was studying flopy's example flopy3_modpath7_unstructured_example.ipynb,
Run the following section and a warning appears,causing an error to occur in the following code
disv = flopy.mf6.ModflowGwfdisv(
gwf,
nlay=nlay,
ncpl=ncpl,
top=top,
botm=botm,
nvert=nvert,
vertices=vertices,
cell2d=cell2d,
)
WARNING: Unable to resolve dimension of ('gwf6', 'disv', 'cell2d', 'cell2d', 'icvert') based on shape "ncvert".
I printed some disv information, is it cell2d problem
cell2d
{internal}
([( 0, 250. , 10250. , 5, 0, 1, 2, 3, 0, None)
( 1, 750. , 10250. , 5, 1, 4, 5, 2, 1, None)
( 2, 1250. , 10250. , 5, 4, 6, 7, 5, 4, None)
( 3, 1750. , 10250. , 5, 6, 8, 9, 7, 6, None)
( 4, 2250. , 10250. , 5, 8, 10, 11, 9, 8, None)
( 5, 2750. , 10250. , 5, 10, 12, 13, 11, 10, None)
( 6, 3250. , 10250. , 5, 12, 14, 15, 13, 12, None)
Can anyone help me

efficient way to operate on the ndarray

There exist an numpy ndarry A of shape [100,50, 5], and I want to expand A as follows. A will be appended with an one-dimensional array of shape (50, ). The resulting A will have shape [100,50,6].
The element of this one-dimensional array is based on the array in the original ndarray, i.e., A[:,:,4] in terms of a given formula, i.e., A[:,i,5]=A[:,i,4]*B[i]+5 for i = 0:49 Here A[:,:,5] corresponds to the added one-dimensional array. B is another array working as weight.
Besides using a for loop to write this function, how to fullfill this task in a vectorized/efficient way leveraging numpy operation
Make 2 arrays - with sizes that we can look at:
In [371]: A = np.arange(24).reshape(2,3,4); B = np.array([10,20,30])
Due to broadcasting we can add a (3,) array to (2,3) array
In [372]: A[:,:,-1]+B
Out[372]:
array([[13, 27, 41],
[25, 39, 53]])
we can then convert that to (2,3,1) array:
In [373]: (A[:,:,-1]+B)[:,:,None]
Out[373]:
array([[[13],
[27],
[41]],
[[25],
[39],
[53]]])
In [374]: A
Out[374]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
and join them on the last axis:
In [375]: np.concatenate((A, Out[373]), axis=-1)
Out[375]:
array([[[ 0, 1, 2, 3, 13],
[ 4, 5, 6, 7, 27],
[ 8, 9, 10, 11, 41]],
[[12, 13, 14, 15, 25],
[16, 17, 18, 19, 39],
[20, 21, 22, 23, 53]]])
Or we can make a target array of the right size, and copy values to it:
In [376]: A1 = np.zeros((2,3,5),int)
In [377]: A1[:,:,:-1]=A
In [379]: A1[:,:,-1]=Out[372]

Can anyone explain why I can't concatenate these two matrices?

Here is my matrices and codeline:
d = np.array([[1,2,3],[6,7,8],[11,12,13],
[16,17,18]])
e = np.array([[ 4, 5],[ 9, 10],[14, 15],[19, 20]])
np.concatenate(d,e)
and this is the error that I get:
TypeError: only integer scalar arrays can be converted to a scalar index
You have a syntax mistake in np.concatenate(d,e), the syntax requires d and e to be in a tuple, like: np.concatenate((d,e)). I tested it, and axis=1 is also required for it to work.
np.concatenate((d, e), axis=1)
is the solution
Since those arrays have different dimensions you should specify the axis concatenate you what like the follow:
1) np.concatenate((d,e), axis=1)
array([[ 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15],
[16, 17, 18, 19, 20]])
or
2)np.concatenate((d,e), axis=None)
array([ 1, 2, 3, 6, 7, 8, 11, 12, 13, 16, 17, 18, 4, 5, 9, 10, 14,
15, 19, 20])

Swap pair of elements along an axis

I have a 2d numpy array as such:
import numpy as np
a = np.arange(20).reshape((2,10))
# array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
# [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
I want to swap pairs of elements in each row. The desired output looks like this:
# array([[ 9, 0, 2, 1, 4, 3, 6, 5, 8, 7],
# [19, 10, 12, 11, 14, 13, 16, 15, 18, 17]])
I managed to find a solution in 1d:
a = np.arange(10)
# does the job for all pairs except the first
output = np.roll(np.flip(np.roll(a,-1).reshape((-1,2)),1).flatten(),2)
# first pair done manually
output[0] = a[-1]
output[1] = a[0]
Any ideas on a "numpy only" solution for the 2d case ?
Owing to the first pair not exactly subscribing to the usual pair swap, we can do that separately. For the rest, it would relatively straight-forward with reshaping to split axes and flip axis. Hence, it would be -
In [42]: a # 2D input array
Out[42]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]])
In [43]: b2 = a[:,1:-1].reshape(a.shape[0],-1,2)[...,::-1].reshape(a.shape[0],-1)
In [44]: np.hstack((a[:,[-1,0]],b2))
Out[44]:
array([[ 9, 0, 2, 1, 4, 3, 6, 5, 8, 7],
[19, 10, 12, 11, 14, 13, 16, 15, 18, 17]])
Alternatively, stack and then reshape+flip-axis -
In [50]: a1 = np.hstack((a[:,[0,-1]],a[:,1:-1]))
In [51]: a1.reshape(a.shape[0],-1,2)[...,::-1].reshape(a.shape[0],-1)
Out[51]:
array([[ 9, 0, 2, 1, 4, 3, 6, 5, 8, 7],
[19, 10, 12, 11, 14, 13, 16, 15, 18, 17]])

understanding behavior of mapping to an array

When does map modify an array in place? I know the preferred way to iterate over an array is with a list comprehension, but I'm preparing an algorithm for ipyparallel, which apparently uses the map function. Each row of my array is a set of model inputs, and I want to use map, ultimately in parallel, to run the model for each row. I'm using Python 3.4.5 and Numpy 1.11.1. I need these versions for compatibility with other packages.
This simple example creates a list and leaves the input array intact, as I expected.
grid = np.arange(25).reshape(5,5)
grid
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
def f(g):
return g + 1
n = list(map(f, grid))
grid
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
But when the function modifies a slice of the input row, the array is modified in place. Can anyone explain this behavior?
def f(g):
g[:2] = g[:2] + 1
return g
n = list(map(f, grid))
grid
array([[ 1, 2, 2, 3, 4],
[ 6, 7, 7, 8, 9],
[11, 12, 12, 13, 14],
[16, 17, 17, 18, 19],
[21, 22, 22, 23, 24]])

Resources