Calculating the normed distances between two points in numpy array - python-3.x

I'm hoping to calculate the distances between two points in a (Nx1) numpy array, i.e.:
a = [2, 5, 5, 12, 5, 3, 10, 8, 1, 3, 1]
I'm hoping to get a square matrix with the (normed) distances between each point:
sq = [[0, |2-5|, |2-5|, |2-12|, |2-5|, ...],
[|5-2|, 0, ...], ...]
So far, what I have doesn't work, giving wrong values for the square distance matrix. Is there a way to (I'm not sure if it is the correct term?) vectorise my method too, but am unfamiliar with the advanced indexing.
What I currently have is the following:
sq = np.zero((len(a), len(a))
for i in a:
for j in len(a+1):
sq[i,j] = np.abs(a[:,0] - a[:,0])
Would appreciate any help!

I think that by exploiting numpy broadcasting, this is the faster solution:
a = [2, 5, 5, 12, 5, 3, 10, 8, 1, 3, 1]
a = np.array(a).reshape(-1,1)
sq = np.abs(a.T-a)
sq
array([[ 0, 3, 3, 10, 3, 1, 8, 6, 1, 1, 1],
[ 3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4],
[ 3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4],
[10, 7, 7, 0, 7, 9, 2, 4, 11, 9, 11],
[ 3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4],
[ 1, 2, 2, 9, 2, 0, 7, 5, 2, 0, 2],
[ 8, 5, 5, 2, 5, 7, 0, 2, 9, 7, 9],
[ 6, 3, 3, 4, 3, 5, 2, 0, 7, 5, 7],
[ 1, 4, 4, 11, 4, 2, 9, 7, 0, 2, 0],
[ 1, 2, 2, 9, 2, 0, 7, 5, 2, 0, 2],
[ 1, 4, 4, 11, 4, 2, 9, 7, 0, 2, 0]])

With numpy the following line might be the shortest to your result:
import numpy as np
a = np.array([2, 5, 5, 12, 5, 3, 10, 8, 1, 3, 1])
sq = np.array([np.array([(np.abs(i - j)) for j in a]) for i in a])
print(sq)
The following would give you the desired result without numpy.
a = [2, 5, 5, 12, 5, 3, 10, 8, 1, 3, 1]
sq = []
for i in a:
distances = []
for j in a:
distances.append(abs(i-j))
sq.append(distances)
print(sq)
With both, the result comes as:
[[0, 3, 3, 10, 3, 1, 8, 6, 1, 1, 1], [3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4], [3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4], [10, 7, 7, 0, 7, 9, 2, 4, 11, 9, 11], [3, 0, 0, 7, 0, 2, 5, 3, 4, 2, 4], [1, 2, 2, 9, 2, 0, 7, 5, 2, 0, 2], [8, 5, 5, 2, 5, 7, 0, 2, 9, 7, 9], [6, 3, 3, 4, 3, 5, 2, 0, 7, 5, 7], [1, 4, 4, 11, 4, 2, 9, 7, 0, 2, 0], [1, 2, 2, 9, 2, 0, 7, 5, 2, 0, 2], [1, 4, 4, 11, 4, 2, 9, 7, 0, 2, 0]]

There may be more than one way to do this but one way is to only use numpy operations instead of loops because internally python does lots of optimizations for numpy arrays.
One way to do only using array operations is to create an NxN matrix by repeating the original matrix (a) N times.
This will create a matrix N times.
E.g:
a = [1, 2, 3]
b = [[1 , 2, 3], [1 , 2, 3], [1 , 2, 3]]
Then you can do a matrix, array operation of
ans = abs(b - a)
Assuming a is numpy array, you can do:
b = np.repeat(a,a.shape).reshape((a.shape[0],a.shape[0]))
ans = np.abs(b - a)

Related

how yo get the torch.mode 's output indeces , can details

>>> b
tensor([[ 6, 7, 12, 7, 8],
[ 0, 1, 6, 1, 2],
[ 0, 1, 6, 1, 2],
[ 2, 3, 8, 3, 4],
[ 2, 3, 8, 3, 4],
[ 2, 3, 8, 3, 4],
[10, 11, 16, 11, 12],
[-1, 0, 5, 0, 1],
[-2, -1, 4, -1, 0],
[ 2, 3, 8, 3, 4],
[ 1, 2, 7, 2, 3],
[ 1, 2, 7, 2, 3],
[ 2, 3, 8, 3, 4],
[ 5, 6, 11, 6, 7],
[-2, -1, 4, -1, 0],
[-3, -2, 3, -2, -1],
[-5, -4, 1, -4, -3],
[ 1, 2, 7, 2, 3],
[12, 13, 18, 13, 14],
[-3, -2, 3, -2, -1],
[ 2, 3, 8, 3, 4],
[ 3, 4, 9, 4, 5],
[10, 11, 16, 11, 12],
[-6, -5, 0, -5, -4],
[ 9, 10, 15, 10, 11],
[12, 13, 18, 13, 14],
[-3, -2, 3, -2, -1],
[-2, -1, 4, -1, 0],
[-4, -3, 2, -3, -2],
[-1, 0, 5, 0, 1],
[ 2, 3, 8, 3, 4],
[ 4, 5, 10, 5, 6],
[-1, 0, 5, 0, 1],
[ 5, 6, 11, 6, 7],
[ 7, 8, 13, 8, 9],
[ 3, 4, 9, 4, 5],
[ 2, 3, 8, 3, 4],
[ 4, 5, 10, 5, 6],
[-4, -3, 2, -3, -2],
[ 2, 3, 8, 3, 4],
[-1, 0, 5, 0, 1],
[ 2, 3, 8, 3, 4],
[ 4, 5, 10, 5, 6],
[ 9, 10, 15, 10, 11],
[-1, 0, 5, 0, 1],
[-4, -3, 2, -3, -2],
[ 0, 1, 6, 1, 2],
[ 4, 5, 10, 5, 6],
[ 6, 7, 12, 7, 8],
[-2, -1, 4, -1, 0]])
>>> torch.mode(b, 0)
torch.return_types.mode(
values=tensor([2, 3, 8, 3, 4]),
indices=tensor([20, 20, 20, 20, 20]))
i don't know why indeces is all equal to 20
the details of torch.mode description as below
https://pytorch.org/docs/stable/generated/torch.mode.html#torch.mode
torch.mode(input, dim=- 1, keepdim=False, *, out=None)
Returns a namedtuple (values, indices) where values is the mode value of each row of the input tensor in the given dimension dim, i.e. a value which appears most often in that row, and indices is the index location of each mode value found.
By default, dim is the last dimension of the input tensor.
If keepdim is True, the output tensors are of the same size as input except in the dimension dim where they are of size 1. Otherwise, dim is squeezed (see torch.squeeze()), resulting in the output tensors having 1 fewer dimension than input.
It is because of the way the tensor b is. The row [2, 3, 8, 3, 4] is repeated a lot, so in each column the modes are respectively [2, 3, 8, 3, 4] and more importantly, the mode indices will be equal precisely because the modes occur together; if you look at the row with index 20 (i.e., the 21st row), it is exactly [2, 3, 8, 3, 4].
I am assuming that you constructed b similar to the example in torch.mode which I believe is a poor choice for an example as it leads to confusion like the one you are having.
Instead, consider the following:
>>> b = torch.randint(4, (5, 7))
>>> b
tensor([[0, 0, 0, 2, 0, 0, 2],
[0, 3, 0, 0, 2, 0, 1],
[2, 2, 2, 0, 0, 0, 3],
[2, 2, 3, 0, 1, 1, 0],
[1, 1, 0, 0, 2, 0, 2]])
>>> torch.mode(b, 0)
torch.return_types.mode(
values=tensor([0, 2, 0, 0, 0, 0, 2]),
indices=tensor([1, 3, 4, 4, 2, 4, 4]))
In the above, b has different modes in each column which are respectively [0, 2, 0, 0, 0, 0, 2] and the indices returned by torch.mode are [1, 3, 4, 4, 2, 4, 4]. This makes sense because, for example, in the first column, 0 is the most common element and there is a 0 at index 1. Similarly, in the second column, 2 is the most common element and there is a 2 at index 3. This is true for all columns. If you want the modes of the rows instead, you would do torch.mode(b, 1).

How to create a multidimensional matrix in python with recursion

Let's say I have defined a function to generate a list of 6 random integers ranging from 0 to 10
import random
def func():
randomlist = random.sample(range(11), 6)
return randomlist
Run:
func()
Output:
[3, 7, 10, 9, 4, 1]
Now I want to define another function which calls func() inside with a parameter n to generate a multidimensional matrix, where each element will be replaced by a newly created list generated by func(), and n is the times of the completed replacement (so that the total number of integers in the matrix would be 6^n) -- for example --
when n=1, expected result:
[3, 7, 10, 9, 4, 1]
when n=2, expected result:
[[6, 2, 9, 1, 4, 0],
[7, 8, 1, 9, 4, 1],
[1, 0, 4, 6, 3, 1],
[9, 4, 3, 8, 6, 7],
[2, 4, 3, 9, 5, 6],
[4, 7, 2, 0, 1, 8]]
when n=3, expected result:
[[[4, 7, 3, 0, 2, 8],[1, 5, 6, 5, 4, 8],[8, 9, 6, 5, 10, 4],[7, 8, 6, 6, 4, 10],[7, 8, 1, 0, 2, 3],[4, 5, 8, 5, 4, 6]],
[[1, 7, 2, 0, 2, 8],[4, 5, 8, 8, 4, 5],[9, 5, 6, 2, 1, 3],[5, 4, 1, 2, 6, 10],[7, 5, 4, 1, 1, 4],[9, 6, 5, 2, 2, 1]],
[[8, 2, 7, 10, 2, 7],[8, 9, 5, 4, 5, 5],[5, 8, 7, 7, 4, 6],[9, 5, 9, 10, 5, 4],[1, 4, 5, 6, 5, 7],[9, 8, 7, 6, 5, 1]],
[[0, 7, 4, 0, 1, 9],[4, 7, 3, 0, 2, 8],[8, 9, 6, 5, 10, 4],[8, 2, 7, 10, 2, 7],[[7, 5, 4, 1, 1, 4],[7, 8, 1, 9, 4, 1]],
[[9, 5, 3, 9, 2, 8],[8, 9, 6, 5, 10, 4],[9, 4, 3, 8, 6, 7],[3, 7, 10, 9, 4, 1],[4, 7, 3, 0, 2, 8],[9, 4, 3, 8, 6, 7]],
[[5, 3, 4, 5, 2, 10],[[7, 5, 4, 1, 1, 4],[4, 7, 3, 0, 2, 8],[4, 5, 8, 8, 4, 5],[7, 8, 1, 9, 4, 1],[8, 2, 7, 10, 2, 7]]]
I think I'd need a recursive function, but I'm totally out of a clue. Any insights? Thanks a lot.
Well base case of 1 is just to call your func, otherwise build a list of the recursion call on n-1 6^(n-1) times:
def recur_6(n):
if n <= 1:
return func()
return [recur_6(n-1) for _ in range(6**(n-1))]
Live example

PyTorch: How to insert before a certain element

Currently I have a 2D tensor, for each row, I want to insert a new element e before the first index of a specified value v. Additional information: cannot guarantee each row could have a such value. If there isn't, just append the element
Example: Supporse e is 0, v is 10, Given a tensor
[[9, 6, 5, 4, 10],
[8, 7, 3, 5, 5],
[4, 9, 10, 10, 10]]
I want to get
[[9, 6, 5, 4, 0, 10],
[8, 7, 3, 5, 5, 0],
[4, 9, 0, 10, 10, 10]]
Are there some Torch-style ways to do this? The worst case I can treat this as a trivial Python problem but I think the corresponding solution is a little time-consuming.
I haven't yet found a full PyTorch solution. I'll keep looking, but here is somewhere to start:
>>> v, e = 10, 0
>>> v, e = torch.tensor([v]), torch.tensor([e])
>>> x = torch.tensor([[ 9, 6, 5, 4, 10],
[ 8, 7, 3, 5, 5],
[ 4, 9, 10, 10, 10],
[10, 9, 7, 10, 2]])
To deal with the edge case where v is not found in one of the rows you can add a temporary column to x. This will ensure every row has a value v in it. We will use x_ as a helper tensor:
>>> x_ = torch.cat([x, v.repeat(x.size(0))[:, None]], axis=1)
tensor([[ 9, 6, 5, 4, 10, 10],
[ 8, 7, 3, 5, 5, 10],
[ 4, 9, 10, 10, 10, 10],
[10, 9, 7, 10, 2, 10]])
Find the indices of the first value v on each row:
>>> bp = (x_ == v).int().argmax(axis=1)
tensor([4, 5, 2, 0])
Finally, the easiest way to insert values at different positions in each row is with a list comprehension:
>>> torch.stack([torch.cat([xi[:bpi], e, xi[bpi:]]) for xi, bpi in zip(x, bp)])
tensor([[ 9, 6, 5, 4, 0, 10],
[ 8, 7, 3, 5, 5, 0],
[ 4, 9, 0, 10, 10, 10],
[ 0, 10, 9, 7, 10, 2]])
Edit - If v cannot occur in the first position, then no need for x_:
>>> x
tensor([[ 9, 6, 5, 4, 10],
[ 8, 7, 3, 5, 5],
[ 4, 9, 10, 10, 10]])
>>> bp = (x == v).int().argmax(axis=1) - 1
>>> torch.stack([torch.cat([xi[:bpi], e, xi[bpi:]]) for xi, bpi in zip(x, bp)])
tensor([[ 9, 6, 5, 0, 4, 10],
[ 8, 7, 3, 5, 0, 5],
[ 4, 0, 9, 10, 10, 10]])

Question about rvs boundary in scipy.stats.randint

I'm using the scipy.stats.randint to get random numbers.
Here is my source code and result.
Input:
from scipy.stats import randint
randint.rvs(0.00001, 10, size=100)
Output:
array([6, 4, 6, 7, 9, 7, 3, 0, 2, 5, 1, 1, 0, 3, 6, 7, 3, 6, 4, 8, 6, 5,
0, 0, 5, 1, 3, 2, 3, 1, 0, 6, 5, 2, 0, 0, 9, 1, 5, 2, 3, 6, 1, 4,
3, 1, 4, 4, 9, 5, 6, 3, 4, 3, 7, 7, 2, 4, 0, 2, 0, 6, 8, 1, 5, 6,
4, 6, 5, 0, 8, 8, 5, 9, 3, 2, 8, 7, 1, 4, 6, 0, 7, 3, 9, 1, 2, 7,
7, 6, 4, 3, 3, 3, 4, 7, 7, 4, 1, 1])
My question is, I've set the low to 0.000001, but How the '0's came out from output.
Thanks for your help.
Scipy's randint invokes mtrand.randint, that is a part of Numpy package.
As you can see from its source code, lower bound is truncated using (int)(low).
So, to get random numbers from closed interval [1, 10], do the following:
randint.rvs(1, 11, size=100)
Note, you need to increase high bound by 1, as it seen from the form of probability distribution (pmf) for randint.

Python: finding all pallindromic sequences of length k that sum to n

I'm trying to find all palindromic sequences of length k that sum to n. I have a specific example (k=6):
def brute(n):
J=[]
for a in range(1,n):
for b in range(1,n):
for c in range(1,n):
if (a+b+c)*2==n:
J.append((a,b,c,c,b,a))
return(J)
The output gives me something like:
[(1, 1, 6, 6, 1, 1),
(1, 2, 5, 5, 2, 1),
(1, 3, 4, 4, 3, 1),
(1, 4, 3, 3, 4, 1),
(1, 5, 2, 2, 5, 1),
(1, 6, 1, 1, 6, 1),
(2, 1, 5, 5, 1, 2),
(2, 2, 4, 4, 2, 2),
(2, 3, 3, 3, 3, 2),
(2, 4, 2, 2, 4, 2),
(2, 5, 1, 1, 5, 2),
(3, 1, 4, 4, 1, 3),
(3, 2, 3, 3, 2, 3),
(3, 3, 2, 2, 3, 3),
(3, 4, 1, 1, 4, 3),
(4, 1, 3, 3, 1, 4),
(4, 2, 2, 2, 2, 4),
(4, 3, 1, 1, 3, 4),
(5, 1, 2, 2, 1, 5),
(5, 2, 1, 1, 2, 5),
(6, 1, 1, 1, 1, 6)]
The issue is that I have no idea how to generalize this to any values of n and k. I hear that dictionaries would be helpful. Did I mention I was new to python? any help would be appreciated
thanks
The idea is that we simply count from 0 to 10**k, and consider each of these "integers" as a palindrome sequence. We left pad with 0 where necessary. So, for k==6, 0 -> [0, 0, 0, 0, 0, 0], 1 -> [0, 0, 0, 0, 0, 1], etc. This enumerates over all possible combinations. If it's a palindrome, we also check that it adds up to n.
Below is some code that (should) give a correct result for arbitrary n and k, but is not terribly efficient. I'll leave optimizing up to you (if it's necessary), and give some tips on how to do it.
Here's the code:
def find_all_palindromic_sequences(n, k):
result = []
for i in range(10**k):
paly = gen_palindrome(i, k, n)
if paly is not None:
result.append(paly)
return result
def gen_palindrome(i, k, n):
i_padded = str(i).zfill(k)
i_digits = [int(digit) for digit in i_padded]
if i_digits == i_digits[::-1] and sum(i_digits) == n:
return i_digits
to test it, we can do:
for paly in find_all_palindromic_sequences(n=16, k=6):
print(paly)
this outputs:
[0, 0, 8, 8, 0, 0]
[0, 1, 7, 7, 1, 0]
[0, 2, 6, 6, 2, 0]
[0, 3, 5, 5, 3, 0]
[0, 4, 4, 4, 4, 0]
[0, 5, 3, 3, 5, 0]
[0, 6, 2, 2, 6, 0]
[0, 7, 1, 1, 7, 0]
[0, 8, 0, 0, 8, 0]
[1, 0, 7, 7, 0, 1]
[1, 1, 6, 6, 1, 1]
[1, 2, 5, 5, 2, 1]
[1, 3, 4, 4, 3, 1]
[1, 4, 3, 3, 4, 1]
[1, 5, 2, 2, 5, 1]
[1, 6, 1, 1, 6, 1]
[1, 7, 0, 0, 7, 1]
[2, 0, 6, 6, 0, 2]
[2, 1, 5, 5, 1, 2]
[2, 2, 4, 4, 2, 2]
[2, 3, 3, 3, 3, 2]
[2, 4, 2, 2, 4, 2]
[2, 5, 1, 1, 5, 2]
[2, 6, 0, 0, 6, 2]
[3, 0, 5, 5, 0, 3]
[3, 1, 4, 4, 1, 3]
[3, 2, 3, 3, 2, 3]
[3, 3, 2, 2, 3, 3]
[3, 4, 1, 1, 4, 3]
[3, 5, 0, 0, 5, 3]
[4, 0, 4, 4, 0, 4]
[4, 1, 3, 3, 1, 4]
[4, 2, 2, 2, 2, 4]
[4, 3, 1, 1, 3, 4]
[4, 4, 0, 0, 4, 4]
[5, 0, 3, 3, 0, 5]
[5, 1, 2, 2, 1, 5]
[5, 2, 1, 1, 2, 5]
[5, 3, 0, 0, 3, 5]
[6, 0, 2, 2, 0, 6]
[6, 1, 1, 1, 1, 6]
[6, 2, 0, 0, 2, 6]
[7, 0, 1, 1, 0, 7]
[7, 1, 0, 0, 1, 7]
[8, 0, 0, 0, 0, 8]
Which looks similar to your result, plus the results that contain 0.
Ideas for making it faster (this will slow down a lot as k becomes large):
This is an embarrassingly parallel problem, consider multithreading/multiprocessing.
The palindrome check of i_digits == i_digits[::-1] isn't as efficient as it could be (both in terms of memory and CPU). Having a pointer at the start and end, and traversing characters one by one till the pointers cross would be better.
There are some conditional optimizations you can do on certain values of n. For instance, if n is 0, it doesn't matter how large k is, the only palindrome will be [0, 0, 0, ..., 0, 0]. As another example, if n is 8, we obviously don't have to generate any permutations with 9 in them. Or, if n is 20, and k is 6, then we can't have 3 9's in our permutation. Generalizing this pattern will pay off big assuming n is reasonably small. It works the other way, too, actually. If n is large, then there is a limit to the number of 0s and 1s that can be in each permutation.
There is probably a better way of generating palindromes than testing every single integer. For example, if we know that integer X is a palindrome sequence, then X+1 will not be. It's pretty easy to show this: the first and last digits can't match for X+1 since we know they must have matched for X. You might be able to show that X+2 and X+3 cannot be palindromes either, etc. If you can generalize where you must test for a new palindrome, this will be key. A number theorist could help more in this regard.
HTH.

Resources