boolean mask for similarity between tensors - pytorch

I have two tensors and would like to check if elements of row in a, are in the same row in b
a = [[1,2,3], [7,8,4]]
b = [[2,1,1], [4,5,6]]
c = [[T,T,F], [F,F,T]]
I would like this to be done in pure Pytorch in the fastest way possible.

Found the solution (https://stackoverflow.com/a/67870684/12216433)
In my case it would work like this.
AA = a.reshape(2, 3, 1)
BB = b.reshape(2, 1, 3)
mask = (AA == BB).sum(-1).bool()

Related

Conditionally use parts of a nested for loop

I've searched for this answer extensively, but can't seem to find an answer. Therefore, for the first time, I am posting a question here.
I have a function that uses many parameters to perform a calculation. Based on user input, I want to iterate through possible values for some (or all) of the parameters. If I wanted to iterate through all of the parameters, I might do something like this:
for i in range(low1,high1):
for j in range(low2,high2):
for k in range(low3,high3):
for m in range(low4,high4):
doFunction(i, j, k, m)
If I only wanted to iterate the 1st and 4th parameter, I might do this:
for i in range(low1,high1):
for m in range(low4,high4):
doFunction(i, user_input_j, user_input_k, m)
My actual code has almost 15 nested for-loops with 15 different parameters - each of which could be iterable (or not). So, it isn't scalable for me to use what I have and code a unique block of for-loops for each combination of a parameter being iterable or not. If I did that, I'd have 2^15 different blocks of code.
I could do something like this:
if use_static_j == True:
low2 = -999
high2 = -1000
for i in range(low1,high1):
for j in range(low2,high2):
for k in range(low3,high3):
for m in range(low4,high4):
j1 = j if use_static_j==False else user_input_j
doFunction(i, j1, k, m)
I'd just like to know if there is a better way. Perhaps using filter(), map(), or list comprehension... (which I don't have a clear enough understanding of yet)
As suggested in the comments, you could build an array of the parameters and then call the function with each of the values in the array. The easiest way to build the array is using recursion over a list defining the ranges for each parameter. In this code I've assumed a list of tuples consisting of start, stop and scale parameters (so for example the third element in the list produces [3, 2.8, 2.6, 2.4, 2.2]). To use a static value you would use a tuple (static, static+1, 1).
def build_param_array(ranges):
r = ranges[0]
if len(ranges) == 1:
return [[p * r[2]] for p in range(r[0], r[1], -1 if r[1] < r[0] else 1)]
res = []
for p in range(r[0], r[1], -1 if r[1] < r[0] else 1):
pa = build_param_array(ranges[1:])
for a in pa:
res.append([p * r[2]] + a)
return res
# range = (start, stop, scale)
ranges = [(1, 5, 1),
(0, 10, .1),
(15, 10, .2)
]
params = build_param_array(ranges)
for p in params:
doFunction(*p)

Python assigning variables with an OR on assignment, multiple statements in one line?

I am not super familiar with python, and I am having trouble reading this code. I have never seen this syntax, where there multiple statements are paired together (I think) on one line, separated by commas.
if L1.data < L2.data:
tail.next, L1 = L1, L1.next
Also, I don't understand assignment in python with "or": where is the conditional getting evaluated? See this example. When would tail.next be assigned L1, and when would tail.next be assigned L2?
tail.next = L1 or L2
Any clarification would be greatly appreciated. I haven't been able to find much on either syntax
See below
>>> a = 0
>>> b = 1
>>> a, b
(0, 1)
>>> a, b = b, a
>>> a, b
(1, 0)
>>>
It allows one to swap values without requiring a temporary variable.
In your case, the line
tail.next, L1 = L1, L1.next
is equivalent to
tail.next = L1
L1 = L1.next
In python when we write any comma separated values it creates a tuple (a kind of a datastructure).
a = 4,5
type(a) --> tuple
This is called tuple packing.
When we do:
a, b = 4,5
This is called tuple unpacking. It is equivalent to:
a = 4
b = 5
or is the boolean operator here.

Replacing list in list of list , gives only zeros

I have a matrice (list of list) say a and I want to normalize each "row" such that each element corresponds to the fraction of the corresponding row, i.e [p/sum(p) for p in row].
I have the following code
a_norm[:] = a
for i,row in enumerate(a_norm):
b = [p/sum(row) for p in row]
print(b)
a_norm[i] = b
the rows being printed (print(b)) are completely fine but a_norm consists of purely zeros for some reason.
EDIT: Adding an example.
a=np.array([[1,2,3], [20,22,13]]) should give a_norm=[[0.16,0.33,0.5],[0.36,0.4,0.24]]
try this one:
a_norm = [[i / sum(row) for i in row] for row in a]
Mistake you did in making list copy.
use a_norm = a[:] instead of a_norm[:] = a
You can try:
a_norm = a[:]
for i, row in enumerate(a_norm):
b = [p/sum(row) for p in row]
print(b)
a_norm[i] = b
print(a_norm)

search in iteration using python

I have two lists "OD_pair" and "OD_list".
OD_pair = [ A
B
C]
OD_list = [ B
B
A
B
A
B
C]
I am writing a python search to count how many OD pairs repeated in the OD list and adding another column for the result. For example:
I will take "A" from OD_pair, go to "OD_list", count how many "A"s are in "OD list" and return the number, and add it next to OD pair.
#take OD pair from moira data
OD_pair = df_moira['OD_pair'] #OD pair list
#loop ticket gate data and count how many OD pair appears in ticket gate data
OD_list = df_ticket_gate['OD_PAIRS'] # OD list
i = 0
while i < len(OD_pair): # go to OD pair list
OD = OD_pair(i) # take an iteam to search
j = 0
for j in OD_list:
sum(1 for OD_pair in OD_list if OD = OD_list(j)) # search the item in OD list and count
i += 1
The result will look like this :
OD_pair = [ A 2
B 4
C 1 ]
If all you are looking for is getting the number of times an item is repeating in list of values. You can try using this:
df = pd.DataFrame({'A':[1,2,3,4]})
df1 = pd.DataFrame({'B':[2,1,2,3,1,2,3,1,3]})
OD_pair = df[['A']]
OD_list = df1['B'].value_counts().to_frame().reset_index()
Output = OD_pair.merge(OD_list,'left',left_on = 'A',right_on = 'index')[['A','B']]
print(Output)
A more general solution using pure python would be:
OD_pair = ['A','B','C']
OD_list = ['B','B','A','B','A','B','C']
results = {}
for val in OD_pair:
results[val] = OD_list.count(val)
print(results)
which would give:
{'A': 2, 'B': 4, 'C': 1}
Though the code shown in the question suggests you're using pandas dataframes so the other solution is more useful in this specific case.

How to find two or more equal strings stored in two different vectors in matlab

I have two vectors (of different size) with strings in a data file.
I want to find the locations of two (or more) similar strings in each of these vectors.
E.g.:
a=['str1', 'str2', 'str3', 'str4', 'str5', 'str6'];
b=['str3', 'str1', 'str4', 'str4'];
I want an output like:
b(1) corresponds to a(3)
b(2) corresponds to a(1)
b(3) corresponds to a(4)
b(4) corresponds to a(4)
is it possible?
If you store your strings in cell arrays, you can do it like this:
>> a = {'str1', 'str2', 'str3', 'str4', 'str5', 'str6'};
>> b = {'str3', 'str1', 'str4', 'str4'};
>> result = cellfun(#(x) find(strcmp(a, x)), b, 'UniformOutput', false);
result =
[3] [1] [4] [4]
Note: result is a cell array. Therefore, result{i} == j means b(i) corresponds to a(j). If b(i) was not found in a, result{i} is empty.
An alternative is to use the ismember command which will return an array of logicals indicating whether the element of array b is a member of array a. It can also return a vector which indicates where in a the element of b is found. Using your example:
[ismem,idxa]=ismember(b,a)
returns the results
ismem =
1 1 1 1
idxa =
3 1 4 4
So we see that each member of b is in a (due to the ismem vector being all ones) and we see where in a is that element of b from the idxa vector. (Note that if b has an element that is not in a then there would be a zero element in both vectors.)

Resources