Inserting multiple elements in a numpy array - python-3.x

Is there a function in python that allows me to insert number 100's or consecutive non zeros in the array [1,2,3,4,5]?
Output should be [1, 100, 100, 100, 2, 100, 100, 100, 3 .....] or [ 1, 100, 101, 102, 2 , 100, 101, 102, 3...]
I have tried numpy.insert()
ar2=np.insert(ar1, slice(1,None), range(100,103))
Output: array([ 1, 100, 2, 101, 3, 102, 4, 100, 5, 101])
Numpy.Insert() method allows addition of only a single number between the input elements. Let me know your thoughts on this.

You can use numpy.kron
np.kron([1,2,3,4,5],[1,0,0,0]) + 100*np.kron(np.ones(5),[0,1,1,1])
for the second one
np.kron([1,2,3,4,5],[1,0,0,0]) + np.kron(np.ones(5),[0,101,102,103])

Related

Adding in a single 2D / Matrix with a string

How do I add element 1 in all the lists? After that, add element 2? I need to find the percentage of them.
rows = [["SOBs", 60, 80, 70, 75], ["Test1", 60, 50, 60, 65], ["Test2", 40, 30, 40, 45], ["Test3", 45, 90, 80, 85], ["CW", 40, 80, 70, 75]]
I have tried in this manner:
sum(sum(rows, [2])) - this doesn't work
print(sum(rows[0][1] + [1][1])) - also doesn't work
So the for element 1 in it would be 60+60+40+45+40 = 245
Then I take the 245/500*100 = 49%
You can try with list comprehension easily.
element = 1
rows = [["SOBs", 60, 80, 70, 75], ["Test1", 60, 50, 60, 65], ["Test2", 40, 30, 40, 45], ["Test3", 45, 90, 80, 85], ["CW", 40, 80, 70, 75]]
sum_1= sum([rows[i][element] for i in range(len(rows))])
element = 2
sum_2= sum([rows[i][element] for i in range(len(rows))])
print((sum_1*100)/(sum_1+sum_2))
Consider it like a table, and use comprehensions, to summarize columns.
Try this
rows = [["SOBs", 60, 80, 70, 75], ["Test1", 60, 50, 60, 65], ["Test2", 40, 30, 40, 45], ["Test3", 45, 90, 80, 85],
["CW", 40, 80, 70, 75]]
COL_LENGTH = 4 # set as 4, as you have only 4 colums to evaluate
# Create a list of column values for each column index
selected_cols = [[row[i] for row in rows] for i in range(1, COL_LENGTH)]
# Get sum of every column
sums = [sum(col) for col in selected_cols]
# get avg of every column
avgs = [_sum / 5 for _sum in sums]

Pytorch how to reshape/reduce the number of filters without altering the shape of the individual filters

With a 3D tensor of shape (number of filters, height, width), how can one reduce the number of filters with a reshape which keeps the original filters together as whole blocks?
Assume the new size has dimensions chosen such that a whole number of the original filters can fit side by side in one of the new filters. So an original size of (4, 2, 2) can be reshaped to (2, 2, 4).
A visual explanation of the side by side reshape where you see the standard reshape will alter the individual filter shapes:
I have tried various pytorch functions such as gather and select_index but not found a way to get to the end result in a general manner (i.e. works for different numbers of filters and different filter sizes).
I think it would be easier to rearrange the tensor values after performing the reshape but could not get a tensor of the pytorch reshaped form:
[[[1,2,3,4],
[5,6,7,8]],
[[9,10,11,12],
[13,14,15,16]]]
to:
[[[1,2,5,6],
[3,4,7,8]],
[[9,10,13,14],
[11,12,15,16]]]
for completeness, the original tensor before reshaping:
[[[1,2],
[3,4]],
[[5,6],
[7,8]],
[[9,10],
[11,12]],
[[13,14],
[15,16]]]
Another option is to construct a list of parts and concatenate them
x = torch.arange(4).reshape(4, 1, 1).repeat(1, 2, 2)
y = torch.cat([x[i::2] for i in range(2)], dim=2)
print('Before\n', x)
print('After\n', y)
which gives
Before
tensor([[[0, 0],
[0, 0]],
[[1, 1],
[1, 1]],
[[2, 2],
[2, 2]],
[[3, 3],
[3, 3]]])
After
tensor([[[0, 0, 1, 1],
[0, 0, 1, 1]],
[[2, 2, 3, 3],
[2, 2, 3, 3]]])
Or a little more generally we could write a function that takes groups of neighbors along a source dimension and concatenates them along a destination dimension
def group_neighbors(x, group_size, src_dim, dst_dim):
assert x.shape[src_dim] % group_size == 0
return torch.cat([x[[slice(None)] * (src_dim) + [slice(i, None, group_size)] + [slice(None)] * (len(x.shape) - (src_dim + 2))] for i in range(group_size)], dim=dst_dim)
x = torch.arange(4).reshape(4, 1, 1).repeat(1, 2, 2)
# read as "take neighbors in groups of 2 from dimension 0 and concatenate them in dimension 2"
y = group_neighbors(x, group_size=2, src_dim=0, dst_dim=2)
print('Before\n', x)
print('After\n', y)
You could do it by chunking tensor and then recombining.
def side_by_side_reshape(x):
n_pairs = x.shape[0] // 2
filter_size = x.shape[-1]
x = x.reshape((n_pairs, 2, filter_size, filter_size))
return torch.stack(list(map(lambda x: torch.hstack(x.unbind()), k)))
>> p = torch.arange(1, 91).reshape((10, 3, 3))
>> side_by_side_reshape(p)
tensor([[[ 1, 2, 3, 10, 11, 12],
[ 4, 5, 6, 13, 14, 15],
[ 7, 8, 9, 16, 17, 18]],
[[19, 20, 21, 28, 29, 30],
[22, 23, 24, 31, 32, 33],
[25, 26, 27, 34, 35, 36]],
[[37, 38, 39, 46, 47, 48],
[40, 41, 42, 49, 50, 51],
[43, 44, 45, 52, 53, 54]],
[[55, 56, 57, 64, 65, 66],
[58, 59, 60, 67, 68, 69],
[61, 62, 63, 70, 71, 72]],
[[73, 74, 75, 82, 83, 84],
[76, 77, 78, 85, 86, 87],
[79, 80, 81, 88, 89, 90]]])
but I know it's not ideal since there is map, list and unbind which disrupts memory. This is what I offer till I figure out how to do it via view only (so a real reshape)

How to select the elements of a numpy array without using loop?

I am doing crossover, so i want to replace the elements of matrix (m1) with another matrix (m2) Image for your reference(elements in green box has to be replaced).
How to do this without using loop?
Assume that your both arrays contain:
a: b:
array([[ 0, 1, 2, 3], array([[100, 101, 102, 103],
[ 4, 5, 6, 7], [104, 105, 106, 107],
[ 8, 9, 10, 11]]) [108, 109, 110, 111]])
and you want to copy elements from a to b through the "filter"
that you defined.
To do it, create a mask:
of size just like a, initially filled with 0,
fill left upper corner (2 rows by 2 columns) with 1.
The code to do it is:
msk = np.zeros_like(a)
msk[0:2, 0:2] = 1
So its content is:
array([[1, 1, 0, 0],
[1, 1, 0, 0],
[0, 0, 0, 0]])
And now copy b to a through this mask:
a = np.where(msk, a, b)
getting:
array([[ 0, 1, 102, 103],
[ 4, 5, 106, 107],
[108, 109, 110, 111]])
As you wish, without any loop.
Another solution (a one-liner) is:
a = np.where(np.array([[1,1,0,0], [1,1,0,0], [0,0,0,0]]), a, b)
You can simply use fancy indexing for that. Either use the elements of m1 to replace it in m2 or the other way round.
m3 = m2.copy()
m3[:2,:2] = m1[:2,:2]

What is the time complexity for the worst case of this algorithm?

I am analysing an algorithm that gives the location of a "peak value" of a square matrix (This means that the neighbors of the value are less or equal than the value).
The algorith in question is very inefficient, because it goes checking values one by one, starting in the position (0,0) and moving to the neighbor that is more than the number. Here is the code:
def algorithm(problem, location = (0, 0), trace = None):
# if it's empty, it's done!
if problem.numRow <= 0 or problem.numCol <= 0: #O(1)
return None
nextLocation = problem.getBetterNeighbor(location, trace) #O(1)
#This evaluates the neighbor values and returns the highest value. If it doesn't have a better neighbor, it return itself
if nextLocation == location:
# If it doesnt have a better neighbor, then its a peak.
if not trace is None: trace.foundPeak(location) #O(1)
return location
else:
#there is a better neighbor, go to the neighbor and do a recursive call with that location
return algorithm(problem, nextLocation, trace) #O(????)
I know that the best case is that the peak is in (0,0), and I determined that the worst case scenario is the following (Using a 10x10 matrix):
problem = [
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 10],
[34, 35, 36, 37, 38, 39, 40, 41, 0, 11],
[33, 0, 0, 0, 0, 0, 0, 42, 0, 12],
[32, 0, 54, 55, 56, 57, 0, 43, 0, 13],
[31, 0, 53, 0, 0, 58, 0, 44, 0, 14],
[30, 0, 52, 0, 0, 0, 0, 45, 0, 15],
[29, 0, 51, 50, 49, 48, 47, 46, 0, 16],
[28, 0, 0, 0, 0, 0, 0, 0, 0, 17],
[27, 26, 25, 24, 23, 22, 21, 20, 19, 18]]
Note that it basically makes the algorithm go in a spiral and it has to evaluate 59 positions.
So, the question is: How do I get the time complexity for this case in particular and why is that?
I know that all the operations are O(1), except for the recursion, and I'm lost
For an arbitrary matrix of size [m,n], as you showed with your example, we can break down the traversal of a given matrix made by this algorithm (A) as follows:
A will traverse n-1 elements from the top-left corner to element 8,
then m-1 elements from 9 to 17,
then n-1 elements from 18 to 27,
then m-3 elements from 27 to 33,
then n-3 elements from 34 to 40,
then m-5 elements from 41 to 45,
then n-5 elements from 46 to 50,
then m-7 elements from 51 to 53
etc.
At this point, the pattern should be clear, and thus the following worst-case recurrence relation can be established:
T(m,n) = T(m-2,n-2) + m-1 + n-1
T(m,n) = T(m-4,n-4) + m-3 + n-3 + m-1 + n-1
...
T(m,n) = T(m-2i,n-2i) + i*m + i*n -2*(i^2)
where i is the number of iterations, and this recurrence will continue only while m-2i and n-2i are both greater than 0.
WLOG we can assume m>=n and so this algorithm continues while m-2i>0 or while m>2i or for im/2 iterations. Thus plugging back in for i, we get:
T(m,n) = T(m-m,n-m) + m/2*m + m/2*n -2*((m/2)^2)
T(m,n) = 0 + m^2/2 + m*n/2 -2*((m^2/4))
T(m,n) = 0 + m^2/2 + m*n/2 -2*((m^2/4))
T(m,n) = m*n/2 = O(m*n)

Method of vectors in various vector length to fixed length (NLP)

Recently I have been looking around about Natural Language Processing and its vectorization method and advantages of each vectorizer.
I am into character to vectorize, but it seems like the most concerns about the character vectorizer for each word is the embedding to have fixed length.
I do not want to just embed them with 0, which is well known as 0 padding, for instance, the target fixed length is 100 and 72 characters only exists then all 28 of 0 will be padded at the end.
"The example of paragraphs and phrases.... ... in vectorizer form" < with length 72
becomes
[0, 25, 60, 12, 24, 0, 19, 99, 7, 32, 47, 11, 19, 43, 18, 19, 6, 25,
43, 99, 0, 32, 40, 14, 20, 5, 37, 47, 99, 11, 29, 7, 19, 47, 18, 20,
60, 18, 19, 2, 19, 11, 31, 130, 130, 76, 0, 32, 40, 14, 20, 7, 19, 47,
18, 20, 60, 11, 37, 43, 99, 11, 29, 99, 17, 39, 47, 11, 31, 18, 19,
43, 0, 19, 77, 0, 0, 0, 0, 0, 0, 0, 0, ...., 0, 0, 0, 0, 0, 0]
.
.
I want to make the vectors be in a fair distribution form in N fixed dimensions, not like the one above
If you know any papers or algorithms preferring consider this matter, or common way to produce a fixed length vectors from various length of vectors please share .
.
.
Further information added as gojomo requested;
I am trying to get the character level vectors for words in corpus.
Let say, in above example, "The example of paragraphs...." starts with
T [40]
h [17]
e [3]
e [3]
x [53]
a [1]
m [21]
p [25]
l [14]
e [3]
Notice that each character has its own number (etc, could be ascii) and word represents the vectors of character vectors combination, for example,
The [40, 17, 3]
example [3, 53, 1, 21, 25, 14, 3]
which the vectors are not in same dimension. With the case mention above, many people are padding 0 at the end to make it in uniform size
For example, if someone wants to make the dimension of each word to be 300, then 297 of 0s will be padded to letter "The" and 293 of 0s will be padded to "example"., like
The [40, 17, 3, 0, 0, 0, 0, 0, ...., 0]
example [3, 53, 1, 21, 25, 14, 3, 0, 0, 0, 0, 0, ...., 0]
Now I do not think this padding method is appropriate to my experiments so I want to know if there are any methods to convert its vectors to in uniform form with not sparsed form(if this term is allowed).
Even with the phrase with two words, "The example" only takes 11 characters long , still not long enough either.
Whatever the case is that, I would like to know if there are some well known techniques to convert the informal length of vectors to some fixed length.
Thank you !

Resources