How to convert a list of elements to n*n space seperated arrangement where n is number of elements in the list - python-3.x

this is my list :
N= 9
Mylist=[9,8,7,6,5,4,3,2,1]
For this input
Output should be :
9 8 7
6 5 4
3 2 1

It sounds like you're wondering how to turn a list into a numpy array of a particular shape. Documentation is here.
import numpy as np
my_list=[3,9,8,7,6,5,4,3,2,1]
# Dropping the first item of your list as it isn't used in your output
array = np.array(my_list[1:]).reshape((3,3))
print(array)
Output
[[9 8 7]
[6 5 4]
[3 2 1]]

Related

Compare value in a dataframe to multiple columns of another dataframe to get a list of lists where entries match in an efficient way

I have two pandas dataframes and i want to find all entries of the second dataframe where a specific value occurs.
As an example:
df1:
NID
0 1
1 2
2 3
3 4
4 5
df2:
EID N1 N2 N3 N4
0 1 1 2 13 12
1 2 2 3 14 13
2 3 3 4 15 14
3 4 4 5 16 15
4 5 5 6 17 16
5 6 6 7 18 17
6 7 7 8 19 18
7 8 8 9 20 19
8 9 9 10 21 20
9 10 10 11 22 21
Now, what i basically want, is a list of lists with the values EID (from df2) where the values NID (from df1) occur in any of the columns N1,N2,N3,N4:
Solution would be:
sol = [[1], [1, 2], [2, 3], [3, 4], [4, 5]]
The desired solution explained:
The solution has 5 entries (len(sol = 5)) since I have 5 entries in df1.
The first entry in sol is 1 because the value NID = 1 only appears in the columns N1,N2,N3,N4 for EID=1 in df2.
The second entry in sol refers to the value NID=2 (of df1) and has the length 2 because NID=2 can be found in column N1 (for EID=2) and in column N2 (for EID=1). Therefore, the second entry in the solution is [1,2] and so on.
What I tried so far is looping for each element in df1 and then looping for each element in df2 to see if NID is in any of the columns N1,N2,N3,N4. This solution works but for huge dataframes (each df can have up to some thousand entries) this solution becomes extremely time-consuming.
Therefore I was looking for a much more efficient solution.
My code as implemented:
Input data:
import pandas as pd
df1 = pd.DataFrame({'NID':[1,2,3,4,5]})
df2 = pd.DataFrame({'EID':[1,2,3,4,5,6,7,8,9,10],
'N1':[1,2,3,4,5,6,7,8,9,10],
'N2':[2,3,4,5,6,7,8,9,10,11],
'N3':[13,14,15,16,17,18,19,20,21,22],
'N4':[12,13,14,15,16,17,18,19,20,21]})
solution acquired using looping:
sol= []
for idx,node in df1.iterrows():
x = []
for idx2,elem in df2.iterrows():
if node['NID'] == elem['N1']:
x.append(elem['EID'])
if node['NID'] == elem['N2']:
x.append(elem['EID'])
if node['NID'] == elem['N3']:
x.append(elem['EID'])
if node['NID'] == elem['N4']:
x.append(elem['EID'])
sol.append(x)
print(sol)
If anyone has a solution where I do not have to loop, I would be very happy. Maybe using a numpy function or something like cKDTrees but unfortunately I have no idea on how to get this problem solved in a faster way.
Thank you in advance!
You can reshape with melt, filter with loc, and groupby.agg as list. Then reindex and convert tolist:
out = (df2
.melt('EID') # reshape to long form
# filter the values that are in df1['NID']
.loc[lambda d: d['value'].isin(df1['NID'])]
# aggregate as list
.groupby('value')['EID'].agg(list)
# ensure all original NID are present in order
# and convert to list
.reindex(df1['NID']).tolist()
)
Alternative with stack:
df3 = df2.set_index('EID')
out = (df3
.where(df3.isin(df1['NID'].tolist())).stack()
.reset_index(name='group')
.groupby('group')['EID'].agg(list)
.reindex(df1['NID']).tolist()
)
Output:
[[1], [2, 1], [3, 2], [4, 3], [5, 4]]

How to print numpy array in columns without using pandas?

Here is my basic code:
import numpy as np
a = numpy.asarray([ [1,2,3], [4,5,6], [7,8,9] ])
I want to print the arrays as follows:
1 4 7
2 5 8
3 6 9
Also, how would I approach the same concept if my a array has 1000 embedded list in it?
Here is a powerful one-line solution without loops:
print('\n'.join(map(lambda line: ' '.join(map(str, line)), a.T)))
a.T transpose the 2D array, the first map encode a line in a string and the second one concatenate the string lines (by using \n between).
This is an alternative version with generators (likely slower):
print('\n'.join(' '.join(str(item) for item in line) for line in a.T))
Yet another solution with one loop (likely even slower):
for line in a.T:
print(' '.join(str(item) for item in line))
Note the last version produce a trailing new line.
You could also make use of str:
print(str(a.T).translate(str.maketrans({'[':'',']':''})))
1 4 7
2 5 8
3 6 9
print(str(a.T).replace('[', '').replace(']',''))
1 4 7
2 5 8
3 6 9
print(str(a.T).translate(str.maketrans({'[':'',']':''})).replace('\n ', '\n'))
1 4 7
2 5 8
3 6 9

Is there a way to build the Dot product of two matrices with different shape?

Is there a way to build the Dot product of two matrices with different shape, without using anything else as pure python and numpy?
The shape of the columns should be equal, but the rows should be different. (example below)
Of course I know the brute force way:
for i in A:
for j in B:
np.dot(A,B)
but is there something else?
Here an example:
import numpy as np
A = np.full((4,5),3)
B = np.full((3,5),5)
print(A)
print(B)
result = np.zeros((A.shape[0],B.shape[0]))
for i in range(A.shape[0]):
for j in range(B.shape[0]):
result[i,j] = np.dot(A[i],B[j])
print(dot)
Output:
A = [[3 3 3 3 3]
[3 3 3 3 3]
[3 3 3 3 3]
[3 3 3 3 3]]
B = [[5 5 5 5 5]
[5 5 5 5 5]
[5 5 5 5 5]]
result = [[75. 75. 75.]
[75. 75. 75.]
[75. 75. 75.]
[75. 75. 75.]]
The coal is to calculate the dot product without two loops. So is there a more efficient way?

Compare two matrices and create a matrix of their common values [duplicate]

This question already has an answer here:
Numpy intersect1d with array with matrix as elements
(1 answer)
Closed 5 years ago.
I'm currently trying to compare two matrices and return matching rows into the "intersection matrix" via python. Both matrices are numerical data-and I'm trying to return the rows of their common entries (I have also tried just creating a matrix with matching positional entries along the first column and then creating an accompanying tuple). These matrices are not necessarily the same in dimensionality.
Let's say I have two matrices of matching column length but arbitrary (can be very large and different row length)
23 3 4 5 23 3 4 5
12 6 7 8 45 7 8 9
45 7 8 9 34 5 6 7
67 4 5 6 3 5 6 7
I'd like to create a matrix with the "intersection" being for this low dimensional example
23 3 4 5
45 7 8 9
perhaps it looks like this though:
1 2 3 4 2 4 6 7
2 4 6 7 4 10 6 9
4 6 7 8 5 6 7 8
5 6 7 8
in which case we only want:
2 4 6 7
5 6 7 8
I've tried things of this nature:
def compare(x):
# This is a matrix I created with another function-purely numerical data of arbitrary size with fixed column length D
y =n_c(data_cleaner(x))
# this is a second matrix that i'd like to compare it to. note that the sizes are probably not the same, but the columns length are
z=data_cleaner(x)
# I initialized an array that would hold the matching values
compare=[]
# create nested for loop that will check a single index in one matrix over all entries in the second matrix over iteration
for i in range(len(y)):
for j in range(len(z)):
if y[0][i] == z[0][i]:
# I want the row or the n tuple (shown here) of those columns with the matching first indexes as shown above
c_vec = ([0][i],[15][i],[24][i],[0][25],[0][26])
compare.append(c_vec)
else:
pass
return compare
compare(c_i_w)
Sadly, I'm running into some errors. Specifically it seems that I'm telling python to improperly reference values.
Consider the arrays a and b
a = np.array([
[23, 3, 4, 5],
[12, 6, 7, 8],
[45, 7, 8, 9],
[67, 4, 5, 6]
])
b = np.array([
[23, 3, 4, 5],
[45, 7, 8, 9],
[34, 5, 6, 7],
[ 3, 5, 6, 7]
])
print(a)
[[23 3 4 5]
[12 6 7 8]
[45 7 8 9]
[67 4 5 6]]
print(b)
[[23 3 4 5]
[45 7 8 9]
[34 5 6 7]
[ 3 5 6 7]]
Then we can broadcast and get an array of equal rows with
x = (a[:, None] == b).all(-1)
print(x)
[[ True False False False]
[False False False False]
[False True False False]
[False False False False]]
Using np.where we can identify the indices
i, j = np.where(x)
Show which rows of a
print(a[i])
[[23 3 4 5]
[45 7 8 9]]
And which rows of b
print(b[j])
[[23 3 4 5]
[45 7 8 9]]
They are the same! That's good. That's what we wanted.
We can put the results into a pandas dataframe with a MultiIndex with row number from a in the first level and row number from b in the second level.
pd.DataFrame(a[i], [i, j])
0 1 2 3
0 0 23 3 4 5
2 1 45 7 8 9

Reshaping in julia

If I reshape in python I use this:
import numpy as np
y= np.asarray([1,2,3,4,5,6,7,8])
x=2
z=y.reshape(-1, x)
print(z)
and get this
>>>
[[1 2]
[3 4]
[5 6]
[7 8]]
How would I get the same thing in julia? I tried:
z = [1,2,3,4,5,6,7,8]
x= 2
a=reshape(z,x,4)
println(a)
and it gave me:
[1 3 5 7
2 4 6 8]
If I use reshape(z,4,x) it would give
[1 5
2 6
3 7
4 8]
Also is there a way to do reshape without specifying the second dimension like reshape(z,x) or if the secondary dimension is more ambiguous?
I think what you have hit upon is NumPy stores in row-major order and Julia stores arrays in column major order as covered here.
So Julia is doing what numpy would do if you used
z=y.reshape(-1,x,order='F')
what you want is the transpose of your first attempt, which is
z = [1,2,3,4,5,6,7,8]
x= 2
a=reshape(z,x,4)'
println(a)
you want to know if there is something that will compute the 2nd dimension assuming the array is 2 dimensional? Not that I know of. Possibly ArrayViews? Here's a simple function to start
julia> shape2d(x,shape...)=length(shape)!=1?reshape(x,shape...):reshape(x,shape[1],Int64(length(x)/shape[1]))
shape2d (generic function with 1 method)
julia> shape2d(z,x)'
4x2 Array{Int64,2}:
1 2
3 4
5 6
7 8
How about
z = [1,2,3,4,5,6,7,8]
x = 2
a = reshape(z,x,4)'
which gives
julia> a = reshape(z,x,4)'
4x2 Array{Int64,2}:
1 2
3 4
5 6
7 8
As for your bonus question
"Also is there a way to do reshape without specifying the second
dimension like reshape(z,x) or if the secondary dimension is more
ambiguous?"
the answer is not exactly, because it'd be ambiguous: reshape can make 3D, 4D, ..., tensors so its not clear what is expected. You can, however, do something like
matrix_reshape(z,x) = reshape(z, x, div(length(z),x))
which does what I think you expect.
"Also is there a way to do reshape without specifying the second dimension like reshape(z,x) or if the secondary dimension is more ambiguous?"
Use : instead of -1
I'm using Julia 1.1 (not sure if there was a feature when it was originally answered)
julia> z = [1,2,3,4,5,6,7,8]; a = reshape(z,:,2)
4×2 Array{Int64,2}:
1 5
2 6
3 7
4 8
However, if you want the first row to be 1 2 and match Python, you'll need to follow the other answer mentioning row-major vs column-major ordering and do
julia> z = [1,2,3,4,5,6,7,8]; a = reshape(z,2,:)'
4×2 LinearAlgebra.Adjoint{Int64,Array{Int64,2}}:
1 2
3 4
5 6
7 8

Resources