I have a list of column vectors and I want to print only those column vectors from a matrix.
Note: the list can be of random length, and the indices can also be random.
For instance, the following does what I want:
import numpy as np
column_list = [2,3]
a = np.array([[1,2,6,1],[4,5,8,2],[8,3,5,3],[6,5,4,4],[5,2,8,8]])
new_matrix = []
for i in column_list:
new_matrix.append(a[:,i])
new_matrix = np.array(new_matrix)
new_matrix = new_matrix.transpose()
print(new_matrix)
However, I was wondering if there is a shorter method?
Yes, there's a shorter way. You can pass a list (or numpy array) to an array's indexer. Therefore, you can pass column_list to the columns indexer of a:
>>> a[:, column_list]
array([[6, 1],
[8, 2],
[5, 3],
[4, 4],
[8, 8]])
# This is your new_matrix produced by your original code:
>>> new_matrix
array([[6, 1],
[8, 2],
[5, 3],
[4, 4],
[8, 8]])
>>> np.all(a[:, column_list] == new_matrix)
True
Related
I am attempting to create a list of arrays from 2 vectors.
I have a dataset I'm reading from a .csv file and need to pair each value with a 1 to create a list of arrays.
import numpy as np
Data = np.array([1, 2, 3, 4, 5]) #this is actually a column in a .csv file, but simplified it for the example
#do something here
output = ([1,1], [1,2], [1,3], [1,4], [1,5]) #2nd column in each array is the data, first is a 1
I've tried to use numpy concatenate and vstack, but they don't give me exactly what I'm looking for.
Any suggestions would be appreciated.
You can form the output using a list comprehension:
data = [1, 2, 3, 4, 5]
output = [[1, item] for item in data]
This will output:
[[1, 1], [1, 2], [1, 3], [1, 4], [1, 5]]
I searched the net but couldn't find anything. I am trying to get all possible combinations including all subsets combinations of two lists (ideally n lists). All combinations should include at least one item from each list.
list_1 = [1,2,3]
list_2 = [5,6]
output = [
[1,5], [1,6], [2,5], [2,6], [3,5], [3,6],
[1,2,5], [1,2,6], [1,3,5], [1,3,6], [2,3,5], [2,3,6], [1,5,6], [2,5,6], [3,5,6],
[1,2,3,5], [1,2,3,6],
[1,2,3,5,6]
]
All I can get is pair combinations like [1,5], [1,6], .. by using
combs = list(itertools.combinations(itertools.chain(*ls_filter_columns), cnt))
What is the pythonic way of achieving this?
Here is one way:
from itertools import combinations, product
def non_empties(items):
"""returns nonempty subsets of list items"""
subsets = []
n = len(items)
for i in range(1,n+1):
subsets.extend(combinations(items,i))
return subsets
list_1 = [1,2,3]
list_2 = [5,6]
combs = [list(p) + list(q) for p,q in product(non_empties(list_1),non_empties(list_2))]
print(combs)
Output:
[[1, 5], [1, 6], [1, 5, 6], [2, 5], [2, 6], [2, 5, 6], [3, 5], [3, 6], [3, 5, 6], [1, 2, 5], [1, 2, 6], [1, 2, 5, 6], [1, 3, 5], [1, 3, 6], [1, 3, 5, 6], [2, 3, 5], [2, 3, 6], [2, 3, 5, 6], [1, 2, 3, 5], [1, 2, 3, 6], [1, 2, 3, 5, 6]]
Which has more elements then the output you gave, though I suspect that your intended output is in error. Note that my code might not correctly handle the case in which there is a non-empty intersection of the two lists. Then again, it might -- you didn't specify what the intended output should be in such a case.
Here is another way.
Even though it is not fancy, it cares n_lists easily.
def change_format(X):
output = []
for x in X:
output += list(x)
return output
import itertools
list_1 = [1,2,3]
list_2 = [5,6]
list_3 = [7,8]
lists = [list_1, list_2, list_3]
lengths = list(map(len, lists))
rs_list = itertools.product(*[list(range(1, l+1)) for l in lengths])
output = []
for rs in rs_list:
temp = []
for L, r in zip(lists, rs):
temp.append(list(itertools.combinations(L, r)))
output += list(itertools.product(*temp))
output = list(map(change_format, output))
I have managed to make it work for n-lists with less code, which was a challenge for me.
from itertools import chain, combinations, product
def get_subsets(list_of_lists):
"""
Get all possible combinations of subsets of given lists
:param list_of_lists: consists of any number of lists with any number of elements
:return: list
"""
ls = [chain(*map(lambda x: combinations(e, x), range(1, len(e)+1))) for e in list_of_lists if e]
ls_output = [[i for tpl in ele for i in tpl] for ele in product(*ls)]
return ls_output
list_1 = [1, 2, 3]
list_2 = [5, 6]
list_3 = [7, 8]
ls_filter_columns = [list_1, list_2, list_3]
print(get_subsets(ls_filter_columns))
I'd like to multiply the values of two columns per row...
from this:
to this:
I think this can be easily done by numpy or pandas. Here is a sample solution-
import pandas as pd
column = ['A','B','C']
dataframe = pd.DataFrame({"A":['a','b','c'],"B":[1,2,3],"C":[2,2,2]})
dataframe['D'] = dataframe['B']*dataframe['C']
print(dataframe)
The answer using pandas is perfectly ok, but to learn Python it is perhaps better to start using the built-in functions first. Here is the answer using lists
my_list = []
my_list.append([1, 2])
my_list.append([2, 2])
my_list.append([3, 2])
print(my_list)
sum_list = []
for element in my_list:
my_sum = element[0] + element[1]
sum_list.append(element + [my_sum])
print(sum_list)
Result
[[1, 2], [2, 2], [3, 2]]
[[1, 2, 3], [2, 2, 4], [3, 2, 5]]
Your exercise to add the first column!
In order to feed data into a LSTM network to predict remaining-useful-life (RUL) I need to create a 3D numpy array (No of machines, No of sequences, No of variables).
I already tried to combine solutions from stackoverflow and managed to create a prototype (which you can see below).
import numpy as np
import tensorflow as tf
import pandas as pd
df = pd.DataFrame({'ID': [1, 1, 2, 3, 3, 3, 3],
'V1': [1, 2, 2, 3, 3, 4, 2],
'V2': [4, 2, 3, 2, 1, 5, 1],
})
df_desired_result = np.array([[[1, 4], [2, 2], [-99, -99]],
[[2, 3], [-99, -99], [-99, -99]],
[[3, 2], [3, 1], [4, 5]]])
max_len = df['ID'].value_counts().max()
def pad_df(df, cols, max_seq, group_col= 'ID'):
array_for_pad = np.array(list(df[cols].groupby(df[group_col]).apply(pd.DataFrame.as_matrix)))
padded_array = tf.keras.preprocessing.sequence.pad_sequences(array_for_pad,
padding='post',
maxlen=max_seq,
value=-99
)
return padded_array
#testing prototype
pad_df(df, ['V1', 'V2'], max_len)
But when I apply the code above to my data, it applies the right-padding correctly but all values are set to 0.0.
I can't fully figure out this behaviour, I noticed that in the first line of my function, I get returned an array with nested arrays for 'array_for_pad'.
Here is a screenshot of the result:
result padding
so this has kind of stumped me. I feel like it should be an easy problem though.
Lets say I have these two lists
a = [[3, 4], [4, 5]]
b = [[1, 2], [4, 6]]
I am trying so it would return the sum of the two 2-D lists of each corresponding element like so
c = [[4, 6], [8, 11]]
I am pretty sure I am getting lost in loops. I am only trying to use nested loops to produce an answer, any suggestions? I'm trying several different things so my code is not exactly complete or set in stone and will probably change by the time someone reponds so I won't leave a code here. I am trying though!
You could try some variation on nested for-loops using enumerate (which will give you the appropriate indices for comparison to some other 2d array):
a = [[3, 4], [4, 5]]
b = [[1, 2], [4, 6]]
Edit: I didn't see you wanted to populate a new list, so I put that in there:
>>> c = []
>>> for val, item in enumerate(a):
newvals = []
for itemval, insideitem in enumerate(item):
newvals.append(insideitem + b[val][itemval])
c.append(newvals)
newvals = []
Result:
>>> c
[[4, 6], [8, 11]]
Use numpy:
import numpy as np
a = [[3, 4], [4, 5]]
b = [[1, 2], [4, 6]]
c = np.array((a,b))
np.sum(c, axis=0)
I know it is an old question, but following nested loops code works exactly as desired by OP:
sumlist = []
for i, aa in enumerate(a):
for j, bb in enumerate(b):
if i == j:
templist = []
for k in range(2):
templist.append(aa[k]+bb[k])
sumlist.append(templist)
templist = []
print(sumlist)
Output:
[[4, 6], [8, 11]]