SUM numbers in a nested list - python-3.x

Currently I am trying to sum numbers from two nested loops. I am very close but for some reason, the numbers are not adding properly.
def addTables(table1,table2):
counter = 0
index = 0
table3 = [[None] * len(table1[0])] * len(table1)
print(table3)
for row in table1:
for i in table1[0]:
table3[counter][index] = table1[counter][index] + table2[counter][index]
index = index +1
index = 0
counter = counter + 1
print (table3)
My values are table1 = [[1,2,3],[4,5,6]] and table2 = [[1,2,3],[4,5,6]].
For some reason, it is printing
[[None, None, None], [None, None, None]]
[[2, 4, 6], [2, 4, 6]]
[[8, 10, 12], [8, 10, 12]]
but i want it to print
[[None, None, None], [None, None, None]]
[[2, 4, 6], [None, None, None]]
[[2, 4, 6], [8, 10, 12]]
I think this line is wrong, but I have no idea how to fix it.
table3[counter][index] = table1[counter][index] + table2[counter]

The problem is in
table3 = [[None] * len(table1[0])] * len(table1)
Multiplying a list actually copies references to its items; it doesn't duplicate the items in the list. To see what happens, look at your code like this:
subtable = [None] * len(table1[0])]
# subtable = [None, None, None]
table3 = [subtable] * len(table1)
# table3 = [subtable, subtable]
Thus, table3[0] and table3[1] are actually the same list. So when you set table3[0][0], you're actually modifying subtable, which is table3[0] and table3[1]!
You can see this effect for yourself:
>>> table3 = [[None] * 3] * 2
>>> table3
[[None, None, None], [None, None, None]]
>>> table3[0][1] = 5
>>> table3
[[None, 5, None], [None, 5, None]]
You can fix this by using list comprehensions to construct table3:
>>> table3 = [[None for x in table1[0]] for y in table1]
>>> table3
[[None, None, None], [None, None, None]]
>>> table3[0][1] = 5
>>> table3
[[None, 5, None], [None, None, None]]
Alternatively, using list multiplication for the inner list is fine. (This is because the references to None get replaced, while references to a sublist get modified in place):
>>> table3 = [[None] * len(table1[0]) for y in table1]
>>> table3
[[None, None, None], [None, None, None]]
>>> table3[0][1] = 5
>>> table3
[[None, 5, None], [None, None, None]]
This subtlety may be confusing though. Using a nested list comprehension is more explicit.
But personally, I wouldn't construct the list ahead of time like that in the first place. Instead, I would recommend starting with an empty list, and appending to that as you go
table3 = []
for row in table1:
sumrow = []
index = 0
for i in table1[0]:
sumrow.append(table1[counter][index] + table2[counter][index])
index += 1
table3.append(sumrow)
counter += 1
And, building upon that, it's usually cleaner to iterate over lists directly, rather than iterating over their indices. You can iterate over two equally-sized lists by using zip, like so:
for row1, row2 in zip(table1, table2):
sumrow = []
for item1, item2 in zip(row1, row2):
sumrow.append(item1 + item2)
table3.append(sumrow)
And once we're there, we could express sumrow as a list comprehension:
for row1, row2 in zip(table1, table2):
table3.append([item1 + item2 for item1, item2 in zip(row1, row2)])
Note that this pairwise addition in the list comprehension can also be achieved by applying sum to every pair using map:
for row1, row2 in zip(table1, table2):
table3.append(list(map(sum, zip(row1, row2))))
And then we could also replace the outer for loop by a list comprehension.
table3 = [list(map(sum, zip(row1, row2))) for row1, row2 in zip(table1, table2)]
Which can be slightly improved by using list unpacking for the rows:
table3 = [list(map(sum, zip(*rows))) for rows in zip(table1, table2)]
I'm a little conflicted if this is actually the best / most readable approach, so maybe I should've stopped a few versions ago. But hey, here we are ;)

Related

Iterate over i,j in a function returning two arrays

I am trying to accept two arrays from a function and iterate over value pairs in an array
import numpy as np
a = np.zeros(10).astype(np.uint8)
a[0:4] = 1
hist = np.zeros(4)
values, counts = np.unique(a, return_counts=True)
for u, c in zip(values, counts):
hist[u] += c
# This works. hist: [6. 4. 0. 0.]
# for u, c in zip(np.unique(a, return_counts=True)): # ValueError: not enough values to unpack (expected 2, got 1)
# hist[u] += c
# for u, c in np.unique(a, return_counts=True): # IndexError: index 6 is out of bounds for axis 0 with size 4
# hist[u] += c
Code works if I first accept two arrays, then use for k,v in zip(arr1, arr2)
Is it possible two write for k,v in function_returning_two_arrays(args) as a one line statement?
Update. Both zip(*arg) and [arg] work. Can you please elaborate on this syntax, please. A link to an article would be enough. Then I can accept the answer. I got it that a * unpacks a tuple, but what does [some_tupple] do?
Other than the unique step, this just basic python.
In [78]: a = np.zeros(10).astype(np.uint8)
...: a[0:4] = 1
...: ret = np.unique(a, return_counts=True)
unique returns a tuple of arrays, which can be used as is, or unpacked into 2 variables. I think unpacking makes the code clearer.
In [79]: ret
Out[79]: (array([0, 1], dtype=uint8), array([6, 4]))
In [80]: values, counts = ret
In [81]: values
Out[81]: array([0, 1], dtype=uint8)
In [82]: counts
Out[82]: array([6, 4])
The following just makes a list with 1 item - the tuple
In [83]: [ret]
Out[83]: [(array([0, 1], dtype=uint8), array([6, 4]))]
That's different from making a list of the two arrays - which just changes the tuple "wrapper" to a list:
In [84]: [values, counts]
Out[84]: [array([0, 1], dtype=uint8), array([6, 4])]
zip takes multiple items (it has a *args signature)
In [85]: list(zip(*ret)) # same as zip(values, counts)
Out[85]: [(0, 6), (1, 4)]
In [86]: [(i,j) for i,j in zip(*ret)] # using that in an iteration
Out[86]: [(0, 6), (1, 4)]
In [87]: [(i,j) for i,j in zip(values, counts)]
Out[87]: [(0, 6), (1, 4)]
So it pairs the nth element of values with the nth element of counts
Iteration on the [ret] list does something entirely different, or rather it does nothing - compare with `Out[83]:
In [88]: [(i,j) for i,j in [ret]]
Out[88]: [(array([0, 1], dtype=uint8), array([6, 4]))]
I think of list(zip(*arg)) as a list version of transpose:
In [90]: np.transpose(ret)
Out[90]:
array([[0, 6],
[1, 4]])
In [91]: [(i,j) for i,j in np.transpose(ret)]
Out[91]: [(0, 6), (1, 4)]

Filter pandas dataframe in python3 depending on the value of a list

So I have a dataframe like this:
df = {'c': ['A','B','C','D'],
'x': [[1,2,3],[2],[1,3],[1,2,5]]}
And I want to create another dataframe that contains only the rows that have a certain value contained in the lists of x. For example, if I only want the ones that contain a 3, to get something like:
df2 = {'c': ['A','C'],
'x': [[1,2,3],[1,3]]}
I am trying to do something like this:
df2 = df[(3 in df.x.tolist())]
But I am getting a
KeyError: False
exception. Any suggestion/idea? Many thanks!!!
df = df[df.x.apply(lambda x: 3 in x)]
print(df)
Prints:
c x
0 A [1, 2, 3]
2 C [1, 3]
Below code would help you
To create the Correct dataframe
df = pd.DataFrame({'c': ['A','B','C','D'],
'x': [[1,2,3],[2],[1,3],[1,2,5]]})
To filter the rows which contains 3
df[df.x.apply(lambda x: 3 in x)==True]
Output:
c x
0 A [1, 2, 3]
2 C [1, 3]

how to multiply nested list with list?

i have:
dataA=[[1,2,3],[1,2,5]]
dataB=[1,2]
I want to multiply index [0] dataA with index [0] dataB, and index [1] dataA with index [1] dataB, how to do it.
I tried it, but the results didn't match expectations
dataA=[[1,2,3],[1,2,5]]
dataB=[1,2]
tmp=[]
for a in dataA:
tampung = []
for b in a:
cou=0
hasil = b*dataB[cou]
tampung.append(hasil)
cou+=1
tmp.append(tampung)
print(tmp)
output : [[1, 2, 3], [1, 2, 5]]
expected output : [[1,2,3],[2,4,10]]
Please help
List-expression are sth wonderful in Python.
result = [[x*y for y in l] for x, l in zip(dataB, dataA)]
This does the same like:
result = []
for x, l in zip(dataB, dataA):
temp = []
for y in l:
temp.append(x * y)
result.append(temp)
result
## [[1, 2, 3], [2, 4, 10]]
If you are working with numbers consider using numpy as it will make your operations much easier.
dataA = [[1,2,3],[1,2,5]]
dataB = [1,2]
# map list to array
dataA = np.asarray(dataA)
dataB = np.asarray(dataB)
# dataA = array([[1, 2, 3], [1, 2, 5]])
# 2 x 3 array
# dataB = array([1, 2])
# 1 x 2 array
dataC_1 = dataA[0] * dataB[0] #multiply first row of dataA w/ first row of dataB
dataC_2 = dataA[1] * dataB[1] #multiply second row of dataA w/ second row of dataB
# dataC_1 = array([1, 2, 3])
# dataC_2 = array([2, 4, 10])
These arrays can always be cast back into lists by passing them into List()
As other contributors have said, please look into the numpy library!

pandas: aggregate a column of list into one list

I have the following data frame my_df:
name numbers
----------------------
A [4,6]
B [3,7,1,3]
C [2,5]
D [1,2,3]
I want to combine all numbers to a new list, so the output should be:
new_numbers
---------------
[4,6,3,7,1,3,2,5,1,2,3]
And here is my code:
def combine_list(my_lists):
new_list = []
for x in my_lists:
new_list.append(x)
return new_list
new_df = my_df.agg({'numbers': combine_list})
but the new_df still looks the same as original:
numbers
----------------------
0 [4,6]
1 [3,7,1,3]
2 [2,5]
3 [1,2,3]
What did I do wrong? How do I make new_df like:
new_numbers
---------------
[4,6,3,7,1,3,2,5,1,2,3]
Thanks!
You need flatten values and then create new Dataframe by constructor:
flatten = [item for sublist in df['numbers'] for item in sublist]
Or:
flatten = np.concatenate(df['numbers'].values).tolist()
Or:
from itertools import chain
flatten = list(chain.from_iterable(df['numbers'].values.tolist()))
df1 = pd.DataFrame({'numbers':[flatten]})
print (df1)
numbers
0 [4, 6, 3, 7, 1, 3, 2, 5, 1, 2, 3]
Timings are here.
You can use df['numbers'].sum() which returns a combined list to create the new dataframe
new_df = pd.DataFrame({'new_numbers': [df['numbers'].sum()]})
new_numbers
0 [4, 6, 3, 7, 1, 3, 2, 5, 1, 2, 3]
This should do:
newdf = pd.DataFrame({'numbers':[[x for i in mydf['numbers'] for x in i]]})
Check this pandas groupby and join lists
What you are looking for is,
my_df = my_df.groupby(['name']).agg(sum)

Initializing matrix python3

I don't know whether this is a bug, or I got a wrong semantic meaning of the * token in arrays:
>>> arr = [None] * 5 # Initialize array of 5 'None' items
>>> arr
[None, None, None, None, None]
>>> arr[2] = "banana"
>>> arr
[None, None, 'banana', None, None]
>>> # right?
...
>>> mx = [ [None] * 3 ] * 2 # initialize a 3x2 matrix with 'None' items
>>> mx
[[None, None, None], [None, None, None]]
>>> # so far, so good, but then:
...
>>> mx[0][0] = "banana"
>>> mx
[['banana', None, None], ['banana', None, None]]
>>> # Huh?
Is this a bug, or did I got the wrong semantic meaning of the __mult__ token?
You're copying the same reference to the list multiple times. Do it like this:
matrix = [[None]*3 for i in range(2)]

Resources