Divide certain elements in each sub list - python-3.x

b = [[2021, 55, -0.65, 7.61, 10.65, 41.37, 3.39, 12.25, -10.14, 7.61, 8.84],
[2022, 56, 3.0, -0.13, 8.84, 27.25, -0.48, 2.54, 12.43, 7.56, 3.37]]
I want to divide elements [2:10] of each sub list in b by 100. Result expected:
a = [2021, 55, -0.0065, 0.0761, 0.1065, 0.4137, 0.0339, 0.1225, -0.1014, 0.0761, 0.0884], etc
I've tried:
a = [item[2:10] /100 for item in b] Also tried: a = [item[2:10] / 100 for item in x] for x in b]
The first one gives "unsupported operand type for /: list and int". Second one gives "int object not subscriptable"

A minor error in your list comprehension, you were slicing in the wrong place. What you need to do is this:
a = [x[:2] + [item / 100 for item in x[2:]] for x in b]
print(a)
Output:
[[2021, 55, -0.006500000000000001, 0.0761, 0.1065, 0.41369999999999996, 0.0339, 0.1225, -0.1014, 0.0761, 0.08839999999999999], [2022, 56, 0.03, -0.0013, 0.08839999999999999, 0.2725, -0.0048, 0.0254, 0.1243, 0.0756, 0.0337]]

Without list comprehension
In [12]: b = [[2021, 55, -0.65, 7.61, 10.65, 41.37, 3.39, 12.25, -10.14, 7.61, 8
...: .84],
...: [2022, 56, 3.0, -0.13, 8.84, 27.25, -0.48, 2.54, 12.43, 7.56, 3.37
...: ]]
...:
In [13]: for i in range(len(b)):
...: if len(b[i]) >= 10:
...: for j in range(2,10):
...: b[i][j] = b[i][j]/100
output:
[[2021,
55,
-0.006500000000000001,
0.0761,
0.1065,
0.41369999999999996,
0.0339,
0.1225,
-0.1014,
0.0761,
8.84],
[2022,
56,
0.03,
-0.0013,
0.08839999999999999,
0.2725,
-0.0048,
0.0254,
0.1243,
0.0756,
3.37]]

No need to take slices out first... change your list-comprehension to:
b = [[2021, 55, -0.65, 7.61, 10.65, 41.37, 3.39, 12.25, -10.14, 7.61, 8.84],
[2022, 56, 3.0, -0.13, 8.84, 27.25, -0.48, 2.54, 12.43, 7.56, 3.37]]
res = [[item / 100 if 2 <= i < 10 else item for i, item in enumerate(lst)] for lst in b]
print(res)
Output:
[[2021, 55, -0.006500000000000001, 0.0761, 0.1065, 0.41369999999999996, 0.0339, 0.1225, -0.1014, 0.0761, 8.84], [2022, 56, 0.03, -0.0013, 0.08839999999999999, 0.2725, -0.0048, 0.0254, 0.1243, 0.0756, 3.37]]

res = [x[:2] + [x[i]/100 for i in range(len(x)) if i > 1] for x in b]\
print(res)

Related

How to find the index position of items in a pandas list which satisfy a certain condition?

How can I find the index position of items in a list which satisfy a certain condition?
Like suppose, I have a list like:
myList = [0, 100, 335, 240, 300, 450, 80, 500, 200]
And the condition is to find out the position of all elements within myList which lie between 0 and 300 (both inclusive).
I am expecting the output as:
output = [0, 1, 3, 4, 6, 8]
How can I do this in pandas?
Also, how to find out the index of the maximum element in the subset of elements which satisfy the condition? Like, in the above case, out of the elements which satisfy the given condition 300 is the maximum and its index is 4. So, need to retrieve its index.
I have been trying many ways but not getting the desired result. Please help, I am new to the programming world.
You can try this,
>>> import pandas as pd
>>> df = pd.DataFrame({'a': [0, 100, 335, 240, 300, 450, 80, 500, 200]})
>>> index = list(df[(df.a >= 0) & (df.a <= 300)].index)
>>> df.loc[index,].idxmax()
a 4
dtype: int64
or using the list,
>>> l = [0, 100, 335, 240, 300, 450, 80, 500, 200]
>>> index = [(i, v) for i, v in enumerate(l) if v >= 0 and v <= 300]
>>> [t[0] for t in index]
[0, 1, 3, 4, 6, 8]
>>> sorted(index, key=lambda x: x[1])[-1][0]
4
As Grzegorz Skibinski says, if we use numpy to get rid of many computations,
>>> import numpy as np
>>> l = [0, 100, 335, 240, 300, 450, 80, 500, 200]
>>> index = np.array([[i, v] for i, v in enumerate(l) if v >= 0 and v <= 300])
>>> index[:,0]
array([0, 1, 3, 4, 6, 8])
>>> index[index.argmax(0)[1]][0]
4
You can use numpy for that purpose:
import numpy as np
myList =np.array( [0, 100, 335, 240, 300, 450, 80, 500, 200])
res=np.where((myList>=0)&(myList<=300))[0]
print(res)
###and to get maximum:
res2=res[myList[res].argmax()]
print(res2)
Output:
[0 1 3 4 6 8]
4
[Program finished]
This is between in pandas:
myList = [0, 100, 335, 240, 300, 450, 80, 500, 200]
s= pd.Series(myList)
s.index[s.between(0,300)]
Output:
Int64Index([0, 1, 3, 4, 6, 8], dtype='int64')

trapz integration on dataframe index after grouping

I have some data and I want to first group by some interval the Target column and then integrate the target column by index spacing.
import numpy as np
import pandas as pd
from scipy import integrate
df = pd.DataFrame({'A': np.array([100, 105.4, 108.3, 111.1, 113, 114.7, 120, 125, 129, 130, 131, 133,135,140, 141, 142]),
'B': np.array([11, 11.8, 12.3, 12.8, 13.1,13.6, 13.9, 14.4, 15, 15.1, 15.2, 15.3, 15.5, 16, 16.5, 17]),
'C': np.array([55, 56.3, 57, 58, 59.5, 60.4, 61, 61.5, 62, 62.1, 62.2, 62.3, 62.5, 63, 63.5, 64]),
'Target': np.array([4000, 4200.34, 4700, 5300, 5800, 6400, 6800, 7200, 7500, 7510, 7530, 7540, 7590,
8000, 8200, 8300])})
df['y'] = df.groupby(pd.cut(df.iloc[:, 3], np.arange(0, max(df.iloc[:, 3]) + 100, 100))).sum().apply(lambda g: integrate.trapz(g.Target, x = g.index))
Above code gives me:
AttributeError: ("'Series' object has no attribute 'Target'", 'occurred at index A')
If I try this:
colNames = ['A', 'B', 'C', 'Target']
df['z'] = df.groupby(pd.cut(df.iloc[:, 3], np.arange(0, max(df.iloc[:, 3]) + 100, 100))).sum().apply(lambda g: integrate.trapz(g[colNames[3]], x = g.index))
I get:
TypeError: 'str' object cannot be interpreted as an integer
During handling of the above exception, another exception occurred:
....
KeyError: ('Target', 'occurred at index A')
You have several problems in your code:
You have created a copy of your dataframe with a categorical index than I think integrate.trapz cannot deal with.
with apply, you are applying integrate.trapz to each row. That makes no sense. For that reason I asked in my comment if you need, in each row, the integral from 0 to Target value in such row.
If you want transform you data by intervals of 100 in the column 'Target' from 0 as you have done, first you can get the sum by intervals of 'Target' from 0 to 100
>>>i_df = df.groupby(pd.cut(df.iloc[:, 3], np.arange(0, max(df.iloc[:, 3]) + 100, 100))).sum()
Then you get the trapezoidal integral of column 'Target' with intervals of 100
>>>integrate.trapz(i_df['Target'], dx=100)
10242034.0
You cannot use x=i_df.index because the (internal in trapz) operation substraction is not defined for intervals, and you have created an intervals index.
If you need to use the dataframe index you must reset it.
>>>i_df = df.groupby(pd.cut(df.iloc[:, 3], np.arange(0, max(df.iloc[:, 3]) + 100, 100))).sum().reset_index(drop=True)
>>>integrate.trapz(i_df['Target'], x=100*i_df.index)
10242034.0

A problem with printing spiral list in python

I have a problem with making a spiral list.
The program should output a table of size n × n, filled with numbers from 1 to n * n in a spiral coming from the upper-left corner in a clockwise fashion, as shown in the example (here n = 5)
It works when n is even and doesn't work when n is odd
n = int(input())
arr = [[0 for i in range(n)] for j in range(n)]
stop = 0
start = 0
elem = 1
while elem <= n*n:
stop += 1
for j in range(start, n-stop):
i = start
arr[i][j] = elem
elem += 1
for i in range(start, n-stop):
j = n-stop
arr[i][j] = elem
elem += 1
for j in range(n-stop, start, -1):
i = n-stop
arr[i][j] = elem
elem += 1
for i in range(n-stop, start, -1):
j = start
arr[i][j] = elem
elem += 1
start += 1
for i in range(len(arr)):
for j in range(len(arr)):
print(arr[i][j], end=' ')
print()
Help please, where can be problem here?
You can use numpy:
import numpy as np
def spiral(n=5):
a = np.arange(n*n)
b = a.reshape((n,n))
m = None
for i in range(n, 0, -2):
m = np.r_[m, b[0, :], b[1:, -1], b[-1, :-1][::-1], b[1:-1, 0][::-1]]
b = b[1:-1, 1:-1]
a[list(m[1:])]=list(a)
return a.reshape((n,n)) + 1
spiral()
array([[ 1, 2, 3, 4, 5],
[16, 17, 18, 19, 6],
[15, 24, 25, 20, 7],
[14, 23, 22, 21, 8],
[13, 12, 11, 10, 9]])
spiral(10)
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[ 36, 37, 38, 39, 40, 41, 42, 43, 44, 11],
[ 35, 64, 65, 66, 67, 68, 69, 70, 45, 12],
[ 34, 63, 84, 85, 86, 87, 88, 71, 46, 13],
[ 33, 62, 83, 96, 97, 98, 89, 72, 47, 14],
[ 32, 61, 82, 95, 100, 99, 90, 73, 48, 15],
[ 31, 60, 81, 94, 93, 92, 91, 74, 49, 16],
[ 30, 59, 80, 79, 78, 77, 76, 75, 50, 17],
[ 29, 58, 57, 56, 55, 54, 53, 52, 51, 18],
[ 28, 27, 26, 25, 24, 23, 22, 21, 20, 19]])
A better way would be to import pdb and step through your program with a debugger. Instead, I just added some extra print statements:
n = 5
arr = [[0 for i in range(n)] for j in range(n)]
stop = 0
start = 0
elem = 1
count = 0
while elem <= n*n:
stop += 1
for j in range(start, n-stop):
i = start
arr[i][j] = elem
print('a')
elem += 1
for i in range(start, n-stop):
j = n-stop
arr[i][j] = elem
print('b')
elem += 1
for j in range(n-stop, start, -1):
i = n-stop
arr[i][j] = elem
print('c')
elem += 1
for i in range(n-stop, start, -1):
j = start
arr[i][j] = elem
print('d')
elem += 1
print('e')
count +=1
if count > 50:
break
start += 1
for i in range(len(arr)):
for j in range(len(arr)):
print(arr[i][j], end=' ')
print()
Here's the output I got:
a
a
a
a
b
b
b
b
c
c
c
c
d
d
d
d
e
a
a
b
b
c
c
d
d
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
e
1 2 3 4 5
16 17 18 19 6
15 24 0 20 7
14 23 22 21 8
13 12 11 10 9
It looks like something about odd values of n is allowing each for loop to complete, but elem doesn't get incremented enough so your while loop is running forever.
This looks like homework or a coding challenge, so I'm not going to go too deep into why, but I hope I've given you a hint.

Get list of rows with same name from dataframe using pandas

Was looking for a way to get the list of a partial row.
Name x y r
a 9 81 63
a 98 5 89
b 51 50 73
b 41 22 14
c 6 18 1
c 1 93 55
d 57 2 90
d 58 24 20
So i was trying to get the dictionary as follows,
di = {a:{0: [9,81,63], 1: [98,5,89]},
b:{0:[51,50,73], 1:[41,22,14]},
c:{0:[6,18,1], 1:[1,93,55]},
d:{0:[57,2,90], 1:[58,24,20]}}
Use groupby with custom function for count lists, last convert output Series to_dict:
di = (df.groupby('Name')['x','y','r']
.apply(lambda x: dict(zip(range(len(x)),x.values.tolist())))
.to_dict())
print (di)
{'b': {0: [51, 50, 73], 1: [41, 22, 14]},
'a': {0: [9, 81, 63], 1: [98, 5, 89]},
'c': {0: [6, 18, 1], 1: [1, 93, 55]},
'd': {0: [57, 2, 90], 1: [58, 24, 20]}}
Detail:
print (df.groupby('Name')['x','y','r']
.apply(lambda x: dict(zip(range(len(x)),x.values.tolist()))))
Name
a {0: [9, 81, 63], 1: [98, 5, 89]}
b {0: [51, 50, 73], 1: [41, 22, 14]}
c {0: [6, 18, 1], 1: [1, 93, 55]}
d {0: [57, 2, 90], 1: [58, 24, 20]}
dtype: object
Thank you volcano for suggestion use enumerate:
di = (df.groupby('Name')['x','y','r']
.apply(lambda x: dict(enumerate(x.values.tolist())))
.to_dict())
For better testing is possible use custom function:
def f(x):
#print (x)
a = range(len(x))
b = x.values.tolist()
print (a)
print (b)
return dict(zip(a,b))
[[9, 81, 63], [98, 5, 89]]
range(0, 2)
[[9, 81, 63], [98, 5, 89]]
range(0, 2)
[[51, 50, 73], [41, 22, 14]]
range(0, 2)
[[6, 18, 1], [1, 93, 55]]
range(0, 2)
[[57, 2, 90], [58, 24, 20]]
di = df.groupby('Name')['x','y','r'].apply(f).to_dict()
print (di)
Sometimes it is best to minimize the footprint and overhead.
Using itertools.count, collections.defaultdict
from itertools import count
from collections import defaultdict
counts = {k: count(0) for k in df.Name.unique()}
d = defaultdict(dict)
for k, *v in df.values.tolist():
d[k][next(counts[k])] = v
dict(d)
{'a': {0: [9, 81, 63], 1: [98, 5, 89]},
'b': {0: [51, 50, 73], 1: [41, 22, 14]},
'c': {0: [6, 18, 1], 1: [1, 93, 55]},
'd': {0: [57, 2, 90], 1: [58, 24, 20]}}

Compare dictionaries Python 3 [duplicate]

I have these 2 dicts:
a={"test1":90, "test2":45, "test3":67, "test4":74}
b={"test1":32, "test2":45, "test3":82, "test4":100}
how to extract the maximum value for the same key to get new dict as this below:
c={"test1":90, "test2":45, "test3":82, "test4":100}
You can try like this,
>>> a={"test1":90, "test2":45, "test3":67, "test4":74}
>>> b={"test1":32, "test2":45, "test3":82, "test4":100}
>>> c = { key:max(value,b[key]) for key, value in a.iteritems() }
>>> c
{'test1': 90, 'test3': 82, 'test2': 45, 'test4': 100}
Try this:
>>> a={"test1":90, "test2":45, "test3":67, "test4":74}
>>> b={"test1":32, "test2":45, "test3":82, "test4":100}
>>> c={ k:max(a[k],b[k]) for k in a if b.get(k,'')}
{'test1': 90, 'test3': 82, 'test2': 45, 'test4': 100}
Not the best, but still a variant:
from itertools import chain
a = {'test1':90, 'test2': 45, 'test3': 67, 'test4': 74}
b = {'test1':32, 'test2': 45, 'test3': 82, 'test4': 100, 'test5': 1}
c = dict(sorted(chain(a.items(), b.items()), key=lambda t: t[1]))
assert c == {'test1': 90, 'test2': 45, 'test3': 82, 'test4': 100, 'test5': 1}

Resources