Error: 'BlockManager' object has no attribute 'T' issue while using df.at function in a loop - python-3.x

When i am trying to use df.at fuction without loop it works fine and change the data for a perticular column but it is giving error while using this in a loop.
Code is here.
import pandas as pd
data1 = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Height': [5.1, 6.2, 5.1, 5.2]}
df1 = pd.DataFrame(data1)
data2 = {'Name': ['Jai', 'Princi', 'Gaurav', 'Anuj'],
'Height': [4.1, 3.4, 7.1, 9.2]}
df2 = pd.DataFrame(data2)
df3 = pd.concat([df1, df2], axis=1)
for i in range(int(len(df1))):
for j in range(int(len(df2))):
if df1['Name'][i] != df2['Name'][j]:
continue
else:
out = (df1['Height'][i] - df2['Height'][j])
df3.at[i, 'Height_Comparison'] = out
break
print(df3)

The issue was occurring becz of duplicate column names('Name', 'Height') in Data Frame df3 becz of the concat operation. Concat make double entries with same column names ('Name', 'Height') in Data Frame df3 which is creating this problem.
once i changed the column names to Name1, Height1 in df1 and Name2, Heigh2 in df2 the issue got resolved.

Related

iterate over a column check condition and carry calculations with values of other data frames

import pandas as pd
import numpy as np
I do have 3 dataframes df1, df2 and df3.
df1=
data = {'Period': ['2024-04-O1', '2024-07-O1', '2024-10-O1', '2025-01-O1', '2025-04-O1', '2025-07-O1', '2025-10-O1', '2026-01-O1', '2026-04-O1', '2026-07-O1', '2026-10-O1', '2027-01-O1', '2027-04-O1', '2027-07-O1', '2027-10-O1', '2028-01-O1', '2028-04-O1', '2028-07-O1', '2028-10-O1'],
'Price': ['NaN','NaN','NaN','NaN', 'NaN','NaN','NaN','NaN', 'NaN','NaN','NaN','NaN',
'NaN','NaN','NaN','NaN', 'NaN','NaN','NaN'],
'years': [2024,2024,2024,2025,2025,2025,2025,2026,2026,2026,2026,2027,2027,2027,2027,2028,
2028,2028,2028],
'quarters':[2,3,4, 1,2,3,4, 1,2,3,4, 1,2,3,4, 1,2,3,4]
}
df1 = pd.DataFrame(data=data)
df2=
data = {'price': [473.26,244,204,185, 152, 157],
'year': [2023, 2024, 2025, 2026, 2027, 2028]
}
df3 = pd.DataFrame(data=data)
df3=
data = {'quarters': [1,2,3,4],
'weights': [1.22, 0.81, 0.83, 1.12]
}
df2 = pd.DataFrame(data=data)
My aim is to compute the price of df1. For each iteration through df1 check condition and carry calculations accordingly. For example for the 1st iteration, check if df1['year']=2024 and df1['quarters']=2. Then df1['price']=df2.loc[df2['year']=='2024', 'price'] * df3.loc[df3['quarters']==2, 'weights'].
===>>> df1['price'][0]=**473.26*0.81**.
df1['price'][1]=**473.26*0.83**.
...
...
...
and so on.
I could ha used this method but i want to write a code in a more efficient way. I would like to use the following code structure.
for i in range(len(df1)):
if (df1['year']==2024) & (df1['quarter']==2):
df1['Price']= df2.loc[df2['year']==2024, 'price'] * df3.loc[df3['quarters']==2, 'weights']
elif (df1['year']==2024) & (df1['quarter']==3):
df1['price']= df2.loc[df2['year']=='2024', 'price'] * df3.loc[df3['quarters']==3, 'weights']
elif (df1['year']==2024) & (df1['quarters']==4):
df1['Price']= df2.loc[df2['year']=='2024', 'price'] * df3.loc[df3['quarters']==4, 'weights']
...
...
...
Thanks!!!
I think if I understand correctly you can use pd.merge to bring these fields together first.
df1 = df1.merge(df2, how='left' , left_on='years', right_on='year')
df1 = df1.merge(df3, how='left' , left_on='quarters', right_on='quarters')
df1['Price'] = df1['price']*df1['weights']

Better way to swap column values and then append them in a pandas dataframe?

here is my dataframe
import pandas as pd
data = {'from':['Frida', 'Frida', 'Frida', 'Pablo','Pablo'], 'to':['Vincent','Pablo','Andy','Vincent','Andy'],
'score':[2, 2, 1, 1, 1]}
df = pd.DataFrame(data)
df
I want to swap the values in columns 'from' and 'to' and add them on because these scores work both ways.. here is what I have tried.
df_copy = df.copy()
df_copy.rename(columns={"from":"to","to":"from"}, inplace=True)
df_final = df.append(df_copy)
which works but is there a shorter way to do the same?
One line could be :
df_final = df.append(df.rename(columns={"from":"to","to":"from"}))
On the right track. However, introduce deep=True to make a true copy, otherwise your df.copy will just update df and you will be up in a circle.
df_copy = df.copy(deep=True)
df_copy.rename(columns={"from":"to","to":"from"}, inplace=True)
df_final = df.append(df_copy)

Check if the dataframe exist, if so do merge

I have 3 dataframes (df1, df2, df3), out of which 'df3' might be created or not. If the dataframe df3 is created then merge all the three else just merge df1 & df2.
I am trying the below code:
df1 = pd.DataFrame([['a',1,2],['b',4,5],['c',7,8],[np.NaN,10,11]], columns=['id','x','y'])
df2 = pd.DataFrame([['a',1,2],['b',4,5],['c',7,10],[np.NaN,10,11]], columns=['id','x','y'])
df3 = pd.DataFrame([['g',1,2],['h',4,5],['i',7,10],[np.NaN,10,11]], columns=['id','x','y'])
if not isinstance(df3, type(None)):
df1.append(df2)
else:
df1.append(df2).append(df3)
It is giving me "NameError: name 'df3' is not defined" error if df3 doesnot exist
This answer might have the key you're looking for: https://stackoverflow.com/a/1592578
df1.append(df2)
try:
df1.append(df3)
except NameError:
pass # df3 does not exist

None of [Index(['a', 'c'], dtype='object')] are in the [columns] error

I have a main dataframe and want to create a sub-dataframe with specific columns:
df_main=
[a,b,c,d]
[1,3,6,0]
When I want to pick specific columns and create a new one, it throws me this ugly error:
df_new=
df.loc[:, ['a','c']]
df_new.head()
Out:None of [Index(['a', 'c'], dtype='object')] are in the [columns]
What is the issue here?
If I right understand:
import pandas as pd
df = pd.DataFrame({'a':[1], 'b':[3], 'c':[6], 'd':[0]})
df_new = df.loc[:, ['a','c']]
df_new:
a c
0 1 6

Resetting multi index for a pivot_table to get single line index

I have a dataframe in the following format:
df = pd.DataFrame({'a':['1-Jul', '2-Jul', '3-Jul', '1-Jul', '2-Jul', '3-Jul'], 'b':[1,1,1,2,2,2], 'c':[3,1,2,4,3,2]})
I need the following dataframe:
df_new = pd.DataFrame({'a':['1-Jul', '2-Jul', '3-Jul'], 1:[3, 1, 2], 2:[4,3,2]}).
I have tried the following:
df = df.pivot_table(index = ['a'], columns = ['b'], values = ['c'])
df_new = df.reset_index()
but it doesn't give me the required result. I have tried variations of this to no avail. Any help will be greatly appreciated.
try this one:
df2 = df.groupby('a')['c'].agg(['first','last']).reset_index()
cols_ = df['b'].unique().tolist()
cols_.insert(0,df.columns[0])
df2.columns = cols_
df2

Resources