I am currently trying to write data to an excel spreadsheet using Python3 and openpyxl. I have figured out how to do it when assigning one single value, but for some reason when I introduce a For loop it is giving me an error. This program will eventually be used to filter through a python dictionary and print out the keys and values from the python dictionary. For now, I am just trying to create a loop that will input a random integer in the spreadsheet for every key listed in the dictionary (not including nested keys). If anyone can help me determine why this error is coming up it would be much appreciated. Thanks in advance!
# Writing Dictionary to excel spreadsheet
import openpyxl
import random
wb = openpyxl.load_workbook("ExampleSheet.xlsx")
sheet = wb.get_sheet_by_name("Sheet1")
sheet["B1"].value = "Price" #This works and assigns the B1 value to "price" in the spreadsheet
my_dict = {'key': {'key2' : 'value1', 'key3' : 'value2'} 'key4' : {'key5' : 'value3', 'key6' : 'value4'}} #an example dictionary
n = len(my_dict)
for i in range(0,n):
sheet['A'+ str(i)].value = random.randint(1,10) #This does not work and gives an error
wb.save('ExampleSheet.xlsx')
OUTPUT >>> AttributeError: 'tuple' object has no attribute 'value'
The first column of pyxl, is one based, so if you modify your loop to go over range(1,n) your issues should be resolved
Using .format(i) instead of string + str(i) in ur code may work well!
BTW, ur var my_dict get an error .
eg:
for i in range(10):
sheet['A{}'.format(i)].value = 'xx'
Related
I would like to create a pandas dataframe using the names from a list and then appending '_df' to the end of it but I seem to have two issues. Here is my code below.
read_csv = ['apple', 'orange', 'bananna']
for f in read_csv:
print('DEBUG 7: Value of f inside the loop: ', f)
##!!! ERROR HERE - We have reassigned the csv file to f
##!!! ERROR HERE - f now contains contents of f.csv(e.g. apple.csv)
f = pd.read_csv(f + '.csv')
##!!! ERROR HERE - Fix above error and the spice shall flow.
#print('DEBUG 8: Inside read_csv \n', f)
The for loop runs and reads in the first item in my list 'apple' and assigns it to f.
We drop into the loop. The first print statement, DEBUG 7, returns the value of f as 'apple'. So far so good.
Next, we run on to the pd.read_csv which is where my first issue is. How do I append '_df' to f? I have read a few answers on here and tried them but it's not working as I expect. I would like to have the loop run and create a new dataframe for apple_df, orange_df and bananna_df. But we can come back to that.
The second error I get here is "ValueError: Wrong number of items passed 8, placement implies 1" The CSV file has 8 columns and that is getting assigned to f instead of the dataframe name.
I can't for the life of me work out what's occurring to make that happen. Well, I can. If I fix the apple_df issue I believe the dataframe will read in the csv file fine.
Still learning so all help is appreciated.
Thanks
Tom
Use locals() to create local variables (apple_df, orange_df, ...)
read_csv = ['apple', 'orange', 'bananna']
for f in read_csv:
locals()[f"{f}_df"] = pd.read_csv(f"{f}.csv")
>>> type(apple_df)
pandas.core.frame.DataFrame
ValueError: Wrong number of items passed 8, placement implies 1
You got that error because you can't assign DataFrame to f variable which is a string in that loop. You have to store it into new variable, for exaple df
df = pd.read_csv(f + '.csv')
If you want to create new variable by f and "_df" you need to use exec
exec(f + "_df" + " = pd.read_csv(f + '.csv')")
I have made a program where I'm converting raw file from my bank account for credit card transactions to a cleansed file with some new columns.
I'm replacing a column values based on my dictionary. Dictionary has 5 rows, where as the data frame has variable rows. It is to further group the data into types.
I'm also filtering the data so using masking aswell.
replace code
t_type = df2['Transaction'].replace(mappingforcc.load_dictionary.dictionary, inplace=True)
while debugging, when i make the rows equal in dictionary and the dataframe, code runs smooth without any issue. but when there is mismatch among both, I'm getting following error:
ValueError: cannot assign mismatch length to masked array
I even made a function so that i dont have two data frames in my original code as I'm creating dictionary from an excel file.
Despite several searches, unable to resolve it.
Thanks in advance for the help.
Edit: I have found the issue, the problem is that I'm creating dictionary from the following code
load_dictionary.dictionary2 = df_dict.groupby(['Transactions'])['Type'].apply(list).to_dict()
due to which I'm getting the following output in dictionary as there are multiple rows in the sheet.
{'Adv Tax FCY Tran 1%-F': ['Recoverable - Adv. Tax', 'Recoverable - Adv. Tax', 'Recoverable - Adv. Tax', 'Recoverable - Adv. Tax']
Due to which, if another transaction of 'Adv Tax FCY Tran 1%-F' appears, python cannot interpret as it tries to find a value in it.
Need help to avoid this issue.
I solved it, removed duplicates from the dictionary excel file.
My file had multiple rows for each of the lines, hence the group by function incorporated all in dictionary. And when there was a mismatch in numbers, i was getting the error of mismatch.
I also created a function to create my own directory to come around it though but didn't use it as it worked like a charm after i removed duplicates.
I'm a beginner at programming but still sharing the function (converted to code for ease of sharing and easier for everyone to use)
import openpyexcel
from openpyexcel import load_workbook
import ast
file_path = r"filepath.xlsx"
df = openpyexcel.load_workbook(file_path)
ws = df['Sheet1'] #Enter your sheet name or automate using multiple functions in openpyxl
dictionary = "{"
for i in range(2,ws.max_row): #Since i have a header too, therefore started from 2nd Row
dictionary = dictionary + '"' + (ws[('A'+str(i))].value) + '"' + ":" + '"' + (ws[('B'+str(i))].value) + '"' #I have used column A and B, you can change accordingly.
if i != (ws.max_row-1):
dictionary = dictionary + ','
dictionary = dictionary + '}'
dictionary = ast.literal_eval(dictionary)
This issue wasn't at all a waste of time, learnt alot of things along the way
I need to check if an employee has checked out during the break.
To do so, I need to see if there is the time in which Door Name is RDC_OUT-1 is in the interval [12:15:00 ; 14:15:00]
import pandas as pd
df_by_date= pd.DataFrame({'Time':['01/02/2019 07:02:07', '01/02/2019 10:16:55', '01/02/2019 12:27:20', '01/02/2019 14:08:58','01/02/2019 15:32:28','01/02/2019 17:38:54'],
'Door Name':['RDC_OUT-1', 'RDC_IN-1','RDC_OUT-1','RDC_IN-1','RDC_OUT-1','RDC_IN-1']})
df_by_date['Time'] = pd.to_datetime(df_by_date['Time'])
df_by_date['hours']=pd.to_datetime(df_by_date['Time'], format='%H:%M:%S').apply(lambda x: x.time())
print('hours \n',df_by_date['hours'])
out = '12:15:00'
inn = '14:15:00'
pause=0
for i in range (len(df_by_date)):
if (out < str((df_by_date['hours'].iloc[i]).where(df_by_date['Door Name'].iloc[i]=='RDC_IN-1')) < inn) :
pause+=1
print('Break outside ')
else:
print('Break inside')
When running the code above, I got this error:
if (out < ((df_by_date['hours'].iloc[i]).where(df_by_date['Door Name'].iloc[i]=='RDC_OUT-1')) < inn) :
AttributeError: 'datetime.time' object has no attribute 'where'
When you are iterating the DataFrame/Series you are selecting one cell at a time.
The cell which you are Selecting is of type datetime.time
However, where only works with the complete DataFrame/Series rather than having this in a loop.
Like,
sub_df = df_by_date['hours'].where(condition)
and then to count you can use len(sub_df)
I am creating a dictionary code_data by loading a CSV file to a data frame and converting it with to_dict method. This is a fragment of my code:
path = "E:\Knoema_Work_Dataset\hrfpfwd\MetaData\MetaData_num_person.csv"
code_data = pd.read_csv(path, usecols=['value', 'display_value'], dtype=object)
code_data = code_data.set_index('value')['display_value'].to_dict()
In the following line I am attempting to replace its values:
data["Number of Deaths"] = data["Number of Deaths"].replace(code_data)
Sadly, it leads to an error:
Cannot compare types 'ndarray(dtype=int64)' and 'str'
Could you provide me with some assistance with regards to my problem?
Trying to run this simple for loop with a pandas cross tab function. The iteration target is an argument in the cross-tab function. It's supposed to read through a list of columns and produce a cross-tab for each column combination. But instead it's interpreting my 'i' iterable as the literal title of the column instead of whatever variable it should be in that iteration.
I get the error: 'DataFrame' object has no attribute 'i' because it's reading 'i' as the literal name of an attribute instead of the value that should be stored in i from the loop.
import pandas
DF = pandas.read_excel('example.xlsx')
Categories = list(DF.columns.values)
for i in Categories:
pandas.crosstab(DF.Q, DF.i, normalize = 'index', margins=True)
IIUC, you want to loop though every column and create the cross tab against column Q, but your current loop won't produce anything.
Use the following to assign the results to a python dict that you can access with column names as the key:
DF = pandas.read_excel('example.xlsx')
Categories = list(DF.columns.values)
cross_tabs = {}
for i in Categories:
cross_tabs[i] = pandas.crosstab(DF.Q, DF[i], normalize = 'index', margins=True)