Dynamically set the parameter (value_if_true) of an IF formula - python-3.x

I am working with large Excel stocks data. I have the data in a format like this,
What I need to do is, I need to set that stock ticker name in front of the cell which loss is less than -10%.
I can try with the simple =IF(B2<-0.1, "AAL", "") formula, but this will work until the next stock starts, I mean in AADI also it will print "AAL", that's the problem. I need to print the right ticker if this condition is true. If it's AAPL, the ticker AAPL should print in front of the loss cell. So, How can I do that?
Don't know how to complete this while I am having millions of data points. I should know a good solution using Python, VB, or Excel formulas.

IIUC, here is a simple proposition using openpyxl :
from openpyxl import load_workbook
wb = load_workbook("file.xlsx")
ws = wb['Sheet1']
for num_row in range(1, ws.max_row+1):
cellB = ws.cell(row=num_row, column=2)
if isinstance(cellB.value, str):
ticker_name = cellB.value
else:
try:
cellC = ws.cell(row=num_row, column=3)
if cellC.value < 0.1:
ws.cell(row=num_row, column=4).value = ticker_name
except TypeError:
pass
wb.save("file.xlsx")
NB: Make sure to keep always a backup/copy of your original Excel file before running any kind of python/openpyxl's script.
# Output :

Related

Openpyxl: How to merge cells using variable rows

I can't figure out how to merge cells without using the format 'A1:A4'
I want to be able to use ws.cell(row=x,column=y) format to set the start and end points of the merge.
The code below demonstrates what I want, but doesn't work as the range argument isn't in the correct format.
for x in range(10):
start = ws.cell(row=x,column=1)
end = ws.cell(row=x, column=4)
ws.merge_cells(start:end)
Any help much appreciated!
See the manual, we have the following:
merge_cells(range_string=None, start_row=None, start_column=None, end_row=None, end_column=None)
so what you need to do is to do this instead:
for x in range(10):
ws.merge_cells(start_row=x, start_column=1, end_row=x, end_column=4)

Openpyxl dataframe_to_rows generating unreadable content error

I created a fairly simple utility function that just put a dataframe into an excel worksheet, using the newer dataframe_to_rows option in openpyxl:
insertRows = dataframe_to_rows(df)
worksheet = workbook.create_sheet(title=sheetName)
for r_idx, row in enumerate(insertRows, 1):
for c_idx, cell_value in enumerate(row, 1):
worksheet.cell(row=r_idx, column=c_idx, value=value)
When I open the workbook, I received excel's unreadable content error which is usually related to formatting. After some googling, I found that the answer isn't posted anywhere. So posting my fix below
The problem ended up being related to the formatting of the numpy NAN values in the dataframe which the cell class couldn't convert properly.
A simple fix was to change the insertion code into something like the following:
try:
if numpy.isnan(cell_value):
cell_value = "NAN"
elif numpy.isinf(cell_value):
cell_value = "INF"
else:
cell_value = float(cell_value)
except:
pass
worksheet.cell(row=r_idx, column=c_idx, value=cell_value)

How do I use OpenPyXL for a specified range?

How do I divide each value in one column by each value in a separate column?
Do I use the range function?
Example:
for i in range(2,80):
sheet['D{}'.format(i)] = '=C1/E1, C2/E2, C3/E3, etc...'
You can get it done by applying division operations to the actual values of the cells. Your code is pretty close; you just need to correct the right hand side by accessing the cell values:
import openpyxl
wb = openpyxl.load_workbook('path/to/xl/file', read_only = False)
# Assuming you are working with Sheet1
sheet = wb['Sheet1']
for i in range(2,80):
try:
sheet['D{}'.format(i)].value = int(sheet['C{}'.format(i)].value)/int(sheet['E{}'.format(i)].value)
except ValueError:
print("{} and/or {} could not be converted to int.".format(sheet['C{}'.format(i)].value, sheet['E{}'.format(i)].value))
wb.save('path/to/new/xl/file')
I hope this helps.

write a list to Excel, starting at a specific cell, with openpyxl

I am trying to write a list of values to an Excel spreadsheet, starting at a specific cell. The values will be written periodically and the new values will replace the existing values, so writing will always start in cell F2. I have tried multiple versions that I found on SO and other sites but I keep getting various errors, most recently a KeyError = 0 for these efforts :
for rowNum in range(len(list)):
ws.cell(row=rowNum+1, column=5).value = list[rowNum]
for i in range(len(list)):
ws['F' + r].value = list[i+1]
PLEASE help ! Many thanks in advance.
Edit - I found the solution in "Automate the Boring Stuff with Python", chapter 12. I converted my dataframe to a dictionary and then this worked :
for rowNum in range(2, ws.max_row):
item = ws.cell(row=rowNum, column=1).value
if item in new_dict:
ws.cell(row=rowNum, column=5).value = new_dict[item]
I just tried the script below and it worked fine for me.
from openpyxl import Workbook
wb = Workbook()
# grab the active worksheet
ws = wb.active
# Data can be assigned directly to cells
ws['A1'] = 42
# Rows can also be appended
ws.append([1, 2, 3])
# Python types will automatically be converted
import datetime
ws['A2'] = datetime.datetime.now()
# Save the file
wb.save("C:\\Users\\your_path_here\\Desktop\\sample.xlsx")

How to execute the second Iterarion of Data in excel using Openpyxl with Python 3.4

I am trying to read data from my Excel spreadsheet and so far i have been able to do it using the code below but i cant run iterations.
from openpyxl import load_workbook
import numpy as np
wb = load_workbook('c:\ExcelData\pyExcel.xlsx')
ws = wb.get_sheet_by_name('Sheet1')
table = np.array([[cell.value for cell in col] for col in ws['A2':'A3']])
print(table)
Another Example:
val1=2
val2=1
wb = load_workbook(os.path.abspath(os.path.join(os.path.dirname(__file__),'c:\ExcelData\pyExcel.xlsx')))
sheet = wb.get_sheet_by_name('Sheet1')
c = sheet.cell(row=val1, column=val2).value
d = sheet.cell(row=val2, column=val2).value
print(c)
print(d)
So far what this does is to read a harcoded row and cell from an excel file and print or assign the value to a variable, But I am looking for a way to run iterations of data.. I want to use it as a data table when the first rows of all the columns will be executed the first time and then at the end the script will start over again but using the next row.
Thanks.
pinky you should use variables into the table = np.array([[cell.value for cell in col] for col in ws['A2':'A3']])
example ws['variable':'variable']]) or ws['ANUMBERVARIABLE':'ANUMBERVARIABLE']])
#Pinky Read this page http://openpyxl.readthedocs.org/en/latest/tutorial.html and try to find your answer. If you still did not understand from it, I'll try to help you with a code. I feel this is the best way you could actually learn what you are doing rather than just receiving the code directly.

Resources