Average Excel Columns - excel

I am trying to average out values within a column at a certain range. I tried listing out the range as a tuple then for looping to be able to get the cell value. I then created a variable for the average but get the error 'TypeError: 'float' object is not iterable.
range1 = ws["A2":"A6]
for cell in range1:
for x in cell:
average = sum(x.value)/len(x.value)
print(average)

Python and the Openpyxl API makes this kind of thing very easy.
rows = ws.iter_rows(min_row=2, max_row=6, max_col=1, values_only=True)
values = [row[0] for row in rows]
avg = sum(values) / len(values)
But you should probably check that the cells contain numbers, otherwise you'll see an exception.

Something like this will get you the mean of the cells.
import openpyxl as op
def main():
wb = op.load_workbook(filename='C:\\Users\\####\\Desktop\\SO nonsense\\Book1.xlsm')
range1 = wb['Sheet1']['A2:A6']
cellsum = 0
for i, cell in enumerate(range1, 1):
print(i)
cellsum += cell[0].value
print(cellsum / i)
main()

Related

Use previous values of rows in data frame to build the current cell value

I have a dataframe as represented here
A
0.001216
0.000453
0.00506
0.004556
0.005266
I want to create a new column B something according to this formula presented in the code below.
column_key = 'B'
factor = 'A'
df[column_key] = np.nan
df[column_key][0] = (df[factor][0] + 1) * 100
for i in range(1, len(df)):
df[column_key][i] = (df[factor][i] + 1) * df[column_key][i-1]
I have been trying to fill the current cell value using the previous cell of a column and adjacent cell of a column.
This is what I have tried but I don't think this is going to be effective.
Can anyone help me with best efficient approach of solving this problem?
Using pandas.cumprod(), it can be done in following way:
df['B'] = df['A'] + 1
df['B'][0] = df['B'][0] * 100
df['B'] = df['B'].cumprod()

Alternate values in a range of cells using Openpyxl

I am trying to alternate a value in an excel range using openpyxl with a loop, for example starting a "x" value in ['A1'] or (1,1), and the next loop moving to (2,2) etc etc until getting to column 8 or H, and row 10.
If you need to loop like (1,1), (2,2) ...(8,8) [looping upto H column] here is the solution. If you need to increase the cells then change the max_value of range to your required number.
Using openpyxl-cell module:
import openpyxl
import os
def func():
wb = openpyxl.load_workbook(os.path.join(os.getcwd(), 'sample.xlsx'))
ws = wb['Sheet1']
for var in range(1, 9):
print(ws.cell(row=var, column=var).value)
func()
Input: (snapshot from excel)
Output: (Snapshot from IDE):

How to randomly select column that does not contain a specific value in Excel or SPSS

Would anyone know the Excel formula, VBA, or SPSS syntax to do the following:
Create a new variable/column in a dataset or spreadsheet which is populated by the column number (or column title) of a randomly selected column (from a range of 1-42 columns), provided the value in that column for a given row does not contain 99.
In Excel I can do the first step and create random numbers and match these to columns, but I don't know how (or if possible) to 're-roll' a new random number if the initial matched column contains the value 99.
My formula for generating a random number between 1 and 42 to identify a column:
AQ=RANDBETWEEN(1,3)
For a row in Excel using 9-row dummy data: =HLOOKUP(AQ,$A$1:$AP$9,2,FALSE)
Here's an example of how you can re-roll... for the given row, I chose 10 but you can change this however you need
EDIT - now looping thru givenRow:
Sub test()
Dim randCol As Integer
Dim givenRow As Long
Dim saveCol As Integer: saveCol = 44 ' where to store results
With ThisWorkbook.Worksheets("your sheet name")
For givenRow = 1 To 100
Do While True
' get column between 1 and 42
randCol = Int(42 * Rnd + 1)
' if not 99 exit
If .Cells(givenRow, randCol).Value <> 99 Then Exit Do
Loop
' store results in saveCol for givenRow
.Cells(givenRow, saveCol).Value = randCol
Next
End With
End Sub
Heres how you could go about it in SPSS using Python:
begin program.
import spss, spssaux
import random
# get variable list
vars = spssaux.VariableDict().expand(spss.GetVariableName(0) + " to " + spss.GetVariableName(spss.GetVariableCount()-1))
proceed = True
breakcount = 0
while proceed:
# generate random integer between 0 and variable count -1, get random variable's
# name and index-position in dataset
rng = random. randint(0,spss.GetVariableCount() - 1)
ranvar = spss.GetVariableName(rng)
ind = int(vars.index(ranvar))
# read data from random variable, if value 99 is stored in the variable, go back to the top. if not, compute variable
# random_column = column number (index +1 NOT index)
randat = spss.Cursor([ind])
d = randat.fetchall()
randat.close()
data = [str(x).strip('(),') for x in d]
breakcount += 1
if "99.0" not in data:
spss.Submit("compute random_column = %s." %(ind + 1))
proceed = False
elif breakcount == 42:
break
end program.
it iterates through random variables until it finds one without the value 99 in it, then computes the new variable containing the comlumn number.
Edit: Added a break condition so that it doesnt loop infinitely just in case every variable contains a 99

MatLab: Iterating dynamically through Cells with Excel COM Add-In

First of all thanks a lot, for very good answers that I have found here in other topics in the past.
Now to a new challenge:
I am currently working with the COM Add-In in Matlab, i.e. I am reading a Excel Workbook and extracting the Color Property:
excelapp = actxserver('Excel.Application'); %connect to excel
workbook = excelapp.Workbooks.Open('Solutions.xls');
worksheet = workbook.Sheets.Item(1);
ColorValue_Solutions=worksheet.Range('N2').Interior.Color;
Now, I want to do this for cells in the Range A1 up to J222, for which I would like to dynmaically loop through the Range property, letting the programm read each cell individually and then taking out the color proerty. For example:
Columns = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'};
for j = 1:length(Columns)
for i = 1:222
worksheet.(char( strcat('Range(''',Columns(j), num2str(i), ''')') )).Interior.Color
end
end
This, however, results in an error:
Undefined function or variable 'Range('A1')'.
I guess the problem is in the combination of interpreting a string with an included string, i.e. Range('A1').
Any help is much appreciated.
Some time ago I asked a similar question. Check it out, maybe you will find it helpful.
The following code should do what you want:
You can see how the Interior.Color property is extracted from every cell by first getting a Cell object, then accessing the Interior object of that Cell and finally getting the Color property of that Interior object. The Color property is a color integer defined by Microsoft (you can learn more here and here), I store that value in a matrix (M). In order to repeat this process through the specified range of cells, I use a nested loop, like you were doing already. Once the process is finished, I display the contents of M.
excelapp = actxserver('Excel.Application'); % Start Excel as ActiveX server.
workbook = excelapp.Workbooks.Open('Solutions.xls'); % Open Excel workbook.
worksheet = workbook.Sheets.Item(1); % Get the sheet object (sheet #1).
ncols = 10; % From column A to J.
nrows = 222; % From row 1 to 222.
M(nrows, ncols) = 0; % Preallocate matrix M.
for col = 1:ncols % Loop through every column.
for row = 1:nrows % Loop through every row.
cell = get(worksheet, 'Cells', row, col); % Get the cell object.
interior = cell.Interior; % Get the interior object.
colorint = get(interior, 'Color'); % Get the color integer property.
M(row, col) = colorint; % Store color integer in matrix M.
end
end
disp(M); % Display matrix M with the color integer information.
Remember to close the connection when you are finished. You can learn how to do it here and here.

Comparing items in an Excel file with Openpyxl in Python

I am working with a big set of data, which has 9 rows (B3:J3 in column 3) and stretches until B1325:J1325. Using Python and the Openpyxl library, I need to get the biggest and second biggest value of each row and print those to a new field in the same row. I already assigned values to single fields manually (headings), but cannot seem to even get the max value in my range automatically written to a new field. My code looks like the following:
for row in ws.rows['B3':'J3']:
sumup = 0.0
for cell in row:
if cell.value != None:
.........
It throws the error:
for row in ws.rows['B3':'J3']:
TypeError: 'generator' object has no attribute '__getitem__'
How could I get to my goal here?
You can you iter_rows to do what you want.
Try this:
for row in ws.iter_rows('B3':'J3'):
sumup = 0.0
for cell in row:
if cell.value != None:
........
Check out this answer for more info:
How we can use iter_rows() in Python openpyxl package?

Resources