Auto calculate average over varying number values row by row - excel

I have an Excel file with several columns in it and many rows. One column, say A has ID numbers. Another column, say G has prices. Column A has repeating ID numbers, however not all numbers repeat the same amount of times. Sometimes just once, other times 2, 3 or several times. Each column G for that row has a unique price.
Basically, I need to average those prices for a given ID in column A. If each ID was repeated the same number of times, this would be quite simple, but because they are not I have to manually do my average calculation for each grouping. Since my spreadsheet has many many rows, this is taking forever.
Here is an example (column H is the average that I am currently calculating manually):
A ... G H
1 1234 3.00 3.50
2 1234 4.00
3 3456 2.25 3.98
4 3456 4.54
5 3456 5.15
11 8890 0.70 0.95
13 8890 1.20
...
So in the above example, the average price for ID# 1234 would be 3.50. Likewise, the average price for ID# 3456 would be 3.98 and for #8890 would be 0.95.
NOTICE how rows are missing between row 5 and 11, and row 12 is missing too? That is because they are filtered out for some other reason. I need to exclude those hidden rows from my calculations and only calculate the average for the rows visible.
Im trying to write a VBA script that will automatically calculate this, then print that average value for each ID in column H.
Here is some code I have considered:
Sub calcAvg()
Dim rng As Range
Set rng = Range("sheet1!A1:A200003")
For Each Val In rng
Count = 0
V = Val.Value '''V is set equal to the value within the range
If Val.Value = V Then
Sum = Sum + G.Value
V = rng.Offset(1, 0) '''go to next row
Count = Count + 1
Else
'''V = Val.Value '''set value in this cell equal to the value in the next cell down.
avg = Sum / Count
H = avg '''Column G gets the avg value.
End If
Next Val
End Sub
I know there are some problems with the above code. Im not too familiar with VBA. Also this would print the avg on the same line everytime. Im not sure how to iterate the entire row.
This seems overly complicated. Its a simple problem in theory, but the missing rows and differing number of ID# repetitions makes it more complex.
If this can be done in an Excel function, that would be even better.
Any thoughts or suggestions would be greatly appreciated. thanks.

If you can add another row to the top of your data (put column Headers in it) its quite simple with a formula.
Formula for C2 is
=IF(A2<>A1,AVERAGEIFS(B:B,A:A,A2),"")
copy this down for all data rows.
This applies for Excel 2007 or later. If using Excel 2003 or earlier, use AVERAGEIF instead, adjusting ranges accordingly
If you can't add a header row, change the first formula (cell C1) to
=AVERAGEIFS(B:B,A:A,A1)

In my way ..
Sub calcAvg()
Dim x, y, i, y2, t, Count, Mount As Integer
Dim Seek0 As String
x = 1 '--> means Col A
y = 1 '--> means start - Row 1
y2 = 7 '--> means end - Row 19
For i = y To y2
If i = y Then
Seek0 = Cells(i, x)
t = i
Count = Cells(i, x + 6)
Mount = 1
Else
If Cells(i, x) <> Seek0 Then
Cells(t, x + 7) = Count / Mount
Count = Cells(i, x + 6)
Mount = 1
t = i
Seek0 = Cells(i, x)
Else
Count = Count + Cells(i, x + 6)
Mount = Mount + 1
End If
End If
Next
End Sub
Hope this helps ..

Related

Find value in column where running total is equal to a certain percentage

In Excel I have a list of values (in random order), where I want to figure out which values that comprise 75% of the total value; i.e. if adding the largest values together, which ones should I include in order to get to 75% of the total (largest to smallest). I would like to find the "cut-off value", i.e. the smallest number to include in the group of values (that combined sum up to 75%). However I want to do this without first sorting my data.
Consider below example, here we can see that the cutoff is at "Company 6", which corresponds to a "cut-off value" of 750.
The data I have is not sorted, hence I just want to figure out what the "cut-off value" should be, because then I know that if the amount in the row is above that number, it is part of group of values that constitute 75% of the total.
The answer can be either Excel or VBA; but I want to avoid having to sort my table first, and I want to avoid having a calculation in each row (so ideally a single formula that can calculate it).
Row number
Amount
Percentage
Running Total
Company 1
1,000
12.9%
12.9%
Company 2
950
12.3%
25.2%
Company 3
900
11.6%
36.8%
Company 4
850
11.0%
47.7%
Company 5
800
10.3%
58.1%
Company 6
750
9.7%
67.7%
Company 7
700
9.0%
76.8%
Company 8
650
8.4%
85.2%
Company 9
600
7.7%
92.9%
Company 10
550
7.1%
100.0%
Total
7,750
75% of total
5,813
EDIT:
My initial thought was to use percentile/quartile function, however that is not giving me the expected results.
I have been trying to use a combination of percentrank, sort, sum and aggregate - but cannot figure out how to combine them, to get the result I need.
In the example I want to include Companies 1 through 6, as that summarize to 5250, hence the smallest number to include is 750. If I add Company 7 I get above the 5813 (which is where 75% is).
VBA bubble sort - no changes to sheet.
Option Explicit
Sub calc75()
Const PCENT = 0.75
Dim rng, ar, ix, x As Long, z As Long, cutoff As Double
Dim n As Long, i As Long, a As Long, b As Long
Dim t As Double, msg As String, prev As Long, bFlag As Boolean
' company and amount
Set rng = Sheet1.Range("A2:B11")
ar = rng.Value2
n = UBound(ar)
' calc cutoff
ReDim ix(1 To n)
For i = 1 To n
ix(i) = i
cutoff = cutoff + ar(i, 2) * PCENT
Next
' bubble sort
For a = 1 To n - 1
For b = a + 1 To n
' compare col B
If ar(ix(b), 2) > ar(ix(a), 2) Then
z = ix(a)
ix(a) = ix(b)
ix(b) = z
End If
Next
Next
' result
x = 1
For i = 1 To n
t = t + ar(ix(i), 2)
If t > cutoff And Not bFlag Then
msg = msg & vbLf & String(30, "-")
bFlag = True
If i > 1 Then x = i - 1
End If
msg = msg & vbLf & i & ") " & ar(ix(i), 1) _
& Format(ar(ix(i), 2), " 0") _
& Format(t, " 0")
Next
MsgBox msg, vbInformation, ar(x, 1) & " Cutoff=" & cutoff
End Sub
So, set this up simply as I suggested.
You can add or change the constraints as you wish to get the results you need - I chose Binary to start but you could limit to integer and to 1, 2 or 3 for example.
I included the roundup() I used as well as the sumproduct.
I used Binary as that gives a clear indication of the ones chosen, integer values will also do the same of course.
Smallest Value of a Running Total...
=LET(Data,B2:B11,Ratio,0.75,
Sorted,SORT(Data,,-1),MaxSum,SUM(Sorted)*Ratio,
Scanned,SCAN(0,Sorted,LAMBDA(a,b,IF((a+b)<=MaxSum,a+b,0))),
srIndex,XMATCH(0,Scanned)-1,
Result,INDEX(Sorted,srIndex),Result)
G2: =SORT(B2:B11,,-1)
H2: =SUM(B2:B11)*0.75
I2: =SCAN(0,G2#,LAMBDA(a,b,IF((a+b)<$H$2,a+b,0)))
J2: =XMATCH(0,I2#)
K2: =INDEX(G2#,XMATCH(0,I2#)-1)
The issue that presents itself is that there could be duplicates in the Amount column when it wouldn't be possible to determine which of them is the correct result.
If the company names are unique, an accurate way would be to return the company name.
=LET(rData,A2:A11,lData,B2:B11,Ratio,0.75,
Data,HSTACK(rData,lData),Sorted,SORT(Data,2,-1),
lSorted,TAKE(Sorted,,-1),MaxSum,SUM(lSorted)*Ratio,
Scanned,SCAN(0,lSorted,LAMBDA(a,b,IF((a+b)<=MaxSum,a+b,0))),
rSorted,TAKE(Sorted,,1),rIndex,XMATCH(0,Scanned)-1,
Result,INDEX(rSorted,rIndex),Result)
Note that you can define a name, e.g. GetCutOffCompany with the following part of the LAMBDA version of the formula:
=LAMBDA(rData,lData,Ratio,LET(
Data,HSTACK(rData,lData),Sorted,SORT(Data,2,-1),
lSorted,TAKE(Sorted,,-1),MaxSum,SUM(lSorted)*Ratio,
Scanned,SCAN(0,lSorted,LAMBDA(a,b,IF((a+b)<=MaxSum,a+b,0))),
rSorted,TAKE(Sorted,,1),rIndex,XMATCH(0,Scanned)-1,
Result,INDEX(rSorted,rIndex),Result))
Then you can use the name like any other Excel function anywhere in the workbook e.g.:
=GetCutOffCompany(A2:A11,B2:B11,0.75)

generate random numbers between 0 and 1 in 10 cells in the row , in which the sum of the random number always equal to 7

How can I generate random numbers 0 or 1 in 10 cells in the row, in which the sum of the random number is always equal to 7?
enter image description here
Here's a way to get seven "1"s and three "0"s in random order using RAND and RANK
In A1:J1: =RAND()
In A2:J2: =IF(RANK(A1,$A$1:$J$1,1)>3,1,0)
Available here is a version that I really think works! https://www.dropbox.com/s/ec431fu0h0fhb5i/RandomNumbers.xlsx?dl=0
And here's the '0 and 1' version (sheet 2 at the above link):
De-dup Rankings Randoms First Cut Sorted
0.47999002 7 0.479992063 1 1
0.68823003 3 0.688233075 1 1
0.07594004 9 0.075938331 1 1
0.02077005 10 0.020766892 1 0
0.69217006 2 0.692170173 1 0
0.73355007 1 0.733549516 1 1
0.51546008 6 0.515462872 1 1
0.62308009 4 0.623078278 0
0.33033001 8 0.330331577 1
0.561260011 5 0.561260557 1
Formulae for columns A-C exactly as before, D is just 7 1's, E is:
=VLOOKUP(ROW(E2)-1,B$1:D$11,3,FALSE)
Assuming that you want a list of positive random numbers that add to 7 you can use this following method.
Enter a 0 in the top-left cell (Blue Cell).
Enter =RAND()*7 into the next 9 cells below the 0 (Orange Cells).
Enter a 7 in the cell below the 9 random values (Blue Cell).
Copy the 9 random values and paste-special-values over top to turn the formulas into values.
Sort just these 9 cells in ascending order
In the cell just to the right of the first random value put a formula that subtracts the cell to the left and one above from the cell to the left (Yellow Cells).
Repeat this formula down to the cell next to the 7 that was typed in.
Sum the values in the second column (Green Cell).
That should give you 10 random values whose sum is exactly 7.
The only issue is that getting the values to be between 0 and 1 will take a bit of trial and error.
It appears that trial and error may not be practical. It's about a one in 2,710 times that this list will contain only numbers between 0 and 1. Not overly practical. Sorry.
To answer the question in the post, enter this in A1:J1 as an array formula (ctrl+shift+enter):
=1-(TRANSPOSE(MOD(SMALL(RANDBETWEEN(0,1e12*(ROW(INDIRECT("1:10"))>0))+(ROW(INDIRECT("1:10"))-1)/10,ROW(INDIRECT("1:10"))),1))>0.65)
To answer the question in the post title, do the following:
In A1:J1 enter:
=RAND()
In K1 enter:
=IF(SUM(A1:J1)<7,(7-SUM(A1:J1))/(COUNT(A1:J1)-7),7/SUM(A1:J1))
In L1 enter:
=IF(SUM($A1:$J1)<7,(A1+$K1)/($K1+1),A1*$K1)
Fill over to U1.
I believe the 10 numbers generated will be identically distributed in [0,1), but obviously not uniformly (I'm fairly certain the distribution does not have a name). The numbers can't be considered independent. A few statistics on the distribution:
Mean: 0.7 (as expected)
The other statistics are estimated from 10,000 simulations:
Variance: 0.0295
Kurtosis: -0.648
Skewness: -0.192
Think of it as drawing a sample of size 7 from the set {1, 2, ..., 10}. The 1s correspond to the numbers chosen for inclusion in the sample. Here is some VBA code which generates such samples:
Function sample(n As Long, k As Long) As Variant
'returns a variant of length n
'consisting of k 1s and n-k 0s
'thought of as a sample of {1,...,n} of size k
Dim v As Variant 'vector to hold sample
Dim numbers As Variant
Dim i As Long, j As Long, temp As Long
ReDim v(1 To n)
ReDim numbers(1 To n)
For i = 1 To n
v(i) = 0
numbers(i) = i
Next i
'do k steps of a Fisher-Yates shuffle on numbers
For i = 1 To Application.WorksheetFunction.Min(k, n - 1)
j = Application.WorksheetFunction.RandBetween(i, n)
If i < j Then 'swap
temp = numbers(i)
numbers(i) = numbers(j)
numbers(j) = temp
End If
Next i
'now use the first k elements of the partially shuffled array:
For i = 1 To k
v(numbers(i)) = 1
Next i
sample = v
End Function
Used like: Range("A1:J1").Value = sample(10,7)
Using a bit of brute force, I think I've got a workable solution to the original version of the question which asked for random numbers between 0 and 1.
Cells A1 to A9:
=rand()
Cell A10:
=7-sum(A1:A9)
Now you have 10 numbers that add up to 7, but the last one is probably not in the range 0 to 1. To deal with that, just recalculate the sheet to generate new random numbers until that last value is within range. It takes about 25 recalculations to have a ~95% chance that one of them will be within range, so it could take a while. A little VBA can do that for you very quickly:
Sub rand7()
While Range("A10").Value > 1 Or Range("A10").Value < 0
ActiveSheet.Calculate
Wend
End Sub

Excel for loop to get value from another row

I have an spreadsheet that contains various data. It looks like this:
A A A B B C C C C
a 1 2 3 2 1 4 2 3 2
b 0 2 3 3 0 1 2 3 0
c 6 6 3 0 2 1 0 4 0
etc.
What I want is to add all the Aa's and come up with a Aa total, all the Bb's and come up with a Bb total, all the Ab's etc.
What I want to do is, for every column, check if it is A, B or C. I want to do that because the data may change I might end up with four columns for A, two for B, etc. I know however that a, b and c will stay where they are.
I also don't know the order of A, B and C. There could be two A's followed by two C's and then one B.
My final result will be a table containing all the totals:
Aa Ab Ac
Ba Bb Bc
Ca Cb Cc
Where in the previous example would mean that Aa = 1 + 2 + 3 = 6, Ab = 5, etc.
Something like that.
I think the way to go is for 1-1 (the total of Aa's) is to go through every column in the first row. Check if it is an A. If it is, then get the value of the same column but second row. Add it to the total. When gone through all the columns, show up the total in 1-1.
What I have so far (for A):
Sub getA()
Dim x As Integer
Dim total As Integer
'cols = Find number of columns with data in them
For x = 1 To cols
'cell = cell in Ax
If InStr(1, cellvalue, "a") = 1 Then
'val = value from row 5 in same column
total = total + Val
End If
Next
End Sub
But I don't really know how to proceed with the commented lines.
Finally, another thing I would like to know is how will these values be presented in their respective cells without any extra event being carried (button for example). They should just appear in their cells from the moment someone opens the spreadsheet.
Any help is greatly appreciated.
Thanks.
Just an FYI, this can be done using the SUMPRODUCT formula:
=SUMPRODUCT(($B$1:$J$1=D$9)*($A$2:$A$4=$C10)*$B$2:$J$4)
EDIT
To compare the first letter then use this formula:
=SUMPRODUCT((LEFT($B$1:$J$1,1)=D$9)*($A$2:$A$4=$C10)*$B$2:$J$4)
Are you looking for something like:
Function countletter(strLetter As String) As Double
Dim x As Double, y As Double, xMax As Double, yMax As Double
xMax = Range("A1").CurrentRegion.Columns.Count
yMax = Range("A1").CurrentRegion.Rows.Count
For x = 1 To xMax
For y = 1 To yMax
If Cells(y, x).Value = strLetter Then
countletter = countletter + 1
End If
Next
Next
End Function

Excel - Entries in List of values be based on some other column value

We have a scenario in excel (2010) where the list of values present in a dropdown change dynamically based on some column of that row. For eg. Consider the "Supervisor" dropdown in sheet1 below:
Emp Grade Supervisor
A 14
B 12
C 13
D 12
E 12
F 13
G 14
Now let's say there is a dropdown for the supervisor. For every employee, the supervisor can be a person of his grade or higher grades only. So, for eg. For grade 13 employee, can have a supervisor with grade 13 or grade 14 only, not grade 12.
How can I write a custom condition like this inside the list of values? I have tried with things like named range, offset etc. but none allows specifying custom conditions. Any help?
I found the following document to be helpful in creating dependent Data Validation dropdowns: DV0064 - Dependent Lists Clear Cells, which can be downloaded here (for free):
http://www.contextures.com/excelfiles.html#DataVal
You can tailor the example to your needs.
=OFFSET('validation pivot'!$A$1,0,1,COUNTIFS('validation pivot'!$A:$A,">="&B2),1)
The supervisor needs to be at least his pay grade (>=B2). In order to have it work you need to have the pivot inserted in validation pivot A1. How to create the pivot (hasty notes):
add grade and emp 'emp as subset
tabular view 'to have separate columns
repeat labels ' to be able to count them
remove autosums(both within and total) 'to not deal with evading it
hide column labels and filters 'same
descending order(grade) 'to get a simple match method
data: store none 'to refresh the descending order every time
See uploaded sample file.
This code (column A = EMP, B = Grade, C = Supervisor)
Sub test()
Dim actualgrade As Integer
Dim lastRowA As Integer
Dim numbers As String
lastRowA = Sheets("sheet1").Cells(Sheets("sheet1").Rows.Count, "A").End(xlUp).Row
For i = 2 To lastRowA '1 = headers
actualgrade = Cells(i, 2)
For j = 2 To lastRowA
If Cells(j, 2) >= actualgrade Then
numbers = numbers & " " & Cells(j, 1).Value
End If
Next j
Cells(i, 3).Value = numbers
numbers = ""
Next i
End Sub
Makes this result:
Emp Grade Supr
A 14 A G
B 12 A B C D R F G
C 13 A C F G
D 12 A B C D R F G
R 12 A B C D R F G
F 13 A C F G
G 14 A G
Feel free to change it like you need it

Updating column values on specific Row using VBA

How to update value from xt to xtt in 6th column, first row.
1 2 3 4 5 6
x xx xy xz x1 xt
y yx tt cc z3 xcc
Based on above data, I am getting range from worksheet. After getting the Row object, how do I update the Cell value in particular column?
As asked, you can update a specific column using the method:
'Sheet.Cells(row, column) = value
' i.e.
ActiveSheet.Cells(1, 6) = "xtt"
If you only want to perform the update if it has a value of "xt", then obviously you'd need to check the contents before performing the update... For example:
If (ActiveSheet.Cells(1, 6) = "xt") Then
ActiveSheet.Cells(1, 6) = "xtt"
End If

Resources