I came across this question and the answers. I'm having similar challenge but i tried the answer to non-365 office users but could not get it - excel

this data set below. It has 3 columns (PTQ, % Growth, $ Growth) that all need ranked individually then summed up and then ranked again for a total power rank of each region. Is there any way I can do this with a single formula? I do this a lot and it would be nice not to have to rank everything individually each time.
To clarify, I do not want to rank first on one column then another, they all need to be ranked equally together.
Data:
Region
PTQ
% Growth
$ Growth
TR ARIZONA
103
17.5
201330
TR IDAHO UTAH
75.5
-6.3
-69976
TR LA HAWAII
99.4
19.2
194840
TR LA NORTH
125
32.7
241231
TR NORTHERN CALIFORNIA
102.3
26.2
308824
TR NORTHWEST
91.1
-0.6
-4801
TR SAN FRANSISCO
76.9
-16.7
-158387
TR SOUTHERN CALIFORNIA
106.9
30.8
495722
TR TUCSON
100.3
7.6
34888

Assuming the same layout as P.b., in I4:
=1+SUMPRODUCT(N(MMULT(CHOOSE({1,2,3},RANK(C$4:C$12,C$4:C$12),RANK(D$4:D$12,D$4:D$12),RANK(E$4:E$12,E$4:E$12)),{1;1;1})<SUM(RANK(C4,C$4:C$12),RANK(D4,D$4:D$12),RANK(E4,E$4:E$12))))
and copied down.

This is quite challenging in older Excel, but possible nonetheless:
=IFERROR(
INDEX(
MMULT(--(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12)>=TRANSPOSE(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12))),ROW($C$4:$C$12)^0),ROW($A1))
-SUMPRODUCT(--(MMULT(--(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12)>=TRANSPOSE(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12))),ROW($C$4:$C$12)^0)
=INDEX(MMULT(--(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12)>=TRANSPOSE(RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12))),ROW($C$4:$C$12)^0),
ROW($A1))))+1
,"")
(requires being entered with ctrl+shift+enter)
Explanation:
First an array is made of the sum of the 3 rankings:
RANK($C$4:$C$12,$C$4:$C$12)+RANK($D$4:$D$12,$D$4:$D$12)+RANK($E$4:$E$12,$E$4:$E$12)
This results in your so called Rank Sum - array.
Then - since RANK requires a range, not an array, we need an alternative to create a ranking of the array: MMULT can do that.
MMULT(RankSum>=RankSum,ROW(RankSum)^0) creates an array of the ranked RankSum, however. If 2 are ranked equally - for instance rank 1 - it's rank both as 2, not 1. Therefore I used SUMPRODUCT to calculate the number of items in the calculated MMULT-array that equal the indexed MMULT-array result as an alternative to COUNTIF, which is also limited to take a Range, not Array. So MMULTarray-SUMPRODUCT(--(MMULTarray=IndexedMMULTarray)) is your end result.
Calculation is based on your data being in B4:E12 and formula above is entered (with ctrl+shift+enter) in a cell in row 4 and copied down; I4 in the shared screenshot.
Even though this formula answers your question, I doubt this is what you thought what it would be. Changing the range to a different range by itself could be very teasing. And calculating the rankings manually and sum/rank them is probably easier to maintain. You may make it more dynamical by adding INDEX in the ranges.

Related

Is there a way to distribute data according to a logic in Excel vba?

I have an Excel sheet with the below data.
There are 10,000 Data rows.
9000 are of "USA" & 1000 are of "Other" country.
I want to evenly distribute the data so that when I have 9 "USA" followed by 1 "Other" data distributed throughout.
Name
Country
Alice
USA
Brook
Other
Cathy
USA
David
USA
Esther
Other
Freddy
USA
Galin
USA
Henry
Other
Indigo
USA
Jenny
USA
Kalin
Other
Linda
USA
How do I accomplish this using manual & excel VBA? Appreciate both solutions. Thanks
This can be achieved with a formula if you have the newest version of Excel.
Try something like (adapt ranges and what you are filtering on as necessary):
=LET(x, FILTER($B$1:$C$12, $C$1:$C$12="a"),
y, FILTER($B$1:$C$12, $C$1:$C$12="b"),
z, ROW(D1:D12), myrows, MAX(z),
ratio, MAX((COUNTA(x)/2)/(COUNTA(y)/2), (COUNTA(y)/2)/(COUNTA(x)/2))+1,
IF(MOD(z,ratio)<>0,
INDEX(x, IF(MOD(SEQUENCE(myrows),ratio)=0, 0, SEQUENCE(myrows)-CEILING(ROW(G1:G12)/ratio-1,1)), SEQUENCE(1,2)),
INDEX(y, IF(MOD(SEQUENCE(myrows),ratio)<>0,0,SEQUENCE(myrows)/ratio), SEQUENCE(1,2))))
For example:
The trick is to create the "correct" sequence for each result; for the first array you want to skip every nth row (in your case 10), and having the nth+1 row not default to n+1, but n, while in the second array you want to skip every row that isn't a some multiple of n, and have the nth rows count sequentially.
A caveat-- as is, I don't believe the formula will work with repetition other than 1, i.e. if you want to do something like 8 rows followed by 2 rows, this won't work.
This works even with older Excel versions:
If this is your data:
Add a Sort column with the following formula in C2 and pull it down:
=IF(B2="USA",COUNTIF($B$2:B2,"USA")+INT((COUNTIF($B$2:B2,"USA")-1)/ROUNDUP(COUNTIF(B:B,"USA")/(COUNTA(B:B)-COUNTIF(B:B,"USA")),0)),COUNTIF($B$2:B2,"Other")*(ROUNDUP(COUNTIF(B:B,"USA")/(COUNTA(B:B)-COUNTIF(B:B,"USA")),0)+1))
Then sort by this column C and USA and Other are evenly spread:

Filtering discrepancies in duplicate measurements

I have a dataset with the following problem.
Sometimes, a temperature sensor would return duplicate readings at the exact same minute, where sometimes 1 of 2 of the duplicates is "reasonable" and the other is slightly off.
For example:
TEMP TIME
1 24.5 4/1/18 2:00
2 24.7 4/1/18 2:00
3 24.6 4/1/18 2:05
4 28.3 4/1/18 2:05
5 24.3 4/1/18 2:10
6 24.5 4/1/18 2:10
7 26.5 4/1/18 2:15
8 24.4 4/1/18 2:15
9 24.7 4/1/18 2:20
10 22.0 4/1/18 2:20
Line 5, 7 & 10 are readings that are to be removed as they are too high or low (doesn't make sense that within 5 minutes it will rise and drop more than a degree in a relatively stable environment).
The goal at the end with this dataset is to "average" the similar values (such as in line 1 & 2) and just remove the lines that are too extreme (such as line 5 & 7) from the dataset entirely.
Currently my idea to formulate this is to look at a previously obtained row, and if one of the 2 duplicates is +/- 0.5 degree, to mark in a 3rd column with TRUE so I can filter out all the TRUE values in the end. I'm not sure how to communicate within the if statement that I'm looking for a + OR - 0.5 of a previous number however. Does anyone know?
Here is a google sheet example that does what you want:
https://docs.google.com/spreadsheets/d/1Va9RjSeulOfVTd-0b4EM4azbUkYUb22jXNc_EcafUO8/edit?usp=sharing
What I did:
Calculate a column of a 3-item running average of the data using "=AVERAGE(B3:B1)"
Filter the list using "=IF(ABS(B2-C2) < 1, B2, )"
Calculate the average of the filtered list
The use of Absolute Value is what provides "+ OR -" that you were looking for. It is saying if the distance between two numbers is too much, then don't include the term.
So, A Simple Solution came to my mind. Follow the Following steps given below:
Convert Data to Table
Add a 4th column at the last
Enter the formula "Current Value - Previous Value"
Filter the Column with high difference values
Delete those rows of filtered data and you'll be left with Normal Values
Here's the ref. Image
Or If you want to consider the Same time difference only then do the following:
Convert your data to Table
Add 4th column at the end of table
Writhe the Following Formula to 4th Column
IF(Current_Time = Previous_Time, Current_Temp-Previous_Temp,"")
Filter and Delete the Data with high Difference
See the following Image:

Spreadsheet aggregation/manipulation

I have a spreadsheet structured like
2005 Alameda total HS graduates 1234
2005 Alameda UC enrollees 112
2006 Alameda total HS graduates 892
2006 Alameda UC enrollees 84
...
2009 Yolo total HS graduates 1300
2009 Yolo UC enrollees 93
and so on for every CA county for several years.
I want to generate a spreadsheet like this:
county 2005 2006 ...
Alameda 11.1% 9%
Alpine 7% 8%
...
Yolo 5.5% 4%
i.e. I want to project the years from rows to columns and have a row for each county, then divide the number of graduates (the data from each odd-numbered row in the original sheet) by the number of UC enrollees (even-row data) for each year, and insert it in the appropriate cell.
This would be easy enough for me to do in Java, but I want to get a feel for what's possible just using excel/Google sheets alone - how might I go about accomplishing this?
Assuming the counties are sorted, and they start in cell B2, enter =B2 in cell F2, and enter the following in F3:
=INDIRECT("B"&COUNTIF(B3:B$9999,"<="&F2)+ROW())
You can change 9999 based on the number of records, but it's fine as-is.
Copy F3 down as many rows as are needed:
You can then calculate percentages using SUMPRODUCT:
=IFERROR(
SUMPRODUCT(($A$2:$A$100=G$1)*
($B$2:$B$100=$F2)*
($C$2:$C$100="UC enrollees")*
$D$2:$D$100
)
/
SUMPRODUCT(($A$2:$A$100=G$1)*
($B$2:$B$100=$F2)*
($C$2:$C$100="total HS graduates")*
$D$2:$D$100
),
"")
The first SUMPRODUCT totals UC enrollees that match the year and county. The second SUMPRODUCT does the same for HS graduates. The results are divided, and IFERROR handles divide-by-zero errors for missing data.
Since your example shows percentages, I assume you want to divide UC enrollees by HS graduates, and not the other way around. Either way, I don't get the same totals as you, so let me know if I misunderstood.
Here is the pivot table way of doing it for comparison.
They are many ways of doing this but I've added column headers and chosen to use this formula to put percentages in even rows of column E and zeroes in odd rows in sheet 1:-
=IF(ISEVEN(ROW()),D3/D2*100,0)
Then I've inserted a pivot table in sheet 2 referring to my data in sheet 1 and set up the fields as shown and it's pretty automatic:-

How to make a "trending" or "averaging" curve

I have a spreadsheet on which I've been tracking my weight for the last year.
I weigh myself nearly every day, and I can be off by as much as 5 pounds from day to day.
I would like make a graph shows the overall pattern of my weight loss / gain, but without all of the noise.
What are some formulas that I can use to calculate the overall trend?
Place the raw daily measurements in A1 thru A365In B2 enter:
=(A1+A2+A3)/3
and copy down. Column B will give you a smoother dataset for plotting and trending.
Once you have enough data points a "moving average" will help reduce the daily noise. Let's say you have 10 data points starting in A1:
120.0 119.0 114.1 116.7 112.0 108.7 107.9 104.6 108.9 111.7
In cell C2 you could use the formula AVERAGE(A1:C1) and copy it to the end of your data set. THe relative references will always average the last 3 measurements.
Now your data looks like:
120.0 119.0 114.1 116.7 112.0 108.7 107.9 104.6 108.9 111.7
117.7 116.6 114.3 112.5 109.5 107.1 107.1 108.4
So your second row has far less variation that the raw data.
You can also get fancy and make the number of measurements variable. If that number were stored in A5 (below your data) then the formula would be something like
=AVERAGE(OFFSET(C1,0,0,1,-MIN(COLUMN(),$A$5)))
The MIN ensures that you don't go past the beginning of the data set (if you do a 5-day moving average you can;t go back 5 days from the 4th day, etc.)

Using the Excel's Rank() function to calculate allocations based on ranking and constraints

I have the following table set up
Limit Allocation Yield Ranking
$600 [to calc] 0.07% 7
$600 0.09% 6
$600 0.20% 1
$400 0.20% 1
$400 0.13% 4
$200 0.19% 3
$200 0.12% 5
Additionally, I have a constraint which I could only allocate a total of $2000 across the 7 rows here, by the rankings of their yield (so a higher yield would get everything allocated up to the limit column if there is any left overs from the $2000 total).
I was wondering how I could set up the equations so that it could perform the allocation automatically. Thanks!
I'm going to assume this table starts in A1...
In E1, put the amount you have to allocate
In B2 (and then copied to B3...B8) use the following formula
=MIN(A2,$E$1-SUMIF($D$2:$D$8,">"&D2,$B$2:$B$8))
This will work out how much has been taken by higher ranked, and take the rest, upto whatever is the lesser amount of their limit, and what is left in the pot.
There is one fault with this equation that you will need to figure out how to handle:
If there are equal ranks at the end of the distribution, then both will get the final amount. (e.g. try this with $2,001, and you will see that the 2 rows that have then rank 1 will both claim the final dollar)
Answer to solve the ties for rank causing problem. In the rank column D, add to the rank =rank(c2,$c$2:$c$8,0) + (.0000001 * row(a2)), or whatever row you are in. Then format the rank column to only show integers. Doing this makes the very small decimal addition to the rank the tie breaker so the first row with the rank's matching integer will take the allocation. Since you are adding it to the rank, it doesn't effect any totals. By changing the column format display to integer, the viewer will not be aware of the tiebreaker.

Resources