Calculating number of times one "team" beats another in Excel - excel

(I use the term "teams" generically here because the entirety of this question rests on ranking, and it seemed to be the most intuitive language to describe my problem.)
In a league of 30 teams, each day only 8 teams play. The results for those teams are ranked ordinally from 1 to 8 for the day. This continues "forever", so that additional results must be recorded every day.
Example after 4 days:
I want to calculate a single number to describe the relationship between two teams. For instance, given the example, the value (in a 2d table) that describes the relationship of Ace to Get is 1. Ace beat Get twice and Get beat Ace once (2-1).
I have been messing with Sumproduct, Match, and Index to get get values, which I could calculate using many extra tables, but I may need to add "teams" on the fly, and I do not know how large the pool of teams will become. Because of this, I was hoping to be able to use a single formula in the 2d relationship table. The results of that table, looking at just day 1 and day 2 given the previous example, are:
Is there a direct formula I can use to calculate the results to populate that table?

You can try following formula:
=IF($A11<>B$10;
SUMPRODUCT(
IF(MMULT(($B$1:$I$1)*($B$2:$I$3=$A11);ROW($1:$8)^0)
<MMULT(($B$1:$I$1)*($B$2:$I$3=B$10);ROW($1:$8)^0);
1;
-1)
*(((MMULT(--($B$2:$I$3<>$A11);ROW($1:$8)^0)=8)
+(MMULT(--($B$2:$I$3<>B$10);ROW($1:$8)^0)=8))
=0));
"")
Copy right and down.

Related

Using Sumproduct to calculate two tables using horizontal (table headers) and vertical references

Hopefully the title makes some sense because I'm trying to wrap my head around the logic and I'm not quite sure how to phrase the question.I'll try to give a brief explanation of the end goal without over complicating it with unnecessary details.
I have a table of survey score averages for every month per person and a correlating table with the number of surveys each person received for each month. The logic is essentially multiple the score for each month by the number of surveys, combine them, divide by the total number of surveys within that time period to get their true average. Where things get a little complicated is that I have to include the ability to set a custom date range and return the value. So sometimes I might be looking at the average for Jan - Apr, other times I might just be looking at Feb-Mar etc.
I think sumproduct is going to get what I need done but I'm running into issues trying to write it out. I've written it several different ways and none of them worked so here's one that best conveys what I'm trying to do,
=SUMPRODUCT(--(F7:I7,L7:O7>=C2),--(F7:I7,L7:O7<=C3),--(E8:E12,K8:K12=B9),tbl_average[[Jan-20]:[Apr-20]],tbl_surveys[[Jan-20]:[Apr-20]])
I super appreciate any assistance I can get on this. I'm hoping the end result is not nearly as difficult as I'm making it out to be.
Some additional information:
I'm going to be using this same process to calculate multiple metrics across multiple worksheets.In the test example each of the tables will most likely be on different sheets. The dashboard with the calculated results will contain everyone's names and will be filtered and rearranged frequently, so I need to make sure we're always matching directly to their names and not just the relative rows. Basically, in my example I show that Agent 1 is always lined up on row 8 but that's not always going to be the case. Agent 1 could be in Row 8 on Sheet 1, Row 10 on Sheet 2, and Row 12 on Sheet 3 and I need all the correct values to multiply and sum against one another.

Index Match Match across multiple columns

I have an Index Match Match question that I have not been able to find the answer for in researching. Although the solution may actually might be different than an Index Match Match formula - I'm open to try something more efficient than my current workaround.
I have one worksheet with data from my company on it. We sell a Product (let's call it Coke Zero) and we track the weeks that we put a promotion on and how much profit we make by selling it to the retailer. For example a promotion for Coke Zero starts the first week of Jan and ends 3 weeks later and we make a gross profit of $100 each week the promotion runs. I then have an external database with sales data formatted on a weekly basis to tell me how many units of Coke Zero I sold in each week. My internal data has thousands of lines like this with dozens of products, however the promotions are consolidated on one single row regardless of if it runs for more than one week, making matching up to the external database difficult. I need to create a lookup for what our Gross Profit was for each week of the promotion.
I have attached an example image of the workbook + two data sheets of what I've tried to do, summarised below.
On the Internal Data Sheet I've created additional columns to the right with all of the weeks listed that the promotion is on for, and concatenated them with the Product Code to be able to match week by week to the data in the External data sheet. Then my lookup basically checks every column one after another until it finds one where the concatenate of Week_Product Code concatenate matches.
My current solution technically works but my final formula is really slow and cumbersome given the data can be anywhere from 10K-200K lines when looking at multiple retailers. I was hoping to find a more efficient formula to complete the lookup.
Current solution on the External Data Sheet Column E:
=IF(ISNUMBER(MATCH(D2,'Internal Data'!$E:$E,0)),INDEX('Internal Data'!$D:$D,MATCH(D2,'Internal Data'!$E:$E,0)),
IF(ISNUMBER(MATCH(D2,'Internal Data'!$F:$F,0)),INDEX('Internal Data'!$D:$D,MATCH(D2,'Internal Data'!$F:$F,0)),
IF(ISNUMBER(MATCH(D2,'Internal Data'!$G:$G,0)),INDEX('Internal Data'!$D:$D,MATCH(D2,'Internal Data'!$G:$G,0)),
"0")))
I got SUMPRODUCT to work using this formula in J2:
=SUMPRODUCT(--($B$2:$D$3=H2)*--($E$2:$E$3=I2)*$F$2:$F$3)
And, you don't need those concatenated lookup columns:
Well, that was fun.

Displaying multiple items in Excel graph and few calculation issues

I've done some Googling for each of my issues but haven't found exactly the results as I wanted. Things I need to be done doesn’t probably include any macros/VBA skills, just basic knowledge of Excel.
Now to my spreadsheet. I'm a Dota 2 player and I like statistics. I like it that much that I'd like to keep track of my achievements and results. Only problem is that the game tracker sucks and to get great information in web you have to pay for it, so I decided it's time for me to create my own spreadsheet to track my skills.
I don't know which place is the best to share my spreadsheet but I uploaded it to Estonian uploading host, link is here. I will also provide with pictures so you don't have to download anything.
This is what it looks like in general:
Problem number 1: The left table, or column has 1000 rows. In web design it's possible to make elements fixed depending on the scroll, I'd like to use similar feature here. If the table gets scrolled down, the right table (area with games, bonus and graph) will get scrolled down with it.
Problem number 2: Average MMR. I'd like to show average MMR after each entry depending on the first entries. Right now there's avg MMR for J4:J8. The calculation for J8 looks like this: =AVERAGE(C4:C8). For J7 it looks like this: =AVERAGE(C4:C7). I'd like to do this for all my 1000 rows, but I don't want to type it out. If I try to drag down from the corner, it will continue with C5:C8, C6:C9 etc (so it changes the starting point)
Problem number 3: Under longestGame there's currently Date and Hero. This should show the Date and Hero of which the longest game occurred. I tried to do this with LOOKUP function but it required table to be in ascending order, which I don't want. For current, 44,22, there should be Storm Spirit and 14.06.2015.
Problem number 4: Graph. I'd like to display three series on graph - MMR, average MMR and game length (time). The problem is, that MMR and average MMR will be in the numbers on 3000-7000 but the game length will only be probably in timeframe 20:00-120:00. Maybe it's possible to add two sets of values to the Y axis or maybe set Time series maximum 200:00 and minimum 0:00 and create graph according to this. I'm really stupid making graphs and I haven't figured out a clever way yet.
Problem number 5: Graph again. Right now I have to set the series for the graph. I've currently set it to C4:C54 (so 50 rows). I'd like it to move around a bit and by that I mean that if there happens to be C55-th game then the graph would start from C5:C55 and move along (so it'll count 50 last games).
I'm in a benevolent mood so rather than downvoting your question, because it is not really suitable for this forum I'm going to give you some hints and guidance. The numbers below correspond to the problems in your question.
Excel permits more than one window to be used on the same workbook -
so one window can show the data and one the summary.
Find out about absolute and relative cell addressing - its a valuable bit of knowledge for anyone serious about Excel and it will be of use in solving your problem.
Find out about the MATCH function. You can use this to find out which row of your table contains the longest game, shortest game, max MMR, min MMR by matching an element from the summary on the right (cols M onward) against the appropriate column table on the left. The find out about the INDEX function - this can be used to pull the values in the columns for Hero and Date which correspond to a specific row (such as the row containing the longest game, shortest game, etc). Search INDEX MATCH and find out why using these two functions in combination is often preferred to using the VLOOKUP function
Persevere - there are graph options available to do what you want and the only way to really learn is to go through the pain of trying them out, failing and working at it until you succeed.
Set up an area of worksheet to hold the 50*3 table of data for your graphs. Find out about the COUNT function and think how it might be of use in determining which rows of the data table map to the 50 rows of graph data. Then think about how to populate the graph data table using one of the functions mentioned above. Incidentally, C4:C54 is actually 51 rows, not 50.

AverageIf and Multiple data strings

I'm involved with a youth football tournament on the referee side, with assessing/coaching the referees. I've just taken over doing the data entry for the referees assessment scores which we then use to determine who gets finals etc and am looking to extract more usable information from the data to help us identify trends.
I've got (up to) 200 referees, each receiving from none to two assessment scores each day for 5 days. The scores are entered as both the raw mark and the weighted mark based on match difficulty (along with a host of other data about the match that isn't relevant to this issue.
I can extract the average mark (raw and weighted) across all referees without issues and have done so using the below formula, which is the raw average mark:
=AVERAGE(Working!AK4:AK200,Working!BK4:BK200,Working!CL4:CL200,Working!DL4:DL200,Working!EM4:EM200,Working!FM4:FM200,Working!GN4:GN200,Working!HN4:HN200,Working!IO4:IO200,Working!JO4:JO200)
But I also want to extract the average mark (raw and weighted) across two subsets - Academy and non academy referees, to help plot trends and determine where resources need to be utilised.
I've attempted to use an AVERAGEIF formula, but am getting a #VALUE! return. This is the formula that I've attempted to use to return the average raw mark for those referees in the academy:
=AVERAGEIF(Working!G4:G200,Working!G4:G200="Yes",(Working!AK4:AK200,Working!BK4:BK200,Working!CL4:CL200,Working!DL4:DL200,Working!EM4:EM200,Working!FM4:FM200,Working!GN4:GN200,Working!HN4:HN200,Working!IO4:IO200,Working!JO4:JO200))
If I do the same formula as above, but without the brackets around the [average_range], I get a 'you've used too many arguments, and it highlights BK200.
From what I've been able to find so far online, it seems that the formula I'm trying to use would only work if ALL the cells in (Working!G4:G200) returned "Yes". However if there are only 50 academy referees as indicated by "Yes" in G column, then I want those specific scores to be averaged, and the inverse for the non-academy referees.
I thought about having another sheet, which would simply contain populate from Column G (a simple =G4 and then populated down to =G200 next to all of the scores), consolidated into a block of raw marks columned under Assessment 1, 2, 3, 4.... and then the same for all of the weighted marks which would populate from the equivalent cell on the working sheet, but there's a lot of filtering, and re-sorting that goes on on the working sheet, and I'm not 100% certain that that wouldn't cause issues.
Any feedback on how to work through this problem, so that I can display the overall average mark for academy and non-academy referees in both raw and weighted form would be much appreciated, and I apologize if this post is rather convoluted.
I don't think there is a neat solution if the scores are in several columns which are not consecutive.
My suggestion is:-
(1) Work out the sum for each column separately and total them up
(2) Work out the count for each column separately and total them up
(3) Divide Sum by Count to get Average.
In my small example below with 3 referees and 3 columns:-
(1) In K2:-
=SUMIF(H2:H4,"Yes",B2:B4)+SUMIF(H2:H4,"Yes",D2:D4)+SUMIF(H2:H4,"Yes",F2:F4)
(2) In K3:-
=COUNTIFS(B2:B4,">=0",H2:H4,"Yes")+COUNTIFS(D2:D4,">=0",H2:H4,"Yes")+COUNTIFS(F2:F4,">=0",H2:H4,"Yes")
(3) In K4:
=K2/K3
This would include any zero scores (if this is possible) but exclude any blanks.
You can then scale it up to your data.
Beyond this, you would have to change the data structure either
(1) Add a row to label the columns that you want to average e.g.
Score 1 Score 2 Score 3
3 0 3
so you could pick up only the columns labelled 3 say
Here's how it would be in my small example:-
In K3:-
=SUM((B$2:F$2=3)*($H3:$H5="Yes")*B3:F5)
Which is an array formula and must be entered with Ctrl-Shift-Enter
In K4:-
=SUM((B$2:F$2=3)*($H3:$H5="Yes")*(B3:F5<>""))
another array formula
In K5:-
=K3/K4
This is how the columns you want are labelled with a 3 in row 2, so it ignores the other columns:-
(2) Consolidate them into another sheet as you suggest.

How can I implement 'balanced' error spreading functionality in Excel?

I have a requirement in Excel to spread small; i.e. pennies, monetry rounding errors fairly across the members of my club.
The error arises when I deduct money from members; e.g. £30 divided between 21 members is £1.428571... requiring £1.43 to be deducted from each member, totalling £30.03, in order to hit the £30 target.
The approach that I want to take, continuing the above example, is to deduct £1.42 from each member, totalling £29.82, and then deduct the remaining £0.18 using an error spreading technique to randomly take an extra penny from 18 of the 21 members.
This immediately made me think of Reservoir Sampling, and I used the information here: Random selection,
to construct the test Excel spreadsheet here: https://www.dropbox.com/s/snbkldt6e8qkcco/ErrorSpreading.xls, on Dropbox, for you guys to play with...
The problem I have is that each row of this spreadsheet calculates the error distribution indepentently of every other row, and this causes some members to contribute more than their fair share of extra pennies.
What I am looking for is a modification to the Resevoir Sampling technique, or another balanced / 2 dimensional error spreading methodology that I'm not aware of, that will minimise the overall error between members across many 'error spreading' rows.
I think this is one of those challenging problems that has a huge number of other uses, so I'm hoping you geniuses have some good ideas!
Thanks for any insight you can share :)
Will
I found a solution. Not very elegant, through.
You have to use two matrix. In the first you get completely random number, chosen with =RANDOM() and in the second you choose the n greater value
Say that in F30 you have the first
=RANDOM()
cell.
(I have experimented with your sheet.)
Just copy a column of n (in your sheet 8) in column A)
In cell F52 you put:
=IF(RANK(F30,$F30:$Z30)<=$A52, 1, 0)
Until now, if you drag left and down the formulas, you have the same situation that is in your sheet (only less elegant und efficient).
But starting from the second row of random number you could compensate for the penny esbursed.
In cell F31 you put:
=RANDOM()-SUM(F$52:F52)*0.5
(pay attention to the $, each random number should have a correction basated on penny already spent.)
If the $ are ok you should be OK dragging formulas left and down. You could also parametrize the 0.5 and experiment with other values. With 0,5 I have a error factor (the equivalent of your cell AB24) between 1 and 2

Resources