Optimization of a list in Excel with Variables - excel

I have a list of 153 golfers with associated salaries and average scores.
I want to find the combination of 6 golfers that optimizes avg score and keeps salary under $50,000.
I've tried using Solver, but I am stuck! Can anyone help please? :)

Illustrating a solution that is pretty close to what #ErwinKalvelagen suggested.
Column A is the names of the 153 golfers
Column B is the golfers salaries (generated by =RANDBETWEEN(50, 125)*100, filled down, then Copy/Paste Values)
Column C is the golfers average scores (generated by =RANDBETWEEN(70, 85), filled down, then Copy/Paste Values)
Column D is a 0 or 1 to indicate if the golfer is included.
Cell F2 is the total salary, given by =SUMPRODUCT(B2:B154,D2:D154)
Cell G2 is the number of golfers, given by =SUM(D2:D154)
Cell H2 is the average score of the team, given by =SUMPRODUCT(C2:C154,D2:D154)/G2
The page looks like this, before setting up Solver ...
The Solver setup looks like this ...
According to the help, it says to use Evolutionary engine for non-smooth problems. In Options, I needed to increase the Maximum Time without improvement from 30 to 300 (60 may have been good enough).
It took a couple of minutes for it to complete. It reached the solution of 70 fairly quickly, but spent more time looking for a better answer.
And here are the six golfers it came up with.
Of the golfers with an average of 70, it could have found a lower salary.
In Cell I2 added the formula =F2+F2*(H2-70) which is essentially salary penalized by increases in average score above 70 ...
... and use the same Solver setup, except to minimize Cell I2 instead of H2 ...
and these are the golfers it chose ...
Again - it looks like there is still a better solution. It could have picked Name97 instead of Name96.

This is a simple optimization problem that can be solved using Excel solver (just use "Simplex Lp solver" -- somewhat of a misnomer as we will use it here to solve an integer programming or MIP problem).
You need one column with 153 binary (BIN) variables (Excels limit is I believe 200). Make sure you add a constraint to set the values to Binary. Lets call this column INCLUDE; Solver will fill it with 0 or 1 values. Sum these values, and add a constraint with SUMINCLUDE=6. Then add a column with INCLUDE * SCORE. Sum this column and this is your objective (optimizing the average is the same as optimizing the sum). Then add a column with INCLUDE*SALARY and sum these. Add a constraint with SUMSALARY <= 50k. Press solve and done.
I don't agree with claims that Excel will crash on this or that this does not fit within the limits of Excels solver. (I really tried this out).
I prefer the simplex method above the evolutionary solver as the simplex solver is more suitable for this problem: it is faster (simplex takes < 1 seconds) and provides optimal solutions (evolutionary solver gives often suboptimal solutions).
If you want to solve this problem with Matlab a function to look at is intlinprog (Optimization Toolbox).
To be complete: this is the mathematical model we are solving here:
Results with random data:
....

Related

Excel SUMPRODUCT to find the % of a whole number

the Excel SUMPRODUCT function can be quite powerful in different scenario including
Doing the product
Sum with condition
Division
Weighted average
Calculating the % given two sets of value
Finding text in a cell given a range of values
But what I couldn't make out of it is to find the contribution ratio (%) of two numbers.
Say something like
A
B
Contribution %
414
24
5,80%
754
36
4,78%
5,13%
Can I use the SUMPRODUCT to obtain the above Contribution (B/A) but across the whole set?
A =SUMPRODUCT(B1:B2;0,1/A1:B2) gives a value next to it, but not the correct one.
Perhaps it's me that I'm overlooking to something, or simply asking for something impossible.
Thanks for feeding me in.

What function to use for this difficult excel calculation for the roulette wheel?

so I am a complete excel and math noob and I want to have a cell in excel which will display the "Pelayo number", which is used in calculating bias in a roulette wheel. You can read more about it here: https://www.roulette-bet.com/2015/06/the-roulette-bias-winning-method.html
enter image description here
Let me explain briefly what I want. As you can see on the image there are two columns, in one there are the numbers on a roulette wheel and and in the second one there is the frequency of each number. On top you see number of spins (852). The number on the bottom (23,02.....) is the expected frequency of each number. The table is dynamic, constantly evolving as I enter new data.
Now I want a cell to display the total number of positives. Which is calculated like this:
If there have been 300 spins, each numbers has to have been spun 300/36 = 8.33 in order to be breaking even. This means those which have been spun 8 times are losing a little, and those which have showed 9 times are winning something. If a number has appeared 14 times it is clear it has 14-8.33 = 5.67 which we will express in an abbreviated form like +5. Let’s suppose the exact same situation has occurred for 6 other numbers also, they all will make a total sum of 5.67 + 5.67 + 5.67 + 5.67 + 5.67 + 5.67+ 5.67 = 39.69. as no other number has been spun over 9 times, then we say the amount of total positives at this table at 300 spins is +39.
TLDR So ideally something like: Select all the numbers from (G6:G42) which are bigger than value in (G50) and then substract them one after another from the expected frequency (G50) and then add this all up.
I tried to solve it but just couldnt find a tutorial anywhere
I'll break this down for you, and show you a few helpful Excel concepts along the way.
Especially if you are a beginner, I'd recommend using a helper column. Helper columns are great ways to break down complicated functions into smaller, more manageable parts.
In H6, write =IF(G6>G$50,G6,0). That if statement will set us up for our sum, with either the value in G6 or a 0. The $ will be cleared up in a moment.
Then, hover your mouse over that cell, and a little square box will appear in the lower right corner of H6. Grab that tiny box, and drag it down to H42. This fills in the formula, adjusting all of the numbers relatively as you go. Note that the 50 stayed constant, however - that's what the $ did!
H6 is now your helper column. It doesn't find your answer, but it gets an important, intermediary step done.
Finally, wherever you'd like your answer, write =SUM(G6:G42), and you should be well on your way.
=SUMIF(G6:G42,">"&G50,G6:G42)-COUNTIF(G6:G42,">"&G50)*G50
It sums values that are over in G50 then distracts G50 value as much times as there were values to sum up to.
For example in case G50 is 23.02 and you have values 20, 21, 22, 23, 24, 25.
It would calculate like (22+23+24+25)-4*23.02

Excel Rolling Mean of 3 Similar Consecutive Observations

I'm trying to find the rolling mean of time series while ignoring values that do not follow the trend.
x
869
1570
946
0
1136
So, what I would want the result to look like is...
x | y
869 | 0
1570 | 0
946 | 1128.33
3 | 0
1136 | 1217.33 ([1136+1570+946]/3)
900 | 2982 ([946+1136+900]/3)
860 | 2896
The tough part here is if the row I'm on is a trending value I want to take the 3 previous trending values and find them mean of them, but if it's a non-trending value I want it to just zero out. Sometimes I might have to skip 2 or 3 previous lines to get 3 trending values to take the average as well.
So far I've been using array, RC formulas in a VBA macro form, but I'm not sure I could use RC here or if it has to be something else completely. Any help would be greatly appreciated.
I believe I can help you with your problem. First three notes:
1) It appears to me that you are trying to do DCA on smoothed production profiles, ignoring months without a complete record or no data. I'm making this assumption since you mentioned this was time series data but didn't give a sample rate. 2) I've added some extra 'data' for the sake of demo-ing. 3) In your example you shared, the last two values in your 'Y' column it looks like you may have summed but have forgotten to divide.
The solution I came up with has three parts: 1) create a metric to identify 'outliers'; 2) flag 'outliers'; 3) smooth non-flagged data. Let's establish some worksheet infrastructure and say that your production values are in column B and the associated time is in column A as follows:
Part 1) In column 'C', estimate a rough data value based on a trend approximated from two points on either side of your current time step. Subtract the actual value from this approximation. The result will always be positive and quite large for a timestep with little or no production.
=(INTERCEPT(B1:B6,A1:A6)+(A4*SLOPE(B1:B6,A1:A6)))-B4
Part 2) In column 'D', add a condition for when the value computed above is larger than the actual data point. Have it use '0' to identify a point that shouldn't be included in your average. Copy this down to the end of your data as well.
=IF(C4>B4,0,1)
Our sheet now looks like this:
3) Your three element average can now be computed. In the last cell of column 'E', enter the following array formula. You have to accept this formula by pressing ctrl + shift + enter. Once that is done fill the column from bottom to top:
=IFERROR(IF(D17=1,AVERAGE(INDEX(B12:B17,MATCH(2,1/(FIND(1,D12:D17)))),INDEX(B12:B16,MATCH(2,1/(FIND(1,D12:D16)))-COUNTIF(D17,"=0")),INDEX(B12:B15,MATCH(2,1/(FIND(1,D12:D15)))-COUNTIF(D16:D17,"=0"))),0),"")
This takes averages the most recent three values and allows for a skip of up to three time steps of outlier data per your problem statement. For an idea of how the completed sheet looks:
This was a fun challenge, I have some ideas for a more efficient formula but this should get the job done. Please let me know how this works for you!
Cheers
[EDIT]
An alternative approach which allows the user to specify the number of previous entries to include is detailed below. This is a more general (preferred alternative) and picks up in place of the previously described step 3.
3Alt) In cell G2 enter a number of previous values to average, for this example I am sticking with 3. In cell E4 enter the following array expression (ctrl+shift+enter) and drag to the end of column E:
=IFERROR(IF(D4=1,SUM(INDEX(D:D,LARGE(($D$4:D4=1)*ROW($D$4:D4),$G$2)):D4 * INDEX(B:B,LARGE(($D$4:D4=1)*ROW($D$4:D4),$G$2)):B4)/$G$2,0),"")
This uses the LARGE function to find the 'nth' largest value, where n is the number of preceding values from the current time-step to average. Then it builds a range that extends from the found cell to the current time step. Then it multiplies the flags (0's and 1's) by each month's production value, sums them and divides by n. In this way months flagged as bad are set to 0 and not included in the sum.
This is a much cleaner way to achieve the desired result and has the flexibility to average different periods of time. See example of the final value below.

Microsoft Excel 2007 Always round up even if the decimal is under 0.5

So I'm creating a spreadsheet that determines the cost of materials and the number of each material needed in order to complete a desire project using input from myself. Right now the desired project is a wall that is 250x9 that requires replace all the 4x8 sheets of wood with OSB and install Vinyl Siding. The issue I'm running into is I cannot get it to always round up. By that I mean even if the value is 1.1 it should round up. In this specific case I am buying nails for my nail gun in a box of 2,000 and each sheet of OSB will have 32 nails in it. If 250x9 area requires 70.3125 sheets of OSB it means I still have to buy 71 sheets of OSB. If that OSB is 71 sheets then it require that I have 2272 Nails then the result is I need 1.125 Boxes of nails. However I can't seem to get it to show this as 2 boxes because again I still need to purchase more than one box to complete the project. So with that being said if I take the number of OSB needed 70.3125 and I place it in a formula with a roundup function it still rounds down (gives me a headache that there is a roundup and a rounddown function and it will still round down on me. Perhaps it is the way I am using it in the formula that is incorrect, I'm not sure. So let me translate the formula's used and you can let me know if I'm doing something wrong or if there is a function or set of functions that I can use to solve this issue.
=SUM(((B30*C30)+(B35*C35)+(E30*F30)+(H30*I30))/(E9*G9))
This says that if I added Wall1 L*W with Wall2 L*W with Wall3 L*W with Wall4 L*W and divide it by OSB H*W I get the number of sheets needed. Which in this case is 2250/32 basically. But its programmed in a manner that I can input the information for individual walls to different area's and get it to spit out the total SqFt for each wall and give an individual breakdown per wall of material needed with cost associated per sq ft of material bleh bleh bleh. The point is I take the result that is the 70.3125 and I move it to a different workbook and I say "Sheets OSB Needed" and in that box I have
=ROUNDUP(Sheet1!A9,1)
Whereas I'm asking it to roundup A9 which is the result of the above formula by intervals of 1. But the output is still 70 instead of 71. and much the same case with the nails needed. Which can be calculated in a few different manners but regardless the amount of nails needed divided by 2000 would output the decimal answer which yields a value of less than 1.5 and it too provides me with a value of 1 instead of 2 with much the same formula. I could achieve my desired result I suppose with Trunc and Mod functions collaborating using multiple cells to output the different portions of the data. But is there a way to do this that doesn't involve so many cells being used up?
C7
=Trunc(A9)
Removes Decimal from 70.3125
C8
=MOD(A9)
Outputs decimals from 70.3125
C9
=IF(C8<1,"1",C8)
If Decimals are < a whole number make it a whole number
C3
=SUM(C7+C9)
Add the whole number to the Trunc Number to get value desired.
Which I'm already seeing an issue with this if there is no decimals in the sheets needed then wouldn't it always add one because the decimal place would be 0? How can I handle this issue? Isn't there an easier way to do this or a way to code it so that its all nested into one calculation or at least mostly all into one calculation without making a circular reference of some sort?
You need to change the second parameter to a 0 ROUNDUP(70.3125, 1) is 70.3 the 3 must be getting dropped elsewhere or lost in formatting.
ROUNDUP(70.3125, 0) will give 71.
The second parameter of round up is the decimal place. So to round to integers it should be 0 not 1

How can I implement 'balanced' error spreading functionality in Excel?

I have a requirement in Excel to spread small; i.e. pennies, monetry rounding errors fairly across the members of my club.
The error arises when I deduct money from members; e.g. £30 divided between 21 members is £1.428571... requiring £1.43 to be deducted from each member, totalling £30.03, in order to hit the £30 target.
The approach that I want to take, continuing the above example, is to deduct £1.42 from each member, totalling £29.82, and then deduct the remaining £0.18 using an error spreading technique to randomly take an extra penny from 18 of the 21 members.
This immediately made me think of Reservoir Sampling, and I used the information here: Random selection,
to construct the test Excel spreadsheet here: https://www.dropbox.com/s/snbkldt6e8qkcco/ErrorSpreading.xls, on Dropbox, for you guys to play with...
The problem I have is that each row of this spreadsheet calculates the error distribution indepentently of every other row, and this causes some members to contribute more than their fair share of extra pennies.
What I am looking for is a modification to the Resevoir Sampling technique, or another balanced / 2 dimensional error spreading methodology that I'm not aware of, that will minimise the overall error between members across many 'error spreading' rows.
I think this is one of those challenging problems that has a huge number of other uses, so I'm hoping you geniuses have some good ideas!
Thanks for any insight you can share :)
Will
I found a solution. Not very elegant, through.
You have to use two matrix. In the first you get completely random number, chosen with =RANDOM() and in the second you choose the n greater value
Say that in F30 you have the first
=RANDOM()
cell.
(I have experimented with your sheet.)
Just copy a column of n (in your sheet 8) in column A)
In cell F52 you put:
=IF(RANK(F30,$F30:$Z30)<=$A52, 1, 0)
Until now, if you drag left and down the formulas, you have the same situation that is in your sheet (only less elegant und efficient).
But starting from the second row of random number you could compensate for the penny esbursed.
In cell F31 you put:
=RANDOM()-SUM(F$52:F52)*0.5
(pay attention to the $, each random number should have a correction basated on penny already spent.)
If the $ are ok you should be OK dragging formulas left and down. You could also parametrize the 0.5 and experiment with other values. With 0,5 I have a error factor (the equivalent of your cell AB24) between 1 and 2

Resources