How can I implement 'balanced' error spreading functionality in Excel? - excel

I have a requirement in Excel to spread small; i.e. pennies, monetry rounding errors fairly across the members of my club.
The error arises when I deduct money from members; e.g. £30 divided between 21 members is £1.428571... requiring £1.43 to be deducted from each member, totalling £30.03, in order to hit the £30 target.
The approach that I want to take, continuing the above example, is to deduct £1.42 from each member, totalling £29.82, and then deduct the remaining £0.18 using an error spreading technique to randomly take an extra penny from 18 of the 21 members.
This immediately made me think of Reservoir Sampling, and I used the information here: Random selection,
to construct the test Excel spreadsheet here: https://www.dropbox.com/s/snbkldt6e8qkcco/ErrorSpreading.xls, on Dropbox, for you guys to play with...
The problem I have is that each row of this spreadsheet calculates the error distribution indepentently of every other row, and this causes some members to contribute more than their fair share of extra pennies.
What I am looking for is a modification to the Resevoir Sampling technique, or another balanced / 2 dimensional error spreading methodology that I'm not aware of, that will minimise the overall error between members across many 'error spreading' rows.
I think this is one of those challenging problems that has a huge number of other uses, so I'm hoping you geniuses have some good ideas!
Thanks for any insight you can share :)
Will

I found a solution. Not very elegant, through.
You have to use two matrix. In the first you get completely random number, chosen with =RANDOM() and in the second you choose the n greater value
Say that in F30 you have the first
=RANDOM()
cell.
(I have experimented with your sheet.)
Just copy a column of n (in your sheet 8) in column A)
In cell F52 you put:
=IF(RANK(F30,$F30:$Z30)<=$A52, 1, 0)
Until now, if you drag left and down the formulas, you have the same situation that is in your sheet (only less elegant und efficient).
But starting from the second row of random number you could compensate for the penny esbursed.
In cell F31 you put:
=RANDOM()-SUM(F$52:F52)*0.5
(pay attention to the $, each random number should have a correction basated on penny already spent.)
If the $ are ok you should be OK dragging formulas left and down. You could also parametrize the 0.5 and experiment with other values. With 0,5 I have a error factor (the equivalent of your cell AB24) between 1 and 2

Related

What function to use for this difficult excel calculation for the roulette wheel?

so I am a complete excel and math noob and I want to have a cell in excel which will display the "Pelayo number", which is used in calculating bias in a roulette wheel. You can read more about it here: https://www.roulette-bet.com/2015/06/the-roulette-bias-winning-method.html
enter image description here
Let me explain briefly what I want. As you can see on the image there are two columns, in one there are the numbers on a roulette wheel and and in the second one there is the frequency of each number. On top you see number of spins (852). The number on the bottom (23,02.....) is the expected frequency of each number. The table is dynamic, constantly evolving as I enter new data.
Now I want a cell to display the total number of positives. Which is calculated like this:
If there have been 300 spins, each numbers has to have been spun 300/36 = 8.33 in order to be breaking even. This means those which have been spun 8 times are losing a little, and those which have showed 9 times are winning something. If a number has appeared 14 times it is clear it has 14-8.33 = 5.67 which we will express in an abbreviated form like +5. Let’s suppose the exact same situation has occurred for 6 other numbers also, they all will make a total sum of 5.67 + 5.67 + 5.67 + 5.67 + 5.67 + 5.67+ 5.67 = 39.69. as no other number has been spun over 9 times, then we say the amount of total positives at this table at 300 spins is +39.
TLDR So ideally something like: Select all the numbers from (G6:G42) which are bigger than value in (G50) and then substract them one after another from the expected frequency (G50) and then add this all up.
I tried to solve it but just couldnt find a tutorial anywhere
I'll break this down for you, and show you a few helpful Excel concepts along the way.
Especially if you are a beginner, I'd recommend using a helper column. Helper columns are great ways to break down complicated functions into smaller, more manageable parts.
In H6, write =IF(G6>G$50,G6,0). That if statement will set us up for our sum, with either the value in G6 or a 0. The $ will be cleared up in a moment.
Then, hover your mouse over that cell, and a little square box will appear in the lower right corner of H6. Grab that tiny box, and drag it down to H42. This fills in the formula, adjusting all of the numbers relatively as you go. Note that the 50 stayed constant, however - that's what the $ did!
H6 is now your helper column. It doesn't find your answer, but it gets an important, intermediary step done.
Finally, wherever you'd like your answer, write =SUM(G6:G42), and you should be well on your way.
=SUMIF(G6:G42,">"&G50,G6:G42)-COUNTIF(G6:G42,">"&G50)*G50
It sums values that are over in G50 then distracts G50 value as much times as there were values to sum up to.
For example in case G50 is 23.02 and you have values 20, 21, 22, 23, 24, 25.
It would calculate like (22+23+24+25)-4*23.02

Distribution of time values randomly in a table Excel - Modeling Power Grid

I am working on a model of charging load of electric vehicle. I am attaching a link to an excel workbook for your better understanding.
Column B contains random time values
Column G to P represents houses and each house can have 1 car. So the each time values needs to be distributed in one column. Now when a car is plugged in, its load stays constant for 3 cells.
I want excel to randomly distribute these cars e.g. 4 cars to 4 houses and leave others blank.
what i can think of is, to assign each time a random house then use IF formula with AND function to match random times with time series and second condition to match random houses with columns 1-10.
the problem i am facing is, the formula gives a value error and only works in the rows with has random generated time in front of them screenshot. I know there is a very small thing that i am missing. please help me find it
Regards
workbook
=IF(ISNA(MATCH(G$5,$C$6:$C$9,FALSE)),"",IF(AND(INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE))>=$F6,INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE))<=$F6+TIME(0,30,0)),11,""))
The two elements in the AND find the house number in column C and return the corresponding time in column B.
The first element compares the time in F to that time. The second element compares the time + 30 minutes to F (three cells). If it's between those two times, it gets an 11.
The ISNA makes sure that the house in question is on the list. You could also use an IFERROR, but I prefer the precision of ISNA.
Update
If you want the values to wrap around, you need to OR compare to the next day.
=IF(ISNA(MATCH(G$5,$C$6:$C$9,FALSE)),"",IF(OR(AND(ROUND($F6,5)>=ROUND(INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE)),5),ROUND($F6,5)<=ROUND(INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE))+TIME(0,30,0),5)),AND(ROUND($F6+1,5)>=ROUND(INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE)),5),ROUND($F6+1,5)<=ROUND(INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE))+TIME(0,30,0),5))),11,""))
That formula structure looks like
=If(isna(),"",if(or(and(today,today),and(tomorrow,tomorrow)),11,"")
This formulas already getting too big. If you triple it for your three voltages, it will be huge. You should consider writing a UDF in VBA. It won't be as quick to calculate, but will probably be more maintainable.
If you want to stick with a formula, you could put the wattage in row 4 above the house number. Then in another table, list the wattages and minutes to charge. So in, say, B12:C14 you have
3.7 120
11 30
22 15
Now where you have 11 in your formula, you'd have G$4 and the two placed you have TIME(0,30,0), you'd have TIME(0,INDEX($C$12:$C$14,MATCH(G$4,$B$12:$B$14,FALSE)),0). I re-arranged some stuff to make it more 'readable' (but it's still pretty tough) and here's the final formula
=IF(ISNA(MATCH(G$5,$C$6:$C$9,FALSE)),"",IF(OR(AND(ROUND($F6,5)>=ROUND(INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE)),5),ROUND($F6,5)<=ROUND(INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE))+TIME(0,INDEX($C$12:$C$14,MATCH(G$4,$B$12:$B$14,FALSE)),0),5)),AND(ROUND($F6+1,5)>=ROUND(INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE)),5),ROUND($F6+1,5)<=ROUND(INDEX($B$6:$B$9,MATCH(G$5,$C$6:$C$9,FALSE))+TIME(0,INDEX($C$12:$C$14,MATCH(G$4,$B$12:$B$14,FALSE)),0),5))),G$4,""))

How to sum total hours in a row while skipping certain values?

I study wildlife and currently, I am doing an analysis regarding how long my focal species goes off of the mountain (its main habitat) and into human settlements.
Here is a picture with the data: data
Anyways, as you can see there are three coloured columns. Yellow is data, green is time, and blue is whether the animal is on or off the mountain (with red being when the animal is off).
As you can see, this one particular animal went off on several occasions. In this case, he went off the mountain three times but stayed off at various lengths. As I have thousands of data points, I essentially would like to determine how long each "off the mountain" event lasted. That is, since I consider every time the animal went off the mountain to be a separate event, I would like to determine how long the animal was off the mountain for each excursion, separately. In this case, the animal went off three times and I would like to total those three events individually.
So, as stated, an event would be every single occasion that the animal left the mountain, stayed there (for however long), and eventually made its way back up.
Any help would be greatly appreciated.
The simplest way would be just to count how many consecutive "off" periods there are in a particular run following an "on" period then multiply by 3 hours 20 minutes which you could do like this (starting in (say) K2)
=IF(AND(G1="On",G2="Off"), MATCH("On",G3:G$100,0)*TIME(3,20,0)*24,0)
You could take it further by looking at the individual times of the fixes as well to get an upper and lower limit (e.g. for the first excursion it could be between 3 hours 20 minutes and 10 hours 40 minutes roughly).
Upper limit
=IF(AND(G1="On",G2="Off"), (INDEX(J3:J$100,MATCH("On",G3:G$100,0))-J1)*24,0)
Lower limit
=IFERROR(IF(AND(G1="On",G2="Off"), (INDEX(J3:J$100,MATCH("On",G3:G$100,0)-1)-J2)*24,0),0)
where my column J contains a datetime value formed by adding the date and time in columns A and B together.
This raises an issue about what happens when the animal is still off-mountain at the end of its data (currently gives #N/A because MATCH is unable to find a cell containing "On"). Would need to decide how to treat this case if it ever occurs in practice.
Note when there is only one off-mountain measurement the lower limit is zero because in theory the animal could have left immediately before the measurement and returned immediately afterwards.
EDIT
To address the above issue where the animal is still off-mountain at the end of its data (and looking at the sample data it looks as if a different animal's data is immediately following the first animal's data) you would need this
=IF(AND(G1="On",G2="Off"), IFERROR(MATCH(1,(G3:G$100="On")*(E3:E$100=E2),0),MATCH(TRUE,E3:E$100<>E2,0))*TIME(3,20,0)*24,0)
which would have to be entered as an array formula using CtrlShiftEnter
You could argue that you might need to do some averaging for an incomplete off-mountain excursion like this which would make it even more complicated, but this is an Excel answer and can't go too far into the rights or wrongs of the analysis.
I guess a good starting-point would be knowing how you gather these statistics in the first place.

Calculating the best combination of coaches for passengers?

I have a table in Microsoft Excel that I'd like to use to calculate the best combination of coaches to house the supplied number of passengers. Here is a simplified version of the table:
I need to enter three formulas in the coach count column that calculates the best value-for-money combination of coaches that can carry all the passengers. For example, if there was 40 passengers, the result should be one 49-seat coach as opposed two 20-seat coaches as it's the cheapest combination.
I have no idea how I would work on implementing these formulas and would appreciate some pointers.
So far, all I have in C4 is
=IF(MOD(B1, A4) = 0, B1 / A4, 0) which only works with multiples of 20 and does not account for combinations of coaches or cost efficiency.
Perhaps this is too complex of a task to implement in formulae? Would I be better off using a VB macro, or simply leaving it to the user to calculate the best combination?
There are two ways to address this problem. I will outline both solutions:
Option 1: In Worksheet Formulas
I'd have to spend more time on this in order to find a really elegant solution for this route, but here's a functional approach that should work well enough. Here are some quick highlights:
Firstly, you need to add a column to your table that outlines the minimum number of seats a coach carries. This helps to facilitate the vlookup.
Secondly, make sure that your lookup table is sorted in ascending order according to the minimum # of seats.
I have made the assumption that the most effective pricing model is to get the majority of people onto the largest coach (or many of the largest coach), and then to use the smallest coach that would accommodate the remaining people. If this is not a fair assumption, then this solution may not be appropriate.
Here are screenshots of the final outcome:
And the formulas required to make it: (and a link in case you need to blow it up: http://i.stack.imgur.com/hKjQK.jpg)
Note: You'll notice that the previous answer is incorrect, as it suggested that 74 people would need to spend $180 instead of $140.
Option 2: Using Excel's Solver Add-In
Enable the solver add-in (File --> Options --> Add-ins --> Excel Add-ins (Manage) --> Solver Add-In)
Configure worksheet as shown:
UI:
The formulas:
On the Ribbon, go to the Data Tab, Analysis Group, & Click Solver.
Configure Solver as follows:
Click "Solve" and then click "Ok"
Final Outcome:
This seems to be a classic linear programming problem. You need to minimize total cost = (number of coach 1 times 50) + (number of coach 2 times 60) + (number of coach 3 times 80), subject to the constraint that (number of coach 1 times 20) + (number of coach 2 times 29) + (number of coach 3 times 49) is greater than or equal to (number of attendees), and all numbers of coaches are greater than or equal to zero. I think Excel's Solver is the tool for such a problem. You don't need to implement any of the solution yourself, you just set it up and Solver handles the algorithmic stuff.
Try this:
Sample calculation
With Formulas showing
The idea is to check for the largest coach first, using the integer value of division Count/Seats. The do the same for the 2nd largest coach with the remaining people. Etc etc.

Microsoft Excel 2007 Always round up even if the decimal is under 0.5

So I'm creating a spreadsheet that determines the cost of materials and the number of each material needed in order to complete a desire project using input from myself. Right now the desired project is a wall that is 250x9 that requires replace all the 4x8 sheets of wood with OSB and install Vinyl Siding. The issue I'm running into is I cannot get it to always round up. By that I mean even if the value is 1.1 it should round up. In this specific case I am buying nails for my nail gun in a box of 2,000 and each sheet of OSB will have 32 nails in it. If 250x9 area requires 70.3125 sheets of OSB it means I still have to buy 71 sheets of OSB. If that OSB is 71 sheets then it require that I have 2272 Nails then the result is I need 1.125 Boxes of nails. However I can't seem to get it to show this as 2 boxes because again I still need to purchase more than one box to complete the project. So with that being said if I take the number of OSB needed 70.3125 and I place it in a formula with a roundup function it still rounds down (gives me a headache that there is a roundup and a rounddown function and it will still round down on me. Perhaps it is the way I am using it in the formula that is incorrect, I'm not sure. So let me translate the formula's used and you can let me know if I'm doing something wrong or if there is a function or set of functions that I can use to solve this issue.
=SUM(((B30*C30)+(B35*C35)+(E30*F30)+(H30*I30))/(E9*G9))
This says that if I added Wall1 L*W with Wall2 L*W with Wall3 L*W with Wall4 L*W and divide it by OSB H*W I get the number of sheets needed. Which in this case is 2250/32 basically. But its programmed in a manner that I can input the information for individual walls to different area's and get it to spit out the total SqFt for each wall and give an individual breakdown per wall of material needed with cost associated per sq ft of material bleh bleh bleh. The point is I take the result that is the 70.3125 and I move it to a different workbook and I say "Sheets OSB Needed" and in that box I have
=ROUNDUP(Sheet1!A9,1)
Whereas I'm asking it to roundup A9 which is the result of the above formula by intervals of 1. But the output is still 70 instead of 71. and much the same case with the nails needed. Which can be calculated in a few different manners but regardless the amount of nails needed divided by 2000 would output the decimal answer which yields a value of less than 1.5 and it too provides me with a value of 1 instead of 2 with much the same formula. I could achieve my desired result I suppose with Trunc and Mod functions collaborating using multiple cells to output the different portions of the data. But is there a way to do this that doesn't involve so many cells being used up?
C7
=Trunc(A9)
Removes Decimal from 70.3125
C8
=MOD(A9)
Outputs decimals from 70.3125
C9
=IF(C8<1,"1",C8)
If Decimals are < a whole number make it a whole number
C3
=SUM(C7+C9)
Add the whole number to the Trunc Number to get value desired.
Which I'm already seeing an issue with this if there is no decimals in the sheets needed then wouldn't it always add one because the decimal place would be 0? How can I handle this issue? Isn't there an easier way to do this or a way to code it so that its all nested into one calculation or at least mostly all into one calculation without making a circular reference of some sort?
You need to change the second parameter to a 0 ROUNDUP(70.3125, 1) is 70.3 the 3 must be getting dropped elsewhere or lost in formatting.
ROUNDUP(70.3125, 0) will give 71.
The second parameter of round up is the decimal place. So to round to integers it should be 0 not 1

Resources