Timetable schedule in Excel - excel

I try to make a simple timetable / schedule for some subject taught in school using the solver in Excel.
This is the setup:
Whereas the Cost, Timetable/Schedule, Demand and Supply matrix are hardcoded. The "Math Assigned" is the sum og respectively the column and row, and the objective is to minimize the "total cost", which is defined as the sumproduct between the Cost matrix and the Timetable/Schedule matrix.
Using the Solver in Excel I'm only able to Assign one subject in this case Math.
Here is what I wrote in the solver:
Which result in the Math classes being allocated:
Question
How can I best allocate the other subjects aswell?
EDIT
I tried adding a Timetable/Schedule for the subject English (Now there is one for Math and for English).
Now the total cost is to minimize the sumproduct of Math timetable and Cost matrix + sumproduct of English timetable and the same Cost matrix.
And additionally added constraints for English corresponding to the constraint for Math.
How can I avoid (maybe using a constraint) the solver from placing ones in the same entries in the binary matrices (Math Timetable and English Timetable), when the use the same cost matrix?
An example, where two matrices share a 1 in the same entry:
Solution
I solved it myself, by adding a helper matrix which sums each of the entries and defining that matrix to be less than a matrix only containing ones.
However, If someone can come up with a less extensive solution, I will approve the solution. Since this is just a minimal example and I will absolutely love to make it more simple and efficient.

Related

Excel Solver, way to maximize two values?

I am a user of microsoft excels solver, and am pretty sure it is not possible to solve to maximize for two values. I was wondering if anyone might have another clever way to do this.
Basically I have a column of numbers between 1 and 30 that I need to look over about and pull out 9 to 10 values (out of 200) based on a couple other constraints. I would also like to not just maximize this value, but also a probability value (range from 0 to 1) that I would also like to maximize.
Adding them up won't work as that would grossly undervalue the probability value and multiplying may do the opposite by overvaluing the probability. Any Strategies to handle this problem would be greatly appreciated.
This is an example of multi-objective optimization, which has an extensive literature. As the Wikipedia article shows, this can lead to some pretty deep waters.
By far the easiest approach is that of linear scalarization. This refers to replacing a vector of 2 (or more) objective functions by a single (hence scalar) objective function which is a linear combination of the objective function. What you can do with the solver is to create 2 cells to hold the relative weights to assign to the two objectives. These will be 2 numbers in the range 0 and 1 which sum to 1. Then create a new objective function which is the SUMPRODUCT (linear combination) of these weights and the objectives. Then -- jut use the solver to optimize this objective function. If you aren't happy with the results -- adjust the weights. There is no one right answer. One of the advantages of this approach is that it allows a decision maker to clarify the relevant importance of the objectives.

Optimization of a list in Excel with Variables

I have a list of 153 golfers with associated salaries and average scores.
I want to find the combination of 6 golfers that optimizes avg score and keeps salary under $50,000.
I've tried using Solver, but I am stuck! Can anyone help please? :)
Illustrating a solution that is pretty close to what #ErwinKalvelagen suggested.
Column A is the names of the 153 golfers
Column B is the golfers salaries (generated by =RANDBETWEEN(50, 125)*100, filled down, then Copy/Paste Values)
Column C is the golfers average scores (generated by =RANDBETWEEN(70, 85), filled down, then Copy/Paste Values)
Column D is a 0 or 1 to indicate if the golfer is included.
Cell F2 is the total salary, given by =SUMPRODUCT(B2:B154,D2:D154)
Cell G2 is the number of golfers, given by =SUM(D2:D154)
Cell H2 is the average score of the team, given by =SUMPRODUCT(C2:C154,D2:D154)/G2
The page looks like this, before setting up Solver ...
The Solver setup looks like this ...
According to the help, it says to use Evolutionary engine for non-smooth problems. In Options, I needed to increase the Maximum Time without improvement from 30 to 300 (60 may have been good enough).
It took a couple of minutes for it to complete. It reached the solution of 70 fairly quickly, but spent more time looking for a better answer.
And here are the six golfers it came up with.
Of the golfers with an average of 70, it could have found a lower salary.
In Cell I2 added the formula =F2+F2*(H2-70) which is essentially salary penalized by increases in average score above 70 ...
... and use the same Solver setup, except to minimize Cell I2 instead of H2 ...
and these are the golfers it chose ...
Again - it looks like there is still a better solution. It could have picked Name97 instead of Name96.
This is a simple optimization problem that can be solved using Excel solver (just use "Simplex Lp solver" -- somewhat of a misnomer as we will use it here to solve an integer programming or MIP problem).
You need one column with 153 binary (BIN) variables (Excels limit is I believe 200). Make sure you add a constraint to set the values to Binary. Lets call this column INCLUDE; Solver will fill it with 0 or 1 values. Sum these values, and add a constraint with SUMINCLUDE=6. Then add a column with INCLUDE * SCORE. Sum this column and this is your objective (optimizing the average is the same as optimizing the sum). Then add a column with INCLUDE*SALARY and sum these. Add a constraint with SUMSALARY <= 50k. Press solve and done.
I don't agree with claims that Excel will crash on this or that this does not fit within the limits of Excels solver. (I really tried this out).
I prefer the simplex method above the evolutionary solver as the simplex solver is more suitable for this problem: it is faster (simplex takes < 1 seconds) and provides optimal solutions (evolutionary solver gives often suboptimal solutions).
If you want to solve this problem with Matlab a function to look at is intlinprog (Optimization Toolbox).
To be complete: this is the mathematical model we are solving here:
Results with random data:
....

Calculating the best combination of coaches for passengers?

I have a table in Microsoft Excel that I'd like to use to calculate the best combination of coaches to house the supplied number of passengers. Here is a simplified version of the table:
I need to enter three formulas in the coach count column that calculates the best value-for-money combination of coaches that can carry all the passengers. For example, if there was 40 passengers, the result should be one 49-seat coach as opposed two 20-seat coaches as it's the cheapest combination.
I have no idea how I would work on implementing these formulas and would appreciate some pointers.
So far, all I have in C4 is
=IF(MOD(B1, A4) = 0, B1 / A4, 0) which only works with multiples of 20 and does not account for combinations of coaches or cost efficiency.
Perhaps this is too complex of a task to implement in formulae? Would I be better off using a VB macro, or simply leaving it to the user to calculate the best combination?
There are two ways to address this problem. I will outline both solutions:
Option 1: In Worksheet Formulas
I'd have to spend more time on this in order to find a really elegant solution for this route, but here's a functional approach that should work well enough. Here are some quick highlights:
Firstly, you need to add a column to your table that outlines the minimum number of seats a coach carries. This helps to facilitate the vlookup.
Secondly, make sure that your lookup table is sorted in ascending order according to the minimum # of seats.
I have made the assumption that the most effective pricing model is to get the majority of people onto the largest coach (or many of the largest coach), and then to use the smallest coach that would accommodate the remaining people. If this is not a fair assumption, then this solution may not be appropriate.
Here are screenshots of the final outcome:
And the formulas required to make it: (and a link in case you need to blow it up: http://i.stack.imgur.com/hKjQK.jpg)
Note: You'll notice that the previous answer is incorrect, as it suggested that 74 people would need to spend $180 instead of $140.
Option 2: Using Excel's Solver Add-In
Enable the solver add-in (File --> Options --> Add-ins --> Excel Add-ins (Manage) --> Solver Add-In)
Configure worksheet as shown:
UI:
The formulas:
On the Ribbon, go to the Data Tab, Analysis Group, & Click Solver.
Configure Solver as follows:
Click "Solve" and then click "Ok"
Final Outcome:
This seems to be a classic linear programming problem. You need to minimize total cost = (number of coach 1 times 50) + (number of coach 2 times 60) + (number of coach 3 times 80), subject to the constraint that (number of coach 1 times 20) + (number of coach 2 times 29) + (number of coach 3 times 49) is greater than or equal to (number of attendees), and all numbers of coaches are greater than or equal to zero. I think Excel's Solver is the tool for such a problem. You don't need to implement any of the solution yourself, you just set it up and Solver handles the algorithmic stuff.
Try this:
Sample calculation
With Formulas showing
The idea is to check for the largest coach first, using the integer value of division Count/Seats. The do the same for the 2nd largest coach with the remaining people. Etc etc.

Compute statistical significance with Excel

I have 2 columns and multiple rows of data in excel. Each column represents an algorithm and the values in rows are the results of these algorithms with different parameters. I want to make statistical significance test of these two algorithms with excel. Can anyone suggest a function?
As a result, it will be nice to state something like "Algorithm A performs 8% better than Algorithm B with .9 probability (or 95% confidence interval)"
The wikipedia article explains accurately what I need:
http://en.wikipedia.org/wiki/Statistical_significance
It seems like a very easy task but I failed to find a scientific measurement function.
Any advice over a built-in function of excel or function snippets are appreciated.
Thanks..
Edit:
After tharkun's comments, I realized I should clarify some points:
The results are merely real numbers between 1-100 (they are percentage values). As each row represents a different parameter, values in a row represents an algorithm's result for this parameter. The results do not depend on each other.
When I take average of all values for Algorithm A and Algorithm B, I see that the mean of all results that Algorithm A produced are 10% higher than Algorithm B's. But I don't know if this is statistically significant or not. In other words, maybe for one parameter Algorithm A scored 100 percent higher than Algorithm B and for the rest Algorithm B has higher scores but just because of this one result, the difference in average is 10%.
And I want to do this calculation using just excel.
Thanks for the clarification. In that case you want to do an independent sample T-Test. Meaning you want to compare the means of two independent data sets.
Excel has a function TTEST, that's what you need.
For your example you should probably use two tails and type 2.
The formula will output a probability value known as probability of alpha error. This is the error which you would make if you assumed the two datasets are different but they aren't. The lower the alpha error probability the higher the chance your sets are different.
You should only accept the difference of the two datasets if the value is lower than 0.01 (1%) or for critical outcomes even 0.001 or lower. You should also know that in the t-test needs at least around 30 values per dataset to be reliable enough and that the type 2 test assumes equal variances of the two datasets. If equal variances are not given, you should use the type 3 test.
http://depts.alverno.edu/nsmt/stats.htm

Statistically removing erroneous values

We have a application where users enter prices all day. These prices are recorded in a table with a timestamp and then used for producing charts of how the price has moved... Every now and then the user enters a price wrongly (eg. puts in a zero to many or to few) which somewhat ruins the chart (you get big spikes). We've even put in an extra confirmation dialogue if the price moves by more than 20% but this doesn't stop them entering wrong values...
What statistical method can I use to analyse the values before I chart them to exclude any values that are way different from the rest?
EDIT: To add some meat to the bone. Say the prices are share prices (they are not but they behave in the same way). You could see prices moving significantly up or down during the day. On an average day we record about 150 prices and sometimes one or two are way wrong. Other times they are all good...
Calculate and track the standard deviation for a while. After you have a decent backlog, you can disregard the outliers by seeing how many standard deviations away they are from the mean. Even better, if you've got the time, you could use the info to do some naive Bayesian classification.
That's a great question but may lead to quite a bit of discussion as the answers could be very varied. It depends on
how much effort are you willing to put into this?
could some answers genuinely differ by +/-20% or whatever test you invent? so will there always be need for some human intervention?
and to invent a relevant test I'd need to know far more about the subject matter.
That being said the following are possible alternatives.
A simple test against the previous value (or mean/mode of previous 10 or 20 values) would be straight forward to implement
The next level of complexity would involve some statistical measurement of all values (or previous x values, or values of the last 3 months), a normal or Gaussian distribution would enable you to give each value a degree of certainty as to it being a mistake vs. accurate. This degree of certainty would typically be expressed as a percentage.
See http://en.wikipedia.org/wiki/Normal_distribution and http://en.wikipedia.org/wiki/Gaussian_function there are adequate links from these pages to help in programming these, also depending on the language you're using there are likely to be functions and/or plugins available to help with this
A more advanced method could be to have some sort of learning algorithm that could take other parameters into account (on top of the last x values) a learning algorithm could take the product type or manufacturer into account, for instance. Or even monitor the time of day or the user that has entered the figure. This options seems way over the top for what you need however, it would require a lot of work to code it and also to train the learning algorithm.
I think the second option is the correct one for you. Using standard deviation (a lot of languages contain a function for this) may be a simpler alternative, this is simply a measure of how far the value has deviated from the mean of x previous values, I'd put the standard deviation option somewhere between option 1 and 2
You could measure the standard deviation in your existing population and exclude those that are greater than 1 or 2 standard deviations from the mean?
It's going to depend on what your data looks like to give a more precise answer...
Or graph a moving average of prices instead of the actual prices.
Quoting from here:
Statisticians have devised several methods for detecting outliers. All the methods first quantify how far the outlier is from the other values. This can be the difference between the outlier and the mean of all points, the difference between the outlier and the mean of the remaining values, or the difference between the outlier and the next closest value. Next, standardize this value by dividing by some measure of scatter, such as the SD of all values, the SD of the remaining values, or the range of the data. Finally, compute a P value answering this question: If all the values were really sampled from a Gaussian population, what is the chance of randomly obtaining an outlier so far from the other values? If the P value is small, you conclude that the deviation of the outlier from the other values is statistically significant.
Google is your friend, you know. ;)
For your specific question of plotting, and your specific scenario of an average of 1-2 errors per day out of 150, the simplest thing might be to plot trimmed means, or the range of the middle 95% of values, or something like that. It really depends on what value you want out of the plot.
If you are really concerned with the true max and true of a day's prices, then you have to deal with the outliers as outliers, and properly exclude them, probably using one of the outlier tests previously proposed ( data point is x% more than next point, or the last n points, or more than 5 standard deviations away from the daily mean). Another approach is to view what happens after the outlier. If it is an outlier, then it will have a sharp upturn followed by a sharp downturn.
If however you care about overall trend, plotting daily trimmed mean, median, 5% and 95% percentiles will portray history well.
Choose your display methods and how much outlier detection you need to do based on the analysis question. If you care about medians or percentiles, they're probably irrelevant.

Resources