I am trying to port a very simple Excel into a Matlab code (I am not completely satisfied with Excel Solver!). My problem is this:
I have two materials (say A and B) with their properties (density, visco, etc) and prices, and I mix them to obtain a third material (say C), whose properties are a mix (non necessarily linear) of the two, and which, if it respects some limits (ie density max X, visco max Y), can be sold for a certain price. What I have is a function which takes the quantity of A and B, their properties, their prices, material C limits, and material C price. It then comes up with a profit (i.e. price C * (quantity A + quantity B) - (price A * quantity A + price B * quantity B) ), and an indicator which tells me if all the properties limits are satisfied in material C (basically it compares limits and actual properties, and puts 0 if ok and 1 otherwise ---> if all properties are respected, mean of that vector should be 0).
Thus I have:
[profit, ok] = blend([qA, qB], [specA, specB], [pA, pB], [limits], pC)
and I want to max profit by changing quantity A and quantity B, sub that the ok vector is 0 and qA+qB is less than a specified max quantity. The real problem is imposing the ok vector equal to 0. I thought about porting the limit check outside of the function, but I can only check if the limits are respected once the function has calculated the property of the blends, so I cannot put that outside. Is there a solution to this? Many thanks!
What your are looking for is called nonlinear constrained optimization, and most likely specifically the function fmincon
I am afraid, that it is probably best if you unpack your function to fit into the standard scheme. This is incidentally a very good thing to learn, as this is the common way to express such problems.
The call is
x=fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon)
You have two parameters, giving the quantities of your materials, or if you normalize the total quantity to 1, you could get by with just one parameter x and express the other as (1-x) in all equations.
So you would need to write one function fun, that just computes the profit based on the parameters.
The material constraints are then put into the remaining parameters. You could put all constraints into the ceq return of nonlcon, as explained here to return zero when the mix is ok.
However, cleaner and more efficient is to encode all linear constraints using the A and b matrices.
For more details I would require the actual constraints and functions that you have.
Related
I have to run a montecarlo where, for some products, certain exchanges are relate to each other in the sense that my process can take as input any of the products in different (bounded) proportions but with fixed sum.
Example:
my product a takes as inputs a total of 10 kg of x,y, and z alltogheter and x has a uniform distribution that goes from 0 to 4 kg, y from 1 to 6 and z from 3 to 8 with their sum that must be equal to 10. So, every iteration I would need to get a random number for my three exchanges within their bounds making sure that their sum is always 10.
I have seen that in stats_array it is possible to set the bounds of the distributions and thus create values in a specified interval but this would not ensure that the sum of my random vector equals the fixed sum of 10.
Wondering if there is already a (relatively) straightforward way to implemented this in bw2
Otherwise the only way I see this feasible is to create all the uncertainity parameters with ParameterVectorLCA, tweak the value in the array for those products that must meet the aforementioned requirements (e.g with something like this or this) and then use this array with modified parameters to re-run my MC .
We are working on this in https://github.com/PascalLesage/brightway2-presamples, but it isn't ready yet. I don't know of any way to do this currently without hacking something together by subclassing the MonteCarloLCA.
I've read many articles about the Monte Carlo algorithm for approximating the preflop equity in NL holdem poker.
Unfortunately, it iterates over only a few possible boards to see what happens. The good thing about this is that you can put in exact hand ranges.
Well, I don't need exact ranges. It's good enough to say "Top 20% vs Top 35%".
Is there a simple formula to tell (or approximate) the likelihood of winning or losing? We can ignore splits here.
I can imagine that the way to calculate the odds will become much simpler if we just using two (percentile) numbers instead of all possible card combinations.
The thing is, I don't know if for example the case "Top 5% vs Top 10%" is equal to "Top 10% vs Top 20%".
Does anyone know of a usable relation or a formula for these inputs?
Thanks
Okay, I've made a bit analytical work and I came up wit the following.
The Formula
eq_a(a, b) := 1/2 - 1/(6*ln(10)) * ln(a/b)
Or if you like:
eq_a(a, b) := 0.5 - 0.072382 * ln(a/b)
Where a is the range in percent (0 to 1) for player a. Same for b.
The function outputs the equity for player a. To get the equity for player b just swap the two ranges.
When we plot the function it will look like this: (Where a = x and b = y)
As you can see it's very hard to get an equity greater than 80% preflop (as even AA isn't that good mostly).
How I came up with this
After I've done some analysis I became aware of the fact that the probability of winning is dependent on just the ratio of the two ranges (same for multiway pots).
So:
eq_a(a, b) = eq(a * h, b * h)
And yes, Top 5% vs Top 10% has the same equities as Top 50% vs Top 100%.
The way I've got the formula is I've done some regressions on sample data I've calculated with an app and picked the best fit (the logarithmic one). Then I optimised it using special cases like eq_a(0.1, 1)=2/3 and eq_a(a, a)=1/2.
It would be great if someone will do the work for multiway preflop all-ins.
I have a dataset of size (61573, 25). The rows represent users whereas the columns represent views on particular movie genres. For example, if data[i,j] == 3 that means that user i has viewed 3 movies of gender j in total. As expected ,rows are sparse and right-skewed.
What I would like to do is to compute how much engaged a user is on each of the 25 movie genders by assigning to him one of the following tags: {VL, L, A, H, VH}.
What I have tried so far is to compute z-scores, either row or column -wise (I haven't tried to standardize values twice, though (i.e. first on rows and then on columns)), and then apply the following function depending on how far away the z-scores are from 0:
(-oo, -2] --> VL
(-2, -1] --> L
(-1, +1) --> A
[+1, +2) --> H
[+2, +oo) --> VH
In either case, my problem is that the results seem very bad in most of the cases probably because they are laying between -1 and +1, and thus are almost always marked as A (i.e. average). So, what else should I try based on your opinion? How would YOU approach this problem?
The z-scores clearly are not the right way to go.
The reason is that they are based on the assumption that your data is normal distributed. I have strong doubts that your data is normal distributed - in particular, it probably doesn't have any negative values, does it?
Have you tried just using quantiles? top 10%, bottom 10% etc.?
I have two groups of data sets, A and B. I would like to know weither the average value of A
significantly differs then B's average. How to do that in Excel 2007?
(I know there's a TTEST formula in excel, I also know I don't need to use the paired version of it, what other parameters do I need to set and how to interpert the result?)
Thanks,
Jon
=ttest(array1,array2,tails,type)
array1 is data set A
array2 is data set B
tails: 1= one tailed, 2 = two tailed. Use one tailed if you are testing whether A is higher than B, or whether A is lower than B. Use two tailed if you are testing whether A is either higher or lower than B. (Probably 1 for your situation.)
type: You said you don't need paired, which is Type 1. Type 2 is if your data sets have equal variance, and Type 3 is if they have unequal variance. For example if the data points in A are all pretty close, but in B they are wildly different, use Type 3.
I have 2 columns and multiple rows of data in excel. Each column represents an algorithm and the values in rows are the results of these algorithms with different parameters. I want to make statistical significance test of these two algorithms with excel. Can anyone suggest a function?
As a result, it will be nice to state something like "Algorithm A performs 8% better than Algorithm B with .9 probability (or 95% confidence interval)"
The wikipedia article explains accurately what I need:
http://en.wikipedia.org/wiki/Statistical_significance
It seems like a very easy task but I failed to find a scientific measurement function.
Any advice over a built-in function of excel or function snippets are appreciated.
Thanks..
Edit:
After tharkun's comments, I realized I should clarify some points:
The results are merely real numbers between 1-100 (they are percentage values). As each row represents a different parameter, values in a row represents an algorithm's result for this parameter. The results do not depend on each other.
When I take average of all values for Algorithm A and Algorithm B, I see that the mean of all results that Algorithm A produced are 10% higher than Algorithm B's. But I don't know if this is statistically significant or not. In other words, maybe for one parameter Algorithm A scored 100 percent higher than Algorithm B and for the rest Algorithm B has higher scores but just because of this one result, the difference in average is 10%.
And I want to do this calculation using just excel.
Thanks for the clarification. In that case you want to do an independent sample T-Test. Meaning you want to compare the means of two independent data sets.
Excel has a function TTEST, that's what you need.
For your example you should probably use two tails and type 2.
The formula will output a probability value known as probability of alpha error. This is the error which you would make if you assumed the two datasets are different but they aren't. The lower the alpha error probability the higher the chance your sets are different.
You should only accept the difference of the two datasets if the value is lower than 0.01 (1%) or for critical outcomes even 0.001 or lower. You should also know that in the t-test needs at least around 30 values per dataset to be reliable enough and that the type 2 test assumes equal variances of the two datasets. If equal variances are not given, you should use the type 3 test.
http://depts.alverno.edu/nsmt/stats.htm