Create exchanges with bounded random parameters and fixed sum to be used in Montecarlo - brightway

I have to run a montecarlo where, for some products, certain exchanges are relate to each other in the sense that my process can take as input any of the products in different (bounded) proportions but with fixed sum.
Example:
my product a takes as inputs a total of 10 kg of x,y, and z alltogheter and x has a uniform distribution that goes from 0 to 4 kg, y from 1 to 6 and z from 3 to 8 with their sum that must be equal to 10. So, every iteration I would need to get a random number for my three exchanges within their bounds making sure that their sum is always 10.
I have seen that in stats_array it is possible to set the bounds of the distributions and thus create values in a specified interval but this would not ensure that the sum of my random vector equals the fixed sum of 10.
Wondering if there is already a (relatively) straightforward way to implemented this in bw2
Otherwise the only way I see this feasible is to create all the uncertainity parameters with ParameterVectorLCA, tweak the value in the array for those products that must meet the aforementioned requirements (e.g with something like this or this) and then use this array with modified parameters to re-run my MC .

We are working on this in https://github.com/PascalLesage/brightway2-presamples, but it isn't ready yet. I don't know of any way to do this currently without hacking something together by subclassing the MonteCarloLCA.

Related

Adding a number to a date in excel (with an average and st dev)

I have a list of dates. They're dummy data for sign up dates.
I want to add another list that is dummy data for first usage dates.
To make the dates, I used this -
=RANDBETWEEN(DATE(2017,1,1),DATE(2017,6,30))
To make the first usages dates, I'm trying this -
=C6+RANDBETWEEN(0,100)
But, I don't want to just add a random number. I want to add a number from a normal distribution with a mean of 10 and a standard deviation of 30 (without going into negatives). Is that possible?
The following will generate a random number in normal distribution with a mean of 10 and deviation of 30:
= NORM.INV(RAND(),10,30)
The easiest way I can think of to exclude negative values is to just take the absolute value of this.
= ABS(NORM.INV(RAND(),10,30))
But, as already noted, if you exclude negative numbers (no matter how you decide to exclude them), then it isn't really a normal distribution anymore.
EDIT:
Another way to exclude negative numbers is the following:
Since NORM.INV(0.37,10,30) returns a value just above 0, you can use this knowledge to change the formula to only allow random values between 0.37 and 1 to be generated:
= NORM.INV(0.63*RAND()+0.37,10,30)
However, again I must point out this isn't a true normal distribution.

Matlab - Optimize blend of two stream sub constraint to max profit

I am trying to port a very simple Excel into a Matlab code (I am not completely satisfied with Excel Solver!). My problem is this:
I have two materials (say A and B) with their properties (density, visco, etc) and prices, and I mix them to obtain a third material (say C), whose properties are a mix (non necessarily linear) of the two, and which, if it respects some limits (ie density max X, visco max Y), can be sold for a certain price. What I have is a function which takes the quantity of A and B, their properties, their prices, material C limits, and material C price. It then comes up with a profit (i.e. price C * (quantity A + quantity B) - (price A * quantity A + price B * quantity B) ), and an indicator which tells me if all the properties limits are satisfied in material C (basically it compares limits and actual properties, and puts 0 if ok and 1 otherwise ---> if all properties are respected, mean of that vector should be 0).
Thus I have:
[profit, ok] = blend([qA, qB], [specA, specB], [pA, pB], [limits], pC)
and I want to max profit by changing quantity A and quantity B, sub that the ok vector is 0 and qA+qB is less than a specified max quantity. The real problem is imposing the ok vector equal to 0. I thought about porting the limit check outside of the function, but I can only check if the limits are respected once the function has calculated the property of the blends, so I cannot put that outside. Is there a solution to this? Many thanks!
What your are looking for is called nonlinear constrained optimization, and most likely specifically the function fmincon
I am afraid, that it is probably best if you unpack your function to fit into the standard scheme. This is incidentally a very good thing to learn, as this is the common way to express such problems.
The call is
x=fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon)
You have two parameters, giving the quantities of your materials, or if you normalize the total quantity to 1, you could get by with just one parameter x and express the other as (1-x) in all equations.
So you would need to write one function fun, that just computes the profit based on the parameters.
The material constraints are then put into the remaining parameters. You could put all constraints into the ceq return of nonlcon, as explained here to return zero when the mix is ok.
However, cleaner and more efficient is to encode all linear constraints using the A and b matrices.
For more details I would require the actual constraints and functions that you have.

How to obtain Incremental standard deviations from a set of standard deviations?

I have a data set containing three columns, first column represents number of trials, second column represents experimental values, and the third column represents corresponding standard deviation.
With each experiment there is an increment in my experimental values. To get the incremental values, I hold my first value as the reference value and subtract this reference value from each subsequent value and use them to create fourth column of these incremental values.
My problem begins right from here. How do I create a new set of incremental standard deviations for the incremental experimental values I got? My apology if the problem is not well defined but hopefully someone will eventually be able to help me out. Many thanks!
Below is my data set,
Trial Mean SD Incr Mean Incre SD
1 45.311 4.668 0
2 56.682 2.234 11.371
3 62.197 2.266 16.886
4 70.550 4.751 25.239
5 80.528 4.412 35.217
6 87.453 4.542 42.142
7 89.979 2.185 44.668
8 96.859 3.476 51.548
To be clear, for other readers, your incremental mean is actually the difference between trial 1 and the other trials.
Variances add directly when you subtract (or add) independent normal distributions. So you first want to convert that standard deviation to a variance by squaring it, and then you can add the variances, and then you can take the square root to turn it back into a standard deviation. Note when using this kind of Pythagorean combination, you are assuming that trial 1 is independent from the trials, so for example, you cannot do things like have some sample in both trials.
Logically this makes sense that your so called "incremental SD" will always be greater than the individual SDs, since the uncertainty of both distributions contributes towards the uncertainty of the difference.

probability logic statistics

I am not sure whether this is the right place to ask this question.
As this is more like a logic question.. but hey no harm in asking.
Suppose I have a huge list of data (customers)
and they all have a data_id
Now I want to select lets say split the data in ratio lets say 10:90 split.
Now rather than stating a condition that (example)
the sum of digits is even...go to bin 1
the sum of digits is odd.. go to bin 2
or sum of last three digits are x then go to bin 1
sum of last three digits is not x then go to bin 2
Now this might result in uneven data collection..sometimes it might be able to find the data.. more (which is fine) but sometimes it might not be able to find enough data
Is there a way (probabilistically speaking)
which says.. sample size is always greater than x%
Thanks
You want to partition your data by a feature that is uniformly distributed. Hash functions are designed to have this property ... so if you compute a hash of your customer ID, and then partition by the first n bits to get 2^n bins, each bin should have approximately the same number of items. (You can then select, say, 90% of your bins to get 90% of the data.) Hope this helps.

How to determine statistical significance using T-Test in Excel?

I have two groups of data sets, A and B. I would like to know weither the average value of A
significantly differs then B's average. How to do that in Excel 2007?
(I know there's a TTEST formula in excel, I also know I don't need to use the paired version of it, what other parameters do I need to set and how to interpert the result?)
Thanks,
Jon
=ttest(array1,array2,tails,type)
array1 is data set A
array2 is data set B
tails: 1= one tailed, 2 = two tailed. Use one tailed if you are testing whether A is higher than B, or whether A is lower than B. Use two tailed if you are testing whether A is either higher or lower than B. (Probably 1 for your situation.)
type: You said you don't need paired, which is Type 1. Type 2 is if your data sets have equal variance, and Type 3 is if they have unequal variance. For example if the data points in A are all pretty close, but in B they are wildly different, use Type 3.

Resources