Distributing data with lower and upper boundaries in Excel - excel

Here's a link to a screenshot with the formula used in Column B and some sample data
I have a spreadsheet with 48 rows of data in column A
The values range from 0 to 19
The average of these 48 rows = 8.71
the standard deviation of the population = 3.77
I've used the STANDARDIZE function in excel in column B to return the Z-score of each item in column A given that I know the mean (8.71), std dev (3.77), and x (whatever is in column A).
For example (row 2) has:
x = 2
z = -1.779
Using the z value, I want to create an lower (4) and upper (24) boundary and calculate what the value would be in this 3rd column.
Essentially, if x = 0 (min value), then z = -2.3096, and columnC = 4 (lower boundary condition)
Conversely, if x = 19 (max value), then z = 2.9947, and columnC = 19 (upper boundary condition)
and then all other values between 0 to 19 would be calculated....
Any ideas how I can accomplish this with a formula in the column C?

So if your lowest original value is 0 and your highest is 19 and you want to re-distribute them from 4 to 24 and we assume that both are linear that means:
Since both are linear we have to use these formulas:
we develope the first to c so we get
and replace the c in the second equation with that so we get
and develope this to m as follows
If we put this togeter with our third equation above we get:
So we finally have equations for m = and c = and we can use the numbers from our old and new lower and upper bound to get:
you can use these values with
where x is are your old values in column A and y is the new distributed value in column B:
Some visualization if you change the boundaries:
Idea for a non-linear solution
If you want 4 and 24 as boundaries and the mean should be 12 the solution cannot be linear of course. But you could use for example any other formula like
So you can use this formula for column D y2 with the following values a, b, c as well as calculating the mean, min and max over column D y2.
Then use the solver:
Goal is: Mean $M$15 should be 12
secondary conditions: $M$16 = 4 (lower boundary) and $M$17 = 24 (upper boundary)
variable cells are a, b and c: $M$11:$M$13
The solver will now adjust the values a, b and c so that you get very close to your goal and to get these results:
The min is 4 the max is almost 24 and the mean is almost 12 that is probably the closest you can get with a numeric method.

Related

TRUE/FALSE ← VLOOKUP ← Identify the ROW! of the first negative value within a column

Firstly, we have an array of predetermined factors, ie. V-Z;
their attributes are 3, the first two (•xM) multiplied giving the 3rd.
f ... factors
• ... cap, the values in the data set may increase max
m ... fixed multiplier
p ... let's call it power
This is a separate, standalone array .. we'd access with eg. VLOOKUP
f • m pwr
V 1 9 9
W 2 8 16
X 3 7 21
Y 4 6 24
Z 5 5 25
—————————————————————————————————————————————
Then we have 6 columns, in which the actual data to be processed is in, & thereof derive the next-level result, based on the interaction of both samples introduced.
In addition, there are added two columns, for balance & profit.
Here's a short, 6-row data sample:
f • m bal profit
V 2 3 377 1
Y 2 3 156 7
Y 1 1 122 0
X 1 2 -27 2
Z 3 3 223 3
—————————————————————————————————————————————
Ultimately, starting at the end, we are comparing IF -27 inverted → so 27 is within the X's power range ie. 21 (as per the first sample) .. which is then fed into a bigger formula, beyond the scope of this post.
This can be done with VLOOKUP, all fine by now.
—————————————————————————————————————————————
To get to that .. for the working example, we are focusing coincidentally on row5, since that's the one with the first negative value in the 'balance' column, so ..
on factorX = which factor exactly is to us unknown &
balance -27 = which we have to locate amongst potentially dozens to hundreds of rows.
Why!?
Once we know that the factor is X, based on the * & multiplier pertaining to it, then we also know which 'power' (top array) to compare -27, as the identified first negative value in the balance column, to.
Is that clear?
I'd like to know the formula on how to achieve that, & (get to) move on with the broader-scope work.
—————————————————————————————————————————————
The main issue for me is not knowing how to identify the first negative or row -27 pertains to, then having that piece of information how to leverage it to get the X or identify the factor type, especially since its positioned left of the latter & to the best of my knowledge I cannot use negative column index number (so, latter even if possible is out of the question anyway).
To recap;
IF(21>27) = IF(-21<-27)
27 → LOCATE ROW with the first negative number (-27)
21 → IDENTIFY the FACTOR TYPE, same row as (-27)
→ VLOOKUP pwr, based on factor type identified (top array, 4th column right)
→ invert either 21 to a negative number or (-27) to the positive number
= TRUE/FALSE
Guessing your columns I'll say your first chart is in columns A to D, and the second in columns G to K
You could find the letter of that factor with something like this:
=INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0)))
INDEX(J:J<0) converts that column to TRUE and FALSE depending on being negative or not and with XMATCH you find the first TRUE. You could then use that in VLOOKUP:
=VLOOKUP(INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0))),A:D,4,0)
That would return the 21. You can use the first concept too to find the the -27 and with ABS have its "positive value"
=VLOOKUP(INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0))),A:D,4,0) > INDEX(J:J,XMATCH(TRUE,INDEX(J:J<0)))
That should return true or false in the comparison

Sum of the greatest value in one column, plus the sum of the other values in another column

Consider the following sheet/table:
A B
1 90 71
2 40 25
3 60 16
4 110 13
5 87 82
I want to have a general formula in cell C1 that sums the greatest value in column A (which is 110), plus the sum of the other values in column B (which are 71, 25, 16 and 82). I would appreciate if the formula wasn't an array formula (as in requiring Ctrl + Shift + Enter). I don’t have Office 365, I have Excel 2019.
My attempt
Getting the greatest value in column A is easy, we use MAX(A1:A5).
So the formula I want in cell C1 should be something like:
=MAX(A1:A5) + SUM(array_of_values_to_be_summed)
Obtaining the values of the other rows in column B (what I called array_of_values_to_be_summed in the previous formula) is the hard part. I've read about using INDEX, MATCH, their combination, and obtaining arrays by using parenthesis and equal signs, and I've tried that, without success so far.
For example, I noticed that NOT((A1:A5 = MAX(A1:A5))) yields an array/list containing ones (or TRUEs) for the relative position of the rows to be summed, and containing a zero (or FALSE) for the relative position of the row to be omitted. Maybe this is useful, I couldn't find how.
Any ideas? Thanks.
Edit 1 (solution)
I managed to obtain what I wanted. I simply multiplied the array obtained with the NOT formula, by the range B1:B5. The final formula is:
=MAX(A1:A5) + SUM(NOT((A1:A5 = MAX(A1:A5))) * B1:B5)
Edit 2 (duplicate values)
I forgot to explain what the formula should do if there are duplicates in column A. In that case, the first term of my final formula (the term that has the MAX function) would be the one whose corresponding value in column B is smallest, and the value in column B of the other duplicates would be used in the second term (the one containing the SUM function).
For example, consider the following sheet/table:
A B
1 90 71
2 110 25
3 60 16
4 110 13
5 110 82
Based on the above table, the formula should yield 110 + (71 + 25 + 16 + 82) = 304.
Just to give context, the reason I want such a formula is because I’m writing a spreadsheet that automatically calculates the electric current rating of the short-circuit protective device of the feeder of a group of electric motors in a house or building or mall, as required by the article 430.62(A) of the US National Electrical Code. Column A is the current rating of the short-circuit protective device of the branch-circuit of each motors, and column B is the full-load current of each motor.
You can use this formula
=MAX(A1:A5)
+SUM(B1:B5)
-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)
Based on #Anupam Chand's hint for max-value-duplicates there could also be min-value-duplicates in column B for corresponding max-value-duplicates in column A. :) This formula would account for that
=SUM(B1:B5)
+(MAX(A1:A5)-AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1))
*SUMPRODUCT((A1:A5=MAX(A1:A5))*(B1:B5=AGGREGATE(15,6,(B1:B5)/(A1:A5=MAX(A1:A5)),1)))
Or with #Anupam Chand's shorter and better readable and overall better style :)
=SUM(B1:B5)
+(MAX(A1:A5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
*COUNTIFS(A1:A5,MAX(A1:A5),B1:B5,MINIFS(B1:B5,A1:A5,MAX(A1:A5)))
The explanation works for bot solutions:
The SUM-part just sums the whole list.
The second line gets the max-value for column A and the corresponding min-value of column B for the max-values in column A and adds or subtracts it respectively.
The third line counts, how many times the corresponding min-value for the max-value occurs and multiplies it with the second line.
Can you try this ?
=MAX(A1:A5)+SUM(B1:B5)-MINIFS(B1:B5,A1:A5,MAX(A1:A5))
What we're doing is adding the max of A to all rows of B and then subtracting the min value of B where A is the max.
If you have Excel 365 you can use the following LET-Formula
=LET(A,A1:A5,
B,B1:B5,
MaxA,MAX(A),
MinBExclude, MINIFS(B,A,MaxA),
sumB1,SUMPRODUCT(B*(A=MaxA)*(B<>MinBExclude)),
sumB2,SUMPRODUCT(B*(A<>MaxA)),
MaxA +sumB1+sumB2
A and B are shortcuts for the two ranges
MaxA returns the max value for A (110)
MinBExclude filters the values of column B by the MaxA-value (25, 13, 82) and returns the min-value of the filtered result (13)
sumB1 returns the sum of the other MaxA values from column B (26 + 82)
sumB2 returns the sum of the values from B where value in A <> MaxA (71 + 60)
and finally the result is returned
If you don't have Excel 365 you can add helper columns for MaxA, MinBExclude, sumB1 and sumB2 and the final result

VLookup with Multiple Ranges

I'm trying to make a formula that would do the following: There are say 10 categories 1-10, given a number x and y, the line is in category 3 if and only if x is between 1 and 2 and y is between 5-7 for example. I don't know how to use VLookup given the multiple conditions and the two ranges that are completely different and not in a sequential order.
I tried using index match:
=INDEX(B5:B15,MATCH(1,IF(AND(K5>=C5:C15,K5<=D5:D15),1,0)*IF(AND(L5>=E5:E15,L5<=F5:F15),1,0),0))
but this returns an error where column B are the categories, K5 and L5 are x and y respectively and column C is the lower bounds for x per category with D as upper bounds and same for E and F for y.
Here's a mock representation of the data and rules:
Data
x y category
1.2 12 1
1.5 5 2
0.98 23 3
.
.
.
Rules
Category X-LB X-UB Y-LB Y-UB
1 1 2 9 15
2 1.5 1.7 1 9
3 0.8 1 20 23
.
.
.
LB is lower bound and UB is upper bound. For example given x and y above using the rules table we find the expected return column.
Thank you,
If you have only category which will fit the bill in each case, one way is to use SUMPRODUCT.
Formula in C2 and down is
=SUMPRODUCT(($B$10:$B$12<=A2)*($C$10:$C$12>=A2)*($D$10:$D$12<=B2)*($E$10:$E$12>=B2)*($A$10:$A$12))
In M5, copied down :
=INDEX($B$5:$B$15,MATCH(1,INDEX(($K5>=$C$5:$C$15)*($K5<=$D$5:$D$15)*($L5>=$E$5:$E$15)*($L5<=$F$5:$F$15),0),0))
Or,
Another shorter option.
Using SUMIFS function, formula in M5 copied down :
=SUMIFS($B:$B,$C:$C,"<="&$K5,$D:$D,">="&$K5,$E:$E,"<="&$L5,$F:$F,">="&$L5)

How to sum constants if the values of a row contian a specific value in excel?

I have the following row in excel:
12 4 12p 12a 12b
I need to sum this elements with their values from the legend.
12 = 12;
4 = 4;
12p = 12,5;
12a = 12,2;
12b = 12,3;
For example
=12 + 4 + 12,5 + 12,2 + 12,3
Any ideas?
If you have all the elements within one cell as a single string of text, the optimal approach would be to start by using text-to-column to split them up. So you'll have 12 in A, 4 in B, 12p in C, 12a in D, 12b in E. If that's not an option, I can show you string manipulations that can be an alternative.
You'll need to turn your "legend" into a look-up table, (perhaps on sheet2?), with column A having: p, a, b, etc.. and column B having the relative values.
Once that's done, place this formula on sheet1, in F column:
=A2+IFERROR(VLOOKUP(RIGHT(A2),Sheet2!$A:$B,2,FALSE),0)
Then drag it to the right 5 times, and it will have the values of the elements "translated".
You can sum the translated range easily.

How do I perform a monte carlo simulation in Open Office?

I am trying to generate some ranges for a problem I am working on. These rangers are going to be based on the sum of the ratio's of a bunch of numbers. So for example, the constant's are 5 6 and 7.
The ranges I get will be 5/x + 6/y + 7/z = S
I want x, y, and z to come out of a list of numbers I have - say .5, .6, .7, .8, .9, and 1
So If I run 100 iterations of this, I want the spreadsheet to randomly fill a value in X from that list of numbers, another random selection for y, and yet another for z.
And like I said, I want that sum, S, to be calculated 100 times in such a way that I will get a range of values for S.
I have been trying to figure out how to do this without the use of macros.
Here's one way to do it. Create a table of x, y, and z input values. Put a column to the left of the table with the number of each input value (1...N). Say that you have 10 potential input values for each. So your table is in A1:D10 with 1 through 10 in column A and the x values in B, y values in C, and z valued in D.
Then you can select a random value of the x values by writing =VLOOKUP(10*RAND()+1,$A$1:$D$10,2,TRUE). This randomly selects a number between 0 and 10 and looks up the x value matching the A column that matches the number, rounded down. E.g. the random number is 4.3 -- then it will select the 4th value. Replace the third parameter in the VLOOKUP column with 3 for y values and 4 for z values...
If you don't have any other data in columns A:D, you can generalize this with =VLOOKUP(count($A:$A)*RAND()+1,$A:$D,2,TRUE).

Resources