Excel formula: For each instance of peak/bottom value in column, get range/distance to the second next peak/bottom - excel

I am looking to solve the following problem in Excel:
ID Value Distance
1 1 3
2 0 0
3 -1 3
4 1 0
5 0 0
6 -1 0
7 0 0
Essential the distance column is what I want. It looks at peak/bottom values(1 and -1), then scrolling down to find the second next peak or bottom and compute the distance. For example, for ID 1, since it is peak, we looking for the second peak/bottom, ID 3 should be skipped since its the first, so we look at ID 4 and get distance = 4-1 = 3

Try following formula:
=IFERROR(AGGREGATE(15,6,A2:$A$18/ABS(B2:$B$18),3)/ABS(B2)-A2,0)
Explanation:
AGGREGATE function with first two parameters 15, 6 and last 3 returns the third smallest value in the array A2:$A$18/ABS(B2:$B$18) ignoring errors - in the first row after division the array looks like this [1, #DIV/0!, 3, 4, #DIV/0!, 6, #DIV/0!, ...] and returns 4.
Next, this value is divided by the absolute value of column B of the current row (if we divide by 0, then we get an error and the IFERROR function returns 0).
Then we subtract the value of column A of the current row from the obtained result (in the first row 1) and we get the desired distance - 3
To get the third and subsequent values, increase the last parameter of the AGGREGATE function accordingly.

Related

Nuanced Excel Question; calculating proportions

Fellow overflowers, all help is appreciated;
I have the following rows of values (always 7 values per row) of data in Excel (3 examples below), where data is coded as 1 or 2. I am interested in the 1's.
2, 2, 1, 2, 2, 1, 1.
1, 2, 2, 2, 2, 1, 2.
2, 2, 2, 1, 1, 1, 2.
I use the =MATCH(1,A1:G1,0) to tell me WHEN the first 1 appears, BUT now I want to calculate the proportion that 1's make up of the the remaining values in the row.
For example;
2, 2, 1, 2, 2, 1, 1. (1 first appears at point 3, but then 1's make up 2 out of 4 remaining points; 50%).
1, 2, 2, 2, 2, 1, 2. (1 first appears at point 1, but then 1's make up 1 out of the 6 remaining points; 16%).
2, 2, 2, 1, 1, 1, 2. (1 first appears at point 4, but then 1's make up 2 out of the 3 remaining points; 66%).
Please help me calculate this proportion!
You could use this one
=(LEN(SUBSTITUTE(SUBSTITUTE(MID(A1,SEARCH(1,A1)+3,1000)," ",""),",",""))
-LEN(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(MID(A1,SEARCH(1,A1)+3,1000)," ",""),",",""),1,""))
)/LEN(SUBSTITUTE(SUBSTITUTE(MID(A1,SEARCH(1,A1)+3,1000)," ",""),",",""))
The
SUBSTITUTE(SUBSTITUTE(MID(A1,SEARCH(1,A1)+3,1000)," ",""),",","")
-part gets the string after the first 1. The single 1 in the middle part is the one, you want to calculate the percentage for. So if you want to adapt the formula to other chars, you have to change the single 1 in th emiddle part and the three 1s in the three searches.
EDIT thank you for the hint #foxfire
A solution for values in columns would be
=COUNTIF(INDEX(A1:G1,1,MATCH(1,A1:G1,0)+1):G1,1)/(COUNT(A1:G1)-MATCH(1,A1:G1,0))
You can do it with SUMPRODUCT:
My formula in column H is a MATCH like yours:
=MATCH(1;A3:G3;0)
My formula for calculatin % of 1's over reamining numbers after first 1 found, is:
=SUMPRODUCT((A3:G3=1)*(COLUMN(A3:G3)>H3))/(7-H3)
This is how it works:
(A3:G3=1) will return an array of 1 and 0 if cell value is 1 or not. So for row 3 it would be {0;0;1;0;0;1;1}.
COLUMN(A3:G3)>H3 will return an array of 1 and 0 if column number of cell is higher than column number of first 1 found, (that matchs with its position inside array). So for row 3 it would be {0;0;0;1;1;1;1}
We multiply both arrays. So for row 3 it would be {0;0;1;0;0;1;1} * {0;0;0;1;1;1;1} = {0;0;0;0;0;1;1}
With SUMPRODUCT we sum up the array of 1 and 0 from previous step. So for row 3 we would obtain 2. That means there are 2 cells with value 1 after first 1 found.
(7-H3) will just return how many cells are after first 1 found, so fo row 3, it means there are 4 cells after first 1 found.
We divide value from step 4 by value from previous step, and that's the % you want. So for row 3, it would be 2/4=0,50. That means 50%
update: I used 2 columns just in case you need to show where is the first 1. But in case you want a single column with the %, formula would be:
=SUMPRODUCT((A3:G3=1)*(COLUMN(A3:G3)>MATCH(1;A3:G3;0)))/(7-MATCH(1;A3:G3;0))

Intermediate steps in evaluation of Frequency formula

This has reference to [SO question]Counting unique list of items from range based on criteria from other ranges
Formula Suggested by Scot Craner is :
=SUM(--(FREQUENCY(IF(B2:B7<=25,IF(C2:C7<=35,COUNTIF(A2:A7,"<"&A2:A7),""),""),COUNTIF(A2:A7,"<"&A2:A7))>0))
I have been able to understand clearly the logic and evaluation of the formula except for this step shown in the attached snapshots.
As per MS Office document:
FREQUENCY(data_array, bins_array) The FREQUENCY function syntax has
the following arguments: Data_array Required. An array of or
reference to a set of values for which you want to count frequencies.
If data_array contains no values, FREQUENCY returns an array of zeros.
Bins_array Required. An array of or reference to intervals into
which you want to group the values in data_array. If bins_array
contains no values, FREQUENCY returns the number of elements in
data_array.
It is clear to me as to How {1;1;4;0;"";"") comes in data_array and also how {1;1;4;0;5;3} comes in bins_array.But how it evaluates to {2;0;1;1;0;0;0} is not clear to me.
Would appreciate if someone can lucidly explain it.
So you wants to know how
FREQUENCY({1;1;4;0;"";""},{1;1;4;0;5;3}) evaluates to {2;0;1;1;0;0;0}?
Problem is that the bins_array not needs to be sorted to make FREQUENCY working. But of course it internally must sort the bins_array to get the intervals into which to group the values in data_array. Then it groups and counts and then it returns the counted numbers in the same order the bins was given in bins_array.
Scores Bins
1 1
1 1
4 4
0 0
"" 5
"" 3
Bins sorted
0 (<=0)
1 (>0, <=1)
1 (>1, <=1) == not possible
3 (>1, <=3)
4 (>3, <=4)
5 (>4, <=5)
(>5)
Bin Description Result
1 Number of scores (>0, <=1) 2
1 Number of scores (>1, <=1) == not possible 0
4 Number of scores (>3, <=4) 1
0 Number of scores (<=0) 1
5 Number of scores (>4, <=5) 0
3 Number of scores (>1, <=3) 0
Number of scores (>5) 0

Pandas - Least frequent value in column

I have a Pandas series of integers, 'win'. I want the values most_common and least_common to be the most and least frequent values in the column. for example, with the following numbers, I would want most_common to be 2 and least_common to be 1. If it is a tie (either way) then this can be broken arbitrarily.
0 1 2 2 2 0 0 2 2 0
I can find most_common using the following code:
win.mode()[0]
How can I find the least common? I tried the following code, but it did not work, and in any case I was not sure if this was the best way to go about this:
lowest =valid_loss.value_counts().tail(1)[0]
I think need last value of index for lowest value and first index for top value:
valid_loss = pd.Series([0, 1, 2, 2, 2, 0, 0, 2, 2, 0])
s = valid_loss.value_counts()
print (s)
2 5
0 4
1 1
dtype: int64
highest = s.index[0]
print (highest)
2
lowest = s.index[-1]
print (lowest)
1

Sum values based on first occurrence of other column using excel formula

Let's say I have the following two columns in excel spreadsheet
A B
1 10
1 10
1 10
2 20
3 5
3 5
and I would like to sum the values from B-column that represents the first occurrence of the value in A-column using a formula. So I expect to get the following result:
result = B1+B4+B5 = 35
i.e., sum column B where any unique value exists in the same row but Column A. In my case if Ai = Aj, then Bi=Bj, where i,j represents the row positions. It means that if two rows from A-column have the same value, then its corresponding values from B-column are the same. I can have the value sorted by column A values, but I prefer to have a formula that works regardless of sorting.
I found this post that refers to the same problem, but the proposed solution I am not able to understand.
Use SUMPRODUCT and COUNTIF:
=SUMPRODUCT(B1:B6/COUNTIF(A1:A6,A1:A6))
Here the step by step explanation:
COUNTIF(A1:A6, A1:A6) will produce an array with the frequency of the values: A1:A6. In our case it will be: {3, 3, 3, 1, 2, 2}
Then we have to do the following division: {10, 10, 10, 20, 5, 5}/{3, 3, 3, 1, 2, 2}. The result will be: {3.33, 3.33, 3.33, 20, 2.5, 2.5}. It replaces each value by the average of its group.
Summing the result we will get: (3.33+3.33+3.33) + 20 + (2.5+2.5=35)=35.
Using the above trick we can just get the same result as if we just sum the first element of each group from the column A.
To make this dynamic, so it grows and shrinks with the data set use this:
=SUMPRODUCT($B$1:INDEX(B:B,MATCH(1E+99,B:B))/COUNTIF($A$1:INDEX(A:A,MATCH(1E+99,B:B)),$A$1:INDEX(A:A,MATCH(1E+99,B:B))))
... or just SUMPRODUCT.
=SUMPRODUCT(B2:B7, --(A2:A7<>A1:A6))

SUMPRODUCT with a conditional with two ranges to calculate

To calculate a margin (JAN) I need to calculate:
sales(loja1)*margin(loja1)+sales(loja2)*margin(loja2)+sales(loja3)*margin(loja3)
/
(SUM(sales(loja1);sales(loja2);sales(loja3))
but I need to make this using a SUMPRODUCT. I tried:
=SUMPRODUCT((B3:B11="sales")*(C3:C11);(B3:B11="margin")*C3:C11))/SUMPRODUCT((B3:B11="sales")*(C3:C11))
but gave error!
When SUMPRODUCT is used to select cells within a range with text, the result for each evaluation will either be TRUE or FALSE. You will need to convert this to 1's or 0's by using '--' before the function so that when you multiply it by another range of cells, you will get the expected value
SUMPRODUCT Example: Sum of column B where column A is equal to 'Sales"
A B
1 | Sales 5
2 | Sales 6
3 | Margin 3
4 | Margin 2
Resulting Formula =SUMPRODUCT(--(A1:A4 = "Sales"),B1:B4)
How SUMPRODUCT works:
First, an array is returned that has True for each value in A1:A4 that equals "Sales", and False for each value that doesn't
Sales TRUE
Sales -> TRUE
Margin FALSE
Margin FALSE
Then the double negative converts TRUE to 1 and False to 0
1
1
0
0
Next, the first array (now the one with 1's and 0's) is multiplied by your second array (B1:B4) to get a new array
1st 2nd New Array
1 * 5 = 5
1 * 6 = 6
0 * 3 = 0
0 * 2 = 0
Finally all the values in the new array are summed to get your result (5+6+0+0 = 11)
Step 1:
For your scenario, you're going to need find the sales amount for each Location and multiply it by the margin for the corresponding location
location 1: sales * margin
=SUMPRODUCT(--(A3:A11="loja1"),--(B3:B11="venda"),(C3:C11)) * SUMPRODUCT(--(A3:A11="loja1"),--(B3:B11="margem"),(C3:C11))
You can do a similar formula for location 2 and 3 and then sum them all together.
Step: 2
To sum the sales for all locations, you can do a similar formula, again using the double negative, i.e. "--"
SUMPRODUCT(--(B3:B11="sales"),(C3:C11))
The resulting formula will be a bit long, but when you divide Step 1 by Step 2, you'll get the desired result

Resources