Excel - find the biggest gap between numbers in rows - excel

I have an excel file with >12500 rows in one column.
It contains such random strings with 20 digits:
2,3,4,6,7,8,12,13,14,24,30,42,45,46,48,50,56,58,**59**,61
1,2,6,8,11,12,13,16,17,21,24,27,28,33,34,42,44,48,58,61
3,7,10,13,14,15,18,21,23,24,25,29,30,34,37,48,51,56,57,60
8,11,13,16,17,19,21,27,29,35,36,39,42,44,46,50,53,54,57,60
2,4,7,9,21,26,28,30,32,34,35,37,38,39,43,44,50,60,61,62
10,13,15,18,21,22,23,24,25,26,40,42,48,49,51,52,56,**59**,61,62
1,2,4,7,14,15,18,20,24,29,30,32,35,41,42,50,52,55,58,62
1,4,8,9,10,12,17,24,25,33,37,41,43,44,46,49,52,**59**,61,62
1,2,4,6,9,12,15,17,21,24,30,31,32,36,41,44,47,48,51,58
2,7,10,12,15,16,20,24,25,27,30,33,39,44,45,52,54,55,58,60
5,7,10,11,20,22,24,31,32,33,36,38,39,41,43,47,50,52,56,58
3,6,8,9,14,15,19,21,25,28,34,37,39,45,47,54,55,56,57,**59**
1,2,3,4,5,8,14,15,18,20,23,31,33,37,42,45,46,51,52,55
I need to know whats the biggest gap between rows where a number hasn't repeated. For example - I search for any number (e.g 59) and I need to know what's the largest gap between two rows where number 59 hasn't repeated.
In this example it's 4 row gap between 59's.
Hope that I make myself clear.

Seems like a fun problem which admits a simple but not quite obvious answer. First -- make sure that the data is in 20 columns (use the text to columns feature under the data tab). Using your example, I came up with a spreadsheet that looks like:
V1 holds the target number. The formulas are in columns U.
In U1 I entered:
=IF(ISNA(MATCH($V$1,A1:T1,0)),1,0)
This formula uses MATCH to test if the value in V1 lies in the range to the left of it. If it doesn't the match function returns #N/A. The function ISNA checks for this error value. IF it is present, the overall formula returns 1 (since there are now 1 consecutive row without the target number) otherwise it returns 0.
The formula in U2 is similar with a little twist:
=IF(ISNA(MATCH($V$1,A2:T2,0)),1+U1,0)
The same basic logic -- but rather than returning 1 if the target number isn't present it adds 1 to the number above. The formula is then copied down the rest of the range. It has the effect of keeping a running total of consecutive rows without the target value. This running total is reset to 0 whenever a row with the target value is encountered.
The final ingredient requires no comment. In U14 I just have
=MAX(U1:U13)
which is the number you are looking for (assuming that the maximum number of consecutive rows without the target number is what you are looking for, even if this occurs either at the top or bottom of the data. If you want the largest gap that is literally between two rows where the number occurs, the logic would need to be made more complex).

Related

Get count only all parameters are within there LSL & USL

From below data table, I tried to get sample count which are each value within their specification limits. To get the sample count, all 3 values must fulfilled its specification requirement. I used SUMPRODUCT function for each and every column and checked its relevant specification limits using following formula.
=SUMPRODUCT((B3:B8>=B9)*(B3:B8<=B10),(C3:C8>=C9)*(C3:C8<=C10),(D3:D8>=D9)*(D3:D8<=D10))
But when I am dealing with more columns, this is getting more complex.
My question is, are there any other way to check all column at once? to reduce the formula complexity.
Note:- Highlighted with red color are out of specification limits. Only 2 & 6 rows are counted the returned result.
You can use SUMPRODUCT/MMULT:
=SUMPRODUCT(--(MMULT((B3:D8>=B9:D9)*(B3:D8<=B10:D10),ROW(A1:A3)^0)=3))
Just remember that the second parameter of MMULT must specify the number of rows corresponding to the number of columns of the first parameter, i.e. B3:D8 = 3 columns => A1:A3 = 3 rows. The comparison with 3 also changes accordingly.

Excel multilevel array formula with partial string matches to sum resultant cells

I've been trying to sort this for over a day now without much luck. I have successfully used SUMIFS, INDEX, MATCH, COUNTIF, "--" etc array functions previously and am not a novice, but also not an expert on these. I can't seem to weave these together correctly, and likely on an altogether incorrect path.
Basically, I am trying to aggregate data from multiple spreadsheets, requiring a mapping of various items (rows) into a canonical form for summing.
The image here shows a representative, but simplified version of my quest. Each "region" on this example spreadsheet (Final..., Mapping, DataSet1, DataSet2) is actually in different spreadsheets, and there are several sheets with 50-150 rows in each xlsx.
Note that the names in Column B are quite arbitrary (meaning not all P1's have an 'x' pattern, like shown here as x1, x2, etc. Do not rely on any pattern in the names, except the x, y , z in the Mapping table are substrings (case insensitive, trailing match) of the names in Column B in the DataSets.
And in the image, the Final Result Table (summed manually) is what I want to compute via(an array) formula: A single formula would be ideal (given I have many spreadsheets from which the monthly data is being pulled from, so I can't readily modify but can create an interim spreadsheet if required, so open to helper columns or helper rows).
Here's the process - For each name (B3-B5) in the Final Result Table, I want to sum the name from it's components as follows:
Lookup all the matches in the Mapping Table (so for P1, the formula =IF($C$10:$C$15=$B3, $B$10:$B$15,"") gives {"x1";"";"";"x2";"";"x3"}.
I then want to search each of x1, x2, and x3 in B19:B26 to get rows 21, 22, 24, 25, 26 in DataSet1 and B31:B35 to get row 32 in DataSet2, to then add up the Jan totals into C3. (Effectively,
C3=C21+C22+C24+C25+C26+C32). Same for P2 and P3, and thru Feb, Mar, ...
I am stuck on how to remove blank or 0 or Div0 or such "error rows" from the interim result in 2, and also need to use 2 arrays of different sizes (3 valid rows in example 2 above, ignoring blanks) to search many rows in DataSets. I tried SEARCH("*"&IF($C$10:$C$15=$B3, $B$10:$B$15,""), $B$19:$B$26) but get unexpected results. I have tried to replace text in the interim result {"x1";"";"";"x2";"";"x3"} with TRUE/FALSE, and 1/0, etc. to help with INDEX or MATCH, but am stymied by errors in downstream ("surrounding") formulas.
Thanks in advance.
Here is a solution without resorting to nasty (imo) CSE formulas.
= SUMPRODUCT($C$19:$F$26*(COUNTIFS($B$10:$B$15, RIGHT($B$19:$B$26,2),$C$10:$C$15,$B3)>0)*($C$18:$F$18=C$2))
+
SUMPRODUCT($C$31:$F$35*(COUNTIFS($B$10:$B$15, RIGHT($B$31:$B$35,2),$C$10:$C$15,$B3)>0)*($C$30:$F$30=C$2))
There is one SUMPRODUCT for each data set. If possible, it would be better to put all your data sets into a single table with a column identify which data set it is a part of.
The way it works is to takes each values in your data set and multiplies it by whether the 2 right most character appear in your mapping table for that P code, multiplied by whether the value is in the correct month. So it returns 0 if either of those conditions are false. Then returns the sum.
UPDATE IN RESPONSE TO OP COMMENTS
If, the X,Y, Z codes are not always 2 digits but the first part is ALWAYS 8 digits, you can easily amend the:
RIGHT($B$19:$B$26,2)
to be:
RIGHT($B$19:$B$26,LEN($B$19:$B$26)-8)
Making the formula for the first data set:
=SUMPRODUCT($C$19:$F$26*(COUNTIFS($B$10:$B$15, RIGHT($B$19:$B$26,LEN($B$19:$B$26)-8),$C$10:$C$15,$B3)>0)*($C$18:$F$18=C$2))
And you can amend for other data sets and simply add them together.
Nice challenge! Are you willing to drop all your tables (DataSet1, DataSet2...) into one spreadsheet, so that we can refer just one single range for each month?
Here's one solution (hopefully a good starting point) - array formula (Ctrl+Shift+Enter):
=SUMPRODUCT(IFERROR(IF(TRANSPOSE(IF($B3=$C$10:$C$15,$B$10:$B$15,""))=RIGHT($B$18:$B$36,2),C$18:C$36,0),0))

(Excel) Giving the smallest value between 2 values

Excel data table
I am new to here. If I have any mistake in making new post, Please tell me and I am ready to correct my mistake.
For above pic, I want to extract the 2 smallest values in column D respectively between row 77 and row 84, and between row 84 and 97. The resulting values are shown in P77 and P84 respectively.
How should I write the excel formula for it? Or it needs VBA to code it?
Thanks a lot for your sincere help!
(Update)
data set
above pic is another capture of my data set which filtered the day with "Bullish breaking candle/bearish breaking candle" only.
Thanks
There are lots of functions/ways to calculate Minimum in addition to the MIN function and it is worth being familiar with them as you will require different ones according to your data.
So quick rundown of some of the main offerings:
SMALL function:
I would consider also the more versatile SMALL function
=SMALL(D77:D84,1) in cell P77
=SMALL(D84:D97,1) in cell P84
You put the array (the range of cells to compare) then the k-th smallest item in that range that you want to retrieve e.g. put 1 to get the smallest, as above, comparable to MIN function, or 2 to get the second smallest etc.
Official blurb below:
Description
Returns the k-th smallest value in a data set. Use this function to
return values with a particular relative standing in a data set.
Syntax
SMALL(array, k)
The SMALL function syntax has the following arguments:
Array Required. An array or range of numerical data for which you
want to determine the k-th smallest value.
K Required. The position (from the smallest) in the array or range
of data to return.
AGGREGATE Function:
Consider the even more versatile AGGREGATE function which can cope with hidden rows in the range, errors etc. You can specify a host of additional requirements whilst still getting the minimum value
General syntax for first form:
AGGREGATE(function_num, options, ref1, [ref2], …)
Function 5 is Minimum. Options are viewable at link I gave but 7 is ignore errors and hidden rows. So, you could use:
=AGGREGATE(5,7,D77:D84)
The AGGREGATE option above is the only version that will still return the minimum correctly if there is an error in the range D77:D84 e.g. a DIV/0 error.
SUBTOTAL Function:
Similar to the AGGREGATE function is the SUBTOTAL function.
You can use SUBTOTAL(5, D77:D84) where 5 specifies you want the minimum for the range. This will not ignore errors. SUBTOTAL(105,D77:D84) will ignore hidden rows though.
Simply put the formula '=Min(D77:D84)' in cell P77
and '=Min(D84:D97)' in cell P84

How to find the index of remaining columns if the data is repetitive

I have a data entry like thisData entries
Now, i need to find the smallest 10 values and also get the corresponding person and area and date along with it.
I used SMALL functoin to find the least 10 values. Then I used the index and match functions for getting their corresponding row entries. The problem is since some data entries are being repetitive, these functions are giving the row of the first 2 for all the remaining 2s. How to solve this
In F2 use Rank like this, so you have unique numbers:
=RANK(C2,$C$2:$C$21,1)+ROW()/1000
in G2 use Small, to pull the smallest of the ranked numbers and copy down 10 rows.
=SMALL($F$2:$F$21,ROW(A1))
Now you can pull person, date, real hours and area with an index match in H2, copied across and down.
=INDEX(A$2:A$21,MATCH($G2,$F$2:$F$21,0))

Rank the top 5 entries in different criteria

I have a table that I want to find the top X people in each of the different groups.
Unique Names Number Group
a 30 1
b 4 2
c 19 3
d 40 2
e 1 1
f 9 2
g 15 3
I've ranked the top 5 people by number by using =index($A$2:$A$8,match(large($B$2:$B$8,1),$B$2:$B$8,0)). The 1 in the LARGE function I linked to a ranked range so that when I dragged down it changed up the number.
What I would like to do next is rank the top x number of people in each group. So top 3 in group 1.
I tried =index($A$2:$A$8,match("1"&large($B$2:$B$8,1),$C$2:$C$8&$B$2:$B$8,0)) but it didn't seem to work.
Thanks
EDIT: After looking at the answers below I have realised why they are not working for me. My actual data that I want to use the formula with have multiple entries of numbers. I have adjusted the example data to show this. The problem I have is that if there are duplicate numbers then it returns both of the names even if one is not in the group.
Unique Names Number Group
a 30 1
b 30 2
c 19 3
d 40 2
e 1 1
f 30 2
g 15 3
Proof of Concept
Use the following formula in the example above in cell F2 and copy down and to the right as needed.
=IFERROR(INDEX($A$2:$A$8,MATCH(AGGREGATE(14,6,($C$2:$C$8=F$1)*($B$2:$B$8),ROW($A2)-1),$B$2:$B$8,0)),"")
In the header row provide the group numbers. or come up with a formula to augment and reset the group number as you copy down based on your X number in your question.
Explanation:
The AGGREGATE function unlike the large function is an array function without the need to use CSE. As such we can add criteria to what we want to use. In this case only 1 criteria was used and that was the group number. in the formula it was the following part:
($C$2:$C$8=F$1)
If there were multiple criteria we would use either an + operator as an OR or we would use an * operator as an AND.
The 6 option in the aggregate function allows us to ignore errors. This is useful when trying to get the small. It is also useful for dealing with other information that may cause errors that do not need to be worried about.
As this is technically an array operation avoid using full column/row references as they can bog down your system.
The basics of what the over all formula is doing is building a list that match the group number you are interested in. After filtering your numbers, it then determines which is the largest, second largest etc by what row you have copied down to. It then determine what row the nth largest number occurs in through the match function, and finally it returns to the corresponding name to that row with the index function.
Building on all the other great answers.
Because you have the possibilities of duplicate values in each group we need to do this with two formulas.
First we need to get the numbers in order. I used the Aggregate, but this could be done with the array LARGE(IF()) also:
=IFERROR(AGGREGATE(14,6,$B$2:$B$8/($C$2:$C$8=E$1),ROW(1:1)),"")
Then using that number and order we can reference, we can use a modified version of #ForwardEd's formula, using COUNTIF() to ensure we get the correct name in return.
=IFERROR(INDEX($A$2:$A$8,AGGREGATE(15,6,(ROW($B$2:$B$8)-ROW($B$2)+1)/(($C$2:$C$8=F$1)*($B$2:$B$8=E3)),COUNTIF(E$2:E2,E3)+1)),"")
This will count the number in the results returned and then bring in the correct name.
You could also solve this with array formulas - to filter a group whose name is stored in E1, your code
=INDEX($A$2:$A$8,MATCH(LARGE($B$2:$B$8,1),$B$2:$B$8,0))
would then be adapted to
=INDEX($A$2:$A$8,MATCH(LARGE(IF($C$2:$C$8<>E1,-1,$B$2:$B$8),1),$B$2:$B$8,0))
Note: After entering an array formula, you have press CTRL+SHIFT+ENTER.
Thank you to everyone who offered help but for some reason none of your methods worked for me, which I am sure was to do with the quality of my data. I used an alternate method in the end which is slightly convoluted but seemed to work.
=IF($C2="1",RANK($B2,$B$2:$B$8,1)+ROW()/10000,-1)
Essentially using the rank function and adding a fraction to separate out duplicate values.

Resources