top matching cells only - excel

I have a simple formula in an excel sheet with 3000 rows, is there is a way to to return only a certain percentage of the top matching cells only. I only want the top 9% (270 rows) that match my criteria to be shown and other rows to be hidden
Here is my formula:
= IF(AND(H2=1,O2>=58)," A",IF(AND(H2=2,O2>=58)," B",IF(AND(H2=5,O2>=55)," C")))
I did it with pivot table but want to do it in the same sheet within the same formula adding to it to get what I need.
Thanks

Screenshot/here refer:
=LET(fil,FILTER($C$4:$C$3000,--($C$4:$C$3000>=500)),FILTER(fil,--(fil>=PERCENTILE(fil,E3))))
Pre-requisite: Office 360 compatible version of Excel
Notes: cell E3 (screenshot) = 8% (custom format type, including apostrophes: "top "0%" of filtered"; likewise, cell I3 type: "top "0%)
Advantages:
Auditability
Understandability
Preserves original ordering
Moderate calc. speed
Quick implementation
No interim calcs
Disadvantages:
Requires Office 360
References:
Let function here - not essential but quite useful/convenient (alternative, longer version to above becomes cumbersome/unwieldy to work with - likely to be relatively more error-prone than more succinct/parsimonious equivalent hence provided)
Filter here

Related

How to identify and name different ranges in one column in Excel?

I have a recorded drive cycle of a truck which includes speed and coordinates over time. The file looks like this (simplified version)
I want to know the number of times the truck is stopped (when speed = 0) and number them. Therefore, I would like to group the intervals with '0' values (vehicle stopped) in the Speed column and name them in order (Stop 1, Stop 2, etc). Ultimately, my goal would be to somehow be able to calculate the number of stops and duration like this:
Is there any function in Excel which would allow me to do something like that? Thank you.
You can do it using Pivot Tables and a helper column. Also if you have Excel365 with functions like FILTER,UNIQUE and SUMIFS
Formula helper column is just to enumerate properly each group of stops. Notice data need to be sorted properly as your exampel or it won't work:
=IF(D2="STOP";IF(D1="STOP";E1;MAX($E$1:E1)+1);"")
Then pivot table can be inserted:
Helper Column and Vehicle Stop into rows section
Time-step seconds into values section, formated as Time and operation=Sum
Filter field Helper Column to exclude blanks
If you have Excel 365, you can get this output with advanced formulas. In cell J14 formula is:
=UNIQUE(FILTER(D2:D19&E2:E19;D2:D19="STOP"))
And K14 is (and drag down) is:
=SUMIFS($B$2:$B$19;$D$2:$D$19;"STOP";$E$2:$E$19;MID(J14;5;99))
Anyways, I've uploaded the workbook to Gdrive so you can see the Pivot Table and the formulas by yourself. If you don't jave Excel 365, the formulation part may give some errors when you open it:
https://docs.google.com/spreadsheets/d/1t1INxGKJRHFexWnSC5LlIi37JrThjajs/edit?usp=sharing&ouid=114417674018837700466&rtpof=true&sd=true
Just as a postscript to this, you could do the same thing without using helper columns by comparing offset ranges to see where the transitions in speed occurred from 0 to 10 (go) or 10 to 0 (stop). The problem is that normally you would have to go outside the data range (a2:a16 and c2:c16) which gets you either a header cell (a1 or c1) or a blank cell (a17 or c17). In the case of time, there is also a special case where the first time-step is taken to be zero.
All this can be avoided in Excel 365 by using vstack to add appropriate dummy values to the beginning or end of the two ranges:
=LET(timeRange,A2:A16,
speedRange,C2:C16,
timeRange1,VSTACK(A2,timeRange),
speedRange1,VSTACK(1,speedRange),
speedRange2,VSTACK(speedRange,1),
startTime,FILTER(timeRange1,(speedRange1>0)*(speedRange2=0)),
endTime,FILTER(timeRange1,(speedRange1=0)*(speedRange2>0)),
stopTime,endTime-startTime,
label,"STOP "&SEQUENCE(ROWS(stopTime)),
HSTACK(label,stopTime))

Matching all combinations (values) from matrix-like-table using drop-down

On my workbook (WB) there are 2x sheets. One is Test1 or where I have my drop-downs (from A22 and below) and on A8-A11 are matching fields that are being colored if correspoding match is "hit". In my case FALSE is (_) and TRUE (1) if you are looking on Matrix table on sheet2.
On sheet 2 (Matrix) is matrix-table that has horizontal/vertical axis same (headers), and (_'s) (1's) if there is an interesection. Meaning, it has to be all FALSE so that the System is in "green (all FALSE)" and can be sold, if only one part is "red (TRUE)" then the combination is not supported.
Example from matrix:
070FX has (_) on intersect has (1) on CE0, D01 and it that case it should hit TRUE. So all three parts should be "red (TRUE)", as it is on 3rd picture.
If you check my TRUE/FALSE results from formula (in A13 and A14) you can understand it slightly better:
=SUMPRODUCT((Matrix!$A$2:$A$103=A12)*((Matrix!$B$1:$CV$1=$A$9)+(Matrix!$B$1:$CV$1=$B$9)+(Matrix!$B$1:$CV$1=$C$9)+(Matrix!$B$1:$CV$1=$D$9)+(Matrix!$B$1:$CV$1=$E$9)+(Matrix!$B$1:$CV$1=$F$9)+(Matrix!$B$1:$CV$1=$G$9)+(Matrix!$B$1:$CV$1=$H$9)+(Matrix!$B$1:$CV$1=$I$9)+(Matrix!$B$1:$CV$1=$J$9)+(Matrix!$B$1:$CV$1=$K$9)+(Matrix!$B$1:$CV$1=$A$12)+(Matrix!$B$1:$CV$1=$B$12)+(Matrix!$B$1:$CV$1=$C$12)+(Matrix!$B$1:$CV$1=$D$12)+(Matrix!$B$1:$CV$1=$E$12)+(Matrix!$B$1:$CV$1=$F$12)+(Matrix!$B$1:$CV$1=$G$12)+(Matrix!$B$1:$CV$1=$I$12)+(Matrix!$B$1:$CV$1=$J$12)+(Matrix!$B$1:$CV$1=$K$12))*(NOT(ISERROR(1/VALUE(Matrix!$B$2:$CV$103)=1))))>0
Maybe you are asking why two rows of formulas (A13 and A14), it is actually one formula but I separated into two rows because of printing, this document should fit on one page only.
*The problem what I have is making this more dynamic, and easier to read/understand. If you see my formula it is SUMPRODUCT but it does have hard-coded arrays, and that is not what I need, I realise recently that we have many changes within our document and sometimes parts are being added or deleted. But my array is hard-coded, so you can imagine how much effort is to adjust it. And to explain to someone how it works is also pain in the ss.
I hope there is some different way to do this, maybe another set of functions or even with Power Query as best dynamic thing in Excel.
https://docs.google.com/spreadsheets/d/1UC0cgsVCm0ekbtu7Wsjpy8PdJ76o8HNq/edit?usp=sharing&ouid=101738555398870704584&rtpof=true&sd=true
The formula below could be used in cell A13 and then copied across
=SUMPRODUCT((_0359_matrix[_]=A9)*(IFNA(IF(MATCH(COLUMN(_0359_matrix[[#Headers],[070FX]:[YS1]])-1,MATCH($A$9:$K$9,_0359_matrix[[#Headers],[070FX]:[YS1]],0),0),1,0),0)+IFNA(IF(MATCH(COLUMN(_0359_matrix[[#Headers],[070FX]:[YS1]])-1,MATCH($A$12:$K$12,_0359_matrix[[#Headers],[070FX]:[YS1]],0),0),1,0),0))*(NOT(ISERROR(1/VALUE(_0359_matrix[[070FX]:[YS1]])=1))))>0
and the formula below could be used in cell A14 and then copied across
=SUMPRODUCT((_0359_matrix[_]=A12)*(IFNA(IF(MATCH(COLUMN(_0359_matrix[[#Headers],[070FX]:[YS1]])-1,MATCH($A$9:$K$9,_0359_matrix[[#Headers],[070FX]:[YS1]],0),0),1,0),0)+IFNA(IF(MATCH(COLUMN(_0359_matrix[[#Headers],[070FX]:[YS1]])-1,MATCH($A$12:$K$12,_0359_matrix[[#Headers],[070FX]:[YS1]],0),0),1,0),0))*(NOT(ISERROR(1/VALUE(_0359_matrix[[070FX]:[YS1]])=1))))>0
Both formulae are longer than they could be, since they're using table referencing but, since you wanted your formulae to be dynamic, I think it's appropriate (without table referencing the formulae are approximately half the length of the originals).
The formulae are just shorter versions of your originals but there is a LOT of duplication of calculations in both sets, i.e. ALL 22 cells calculate the mid portion (the middle multiplicand) and the last portion (the last multiplicand) so, for the sake of performance, it would be sensible to have 2 helper cells (or 2 named formulae) which calculate these values once, and then just have the 22 formulae referring to these, thus shortening the formulae considerably.
(the table referencing may, or may not, make the formula less readable so I'm including below a screenshot of my 'research'
the data in row 1 are proxies for your table headers, and the data in row 3 are proxies for your row 9 (of Test1) - the formula effectively does all the matching for a single row in aggregate, rather than having to sum individual results, as was the case in your original formulae; in this way there are 2 'copies' of the formula, since your headers (on Test1) are on 2 different rows due to your 'printing constraint' - you could make my formula shorter if all headers were in a single row, e.g. by putting the formula =A12 in cell L9, =B12 in M9 etc (and possibly hiding those columns, to keep the sheet 'clean') - if you implemented this suggestion, then the formula for cell A13 could be shortened to this
=LET(hdrs,_0359_matrix[[#Headers],[070FX]:[YS1]],SUMPRODUCT((_0359_matrix[_]=A9)*(IFNA(IF(MATCH(COLUMN(hdrs)-1,MATCH($A$9:$V$9,hdrs,0),0),1,0),0))*(NOT(ISERROR(1/VALUE(_0359_matrix[[070FX]:[YS1]])=1))))>0)
which is less than 1/3 the length of your original formula)

Specifying range from A2 till infinity (NO VBA)

Without VBA, I am trying to refer a range that starts at A2 and never ends. For example, if I want row 2 till row 5 i'd do
$A$2:$A$5
But what if I want the end to be open?
$A$2:??
Is this possible?
Depending on what's in A1 and what formula you're putting the reference into, you could simply use A:A. For example, if you wanted to sum all of the values in column A, but A1 contained a column title rather than a number, you could still write =SUM(A:A) and the title in A1 would just be ignored.
A2:A works in many formulas
hope that helps
If you want to refer to a range starting from A2 until max row (1048576 or 65536 for Excel prior to 2007), you can use this volatile formula... =OFFSET(A2,0,0,(COUNTBLANK(A:A)+COUNTA(A:A)-1),1) . Use formula as a defined range name or inside other formula which takes range as an argument (for eq SUM)...
Another option (in case your formula is in A1, so accessing A:A would create a circular reference) is:
OFFSET(A2, 0, 0, ROWS(A:A)-1)
This uses ROWS to count the total number of rows (without actually accessing the rows!), subtracts 1 (because we're starting with the second row), and uses this result as the height of a range created with OFFSET.
This is another option based on a formula, using the example locations in the OP's question:
=A2:INDEX(A:A,MAX(FILTER(ROW(A:A),IF(ISBLANK(A:A),0,1)=1)))
The components are the following:
=MAX(FILTER(ROW(A:A),IF(ISBLANK(A:A),0,1)=1))
which finds the number of the deepest row that is not blank, and
A2:INDEX(A:A,<expression 1 above>)
which relies on the expression above to make a bigger formula, which obtains a range starting from any location and ending at a location in the given column at the position obtained by this expression, 1.
This is an alternative to the others listed, and may be of interest as it differs from them in potentially substantial ways.
I can note the following characteristics:
It is not necessarily fast.
It seems to NOT be a volatile formula. This is important, as it means it won't necessarily be recalculated every time a calculation is made. However, I am not sure about the frequency of calculation, and don't fully understand its volatility status.
The uncertainty is related the use of the INDEX function (and, apparently, specifically after the : in a range). There are some resources that describe it.
INDIRECT and OFFSET functions are definitely volatile. There are a number of resources that describe performance implications of volatile functions, some of them mentioned in other SO answers. For example:
https://learn.microsoft.com/en-us/office/client-developer/excel/excel-recalculation
https://www.sumproduct.com/thought/volatile-functions-talk-dirty-to-me
http://www.decisionmodels.com/calcsecretsi.htm
https://chandoo.org/wp/handle-volatile-functions-like-they-are-dynamite/
It allows the user to not have to think about the data in certain cells (for example, A1, which may be meant to have a header, and not numbers).
It returns a range between the cell specified before the : and the last cell in the column that is non-blank. I think it should include non-numeric values in its consideration as well.
It shares some commonality in terms of the range it aims to identify with the answer by Kresimir L.: =OFFSET(A2,0,0,(COUNTBLANK(A:A)+COUNTA(A:A)-1),1).
To note: This answer applies to the version of Excel available as of the time of writing as part of Office 365 (and continually updated). However, the answer is based only on my own verification of its apparent correctness of my installation. I am not sure that all installations of Office 365 have the same software exactly; and I have the sense that some features may differ among different installations (even) of Office 365. I am not sure that this answer applies to everyone. Please test. I would appreciate feedback on your success with this approach.
This well covered in VBA as code below:
Range("A2", Range("A2").End(xlDown))
And if you want reach that in formula, it depends on the version number of your MS-Excel.
According to this reference number of all rows are in a sheet from Excel 2007 onwards are 1048576 that you can use bellow:
$A$2:$A$1048576
Because this range in formula is depended on Excels version, this may be different in future versions.
Finally, I suggest you use VBA.

IF formula not working

What is wrong with this Excel formula?
=IF((C7-$C$2)<=$C$2,(C7*C3),IF((AND((C7-$C$2)>(2*$C$2),(C7-$C$2)<=($C$2*2),((C7-$C$2)*($C$4)),IF(AND((C7-($C$2*2)>($C$2*2),(C7-$C$2)>($C$2*2),(C7-(C7-($C$2*2)),(C7*$C$5)))
It's a sales calculation where:
if you sell over a certain number you get one level of commission per deal and
if you sell 2X a specific number you get a higher pay out for every deal
I entered your formula into D7 and immediately got an error "The formula you typed contains an error" with the whole formula highlighted. That indicated that Excel couldn't find one explicit error. I also noticed that the last ")" wasn't black so that suggested a nesting error. My usual way for finding these is to F2 while on the cell and left arrow through the brackets - go over the last bracket and its corresponding bracket should turn bold in the cell/formula bar. If it's not the first bracket you've got a problem!
Bracketing can be good but it can be confusing if over-used - multiplications will always be calculated first by Excel before addition/subtraction so you could try getting rid of some bracket pairs but read on because there may be a better solution.
Formula are sometimes easier to understand if you define your parameters within Named Ranges (i.e. Base Sales Volume in C2 named as BSV or BaseSales) which you can then see in the formula. BaseSales is a lot easier to comprehend than $C$2.
After naming your ranges for the Base Sales, create Named Ranges for your three commission rates in $C$2, $C$3 and $C$4. Replace the references to the absolute cells in your formula. In 2007+ on the Formula tab go Define Name...Apply Name and highlight all the names you made then apply. Your formula should have names instead of cell references.
Nested formulae are good but sometimes you need to build them up from simple formulae over many columns and then consolidate the crucial bits to put into your mega-formula.
Instead of testing from the bottom up through the target levels consider top down -
IF sales > top target
sales * top rate
ELSE
IF sales > 2nd target
sales * 2nd rate
ELSE
sales * base rate
Then you should be able to do it with just two IF functions.

understanding and excel formula I have been using

I have been using the following formula to compare strings and show all the matches. It works perfectly but I am trying to increase my overall understanding.
=IF(ISNA(VLOOKUP($B8,N$1:N$1048576,1,0)),"",1)
From what I know this will look up all strings between N1 - N104 and compare them to the string located in B* and return a 1 if found and a 0 if not found what is confusing me is the -- 8576 number what does it do ?
1048576 (2 to the 20th power) is the maximum number of rows in a worksheet in Excel 2007 and later. (In previous versions, it was 65536, or 2 to the 16th power.)
Basically, the N$1:N$1048576 refers to "all the cells in column N".
However, for safety reasons, you should change that part of the formula to the simpler N:N - in fact, if I copy your formula, click on an Excel cell, and press Ctrl+V, Excel does that replacement automatically.
As stated in an article on Office.com, the maximum size of an Excel spreadsheet is:
1,048,576 rows by 16,384 columns
That's where your "8576" is coming from. Your formula is not checking from cell N1:N104 but rather the entire column of N.
Another way of writing your formula would be:
=IF(ISNA(VLOOKUP($B8,N:N,1,0)),"",1)

Resources