I'm having what seems to be an inexplicable inconsistency between two formulas in Excel. I'm attempting to create a multiple output XLOOKUP-type search using this temporary formula:
=INDEX($B$3:$B$41, SMALL(IF(D$2=$A$3:$A$41, ROW($A$3:$A$41)- MIN(ROW($A$3:$A$41))+1,""), ROW()-2))
I'm drawing from simplified data in two columns; A3:A41 including numerical job numbers, and B3:B41 with comments on each job. Each line of comment for a job has a duplicated job number, e.g.
Job No
Comments
1000
Comment 1
1000
Comment 2
2000
Comment 1
3000
Comment 1
My output is supposed to list all of the comments in a column, headed by the job number, like so:
1000
2000
3000
Comment 1
Comment 1
Comment 1
Comment 2
This works for the first test job (number 1000, in cell D2), and lists all the comments for that job in cells D3:D7, and then lists #NUM! errors for the rest of the pre-set range when it runs out of comments to list, as it should. However, job 2000 in cell E2 results in #NUM! errors for all cells, and isn't finding any comments. The formula is exactly the same (I dragged to fill it), and when I manually change the IF reference for the 2000 job to refer to cell D2, it works.
All of the job numbers are text cells, but the only difference between the successful and unsuccessful formula is the change from ...IF(D$2=$A$3:$A$41... to ...IF(E$2=$A$3:$A$41..., which changes the reference from 1000 to 2000, both of which I've manually confirmed exist and have comments in the test data.
I'm stumped. There's no technical difference between
=INDEX($B$3:$B$41, SMALL(IF(D$2=$A$3:$A$41, ROW($A$3:$A$41)- MIN(ROW($A$3:$A$41))+1,""), ROW()-2))
and
=INDEX($B$3:$B$41, SMALL(IF(E$2=$A$3:$A$41, ROW($A$3:$A$41)- MIN(ROW($A$3:$A$41))+1,""), ROW()-2))
if both D2 and E2 are text cells, is there?
Related
Excel formula stop working after x number of rows
A similar question was posted 4 years ago. There were two answers. I have reviewed both. The first response is excellent but it is not the cause of my issues. The second answer was a statement that the person who responded had the same issue and what he or she did to solve it which will not work in my case. There was no explanation as to why the problem occurred.
Here is my problem. I am using the following formula:
IF(MONTH(MDB.xlsx!Date)=7,SUMIFS(MDB.xlsx!_910,MDB.xlsx!Activity,"MGRINC",MDB.xlsx!AcctNum,$Y202),0)
MDB is a Master Data Base. It currently has 214 rows but will grow substantially to probably around 5000 rows. The named ranges in the MDB are currently defined as rows 1 to 500. The above formula is in a spreadsheet, with about 300 rows. The formula works fine through row 201. From 202 on, it only returns zeros.
This is what I have done:
I have looked at constituent parts of the formula using F9, all values and arrays are reporting correctly. (That’s why the defined name range is currently set to only 500 rows, so I can breakdown a formula using F9 and not get an error after 8,192 characters.)
If I move the line with this formula from line 202 to an earlier row, it works fine.
If I delete earlier rows, the formula works fine.
This appears to be a memory issue of some sort but I don’t understand why. I have built larger and much more complex spreadsheets some of which take minutes to calculate with no issues.
Any thoughts?
Extracting the Month inside a Sumif will not work. You could add two conditions, one testing for the date being greater/equal to July 1, the second testing for the date being smaller/equal to July 31.
=SUMIFS(MDB.xlsx!_904,MDB.xlsx!Date,">="&date(2020,7,1),MDB.xlsx!Date,"<="&date(2020,7,31),MDB.xlsx!Activity,"MGRINC",MDB.xlsx!AcctNum,$Y202)
Or, you could add a column to your source data that has the month number, then test for that month number
Or, you could change the formula to use SumProduct instead of Sumif, incorporating the month filter into the SumProduct parameters like this:
=Sumproduct((MONTH(MDB.xlsx!Date)=7),--(MDB.xlsx!Activity="MGRINC"),--(MDB.xlsx!AcctNum=$Y202),MDB.xlsx!_910)
With the formula just testing form Month number, the year is disregarded, so if you have data spanning multiple years, you may want to add a parameter that checks the year.
Note that all these formulas are regular formulas and do not need to be array entered.
Suppose you have an ordered, indexed list of positive values. These positive values are interrupted by 0 values. I want to determine if a consecutive sub-array exists which is not interrupted by 0 values and whose sum exceeds a certain threshold.
Simple example:
Index, Value
0 0
1 0
2 3
3 4
4 2
5 6
6 0
7 0
8 0
9 2
10 3
11 0
In the above example, the largest consecutive sub-array not interrupted by 0 is from index 2 to index 5 inclusive, and the sum of this sub-array is 15.
Thus, for the following thresholds 20, 10 and 4, the results should be FALSE, TRUE and TRUE respectively.
Note I don't necessarily have to find the largest sub-array, I only have to know if any uninterrupted sub-array sum exceeds the defined threshold.
I suspect this problem is a variation of Kadane's algorithm, but I can't quite figure out how to adjust it.
The added complication is that I have to perform this analysis in Excel or Google Sheets, and I cannot use scripts to do it - only inbuilt formulas.
I'm not sure if this can even be done, but I would be grateful for any input.
Start with
=B2
in c2
then put
=IF(B3=0,0,B3+C2)
in C3 and copy down.
EDIT 1
If you were looking for a Google sheets solution, try something like this:
=ArrayFormula(max(sumif(A2:A,"<="&A2:A,B2:B)-vlookup(A2:A,{if(B2:B=0,A2:A),sumif(A2:A,"<="&A2:A,B2:B)},2)))
Assumes that numbers in column B start with zero: would need to add Iferror if not. It's basically an array formula implementation of #Gary's student's method.
EDIT 2
Here is the Google Sheets formula translated back into Excel. It gives you an alternative if you don't want to use Offset:
=MAX(SUMIF(A2:A13,"<="&A2:A13,B2:B13)-INDEX(SUMIF(A2:A13,"<="&A2:A13,B2:B13),N(IF({1},MATCH(A2:A13,IF(B2:B13=0,A2:A13))))))
(entered as an array formula).
Comment
Maybe the real challenge is to find a formula that works both in Excel and Google sheets because:
Vlookup doesn't work the same way in Excel
The offset/subtotal combination doesn't work in Google sheets
The index/match combination with n(if{1}... doesn't work in Google sheets.
With data in columns A and B, insure column B end with a 0. Then in C2 enter:
=IF(AND(B3=0,B2<>0),SUM(B$1:$B2)-MAX($C$1:C1),"")
and copy downwards:
Column C lists the sums of consecutive non-zeros. In another cell enter something like:
=MAX(C:C)>19
where 19 is the criteria value.
You can avoid the "helper" column by using a VBA UDF.
EDIT#1:
Use this instead:
=IF(AND(B3=0,B2<>0),SUM(B$1:$B2)-SUM($C$1:C1),"")
Thanks to #Tom Sharpe and #Gary's Student for answering the question.
While I admittedly did not specify this in the question, I would prefer to achieve the solution without a helper column because I have to do this operation on 30+ successive columns. I just didn't think it was possible in Excel.
Full credit goes to user XOR LX on the Excelforum for coming up with this solution. It has blown my mind and took me the better part of an hour to wrap my head around, but it is certainly very creative. There is no way I could have come up with it myself. Re-posting it here for the benefit of everyone who is looking into this.
Copy and paste the table from my initial question into an empty Excel sheet such that the headers appear in (A1:B1) and the values appear in (A2:B13).
Then enter this formula as an array formula (ctrl+shift+enter), which gives the max of the sums of all the uninterrupted sub-arrays:
=MAX(SUBTOTAL(9,OFFSET(B2,A2:A14,,-FREQUENCY(IF(B2:B13,A2:A13),IF(B3:B14=0,A2:A13,0))-1)))
Note the deliberate offset to include one additional row below the end of the dataset.
I am creating a spreadsheet to track my habits through the coming year, and I want to be able to show both a simple count, and also a current streak for each habit. I've already searched and found one answer for counting a streak, but this answer does not work with a live tracker (increasing the end column by one each day), and also requires there to be no blank fields (1 or 0 entry).
=COUNTA(S$2:S$74)-MATCH(2, 1/(S2:S$74=0), 1)
If I switch the above from =0 to ="" in the match, it breaks pretty spectacularly, and in any case it fails completely for a range that includes 366 days, but the current day is <366. I would be tracking "current day" by a habit row labeled "logging" that I would populate each day, whether any other habits were logged or not. Is this something possible with formulas?
Using the suggestions I got to concat 1's and 0's to a string, I then got a successful streak formula using
=IFERROR((LEN(NG3)-SEARCH("0[^0]*$",NG3)),LEN(NG3))
Where NG is the column the concatenations exist in, and row 3 is the first row of data.
I then basically duplicated my calendar to allow for non-binary logging (converting any blank or 0 entry to a 0, all other entries to a 1), and contingent on the "logged" row being populated so that unlogged days are blank fields. Works like a charm now.
My interpretation of the algorithm for this:
If you had a 1 for each time same thing happened, and 0 when you did something different.
Then you concatenated all the 0 and 1 into one long string.
Then split the string into an array based on 0, removing all the zeros.
Then finally counted the maximum length of the array result
In formula put the following:
=MAX(LEN(SPLIT(B1,"0")))
Then either press Shift + Ctrl + Enter on Windows
Or Cmd + Shift + Enter on a Mac
This will convert formula into an ArrayFormula.
(You need to do this, because if you just entered
=SPLIT(B1)
in C1, it would generate a range of columns, one per array entry, filling C1, D1, E1 etc...).
Here are the formulae:
Here is the output:
Sample spreadsheet in GoogleDocs
You may be able to upload/download/convert it between Googledocs and LibreOffice.
Have not used that spreadsheet solution.
But most are pretty similar these days.
(Alternate title: Why on earth doesn't Excel support user-defined formulas with parameters without resorting to VB and the problems that entails?).
[ Updated to clarify my question ]
In excel when you define a table it will tend to automatically replicate a formula in a column. This is very much like "fill down".
But ... what if you need exceptions to the rule?
In the tables I'm building to do some calculations the first row tends to be "special" in some way. So, I want the auto-fill down, but just not on the first row, or not on cells marked as custom. The Excel docs mention exceptions in computed columns but only in reference to finding them and eliminating them.
For example, first row is computing the initial value
The all the remaining rows compute some incremental change.
A trivial example - a table of 1 column and 4 rows:
A
1 Number
2 =42
3 =A2+1
4 =A3+1
The first formula must be different than the rest.
This creates a simple numbered list with A2=42, A3=43, A4=44.
But now, say I'd like to change it to be incremented by 2 instead of 1.
If I edit A3 to be "A2+2", Excel changes the table to be:
A
1 Number
2 =A1+2
3 =A2+2
4 =A3+2
Which of course is busted -- it should allow A2 to continue to be a special case.
Isn't this (exceptions - particularly in the first row of a table) an incredibly common requirement?
If you have the data formatted as a table you can use table formulas (eg [#ABC]) instead of A1 format (eg A1, $C2 etc). But there are 2 tricks to account for.
Firstly there is no table formula syntax for the previous row, instead excel will default back to A1 format, but you can use the offset formula to move you current cell to the previous row as shown below. However in this case it will return an # value error since I cant +1 to "ABC".
ABC
1 =OFFSET([#ABC],-1,0)+1
2 =OFFSET([#ABC],-1,0)+1
3 =OFFSET([#ABC],-1,0)+1
4 ....
So the second trick is to use a if statement to intialise the value, buy checking if the previous row value = heading value. If the same use the initial value else add the increment. Note assumes table is named Table1
ABC
1 =IF(OFFSET([#ABC],-1,0)=Table1[[#Headers],[ABC]],42,OFFSET([#ABC],-1,0)+1)
2 =IF(OFFSET([#ABC],-1,0)=Table1[[#Headers],[ABC]],42,OFFSET([#ABC],-1,0)+1)
3 =IF(OFFSET([#ABC],-1,0)=Table1[[#Headers],[ABC]],42,OFFSET([#ABC],-1,0)+1)
4 ....
Note you can set the initial value to be a cell outside the table to define the initial value (in say $A$1) and increment (in say $A$2) as below
ABC
1 =IF(OFFSET([#ABC],-1,0)=Table1[[#Headers],[ABC]],$A$1,OFFSET([#ABC],-1,0)+$A$2)
2 =IF(OFFSET([#ABC],-1,0)=Table1[[#Headers],[ABC]],$A$1,OFFSET([#ABC],-1,0)+$A$2)
3 =IF(OFFSET([#ABC],-1,0)=Table1[[#Headers],[ABC]],$A$1,OFFSET([#ABC],-1,0)+$A$2)
4 ....
I use this IF OFFSET combination all the time for iterating and looping in tables.
If you have alot of columns that need to determine if they are the first row you can have one column test if first row and the rest can work with a simpler if. eg ABC will give true for first row false for others, then DEF with increment the initial value
ABC DEF
1 =OFFSET([#ABC],-1,0)=Table1[[#Headers],[ABC]] =IF([#ABC],$A$1,OFFSET([#DEF],-1,0)+$A$2)
2 =OFFSET([#ABC],-1,0)=Table1[[#Headers],[ABC]] =IF([#ABC],$A$1,OFFSET([#DEF],-1,0)+$A$2)
3 =OFFSET([#ABC],-1,0)=Table1[[#Headers],[ABC]] =IF([#ABC],$A$1,OFFSET([#DEF],-1,0)+$A$2)
4 ....
Hope that helps
I don't know if you are looking for something as simple as locking down a formula. You can do that by highlighting the part of the formula you do not want to change and then hitting F4. This will absolute this section of the formila, using a $ to indicate it, and will not change as you copy/paste it down the table.
Alternately, you may be able to use Defined Names. These you can set up in the Data tab and basically assigns something to a name or variable you can then put into your formulas. These can be as simple as an easy reference for a cell on another sheet to incredibly complex multi-sheet formals.
Normally, to handle "exceptional" formula in the first row of a table consiting of several columns, you simply enter it there manually, and fill only the lines below. But if you have more "exceptional" cases scattered around, you will need another column with 0/1 values indicating where the exceptins are. And then you use if(condition, formula_if_true, formula_if_false) everywhere.
A B
Number Exceptional?
1 if(C1,42,A1+1) 0
2 if(C2,42,A2+1) 1
3 if(C3,42,A3+1) 0
As much as I love Excel, and as much as it is the best product of whole MS, it is still a weak tool. FYI, you can quiclky learn modern and poweful scripting languages, such as Ruby, here, and never be bothered by spreadsheet idiosyncrasies again.
I get the feeling this is close to impossible, but here goes. I have data structured like this:
Test Group 1
pass
fail
pass
fall
Test Group 2
pass
fail
Test Group 3
pass
fail
fail
pass
I want to be able to paste it into Excel and have Excel summarise the percentage of each test group. So it would end up looking like this:
Test Group 1 Percentage Pass: 50%
pass
fail
pass
fall
Test Group 2 Percentage Pass: 50%
pass
fail
Test Group 3 Percentage Pass: 50%
pass
fail
fail
pass
BUT as you can see the test group length is not set, and it may vary. I was hoping is that I might be able to create a formula that follows this logic:
if( A<n> contains "Test")
count A<n> +1 until A<n> contains "Test"
It seems like I'm asking a lot from Excel formulas. I have spent some of today writing a small C# app that will split the configs into separate files so that I could copy and paste separate files into Excel. But it would be great to have fewer steps!
--- UPDATE ----
There were three very interesting approaches to this problem proving nothing is impossible :p
I wanted a solution that allowed me to copy and paste results in and see a load of percentages pop up so I was always sort of looking for a formula solution HOWEVER, please take the time to look at pnuts and Jerrys answers as they reveal some useful features of Excel!
chuffs answer was the one I was looking for and it worked out of the box. For anyone who wants to delve deep into how it works and why, I broke the formula down into steps and filled in some help information. The key to these formulas is combining MATCH, OFFSET and the slightly more obvious SEARCH / FIND / LEFT (I used to use IFERROR(FIND type approach, LEFT seems cleaner :) )
Do look at the documentation for these formulas but to see it all in action with some examples see the Google Spreadsheets I created detailing chuffs answer:
https://docs.google.com/spreadsheet/ccc?key=0AqODI11eAjtldDhDd2dBcFhpZW9SXzEybGtMUWMwM3c#gid=0
---P.S---
For the record I did actually create a C++ program to prettify my data and ouput it as .csv files. If I had had this info I wouldn't have bothered but am glad I tried both routes, it was a fun learning adventure.
A single, array-formula solution to this problem is possible. It requires only that a stop value "Test" be added at the bottom of your column of data.
Assuming your data are in the range A2:A18, here is the formula that would compute the average in cell B2:
=IF(LEFT(A1,4)="Test",
COUNTIF(OFFSET(A1,1,0,MATCH("Test*",$A2:$A$18,0)-1,1),"pass")
/ COUNTA(OFFSET(A1,1,0,MATCH("Test*",$A2:$A$18,0)-1,1)),
"")
The key parts of the formula are expressions that calculate the ranges for the two count functions, COUNTIF and COUNTA.
The COUNTIF function - which counts the number of passes in a group - takes two arguments, the range for the count and the condition to be met by cells counted.
I use the OFFSET function to provide the count range. OFFSETtakes five arguments:
An anchor cell (if a range is provided, the function only uses the top left cell in the range).
The number of rows below (+) or above (-) the anchor cell that the range to be returned will begin.
The number of columns to the right (+) or left (-) of the anchor cell for the start of the return range.
The number of rows to return.
The number of columns to return.
For example, OFFSET(A1,5,2,4,1) will return the range C6:D9.
In the formula, the anchor cell is A1, the row offset is 1 and the column offset is 0, in other words, the start of the range to return is A2.
The number of rows to return is computed by the MATCH function. For the calculation of the average for the first test group, MATCH looks in the range A2:A18 for the row of the first cell that starts with the word 'Test'. A2 is the cell in the pass/fail column just below the cell that starts the test group.
The last cell in the column is A18, which has been added to the data and contains the word 'Test'. This prevents the formula from returning an error for the last test group. Otherwise, the match would return an error because it could not find another occurrence of Test beyond the start of the final group.
Also note that the anchor points for the match range, i.e., $A2:$A$18, leaves unanchored the row reference to the top of the range (the 2 in A2). That's so the range will adjust downward as the formula is copied down so as to find the next, not the previous, occurrence of 'Test'.
Finally, I decrease the number of return rows by 1 to exclude the Text that MATCH found, which belongs to the next group.
The COUNTA functions - which counts the total number of passes and fails in a group - uses the same OFFSET/MATCH expression to get the range for the group.
This is an array formula and so must be entered with the Control-Shift-Enter key combination. Copy the formula down to the bottom of the data (excluding the cell with the 'Test' stop value) to calculate the averages for each group.
For convenience, here is the formula without the breaks shown above. It can be directly pasted into the worksheet (remembering to confirm it with Control-Shift-Enter).
=IF(LEFT(A1,4)="Test",COUNTIF(OFFSET(A1,1,0,MATCH("Test*",$A2:$A$18,0)-1,1),"pass")/COUNTA(OFFSET(A1,1,0,MATCH("Test*",$A2:$A$18,0)-1,1)),"")
Basically Subtotal will meet the requirement, provided some fairly tedious layout adjustments are acceptable.
Preparation:
Assuming Test Group 1 is in A1, insert Row1 and ColumnA, with in A2 =IF(LEFT(B2,1)="T",B2,"") and in C2 =IFERROR(MATCH(B2,{"fail","pass"},0)-1,"") and both formulae copied down as required. (And change fall to fail in source!)
Subtotal:
Select A:C, Data > Outline – Subtotal, At each change in: (1) Test Group 1, Use function: Average, Add subtotal to: check (Column C), uncheck Summary below data, OK.
Tidy up:
1. Move A3:C3 to A1 and A2:C2 to A3.
2. Filter A:C and for ColumnB, Text Filters, Contains add Test, OK.
3. In C3 put ="Percentage Pass: "&C4*100&"%" and copy down to last row numbered in blue (should show #DIV/0!).
4. Highlight A:C for all the rows numbered in blue and embolden.
5. Hide ColumnA and, optionally, hide or delete Rows1:2.
6. Take off filter selection.
Hopefully with preparation as on the left the result would be similar to as shown on the right:
Example re comment to #Jerry's answer:
There are many ways to tackle this -- one possible easy solution is to add a column next to your pass/fail column that says (assuming column A):
=IF(A1="pass",100,0)
if you do that, then the average of those values would be equal to the percentage pass.