IF + AND + Date range formula - excel

I have a problem to create a formula so I want to ask you for your help.
Excel sheet has 150 000 rows and with this formula I want to safe a time.
I have a Date, Name and Status and I need to see in other cell which Name was 4x or more time in consecutive GOOD or OK
Example of input:
https://imgur.com/aRALd9S
I think IF + AND + DATE Range it’s enough, but I don’t know how to put it together.
Thanks a lot for your suggestions !
Here is what i have so far: https://imgur.com/Y5WAov5
=COUNTIFS($D$2:$D$15;D2;$C$2:$C$15;"OK";$D$2:$D$15;D2;$E$2:$E$15;">="&E2;$E$2:$E$15;"<="&E2+7)+(COUNTIFS($D$2:$D$15;D2;$C$2:$C$15;"GOOD"))
With this i'm able to count how many times i have a Name, which is OK or GOOD and is in one week range, but i still don't know, what i have to change, that i will stop to count when false is there

Well here's something you could try. You could do it in one formula with an array formula, but with 150K rows it seems much better to try and avoid array formulas and use helper columns where necessary.
The first helper column just contains the person's ID if the row contains FALSE:
=IF(H2=FALSE,I2,"")
The second helper column contains the offset from the current row to the next FALSE for the same person:
=IFERROR(MATCH(I2,K2:K$15,0)-1,16-ROW())
So now you can use basically your same COUNTIFS formula but replacing each range with an INDEX which specifies how many rows you should count:
=IF(H2=FALSE,0,COUNTIFS(I2:INDEX(I2:I$15,L2),I2,H2:INDEX(H2:H$15,L2),"GOOD",J2:INDEX(J2:J$15,L2),">="&J2,J2:INDEX(J2:J$15,L2),"<="&J2+7))+
IF(H2=FALSE,0,COUNTIFS(I2:INDEX(I2:I$15,L2),I2,H2:INDEX(H2:H$15,L2),"OK",J2:INDEX(J2:J$15,L2),">="&J2,J2:INDEX(J2:J$15,L2),"<="&J2+7))
Note 1
The 16 in the second equation is to allow for the case where there are no more rows labelled FALSE after the current row, so the MATCH fails. This makes the Countifs count everything from the current rows to the end of the data.
Note 2 - expanding to a larger range of data
You should be able replace the figure 16 with countA(I:I), the size of the data plus headers.
There shouldn't be a problem with using a larger range for the Index e.g.
=IF(H2=FALSE,0,COUNTIFS(I2:INDEX(I2:I$150000,L2),I2,H2:INDEX(H2:H$150000,L2),"GOOD",J2:INDEX(J2:J$150000,L2),">="&J2,J2:INDEX(J2:J$150000,L2),"<="&J2+7))+
IF(H2=FALSE,0,COUNTIFS(I2:INDEX(I2:I$150000,L2),I2,H2:INDEX(H2:H$150000,L2),"OK",J2:INDEX(J2:J$150000,L2),">="&J2,J2:INDEX(J2:J$150000,L2),"<="&J2+7))
but increasing the search range in the MATCH to 150K rows in the second formula does seriously affect performance when repeated 150K times. The only solution I can think of at the moment is to see if a maximum can be placed on the distance from any occurrence of a name to the next occurrence of the name with FALSE next to it.

Related

Generate two false Booleans every ten rows in excel

I need to add a column to my spread sheet that generates two "false" at random intervals every ten frames.
So for example rows 1 though 10 could read:
true
true
true
False
true
false
true
true
true
true
and then repeat that for rows 11 through 20, but the false are randomly put in different places. etc. I want write a formula that does this for me.
With Office 365:
In first cell you want the list to be created put:
=LET(rws,1000,arr,RANDARRAY(10,rws/10),seq,SEQUENCE(rws,,0),INDEX(MAKEARRAY(10,rws/10,LAMBDA(i,j,INDEX(BYCOL(arr,LAMBDA(v,MATCH(SMALL(v,i),v,0))),1,j)<9)),MOD(seq,10)+1,INT(seq/10)+1))
Change the 1000 to the number of rows desired.
If one does not have Office 365 then put this in the second row of a column and copy it down.
=IF(COUNTIF(INDEX(A:A,MIN(ROW($ZZ1)-MOD(ROW($ZZ1)-1,10)+1,ROW()-1)):INDEX(A:A,ROW()-1),FALSE)>=2,TRUE,IF(COUNTIF(INDEX(A:A,MIN(ROW($ZZ1)-MOD(ROW($ZZ1)-1,10)+1,ROW()-1)):INDEX(A:A,ROW()-1),TRUE)>=8,FALSE,RANDBETWEEN(0,9)<8))
Be aware:
Each cell is randomly chosen and as such FALSE will appear in the last of the 10 more often than truly random. One can play with the RANDBETWEEN(0,9)<8 to maybe make that more random.
BRUTE FORCE METHOD
There are 10!/(8!*2!) = 45 ways of arranging your True/False requirements
I personally didn't have anything better to do with my time so I wrote out all possible combinations in 45 columns.
The concept with this methodology is to randomly write out one of the 45 columns every 10 rows. One of the problems here is that using random in a formula does not mean you will be able to use the same random value in the next row of the formula.
A potential random problem side step
In order to make a random result accessible by multiple formula calculations one can spit out the results in a helper column. For this solution we will be randomly selecting from 45 possible columns, so in the first column the following formula is used and copied down. The number of rows will be equal to the number of 10 groupings you will use.
Start in A1 and copy down
=RANDBETWEEN(1,45)
How to make each formula in a group of ten pick the same random number
For demonstration purposes the next column is to generate integers starting at 1 and increasing by 1 after every 10 rows. For the demonstration it would need to be copied down a number of rows equal to the number of results needed (10 * number of groups of 10). Ultimately this formula can be embedded in the final formula.
Start in B1 and copy down
=INT((ROW(A1)-1)/10)+1
For demonstration purposes the next column is to generate integers starting at 1 and increasing by 1 row but resetting to 1 after the 10th row. For the demonstration it would need to be copied down a number of rows equal to the number of results needed (10 * number of groups of 10). Ultimately this formula can be embedded in the final formula.
Start in C1 and copy down
=MOD(ROW(A1)-1,10)+1
So now there is a way of indexing the column you need and what row of that column you need.
Indexing the solution
In the next column the index function is used (twice) to find out what column and row to look in from the list of all possible combination. In this demo, the list of all possible combination is written out from F1:AX10.
First we start by indexing which random column to use. Since the random numbers are written in column A starting in row 1 I used the following formula:
=INDEX(A:A,B1)
To get the row reference I used the following formula:
=C1
I then took those two formulas and combined them to pull data from the possibility table as follows:
Start in D1
=INDEX($F$1:$AX$10,C1,INDEX(A:A,B1))
Tidying it up
We can't eliminate the random number column as we need something quasi static for the formulas to refer to. The reason I say quasi static, random is a volatile function which means it will recalculate every time the sheet recalculates. However, we can place the formulas from B and C into D. This results in the formula in D looking like:
=INDEX($F$1:$AX$10,MOD(ROW(A1)-1,10)+1,INDEX(A:A,INT((ROW(A1)-1)/10)+1))
It's not clear which version of Excel you're using so this approach will work for all versions:
the starting point is C12:L13, where the formula in row 12 is
=RANDBETWEEN(1,5)
and the formula in row 13 is
=RANDBETWEEN(6,10)
These results determine the positions of the FALSE values in the range starting with cell C1 where the formula is
=NOT(OR(ROW()=C$12,ROW()=C$13))
The array formula in A1:A10 is
=INDEX($C$1:$L$10,,1+MOD(RANDBETWEEN(1,100),10))
column B is just an indexing column containing the formula
=1+MOD(ROW()-1,10)
which, coupled with the conditional formatting in column A illustrates that the positions of the FALSE values are different in each 10-row sequence.
(you will notice that the random numbers generated in columns I and J happen to be the same so, if this is a concern, you could extend the 'helper range' beyond 10 columns in order to augment randomness)

Q: Begin a row from the first non-null value in excel

Problem:
I have a data table with the price of different references each year. Some references are bought since 2004, others since 2002...
However, I want something like this where all the prices begin at the same point called in my example year 1 that is the first year where the company bought a reference (so this is the first non-null value in the row), year 2 is the second year....
The main difficulty is that I can't use VBA (even if I could, I don't see how) and I have to do this for more than thousands rows.
How can I do to automatize this?
I tried to get the first value of each rows with this:
=INDEX(A2:V2;MATCH(TRUE;A2:V2<>"";0))
But I don't know how to get the values after.
I got something for you:
Sub Delete_blanks()
Selection.SpecialCells(xlCellTypeBlanks).Select
Selection.Delete Shift:=xlToLeft
End Sub
Select the data and run the macro it should work for you.
Any questions or problems let me know.
With your requirements, I think it would be easier (in terms of finding the "next" column, to find the first filled column number, instead of the first value. Then you can increment that.
=IF(INDEX(2:2,AGGREGATE(15,6,1/ISNUMBER(2:2)*COLUMN(2:2),1)-1+COLUMNS($A:A))="","",INDEX(2:2,AGGREGATE(15,6,1/ISNUMBER(2:2)*COLUMN(2:2),1)-1+COLUMNS($A:A)))
To find the first column number that contains a number (I assume the first column is text, if not, some changes may be needed):
1/ISNUMBER(2:2)*COLUMN(2:2),1)-1+COLUMNS($A:A)
will return an array of {DIV/0!, n, …} where n is a column number where the cell contains a number
The AGGREGATE function will return the smallest number in this array, ignoring errors, and that is the first filled column.
We then increment this as we fill right with the COLUMNS($A:A).
And we need the IF statement because when INDEX finds an empty cell, it will return a zero instead of a blank.

VBA Code: Average Columns to Right (Variable # of Columns)

I am writing a code that will average all values to the left in data sets that have varying numbers of columns. For example, if my data set is A1:AC1, then I used the code
ActiveCell.FormulaR1C1 = "=AVERAGE(RC[-28]:RC[-2])"
This code works for finding the average of cells B1:AB1 which is exactly what I want in this instance, but if my next data set has say 40 columns, this code will only average the 26 cells referenced. I have researched everywhere and found that I should somehow first count the columns to the left and then use that in the average function, but I am not sure how exactly to accomplish this. How can I write a code that will average all values to the left of my active cell regardless of how many columns it is?
You're close, just:
Don't use relative references for your starting range.
Keep the ending range relative.
Try this:
ActiveCell.FormulaR1C1 = "=AVERAGE(RC2:RC[-2])"
This will always start in column B and end 2 columns before the current cell, for example.
Hope that makes sense / helps.

Excel formula to lookup the last value in a column and return the value of the adjacent cell

I have the following formula to return the value of the last value in a column:
=LOOKUP(2,1/(D:D<>""),D:D)
What I need now is to return the value of the cell adjacent to it as well. (It will not necessarily be the last value in that column and the info in Column D could have duplicates.
If your data looks like this:
A 1
A 2
A 3
B 4
B 5
B 6
C 7
To get last value this will do the trick:
=INDIRECT("B"&COUNTA(A:A))
And to get last where value is A:
=INDIRECT("B"&MATCH("A",A1:A7,0)+COUNTIF(A1:A7,"A")-1)
Just use next column:
=LOOKUP(2,1/(D:D<>""),E:E)
Ok, So I have found an answer by playing around with array formulas.
The problem was that this is a stock control sheet where there are changes made at multiple times, each recorded in the next available row. There is always a date (Column E) but not necessarily a Supplier, as it might be stock moving out. When a Supplier delivers, the Supplier name is recorded in Column D. In D1 the last supplier is then shown with the following formula.
=LOOKUP(2,1/(D:D<>""),D:D)
I want to then see what date it was last received. The formula I found that works is as follows (Array Formula):
=INDEX(E:E,MAX(IF(D:D=D1,ROW(D:D)-ROW(INDEX(D:D,1,1))+1)))
This is generally how I do it:
=XMATCH(FALSE,ISBLANK(A:A),0,-1)
This is what each part does:
Parameter
Explanation
FALSE
Instructs Excel to find the first instance of FALSE that it finds
ISBLANK(A:A)
Takes in the column A:A and notionally assigns a value to every item in the column
0
Means we want an exact match. Probably not necessary to put in, but I think it's good practice anyway
-1
Instructs Excel to start the search at the bottom/right of the range and work up/left. If you change this to 1 (the default), Excel will begin the search at the top/left and work down/right
So, taken together, this will search from the bottom of the column A:A, until Excel finds the first cell that is not blank, and return that cell.
Also, yes, this equation can be changed to a row format (e.g. 1:1), and can take a smaller range (e.g. A1:A20), but it cannot take a 2-dimensional range (e.g. A1:B20).
As a practical matter, this approach is much faster than other approaches (and much faster than you'd think, given it's evaluating against every row/column in the range), and won't get fooled by columns that have empty spaces in them (like with a COUNTA style approach).

HLOOKUP with wildcard and SUM more than one column

I need help with the following:
I have a worksheet containing some data. Row 1 is header and from row 2 downward is the data. At the end there is total for all the data above. This worksheet is dynamic, i.e., if week 1 has 200 rows of data, then week 2 could have 250 or 190 rows of data.
Likewise, the columns across, change every week. This week I have 18 columns and next week I could have 20 columns.
Within row # 1, the header, I have two headings "CTAEO1P" and "CTAEO2P".
On another worksheet, I want to add the "totals" of both of those columns i.e., Individual totals of CTAEO1P = 32.98 + CTAEO2P = 46.25 = 79.23
I am using named ranges and named the whole of the worksheet with data as "MT". The range is whole of the worksheet so when next week I copy the data over from another worksheet, I should not have to adjust the range.
I am using the following formula, courtesy of another expert on this forum:
=HLOOKUP("CT*",MT,MATCH(9^99,INDEX(MT,0,MATCH("CT*",INDEX(MT,1,0),0))),0)
This formula look for any column that starts with "CT" and then "Match(9^99" and "index" finds the last number within that column (the total in this case) and then return that value on the worksheet. In this case this formula is returning "32.98" only, as this is the first occurrence.
I think I can use "Sumproduct" formula here but then a) I would have to create more than one named range, one for the header row and another for the "Total" row, b) every week I would have to adjust the range for "Total" row. Unless, if I can nest "Match(9^99..." part within "SUMPRODUCT" function.
I want to use "MT" range alone and want to add the totals of all the columns that start with "CT".
I hope I have been able to explain my problem better enough to make some sense, however, if you need any further information, then please let me know.
Regards
Tariq
I will forget about the MT range, as long as your data starts in A1 this will work
=SUMPRODUCT(ISNUMBER(SEARCH("CT*";OFFSET(A1;0;0;1;MATCH(9^99;2:2))))*OFFSET(A1;MATCH(9^99;A:A)-1;0;1;MATCH(9^99;2:2)))
Depending on your regional settings you may need to replace field separator ";" by ","
I think you can use a relatively simple SUMPRODUCT solution like this
=SUMPRODUCT((LEFT(INDEX(MT,1,0),2)="CT")*ISNUMBER(MT),MT)/2
SUMPRODUCT will total all values in the relevant columns, including the totals so divison by 2 will ensure you get the correct count
If you don't like that approach then assuming first column of MT always has data and that the totals for each column will all be in the same row you can use SUMIF like this
=SUMIF(INDEX(MT,1,0),"CT*",INDEX(MT,MATCH(9^99,INDEX(MT,0,1)),0))
That should be more efficient than the first version

Resources