how to conditionally match in excel - excel

I've got two data sets: Data-A and Data-B.
Data-A
A B C D Start_Date End_Date
N C P 1 23-05-2015 27-05-2015
N C K 1 30-05-2015 07-06-2015
N C Ke 1 09-06-2015 28-06-2015
N C Ch 1 14-07-2015 25-07-2015
N C Th 1 29-06-2015 13-07-2015
N C Po 2 23-05-2015 27-05-2015
N C Kan 2 30-05-2015 08-06-2015
Data-B
X D Date A B C
444 1 09-07-2015
455 1 20-07-2015
1542 1 28-06-2015
2321 1 21-07-2015
2744 1 01-07-2015
7455 2 25-05-2015
12454 2 02-06-2015
18568 2 24-05-2015
28329 2 03-06-2015
28661 2 31-05-2015
Values is data-Bare missing and I need to fill them using conditional index matching/vlookup such that column D(Data-B) is matched along with Date(Data-B) such that Start Date<= Date <=End Date.
Desired Output:
X D Date A B C
444 1 09-07-2015 N C Th
455 1 20-07-2015 N C Ch
1542 1 28-06-2015 N C Ke
2321 1 21-07-2015 N C Ch
2744 1 01-07-2015 N C Th
7455 2 25-05-2015 N C Po
12454 2 02-06-2015 N C Kan
18568 2 24-05-2015 N C Po
28329 2 03-06-2015 N C Kan
28661 2 31-05-2015 N C Kan

Proof of Concept
In order to achieve the above I used the AGGREGATE function. It is a normal formula that performs array like calculations. The following formula will return the results from the first row that matches your criteria.
=INDEX(A$2:A$8,AGGREGATE(15,6,ROW($D$2:$D$8)/(($J2=$D$2:$D$8)*($E$2:$E$8<=$K2)*($K2<=$F$2:$F$8)),1)-1)
This assumed your table Data-A Started in A1 and included 1 row as a header row. The formula can be place in the first cell under A in Data-B and copied down and to the right as needed.
UPDATE Formula explained
The aggregate function performs array calculations within its brackets for certain sub function. There are about 19 different subfunctions. Subfunction 14 and 15 are both array calculations. This is a nice feature since it does array like calculations while being a regular formula.
Since I wanted the first row that met your criteria, I opted to use the small function or subfunction 15 for the first argument. Basically I am telling the aggregate function to generate a list and sort it in ascending order.
The second argument has a value of 6 which tell the aggregate to ignore any results from the array that generate errors. This will come in very handy if we can make results we do not want turn in to errors.
Now we are getting into the array portion of the formula. You can take this next part of the equation and highlight the appropriate rows in a neighbouring column and enter it as a CONTROL+SHIFT+ENTER (CSE) formula. As long as you do this in the top cell the array formula will propagate to the remainder of the selected cells and show you the results of the array. Also check the formula bar to see if { } appeared around your formula. You cannot add the { } manually.
{=ROW($D$2:$D$8)/(($J2=$D$2:$D$8)*($E$2:$E$8<=$K2)*($K2<=$F$2:$F$8))}
What this will do is determine the current row and then will divide it by the results of our conditions. You can also try each of the following conditions in a separate column as CSE formulas in the same manner described above to see their results.
($J2=$D$2:$D$8)
($E$2:$E$8<=$K2)
($K2<=$F$2:$F$8)
These on their own will provide you with either TRUE or FALSE as it checks each row. Now the interesting thing is, and this applies to excel formulas, when you perform a math operation on a Boolean, it will treat 0 as false and anything other number as TRUE. It will actually convert TRUE to 1. You will also note that each of the logic checks was separated by *. In this case * is acting like an AND operator as only when all results are true will you get an answer of 1. (+ will act like an OR operator)
Now if you remember from earlier 6 said to ignore all errors. So any row that does not meet our logic check will result in a division by 0 since not all logic checks results in TRUE or 1. All the checks that wound up false wind up getting ignored. So now after doing that, a list of only row numbers that met our criteria is left inside the aggregates array.
After the logic check there is a ,1 for the next argument. In this case we are telling the aggregate to return the 1st number in the list which is the first row number that met our criteria. If we wanted the third number, this would be ,3 instead.
So aggregate is returning the first row number of the results we want. When this is paired with an INDEX function, when can use the result to tell us what row of the INDEX function to look in. In this case we said we wanted to look in the index A$2:A$8. The aggregate function is telling us how many rows to go down in the index. If the index had start in row 1 we would not have to do anything. But since there is a header row, we need to adjust the results from the aggregate function by subtracting 1 for the head row (in reality you need to subtract the row number above the start of your data). This is why you see the -1 after the aggregate function.
Now if you pay attention to the lock on the range you will notice I did not lock the A in A$2:A$8. I did this so that I could copy the formula to the right and the column A address would update as I did. This only works because you were keeping the columns in the same order. If the order has changed I would have changed the index from a 1D array to a 2D array and used a MATCH function to line up the column headers.

Related

Textjoin values of column B if duplicates are present in column A

I want to consolidate the data of column B into a single cell ONLY IF the index (ie., Column A) is duplicated.
For example:
Currently, I'm doing manually for each duplicated index by using the following formula:
=TEXTJOIN(", ",TRUE,B4:B6)
Is there a better way to do this all at once?
Any help is appreciated.
There may easier way but you can try this formula-
=BYROW(A2:A17,LAMBDA(p,IF(INDEX(MAP(A2:A17,LAMBDA(x,SUM(--(A2:INDEX(A2:A17,ROW(x)-1)=x)))),ROW(p)-1,1)=1,TEXTJOIN(", ",1,FILTER(B2:B17,A2:A17=p)),"")))
Using REDUCE might be possible for a more succinct solution, though try this for now:
=BYROW(A2:A17,LAMBDA(ζ,LET(α,A2:A17,IF((COUNTIF(α,ζ)>1)*(COUNTIF(INDEX(α,1):ζ,ζ)=1),TEXTJOIN(", ",,FILTER(B2:B17,α=ζ)),""))))
For the sake of alternatives about how to solve it:
Using XMATCH/UNIQUE
=LET(A, A2:A17, ux, UNIQUE(A),idx, FILTER(XMATCH(ux, A), COUNTIF(A, ux)>1),
MAP(SEQUENCE(ROWS(A)), LAMBDA(s, IF(ISNA(XMATCH(s, idx)), "", TEXTJOIN(",",,
FILTER(B2:B17, A=INDEX(A,s)))))))
or using SMALL/INDEX to identify the first element of the repetition:
=LET(A, A2:A17, n, ROWS(A), s, SEQUENCE(n),
MAP(A, s, LAMBDA(aa,ss, LET(f, FILTER(B2:B17, A=aa), IF((ROWS(f)>1)
* (INDEX(s, SMALL(IF(A=aa, s, n+1),1))=ss), TEXTJOIN(",",, f), "")))))
Here is the output:
Explanation
XMATCH and UNIQUE
The main idea here is to identify the first unique elements of column A via ux, and find their corresponding index position in A via XMATCH(ux, A). It is an array of the same size as ux. Then COUNTIF(A, ux)>1) returns an array of the same size as XMATCH output indicating where we have a repetition.
Here is the intermediate result:
XMATCH(ux, A) COUNTIF(A, ux)>1)
1 FALSE
2 FALSE
3 TRUE
6 FALSE
7 TRUE
9 TRUE
11 FALSE
12 TRUE
15 FALSE
16 FALSE
so FILTER takes only the rows form the first column where the second column is TRUE, i.e the index position (idx) where the repetition starts. For our sample it will be: {3;7;9;12}.
Now we iterate over the sequence of index positions (s) via MAP . If s is found in idx via XMATCH (also XLOOKUP(s, idx, TRUE, FALSE) can be used for the same purpose) then we join the values of column B filtered by column A equal to INDEX(A,s).
SMALL and INDEX
This is a more flexible approach because in the case we want to do the concatenation in another position of the repetition you just need to specify the order and the formula doesn't change.
We iterate via MAP through elements of column A and index position (s). The name f has the filtered values from column B where column A is equal to a given value of the iteration aa. We need to identify only filtered rows with repetition, so the first condition ROWS(f) > 1 ensures it.
The second condition identifies only the first element of the repetition:
INDEX(s, SMALL(IF(A=aa, s, n+1),1))=ss
The second argument of SMALL indicates we want the first smallest value, but it could be the second, third, etc.
Where A is equal to aa, IF assigns the corresponding value of the sequence (remember IF works as an array formula), if not then it assigns a value that will never be the smallest one, for example, n+1, where n represents the number of rows of column B. SMALL returns the smallest index position. If the current index position ss is not the smallest one, the conditions FALSE.
Finally, we do a TEXTJOIN only when both conditions are met (we multiply them to ensure an AND condition).

how to vlookup if prefix found in the list?

HI.
how can i come up with return value of "company name" (column H) at Column B IF any of the "PrefiX" (Column G) found at "con no" (Column A).
Sample of outcome needed as in column B.
Sample:
620011113 = DD
CN1234 = BB
thanks
=INDEX($H:$H,AGGREGATE(15,6,ROW($G$1:$G$7)/(--(FIND($G$1:$G$7,$A2)=1)*--(LEN($G$1:$G$7)>0)),1),1)
Breaking this down, the INDEX retrieves the Nth item from Column H (Company name). To find the value of N, we are using the AGGREGATE function
AGGREGATE is a weird function - it lets us use things like MAX or LARGE or SUM while ignoring any error values. In this case, we will be using it for SMALL (first argument, 15), while Ignoring Error Values (second argument, 6). We will want the very smallest value, so the fourth argument will be 1. (If we wanted the second smallest, it would be 2, and so on)
=INDEX($H:$H,AGGREGATE(15,6, <SOMETHING> ,1),1)
So, all we need now is a list of values to compare! To make things slightly simpler, I'll break that bit of the code out for you here:
ROW($G$1:$G$7) / (--(FIND($G$1:$G$7,$A2)=1) * --(LEN($G$1:$G$7)>0))
There are 3 parts to this. The first, ROW($G$1:$G$7)is the actual value we want to retrieve - these will be the Row Numbers for each Prefix that matches your value. On its own, however, it will be all the row numbers. Since we are skipping errors, we want any Rows that don't match the prefix to throw an error. The easiest way to do this is to Divide by Zero
At the start of --(FIND($G$1:$G$7,$A2)=1) and --(LEN($G$1:$G$7)>0) we have a double-negative. This is a quick way to convert True and False to 1 and 0. Only when both tests are True will we not divide by 0, as this table shows:
A | B | A*B
1 | 1 | 1
1 | 0 | 0
0 | 1 | 0
0 | 0 | 0
Starting with the second test first (it's easier), we have LEN($G$1:$G$7)>0 - basically "don't look at blank cells".
The other test (FIND($G$1:$G$7,$A2)=1) will search for the Prefix in the Con No, and return where it is found (or a #VALUE! error if it isn't). We then check "is this at position 1" - in other words, "Is this at the start of the Con No, rather than in the middle". We don't want to say Con No CNQ6060 is part of Company AA instead of Company BB by mistake!
So, if the Prefix is at the Start of the Con No, AND it isn't Blank (because there is an infinite amount of Nothing Before, After, and Between every number and letter), then we get it added to our list of Rows. We then take the smallest row (i.e. closest to the top - change AGGREGATE(15 to AGGREGATE(14 if you want the closest to the bottom!), and use that to get the Company Name
You could try the below formal:
=VLOOKUP(IF(LEFT(A3,1)="6",LEFT(A3,4),IF(LEFT(A3,1)="C",LEFT(A3,2),IF(LEFT(A3,1)="E",LEFT(A3,7)))),$G$3:$H$7,2,0)
Have in mind that you have to use ' before the cell value of column A & G in order to convert cell value into text get the correct out comes using VLOOKUP
Result:

Deleting duplicate data in one of three column in Excel

I tried to use some functions that I found whilst searching to solve my problem, they are slightly modified it to remove duplicate data in a field.
File
Rather than a count of 4, I would like the count of 2 from column J. The information below are my attempts for 4 different sections on the attached document as I always thought the next one would give me the result that I wanted.
H ====I==========J
P13C Body Exterior 4943
P13C Body Exterior 4943
P13C Body Exterior 5122
P13C Body Exterior 5122
=IFERROR(INDEX($K$7:$K$142,MATCH(0,COUNTIFS($H$7:$H$142,B14,$K$7:$K$142,$E$14),0)),"")
as does this
=IFERROR(INDEX($J$7:$J$142,MATCH(,IF(H$7:H$142="P13C",COUNTIF(I7:I142,$J$7:$J$142)),)),"")
and this
=IFERROR(INDEX($K$7:$K$142,MATCH(0,COUNTIF($H$7:$H$142,$K$7:$K$142),0)),"")
This, gives me a 0
=IF($J$7:$J$142>1,IF($K$7:$K$142="20",SUM(IF(FREQUENCY($H$7:$H$142,$H$7:$H$142)>1,1))))
This gives me a DIV error
=SUMPRODUCT(((H7:H142="P13C")*(I7:I142="Body Exterior"))/(COUNTIFS(J7:J142, J7:J142, H7:H142, "P13C", I7:I142,"Body Exterior")+((H7:H142<>"P13C")+(I7:I142="Body Exterior"))))
There are duplicates in $J$7:$J$142, but I only want the one count.
Sort the column J in smallest to largest order
concatenate all three values by using '&' on next column K ---> H&I&J
then use IF and COUNTIF formula on next column L,
Column L:
=IF(K1=K2,COUNTIF(K1,K1:K142),"")
this a easy method to get count 1 for J column

Return a row number that matches multiple criteria in vbs excel

I need to be able to search my whole table for a row that matches multiple criteria. We use a program that outputs data in the form of a .csv file. It has rows that separate sets of data, each of these headers don't have any columns that are unique in of them self but if i searched the table for multiple values i should be able to pinpoint each header row. I know i can use Application.WorksheetFunction.Match to return a row on a single criteria but i need to search on two three or four criteria.
In pseudo-code it would be something like this:
Return row number were column A = bill & column B = Woods & column C = some other data
We need to work with arrays:
There are 2 kinds of arrays:
numeric {1,0,1,1,1,0,0,1}
boolean {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
to convert between them we can use:
MATCH function
MATCH(1,{1,0,1,1,1,0,0,1},0) -> will result {TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}
simple multiplication
{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE}*{TRUE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,TRUE} -> will result {1,0,1,1,1,0,0,1}
you can can check an array in the match function, entering it like in the picture below, be warned that MATCH function WILL TREAT AN ARRAY AS AN "OR" FUNCTION (one match will result in true
ie:
MATCH(1,{1,0,1,1,1,0,0,1},0)=TRUE
, YOU MUST CTR+SHIFT+ENTER !!! FOR IT TO GIVE AN ARRAY BACK!!!
in the example below i show that i want to sum the hours of all the employees except the admin per case
we have 2 options, the long simple way, the complicated fast way:
long simple way
D2=SUMPRODUCT(C2:C9,(A2=A2:A9)*("admin"<>B2:B9)) <<- SUMPRODUCT makes a multiplication
basically A1={2,3,11,3,2,4,5,6}*{0,1,1,0,0,0,0,0} (IT MUST BE A NUMERIC ARRAY TO THE RIGHT IN SUMPRODUCT!!!)
ie: A1=2*0+3*1+11*1+3*0+2*0+4*0+5*0+6*0
this causes a problem because if you drag the cell to autocomplete the rest of the cells, it will edit the lower and higher values of
ie: D9=SUMPRODUCT(C9:C16,(A9=A9:A16)*("admin"<>B9:B16)), which is out of bounds
same as the above if you have a table and want to view the results in a diferent order
the fast complicated way
D3=SUMPRODUCT(INDIRECT("c2:c9"),(A3=INDIRECT("a2:a9"))*("admin"<>INDIRECT("b2:b9")))
it's the same, except that INDIRECT was used on the cells that we want not be modified when autocompleting or table reorderings
be warned that INDIRECT sometimes give VOLATILE ERROR,i recommend not using it on a single cell or using it only once in an array
f* c* i cant post pictures :(
table is:
case emplyee hours totalHoursPerCaseWithoutAdmin
1 admin 2 14
1 him 3 14
1 her 11 14
2 him 3 5
2 her 2 5
3 you 4 10
3 admin 5 10
3 her 6 10
and for the functions to check the arrays, open the insert function button (it looks like and fx) then doubleclick MATCH and then if you enter inside the Lookup_array a value like
A2=A2:A9 for our example it will give {TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,FALSE,FALSE} that is because only the first 3 lines are from case=1
Something like this?
Assuming that you data in in A1:C20
I am looking for "Bill" in A, "Woods" in B and "some other data" in C
Change as applicable
=IF(INDEX(A1:A20,MATCH("Bill",A1:A20,0),1)="Bill",IF(INDEX(B1:B20,MATCH("Woods",B1:B20,0),1)="Woods",IF(INDEX(C1:C20,MATCH("some other data",C1:C20,0),1)="some other data",MATCH("Bill",A1:A20,0),"Not Found")))
SNAPSHOT
I would use this array* formula (for three criteria):
=MATCH(1,((Range1=Criterion1)*(Range2=Criterion2)*(Range3=Criterion3)),0)
*commit with Ctrl+Shift+Enter

Excel VBA - Referring between ranges

Here's my problem:
I have two ranges, r_products and r_ptypes which are from two different sheets, but of same length i.e.
Set r_products = Worksheets("Products").Range("A2:A999")
Set r_ptypes = Worksheets("SKUs").Range("B2:B999")
I'm searching for something in r_products and I've to select the value at the same position in r_ptypes. The result of Find method is being stored in cellfound. Now, consider the following data:
Sheet: Products
A B C D
1 Product
2 S1
3 P1
4 P2
5 S2
6 S3
Sheet: SKUs
A B C D
1 SKU
2 S1-RP003
3 P1-BQ900
4 P2-HE300
5 S2-NB280
6 S3-JN934
Now, when I search for S1, cellfound.Row gives me value 2, which is, as I understand, 2nd row in the total worksheet, but is actually 1st row in the range(A2:A999).
When I use this cellfound.Row value to refer to r_ptypes.cells(cellfound.Row), It is taking it as an Index value and returns B3 (P1-BQ900) instead of what I want, i.e. B2 (S1-RP003).
My question is how'll I find out the index number in cellfound? If not possible, how can I use Row number to extract data from r_ptypes?
Dante's solution above works fine. Also, I managed to get the index value using built in excel function Match instead of using Find method of a range. Listing it here for reference.
indexval = Application.WorksheetFunction.Match("searchvalue", r_products, 0)
Using the above, I'm now able to refer the rows in r_ptypes
skuvalue = r_ptypes.Rows(indexval).Value
Because .Row always returns the absolute row number of a sheet, not the offset (i.e. index) in the range.
So, just do some minus job to deal with it.
For you example,
r_ptypes.Cells(cellfound.Row - r_ptypes.Cells(1).Row + 1)
or a little bit neat (?)
With r_ptypes
.Cells(cellfound.Row - .Cells(1).Row + 1)
End With
That is, get the row difference between cellfound and the first cell and + 1 because Excel counts cells from 1.

Resources