Vlookup for double values - excel

How can i create a vlookup or VBA code such that the Vlookup function doesnt always take the first but after the first has been taken also a second value etc.
I think it becomes clearer with a screenshot
Some further explanation:
Column A is an identifier (in the real sample the cusip (an Identification number) of a company)
Column B represents a Dealnumber.
But not for every deal cusips are availble (explains empty spaces), and a company can occur in several deals(-> values in column A are not mutually exclusive)
Since the values in column A are not mutually exclusive, i have to "map" the values of column B, which are mutually exclusive, to the occurance in column D

Use this, but it requires a title row on the data column:
=IF(D1=D2,E1,INDEX(B:B,AGGREGATE(15,7,ROW($B$1:$B$6)/($A$1:$A$6=D2),COUNTIFS($D$1:D1,D2,$D$2:D2,"<>"&D2)+1)))
It does not matter what the value in D1 is but the formula must be placed in the second row as it uses the first row as a check for changes.
Since this is an array type formula the references should limit themselves to the data set and not use full column references, But we can use INDEx(MATCH()) to do that automatically:
=IF(D1=D2,E1,INDEX(B:B,AGGREGATE(15,7,ROW($B$1:INDEX(B:B,MATCH("zzz",A:A)))/($A$1:INDEX(A:A,MATCH("zzz",A:A))=D2),COUNTIFS($D$1:D1,D2,$D$2:D2,"<>"&D2)+1)))

If I am understanding how your Columns A & B are being populated (by looking at occurrences in C & D), why not just use the C&D data?
You could find the number of tasks such that (I have this in D2):
=IFERROR(IF(C2=C1,D1,D1+1),1)
This will give the occurrences without using the Vlookup or Index/Match, adding 1 where the task changes (the iferror is used to give the first "1", since you can't add "occurrence" + 1).

You probably have your answer by now. Here's a nice (additional) solution.
=IFERROR(INDEX($B$3:$B$13, SMALL(IF(D$2=$A$3:$A$13, ROW($B$3:$B$13)-2,""), ROW()-2)),"")
or
=IFERROR(INDEX($B$3:$B$13,SMALL(IF(D$2=$A$3:$A$13,ROW($A$3:$A$13)- MIN(ROW($A$3:$A$13))+1,""), ROW()-2)),"")
See this link:
https://www.ablebits.com/office-addins-blog/2017/02/22/vlookup-multiple-values-excel/
To get VBA code, enter the function and get it working, then turn on the Macro Recorder, click on the primary cell (first cell), hit F2, hit Enter to commit the change, and double click the black cross to fill down. Now, you should have all your VBA, and of course you can modify your code to do whatever you want.

Related

Find and remove duplicate IDs and replace

I have very basic user Excel knowledge. I have a spreadsheet where I keep track of reloading data. Each load I enter gets a unique load number that is calculated automatically with a formula, based on the caliber name and an incrementally increasing number. As of now, every load I enter gets a number, even if it's been repeated before. Popular loads that I repeat often are all the same except for the date and numbers of rounds made but currently will have different load numbers. Is there a way to skips these repeated loads and assign it the previous load number or not assign a load number at all, with a formula instead of manually?
I know this is asking for a great deal but I'd greatly appreciate any help! I'm certainly open to suggestions if this isn't even the best way to go about this.
Sample workbook at:
https://www.dropbox.com/s/v5y1ufxjiosmnap/My%20Reloading%20Data%20-%20Sample.xlsx?dl=0
Here's what I've tried so far:
In column Q2, combine all the criteria.
=C17&E17&F17&G17&H17&L17&M17&N17&P17
In column R2 look for duplicates.
=IF(COUNTIF($Q$2:$Q17, $Q17)>1, "Duplicate", "")
D2 is the Load # column.
=IF(R17="Duplicate","",(TEXT(C17,0)&"-"&TEXT(COUNTIF($C$2:C17,C17),"000")))
This will skip the duplicate loads and not give them a load # leaving the cell blank. I'd love to find and match what that load # should be and insert it. Also, when the sequential numbering resumes it acts as if it's counted the duplicate row. For instance D2 might look like:
9mm-001
9mm-002
(Skipped for duplicate and left blank, but would like it to find, match, and insert the duplicate load #)
9mm-004 (I'd like to to be 9mm-003)
You should be able to achieve this with a VLOOKUP formula or a combination of MATCH and INDEX.
VLOOKUP (Vertical Lookup) looks for a match in another cell and returns a value from an offset column. A non match, if you use FALSE as the last parameter, returns a #N/A error.
So, in D20 (for example) you could, using column Q as your determinant, use the following, assuming you had a copy of D in column R:
=IFERROR(VLOOKUP(Q20,Q$1:R19,2,FALSE),[value for newly found loadno])
What this formula does is calculates a VLOOKUP - if that doesn't find a record, calculate a new value. The VLOOKUP will look at the concatenated key in the current row Q column, search through all previous columns (note it is anchored at row 1, but not anchored for the bottom of the range so you can copy the formula), it uses the column 2 (Q is column 1, so R is column 2) for the result, and demands an exact match (FALSE). If it doesn't find one, return NA and let the second half of the IFERROR take over.
See how you go with this.
The MATCH INDEX may work better because you won't need the additional R column due to VLOOKUP only being able to look to the right of the key.
Here is an INDEX and MATCH solution - slightly harder to understand, but a more flexible solution.
=IFERROR(INDEX(D$1:D19,MATCH(Q20,Q$1:Q19,0)),[value for newly found load number])
I prefer this.
The outer function says return the nth value in the list. The inner MATCH function says find this value (Q20) in this list (Q1:Q19). The 0 as the third parameter of the MATCH function says the match has to be exact.

Matching part of Excel column to another and pasting a set of values

I have a long column in an Excel spreadsheet that I'd like to match with another long column in the same sheet.
The columns look like:
And I want to make a script that will allow me to check A to F using B to copy and paste F,G,H to B,C,D if A and F match to look like:
The problem is that A and F don't usually match exactly; only most of the words - as you can see with "metal corp & company" vs "metal corp".
Is there any way I can do this via script? I know I can see if they match exactly using something like =NOT(ISERROR(MATCH(A2,$F$2:$F$5, 0))) etc....Or to explicitly state the name in quotations, but nothing to match only part of the entry to another.
Also, is there any way to automatically paste F,G,H if the match is found?
Ok, you can do a vlookup and break the formula with copy/paste values if needed. you may not need to do that.
So, you need to consider how to make vlookup work. vlookup can take a value and search for it in an array and return a corresponding value in that row of the array that it is found. given your ask for the following values along the right, this should work well if we can get a value to be matched. afterwards just repeat the same formula with +1 to each column being returned (F, G, H).
So, your issue is getting the match to work since you think most of the time the values do not match 100%, but are close. you need to make them match, somehow. either by manually assigning an lookup table where the alias's are aligned. e.g. on another sheet, in column A, have "metal corp & company" and in column b, have "metal corp" or by a formula to establish a match. depending on the number of rows, doing it manually may be substantially less work than a formula. a formula can work it the alias's are consistently wrong in a certain fashion. if it is random, then a formula won't work.
I would try something as simple =vlookup(value,array,column,TRUE). the true value will return an approximate match instead of exact match.
See example screenshot, the only formula is in column B. "hi" is close enough to "high" and then "there" is returned from column e.

Transpose multiple occurrences

EDIT: I have revived the source data source to remove the ambiguity of my last screen shots
I am trying to transpose spreadsheet data where there are many rows where the customer name may be duplicated but each row contains a different product.
For instance
revised original data source
to
revised proposed data format
I would like to do it with formulae if possible as I struggle with VB
Thank you for any help
I realise this is a huge answer, apologies but I wanted to be clear. If you need anything from me, drop me a comment and I'll help out.
Here's the output from my formula:
EDITED ANSWER - Named ranges used for ease of understanding:
These are just an example of a few of the named ranges I have used, you can reference the ranges directly or name them yourself (simplest way is to highlight the data then put the name in the drop down next to the formula bar [top left])
Be wary that as we will be using Array formulas for AccNum and AccType, you will not want to select the entire column and instead opt for either the exact data length or overshoot it by 100 or so. Large array formulas tend to slow down calculation and will calculate every cell individually regardless of it being empty.
First formula
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Account Number ",LEFT((COLUMN(A:A)+1)/2,1)),"")
This formula is identical to the one in the original answer apart form the adjusted heading title.
=IF(Condition,True,False) - There are so many uses for the IF logic, it is the best formula in Excel in my opinion. I have used to IF with COUNTIF to check whether there is more than 0 cells that are more than BLANK (or ""). This is just a trick around using ISBLANK() or other blank identifiers that get confused when formula is present.
If the result is TRUE, I use CONCATENATE(Text1,Text2,etc.) to build a text string for the column header. ROW(1:1) or COLUMN(A:A) is commonly used to initiate an automatically increasing integer for formulas to use based on whether the count increase is required horizontally or vertically. I add 1 to this increasing integer and divide it by 2 so that the increase for each column is 0.5 (1 > 1.5 > 2 > 2.5) I then use LEFT formula to just take the first digit to the left of this decimal answer so the number increases only once every 2 columns.
If the result is FALSE then leave the cell blank ,""). Standard stuff here, no explanation needed.
Second Formula
=CONCATENATE(INDEX(Forename,MATCH(Sheet4!$A2,Reference,0)))
=CONCATENATE(INDEX(Surname,MATCH(Sheet4!$A2,Reference,0)))
CONCATENATE has only been used here to force blank cells to remain blank when pulled by INDEX. INDEX will read blank cells as values and therefore 0's whereas CONCATENATE will read them as text and therefore "".
INDEX(Range,Row,Column): This is a lookup formula that is much more advanced than VLOOKUP or HLOOKUP and not limited in the way that they are.
The range i have used is the expected output range - Forename or Surname
The row is then calculated using MATCH(Criteria,Range,Match Type). Match will look through a range and return the position as an integer where a match occurs. For this I have set the criteria to the unique reference number in column A for that row, the range to the named range Reference and the match type as 0 (1 Less than, 0 Exact Match, -1 Greater than).
I did not define a column number for INDEX as it defaults to the first column and I am only giving it one column of data to output from anyway.
Third Formula
Remember these need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter)
=IFERROR(INDEX(AccNum,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0))),"")
=IFERROR(INDEX(AccType,SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))),"")
As you can see, one of these is used for AccNum and the other for AccType.
IFERROR(Value): The reason that this has been used is that we are not expecting the formula to always return something. When the formula cannot return something or SMALL has run out of matches to go through then an error will occur (usually #VALUE or #NUM!) so i use ,"") to force a blank result instead (again standard stuff).
I have already explained the INDEX formula above so let's just dive in to how I have worked out the rows that match what we are looking for:
SMALL(IF(Reference=Sheet4!$A2,ROW(Reference)-ROW(INDEX(Reference,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0))
The IF statement here is fairly self explanatory but as we have used it as an array formula, it will perform =Sheet4!$A2 which is the unique reference on every cell in the named range Reference individually. In your mock data this returns a result of: {FALSE;TRUE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} for the first entry (I included titles in the range, hence the initial FALSE). IF will do my row calculation* for every true but leave the FALSEs as they are.
This leaves a result of {FALSE;2;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE;FALSE} that SMALL(array,k) will use. SMALL will only work on numeric values and will display the 'k'th result. Again the column trick has been used but to cover more ground, I used another method: ROUNDDOWN(Number,digits) as opposed to using LEFT() Digits here means decimal places so I used 0 to round down to a whole integer for the same result. As this copies across the columns like so: 1, 1, 2, 2, 3, 3, SMALL will alternatively (as the formulas alternate) grab the 1st smallest AccNum then the 1st Smallest AccType before grabbing the 2nd AccNum and Acctype and so forth.
*(Row number of the match minus the first row number of the range then plus 1, again fairly common as a foolproof way to always get the correct row regardless of where the data starts; actually as your data starts on row 1 we could just do ROW(Reference) but I left it as is incase you had data in a different format)
ORIGINAL ANSWER - Same logic as above
Here's your solution in 3 parts
Part 1 being a trick for the auto completion of the titles so that they will hide when not used (in case you will just copay and paste values the whole lot to speed up use again).
=IF(COUNTIF(C2:C11,">""")>0,CONCATENATE("Product ",LEFT((COLUMN(A:A)+1)/2,1)),"") in C
=IF(COUNTIF(D2:D11,">""")>0,CONCATENATE("Prod code ",LEFT((COLUMN(B:B)+1)/2,1)),"") in D
Highlight both of the cells and drag across to stagger the outputs "Product " and "Prod code "
Part 2 would be inputting the unique IDs to the new sheet, I would suggest copying your entire column A across to a new sheet and using DATA > REMOVE DUPLICATES > Continue with current selection to trim out the multiple occurrences of unique IDs.
In column B use =INDEX(Sheet2!$B$1:$B$7,MATCH(Sheet4!$A2,Sheet2!$A$1:$A$7,0)) to get the names pulled across.
Part 3, the INDEX
Once again, we are doing a staggered input here before copying the formula across the page to cover the entirety of the data.
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(A:A)+1)/2,0)),1),"") in C
=IFERROR(INDEX(Sheet2!$C$1:$D$11,SMALL(IF(Sheet2!$A$1:$A$11=Sheet4!$A2,ROW(Sheet2!$A$1:$A$11)-ROW(INDEX(Sheet2!$A$1:$A$11,1,1))+1),ROUNDDOWN((COLUMN(B:B)+1)/2,0)),2),"") in D
The formulas of Part 3 will need to be entered as an array (when in the formula bar hit Ctrl+Shift+Enter) . This will need to be done before copying the formulas across.
These formulas can now be dragged / copied in all directions and will feed off of the unique ID in column A.
My Answer is already rather long so I haven't gone on to break the formula down. If you have any trouble understanding how this works, let me know and I will be happy to write up a quick guide, breaking it down chunk by chunk for you.

How Do I Copy The Cell Containing a Substring?

I have a list of alphanumeric inventory items in Column A. They are sorted ascending by value.
In Column B, I have an unsorted list of the inventory items filenames.
I'd like to place a formula in Column C that finds the cell in Column B that contains a substring that matches Column A.
I've experimented with several forms of VLOOKUP and INDEX/MATCH, but the best I've gotten is an index number of the matching cell. That isn't quite what I need, but its the closest I've gotten.
I'd really like to get the entire value of the cell in Column B.
I know of one way to do this, and one additional resource that might make your life much easier.
First, If I had the index of the matching cell, I would simply use an indirect function. They're used like this:
indirect("string reference to a cell")
In your case, cell C1 would contain
=indirect("A" & B1)
Second, the resource I have in mind is something I read long, long ago, but stuck in my mind as absolutely brilliant. It uses the lookup function, which only exists for backwards compatibility but has turned out to have hidden utility. See this MrExcel.com page for more info. Barry Houdini's answer is, to prevent link rot, repeated here without the question itself: =LOOKUP(2^15,SEARCH(D$2:D$10,A2),E$2:E$10)

Vlookup and get the min value (date)

TOP Table is Input, and bottom table is preview for required output.
For Each ID I need to find earliest datetime. I also need other information from other columns (please see image below).
My current solution is:
In Cell E2 =A2
Cell E3 drag down =IF(E2<>A3,IF(E1=A3,"",A3),"")
In Cell F2 drag down =IF(E2<>"",MIN(IF($A$2:$A$14=E2,$C$2:$C$14)),"") Ctrl+Shift+Enter
One more option without any intermediate calculations:
Select the whole range starting E2 and to the last row where IDs are located - for the sample given it's row 14, so select range E2:E14: =IFERROR(INDEX($A$2:$A$14,SMALL(IF(MATCH($A$2:$A$14,$A$2:$A$14,0)=ROW(INDIRECT("1:"&ROWS($A$2:$A$14))),MATCH($A$2:$A$14,$A$2:$A$14,0),""),ROW(INDIRECT("1:"&ROWS($A$2:$A$14))))),"") and press CTRL+SHIFT+ENTER instead of usual ENTER - this will define a Multicell ARRAY formula and will result in curly {} brackets around it (but do NOT type them manually!).
F2 (ID2): =IF(E2="","",SUMPRODUCT(--(E2=$A$2:$A$14),--(G2=$C$2:$C$14),$B$2:$B$14)) - normal formula.
G2 (Min Date): =IF(E2="","",MIN(IF(E2=$A$2:$A$14,$C$2:$C$14,2^100))) and press CTRL+SHIFT+ENTER instead of usual ENTER - this will define an ARRAY formula and will result in curly {} brackets around it (but do NOT type them manually!).
H2 (InCh): =IF(E2="","",INDEX($D$2:$D$14,SUMPRODUCT(--(E2=$A$2:$A$14),--(F2=$B$2:$B$14),--(G2=$C$2:$C$14),ROW(INDIRECT("1:"&ROWS($D$2:$D$14)))))) - normal formula.
Remarks:
To make the solution more compact and easy to read, define named range for ID column, and then reference other data columns using OFFSET.
ID2 values may not be unique - as they are on the sample for IDs 1...3.
Resulting set for Min Date should be formatted the same way as source Date row.
The key formula of the solution - is multicell monster which returns unique IDs without empty rows - as OP requested)
Sample file: https://www.dropbox.com/s/d2098updfh8djnf/MinDateIDs.xlsx
This is quite a challenge... I think I have found an approach that works. For the sake of clarity, I used a few helper columns. Also, I did not use any named ranges but stuck with the column-row indications. You might want to change that.
It looks like this:
and zooming in to the relevant columns:
Column F contains an array formula to filter out duplicates. An approach is explained here. The formula I used in F2 is
=INDEX($A$2:$A$14, MATCH(MIN(IF(COUNTIF($F$1:F1,$A$2:$A$14)=0, 1, MAX((COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1)*2))*(COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1)), COUNTIF($A$2:$A$14, "<"&$A$2:$A$14)+1, 0))
Use Ctrl-Shift-Enter to confirm as array formula. Drag this down or copy into column F. Then columns G and H contain the starting and ending indices of the duplicate ID values. This answer helped, please upvote it :-). The two formulas used are:
=MATCH(2,1/FREQUENCY($F2,$A$2:$A$14))
in G2, and
=FREQUENCY($A$2:$A$14,$F2)
in H2. Again, drag them down to get the full column filled. Next, column I is for clarification only -- and for sanity checking. It contains the desired minimum date from each sub-array. Column J substitutes that formula into a MATCH to find the actual index of the desired date.
=MIN(OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1))
in I2 and
=$G2-1+MATCH(2,1/FREQUENCY(MIN(OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1)), OFFSET($C$2:$C$14,$G2-1,0,1+$H2-$G2,1))
in J2. Finally, columns L, M and N index into the original set of data via
=INDEX(B$2:B$14,$J2)
in L2, which you can drag horizontally and then vertically.
When you are done, you can hide the helper columns, or fold everything into big formulas. Good luck with that... There might be an easier way to achieve this, but I did not find it.
If you want the value from column D in G then assuming that column C values are unique you could just use a VLOOKUP, i.e. in G2 copied down
=VLOOKUP(F2,C$2:D$14,2,0)
Per your picture, they're all in the same sheet. Just sort by ID, then Date (ascending). As you work your way down the ID column, each time the ID changes, you know you've found the row with the minimum Date for that specific ID. Create an extra column to signify where ID changes occur, and filter for those rows (hide the column if you so desire).
And... voila.
Know this link is old, but there is a much shorter and easier way!
How about using a pivot table using the Minimum as field setting and then do a =GETPIVOTDATA() to get the information back!
Seems a lot simpler as these formulas!
Actually, I just realized I've been overthinking this...Excel keeps the top item and removes all that follow when removing duplicates.
So if you are going to create an extra working table anyway, why not just copy the range/columns you want to keep, then use the basic sort.
Sort first by ID, then by the column you want as the second filter. Be sure the sorts are in the order you want (e.g. newest to oldest, oldest to newest, A to Z, Largest to smallest, etc).
Once the data is sorted, remove duplicates based on ID. You are left with all of your columns of data, filtered by newest/oldest/largest/smallest per individual.
This worked for my table with 30,000+ records, filtered down to 1500 unique individuals with most recent (plus associated amount), and with a second filter, the largest (plus associated date) for each person.

Resources