Excel Column Missing Data - excel

I have an excel spreadsheet that has a Name Date match in two columns for one list, and the same name date match for two columns in a 2nd list.
One list is longer than the other, how can I find the data that is missing in the matching date/name from the other list?
For instance let's say
List 1 List2
1/2/2012 Tim 1/2/2012 Tim
2/2/2012 Jill 2/2/2012 Jill
3/2/2012 Bob
So basically I need to search list one and find out that List 2 is missing "3/2/2012 Bob" both the dates and names are in their own columns.
How do I do this? Keeping in mind that these lists have no order and that it is possible that someone in list2 might show up in list1 just not on the same row.

If List 1 is in columns A and B, and List2 is in columns C and D, then select an area the same size as List 1 (or the size of List 1 minus List2--make sure it has 2 columns) and enter this as an array formula (ctrl+shift+enter):
=IFERROR(INDEX(A2:B4,SMALL(IFERROR(MATCH(A2:A4&B2:B4,C2:C3&D2:D3,0)+ROWS(A2:A4),ROW(INDIRECT("1:"&ROWS(A2:A4)))),ROW(INDIRECT("1:"&ROWS(A2:A4)))),{1,2}),"")
Expand the rows as needed. Also, you'll need to format the first output column as a date.
EXPLANATION UPDATE
The Evaluate Formula dialogue is helpful for breaking down complicated formulas. Select the cell with the formula and press alt then T then U then F ("TUF" for tough formulas...).
Start with the MATCH function. Since we're looking for a date-name match, concatenate the name and date columns with &. MATCH will tell us which pairs in List 1 are also in List2 (specifically where--we'll get an array of indices in List2 where the matches were found in List 1). If a match is not found, it will return #NA. So for the OP's example MATCH will return {1;2;#NA} (the first value in the List 1 array is in position 1 in List2, the second value is in position 2, and the third value was not found in List2).
The second argument in the inner IFERROR is an array of the indices of List 1. ROWS returns the number of rows in a range, INDIRECT returns a safe (meaning it won't be accidentally deleted or moved) reference to rows 1 through the number returned by ROWS, and ROW returns the row number of each of the rows--so in the OP's example, this is {1;2;3}.
We want the elements from List 1 that were not found in List2, so we add the number of rows in List 1 to the non-error MATCH results. This will send those values to the end of the array returned by SMALL. For the OP's example, the array passed as the first argument to SMALL is {4;5;3}.
Now we want to bring the indices of interest to the top using SMALL. We use ROW(INDIRECT("1:"&ROWS(A2:A4))) again as the second argument in SMALL to sort the array smallest to largest. For the OP's example, the resulting array is {3;4;5}.
We then pass that array to INDEX as the "row_num" argument and {1,2} as the column argument. INDEX will then pull both columns from the range that we give as its first argument for each row in the array that resulted from SMALL. The values at the end of the array that resulted from SMALL are larger than the number of rows in List 1 (since we added ROWS(A2:A4) to them), so they will result in #REF! errors. These correspond the the elements that returned a match in the MATCH function. We wrap this with yet another IFERROR to blank out the errors.

Related

Appending two lists in excel

I have been trying and searching how to append two lists in excel to use in a formula. The lists do not exist in columns, they are created using a formula. I want to combine the two lists in a single one, not to show the values but to use the new list in a formula. I am using excel 365 (UNIQUE function). Let me replace my initial text by a real small case.
I have an excel file with 3 work sheets. Sheet1 is:
Sheet2 is:
Now I want to run some analysis in Sheet3. In my example I want to count how many unique values from column A have column B containing one of the letters 'a', 'b, 'c', or 'd'. For instance, in Sheet1, the letter 'a' appears in all rows. Column A has 3 unique values. So my result for 'a' is 3. The letter 'b' does not appear for the case where column A is '3'. Therefore the result for 'b' is '2'.
So I create a Sheet3 to show my results. The first column contains a list of letters {a, b, c, d}. I then use the formula:
=COUNT(UNIQUE(FILTER(Sheet1!$A$1:$A$100, ISNUMBER(SEARCH(A1, Sheet1!$B$1:$B$100)))))
From inside out: the SEARCH function looks in cells B1 to B100 (I can live with specifying a larger range) where is the position of the value specified in column A (of the current sheet). If it does, then SEARCH returns a number. I check if the return value is a number (ISNUMBER) and use this to filter values in column A of Sheet1. I then apply the UNIQUE function to these values and finally count them.
Then I do the same with values in Sheet2. And it works. This is the output:
Column B is the number of unique values (as specified above) from Sheet1 and Column C the same from Sheet2.
So far so good. But now I want to have the counting of unique values globally. Not for each Sheet. One cannot just add the values from column B and C, as there might be an overlap. For example, the result for 'a' should be 3, not 5.
The solution here would be to grab the two unique lists (one from Sheet1 and the other from Sheet2), join them, UNIQUE this new list, and count. How do I join them ? That is my question.
Note that this 'counting of unique values' is just an example. I might want to find the maximum, or sort them, or find only prime numbers, or the average, or the median, or something else. So I need a general approach to join the results.
I got options close to a workable thing when all the data is in the same worksheet.
Finally, note that the data size I have is not huge, but it is large (thousands of lines at the most).
Here is something you could try:
=LET(x,{"A","B","C"},y,{"D","E"},z,CHOOSE({1,2},x,y),cnt,MAX(COUNTA(x),COUNTA(y)),seq,SEQUENCE(cnt*2),final,INDEX(z,MOD(seq-1,cnt)+1,CEILING(seq/cnt,1)),FILTER(final,NOT(ISERROR(final))))
Here both 'x' and 'y' variables are placeholders for your two (vertical) arrays. In this case I used: {"A","B","C"} and {"D","E"}. Assuming you just want to place the 2nd array directly under the 1st one, the above suggestion does just that:

VLOOKUP - Find lookup value may be seperated by comma

I have two lists, where I need to see if values from the 1st list is also present in the 2nd list. However, due to the way my system is formatted, some values from the 1st list contains multiple values, that needs to be looked up.
If just one of the values is present in the 2nd list, it should print that value.
1st list values:
COLUMN A:
C00276129, CDK1029191
CAE031070
CAU029379
2nd list values:
COLUMN B:
CDK1029191
CAE031070
CUS0000000
CUS0000002
As you can see, in list one, some of the values may be printed out on the same row, but seperated by comma.
I am trying to get VLOOKUP to search for both values in list 1 and compare to the entire list 2:
=IFERROR(VLOOKUP(A1 & "*";B:B;1;FALSE);"Value not present")
However, above just returns "Value not present", even though the value on the first row is indeed present in list 2.
You can use this "Clumsy" formula to return just value that was found in case 2 values are in same row. =TRIM(IFERROR(VLOOKUP(LEFT(A2,FIND(",",A2,1)-1),B:B,1,FALSE),"")&" "&IFERROR(VLOOKUP(RIGHT(A2,LEN(A2)-FIND(",",A2,1)-1),B:B,1,FALSE),"")&" "&IFERROR(VLOOKUP(A2,B:B,1,FALSE),""))

Array of unique values and matching

Suppose several dices (3 for example) thrown each time. It also could be more than six possible outcomes per "dice", but I took six for better illustration.
1). Columns E or G:
Lookback is simply the size of an array. Arrey should include only unique values and ignore zero values. The tricky thing is that the series of observations are sorted from oldest to newest, and values of an array must be updated based on the newest series of 3 numbers (largest row number in the selected range).
So the parameters of a function should include (array range, max value, array size).
What I need to do is simply to take all values from 1 to 'max value' (1,2,3,...) and subtract all values from an array. In other words, take only those values, which are not included in array for a given range. Finally, type them in ascending order using comma delimiter.
2). Columns D or F:
Here we take any particular range of values, and compare it with our comma delimited list. If there is a match, then type matched numbers similarly using comma delimiter.
I suggest splitting out a lookup table in col h to m with 1,2,3,4,5,6... Across the top in h1 to m1 then in each row you can do a hlookup( h1, a3:c3, 1,false) in cell h3 to m3. This will return either a number or error, you could further wrap this function in an if function if(iserror(hlookup...),h1, ""). This would give you a row of numbers that it does not find in your dice roll which you could concatenate to get what your looking for.

Lookup Multiple Items

I have a list of names and numbers
NAME | Number
Joe | 1
Jane | 0
Jack | 1
Jill | 0
John | 1
I'm trying to look up the numbers and find out the corresponding name
The formula I have is
{=index($A$2:$B$6, SMALL(IF($B$2:$B$6 = 1, ROW ($B$2:$B$6)), Row(1:1)), 1)}
As I understand the formula:
First Excel runs the index function. It runs the index function on the array A2 through B6.
For the row number in the index function, it uses the function SMALL(IF($B$2:$B$6 = 1, ROW ($B$2:$B$6)), Row(1:1). This examines an array, b2:b6, and if the element under consideration in B2:B6 is a 1, it returns the row number of b2:b6. In this case, it would return a 2.
At this point I'm kind of stuck. I'm guessing that the second ROW function returns first case of the 1 derived from the small function
Lastly, the index function finds the name located in column 1 for the index found.
Your understanding of this formula is pretty good. I assume that you are going to copy it down enough rows to get all the values reported? If so, here is what is happening:
INDEX needs to know what row to go retrieve. In order to do this, we are going to give it a row number.
In order to get a row number we need to know which items meet the condition. We use the IF conditional to report a row number if the condition is met (otherwise we get FALSE).
Since that will give us an array of row numbers, we then use the SMALL function to give us a single value. That satisfies the INDEX function which needs a single row to retrieve.
So which value do we choose from SMALL? Well, we just give it a sequence of 1-2-3-... by using ROW(1:1). When this is copied down, it will become ROW(2:2), ROW(3:3), etc. Each of these will return 1, 2, 3, respectively so we get the next entry. Note that SMALL skips FALSE so it works for the output of the IF call.
So the first call to ROW (inside the IF) is used to determine the row of the values in the array that match the condition.
The second call to ROW(1:1) is just used to get an incrementing sequence once the formula is copied down.
The final thing to note is that your formula will be off by one row on the answers because ROW($B$2:$B$6) will return the absolute row number of those rows and not one that is relative to the starting corner of the array of interest. In this case, you will need to subtract 1 to get it to work (since it starts in row 2). In the general case, use a formula like this which accounts for the offset of the array:
=INDEX($A$2:$A$6,SMALL(IF($B$2:$B$6=1,ROW($B$2:$B$6)-ROW($B$2)+1),ROW(1:1)))
That is an array formula like you have (enter with CTRL+SHIFT+ENTER). The corresponding ranges look like:

Call 'Large' ref, grab value from column in referenced row

Let's say I have a few columns, five for example. Multiple rows. For each individual row, on column A and B, I have two strings that I would like to reference. Columns C and D add up into column E, which totals the two values.
What I'm looking to do is reference the largest values in the chart, pull that number, and also return the two strings in columns A and B.
I know you can pull the largest number in range x,y in col E with =LARGE(Ex:Ey,1), but how does one reference the row that the number represents?
Let's say for reference that the two strings in the sixth row are Alpha and Bravo, and this sixth row contains the largest value (26 for example) that I want to pull.
I'm looking for a way to get the output 26 Alpha Bravo, if that's possible. I'm making a list going from largest to smallest, so I'm looking for a way to incorporate LARGE in there as well - looking to pick the 10 largest values and their respective strings.
Any thoughts?
I'm looking for a way to get the output 26 Alpha Bravo, if that's possible.
Please try:
=MAX(E:E)&" "&INDEX(A:A,MATCH(MAX(E:E),E:E,0))&" "&INDEX(B:B,MATCH(MAX(E:E),E:E,0))
try =MATCH(LARGE(A1:A6, 2),A1:A6,0)
Large returns the value you are looking for, and match will find the row it is in. Use the row starting in row 1 for match to give you the physical row, not the instance within the list. Match with the third argument set to zero is basically a find first, and the Match function can also accept wildcards.

Resources