I'm struggling to create (by create, I mean; hack together) a formula that will pull in data from the last entry point individual rows depending on criteria of another cell.
I have a formula to pull in data from relevant name:
=IFERROR(VLOOKUP($B$1,DATA!$C$4:$BZ$64,2,0),"")
$B$1 contains a list and contains names, e.g. Bloggs, Joe; Doe, Jane; etc
DATA!$C$4:$BZ$64 contains multiple criteria, some of which are "Class"
The class is either First, Second or Third and is present on columns O, T, Y, AD and so on (every 5th column to $BZ)
Is there a way to ge the data from the last populated cell from the row that corresponds to name?
Blogs, Joe may have entries for O, T, Y and AD but I would like to get the content from the last cell, AD
Doe, Jane may only have entries for O and T but, again, I would like to take the data from the last cell, T
I think I should be using the same formula but adding something to where it is looking at column 2 but I'm not sure what.
Hope this makes sense.
Thnx for looking,
Sam
Example
Related
I am trying to figure out how to best do this. I have a sheet which has about 44 columns and around 64,000 rows. The columns have different customer data points such as name, date of birth, phone number, and e-mail (these are the most relevant columns for my purposes). I was wondering how I could sort by or highlight the rows in which at least three column data points match, to show a duplicate record for a customer. To explain clearly, I only want to highlight the rows that are duplicates based on at least 3 columns (the name column (the constant) and either phone number or DOB or e-mail.)
For example:
In the above, John Smith matched based on DOB alone. Lisa winters based on email, and Stephanie wright based on both DOB and email.
Now that I am looking at it more I will combine first and last name into one column so it will only have to match 2 or more columns instead of three.
I posted in superuser and all I got was countifs which seems like a start, but I seem to need to incorporate " and, or" logic as well?
Any help with specific formulas is greatly appreciated!
Just for comparison, this would be the array-type approach but as #Luuklag rightly says, it could be slow with 64K rows of data, although it does give complete results
=SUMPRODUCT(($A2<>"")*($A2=$A$2:$A$10)*($B2=$B$2:B$10)*SIGN((($C2=$C$2:$C$10)+($D2=$D$2:$D$10)+($E2=$E$2:$E$10))))>1
So this tests all rows to see if there is more than one which agrees with the current row on last name, first name, and one of DOB, phone and email, assuming your data is in the first five columns and omitting any rows where last name is blank. Adjust ranges to suit.
This is too slow on 64K rows. A little better is to use SUMIFS
=(COUNTIFS($A$2:$A$64000,$A2,$B$2:$B$64000,$B2,$C$2:$C$64000,$C2)
+COUNTIFS($A$2:$A$64000,$A2,$B$2:$B$64000,$B2,$D$2:$D$64000,$D2)
+COUNTIFS($A$2:$A$64000,$A2,$B$2:$B$64000,$B2,$E$2:$E$64000,$E2))>3
You should sort your data on name. And then create an extra helper column that binary indicates wether it is a duplicate or not.
You could simply use a formula in F2 like:
=IF(AND($A2=$A1,$B2=$B1,OR($C2=$C1,$D2=$D1,$E2=$E1)),1,0)
This will give you 1's in column F for those that are a duplicate of the row above based on both first and last name, and at least one other column. This isn't an completely ideal situation ofcourse, as it doesn't always show a duplication. For example:
If there are 3 entries with the same name, and the first has all other fields populated. The second entry has only name and email. And is considered a match to the first entry. The third entry has only name and DOB, and isn't considered a match to the second entry, as only the names match.
To circumvent this you would require the use of INDEX(MATCH()), however that is quiet the burden on your pc, especially if you are going to use it recursively on 64K entries.
Currently, I am working with a scenario where I have 0 to 6 names in the field. For example, A2 has Bob Smith and Jone Random. Below is an example how they look in a check.
Effi Liu
<- (enter/gap)
Kevin Xing
Basically, I want to generate a 1 column that counts how many people in A2. For example, two people and then create a function that will separate. Each of the names into different columns.
If you have the names separated by the Enter Key in Column A, use the below formula in Column B and drag it to the right upto 6 or more cells and then drag down,
=TRIM(MID(SUBSTITUTE(SUBSTITUTE($A1,CHAR(10),"#"),"#",REPT(" ",999)),(COLUMN(A:A)-1)*999+1,999))
This formula splits your name strings with the Enter Key as the delimiter into multiple columns as you drag. You could then use COUNTA function to find the non-blank cells (number of names). I leave that part to you to google and find it. Hope this helped you.
I have a list of two names. One, the employee list, the other, a list of transfers. I want to match the names and then in an adjacent column post the new city.
I am using the following and I realize that the $X$1 will only post the first transfer city to all of the applicable names.
=IF(ISERROR(MATCH(D1,$Q$1:$Q$159,0)),"",$X$1)
What can use to replace the $X$1 ?
Looks like it's =LOOKUP() time:
=IFERROR(LOOKUP($D1,$Q:$Q,$X:$X),"")
Assuming list of employees is in column Q and list of transfer cities is in column X.
Notes:
=IFERROR() allows to simplify =IF() and =ISERROR() combo.
You can pass the whole columns and rows to functions instead of fixed ranges. This is useful because you don't have to rewrite your formulae each time more data are added. $Q:$Q means whole column Q, 1:1 means whole row 1 etc.
It's a good practice to fix the column of parameters, even if you don't plan to copy your formula to another column: $D1.
I'm trying to use Excel to extract figures based on multiple criteria and their location within columns.
So for example. If I wanted to do a SUMIF to receive the figures associated with the First class. The formula would retrieve the figure in a specified row,
But If I wanted to retrieve the figure associated with England. The formula would contain multiple criteria to look for the First class then look for the country England and retrieve the figure on its row in a specified column.
These columns will grow and shrink each month. Meaning I need it to be somewhat dynamic.
I've tried to do this using SUMIF and SUMIFS with no luck.
=SUMIFS(D2:D10,A2:A10,"First",B2:B10,"England")
The challenge you have is that in columns A, B and C, the values are not repeated downwards into the now blank cells. So values do not appear next to each other in the same row.
Assuming that the example you gave is quite simple, and you could also have multiple International Products for a given Class and Country, I would go for the following solution:
Reserve two columns (E and F) for intermediate calculations. If they are currently used, move those used columns to the right, making room for an empty E and F column. You could of course also choose two other columns for this purpose. But I will assume they are E and F.
Then in E2 put this formula and copy it further down the E column as far as needed.
=IF(A2<>"", A2, OFFSET(E2,-1,0))
In F2 put this formula and copy it down as well:
=IF(B2<>"", B2, IF(A2<>"", "", OFFSET(F2,-1,0)))
This should give the following display (the header titles in E1 and F1 are cosmetic only):
Now you can do formulas on those columns in combination with the C column. For instance:
=SUMIFS(D2:D10, E2:E10,"First", F2:F10,"England", C2:C10,"")
And this would output 2. Note that if you really only want to match one row, you should specify a condition for each column (E, F and C).
The intermediate formulas in the E and F columns are quite resistant to deletion of rows, due to the use of OFFSET. If you insert rows, you should of course make sure the formulas in E and F are copied into it.
If you will ever use more than 3 columns for the source data, you'll need to also add more intermediate columns with similar formulas. Also your SUMIFS would need extra conditions then.
You could use the following SUMPRODUCT() For Class and Country:
=SUMPRODUCT(($A$2:$A$10=$F$1)*($B$3:$B$11=$G$1)*($D$3:$D$11))
Then for all three:
=SUMPRODUCT(($A$2:$A$10=$F$1)*($B$3:$B$11=$G$1)*($C$4:$C$12=H1)*($D$4:$D$12))
A picture for references.
The idea is that each column must move down one row in its reference. And the Sum column must start on the same row as the last column being referenced.
I have data in two columns:
a 1
a 1
a 2
b 3
b 4
In the list there is 4 unique rows. I would like to ad a unique id to each unique row.
Like this:
1 a 1
1 a 1
2 a 2
3 b 3
4 b 4
Of course I have many more rows and columns and date are more complex than in this example.
Anyway to do this i excel?
Mvh Kresten Buch
I have the same issue, I have developed a three formula approach to this. I could probably concatenate it if I nested them, but whatevs, this works.
Assume the data you want to 'number' is in column A, and the first row of the table is row 3.
The first column (in column B) counts occurrences of the 'value' and the range expands from the top of the table downwards as the table grows:
=COUNTIF($A$3:A3,A3)
the second column's formula also expands as the row count does, and simply adds 1 to the transaction count every time it encounters a 1 (ie first occurrence of a new unique value) in column B
=IF(B3=1,MAX($C$2:C2)+1,"")
This one worked for me even in the first row of the table btw - i was expecting to have to manually input a 1 to start the list. Having it work without the manual entry is a good thing, it means the formmulas all work even if you resort the table data into a different order.
The third one in column D uses a vlookup to find the value. Note that when vlookup finds more than one number, it always pulls the first occurrence.
=VLOOKUP(A3,$A$3:C3,3,FALSE)
Note that this will renumber all the data outcomes dynamically if you do resort the entire thing. ie the formulas all work, but the number 'assigned' to a praticular set of data might be different, as it all works from whatever order the list of items is in.
My use case for these formulas assumes that every month i paste a new set of data to the bottom of the table, some items of which are repeats from previous months - ie are already in the table, and some of which are new.
if the dynamic renumbering is a problem, use a 'row key' so you can resort back to the original order at the end.
Assuming your data is in B2:C6 please try =IF(AND(B1=B2,C1=C2),A1,A1+1) in A2, copied down
If your data is not sorted, it's more complicated... but you can use something like this in A2:
=IF(COUNTIFS($B$1:B2,B2,$C$1:C2,C2)>1,INDEX($A$1:A1,IFERROR(MATCH(B2&"-"&C2,$B$1:B1&"-"&$C$1:C1,0),1)),MAX($A$1:A1)+1)
I'm assuming that there are no headers and you have already put 1 in cell A1 for the first record.
It basically checks the whole columns above the formula and if there's already a similar record, it'll assign the previously given unique ID and if not, it'll give a new ID.
This is an array function and as such will work if you use Ctrl+Shift+Enter and not Enter alone.
The IFERROR() is there because MATCH(B2&"-"&C2,$B$1:B1&"-"&$C$1:C1,0) would return an error if it is on row 2 (the first record to check).
Once you put that in the first cell, you can fill down the formula.
I deal with this issue all the time when structuring a data set into panel data. say you have multiple columns of data, and each are unique based on the name of someone, like:
ANNE
ROSE
ANNE
FRANK
TOM
ROSE
ANNE
but instead of having each column related to Anne, Rose, Frank, or Tom, you want it to look like this:
1 ANNE
2 ROSE
1 ANNE
3 FRANK
4 TOM
2 ROSE
1 ANNE
So that each name now has a unique numerical identifier that can be used in place of the name.
Make a pivot table of your data and only place the column that has the names (or whatever the identifier may be) into the Rows section. This will single out all the different names used within the dataset. Copy and paste this pivot table anywhere on the sheet so that the names are in actual cells and not off of a pivot table. To the right of the names, enter 1 next to the first name, and then =B1+1, and so on so that you number each name with a unique value; then copy and paste this column as numbers so that their formulas are erased. Finally, just go to your original dataset and perform a VLOOKUP so that the names get attached with whatever unique value was assigned off of the pivot table. Make sure to copy and paste as numbers once done to remove the VLOOKUP formula.
Takes literally 2 minutes to do, depending on size of dataset, and is very easy. It will work perfectly every time.