How to identiy new/repeat customers - excel

I have a list of ~55k customer sales. I need to create a column to identify if a sale was repeat or new business based on customer name. I have columns A/B and I want to create column C as shown below:

Alternatively:
=IF(COUNTIF(B$2:B2,B2)>1,"Repeat","New")

This assumes that Column A is sorted ascending:
Put this in C2 and copy down:
=IF(ISNUMBER(MATCH(B2,$B$1:B1,)),"Repeat", "New")
If the data is not sorted we need to switch to COUNTIFS(), which is not as optimized as MATCH and will cause issues in the calc if the data set it too large(>10,000)
=IF(COUNTIFS($A:$A,"<"&A2,$B:$B,B2),Repeat", "New")
Again put that in C2 and copy down the dataset.

MINIFS Beats COUNTIFS?
If there is only one first date associated with each customer then the data has not to be sorted and you can use the following:
=IF(A2>MINIFS($A:$A,$B:$B,B2),"Repeat","New")

Related

Query another table in Excel

I want to pull all the values from a particular column in another table. My goal is to take a handful of different tables and put particular columns from each of them into a single, collated table.
For example, let's say I have tables about different kinds of objects
FRUITS
name flavor
banana savory
orange sweet
peach sweet
PETS
name lifespan
dog long
fish short
cat long
Imagine that I now want to make a third table with the name column from fruits and pets.
COLLATED
name source
banana fruits
orange fruits
peach fruits
dog pets
fish pets
cat pets
I tried to install the powerpivot add-in to do this, but I wasn't sure how to do it with a Mac. I'd prefer to use any "table connection" features that Excel offers in case that is possible.
A combination of ideas from both #Ike and #JosWoolley great answers would be this:
=LET(
n,{"Fruits","Pets","Cars"},
w,(tblFruits[Name],tblPets[Name],tblCars[Name]),
y,COUNTA(w),
s,SEQUENCE(AREAS(w)*y,,0),
q,1+QUOTIENT(s,y),
z,CHOOSE({1,2},IFERROR(INDEX(w,1+MOD(s,y),,q),""),INDEX(n,q)),
FILTER(z,INDEX(z,0,1)<>""))
For a new table, the table name would be added to the n variable and the column/range to the w variable without the need to edit the rest of the formula.
Edit #1
Adding more columns can get tricky using this approach but it can be done. For example having an extra 'Price' column in all tables would require something like this:
=LET(
n,{"Fruits","Pets","Cars"},
w,(tblFruits[Name],tblPets[Name],tblCars[Name]),
p,(tblFruits[Price],tblPets[Price],tblCars[Price]),
y,COUNTA(w),
s,SEQUENCE(AREAS(w)*y,,0),
q,1+QUOTIENT(s,y),
z,CHOOSE({1,2,3},IFERROR(INDEX(w,1+MOD(s,y),,q),""),INDEX(n,q),IFERROR(INDEX(p,1+MOD(s,y),,q),"")),
FILTER(z,INDEX(z,0,1)<>""))
where you have an extra p variable and the CHOOSE is updated to reflect the new values. Of course, you could change the order of the columns in the CHOOSE by either changing the order of the 3 parts or by simply changing the numbers in the {1,2,3} array (e.g. {1,3,2}).
=LET(w,(tblFruits[name],tblPets[name],tblCars[name]),x,AREAS(w),y,COUNTA(w),z,IFERROR(INDEX(w,1+MOD(SEQUENCE(x*y,,0),y),,1+INT(SEQUENCE(x*y,,0)/y)),""),FILTER(z,z<>""))
Amend the table column names as required, adding in as many as required.
This should work for reasonably small ranges, though x*y could certainly be improved as a lower bound.
Agreed with Ike that a recursive lambda would probably be of help here.
I added two tables to a sheet: tblFruits and tblPets.
Then you can put the following formula in any cell on the same sheet or another sheet.
=LET(
a,CHOOSE({1,2},tblFruits[name],"Fruits"),
b,CHOOSE({1,2},tblPets[name],"Pets"),
rowIndex,SEQUENCE(ROWS(a) + ROWS(b)),
colIndex,SEQUENCE(1,COLUMNS(a)),
IF(rowIndex<=ROWS(a),
INDEX(a,rowindex,colIndex),
INDEX(b,rowindex-ROWS(a),colIndex)
)
)
The first four rows of the formula are used to retrieve variables that are then used in the final IF-function:
a and b will return "virtual" arrays of each name column plus the "new" column giving the type.
rowIndex returns a single array {1,2,...(number of rows of both tables)}
colIndex returns an array that is build of the number of columns - in this case 2 (name and type)
These variables are used in the IF-formula:
Think of it as a For i = 1 to Ubound(rowIndex)-loop.
If the first value from the rowIndex-Array is smaller than the number of rows of tblFruits,
then INDEX-result is based on virtual array a,
if not the rowindex for b is calculated and INDEX-result is based on virtual array b.
The result is a spill-down array - you can use a filter on it. Just add a header row and add filter.
But you won't be able to create a table based on it.
Therefore you will have to use VBA to create the combined data.
This would be the formula with a third table:
=LET(
a,CHOOSE({1,2},tblFruits[Name],"Fruits"),
b,CHOOSE({1,2},tblPets[name],"Pets"),
c,CHOOSE({1,2},tblRooms[name],"Rooms"),
rowIndex,SEQUENCE(ROWS(a)+ROWS(b)+ROWS(c)),
colIndex,SEQUENCE(1,COLUMNS(a)),
IF(rowIndex<=ROWS(a),
INDEX(a,rowIndex,colIndex),
IF(rowIndex<=ROWS(a) + ROWS(b),
INDEX(b,rowIndex-ROWS(a),colIndex),
INDEX(c,rowIndex-(ROWS(a)+ROWS(b)),colIndex))))

Can I create a SUMIF with data from either of 2 columns?

I am creating a budgeting table. I am using SUMIF (Category, "Category Name", Amount). However, I have 2 sets of "Amount" - Budgeted and Actual.
How do I make it in such a way that the SUMIF for that Category would add the data from Budgeted if the Actual field is blank?
Try this
check the ";" maybe you have to replace it with a ","
=SUMIFS(Amount;Category;SelectedCategory)+SUMIFS(Budget;Category;SelectedCategory;Amount;"")
Another way to do this is to have a column that represents the value to sum.
New column:
=if(isblank(Amount), Budget, Amount)
Then you just sumifs the new column by category.

Return second (and subsequent) unique name in list

I have a list of data that I get from a third party. For simplicity, lets say that Column A is the Unique ID (alpha-numeric), and Column B is the employee who is assigned to that ID. One employee has several ID's, as they work several cases at a time. A few of the Unique ID's begin with "AC", and these ID's are special cases.
I need a formula that will search through Column A on the "Raw Data" sheet for any license number that begins with "AC", and return the Assigned Employee name on my "Assigned Employees" sheet. This is easy enough for the first one with a simple index match formula. However, I need it to bring back the second name, and any other names that are there. In the example below, I would need it to bring back Paul, then Lee.
Column A Column B
Unique ID Assigned Employee
AC798358 Paul
90807248 Paul
AC48298 Lee
B98281 Lee
AC42795 Lee
The table on "Assigned Employees" looks like this:
Employee 1 Employee 2 Employee 3 Employee 4
Paul Lee
I'm using this index match formula to get the first return (Paul), but it will only work for the first "AC" ID number on the sheet.
=INDEX('Assigned'!$B:$B,MATCH("AC*",'Assigned'!$A:$A,0))
I'm trying this formula, which would bring the first and subsequent returns by changing the "k" number for the "Small" function, but it's not working for me.
=INDEX('Assigned'!$B:$B,SMALL(IF('Assigned'!$A:$A="AC*",ROW('Assigned'!$A:$A)-ROW(INDEX('Assigned'!$A:$A,1,1))+1),1))
I know that it doesn't like this part: IF('Assigned'!$A:$A="AC*", but I don't know how else to write it to make it work. Any help would be appreciated.
Possibly relevant: there are a lot of blank rows in this data set.
There is a standard array formula method for pulling a unique list from a list of duplicates. For your sample data, put this in D2, finish with ctrl+shift+enter (aka CSE) then drag right.
=INDEX($B2:$B10, MATCH(0, COUNTIF($C2:C2, $B2:$B10), 0))
You can add conditions (e.g. IF(LEFT($A2:$A10, 2)="AC") to this.
=INDEX($B2:$B10, MATCH(0, IF(LEFT($A2:$A10, 2)="AC", COUNTIF($C2:C2, $B2:$B10)), 0))
This style of LISTUNIQUE formula requires an unused cell to the left (or above if listing into rows). If you don't have the room for an unused cell to the left or above, you could avoid that by using a more conventional formula to achieve the first item in the list and modifying the second to use the first as its reference starting point.
'in D2
=INDEX(B2:B10, MATCH("AC*", A2:A10, 0))
'in E2 (with CSE and dragged right)
=INDEX($B2:$B10, MATCH(0, IF(LEFT($A2:$A10, 2)="AC", COUNTIF($D2:D2, $B2:$B10)), 0))
You can avoid the #N/A errors when you run out of matching items with a wrapping IFERROR function.

EXCEL Find values containing

I have a list of product codes and product SKUs and need to find partial matches. The problem is all the data is out of order.
I have provided a subset of data done manually
Master SKU Product Code Corresponding Product SKU
1_100049 1000510 1_1000510
1_1000510 1000511 1_1000511
1_1000511 100052 4_100052
1_100052 1000525 N/A
1_100053 100053 2_100053
1_100054 100054 1_100054
1_1000560 1000540 N/A
1_1000570 100055 N/A
1_1000575 1000560 1_1000560
1_100060 1000570 1_1000570
1_1000600 1000575 6_1000575
1_100061 100060 3_100060
1_1000620 1000600 1_1000600
I need to find the Product SKU corresponding to the product code. Is there anyway to just list the match in column C? (The data is just in in two columns A and B)
The formula I have is
=VLOOKUP(A2,B$2:B$6000,3,"TRUE")
You can use INDEX/MATCH on a modified Master-SKU column with an array formula
=INDEX(A2:A10,MATCH(B2,RIGHT(A2:A10,LEN(B2)),0))
Use Ctrl-Shift-Enter when you insert the formula. If your columns contain numbers instead of text, you might have to add VALUE
=INDEX(A2:A10,MATCH(B2,VALUE(RIGHT(A2:A10,LEN(B2))),0))
This VLOOKUP may work for you. Adjust the lookup range to your data:
=VLOOKUP(RIGHT(A2,LEN(A2)-2),$B$2:$B$2,1,0)

Spread values across rows based on codes

I have two tables (1.Purchase order and 2.The invoice) and I want to spread the quantity from the invoice table to the purchase order invoice quantity column by code but I want to match the exact quantity from the purchase order quantity.
Here is how the table looks now:
Purchase order table and Invoice table
And this is how i want it to look:
In this post a formula was suggested
=MAX(MIN(M$2-SUM(E$1:E1), D2), 0)
which I customized to use vlookup so it can match the code,
=MAX(MIN(VLOOKUP(A2,J:M,4,FALSE)-SUM(E$1:E1), D2), 0)
but that doesn't work.
#Jeeped suggested to use AGGREGATE function for a one column conditional match, but can anyone give me an example related to this situation?
Here is the sample Excel file
Thank you!
In E2 as,
=MAX(MIN(VLOOKUP(A2, J:N, 4, FALSE)-SUMIFS(E$1:E1, A$1:A1, A2), D2), 0)
What happened to row 1?

Resources