Finding companies appearing with different IDs in MS Excel - excel

I have 2 columns in my data:
A - each company's unique ID.
B - the company name that corresponds to the respective ID.
This type of data extends to 13,000 rows. For instance:
Col A Col B
12 Google Inc
12 The Google
14 Google
18 Amazon
18 Amazon
21 Amazon INC
18 Amazon
...
As you can see from the example above, the issue is that sometimes the company has a different ID appearing. Furthermore, although in all 3 cases, the company is still the same, the fact that they've been worded differently makes it hard to do an exact match.
My goal in this exercise is two-fold:
Find which companies have different IDs showing.
Identify the row at which this happens.
It would be cumbersome to go through all 13,000 rows. What Excel formulas would do the trick?

You could use pivot tables to count how many duplicates each name has.
I would also:
Order the list by column B.
Add a formula in column c that compares the formula row with the previous row.
For example consider a formula in row 5:
=IF(B4=B5,"Identical","Different")
You could build in more intelligence for example compare the first word in the name in row 5 to see if it is in the row 4 name. eg
=IF( iserror( find( LEFT(B5,FIND(" ",B5,1)-1) ,B4,1) )
,""
,"Similar")
You could combine the above tow into a single function, or may use both in different columns (which is easier)
PART 2:
The data must be ordered by column B!
So using the above logic to compare the IDs you should add another column (column F) with this formula
= find( LEFT(B5,FIND(" ",B5,1)-1) ,B4,1)
Then add another column (column G)
=IF(B4=B5
, B5
, IF( iserror(F5) )
,""
, F5 )
)
This results in a value in column G which is either the identical company name or the first word of a company that has a matching name.
You can then add another column (column H) which compares the id's of rows with the same IDs
=IF(F4=F5
, IF(A4<>A5, "Different IDS, "Ok IDs")
, "First row in company group"
)

Related

Excel: match in a column with another column based on a non blank value in another column

I have 2 arrays. The first array lists account nr's with the company name and a column that tracks if they were active last month (marked with an x). The second array has two column; one with only company names and another that needs to be marked with an x if the company is present in the first array in January.
Objective: I want to track active companies in January. To do this, I want to mark the second column in array 2 for all companies that have an 'x' in 'active Jan' in array 1.
Array 1
Name Company[1]
account nr.
active Jan
A
123
x
B
321
B
132
x
Array 2
Name Company[2]
active Jan
A
B
What I tried: formula in [array 2,column 2] that gives the value in [array1,column 3] based on a match between name company in array 1 and array 2:
=IFERROR(VLOOKUP([#NameCompany[2]],'array1'!A3:C5,3,FALSE),"")
Result: This gives the first blank or x values of a company in array1. I need to only give an 'x' in [array2,column2] if there is an x in ]array1,column3], instead of copying the first value in [array1,column3] the formula comes across.
For example, the formula above would give a 0 value (or blank) for company B instead of an x because there are multiple accounts from company B. Ideally, the formula should search only companies with an 'x' in [array1,column3] and then put an 'x' in [array2,column2].
I know I use the wrong formula to reach my objective but I can't find the right one in google/stackoverflow. Please help.
You can use this formula:
=SUMPRODUCT(('array1'!$A$3:$A$5=[#[Name Company'[2']]])*('array1'!$C$3:$C$5="x")) > 0
i've already to chehck it, and found what u want
please try to use my Formula
=IF(LEN(VLOOKUP(A11,$A$2:$C$5,3))=0," ",VLOOKUP(A11,$A$2:$C$5,3))

EXCEL - Find if value exists in column B in the range of a value in column A

I have a list of companies with certain products. Now I want to find out if one company has a certain product or not. Example, I want to find out which company had Product C and return a one on all cells:
Column 1
Company A
Company A
Company A
Company A
Company B
Company B
Column 2
Product A
Product B
Product C
Product A
Product B
Column 3 (Result):
1
1
1
0
0
This solution will require 2 additional columns. I'm assuming your first row is headers, and the range is from A1:B6. Data starts on Row 2. I'll give a few options on how to execute this though. Where I put "Product C" can also reference a cell. Whenever I'm using binary like this it's usually to filter datasets, so there might be a better alternative to what you want vs. what's below.
In Column C, =if(B2="Product C",1,0) or you can use =--(B2="Product C")
Sort by Column C in Descending Order, =vlookup(A2,$A$2:$A$6,1,0) copy and paste as values, but if you keep the formula and resort it will mess up.
If Product C would only appear once for any given company you can us Sumifs too. =Sumifs($C$2:$C$6,$A$2:A$6$,A2)
If you have 365, you can also use Maxifs($C$2:$C$6,$A$2:A$6$,A2), which won't care how you sort the dataset.

Excel: Find duplicates in one column, then remove rows based on value in other column

I've been able to find a number of articles that seem to orbit my particular puzzle, but I'm having difficulty carving out the specific solution for it. Using the below image for reference:
ID Name Company Name
5 Dennis E Lantz Boggio Architects, Pc
6 Director Lantz Boggio Architects, Pc
7 Glenn D Lantz Boggio Architects, Pc
8 Director Ge Johnson Construction
9 Evan Da GH Phipps Construction Companies
10 Paul Fog GH Phipps Construction Companies
11 Todd W GH Phipps Construction Companies
I have a mailing list that is organized so each unique contact is placed on an individual row. The list contains columns for Name (column A in my sheet) and Company Name (column B).
If the Name cell was originally empty, a default 'generic' title is entered (e.g. 'Director', as per rows 6 and 8 in the image).
In some cases, there are multiple contacts at the same company (e.g. rows 5-7, 9-11). Occasionally, one of those contacts has a 'generic' name (e.g. row 6).
What I'd like to do:
Search for duplicates in Column B
Then delete the row based on the value in Column A (with me defining the specific values to be sought for)
So in the example image, only row 6 would be deleted because Column B contains a duplicate address, and Column A contains the value 'Director'.
Thank you!
Maybe, in C5 and copied down to suit:
=AND(COUNTIF(B:B,B5)>1,A5=C$1)
with Director in C1.
Then filter ColumnC to select TRUE and delete.
COUNTIF(B:B,B5) searches for the content of B5 throughout ColumnB (the B:B) and returns the count of the instances. B5 is within ColumnB so function will always find at least 1, for duplicates more than one, so >1 should detect that the row in question (5 for example) is not the only instance.
However, similar entries will not be counted - for example those that end in a trailing space, when what is in B5 does not.

Using Correl Function in Excel for Varying Array Sizes

So the current setup of the problem at hand is that I have 4 columns, Employee ID, Category 1, Category 2, and Category 3. I need to find the correlation between Category 1 & Category 2, Category 1 & Category 3, and Category 2 & Category 3 for each Employee ID. The issue is that the array length for each Employee ID is different. Some employees will have 5 records, some employees will have 8 records to their ID.
This problem would be simple if the Subtotal button had the CORREL function built into it given its grouping by feature.
How would I go about calculating the 3 correlation coefficients for each unique Employee ID? Excel function or VBA works
You need to use an array formula. Please see this screen shot of a sample situation:
As you can see the formula for cell G2 is: =CORREL(IF($A$2:$A$16=F2,$B$2:$B$16,""), IF($A$2:$A$16=F2,$C$2:$C$16,"")) That is saying if cell in column A matches your employeeId, include the cell in corresponding column in the array (Column B for the first IF and Column C for the second ID). After entering in the formula you need to make sure you hit Ctrl+Shift+Enter to tell excel you want to do an array formula or Command+Enter on a Mac.
You obviously need to modify the formula to fit the size of your data and you can copy that formula to any cells if you setup your $s correctly.

Conditional Unique ID for each record excel

I have two values in different columns. Column A have Department name i.e. HR, Admin and Ops. and column be have date. I want Unique ID in column C based on Combination of Column A & B and Unique number at the end.
Unique ID: HR-Aug-16-1
Admin-Aug-16-1
this number will be repeat till the combination of Column A and B repeated 50 times after 50 times last value will be increased by +1. i.e.
HR-Aug-16-2
Admin-Aug-16-2
Right now I am using formula,
=A1&"-"&TEXT(B1,"mmm-yy")&"-1"
In C1 as a standard formula,
=A1&"-"&TEXT(B1,"mmm-yy")&CHAR(45)&INT((SUMPRODUCT(--(A$1:A1&TEXT(B$1:B1, "mmm-yy")=A1&TEXT(B1, "mmm-yy")))-1)/5)+1
I've set this to repeat after 5 for an example. I'll leave it to you to change the modifier to 50. Fill down as necessary.

Resources