Give warning when duplicate values are placed in different keywords - excel

Let's say I have a sheet with multiple columns but to make the example I only put 2.
What I want to achieve is to give the user a warning when the same person get's placed in different teams. (It's possible to have duplicate persons, but it's not allowed to have them in seperate teams)
First I thought I'll filter it so I can only see the duplicate and then check if the same person get's placed in 2 different teams. But now I see it's not possible to filter the duplicates. Then I thought using conditional formatting and check if the cell.color.interior is changed but I noticed it does not change it! Solutions provided on stackoverflow does not suffice for me. Neither do I want to a pivottable or an extra column since my sheet is already overcrowded.
Example:
Value A Value B
Tom Team 1
Ben Team 1
Tom Team 1 <- possible
Elle Team 2
Tom Team 2 <- not possible, give warning!
Rick Team 2
And the list goes on.
Does someone know I can give the user a warning when placing the same person in different teams?
Or how to get to see the duplicate values in the sheet or get it in a range in vba?
Thanks!

Okay then, since you cant sort the table it requires a rather more brutal or complicated approach.
1) First a simple, quick approach
Although you could still add a temporary column, number it, sort the sheet based on the Value A field, then just iterate through the cells, looking for instances where
Cells(curRow, rowNumOfValueA).value = Cells(curRow-1, rowNumOfValueA).value
When this happens, check that the value in column 'ValueB' is the same for each item. If so, continue onto the next row. If not, mark the row in some way - setting the interior colour is often a good way to do it. You can then sort the table according to the added column, delete the column and then take action.
2) Second a brute-force
Starting at the first row, get the value in ValueA and ValueB, call them for example findMeA and findMeB
a) starting at the following row, check to see if ValueA matches findMeA. If it doesn't, move on to the next row. If it does match, ensure that ValueB matches findMeB too. if not, mark the row or add its index to an array for reporting when the sanity-check is done.
b) move to the next row, if it's empty, return to (2). If it's not, return to (a)
As you can probably see, the brute force approach gets exceedigly nasty cpu-wise as the number of rows goes up. With just 10 rows of data, it's 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 searches that need to be done, unless you make a mistake of logic and check every row against every row. Just 10 rows would be 100 checks. - 10^2
If, on the other hand, you sort the table first - you can get away with just N checks for every N rows. 10 rows takes 10 checks, 1000 rows takes 1000 checks. (I can't be bothered to find out how to easily calculate 1000 + 999 + 998 + .... + 1, but it's huge. Actually, it's about 500,000 (1000 + 999+1 + 998+2 + 997+3 ... 499+501 + 500). - that is 1000 + 499*1000 + 500
So, the brute force takes exponentially longer as the list grows, but the sort-first method takes linear time. Twice as big equals twice as long.
Sorted method: 10 items - 10 checks. 1000 items 1000 checks.
Brute Force: 10 items - 55 checks. 1000 items 500,500 checks.
I.e at worst case, brute force takes 500 times as long as a sorted approach. If it was only a few items, I'd just suggest brute-forcing it. Over half a million checks with VBA doesn't seem so pleasant though, so I'd sort the sheet, or at the very least - make a copy and sort then operate on that, before reporting the problems with the actual sheet.

Related

How do I get a single number and seperate it into multiple groups

say I have a number
250,000
And I have a dynamically changing group:
group 1 - 0 to 100,000
group 2 - 100,001 to 200,000
group 3 - 200,000
onwards
Is there a formula where I can split it up this way in that the first 100,000 goes into the first group, then the next 150,000 goes into the second group, and the third group is empty.
I've been trying with If statements, and the first group is easy, but the second one I try to subtract from the first, but only after I check its higher then the max of group 1 and everythings starts breaking.
Thanks so much.
This is the formula everybody has been talking about. The value is in A2.
=4-MATCH(A2,{100000000,200000,100000,0},-1)
Note that 100,000,000 is an arbitrary high number intended to be larger than any you will ever have to sort. The array must be sorted in descending order.
To mirror BigBen's comment, you need to do a =MATCH(). Instead of the match_type being 0 (exact), you can change it to 1 or -1 depending on your needs.

If match found yes/ no, two different tables, two different values

I did not see something like this in other questions/ forums so hopefully it can be done.
Need to know if values from one table are in another, checking to see if there is a match.
Table 1 in Sheet 1 is used to record "Incoming" data of Part Number and Lot Number.
Table 2 in Sheet 2 is used to record when record is "Outgoing".
Column A is Part Number, B is Lot number in both Sheets. Part number can repeat, but Lot # will not. Trying to find a way to return a Yes/ No or 1,0 if part number and lot number in Sheet 1 exists in Sheet 2 in Column C of Sheet1. I have attached a Snippet example what I am trying to do. This will help me generate info on if an Incoming record has been completed and left (Outgoing). I do not believe vlookup will work and have tried some different permutations of match. Open for other options. Thanks!!
Edit: Lot # does have to ablity to repeat (not often) but with a different corresponding Part Number. Need to know if there is a match with both Lot# and Part Number as in the Incoming record.
Use:
=--(COUNTIFS(Sheet2!A:A,A2,Sheet2!B:B,B2)>0)
If there are matches it will return 1 if not 0

Rank the top 5 entries in different criteria

I have a table that I want to find the top X people in each of the different groups.
Unique Names Number Group
a 30 1
b 4 2
c 19 3
d 40 2
e 1 1
f 9 2
g 15 3
I've ranked the top 5 people by number by using =index($A$2:$A$8,match(large($B$2:$B$8,1),$B$2:$B$8,0)). The 1 in the LARGE function I linked to a ranked range so that when I dragged down it changed up the number.
What I would like to do next is rank the top x number of people in each group. So top 3 in group 1.
I tried =index($A$2:$A$8,match("1"&large($B$2:$B$8,1),$C$2:$C$8&$B$2:$B$8,0)) but it didn't seem to work.
Thanks
EDIT: After looking at the answers below I have realised why they are not working for me. My actual data that I want to use the formula with have multiple entries of numbers. I have adjusted the example data to show this. The problem I have is that if there are duplicate numbers then it returns both of the names even if one is not in the group.
Unique Names Number Group
a 30 1
b 30 2
c 19 3
d 40 2
e 1 1
f 30 2
g 15 3
Proof of Concept
Use the following formula in the example above in cell F2 and copy down and to the right as needed.
=IFERROR(INDEX($A$2:$A$8,MATCH(AGGREGATE(14,6,($C$2:$C$8=F$1)*($B$2:$B$8),ROW($A2)-1),$B$2:$B$8,0)),"")
In the header row provide the group numbers. or come up with a formula to augment and reset the group number as you copy down based on your X number in your question.
Explanation:
The AGGREGATE function unlike the large function is an array function without the need to use CSE. As such we can add criteria to what we want to use. In this case only 1 criteria was used and that was the group number. in the formula it was the following part:
($C$2:$C$8=F$1)
If there were multiple criteria we would use either an + operator as an OR or we would use an * operator as an AND.
The 6 option in the aggregate function allows us to ignore errors. This is useful when trying to get the small. It is also useful for dealing with other information that may cause errors that do not need to be worried about.
As this is technically an array operation avoid using full column/row references as they can bog down your system.
The basics of what the over all formula is doing is building a list that match the group number you are interested in. After filtering your numbers, it then determines which is the largest, second largest etc by what row you have copied down to. It then determine what row the nth largest number occurs in through the match function, and finally it returns to the corresponding name to that row with the index function.
Building on all the other great answers.
Because you have the possibilities of duplicate values in each group we need to do this with two formulas.
First we need to get the numbers in order. I used the Aggregate, but this could be done with the array LARGE(IF()) also:
=IFERROR(AGGREGATE(14,6,$B$2:$B$8/($C$2:$C$8=E$1),ROW(1:1)),"")
Then using that number and order we can reference, we can use a modified version of #ForwardEd's formula, using COUNTIF() to ensure we get the correct name in return.
=IFERROR(INDEX($A$2:$A$8,AGGREGATE(15,6,(ROW($B$2:$B$8)-ROW($B$2)+1)/(($C$2:$C$8=F$1)*($B$2:$B$8=E3)),COUNTIF(E$2:E2,E3)+1)),"")
This will count the number in the results returned and then bring in the correct name.
You could also solve this with array formulas - to filter a group whose name is stored in E1, your code
=INDEX($A$2:$A$8,MATCH(LARGE($B$2:$B$8,1),$B$2:$B$8,0))
would then be adapted to
=INDEX($A$2:$A$8,MATCH(LARGE(IF($C$2:$C$8<>E1,-1,$B$2:$B$8),1),$B$2:$B$8,0))
Note: After entering an array formula, you have press CTRL+SHIFT+ENTER.
Thank you to everyone who offered help but for some reason none of your methods worked for me, which I am sure was to do with the quality of my data. I used an alternate method in the end which is slightly convoluted but seemed to work.
=IF($C2="1",RANK($B2,$B$2:$B$8,1)+ROW()/10000,-1)
Essentially using the rank function and adding a fraction to separate out duplicate values.

Multiple Date Comparison Queries

Is there an efficient way of identifying which date is the maximum date across 12 columns of data which all have different dates? Naturally the easiest approach is MAX(RANGE A: RANGE L) and then pull that value down to get the rest of the rows. However, this isn't what I want.
What I want to do is to create a function where I can compare the dates across rows and if it is the max - highlight that value. This is because each column is responsible for a specific part of a process and I want to identify where is the largest delay.
My initial thoughts were defining 2 variables and having them each hold one value temporarily and performing a check to see if var 1 > var 2. If it is, then move to the next one (FOR EACH) loop - otherwise max value is reached. Highlight that value.
Would anyone be able to assist me?

Excel - find the biggest gap between numbers in rows

I have an excel file with >12500 rows in one column.
It contains such random strings with 20 digits:
2,3,4,6,7,8,12,13,14,24,30,42,45,46,48,50,56,58,**59**,61
1,2,6,8,11,12,13,16,17,21,24,27,28,33,34,42,44,48,58,61
3,7,10,13,14,15,18,21,23,24,25,29,30,34,37,48,51,56,57,60
8,11,13,16,17,19,21,27,29,35,36,39,42,44,46,50,53,54,57,60
2,4,7,9,21,26,28,30,32,34,35,37,38,39,43,44,50,60,61,62
10,13,15,18,21,22,23,24,25,26,40,42,48,49,51,52,56,**59**,61,62
1,2,4,7,14,15,18,20,24,29,30,32,35,41,42,50,52,55,58,62
1,4,8,9,10,12,17,24,25,33,37,41,43,44,46,49,52,**59**,61,62
1,2,4,6,9,12,15,17,21,24,30,31,32,36,41,44,47,48,51,58
2,7,10,12,15,16,20,24,25,27,30,33,39,44,45,52,54,55,58,60
5,7,10,11,20,22,24,31,32,33,36,38,39,41,43,47,50,52,56,58
3,6,8,9,14,15,19,21,25,28,34,37,39,45,47,54,55,56,57,**59**
1,2,3,4,5,8,14,15,18,20,23,31,33,37,42,45,46,51,52,55
I need to know whats the biggest gap between rows where a number hasn't repeated. For example - I search for any number (e.g 59) and I need to know what's the largest gap between two rows where number 59 hasn't repeated.
In this example it's 4 row gap between 59's.
Hope that I make myself clear.
Seems like a fun problem which admits a simple but not quite obvious answer. First -- make sure that the data is in 20 columns (use the text to columns feature under the data tab). Using your example, I came up with a spreadsheet that looks like:
V1 holds the target number. The formulas are in columns U.
In U1 I entered:
=IF(ISNA(MATCH($V$1,A1:T1,0)),1,0)
This formula uses MATCH to test if the value in V1 lies in the range to the left of it. If it doesn't the match function returns #N/A. The function ISNA checks for this error value. IF it is present, the overall formula returns 1 (since there are now 1 consecutive row without the target number) otherwise it returns 0.
The formula in U2 is similar with a little twist:
=IF(ISNA(MATCH($V$1,A2:T2,0)),1+U1,0)
The same basic logic -- but rather than returning 1 if the target number isn't present it adds 1 to the number above. The formula is then copied down the rest of the range. It has the effect of keeping a running total of consecutive rows without the target value. This running total is reset to 0 whenever a row with the target value is encountered.
The final ingredient requires no comment. In U14 I just have
=MAX(U1:U13)
which is the number you are looking for (assuming that the maximum number of consecutive rows without the target number is what you are looking for, even if this occurs either at the top or bottom of the data. If you want the largest gap that is literally between two rows where the number occurs, the logic would need to be made more complex).

Resources