How to find duplicate adjacent cells in Excel - excel

I have 2 columns in Excel (like below) and I would like to identify (conditionally format) any rows that are exactly the same.
As you can see 326.001 1,000 HOUR are identical for the first 3 rows, I would like to highlight or mark these rows so I can see that they are not unique.
+---------+------------+
| ID | INTERVAL |
+---------+------------+
| 326.001 | 1,000 HOUR |
| 326.001 | 1,000 HOUR |
| 326.001 | 1,000 HOUR |
| 326.001 | 3,000 HOUR |
| 326.002 | 1 MONTH |
| 326.002 | 1 YEAR |
| 326.002 | 5 YEAR |
| 326.002 | 500 HOUR |
| 326.002 | 500 HOUR |
| 326.002 | 500 HOUR |
| 326.002 | 1,000 HOUR |
| 326.002 | 1,000 HOUR |
| 326.002 | 1,000 HOUR |
| 326.002 | 3,000 HOUR |
| 326.009 | 3 MONTH |
| 326.009 | 1 YEAR |
| 326.01 | 3 MONTH |
+---------+------------+

I would add a third column: EqualityTest, with a formula such as:
=AND([#ID]=A5,[#INTERVAL]=B5)
This assumes the data is sorted.
The above formula is for row 5, with ID and Interval in columns A and B. Copy-Paste the formula down, and apply conditional formatting to highlight False for unique values.

qroberts,
I'm not clear on your question.
If you're actually attempting to remove duplicate rows, Excel offers that as a function on the basic ribbon bar for the "Data" ribbon: the button (!) is "Remove Duplicates" in the "Data Tools" section on the "Data" ribbon.
If, instead, you are looking for a highlighter which will identify duplicates, things are a bit more complicated, as you need a macro which will find duplicates and then turn on some form of formatting/highlighting (I suggest formatting a background color).
For a macro which will highlight duplicates, we need to hear a bit more to understand your needs. If you have two different sets of duplicates, do you want them highlighted to different colors? If the number of duplicate sets
gets large, this could be a problem.
As another poster has noted, it also matters whether your candidate set is sorted. A full range search for duplicates would be an interesting bit of coding.

So use COUNTIFS():
=COUNTIFS($A$2:$A$10000,$A2,$B$2:$B$10000,$B2)>0
It will return true for any that has duplicates in both columns. This formula does not care if the data is sorted or not.

Related

MS Excel: How to list all column if the rows contain a given date?

My data looks like below. I have Groups that I share topics each day. We do this randomly based on need.
| | Topic 1 | Topic 2 | Topic 3 | Topic 4 | Topic 5 | Topic 6 | Topic 7 | Topic 8 | Topic 9 |
|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| Group 1 | | 19-apr | 30-apr | | | | | | |
| Group 2 | 18-apr | 25-apr | | | | | | | |
| Group 3 | | | | | 19-apr | 30-apr | | | |
| Group 4 | 18-apr | 25-apr | | | | | | | |
| Group 5 | | | | | | | 19-apr | 30-apr | |
| Group 6 | | | 25-apr | | | | | | |
| Group 7 | 18-apr | 25-apr | | | | | | | |
For our metrics & analysis, we need a list of groups per date on a different sheet. We like to know which all groups were engaged a given day. Like below
Can somebody please help me how I can get this done with only using formulas and without macros?
I believe this can somehow be handled on Index Matching or look-ups.
You could definitely do this with macros. You can do something similar without macros; it may not be precisely what you were looking for because it will leave blank space where groups were not addressed.
Method 1
Here is the formula I used and a picture of the sheet it is in:
=IF(IFERROR(MATCH(L$4,$B6:$H6,0),FALSE),INDEX($B$5:$B$13,MATCH($K5,$B$5:$B$13,0),1),"")
The idea is that if you have absolute references alongside your list of groups per date, then you can use index and match to fill in that group's name, but only if Match finds that precise date code in that group's row from the previous table. If you place an equivalent formula in the first cell, you can drag it out to the rest of the array.
The formula I used is not the only way to do this, but if you know Index and Match, then it should make sense to you.
Method 2
A more convoluted method would be to use image references. With these, it is possible to make the report precisely what you asked for on a separate sheet.
Suppose you took Method 1 and separated each column out into a different table. Nearly the same formula inside the cells below the date heading, except that you enclose the heading reference in int() as shown below. Create one table for each of N dates, where N is the number of days you want to monitor at once. Then when you want the summary to show you different dates, you go to each table and change the heading, and filter out blanks.
formula:
=IF(IFERROR(MATCH(INT($L$2),$B4:$H4,1),FALSE),INDEX($B$4:$B$11,MATCH($K3,$B$4:$B$11,1),0),"")
The below image shows what I mean by one table for each date:
Then you insert an image. Doesn't matter what image; could be a screenshot of anything. Click on that image, then click into the formula bar. Then highlight the table column you want it to represent. Below is a screenshot of how to to that:
Now place that picture on its own sheet in the workbook. Place each date table on its own sheet in the workbook. The reason you do this is: if you filter a table, everything else overlapping the filtered rows outside the table will also be hidden. You move tables to separate sheets to prevent them from hiding each other.
Finally, arrange your pictures into the order you like, filter the blanks out of the tables, and your images will be exactly what you were looking for:
Again, this is a little convoluted because if you want the report to show you new date summaries, you would have to change the headings on every table. Then you would have to go to each table and refresh it's filter. This is where macros usually come in.
Assume range A1:J8 housed your Source table, and L1:P8 housed the Date/Group Output
1] In L2, copied across :
=IFERROR(1/(1/AGGREGATE(15,6,$B$2:$J$8/($B$2:$J$8>K$2),1)),"")
2] In L3, copied across to P3 and all copied down :
=IF(L$2="","",IFERROR(INDEX($A:$A,AGGREGATE(15,6,ROW($A$2:$A$8)/($B$2:$J$8=L$2),ROW(A1))),""))
You can use the following formula to get a list of dates from a table:
=IFERROR(AGGREGATE(15,6,($B$2:$J$8/($B$2:$J$8*(COUNTIF($A$15:A15,$B$2:$J$8)=0)))*$B$2:$J$8,1),"")
To get a list of groups by date, use the following:
=IFERROR(INDEX($A$1:$A$8,AGGREGATE(15,6,(1/(B$15=$B$1:$J$8))*ROW($B$1:$J$8),ROW(A1))),"")

EXCEL: SUMIFS criterion applied to a INDEX MATCH search equals a value

I've spent pretty much all day trying to figure this out. I've read so many threads on here and on various other sites. This is what I'm trying to do:
I've got the total sales output. It's large and the number of items on it varies depending on the time frame it's looked at. There is a major lack in the system where I cannot get the figures by region. That information is not stored in the system. The records only store the customer's name, the product information, number of units, price, and purchase date. I want to get the total number of each item sold by region so that I can compare item popularity across regions.
There are only about 50 customers, so it is feasible for me to create a separate sheet assigning a region to the customers.
So, I have three sheets:
Sheet 1: Sales
+-----------------------------------------------------+
|Customer Name | Product | Amount | Price | Date |
-------------------------------------------------------
| Joe's Fish | RT-01 | 7 | 5.45 | 2020/5/20 |
-------------------------------------------------------
| Joe's Fish | CB-23 | 17 | 0.55 | 2020/5/20 |
-------------------------------------------------------
| Mack's Bugs | RT-01 | 4 | 4.45 | 2020/4/20 |
-------------------------------------------------------
| Joe's Fish | VX-28 | 1 | 1.20 | 2020/5/13 |
-------------------------------------------------------
| Karen's \/ | RT-01 | 9 | 3.45 | 2020/3/20 |
+-----------------------------------------------------+
Sheet 2: Regions
+----------------------+
| Customer | Region |
------------------------
| Joe's Fish | NA |
------------------------
| Mack's Bugs | NA |
------------------------
| Karen's \/ | EU |
+----------------------+
And my results are going in Sheet 3:
+----------------------+
| | NA | EU |
------------------------
| RT-01 | 11 | 9 |
+----------------------+
So looking at the data I made up for this question, I want to compare the number of RW-01's sold in North America to those sold in Europe. I can do it if I add an INDEX MATCH column to the end of the sales sheet, but I would have to do that every time I update the sales information.
Is there some way to do a SUMIFS like:
SUMIFS(Sheet1!$D:$D,Sheet1!$A:$A,INDEX(Sheet2!$B:$B,MATCH(Sheet1!#Current A#,Sheet2!$A:$A))=Sheet3!$B2,Sheet1!$B:$B,Sheet3!$A3)
?
I think it's difficult to do it with a SUMIFS because the columns you're matching have to be ranges, but you can certainly do it with a SUMPRODUCT and COUNTIFS:
=SUMPRODUCT(Sheet1!$C$2:$C$10*(Sheet1!$B$2:$B$10=$A2)*COUNTIFS(Sheet2!$A$2:$A$5,Sheet1!$A$2:$A$10,Sheet2!$B$2:$B$5,B$1))
I don't recommend using full-column references because it could be slow.
BTW I was assuming that there were no duplicates in Sheet2 for a particular combination of customer and region - if there were, you could use
=SUMPRODUCT(Sheet1!$C$2:$C$10*(Sheet1!$B$2:$B$10=$A2)*
(COUNTIFS(Sheet2!$A$2:$A$5,Sheet1!$A$2:$A$10,Sheet2!$B$2:$B$5,B$1)>0))
EDIT
It is worth using a dynamic version of the formula, though it is not elegant:
=SUM(Sheet1!$C2:INDEX(Sheet1!$C:$C,MATCH(2,1/(Sheet1!$C:$C<>"")))*(Sheet1!$B2:INDEX(Sheet1!$B:$B,MATCH(2,1/(Sheet1!$B:$B<>"")))=$A2)*
(COUNTIFS(Sheet2!$A$2:INDEX(Sheet2!$A:$A,MATCH(2,1/(Sheet2!$A:$A<>""))),Sheet1!$A2:INDEX(Sheet1!$A:$A,MATCH(2,1/(Sheet1!$A:$A<>""))),Sheet2!$B$2:INDEX(Sheet2!$B:$B,MATCH(2,1/(Sheet2!$B:$B<>""))),B$1)>0))
As you would need to make the match in memory I don't think it's feasible in Excel, you'll have to use a vba dictionary.
On the other hand, if the number of columns is fixed in your sales sheet, you can just format as table and add your index match in F.
When updating the sales data delete all lines as of line 3 and copy paste the update value. Excel will automatically apply the index match on all rows.

How to compare multiple HLOOKUP cases

In my sheet I have a formula using HLOOKUP to calculate a number based on the content of a cell. The content can be choosen among "OK","NOK","-".
Deliverable | CASE |Description | Value
Deliverable1 | OK | ******* | 3
Deliverable1 | NOK | ####### | 6
Deliverable1 | - | &&&&&&& | 10
Deliverable2 | OK | ******* | 4
Deliverable2 | NOK | ####### | 7
Deliverable2 | - | &&&&&&& | 9
I want then to calculate, for a given deliverable, the difference between the applied case and the case when the selection is put to "OK". I.e. if I have a -, I want to get 7.
To achieve this I have created a duplicate sheet where I force the content of the cell to be OK, and then I calculate the difference. The problem with this approach is that when using filtering or sorting the calculation is messed up.
Is there a way to avoid using the duplicate sheet?

How to return the header by value in a PivotTable

I have a PivotTable like this:
Sum of Gf_Amount | Column Labels
| 2015 | | | | Grand Total
Row Labels | 17-Mar | 18-Mar | 19-Mar | 20-Mar |
3601 | 20 | 20 | | | 40
10386 | 35 | | | | 35
76301 | 5 | | | | 5
80941 | | | | 10 | 10
205738 | | | 5 | | 5
219576 | | 15 | | | 15
Grand Total | 60 | 35 | 5 | 10 | 110
What I want do is find the last non-empty column and return the date according to the value. For example: for ID 3601 the result should be 2015 18-Mar.
Currently I know how to find the last non-empty column by using =LOOKUP(9.99E+307,B6:E6). For ID 3601 it gives me 20 which is correct. However when I use:
=INDEX($B$5:$E$5,MATCH(LOOKUP(9.99E+307,B6:E6),B6:E6,0))
to find the header, it gives me 17-Mar which is the corresponding header for the first 20. Besides, the formula I wrote can't even give me the year.
Can anyone help me out so I can find the date and year? (It doesn't have to be in PivotTable. You can copy and paste it in a normal table.)
I'm guessing that your column labels are date indices formatted as dd-mmm so there is no need to find the 2015 that is displayed hence:
=INDEX($5:$5,MATCH(1E+100,A6:E6))
formatted as say dd-mmm-yyyy and presumably copied down may suit.
It is a peculiarity (perhaps never really intended) of the MATCH function that, without the optional argument, where it can’t find a match in a list it returns the index of the last entry in the list – very useful, as here, at times! So all the “big number” (there are lots of versions of it – for example the one you used 9.99E+307) does is feed MATCH a number so large it is never likely to find it (to force selection of the last entry).
I like 1E+100, a googol, as short and easy to remember, and for its ‘derivation’. 9.99E+307 is theoretically better as closer to the largest number Excel can handle:
9.99999999999999E+307
but
10,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000
for me is big enough – I don’t expect ever to want to work with a number bigger than that and smaller than or equal to 9.99E+307.

Counting the frequency of combinations of numbers (in excel using VBA)

I want excel to count the FREQUENCY that certain number-letter combinations appear down a column in excel (using vba). All my data goes down one column like this:
Column A (only 1,2,3,4,5,s,f appear)
1
2
s
4
3
s
4
2
f
2
s
2
s
I want to count the number of occasions combinations of (1-s, 2-s, 3-s, 4-s, 5-s) occur, strictly when the number occurs first (is in the higher row). I do not want to count occasions when the s comes before the number (e.g. s-2). I know how to count the number of individual letters/numbers using the countIf function.
I might later want to expand my analysis to look at the occasions that three letter-number combinations (e.g. 2-s-3, 2-s-5)
I am very much a VBA noob.
Try inserting a new column to the right of Column A. Use this formula =A1&A2 and fill it down the column. The values will look like this:
+----------+----------+
| Column A | Column B |
+----------+----------+
| 1 | 12 |
| 2 | 2s |
| s | s4 |
| 4 | 43 |
| 3 | 3s |
| s | s4 |
| 4 | 42 |
| 2 | 2f |
| f | f2 |
| 2 | 2s |
| s | s2 |
| 2 | 2s |
| s | s |
+----------+----------+
Now you can count occurences like you were doing before! :D
Of course, you can expand to three character frequency analysis by making the formula =A1&A2&A3.
Seems possible with COUNTIFS, with 1 to 5 inclusive in C1:G1 and in C2:
=COUNTIFS($A1:$A12,C1,$A2:$A13,"s")
copied across to suit.
You can use the VBA equivalent of this formula
=SUMPRODUCT(--(ISNUMBER(A1:A12)),--(A2:A13="s"))
which looks for number, followed by s in the row below (4 for your sample)
code
MsgBox Evaluate("SUMPRODUCT(--(ISNUMBER(A1:A12)),--(A2:A13=""s""))")

Resources