Combine and Sum Duplicate Columns in Excel - excel

My apologies in advance - this is my first question on Excel here on SE, and I am fairly new at using it (if this question should be migrated to SuperUser please let me know in comments).
I have a spreadsheet which looks like this:
I need to combine the duplicate properties, summing the rows as I go.
Thus, the outcome should look something like this:
Notice that in this example I have created a third column for the duplicate properties - this is fine, though ideally, the original columns would be hidden after the duplicate columns are summed up. I have also created another column for property 2, even though it does not have a duplicate - this is also fine, though leaving property 2 alone entirely would also be acceptable.
I have read this question, but my question is somewhat different in that I am trying to sum the values, as well as attempting to sum based on duplicate columns, and not duplicate rows. I have also attempted to manipulate the instructions from here and here, though am having little luck.
Any help would be greatly appreciated!

With this data
Add this formula starting from B6 to E6 with CTRL+SHIFT+ENTER
=IFERROR(INDEX($B$1:$J$1, MATCH(0,COUNTIF($A$6:A6, $B$1:$J$1), 0)),"")
And this formula starting from B7 to E8
=SUMIF($B$1:$G$1,B$6,$B2:$G2)

Related

Looking at independent, unrelated, values in two columns, find where information matches in both cells and remove duplicates into two new columns

Thanks for any help here. I've been racking my brain (and searching online I promise) for a while on this one.
I'm looking at columns A and B that have unrelated information, but sometimes the information in columns A and B are duplicates. For example cell A2 says "Frank" and B2 say "1". then cell A3 says "Frank" and B3 says "1". The information is a duplicate across two columns. The rest of the names in this example can be anything, Frank, Sally, Robert, etc.. and the rest of the numbers in column B can be anything (Image of example attached). If the information is a duplicate I'd like to output a reduced list using two new columns.
These functions need to be operable in real time. As data is added the equations must update in real time. I also can't concatenate because I need info to stay in two columns. I've see a lot of examples of doing this for 1 column of data using an array (see example 2), but I don't know how to build one that considers two columns. 1 column example: =IFERROR(INDEX($T$2:$T$9, MATCH(0,COUNTIF($X$1:X3, $T$2:$T$9), 0)),"") Any ideas how to build this out so it works for two columns?
I'd like to avoid using VBA if possible, but if it's the only way so be it. I want to avoid using VBA because a lot of people touch the spreadsheet and it's a lot easier for me to fix a function than code. Gotta love humans!
Thanks so much for your help!
Robby
example example2
The simplest way that I can think of is:
a) Select your data, and paste elsewhere either on same sheet or a new one.
b) Select one of the cells in the copied data
c) Go to the Data menu tab, and click Remove Duplicates
d) Click OK
Create a new column C that concatenates columns A and B. Then remove duplicates on the basis of column C.
https://support.office.com/en-us/article/CONCATENATE-function-8F8AE884-2CA8-4F7A-B093-75D702BEA31D

How to identify the cells from the sumproduct formula result

I would first like to apologize if this question has already been posted and answered numerous times but I was unable to find the right wording for my question to find a thread that matched.
I have a Sumproduct formula with multiple criteria that helps identify the number of issues I have on the main spread sheet. I got the number of issues, however now I would like to identify the cells meeting this criteria. Is there any way to do this?
To further explain my intention, the main tab on my spread sheet is a report with many different columns that would need to be filtered several different ways each time in order to catch the exceptions we are looking for. I am trying to avoid this manual process by creating a new tab to show these exceptions without having to look for them and leave cause for any user/human error where something could potentially be missed. In a new tab, I used several formulas (like the one below) to determine the number of different exceptions we need to catch, however I am wondering if there is a way to also identify these specific cells that the exceptions fall in so that the user can immediately locate and correct it.
For example: 2 issues identified; B10 and B26. (Or more specifically, if possible, the contents of that given cell?)
Sumproduct:
=SUMPRODUCT(--(May!C2:C452="FHA"),--(May!Z2:Z452<>""),--(May!AB2:AB452<>""),--(May!AC2:AC452=""))
Note: I have also tried to achieve this by using conditional formatting using the formula above, however the issue that I run into with that approach is that the entire row gets highlighted instead of the specific cells matching the criteria from the formula. I am open to a solution with this as well if it is an easier approach.
I hope I am getting across what I am trying to do! Thank you in advance to whomever can help!
Consider adding an additional column to the May worksheet. The new column would contain formulas like:
=(C2="FHA")*(Z2<>"")*(AB2<>"")*(AC2="")
If you AutoFilter on this new column, you will see all the contributors to the SUMPRODUCT() formula.
Your 2 issues identified; B10 and B26 appears to refer to cells that have no bearing on what you seek to achieve so I may have misunderstood but suggest selecting A:AC and applying a CF formula rule of:
=AND($C1="FHA",$Z1<>"",$AB1<>"",$AC1="",OR(COLUMN()=3,COLUMN()=26,COLUMN()=28,COLUMN()=29))

Sumproduct or Countif on a 2D matrix

I'm working on data from a population of people with allergies. Each person has a unique ExceptionID, and each allergen has a unique AllergenID (451 in total).
I have a data table with 2 columns (ExceptionID and AllergenID), where each person's allergies are listed row by row. This means that the ExceptionID column has repeated values for people with multiple allergies, and the AllergenID column has repeated values for the different people who have that allergy.
I am trying to count how many times each pair of allergies is present in this population (e.g. Allergen#107 & Allergen#108, Allergen#107 & Allergen#109,etc). To keep it simple I've created a matrix of 451 rows X 451 columns, representing every pair (twice actually because A/B and B/A are equivalent).
I somehow need to use the row name (allergenID) to lookup the ExceptionID in my data table, and count the cases where that matches the ExceptionIDs from the column name (also AllergenID). I have no problem using Vlookup or Index/Match, but I'm struggling with the correct combination of a lookup and Sumproduct or Countif formula.
Any help is greatly appreciated!
Mike
PS I'm using Excel 2016 if that changes anything.
-=UPDATE=-
So the methods suggested by Dirk and MacroMarc both worked, though I couldn't apply the latter to my full data set (17,000+ rows) because it was taking a long time.
I've since decided to turn this into a VBA macro because we now want to see the counts of triplets instead of pairs.
With the 2 columns you start with, it is as good as impossible... You would need to check every ExceptionID to have 2 different specific AllergenID. Better use a helper-table with ExceptionID as rows and AllergenID as columns (or the opposite... whatever you like). The helper table needs a formula like:
=COUNTIFS($A:$A,$D2,$B:$B,E$1)
Which then can be auto-filled. (The ranges are from my example, you need to change them to your needs).
With this helper-matrix you can easily go for your bigger matrix like this:
=COUNTIFS(E:E,1,INDEX($E:$G,,MATCH($I2,$E$1:$G$1,0)),1)
Again, you can auto-fill with this formula, but you need to change it, so it fits your needs.
Because the columns have the same ID2 (would be your AllergenID), there is no need to lookup them because E:E changes automatically with the auto-fill.
Most important part of the formulas are the $ which should not be messed up, or you can not auto-fill it.
Picture of my self-made example (formulas are from the upper left cell in each table):
If you still have any questions, just ask :)
It can be done straight from your original set-up with array formulas:
Please note that array formulas MUST be entered with Ctrl-Shift-Enter, before copying across and down:
In the example pic, I have NAMED the data ranges $A$2:$A$21 as 'People' and $B$2:$B$21 as 'Allergens' to make it a nicer set-up. You can see in the formula bar how that looks as a formula. However you could use the standard references like this in your first matrix cell:
EDIT: silly me, N function is not needed to turn the booleans into 1's and 0's, since multiplying booleans will do the trick. Below formula works...
SUM(IF(MATCH($A$2:$A$21,$A$2:$A$21,0)=ROW($A$2:$A$21)-1, NOT(ISERROR(MATCH($A$2:$A$21&$E2,$A$2:$A$21&$B$2:$B$21,0)))*NOT(ISERROR(MATCH($A$2:$A$21&F$1, $A$2:$A$21&$B$2:$B$21,0))), 0))
Then copy from F2 across and down. It can be perhaps improved in technique with sumproduct or whatever, but it's just a rough example of the technique....

Removing blank entries from Excel data validation with dependant lists

I need help removing blank entries from an Excel data validation list.
I’ve looked at various solutions, but in my implementation I am using dependent lists to drive several VLOOKUP, so none of the solutions I have found seem to work. As an Excel novice it’s difficult to work out which path I should be heading down, so I’d be grateful to anyone that could help out.
If anyone feels like a challenge and wants to have a look, my sheet can be accessed at: https://www.dropbox.com/s/b7lxe9oagzdaniy/MRF_Dashboard_v0.6.1.1.xlsm?dl=a
For your New list, in Raw!DL2 use this array formula.
=IF(LEN(DL1),IFERROR(INDEX($A$2:$A$99, MATCH(0,IF($A$2:$A$99<>"",IF($DK$2:$DK$99="",COUNTIF(DL$1:DL1,$A$2:$A$99),1),1),0)),""),"")
Array formulas require Ctrl+Alt+Delete to finalize. Once entered correctly you can fill down to Raw!DL100. This array formula produces a list of the numbers from column A where column DK is blank.
Similarly, the array formula for Raw!DM2 would be:
=IF(LEN(DM1),IFERROR(INDEX($A$2:$A$999, MATCH(0,IF($A$2:$A$999<>"",IF($DK$2:$DK$999="Pending",COUNTIF(DM$1:DM1,$A$2:$A$999),1),1),0)),""),"")
Fill down as necessary.

Using a Variable in Excel for COUNTIF

First time question and I hope it's easier than I'm making this.
Can I use a variable inside a COUNTIF formula?
Currently my formula is:
=COUNTIF($C$2:$C$415,R6)
I would like to have $415 as my variable. I have tried something along the lines of:
D1=415=COUNTIF($C$2:$C$(D1),R6) ..
but obviously get a error.
The reason I need this is column C will constantly be incrementing as I add more rows.
Instead of going into each of my formulas and updated 415 to 416, 417 etc, I would like to just define a Cell that can be my variable, or total rows.
Currently Column C can have blank cells, so I can't have a macro that finds the next empty cell.. but I do however have Column A with a constant populated cell and stops at the last ticket. However Column A is unrelated to the COUNTIF.
UPDATE 1
I'd also like to mention that I'd be using this variable in many formulas in the spreadsheet. Not only COUNTIF's. Also, the COUNTIF contains text.
UPDATE 2
Actually, I figured it out! I am using this formula instead:
=COUNTIF(INDIRECT("C"&D1&":A"&D2),R6)
I'm putting D1=2 and D2=415 and will just update cell D2 with how many rows I have.
I guess I just needed to ask the question thoroughly to fully understand what I wanted!
Thank you in advance for all help, tips and suggestions.
Would "=COUNTIF($C:$C,R6)" do the trick? This will apply COUNTIF to the whole of column C. It's an easy solution, but probably not the most efficient.
I prefer tables for storing data; as new data is added, the table automatically expands and the columns are already labeled (much like Named Ranges). Then you can have =COUNTIF(Table1[Column1],"Criteria"), which will encompass any new rows added to the table automatically. Especially helpful if you have multiple tables in the same column.

Resources