Excel - Remove duplicates and SUM at the same time - excel

I have a column with ID's, but they are duplicated; for instance:
"0,0,1,1,1,2,3,3,4,4, ... "
For each row, I have a given value in the other columns, for instance:
"0-24; 0-36; 1-13; 1-34; 1-23;..."
I want to keep only one row with each ID but I need to sum the values of each ID, that is, sum all the values in all columns for a given ID (0,1,2,...), which may include several rows.
Is there a easy way to do this using Excel?
Here is some sample data (table to the left) together with the desired output (tables to the right).
ID Value ID Value
0 24 0 60
0 36 1 70
1 13 2 16
1 34 3 24
1 23 4 48
2 16
3 9
3 15
4 24
4 24

What you can do is to copy your IDs and paste them for example in another Sheet. Let's assume your original table is in Sheet1, and you copy all your IDs to column A in Sheet2.
Then you remove duplicate IDs in Sheet2:
Select column A > Data Ribbon > Data Tools > Remove Duplicates
In column B, you then put the formula:
=SUMIF(Sheet1!$A:$A, Sheet2!$A2, Sheet1!$B:$B)
Note: above formula goes into cell B2 on Sheet2, and you copy it down with pasteSpecial > only formulas.
Edit: if you still want the same number of rows etc because of the information in your other columns, just skip the "Remove duplicates"-part.

Related

How do I create rows in a table based off a cell value and fill a column also based off the cell values in Excel

I am looking to create and fill a table based on two cell values.
If D2 and E2 contain the values 64 and 8 respectively. I want to create a column in a table that has 64X8 rows (not a 64x8 table). Then I want to fill the column with values 0-63 and repeat 8 times.
For example, the table will have a column with values:
0
1
2
3
4
...
63
0
1
2
3
4
...
63
The 0-63 pattern repeating 8 times.
Is this possible? Sorry if my explanation isn't clear. I can provide more detail.
Try below formula-
=MOD(SEQUENCE(D2*E2,,0),D2)

Make formula for list of row names independent from entire column and first row

I have the following Excel spreadsheet:
A B C D E F G H
1 Q1 Q2 Q3 Q4 Search criteria: 60 Asset 2
2 Asset 1 15 85 90 70 Asset 3
3 Asset 2 40 80 45 60 Asset 3
4 Asset 3 30 60 55 60 Asset 5
5 Asset 4 12 72 25 15
6 Asset 5 60 48 27 98
7
In Cells A1:E6 I have different assets with their performance from quarter Q1-Q4.
In Column H I list all assets that match the search criteria in Cell G1.
In this case the search criteria is 60 which can be found in the Cells A1:E6 for the Assets 2, 3 and 5.
For creating the list I use the formula from here:
=INDEX(A:A,SMALL(IF($B$2:$E$6=$G$1,ROW($B$2:$E$6)),ROW(1:1)))
All this works fine so far.
Now when I move the Cells A1:E6 in the sheet for example to D9:H14 the array formula keeps only working if it still refers to A:A and ROW(1:1) which might be a problem if the user decides to delete ROW(1:1). Therefore, I tried to modify the formula to:
=INDEX($D$9:$D$14,SMALL(IF($E$10:$H$14=$J$10,ROW($E$10:$H$14)),ROW($D$9:$H$9)))
However, with this modification I get #NUM! error.
Do you have any idea if it is possible to make the array formula independent from A:A and ROW(1:1) so it refers only to the Cells A1:E6 and automatically moves when the those cells are moved?
If you use excel 2013 or later then you can use following formula.
=IFERROR(INDEX($D$10:$D$14,AGGREGATE(15,6,ROW($1:$5)/($E$10:$H$14=$J$10),ROW(1:1))),"")
You can limit A:A to A1:A6 so that it would adjust as necessary when you move it. Your formula should thus be now
=INDEX(A1:A6,SMALL(IF($B$2:$E$6=$G$1,ROW($B$2:$E$6)),ROW(1:1)))
As for ROW(1:1), your top formula should always be ROW(1:1) and when you drag it down, then next formula should have ROW(2:2). When you move your top formula somewhere else and the ROW(1:1) changes to something like ROW(9:9) or anything, change it to ROW(1:1).
Please note that 'moving' your formula is different from 'dragging it down'.
EDIT:
So after you moved your data set, the top formula should now be:
=INDEX($D$9:$D$14,SMALL(IF($E$10:$H$14=$J$10,ROW($E$10:$H$14)),ROW(1:1)))
This is assuming that cell G1 (criteria) is also moved to J10.

Vllookup, match, index

I am trying to compare data from one sheet to another. Address and its ID.
Both sheets have Address and ID. ID can be repetitive.
Sheet 1 Sheet 2
Address ID Address ID
23 1 22 1
45 1 45 1
23 2 23 2
45 2 45 3
I want to check whether the data address & ID on sheet 1 appear on Sheet 2 thus making a new row with return Yes or No on sheet 1 for every column.
This can be done very quickly with an array formula.
Picture of ranges and result shows the data on the same sheet so that it is easier to see what is going on. Pretend Sheet1 is on the left and Sheet2 is on the right.
Formula in cell D3 is an array formula (enter with CTRL+SHIFT+ENTER) and then copied down to fill.
=(B3=$F$3:$F$6)*(C3=$G$3:$G$6)
This formula will simple return a 0 or 1 for no match/match. You can wrap it in an IF if you want text instead. It is simply checking that the relevant values match for both columns in both "sheets".

Removing specific rows in Excel

I have a data set in Excel as shown in the snippet below:
Patient Number Age State
1 20 1
1 20 3
1 20 2
2 35 1
2 35 4
3 62 2
3 62 1
3 62 3
3 62 4
3 62 5
I need to keep the last row of each patient, i.e. I need the dataset to look as follows:
Patient Number Age State
1 20 2
2 35 4
3 62 5
Is there a simple way to do this with Excel? Since the dataset is huge and cannot do it manually
If your data is in A:C columns, you can add another column with the following formula in D2:
=A2<>A3
Fill it down. Apply autofilter, choose False in D column and delete all filtered rows.
Edit:
This solution assumes your data is sorted by A column.
Enter below to D2 and press CTRL+SHIFT+ENTER to make it an array formula:
=MAX(IF($A$2:$A$11=A2,ROW($A$2:$A$11)))=ROW()
Advantage of this formula is PatientNumber column doesn't have to be sorted. Formula will find the last entry for each PatientNumber. See below, added one more row for Patient number 1:
You can easily keep the top entry with Data ► Data Tools ► Remove Duplicates. To keep the last entry, you first need to reverse the order.
In an unused Helper column to the right put a 1 in the top row then select all of cells in that column to the bottom of your data and use Home ► Editing ► Fill ► Series to gain a column of sequential numbers.
Sort your data using that new column in descending order.
Choose Data ► Data Tools ► Remove Duplicates using only the Patient column as the criteria for duplication.
Delete the Helper column as it is not longer needed.
Duplicates are deleted from the bottom up so the first value for each patient will be retained.

Sum the values in Excel cells depending on changing criteria

In an Excel spread sheet I have three columns of data, the first column A is a unique identifier. Column B is a number and column C is either a tick or a space:
A B C
1 d-45 150 √
2 d-46 200
3 d-45 80
4 d-46 20 √
5 d-45 70 √
Now, I wish to sum the values in column B depending on a tick being present and also relative to the unique ID in column A. In this case rows 1 and 5. Identifying the tick I use
=IF(ISTEXT(C1),CONCATENATE(A1))
&
=IF(ISTEXT(C1),CONCATENATE(B1)).
This leaves me with two arrays of data:
D E
1 d-45 150
4 d-46 20
5 d-45 70
I now want to sum the values in column E depending on the ID in column D, in this case row 1 and 5. I can use a straight forward SUMIFS statement to specify d-45 as the criteria however this unique ID will always change. Is there a variation of SUMIFS I can use?
I also wish to put each new variation of ID number into a separate header with the summed totals underneath, that is:
A B
1 d-45 d-46
2 220 20
etc...
You can try this:
To get the distinct ID's write (in H1 then copy right):
This one is an array formula so you need Ctrl Shift Enter to enter the formula
=INDEX($A$1:$A$5;SMALL(IF(ROW($A$1:$A$5)-ROW($A$1)+1=MATCH($A$1:$A$5;$A$1:$A$5;0);ROW($A$1:$A$5)-ROW($A$1)+1;"");COLUMNS($A$1:A1)))
Now to get the sum (H2 and copy right)
=SUMPRODUCT(($A$1:$A$5=H1)*ISTEXT($C$1:$C$5)*$B$1:$B$5)
Data in the example is in A1:C5
Depending on your regional settings you may need to replace ";" field separator by ","
Try this,
SUMIFS
=SUMIFS(B1:B5,A1:A5,"=d-45",C1:C5,"<>")
where "<>" means that the cell is not empty...

Resources