How to count unique column data in an excel sheet - excel

I am using excel sheet and i have data column as shown below:
As we can see that some of the names are duplicate or appeared twice. My question is how can count unique name records or rows associated with each name for summary column.
Out put i am looking for is shown below:
Not sure which formula to use as count is counting all of that data i.e. '7' in this case. How can i use count or any other function to count unique records as shown above?

You can do what you're after with a pivot table.
Click the Insert tab then select "Recommended Pivot Tables".
A window will open up prompting you to select the data range. I recommend using a named range for your list and referencing that, but you can just highlight the list directly if you want.
Once the data range is selected, click "Ok" and new window will open with exactly what you want. A unique values list and a "Count of Column1". It is the default of the recommended pivot tables.
I outlined this because it's easy and fast, but it's important to understand you can make this pivot table yourself from scratch if you learn about pivot tables in general. Pivot tables are often overlooked in Excel as an option.
Lastly, you could get really advanced with Excel Power Queries. Just Google "Excel Power query" and you will be shown all kinds of information on them. They are a close second place in power to manipulate Excel data short of using VBA.
Good luck!

CountA(Unique(D2:D8,,False)) = 5 [Count(Unique(D2:D8)) is the same as False is the default.]
CountA(Unique(D2:D8,,True)) = 3 (once and only once)
Note: the Unique function was released in late 2019 to Office 365. So if you want to use this check your version, not present in 1908, present in 2006.
Edit: It's actually in 2002, I just updated my 1908 machine.
HTH

If names duplicates are removed the following formula can be used: =COUNTIF(B:B,F2)
If duplicates must be removed by formula, MATCH (searches for a specified item in a range of cells, and then returns the relative position of that item in the range.) and SMALL (Returns the k-th smallest value in a data set.) functions can be used as shown.
C$1048576 is used to reference last row number for a big list case.
formulas:
Column A, names sequence
Colunm B, names
Column C, formula =MATCH(B2,B:B,0)
Column D, formula =IF(COUNTIF(C2:$C$1048576,C2)=1,C2,"")
Column E, formula =SMALL(D:D,A2)
Column F, formula =VLOOKUP(E2,A:B,2,0)
Column G, formula =COUNTIF(B:B,F2)

For anyone like me without O265's lovely Unique & Filter Functions, and who doesnt want to use a pivot table, and there are many ways to do this, but this i have just done this in normal excel.
List of data in Column H, Formula in column O3. Drag down. Highlights your distinct and unique values from H.
=IF(COUNTIF(H:H,H28)=1,"U - "&COUNTIF(H:H,H28),IF(COUNTIF(H$1:H27,H28)=1,"U - "&COUNTIF(H:H,H28),"-"))
Formula is short. You can just do this and drag down. Apply the same principal to your worksheet data wherever it is.
=IF(COUNTIF(H:H,H3)=1,"U",IF(COUNTIF(H$1:H2,H3)=1,"U","-"))
Similarly, you can just use this formula here (credit goes to this source for this one):
=(COUNTIF($H$1:$H1,$H1)=1)+0
Id like to point out that the above formula is a better formula than mine. It highlights with a "1" (or with a tweak, the value of your choice) the first time any value is seen/spotted on any given list, whether duplicate or unique.
Whereas mine is a bit "more random" when picking up the "unique and distict" values.
Mine gets there in the end, but Extend Office's gets there first, as I think is proper (getting the first time a unqique distict value is spotted/occurs.).
Formula in K5 =IF((COUNTIF($H$5:$H5,$H5)=1)+0=1,"UNIQUE DIST","") and drag down...
You could append/add a normal basic countif after the results to show how many actual times the given value appears if you wanted. :
=IF((COUNTIF($H$5:$H5,$H5)=1)+0=1,"UNIQUE DIST","")&" - "&COUNTIF(H:H,H5)

Related

Excel formula for counting the number of incidents of a word in a column?

Is there a quicker way of searching for terms without typing each one into the formula? Like, say I have a column that has a bunch of names of locations and I want to find out how many times each one comes up.
This is the formula for when I type in the locations:
=COUNTIF($F$2:$F$274,"*AD library*")
I just modify the AD library to the next one, say monastery so it would be
=COUNTIF($F$2:$F$274,"*monastery*")
Is there another way of getting the same info without having to type in each one (it's a big sheet with a lot of locations).
Thanks
use PivotTable on just that one column. Put that column in both rows and Values:
Go to the insert tab and insert a pivot table.
Then drag the header (locations?) you want the count of to row labels and any other header to the value field (preferably something with text).
Now the pivot should give you the count of each item.
If you don't have a column with text then choose any other and you need to switch from sum to count in the value field settings.

Restructuring data in excel

I am trying to condense data in a specific way. I want any occurrences of the number 1 in each column to show up as 1 (regardless of how many times it occurs) with the corresponding site, in the corresponding column. Some sites occur multiple times in the original data, and I want to make it so that only one of each unique site shows up in the resulting data table with a 1 for the corresponding column if there any 1's in the column from the original data.
I would think it would be a vlookup function, but I have tried many different things and I am really stuck on this.
Image of original data and what I am trying to do:
Thank you
This assumes that your data set only contains 1 or blank and this approach uses a Pivot Table with MAX function. Below are details in case anyone doesn't know Pivot Tables.
Select a cell in your data and insert Pivot Table. Note, I added a title for column A, as you need that in the Pivot Table.
Click in the created Pivot Table and the PivotTable Fields dialog should pop up. If not, right click in Pivot Table and select Show Field List.
Drag the Field names (Code, a, b,& c) down to the appropriate blocks below. (Values under Columns will be created for you.)
Click on the drop down arrow next to each field name and select Max. That will rename it to "Max of ...". If that bothers you, then you can type the name you want into the Custom Name field. Note, it will not let you type the same name as the field name, eg a, but it will work if you put a space in front of it.
Given that the Pivot Table would be a lot of work for a large number of columns, here is a formula based approach. Put this formula in cell G2, then drag it down and across to fill your new table.
Note, you will have to populate all codes that you have in column F. And if any new codes are added later you will have to keep this updated. One of the advantages of a Pivot Table is that it will do this for you.
I know that you won't be putting this in these cells, so adjust accordingly. In fact, I would recommend this be in another sheet.
=IF(COUNTIFS($A:$A,$F2,B:B,1)>0,1,0)
COUNTIFS($A:$A,$F2,B:B,1)
This will count each occurrence when the value in column A matches your code $F2 AND the value in column B equals 1.
If that count is >0, then you know that at least one match was found and the IF will return 1, otherwise 0.

Generate or fill cell data based on another dataset excel

I've a data set that shows;
employee name
date
time work started
time work ended
Now I am trying to have a report like sheet where I can select a certain employee name from a list of employees to view his/her time attended for a particular month.
I tried vlookup but went no where since I need to lookup by two columns plus a row.
Is this possible? without macros or vba.
Thanks
Since name and date are unique identifiers it is possible to use the sumifs function.
For ‘time in’ and ‘Rachel’ this will look as follows:
=Sumifs(column ‘time in’ from data set, column ‘name’ from dataset, “Rachel”, column ‘date’ from data set, “10/01/2017”)
Where Rachel and the date also can be a referenced cell.
=AGGREGATE(15,6,ROW(SHEET1!$A$2:$E$22)/((SHEET2!$B$1=SHEET1!$B$2:$B$22)*(SHEET2!$A4=SHEET1!$C$2:$C$22)),1)
The above formula will grab the row number that matches your criteria. to pull the information you want, you can place the row number inside an INDEX formula to get the following:
=INDEX(SHEET1!$D:$E,AGGREGATE(15,6,ROW(SHEET1!$A$2:$E$22)/((SHEET2!$B$1=SHEET1!$B$2:$B$22)*(SHEET2!$A4=SHEET1!$C$2:$C$22)),1),COLUMN(A1))
You can place the above in your first Time cell and copy right and down. You will see errors if criteria do not exist. ie no person of that name or no date data for that person. to avoid this you can wrap the whole thing in an IFERROR like the below:
=IFERROR(INDEX(SHEET1!$D:$E,AGGREGATE(15,6,ROW(SHEET1!$A$2:$E$22)/((SHEET2!$B$1=SHEET1!$B$2:$B$22)*(SHEET2!$A4=SHEET1!$C$2:$C$22)),1),COLUMN(A1)),"Nothing found")
if you would rather a blank than nothing found display change the "nothing found" to "" or 0 if you want 0 to be displayed.
Note: Aggregate is performing array like calculations in this case. As such you do not want to full column references as it will cause a lot of unnecessary calculations to be performed. Because you have unique entries, SUMIFS option given in another answer is a much better choice.
I think a pivot table will do the job for you.
Place the employee name in the filter, place date and
times in the rows.
Remove subtotals from the Pivot Table
Change Table layout to tabular and Repeat rows
Right click on the Time In and select Ungroup
Then you have the image below.
I have the following layout:
In B11 write this formula and drag down:
=INDEX($B$2:$E$5,MATCH($B$7&$A11,$B$2:$B$5&$C$2:$C$5,0),3)
In C11 write this and drag down:
=INDEX($B$2:$E$5,MATCH($B$7&$A11,$B$2:$B$5&$C$2:$C$5,0),4)
Note that these are Array-Formulas, so you need to enter them with CTRL + SHIFT + ENTER instead of the normal Enter.
You will get a #NV error if the employee hasn't worked on one of the dates A11 and A12. So you could surround the Formula with IFERROR to avoid this.

How do I prevent Excel from automatically replicating formulas in tables?

I'm using Excel 2016. I have a table with headers and when I plug in a formula, Excel is automatically replicating the formula to all other cells in the column. While that would normally be fine, it's wrongly calculating the table headers. I thought I could just change the top row to exclude the header but Excel updates the rest of the column which I don't want.
I would like to either turn this automatic formula replication feature off or figure out a way to customize the formula in the top row so it doesn't calculate the header value.
Here's the formula I'm using and I didn't do anything special with the table outside of add a 'Totals' row:
=SUM(B2+C1-D2)
You can stop creating calculated columns. The option to automatically fill formulas to create calculated columns in an Excel table is on by default. If you don’t want Excel to create calculated columns when you enter formulas in table columns, you can turn the option to fill formulas off. If you don’t want to turn the option off, but don’t always want to create calculated columns as you work in a table, you can stop calculated columns from being created automatically.
Turn calculated columns on or off
1) On the File tab, click Options.
2) Click Proofing.
3) Under AutoCorrect options, click AutoCorrect Options.
4) Click the AutoFormat As You Type tab.
5) Under Automatically as you work, select or clear the Fill formulas in tables to create calculated columns check box to turn this option on or off.
Stop creating calculated columns automatically
After entering the first formula in a table column, click the AutoCorrect Options button that is displayed, and then click Stop Automatically Creating Calculated Columns.
In Excel 2016/365 you can also change a cell you want, let it auto-populate the rest of the column, then Ctrl+Z, this will undo auto-populate but keep the new formula/text you just changed in that one cell.
First, why do you wrap a simple formula into a SUM function? I always wonder why people do that when it's much shorter to write =B2+C1-D2 instead.
Second, if you used the true capabilities of SUM() then text, i.e. your column header, would be ignored instead of throwing an error. The + and - operators don't tolerate text, be it in a table or not. You could rewrite your formula to be
=Sum(B2,C1,D2*-1)
Third, be aware that cell referencing like that will behave erratically when you insert rows into the existing table (between existing rows). The row references will be off for anything below the inserted row and you will need to manually copy down the formula again to get correct results.
In order to get a formula that does not require adjusting, you may want to use structured referencing, where each row has exactly the same formula, instead of cell references, where row references are adjusted in each row. A possible formula for this would be (if your columns are labelled data1, data2 and data3 for columns B, C and D):
=SUM([#data1],OFFSET([#data2],-1,0),[#data3]*-1)
To get the data from the row above, Offset() is used on the cell in the current row (using the # sign), with a negative row offset. Keep in mind that Offset is volatile, which may slow down very large datasets.
Adjusting AutoCorrect options is not optimal. You can do it on a col by col basis by:
let the column autopopulate your formula
delete any number of values in the autocalculated column (just one will do the trick).
adjust formulas in the column as you wish
Only able to test in Excel 2013.
May be a little late to open this query - but a couple of points:
formula replication does not work if there are already different formula in the column - may need 3 different formula before Excel says it cannot guess which formula to replicate
a solution could be to add 2 dummy rows at the top which will then prevent the replication - the benefit of this is that you are not disabling replication for any other tables that may be in the workbook; and I'm not sure if the setting may also inadvertently be copied to other new workbooks you create whilst you have the first workbook open
finally, the old chestnut that =A1+B2 etc creates an error if A1 or B2 are not numeric; whereas =sum(A1,B2) works differently in that text will be ignored and effectively treated as zero
If your reference has letters in the middle these technics won't work. To add a zero before the reference in that case, create a new column next to it, add "0" to all the cells next to the references, add another column and use the function "concontenate" or whatever between the "0" cell and the one with the reference.
Not only that, the =len function will correctly tell you the # of digits on references with letters in the middle (because excel does not view them as numbers per se, but rather text/general format) while on other references with 0s added before them with formatting this doesn't happen.
Eg: 012345 =len gives 5 digits / 01A345 =len gives 6 digits

How to automatically enforce a formula in cells for a newly created row?

I have a worksheet used for estimation where some columns have a formula, most importantly Column(A) has a formula =ROW() which I used as an ID for that estimate (row item) and Column(I) is a total (=SUM(E2:H2)) of all the estimates in that row.
When a user inserts a new row (a new estimate) I need to ensure the new row already have the required formulas in them, I do not want to have the user need to drag the formula from the cells above it (this is not being used by Excel-savy people).
Also, this should only be done to a specific range of rows (I do not want to see an ID/Total for Row750 if it is not used), there needs to be a way to tell it to stop at a specific point (for example I have a row(13) there B13="TOTAL" and C13=SUM of totals (overall total) and this should be the last row, so if the user interests a new row it would be above this one, etc...
Any help/hints/ideas would be much appreciated.
You could use the 'Tables' feature. In Excel 2003 it was called 'Lists'. I'm not sure where to find this feature in 2010.
Ex:
Define a range as being a list table. I used the ROW() function to autonumber the rows. Here's how it looks:
It doesn't overcome the problem that Tim mentioned, but I file that under "not a problem". I actually prefer dynamic autonumbering. When you insert a row, the autonumber changes automatically:
And when you delete a row, the autonumber adjusts:
The best part is that any formulas in the list table will auto-fill down. Besides the ROW formula, I added a SUM formula in column C, then added a new record:
As far as stopping at some point, you will need to move your formulas to another column. The list will continue down and won't stop simply because you have a formula there. You would need to right-click the list table, choose "Insert » Row" in order to keep your formulas in the same columns as the list table.

Resources