Retrieve unique counts in excel - excel

I'm looking to create a formula that counts the number of sign ups for a given day. My input data is based off an event log, and looks something like this:
AccountName | SignUpDay | EventLogged
acct1 | 06/01/2016 | 06/01/2016
acct1 | 06/01/2016 | 06/05/2016
acct2 | 06/01/2016 | 06/02/2016
acct3 | 06/01/2016 | 06/04/2016
acct3 | 06/01/2016 | 06/06/2016
acct4 | 06/03/2016 | 06/06/2016
The above is dummy data. But lets say I have 10k lines for my input data. For my output, given a specific day I want to look at the input data and return the number of signups for that particular day. What I want to achieve is something like this:
SignUpDay | Count
06/01/2016 | 3
06/02/2016 | 0
06/03/2016 | 1
I know I could probably do something like this in R, but I'm working within what I have right now, which is excel. Anyone have any ideas on how to achieve this?

With Excel's FREQUENCY() and an Array formula:
Assuming
your sheet is set up with AccountNames in A2:A100
your SignupDates are stored in B2:B100
the current Date you want to subset for is in F2
then enter:
=SUM(IF(FREQUENCY(MATCH(IF(B$2:B$100=F2,A$2:A$100,""),IF(B$2:B$100=F2,A$2:A$100,""),0),MATCH(IF(B$2:B$100=F2,A$2:A$100,""),IF(B$2:B$100=F2,A$2:A$100,""),0))>0,1))-1
and press CTRL+SHIFT+ENTER to enter this as an array formula.

With your list of sequential dates in, for example, F2:Fn, you may try this array-entered formula:
G2: =SUM(1/COUNTIF(AccountName,AccountName)*(SignUpDay=F2))
AccountName and SignUpDay should refer only to the existing data range (no blanks). If there are blanks, a more complex formula would be required. If you use a table, with structured addressing, the names can adjust automatically.
Also, the formula assumes that the same account will not sign up on more than one day. If that might be the case, a more complex formula would be required.
eg:
=SUM(1/COUNTIF(SignUpTable[[AccountName ]],SignUpTable[[AccountName ]])*(SignUpTable[ [ SignUpDay ] ]=F2))
To array-enter a formula, after entering
the formula into the cell or formula bar, hold down
while hitting . If you did this
correctly, Excel will place braces {...} around the formula.
EDIT: If your data does not fit into the above constraints, I would suggest one of the solutions linked to by #pnuts, in his comments, where you add an additional column to your data with the formula:
=IF(SUMPRODUCT(($A$2:$A2=A2)*($B$2:$B2=B2))>1,0,1)
and then construct a Pivot table with SignUpDay in the Rows area, and this new column in the Values area. A disadvantage of the Pivot Table solution is that dates with zero signups will not be represented in the table.
In Excel 2013+ it is possible to generate Unique Counts in the Values area, but you mentioned you are using Excel 2010, so that is not a possibility.

Related

Excel Offset: Automating the Reference Cell by using Index Match

If the answer was already provided, please feel free to post the link, but I did not see one.
This is a real estate example. I have a table of a lot of properties and information. One of the fields is "Occupancy." I am trying to use the AVERAGE OFFSET functions combined with CELL and INDEX/MATCH to calculate the average Occupancy of the top 5 and top 10 buildings in this table. To accomplish this, I wanted to create a "Summary Table" below the Property table to list the averages.
Specifically:
For OFFSET, in the "Reference" section, I want to look up "Occupancy" in my summary cell, use INDEX MATCH to find "Occupancy" in the table, and then;
I want the Cell Reference returned AS A VALUE and then move one row down and then highlight the top 5.
Most importantly, I want everything cell referenced without hardcoding or navigating the table to highlight the cell reference. I am not sure if this is possible, but after pouring through a bunch of articles, I am having trouble finding a solution.
To make things simple, let's pull out all the columns I had in the original table and go off the below. Assume all the data types are in the proper format:
**COLUMN A** **COLUMN B**
ROW 1 | Properties | Occupancy |
ROW 2 | Property A | 58% |
ROW 3 | Property B | 56% |
ROW 4 | Property C | 77% |
ROW 5 | Property D | 85% |
ROW 6 | Property E | 92% |
ROW 7
ROW 8 |Top 5 | | 'NOTE: I used a custom formula to format:
"Top "#. Please do not think this is text.
ROW 10|Occupancy | **CODE HERE**| 'Format is "General"
ROW 11| | |
ROW 12| | |
Here's how I initially coded it in B10:
'=AVERAGE(OFFSET(CELL("address",INDEX($A$1:$B$1,MATCH(A10,$A$1:$B$1,0))),1,0,5,1))
I am getting "there is a problem with your formula" since I am guessing it is due to the cell returning text in lieu of the cell reference as a value. So I error checked myself and stripped out the AVERAGE, OFFSET and put this is B11:
'=CELL("address",INDEX($A$1:$B$1,MATCH(A10,$A$1:$B$1,0)))
I know I get $B$1, so I then put $B$1 to replace the CELL function within B10 to see if my average offset is wrong:
=AVERAGE(OFFSET($B$1,1,0,5,1))
I now get the right solution since I went and hardcoded the reference, which does not solve the issue.
If I use INDIRECT and then CELL in B12:
'=INDIRECT(CELL("address",INDEX($A$1:$B$1,MATCH(A10,$A$1:$B$1,0))))
I then get the title "Occupancy" and not the cell reference.
Obviously, I am aware of the fact that the CELL function will return text and not a value. I can't use INDIRECT because when I do, INDIRECT will just display the title of the column in text and not return the cell reference as a value. ADDRESS is not useful because you have to hardcode and the goal is to have zero hardcodes.
How can I code finding the cell reference of specific text in a table without actually going to the table to cell reference when I am building out my OFFSET function?
As you can see, I went through a variety of ways to dissect the problem and I would appreciate anyone's help to come to a solution. Thanks in advance.

Excel average with missing values

I have data like the following in cells A1:F4.
Quarter | FY15Q4 | FY16Q1 | FY16Q2 | FY16Q3 | FY16Q4
Company A | 0.34% | 0.48% | 0.55% | 0.68% | ------
Company B 0.32% 0.36% 0.34% 0.35% 0.35%
Company C | 1.18% |------ |----- |----- |
I'm trying to find the average of the most recent non-missing value from last 4 columns in each row. So for:
FY15Q4, I want the average of 0.34%, 0.32%, and 1.18%
FY16Q1, I want the average of 0.48%, 0.36%, and 1.18%
FY16Q2, I want the average of 0.55%, 0.34%, and 1.18%
FY16Q3, I want the average of 0.68%, 0.35%, and 1.18%
FY16Q4, I want the average of 0.68% and 0.33% (Company C has no data for the most recent 4Qs, so it is to be ignored from the calculation of the average)
The following array formula works as I want for FY16Q3...
{=AVERAGE(IF(COUNT(B2:E2)=0,"",INDIRECT(ADDRESS(ROW(B2:E2),MAX((B2:E2<>"")*COLUMN(B2:E2))))),IF(COUNT(B3:E3)=0,"",INDIRECT(ADDRESS(ROW(B3:E3),MAX((B3:E3<>"")*COLUMN(B3:E3))))),IF(COUNT(B4:E4)=0,"",INDIRECT(ADDRESS(ROW(B4:E4),MAX((B4:E4<>"")*COLUMN(B4:E4))))))}
But for FY16Q4, the same formula structure...
=AVERAGE(IF(COUNT(C2:F2)=0,"",INDIRECT(ADDRESS(ROW(C2:F2),MAX((C2:F2<>"")*COLUMN(C2:F2))))),IF(COUNT(C3:F3)=0,"",INDIRECT(ADDRESS(ROW(C3:F3),MAX((C3:F3<>"")*COLUMN(C3:F3))))),IF(COUNT(C4:F4)=0,"",INDIRECT(ADDRESS(ROW(C4:F4),MAX((C4:F4<>"")*COLUMN(C4:F4))))))
returns the #VALUE error value.
It seems that the AVERAGE function, which usually deals well with blank cell ("") values, is struggling because of the added complexity of the array formula.
Any suggestions on how I can make this work without either (i) using find-and-replace to replace "" with a truly blank cell; or (ii) using VBA?
Surely there must be a way to make this work using only formulas...
One method without array formulas is to use helper cells. I set up another table with the same column/row headers.
Then in the first cell I put:
=IFERROR(INDEX($A:$F,MATCH($A12,$A:$A,0),IF(MATCH(1E+99,INDEX($A:$F,MATCH($A12,$A:$A,0),1):INDEX($A:$F,MATCH($A12,$A:$A,0),MATCH(B$11,$1:$1,0)))<=MATCH(B$11,$1:$1,0)-4,NA(),MIN(MATCH(1E+99,INDEX($A:$F,MATCH($A12,$A:$A,0),1):INDEX($A:$F,MATCH($A12,$A:$A,0),MATCH(B$11,$1:$1,0))),MATCH(B$11,$1:$1,0)))),"")
Then drag it across and down to fill in the table with the correct number:
Then it is just a simple AVERAGE() formula:
=AVERAGE(B12:B14)
Which when dragged over will ignore the blank cells.
A few caveats put out by #Jeeped and #XOR LX.
This only works if the "Blank" are NOT 0 formatted as something else.
As companies and quarters are added the reference table will need the same columns and rows added.

Change the order of table column in Excel

Well, its probably not bit easy anymore still here it comes.
I want to order the column with table content in Excel in specific order.
i.e I want all the columns with Prefix ('Product'/'product') to come at end in alphabetical order except 'Product_ID' to be first among them and everything else remains the same.
i.e Demo
Cust_ID | Name | Product_Quantity | product_Name| Product_ID | Price_per_quantity
1 | Rohit | 4 | Pen | A23 | $2
2 | Tim | 3 | Pot | P41 | $3
to
Cust_ID | Name | Price_per_quanity | Product_id | product_Name | Product_Quantity
1 | Rohit | $2 | A23...(respective columns data)
I wish to have a generalized way for n columns (with/without) using VBA script which contains more columns.
Also (2nd question) To reverse the columns irrespective of there value which is solved by fellow SO members.
Assuming a is in A1, insert a row at the top and populate with:
=COLUMN()
copied across above all populated columns. Select all populated columns and sort by Row1 Largest to Smallest.
Follow these steps
Insert a top row for dummy purpose with values 1, 2, 3, 4
Copy all data including this dummy row -> paste special -> transpose in some other location
Sort the transposed data using "Largest to smallest" with "Expand" option
Now again copy the data, this time without the first dummy column -> paste special -> transpose.
Phew...
Picture attached.
Select the Cells you want to sort. Right Click on them, choose Sort -> Custom Sort
Click on Options and choose Left to Right. And set the settings as in the picture:
And press Ok.
This answer is an addendum to the response given by #pnuts, which, if possible to use, is far simpler. If you cannot sort by column for some reason, one way to achieve your result is to take the transpose of the input (i.e. swap rows and columns), sort alphabetically by row, then transpose back to get the original format. Here is a step by step guide for how to do this:
Highlight a 4x3 region of the spreadsheet where you want the transposed data to go. Then enter a transpose array formula, which is =TRANSPOSE($A$1:$D$3) in the example above. Note carefully that the cell ranges are absolute ($) so that it doesn't change as Excels copies it over the range. Also remember to press CTRL + ALT + ENTER, rather than just ENTER, to tell Excel that you are entering an array formula. Next copy this data to a new location (values only), and sort in descending order by the letter column:
The final step is to take the transpose of this sorted data to obtain your final result:
Again, this involved using the TRANSPOSE function with an array formula.
Now you have your original data sorted by column.

Find one cell inside a string and then populate another cell based off of that

I have a basic financial spreadsheet that I am trying to make even nicer. I have a table that looks like this:
| Description | Category | Amount
| 2341234 Chick-fil-a | Eating Out | 5.89
| 11234 Redbox/Georgetown | Entertainment | 1.21
And another table that looks like this:
| Name | Category |
| Chick-fil-a | Eating out |
| Redbox | Entertainment |
I want to find the strings from the first column of the second table in the first column of the first table and automaically populate the second column of the first table. I am pretty sure I can find the position of the string by doing a {=find(2ndSheet!A:A,1stSheetA:A)} or something of the like, but I have no idea how to populate another cell based off of that.
Does it have to be a macro?
This is a bit ugly but seems to work. It handles the situation where your description may not necessarily match the lookup table (for instance, with Redbox | Redbox/Georgetown). Replace the [Lookup_Table] with the absolute range of your lookup table, and enter as an array formula (Control+Shift+Enter):
=INDEX([Lookup_Table],MATCH(TRUE,NOT(ISERR((FIND([Lookup_Table],A2)))),0),2)
As for what it's doing:
=INDEX(
[Lookup_Table],
MATCH(
TRUE,
NOT(ISERR((FIND([Lookup_Table],A2)))),
0),
2)
It's using an INDEX formula with the [Lookup_Table] as the base range. For the row to match, it looks for instances where running the FIND formula using the current cell does not cause an error. So in the case of Redbox, you'd get an array like {#VALUE!, 7}. You then set the values to TRUE/FALSE based on whether they are errors, and then flip them so that non-errors are True. You then match the TRUE condition against that array, and a hit will return the row number in your lookup table. You then use that row number and column 2 to find the corresponding value.
Again, that is not a pretty one (and there is probably a better way to handle all of your cases), but it seems to work in my extremely exhaustive test of three rows :)
Assuming that your first table is starts in column A, substituting this formula starting in B2 should find the value for you.
=VLOOKUP(RIGHT(A2,LEN(A2)-FIND(" ",A2,1)),[Range of Helper Table],2,FALSE)
You'll need to substitute [Range of Helper Table] with the actual range. Make sure you use absolute references using $, for example: 2ndSheet!A$2:B$20. so that you can copy and paste.

Excel: SUMIF depending on number in field

Hi All You Amazing People
Update
You know what, I should let you know that I am actually trying to do this with numbers and not alphabets. For instance, I have a field with value like 225566 and I am trying to pick out fields which have 55 in them. It is only now I realize this might make a huge difference
ColumnA | ColumnB |
225566 | 2
125589 | 3
95543 | 2
(Below is what I had asked first and later realized I wasn't asking the right question.)
*Lets say I have a table as
ColumnA | ColumnB |
AABBC | 2
AADDC | 3
ZZBBC | 2
Now how could I get a SUMIF for those rows where Column A has a field with BB in it? Assume that there are hundreds of rows. I realize that I have to borrow something conceptually from the way text to column is done. But I wonder if anyone would know how I could do this. Thanks a lot.*
Since you're trying to do this on numbers, you'll need to use an array formula.
If your test values are in A3:A5 and your values to sum are in B3:B5, this will work:
=SUM( IF(ISERROR(FIND("55", TEXT(A3:A5,"#"))), 0, 1) * B3:B5 )
When entering an array formula, use Ctrl-Shift-Enter rather than just hitting Enter.
This sums the product of the sum value and a 0 or 1 from the IF() statement, which tests whether or not each test value, after being converted to text, contains a "55".
I think you will need an matrix/array formular to do this:
{=SUM(IF(ISERROR(FINDEN("55";A2:A4;1));0;1))}
The weird brakets {} indicate it is an matrix formular you get them by pressing SHIFT+CTRL+RETURN instead of Return when editing the formula.
This formula will cycle through the range A2:A4, check if it finds "55" inside and if so add 1 to the sum.
Google array/matrix formulas as they are not self explanatory.
Best
Jan
In Excel 2003 and 2007 (and possibly earlier versions, I cannot test), you can use * as a wildcard character in the match. For example, with your sample data set C1 to
=SUMIF(A1:A3,"*BB*",B1:B3)
and you should see the value 4.
Create a 3rd column (ColumnC) and put this formula in it:
=Text(A2,0)
Drag that column down to complete your column. This will format the value as text. Next, use SUMIF as DocMax explained, except with different columns:
=SUMIF(C1:C3,"*BB*",B1:B3)
The reason you do this is because you need to be reading a Text value, not a Number value when using the *BB* comparison of SUMIF. Great question.

Resources