I am stuck trying to count the number of rows that meet a certain criteria. I have a large database in excel. Now on column A and B, I have a lot of text in cell, but I want to count every row if cell A or B meets my criteria.
Let's say column A has some notes from the morning and column B from the afternoon, and every rows represents a day. Now, let's say I want to count how many days I have mentioned "shopping", morning or evening. I would need a function like this:
// COUNTIF_OR(<range_1>,<criteria_1>,<range_2>,<criteria_2>)
COUNTIF_OR(A1:A100;"*shopping*";B1:B100;"*shopping*")
Most solution I could find on the internet counts the rows if A AND B meet the criteria, instead of A OR B.
Please, help me!
Since your two columns are contiguous, you could use:
=SUMPRODUCT(N(MMULT(N(ISNUMBER(SEARCH("shopping",A1:B100))),{1;1})>0))
though I would be tempted by:
=SUM(COUNTIFS(A:A,{"=";"<>";"="}&"*shopping*",B:B,{"<>";"=";"="}&"*shopping*"))
which, notwithstanding its rather obtuse nature and the fact that it is not readily extendible to more than two columns, does nevertheless mean that you can continue to employ COUNTIFS and so benefit from referencing entire columns (A:A, B:B) with virtually no detriment to calculation performance (which cannot be said for the SUMPRODUCT/MMULT set-up).
Related
Sorry for the macabre subject matter, but I need to add the ages up of a group of people IF they have died.
My data is structured as follows:
Sheet A has the list of people to analyze.
Sheet B is an essential database of all people, with a Yes/No column for if they are alive, and another column for their age.
Basically, I want to get the SUM of ages of the people on a list on Sheet A IF the person is dead.
In my head, it combines the VLOOKUP, SUM and IF functions, but not quite sure how. I want the result in ONE cell.
Thanks for any suggestions!
For formula matters, let's assume the data is structured as below, with the highlighted cell where I want the formula to say 60:
Due to the data structure you provided if you only want to sum the names in Group 1 that are dead then you would use:
=SUMPRODUCT(SUMIFS(E:E,F:F,"No",D:D,B1:B3))
If the Names are not important then use #JoeJam's answer.
You can use =SUMIF(C2:C4,"No",B2:B4) where the age is in column B and Yes/No is in column C.
So, I've searched for an answer to this, but I can't find anything. Hopefully some Excel guru out there has an easy answer.
CONTEXT
I have a sheet that has two columns; a list of airport codes (Col A) and a list of fuel gallons (Col B). Column A has a bunch of duplicate entries, column B is always different. It's basically a giant list of fill-up events for aircraft over time at different airports. The airports can be the same, because it's one row per fill-up event.
PROBLEM
What I want to do is have a formula that takes the enter data set, finds all identical entries in Col A, sums the Col B values for the matches, and spits out the result on a separate sheet with one entry for every set/match.
OTHER STUFF
I do not have a reference list for Column A and I would rather not create one since there are thousands of entries. I would like to just write a formula that does all this at once, using the data itself as the reference.
All the answers I find are "create a reference list on a separate sheet", and it's driving me crazy!
Thanks in advance for any help!
-rt
Sounds that you need a formula version of remove duplicated for column A, and a simple sumif for column B.
Column A
=IFERROR(INDEX(Data!A$1:A$1000,SMALL(IF(
MATCH(Data!A$1:A$1000,Data!A$1:A$1000,0)=ROW(Data!A$1:A$1000),ROW(Data!A$1:A$1000)),ROW())),"")
Array Formula so please press Ctrl + Shift + Enter to complete it. After that you might see a {} outside the formula.
Column B
=SUMIF(Data!A$1:A$1000,A2,Data!B$1:B$1000)
Just change the range for your data.
Reminders: The formula in columnA should starts from Row#1, or you have to add some offset constant for adjustments.
Since the returning value of MATCH() represents the position of the key in the given array. If we wanted it to be equal to its row number, we have to add some constant if the array is not started from ROW#1. So the adjustment of data in Range(B3:B1000) is below.
=IFERROR(INDEX('Event Data'!B$3:B$1000,SMALL(IF(
MATCH('Event Data'!B$3:B$1000,
'Event Data'!B$3:B$1000,0)+2=ROW('Event Data'!B$3:B$1000),
ROW('Event Data'!B$3:B$1000)),ROW())-2),"")
Further more, the range should exactly the same as the data range. If you need it larger than the data range for future expandability, an IFERROR() should added into the formula.
=IFERROR(INDEX('Event Data'!B$3:B$1000,SMALL(IFERROR(IF(MATCH(
'Event Data'!B$3:B$1000,'Event Data'!B$3:B$1000,0)+2
=ROW('Event Data'!B$3:B$1000),
ROW('Event Data'!B$3:B$1000)),FALSE),ROW())-2),"")
Lastly, I truly recommended that you should use the Remove Duplicated built in excel since the array formula is about O(n^2) of time complexity and memory usage also. And every time you entered any data in even other cells, it will automatically re-calculate all formulas once the calculation option in your excel is automatic. This will pull down the performance.
This sounds simple, but I'm getting a real headache trying to figure it out:
I have two tables in excel in the same workbook but on different sheets. I want to count unique items in column B on the first table that meet a criteria (based on the data that's in column A of that table) and appear on the second table (on a different worksheet).
Because the data I'm working with is confidential, I've made up the two tables below (I've just clipped .jpgs). They are similarly formatted, but in reality I have much more data.
I need a formula that counts the number of unique people in Column B of Table 1 who also appear in Column B of Table 2 and whose date (in Column A of Table 1) is on or before 4/2/2016.
In this example it should come out with the answer three (for Bob, Jim, and Sue).
Table 1
Table 2
Any help you can provide would be hugely appreciated!!
If you put this =IF(AND(COUNTIF('Second Table'!$B:$B,'First Table'!$B2)>0,$A2<DATE(2016,4,2),SUMIF(B1:$B1,B2,$C$1:$C1)=0),1,"") in C2, then auto-fill it down would do it and you could sum it then at the bottom..
I'm looking for a way to count the number of times two cells appear side by side in Excel - like intersections. Sometimes in my data (of about 550 records) A Road will appear next to B Road, which is a count of 1. If it occurs again, later in the data the count would be 2. But if B Road appears in the first column and A Road appears in the second column, I can't find a way to make that number 3 in the count.
I've tried concatenating the data, but I need to be able to write this formula without inserting specific criteria (like searching for A Road) because it would be easier in that case to do this manually. Does anyone know if there's a formula for find the occurrence of the same two variables between two columns without specific criteria?
If the order of values in the two columns were important (i.e. A Road followed by B Road is different than B Road followed by A Road), then a simple pivot table would provide the counts you need. You would just put Col M on the rows, Col N on the columns and the count of any field as the value.
But the OP has said that A Road followed by B Road should be counted in the same total as B Road followed by A Road. Let's modify the concatenation in Col O to become =IF(M2<N2,CONCAT(M2," & ",N2),CONCAT(N2," & ",M2)). That provides a canonical form of each combination regardless of order. Having done that, then it is again an easy matter to create a pivot table that shows all the required counts -- just put the concatenated value on the rows and the count as the value.
If I correctly understand your intent, then try this array formula : =SUM(IF(EXACT(B$2:B3&C$2:C3,B3&C3),1,""))+SUM(IF(EXACT(B$2:B3&C$2:C3,C3&B3),1,"")).The second sum formula accounts for any reverse order of adjacent street occurrences.Then copy down the column as needed. Enter with Ctrl+Shift+Enter.
I'm performing array calculations that are taking a long time to complete. I'd like to optimize my formulas some more. All of the formulas are of the same nature - they perform some high-level function (Average, Slope, Min, Max) across a column of values. However, not all cells in a column are included in the array. I use multiple IF criteria to choose which cells get included. All comparisons are made to the current row. Here's an example of the data:
A B C D E
1 Company Generation Date Value ToCalculate
2 Abc 1 1/1/2010 5.6
3 ... ... ... ... ...
E would look something like this
{=Average(If(A2=A2:A1000, If(B2=B2:B1000, If(C2 > C2:C1000, D2:D1000))))}
So once E2 is calculated then I have to autofill down column E. Column F, G, H, ... Uses the same approach, either selects different values to operate on or a different function to perform. My dataset is quite large, and with only a few of these the spreadsheet is taking an hour plus to compute. Every so often I'll add a fourth criteria, all other criteria being the same.
Is there an efficiency? Some thoughts:
Can I use a single array per column instead of thousands per column?
Can I condense the first three criteria so that the output is row numbers? Perhaps then subsequent formulas won't have to search for multiple criteria but can just perform the function?
or somehow build the crtieria up? So a new column returns all rows where the company is the same. another column returns all rows from the first column where generation is the same...and so on...
For the Average you can do without arrays:
=AVERAGEIFS(D2:D$1000,A2:A$1000,A2,B2:B$1000,B2,C2:C$1000,"<="&C2)
As there is also a COUNTIFS and a SUMIFS, I think your slopes could be calculated the same way.
For the rest of the functions (max, min, etc), we should analyze case by case.
I did a slight performance test, and this is apparently better, but of course my datasets are just mocked.
HTH!
Note: Excel 2007 and up only!
Edit - Answering your comment.
Without knowing the dimensions of the problem is difficult to give advice, but I'll risk one anyway:
You could write a VBA function that:
1) Generates a new sheet for each company-generation pair
2) Sorts the data in those sheets by date
3) Adds the formulas to those sheets (no conditionals needed in this context)
4) Recalculates and Gets the results from those formulas and populates the original sheet
5) Deletes the auxiliary sheets
To capture the rows and re-use try this approach:
Sort the data by Company & Generation.
Make a unique list of Companies & generations (use Advanced Filter, Unique Only, Copy)
For each Company generation pair in the list build 2 columns of formulae. First column gives the count of rows in the data for this pair (use COUNTIFS), second column gives the first row in the data for this pair (=first row for previous pair+count of rows for previous pair). Then you can use a function like OFFSET to return only the rows of data for the Company-Generation pair and embed this inside the final function/array formula (AVERAGEIFS etc) You could extend this sort and count approach to include dates if you wanted. There is a drawback that if the list of cities and generations change you have to change the list of uniques and associated formulas. There are examples of this approach on my website athttp://www.decisionmodels.com/optspeedk.htmhttp://www.decisionmodels.com/optspeedj.htm