I am working on a data sheet that has almost 300,000 rows by about 40 columns.
I have a countifs function to count the number of rows that have an entry ranging from "A1" through "A5" for each letter A-G in a particular column.
I have broken out analysis on separate sheets to pull data for each row for each separate letter A-G using countifs(range,"other data","F?") (I know its simplified).
I need to create a new sheet that excludes any row with an A value in it.
I tried countifs(range,"other data", range,{"B?","C?","D?","E?","F?","G?"}) and it only returns the count for the outside values (B and G), how do I get Excel to count all of those other values as well? I would like to keep this format because to create the sheets for B-G, I just used the find and replace to replace "A?" with "B?" and so on for the other sheets.
I would like to just replace "B?" with whatever works to count the number of rows that have B-G in that particular column.
You countifs formula, with an array constant for criteria, returns an array of values. But what you want is the SUM of that array. So:
sum(countifs(range,"other data", range,{"B?","C?","D?","E?","F?","G?"}))
Without the sum function, you will only see the value of the first element of that array.
I have a feeling this is the wrong answer, but I'll say it anyway. Why can't you use
=COUNTIFS(Range,"<>A?")
Or are there other possible values that you want to exclude?
In which case you should be able to use this for A
=COUNTIFS(Range,">=B1",Range,"<=G5")
and for B1-B5
=COUNTIFS(Range,">=A1",Range,"<=A5")+COUNTIFS(Range,">=C1",Range,"<=G5")
which can be modified for C, D, E and F
and this for G
=COUNTIFS(Range,">=G1",Range,"<=G5")
Related
I have been trying and searching how to append two lists in excel to use in a formula. The lists do not exist in columns, they are created using a formula. I want to combine the two lists in a single one, not to show the values but to use the new list in a formula. I am using excel 365 (UNIQUE function). Let me replace my initial text by a real small case.
I have an excel file with 3 work sheets. Sheet1 is:
Sheet2 is:
Now I want to run some analysis in Sheet3. In my example I want to count how many unique values from column A have column B containing one of the letters 'a', 'b, 'c', or 'd'. For instance, in Sheet1, the letter 'a' appears in all rows. Column A has 3 unique values. So my result for 'a' is 3. The letter 'b' does not appear for the case where column A is '3'. Therefore the result for 'b' is '2'.
So I create a Sheet3 to show my results. The first column contains a list of letters {a, b, c, d}. I then use the formula:
=COUNT(UNIQUE(FILTER(Sheet1!$A$1:$A$100, ISNUMBER(SEARCH(A1, Sheet1!$B$1:$B$100)))))
From inside out: the SEARCH function looks in cells B1 to B100 (I can live with specifying a larger range) where is the position of the value specified in column A (of the current sheet). If it does, then SEARCH returns a number. I check if the return value is a number (ISNUMBER) and use this to filter values in column A of Sheet1. I then apply the UNIQUE function to these values and finally count them.
Then I do the same with values in Sheet2. And it works. This is the output:
Column B is the number of unique values (as specified above) from Sheet1 and Column C the same from Sheet2.
So far so good. But now I want to have the counting of unique values globally. Not for each Sheet. One cannot just add the values from column B and C, as there might be an overlap. For example, the result for 'a' should be 3, not 5.
The solution here would be to grab the two unique lists (one from Sheet1 and the other from Sheet2), join them, UNIQUE this new list, and count. How do I join them ? That is my question.
Note that this 'counting of unique values' is just an example. I might want to find the maximum, or sort them, or find only prime numbers, or the average, or the median, or something else. So I need a general approach to join the results.
I got options close to a workable thing when all the data is in the same worksheet.
Finally, note that the data size I have is not huge, but it is large (thousands of lines at the most).
Here is something you could try:
=LET(x,{"A","B","C"},y,{"D","E"},z,CHOOSE({1,2},x,y),cnt,MAX(COUNTA(x),COUNTA(y)),seq,SEQUENCE(cnt*2),final,INDEX(z,MOD(seq-1,cnt)+1,CEILING(seq/cnt,1)),FILTER(final,NOT(ISERROR(final))))
Here both 'x' and 'y' variables are placeholders for your two (vertical) arrays. In this case I used: {"A","B","C"} and {"D","E"}. Assuming you just want to place the 2nd array directly under the 1st one, the above suggestion does just that:
The blue columns is the data given and the red columns is what is being calculated. Then the table to the right is what I am referencing. So, F2 will be calculated by the following steps:
Look at the Machinery column (D), if the cell contains LF, select column K, otherwise select column L
Look at the Grade column (E), if the cell contains RG, select rows 4:8, otherwise select rows 9:12.
Look at the Species column (A), if the cell contains MS, select rows 5 and 10, otherwise.......
Where every the most selected cell is in columns K and L, copy into column F.
Multiply column F by column C.
I don't want to make another column for my final result. I did in the picture to show the two steps separately. So column F should be the final answer (F2 = 107.33). The reference table can be formatted differently as well.
At first, I tried using nested-if statements, but realized that I would have like 20+ if statements for all the different outcomes. I think I would want to use the SEARCH function to find weather of not the cell contains a specific piece of information. Then I would probably use some sort of combination of match, if, v-lookup, index, search, but I am not sure how to condense these.
Any suggestion?
SUMPRODUCT is the function you need. I quickly created some test data on the lines of what you shared like this:
Then I entered the below formula in cell F2
=SUMPRODUCT(($I$4:$I$9=E2)*($J$4:$J$9=LEFT(A2,FIND(" ",A2)-1))*IF(ISERROR(FIND("LF",D2,1)),$L$4:$L$9,$K$4:$K$9))
The formula may look a little scary but is indeed very simple as each sub formula checks for a condition that you would want to evaluate. So, for example,
($I$4:$I$9=E2)
is looking for rows that match GRADE of the current row in range $I$4:$I$9 and so on. The * ensures that the arrays thus returned are multiplied and only the value where all conditions are true remains.
Since some of your conditions require looking for partial content like in Species and Machine, I have used Left and Find functions within Sumproduct
This formula simply returns the value from either column K or L based on the matching conditions and you may easily extend it or add more conditions.
I understand that VLOOKUP searches the first column of a table in order to find a value, then it grabs the value from the same row and a different user-specified column. The following code returns data from the 2nd column, column B.
VLOOKUP(5,$A$2:B100,2)
Is there a way to set the return column to the last column of the input table? Something like the following, which would return data from columns B, P, and AC, respectively.
VLOOKUP(5,$A$2:B100,end)
VLOOKUP(5,$A$2:P100,end)
VLOOKUP(5,$A$2:AC100,end)
Alternatively, is there a way to grab the current column number and use that as an index?
VLOOKUP(5,$A$2:B100,current_column_number)
I'd like to write one VLOOKUP formula and then be able to drag it right across the spreadsheet, so that B100 becomes C100, D100, E100, etc. and the column lookup changes accordingly.
Update
I can do the alternate approach using the COLUMN function, but it requires programming a fixed offset and doesn't seem as robust. I'd still like to know if there is an "end" option.
=VLOOKUP(5,$A$2:B100,COLUMNS($A$2:B100))
Unfortunately you cannot simply drag it, you'll need to replace as there are two equivalent ranges written in the nested function.
The COLUMNS effectively counts the columns in the range giving the exact result needed for the VLOOKUP's end variant.
EDIT to show OP what a simple drag function would be like:
Function VLOOKUP2(Expected As Variant, Target As Range)
x = Target.Columns.Count
VLOOKUP2 = Application.WorksheetFunction.VLookup(Expected, Target, x)
End Function
You can use the Excel COLUMN() function to convert the column reference to a numerical index into the VLOOKUP table. Try this:
VLOOKUP(5, $A$2:B100, COLUMN(B2))
VLOOKUP(5, $A$2:P100, COLUMN(P2)
VLOOKUP(5, $A$2:AC100, COLUMN(AC2))
In pratice, you can just enter the first formula I gave above and then copy to the right. Each copy will automatically shift the column number to the end.
You could use the count function while holding ($) one side of the count range, thus giving you an integer that Vlookup can use.
Something like:
VLOOKUP(5,$A$2:B100,COUNT($A$2:A2))
You may need to add a + or - 1 to the count function depending on where your range starts.
It's effectively doing the same thing you already did with the array for the vlookup
I'm struggling with integrating a condition into my COUNTIFS statement. I have about 5 conditions which I've been able to easily work in, but I can't figure out the last one. The criteria range would be A1:A40000, and the criteria would count the number that match any value in a list of 30 text strings on Sheet 2, Cells A1:A40. Is this possible? I can get the result without the other conditions. Unfortunately, I do not have the flexibility to add a column next to A1:A40000 that checks to see if it is in the list.
Edit: Clarification per request.
Simplified version of what I'm doing. I need to count the number items (column A) that meet several conditions depending on the column in the entire data set. So, I need to find the number of items that have a value of "1" in column B - AND - a value of "YES" in column "C" - AND - a value of "OLD" in column "D" - AND - (the part I'm struggling with) column "E" must contain any one of the values that's in a completely separate range (call it Z1:Z40). The formula for the first 3 conditions would be:
=COUNTIFS(B:B,1, C:C,"YES", D:D,"OLD")
The final criteria in bold would be something like:
=COUNTIFS(B:B,1, C:C,"YES", D:D,"OLD", **E:E,isnumber(match(E:E,Z1:Z40,0))**)
But that does not work...
You can simply use the range as a criteria. If you do that then your COUNTIFS function will return an array (one value each for each value in Z1:Z40) so you need a function to sum that array - I use SUMPRODUCT because it doesn't require array entry
=SUMPRODUCT(COUNTIFS(B:B,1,C:C,"yes",D:D,"old",E:E,Z1:Z40))
That approach has some limitations - you can only use two "multi-item" criteria in one COUNTIFS function (and if you do one must be a column, the other a row, or you need to use TRANSPOSE to make it that way), and items in Z1:Z40 should not be repeated (or you may get double counting).
To overcome either of those limitations you can use SUMPRODUCT in place of COUNTIFS - with ISNUMBER(MATCH for the multi-item criterion. If you use SUMPRODUCT like that then it's better to restrict the ranges for efficiency reasons, e.g.
=SUMPRODUCT((B2:B100=1)*(C2:C100="yes")*(D2:D100="old")*ISNUMBER(MATCH(E2:E100,Z1:Z40,0)))
You can add as many ISNUMBER(MATCH criteria as you want and Z1:Z40 can be any single row/column range
Let's say all your headers are in row 1 and the real data starts in row 2.
I would add a column on the end and put in the formula
=IF(AND(B2=1, C2="YES", D2="OLD", COUNTIF($Z$1:$Z$40,E2)),"YES","NO")
Then copy that down and any row where Column F was "YES" is a row that met all the criteria.
There is also a way to use wildcards
=countifs(A1:D1;"*yes")
which counts all cell which contain 'yes'
I use a sheet to enter names of people who work at certain shifts, for example
on column A, the people that work from 8am to 4pm,
on column B the people that work from 4pm till midnight
on column C and beyond, special shifts
etc
This table is A1:N24 and it contains titles (of shifts), names of workers and some special notes, about each worker.
On column R I have a list of workers that I use for data validation/drop down lists, to make the entry of workers' names easier
My question is how I can count the number of cells on the A1:N24 table that contain only names from the R column list, leaving out the title cells and the special notes cells.
The COUNTIF function seems like a logical choice but I couldn't make it work with a range of criteria, my workers list. Maybe the DCOUNTA function could be of use in my case?
Any help would be appreciated
Try this (entered as an array formula)
=COUNT(MATCH(A1:N24,R:R,0))
How it works:
MATCH(A1:N24,R:R,0) returns an array of values where the entry in A1:N24 is found, and #N/A errors where its not
COUNT( ) counts the Numbers in that array, ie the number of matching values