How to ignore values already counted in previous rows for cumulative count - excel

I have a dataset that looks like this:
Sample Species1 Species2 Species3 Cumulative count
1 1 1
2 1 1 2
3 1 2
4 2 2
5 1 2 1 3
I would like to count every new species added by each sample. So in the example above, samples 3 and 4 don't add any new species to the total number of species, so their cumulative count remains the same (I am trying to create a species accumulation curve).
I have tried this, but cannot get it to work with numbers >0 (for instance), rather than text:
How to ignore data previously counted by countif and return a specific value in cell
Essentially, I need something to check if the species in the current row were already present in previous rows.
The goal is to produce a graph like this, so I can determine where sampling effort begins to have a diminishing return (in number of species):
Is there an excel formula I could use to fill the 'Cumulative count' column and return the results above? I should also mention that a short solution would be best, because I have 35+ species and the formulas can get long and complicated very quickly. Any assistance would be appreciated.

Going horizontal is easy smaller formula:
=SUMPRODUCT((B3:D3>0)*1,(B$2:D2<=0)*1)+E2
.......
=SUMPRODUCT((B6:D6>0)*1,(B$2:D2<=0)*1,(B$3:D3<=0)*1,(B$4:D4<=0)*1,(B$5:D5<=0)*1)+E5
I found a method but this probably can be refined a bit:
For our first Cum Count Row:
=COUNTIF(B2:D2,"<>" & "")
For each thereafter:
=IF(AND(B3>0,SUM(B$2:B2)<=0),1,0)+IF(AND(C3>0,SUM(C$2:C2)<=0),1,0)+IF(AND(D3>0,SUM(D$2:D2)<=0),1,0)+E2
.....................
=IF(AND(B6>0,SUM(B$2:B5)<=0),1,0)+IF(AND(C6>0,SUM(C$2:C5)<=0),1,0)+IF(AND(D6>0,SUM(D$2:D5)<=0),1,0)+E5

A shorter solution is a UDF, and is provided here by JvdV:
stackoverflow.com/questions/51980149/count-column-if-it-contains-a-filled-cell-in-excel/51980258
Well, I guess if you want to work without any helper rows to use the
COUNTA funtion a smooth way could be a UDF, possibly like so:
Function CountColumns(RNG As Range) As Long
Dim COL As Range
For Each COL In RNG.Columns
If Application.WorksheetFunction.CountA(COL) > 0 Then CountColumns = CountColumns + 1
Next COL
End Function

Related

Count of Excel based on 2 column criteria and counting the 3rd column

I need a count of how many date items fall within Data 1 & Data 2
ie:
x-1 will have a count of 2
x-2 will have a count of 1
-x-3 will have a count of 2
-y-1 will have a count of 2
What would be the best way to go abouts when approaching this?
Data 1
Data 2
Date
x
1
Date 1
x
1
Date 1
x
1
Date 2
x
2
Date 3
x
2
Date 3
y
1
Date 1
y
1
Date 1
I see only one way to interpret with the available information:
To count the number of times Date_to_test falls within Date_1 and Date_2 (screenshot below, sheet here), you could use either the sum or something like a countifs (with interim calc):
sum approach
=SUM(1*($C$2:$C$11<=$B$2:$B$11)*($A$2:$A$11<=$C$2:$C$11))
countifs + interim calc
helper
=1*(C2<=B2)*(A2<=C2)
(additional column, drag down)
countifs
=COUNTIFS($D$2:$D$11,1)
Screenshot
Alternative
as for the 'sum' approach, sumproduct variants (e.g. =SUMPRODUCT(1*($C$2:$C$11<=$B$2:$B$11),1*($B$2:$B$11>=$A$2:$A$11))) are calculation/memory intensive
despite the countifs + helper approach containing more 'visible' data - these values need only be calculated once, the countifs can then be determined independently (assuming no updates to the helper column) - thus making it more memory/calculation efficient depending upon your calculation mode, screen-updating preferences
Caveat
if, by some misfortune re: interpreting your question, you are referring to some other means of establishing whether "date items fall within Data 1 & Data 2", then without knowing what this is, there very low likelihood of being able to guess this correctly

Multiple Return Vlookup Horizontal match range with Vertical return range

I have binary data running in the horizontal direction: For example the match ranges look like:
Mike 0 1 0 0 0 1
Julie 1 1 0 1 1 0
Joe 1 1 1 0 0 0
And the return Range contains textual data:
Q1: What is the capital of NY?
Q2: What is the capital of Ohio?
Q3: What is the capital of Washington?
.
.
.
I need to match every occurrence of 1 with corresponding data that runs in the vertical direction. i.e. horizontal index corresponding with vertical index. I have found several instances where a multiple return vlookup was accomplished by using:
=IFERROR(INDEX(return_range,SMALL(IF((1=match_range),ROW(match_range)-1),ROW(1:1)),2),"")
However this isn't working. I assume it isn't working because it is meant for two vertical data sets. I have tried switching the "row" for "column" in the function, but didnt have any luck.
Also, the match range and return range are on different sheets.
The match range (in horizontal direction) is binary information on whether a question was answered correctly. The return range is the corresponding set of questions (in vertical direction). Therefore, the output would be an array:
Mike: Q2 Q6
Julie: Q1 Q2 Q4 Q5
Joe: Q1 Q2 Q3
How can this function be modified to accomplish this?
To get the correct row in an array that then can be used in other formula we use INDEX:
INDEX($A:$G,MATCH($I2,$A:$A,0),0)
This will return all the values in Column A through G in the row where the name matches that in I2.
It can be used as such in a INDEX/AGGREGATE Function:
=IFERROR(INDEX($A$1:$G$1,AGGREGATE(15,6, COLUMN(INDEX($A:$G,MATCH($I2,$A:$A,0),0))/(INDEX($A:$G,MATCH($I2,$A:$A,0),0)=1),COLUMN(A:A))),"")
My best guess as to your data set up:
Use a formula like this:
=IFERROR(INDEX($I:$I,AGGREGATE(15,6, COLUMN(INDEX($A:$G,MATCH($K2,$A:$A,0),0))/(INDEX($A:$G,MATCH($K2,$A:$A,0),0)=1),COLUMN(A:A))),"")

Excel formula to apply penalty column to ranking

I have thought long and hard about this, but I can't find a solution to what I believe is quite a simple problem.
I have a table of results, where sometimes someone will be given a penalty of a varying amount. This is entered into the penalty column (Col C).
I need a formula which checks if there is an entry into the penalty column and applies it, not only to that row, but to the number of subsequent rows which are affected, depending on the severity of the penalty.
I have tried to see if this is possible by referencing the penalty against the 'ROW()' function but have not been able to achieve the desired effect.
Col D shows the desired output of the formula.
Col E is included for reference only, to show the desired effect on each row.
Col A Col B Col C Col D Col E
Pos Name Penalty New Pos Change
1 Jack 1 0
2 Matt 2 0
3 Daniel 2 5 +2
4 Gordon 3 -1
5 Phillip 4 -1
6 Günther 6 0
7 Johann 3 10 +3
8 Alain 7 -1
9 John 8 -1
10 Gianmaria 9 -1
The big issue is, if someone is handed a big penalty, for example '10' then it affects the following ten rows. I can't work out how to include this variable logic...
I would be interested to hear the approach of others...
You need to use the RANK() function:
Excel RANK Function Examples
In a new column, add the penalty value to the original position, plus a small coeffieient depending on the original position (0.01 per increment perhaps) to move the penalised player below the original person at that position, then in the next column you can RANK() the new column of values (F in my case).
New value is therefore =A2+(IF(C2>0,C2+(0.01*A2)))
Rank is then =RANK(F2,F2:F11,1)
You can combine all the functions into one, but it's clearer to do it in separate columns at first.

how to count number between certain range of rows?

i would like to count number for every 7 rows, data are in one column. i use this formula, but it is not working.
from B8 to B14329, for every 7 rows, count number if it is equal to 3. so i know how many 3 in every 7 rows.
=COUNTIFS(B8:B14329, OFFSET($B$7,(ROW()-12)*7,0,7,1),B8:B14329,=3)
Thanks a lot!
i want something like this:
data count
3
2
3
1
3
3
1 4
1
2
2
3
3
1
1 2
.....
....
...
Simple and easy:
=SUMPRODUCT((B8:B14329=3)*(MOD(ROW(B8:B14329),7)=1))
Just change the =1 to your needs. To start with row 1 =1, 2 =2 ... 6 =6, 7 =0. This way, to start count at row 8 it is =1
EDIT: having your exaple now, you want something completely different... lol.
=IF(MOD(ROW(),7)=0,COUNTIF(A8:A14,3),"")
Put this in row 14 and then drag down... change the =0 as you need it.
Here's what I would do
Add a new column with the row index (8 to 14239) in your case
Add Yet another column, with a formula to tell whether the column you just added is a multiple of 7. Put it's value like "TRUE" or "FALSE"
You can use the MOD function to check the remainder of the division.
= MOD ( Number , Divisor )
By now, you should have, aside from the columns you already have, something like:
8-----FALSE
9-----FALSE
10-----FALSE
11-----FALSE
12-----FALSE
13-----FALSE
14-----TRUE
15-----FALSE
Once you have that, just apply a filter on the "TRUE/FALSE" column, select the "TRUE" values and you will be able to count the number of "3"s on the actual value column, by also using a filter on it.
I hope it helps, and it's easier than a really messy formula.

Find the top n values in a range while keeping the sum of values in another range under x value

I'd like to accomplish the following task. There are three columns of data. Column A represents price, where the sum needs to be kept under $100,000. Column B represents a value. Column C represents a name tied to columns A & B.
Out of >100 rows of data, I need to find the highest 8 values in column B while keeping the sum of the prices in column A under $100,000. And then return the 8 names from column C.
Can this be accomplished?
EDIT:
I attempted the Solver solution w/ no luck. 200 rows looks to be the max w/ Solver, and that is what I'm using now. Here are the steps I've taken:
Create a column called rank RANK(B2,$B$2:$B$200) (used column D -- what is the purpose of this?)
Create a column called flag just put in zeroes (used column E)
Create 3 total cells total_price (=SUM(A2:A200)), total_value (=SUM(B2:B200)) and total_flag (=(E2:E200))
Use solver to minimize total_value (shouldn't this be maximize??)
Add constraints -Total_price<=100000 -Total_flag=8 -Flag cells are binary
Using Simplex LP, it simply changes the flags for the first 8 values. However, the total price for the first 8 values is >$100,000 ($140k). I've tried changing some options in the Solver Parameters as well as using different solving methods to no avail. I'd like to post an image of the parameter settings, but don't have enough "reputation".
EDIT #2:
The first 5 rows looks like this, price goes down to ~$6k at the bottom of the table.
Price Value Name Rank Flag
$22,538 42.81905675 Blow, Joe 1 0
$22,427 37.36240932 Doe, Jane 2 0
$17,158 34.12127693 Hall, Cliff 3 0
$16,625 33.97654031 Povich, John 4 0
$15,631 33.58212402 Cow, Holy 5 0
I'll give you the solver solution as a starting point. It involves the creation of some extra columns and total cells. Note solver is limited in the amount of cells it can handle but will work with 100 anyway.
Create a column called rank RANK(B2,$B$2:$B$100)
Create a column called flag just put in zeroes
Create 3 total cells total_price, total_value and total_flag
Use solver to minimize total_value
Add constraints
-Total_price<=100000
-Total_flag=8
-Flag cells are binary
This will flag the rows you want and you can grab the names however you want.

Resources