I have a big data set which has about 9000 rows. I have a few variables for every year from 1960 onwards, and I need to average them in ten year bins. So I have something like:
1
2
3
4
2
3
4
5
Now I need to average the first ten rows, then the next ten, and so on, for all 9000-odd rows. I can do that, but then I get all these rows averaged in the middle which I don't need, and I can't go about deleting those many rows. There has to be an easy way to do this, surely?
Would appreciate any help!
Suppose your data starts from A1. Try this one in B1:
=AVERAGE(INDEX(A:A,1+10*(ROW()-ROW($B$1))):INDEX(A:A,10*(ROW()-ROW($B$1)+1)))
and drag it down.
in B1 it would be =AVERAGE(A1:A10)
in B2 it would be =AVERAGE(A11:A20)
in B3 it would be =AVERAGE(A21:A30)
and so on.
General case
If your data starts from An (where n is 2,3,4,...), use this one:
=AVERAGE(INDEX(A:A,n+10*(ROW()-ROW($B$1))):INDEX(A:A,n-1+10*(ROW()-ROW($B$1)+1))
where you should change n to 2,3,4,...
Related
I have the following dataset where I want to get the sum of each variable every 6 days. I can get the total sum of every 6 days using
=SUM(OFFSET($A$2,,(COLUMNS($A$5:A5)-1)*6,,6))
And I can get the total sum of each variable using
=SUMIF(A1:S1,A1,A2:S2)
But I cant get the total sum of each variable within the block of 6 days. It won't increment when I drag the formula.
So the results should be
First batch Second batch Third batch
A B C A B C A B C
2 2 2 4 4 4 6 6 6
You can use SUMPRODUCT:
=SUMPRODUCT((1:1=A6)*2:2*(COLUMN(1:1)>(INT((COLUMN()-1)/3)*6))*(COLUMN(1:1)<=(INT((COLUMN()-1)/3+1)*6)))
Edit:
To shift the column by five position, you will need to change the following parameters in the formula:
Full row range change to exact range, i.e. 1:1 to e.g. $F$1:$W$1
Change COLUMN()-1 to COLUMN()-3
If you also want to change the number of columns to be summed, additionally replace the factor of 6 with a 7-1 for seven columns or 36-30 for thirty-six columns.
So formulas looks like:
batch of 6 cols
=SUMPRODUCT(($F$1:$W$1=F6)*$F$2:$W$2*(COLUMN($F$1:$W$1)>=((INT((COLUMN()-3)/3))*6))*(COLUMN($F$1:$W$1)<((INT((COLUMN()-3)/3+1))*6)))
batch of 7 cols
=SUMPRODUCT(($F$1:$Z$1=F6)*$F$2:$Z$2*(COLUMN($F$1:$Z$1)>=((INT((COLUMN()-3)/3))*7-1))*(COLUMN($F$1:$Z$1)<((INT((COLUMN()-3)/3+1))*7-1)))
batch of 36 cols
=SUMPRODUCT(($F$1:$WW$1=F6)*$F$2:$WW$2*(COLUMN($F$1:$WW$1)>=((INT((COLUMN()-3)/3))*36-30))*(COLUMN($F$1:$WW$1)<((INT((COLUMN()-3)/3+1))*36-30)))
Instead of creating a really, really, really complex formula that can be dragged right, I suggest you add a row to the data at the top that identifies the batch number. Then you can use that batch number as an additional parameter in the Sumifs(). you can hide the rows with the batch numbers if they upset your spreadsheet design.
=SUMIFS(3:3,1:1,A16,2:2,A17)
This is far easier than creating a formula that dynamically adjusts references in tiered steps of three and six.
I'd like to sum the lpqty and receivedqty columns for every two rows in Excel, and I'm struggling to find a way to do it. I want a list of the first 4 columns, where the week values only appear once, not twice as it is right now. (Ignore the iog column). So essentially I want it to look something like this:
2018 9 35 BCN1 59380 109963
2018 9 36 BCN1 356071 724178
I've tried with some SUM and OFFSET formulas, but I can't seem to make it work.
For the first 4 columns:
=INDEX(A:A,(ROW()*2-2))
starting in (say) J2
Then for the next 2 columns:
=INDEX(F:F,(ROW()*2-2))+INDEX(F:F,(ROW()*2-1))
(assumes all rows are strictly in pairs)
I am dealing with a large data set in Excel and need to search a for two neighboring cells in the same column. Usually I would just go through this quickly row by row, but there are around 30,000 rows and probably 1% of those are the neighbors I am looking for. The data is organized temporally, meaning I cannot just sort.
Anyone have an idea if/how this can be done?
You could drag down this formula in column next to your data.
For example, in B3 where column A has data:
=IF(AND(A3<>"",A2<>""),"neighbour above","")
So:
Row A B
1 Data Check
2 10
3 20 neighbour above
4
5 40
6 50 neighbour above
7 60 neighbour above
8
9
10 90
Note B2 first position has no formula. This will highlight neighbouring cells within the column.
How many?
To count how many neighbours, use a countif. so in C1 you can have:
=COUNTIF(B:B, "neighbour above")
which will return 3 in this case above. pairs 10 and 20, 40 and 50, 50 and 60.
You can choose other marker text to flag the neighbour, besides "neighbour above". Just put it in the IF statement.
I am trying to analyze blood pressure that is taken every minute, and determine how long the values are within a certain range, consecutively. I have the data set up in excel for the moment. I have color coded the values based on the ranges I would like to quantify. I know that if I do a simple "=countIF) function I can get the total number of times these values meet the criteria. But what I want to do next is quantify for how long the values fall within a specified range, consecutively.
This shows values in columns in excel, where each column is a different patient, and the heat map are the value conditions to help me visualize if certain thresholds occur for longer times than others. But I want to find a way to quanitify this in excel, if possible. Any help would be much appreciated.
The final result I am looking for is to be able to measure how much time each patient sustains a specific category of blood pressure to know if certain ranges are more prolonged than others (e.g. blood pressure is between 120-130 for 30 minutes). So in the spreadsheet above, assuming each cell is a 1-minute bin, for column HU, BP is between 120-130 for 3 minutes (rows 2-4), and again for 16 minutes (rows 6-22). In column HS, blood pressure is above 140 (black) for 7 minutes.
I want to find a workflow to quantify these durations so that I can get a summary of the number of consecutive 1-minute bins (each cell) at a specified range/threshold for each patient (column)
First, I would create another sheet -- let's call it "Thresholds" -- with thresholds of bloodpressures in ascending order in column A.
Put a category number next to each value (in column B)
For example:
0 0
90 1
100 2
105 3
110 4
115 5
120 6
125 7
... etc.
Back in the other sheet, add a new column next to each bloodpressure column. So you
would have a new column HR next to HQ.
Put there a formula that looks up the category for the value in HQ, from sheet "Thresholds".
You can use VLOOKUP for that. For example in row 2:
=VLOOKUP(HQ2, Thresholds!$A$1:$B:$1000, 2)
Then add yet another column, HS it will be.
In there make a running count for same category rows, like this (for row 2, I assume you have used row 1 for column titles):
=IF(HR1<>HR2, 1, HS1+1)
Drag down this formula to the column. This formula checks if this row has a different category of blood pressure than the previous one. If so, it
sets the counter to 1 (it is the first instance in this running series). In the other
case it takes the value of the counter in the previous row and adds 1 to it.
Repeat this for the other columns (inserting 2 new columns next to each).
This will already give you a start for further analysis.
I have a column of data in excel that I need to take the average of the bottom 10% of. My data reads:
1
2
3
4
5
6
7
8
9
10
so the average of the bottom 30% would be - (1+2+3)/3 = 2. Is there a way to automate this in excel where all I have to do is give it what percent I want and it gives me the answer?
A simpler version: no Array Formula or Indirect required
Assuming data in column A, and required percentage in cell B1 (as a decimal)
=AVERAGEIF(A:A,"<="&SMALL(A:A,COUNT(A:A)*B1))
I'm not entirely sure what you're looking for when you say 'where all I have to do is give it what percent I want and it gives me the answer', but you could perhaps try AVERAGEIF:
=AVERAGEIF(A1:A10,"<="&COUNTA(A1:A10)*0.3)
Assuming that the data is in the range A1:A10. You can have a reference for the 0.3 for the percentage.
=AVERAGEIF(A1:A10,"<="&COUNTA(A1:A10)*B1)
If you put the percentage in B1, then the formula will change accordingly.
Assuming your data is in A1:A10, and your desired % is in B1:
=AVERAGE(SMALL(A1:A10,ROW(INDIRECT("1:"&(B1*COUNT(A1:A10))))))
Note! This is an Array Formula! That means that you have to enter into the formula bar at the top (not in the cell), and press ctrl shift enter when you're done.
This will wrap the formula in these { }, so you'll know you did it right. Typing them in does not work, you have to ctrl+alt+enter!
How does it work?
ROW(INDIRECT("1:"&(B1*COUNT(A1:A10))))
The Count checks how many items you have in your list, so it knows how many numbers it will need to average. Let's say B1 is 40%.
40% of 10 items is 4, but 40% of 20 is 8.
Since it's 10 entries long, we'll creating an "array", a series of numbers from 1 to 4 (40%).
*SMALL(A1:A10*
SMALL finds the *n*th smallest number in a range. With our array of 1 to 4, it will find the lowest 4 entries.
AVERAGE(
Then we average the result :)