Dividing a column into N equal groups by value - excel

Say I have a column with values:
23
24
25
66
67
84
81
85
I want to divide this into N groups, say N right now is 4.
23,1
24,1
25,2
66,2
67,3
84,3
81,4
85,4
I actually need to divide around 30k sorted values into groups 1 to 99; each with equal number of elements.
Any quick way to do this in Excel?

With data in column A, in B1 enter:
=A1 & "," & ROUNDUP(ROW()/(COUNT(A:A)/4),0)
and copy down. For example:
.
Change the 4 in the formula to vary the number of groups.

I use this trick for equal data bucketing. Suppose you have data in A1:A8 range. Put this formula in B1:
=MAX( ROUNDUP( PERCENTRANK($A$1:$A$8, A1) *4, 0),1)
Fill down the formula all across B column and you are done. The formula divides the range into 4 equal buckets and it returns the bucket number which the cell A1 falls into. The first bucket contains the lowest 25% of values.
Adjust the number of buckets according to thy wish:
=MAX(ROUNDUP(PERCENTRANK([Range],[OneCellOfTheRange]) *[NumberOfBuckets],0),1)
The number of observation in each bucket will be equal or almost equal. For example if you have a 100 observations and you want to split it into 3 buckets then the buckets will contain 33, 33, 34 observations. So almost equal. You do not have to worry about that - the formula works that out for you.

if this is in column A
row 1
row 2
row 3
row 4
row 5
place formula in column B
=MOD(ROW(); 4)+1
this result in
row 1, 2
row 2, 3
row 3, 4
row 4, 1
row 2, 2

Related

looping through columns to adjust formula

i have 3 rows of data and 17 columns, in row 2 depending on a selection (low, med, high, which is cell referenced as say 1, 2, 3) i want to have formula that if low is selected row 2 would equal corresponding column in row 1 and add 1.
effectively some code i was thinking of is:
For i = 2 to 17
Cells(2,i).Formula = "Cells(1,i) + 1"
I need the output to actually be a formula and not a value though

Nested IF Excel formula

I am currently using the following formula i.e. =IF(COUNTIF($A$1:A2,A2)>4,A2+1,A2) to change the number when I drag this formula downsdie of the rows.
For Example: in this case for every five rows number will change i.e. A1 to A5 it will 1 and A6 to A10 it will be 2 and A11 to A15 it will be 3 etc.
Just wanted to know is it possible to extend the same formula, so along with adding 1 number for every five rows it should also skip 2 numbers for every 60 rows.
For Example: if the 60 row is number 12, then 61st row should be 15 and 120 row will be 26 and 121 row should be 124 etc.
Can someone please help me with this formula?
Thanks for your help in advance.
Number starts at one.
Then get the cell's row number and subtract one. Divide that number by 5 and discard the fractional part (or the remainder). So numbers from 0 to 4 (which are rows 1 through 5) all get an increment of 0, 5 to 9 get 1, and so on. Similar logic with multiples of 60 except that the counting is doubled.
=1 + floor((row()-1)/5, 1) + floor((row()-1)/60, 1) * 2

Sum the values in Excel cells depending on changing criteria

In an Excel spread sheet I have three columns of data, the first column A is a unique identifier. Column B is a number and column C is either a tick or a space:
A B C
1 d-45 150 √
2 d-46 200
3 d-45 80
4 d-46 20 √
5 d-45 70 √
Now, I wish to sum the values in column B depending on a tick being present and also relative to the unique ID in column A. In this case rows 1 and 5. Identifying the tick I use
=IF(ISTEXT(C1),CONCATENATE(A1))
&
=IF(ISTEXT(C1),CONCATENATE(B1)).
This leaves me with two arrays of data:
D E
1 d-45 150
4 d-46 20
5 d-45 70
I now want to sum the values in column E depending on the ID in column D, in this case row 1 and 5. I can use a straight forward SUMIFS statement to specify d-45 as the criteria however this unique ID will always change. Is there a variation of SUMIFS I can use?
I also wish to put each new variation of ID number into a separate header with the summed totals underneath, that is:
A B
1 d-45 d-46
2 220 20
etc...
You can try this:
To get the distinct ID's write (in H1 then copy right):
This one is an array formula so you need Ctrl Shift Enter to enter the formula
=INDEX($A$1:$A$5;SMALL(IF(ROW($A$1:$A$5)-ROW($A$1)+1=MATCH($A$1:$A$5;$A$1:$A$5;0);ROW($A$1:$A$5)-ROW($A$1)+1;"");COLUMNS($A$1:A1)))
Now to get the sum (H2 and copy right)
=SUMPRODUCT(($A$1:$A$5=H1)*ISTEXT($C$1:$C$5)*$B$1:$B$5)
Data in the example is in A1:C5
Depending on your regional settings you may need to replace ";" field separator by ","
Try this,
SUMIFS
=SUMIFS(B1:B5,A1:A5,"=d-45",C1:C5,"<>")
where "<>" means that the cell is not empty...

Top third, next third of items by sales

I have an excel sheet as shown below. I need to get the top third/ next third items by sales count. Is there a way to get this done in Excel?
Item Count
1 100
2 90
3 80
4 60
5 55
6 50
7 45
8 35
9 25
Dividing into 3 buckets, so 540/3 = ~180 items in each –
Bucket 1 – Items 1 and 2 (Count = 190)
Bucket 2 – Items 3, 4 and 5 (Count = 195)
Bucket 3 - Items 6, 7, 8, 9 (Count = 155)
There are multiple ways to achieve this. Assuming that your Item and Count data are in columns A and B, then the shortest path is to use the following formula in cell C2:
=ROUND(3*SUM($B$2:$B2)/SUM($B$2:$B$10),0)
After entering that into C2, select that cell and drag down the right-bottom corner of the cell all the way to the last row. Note the $ sign that is "missing" on purpose before the second 2. That takes care of the auto-fill behavior needed when dragging down the corner.
If you are allowed to use a helper column, you can create a computationally more efficient method using following layout:
If you want to, you can hide column C. It contains cumulative values of the different sales counts. Cell C1 is set to 0, cell C2 contains the formula =$C1+$B2. Column D then approximates the buckets by using the formula =ROUND(3*$C2/$C$10,0) in cell D2, and then again dragging down the bottom-right corner. This might be the better approach if you have many rows on your sheet.
Note that both solutions yield the same results. The value in one or more buckets could become 0, which is not exactly right. That can be avoided using ROUNDUP in stead of ROUND, but since you have not indicated clearly where you want the boundaries of the buckets to fall exactly in different situations, I thought I leave that as an exercise to you :-).

Averaging daily varying column in excel vb

Every day I have to analyze two cols of numbers.
Cols differ each day.
Col 1 has no.'s from 1 to 5, eg. Day 1 there are 150 x 1's and 200 x 2's, etc. Day 2, 350 x 1's and 85 x 2's etc.
Col 2 has values between 1 and 99.
I need to count how many 1's there are to obtain a 1's average, 2's ave., etc. So far I have tried to write a vb program (excel 2010) - I have written the following:
Function Phil2()
ct = 0
For X = 2 To 10
If ax = 1 Then Let b15 = b15 + bx
ct = ct + 1
Next
End Function.
But I cannot get it to display. Can anyone help me?
I want the average of the 1's in cell b15.
See the formula bar for what is in cell E1. If you don't have XL2007 or above, the formula becomes:
=IF(ISERROR(SUMIF($A$1:$B$10,D1,$B$1:$B$10)/COUNTIF($A$1:$B$10,D1)),"",SUMIF($A$1:$B$10,D1,$B$1:$B$10)/COUNTIF($A$1:$B$10,D1))
You could also make more "automated" by using Dynamic Named Ranges for your ID (1,2,3..) and data (%) sets, that change each day.
OK, It works fine - I modified your formula to:
=IFERROR(AVERAGEIFS(B$16:B$500,$A$16:$A$500,$A2,B$16:B$500,">0"),"")
and it works perfectly for values 1, and 2. So that's a great start. I placed the formula cells on top: so in a1 I typed cow no., in b1 %Milk, in c1 %weight, etc.. In a2 I typed 1, a3 2, a4 3 etc.. In b2 your formula etc.. My next challenge is to lump together all cow types 3 to 11. So next to cow type 1 we have a % for each category, same for cow type 2, etc.. But the 3rd row must have an average for all categories 3+. Raw data cow types are in a10 down, vals in b10, c10, etc.

Resources