How do I create groups based on the sum of values? - excel

I haven't been able to find anything like this and maybe I'm looking in the wrong place because I have very limited knowledge in programming so any help would be appreciated.
If there is a way to do this in Microsoft Excel without code that would be preferable, but if code is necessary then please help me by telling me the code is or how I can write it myself.
I need to be able to make separate groups of rows based on the sum of the values in a single column. This can be done either by inserting a "Total" row after a set summation value is reached or by color coding each group that reaches this summation value.
So, the order I see that this needs to happen is:
In one column, of a multi column table, sum the rows in order (if possible, starting on a row of my choosing)
Once the sum has reached a certain value, but doesn't exceed that value, those rows are grouped together
This occurs down the entire length of the table, creating separate groups based on the sum of the rows being as close to the set value as possible
For example:
Starting with this data:
Cust. │Qty.│ Type
A │2│ L
B │4│ XL
C │4│ M
D │9│ S
E │1│ L
F │9│ M
G │10│ L
H │1│ L
I │1│ XL
J │5│ L
K │1│ M
L │5│ S
M │4│ S
N │2│ S
The quantities are summed and checked against the value 10 and then grouped accordingly:
Cust. Qty. Type
A │2│ L
B │4│ XL
C │4│ M
│**Total: 10**│
D │9│ S
E │1│ L
│**Total: 10**│
F │9│ M
│**Total: 9**│
G │10│ L
│**Total: 10**│
H │1│ L
I │1│ XL
J │5│ L
K │1│ M
│**Total: 8**│
L │5│ S
M │4│ S
│**Total: 9**│
N │2│ S
... etc.
Or by color coding the different groups:
Color Coded Groups
The data table is constantly changing, with old rows being removed, and new ones added. The table is static once the information is pulled, so I would need to be able to apply this easily every time I pull updated data.
Also, if possible, the algorithm needs to be dynamic enough so if I insert, remove, or rearrange the rows that the groups are automatically updated.
Any help, suggestions, or comments would be greatly appreciated. Doing this manually is very time consuming and cumbersome due to the large amount of data that needs to be sorted.
Thank you in advance for the help.

One solution using only excel is this:
1) Add three additional columns to the table: "Total", "Starts Group", and "Group Number". So the table has 5 columns:
Customer Quantity Type Total Starts Group Group Number
2) Add one empty row between the headers and the data rows - this will make crafting and maintaining the formulas easier.
3) On the third row, which would be the first row with actual data (A | 2 | L), put the following formulas for the three new columns:
"Total" -> =IF(SUM(B3+D2)>10,B3,SUM(B3+D2))
"Starts Group" -> =IF(SUM(B3+D2)>10,TRUE,FALSE)
"Group Number" -> =IF(E3,F2+1,F2)
4) The "Group Number" column contains the information that you want. You can color code the rows using that value. Also, the table should be completely dynamic - you can add/remove rows as you wish and it will get recomputed.
So your specific example would look like this:
OrderID Contract Price BuySell OrderType Quantity
1 ZS 10914 Buy 6
2 ZS 10916 Buy 4
3 ZL 3188 Sell 9
4 ZM 3981 Sell 9
5 ZM 3985 Sell 2
6 ZS 10914 Buy 10
7 ZL 3186 Sell 9
8 ZM 3982 Sell 11
9 ZS 10910 Buy 2
10 ZS 10911 Buy 4
11 ZS 10913 Buy 2
12 ZS 10914 Buy 4
13 ZL 3184 Sell 9
14 ZM 3983 Sell 11
15 ZS 10926 Buy 10
16 ZL 3184 Sell 9
17 ZM 3983 Sell 11
18 ZS 10926 Buy 10
19 ZL 3184 Sell 9
20 ZM 3983 Sell 11

Related

TRUE/FALSE ← VLOOKUP ← Identify the ROW! of the first negative value within a column

Firstly, we have an array of predetermined factors, ie. V-Z;
their attributes are 3, the first two (•xM) multiplied giving the 3rd.
f ... factors
• ... cap, the values in the data set may increase max
m ... fixed multiplier
p ... let's call it power
This is a separate, standalone array .. we'd access with eg. VLOOKUP
f • m pwr
V 1 9 9
W 2 8 16
X 3 7 21
Y 4 6 24
Z 5 5 25
—————————————————————————————————————————————
Then we have 6 columns, in which the actual data to be processed is in, & thereof derive the next-level result, based on the interaction of both samples introduced.
In addition, there are added two columns, for balance & profit.
Here's a short, 6-row data sample:
f • m bal profit
V 2 3 377 1
Y 2 3 156 7
Y 1 1 122 0
X 1 2 -27 2
Z 3 3 223 3
—————————————————————————————————————————————
Ultimately, starting at the end, we are comparing IF -27 inverted → so 27 is within the X's power range ie. 21 (as per the first sample) .. which is then fed into a bigger formula, beyond the scope of this post.
This can be done with VLOOKUP, all fine by now.
—————————————————————————————————————————————
To get to that .. for the working example, we are focusing coincidentally on row5, since that's the one with the first negative value in the 'balance' column, so ..
on factorX = which factor exactly is to us unknown &
balance -27 = which we have to locate amongst potentially dozens to hundreds of rows.
Why!?
Once we know that the factor is X, based on the * & multiplier pertaining to it, then we also know which 'power' (top array) to compare -27, as the identified first negative value in the balance column, to.
Is that clear?
I'd like to know the formula on how to achieve that, & (get to) move on with the broader-scope work.
—————————————————————————————————————————————
The main issue for me is not knowing how to identify the first negative or row -27 pertains to, then having that piece of information how to leverage it to get the X or identify the factor type, especially since its positioned left of the latter & to the best of my knowledge I cannot use negative column index number (so, latter even if possible is out of the question anyway).
To recap;
IF(21>27) = IF(-21<-27)
27 → LOCATE ROW with the first negative number (-27)
21 → IDENTIFY the FACTOR TYPE, same row as (-27)
→ VLOOKUP pwr, based on factor type identified (top array, 4th column right)
→ invert either 21 to a negative number or (-27) to the positive number
= TRUE/FALSE
Guessing your columns I'll say your first chart is in columns A to D, and the second in columns G to K
You could find the letter of that factor with something like this:
=INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0)))
INDEX(J:J<0) converts that column to TRUE and FALSE depending on being negative or not and with XMATCH you find the first TRUE. You could then use that in VLOOKUP:
=VLOOKUP(INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0))),A:D,4,0)
That would return the 21. You can use the first concept too to find the the -27 and with ABS have its "positive value"
=VLOOKUP(INDEX(G:G,XMATCH(TRUE,INDEX(J:J<0))),A:D,4,0) > INDEX(J:J,XMATCH(TRUE,INDEX(J:J<0)))
That should return true or false in the comparison

List result of lookup A in B, B in C without helper column

I have 2 tables:
Table1 containing Customer & Part#
Table2 containing Part# & Type
(The actual data lists are larger)
Table1 (Customer & Part#) & Table3 (Helper):
Customer
Part#
Helper
A
1
X
B
2
Y
C
3
X
A
4
Y
A
5
X
A
5
X
A
2
Y
Table2:
Part#
Type
1
X
2
Y
3
X
4
Y
5
X
Desired result for combination of customer A and Type X:
Part#
1
5
5
These being the 3 results of part numbers in Table1 that are Customer A and the lookup of the Part# results in Type X (see also Helper column).
I'm able to retrieve the results by creating the helper column as shown in the example data, however I want to skip this column and solve it in one go. But I don't know if that's even possible.
I was thinking about something in this direction.. =INDEX (Table1[Part'#],IF(Table1[Customer]="A",ROW(Table1[Customer]))
..but there I get stuck. I think I can pickup from there with IF, ISNUMBER, SEARCH but my head errors there.
Does anybody know a way to skip the helper column for this?
PS I have office365, but FILTER is not yet released by company rules (unfortunately).
PS I prefer a formula solution, but VBA is allowed when necessary
Here is a formula solution for Excel version 2010 to 2019
In I3, formula copied down :
=IFERROR(INDEX(B:B,AGGREGATE(15,6,ROW(A$3:A$9)/(VLOOKUP(N(IF({1},B$3:B$9)),D$3:E$7,2,0)=H$3)/(A$3:A$10=G$3),ROW(A1))),"")

Calculate Average Across Multiple Pairs/Permutations

Not sure how to use AVERAGEIFS or a combination of SUMIFS and COUNTIFS to efficiently solve this, or some other function.
Basically, assume I have the following dataset of trip times between certain points
Start End Trip Time(Minutes)
A B 12
A B 8
B A 9
B A 2
A C 15
C A 5
C B 11
C B 9
B C 7
A B 16
A D 18
D C 21
E A 11
X Y 19
There could be n number of points in the dataset, but assume we are only interested in the average trip time of all trip pairs between 4 cities (A,B,C,D). i.e. AB, BA, AC, CA, BD, DB, etc. but not AA, BB, CC, DD.
How can I go about averaging the trip time between all these permutations? Much help would be appreciated..thank you!
Not very pretty, but using a named range "CITIES" (A20:A23 below)
In E3 to arrange as unique pairs regardless of direction (and fill down):
=IFERROR(INDEX(CITIES,MIN(MATCH(A3,CITIES,0),MATCH(B3,CITIES,0)))&":"&
INDEX(CITIES,MAX(MATCH(A3,CITIES,0),MATCH(B3,CITIES,0))),"")
In F3:
=IF(E3<>"",AVERAGEIFS($C$3:$C$16,$E$3:$E$16,E3),"")
You can copy/paste values/remove duplicates to get the unique pairs.

Get number of unique values from a column with multiple criteria

I am working on an Excel problem. Here is my questions:
name department year
a cs 5
b cs 8
c cs 2
d cs 3
a cs 1
b cs 10
a ma 7
f ma 8
h ma 2
The question is to get the number of unique name (only occur once) with department="cs" and year >2, in this case the result is 2 (i.e,"a" and "d" only occur once).
I knew the formula below might do the trick, but did not know how to put the range filtered by department="cs" and year >2 into the below formula.
=SUM(IF(COUNTIF(range, range)=1,1,0))
Use SUMPRODUCT:
=SUMPRODUCT((COUNTIFS(A:A,A2:INDEX(A:A,MATCH("zzz",A:A)),B:B,"cs",C:C,">2")=1)*(B2:INDEX(B:B,MATCH("zzz",A:A))="cs")*(C2:INDEX(C:C,MATCH("zzz",A:A))>2))

Excel formula to apply penalty column to ranking

I have thought long and hard about this, but I can't find a solution to what I believe is quite a simple problem.
I have a table of results, where sometimes someone will be given a penalty of a varying amount. This is entered into the penalty column (Col C).
I need a formula which checks if there is an entry into the penalty column and applies it, not only to that row, but to the number of subsequent rows which are affected, depending on the severity of the penalty.
I have tried to see if this is possible by referencing the penalty against the 'ROW()' function but have not been able to achieve the desired effect.
Col D shows the desired output of the formula.
Col E is included for reference only, to show the desired effect on each row.
Col A Col B Col C Col D Col E
Pos Name Penalty New Pos Change
1 Jack 1 0
2 Matt 2 0
3 Daniel 2 5 +2
4 Gordon 3 -1
5 Phillip 4 -1
6 Günther 6 0
7 Johann 3 10 +3
8 Alain 7 -1
9 John 8 -1
10 Gianmaria 9 -1
The big issue is, if someone is handed a big penalty, for example '10' then it affects the following ten rows. I can't work out how to include this variable logic...
I would be interested to hear the approach of others...
You need to use the RANK() function:
Excel RANK Function Examples
In a new column, add the penalty value to the original position, plus a small coeffieient depending on the original position (0.01 per increment perhaps) to move the penalised player below the original person at that position, then in the next column you can RANK() the new column of values (F in my case).
New value is therefore =A2+(IF(C2>0,C2+(0.01*A2)))
Rank is then =RANK(F2,F2:F11,1)
You can combine all the functions into one, but it's clearer to do it in separate columns at first.

Resources