Distribute quantity through min max in Excel - excel

I initially tried to do this directly in SQL Server but it seems like it can't be possible through query so I want to calculate this "Distribute" column in Excel. Below is the details of the question. Appreciate if someone can help here.
I have following column in Excel and want to calculate values in "Distribute" column.
Item
Qty
Customer
Rank
Min
Max
Distribute
001
1500
0101
1
250
600
????
001
1500
0104
2
0
500
????
001
1500
0103
3
100
300
????
001
1500
0105
4
200
300
????
002
2000
0104
1
200
600
????
002
2000
0105
2
150
700
????
002
2000
0101
3
100
200
????
002
2000
0103
4
100
500
????
002
2000
0102
5
50
200
????
003
800
0103
1
100
500
????
003
800
0102
2
50
200
????
003
800
0101
2
50
100
????
003
800
0104
3
50
80
????
There are multiple items (Item) and each item has fixed quantity available (Qty)
Each item is distributed in different customers (Customers) based on their rank (Rank). The ranks are group by for every item. Data is already sorted via Rank column for every item. Multiple customers against an item can have same rank.
From the total quantity (Qty) of each item, every customer must get minimum quantity mentioned in (Min) column irrespective of its rank.
The remaining quantity of every item must be distribute based on the rank of the customer making sure that it should not exceed to the maximum quantity mentioned in (Max) column.
It is OK, if total quantity of the item is not consumed after distribution maximum quantity to all customer.
What I am after is the result something like this:
Item
Qty
Customer
Rank
Min
Max
Distribute
001
1500
0101
1
250
600
600
001
1500
0104
2
0
500
500
001
1500
0103
3
100
300
200
001
1500
0105
4
200
300
200
002
2000
0104
1
200
600
600
002
2000
0105
2
150
700
700
002
2000
0101
3
100
200
200
002
2000
0103
4
100
500
450
002
2000
0102
5
50
200
50
003
800
0103
1
100
500
500
003
800
0102
2
50
200
200
003
800
0101
2
50
100
50
003
800
0104
3
50
80
50
Looking forward if you can provide a formula or solution here. Thanks for your help.

FORMULA BASED SOLUTION
Here is a possible formula based solution with multiple cells involed that assumes the table is already properly sorted (by rank with any order then by rank from smaller to greater) and will stay that way:
A
B
C
D
E
F
G
H
I
J
K
Item
Qty
Customer
Rank
Min
Max
[Cumulative] Qty - Min
Basic
[Cumulative] Remain
Extra
Distribute
1
1500
101
1
250
600
=MAX(0,IF(A1<>A2,B2-E2,G1-E2))
=IF(A2<>A1,MIN(B2,E2),MIN(G1,E2))
=IF(A1<>A2,AGGREGATE(15,6,G:G/(A:A=A2),1),MAX(0,I1-(F1-E1)))
=MIN(I2,F2-E2)
=H2+J2
1
1500
104
2
0
500
=MAX(0,IF(A2<>A3,B3-E3,G2-E3))
=IF(A3<>A2,MIN(B3,E3),MIN(G2,E3))
=IF(A2<>A3,AGGREGATE(15,6,G:G/(A:A=A3),1),MAX(0,I2-(F2-E2)))
=MIN(I3,F3-E3)
=H3+J3
1
1500
103
3
100
300
=MAX(0,IF(A3<>A4,B4-E4,G3-E4))
=IF(A4<>A3,MIN(B4,E4),MIN(G3,E4))
=IF(A3<>A4,AGGREGATE(15,6,G:G/(A:A=A4),1),MAX(0,I3-(F3-E3)))
=MIN(I4,F4-E4)
=H4+J4
1
1500
105
4
200
300
=MAX(0,IF(A4<>A5,B5-E5,G4-E5))
=IF(A5<>A4,MIN(B5,E5),MIN(G4,E5))
=IF(A4<>A5,AGGREGATE(15,6,G:G/(A:A=A5),1),MAX(0,I4-(F4-E4)))
=MIN(I5,F5-E5)
=H5+J5
2
2000
104
1
200
600
=MAX(0,IF(A5<>A6,B6-E6,G5-E6))
=IF(A6<>A5,MIN(B6,E6),MIN(G5,E6))
=IF(A5<>A6,AGGREGATE(15,6,G:G/(A:A=A6),1),MAX(0,I5-(F5-E5)))
=MIN(I6,F6-E6)
=H6+J6
2
2000
105
2
150
700
=MAX(0,IF(A6<>A7,B7-E7,G6-E7))
=IF(A7<>A6,MIN(B7,E7),MIN(G6,E7))
=IF(A6<>A7,AGGREGATE(15,6,G:G/(A:A=A7),1),MAX(0,I6-(F6-E6)))
=MIN(I7,F7-E7)
=H7+J7
2
2000
101
3
100
200
=MAX(0,IF(A7<>A8,B8-E8,G7-E8))
=IF(A8<>A7,MIN(B8,E8),MIN(G7,E8))
=IF(A7<>A8,AGGREGATE(15,6,G:G/(A:A=A8),1),MAX(0,I7-(F7-E7)))
=MIN(I8,F8-E8)
=H8+J8
2
2000
103
4
100
500
=MAX(0,IF(A8<>A9,B9-E9,G8-E9))
=IF(A9<>A8,MIN(B9,E9),MIN(G8,E9))
=IF(A8<>A9,AGGREGATE(15,6,G:G/(A:A=A9),1),MAX(0,I8-(F8-E8)))
=MIN(I9,F9-E9)
=H9+J9
2
2000
102
5
50
200
=MAX(0,IF(A9<>A10,B10-E10,G9-E10))
=IF(A10<>A9,MIN(B10,E10),MIN(G9,E10))
=IF(A9<>A10,AGGREGATE(15,6,G:G/(A:A=A10),1),MAX(0,I9-(F9-E9)))
=MIN(I10,F10-E10)
=H10+J10
3
800
103
1
100
500
=MAX(0,IF(A10<>A11,B11-E11,G10-E11))
=IF(A11<>A10,MIN(B11,E11),MIN(G10,E11))
=IF(A10<>A11,AGGREGATE(15,6,G:G/(A:A=A11),1),MAX(0,I10-(F10-E10)))
=MIN(I11,F11-E11)
=H11+J11
3
800
102
2
50
200
=MAX(0,IF(A11<>A12,B12-E12,G11-E12))
=IF(A12<>A11,MIN(B12,E12),MIN(G11,E12))
=IF(A11<>A12,AGGREGATE(15,6,G:G/(A:A=A12),1),MAX(0,I11-(F11-E11)))
=MIN(I12,F12-E12)
=H12+J12
3
800
101
2
50
100
=MAX(0,IF(A12<>A13,B13-E13,G12-E13))
=IF(A13<>A12,MIN(B13,E13),MIN(G12,E13))
=IF(A12<>A13,AGGREGATE(15,6,G:G/(A:A=A13),1),MAX(0,I12-(F12-E12)))
=MIN(I13,F13-E13)
=H13+J13
3
800
104
3
50
80
=MAX(0,IF(A13<>A14,B14-E14,G13-E14))
=IF(A14<>A13,MIN(B14,E14),MIN(G13,E14))
=IF(A13<>A14,AGGREGATE(15,6,G:G/(A:A=A14),1),MAX(0,I13-(F13-E13)))
=MIN(I14,F14-E14)
=H14+J14
VBA SOLUTION
Here is a possible VBA solution that assumes the table is already properly sorted (by rank with any order then by rank from smaller to greater) and will stay that way:
Sub SubDistribution()
Dim RngData As Range
Dim RngItem As Range
Dim RngQty As Range
Dim RngMin As Range
Dim RngMax As Range
Dim RngDistribute As Range
Dim VarArray() As Variant
Dim DblItemCol As Double
Dim DblQtyCol As Double
Dim DblMinCol As Double
Dim DblMaxCol As Double
Dim DblRow As Double
Dim DblCounter01 As Double
Dim DblQuantity As Double
Dim BlnFirstLap As Boolean
Set RngData = Range("A2")
Set RngQty = Range("B2")
Set RngItem = Range("A2")
Set RngMin = Range("E2")
Set RngMax = Range("F2")
Set RngDistribute = Range("G2")
DblItemCol = RngData.Column - RngItem.Column + 1
DblQtyCol = RngData.Column - RngQty.Column + 1
DblMinCol = RngData.Column - RngMin.Column + 1
DblMaxCol = RngData.Column - RngMax.Column + 1
Set RngData = Range(RngData, RngData.End(xlToRight).End(xlDown))
ReDim VarArray(1 To RngData.Rows.Count)
For DblRow = 1 To RngData.Rows.Count
If RngItem.Offset(DblRow).Value = RngItem.Offset(DblRow - 1).Value And BlnFirstLap = False Then
DblQuantity = RngQty.Offset(DblRow - 1).Value
BlnFirstLap = True
Else
If RngItem.Offset(DblRow).Value <> RngItem.Offset(DblRow - 1).Value Then
BlnFirstLap = False
End If
End If
If RngItem.Offset(DblRow).Value <> RngItem.Offset(DblRow - 1) Then
VarArray(DblRow) = Excel.WorksheetFunction.Min(RngQty.Offset(DblRow - 1), RngMin.Offset(DblRow - 1))
Else
VarArray(DblRow) = Excel.WorksheetFunction.Min(DblQuantity, RngMin.Offset(DblRow - 1))
End If
DblQuantity = Excel.WorksheetFunction.Max(0, DblQuantity - RngMin.Offset(DblRow - 1).Value)
If BlnFirstLap = True Then
DblCounter01 = DblCounter01 + 1
Else
For DblCounter01 = DblCounter01 To 0 Step -1
VarArray(DblRow - DblCounter01) = VarArray(DblRow - DblCounter01) + Excel.WorksheetFunction.Min(DblQuantity, RngMax.Offset(DblRow - 1 - DblCounter01) - RngMin.Offset(DblRow - 1 - DblCounter01))
DblQuantity = Excel.WorksheetFunction.Max(0, DblQuantity - (RngMax.Offset(DblRow - 1 - DblCounter01).Value - RngMin.Offset(DblRow - 1 - DblCounter01).Value))
Next
DblCounter01 = 0
End If
Next
RngDistribute.Resize(UBound(VarArray)).Value = Excel.WorksheetFunction.Transpose(VarArray)
End Sub

Related

Unable to establish relationship for two fact tables with multiple many-to-many dimensions in Excel Data Model

I have 3 tables (FACT1, FACT2, DIM1, DIM2)
FACT1:
Code Month Value
058 1 500
059 1 600
061 1 700
058 2 1000
059 2 1000
061 2 1000
FACT2:
Service Month Status Value
058-buy 1 OK 700
059-purchase 1 Missing 800
061-trade 1 OK 900
058-buy 2 OK 300
059-purchase 2 Missing 400
061-trade 2 OK 500
DIM1:
Code Service
058 058-buy
059 059-purchase
061 061-trade
DIM2:
Month Name
1 January
2 February
3 March
I have all 4 tables loaded into Data Model in Excel and created a new measure in FACT1
Value Total:=sum([Value])+sumx(FILTER('FACT2','FACT2'[Status]="OK"),'FACT2'[VALUE])
I have also created relationship between tables:
FACT1(Code) to DIM1(Code)
FACT2(Service) to DIM1(Service)
FACT1(Month) to DIM2(Month)
FACT2(Month) to DIM2(Month)
However, when I insert a new pivot using the data model by having
Code from DIM1 on "ROW"
Month from FACT1 on "COLUMN"
New Measure Value Total on "Values"
I get something like this:
Code 1 2 Grand Total
058 1500 2000 2500
059 600 1000 1600
061 2100 2400 3100
Grand Total 4200 5400 7200
Somehow it will get sliced by Month properly, but if you pay attention to the last column of the pivot table Grand Total, they are correct! 2500 is indeed 500 + 700 + 1000 + 300, 1600 is indeed 600 + 1000, 3100 is indeed 700 + 900 + 1000 + 500
The final grand total for the above 3 is correct at 7200 too.
Now the question is why the middle part of the pivot tables acts oddly?
For your Pivot Table, use Month from the DIM2 table, not the FACT1 table. With your current set-up, FACT1 cannot filter the FACT2 table, though DIM2 can.

pandas - sort columns and group by a particular field

I have a list of objects
[
{
"companyid": long,
"parentid": long
"score": long,
...
}
]
The parentid is nothing but the cid of the parent company
Sample data will look something like this
cid parentid score
1 10 1000
2 10 100
3 10 1001
10 10 20
11 100 1000
12 100 100
100 100 200
111 1000 10
112 1000 100
1000 100 2000
I need to sort the values based on the score, but i want to group the values by parentid
I tried this which didn't really fit my requirements, since it groups then sorts
df.groupby('headcompanyid').apply(lambda x: x.sort_values('score'))
Sorting by score will give this result:
cid parentid score
1000 100 2000
3 10 1001
1 10 1000
11 100 1000
100 100 200
2 10 100
112 1000 100
12 100 100
10 10 20
111 1000 10
Grouping by parentid on the sorted data (which is my end goal), should give this result
cid parentid score
1000 100 2000
11 100 1000 // since 100 is the parentid, it needs to be pushed up the in the result set
100 100 200 // if multiple records are pushed up, then sorting should be based on score
12 100 100
3 10 1001 // 2nd group by is based on 10, since 1001 is the next highest score which
1 10 1000 // doesn't belong to the 100 parentid group
2 10 100
10 10 20
112 1000 100
111 1000 10
i am using pandas v0.24.2 and python 3.7 if it matters
Try this:
df.sort_values(['parentid', 'score'], ascending=[False, False])
Output:
cid parentid score
8 112 1000 100
7 111 1000 10
9 1000 100 2000
4 11 100 1000
6 100 100 200
5 12 100 100
2 3 10 1001
0 1 10 1000
1 2 10 100
3 10 10 20

Group by multiple columns and calculate the average sum

I have the below dataframe :
Customer Category Month Mon_exp
1 A 1 200
1 A 1 100
1 A 2 150
1 B 2 150
1 B 3 300
2 A 1 300
2 A 1 200
2 A 2 150
2 B 2 150
2 B 3 400
Expected Dataframe :
Customer Category Month Mon_exp Ave_Mon_exp
1 A 1 200 300
1 A 1 100 300
1 A 2 150 300
1 B 2 150 300
1 B 3 300 300
2 A 1 300 400
2 A 1 200 400
2 A 2 150 400
2 B 2 150 400
2 B 3 400 400
Explanation for the new column 'Ave_Mon_exp' :
1) For Each customer, sum the 'Mon_exp' and divide with the count of unique 'Month' value.
For eg. Customer - 1, Sum of 'Mon_exp' is 900 and count of unique 'Month' value is 3. Hence the Ave_Mon_exp is 300.
Can anyone help me to derive the new column 'Ave_Mon_exp' ?
Thanks
import pandas as pd
sample_df = pd.DataFrame({'Customer':[1,1,1,1,1,2,2,2,2,2],'Category':['A','A','A','B','B','A','A','A','B','B'], 'Month': [1,1,2,2,3,1,1,2,2,3], 'Mon_exp': [200, 100, 150, 150, 300,300,200,150,150,400]})
new_col = sample.groupby('Customer')['Mon_exp'].sum()/ sample.groupby('Customer')['Month'].nunique()
new_col.name = 'Customer'
sample = sample.join(new_col, on='Customer', rsuffix='_Ave_Mon_exp')
print(sample_df)

Counting number of specific observations inside groups of rows

Reproducible example
Consider the following data:
ID ID_2 Specie Area Tree DBH H Cod
2 111 E_citriodora 432 1 19.098 20
2 111 E_citriodora 432 2 1
2 111 E_citriodora 432 3 1
2 111 E_citriodora 432 4 20.530 17.4 6
...
2 111 E_grandis 557 1 1
2 111 E_grandis 557 2 24.828 15 6
2 111 E_grandis 557 3 1
2 111 E_grandis 557 4 14.483 16 5
...
2 111 E_paniculata 704 1 1
2 111 E_paniculata 704 2 14.164 19.5
2 111 E_paniculata 704 3 1
2 111 E_paniculata 704 4 17.507 20
Here is a complete reproducible example with 208 rows. The actual data has more rows and species, in which the number of rows per specie is not always the same.
Question
What I would like to do is the following:
Check if the count of code 6 on column "Cod" for each specie is smaller than 3 (minimum threshold) and greater than Area/100 (considering the result rounded up to an integer). If one of the conditions are met, I would like to display a message box.
Count of code 6 is smaller than 3 or greater than roundup(Area/100,0)
Expected result
E_citriodora has four numbers 6 on column "Cod". The correct count of code number 6 should be between 3 and =ROUNDUP(432/100,0)=5. So, 3 < 4 < 5 would not trigger the message box.
E_grandis has seven observations for code 6, but in this case the maximum threshold is 6 because the area of 557/100 is 5.57 which rounded up is 6.
3 < 7 < 6. This result would trigger the message box.
The third example, E_paniculata has only 2 observations for code 6. This is smaller than the minimum threshold of 3. 3 < 2 < 8. This result would also trigger the message box.
It is not necessary to display a message box for each time a condition is met, but just one message indicating there is at least one flaw.
What I have tried
I could do this manually for each specie using formulas. For example, regarding the first specie of the data frame:
=IF(OR(COUNTIF(H2:H73,6) < 3,COUNTIF(H2:H73,6) > ROUNDUP(D2/100,0)),"Not Ok", "Ok")
However I was expecting to achieve this with a macro and my main difficulty has been to set the count inside each group of specie and which type of loop would be the most suitable in this situation. Tks.
Assuming your data is always sorted the way in your example file, this code would print all species with code6 greater than 3 to your console:
Sub test()
'Assuming A2 in Sheet 1 contains your first ID
Dim r As Range
Set r = ThisWorkbook.Sheets(1).Range("A2")
if r = "" then exit sub
Dim specie As String
specie = ""
Dim cod6 As Integer
'Stop at first empty row
Do While Not r = ""
'Next Specie
If specie <> r.Offset(0, 2) Then
specie = r.Offset(0, 2)
cod6 = 0
End If
'Count cod
If r.Offset(0, 7) = 6 Then cod6 = cod6 + 1
'Check cod at end of specie
If specie <> r.Offset(1, 2) Then
'Put your real condition here and make a msgbox
If cod6 > 3 Then Debug.Print specie & " has cod6 greater than"
End If
Set r = r.Offset(1, 0)
Loop
End Sub

How to get the corresponding values of the latest date in Excel?

I have this values in my excel:
A B C D
StaffId FSales ESales Date
1 100 500 23-Jan-13
1 50 170 25-Jan-13
1 70 230 26-Jan-13
2 100 300 25-Jan-13
2 130 200 27-Jan-13
Outcome wanted:
A B C D
StaffId FSales ESales Date
1 100 500 23-Jan-13 10:00:00AM
1 50 170 25-Jan-13 11:00:00AM
1 70 230 26-Jan-13 11:30:00AM
2 100 300 25-Jan-13 03:00:00PM
2 130 200 27-Jan-13 02:00:00PM
3 100 200 29-Jan-13 01:01:00PM
3 90 209 29-Jan-13 01:00:00PM
A B C D
StaffId FSales ESales Date
1 70 230 26-Jan-13 11:30:00AM
2 130 200 27-Jan-13 02:00:00PM
3 100 200 29-Jan-13 01:01:00PM
Lets say dates are jumbled up and not arranged in any order. How can i get the latest date Fsales and ESales for each staff?
Meaning getting 70 230 for staffid 1 and 130 200 for staffid 2.
Help needed please
Assuming you have the second list with the unique staff ID in Sheet2 and the original list in Sheet1, starting in row 2, enter the following formula:
FSales max in Sheet2!B2: =INDEX(Sheet1!$B:$B,MATCH(MAX(Sheet1!$D:$D*(Sheet1!$A:$A=A2)),(Sheet1!$D:$D*(Sheet1!$A:$A=A2)),0))
ESales max in Sheet2!C2: =INDEX(Sheet1!$C:$C,MATCH(MAX(Sheet1!$D:$D*(Sheet1!$A:$A=A2)),(Sheet1!$D:$D*(Sheet1!$A:$A=A2)),0))
Both formulas are array-formulas, i.e. enter them with Ctrl-Shift-Enter instead of Enter.

Resources