I have table in Excel with some information, the main column is Weight (in KG).
I need Excel to group Rows into groups, where each group's sum of Weight (in KG) is less than 24000 kg and greater than 23500 kg.
To do so manually is very time consuming, since there are thousands of rows with different Weight values.
table example:
ID | Weight (KG)
1 | 11360
2 | 22570
3 | 10440
4 | 20850
5 | 9980
6 | 9950
7 | 19930
8 | 9930
9 | 9616
10 | 9580
... and so on
The closest I got to solving the problem is adding 3 new columns: Total, Starts Group and Group Number.
Total function: =IF(SUM(B3+C2)>24000,B3,SUM(B3+C2)) - calculates current sum of Weight values in the current group
Starts group function: =IF(SUM(B3+C2)>24000,B3,SUM(B3+C2)) - checks if current row makes a new group
Group number function: =IF(D3,E2+1,E2) - all rows that contain same number are in the same group
The problem with this is that it doesn't create groups that are greater than 23500 too, but only that are less than 2400 kg.
It doesn't have to be in Excel, any app/script would work too, it just has to get the job done.
Desired output:
ID | Weight (KG) | Group ID
1 | 11360 | 1
2 | 2570 | 2
3 | 10440| 1
4 | 20850 | 2
5 | 180| 2
6 | 1950 | 1
So i want to get groups similar to these:
Group number 1 - Total 23750kg
Group number 2 - Total 2360kg
Url to my example table with functions I added:
https://1drv.ms/x/s!Au0UogL2uddbgTFJJ4TzSKLhPFPE?e=r02sPX
You may want to try this for total:
=IF(SUM(B3+C2)>24000;B3;IF(SUM(B3+C2)<=23500;SUM(B3+C2);B3))
edit:
I just saw you pasted the proposal into your sample file. You may need to replace the ; with , due to regional format settings.
The limitation remains:
first priority is <24k and second priority is >=23.5k
If the next row’s value makes the “jump” above 24k you may end up remaining below 23.5k and switching to the next group
edit2:
You may want to look up some optimization models and algorithms for your combination problem before trying to implement it in Excel.
Or try with simple rules, e.g. categorizing your rows such as weight over 20k, 16k, 12k,8k, 4k, 2k, 1k, 500, etc. and try to group/combine them accordingly
The goal is to fix a circular reference in my logic in this "two weeks pay" input workbook.
It's a temporary sheet for when people are outside the office and can't access the system.
That said, that sheet still should give them accurate data.
There's 26 sheets that contains the times done by an employee for example, in a typical format, for 2 weeks in each (a year in total).
Stripped of all formatting and non-useful information for this enquiry, they would look somewhat like this (with proper dates) :
+-----------+----------+--------+----------+--------+-------+------+
| Date | AM start | AM end | PM start | PM end | total | over |
+-----------+----------+--------+----------+--------+-------+------+
| Monday | 8:00 | 12:00 | 13:00 | 16:00 | 7:00 | 0:00 |
+-----------+----------+--------+----------+--------+-------+------+
| Tuesday | 8:00 | 12:00 | 13:00 | 15:00 | 6:00 | 0:00 |
+-----------+----------+--------+----------+--------+-------+------+
| Wednesday | 8:00 | 12:00 | 13:00 | 17:00 | 8:00 | 1:00 |
+-----------+----------+--------+----------+--------+-------+------+
| ... | .... | .... | .... | .... | .... | .... |
+-----------+----------+--------+----------+--------+-------+------+
Then on another sheet, there's some calculation has to what is the paid amount (maximum 70 hours per 2 weeks), the overtime done that has to be paid, etc.
A B C D E F G
+-------+------------+----------+---------------+-----------------------+-------+---------------------+
1 | Pay # | Hours paid | Overtime | Used overtime | Total hours worked | | Total overtime left |
+-------+------------+----------+---------------+-----------------------+-------+---------------------+
2 | 1 | 70:00 | 5:00 | 0:00 | 75:00 | | 0:00 |
+-------+------------+----------+---------------+-----------------------+-------+---------------------+
3 | 2 | 70:00 | 0:00 | 5:00 | 65:00 | | |
+-------+------------+----------+---------------+-----------------------+-------+---------------------+
4 | ... | ... | ... | .... | | | |
+-------+------------+----------+---------------+-----------------------+-------+---------------------+
In the above table, the pay #2 got 70 hours paid, but the person would have done only 65 hours and used 5 hours of the overtime done the past two weeks.
A1:E4 is connected together, G1:G2 is data by itself, not linked to the pay numbers or other data in that sheet (in other words, there is only one cell that contains that total overtime left and F is used to separate both tables).
G2 currently have 0:00 because the 5:00 it would have had has been used to complete the second pay.
The Hours paid cell (B) contains this formula :
=IF($E$2>=2.91666666666667,2.91666666666667,IF((2.91666666666667- $E$2)<=$G$2;2.91666666666667,$E$2+$D$2))
Step 1 [condition] : If the total hours worked for that two weeks is greater than or equals to 70 hours (the 2.91666666666667 is used here instead of "70:00" to make the comparison works);
Step 1.1 [true] : Then put "70:00" in the cell because there's a 70 hours maximum per two weeks (this is fine since we have another cell that stores the overtime done (in this example, C));
Step 1.2 [false->condition] : Else, the total hours worked for that two weeks is lower than 70 hours so check if 70 hours minus the total hours worked for that two weeks is lower than the total overtime left to be used (used to check if there's overtime left that can be used this time to make the pay the highest it can be up to a maximum of 70 hours);
Step 1.2.1 [true] : If it is, put 70 hours because we'll use some of the overtime left to be paid to complete this two weeks;
Step 1.2.2 [false] : Otherwise, put the total hours worked for that two weeks added to the used overtime for that week (this cell is explained later with her formula and this step is for when there's overtime left, but not enough to make it go up to 70 hours so we put the amount of time it ends up being).
The important part here is to remember that B needs D, hence why I explained it's formula.
The Overtime (C) and "Total hours worked" (E) cells contains basic formulas that either gives the amount of time over 70 hours or the total hours worked; no need to explain it here, it works.
The Used overtime cell (D) is where it gets tricky. To explain it, we'll need to know what's up with G2.
The Total overtime left cell (G2) is the total of overtime hours minus the sum of all cells in D (used overtime).
It's purpose is to get an up to date value of how much overtime there is left to be paid.
Back to Used Overtime.
You probably start to see circular reference here; D needs G2 to work and G2 is the sum of all cells in D (in the table range, not all of them).
The formula requires the notion of how much overtime there is left so it can check if we can use some.
Here's the formula :
=IF($E$2>=2.91666666666667,"00:00",IF((2.91666666666667-$E$2)<=$G$2,(2.91666666666667-$E$2),IF(($G$2+$E$2)<=2.91666666666667,$G$2,"00:00")))
Step 1 [condition] : If the total hours worked for that two weeks is greater than or equals to 70 hours;
Step 1.1 [true] : Then put 0 hours since that pay is already at the 70 hours maximum;
Step 1.2 [false->condition] : Else, the total hours worked for that two weeks is lower than 70 hours and could grow higher if we still have overtime to be paid so check if 70 hours minus the total hours worked for that two weeks (the amount of time we could add from the overtime left) is lower than or equals to the total overtime left to be paid;
Step 1.2.1 [true] : Then put 70 hours minus the total of hours worked for that two weeks (the amount of time we will add from the overtime left to make this pay grows to the maximum of 70 hours);
Step 1.2.2 [false->condition] : Otherwise, check if the total overtime left added to the total of hours worked for that two weeks is lower than or equals to 70 hours (if so, then it means that we can add all the overtime left here without getting over 70 hours);
Step 1.2.2.1 [true] : If it is then the value is the total overtime left since it will make the total hours worked for that two weeks still be under the maximum yet pay for the overtime that was left to be paid;
Step 1.2.2.2 [false] : Otherwise, put 0 hours since we will not be adding overtime to this pay because there is no overtime left to be added.
How could I both have the accurate overtime left and yet have the used overtime both dynamically calculating themselves without a circular reference ?
What if every row had an up to date value for Total Overtime Left after that pay period?
Formula for G2: =C2-D2
Then every G cell after that only needs to add the Overtime Left from the previous pay period + overtime - used overtime:
G3: =G2+C3-D3
And it just goes on from there.
In my theoretical data set, I have a list which shows the date-time of a sale, and the employee who completed the transaction.
I know how to do grouping in order to show how many sales each employee has per day, but I'm wondering if there's a way to count how many grouped days have more than 0 sales.
For example, here's the original data set:
Employee | Order Time
A | 8/12 8:00
B | 8/12 9:00
A | 8/12 10:00
A | 8/12 14:00
B | 8/13 10:00
B | 8/13 11:00
A | 8/13 15:00
A | 8/14 12:00
Here's the pivot table that I have created:
Employee | 8/12 | 8/13 | 8/14
A | 3 | 1 | 1
B | 1 | 2 | 0
And here's what I want to know:
Employee | Working Days
A | 3
B | 2
Split your Order Time column (assumed to be B) into two, say with Text to Columns and Space as the delimiter (might need a little adjustment). Then pivot (using the Data Model) as shown:
and sum the results (outside the PT) such as with:
=SUM(F3:H3)
copied down to suit.
Columns F:G may then be hidden.
I fully support #Andrea's Comment (a correction) on the above:
I think this could have been made simpler. If you remove the "Time" in values of the pivot table and then move "Order" from columns to values and use distinct count as in the example. It should count Employee per date making the sum not needed. If you scale this to make it larger. Say 50 dates then the =Sum() needs to be moved each time.
I have 2 columns in excel.
Column 1 indicates 'pieces' (of delivery) and the other indicates 'processing time'.
I typed these in by hand because i was given them on a sheet of paper, so there is no maths formula visible.
Is there a way to get Excel to tell me how 'Process time' is being calculated because I really can't figure it out.
--- Example of situation ---
Total pieces | Pro Time (MM:SS)
40 | 00:21
3 | 00:01
12 | 00:04
43 | 00:22
I am back with my new excel question.
Lets say I have table like this.
| A | B
------------------------------------------
1 | ENV | Value
------------------------------------------
2 | ABC - 10/1/2014 1:38:32 PM | 4
3 | XYZ - 10/1/2014 1:38:32 PM | 6
4 | ABC - 9/1/2014 1:38:32 PM | 1
5 | XYZ - 10/1/2014 1:38:32 PM | 10
6 | ABC - 10/1/2014 1:38:32 PM | 7
7 | XYZ - 9/1/2014 1:38:32 PM | 1
8 | ABC - 9/1/2014 1:38:32 PM | 10
9 | ABC - 10/1/2014 1:38:32 PM | 7
10 | XYZ - 10/1/2014 1:38:32 PM | 7
Now, in Cell C2, I've selected ABC.
So in cell D2, I want the average (from col B) of all the "ABC" (col A) where Month = 10 (col A) and in cell E2, Max (from col B) of all the "ABC" where Month = 10 (col A).
So, my result in cells D2 and E2 would be 6 and 7 respectively.
I hope my question and example make sense.
UPDATE:
Thank you all for all your help.
Now let's say I am not sure how many rows I'll have on this spreadsheet, so I came up with this formula, but its not working, giving me #DIV/0! error.
*Note: I am using formula to get "ABC" and "10" from cell C2.
=AVERAGEIFS(
(OFFSET($A$1,1,1,COUNTA($B:$B)-1,1)),
OFFSET($A$1,1,0,COUNTA($A:$A)-1,1), (MID(C2,1,(FIND("-",C2))-2)),
OFFSET($A$1,1,0,COUNTA($A:$A)-1,1), (MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1))))
Even tried this, but same error:
=SUMPRODUCT(((MID(A2:A10,1,(FIND("-",A2:A10))-1))=(MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1))))*
(MONTH(DATEVALUE(MID(A2:A10,7,99)))=(MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1))))*
(B2:B10))/SUMPRODUCT(((MID(A2:A10,1,(FIND("-",A2:A10))-1))=(MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1))))*
(MONTH(DATEVALUE(MID(A2:A10,7,99)))=(MID(C2,(FIND("-",C2)+1),(FIND("/",C2))-(FIND("-",C2)+1)))))
Can you help me with this...?
Solution with Intermediary Values
To solve the issue (I tested the average only) I first used 2 intermediary values: this solution is not optimal and there will be many smarter ways to address the issue (e.g. pivot tables).
ENV Value Intermediary 1 Intermediary 2
ABC - 10/1/2014 1:38:32 PM 4 ABC 10
XYZ - 10/1/2014 1:38:32 PM 6 XYZ 10
ABC - 9/1/2014 1:38:32 PM 1 ABC 9
XYZ - 10/1/2014 1:38:32 PM 10 XYZ 10
The first intermediary column contains the first 3 chars of ENV column (=LEFT(A9,3)), while the second intermediary column contains the month (=MID(A9,7,2)). This works only if your ENV records are fixed size and homogeneous (e.g. your env name has exactly 3 chars).
With this layout, you can compute the average putting in any cell the following formula:
=AVERAGEIFS(D9:D12, F9:F12,"=ABC", G9:G12, "=10")
Where D9:D12 is the values interval, F9:F12 is the 1st intermediary column and G9:G12 the second intermediary column.
One Shot Compact Solution (Arrays)
An optimized solution can be found relying on arrays. For instance, to calculate the average and the max of an interval based on 2 "vectorial" conditions you can write this one liners:
= MAX(IF((LEFT(A9:A12,3)="ABC")*(MID(A9:A12,7,2)="10"),D9:D12))
= AVERAGE(IF((LEFT(A9:A12,3)="ABC")*(MID(A9:A12,7,2)="10"),D9:D12))
With A9:A12 your original records, and D9:D12 is the values interval.
The advantages of this solution are that you don't need any intermediary column and that you can extend this approach to all the other formulas that don't have 'xxxxxIFS' (it's the case for MAX).
NOTE: you have to confirm this formula with CTRL + SHIFT + RETURN or your formula will fail with #VALUE error.
Live Demo
Live demo available here.
You can start by spiting column A into a date and letters using - Data > Text to Columns with the delimiter " - ".
after you have the new two columns (let say F and G) you can use the function "AVERAGEIF" with a condition that check is the value of the cell in "F" is ABC and the Moth(cell in "G") = 10.
as for the max, you can do the same with MAX(IF....) for column E.
SUMPRODUCT will allow you to parse the left-most and date characters from your combined string. A pseudo-MAXIF() can be similarly constructed using MAX() and INDEX().
In D2 use =SUMPRODUCT((LEFT(A2:A10,3)="ABC")*(MONTH(DATEVALUE(MID(A2:A10,7,99)))=10)*(B2:B10))/SUMPRODUCT((LEFT(A2:A10,3)="ABC")*(MONTH(DATEVALUE(MID(A2:A10,7,99)))=10))
In E2 use =MAX(INDEX((LEFT(A2:A10,3)="ABC")*(MONTH(DATEVALUE(MID(A2:A10,7,99)))=10)*(B2:B10),,))
Both SUMPRODUCT and INDEX like to choke on anything remotely resembling an error when parsing text so keep the cell range references to what your actual data is and avoid blanks.
Your results should look like the following.