Generating columns based on query results - excel

I have four pivots I'm pulling data from that look like this:
tracked_work_by_users_by_operation_pivot:
+-------------------+---------------+---------------+-----------------------+---------------+
| DATE(start_time) | userid | operation_id | Time estimated | Time Elapsed |
+-------------------+---------------+---------------+-----------------------+---------------+
| 1/2/2011-1/8/2011 | jsmith | 11| 40| 40|
| 1/2/2011-1/8/2011 | jsmith | 10| 20| 24|
+-------------------+---------------+---------------+-----------------------+---------------+
faults_by_user_pivot:
+-------------------+---------------+---------------+-----------------------+----------+
| date(date_entered)| userid | operation_id | Major | Minor |
+-------------------+---------------+---------------+-----------------------+----------+
| 1/2/2011-1/8/2011 | jsmith | 11| 2 | 1|
+-------------------+---------------+---------------+-----------------------+----------+
paid_hours_by_user_pivot:
+-------------------+---------------+---------+
|date_range | userid | Total |
+-------------------+---------------+---------+
| 1/2/2011-1/8/2011 | jsmith | 40 |
+-------------------+---------------+---------+
tracked_work_by_users_pivot:
+-------------------+---------------+---------+
|DATE(start_time) | userid | Total |
+-------------------+---------------+---------+
| 1/2/2011-1/8/2011 | jsmith | 24 |
| | | |
+-------------------+---------------+---------+
What I need to do is compile a report for each user for each operation. From what I see the best way to do that is to have a format similar to:
+--------------+--------------+ +--------------+--------------+
| jsmith | packaging | | jsmith | machining |
+--------------------+--------------+--------------+----------------+--------------+--------------+--------------+--------------+----------------+--------------+--------------+
| DATE | time_elapsed | hours_worked | estimated_work | minor_faults | major_faults | time_elapsed | hours_worked | estimated_work | minor_faults | major_faults |
+--------------------+--------------+--------------+----------------+--------------+--------------+--------------+--------------+----------------+--------------+--------------+
| 1/2/2011-1/8/2011 | 24 | 40 | 36 | 1 | 2 | 24 | 40 | 36 | 1 | 2 |
+--------------------+--------------+--------------+----------------+--------------+--------------+--------------+--------------+----------------+--------------+--------------+
So that jsmith will have separate entries for machining and for packaging because we want to be able to rank him against all machining operators and all packaging operators. How can I best do this so that I will not have to add another 12 entries(since there are twelve operations) every time I add a new user?

The best way to do this seems to be to simply write an app instead of strong-arming excel into it.

Related

Pandas: How to merge cells in the dataframe from a specific column using pandas?

I want to remove the duplicated names from the cells and merge them. This dataframe is generated after concatenating multiple dataframes.
My dataframe as under:
| | Customer ID | Category | VALUE |
| -:|:----------- |:------------- | -------:|
| 0 | GETO90 | Baby Sets | 1090.0 |
| 1 | GETO90 | Girls Dresses | 5357.0 |
| 2 | GETO90 | Girls Jumpers | 2823.0 |
| 3 | SETO90 | Girls Top | 3398.0 |
| 4 | SETO90 | Shorts | 7590.0 |
| 5 | SETO90 | Shorts | 7590.0 |
| 6 | RETO90 | Pants | 6590.0 |
| 7 | RETO90 | Pants | 6590.0 |
| 8 | RETO90 | Jeans | 8590.0 |
| 9 | YETO90 | Jeans | 9590.0 |
| 10| YETO90 | Jeans | 2590.0 |
I want to merge the first column and the expected dataframe is mentioned below:
| | Customer ID | Category | VALUE |
| -:|:----------- |:------------- | -------:|
| 0 | GETO90 | Baby Sets | 1090.0 |
| 1 | | Girls Dresses | 5357.0 |
| 2 | | Girls Jumpers | 2823.0 |
| 3 | SETO90 | Girls Top | 3398.0 |
| 4 | | Shorts | 7590.0 |
| 5 | | Shorts | 7590.0 |
| 6 | RETO90 | Pants | 6590.0 |
| 7 | | Pants | 6590.0 |
| 8 | | Jeans | 8590.0 |
| 9 | YETO90 | Jeans | 9590.0 |
| 10| | Jeans | 2590.0 |
Use duplicated with loc:
df.loc[df.duplicated('Customer ID'), 'Customer ID'] = ''

How do I append the result of a PowerQuery to itself?

Let's say I have a table as follows
| make | model | license | mileage | book value |
|-----------|-----------|---------|---------|------------|
| ford | F150 | 123456 | 34000 | 35000 |
| chevrolet | Silverado | 555778 | 32000 | 29000 |
| | | | | |
Let's pretend I had to unpivot and all that, which I've done. I just used simplified data for this question. Now let's assume I run the query today (July 30th) I want my result to be:
| Date | make | model | license | mileage | book value |
|------------|-----------|-----------|---------|---------|------------|
| 2020-07-30 | ford | F150 | 123456 | 34000 | 35000 |
| 2020-07-30 | chevrolet | Silverado | 555778 | 32000 | 29000 |
| | | | | | |
I want to add the day the query is run. However, here's where I am stuck. Let's say I ran the query tomorrow, I want it to add the new values to the bottom of the existing result:
| Date | make | model | license | mileage | book value |
|------------|-----------|-----------|---------|---------|------------|
| 2020-07-30 | ford | F150 | 123456 | 34000 | 35000 |
| 2020-07-30 | chevrolet | Silverado | 555778 | 32000 | 29000 |
| 2020-07-31 | ford | F150 | 123456 | 34200 | 35000 |
| 2020-07-31 | chevrolet | Silverado | 555778 | 32156 | 29000 |
This would allow me to track the fleet over time.
Any help would be greatly appreciated

How can I compare two spreadsheets to see if column a is a match AND they overlap ranges =true?

I have two spreadsheets--one represents locations where the road was recently repaired and the second shows all eligible roads based on the road's speed limit. The first spreadsheet has a list of ID's (Column B) and a beginning point (Column E) and ending point (Column F) for the repair location. The second spreadsheet may have multiple matches for each ID (Column A) and the eligible beginning points (Column P) and ending points (Column Q).
I want to compare to see if any portions of the eligible roads are already on the recently repaired list.
Completed repairs = 18SealCoatMap where B=Highway Name, E=Beginning Limit, and F=Ending Limit.
| County | Highway | BDFO | EDFO |
|-----------|-----------|--------|--------|
| Guadalupe | FM0078-KG | 13.064 | 14.018 |
| Guadalupe | FM0078-KG | 14.018 | 14.848 |
| Guadalupe | FM0078-KG | 14.848 | 18.991 |
| Guadalupe | FM0465-KG | 0 | 3.342 |
Elibible repairs =MLOVER45 where A=Highway Name, B=Line ID, P=Beginning Limit, and F=Ending Limit.
| Lane | ID | Highway | SpeedLimit | Begin_DFO | End_DFO |
|-----------|----|---------|------------|-----------|---------|
| FM0078-KG | 1 | FM0078 | 50 | 13.064 | 14.018 |
| FM0078-KG | 2 | FM0078 | 55 | 14.845 | 14.848 |
| FM0078-KG | 3 | FM0078 | 50 | 14.018 | 14.845 |
| FM0078-KG | 4 | FM0078 | 55 | 14.848 | 15.006 |
So far, I'm only working with the beginning point of each eligible location. When I get a working formula, I'll copy it for the ending location.
Here's a more varied example...
Eligible Locations:
| Lane | ID | Highway | SpeedLimit | Begin_DFO | End_DFO |
|-----------|-----|---------|------------|-----------|---------|
| FM0791-KG | 369 | FM0791 | 70 | 0 | 6.909 |
| FM0791-KG | 372 | FM0791 | 70 | 6.909 | 18.603 |
| FM0791-KG | 377 | FM0791 | 55 | 19.286 | 19.486 |
| FM0791-KG | 378 | FM0791 | 70 | 19.486 | 30.971 |
Completed Locations:
| County | Highway | BDFO | EDFO |
|----------|-----------|--------|--------|
| Atascosa | FM0791-KG | 21.619 | 23.196 |
| Atascosa | FM0791-KG | 21.619 | 23.196 |
| McMullen | FM0791-KG | 0.000 | 7.017 |
| McMullen | FM0791-KG | 0.000 | 7.017 |
| McMullen | FM0791-KG | 2.190 | 2.760 |
| McMullen | FM0791-KG | 2.190 | 2.760 |
I tried the following formula but every location came back true:
=IF(A2='18SealCoatMap'!B2:B345,AND(MLOVER45!P2>'18SealCoatMap'!E2:E345,MLOVER45!P2<'18SealCoatMap'!F2:F345),TRUE)
Then I tried:
=INDEX('18SealCoatMap'!B2:B345,MATCH(A2,IF(P2>'18SealCoatMap'!E2:E345,P2<'18SealCoatMap'!F2:F345)),2)
but all of the results came back #N/A
I expect the outcome to be the ID number for the eligible location (or TRUE) if there's a match so that I can schedule repairs for all locations that do not already fall within the limits. Based on the results, I will then schedule locations that are entirely or partially due for repair.

Blending Model: Oil Production

Oil Blending
An oil company produces three brands of oil: Regular, Multigrade, and
Supreme. Each brand of oil is composed of one or more of four crude stocks, each having a different lubrication index. The relevant data concerning the crude stocks are as follows.
+-------------+-------------------+------------------+--------------------------+
| Crude Stock | Lubrication Index | Cost (€/barrell) | Supply per day (barrels) |
+-------------+-------------------+------------------+--------------------------+
| 1 | 20 | 7,10 | 1000 |
+-------------+-------------------+------------------+--------------------------+
| 2 | 40 | 8,50 | 1100 |
+-------------+-------------------+------------------+--------------------------+
| 3 | 30 | 7,70 | 1200 |
+-------------+-------------------+------------------+--------------------------+
| 4 | 55 | 9,00 | 1100 |
+-------------+-------------------+------------------+--------------------------+
Each brand of oil must meet a minimum standard for a lubrication index, and each brand
thus sells at a different price. The relevant data concerning the three brands of oil are as
follows.
+------------+---------------------------+---------------+--------------+
| Brand | Minimum Lubrication index | Selling price | Daily demand |
+------------+---------------------------+---------------+--------------+
| Regular | 25 | 8,50 | 2000 |
+------------+---------------------------+---------------+--------------+
| Multigrade | 35 | 9,00 | 1500 |
+------------+---------------------------+---------------+--------------+
| Supreme | 50 | 10,00 | 750 |
+------------+---------------------------+---------------+--------------+
Determine an optimal output plan for a single day, assuming that production can be either
sold or else stored at negligible cost.
The daily demand figures are subject to alternative interpretations. Investigate the
following:
(a) The daily demands represent potential sales. In other words, the model should contain demand ceilings (upper limits). What is the optimal profit?
(b) The daily demands are strict obligations. In other words, the model should contain demand constraints that are met precisely. What is the optimal profit?
(c) The daily demands represent minimum sales commitments, but all output can be sold. In other words, the model should permit production to exceed the daily commitments. What is the optimal profit?
QUESTION
I've been able to construct the following model in Excel and solve it via OpenSolver, but I'm only able to integrate the mix for the Regular Oil.
I'm trying to work my way through the book Optimization Modeling with Spreadsheets by Kenneth R. Baker but I'm stuck with this exercise. While I could transfer the logic from another blending problem I'm not sure how to construct the model for multiple blendings at once.
I modeled the problem as a minimization problem on the cost of the different crude stocks. Using the Lubrication Index data I built the constraint for the R-Lub Index as a linear constraint. So far the answer seems to be right for the Regular Oil. However using this approach I've no idea how to include even the second Multigrade Oil.
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| Decision Variables | | | | | | | | |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| | C1 | C2 | C3 | C4 | | | | |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| Inputs | 1000 | 0 | 1000 | 0 | | | | |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| | | | | | | | | |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| Objective Function | | | | | | Total | | |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| Cost | 7,10 € | 8,50 € | 7,70 € | 9,00 € | | 14.800,00 € | | |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| | | | | | | | | |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| Constraints | | | | | | LHS | | RHS |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| C1 supply | 1 | | | | | 1000 | <= | 1000 |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| C2 supply | | 1 | | | | 0 | <= | 1100 |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| C3 supply | | | 1 | | | 1000 | <= | 1200 |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| C4 supply | | | | 1 | | 0 | <= | 1100 |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| R- Lub Index | -5 | 15 | 5 | 30 | | 0 | >= | 0 |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| R- Output | 1 | 1 | 1 | 1 | | 2000 | = | 2000 |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| | | | | | | | | |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| Blending Data | | | | | | | | |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
| R- Lub | 20 | 40 | 30 | 55 | | 25 | >= | 25 |
+--------------------+--------+--------+--------+--------+--+-------------+----+------+
Here is the model with Excel formulars:
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| Decision Variables | | | | | | | | |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| | C1 | C2 | C3 | C4 | | | | |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| Inputs | 1000 | 0 | 1000 | 0 | | | | |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| | | | | | | | | |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| Objective Function | | | | | | Total | | |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| Cost | 7,1 | 8,5 | 7,7 | 9 | | =SUMMENPRODUKT(B5:E5;B8:E8) | | |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| | | | | | | | | |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| Constraints | | | | | | LHS | | RHS |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| C1 supply | 1 | | | | | =SUMMENPRODUKT($B$5:$E$5;B11:E11) | <= | 1000 |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| C2 supply | | 1 | | | | =SUMMENPRODUKT($B$5:$E$5;B12:E12) | <= | 1100 |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| C3 supply | | | 1 | | | =SUMMENPRODUKT($B$5:$E$5;B13:E13) | <= | 1200 |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| C4 supply | | | | 1 | | =SUMMENPRODUKT($B$5:$E$5;B14:E14) | <= | 1100 |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| R- Lub Index | -5 | 15 | 5 | 30 | | =SUMMENPRODUKT($B$5:$E$5;B15:E15) | >= | 0 |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| R- Output | 1 | 1 | 1 | 1 | | =SUMMENPRODUKT($B$5:$E$5;B16:E16) | = | 2000 |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| | | | | | | | | |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| Blending Data | | | | | | | | |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
| R- Lub | 20 | 40 | 30 | 55 | | =SUMMENPRODUKT($B$5:$E$5;B19:E19)/SUMME($B$5:$E$5) | >= | 25 |
+--------------------+------+-----+------+----+--+----------------------------------------------------+----+------+
A nudge in the right direction would be a tremendous help.
I think you want your objective to be Profit, which I would define as the sum of sales value - sum of cost.
To include all blends, develop calculations for Volume produced, Lube Index, Cost, and Value for each blend. Apply constraints for volume of stock used, volume produced, and lube index, and optimize for Profit.
I put together the model as follows ...
Columns A through D is the information you provided.
The 10's in G2:J5 are seed values for the stock volumes used in each blend. Solver will manipulate these.
Column K contains the total product volume produced. These will be constrained in different ways, as per your investigation (a), (b), and (c). It is =SUM(G3:J3) filled down.
Column L is the Lube Index for the product. As you noted, it is a linear blend - this is typically not true for blending problems. These values will be constrained in Solver. It is {=SUMPRODUCT(G3:J3,TRANSPOSE($B$2:$B$5))/$K3} filled down. Note that it is a Control-Shift-Enter (CSE) formula, required because of the TRANSPOSE.
Column M is the cost of the stock used to create the product. This is used in the Profit calculation. It is {=SUMPRODUCT(G3:J3,TRANSPOSE($C$2:$C$5))}, filled down. This is also a CSE formula.
Column N is the value of the product produced. This is used in the Profit calculation. It is =K3*C8 filled down.
Row 7 is the total stock volume used to generate all blends. These values will be constrained in Solver. It is =SUM(G3:G5), filled to the right.
The profit calculation is =SUM(N3:N5)-SUM(M3:M5).
Below is a snap of the Solver dialog box ...
It does the following ...
The objective is to maximize profit.
It will do this by manipulating the amount of stock that goes into each blend.
The first four constraints ($G$7 through $J$7) ensure the amount of stock available is not violated.
The next three constraints ($K$3 through $K$5) are for case (a) - make no more than product than there is demand.
The last three constraints ($L$3 through $L$5) make sure the lube index meets the minimum specification.
Not shown - I selected options for GRG Nonlinear and selected "Use Multistart" and deselected "Require Bounds on Variables".
Below is the result for case (a) ...
For case (b), change the constraints on Column K to be "=" instead of "<=". Below is the result ...
For case (c), change the constraints on Column K to be ">=". Below is the result ...
I think I came up with a solution, but I'm unsure if this is correct.
| Decision Variables | | | | | | | | | | | | | | | | |
|--------------------|---------|--------|--------|--------|-------------|--------|--------|--------|--------|--------|--------|--------|---|--------------------------------|----|------|
| | C1R | C1M | C1S | C2R | C2M | C2S | C3R | C3M | C3S | C4R | C4M | C4S | | | | |
| Inputs | 1000 | 0 | 0 | 800 | 0 | 300 | 0 | 1200 | 0 | 200 | 300 | 600 | | | | |
| | | | | | | | | | | | | | | | | |
| Objective Function | | | | | | | | | | | | | | Total Profit (Selling - Cost) | | |
| Cost | 7,10 € | 7,10 € | 7,10 € | 8,50 € | 8,50 € | 8,50 € | 7,70 € | 7,70 € | 7,70 € | 9,00 € | 9,00 € | 9,00 € | | 3.910,00 € | | |
| | | | | | | | | | | | | | | | | |
| Constraints | | | | | | | | | | | | | | LHS | | RHS |
| Regular | -5 | | | 15 | | | 5 | | | 30 | | | | 13000 | >= | 0 |
| Multi | | -15 | | | 5 | | | -5 | | | 20 | | | 0 | >= | 0 |
| Supreme | | | -30 | | | -10 | | | -20 | | | 5 | | 0 | >= | 0 |
| C1 Supply | 1 | 1 | 1 | | | | | | | | | | | 1000 | <= | 1000 |
| C2 Supply | | | | 1 | 1 | 1 | | | | | | | | 1100 | <= | 1100 |
| C3 Supply | | | | | | | 1 | 1 | 1 | | | | | 1200 | <= | 1200 |
| C4 Supply | | | | | | | | | | 1 | 1 | 1 | | 1100 | <= | 1100 |
| Regular Demand | 1 | | | 1 | | | 1 | | | 1 | | | | 2000 | >= | 2000 |
| Multi Demand | | 1 | | | 1 | | | 1 | | | 1 | | | 1500 | >= | 1500 |
| Supreme Demand | | | 1 | | | 1 | | | 1 | | | 1 | | 900 | >= | 750 |
| | | | | | | | | | | | | | | | | |
| | | | | | | | | | | | | | | | | |
| Selling | | | | | | | | | | | | | | | | |
| Regular | 8,50 € | x | 2000 | = | 17.000,00 € | | | | | | | | | | | |
| Multi | 9,00 € | x | 1500 | = | 13.500,00 € | | | | | | | | | | | |
| Supreme | 10,00 € | x | 900 | = | 9.000,00 € | | | | | | | | | | | |
| | | | | | 39.500,00 € | | | | | | | | | | | |

SSIS Convert column to rows from an excel sheet

I have an excel sheet table with a structure like this:
+------------+-----+----------+----------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+
| date | Day | StoreDdg | StoreR/H | DbgCategory1Dpt1 | R/HCategory1Dpt1 | DbgCategory2Dpt1 | R/HCategory2Dpt1 | DbgCategory3Dpt1 | R/HCategory2Dpt1 | DbgDepartment1 | R/HDepartment1 | DbgCategory1Dpt2 | R/HCategory1Dpt2 | DbgCategory2Dpt2 | R/HCategory2Dpt2 | DbgCategory3Dpt2 | R/HCategory2Dpt2 | DbgDepartment2 | R/HDepartment2 |
+------------+-----+----------+----------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+
| 1-Jan-2017 | Sun | 138,894 | 133% | 500 | 44% | 12,420 | 146% | | | | 11,920 | 104% | #DIV/0! | 13,580 | 113% | 9,250 | 92% | 6,530 | 147% |
| 2-Jan-2017 | Mon | 138,894 | 270% | 500 | 136% | 12,420 | 277% | 11,920 | | | | 193% | #DIV/0! | 13,580 | 299% | 9,250 | 225% | 6,530 | 181% |
+------------+-----+----------+----------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+------------------+------------------+------------------+------------------+------------------+------------------+----------------+----------------+
I would like to convert this into
+------------+-----+--------+-------------+---------------+---------+------+
| date | Day | Store | Department | Category | Dpt | R/H |
+------------+-----+--------+-------------+---------------+---------+------+
| 1-Jan-2017 | Sun | Store1 | Department1 | Category1Dpt1 | 138,894 | 133% |
| 1-Jan-2017 | Sun | Store1 | Department1 | Category2Dpt1 | 500 | 44% |
| 1-Jan-2017 | Sun | Store1 | Department1 | Category3Dpt1 | 12,420 | 146% |
| 1-Jan-2017 | Sun | Store1 | Department2 | Category1Dpt2 | 11,920 | 104% |
| 1-Jan-2017 | Sun | Store1 | Department2 | Category2Dpt2 | 13,580 | 44% |
| 1-Jan-2017 | Sun | Store1 | Department2 | Category3Dpt2 | 9,250 | 92% |
| 2-Jan-2017 | Mon | Store1 | Department1 | Category1Dpt1 | 138,894 | 270% |
| 2-Jan-2017 | Mon | Store1 | Department1 | Category2Dpt1 | 500 | 136% |
| 2-Jan-2017 | Mon | Store1 | Department1 | Category3Dpt1 | 12,420 | 277% |
| 2-Jan-2017 | Mon | Store1 | Department2 | Category1Dpt2 | 13,580 | 299% |
| 2-Jan-2017 | Mon | Store1 | Department2 | Category2Dpt2 | 9,250 | 225% |
| 2-Jan-2017 | Mon | Store1 | Department2 | Category3Dpt2 | 6,530 | 181% |
+------------+-----+--------+-------------+---------------+---------+------+
any recommendation about how to do this?
You can do this by taking the excel file as source. You might have to save as the excel in 2005 or 2007 format depending upon the version you are using of the visual studio if it is already in 2007 format then its good .
Now extracting the data for DbgDepartment1 and DbgDepartment2 , you may create 2 different source in the DFT. In one , you may select column which are related to DbgDepartment1 and in the second ,you may choose DbgDepartment2. You might have to use the Derived Column depending on the logic you will use further . Then you may use the Union Transformation, as the source file is the same and can load the data into the destination .Try it , you will get a solution .
I used R statistic language to solve this issue by using data tidying packages ("tidyr", "devtools")
for more info check the link: http://garrettgman.github.io/tidying/

Resources