How can I convert a repeated column element in to a title row? - excel

I have some rather ugly post-pivot data, much like the following:
Location
Team
Staff
Sales
North
1
1100
55
North
2
2100
56
North
3
3200
91
South
1
7100
75
South
2
3100
16
South
3
9200
41
East
1
8100
25
East
2
9100
56
East
3
4200
31
My users don't like the duplication in the first column and would rather it be a header row with only one element, with the three resulting tables side-by-side. So, something like this:
with the obvious extension for East.
How can I achieve this automatically? I would do it by hand, but the real version of my table has a few hundred categories of values in the Location column.

Related

Excel Power Query. Unpivot Years only to create multiple measures on each row

I'm trying to normalise some data that is supplied in Excel. The data is made up of a number of dimension columns followed by several measure columns over time. Unfortunately the data comes in with a single "Measure/Year" identifier which means that if there are 10 years of data and 4 measures, there will be 40 measure columns.
I can't select specific columns to unpivot as the number of columns will change over time and I want to automate this completely.
A simplified sample of data looks like this (just showing 2 measures over 3 years in this example - but potentially 5 measures over an ever increasing number of years).
Country
Category
Product
QTY_2018
QTY_2019
QTY_2020
Value_2018
Value_2019
Value_2020
France
Fruit
Apple
10
20
30
11
22
33
France
Fruit
Orange
40
50
60
44
55
66
Germany
Veg
Carrot
70
80
90
77
88
99
What I would like to achieve is...
Country
Category
Product
Year
QTY
Value
France
Fruit
Apple
2018
10
11
France
Fruit
Apple
2019
20
22
France
Fruit
Apple
2020
30
33
France
Fruit
Orange
2018
40
44
France
Fruit
Orange
2019
50
55
France
Fruit
Orange
2020
60
66
Germany
Veg
Carrot
2018
70
77
Germany
Veg
Carrot
2019
80
88
Germany
Veg
Carrot
2020
90
99
So far I have selected all the non-measure columns and then applied a transform "Unpivot other columns", and then creating 2 custom columns to get the measure name (Qty or Value in this example) and the year. This gets around the problem of the varying number of measure columns but that only gets me so far.
I now have data that looks like this
Country
Category
Product
Year
Measure
Amount
France
Fruit
Apple
2018
QTY
10
France
Fruit
Apple
2018
Value
11
and so on...
Notes:
The measure label column will always 'measurename_YYYY'
The list of measure names is finite (4 or 5 maybe) so updating this to support more measure names if any are added will be fine as this will be rare. The number of years will increase each year but as I want end users to be able to refresh the query based on the contents of a sheet they update (the sample data above) then the varying periods must be handled in the query.
If this can be done in the datamodel I'm happy to go with that too.
I maybe going about this the wrong way with my attempts so far but my Power Query knowledge is pretty basic so any help would be gratefully received.
You should be able to just repivot on the new Measure column to get your desired result now.
You're nearly there. Just Pivot on your "Measure" column, to complete the output:
Unpivoted = Table.UnpivotOtherColumns(Source, {"Country", "Category", "Product"}, "Attribute", "Value"),
#"Split Column" = Table.SplitColumn(Unpivoted, "Attribute", Splitter.SplitTextByEachDelimiter({"_"}, QuoteStyle.Csv, false), {"Measure", "Year"}),
#"Pivoted Column" = Table.Pivot(#"Split Column", List.Distinct(#"Split Column"[Measure]), "Measure", "Value")

In Excel how can a formula verify whether the column location or column element has taken the correct data from its header name?

The Input data
in sheet1
and
the output calculated in sheet2
Now the sheet1 data can be changed by the user for input, so now columns 'Units1' & 'Units2' may not be placed at the same address that are in columns 'C' and 'D' respectively, so suppose a new user will input the data in which 'Avocado' and 'Banana' are in columns C & D , then the 'Output' calculation in Sheet2 will be incorrect because we always want to use Units1 & Units2 for calculation.
How to fix this, so that every time the data is input the formula checks whether the correct columns have been taken for calculation or not?
Is there a way to use INDEX or family of LOOKUP functions or any other function for this.
Maybe by a creating a new sheet and making a table of Indexes which refer to (or point to) the column names of Data sheet
Location
Dates
Units1
Units2
Avocado
Banana
New York
05-01-18
10
12
1
2
Los Angeles
02-02-18
20
23
1
2
Chicago
08-03-18
30
34
1
2
Houston
05-04-18
40
45
1
2
Phoenix
02-05-18
50
56
1
2
Philadelphia
08-06-18
60
67
1
2
San Antonio
05-07-18
70
78
1
2
San Diego
02-08-18
80
89
1
2
Dallas
08-09-18
90
99
1
2
San Jose
05-10-18
100
112
1
2
Use INDEX/MATCH:
=INDEX(2:2,1,MATCH("Units2",$1:$1,0))/INDEX(2:2,1,MATCH("Units1",$1:$1,0))

1) Issue In Normalize Transformation for Informatica Power Center

I am Trying to Normalize Records of My SOurce table using Normalize Transformation in informatica, But Sequence are not re-generating for different rows.
Below Is SOurce Table :
Store_Name Sales_Quarter1 Sales_Quarter2 Sales_Quarter3 Sales_Quarter4
DELHI 150 240 455 100
MUMBAI 100 500 350 340
Target Table :
Store_name
Sales
Quarter
I am Using Occurrence - 4, on Sales Column for getting GCID Sales.
For Quarter, I am Using GCID Sales column :
O/P :
STORE_NAME SALES_COLUMN QUARTER
Mumbai 100 1
Mumbai 500 2
Mumbai 350 3
Mumbai 340 4
Delhi 150 5
Delhi 240 6
Delhi 455 7
Delhi 100 8
Why Quarter Value is not restarting from 1 for Delhi and is continuing from 5 ?
There is a GK column that keeps sequential numbers for all rows. Definitely, GCID is the right column that keeps numbers per multi-occurrences in a row. So, double check that there is GCID port and not GK that is linked to QUARTER port to target…
It’s good to provide a screenshot for the mapping and for the normalizer transformation (Normalizer tab) to be more informative about your question/issue…
But I suppose you have 'Store_Name' port at level 1 and all 'Sales_Quarter1', 'Sales_Quarter2', 'Sales_Quarter3' and 'Sales_Quarter4' ports grouped at level 2 on Normalizer tab (using >> button at top left area). And at group level (for these four ports) you set the Occurrence to 4.

Cumulatively Reduce Values in Column by Results of Another Column

I am dealing with a dataset that shows duplicate stock per part and location. Orders from multiple customers are coming in and the stock was just added via a vlookup. I need help writing some sort of looping function in python that cumulatively decreases the stock quantity by the order quantity.
Currently data looks like this:
SKU Plant Order Stock
0 5455 989 2 90
1 5455 989 15 90
2 5455 990 10 80
3 5455 990 20 80
I want to accomplish this:
SKU Plant Order Stock
0 5455 989 2 88
1 5455 989 15 73
2 5455 990 10 70
3 5455 990 20 50
Try:
df.Stock -= df.groupby(['SKU','Plant'])['Order'].cumsum()

Extract a value from the chart, filtered by criteria in excel

I've got two sheets like this in excel :
Price chart :
**Post AB** **Post Tenn** **Post DN**
Price 10.1-10.20 Price 10.1-10.20 Price 10.1-20.1
CityOrigin Destination 20 kg 40 kg 20 kg 40 kg 20 kg 40 kg
New York Madrid 45 40 40 50 45 40
Los Angeles Madrid 65 70 70 70 56 60
Oregon Paris 89 100 110 105 74 98
Washington Paris 34 80 45 65 45 69
and Working chart:
Price Rate
Post Career CityOrigin Date 20KG 40KG
Post AB New Tork =Today() ? ?
Post Tenn Los Angles " ? ?
Post DN Oregon " ? ?
I am wondering, is it possible to use today date with Post Career and City origin to extract only rates that are actually valid for today for 20kg and 40kg packages from price chart sheet?
My ideal result should be look like this :
Price Rate
Post Career CityOrigin Date 20KG 40KG
Post AB New York 10/20 40 45
Post Tenn Los Angles 10/20 70 70
Post DN Oregon 10/20 74 98
My question is which function I should use to call the price based on date and post career from price sheet? multiple Lookup??
So here is what I have so far without knowing how other date ranges will be like in your data structure, but this should give you something to work on.
The formula I entered in cell D13 is:
=INDEX($C$4:$H$7,MATCH($B13,$A$4:$A$7,0),MATCH($A13,$C$1:$H$1,0)+IF(D$12="20 kg",0,1))
Basically I just use INDEX/MATCH to lookup the row and column numbers. Once you have other data come in to play, I can take another look if you can't find a way around it.
Please note that I have removed the * sign on row 1 so it is easier to do in the MATCH function, otherwise, you will need to use array formula to do this and that is probably not the way you want to go.

Resources