Assistance with Power Query to select rows and transpose - excel

I have close to 400 excel files that contain client pricing information. The piece of work is to transform these into multiple database tables to load into Azure SQL Database. I'm tasked with the transformation.
The existing data follows a level of logic. 1 workbook per client. A single client can have multiple ICP's (shown in the example). What I'm looking for is some assistance to use Power Query to transform the data from the source to target formats. Once in Target, it'll be loaded in bulk to SQL server to perform additional transformation into required SQL tables.
Rows that need to be selected
Customer Name
Supply Address
ICP
Any subsequent row within the selected range (between ICPs) that contain data within the "Old Rates" or "New Rates" columns. Even if a single value is stored in one or the either, both columns for that row require extraction.
Example of source and target
Source and Target
Having to post as picture as I keep getting an error message about code not being formatted correctly, yet there is no code. Only two tables

first index the your table
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
rowscount = List.Count(List.Distinct(Source[type])),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"old rate", Percentage.Type}, {"new rate", Percentage.Type}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type",null,0,Replacer.ReplaceValue,{"type", "detail", "old rate", "new rate"}),
#"Added Index" = Table.AddIndexColumn(#"Replaced Value", "Index", 0, 1, Int64.Type),
#"Integer-Divided Column" = Table.TransformColumns(#"Added Index", {{"Index", each Number.IntegerDivide(_, rowscount), Int64.Type}})
enter code here
in
#"Integer-Divided Column"
then you can pivot and merge the other data
let
Source = Table1,
#"Removed Columns" = Table.RemoveColumns(Source,{"old rate", "new rate"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([type] = "customer name" or [type] = "ICP" or [type] = "supply address")),
#"Pivoted Column" = Table.Pivot(#"Filtered Rows", List.Distinct(#"Filtered Rows"[#"type"]), "type", "detail"),
#"Merged Queries" = Table.NestedJoin(#"Pivoted Column", {"Index"}, Table1, {"Index"}, "Table1", JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table1", {"type", "old rate", "new rate"}, {"type", "old rate", "new rate"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Table1", each ([type] = "Anytime" or [type] = "Day" or [type] = "EA Levy")),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Index"})
in
#"Removed Columns1"
see also sample file
sample file

Related

Excel (or) Power BI, Rolling Sum

Is there any way in Excel Pivot or Power BI to do the rolling sum of the given data (let say monthly)?
Let say I have a list of cases, each row represent case count and amount. The project start date and end date varied as follows.
For, simplicity, if I demonstrate the data graphically, would be as follows.
What I'm try to do is to aggregate how much case counts and amounts in total for each chunk of month.
My goal is to produce below list using Pivot (if Pivot is not possible, then by Power Query) directly.
I could produce monthly aggregates using Filter function and Sum, then pivot that data to produce above result.
If there is a direct way of producing that aggregates in one step, that would be better. Please suggest it for me.
Please see sample data in below link
https://docs.google.com/spreadsheets/d/1vAKElb2-V_If-MMlPwHk_VGhYr8pkOg_gQfRYRrkbtc/edit?usp=share_link
Excel file in Zip
https://drive.google.com/file/d/1QqgNUrJlBuvin7iecsxsvexrGZXFIt-g/view?usp=share_link
Thank you in advance
LuZ
You can load the data into powerquery and transform from left to data table on right
code for that is
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom1" = Table.AddColumn(Source, "Date", each List.Generate(()=>[x=[Start Date],i=0], each [i]<12, each [i=[i]+1,x=Date.AddMonths([x],1)], each [x])),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom1", "Date"),
#"Added Custom" = Table.AddColumn(#"Expanded Custom", "Year", each Date.Year([Date])),
#"Added Custom2" = Table.AddColumn(#"Added Custom", "Month", each Date.Month([Date])),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Start Date", "End Date", "Date"})
in #"Removed Columns"
Afterwards, load the powerquery back into excel as pivot report and generate your table
Alternatively, just use use
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom1" = Table.AddColumn(Source, "Date", each List.Generate(()=>[x=[Start Date],i=0], each [i]<12, each [i=[i]+1,x=Date.AddMonths([x],1)], each [x])),
#"Expanded Custom" = Table.ExpandListColumn(#"Added Custom1", "Date"),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"Start Date", "End Date"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Date"}, {{"Amount", each List.Sum([Amount]), type number}, {"Case Count", each List.Sum([Case Count]), type number}}),
#"Changed Type" = Table.TransformColumnTypes(#"Grouped Rows",{{"Date", type date}, {"Amount", type number}, {"Case Count", type number}})
in #"Changed Type"
to generate this table, then graph it

How to group data from a column into rows based on prefix in Excel

I have a set of data in excel column. Example is 1000_1.jpg, 1000_2.jpg, 1001_1.jpg ... i am looking to convert this data into rows based on prefix of each file i.e. 1000, 1001 etc.
I have tried using the formula given by #Tom in how to group data from a column into rows based on content this guide but its only working on small set of data which i tested on 10,000 rows. But when testing on whole excel sheet that same formula is returning 0.
I am attaching excel file link here: https://drive.google.com/file/d/1vfEFh2idNpB_gMiMWPhXY2JTsAALtxS0/view?usp=sharing
Expected result is same as given in reference question This
It is quite quick done with PowerQuery
I'm not so good at it, probably it can be done more beautifully. But it works like this:
let
Source = Table.FromColumns({Lines.FromBinary(File.Contents("D:\OneDrive\Desktop\images names.csv"))}),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Images Names", type text}}),
#"Inserted Text Between Delimiters" = Table.AddColumn(#"Changed Type", "Text Between Delimiters", each Text.BetweenDelimiters([Images Names], "_", "."), type text),
#"Changed Type1" = Table.TransformColumnTypes(#"Inserted Text Between Delimiters",{{"Text Between Delimiters", Int64.Type}}),
#"Inserted Text Before Delimiter" = Table.AddColumn(#"Changed Type1", "Text Before Delimiter", each Text.BeforeDelimiter([Images Names], "_"), type text),
#"Sorted Rows" = Table.Sort(#"Inserted Text Before Delimiter",{{"Text Before Delimiter", Order.Ascending}, {"Text Between Delimiters", Order.Ascending}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Sorted Rows", {{"Text Between Delimiters", type text}}, "de-DE"), List.Distinct(Table.TransformColumnTypes(#"Sorted Rows", {{"Text Between Delimiters", type text}}, "de-DE")[#"Text Between Delimiters"]), "Text Between Delimiters", "Images Names"),
#"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([2] <> null)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Text Before Delimiter"})
in
#"Removed Columns"
82505 rows and 40 columns
I uploades the file here:
https://1drv.ms/x/s!AncAhUkdErOkgvInwjmn2ETvY0-ysA?e=Rnpxp1
If you try to load it from PowerQuery to Excel it might not work, because the original name list is not available. But you can see how I did it and if you want just download the pasted values (in case you need to do this job only once - you got it done by me)

Excel: arrange prescription journey based on dates and IDs

I am trying to arrange patient's journey based on first regimen then second regimen and so on.
However, after sorting data based on date:
I tried using IF formula as follow but it does not work correctly ( it worked for ID with three rows without having A5 in the formula):
=IF(AND(A2=A3,A3=A4,A4=A5,(C5-C4<=30),(C4-C3<=30),D3>(C4+30),D2>(C3+30)),B2&", "&B3&", "&B4&", "&B5,
IF(AND(A2=A3,A3=A4,A4=A5,(C4-C3<=30),D2>(C3+30)),B2&", "&B3&", "&B4,
IF(AND(A2=A3,A3=A4,A4=A5,(C4-C3<=30),D2<(C3+30)),B3&", "&B4,
IF(AND(A2=A3,A3=A4,A4=A5,(C4-C3>30),D2<(C3+30)), B3, "N")
I need to have similar results as follow:
Is there any way to have a formula helping to do so, or any other way to have similar results.
I had been working on Power Query code so I will present that first.
You can adapt the same algorithm to use in VBA, if you prefer. I would probably be using nested dictionaries and/or a class module to accomplish it effectively
To use Power Query
Select some cell in your Data Table
Data => Get&Transform => from Table/Range or from within sheet
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
//Change next line to reflect your data source
Source = Excel.CurrentWorkbook(){[Name="Drugs"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"CATEGORY", type text},
{"F.DATE", type date}, {"L.DATE", type date}}),
//Group by ID, then run fnJourney custom function on each subtable to return results
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {
{"Count", each fnJourney(_)}}),
//Expand results
#"Expanded Count" = Table.ExpandListColumn(#"Grouped Rows", "Count"),
#"Expanded Count1" = Table.ExpandRecordColumn(#"Expanded Count", "Count", {"Year", "Reg"}),
//Pivot on Year column with no aggregation
#"Year Headers" = List.Sort(List.Distinct(Table.TransformColumnTypes(#"Expanded Count1", {{"Year", type text}}, "en-US")[Year])),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Expanded Count1", {{"Year", type text}}, "en-US"),
#"Year Headers", "Year", "Reg"),
#"Changed Type1" = Table.TransformColumnTypes(#"Pivoted Column", List.Transform(#"Year Headers", each {_, type text})),
//Join with original table
join = Table.NestedJoin(#"Changed Type", "ID", #"Changed Type1","ID", "Joined", JoinKind.LeftOuter),
//add shifted ID column to decide if the joined table should be retained or deleted
#"Shifted ID" = Table.FromColumns(Table.ToColumns(join) & {{null} & List.RemoveLastN(join[ID])},
type table[ID=Int64.Type, CATEGORY=text, F.DATE=date, L.DATE=date, joined=table, shiftedID=Int64.Type]),
#"Added Custom" = Table.AddColumn(#"Shifted ID", "Custom", each if [shiftedID] <> [ID] then [joined] else null, type nullable table),
//Remove shifted and Joined columns
//Then expand the tables in the Custom Column
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"joined", "shiftedID"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns", "Custom", #"Year Headers"),
#"Changed Type2" = Table.TransformColumnTypes(#"Expanded Custom", List.Transform(#"Year Headers", each {_, type text}))
in
#"Changed Type2"
Custom Function CodeCreate new Blank query and paste code below
//Rename this query fnJourney
(tbl as table) =>
let
//Source = Drugs,
Source = tbl,
Years = {List.Min(List.Transform(Source[F.DATE], each Date.Year(_)))..
List.Max(List.Transform(Source[L.DATE], each Date.Year(_)))},
inJourney = List.Generate(
()=>[yr=Years{0},
reg=Text.Combine(Table.SelectRows(Source,
each Years{0} >= Date.Year([F.DATE])
and Years{0} <= Date.Year([L.DATE]))[CATEGORY],", "),
idx=0],
each [idx] < List.Count(Years),
each [yr=Years{[idx]+1},
reg=Text.Combine(Table.SelectRows(Source,
(r)=>Years{[idx]+1} >= Date.Year(r[F.DATE])
and Years{[idx]+1} <= Date.Year(r[L.DATE]))[CATEGORY],", "),
idx=[idx]+1],
each Record.FromList({[yr], [reg]},{"Year","Reg"})
)
in
inJourney
Before
After

Pivot columns with multiple instance (rows) of attribute

I've searched far and wide and haven't found an answer to this specific case, and wasn't able to adapt some of these solutions.
First of all, my data is a long list of attributes and their values for every product, structured like this:
Structured Initial Data
Note that some products have a single value per attributes, but (and here's my problem) some products have different values for the same attribute.
When I pivot the table in PowerQuery, i get errors where the products have multiple instances of the same attributes.
The resulting table that i'm looking for would be structured like this:
Structured Final Data
Thank you for your help!
See if this works for you
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Sorted Rows" = Table.Sort(Source,{{"Products", Order.Ascending}, {"Attributes", Order.Ascending}}),
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"Products"}, {{"data", each _, type table}}),
#"Added Index1" = Table.AddIndexColumn(#"Grouped Rows", "Index", 0, 1),
#"Expanded data" = Table.ExpandTableColumn(#"Added Index1", "data", {"Attributes", "Values"}, {"Attributes", "Values"}),
mGroup = Table.Group(#"Expanded data" , {"Attributes","Products"}, {{"GRP", each Table.AddIndexColumn(_, "Index2", 1, 1), type table}}),
#"Expanded GRP" = Table.ExpandTableColumn(mGroup, "GRP", {"Values", "Index", "Index2"}, {"Values", "Index", "Index2"}),
#"Added Custom" = Table.AddColumn(#"Expanded GRP", "Row#", each [Index]+[Index2]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index", "Index2"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attributes]), "Attributes", "Values"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Row#"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns1",{"Products", "Each", "Pack"})
in #"Reordered Columns"
It groups on product and adds an index. Then it groups on product and Attribute and adds another index. The sum of those two are a unique row number you can use for pivoting

Power Query transpose and pivot list

I have the following list in Excel Powerquery:
I would like to transform this list within the Power Query editor to make the following list:
Assuming a source table called Table1, you could use this (in the advanced editor):
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Buffered = Table.Buffer(Source),
#"Grouped Rows" = Table.Group(Buffered, {"value"}, {{"AllRows", each Table.AddIndexColumn(_, "Record", 1, 1), type table}}),
#"Expanded AllRows" = Table.ExpandTableColumn(#"Grouped Rows", "AllRows", {"value2", "Record"}, {"value2", "Record"}),
#"Pivoted Column" = Table.Pivot(#"Expanded AllRows", List.Distinct(#"Expanded AllRows"[value]), "value", "value2"),
#"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"Record"})
in
#"Removed Columns"

Resources