I have the data structured in excel in the following format
What I want to do with that is to transform it into this. In simple words for each ID I want to record the difference in value from previous day, and if there is no value in previous day we just keep the current value.
As an intermediate step I am trying to transform the raw data into something like this but I am not sure how to go about it in simple Excel pivot tables, or Power query transformations.
There is something wrong with your sample because [v1-v2] is not the same method as [v5-v4, v3-v2, v8-v7] but I assume the latter ones were right
See if this works for you
Assumes data in 3 columns in a range named Table1 with column headers Dates, ID, Value
You can paste into PowerQuery using ... Advanced Editor ...
Creates a column with the value of yesterday for that ID and returns a null if nothing is found. Then does the subtraction, and pivots
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Dates", type date}, {"ID", type text}, {"Value", Int64.Type}}),
Yesterday = Table.AddColumn(#"Changed Type" , "Yesterday", (i) => List.Sum(Table.SelectRows( #"Changed Type", each ([ID] = i[ID] and Date.AddDays([Dates],1) = i[Dates]))[Value]), type number ),
#"Replaced Value" = Table.ReplaceValue(Yesterday,null,0,Replacer.ReplaceValue,{"Yesterday"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each [Value]-[Yesterday]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Value", "Yesterday"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {{"Dates", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {{"Dates", type text}}, "en-US")[Dates]), "Dates", "Custom", List.Sum)
in #"Pivoted Column"
Related
I have a set of data of non-trivial size that I am trying to transform in Power Query. One column's (say, "Column_1") values holds several dimensions of data that are not consistently delimited in any way. I want to apply formulas to this column to do the following:
with reference to various separate tables (say, "Lookup_n") each listing all possible values for a given dimension, identify whether a substring contained in a table is present in the data in Column1
if it is present, insert that substring into a new column specific to that dimension, and remove it from the data in Column1
Here is an example of what I would like to have happen:
Sample Output
I am fairly new to Power Query so don't really know where to begin in formulating a solution to this. I would be very interested to hear if there is an easier way to accomplish this than using the method I have described.
Thanks!
In powerquery, try this code for the input after creating query lookup_1 (with column name lookup_1), query lookup_2 (with column name lookup_2_ and query lookup_3 (with column name lookup_3)
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
Lookup = Table.UnpivotOtherColumns( Table.Combine({lookup_3, lookup_2, lookup_1}),{} , "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(Source,"custom",(i)=>(Table.SelectRows(Lookup, each Text.Contains(i[Column_1],[Value])))),
Expanded = Table.ExpandTableColumn(#"Added Custom", "custom", {"Attribute", "Value"}, {"Attribute", "Value"}),
#"Changed Type1" = Table.TransformColumnTypes(Expanded,{{"Column_1", type text}, {"Attribute", type text}, {"Value", type text}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type1",null,"<none>",Replacer.ReplaceValue,{"Attribute", "Value"}),
#"Pivoted Column" = Table.Pivot(#"Replaced Value", List.Distinct(#"Replaced Value"[Attribute]), "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Pivoted Column",{"<none>"})
in #"Removed Columns"
I have a set of data in excel column. Example is 1000_1.jpg, 1000_2.jpg, 1001_1.jpg ... i am looking to convert this data into rows based on prefix of each file i.e. 1000, 1001 etc.
I have tried using the formula given by #Tom in how to group data from a column into rows based on content this guide but its only working on small set of data which i tested on 10,000 rows. But when testing on whole excel sheet that same formula is returning 0.
I am attaching excel file link here: https://drive.google.com/file/d/1vfEFh2idNpB_gMiMWPhXY2JTsAALtxS0/view?usp=sharing
Expected result is same as given in reference question This
It is quite quick done with PowerQuery
I'm not so good at it, probably it can be done more beautifully. But it works like this:
let
Source = Table.FromColumns({Lines.FromBinary(File.Contents("D:\OneDrive\Desktop\images names.csv"))}),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Images Names", type text}}),
#"Inserted Text Between Delimiters" = Table.AddColumn(#"Changed Type", "Text Between Delimiters", each Text.BetweenDelimiters([Images Names], "_", "."), type text),
#"Changed Type1" = Table.TransformColumnTypes(#"Inserted Text Between Delimiters",{{"Text Between Delimiters", Int64.Type}}),
#"Inserted Text Before Delimiter" = Table.AddColumn(#"Changed Type1", "Text Before Delimiter", each Text.BeforeDelimiter([Images Names], "_"), type text),
#"Sorted Rows" = Table.Sort(#"Inserted Text Before Delimiter",{{"Text Before Delimiter", Order.Ascending}, {"Text Between Delimiters", Order.Ascending}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Sorted Rows", {{"Text Between Delimiters", type text}}, "de-DE"), List.Distinct(Table.TransformColumnTypes(#"Sorted Rows", {{"Text Between Delimiters", type text}}, "de-DE")[#"Text Between Delimiters"]), "Text Between Delimiters", "Images Names"),
#"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([2] <> null)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Text Before Delimiter"})
in
#"Removed Columns"
82505 rows and 40 columns
I uploades the file here:
https://1drv.ms/x/s!AncAhUkdErOkgvInwjmn2ETvY0-ysA?e=Rnpxp1
If you try to load it from PowerQuery to Excel it might not work, because the original name list is not available. But you can see how I did it and if you want just download the pasted values (in case you need to do this job only once - you got it done by me)
I am trying to arrange patient's journey based on first regimen then second regimen and so on.
However, after sorting data based on date:
I tried using IF formula as follow but it does not work correctly ( it worked for ID with three rows without having A5 in the formula):
=IF(AND(A2=A3,A3=A4,A4=A5,(C5-C4<=30),(C4-C3<=30),D3>(C4+30),D2>(C3+30)),B2&", "&B3&", "&B4&", "&B5,
IF(AND(A2=A3,A3=A4,A4=A5,(C4-C3<=30),D2>(C3+30)),B2&", "&B3&", "&B4,
IF(AND(A2=A3,A3=A4,A4=A5,(C4-C3<=30),D2<(C3+30)),B3&", "&B4,
IF(AND(A2=A3,A3=A4,A4=A5,(C4-C3>30),D2<(C3+30)), B3, "N")
I need to have similar results as follow:
Is there any way to have a formula helping to do so, or any other way to have similar results.
I had been working on Power Query code so I will present that first.
You can adapt the same algorithm to use in VBA, if you prefer. I would probably be using nested dictionaries and/or a class module to accomplish it effectively
To use Power Query
Select some cell in your Data Table
Data => Get&Transform => from Table/Range or from within sheet
When the PQ Editor opens: Home => Advanced Editor
Make note of the Table Name in Line 2
Paste the M Code below in place of what you see
Change the Table name in line 2 back to what was generated originally.
Read the comments and explore the Applied Steps to understand the algorithm
M Code
let
//Change next line to reflect your data source
Source = Excel.CurrentWorkbook(){[Name="Drugs"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"CATEGORY", type text},
{"F.DATE", type date}, {"L.DATE", type date}}),
//Group by ID, then run fnJourney custom function on each subtable to return results
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {
{"Count", each fnJourney(_)}}),
//Expand results
#"Expanded Count" = Table.ExpandListColumn(#"Grouped Rows", "Count"),
#"Expanded Count1" = Table.ExpandRecordColumn(#"Expanded Count", "Count", {"Year", "Reg"}),
//Pivot on Year column with no aggregation
#"Year Headers" = List.Sort(List.Distinct(Table.TransformColumnTypes(#"Expanded Count1", {{"Year", type text}}, "en-US")[Year])),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Expanded Count1", {{"Year", type text}}, "en-US"),
#"Year Headers", "Year", "Reg"),
#"Changed Type1" = Table.TransformColumnTypes(#"Pivoted Column", List.Transform(#"Year Headers", each {_, type text})),
//Join with original table
join = Table.NestedJoin(#"Changed Type", "ID", #"Changed Type1","ID", "Joined", JoinKind.LeftOuter),
//add shifted ID column to decide if the joined table should be retained or deleted
#"Shifted ID" = Table.FromColumns(Table.ToColumns(join) & {{null} & List.RemoveLastN(join[ID])},
type table[ID=Int64.Type, CATEGORY=text, F.DATE=date, L.DATE=date, joined=table, shiftedID=Int64.Type]),
#"Added Custom" = Table.AddColumn(#"Shifted ID", "Custom", each if [shiftedID] <> [ID] then [joined] else null, type nullable table),
//Remove shifted and Joined columns
//Then expand the tables in the Custom Column
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"joined", "shiftedID"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns", "Custom", #"Year Headers"),
#"Changed Type2" = Table.TransformColumnTypes(#"Expanded Custom", List.Transform(#"Year Headers", each {_, type text}))
in
#"Changed Type2"
Custom Function CodeCreate new Blank query and paste code below
//Rename this query fnJourney
(tbl as table) =>
let
//Source = Drugs,
Source = tbl,
Years = {List.Min(List.Transform(Source[F.DATE], each Date.Year(_)))..
List.Max(List.Transform(Source[L.DATE], each Date.Year(_)))},
inJourney = List.Generate(
()=>[yr=Years{0},
reg=Text.Combine(Table.SelectRows(Source,
each Years{0} >= Date.Year([F.DATE])
and Years{0} <= Date.Year([L.DATE]))[CATEGORY],", "),
idx=0],
each [idx] < List.Count(Years),
each [yr=Years{[idx]+1},
reg=Text.Combine(Table.SelectRows(Source,
(r)=>Years{[idx]+1} >= Date.Year(r[F.DATE])
and Years{[idx]+1} <= Date.Year(r[L.DATE]))[CATEGORY],", "),
idx=[idx]+1],
each Record.FromList({[yr], [reg]},{"Year","Reg"})
)
in
inJourney
Before
After
Hello all you power query wizards,
I have a similar question to this question: Timeseries with overlapping timeframes, using just the most recent in Excel Power Query, except my column isn't just a date column, but instead a date/time column. I am bringing together a directory of files that look like this and have overlapping times but I only want to keep the newer data instead of combining them together:
List A
List B
Does anyone have a strategy to accomplish this goal or is this something I should do outside of Power Query, such as python?
Many thanks in advance for any insight you can provide!
let
Source = Folder.Files("C:\Users\xxxx\OneDrive\Documents\Atom Projects\10MinOrtho\2. Orthometric\2021-06\10MinOrthos"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File (2)", each #"Transform File (2)"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File (2)"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File (2)", Table.ColumnNames(#"Transform File (2)"(#"Sample File (2)"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Column1", type date}, {"Column2", type time}, {"Column3", type number}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Source.Name"}),
#"Merged Date and Time" = Table.CombineColumns(#"Removed Columns", {"Column1", "Column2"}, (columns) => List.First(columns) & List.Last(columns), "Merged"),
#"Sorted Rows" = Table.Sort(#"Merged Date and Time",{{"Merged", Order.Ascending}})
in
#"Sorted Rows"
You don't describe exactly what you want to do with the overlapped times.
I suggest
remove the entries from List A that are in the overlap region with List B.
This can be done with a simple filter based on the first time listed in List B
I have assumed that List B is in date/time sorted order. If not a minor code change will be required
Then append the two lists
M Code
let
Source = Excel.CurrentWorkbook(){[Name="ListA"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"Value", type number}}),
Source2 = Excel.CurrentWorkbook(){[Name="ListB"]}[Content],
#"Changed Type2" = Table.TransformColumnTypes(Source2,{{"Date/Time", type datetime}, {"Value", type number}}),
//overlap starts at the first date from the second list
overlapStart = #"Changed Type2"[#"Date/Time"]{0},
//Filter list A to end before start time in List B
filteredA = Table.SelectRows(#"Changed Type", each [#"Date/Time"] < overlapStart),
//now combine the two lists
combLists = Table.Combine({filteredA,#"Changed Type2"})
in
combLists
Lists A & B
Combined
Given the following time series of cashflow, how can I aggreate them into a cumulative time series of cashflow in Excel, ideally by using array formula only and without VBA macro?
Specifically, I was given this time series of cashflow for each transaction:
Given the inputs (in column F) for the number of transactions in each period, I would like to be able to calculate the aggregated time series of total cashflow (in column G, highlighted in yellow), ideally by using array formula only and without VBA macro?
Note: Column H to J are for illustrations only to show how column G should be calculated, I don't want to have them in my final spreadsheet.
Thank you very much for your help!
I believe you can do it by formula - most easily by reversing the cash flows and multiplying by the current and previous 5 transactions:
=SUMPRODUCT(INDEX(F:F,MAX(ROW()-5,3)):F16*INDEX(C:C,MAX(11-ROW(),3)):$C$8)
in G3.
This is an ordinary non-array formula.
OK Put this array formula in G3:
=IFERROR(SUMPRODUCT(INDEX($B$2:$B$7,N(IF({1},MODE.MULT(IF(INDEX(F:F,MAX(ROW()-5,3)):F3>0,(ROW()-ROW(INDEX(F:F,MAX(ROW()-5,3)):F3)+1)*{1,1}))))),INDEX(INDEX(F:F,MAX(ROW()-5,3)):F3,N(IF({1},MODE.MULT(IF(INDEX(F:F,MAX(ROW()-5,3)):F3>0,(ROW(INDEX(F:F,MAX(ROW()-5,3)):F3)-MIN(ROW(INDEX(F:F,MAX(ROW()-5,3)):F3))+1)*{1,1})))))),0)
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode. Then copy down.
Once Microsoft relaeases FILTER and SEQUENCE it can be shortened:
=IFERROR(SUMPRODUCT(INDEX($B$2:$B$7,FILTER(SEQUENCE(ROW()-MAX(ROW()-5,3)+1,,ROW()-MAX(ROW()-5,3)+1,-1),INDEX(F:F,MAX(ROW()-5,3)):F3>0)),FILTER(INDEX(F:F,MAX(ROW()-5,3)):F3,INDEX(F:F,MAX(ROW()-5,3)):F3>0)),0)
This can also be done in Power Query.
Please refer to this article to find out how to use Power Query on your version of Excel. It is available in Excel 2010 Professional Plus and later versions. My demonstration is using Excel 2016.
Steps are:
Load both tables being the time series of cash-flow and your 2-column output table to the power query editor, then you should have:
For the first table, merged the Period column with Cashflow column with semicolon ; as the delimiter;
Transpose the column/table, then merge the columns with comma , as the delimiter;
Add a custom column use this formula ="Connector" which will fill the column with the word Connector, then you should have:
For the second table, also add a custom column use the same formula ="Connector" which will fill the column with the word Connector;
Merge the second table with the first table using the Custom column as the connection, then expand the new column to show the Merged column from the first table, then you should have:
Remove the Custom column, then split the Merged column by delimiter comma , and put the results into Rows;
Split the Merged column again by delimiter semicolon ; to separate the Period and Cashflow from the first table;
Add a custom column to calculate the New Period being =[Period]+[Merged.1];
Add another custom column to calculate the Cashflow being =[#"# Tran"]*[Merged.2], then you should have something like the following:
Group/sum the Cashflow column by New Period.
Once done you can Close & Load the result to a new worksheet (by default). If you want to show the # Trans column in the final output, you can make a duplicate of your second table before making any changes, and then merge it with the final output table by the Period column to show the corresponding number of transactions.
Here are the power query M codes for the first table:
let
Source = Excel.CurrentWorkbook(){[Name="Tbl_CFS"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Period", Int64.Type}, {"Cashflow", Int64.Type}}),
#"Merged Columns1" = Table.CombineColumns(Table.TransformColumnTypes(#"Changed Type", {{"Period", type text}, {"Cashflow", type text}}, "en-AU"),{"Period", "Cashflow"},Combiner.CombineTextByDelimiter(";", QuoteStyle.None),"Merged"),
#"Transposed Table" = Table.Transpose(#"Merged Columns1"),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Transposed Table", {{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}}, "en-AU"),{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6"},Combiner.CombineTextByDelimiter(",", QuoteStyle.None),"Merged"),
#"Added Custom" = Table.AddColumn(#"Merged Columns", "Custom", each "Connector")
in
#"Added Custom"
And here are the codes for the second table:
let
Source = Excel.CurrentWorkbook(){[Name="Tbl_Total"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Period", Int64.Type}, {"# Tran", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each "Connector"),
#"Merged Queries" = Table.NestedJoin(#"Added Custom", {"Custom"}, Tbl_CFS, {"Custom"}, "Tbl_CFS", JoinKind.LeftOuter),
#"Expanded Tbl_CFS" = Table.ExpandTableColumn(#"Merged Queries", "Tbl_CFS", {"Merged"}, {"Merged"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Tbl_CFS",{"Custom"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Removed Columns", {{"Merged", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged", type text}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type1", "Merged", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Merged.1", "Merged.2"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Merged.1", Int64.Type}, {"Merged.2", Int64.Type}}),
#"Added Custom1" = Table.AddColumn(#"Changed Type2", "New Period", each [Period]+[Merged.1]),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Cashflow", each [#"# Tran"]*[Merged.2]),
#"Grouped Rows" = Table.Group(#"Added Custom2", {"New Period"}, {{"Sum", each List.Sum([Cashflow]), type number}})
in
#"Grouped Rows"
All steps are using built-in functions so should be straight forward and easy to execute. Let me know if there is any question. Cheers :)