Power Query - replacing text in a new column - excel

I am new to Power Query in Excel and my question is:
I have a text column with date in this format "09-Feb-17 A". To remove the text " A" and populate the date info in a new column (custom column?), I have used this code:
= Table.AddColumn(#"Changed Type", "Start Date", each Replacer.ReplaceText([Start], " A",""))
Problem is some of the dates are in correct format i.e. without " A". For those dates I get an error:
Expression.Error: We cannot convert the value #date(2019, 1, 11) to
type Text. Details:
Value=11/01/2019
Type=Type
Is there any way to solve this issue within power query?
Thanks in advance.

You can use try otherwise to deal with both data types:
= Table.AddColumn(#"Changed Type", "Start Date", each try Date.FromText(Replacer.ReplaceText([Start], " A","")) otherwise DateTime.Date([Start]), type date)
Or this, which will extract the date before the first space, irrespective of which (or how many) characters follow:
= Table.AddColumn(#"Changed Type", "Start Date", each try Date.FromText(Text.BeforeDelimiter([Start], " ")) otherwise DateTime.Date([Start]), type date)

Perhaps
Change the column Data type to Text (dates --> date time)
Extract the portion of the string prior to the space
Optional Change column Data type to Date
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Dates", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Text.BeforeDelimiter([Dates]," ")),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Custom", type date}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type1",{"Dates"})
in
#"Removed Columns"

Related

How to group data from a column into rows based on prefix in Excel

I have a set of data in excel column. Example is 1000_1.jpg, 1000_2.jpg, 1001_1.jpg ... i am looking to convert this data into rows based on prefix of each file i.e. 1000, 1001 etc.
I have tried using the formula given by #Tom in how to group data from a column into rows based on content this guide but its only working on small set of data which i tested on 10,000 rows. But when testing on whole excel sheet that same formula is returning 0.
I am attaching excel file link here: https://drive.google.com/file/d/1vfEFh2idNpB_gMiMWPhXY2JTsAALtxS0/view?usp=sharing
Expected result is same as given in reference question This
It is quite quick done with PowerQuery
I'm not so good at it, probably it can be done more beautifully. But it works like this:
let
Source = Table.FromColumns({Lines.FromBinary(File.Contents("D:\OneDrive\Desktop\images names.csv"))}),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Images Names", type text}}),
#"Inserted Text Between Delimiters" = Table.AddColumn(#"Changed Type", "Text Between Delimiters", each Text.BetweenDelimiters([Images Names], "_", "."), type text),
#"Changed Type1" = Table.TransformColumnTypes(#"Inserted Text Between Delimiters",{{"Text Between Delimiters", Int64.Type}}),
#"Inserted Text Before Delimiter" = Table.AddColumn(#"Changed Type1", "Text Before Delimiter", each Text.BeforeDelimiter([Images Names], "_"), type text),
#"Sorted Rows" = Table.Sort(#"Inserted Text Before Delimiter",{{"Text Before Delimiter", Order.Ascending}, {"Text Between Delimiters", Order.Ascending}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Sorted Rows", {{"Text Between Delimiters", type text}}, "de-DE"), List.Distinct(Table.TransformColumnTypes(#"Sorted Rows", {{"Text Between Delimiters", type text}}, "de-DE")[#"Text Between Delimiters"]), "Text Between Delimiters", "Images Names"),
#"Filtered Rows" = Table.SelectRows(#"Pivoted Column", each ([2] <> null)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Text Before Delimiter"})
in
#"Removed Columns"
82505 rows and 40 columns
I uploades the file here:
https://1drv.ms/x/s!AncAhUkdErOkgvInwjmn2ETvY0-ysA?e=Rnpxp1
If you try to load it from PowerQuery to Excel it might not work, because the original name list is not available. But you can see how I did it and if you want just download the pasted values (in case you need to do this job only once - you got it done by me)

Create a different pivot view in Power Query

I have the data structured in excel in the following format
What I want to do with that is to transform it into this. In simple words for each ID I want to record the difference in value from previous day, and if there is no value in previous day we just keep the current value.
As an intermediate step I am trying to transform the raw data into something like this but I am not sure how to go about it in simple Excel pivot tables, or Power query transformations.
There is something wrong with your sample because [v1-v2] is not the same method as [v5-v4, v3-v2, v8-v7] but I assume the latter ones were right
See if this works for you
Assumes data in 3 columns in a range named Table1 with column headers Dates, ID, Value
You can paste into PowerQuery using ... Advanced Editor ...
Creates a column with the value of yesterday for that ID and returns a null if nothing is found. Then does the subtraction, and pivots
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Dates", type date}, {"ID", type text}, {"Value", Int64.Type}}),
Yesterday = Table.AddColumn(#"Changed Type" , "Yesterday", (i) => List.Sum(Table.SelectRows( #"Changed Type", each ([ID] = i[ID] and Date.AddDays([Dates],1) = i[Dates]))[Value]), type number ),
#"Replaced Value" = Table.ReplaceValue(Yesterday,null,0,Replacer.ReplaceValue,{"Yesterday"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each [Value]-[Yesterday]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Value", "Yesterday"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Removed Columns", {{"Dates", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Removed Columns", {{"Dates", type text}}, "en-US")[Dates]), "Dates", "Custom", List.Sum)
in #"Pivoted Column"

How do we aggregate time series in Excel?

Given the following time series of cashflow, how can I aggreate them into a cumulative time series of cashflow in Excel, ideally by using array formula only and without VBA macro?
Specifically, I was given this time series of cashflow for each transaction:
Given the inputs (in column F) for the number of transactions in each period, I would like to be able to calculate the aggregated time series of total cashflow (in column G, highlighted in yellow), ideally by using array formula only and without VBA macro?
Note: Column H to J are for illustrations only to show how column G should be calculated, I don't want to have them in my final spreadsheet.
Thank you very much for your help!
I believe you can do it by formula - most easily by reversing the cash flows and multiplying by the current and previous 5 transactions:
=SUMPRODUCT(INDEX(F:F,MAX(ROW()-5,3)):F16*INDEX(C:C,MAX(11-ROW(),3)):$C$8)
in G3.
This is an ordinary non-array formula.
OK Put this array formula in G3:
=IFERROR(SUMPRODUCT(INDEX($B$2:$B$7,N(IF({1},MODE.MULT(IF(INDEX(F:F,MAX(ROW()-5,3)):F3>0,(ROW()-ROW(INDEX(F:F,MAX(ROW()-5,3)):F3)+1)*{1,1}))))),INDEX(INDEX(F:F,MAX(ROW()-5,3)):F3,N(IF({1},MODE.MULT(IF(INDEX(F:F,MAX(ROW()-5,3)):F3>0,(ROW(INDEX(F:F,MAX(ROW()-5,3)):F3)-MIN(ROW(INDEX(F:F,MAX(ROW()-5,3)):F3))+1)*{1,1})))))),0)
Being an array formula it must be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode. Then copy down.
Once Microsoft relaeases FILTER and SEQUENCE it can be shortened:
=IFERROR(SUMPRODUCT(INDEX($B$2:$B$7,FILTER(SEQUENCE(ROW()-MAX(ROW()-5,3)+1,,ROW()-MAX(ROW()-5,3)+1,-1),INDEX(F:F,MAX(ROW()-5,3)):F3>0)),FILTER(INDEX(F:F,MAX(ROW()-5,3)):F3,INDEX(F:F,MAX(ROW()-5,3)):F3>0)),0)
This can also be done in Power Query.
Please refer to this article to find out how to use Power Query on your version of Excel. It is available in Excel 2010 Professional Plus and later versions. My demonstration is using Excel 2016.
Steps are:
Load both tables being the time series of cash-flow and your 2-column output table to the power query editor, then you should have:
For the first table, merged the Period column with Cashflow column with semicolon ; as the delimiter;
Transpose the column/table, then merge the columns with comma , as the delimiter;
Add a custom column use this formula ="Connector" which will fill the column with the word Connector, then you should have:
For the second table, also add a custom column use the same formula ="Connector" which will fill the column with the word Connector;
Merge the second table with the first table using the Custom column as the connection, then expand the new column to show the Merged column from the first table, then you should have:
Remove the Custom column, then split the Merged column by delimiter comma , and put the results into Rows;
Split the Merged column again by delimiter semicolon ; to separate the Period and Cashflow from the first table;
Add a custom column to calculate the New Period being =[Period]+[Merged.1];
Add another custom column to calculate the Cashflow being =[#"# Tran"]*[Merged.2], then you should have something like the following:
Group/sum the Cashflow column by New Period.
Once done you can Close & Load the result to a new worksheet (by default). If you want to show the # Trans column in the final output, you can make a duplicate of your second table before making any changes, and then merge it with the final output table by the Period column to show the corresponding number of transactions.
Here are the power query M codes for the first table:
let
Source = Excel.CurrentWorkbook(){[Name="Tbl_CFS"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Period", Int64.Type}, {"Cashflow", Int64.Type}}),
#"Merged Columns1" = Table.CombineColumns(Table.TransformColumnTypes(#"Changed Type", {{"Period", type text}, {"Cashflow", type text}}, "en-AU"),{"Period", "Cashflow"},Combiner.CombineTextByDelimiter(";", QuoteStyle.None),"Merged"),
#"Transposed Table" = Table.Transpose(#"Merged Columns1"),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Transposed Table", {{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}}, "en-AU"),{"Column1", "Column2", "Column3", "Column4", "Column5", "Column6"},Combiner.CombineTextByDelimiter(",", QuoteStyle.None),"Merged"),
#"Added Custom" = Table.AddColumn(#"Merged Columns", "Custom", each "Connector")
in
#"Added Custom"
And here are the codes for the second table:
let
Source = Excel.CurrentWorkbook(){[Name="Tbl_Total"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Period", Int64.Type}, {"# Tran", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each "Connector"),
#"Merged Queries" = Table.NestedJoin(#"Added Custom", {"Custom"}, Tbl_CFS, {"Custom"}, "Tbl_CFS", JoinKind.LeftOuter),
#"Expanded Tbl_CFS" = Table.ExpandTableColumn(#"Merged Queries", "Tbl_CFS", {"Merged"}, {"Merged"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Tbl_CFS",{"Custom"}),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Removed Columns", {{"Merged", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged"),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged", type text}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Changed Type1", "Merged", Splitter.SplitTextByDelimiter(";", QuoteStyle.Csv), {"Merged.1", "Merged.2"}),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Merged.1", Int64.Type}, {"Merged.2", Int64.Type}}),
#"Added Custom1" = Table.AddColumn(#"Changed Type2", "New Period", each [Period]+[Merged.1]),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Cashflow", each [#"# Tran"]*[Merged.2]),
#"Grouped Rows" = Table.Group(#"Added Custom2", {"New Period"}, {{"Sum", each List.Sum([Cashflow]), type number}})
in
#"Grouped Rows"
All steps are using built-in functions so should be straight forward and easy to execute. Let me know if there is any question. Cheers :)

How to repeat a sequence of number in excel

I have a column in excel against which I want to create a column which contains the repeated sequence from the first column
what I have :
What I need against it:
Here is a step-by-step solution using Power Query:
Please note you need to have Excel 2010 or later version to be able to use Power Query. My version is Excel 2016.
I did not use any advanced coding but just a few built-in functions of the Power Query Editor in combination of Text.Repeat formula.
Here is the full code behind the scene just for reference only.
let
Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}}),
#"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Column1", "Column1 - Copy"),
#"Renamed Columns" = Table.RenameColumns(#"Duplicated Column",{{"Column1", "Number"}, {"Column1 - Copy", "Text"}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"Text", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type1", "Custom", each Text.Repeat([Text],[Number])),
#"Split Column by Position" = Table.ExpandListColumn(Table.TransformColumns(#"Added Custom", {{"Custom", Splitter.SplitTextByRepeatedLengths(1), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Custom"),
#"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Position",{{"Custom", Int64.Type}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type2",{"Custom"})
in
#"Removed Other Columns"
Cheers :)
Here is one way of doing this:
Formula in B2:
=IF(COUNTIF($B$1:B1,B1)=INDEX($A$2:$A$6,SUMPRODUCT(1/(COUNTIF($B$1:B1,$B$1:B1)))-1),INDEX($A$2:$A$6,SUMPRODUCT(1/(COUNTIF($B$1:B1,$B$1:B1)))),INDEX($A$2:$A$6,SUMPRODUCT(1/(COUNTIF($B$1:B1,$B$1:B1)))-1))
It's a rather long formula and can be significantly shorter, but I got a feeling your sample data does not represent your real data, so this would work also for other number than constantly +1. For example:
Just for interest, you can do it by looking up the row of the output column in the cumulative sums of the input column. I like the idea of getting the output directly from the row number, but I can't see a neat way of implementing it
(1) Helper column
Put the cumulative totals in column B:
=SUM(A1:A$1)-A1
Then just do a lookup in the output column:
=IF(ROW()>SUM(A$1:A$5),"",INDEX(A$1:A$5,MATCH(ROW()-1,B$1:B$5)))
(2) Subtotal/offset combo:
=IF(ROW()>SUM(A$1:A$5),"",INDEX(A$1:A$5,MATCH(ROW()-1,SUBTOTAL(9,OFFSET($A$1,0,0,ROW(A$1:A$5)))-A$1:A$5)))
This has to be entered as an array formula using CtrlShiftEnter

excel powerquery replacevalues based on cell

I'm extracting data from a site using excel powerquery, the site is http://www.timeanddate.com/holidays/south-africa/2014
The web table presents dates in the format mmm dd so;
Jan 01
Mar 20
Mar 21 ...etc.
To get results for different years I can invoke a prompt to request year input and replace the relevant value in the URL as follows;
= let
#"Table 0" = (myParm)=>
let
Source = Web.Page(Web.Contents("http://www.timeanddate.com/holidays/south-africa/" & Number.ToText(myParm))),
However - without the year specified in the web results table, when imported into excel it understandably plonks its own values in (Excel native just uses current year being 2015, powerquery interprets the info completely differently) alla such;
2001/01/01
2020/03/01
2021/03/01
herewith the questions:
I want to be able to specify the year in the query using a cell, replacing the myParm with a cell value and refreshing on change (can do it with excel native, need to know how to do it with powerquery)
I want to be able to replace the year value on the resultant year column data with whatever is in the aforementioned cell
For #1, assuming you have an Excel Table named YearTable with a single column named Year and a single detail row with the required year value (e.g. 2015), you can use this M expression:
Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year]
This dives into that table and plucks the value from the first detail row.
For example, you could embed that in your opening Step e.g.
Web.Page(Web.Contents("http://www.timeanddate.com/holidays/south-africa/" & Number.ToText(Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year])))
For #2, I would add use that expression to Add a Column using something like this formula:
[Date] & " " & Number.ToText(Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year])
Then you can use the Parse button (Transform ribbon, under Date) to convert that to a Date datatype if required.
Note the generated Change Type step I got from that page cast Date as a Date with an implied year (the issue you noticed). Just edit the formula for that step, to set the "Date" column as "text" to avoid that.
Here's my entire test M script:
let
Source = Web.Page(Web.Contents("http://www.timeanddate.com/holidays/south-africa/" & Number.ToText(Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year]))),
Data0 = Source{0}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data0,{{"Header", type text}, {"Date", type text}, {"Weekday", type text}, {"Holiday name", type text}, {"Holiday type", type text}}),
#"Added Derived Date" = Table.AddColumn(#"Changed Type", "Derived Date", each [Date] & " " & Number.ToText(Excel.CurrentWorkbook(){[Name="YearTable"]}[Content]{0}[Year])),
#"Parsed Date" = Table.TransformColumns(#"Added Derived Date",{{"Derived Date", each Date.From(DateTimeZone.From(_)), type date}})
in
#"Parsed Date"
Another solution from Colin Banfield;
1) In Excel, create a table with Year as the column name and enter the year as the row value. Then create a query from the table. Your query should have one column and one row value. Name the query appropriately and save.
2) Get the data from the web site. Assume we name the query HolidayTable. Convert the query to a function query e.g.
(Year as number)=>
let
Source = Web.Page(Web.Contents("www.timeanddate.com/holidays/south-africa/"&Number.ToText(Year))),
Data0 = Source{0}[Data],
#"Changed Type" = Table.TransformColumnTypes(Data0,{{"Header", type text}, {"Date", type date}, {"Weekday", type text}, {"Holiday name", type text}, {"Holiday type", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Header"})
in
#"Removed Columns"
3) Add this function as a new column in the step (1) query, and add a new date custom column. After a couple other transformations (column reorder, column removal), you should end up with a query that looks like the following:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Year", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each HolidayTable([Year])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Date", "Weekday", "Holiday name", "Holiday type"}, {"Date", "Weekday", "Holiday name", "Holiday type"}),
#"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Calendar Date", each #date([Year],Date.Month([Date]),Date.Day([Date]))),
#"Reordered Columns" = Table.ReorderColumns(#"Added Custom1",{"Year", "Date", "Calendar Date", "Weekday", "Holiday name", "Holiday type"}),
#"Removed Columns" = Table.RemoveColumns(#"Reordered Columns",{"Year","Date"})
in
#"Removed Columns"
Notes:
a) The first two lines are from the original table query in step (1).
b) The #"Added Custom" step adds a new custom column, which passes the value in the Year column to the HolidayTable function
c) The #"Added Custom1" step adds a new custom column that creates a new date from the value in the Year column, and the month and day values from the original Date column.

Resources