Related
source link
Hi, so I have 2 tables (need and supply) like this:
I am trying to add a custom column on the need table where for each item I want to retrieve the appropriate supply date based on the following condition (pseudocode) :
if
supply(Qty) >= need(Qty) and (supply(Supply date) <> null and |supply(Supply date) - need(Date)| < 31 days)
then supply(Supply date)
else
if
supply(Supply date) = null
then "NO"
else
"NON2"
Here's what I started doing:
x = Table.Column(source, Table.SelectRows(supply, each supply([Qty]) >= need([Qty]) and (supply[Supply date] <> null and ((supply([Supply date]) =< Date.AddMonths(need([Date]),1) or (supply([Supply date]) >= Date.AddMonths(need([Date]),-1)) )),Supply([date]),
if x <> null then x else "NO2"
Obviously I don't get what I want, that's why I come here asking for your help. Thx
Supply is a table. How can I apply Supply(Supply date) = null then "NO" to an entire table?
That said, see if this helps at all
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Item", type any}, {"Date", type datetime}, {"Qty", Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Item"}, supply, {"item"}, "supply", JoinKind.LeftOuter),
#"Added Custom" = Table.AddColumn(#"Merged Queries", "Custom", each
let ThisDate=[Date], ThisQty=[Qty] in try Table.SelectRows( [supply], each [Qty]>ThisQty and [Supply date]<>null
and Number.From([Supply date]) - Number.From(ThisDate)<31)[Supply date]{0} otherwise "NON")
in #"Added Custom"
Pulling data into excel from a text file of PDF occasionally splits sentences.
Following a previous post, Using PQ I can group text together and then with regex identify . belonging to the end of a sentence and split using this. This works really well, except where a Subheading may exist without punctuation. This results in a minor error where sentences are occasionally joined together. for example
SUBHEADING 1
Non-sensical
sentence 1. Sentence 2.
Returns as:
SUBHEADING 1 Non-sensical sentence 1.
Sentence 2.
I initially thought this was the best that could be done, however, the focus so far has been on identifying the end of a sentence with . Additionally though, we could also use the fact that sentences tend to start with a capital letter to refine this further. The way excel interprets PDF pages means that prior transformations are required to group the text, which I am pretty certain rules out using regex.
Here is an example of what I am trying to do (orange) and where I currently am (green):
Notice how each sentence starts with a capital and ends with .. I think this will be a logical way to separate SUBHEADING/CHAPTERS, etc., from the rest of the text. From this, I can then apply the remainder of the transformations on the previous post and if im right, this should give better separation.
So far:
I have identified which sentences Start with a Captial letter/lowercase and which end in lowercase or . (where ending in a number is also considered lowercase). Using this, I think it's safe to assume that:
If a row ends lowercase AND the next row also contains lowercase,
this belongs to the same sentence and can be appended.
If the Sentence starts Uppercase and ends in a full stop, this is a
complete sentence and requires no changes.
If a Sentence starts uppercase without a .At the end and then Next
is uppercase; this is a subheading/Extra info, not a typical
sentence.
I think it is possible to generate the Desired table using these rules. I am having trouble determining the transformations required to append the sentences following the above rules. I need some sort of way to identify the Upper/lower/. of rows above and below and I just cant see how.
I will update as I progress; however, if anyone could comment on how to achieve this using these rules, that would be great.
Note there is an error with the current M code where 3d. Is recognised as uppercase.
M Code:
let
Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each if Text.Start([Column1], 1) = Text.Lower(Text.Start([Column1], 1)) then "Lowercase" else "Uppercase"),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each if Text.End([Column1], 1) = "." then Text.End([Column1], 1) else if Text.End([Column1], 1) = Text.Lower(Text.End([Column1], 1)) then "Lowercase" else null)
in
#"Added Custom1"
Source Data:
SUBHEADING 1
Non-sensical
sentence 1. Sentence 2.
Sentence 3 part 3a,
part 3b, and
part 3c.
SUBHEADING 1.1.
Non-sensical
sentence 4. Sentence 5.
Sentence 6 part 3a,
part 3b, 3c and
3d.
2.0. SUBHEADING
Extra Info 1 (Not a proper sentence)
Extra Info 2
Sentence 7.
SUBHEADING 3.0
Sentence 8.
SOLUTION SO FAR (Minor error if Subheading starts with number)
let
Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Start", each if Text.Start([Column1], 1) = Text.Lower(Text.Start([Column1], 1)) then "Lowercase" else "Uppercase"),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "End", each if Text.End([Column1], 1) = "." then Text.End([Column1], 1) else if Text.End([Column1], 1) = Text.Lower(Text.End([Column1], 1)) then "Lowercase" else null),
#"Added Custom4" = Table.AddColumn(#"Added Custom1", "Complete Sentences", each if [Start] = "Uppercase" and [End] = "." then "COMPLETE" else null),
#"Added Index" = Table.AddIndexColumn(#"Added Custom4", "Index", 0, 1, Int64.Type),
#"Added Custom2" = Table.AddColumn(#"Added Index", "StartCheckCaseBelow", each try #"Added Index" [Start] { [Index] + 1 } otherwise "COMPLETE"),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom2",{"Index"}),
#"Added Custom3" = Table.AddColumn(#"Removed Columns1", "CompleteHeadings_CompleteText", each if [StartCheckCaseBelow]= "COMPLETE" then "COMPLETE" else if [Start] = "Uppercase" and [StartCheckCaseBelow] = "Uppercase" then "COMPLETE" else null),
#"Added Custom5" = Table.AddColumn(#"Added Custom3", "Custom", each if [CompleteHeadings_CompleteText] = null then if [Start] = "Uppercase" and [End] = "Lowercase" then "START" else if [End] = "." then "END" else null else null),
#"Added Index1" = Table.AddIndexColumn(#"Added Custom5", "Index", 0, 1, Int64.Type),
#"Merged Columns" = Table.CombineColumns(Table.TransformColumnTypes(#"Added Index1", {{"Index", type text}}, "en-GB"),{"Index", "CompleteHeadings_CompleteText"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"CompleteHeadings_CompleteText"),
#"Added Index3" = Table.AddIndexColumn(#"Merged Columns", "Index", 0, 1, Int64.Type),
#"Filtered Rows" = Table.SelectRows(#"Added Index3", each ([Custom] = "START")),
#"Added Index2" = Table.AddIndexColumn(#"Filtered Rows", "temp", 0, 1, Int64.Type),
combined = #"Added Index2" & Table.SelectRows(#"Added Index3", each [Custom] <> "START"),
#"Sorted Rows" = Table.Sort(combined,{{"Index", Order.Ascending}}),
#"Removed Columns" = Table.RemoveColumns(#"Sorted Rows",{"Index"}),
#"Added Custom6" = Table.AddColumn(#"Removed Columns", "Custom.1", each if Text.Contains([CompleteHeadings_CompleteText], "COMPLETE") then [CompleteHeadings_CompleteText] else if [Complete Sentences] <> null then [Complete Sentences] else [temp]),
#"Filled Down" = Table.FillDown(#"Added Custom6",{"Custom.1"}),
#"Removed Other Columns" = Table.SelectColumns(#"Filled Down",{"Column1", "Custom.1"}),
#"Grouped Rows" = Table.Group(#"Removed Other Columns", {"Custom.1"}, {{"Text", each Text.Combine([Column1], " "), type text}}),
#"Removed Other Columns1" = Table.SelectColumns(#"Grouped Rows",{"Text"})
in
#"Removed Other Columns1"
This works to achieve the table in orange however requires many steps which need to be simplified. If you could advise on this or know of a better way please let me know.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each if Text.Start([Column1], 1) = Text.Lower(Text.Start([Column1], 1)) then " " else "| "),
#"Merged Columns" = Table.CombineColumns(#"Added Custom",{ "Custom", "Column1"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
#"Transposed Table" = Table.Transpose(#"Merged Columns"),
Custom1 = Table.ColumnNames( #"Transposed Table"),
Custom2 = #"Transposed Table",
#"Merged Columns1" = Table.CombineColumns(Custom2,Custom1,Combiner.CombineTextByDelimiter("", QuoteStyle.None),"Merged"),
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Merged Columns1", {{"Merged", Splitter.SplitTextByDelimiter("| ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged"),
#"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Split Column by Delimiter", {{"Merged", Splitter.SplitTextByDelimiter(". ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Merged"),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Merged", type text}})
in
#"Changed Type"
Will be interested to see if there are more elegant solutions out there.
I have the following data with duplicates which I wish to identify. I do not wish to remove these so unique value only won't work. I want to be able to identify them but just saying null.
I have attempted to self-reference the code but end up just duplicating the original result.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Removed Duplicates" = Table.Distinct(#"Changed Type"),
#"Merged Queries" = Table.NestedJoin(Source, {"Column1"}, #"Removed Duplicates", {"Column1"}, "Removed Duplicates", JoinKind.LeftOuter)
in
#"Merged Queries"
You can use List.Generate to generate a list with your requirements. And then you can either replace the first column or add the list as a second column.
This needs to be done in the Advanced Editor.
Please note there is a difference between the text string "null" and the power query null value. Based on your example screenshot, I assumed you wanted the "null" text string. If you prefer the null value, remove the surrounding quotes in the code
M Code
let
//Change next line to reflect your actual data source
Source = Excel.CurrentWorkbook(){[Name="Table13"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
//change 2nd and later duplicates to null
dupsNull = List.Generate(
()=>[v=#"Changed Type"[Column1]{0}, idx=0],
each [idx]<Table.RowCount(#"Changed Type"),
each [v=if List.PositionOf(#"Changed Type"[Column1],#"Changed Type"[Column1]{[idx]+1},Occurrence.First) = [idx]+1
then #"Changed Type"[Column1]{[idx]+1} else "null", idx=[idx]+1],
each [v]),
//either add as a column or replace the first column
#"add de-duped" = Table.FromColumns(
Table.ToColumns(#"Changed Type") & {dupsNull},
type table[Column1=text, Column2=text])
in
#"add de-duped"
Here's another way. First, add an index column. Then add another column using List.PositionOf to get the row of the first occurrence of each value in the column. Then add one last column to compare the index and List.PositionOf, to determine which row entries should be null.
Let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each List.PositionOf(#"Added Index"[Column1],[Column1])),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each if [Index] = [Custom] then [Column1] else null)
in
#"Added Custom1"
Here a solution that doesn't require to add a new column. It returns the same column just with duplicated values replaced with "null":
let
Source = Excel.CurrentWorkbook(){[Name="TB_INPUT"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
removeDups = (lst as list) =>
List.Accumulate(lst, {}, (x, y) => x & {if List.Contains(x, y) then "null" else y}),
replacedValues = removeDups(Table.Column(#"Changed Type", "Column1")),
#"replaced Values" = Table.FromList(replacedValues, null, type table[Column1 = Text.Type ])
in
#"replaced Values"
it uses a List.Accumulate function to simplify the process to generate the corresponding list with the specified requirements.
The output will be the following in Power Query:
and in Excel:
If you want an empty cell instead of "null" token, then in the function removeDups replace "null" with null.
If you want to consider a more general case, lets say you have more than one column in the input Excel Table and you want to replace duplicated values in more than one column at the same time.
Let's say we have the following input in Excel:
The following code can be used to replace duplicates in Column1 and Column2:
let
Source = Excel.CurrentWorkbook(){[Name="TB_GralCase"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", Int64.Type}}),
listOfColumns = {"Column1", "Column2"},
remainingColumns = List.Difference(Table.ColumnNames(#"Changed Type"), listOfColumns),
removeDups = (lst as list) =>
List.Accumulate(lst, {}, (x, y) => x & {if List.Contains(x, y) then "null" else y}),
replacedValues = List.Transform(listOfColumns, each removeDups(Table.Column( #"Changed Type", _))),
#"replaced values" = Table.FromColumns(
replacedValues & Table.ToColumns(Table.SelectColumns( #"Changed Type", remainingColumns)),
listOfColumns & remainingColumns
)
in
#"replaced values"
In listOfColumns variable, you define the list of columns you want to replace duplicates.
The the output in Power Query will be:
What I am trying to achieve is to obtain "matches/pairs" from two tables. One (source 1)is data table with Date/Time and Pressure value columns and the other (source 2) is like Date/Time and Info value Columns. Second table has so called "pairs" , start and stop in certain time. I want to get exact matches when is found in source 1 or approximate match when is not exact as in source 1 (seconds can be a problem).
Lets say you are matching/lookup two tables, give me everything that falls between for instance 15.01.2022 06:00:00 and 15.01.2022 09:15:29.
Where I have a problem is more likely exact match and seconds. It is skipping or cant find any pair if the seconds are not matching. So my question is how to make if not seconds then lookup for next availablee match, can be a minute too as long as they are in the given range (start stop instances).
That is a reason I am getting this Expression error. Or is there a way to skip that error and proceed with Query??
Link to download the data:
https://docs.google.com/spreadsheets/d/1Jv5j7htAaEFktN0ntwOZCV9jesF43tEP/edit?usp=sharing&ouid=101738555398870704584&rtpof=true&sd=true
On the code below is what I am trying to do:
let
//Be sure to change the table names in the Source= and Source2= lines to be the actual table names from your workbook
Source = Excel.CurrentWorkbook(){[Name="Parameters"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//get start/stop times table
Source2 = Excel.CurrentWorkbook(){[Name="Log_Original"]}[Content],
typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
#"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
#"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "NextLineStart", each if Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) = true
and Text.Contains(#"Added Index"[Info]{[Index]+1},"start",Comparer.OrdinalIgnoreCase) = true
then "delete"
else null),
#"Filtered Rows1" = Table.SelectRows(#"Added Custom", each ([NextLineStart] = null)),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Index", "NextLineStart"}),
//create a list of all the relevant start/stop times
filterTimes = List.Combine(
List.Generate(
()=> [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{0},
Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{1}-#"Removed Columns1"[#"Date/Time"]{0})+1,
#duration(0,0,0,1)), IDX = 0],
each [IDX] < Table.RowCount(#"Removed Columns1"),
each [times = List.DateTimes(#"Removed Columns1"[#"Date/Time"]{[IDX]+2},
Duration.TotalSeconds(#"Removed Columns1"[#"Date/Time"]{[IDX]+3}-#"Removed Columns1"[#"Date/Time"]{[IDX]+2})+1,
#duration(0,0,0,1)), IDX = [IDX]+2],
each [times]
)
),
//filter the table using the list
filterTimesCol = Table.FromList(filterTimes,Splitter.SplitByNothing()),
filteredTable = Table.Join(#"Changed Type","Date/Time",filterTimesCol,"Column1",JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(filteredTable,{"Column1"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each DateTime.ToText([#"Date/Time"],"dd-MMM-yy")),
#"Filtered Rows2" = Table.SelectRows(#"Added Custom1", each [#"Date/Time"] > #datetime(2019, 01, 01, 0, 0, 0)),
#"Sorted Rows" = Table.Sort(#"Filtered Rows2",{{"Date/Time", Order.Ascending}})
in
#"Sorted Rows"
I set up the below to return a sorted table with all results between the start and ending date/times. You can then select the first or middle or bottom row of each table if you want from this point. Its hard to tell from your question if you are looking for the value closest to the start value, closest to the end value or something inbetween. You can wrap my Table.Sort with a Table.FirstN or Table.LastN to pick up the first or last row.
I left most of your starting code alone
let Source = Table.Buffer(T1),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date/Time", type datetime}, {"P7 [mbar]", Int64.Type}}),
//get start/stop times table
Source2 = T2,
typeIt = Table.TransformColumnTypes(Source2, {"Date/Time", type datetime}),
#"Filtered Rows" = Table.SelectRows(typeIt, each ([#"Date/Time"] <> null)),
#"Added Index" = Table.AddIndexColumn(#"Filtered Rows", "Index", 0, 1),
// shift Info up one row for comparison
shiftedList = List.RemoveFirstN( #"Added Index"[Info],1),
custom1 = Table.ToColumns( #"Added Index") & {shiftedList},
custom2 = Table.FromColumns(custom1,Table.ColumnNames( #"Added Index") & {"NextInfo"}),
#"Filtered Rows2" = Table.SelectRows(custom2, each not (Text.Contains([Info],"start", Comparer.OrdinalIgnoreCase) and Text.Contains([NextInfo],"start", Comparer.OrdinalIgnoreCase))),
#"Added Custom3" = Table.AddColumn(#"Filtered Rows2", "Type", each if Text.Contains(Text.Lower([Info]),"start") then "start" else if Text.Contains(Text.Lower([Info]),"finished") then "finished" else null),
#"Removed Columns2" = Table.RemoveColumns(#"Added Custom3",{"Info", "NextInfo"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns2", "Custom", each if [Type]="start" then [Index] else null),
#"Filled Down" = Table.FillDown(#"Added Custom1",{"Custom"}),
#"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Type]), "Type", "Date/Time"),
#"Added Custom2" = Table.AddColumn(#"Pivoted Column","Table",(i)=>Table.Sort(Table.SelectRows(T1, each [#"Date/Time"]>=i[start] and [#"Date/Time"]<=i[finished]),{{"Date/Time", Order.Ascending}}) , type table )
in #"Added Custom2"
My question or rather problem, doesn't fall to far from this post.
I've checked these sources to get to where I am with the code below;
link 1
link 2
let
Source = Table.Combine({CITIBank_USDChecking, QNBFinansbank_TRYMevduat}),
#"Sort Rows" = Table.Sort(Source,{{"Date", Order.Ascending}}),
#"Add Custom ""Num"" Col" = Table.AddColumn(#"Sort Rows", "Num", each null),
#"Add Custom ""Payee"" Col" = Table.AddColumn(#"Add Custom ""Num"" Col", "Payee", each null),
#"Add Custom ""Tag"" Col" = Table.AddColumn(#"Add Custom ""Payee"" Col", "Tag", each null),
#"Add Custom ""Category"" Col" = Table.AddColumn(#"Add Custom ""Tag"" Col", "Category", each null),
#"Reorder Cols" = Table.ReorderColumns(#"Add Custom ""Category"" Col",{"Account", "Currency", "Date", "Num", "Payee", "Description", "Tag", "Category", "Clr", "Debit", "Credit"}),
BufferValues = List.Buffer(#"Reorder Cols"[Debit]),
BufferGroup = List.Buffer(#"Reorder Cols"[Account]),
runTotal = Table.FromList (
fxGroupedRunningTotal (BufferValues , BufferGroup) , Splitter.SplitByNothing() , {"runningDebit"}
),
combineColumns = List.Combine (
{ Table.ToColumns (#"Reorder Cols") , Table.ToColumns(runTotal) }
),
#"Convert to Table" = Table.FromColumns (
combineColumns , List.Combine ( { Table.ColumnNames(#"Reorder Cols") , {"runningDebit"} } )
),
#"Removed Columns" = Table.RemoveColumns(#"Convert to Table",{"runningDebit"})
in
#"Removed Columns"
The issue with doing grouped running totals -in my case- is the resetting that occurs once the list hits a value of 0. I don't know how to get around this issue at this point. Basically, I'm trying to get a running total of debits by account, where the debit is a list of decimal numbers and 0 at times. Maybe #alexis-olson can chime in...
This would be the running group sum of Debits based on the sample starting image
Basically, you add an index
Compare the Account to the Account in the row below. If they dont match, copy over the index
Fill upwards. You now have unique groupings for Account
Add a formula for running sum based on that grouping
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each try if [Account] = #"Added Index"{[Index]+1}[Account] then null else [Index] otherwise [Index]),
#"Filled Up" = Table.FillUp(#"Added Custom",{"Custom"}),
#"Added Custom2" = Table.AddColumn(#"Filled Up","runningDebit",(i)=>List.Sum(Table.SelectRows(#"Filled Up", each [Custom] = i[Custom] and [Index]<=i[Index]) [Debit]), type number)
in #"Added Custom2"