Excel Power Query to only show unhidden rows

Excel Power Query to only show unhidden rows - excel

i have an excel workbook with 3 worksheets. each of the worksheet contains hidden and unhidden rows. i’ve managed to combine all 3 worksheets into one single worksheet via power query. however, the worksheet generated shows all the hidden and unhidden rows. is there anyways i could just generate the worksheet with just the unhidden rows?
i’ve read about using a helping column and applying a filter on it but i am unsure of how to do that as well. any new suggestions/walkthrough is greatly appreciated

You can do this in Power Query.
You can unzip the XLSX file to XML tables, and examine the row attributes - then get the sheet data, and merge the hidden attribute with the row number.
Pass this function the file name (including path) and sheet name, and it will return the sheet contents, with an added "Row hidden" column:
//fnGetRowHiddenStatus
(MyFileName as text, MySheetName as text) =>
let
fnUnzip = (ZIPFile, Position, FileToExtract, DataSoFar) =>
let
MyBinaryFormat = try BinaryFormat.Record([DataToSkip=BinaryFormat.Binary(Position),
MiscHeader=BinaryFormat.Binary(18),
FileSize=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
UnCompressedFileSize=BinaryFormat.Binary(4),
FileNameLen=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian),
ExtrasLen=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian),
TheRest=BinaryFormat.Binary()]) otherwise null,
MyCompressedFileSize = try MyBinaryFormat(ZIPFile)[FileSize]+1 otherwise null,
MyFileNameLen = try MyBinaryFormat(ZIPFile)[FileNameLen] otherwise null,
MyExtrasLen = try MyBinaryFormat(ZIPFile)[ExtrasLen] otherwise null,
MyBinaryFormat2 = try BinaryFormat.Record([DataToSkip=BinaryFormat.Binary(Position), Header=BinaryFormat.Binary(30), Filename=BinaryFormat.Text(MyFileNameLen), Extras=BinaryFormat.Binary(MyExtrasLen), Data=BinaryFormat.Binary(MyCompressedFileSize), TheRest=BinaryFormat.Binary()]) otherwise null,
MyFileName = try MyBinaryFormat2(ZIPFile)[Filename] otherwise null,
GetDataToDecompress = try MyBinaryFormat2(ZIPFile)[Data] otherwise null,
DecompressData = try Binary.Decompress(GetDataToDecompress, Compression.Deflate) otherwise null,
NewPosition = try Position + 30 + MyFileNameLen + MyExtrasLen + MyCompressedFileSize - 1 otherwise null,
AsATable = Table.FromRecords({[Filename = MyFileName, Content=DecompressData]}),
#"Appended Query" = if DecompressData = null then DataSoFar else if (MyFileName = FileToExtract) then AsATable else
if (FileToExtract = "") and Position <> 0 then Table.Combine({DataSoFar, AsATable})
else AsATable
in
if (MyFileName = FileToExtract) or (#"Appended Query" = DataSoFar) then
#"Appended Query"
else
#fnUnzip(ZIPFile, NewPosition, FileToExtract, #"Appended Query"),
Unzipped = fnUnzip(File.Contents(MyFileName), 0, "", null),
WorkbookXML = Xml.Tables(Table.SelectRows(Unzipped, each Text.Contains([Filename],"xl/workbook"))[Content]{0}),
WorkbookData = Table.SelectRows(WorkbookXML, each [Name] = "sheets"){0}[Table]{0}[Table],
SheetIDs = Table.ExpandTableColumn(WorkbookData, "http://schemas.openxmlformats.org/officeDocument/2006/relationships", {"Attribute:id"}, {"Attribute:id"}),
SheetID = Text.Replace(Table.SelectRows(SheetIDs, each ([#"Attribute:name"] = MySheetName))[#"Attribute:id"]{0},"rId","sheet"),
SheetXML = Xml.Tables(Table.SelectRows(Unzipped, each Text.Contains([Filename], "worksheets/" & SheetID))[Content]{0}),
SheetData = Table.SelectRows(SheetXML, each [Name]="sheetData"){0}[Table]{0}[Table],
Renamed = Table.RenameColumns(SheetData,{{"Attribute:r", "Row"}}),
#"Row Data" = Table.SelectColumns(Renamed,{"Row", "Attribute:hidden"}, MissingField.Ignore),
#"Changed Type" = Table.TransformColumnTypes(#"Row Data",{{"Row", Int64.Type}}),
#"Row Status" = Table.AddColumn(#"Changed Type", "Hidden", each try if [#"Attribute:hidden"] = "1" then true else false otherwise false, type logical),
Workbook = Excel.Workbook(File.Contents(MyFileName)),
Worksheet = Table.SelectRows(Workbook, each ([Name] = MySheetName))[Data]{0},
#"Added Row" = Table.AddIndexColumn(Worksheet, "Row", 1, 1, Int64.Type),
#"Merged Row Status" = Table.NestedJoin(#"Added Row", {"Row"}, #"Row Status", {"Row"}, "Row Status", JoinKind.LeftOuter),
#"Expanded Row Status" = Table.ExpandTableColumn(#"Merged Row Status", "Row Status", {"Hidden"}, {"Row hidden"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Row Status",{"Row"})
in
#"Removed Columns"
Example:
let
Source = fnGetRowHiddenStatus("C:\Temp\rowhide.xlsx", "Sheet1")
in
Source

Related

How can i get all the sub-children?

source link
Hello guys, so i have a function ("flecheD"),
(ColChild,ColParent,ParentActuel,source)=>
let
mylist=Table.Column(Table.SelectRows(source,each Record.Field(_,ColParent)=ParentActuel),ColChild),
resultat=Text.Combine(mylist)
in
Text.Trim(
if resultat ="" then "" else # resultat &"|" & # flecheD(ColChild,ColParent,resultat,source),"|")
which loops through 2 columns (Parent,Child) to get all children of the main parent (output->Children column). The problem is that when the function is confronted with several children, the resultat variable no longer has a single letter/child but several, which blocks the function from looking for the other sub-children.
In order to solve this, I tried to create a custom function "SubChilldren" with List.Generate()
(children as text, ColChild,ColParent,source)=>
let
i = 1,
length = Text.Length(children),
subchildren = List.Generate( ()=>#flecheD(ColChild,ColParent,Text.At(children,i-1),source), i<=length, i+1 )
in
Text.Combine(subchildren)
which when coupled with my initial function
(ColChild,ColParent,ParentActuel,source)=>
let
mylist=Table.Column(Table.SelectRows(source,each Record.Field(_,ColParent)=ParentActuel),ColChild),
resultat=Text.Combine(mylist)
in
Text.Trim(
if resultat ="" then "" else if Text.Length(resultat) = 1 then # resultat &"|" & # flecheD(ColChild,ColParent,resultat,source)
else #resultat &"|"& SubChildren(resultat,ColChild,ColParent,source),"|")
should normally get the sub-children of each children. However, it doesnt work . Could you please help me . Thx

I thought this was a fun way, but you could write a recursive function as well. I have it hard coded to 4 levels of children deep
(not sure how in your source data D child can have two parents, c and J, but whatever)
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Grouped Rows" = Table.Group(Source, {"Parent"}, {{"data", each List.RemoveNulls(_[Child]), type list}}),
Parent_List = List.Buffer(#"Grouped Rows"[Parent] ),
Child_List = List.Buffer(#"Grouped Rows"[data] ),
Process = (n as list) as list =>
let children = List.Transform(List.Transform(n, each Text.ToList(_)), each Text.Combine( List.Distinct(List.Combine(List.Transform(_, each try Child_List{List.PositionOf( Parent_List, _ )} otherwise null))))) in children,
Level1=Process(Source[Parent]),
Level2=Process(Level1),
Level3=Process(Level2),
Level4=Process(Level3),
Final=List.Transform(List.Positions(Level1),each Level1{_}&"|"&Level2{_}&"|"&Level3{_}&"|"&Level4{_}&"|"),
#"Replaced Value" = Table.ReplaceValue(Table.FromList(Final),"||","",Replacer.ReplaceText,{"Column1"}),
custom1 = Table.ToColumns(Source) & Table.ToColumns(#"Replaced Value"),
custom2 = Table.FromColumns(custom1,Table.ColumnNames(Source) & {"Children"})
in custom2
edited to be generic so it can take text as well as numerical inputs
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Parent", type text}, {"Child", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Parent"}, {{"data", each List.Transform(List.RemoveNulls(_[Child]), each Text.From(_)), type list}}),
Parent_List = List.Buffer(List.Transform(#"Grouped Rows"[Parent], each Text.From(_))),
Child_List = List.Buffer(#"Grouped Rows"[data]),
Process = (n as list) as list =>let children = List.Transform(List.Transform(n, each Text.Split(_,",") ) , each try Text.Combine(List.Distinct(List.Combine(List.Transform(_, each try Child_List{List.PositionOf( Parent_List, _ )} otherwise ""))),"," ) otherwise "") in children,
Level1=Process(#"Changed Type"[Parent]),
Level2=Process(Level1),
Level3=Process(Level2),
Level4=Process(Level3),
Final=List.Transform(List.Positions(Level1),each Level1{_}&"|"&Level2{_}&"|"&Level3{_}&"|"&Level4{_}&"|"),
#"Replaced Value" = Table.ReplaceValue(Table.FromColumns({Final}),"||","",Replacer.ReplaceText,{"Column1"}),
custom1 = Table.ToColumns(#"Changed Type") & Table.ToColumns(#"Replaced Value"),
custom2 = Table.FromColumns(custom1,Table.ColumnNames(#"Changed Type") & {"Children"})
in custom2

PowerQuery, excel: How to dynamically detect if certain tags are present in an external source and create a table with points per tag for each series?

I am a bit new to powerquery and need some help with the points specified below the code.
I have a piece of working code that does the following:
Load the source
find the max "series" of a row, the format is a mix of letters and numbers, i.e. Y21Q3S1, the letters stay the same and the numbers are increasing (year, quarter, and series).
I want to look if a certain tag is assigned to a row, so I search all the tag columns if a tag is present and write that in the "Tags" column and "none" if there were none found
through grouping I find the points per tag, for each "max series"
I finally present it in a table in excel with first column being the series, then a column for the Tags as well as a column "None" if none of the Tags were present. I add a last updated date column.
The code:
let
Source = Csv.Document(Web.Contents("somefile.csv"),[Delimiter=",", Columns=32, Encoding=65001, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Promoted Headers", {{"Custom field (Points)", type number}}, "en-GB"),
#"Added Custom" = Table.AddColumn(#"Changed Type with Locale", "Max Series", each List.Max({[Series], [Series_1], [Series_2], [Series_3], [Series_4]})),
Tags = Table.AddColumn(#"Added Custom", "Tags", each if List.Contains({[Tags], [Tags_7], [Tags_8], [Tags_9], [Tags_10], [Tags_11], [Tags_12], [Tags_13], [Tags_14], [Tags_15], [Tags_16], [Tags_17], [Tags_18], [Tags_19], [Tags_20], [Tags_21], [Tags_22], [Tags_23]}, "tag1") then "tag1"
else if List.Contains({[Tags], [Tags_7], [Tags_8], [Tags_9], [Tags_10], [Tags_11], [Tags_12], [Tags_13], [Tags_14], [Tags_15], [Tags_16], [Tags_17], [Tags_18], [Tags_19], [Tags_20], [Tags_21], [Tags_22], [Tags_23]}, "tag2") then "tag2"
else if List.Contains({[Tags], [Tags_7], [Tags_8], [Tags_9], [Tags_10], [Tags_11], [Tags_12], [Tags_13], [Tags_14], [Tags_15], [Tags_16], [Tags_17], [Tags_18], [Tags_19], [Tags_20], [Tags_21], [Tags_22], [Tags_23]}, "tag3") then "tag3"
else "zzzNone"),
RemoveDummy = Table.SelectRows(Tags, each [ID] <> "ID-1234"),
#"Grouped Rows" = Table.Group(RemoveDummy, {"Max Series", "Tags"}, {{"Points per Tags", each List.Sum([#"Custom field (Points)]), type number}}),
#"Sorted Rows" = Table.Sort(#"Grouped Rows",{{"Tags", Order.Ascending}}),
#"Pivoted Column" = Table.Pivot(#"Sorted Rows", List.Distinct(#"Sorted Rows"[Tags]), "Tags", "Points per Tags"),
#"Renamed Columns" = Table.RenameColumns(#"Pivoted Column",{{"zzzNone", "None"}, {"Max Series", "Series"}}),
#"Added Custom1" = Table.AddColumn(#"Renamed Columns", "Last update", each DateTime.LocalNow()),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom1",{{"Last update", type datetime}})
in
#"Changed Type1"
The "Series" and "Tags" columns are a multivariable field, containing all series and tags and is translated by excel into multiple columns. The issue is that the number of series and tags are changing and to try coping with this I have created a dummy row with a lot of series. However, as you can see from the code this also changes and somehow "Tags_2" to "Tags_6" has disappeared and I had to error correct by removing these from the code.
Is there a dynamic way to if any column "Tags_*" contains "tag1" then... so I don't have to hard-code this?
Same goes for the "Max Series" where I would like to dynamically take max value of any columns "Series_*"
I would like to make the “Tags” step more dynamic, so that I can take input from a table in the excel sheet specifying which tags I want to search for instead of hardcoding “tag1, “tag2” etc.
My current code only assigns the points to the first tag found. However, I would like to assign points to several tags, so if two tags were found the "points" be assigned with half to each and for 3 tags they would all get one third of the points. I don’t know how to do this. Could you help me here?
As I am a bit new powerquery my code might be far from optimal, if you have some suggestions in your answers on how I can improve it that would be highly appreciated :-)

Hi mitru and welcome to StackOverflow!
You can make the 'Tags' step automatic by 'Unpivot other columns' and 'Group by' operations. To obtain this you should select all non Tag* columns and use 'Unpivot other columns'. Then please perform a Group by operation with operation = All Rows
You will receive a column populated with tables. The next step is to create a Custom columns with following formula:
=if List.Contains([Tags][Value],"tag1") then "tag1"
else if List.Contains([Tags][Value],"tag2") then "tag2"
else if List.Contains([Tags][Value],"tag3") then "tag3"
else "zzzNone"
The [Tags] is the column containing tables while [Value] is the column within each table that contains tags you are looking for.
Under below link there is a file with sample solution that I created.
https://sendeyo.com/en/608c8dee7f
Regarding bullet 3. I am not sure how the scoring system should work. Can you provide a sample data with the final output?

With the help from Gonso's post I was able to make the "tags" step more dynamic.
Furthermore, I found a solution on the bullet 3 assigning points to different tags if more than one tag is present.
I am posting the updated code here in case anyone else find the solution helpful:
let
Source = Csv.Document(Web.Contents("somefile"),[Delimiter=",", Columns=50, Encoding=65001, QuoteStyle=QuoteStyle.None]),
Promoted_Headers = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
Changed_Type_with_Locale = Table.TransformColumnTypes(Promoted_Headers, {{"Custom field (Points)", type number}}, "en-GB"),
Max_Series = Table.AddColumn(Changed_Type_with_Locale, "Max Series", each List.Max({[Series], [Series_1], [Series_2], [Series_3], [Series_4]})),
Unpivoted_Other_Columns = Table.UnpivotOtherColumns(Max_Series, {"Type", "ID", "Custom field (Points)", "Series", "Series_1", "Series_2", "Series_3", "Series_4", "Series_5", "Series_6", "Series_7", "Series_8", "Series_9", "Max Series"}, "Attribute", "Value"),
Tags = Table.Group(Unpivoted_Other_Columns, {"Type", "ID", "Custom field (Points)", "Series", "Series_1", "Series_2", "Series_3", "Series_4", "Series_5", "Series_6", "Series_7", "Series_8", "Series_9", "Max Series"}, {{"Tags", each _, type table [Type=nullable text, ID=nullable text, #"Custom field (Points)"=nullable number, Series=nullable text, Series_1=nullable text, Series_2=nullable text, Series_3=nullable text, Series_4=nullable text, Series_5=nullable text, Series_6=nullable text, Series_7=nullable text, Series_8=nullable text, Series_9=nullable text, Max Series=text, Attribute=text, Value=text]}}),
Tag1 = Table.AddColumn(Tags, "Tag1", each if List.Contains([Tags][Value],"Tag1") then 1
else 0),
Tag2 = Table.AddColumn(Tag, "Tag2", each if List.Contains([Tags][Value],"tag2") then 1
else 0),
Tag3 = Table.AddColumn(Tag2, "Tag3", each if List.Contains([Tags][Value],"tag3") then 1
else 0),
NoneTag = Table.AddColumn(Tag3, "TagNone", each if List.Sum({[Tag], [Tag2], [Tag3]}) > 0
then 0
else 1),
Tag_total = Table.AddColumn(NoneTag, "Tag_total", each List.Sum({[Tag], [Tag2], [Tag3], [TagNone]})),
Tag_update = Table.ReplaceValue(Tag_total,each [Tag1], each if [Tag1] > 0 then ([#"Custom field (Points)"] * ([Tag1] / [Tag_total])) else [Tag1],Replacer.ReplaceValue,{"Tag1"}),
Tag2_update = Table.ReplaceValue(Tag_update, each [Tag2], each if [Tag2] > 0 then ([#"Custom field (Points)"] * ([Tag2] / [Tag_total])) else [Tag2],Replacer.ReplaceValue,{"Tag2"}),
Tag3_update = Table.ReplaceValue(Tag2_update, each [Tag3], each if [Tag3] > 0 then ([#"Custom field (Points)"] * ([Tag3] / [Tag_total])) else [Tag3],Replacer.ReplaceValue,{"Tag3"}),
TagNone_update = Table.ReplaceValue(Tag3_update, each [TagNone], each if [TagNone] > 0 then ([#"Custom field (Points)"] * ([TagNone] / [Tag_total])) else [TagNone],Replacer.ReplaceValue,{"TagNone"}),
RemoveDummy = Table.SelectRows(TagNone_update, each [ID] <> "ID-1234"),
Grouped_Series = Table.Group(RemoveDummy, {"Max Series"}, {{"Tag1", each List.Sum([Tag1]), type number}, {"Tag2", each List.Sum([Tag2]), type number}, {"Tag3", each List.Sum([Tag3]), type number}, {"None", each List.Sum([TagNone]), type nullable number}}),
Sorted_Series = Table.Sort(Grouped_Series,{{"Max Series", Order.Ascending}}),
Renamed_Series = Table.RenameColumns(Sorted_Series,{{"Max Series", "Series"}}),
Added_last_update = Table.AddColumn(Renamed_Series, "Last update", each dateTime.LocalNow()),
Changed_date_Type = Table.TransformColumnTypes(Added_last_update,{{"Last update", type datetime}})
in
Changed_date_Type

Replace by regular expresion in Excel

I have a list in Excel like the following:
1 / 6 / 45
123
1546
123 456
1247 /% 456 /
I want to create a new column with all sequences of consecutive non digits replaced by a character. In Google Sheets, this is easy using =REGEXREPLACE(A1&"/","\D+",","), resulting in:
1,6,45,
123,
1546,
123,456
1247,456,
In that formula A1&"/" is needed in order for REGEXREPLACE to work with numbers. No big deal, just adds a comma at the end.
How can we do this in Excel? Pure Power Query (not R, not Python, just M) is very much encouraged. VBA and other clickable Excel features are unacceptable (like find and replace).

If you have Excel 365:
In B1:
=LET(X,MID(A1,SEQUENCE(LEN(A1)),1),SUBSTITUTE(TRIM(CONCAT(IF(ISNUMBER(--X),X," ")))," ",","))
Or if streaks of digits are always delimited by at least a space:
=TEXTJOIN(",",,FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.*0=0]"))
Another option, if you have got access to it, is LAMBDA(). Make a function to replace all kind of characters, something along the lines of this. Without LAMBDA() and TEXTJOIN() I think your best bet would be to start nesting SUBSTITUTE() functions.

Here is a Power Query solution.
It makes use of the List.Accumulate function to determine whether to add a digit, or a comma, to the string:
Note that the code replicates what you show for results. If you prefer to avoid trailing (and/or leading) commas, it can be easily modified.
let
Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "textToList", each List.Combine({Text.ToList([Column1]),{","}})),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "commaTerminators", each List.Accumulate(
[textToList],"", (state,current) =>
if List.Contains({"0".."9"},current)
then state & current
else if Text.EndsWith(state,",")
then state
else state & ",")),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"textToList"})
in
#"Removed Columns"
Edit To eliminate leading/trailing commas, we add the Text.Trim function which, in Power Query, allows defining a specific text to Trim from the start/end:
let
Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "textToList", each List.Combine({Text.ToList([Column1]),{","}})),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "commaTerminators", each
Text.Trim(
List.Accumulate(
[textToList],"", (state,current) =>
if List.Contains({"0".."9"},current)
then state & current
else if Text.EndsWith(state,",")
then state
else state & ","),
",")),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"textToList"})
in
#"Removed Columns"
VBA UDF You mentioned you did not want VBA, but not clear if you were restricting that to a "clickable". Here is a user defined function that you can use on a worksheet directly. It uses the VBA regex engine which allows easy extraction of multiple matches
You can enter a formula on the worksheet such as =commaSep(cell_ref) to get the same results as shown above in my second PQ example
Option Explicit
Function commaSep(S As String) As String
Dim RE As Object, MC As Object, M As Object
Dim sTemp As String
Set RE = CreateObject("vbscript.regexp")
With RE
.Global = True
.Pattern = "\d+"
If .test(S) Then
Set MC = .Execute(S)
sTemp = ""
For Each M In MC
sTemp = sTemp & "," & M
Next M
commaSep = Mid(sTemp, 2)
Else
commaSep = "no digits"
End If
End With

This is another variation if you have TEXTJOIN function available.
=SUBSTITUTE(TRIM(TEXTJOIN("",TRUE,IFERROR(MID(A2,ROW($A$1:INDEX(A:A,LEN(A2))),1)+0," ")))," ",",")

And another option in Power Query.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTQVzADYhNTpVgdINfIGEKbmpjBBIByZgpQjom5gr4qWEBfKTYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
x1 = Table.AddColumn(#"Changed Type", "x1", each Text.ToList([Column1])),
x2 = Table.AddColumn(x1, "x2", each List.Transform([x1], each if Text.Contains("0123456789", _) then _ else " " )),
x3 = Table.AddColumn(x2, "x3", each Text.Split(Text.Combine([x2])," ")),
x4 = Table.AddColumn(x3, "x4", each List.Transform([x3], each if Text.Contains("0123456789", try Text.At(_,0) otherwise " ") then _&"," else "" )),
x5 = Table.AddColumn(x4, "x5", each Text.Combine([x4])),
#"Removed Columns" = Table.RemoveColumns(x5,{"x1", "x2", "x3", "x4"})
in
#"Removed Columns"

Moving one column to the end of another

I have a bunch of excel files that are formatted in a pretty weird way that I would like to make an automated import script for, so I can easily get this data into a sheet with the correct formatting.
It has hourly values for each month in the first 12 columns, then the date along with hours in the next 12 columns.
What I would like to do is be able to get this data in to a table where the first column has the date and hour (in excel format) and the second one containing the data. My thought was to record a macro during the process of adjusting the data with power query and then repeating the macro with multiple files. However, I can't seem to find a good way to make "Column 2"'s data moved to the end of "Column 1" and repeating that for both values and the dates using Power Query. Any pointers?
Also notice, column 1s length differs from column 2 since January has more values than February. However, column 1s lenght is the same as column 13, column 2s length same as 14 and so on.
I have uploaded a sample file here

Create a blank query from scratch (on my machine I do this in Excel via: Data > Get Data > From Other Sources > Blank Query).
Click Home > Advanced Editor, copy-paste the code below and change this line folderPath = "C:\Users\user\", to the path of the parent folder that contains the Excel files. Then click Close & Load.
As the data is being imported from multiple workbooks (and possibly multiple sheets), the first two columns of the loaded table should be the workbook and worksheet that that row of data came from. (If you want to get rid of the first two columns, edit the query.)
let
folderPath = "C:\Users\user\",
getDataFromSheet = (sheetData as table) =>
let
promoteHeaders = Table.PromoteHeaders(sheetData, [PromoteAllScalars=true]),
standardiseHeaders =
let
headers = Table.ColumnNames(promoteHeaders),
zipWithLowercase = List.Zip({headers, List.Transform(headers, Text.Lower)}),
renameAsLowercase = Table.RenameColumns(promoteHeaders, zipWithLowercase)
in
renameAsLowercase,
emptyTable = Table.FromColumns({{},{}}, {"Date", "Value"}),
monthsToLoopOver = {"januari", "februari", "mars", "april", "maj", "juni", "juli", "augusti", "september", "oktober", "november", "december"},
appendEachMonth = List.Accumulate(monthsToLoopOver, emptyTable, (tableState, currentMonth) =>
let
selectColumns = Table.SelectColumns(standardiseHeaders, {currentMonth & "tim", currentMonth}, MissingField.UseNull),
renameColumns = Table.RenameColumns(selectColumns, {{currentMonth & "tim", "Date"}, {currentMonth, "Value"}}),
appendToTable = Table.Combine({tableState, renameColumns})
in
appendToTable
),
tableOrNull = if List.Contains(Table.ColumnNames(standardiseHeaders), "januari") then appendEachMonth else null
in
tableOrNull,
getDataFromWorkbook = (filePath as text) =>
let
workbookContents = Excel.Workbook(File.Contents(filePath)),
sheetsOnly = Table.SelectRows(workbookContents, each [Kind] = "Sheet"),
invokeFunction = Table.AddColumn(sheetsOnly, "f", each getDataFromSheet([Data]), type table),
appendAndExpand =
let
selectColumnsAndRows = Table.SelectColumns(Table.SelectRows(invokeFunction, each not ([f] is null)), {"Name", "f"}),
renameColumns = Table.RenameColumns(selectColumnsAndRows, {{"Name", "Sheet"}}),
expandColumn = Table.ExpandTableColumn(renameColumns, "f", {"Date", "Value"})
in
expandColumn
in
appendAndExpand,
filesInFolder = Folder.Files(folderPath),
validFilesOnly = Table.SelectRows(filesInFolder, each [Extension] = ".xlsx"),
invokeFunction = Table.AddColumn(validFilesOnly, "f", each getDataFromWorkbook([Folder Path] & [Name])),
appendAndExpand =
let
selectRowsAndColumns = Table.SelectColumns(Table.SelectRows(invokeFunction, each not ([f] is null)), {"Name", "f"}),
renameColumns = Table.RenameColumns(selectRowsAndColumns, {{"Name", "Workbook"}}),
expandColumn = Table.ExpandTableColumn(renameColumns, "f", {"Sheet", "Date", "Value"})
in
expandColumn,
excludeBlankDates = Table.SelectRows(appendAndExpand, each not (Text.StartsWith([Date], " "))),
transformTypes =
let
dateAndHour = Table.TransformColumns(excludeBlankDates, {{"Date", each Text.Split(_, " ")}}),
changeTypes = Table.TransformColumns(dateAndHour, {{"Workbook", Text.From, type text}, {"Sheet", Text.From, type text}, {"Date", each DateTime.From(_{0}) + #duration(0, Number.From(_{1}), 0, 0), type datetime}, {"Value", Number.From, type number}})
in
changeTypes
in
transformTypes
For reliability and robustness, it would be good if you create a folder and put all Excel files (that need restructuring) into that folder -- and ensure nothing else goes into that folder (not even the file that will be doing the importing/restructuring).
If you can't do this for whatever reason, then click the validFilesOnly step whilst in the Query Editor and amend the filter criteria, such that the table only includes files you want restructured.

You can do the formatting pretty quickly by writing your own macro in VBA. Then you can create another macro to run the formatting macro on multiple files within a folder.
Run same excel macro on multiple excel files
Here is an example of a macro that will reformat your data closer to what you are looking for.
Sub FormatBlad()
' create a new sheet and rename it
Sheets.Add After:=ActiveSheet
Sheets(Sheets.Count).Name = "Formatted"
' set helper variables
Dim blad As Worksheet
Dim format As Worksheet
Set blad = Sheets("Blad1")
Set ft = Sheets("Formatted")
Dim blad_row_num As Integer
Dim ft_row_num As Integer
Dim month_offset As Integer
blad_row_num = 2
ft_row_num = 2
month_offset = 13 ' column N - 1
' set column headers in formatted sheet
ft.Range("A1").Value = "Date"
ft.Range("B1").Value = "Value"
' loop through months
For i = 1 To 12
blad_row_num = 2
While blad.Cells(blad_row_num, i).Value <> ""
ft.Cells(ft_row_num, 1).Value = blad.Cells(blad_row_num, month_offset + i).Value
ft.Cells(ft_row_num, 2).Value = blad.Cells(blad_row_num, i).Value
blad_row_num = blad_row_num + 1
ft_row_num = ft_row_num + 1
Wend
Next i
End Sub

Power BI - Call Azure API with nextLink (next page)

Apologies, I'm new to Power BI. I'm using Power BI to call an Azure API that will list all the VMs in my subscription, however it will only show the first 50 before having a nextLink.
Here is the API I'm calling;
https://management.azure.com/subscriptions/< subscription >/providers/Microsoft.Compute/virtualMachines?api-version=2017-12-01
I've seen other pages and forums with a similar issue (such as Microsoft API), but not for Azure API. I messed about with their fix, but could not work out how to apply it to mine.
Their code;
let
GetUserInfo = (Path)=>
let
Source = Json.Document(Web.Contents(Path)),
LL= #Source[value],
result = try #LL & #GetUserInfo(Source[#"#odata.nextLink"]) otherwise #LL
in
result,
Fullset = GetUserInfo("https://graph.microsoft.com/beta/users?$select=manager&$expand=manager"),
#"Converted to Table" = Table.FromList(Fullset, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "displayName", "manager"}, {"Column1.id", "Column1.displayName", "Column1.manager"}),
#"Expanded Column1.manager" = Table.ExpandRecordColumn(#"Expanded Column1", "Column1.manager", {"id", "displayName"}, {"id", "displayName"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded Column1.manager",{{"Column1.displayName", "Employee Full Name"}, {"Column1.id", "Employee Id"}, {"id", "Manager Id"}, {"displayName", "Manager Full name"}})
in
#"Renamed Columns"
Compared to the start of mine once I've connected the source by the simple web link;
let
Source = Json.Document(Web.Contents("https://management.azure.com/subscriptions/< subscription >/providers/Microsoft.Compute/virtualMachines?api-version=2017-12-01")),
#"Converted to Table" = Record.ToTable(Source)
in
#"Converted to Table"
If I were to adjust it, I suspected it would look something like this;
let
GetUserInfo = (Path)=>
let
Source = Json.Document(Web.Contents(Path)),
LL= #Source[value],
result = try #LL & #GetUserInfo(Source[#"#odata.nextLink"]) otherwise #LL
in
result,
Fullset = GetUserInfo("https://management.azure.com/subscriptions/< subscription >/providers/Microsoft.Compute/virtualMachines?api-version=2017-12-01"),
#"Converted to Table" = Record.ToTable(Source)
in
#"Converted to Table"
However I am prompted with the following error once clicking OK;
Expression.Error: The name 'Source' wasn't recognized. Make sure it's spelled correctly.
Any help on this would be greatly appreciated.

For anyone interested, here is what I ended up doing thanks to this link:
https://datachant.com/2016/06/27/cursor-based-pagination-power-query/
let
iterations = 10,
url =
"https://management.azure.com/subscriptions/< subscription >/providers/Microsoft.Compute/virtualMachines?api-version=2017-12-01",
FnGetOnePage =
(url) as record =>
let
Source = Json.Document(Web.Contents(url)),
data = try Source[value] otherwise null,
next = try Source[nextLink] otherwise null,
res = [Data=data, Next=next]
in
res,
GeneratedList =
List.Generate(
()=>[i=0, res = FnGetOnePage(url)],
each [i]<iterations and [res][Data]<>null,
each [i=[i]+1, res = FnGetOnePage([res][Next])],
each [res][Data])
in
GeneratedList
1 whole day of Googling headache :S

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Excel Power Query to only show unhidden rows - excel

Related

How can i get all the sub-children?

PowerQuery, excel: How to dynamically detect if certain tags are present in an external source and create a table with points per tag for each series?

Replace by regular expresion in Excel

Moving one column to the end of another

Power BI - Call Azure API with nextLink (next page)

Categories

Resources