Power query data load into model and table does not sort correctly - excel

I have a table that I fetch via a connect through Power Query, which has a list of names. I apply some steps including sorting the names column alphabetically and then loading it to a Table and the "Data Model". However the table that is loaded onto a worksheet contains the list of the names sorted in a completely different order, its like Excel is ignoring my sorting preference completely. I tried to sort the data in the "Data Model" resorting it in Power query even the table in the worksheet itself, but after I hit refresh it reverts to the wrong order.

Try Table.Buffer wrapped around the sort
= Table.Buffer(Table.Sort(Source,{{"date", Order.Ascending}}))
or, alternatively, add an index at start, and resort on index when done

I can confirm buffering doesn't work. Incredible bug.

Related

Refresh power pivot-power query

I have a table in power query that is fed from some Excel files, with this data I make an inner join with other catalog tables that I have and do operations on calculated columns and then add to the power pivot data model to make some pivot tables, initially everything was working very well until I made adjustments to the power query table by removing or adding more columns as well as editing the inner join operations, now when I do an update with and I want to pass the power query table to data model power pivot it gives me an error that the table does not exist, Mention that if I do the update only power query table it works without problems, the problem is when the data is going to be passed to power pivot.
How can I correct this error?
Sorry for my English
Yes, often when changes are made to initial queries these issues happen. Normally, when Query Names and/or Field/Column Names are changed and these names are used in the Merge or calculation steps query will pop these errors.
So, review/compare all the changes that you made to steps after the merge.
If you don't find any errors, consider making a copy of the steps and rebuilt from the merge to ensure optimum performance.

Why does Power Query not honor the sort in the output table when loaded to the data model?

I'm using Excel 365 and pulling in a table via Power Query. In the underlying SQL, I use an ORDER BY clause to sort the data, which works correctly in the PowerQuery preview and in my SQL client. But I noticed the table that winds up getting loaded in Excel isn't sorted in the same order. I added a sort step to Power Query as the final step, and it still didn't work, even after I wrapped the step in Table.Buffer(). I have also tried it with and without the Preserve column order/sort checkbox checked.
It only honors the sort settings if I uncheck the box to load the query into the data model.
How can I make it follow the sort as specified and load into the data model?

Excel Data Queries - Ignore missing table / assign specific table number for every query

I am having a bit of trouble to create an automated report based on an HTML file. The file contains tables with data structured from the web page, and I just create tables from the tables recognized by Excel. So far it does what I need, but sometimes one or more tables from the HTML file is missing, and causing the tables to shuffle between them, like table 0 is missing then table 1 will take it's place and break the entire sheet because the wrong table is in the place of table 0.
What I wanted to know if it's a way to assign every query to a specific table number for each query. Like Table 0 will get the value from the specified query, not the first one that comes in the list of queries. The code so far is this for Power Query Editor:
let
Source = Web.Page(File.Contents("D:\AUTO.html")),
Data0 = Source{0}[Data]
in Data0
I use this code because the columns or rows will not always be the same, sometimes one can be missing and if I use the original code that is generated when getting the data from the page it will give errors and not load the table if there is a missing column/row.
Any help is appreciated.
MissingField.Ignore
When you use functions like Table.SelectColumns or RenameColumns or ReorderColumns you can use the MissingField.Ignore options to avoid the missing field error to stop your query
eg:
= Table.SelectColumns(#"blah",{"column1", "column2", "column3"}, MissingField.Ignore)
documentation:
https://learn.microsoft.com/en-us/powerquery-m/missingfield-error

PowerQuery to Excel sorting

How do I force an Excel table to keep the same sorting I've applied in Power Query?
I have loaded a data model query from an access database file, which I have then shaped and sorted using Power Query.
Afterwards I have imported it as an Excel table using the "Existing Connections" and made sure that I have the "Preserve column sort/filter/layout" box checked.
However, the data I see in Excel is not sorted and seems to be thrown in completely at random?
I have also checked the "Preserve column sort/filter/layout" box in the "Design - Table tools" under external connections?
I usually just add an index column in PQ and resort in Excel after linking to the existing connection.
The same issue happens in reverse when you bring sorted data into PQ, and it resorts it without being asked. An index column in the initial table import solves that as well

Power Query in Excel to Select Specific Cells from a column

I'm using Power Query in Excel to reference a table within the same workbook. I want to select specific columns within that table. I know that can be accomplished by loading the table into Power Query and then choosing the columns I want to use. The resulting query is:
let
Source = Excel.CurrentWorkbook(){[Name="Legend_Data_Merged"]}[Content],
#"Removed Other Columns" = Table.SelectColumns(
Source,
{
"Observation number",
"First Sales Offer - Products",
"Middle Sales Offer(s) - Products",
"Last Sales Offer - Products"
}
)
in
#"Removed Other Columns"`
So, here's my question/issue:
I think this way is first pulling the entire table into Power Query, then stripping down from there. What I want to do is define the source table as the "Legend_Data_Merged" table, but choose which columns to pull from that table in the same operation. This way, it never has to load the entire table into Power Query. The reason is the table itself is about 120 columns long, and I only need three columns, and I have about 20 of these similar queries and it's starting to hog memory. Am I wrong in my logic here? And if not, anyone have an idea on what the query would be?
Could there maybe be a way to define the columns in the [content] part of the source operation ?
Thanks.
It may be a very simple attempt, but why not add a Worksheet "DataTransfer" where you set only references to the columns you need and read this small table with power query ?
If your columns are close together you could also set a named range and read only this range with powerquery.
But anyway, when the workbook is open, your big table is already in memory. There should not be much memory allocation, when reading the table with powerquery and selecting the three columns.
It's possible there's some problem in Excel or Power Query. How much memory are you seeing used by the excel.exe and Microsoft.Mashup.Container.NetFX40.exe process?
The only way to directly remove the columns from [Content] is to modify the actual data of the Excel table. You could try that to see if it makes a difference, but Power Query generally tries to be smart about only loading columns it needs.
If your query is using a lot of memory, you might get performance saving your data in a more efficient format (I'd try CSV). In any case, try turning off the "load to worksheet" and instead just load to data model.
You can refer to my question and answer here.
What you will want to do is use the Table.SelectColumns method instead of Remove.
let
db = Sql.Databases("sqlserver.database.url"){[Name="DatabaseName"]}[Data],
Sales_vDimCustomer = Table.SelectColumns(
db{[Schema="Sales",Item="vDimCustomer"]}[Data],
{
"Name",
"Representative",
"Status",
"DateLastModified",
"UserLastModified",
"ExtractionDate"
}
)
in
Sales_vDimCustomer
When viewing the raw sql using Express Profiler it will be done in one statement where
SELECT
$Table.Name,
$Table.Representative,
$Table.Status,
$Table.DateLastModified,
$Table.UserLastModified,
$Table.ExtractionDate
FROM
Sales.vDimCustomer as $Table
PowerBi and Power Query will also now show an error/ warning message with this recommendation when trying to import a large number of columns.

Resources