Connection Only on merged Power Queries - excel

I have two queries in my workbook that rely on each other. One is set to a connection only, the other is set to load to a table after performing some merging and expanding operations. I noticed that when refreshing, the query set to "Connection only" does not have a visual indication of refreshing.
When I refresh the secondary query that relies on information from the connection only one, does it actually refresh both of them? I am having a hard time finding clear documentation on this. A link to where the information is would also be appreciated.
Further information on the queries themselves:
Both link to SQL tables.
One pulls the latest data available in the table.
The other pulls recent information from a different table.
The second one merges the two tables together (by the key).
The second one then only grabs information from the first when there is missing information in the second.
I am specifically asking; When the second table calls a refresh, does the first table also refresh even though it is a connection only?

Yes effectively the first query is also refreshed - it's query definition is run and the result is pulled into your second query.
Note in the Query Editor window you will see a "Preview" dataset for your first query, which would not be refreshed by your refresh of the second query. That "Preview" dataset is only a design tool, it doesn't affect your results when you actually refresh and deliver data into an Excel table.

I also had a tough time finding information about this. There is a post by Ken Puls in PowerPivotPro.com that helps drive some (good) conclusions about this. Net-net, the "connection only" queries ARE refreshed before the merge query, you just don't see it (which, btw, I think should be implemented in Excel). Hope this helps.

Related

Excel create multiple MS Queries using one data connection between two sheets in the same workbook

In Excel 2021, what exactly is a "data connection", "query" and "domain source name"?
Let's say I have a Workbook "Manahil_Customer_Database.xlsm" in which I have a sheet "sht_Customer_Cities" that has a table "tbl_Customer_Cities". In a new sheet "sht_Report" I want to run two queries using one connection via MS Query. Now when I go through the MS Query route I get one Domain Name Source File "Manahil_Customer_Database.dsn" and one MS Query file "Customer_Countries_Cities.dqy" and one Connection file "Customer_Countries_Cities.odc".
However when I look at the "Queries & Connections" it says 0 Queries and 1 Connection named "Customer_Countries_Cities". I want to be able to establish a single Data Connection via MS Query from the "sht_Report" to the Workbook "Manahil_Customer_Database.xlsm" and than run multiple queries using the same connection.
Power Query replaced MS:Query from Excel 2016 onwards. The objects and panes you are describing relate to Power Query, not MS:Query.
Power Query is far more functional, reliable, flexible and performant than MS:Query.
For example depending on your exact requirement, you might create a base query that gathers all the required data, then refer to that base query in Reference queries that filter the output needed for each destination table.
Here's a starting point for Power Query:
https://support.microsoft.com/en-us/office/about-power-query-in-excel-7104fbee-9e62-4cb9-a02e-5bfb1a6c536a
Power Query is a MS tool that assists you on your ETL tasks.
As read in a previous answer, it is based on M language.
To be able to import / modify / connect your data, the command is:
DATA / GET DATA and select your input
Check this link for a quick introduction:
https://learn.microsoft.com/en-us/power-query/power-query-what-is-power-query
If I understand the situation correctly, you are working internally, within a single excel file. Data connections, queries, and domain sources, are all used to associate externally.
Internally I would think you could use a pivot-table and/or a slicer.
If you provide additional details on what specifically you are trying to do, a better answer could be provided.
Some additional reading below may help further:
Power Query Help
Data Connections
Queries
External Links

My simple PowerQuery finishes, but Excel continue to lag even afterwards for minutes

I'll run a custom PowerQuery in Excel 365, the results are returned and displayed as a table on the worksheet, and all is well -- until I go to continue to use Excel. Excel continues to lag for about 5 minutes after the results are returned!
I've looked through some PowerQuery performance solutions, but all seem to address slow query performance. The queries perform within expectations -- whereas I'm looking for advice to combat Excel locking up for a while after results are returned.
This happens with a variety of queries using different datasources -- not just one or a few. Since the lag is occurring AFTER results are returned, I'm looking for generic potential solutions.
Most of my code uses datasources like the current workbook's tables, a file folder, and perhaps tables located within other Excel files.
I've never encountered this before working with power query. Are you experiencing the same symptoms if you load similar connections to a mostly empty workbook? If you are the issue may be with your install or available resources on your workstation. If not, there may be an issue with the existing code you have in a workbook (ie: inefficient code that you have triggered by a pivot table update event).

Excel Power Query connection to another table

Fairly new to power queries and finding my feet largely by trial and error.
I have build a master query returning ~ 2000 rows of data covering different regions. I want to create sub reports on different tabs for each region. I can easily do this by copying the original table and applying a filter on region for each new query. As my s/s is already 10mb, I am trying to do this is as efficiently as possible in terms of performance. I understand I can do this also by creating a "reference only" to the master query instead of duplicating and filtering master query (so creating 10 versions of master query with different filters).
I have been trying to do this via Query / Reference menu but not sure if I end up with a "connection only" query as it doesn't say it in the Queries panel on the right.
Anyway, I guess the questions are:
1. What is the difference between queries and "connection only" queries (especially with regard to performance / spreadsheet size)?
2. When is best to use "connection only" query?
3. How to create "connection only" query (ideally via menu not code) and how to check if a query is connection only?
Connection only signifies that the data is not being materialised anywhere. It may still be referenced by other queries, but the data isn't being loaded to a worksheet table or to the data model.
You can control the behaviour of queries with the Load To dialog, invoked by Close and Load To from the Query Editor, or right click > Load To from the Workbook Queries pane.
Connection only queries reduce workbook size - good rule of thumb is not to materialise any queries unless you really need to. In your example, it looks like you want to materialise each regional query, but unless you also need a master table of all regions, then your 'master' query may be connection only.
In performance and size terms, it makes more sense to only have your master query, loaded to the data model, with a regional slicer on your report...

Adding Data Manually to Correlate with Refreshing Rows of Existing Live Data Connection Feed in Excel 2013

I have a table in Excel that is populated by a live connection to an external database. The SQL used to generate this table and refresh it looks similar to this (only with more fields and such):
SELECT DISTINCT shr_pf_student_v.ID,
FROM shr_pf_student_v
What my customer wants to be able to do is add additional columns and manually add data that correlates to the ID in each row in Excel. Of course, when the ID data is refreshed, if new rows are added or deleted, the manually added data no longer correlates with the ID it was originally intended to match up with.
I've thoroughly explored the Excel Connection "External Data Properties" options and none solve this issue. I've only found this one solution here: http://www.mrexcel.com/forum/excel-questions/376984-database-query-possibilities.html but after several hours of attempted application, I can't get it to work and I'm not sure that it's possible to do this way either.
Lookup formulas won't work of course because as soon as the data is refreshed, the data looks just like the new refreshed set.
Any new viable options are welcomed. I've searched high and low but I can't help but think that this is such a valuable process that must be rather prevalent and have a solution developed for it somewhere out there?
Many thanks,
Lindsay

Excel - Best Way to Connect With Access Data

Here is the situation we have:
a) I have an Access database / application that records a significant amount of data. Significant fields would be hours, # of sales, # of unreturned calls, etc
b) I have an Excel document that connects to the Access database and pulls data in to visualize it
As it stands now, the Excel file has a Refresh button that loads new data. The data is loaded into a large PivotTable. The main 'visual form' then uses VLOOKUP to get the results from the form, based on the related hours.
This operation is slow (~10 seconds) and seems to be redundant and inefficient.
Is there a better way to do this?
I am willing to go just about any route - just need directions.
Thanks in advance!
Update: I have confirmed (due to helpful comments/responses) that the problem is with the data loading itself. removing all the VLOOKUPs only took a second or two out of the load time. So, the questions stands as how I can rapidly and reliably get the data without so much time involvement (it loads around 3000 records into the PivotTables).
You need to find out if its the Pivot Table Refresh or the VLOOKUP thats taking the time.
(try removing the VLOOKUP to see how long it take just to do the Refresh).
If its the VLOOKUP you can usually speed that up.
(see http://www.decisionmodels.com/optspeede.htm for some hints)
If its the Pivot table Refresh then it depends on which method you are using to get the data (Microsoft Query, ADO/DAO, ...) and how much data you are transferring.
One way to speed this up is to minimize the amount of data you are reading into the pivot cache by reducing the number of columns and/or predefining a query to subset the rows.

Resources