We have use case where we want to leverage incremental cube refresh. However, Incremental Cube Refresh on based three cases: Insert, Update and delete. In source data, data can be updated/deleted from historical records and new data will be appended. In this scenarios, what we expect whenever we do cube refresh Deleted source records should be removed from Cube (if it previously processed), Updated source records should be updated in Cube (on mentioned attributes) and New records should be appended to cube. On cube we partitions created based on time period.
Currently we are doing full refresh every time cube is processed. But this impacts processing time. How can we reduce it?
Related
There is a use case in my company to enable business users with no technical knowledge to use the data from Azure cloud. Back in the SQL server days this was easily solved through OLAP cubes. You could write a query for data that's backing up the cube, and then business people could just connect to the cube and data was downloaded as a pivot table, the only problem with large datasets there was compute (the larger the data, the slower the pivot table) but not really the row limit.
With the current Azure Synapse set up it seems that Excel is trying to download the entire data set and obviously always hits a 1M row limit. Is there anyway to directly use the data in the pivot table without bringing it in full to Excel? Because all my tables are >1M rows.
UPD: You can load data directly to Pivot, but it does load the data to RAM and the actual loading takes time. I am looking for a similar to cube solution, where the pivot table is available immediately and the querying happens once you're adding fields and calculations to the pivot table.
I need to delete query steps after loading the data into model. The reason is to hide the sources, protect our know-how, or maybe I'm just not very proud of what I've done ;).
But when I delete PQ connections or change "Load To" option, also the tables disappear from data model and pivot table becomes unresponsive. It's also not possible to modify or delete the connection created in Power Query from Power Pivot window, or even view table properties.
I could use Review > Protect Workbook > Protect Structure to disable viewing and editing queries / connections, but the steps are still visible, and the user cannot modify the workbook; even pivot table drill-through function doesn't work as it needs to create a new sheet to show data rows.
If you need to remove the query steps, then you have to store the data within the Excel file (since a query is just a set of instructions for how to connect to the data and transform it).
What you can do is create a query, load it to a table in an Excel sheet and then delete the query, leaving a static table. You can then create a pivot table using this static table as the source and it should function normally (though you obviously won't be able to refresh the data). I.e. don't create a data model until you've loaded your data and removed the query.
I have a rather large datamodel in excel. it consists of an imported data mart featuring one fact table and around 20 dimension tables.
I also have 3 tables directly in the excel sheet, where users can enter data, that then gets merged into the existing datamodel using power query.
I would like to be able to update the datamodel thereby updating the content of my pivot tables and my calculations, without refreshing the actual data coming from my external server.
Is this possible without having to disable external data connections i the sheet (I'd like to periodically update the data)
For clarification, i am building a KPI that will be measured monthly on data present on the 1st of every month, but will have to be analyzed, commented, and have outliers handled throughout the month.
You've not mentioned VBA in your question, but going by the fact you've tagged your question as VBA, I'm guessing that's what you're using?
VBA code to refresh a single query is:
Sheets("sheetName").ListObjects("queryName").Refresh
If you're trying to do it manually, then it's just a question of selecting a cell within the table the query is pulling to, and then Query > Refresh.
I am migrating data from SAP HANA view to ODS (Azure Data Factory). From there, the other third-party company is moving data to Salesforce database. Now, when I migrate it we are doing a truncate and load in sink.
There is no column in source which shows the date or last updated date when the news rows are added in SAP HANA.
Do we need to have the date in the source, or any other way we can write it in ODS?
It must show with a last updated date or something to denote when a row has been inserted or changed after initial load. So that they have a track when loading onto Salesforce database.
Truncate and Load a staging table, then run a stored procedure to MERGE into your target table, marking inserted and updated rows with the current sysdatetime(). Or MERGE from the staging table into a Temporal Table, or a table with Change Tracking enabled to track the changes automatically.
I have lost the connection to source analytics service. However, I still have data in PivotTable's cache (I can see from tooltips).
How to get that source data?
I am using Office365
** I cannot use show details since the pivot table has some filteres applied. And i cannot remove filters since it asks me to connect source.
(And yes I have checked this thread but it didn't work for me: Recreate Source Data from PivotTable Cache. It gives 1004 error.)
The easiest way I can think of for a table based data set is:
Show the field list on your pivot table.
Remove all filters, column labels, row labels, and values
Add one field to the values. This will show one aggregate value in the pivot table.
Double click the value and a sheet with all the data should pop up.
If it doesn't, go to the pivot table options then on the data tab check "Enable show details".
This method however will not work with OLAP data. Excel does not download the entire cube; it queries for new data slices with every change to the filters or layout of the pivot table/chart. So even if you could access the data in the pivot cache it would not hold the entire cube, but only the slices needed to show the current layout. You CAN create a snapshot cube file to hold all of the data needed to run in offline mode, however it requires you to be able to connect to the server at least once to create the file.