Moderator please move to appropriate forum if required.
I use MS Excel 2016 for data visualization.
Can understand Extract means saving Excel data onto a spreadsheet and Transforming data means manipulating it in Power Query.
QUESTION:
But if I decide to load data to Power Pivot (Data Model) doesn't that fall back into Transform because you can
Create Calendar Table
Create Measures (or Calculate Columns if necessary)
Or does using Power Pivot (Data Model) fall under Data Modelling because you are no longer formatting, merging pre-existing data;
Rather you are creating new data (i.e. Calendar Table, Measures, etc) to merge with pre-existing data
Kindly clarify
Power Query (now standard excel 2016 in data tab): is an ETL (Extract - Transform - Load) tool. A standard example would be that you would connect it to your source ERP system, and make a product table. That wouldn't be an exact copy of the table, but could consist out of several tables, that are joined. You keep only the relevant columns.
Power Pivot: this is a data modelling tool, it allows you to create relations between data and attribute tables. It gives you the possibility to use time related measures (YTD, Previous Year, ...).
In general when you build your model in power pivot, you can choose to either load the data directly into power pivot (without power query). This is useful if you already have a datawarehouse in which the ETL process is done.
If you have an ETL process to execute, it's better to use power query, and load the data into power pivot. (option: load to data model).
Related
There is a use case in my company to enable business users with no technical knowledge to use the data from Azure cloud. Back in the SQL server days this was easily solved through OLAP cubes. You could write a query for data that's backing up the cube, and then business people could just connect to the cube and data was downloaded as a pivot table, the only problem with large datasets there was compute (the larger the data, the slower the pivot table) but not really the row limit.
With the current Azure Synapse set up it seems that Excel is trying to download the entire data set and obviously always hits a 1M row limit. Is there anyway to directly use the data in the pivot table without bringing it in full to Excel? Because all my tables are >1M rows.
UPD: You can load data directly to Pivot, but it does load the data to RAM and the actual loading takes time. I am looking for a similar to cube solution, where the pivot table is available immediately and the querying happens once you're adding fields and calculations to the pivot table.
I work with Power Query & Power Pivot. I load several tables in the data model of Power Pivot thanks to Power Query. Then I create relationships between those entities.
Now that the tables are linked with each others, is it possible to benefit from it and dump the whole data in a sheet? I must allow the users to see the whole thing in a table, not just summaries with pivot tables.
I have created a small data warehouse with the help of Tableau software. First I entered my information in Excel and created my fact table in Excel and then imported into Tableau where I created my queries.
I would like to know if the creation of a Fact table is the ETL process? (I know what ETL means,I just want to know where it happened in my project).
In principle you do Extract, Transform and Load - but mostly manually. Your Extract procedure is done manually, while gathering the information you need to create your excel sheet. The transformation is then again a manual step, you create the excel sheet based on the data you collected from wherever. And at last, you load the finished excel sheet into your BI system Tableau.
Tableau is a data analytics package that helps you look at already gathered data and query it for business intelligence. It is separate from an ETL tool.
The extract-transform-load process is where data from a system (database, customer relationship system, whatever) is extracted from the data source, it is then transformed/converted so it can be loaded into a data warehouse. For example, Excel spreadsheets converted to CSV, or changing how dates are formatted in Oracle DB data. Once the data is in a format that the warehouse can process, it is loaded into the data warehouse.
Tableau can be used to query and analyze the data in a data warehouse to help discover trends or problems in a business. In and of itself, it is not an ETL tool.
Fact table creation is not part of ETL concept. Its related to Data modulation.
There is no ETL happening in your process.
Problematic data source
There is a SharePoint list with columns: Person, Customer, Responsibility, Week_1, Week_2, Week_n. Values are workload estimates in percentage ( and estimated hours are then calculated from avg. working hours).
Reasoning
The reason for this is that the list is much easier to use as there is no need to create new line entry for every single week per employee.
My take on the issue
You might already see my problem. For painless analytics there should be a similar data model as described in Reasoning chapter.
However it should be possible to create required data model. A new column would be created and it should get its "Week"-value from a lookup-function that could return "ColumnName". This is possible in Excel, but for the record I haven't succeeded in PowerBI with DAX functions.
What would be your recommendation?
I would recommend using Power Query to read from the SharePoint list and the unpivot the data. You can install Power Query in Excel 2010 and Excel 2013. It ships with Excel 2016 named as Get & Transform. And Power Query is available in Power BI Desktop. It is probably a better choice for this than DAX.
I'm looking for a way to load more recent data as date x in Power Pivot and link/add them to an existing table.
Background:
The user downloads data from a datafeed and saves them in Excel Power Pivot.
The data will be deleted from the server afterwards.
In the next step, new data must be added to the existing table in Power Pivot,
so that the workbook graphics can access the complete dataset.
I know there is no API for VBA access to Power Pivot. Is there a
workaround with linked tables and direct access to the database?
1) you create one table just containing the dates from earliest to far into the future
2) you import every new set of data into a new power pivot table
3) you link the dates for a record in the newly imported table to the dates in the power pivot table containing the dates
so the backbone of your whole data is the dates table, while your tables keeping the actual data are treated as lookup tables.
this is hacky and I didn't try it, but it should work