full table join in powerpivot - excel

In powerpivot, Related(Othertable[field]) retrieves the associated column from a related table.
I would like to import ALL such columns, doing the equivalent of a join.
Is it possible to do this ?

nicolas,
the smartest thing to do from my perspective is to merge your queries into one so that you can keep your original tables.
I would suggest using new PowerQuery Merge funcionality, which is very easy and works reliably (and also supports loading data directly into your PowerPivot data model).
Or you can write you custom Query in PowerPivot - if you use MSSQL (or any other) database as your source, you can actually use JOIN directly in the PowerPivot window with Table Import Wizard that makes things a bit easier.
So the answer is: keep your original data tables intact, and create a new one that will be merging them together just for the purpose of your desired report.
Hope this helps.

Related

Refresh power pivot-power query

I have a table in power query that is fed from some Excel files, with this data I make an inner join with other catalog tables that I have and do operations on calculated columns and then add to the power pivot data model to make some pivot tables, initially everything was working very well until I made adjustments to the power query table by removing or adding more columns as well as editing the inner join operations, now when I do an update with and I want to pass the power query table to data model power pivot it gives me an error that the table does not exist, Mention that if I do the update only power query table it works without problems, the problem is when the data is going to be passed to power pivot.
How can I correct this error?
Sorry for my English
Yes, often when changes are made to initial queries these issues happen. Normally, when Query Names and/or Field/Column Names are changed and these names are used in the Merge or calculation steps query will pop these errors.
So, review/compare all the changes that you made to steps after the merge.
If you don't find any errors, consider making a copy of the steps and rebuilt from the merge to ensure optimum performance.

Practical tips on documenting Excel Queries, data model tables, pivot tables?

Building a BI system (dashboards) in Excel using imported tables (from excel files). We're using Excel 2016 query, data model, measures using DAX expressions, resulting in more pivot tables (some of which are reloaded into data model), etc.
My question: is there "best practice" on 1) naming these data elements and 2) documenting these bits to have a more complete system documentation.
Background: I'm the senior "hacker" munging these things together. But I need to move this towards being sustainable. I did some prototyping work and when I went back a week later it was challenging to reconstruct my thoughts and relationships...
I've seen folks refer to use of PowerBI flow diagrams to support documentation; but it seems to be more of the "icing on the cake" than the "cake" itself.
So what "bread and butter" documentation approaches have you, more experienced developers, taken to ensure that your systems are clearly documented so that others can pick up where you left off???
For naming, I follow the Kimball Group's advice for data warehouses/marts, e.g.
https://www.kimballgroup.com/2014/07/design-tip-168-whats-name/
I rename many/most Query steps to reference the column or table name, e.g. Added Custom => Added Customer Name, Append Queries => Append Customers. The idea is to be able to pick the right step first time when coming back for maintenance.
You can select all the Queries in the Query Editor window and copy their code, then paste it into Word etc as the starting point for your documentation. You can also screen-shot the Query Editor's Query Dependancies pop-up.
For the Power Pivot logic, try this solution:
https://powerpivotpro.com/2014/03/automatically-create-data-dictionary-for-your-power-pivot-model/

Limit data coming into Spotfire by a different data table

I have Table A prompted on Year/Month and Table B. Table B also has a Year/Month column. Table A is the default data table (gets pulled in first). I have set up a relationship between Table A and B on the common Year/Month column.
The goal is to get Table B to only pull through data where the Year/Month matches the Year/Month on Table A (what the user entered). The purpose is to keep the user from entering the Year/Month multiple times.
The issue is Table B contains almost 35 million records. What I do not want to do is have Spotfire pull across all 35 Million records. What is currently happening is Spotfire is pulling all those records, then by setting filtering to include Filtered Rows Only on Table B, I am limiting what is seen in the visualization to under 200,000 rows. I would much rather just pull across 200,000 rows to start with.
The question: Is there a way to force Spotfire to filter the data table (Table B) by another data table (Table A) as it pulls the data table (Table B) across, thus only pulling a small number of records into memory?
I'm writing this off the basis that most people utilize information links to get data into Spotfire, especially large data sets where the data is not embedded in the analysis. With that being said, I prefer to handle as much if not all of the joining / filtering / massaging at the data source versus the Spotfire application. Here are my views on the best practices and why.
Tables / Views vs Procedures as Information Links
Most people are familiar with the Table / View structure and get data into Spotfire in one of 2 ways
Create all joins / links in information designer based off data relations defined by the author by selecting individual tables from the data sources avaliable
Create a view (or similar object) at the data source where all joining / data relations are done, thus giving Spotfire a single flat file of data
Personally, option 2 is much easier IF you have access to the data source since the data source is designed to handle this type of work. Spotfire just makes it available but with limited functionality (i.e. complex queries, Intellisense, etc aren't available. No native IDE). What's even better is Stored Procedures IMHO and here is why.
In options 1 and 2 above, if you want to add a column you have to change the view / source code at the data source, or individually add a column in the information designer. This creates dwarfed objects and clutters up your library. For example, when you create an information link there is a folder with all the elements associated with it. If you want to add columns later, you'll have another folder for any columns added, and this gets confusing and hard to manage. If you create a procedure at the data source to return the data you need, and later want to add some columns, you only have to change this at the data source. i.e. change the procedure. Everything else will be inherited by Spotfire... all you have to do is click the "reload data" button in Spotfire. You don't have to change anything in the information designer. Additionally, you can easily add new parameters, set default parameter properties or prompt the user, making this a very efficient method of data retrieval. This is perfect when the data source is an OLTP and not a data-mart/data-warehouse (i.e. the data isn't already aggregated / cleansed) but can also be powerful in data warehouse environments as well.
Ditch the GUI, Edit the SQL
I find managing conditions, parameters, join paths, etc a bit annoying--but that's me. Instead, when possible, I prefer to click "Edit SQL" next to all the elements in my Information Link and alter the SQL there. This will allow database guys to work in an environment which is more familiar.

Import data from excel to Sql database

I have to import the data from excel to SQL database. Before inserting data I have to apply some business rules. So what is best approach.
1. Should I read all the excel and put it in collection and the apply business rules then
Or
1. I read excel, get the record , apply the business rule and insert the record in database
I would put the data without applying any rules into the database. Afterwards you can easily apply the business rules using SQL.
Although this approach assumes:
Your data is well-enough formatted. Maybe one cell is not equivalent to one field? Or maybe the Excel file was edited by several people following different editing conventions. From my experience it's much easier to do this data cleaning before the import to SQL.
You are able to apply your business logic using SQL. After all you need to be able to realize it and if you are a better VBA programmer than SQL or PL/SQL programmer, then you should probably apply the business rule before inserting everything into the database.

Importing data from Excel into a database

I want to import data from Excel into corresponding tables based on different column data's on based on ID's like customer data on based on CustomerID present in Customer table.
Means we have to extract data from the table and Excel source on basis of ID's.
Could you please help me out on this?
Use the SQL Server Data Import Wizard - see an article on it here.
(source: databasedesign-resource.com)
This wizard allows you to define your Excel file to import, it allows you to define the target where to put the data, it allows you to define mappings between columns in Excel and columns in your SQL table, and much more.
Update: based on your comment to the other answer, if you need to import the Excel sheet and match it up to some pre-existing lookup data, then you should definitely look at the SQL Server Integration Services (SSIS) which are there exactly for this kind of import/lookup scenario.
Your question's gamma is a bit all over the place so not entirely sure what you are asking about but here goes.
You can save you excel spreadsheet as a CSV file and then import that into your database. There a number of tutorials on this if you search google. Try searching "import CSV into database".

Resources