Can I use CSV file as Excel pivot data source? - excel

I have some external csv/txt files and I'd like to use them for a pivot table.
However, after I selected my csv file as the external data source, at the end of the guided procedure (header, seperators, etc.) Excel throws an error saying something along the lines of: it's impossible to use the selected type of connection for a pivot table.
Now, I know how to do it with another excel/db table - here it would come very handy to use a csv/txt file.
Can this be done natively, without external plugins?

Use the keyboard shortcut Alt, D, P (not all at once like Alt+D+P, but press each one seperatly). This brings up the old-style pivot table wizard.
Select External Data Source
Click Get Data
Choose and click OK.
Name your data source and choose Microsoft Access Text Driver
Click Connect, uncheck Use Current Directory (unless that's what you want), and Select the Directory you want.
If you don't identify the file when you get back to the "Select a default table..." text box, you'll get prompted to select one.
At that point, click OK back through the dialog boxes. Eventually you'll get thrown into MSQuery where you can build the query you want. From there Return Data to Excel and you can build your pivot table.

Related

How can I write back to source excel files when I get Excel data from data tab?

So, I want to create one Excel file each for every manager with only one single sheet and place it on OneDrive.
In order to get the data in one place, I am creating another excel file called combined.xlsx.
Now, I can export each workbook to a tab using Data -> Get Data -> From File -> From Workbook .
This is great so far. So, I can read data of 10 excelfiles on 10 sheets in combined.xlsx.
Now, I want to modify contents of one of the tabs and make sure it is reflected to the original file. How can I do this?
To elaborate on why it is not possible, you need to understand how Power Query deals with data:
You load your data into Power Query via the "Data" tab. The source can be anything Microsoft allows.
You then manipulate the data any which way in Power Query.
As a last step, you decide if and where to load the results. If you only want to create a connection to the query, you select "Close and Load to", which appears after you click on the arrow next to "Close and Load", and you pick that. Otherwise, the only other options are loading the query results to a table, PivotTable report, PivotChart.
Because the output sheets you have are connected to the query that produced them, any time you refresh the query, whatever manual changes you have made in the table that the query created originally will be wiped out and overwritten with the refreshed data.
If you were able to write back to the source here, you'd in effect
create a circular reference.
Check out this article about having Power Query output your data after manipulating it, maybe it helps.

Save a pivot table structure based on external source without saving/cacheing the data

I have an external data source that implements row-level security.
I have an XLSX file which I want to distribute which will have a pivot table based on the external source. All of my users have an identically configured ODBC connection, except it uses each of their personal credentials and thus they have access to different data.
I've explored all of the connection & pivot table settings that supposedly give you such controls but they are not working for me. When I save my workbook, it seems that it is not possible to prevent the contents of the pivot table (as they currently look) from being saved. When a new user opens it, they will be able to see the current pivot table contents (which they perhaps shouldn't have access to) until they click "enable content", accept the various popups and/or wait for refresh.
Previously, I created a table based on external source and configured the connection to not save data - this worked. I then created a pivot table on top of the table range and configured it to not save source data. This sort of works except the table refreshes first and so the pivot table loses its settings and you have to start again with a blank pivot.
If you create a pivot table directly on the external source (rather than indirectly via a table), which I expect is the best practice, the tickbox in pivot table options for "Save source data with file" is greyed - presumably because excel knows the source data is actually external and so the question isn't relevant - except is is relevant because the pivot table output still contains data when saved
The only thing I can think of is is save the workbook as a user with 0 permission so the pivot table is structured correctly but with 0 contents and then send that round. Users will then see no/harmless data before it auto-refreshes at which point, they'll see what they should see.
Kind of feels like a glaring omission from excel. Am I missing something?

SSIS: failed to retrieve long data / truncation errors

I'm getting either of those two errors when trying to export data from a set of excel spreadsheets.
Simplified scenario:
two excel spreadsheets containing 1 text column
in file 1 the text is never longer than 200 characters
in the 2nd - it is.
SSIS suppose to import them automatically from a folder - easy and simple, but...
Excel source component decides what data type is used here.
When, using created by me sample file with sample text data, it decides to use DT_WSTR(255) it fails with the second file with the truncation error.
When I force it to use DT_NTEXT (by creating longer text in the sample file) if fails with the 1st file complaining that "Failed to retrieve long data for column"... because the 1st file doesn't contain longer texts...
Has anybody found a solution/work-around for this? I mean - except manually changing the source data?
Any help much appreciated.
We can use Flat File Connection Manager instead of Excel Connection Manager. When we create Flat File Connection Manager we can set data type and length explicitly. To do so first we need to save the excel file as csv file or tab delimited file. Then we can use this file to create Flat File Connection. Drag and drop a Flat File Source in the Data Flow tab. In the Flat File Source Editor dialog box click New button and it will launches Flat File Connection Manager Editor dialog box. In the General tab specify the file full path and click Advanced tab. Then put data type and column width like below image.
Click OK and close the dialog box, this will create our connection manager. Now the connection manager can successfully read the full length data but we have to set the data type & length of the Output Columns so that we can get the data in the output pipeline. To do that right click on the Flat File Source and click Show Advanced Editor option. Then follow the below image instruction.
When we finish we run our package and it run successfully without any truncation error and insert all the data in our target database.

How to reuse a set of power query steps in another Excel document?

We have 4GB csv file which is the source for power query in an Excel document. It takes some time to set up all of the transformations, and we would like to be able to reuse the steps when creating other documents which need to import into the data model files of the same format.
Is there a way to save the query and reuse it in another document? I've seen some references to copying the query text from the Advanced Editor, but it seems like there should be a better way of doing it.
Separation of data and PowerQuery transformations
I assume, you opened your Excel data file and did all PowerQuery transformations within it. In order to separate them, you could either go for Peter's solution or you make two copies of that file, one for the data (e.g. "data.xlsx") and the other for the transformations (e.g. "PQ_transformations.xlsx"). Either way, you will have to do some adjustments.
Adjustments
Remove all PQ queries from the data file.
Alter the PQ file. It depends on whether you would like to change the location of each data file within PowerQuery (Option 1) or not (Option 2).
Option 1: Select the data file within PowerQuery.
Open the PQ editor
Go to the first query of your transformations and replace the first statement (which should look like = Excel.CurrentWorkbook(){[Name="Table1"]}[Content]) by = Excel.Workbook(File.Contents("[PATH]\data.xlsx"), null, true) with [PATH] being a placeholder for your file's location.
Close the PQ editor
Delete the tab that contained your original data.
Option 2: Apply transformations without editing PowerQuery
The following setup assumes that you organize your data files in different folders. You can then copy your PQ file into each of these folders, open it and click on "Data"/"Update all" to apply your transformations to the data file in the given folder.
Notes:
I assume that all data files have the same structure and name.
I define the folder in Excel and not in PowerQuery to allow users that have no knowledge of PQ to manually change the folder by overwriting the formula in case they do not want to copy the file all the time.
Add a tab called "Paths".
Select A1 and enter Current folder.
Select A2 and enter =MID(CELL("filename"),1,FIND("[",CELL("filename"))-1). This formula provides you with the folder of the current file as soon as it has been saved.
Select range A1:A2 and bring it into PQ editor by selecting the "Data" ribbon and choosing "From table/area" out of the "Request and transform data" section.
A new query is generated, showing you the current folder.
Open the "Advanced Editor" ("Start"/"Advanced Editor"), change the name of the second step to "SetTypes" and add the additional lines. The result should look similiar to this:
let
Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
SetTypes = Table.TransformColumnTypes(Source ,{{"Current folder", type text}}),
GetPathAsValue = SetTypes{0}[Current folder],
ShowFilesInPath = Folder.Files(GetPathAsValue),
FilterForDataFile = Table.SelectRows(ShowFilesInPath, each ([Name] = "Data.xlsx"))
in
FilterForDataFile
Close the "Advanced editor" and accept the changes.
You should see a row which features your data file.
Click on "Binary" in the "Content" column to see a list of all tables and sheets in that file.
Select the desired "Sheet" or "table" whatever you usually have in your data file.
Rename the query to "GetFile"
Go to the first query of your original transformations and replace the first statement (which should look like = Excel.CurrentWorkbook(){[Name="Table1"]}[Content]) by = GetFile.
Copy and Paste
In PowerQuery right-click on the final query and select copy. Open a new Excel workbook, open PowerQuery and paste the query into the queries pane. All dependent queries and parameters will be copied as well. Then you can adjust the query steps and save the new workbook.

Change data source of a list table in Excel 2010

Is there a way to change the data source of a list table in Excel? It's easy to change the data source of a pivot table but the only way I have found to change the data source of a list table is to delete the table and start again.
Just to be clear, what I mean by a list table is what you get when you set up a connection to a table or query in Access (for example) and then click on "Existing Connections" under the Data tab.
I would prefer a method using the usual user interface but if there is a method that works only in VBA, that would be fine.
You can do it from the Excel interface without VBA.
Under the Data tab, click on Connections.
From the Workbook Connections dialog, click Properties.
From the Connection Properties dialog, click Definition.
Browse for a connection file and then select a table.
That's it.
If your data source is inside Excel on another sheet and you're using OLE DB Query, you might find changing the data source impossible (which was the case with me).
To change the data source in this instance, you need to click on Resize Table under the Table Design tab in Excel, as can be seen below:
When you click on this, you are able to change the range of your query, as can be seen below:

Resources