Dataprep pivot transform - pivot

I'm new to Dataprep and now trying to create a pivot table using the "Pivot Transform"
https://cloud.google.com/dataprep/docs/html/Pivot-Transform_57344645#example---basic-pivot
I searched the documents and the syntax looks simple enough, except for the fact that it is out of context and not sure where to use it.
syntax is :
pivot col: (the parameter to be used as col) value: (value to present) group: (what to group by)
Other solutions that I found here and elsewhere all require lots of code and heavily rely on knowing the columns in advance
e.g : using case when ____ = 'name of col' to pivot the data
any idea will be appreciated

I'll assume that you have already imported the dataset with the data and then you created a new flow. When you add the dataset to the flow you can then add a recipe. You can find details on how to do so in the quickstart.
Regarding, the pivot transform, once you have added the recipe click on the Recipe button on top and then select Add New Step as depicted here:
Then, in the Transformation text box you can start typing pivot and select the Pivot transform as seen below. A menu will appear and you can apply the parameters using the UI.
Otherwise, you can just copy and paste the formula in the text box. For example:
pivot col:Date value:SUM(Sales) group:State
And the fields will be filled in and you can check the preview of the pivot transform:

Related

Is there an Excel function that can create a dynamic and tiered table based off of the number of inputs for each tier?

I am looking for assistance on this project of mine. I want to be able to create a dynamic table that can be update based on the number of inputs.
The goal is to create a fully built out table that follows the tier structure. I have attached an example of what I am trying to do with the inputs above and intended output below and I want the table to be updated if I were to add a new selection to each tier e.g. D to tier 1 or 3 to tier 2
Example of input and output desired
I have had to map this manually previously which is time consuming and error prone so I am looking for a way to do this automatically. Thank you in advance for any help provided :)
Separate your tier values into three separate tables. In the image below I have created three tables - tier1, tier2 and tier3. In each table, there is one column called "values" containing the values for that tier.
For each table, create a query using Data>Get & Transform Data>From Table/Range.
You should then have three queries in the Power Query Editor, like this:
In the Power Query Editor, select the "tier1" query, then use Add Column>Custom Column and configure it like this:
When you hit OK, you will see this:
Hit the double-headed arrow at the top of the new column then hit OK on the dialog to expand the column. You'll see this at the end:
Repeat the above steps for adding a column for tier3, so at the end you have this:
You can now right-click any of the columns and use 'Rename' to rename them as you want.
Finally, click 'Close & Load' to put the result back to the workbook.
Now, you only need to put your tier values into the three tables, then right-click the final query and select 'Refresh' to run the steps again.

Use datamodel data for cell formula

I have an excel file that has a table imported from a txt in a sheet (using New Query). From that table I created a pivot table and some formulas like for example MAX().
I was told that for large files it is better to add the info to the data model as connection only (the data is not visible in a sheet).
No problem in creating the pivot and works great, but trying to do the formulas excel does not find the Table.
Before I could do something like this:
=+MAX(Table1[#[Column1]])
but know when I do MAX, the system does not find the Table1 I have loaded as connection only. Is there any way to relate a formula to data that has been added to the model as connection only?
Thanks.
In order to access data in the data model you can use "Cube functions" you can follow these steps:
1- From inside the "Manage Data Model" option, create a pivot table of your table
2- Customize the new pivot table according to your needs
3- Click the ribbon "PivotTable Tools" | "Analyze" | "OLAP Tools" | "Covert to formulas"
4- Optional: Merge the formulas in one
Remarks:
The functions (MAX, SUM, Etc.) must be defined in the Pivot Table.
Here is an screencast I created for you.
Reference: https://support.office.com/en-us/article/cube-functions-reference-2378132b-d3f2-4af1-896d-48a9ee840eb2
First Create a Data Model table then use
=MAXIF(Table1[Values],Table1[Labels],"a")

Count how many times a particular value appears with respect to value in another column

I want to count how many times a particular value appears with respect to value in another column. ( Apologies, as i am struggling to put it in words properly. Maybe that's why I couldn't google it)
I am using spotfire and actual data set is quite big.
As per my dummy data - i want 5 more columns - a,b,c,d,e which will give me counts like table - 'what I want'
Please if someone can help.
Thanks,
AP
what you're looking for is called a Pivot Table. it doesn't look quite like what you've got in your example, and because you haven't provided a lot of information about what you're trying to do in the end, I'm working under the assumption it's just a quick example you put together? if that's not the case, please clarify your question with your end goal and I'll update my answer.
to create a Pivot Table in Spotfire:
click the Insert menu at the top of the screen
choose Transformation...
in the Insert Transformation dialog that appears, choose your data table from the top dropdown, and choose Pivot from the bottom one, then click Add...
configure the pivot like I've done in the screenshot below
click OK and confirm the Insert Transformation dialog

Custom row labels in PivotTable

I have an excel spreadsheet full of customer data including a few single letter categorical variables.
For example: property type can be (I for investment, O for owner occupier, or R for renter). Is it possible to replace the single letter with a descriptive title in the rows on a PivotTable? I do not have the descriptive names anywhere in my spreadsheet and I would prefer not to add them.
you can give nicknames to the fields that you are checking which populate the pivot table.
If you go the pivot table data and right click you can change the value field settings to give a custom name to a row/series but I do not know about individual data points.
path: pivot table data => right click => select Field Settings => edit custom name.
It does not look like it modifies the raw data (before pivot table).
It adds the name to the chart as well. So make sure your chart looks okay.
to my knowledge this is the best tool for you to mess around with.
Hopefully this answers your question.
coming from experimenting on excel 2013.

How to find data source of a slicer for a pivot table via the Excel UI?

Note: I don't think it makes any fundamental difference, but I am working with pivot tables running on top of a PowerPivot model.
Example scenario:
Three tables in a model: SalesTransaction, BuyerCustomers, SellerCustomers, with a two defined PowerPivot relations:
BuyerCustomers.CustomerCode --> SalesTransaction.BuyerCustomerCode
SellerCustomers.CustomerCode --> SalesTransaction.SellerCustomerCode
I have a PivotTable defined using SalesTransaction as the data source.
Now, if I want to create slicers on both BuyerCustomer and SellerCustomer, in the Pivot Table fields window I can right click and "add as slicer" on either:
SalesTransaction.BuyerCustomerCode and SalesTransaction.SellerCustomerCode (the two columns in the transaction table)
BuyerCustomers.CustomerCode and SellerCustomers.CustomerCode (the individual lookup tables)
Either way, the behavior is identical. My question is: once this has been set up, how can one tell what a slicer is bound to via the UI in Excel? Other than being able to deduce the obvious association via column names, how does one tell?
Using VBA, one can discover the association like so:
ActiveWorkbook.SlicerCaches("Slicer_CustomerCode").SourceName
...which yields:
"[SalesTransaction].[BuyerCustomerCode]"
or
"[BuyerCustomers].[CustomerCode]"
....but as far as I can tell, there is no way to see this via the UI.
You cannot find the slicer TableName.ColumnName data source via the UI, you can only see the ColumnName.
As posted in the question, you can see both table and column names via VBA:
ActiveWorkbook.SlicerCaches("Slicer_YourSlicerName").SourceName
...which yields:
"[TableName].[ColumnName]"

Resources