In Tableau can I calculate a column based on totals of two other columns? - calculated-columns

I'm new to Tableau and new to posting in Stackoverflow so bear with me.
I have a dataset with variables such as State, County, Organization, 2020 Enrollment, 2021 Enrollment, and Delta (change in enrollment over those two years). What I want is a column that gives the percent delta in enrollment over these two years.
The first thing I tried was calculating a column just using the growth formula:
(ZN([2021Enrolled])-ZN([2020Enrolled]))/ZN([2020Enrolled])
In the Data View this works great, because nothing is being summed, I get the correct delta. But when I use this formula in my worksheet, what happens is that the formula is being calculated across all the observations (there are several observations per county, per organization, for example) and then summed up. This gives an incorrect delta for year over year.
What I am looking for is a way to calculate the % delta column based on the total enrollments for 2020 and 2021 in order to achieve the correct % delta.
I included two screenshots below showing what Tableau is giving, and then an Excel spreadsheet of the same data filtered on just one county to show the problem a little better.
Maybe a similar question has been asked before, but I was unsure just how to search this up. Any help would be appreciated.
Thanks!
Sam
Tableau view
Excel view

I found the answer: I was trying to create a calculated column in Data View, what I needed to do was create a calculated column in my worksheet view, so that it would only work on the data presented there.

Related

Excel PivotTable create column based on summarised column data

I have the following Pivot Table:
I am trying to derive the number of cases created per resource for a given month. I.e. for Jan 2021 the formula would be 768/9 = 85.333...
I've tried to use a Calculated Field, but the issue I'm having is that the PivotTable uses the underlying data e.g. for a given case, create date is 1/1/2022 and the resources on that day is 8. It is then dividing the numeric value of 1/1/22 by 8 and coming up with a value in the thousands, for obvious reasons, and then working out the average/sum/etc for all of the cases within that month. What I need is a way of just doing B5/C5 but within the pivot table itself, rather than a column outside of the pivottable. Is this possible?
P.S. Sorry if this is a basic one that has already been answered, but I'm not always sure on the correct terminology for PivotTable functionality so I've likely been googling like a 6 year old.
Thanks,

How to Group Measures and Make Columns Table Source Name

I'm new to Tableau! I hope this is a simple answers. Thanks in advance!
I'm working with employee data and I need to create a matrix of headcount totals across years and months.
Final Matrix Output Example
I'm starting with 6 tables listing all active employees at the beginning of each year from 2015 through 2020. I then have a list of employees and the date that were hired; so all employee additions. I then have the same thing for terminations. All 8 of these tables are in the same Excel file but different tables.
List of Data Tables
How can I take this data and create the matrix I linked above? I've tried creating calculated fields to count the number of active employees for each time period, but I can't then seem to get the matrix to organize itself correctly in a table.
Current Status
I feel like the easiest solution would be to query this so that I just have a snapshot of all active employees at the beginning of each month and year with month and year columns, but I'm not sure how to convert what I have now, into that sort of structure.
Thanks again.
I fear you have to extensively restructure your data before proceeding to build a view/crosstab, as is evident from the current status of your data (screenshot shared by you). You can do it much easily in excel. Meanwhile, I recommend/suggest you to read the paper by Hadley Wickham, renowned statistician/data scientist at this page https://vita.had.co.nz/papers/tidy-data.pdf
Still, I am trying to give you the steps which you can follow-
Step-1 Rename all columns of headcount tables by removing years from these. (Keep year names in sheets instead). This will give same column names for your all headcount tables.
Step-2 UNION all these headcount tables in data-tab of tableau. Keep sheet_names in a separate columns which will later-on be used to extract years' values.
Step-3 PIVOT all months columns to rows (In data tab only)
Step-4 Extract year names from file/sheetname column
Step-5 This will give a table structure with three useful columns to build your crosstab i.e. 1. Year (to be placed in columns); 2. Months (to be placed in rows) and 3. Headcount value (to be placed on viz/text marks card)

PowerBI DAX: logic to use aggregated table as parameter in functions or another workaround to calculate dataset KPI filtered by any field?

In PowerBI, I need to create a Performance Indicator (KPI) measure which evaluates dataset values in a scale from 0 to 1, with target (1) being the MAX value in a 20 years history. It's a national airport trip records open database. The formula is basically [value]/[max value].
My dataset has a lot of fields and I wish I could filter it by any of these fields, with a line chart showing the 0-1 indicator for each month based on the filters.
This is my workaround test solution:
Table 1 - Original dataset: if I filter something here, below tables also update (there are more fields to the left, including YEAR and MONTH
Table 2 - Reference to original dataset, aggregating YEAR-MONTH by the sum of "take-offs" (decolagens)
Table 3 - Reference to above (sum) table, aggregating MONTH by the max of "take-offs" (decolagens)
Table 4 - 'Sum table' merged to 'Max table' by MONTH as new table: then do [Value]/[Max] and we've got the indicator
So if i filter the original dataset by any fields, all other tables update accordingly and the indicators always stays between 0-1, works like a charm.
TL;DR
The problem is: I need to create a dashboard of this on Power Bi. So I need this calculation to be in a measure or another workaround.
My possible solution: by pure DAX code in the measure field, to produce Tables 2 and 3 so I'll divide the month sum values by their month max value (which will both be produced according to PowerBi dashboard slicers) and get the indicator dinamically produced.
I'm stuck at: I don't understand how can I reference a sum/max aggregate table in dax code. Something like = SUM (dataset[take-offs]) / MAX (SUM (dataset[take-offs])). Of course these functions do not work like that, but I hope I made my point clear: how can I produce this four table effect with a single measure?
Other solutions are welcome.
Link to the original dataset: https://www.anac.gov.br/assuntos/dados-e-estatisticas/dados-estatisticos/arquivos/DadosEstatsticos.csv
It's an open dataset, so I guess there's no problem sharing it. Please help! :)
EDIT: please download the dataset and try to solve this. Personally I think it's a quality statistics doubt that will eventually help others. The calculation works, it only needs a Power Bi Measure port.
Add the ALL formula:
Measure = SUMX(ALL('Table'),[Valor])/SUM('Table'[Max])
Example

Calculate average based on a value column (count) in a pivot table

I'm looking a way to add an extra column in a pivot table that that averages the sum of the count for the months ("Count of records" column) within a time period that is selected (currently 2016 - one month, 2017 - full year, 2018 - 5 month). Every month would have the same number based on the year average, needs to be dynamically changing when selecting different period: full year or for example 4 months. I need the column within the pivot table, so it could be used for a future pivot chart.
I can't simply use average as all my records appear only once and I use Count to aggregate those numbers ("Count of records" column).
My current data looks like this:
The final result should look like this:
I assume that it somehow can be done with the help of "calculated filed" option but I couldn't make it work now.
Greatly appreciate any help!
Using the DataModel (built in to Excel 2013 and later) you can write really cool formulas inside PivotTables called Measures that can do this kind of thing. Take the example below:
As you can see, the Cust Count & Average field gives a count of transactions by month but also gives the average of those monthly readings for the subtotal lines (i.e. the 2017 Total and 2018 Total lines) using the below DAX formula:
=AVERAGEX(SUMMARIZE(Table1,[Customer (Month)],"x",COUNTA(Table1[Customer])),[x])
That just says "Summarize this table by count of the customer field by month, call the resulting summarization field 'x', and then give me the average of that field x".
Because DAX measures are executed within the context of the PivotTable, you get the count that you want for months, and you get the average that you want for the yearly subtotals.
Hard to explain, but demonstrates that DAX can certainly do this for you.
See my answer at the following link for an example of how to add data to the DataModel and how to subsequently write measures:
Using the Excel SMALL function with filtering criteria AND ignoring zeros
I also recommend grabbing yourself a book called Supercharge Excel when you learn to write DAX by Matt Allington, and perhaps even taking his awesome online course, because it covers this kind of thing very well, and will save you significant head-scratching compared to going it alone.

Formula for getting worksheet names based on pivot table results

I have an excel file with 30 time cards, each on their own worksheet, where the only identifier is the worksheet name (ie the employee name). Each worksheet has a first column of account numbers, followed by columns for hours worked for each day of the month, and then total.
From these individual employee tabs I make a Totals worksheet(using =SUM('Adams:White'!B1) and then fill left and fill down. . .)
I then make a pivot on the Totals data and get summary data for the department. (ie we spent 100 hours total on account# 12345) - no problem.
My Question is: How do I write a formula(s) to find which employees contributed to the hours spent on account# 12345. The specific output I would want is a table with a column heading of "12345", and then only the names of those who worked on that account below the heading. (Or all names, sorted, with a second column of how many hours they worked on "12345").
Thanks!
Steve
Since you are feeding your data set into a pivot table, you will need to ensure each record (row) in your data set is reportable. i.e. if Adam and Jane worked on account 12345 for a total of 7 hours and your record in your data set (table) is only one row with the account listed and the total number of hours, it will be difficult and extremely bad practice to attempt to report this by staffer (how do you know that the 7 hours is made up of Adam and Jane, or it could be 14 part-time workers that each put in half an hour).
You have two approaches. One: you could consolidate the data into a master data tab and from there you could have each sheet (Adam, Jane, White) be a report off the master table to show performance by staffer.
Two: Make use of power pivot, if you have Excel 2013+ installed. Here you would create a link for each table by account. Now you would have each rep's hours contributed as a field in the power pivot connection.
Please let me know which of the two seems a better choice and I can assist from there.

Resources