How to Group Measures and Make Columns Table Source Name - pivot

I'm new to Tableau! I hope this is a simple answers. Thanks in advance!
I'm working with employee data and I need to create a matrix of headcount totals across years and months.
Final Matrix Output Example
I'm starting with 6 tables listing all active employees at the beginning of each year from 2015 through 2020. I then have a list of employees and the date that were hired; so all employee additions. I then have the same thing for terminations. All 8 of these tables are in the same Excel file but different tables.
List of Data Tables
How can I take this data and create the matrix I linked above? I've tried creating calculated fields to count the number of active employees for each time period, but I can't then seem to get the matrix to organize itself correctly in a table.
Current Status
I feel like the easiest solution would be to query this so that I just have a snapshot of all active employees at the beginning of each month and year with month and year columns, but I'm not sure how to convert what I have now, into that sort of structure.
Thanks again.

I fear you have to extensively restructure your data before proceeding to build a view/crosstab, as is evident from the current status of your data (screenshot shared by you). You can do it much easily in excel. Meanwhile, I recommend/suggest you to read the paper by Hadley Wickham, renowned statistician/data scientist at this page https://vita.had.co.nz/papers/tidy-data.pdf
Still, I am trying to give you the steps which you can follow-
Step-1 Rename all columns of headcount tables by removing years from these. (Keep year names in sheets instead). This will give same column names for your all headcount tables.
Step-2 UNION all these headcount tables in data-tab of tableau. Keep sheet_names in a separate columns which will later-on be used to extract years' values.
Step-3 PIVOT all months columns to rows (In data tab only)
Step-4 Extract year names from file/sheetname column
Step-5 This will give a table structure with three useful columns to build your crosstab i.e. 1. Year (to be placed in columns); 2. Months (to be placed in rows) and 3. Headcount value (to be placed on viz/text marks card)

Related

In Tableau can I calculate a column based on totals of two other columns?

I'm new to Tableau and new to posting in Stackoverflow so bear with me.
I have a dataset with variables such as State, County, Organization, 2020 Enrollment, 2021 Enrollment, and Delta (change in enrollment over those two years). What I want is a column that gives the percent delta in enrollment over these two years.
The first thing I tried was calculating a column just using the growth formula:
(ZN([2021Enrolled])-ZN([2020Enrolled]))/ZN([2020Enrolled])
In the Data View this works great, because nothing is being summed, I get the correct delta. But when I use this formula in my worksheet, what happens is that the formula is being calculated across all the observations (there are several observations per county, per organization, for example) and then summed up. This gives an incorrect delta for year over year.
What I am looking for is a way to calculate the % delta column based on the total enrollments for 2020 and 2021 in order to achieve the correct % delta.
I included two screenshots below showing what Tableau is giving, and then an Excel spreadsheet of the same data filtered on just one county to show the problem a little better.
Maybe a similar question has been asked before, but I was unsure just how to search this up. Any help would be appreciated.
Thanks!
Sam
Tableau view
Excel view
I found the answer: I was trying to create a calculated column in Data View, what I needed to do was create a calculated column in my worksheet view, so that it would only work on the data presented there.

Subtracting minimum and maximum data from the group in Excel

I have a problem with extracting the data from dataset. In my company project has some processes, which are grouped in categories. So I have one project which has many processes grouped. On the basis of column "D" I would like to get the first data and the last date from every kind of category. For example, for project 20.28 I would like to have start date from row 5 and finish date from row 4. I hundreds of projects divided into categories so making it manually isn't and option here. Below is a sample.
Sample data set
I have a solution based on consecutive filters with formulas. If this works to you I'll post here the steps.
https://drive.google.com/file/d/1pdxMGRDl_sV5wRmqIFouDQs8L43LVxgi/view?usp=sharing

Pivot Table project - Avoid using many INDEX and MATCH functions that make Excel crash

I need some help with an Excel Project that's giving me headaches. I succeeded to achieve everything I wanted but the result is too heavy for Excel and it crashes all the time. I'm over-using the INDEX and MATCH functions on large tables (50 000+ lines) and Excel doesn't like it. I'm looking for a way to do the same thing in a lighter way for Excel.
Here's what I achieved : I created a report that helps me analyzing my employees's performance VS their billing targets. To create such a report, I used a Pivot Table.
That Pivot Table needs this information as its source :
Each sales that every employee made (amount in $ and date)
The hourly rate of each employee (which changes for every period, see TABLE1 below)
The billing target for each employees (which changes for every period, see TABLE1 below)
Here's my setup. I have 3 tables :
TABLE1 (See attached image) - A table where I manually input data for each of my employees (hourly rate and billing target). Their billing target and hourly rate change every period. So, each period has a different line and I indicate the first day of the period and the last day of the period.
TABLE2 (See attached image) - Table that contains sales data exported from another software I use. Each line represents an amount sold by an employee to a customer on a specific date. This table is pretty heavy and contains more than 50 000 lines. Moreover, the last 2 columns of this table use Index and Match functions to get the right hourly rate and the right billing target from TABLE1. That means that each of those 50 000 lines uses the INDEX and MATCH functions twice… This part is too heavy for Excel and I need a workaround.
Moreover, TABLE2 is getting refreshed every few days with new data coming from my other software (an ERP). So the solution I need to find must take that into account and must be permanent (I try to avoid steps that will have to be done everytime I refresh TABLE2 with new data).
TABLE3 - A Pivot Table that uses TABLE2 as its data source. I use the slicer to select the name of an employee and a timeline to specify which months I want to display. Then the Pivot Table shows my employee's statistics grouped by months. The main statistic is the amount of "billed hours" for each employee, which is in reality the amount of sales made by that employee, divided by their hourly rate on a specific date.
My thoughts :
It is absurd that TABLE2 uses that many INDEX and MATCH functions. For example, if Employee1 made 500 sales between 2020-07-01 and 2020-07-31 (the same month, thus the same period, thus the same hourly rate and billing target), there will be 500 different lines that will use INDEX and MATCH to get the same hourly rate and billing target from TABLE1. That leads to a lot of duplicated calculation and a lot of duplicated data.
Would it be possible for a Pivot Table Calculated Field to use INDEX and MATCH in its formula? And would it be lighter for Excel to do so?
Another way would be to add, at the bottom of TABLE2, 12 lines per year (1 for each month) for every employee where I would write their hourly rate and the billing target. That way, the Pivot Table would be able to display an hourly rate and a billing target for each month, for each employee. That solution would work and would be lighter for Excel, but it would create a high risk of making mistakes while manually inputting the data.
I'm open to all suggestions including VBA!
Thank you very much for your precious time!
EDIT : FORMULA
As requested, here's my INDEX AND MATCH formula that is in TABLE2 and gets the hourly rate from TABLE1 :
=INDEX(TAB_Employee_Data[[#All];[Hourly_Rate]];MATCH([#[Date (Cell)]]; IF(TAB_Employee_Data[[#All];[Name]]=[#[Employee(Cell)]];TAB_Employee_Data[[#All];[First day of the period]]);1))
TAB_Employee_Data is the tab that contains "TABLE1".
I translated the names of the fields since all my work is in French.
This formula does the following : it searches the name of an employee in TABLE1 and finds the period which fits the date of a line in TABLE2.
Also, to work properly, I need to sort the lines of TABLE1 in chronological order.
TABLE 1 :
TABLE 2:

Formula for getting worksheet names based on pivot table results

I have an excel file with 30 time cards, each on their own worksheet, where the only identifier is the worksheet name (ie the employee name). Each worksheet has a first column of account numbers, followed by columns for hours worked for each day of the month, and then total.
From these individual employee tabs I make a Totals worksheet(using =SUM('Adams:White'!B1) and then fill left and fill down. . .)
I then make a pivot on the Totals data and get summary data for the department. (ie we spent 100 hours total on account# 12345) - no problem.
My Question is: How do I write a formula(s) to find which employees contributed to the hours spent on account# 12345. The specific output I would want is a table with a column heading of "12345", and then only the names of those who worked on that account below the heading. (Or all names, sorted, with a second column of how many hours they worked on "12345").
Thanks!
Steve
Since you are feeding your data set into a pivot table, you will need to ensure each record (row) in your data set is reportable. i.e. if Adam and Jane worked on account 12345 for a total of 7 hours and your record in your data set (table) is only one row with the account listed and the total number of hours, it will be difficult and extremely bad practice to attempt to report this by staffer (how do you know that the 7 hours is made up of Adam and Jane, or it could be 14 part-time workers that each put in half an hour).
You have two approaches. One: you could consolidate the data into a master data tab and from there you could have each sheet (Adam, Jane, White) be a report off the master table to show performance by staffer.
Two: Make use of power pivot, if you have Excel 2013+ installed. Here you would create a link for each table by account. Now you would have each rep's hours contributed as a field in the power pivot connection.
Please let me know which of the two seems a better choice and I can assist from there.

Cognos - Showing every month on x-axis when some months don't have values

Let me first say I am very new to Cognos and have mainly learned by just manipulating items within active reports. I am having an issue with creating a graph that acts like a time series. I want it to display every month (with multiple values in some months and none in others). I want to visually see gaps between data points (ex: we order products every 3 months starting in January, so we should see gaps in the months we do not order products - like February and March).
I have tried changing the label control to manual and setting display frequency to 1. However, I think my issue is that there is not any data within certain months.
You are correct in that your problem is lack of data. A standard inner join will drop rows where there is not a corresponding row in both tables, resulting in gaps.
There are two solutions available:
Use a union to create "dummy" records for each date
Manually specify an outer join between the date table and the table containing the rest of information
Since the first technique is the most common, I'll outline the basic steps for it here.
Create a new query
Add your month data item to the query
Create a 'dummy' data item for your measure. Use 0 for its expression.
If there is a date range filter in the main query apply it here
Create a union
Drag over your new query into the union
Drag over your original query into the union
Pull in the date and measure data items into the union query
Set the Aggregate Function property of the measure to Total
Use the union query as the source for your chart
For every month with measure data you will have two rows, one with the measure amount and one with 0. The two rows will be combined by the auto-group and summarize function. The measures will be added together. Anything added to 0 will end up as the original amount.
For months with no measure data, there will only be the 'dummy' row with 0 for the measure and it will be represented in your chart.

Resources