Cognos - Showing every month on x-axis when some months don't have values - cognos

Let me first say I am very new to Cognos and have mainly learned by just manipulating items within active reports. I am having an issue with creating a graph that acts like a time series. I want it to display every month (with multiple values in some months and none in others). I want to visually see gaps between data points (ex: we order products every 3 months starting in January, so we should see gaps in the months we do not order products - like February and March).
I have tried changing the label control to manual and setting display frequency to 1. However, I think my issue is that there is not any data within certain months.

You are correct in that your problem is lack of data. A standard inner join will drop rows where there is not a corresponding row in both tables, resulting in gaps.
There are two solutions available:
Use a union to create "dummy" records for each date
Manually specify an outer join between the date table and the table containing the rest of information
Since the first technique is the most common, I'll outline the basic steps for it here.
Create a new query
Add your month data item to the query
Create a 'dummy' data item for your measure. Use 0 for its expression.
If there is a date range filter in the main query apply it here
Create a union
Drag over your new query into the union
Drag over your original query into the union
Pull in the date and measure data items into the union query
Set the Aggregate Function property of the measure to Total
Use the union query as the source for your chart
For every month with measure data you will have two rows, one with the measure amount and one with 0. The two rows will be combined by the auto-group and summarize function. The measures will be added together. Anything added to 0 will end up as the original amount.
For months with no measure data, there will only be the 'dummy' row with 0 for the measure and it will be represented in your chart.

Related

Pivot Table project - Avoid using many INDEX and MATCH functions that make Excel crash

I need some help with an Excel Project that's giving me headaches. I succeeded to achieve everything I wanted but the result is too heavy for Excel and it crashes all the time. I'm over-using the INDEX and MATCH functions on large tables (50 000+ lines) and Excel doesn't like it. I'm looking for a way to do the same thing in a lighter way for Excel.
Here's what I achieved : I created a report that helps me analyzing my employees's performance VS their billing targets. To create such a report, I used a Pivot Table.
That Pivot Table needs this information as its source :
Each sales that every employee made (amount in $ and date)
The hourly rate of each employee (which changes for every period, see TABLE1 below)
The billing target for each employees (which changes for every period, see TABLE1 below)
Here's my setup. I have 3 tables :
TABLE1 (See attached image) - A table where I manually input data for each of my employees (hourly rate and billing target). Their billing target and hourly rate change every period. So, each period has a different line and I indicate the first day of the period and the last day of the period.
TABLE2 (See attached image) - Table that contains sales data exported from another software I use. Each line represents an amount sold by an employee to a customer on a specific date. This table is pretty heavy and contains more than 50 000 lines. Moreover, the last 2 columns of this table use Index and Match functions to get the right hourly rate and the right billing target from TABLE1. That means that each of those 50 000 lines uses the INDEX and MATCH functions twice… This part is too heavy for Excel and I need a workaround.
Moreover, TABLE2 is getting refreshed every few days with new data coming from my other software (an ERP). So the solution I need to find must take that into account and must be permanent (I try to avoid steps that will have to be done everytime I refresh TABLE2 with new data).
TABLE3 - A Pivot Table that uses TABLE2 as its data source. I use the slicer to select the name of an employee and a timeline to specify which months I want to display. Then the Pivot Table shows my employee's statistics grouped by months. The main statistic is the amount of "billed hours" for each employee, which is in reality the amount of sales made by that employee, divided by their hourly rate on a specific date.
My thoughts :
It is absurd that TABLE2 uses that many INDEX and MATCH functions. For example, if Employee1 made 500 sales between 2020-07-01 and 2020-07-31 (the same month, thus the same period, thus the same hourly rate and billing target), there will be 500 different lines that will use INDEX and MATCH to get the same hourly rate and billing target from TABLE1. That leads to a lot of duplicated calculation and a lot of duplicated data.
Would it be possible for a Pivot Table Calculated Field to use INDEX and MATCH in its formula? And would it be lighter for Excel to do so?
Another way would be to add, at the bottom of TABLE2, 12 lines per year (1 for each month) for every employee where I would write their hourly rate and the billing target. That way, the Pivot Table would be able to display an hourly rate and a billing target for each month, for each employee. That solution would work and would be lighter for Excel, but it would create a high risk of making mistakes while manually inputting the data.
I'm open to all suggestions including VBA!
Thank you very much for your precious time!
EDIT : FORMULA
As requested, here's my INDEX AND MATCH formula that is in TABLE2 and gets the hourly rate from TABLE1 :
=INDEX(TAB_Employee_Data[[#All];[Hourly_Rate]];MATCH([#[Date (Cell)]]; IF(TAB_Employee_Data[[#All];[Name]]=[#[Employee(Cell)]];TAB_Employee_Data[[#All];[First day of the period]]);1))
TAB_Employee_Data is the tab that contains "TABLE1".
I translated the names of the fields since all my work is in French.
This formula does the following : it searches the name of an employee in TABLE1 and finds the period which fits the date of a line in TABLE2.
Also, to work properly, I need to sort the lines of TABLE1 in chronological order.
TABLE 1 :
TABLE 2:

PowerBI DAX: logic to use aggregated table as parameter in functions or another workaround to calculate dataset KPI filtered by any field?

In PowerBI, I need to create a Performance Indicator (KPI) measure which evaluates dataset values in a scale from 0 to 1, with target (1) being the MAX value in a 20 years history. It's a national airport trip records open database. The formula is basically [value]/[max value].
My dataset has a lot of fields and I wish I could filter it by any of these fields, with a line chart showing the 0-1 indicator for each month based on the filters.
This is my workaround test solution:
Table 1 - Original dataset: if I filter something here, below tables also update (there are more fields to the left, including YEAR and MONTH
Table 2 - Reference to original dataset, aggregating YEAR-MONTH by the sum of "take-offs" (decolagens)
Table 3 - Reference to above (sum) table, aggregating MONTH by the max of "take-offs" (decolagens)
Table 4 - 'Sum table' merged to 'Max table' by MONTH as new table: then do [Value]/[Max] and we've got the indicator
So if i filter the original dataset by any fields, all other tables update accordingly and the indicators always stays between 0-1, works like a charm.
TL;DR
The problem is: I need to create a dashboard of this on Power Bi. So I need this calculation to be in a measure or another workaround.
My possible solution: by pure DAX code in the measure field, to produce Tables 2 and 3 so I'll divide the month sum values by their month max value (which will both be produced according to PowerBi dashboard slicers) and get the indicator dinamically produced.
I'm stuck at: I don't understand how can I reference a sum/max aggregate table in dax code. Something like = SUM (dataset[take-offs]) / MAX (SUM (dataset[take-offs])). Of course these functions do not work like that, but I hope I made my point clear: how can I produce this four table effect with a single measure?
Other solutions are welcome.
Link to the original dataset: https://www.anac.gov.br/assuntos/dados-e-estatisticas/dados-estatisticos/arquivos/DadosEstatsticos.csv
It's an open dataset, so I guess there's no problem sharing it. Please help! :)
EDIT: please download the dataset and try to solve this. Personally I think it's a quality statistics doubt that will eventually help others. The calculation works, it only needs a Power Bi Measure port.
Add the ALL formula:
Measure = SUMX(ALL('Table'),[Valor])/SUM('Table'[Max])
Example

Different aggregation functions for different dimensions in Excel pivot table

Can I define different aggregation methods for subtotals in different dimension in an Excel pivot table?
The following example shows a result I'm trying to obtain. The metric to aggregate is, let's say, lines of code of a software project. The 2 dimensions in question are Date and Organization. In source data, Organization is broken down into 2 columns, Department and Project, while Date is a single column and Excel makes up the Months/Years summaries automatically when making the ODBC data connection.
A metric such as this one should be aggregated differently along the different dimensions. For the Organization dimension, the subtotal for all projects of the department is the SUM, but in the date dimension, the subtotal for all months of the year is the MAX of any given month (or perhaps AVG, or last etc. but certainly not SUM).
I've tried to define the different aggregation methods in Excel in the field settings, but it always selects one or the other method for both dimensions. Is there a way to do it, preferably using standard Pivot Table mechanisms or at worst a UDF in Excel?
What I would do to tackle this problem is to add both aggregation functions: sum and max , then hide ( or shrink a lot ) those columns you do not want to display.
in the above example I shrink columns B,D,F and I because of they has values that are out of scope for your requirements.
The "Total Max of Loc" displays a value consistent with the function expressed throughout the entire column: that is "the maximum number of lines of code reached by each project in each department; this could lead to misunderstandings when we observe the values of the subtotals and grand total; i.e: The "Grand Total - Total Max of Loc" is not the "Total Max of Sum of Loc": in the example, it shows 18 which represents the absolute maximum value of Loc in a Project in each Department; In the same way the Total Max of Loc for Department 2 is 18 and form Department 1 is 12
When requested a different behavior as expressed in comment to this answer, I think we are entering into the strong customizations space and some solution could be found by writing custom macro and by leveraging the getpivotdata function or, if it can be acceptable for your case, simply by the addition of a new column with the max()formula and possibly hiding the column "Total Max of Loc"

Problem with SSAS ParallelPeriod and Excel 2013 Timeline Filter

Currently i had a project using Microsoft SQL Server Analysis Service. I found a problem regarding filtering data with excel timeline.
Here is my date dimension screenshot:
<img src="https://i.stack.imgur.com/NUr2x.png"/><img src="https://i.stack.imgur.com/5OSgA.png" />
I had a cube with 2 measures, Sales Quantity (measures) and Sales Quantity Last Year (calculation). Here is MDX expression for Sales Quantity Last Year calculation:
( ParallelPeriod([Date].[YM].[Calendar Year],1,[Date].[YM].CurrentMember),[Measures].[Sales Quantity In 1000] )
After deploying the project to my local server, the data can be shown perfectly using excel 2013:
Pic: Data in Excel without filter
The problem start when i want to filter the data using excel timeline. When i filter only '2016', my calculation measure is no longer working. You can see the data in 'Sales Quantity in 1000 LY' column is blank. It looks like that i cant see the data outside current filter (2016). Pic: Filtered using timeline filter
But when i use slicer, the data can be shown normally Pic:Filtered using Slicer
Did i make a mistake in building date dimension? Or i need to fix the MDX calculation query? Because when i test this case in Microsoft AdventureWorksDW2014 with the same date hierarchy and the same calculation, all is going well.
Your parallel period calculation looks correct assuming [Date].[YM] is your date hierarchy. I am guessing that your date dimension is off somehow.
Make sure that:
it has a hierarchy created, and the hierarchy is what you are referencing in the parallel period calculation. Here is an example, you could have more or less attributes in the hierarchy obviously.
Your attribute relationships are defined correctly.
Key columns on the attributes in the hierarchy are correct. In the example above, you would just make year the key for the year column, but then for quarter it would be a collection of the year and quarter column. For period, key columns would be year, quarter, period. For week, key columns would be year, quarter, period, week. Date would just use the date column since date is the key.
4.Make sure that the date key attribute is using a date field for it's value column, as a time slicer needs this.
define time intelligence on your date dimension. Right click on the date dimension on the solution explorer and choose add business intelligence, then on choose enhancement screen pick define dimension intelligence. Then set the attribute type for each dimension attribute. Here is how it would be for our example.
Hopefully one of these does it for you.

Statistical functions on non-numerical value

I am not looking for any code or formula but a rationale/logic.
Background: My data set comes in Date/Time format where a new timestamp is created for each new occurrence of an event.
My goal is to calculate number of occurrences within each hour for a given day. Unfortunately, system does not capture number if occurrences per period as integers. So I have count the number of time an hour value appears within the hour i.e number of times 4 o'clock hour appears. I am currently using Pivot Table in Excel to count the number of times each hour appears. Fields in Rows are hour and dates, and field in Values is count of hour.
Trouble is that I cannot use any summarize functions to get stuff like sum, min, max, percentile, and standard deviation. For example, changing count to sum will only add up all hours. So sum of 4 o'clock hour will return 12 instead of 3. So I am having to use array formulas on pivot table to give me max and min etc.
If I was to use this data in data viz tools like Tableau or Power BI. I won't be able to get very far. I am looking for a suggestions/workaround that can allow me to manipulate my data in a way so it can be used in Pivot Tables in Excel and in data viz tools.
I know my questions is not specific to one tool but I am looking to enhance me understanding of data and data manipulations techniques.
EDIT: Please see attached image
Build a data model, using PowerPivot. Join your fact table to a calendar dimension table. Create a row count measure - you can then summarise that measure to suit (sum, average, min, etc)

Resources