How to UnPivot e.g. 300 columns to e.g.100 columns in Spotfire? - spotfire

I want to do some unpivoting in Spotfire as described in the official documentation:
https://docs.tibco.com/pub/spotfire/7.0.1/doc/html/data/data_unpivoting_data.htm
The problem I am having is that I want to do many of these transformations after each other and don't want to do them manually one by one. I did not get that data by Pivoting, but such data is easily generated by Pivoting as well:
https://docs.tibco.com/pub/spotfire/7.0.1/doc/html/data/data_details_on_pivot_data.htm
Similar to the example for Pivot I can end up with colums called e.g. Site,Date,Min(Measure1),Average(Measure1),Max(Measure1), Min(Measure2),Average(Measure2),Max(Measure2),...,Min(Measure99),Average(Measure99),Max(Measure99),Min(Measure100),Average(Measure100),Max(Measure100)
Measure1 could be for example the length of a thing, measure2 the width, measure3 the height and so on, over all things produced at a site at a date.
Now if I wanted to UnPivot now Over all Min and all Average and all Max, I would need to do 3*Unpivot (once for each of them) and end up with Columns called Min, Average and Max for all 100 Measures (per Site and Date) and which measure it is I could have in another column (or 3).
That is probabaly managable, but if I wanted to unpivot that same table over Measure1, Measure2,...,Measure99,Measure100 to get one Column per Measure with a row for Min, Average and Max (per Site and Date), then I would need to do 100 times the Unpivot. I think there must be an easier way. Any Idea? The column name contains both informations to descibe how to unpivot it, like in above example, but I could also get the information how the new column should be called from a column property.

Related

How to calculate average of parent category in excel?

I have data in below format. It shows starting and end time of an activity and calculates duration accordingly. The activity is performed through out the day at different times.
I have added a pivot. I want to find out the average duration in a workday or a holiday(Day category). When I am trying to apply average in the current pivot, it is dividing the total duration by the number of sessions in a day.For example in week 1, an activity was done on 4 work days and the total duration for the activity in workdays was 04.19, I want to divide this number by 4 and find out the average time spent on each day but the pivot divides it by 11 which is the total number of sessions in the four days.
Link for data
Steps:
Add a helper column to identify how many unique pairs of Dates/Day Categories there are:
=IF(SUMPRODUCT(($A$2:$A2=A2)*($B$2:$B2=B2))>1,0,1)
You can add extra products to this formula to force extra fields to be unique to be counted as well.
SRC:Simple Pivot Table to Count Unique Values
Add a Calculated Field in the PivotTable that is:
SUM(Duration)/SUM([Helper Column Name]) and include it in the 'Values' section of the PivotTable. Due to the new column being added, you might have to re-create the PivotTable.
This should produce the average in the manner that you want.

Different aggregation functions for different dimensions in Excel pivot table

Can I define different aggregation methods for subtotals in different dimension in an Excel pivot table?
The following example shows a result I'm trying to obtain. The metric to aggregate is, let's say, lines of code of a software project. The 2 dimensions in question are Date and Organization. In source data, Organization is broken down into 2 columns, Department and Project, while Date is a single column and Excel makes up the Months/Years summaries automatically when making the ODBC data connection.
A metric such as this one should be aggregated differently along the different dimensions. For the Organization dimension, the subtotal for all projects of the department is the SUM, but in the date dimension, the subtotal for all months of the year is the MAX of any given month (or perhaps AVG, or last etc. but certainly not SUM).
I've tried to define the different aggregation methods in Excel in the field settings, but it always selects one or the other method for both dimensions. Is there a way to do it, preferably using standard Pivot Table mechanisms or at worst a UDF in Excel?
What I would do to tackle this problem is to add both aggregation functions: sum and max , then hide ( or shrink a lot ) those columns you do not want to display.
in the above example I shrink columns B,D,F and I because of they has values that are out of scope for your requirements.
The "Total Max of Loc" displays a value consistent with the function expressed throughout the entire column: that is "the maximum number of lines of code reached by each project in each department; this could lead to misunderstandings when we observe the values of the subtotals and grand total; i.e: The "Grand Total - Total Max of Loc" is not the "Total Max of Sum of Loc": in the example, it shows 18 which represents the absolute maximum value of Loc in a Project in each Department; In the same way the Total Max of Loc for Department 2 is 18 and form Department 1 is 12
When requested a different behavior as expressed in comment to this answer, I think we are entering into the strong customizations space and some solution could be found by writing custom macro and by leveraging the getpivotdata function or, if it can be acceptable for your case, simply by the addition of a new column with the max()formula and possibly hiding the column "Total Max of Loc"

Statistical functions on non-numerical value

I am not looking for any code or formula but a rationale/logic.
Background: My data set comes in Date/Time format where a new timestamp is created for each new occurrence of an event.
My goal is to calculate number of occurrences within each hour for a given day. Unfortunately, system does not capture number if occurrences per period as integers. So I have count the number of time an hour value appears within the hour i.e number of times 4 o'clock hour appears. I am currently using Pivot Table in Excel to count the number of times each hour appears. Fields in Rows are hour and dates, and field in Values is count of hour.
Trouble is that I cannot use any summarize functions to get stuff like sum, min, max, percentile, and standard deviation. For example, changing count to sum will only add up all hours. So sum of 4 o'clock hour will return 12 instead of 3. So I am having to use array formulas on pivot table to give me max and min etc.
If I was to use this data in data viz tools like Tableau or Power BI. I won't be able to get very far. I am looking for a suggestions/workaround that can allow me to manipulate my data in a way so it can be used in Pivot Tables in Excel and in data viz tools.
I know my questions is not specific to one tool but I am looking to enhance me understanding of data and data manipulations techniques.
EDIT: Please see attached image
Build a data model, using PowerPivot. Join your fact table to a calendar dimension table. Create a row count measure - you can then summarise that measure to suit (sum, average, min, etc)

Cognos - Showing every month on x-axis when some months don't have values

Let me first say I am very new to Cognos and have mainly learned by just manipulating items within active reports. I am having an issue with creating a graph that acts like a time series. I want it to display every month (with multiple values in some months and none in others). I want to visually see gaps between data points (ex: we order products every 3 months starting in January, so we should see gaps in the months we do not order products - like February and March).
I have tried changing the label control to manual and setting display frequency to 1. However, I think my issue is that there is not any data within certain months.
You are correct in that your problem is lack of data. A standard inner join will drop rows where there is not a corresponding row in both tables, resulting in gaps.
There are two solutions available:
Use a union to create "dummy" records for each date
Manually specify an outer join between the date table and the table containing the rest of information
Since the first technique is the most common, I'll outline the basic steps for it here.
Create a new query
Add your month data item to the query
Create a 'dummy' data item for your measure. Use 0 for its expression.
If there is a date range filter in the main query apply it here
Create a union
Drag over your new query into the union
Drag over your original query into the union
Pull in the date and measure data items into the union query
Set the Aggregate Function property of the measure to Total
Use the union query as the source for your chart
For every month with measure data you will have two rows, one with the measure amount and one with 0. The two rows will be combined by the auto-group and summarize function. The measures will be added together. Anything added to 0 will end up as the original amount.
For months with no measure data, there will only be the 'dummy' row with 0 for the measure and it will be represented in your chart.

How do I get the proper average in a pivot based on pivot data?

I'm trying to get the average number of "on time shipment" based on items rolled up to "ship numbers" and then by "order number". I have one order number in this scenario that is shipped via multiple shipments. It seems to me that after rolling it up via PowerPivot and then creating a pivot table, it's calculating the average based on the total lines of the "order number" instead the pivot.
PowerPivot Data:
Pivot based on data above:
How can I get the average number based on the pivot table rather than the PowerPivot total data of the order number? I'm probably not making any sense, but hopefully the images below explain it better. As you can see, when you roll up the items by ship number then by order number, you'll see that the actual average is 0.6 but the pivot is showing 0.5.
Help!
Technically speaking, the average is correct - if you look at the source data, for some reason all rows are duplicated and if you do regular average calculation, it's actually 0.5.
What you are looking for is calculating average for distinct values, which can be done easily with AVERAGEX function.
I have copied your table and created those 2 Calculated Fields (in Excel 2010, it's Measures):
Average on Time:
=AVERAGE(Table1[On Time])
Average on Time (UNIQUE)
=AVERAGEX(VALUES(Table1[Ship Number]), [Average on Time])
Using AverageX with VALUES() function makes it easier to calculate any expression ONLY for unique values.
If you then put both measures on PivotTable, you should get this:
First column is same as yours (using "regular" AVERAGE function). The second one shows the average calculated over distinct (unique) values of Ship Numbers.
Hope this helps.
PS: This great article by Kasper de Jonge helped me quite a bit with similar scenarios.

Resources