Calculate average of based on weekly dates EXCEL - excel

I am completely new at excel and I have an assignment involving 12k of rows. Basically, I have to calculate the average of the all the values from the same date. These dates follow the arithmetic succession with a difference of 7. Therefore, dates will be like 2/2/52; 2/9/52; 2/16/52; 2/23/53 etc. I know how to find the average of a specific group of values, but selecting one group of values at a time to find the average will take forever because there must be about 5k of different dates. Therefore, I was looking for an automated way that allows me to find the average without going to select the values every single time. The following is an example of the spreadsheet that I am dealing with:
DATE------------------VALUE
2/2/52----------------3.5
2/2/52----------------3.4
2/2/52----------------2.5
2/9/52----------------4.5
2/9/52----------------3.6
2/16/52---------------2.4
2/16/52---------------4.1
2/16/52---------------3.1
2/16/52---------------4.2
2/16/52---------------2.34
Also, please note that the dates do not change in a pattern, meaning dates do not change every n rows.

This is a perfect candidate for a PIVOT table.
Here is your data.
DATE VALUE
2/2/1952 3.5
2/2/1952 3.4
2/2/1952 2.5
2/9/1952 4.5
2/9/1952 3.6
2/16/1952 2.4
2/16/1952 4.1
2/16/1952 3.1
2/16/1952 4.2
2/16/1952 2.34
Select the data and insert pivot table.
Drag Date into rows
Drag VALUE into VALUES
Drop down on the values - select value settings
and select Average
Row Labels Average of VALUE
2/2/1952 3.133333333
2/9/1952 4.05
2/16/1952 3.228
Grand Total 3.364

Related

Calculate average based on a value column (count) in a pivot table

I'm looking a way to add an extra column in a pivot table that that averages the sum of the count for the months ("Count of records" column) within a time period that is selected (currently 2016 - one month, 2017 - full year, 2018 - 5 month). Every month would have the same number based on the year average, needs to be dynamically changing when selecting different period: full year or for example 4 months. I need the column within the pivot table, so it could be used for a future pivot chart.
I can't simply use average as all my records appear only once and I use Count to aggregate those numbers ("Count of records" column).
My current data looks like this:
The final result should look like this:
I assume that it somehow can be done with the help of "calculated filed" option but I couldn't make it work now.
Greatly appreciate any help!
Using the DataModel (built in to Excel 2013 and later) you can write really cool formulas inside PivotTables called Measures that can do this kind of thing. Take the example below:
As you can see, the Cust Count & Average field gives a count of transactions by month but also gives the average of those monthly readings for the subtotal lines (i.e. the 2017 Total and 2018 Total lines) using the below DAX formula:
=AVERAGEX(SUMMARIZE(Table1,[Customer (Month)],"x",COUNTA(Table1[Customer])),[x])
That just says "Summarize this table by count of the customer field by month, call the resulting summarization field 'x', and then give me the average of that field x".
Because DAX measures are executed within the context of the PivotTable, you get the count that you want for months, and you get the average that you want for the yearly subtotals.
Hard to explain, but demonstrates that DAX can certainly do this for you.
See my answer at the following link for an example of how to add data to the DataModel and how to subsequently write measures:
Using the Excel SMALL function with filtering criteria AND ignoring zeros
I also recommend grabbing yourself a book called Supercharge Excel when you learn to write DAX by Matt Allington, and perhaps even taking his awesome online course, because it covers this kind of thing very well, and will save you significant head-scratching compared to going it alone.

Statistical functions on non-numerical value

I am not looking for any code or formula but a rationale/logic.
Background: My data set comes in Date/Time format where a new timestamp is created for each new occurrence of an event.
My goal is to calculate number of occurrences within each hour for a given day. Unfortunately, system does not capture number if occurrences per period as integers. So I have count the number of time an hour value appears within the hour i.e number of times 4 o'clock hour appears. I am currently using Pivot Table in Excel to count the number of times each hour appears. Fields in Rows are hour and dates, and field in Values is count of hour.
Trouble is that I cannot use any summarize functions to get stuff like sum, min, max, percentile, and standard deviation. For example, changing count to sum will only add up all hours. So sum of 4 o'clock hour will return 12 instead of 3. So I am having to use array formulas on pivot table to give me max and min etc.
If I was to use this data in data viz tools like Tableau or Power BI. I won't be able to get very far. I am looking for a suggestions/workaround that can allow me to manipulate my data in a way so it can be used in Pivot Tables in Excel and in data viz tools.
I know my questions is not specific to one tool but I am looking to enhance me understanding of data and data manipulations techniques.
EDIT: Please see attached image
Build a data model, using PowerPivot. Join your fact table to a calendar dimension table. Create a row count measure - you can then summarise that measure to suit (sum, average, min, etc)

Spotfire: Select data from column based on criteria

I have a data table in Spotfire which contains two columns I'm interested in: Time (31/01/2015 for example), and Value (integer).
I want the most recent date (e.g. December 2015) to be set as the current time. Then I want to select Value based on previous 1 month, 3 months, 6 months etc. So if I want all the values for past 6 months it should Sum(Values) for Dec.2015, November.2015, October.2015, September.2015, August.2015 and return that.
So far I've only been able to accomplish this by manually performing the task in Excel before I insert it into Spotfire so is there any way to create a calculated column for each of the periods I want? (Past month, 3 months etc.)
There's likely a number of ways to solve this, but I'm going to give one suggestion and we'll see how it fits your specific case.
You can add a calculated column for each timespan you are interested in, defined like this:
Sum(if (DateAdd('month', 3, [Time]) >= Max([Time]), [Value], null))
This example would get you a column with all the values that have occured in the past 3 months, replace the number 3 in there to modify to the timespans you are interested in. A full sum of the calculated column would get you the total for that timespan.
Might be nicer to use a boolean column instead of duplicating the value column. Then your calculated columns would be defined as:
DateAdd('month', 3, [Time]) >= Max([Time])
When calculating totals you would then use an if statement using the calculated column, like this:
Sum(if([3Months],[Value],null))
where [3Months] is a boolean column.

How do I get the proper average in a pivot based on pivot data?

I'm trying to get the average number of "on time shipment" based on items rolled up to "ship numbers" and then by "order number". I have one order number in this scenario that is shipped via multiple shipments. It seems to me that after rolling it up via PowerPivot and then creating a pivot table, it's calculating the average based on the total lines of the "order number" instead the pivot.
PowerPivot Data:
Pivot based on data above:
How can I get the average number based on the pivot table rather than the PowerPivot total data of the order number? I'm probably not making any sense, but hopefully the images below explain it better. As you can see, when you roll up the items by ship number then by order number, you'll see that the actual average is 0.6 but the pivot is showing 0.5.
Help!
Technically speaking, the average is correct - if you look at the source data, for some reason all rows are duplicated and if you do regular average calculation, it's actually 0.5.
What you are looking for is calculating average for distinct values, which can be done easily with AVERAGEX function.
I have copied your table and created those 2 Calculated Fields (in Excel 2010, it's Measures):
Average on Time:
=AVERAGE(Table1[On Time])
Average on Time (UNIQUE)
=AVERAGEX(VALUES(Table1[Ship Number]), [Average on Time])
Using AverageX with VALUES() function makes it easier to calculate any expression ONLY for unique values.
If you then put both measures on PivotTable, you should get this:
First column is same as yours (using "regular" AVERAGE function). The second one shows the average calculated over distinct (unique) values of Ship Numbers.
Hope this helps.
PS: This great article by Kasper de Jonge helped me quite a bit with similar scenarios.

Excel- Average days between group of dates

I'm trying to use excel to calculate the average frequency of delivery for a set of parts. I have a data set that has two columns- part number and delivery date. I'm trying to figrue out out oftne parts get delivered, on average, in terms of days. I tried using nested ifs like averageif(a2=a2:b9999,datedif(xx)) etc, but to no avail. I'm looking for this:
Input:
Part A 8.1
Part A 8.8
Part A 8.15
Output: Part A Average Delivery - Every 7 Days
etc etc. Any ideas?
If your dates are in ColumnB:
=(MAX(B:B)-MIN(B:B)--1)/COUNT(B:B)
or:
=(MAX(B:B)+1-MIN(B:B))/COUNTA(B:B)
should serve.
Edit
If you have multiple parts (the above assumed only one) and the list is in no particular order then a PivotTable may be best (say with its top left-hand corner in D1), in Tabular form with Part for Row Labels and Delivery three times for Σ Values (the first as MAX, the second as MIN and the third as COUNT). Then =(1+E3-F3)/G3 copied down should give you the average bumber of days between deliveries. For example 5 in your example (3 deliveries in 15 days).

Resources