I am having a huge database of records and I'm finding it to be a nightmare getting to analyse the data.
Objective:
Group my data by Country of Purchase (rows), by Years/Months (rows), by Product (columns) with the Sum of Paid amount being the value.
Let me explain:
Below is a sample excerpt from my table.
And here is the result that I am looking for that I was able to achieve using an Excel Pivot table:
Why use MS Access:
My table has over 3 million records stored across many workbooks, and Excel has a limit of 1m in each sheet. Also Excel crashes more often than not when loading >500k of data.
I installed an older version of MS Access (2010) which has pivot tables option but it was very slow and did not allow me to group correctly. I then tried using a combination of queries and reports to arrive to my result to no avail.
Any help will be very welcome :)
How about doing the aggregation in Access and then the pivot in Excel?
SELECT country, year, month, product, sum(paid)
FROM myTable
GROUP BY country, year, month, product
(year and month based on access functions for date manipulation... alternatively, you could use is as a date to keep date functionality in the pivot - just make it the first of the relevant month)
Then use this as the source of the pivot table. The pivot table then basically just does the formatting - which it can hopefully do quickly enough
Related
I am trying to provide on of our FHF fundraising campaign managers with a tool they can use to help them pick postcodes to send a campaign to.
Spoecifcally, I'm seeking help writing an Excel PowerPivot Measure for a cumulative total that counts only the visible cells in a Pivot Table and wherein the source data is OLAP (i.e. an Excel Data Model - which seems to carry some constraints to possible solutions)
The linked worksheet is a simplified example of the excel worksheet tool I want to give to the Campaign Manager in a Fundraising setting https://www.dropbox.com/scl/fi/3pfc8ix2ekduoocsa90zs/minReprodExample_share1.xlsx?dl=0&rlkey=m6gqf5zbjxhl6dzkjq653it3d
We have two predictive models both giving different predicted response rates per postcode
The 'data' tab contains the raw data
The 'modelA_pivot' tab contains a pivot table ranking the raw data according to the predicted response rates from ModelA
'model_pivotB' does the identical pivot for ModelB
Focussing just on the 'modelA_pivot' tab for now
you can see a slicer that allows the campaign manager to exclude postcodes with only 3,000 addresses
(or some other threshold level of their choosing)
The slicer's exclusion of a single postcode with 3,000 addresses in this example is why you see the rank column in the pivot run; 1, 3, 4, 5
(i.e. missing rank 2 postcode with 3,000 addresses)
In the pivot, the last column - 'addressCount_postcodeCumulative_modelA' - is based on an excel Power Pivot 'Measure'
And the current formula of the measure is
=VAR rankCurrent = MAX([rank_modelA])
RETURN
CALCULATE(
SUM([addressCount_postcodePer]),
FILTER(
ALL(data),
[rank_modelA] <= rankCurrent
)
)
You can see in the 'modelA_pivot' tab the 'addressCount_postcodeCumulative_modelA' column
doesn't work or make sense. As it includes the cumulative total of ALL addresses (including the 3,000 addresses that are excluded from the pivot)
Can anyone help me with a 'Measure' formula that sums only the addresses of the postcodes that are visible and included in the pivot table
FYI and in case anyone is wondering, why a 'Measure'; I am using excel Power Query and Power Pivot so if/when the data upstream changes, the data team will be able to refresh this worksheet with a single click (more or less), and the campaign manager still gets to use excel (the tool they know and like)
But, the use of excel Power Query and Power Pivot and setting this up as an excel Data Model (which uses OLAP structured data)
is also injecting constraints which I'm hoping to fit the answer to this puzzle inside
Constraints such as;
I can't seem to put calculated fields onto the pivot table, and
I don't want to just add conventional excel columns on the side of the pivot table as the size of the pivot table is dynamic depending on choices of the Campaign Manager (like their choice of threshold # addresses in the slicer)
I have a very large data set that has 15,000 rows with a few descriptive columns and then the data itself is stored as monthly sales in 200 columns, one for each month. There are a couple of other data sets I need to connect this to so I want to be able to use Power Pivot to build the relationships.
How do I go about harnessing Power Pivot to build a dashboard for this data? Specifically is there a way to link all those date columns to a DimDate table so I can connect it to other data sets without reformatting the whole data set?
Longtime answer-seeker, first time question-asker here so I'm open to feedback about how I'm asking as well. I'm relatively new to Excel's PowerPivot but feel like I have a handle on it for the most part.
I am using PowerPivot for Excel 2010. I have data that I only receive weekly totals for and I use the monday of that week as my primary key in the table I call 'WeeklyTracking'. I create a relationship from that to my Date Table so that I can filter/analyze by month, year, etc. I get no error when I make that relationship, it is a one to many ( I checked for duplicates in my WeeklyTracking table), and it is showing as 'active'.
However, when I go to create a pivot table it's not separating the data by my Date Table fields. It simply repeats the total for the column. What my pivot table shows me. Table Relationships
I tired disconnecting all other table relationships, and I even tried converting dates to numeric values and linking those but to no avail. When I choose to make column labels the date within the 'WeeklyTracking' table it separates out by date just fine which leads me to believe it has to do with the relationship. But I did something very similar with data I get monthly and didn't have any problems so I can't figure out what's different.
Any ideas?
EDIT: It's actually not working for my monthly report either, upon closer inspection. But still I dont understand why not--There's a primary key in each table...
UPDATE: Tried creating a ID number using a formula for each week and creating the relationship on that and it didn't work either.
I am trying to create an ageing report from data in Power Query. I'm able to do the following if I pull the data into an Excel table:
Age Column =(TODAY()-[#[Request Date]])
Ageing buckets =LOOKUP(J19372,{-60,2.1,5.1,7.1},{"0-2","2+","5+","7+"})
However every time I refresh the data in the table from the query, I then need to copy the formulas down. This is fine for me but I want it to be automatic for others that I send the file to.
Is there a way to do those calculations in PowerQuery?
Sure you can. Except for retrieving today's date via Power Query, you need to refresh table. Check one solution for banding here at Ken Puls blog: http://www.excelguru.ca/blog/2016/02/29/creating-a-banding-function-in-power-query/
I have a Pivot Table structure as follows:
ROWS:
+-State
+---Customer
+-----Brand
Columns:
+-Cost
I would like to have another column that contains the number of Customers in each state. The issue being that my data contains every order that the customers had placed, so when I try to get the count of Customers it is returning every instance of said customer in the column. Another issue is that my data is 40,000 rows, so I want to try and avoid having to edit the raw data.
I can easily do this with brute force, but I was wondering if there is anyway to do this with standard pivot tables and no add-ons. The pivot table already does a nice job of consolidating the unique values for customers, now I just need a count of those unique values.