Expressing percentage - subquery

I need to be able to divide the values of each record in a certain field contained in a query result by the sum of all the values in the same field. This will enable me to calculate the percentage that each record is of the whole. The results must be expressed in the same domain in a new field.
That is: Each record in ColumnA must be divided by the sum of ColumnA, then expressed as a fraction/percentage in a new, neigbouring ColumnB.
I have tried this in Query1:
Total: (select sum(ColumnA) from Query1) - then in a second subquery Percentage:([CountOfColumnA/Total])
This worked once, but Access stopped me with a circular error because of the Query1 from Query1.
Now I am trying to rewrite it and reach the answer in only one subquery to avoid the circular problem. I can easily do this in Excel, but don't know enough about expressions and coding to manage it in Access.
I found a question on this forum from someone with a similar question, but I do not understand the answer - I do not know enough about SQL (nothing actually) to adapt it to my problem. My adaptation was something like this:
SELECT Query1, ColumnA / ((SELECT SUM(ColumnA) FROM Records) AS Percentage FROM Records

Assuming Query1 is a strange field name, try with:
Select
Query1,
ColumnA,
ColumnA / (Select Sum(T.ColumnA) From Records As T) As
Percentage
From
Records

Related

Using ADODB SQL in VBA, why are Strings truncated [to 255] only when I use grouping?

I’m using ADODB to query on Sheet1. If I fetch the data using SQL query on the sheet as below without grouping I’m getting all characters from comment.
However, if I use grouping my characters are truncated to 255.
Note – My first row contains 800 len of characters so drivers have identified the datatype correctly.
Here is my query output without grouping
Select Product, Value, Comment, len(comment) from [sheet1$A1:T10000]
With grouping
Select Product, sum(value), Comment, len(comment) from [sheet1$A1:T10000] group by Product, Comment
Thanks for posting this! During my 20+ years of database development using ADO recordsets I had never faced this issue until this week. Once I traced the truncation to the recordset I was really scratching my head. Couldn't figure how/why it was happening until I found your post and you got me focused on the GROUP BY. Sure enough, that was the cause (some kind of ADO bug I guess). I was able to work around it by putting correlated scalar sub-queries in the SELECT list, vice using JOIN and GROUP BY.
To elaborate...
At least 9 times out of 10 (in my experience) JOIN/GROUP BY syntax can be replaced with correlated scalar subquery syntax, with no appreciable loss of performance. That's fortunate in this case since there is apparently a bug with ADO recordset objects whereby GROUP BY syntax results in the truncation of text when the string length is greater than 255 characters.
The first example below uses JOIN/GROUP BY. The second uses a correlated scalar subquery. Both would/should provide the same results. However, if any comment is greater than 255 characters these 2 queries will NOT return the same results if an ADODB recordset is involved.
Note that in the second example the last column in the SELECT list is itself a full select statement. It's called a scalar subquery because it will only return 1 row / 1 column. If it returned multiple rows or columns an error would be thrown. It's also known as a correlated subquery because it references something that is immediately outside its scope (e.emp_number in this case).
SELECT e.emp_number, e.emp_name, e.supv_comments, SUM(i.invoice_amt) As total_sales
FROM employees e INNER JOIN invoices i ON e.emp_number = i.emp_number
GROUP BY e.emp_number, e.emp_name, e.supv_comment
SELECT e.emp_number, e.emp_name, e.supv_comments,
(SELECT SUM(i.invoice_amt) FROM invoices i WHERE i.emp_number = e.emp_number) As total_sales
FROM employees e

query does not include the specified expression 'ColB' as part of an aggregate function

I am using this query and it is working perfect but in excel as a database, it is giving me an error of aggregate function -
solution: once I add all column in the group by I don't get sum and Group by doesn't work.
select ColA,ColB,ColC,ColdD,SUM(ColE),ColF,ColG FROM automate GROUP BY ColA
One picture indicates table structure:
Another one is expected output:
Please help me if someone knows- MS-Access / excel as database
Every field in your SELECT part must be either GROUPed BY or aggregated. If all values are guaranteed to be the same or if you don't care which value will be picked use FIRST(), otherwise use the appropriate aggregate function (MIN, MAX, FIRST, LAST, SUM, etc.)
Example:
SELECT ColA, FIRST(ColB), FIRST(ColC), FIRST(ColdD), SUM(ColE), FIRST(ColF), FIRST(ColG) FROM automate GROUP BY ColA

Avoid DISTINCTCOUNT in PowerPivot

Due to performance issues I need to remove a few distinct counts on my DAX. However, I have a particular scenario and I can't figure out how to do it.
As example, let's say one or more restaurants can be hired at one or more feasts and prepare one or more menus (see data below).
I want a PowerPivot table that shows in how many feasts each restaurant was present (see table below). I achieved this by using distinctcount.
Why not precalculating this on Power Query? The real data I have is a bit more complex (more ID columns) and in order to be able to pivot the data I would have to calculate thousands of possible combinations.
I tried adding to my model a Feast dimensional table (on the example this would only be 1 column of 2 rows). I was hoping to use that relationship to be able to make a straight count, but I haven't been able to come up with the right DAX to do so.
You could use COUNTROWS() combined with VALUES().
Specifically, COUNTROWS() will give you the count of rows in a table. That means COUNTROWS is expecting a table is input. Here's the magic part: VALUES() will return a table as results, and the table it returns are the distinct values in the table/column that you provide as the argument for VALUES().
I'm not sure if I'm explaining it well, so for the sample data you provided, the measure would look like this (assuming the table is named Table1):
Unique Feasts:=COUNTROWS(VALUES('Table1'[Feast Id]))
You can then create a pivot table from Powerpivot, and drag Restaurant Id into Rows, and drag the measure above into Values. Same result as DISTINCTCOUNT, but with less performance overhead (I think).

How do I SUMIF one PowerPivot table according to the rows of a second PowerPivot table?

I have two tables: one of customers ("Donor"), and one of transactions ("Trans"). In Donor, I want a "Total" column that sums all the transactions by a particular Donor ID, which I would calculate in a standard Excel table thus:
=SUMIF(Trans[Donor ID],[#ID],Trans[Amt])
Simple! How do I do the same thing with a DAX formula? I thought
=CALCULATE(SUM(Trans[Amt]),Trans[Donor ID]=[ID])
would do it, but I get the error
Column "ID" cannot be found or may not be used in this expression.
Strangely, when I use
=CALCULATE(SUM(Trans[Amt]),Trans[Donor ID]=3893)
I do get the total for ID 3893.
Eschewing CALCULATE, I did find that this works:
=SUMX(FILTER(Trans, Trans[Donor ID]=[ID]),[Amt])
...but it only allows the one filter, and I'll need to be able to add more filters, but:
=SUMX(CALCULATETABLE(Trans, Trans[Donor ID]=[ID]),[Amt])
...(which I understand is like FILTER but allows for multiples) does not work.
Can you identify what I'm doing wrong?
After putting together a quick model that looks like this:
I've confirmed that this DAX forumla works as a Calculated Column in the Donor table:
=CALCULATE(SUM(Trans[Amt]), FILTER(Trans, Trans[Donor] = Donor[DonorKey]))
The key here is to make sure that the relationship between the two tables is correctly configured, and then make sure to use the combination of CALCULATE() and FILTER() -- filtering the trans table based on the current donor context.

Report on users who don't estimate well in Excel

I have a spreadsheet corresponding to entries of a user, their estimation, and the actual value (for example: hours for a particular project - again, this is only an example), which we can represent in CSV like:
User,Estimate,Actual
"User 1",5,5
"User 1",7,7
"User 2",3,3
"User 2",9,8
"User 3",6,7
"User 3",8,7
I'm trying to build a report on these users, to quickly see which users underestimate or overestimate, and so I created a pivot table. But, I can't figure out how to simply show if a user has underestimated at some point. I tried to create a calculated field like =IF(Estimate > Actual, 1, 0), but this sums, then compares the Estimate and Actual columns and tells me that "User 3" doesn't over/underestimate.
Without adding an additional field to my data, how can I accomplish this?
A similar SQL pseudo-query would be:
SELECT DISTINCT al.User,
(SELECT COUNT(*) FROM ActivityLog AS l2 WHERE l2.User = al.User AND l2.Estimate > l2.Actual) AS Overestimates
FROM ActivityLog AS al
Edit:
I'm still working on this, and currently have created a static list of users in some cells on the side, and have given them the Array Formulas: {=SUM(IF((A$2:A20 = F6)*(B$2:B20 > C$2:C20), 1, 0))} and {=SUM(IF((A$2:A20 = F6)*(B$2:B20 < C$2:C20), 1, 0))} (if I have the user's name in F6).
Mainly, I want to do this where the list of users can populate dynamically from the main data.
Calculated fields in pivot tables stink. I would get rid of the pivot table and do it with formulas. Start a unique list of users in H15 and enter this in I15
{=MAX(($A$2:$A$7=H16)*($B$2:$B$7-$C$2:$C$7<>0))}
array entered. This will return 1 if they ever over or under estimated and zero if they never did. The downside is that you can't "refresh" it like a pivot table so you have to make sure your unique user list is accurate all the time.
If that's too big of a downside, I think you'll need to add a column to your source data. Specifically
=ABS(B2-C2)
And add that to your pivot table. It will show zero for never over/under and non-zero otherwise.
You are aware that you should make sure the estimates are all in the same range? Smaller numbers can be estimated better (when talking about hours).
Add a column for actual-estimate
then summarize those values for min max and average. (or stddev)

Resources