Group and sum columns based on values (Query) - excel

I would like to group and sum a table based on values (just like a pivot table would do, but without it) but without using it.
Example Table:
For example, in this query I would like to group each row that contains the same RU and TP and sum the Balance Value.
I tried using the group by like this Group by:
but it does not return every RU possible, if a value is in the TP column it does not appear in RU column.
EDIT1: My table starts like this
and when I group and sort using group by in query the result is
the result is what I want, but as you can see, the 0156 and 0195 RUs, for some reason, are deleted from my table, and a lot of others RUs too.

Related

Spark - partitioning/bucketing of n-tables with overlapping but not identical ids

i'm currently trying to optimize some kind of query of 2 rather large tables, which are characterized like this:
Table 1: id column - alphanumerical, about 300mil unique ids, more than 1bil rows overall
Table 2: id column - identical semantics, about 200mil unique ids, more than 1bil rows overall
Lets say on a given day, 17.03. i want to join those two tables on id.
Table 1 is left, table 2 is right, i get like 90% of matches, meaning table 2 has like 90% of those ids present in table 1.
One week later, said table 1 did not change (could but to make explanation easier, consider it didn't), table 2 was updated and now contains more records. I do the join again and now, from the former missing ids some came up, so i got like 95% matches now.
In general, table1.id has some matches with table2.id at a given time which might change on a day-per-day base.
I now want to optimize this join and came up on the bucketing feature. Is this possible?
Example:
1st join: id "ABC123" is present in table1, not in table2. ABC123 gets sorted into a certain bucket, e.g. "1".
2nd join (week later): id "ABC123" now came up in table2; how can it be ensured it comes into the bucket on table 2 which then is co-located with table 1?
Or am i having a general problem of understanding how it works?

Count of KPI from different tables based on date

I have the following 4 tables and i want to show the count of ID's for each KPI's for each Caseload (1,2,3,4 etc ) something like in the second picture.
I am not sure what is the correct method but i've created a date table and use as slicer because one iD might be in a all 4 table but the KPI*Date might be different.
Any ideeas how to calculate this ?

How to invert a merge query in power query

I have a single column table of customer account numbers and a main table containing 400,000 records pulling from an access database. I want to remove all records from the table where the customer account number can be found in the single column table.
The merge query capability in power query allows me to return only the records where there is a match on the customer list (in addition to a variety of other variations on this theme) but I would like to know whether there is a way to invert this so that I return all records where the customer number does not appear in this list.
I have achieved this already by using the List.Contains function and adding a custom column to identify the rows to exclude and then filtering them out, but I think this is severely impacting the performance of my workbook. Refreshing the table that initially has 400,000 rows prior to this series of transformations takes a very long time, and all queries that depend on this table then also take a long time to refresh.
Thank you
If you do a Left Anti Join of your table with a single column, this will give you your table filtered to only have the rows which do not match to the single column.

Avoid DISTINCTCOUNT in PowerPivot

Due to performance issues I need to remove a few distinct counts on my DAX. However, I have a particular scenario and I can't figure out how to do it.
As example, let's say one or more restaurants can be hired at one or more feasts and prepare one or more menus (see data below).
I want a PowerPivot table that shows in how many feasts each restaurant was present (see table below). I achieved this by using distinctcount.
Why not precalculating this on Power Query? The real data I have is a bit more complex (more ID columns) and in order to be able to pivot the data I would have to calculate thousands of possible combinations.
I tried adding to my model a Feast dimensional table (on the example this would only be 1 column of 2 rows). I was hoping to use that relationship to be able to make a straight count, but I haven't been able to come up with the right DAX to do so.
You could use COUNTROWS() combined with VALUES().
Specifically, COUNTROWS() will give you the count of rows in a table. That means COUNTROWS is expecting a table is input. Here's the magic part: VALUES() will return a table as results, and the table it returns are the distinct values in the table/column that you provide as the argument for VALUES().
I'm not sure if I'm explaining it well, so for the sample data you provided, the measure would look like this (assuming the table is named Table1):
Unique Feasts:=COUNTROWS(VALUES('Table1'[Feast Id]))
You can then create a pivot table from Powerpivot, and drag Restaurant Id into Rows, and drag the measure above into Values. Same result as DISTINCTCOUNT, but with less performance overhead (I think).

How to get Excel Pivot 'Summarize by' to return actual data values not sum, avg, etc?

I am using Excel Powerpivot with data in two separate tables. Table 1 has ITEM level data with a BRAND characteristic. Table 2 has BRAND level data. The two tables are linked by the BRAND key. The measure I am using is non addable. i.e. the sum of the ITEMS does not equal the BRAND. The pivot is set up with ITEMS nested under BRANDS in the rows and the Measure in the column.
Excel assumes that I want to summarize ITEM to a BRAND level by applying SUM, MAX, MIN, AVG, etc. I would like to return the actual values from the appropriate ITEM or BRAND level table and not apply any calculations to the values. Is this possible?
If what you are effectively trying to do is produce a different result for the Brand rows (e.g. blank()) then the answer is to write a further measure that does a logic check to determine whether or not the row in question is an ITEM or a BRAND.
= IF (HASONEVALUE(table1[Item]), [Measure], Blank() )
Bear in mind that this will work for your current pivot but may not be adaptable to all pivots.
This assumes that you have explicitly created a measure called [Measure] and you are not just dragging the numeric column into the values box. If not you can create the initial [Measure] something like this:
= Sum(table1[Value])
Where Value is the column you want to use in the measure. Although you have used a sum, if it relates to a single item which has a single row in the table it will give the desired result.

Resources