Power Query Table.Group List.Sum(two columns)

Power Query Table.Group List.Sum(two columns) - excel

Trying to figure out how to perform the sum of two columns inside a Table.Group step.
This one throws:
Expression.Error: We cannot convert Type to List type.
M Code:
= Table.Group(PreviousStep, {"PF"}, {{"ColumnName", each List.Sum({[Column1], [Column2]})}, type number})
M Gurus: Is this even possible?
Objective: Simplify steps (actually I have 12 columns I want to reduce to 6 - adding pairs, grouping them, all in one simple step)

You can kind of do it if you create a custom column to sum within that step. Something like this:
= Table.Group(
PreviousStep,
{"PF"},
{{"ColumnName",
each List.Sum(
Table.AddColumn(
_,
"Custom", each [Column1] + [Column2]
)[Custom]
),
type number
}}
)

Related

Prioritise query values over others using another query for values to be prioritised

I have the following query of Olympic countries in power query which I wish to sort using another query containing "prioritised countries" (the current top 10). I wish to sort the original query such that if a country is on the prioritised list it is alphabetically sorted at the top of the query.
Below visually shows what I am trying to achieve:
The best I have been able to do is merge queries however this removes countries not on the prioritised query. I appreciate that I can create a second query of the original, append this to the prioritised countries and then remove duplicates however I am looking for a more elegant solution as this will require refreshing the data twice.

Let Q be the query to sort and P be the priority list. Then you can get your desired result by appending the intersection Q ∩ P with the set difference Q \ P.
Here's one way to do this in M:
let
Source =
Table.FromList(
List.Combine(
{
List.Sort( List.Intersect( { P[Country], Q[Country] } ) ),
List.Sort( List.RemoveItems( Q[Country], P[Country] ) )
}
),
null,
{"Country"}
)
in
Source

Power Query: Detect null created by Table.NestedJoin JoinKind.FullOuter

With Table.NestedJoin JoinKind.FullOuter, a null may be written into columns when there is a value in the right table "key" that does not exist in the left table "key".
However, unlike a null that is in the left table because the cell is empty, this created null does not = True with the formula [column] = null.
For example:
Table1
Note the null in row 3
Table2
Joined Table
The null in row 5 was created as a result of the Join
Custom Column
added with formula =[A]=null
note the different results for the null
MCode to reproduce the above
let
Source1 = Table.FromRecords({
[A="a"],
[A="b"],
[A=null],
[A="c"]
}),
type1 = Table.TransformColumnTypes(Source1,{"A", type text}),
Source2 = Table.FromRecords({
[A="c"],
[A="d"]
}),
type2 = Table.TransformColumnTypes(Source2,{"A", type text}),
combo = Table.NestedJoin(type1,"A",type2,"A","joined",JoinKind.FullOuter),
#"Added Custom" = Table.AddColumn(combo, "Custom", each [A]=null)
in
#"Added Custom"
Explanations and suggestions as to how to deal with this would be appreciated.
Edit In addition to the above, doing a Replace will also only replace the null in row 3, and not the null in row 5. Seems there is something different about these two nulls.
Note: If I Expand the table, the null in Column A will now test correctly.

Asking the same question on the Microsoft Q&A forum pointed me to the possibility of an issue with the Power Query Evaluation model and also this article on Lazy Evaluation and Query Folding in Power BI/Power Query.
By forcing evaluation of the table with Table.Buffer, both nulls now behave the same.
So:
let
Source1 = Table.FromRecords({
[A="a"],
[A="b"],
[A=null],
[A="c"]
}),
type1 = Table.TransformColumnTypes(Source1,{"A", type text}),
Source2 = Table.FromRecords({
[A="c"],
[A="d"]
}),
type2 = Table.TransformColumnTypes(Source2,{"A", type text}),
//Table.Buffer forces evaluation
combo = Table.Buffer(Table.NestedJoin(type1,"A",type2,"A","joined",JoinKind.FullOuter)),
//IsItNull now works
IsItNull = Table.AddColumn(combo, "[A] = null", each [A] = null)
in
IsItNull
It also seems to be the case that try ... otherwise will also force an evaluation. So instead of Table.Buffer, the following also works:
...
combo = Table.NestedJoin(type1,"A",type2,"A","joined",JoinKind.FullOuter),
//try ... otherwise seems to force Evaluation
IsItNull = Table.AddColumn(combo, "[A] = null", each try [A] = null otherwise null)

Very interesting case. Indeed, the behaviour of last null is counterintuitive in most possible implementations. If you wish to get the same behaviour for both kinds of nulls, try this approach:
= Table.AddColumn(combo, "test", each [A] ?? 10)
Quite interesting, the similar code doesn't work:
= Table.AddColumn(combo, "test", each if [A] = null then 10 else [A])
Moreover, if we want to improved the previous code by using the first syntax we still get unexpectable result (10 instead of 20 for the last null):
= Table.AddColumn(combo, "test", each if [A] = null then 10 else [A] ?? 20)
Сurious, applying ?? operator also fixes the problem with initial column. Now there are regular nulls in A column:
= Table.AddColumn(add, "test2", each [A] = null)
So, if we don't need any calculations and just want to fix invalid nulls, we may use such code:
= Table.TransformColumns(combo, {"A", each _ ?? _})
The column doesn't matter and for joined column the result is the very same:
transform = Table.TransformColumns(combo, {"joined", each _ ?? _}),
add = Table.AddColumn(transform, "test", each [A] = null)

Cast some columns and select all columns without explicitly writing column names

I want to cast some columns and then select all others
id, name, property, description = column("id"), column("name"), column("property"), column("description")
select([cast(id, String).label('id'), cast(property, String).label('property'), name, description]).select_from(events_table)
Is there any way to cast some columns and select all with out mentioning all column names
I tried
select([cast(id, String).label('id'), cast(property, String).label('property')], '*').select_from(events_table)
py_.transform(return_obj, lambda acc, element: acc.append(dict(element)), [])
But I get two extra columns (total 7 columns) which are cast and I can't convert them to dictionary which throws key error.
I'm using FASTAPI, sqlalchemy and databases(async)
Thanks

Pretty sure you can do
select_columns = []
for field in events_table.keys()
select_columns.append(getattr(events_table.c, field))
select(select_columns).select_from(events_table)
to select all fields from that table. You can also keep a list of fields you want to actually select instead of events_table.keys(), like
select_these = ["id", "name", "property", "description"]
select_columns = []
for field in select_these
select_columns.append(getattr(events_table.c, field))
select(select_columns).select_from(events_table)

MDX calculated Member not allowed multiple hierarchy tuple

I've using a sql Table to generate filters on each dimensions for a value in a SSAS Cube.
The MDX Query is based on the column Query below, the calculated member is:
AGGREGATE
(
IIF(Query= "" or ISEMPTY(Query),
[Code].[_KeyQuery].[ALL],
StrToTuple('('+ Query+')')
),[Measures].[Value]
)
I have to work with pivot Table in Excel. It works perfectly, the value is correctly filter on each dimension member. If i use a query like this, it's ok.
[Level].[LevelCode].&[A],[Status].[StatusCode].&[ST]
But now i need adding the possibility to filter on multiple dimensions members. For exemple, using a query :
[Level].[LevelCode].&[A],[Level].[LevelCode].&[X],[Status].[StatusCode].&[ST]
It doesn't works, i've try changing the query like this:
{[Level].[LevelCode].&[A],[Level].[LevelCode].&[X]},[Status].[StatusCode].&[ST]
but the StrToTuple() function causes error. I don't know how to filter in multiple values for a same dimension hierarchy.

If it will always be a tuple then no need to use AGGREGATE just a tuple should return the value:
IIF(
Query= "" OR ISEMPTY(Query),
(
[Code].[_KeyQuery].[ALL]
,[Measures].[Value]
)
,StrToTuple('('+ Query +',[Measures].[Value])')
)
Or this version:
StrToTuple(
'('
+ IIF(
Query= "" OR ISEMPTY(Query)
,[Code].[_KeyQuery].[ALL]
,Query
)
+',[Measures].[Value])'
)
possible approach for decision between tuple and set
Add a column to your control table "TupleOrSet" with values of either "T" or "S". Then you could amend your code to something like this:
IIF(
Query= "" OR ISEMPTY(Query),
(
[Code].[_KeyQuery].[ALL]
,[Measures].[Value]
)
,IIF(
TupleOrSet = "T"
,StrToTuple('('+ Query +',[Measures].[Value])')
,AGGREGATE( StrToSet('{'+ Query +'}'), [Measures].[Value])
)
)
note
A tuple is a definite point in the cube space so cannot therefore be made up of two members from the same hierarchy - this would create coordinates that are non-determinant

Apache Spark 2.0 Dataframes (Dataset) group by multiple aggregations and new column naming

Aggregating multiple columns:
I have a dataframe input.
I would like to apply different aggregation functions per grouped columns.
In the simple case, I can do this, and it works as intended:
val x = input.groupBy("user.lang").agg(Map("user.followers_count" -> "avg", "user.friends_count" -> "avg"))
However, if I want to add more aggregation functions for the same column, they are missed, for instance:
val x = input.groupBy("user.lang").agg(Map("user.followers_count" -> "avg", "user.followers_count" -> "max", "user.friends_count" -> "avg")).
As I am passing a map it is not exactly surprising. How can I resolve this problem and add another aggregation function for the same column?
It is my understanding that this could be a possible solution:
val x = input.groupBy("user.lang").agg(avg($"user.followers_count"), max($"user.followers_count"), avg("user.friends_count")).
This, however returns an error: error: not found: value
avg.
New column naming:
In the first case, I end up with new column names such as: avg(user.followers_count AS ``followers_count``), avg(user.friends_count AS ``friends_count``). Is it possible to define a new column name for the aggregation process?
I know that using SQL syntax might be a solution for this, but my goal eventually is to be able to pass arguments via command line (group by columns, aggregation columns and functions) so I'm trying to construct the pipeline that would allow this.
Thanks for reading this!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Power Query Table.Group List.Sum(two columns) - excel

You can kind of do it if you create a custom column to sum within that step. Something like this: = Table.Group( PreviousStep, {"PF"}, {{"ColumnName", each List.Sum( Table.AddColumn( _, "Custom", each [Column1] + [Column2] )[Custom] ), type number }} )

Related

Prioritise query values over others using another query for values to be prioritised

Power Query: Detect null created by Table.NestedJoin JoinKind.FullOuter

Cast some columns and select all columns without explicitly writing column names

MDX calculated Member not allowed multiple hierarchy tuple

Apache Spark 2.0 Dataframes (Dataset) group by multiple aggregations and new column naming

Categories

Resources