Excel power query average of all values in a column - excel

I have a table in power query and I want to create a new column that displays the average of all values of another referenced column. Any idea how to achieve this? So for example the home.goaldFor column i want to add-up all values and devide by the number of values.
i have tried multiple ways, including this one:
Avg = List.Average(
Table.SelectColumns(
#"ENG - Premier League",
[home.goalsFor],
0
)
)
But that's giving me the error:
Expression.Error: A cyclic reference was encountered during evaluation.
Anyone any ideas ?

I suspect that #"ENG - Premier League" the name of the current step based on the error you're getting.
Try this instead:
Avg = List.Average(#"Previous Step Name"[home.goalsFor])

Related

Excel: #NA error while using index and match

I'm working with excel 2016. I am trying to use INDEX and MATCH to insert numbers into the ALT COST/ACRE. the screenshot is the first table.
I'm trying to insert ZIP AVG PRICE/ACRE from table 2, which looks like:
into ALT COST/ACRE
my attempt , as you can see in the first screenshot is :
=INDEX(Table2[[zip avg price/acre]:[ZIP2]],MATCH([#ZIP],Table2[[#Headers],[ZIP2]],0),1)
However this results in a
#NA
error. How can I fix this?
Table2[[#Headers],[ZIP2]] represents the single header cell that contains the column name, ZIP2.
As you want to search for [#ZIP] in the entire column ZIP2 rather than in its header cell, you should replace it with Table2[ZIP2] which is the data portion of the ZIP2 column:
=INDEX(Table2[[zip avg price/acre]:[ZIP2]], MATCH([#ZIP],Table2[ZIP2],0), 1)
Then you could note that the INDEX is not using the second column out of the Table2[[zip avg price/acre]:[ZIP2]], at which point it would become:
=INDEX(Table2[zip avg price/acre], MATCH([#ZIP],Table2[ZIP2],0), 1)

Pyspark conditionally replace value in column with value from another column

I am working with some weather data that is missing some values (indicated via value code). For example, if SLP data is missing, it is assigned code 99999. I was able to use a window function to calculate a 7 day average and save it as a new column. A significantly reduced example of a single row is shown below:
SLP_ORIGIN
SLP_ORIGIN_7DAY_AVG
99999
11945.823516044207
I'm trying to write code such that when SLP_ORIGIN has the missing code it gets replaced using the SLP_ORIGIN_7DAY_AVG value. However, most code explains how to replace a column value based on a conditional with a constant value, not the column value. I tried using the following:
train_impute = train.withColumn("SLP_ORIGIN", \
when(train["SLP_ORIGIN"] == 99999, train["SLP_ORIGIN_7DAY_AVG"]).otherwise(train["SLP_ORIGIN"]))
where the dataframe is called train.
When I perform a count on the SLP_ORIGIN column using train.where("SLP_ORIGIN = 99999").count() I get the same count from before I attempted replacing the value in that column. I have already checked and my SLP_ORIGIN_7DAY_AVG does not have any values that match the missing code.
So how do I actually replace the 99999 values in the SLP_ORIGIN column with the associated SLP_ORIGIN_7DAY_AVG value?
EVEN BETTER, is there a way to do this replacement and window calculation without making a 7 day average column (I have other variables I need to do the same thing with so I'm hoping there is a more efficient way to do this).
Make sure to double check with dataframe you are verifying on.
I was using train.where("SLP_ORIGIN = 99999").count() when I should have been using train_impute.where("SLP_ORIGIN = 99999").count()
Additionally, instead of making a whole new column to store the imputed 7 day average, one can only calculate the average when the missing value code is present:
train = train.withColumn("SLP_ORIGIN", when(train["SLP_ORIGIN"] == 99999, f.avg('SLP_ORIGIN').over(w)).otherwise(train["SLP_ORIGIN"]))\

How do i convert sumifs in excel to a PowerBI formula?

I am trying to replicate the following excel formula in PowerBi. It adds all the refunded costs from a Unique identifier between a date period
I have tried using the Sumx function in powerBi but It doesn't return the values i need it to return.
SUMIFS([#Refunded;
[#Date];">="&MAX([#Date])-42;
[#Date];"<="&MAX([#Date])-14;
[#UID];)
It needs to return the sum of the same unique identifiers between 42 and 14 days earlier.
I have tried solving is as follows:
calculate(SUM([Refunded]),DATESBETWEEN(all_funnel_data_view[Date].[Date],Value(all_funnel_data_view[Date].[Date])=TODAY()-42,Value(all_funnel_data_view[Date].[Date])=TODAY()-14))
But is only returns empty field
Use the FILTER function as the second argument of CALCULATE. In this, you can filter the date column of the all_funnel_data_view table for a date in the specified time frame. I'm assuming that Refunded is a column, not already a measure. If so, qualifying it with the name of the table will help to make the measure easier to read. In the following example, "YourFactTable" is used for this.
CALCULATE
(
SUM(YourFactTable[Refunded]),
FILTER(all_funnel_data_view,
AND
(
all_funnel_data_view[date] >= TODAY() - 42,
all_funnel_data_view[date] <= TODAY() - 14
)
)
)

Moving total which uses two calculated fields and also uses its previous value-Spotfire

Hi folks I am new to Spotfire and having difficulty in replicating one of the formulas from Excel to spotfire.
Sample Data
Sample data here(excel)
https://docs.google.com/spreadsheets/d/1KSdrIYKlRYG9c3wIM3NwQcLP_Ob8Z2UZ5Cjrdjy8UO8/edit?usp=sharing
and I am trying to replicate the column [Steady Repay-Option Scenario]
Formula used in excel
=IF(B6-IF(C3>0,C2,0)>0,B6-IF(C3>0,C2,0),0)
the above is the formula I have in excel where subsequent columns are calculated by using the previous value and the current values from columns [Monthly impact on cash] and [running total]
This is the formula I have created in spotfire:
if((Sum([Scenario opening balance]) over (allPrevious([Document_Date_Number])) - (If(Sum([Rolling_total_cash_calculated]) over (AllPrevious([Document_Date_Number]))>0,Sum([Monthly_impact_on_cash_calculated]) over (AllPrevious([Document_Date_Number])),0)))>0,Sum([Scenario opening balance]) over (allPrevious([Document_Date_Number])) - (If(sum([Rolling_total_cash_calculated]) over (allPrevious([Document_Date_Number]))>0,Sum([Monthly_impact_on_cash_calculated]) OVER (allPrevious([Document_Date_Number])),0)),0)
Assumptions--
Data has been pivoted into three columns([Document_Date_Number], Monthly_impact_on_cash_calculated] and [Rolling_total_cash_calculated])
where:
[Scenario opening balance] = 150000000(fixed)
[Document_Date_Number] = Jan,Feb,Mar etc
[Rolling_total_cash_calculated] = Rolling total(excel)
[Monthly_impact_on_cash_calculated] = Monthly impact on cash(excel)
But I get incorrect results for some reason
results in spotfire
But the expected result is
Correct result in excel
So although the results match till Oct as shown above they don't seem to match afterwards.
Please let me know what can I do to get the same values. Any help in deeply appreciated.

PivotTable - Calculate value depending on combination of row labels

WARNING - Using Excel 2011 for Macs, inexperienced user
Hi All,
I have a sheet in Excel with a bunch of categorical fields and some numerical ones as well. Let's say it looks like the following:
I would like to make a pivot table that will display the average click rate (avg_click_rate) of the unique combinations of [year, region], i.e. the combinations of fields in the pivottable's rows section.
For example, the avg_click_rate of [years=5] is:
(0.5*10)/(10 +5 ) + (0.6*5)/(10+5) = 0.53
while the avg_click_rate of [region=north] is:
(0.6*5)/(5+20) + (0.2*20)/(5+20) = 0.28
and the avg_click_rate of [years=5, region=south] is:
(0.5*10)/10 = 0.5
I know I have to make a custom Calculated Field to do this, but for the life of me I cannot figure out how to code the formula. Any help would be seriously, seriously appreciated.
To be clear, my formula would be:
SUM{ (click_rate * number_members) / SUM{number_members} }
where the numerator is a single value for each row included in the unique combination of [year, region], while the denominator is a constant - the total number_members for the unique combination of [year, region].
You should create a new column in your source table:
product = click_rate * number_members
And then create a Calculated Field in the pivot table:
CF = product / number_members
Using AVERAGEIFS:
AVERAGEIFS($C$2:$C$9,$A$2:$A$9,A2,$B$2:$B$9,B2)
EDIT:
You can add up to 127 conditions, see: AVERAGEIFS function
Criteria_range1, criteria_range2, … are 1 to 127 ranges in which
to evaluate the associated criteria.

Resources