Rounding all columns excep first one - power query - rounding

I have question - is that possible to use for rounding all columns except first one?
I use some clue from changing data types from Alexis Olson (thank you it is veeeery usefull), but I cannot add to this syntax rounding.
When using syntax
= Table.TransformColumns(
#"Rename column",
List.Transform(
List.RemoveFirstN(Table.ColumnNames(#"Rename"),1),
each Number.Round(_, 0))
)
appears - Expression.Error: Value "2019_1" cannot be transferred to type Number - for explain 2019_1 is header value.
Can anybody help me, please?
Thank you very much in advance
Jiri

Use this code:
round = Table.TransformColumns(Source,List.Transform(List.Skip(Table.ColumnNames(Source)),
each {_, Number.Round}))
Number.Round here is equivalent to Number.Round(_,0).
P.S. I usually use 3rd argument of Number.Round (rounding mode), for instance, if you wish to round 0.5 to 1: Number.Round(_, 0, 2)

You should return a list of lists, where first element is a column name and second - a function that is applied to value. But not list of functions only
let
Source = #table({"first", "second", "third", "fourth"},{
{ 1, 2, 3, 4},
{1.1, 2.2, 3.7, 4.9}
}),
cols = Table.ColumnNames(Source),
cols_1 = List.RemoveFirstN(cols, 1),
transformation_list = List.Transform(cols_1, (column_name)=>{column_name, (value)=>Number.Round(value, 0)}),
round = Table.TransformColumns(Source, transformation_list)
in
round
So transformation_list is

Related

Conditional Column: Order of Tests Matters?

Condition to Test Date
I am using Power Query to create a status column that checks the date against a specified date, like so:
However, this gives me the following error:
Expression.Error: We cannot convert the value null to type Logical.
Details:
Value=
Type=[Type]
The column does contain empty cells, which I want to report as "null" in the new column. I then tried the following logic, and it errors out as well:
Then I moved the null test to the top, and it finally works:
Why Does Order Matter?
Why does the third query produce the expected results but not the first one? This seems bizarre to me, so if there is something I am missing please let me know.
M is using lazy evaluation in the if statement. If the first statement is true, then it doesn't even bother evaluating the other conditions.
https://learn.microsoft.com/en-us/powerquery-m/m-spec-introduction
For computer language theorists: the formula language specified in
this document is a mostly pure, higher-order, dynamically typed,
partially lazy functional language.
Easy fix
On a step before your filter, choose "remove nulls" or "replace nulls with values"
using catch
If you want more flexibility, you can use a try + catch pair.
Step FirstTry is meant to; be your filter, then I added two ways to handle errors.
let
Source = Table.FromList(sample, Splitter.SplitByNothing(),
type table[Date = nullable date], null, ExtraValues.Error),
sample = {
#date(2020, 1, 1),
"text", null,
#date(2024, 1, 1)
},
filter = #date(2022, 1, 1),
FirstTry = Table.AddColumn(
Source , "Comparison", each filter > [Date], Logical.Type),
WithFallback = Table.AddColumn(FirstTry, "WithFallback",
each try
filter > [Date]
catch (e) => e[Message], type text),
WithPreservedDatatype = Table.AddColumn(WithFallback, "PreserveColumnType",
each try
filter > [Date]
catch (e) => null meta [ Reason = e[Message] ],
type logical)
in
WithPreservedDatatype
things to note
the query steps are "out of order", which is totally valid. ( above sample was referenced "before" its line )
Errors are propagated so an error on step4 could actually be step2. Just keep going up until you find it.
the schema says column [Date] is type date -- but it's actually type any.
What you need is to call Table.TransformColumnTypes to convert and assert datatypes
= Table.TransformColumnTypes( Source,{{"Date", type date}})
Now row 2 will correctly show an error, because text couldn't convert into a date
Better Understanding of NULLs
I was not understanding how Excel (or any other data tool) handles null values and how logical tests are performed on null values. This response on Reddit really helped clarify this in my mind:
https://www.reddit.com/r/excel/comments/xu37dr/comment/iqtmivn/?utm_source=share&utm_medium=web2x&context=3
In short, logical tests involving nulls do not behave like you would expect.
For a deeper dive into this, read this excellent post:
https://bengribaudo.com/blog/2018/09/13/4617/power-query-m-primer-part9-types-logical-null-binary
My Solution
Given this enlightened understanding of nulls and how they behave in logical tests, I now know that I must either:
Convert the null values to empty strings ("") before the query
Test first for nulls in the query
My choice is to test for nulls first within the query, like so:

Applying function to each row in Query Custom Column

Summary of problem:
I have a PowerQuery Table in Excel that contains 13 columns. The 13th Column is a custom column "Task Start Week Number". I want the PowerQuery to apply a formula to each of the rows generated for this Query. The formula is as follows:
=IFS(AND('Program Dates'!$B$2<WEEKNUM(New_Items_to_Save[Start Date]),
WEEKNUM(New_Items_to_Save[Start Date])<54),
'Program Dates'!$G$2-('Program Dates'!$D$2-(-53+WEEKNUM(New_Items_to_Save[Start Date]))),
WEEKNUM(New_Items_to_Save[Start Date])<'Program Dates'!$B$2,
'Program Dates'!$G$2-('Program Dates'!$D$2-(-53+WEEKNUM(New_Items_to_Save[Start Date])))+53)
What I've done here is reference a cell which contains the formula, that way I can just run the GetValue() function for a named range. I can't get this to work and I don't know what I'm doing wrong.
Thank you in advance for your help!
Context:
This is the query table I need to add the calculation to.
The last column is the custom column, and those values should be calculated using the following cells:
This is the source of the other info needed to calculate the week number of the program, with reference arrows shown.
Note: The dates referenced in the function have already been converted using the WEEKNUM() operation. I am comparing Week# to Week#, not Date to Week#
Function Logic:
AND: if the date falls within the range of the current year ie. week# is less than 54, but after the start of the program, then perform this calc.
IFS: otherwise, if week# is before the end of the program ie. 2023, then perform this calculation.
Edit:
Here is the PowerQuery function I want to call for each of the new cells in this custom column:
Parameter2 = Date.WeekOfYear(StartWeek)
let
GetWeek = ()
if GetValue("Start_Week") < Parameter2 < 54
then (GetValue("Program_Duration") - GetValue("End_Week") + 53 - Parameter2))
else
(GetValue("Program_Duration") - GetValue("End_Week") + 53 - Parameter2 +53))
in
GetWeek
I don't know if I need the let statement or if I should just put it in a function
f(x) => [equation]
and then call "...each f([column name])" in power query?
I think that there are actually three different parts to your question, and maybe your confusion is coming from combining them all together.
The way I see it is in these parts:
How to create a custom function.
How to apply a function to a new column.
How to apply a function to an existing column.
How to create a custom function
There are two main ways to create a custom function in Power Query:
Using the UI (follow steps here):
Step
Description
Image
1
Write your query
2
Parameterise your query
3
Create your function
Using only code (follow steps here):
Example to filter a table:
let fun_FilterTable = (tbl_InputTable as table, txt_FilterValue as text) as table =>
let
Source = tbl_InputTable,
Filter = Table.SelectRows(DayCount, each Text.Contains([Column], txt_FilterValue))
in
Filter
in
fun_FilterTable
Example to check if one string contains another:
let fun_CheckStringContains = (txt_String as text, txt_Check as text) as nullable logical =>
let
Source = txt_String,
Check = Text.Contains(Source, txt_Check)
in
Check
in
fun_CheckStringContains
More resources:
Using custom functions
Custom Functions Made Easy in Power BI Desktop
PowerQuery best practices
DataFlow best practices
How to apply a function to a new column
Also has two different ways to achieve:
Custom Column (follow steps here):
Step
Description
Image
1
Create custom column
2
Add function
Custom Function (follow steps here):
Step
Description
Image
1
Invoke custom function
Sources:
Add a custom column
Using custom functions
Custom Functions Made Easy in Power BI Desktop
How to apply a function to an existing column
Also has two different ways to achieve (unfortunately, only possible with pure code):
Using Transformation:
Example to uppercase an entire column:
let
Source = Table,
#"Uppercased text" = Table.TransformColumns(Source, {{"Column", each Text.Upper(_), type nullable text}})
in
#"Uppercased text"
Example to add a prefix to all rows in one column:
let
Source = Table,
#"Added prefix" = Table.TransformColumns(Source, {{"Column", each "test_" & _, type text}})
in
#"Added prefix"
Example to coerce column to date in Australian format:
let
Source = Table,
#"Fix date" = Table.TransformColumns(Source, {{"DateColumn", each Date.From(_, "en-AU"), type date}})
in
#"Fix date"
Using Replacement
Example to replace some text:
let
Source = Table,
#"Replaced value" = Table.ReplaceValue(Source, "Admin", "Administrator", Replacer.ReplaceText, {"Column"})
in
#"Replaced value"
Example to replace with values from another column
let
Source = Table,
#"Replaced value" = Table.ReplaceValue(Source, each [FixThisColumn], each [OtherColumn], Replacer.ReplaceText, {"FixThisColumn"})
in
#"Replaced value"
Your Specific Problem
Without some dummy data to use, I have created some here. Please note, in future, please provide some data in a minimum reproducible example (see here), so that we can easily recreate the scenario from your example.
Data:
ID
ProgramStartDate
ProgramEndDate
1
1/Jan/2020
1/Dec/2021
2
1/Jan/2022
1/Mar/2023
3
1/Mar/2022
1/Dec/2022
4
1/Sep/2021
1/Dec/2023
5
1/Jan/2023
1/Dec/2023
I think that you should be using a combination of the PowerQuery in-build date functions (see here) and some of the PowerQuery conditional processes (see here).
My code would look something like this:
let
Source = Table.FromColumns({{1,2,3,4,5},{"1/Jan/2020","1/Jan/2022","1/Mar/2022","1/Sep/2021","1/Jan/2023"},{"1/Dec/2021","1/Mar/2023","1/Dec/2022","1/Dec/2023","1/Dec/2023"}},{"ID","ProgramStartDate","ProgramEndDate"}),
fix_Types = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"ProgramStartDate", type date}, {"ProgramEndDate", type date}}),
add_Today = Table.AddColumn(fix_Types, "DateToday", each Date.From(DateTime.LocalNow()), type date),
add_CheckCurrentYear = Table.AddColumn(add_Today, "IsInCurrentYear", each Date.IsInCurrentYear([DateToday]), type logical),
add_CheckProgramRunning = Table.AddColumn(add_CheckCurrentYear, "ProgramIsCurrent", each [DateToday]>[ProgramStartDate] and [DateToday]<[ProgramEndDate], type logical),
add_ConditionalCheck = Table.AddColumn(add_CheckProgramRunning, "DoSomething", each if [IsInCurrentYear] and [ProgramIsCurrent] then "Do Something" else null, type text)
in
add_ConditionalCheck
And the final output would look something like this:
ID
ProgramStartDate
ProgramEndDate
DateToday
IsInCurrentYear
ProgramIsCurrent
DoSomething
1
1/01/2020
1/12/2021
22/12/2022
TRUE
FALSE
null
2
1/01/2022
1/03/2023
22/12/2022
TRUE
TRUE
Do Something
3
1/03/2022
1/12/2022
22/12/2022
TRUE
FALSE
null
4
1/09/2021
1/12/2023
22/12/2022
TRUE
TRUE
Do Something
5
1/01/2023
1/12/2023
22/12/2022
TRUE
FALSE
null
This should help you work towards resolving your issue.

Python Warning Panda Dataframe "Simple Issue!" - "A value is trying to be set on a copy of a slice from a DataFrame"

first post / total Python novice so be patient with my slow understanding!
I have a dataframe containing a list of transactions by order of transaction date.
I've appended an additional new field/column called ["DB/CR"], that dependant on the presence of "-" in the ["Amount"] field populates 'Debit', else 'Credit' in the absence of "-".
Noting the transactions are in date order, I've included another new field/column called [Top x]. The output of which is I want to populate and incremental independent number (starting at 1) for both debits and credits on a segregated basis.
As such, I have created a simple loop with a associated 'if' / 'elif' (prob could use else as it's binary) statement that loops through the data sent row 0 to the last row in the df and using an if statement 1) "Debit" or 2) "Credit" increments the number for each independently by "Debit" 'i' integer, and "Credit" 'ii' integer.
The code works as expected in terms of output of the 'Top x'; however, I always receive a warning "A value is trying to be set on a copy of a slice from a DataFrame".
Trying to perfect my script, without any warnings I've been trying to understand what I'm doing incorrect but not getting it in terms of my use case scenario.
Appreciate if someone can kindly shed light on / propose how the code needs to be refactored to avoid receiving this error.
Code (the df source data is an imported csv):
#top x debits/credits
i = 0
ii = 0
for ind in df.index:
if df["DB/CR"][ind] == "Debit":
i = i+1
df["Top x"][ind] = i
elif df["DB/CR"][ind] == "Credit":
ii = ii+1
df["Top x"][ind] = ii
Interpreter
df["Top x"][ind] = i
G:\Finances Backup\venv\Statementsv.03.py:173: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df["Top x"][ind] = ii
Many thanks :)
You should use df.loc["DB/CR", ind] = "Debit"
Use iterrows() to iterate over the DF. However, updating DF while iterating is not preferable
see documentation here
Refer to the documentation here Iterrows()
You should never modify something you are iterating over. This is not
guaranteed to work in all cases. Depending on the data types, the
iterator returns a copy and not a view, and writing to it will have no
effect.

Can nested LookUp be done?

Hi again dear stackoverflowers!
I have one column in Sharepoint for an ID number (say that the number is 29) and another column that for that ID holds different subIDs (29.1, 29.2, 29.3, etc.).
What I need is that my PowerApp looks up into the Sharepoint list and takes the maximum subID number associated with the ID given, and automatically sums 0.1, because I do not want people to introduce two equal subIDs.
I'll give you the formula I tried (but the problem is that StartsWith is for text and Max is for numbers), so if you have any ideas about how can it be solved or you have any function that works with both text and numbers I would really appreciate it:
LookUp(my_list.'Prueba', StartsWith('Prueba' , DataCardValue9_2.Text), Max('Prueba')+0.1)
Another thing I tried was to nest LookUp functions, but that did not work either, do you know if that can be done?
LookUp(my_list, Prueba = DataCardValue9_2.Text + 0.3 , Max(Prueba) + 0.1) & LookUp(my_list, Prueba = DataCardValue9_2.Text + 0.2 , Max(Prueba) + 0.1) & LookUp(my_list, Prueba = DataCardValue9_2.Text + 0.1 , Max(Prueba) + 0.1)
Thank you very much for your time and help.
You must convert your column to a number datatype. Value() function for number.
Please do check this link: https://learn.microsoft.com/en-us/powerapps/maker/canvas-apps/functions/function-value

Pandas Indexing-View-Versus-Copy

I have a dataframe with several columns.
Later, a column titled 'Active' is added.
If the 'Volume' column contains anything greater than 0, I need to set 'Active' to 1.
This is a simple example of how I've attempted it:
import pandas as pd
active_df = pd.DataFrame(columns=['Volume'])
active_df['Volume'] = 0, 0, 22, 22, 0, 22, 0, 22, 0, 22
active_df['Active'] = 0
active_df['Active'].loc[active_df['Volume'] > 0] = 1
print(active_df)
Although this produces the expected results, I constantly get a warning: "A value is trying to be set on a copy of a slice from a DataFrame"
I have read the referenced page: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy but still can't solve this.
I thought that I had dealt with this in other code and resolved it, but I can't find an example in existing code.
I rediscovered this question after it being up for a year after a recent upvote.
Having learned a lot more about Pandas since it was asked, I thought I'd revisit the difference in my 'copy of a slice' and the solution.
My original attempt was:
active_df['Active'].loc[active_df['Volume'] > 0] = 1
Which was really a convoluted way at best.
First I'm gettting boolean values for active_df['Volume'] > 0
And then where the row value is TRUE, I'm setting the slice active_df['Active'] to 1.
Although this worked, there was uncertainty in whether this was a view or copy of the dataframe.
The solution was:
active_df.loc[active_df['Volume'] > 0, 'Active'] = 1
In the active_df dataframe, locate the rows where active_df['Volume'] > 0, and the column 'Active', and set those values to 1.
Or stated a different way: Set a value of 1 for the 'Active' column for the rows that have a value of 0 in the 'Volume' column.
So you are really working on the whole dataframe (active_df.loc) instead of the slice and possible copy (active_df['Active'].loc)
Thank you again to #Deena for providing the solution.
I believe that the copies and views internals are different from through the verions, since I don't get that warning using 0.20.3.
I would totally understand if the latest releases would move some of the Views operations to copies, given the volume of confusions and possible bugs that caused.
The safest option for all the versions is:
active_df.loc[active_df['Volume'] > 0, 'Active'] = 1
And you can always double check if the filtered dataframe is a copy or a view:
active_df['Active'].loc[active_df['Volume'] > 0].is_view

Resources