I wish to iterate over rows of a Pandas dataframe while only checking for similarity within a few columns among the rows. Then, I want to check for a similarity condition, followed by a date comparison (which one is earlier/later) and apply corresponding changes to one element of the row selected.
For selecting particular rows, I kind of want something like this:
p=z["product_name", "Category 1", "Category 2", "Features"].iloc[i-1:i]
I know it's not correct, but it's just to give an idea. Select a row with only a few particular headers out of many.
i=1
while (i<=len(z)):
j=i+1
p=z["product_name", "Category 1", "Category 2", "Features"].iloc[i-1:i]
p=p.to_string(index=False)
while(j<=len(z)):
q=z["product_name", "Category 1", "Category 2", "Features"].iloc[j-1:j]
q=q.to_string(index=False)
if (p==q):
if(z["Update Date"].iloc[i-1:i]>z["Update Date"].iloc[j-1:j]):
z.drop(j, axis=0)
j=j+1
i=i+1
I know that most of this code is actually wrong but this is the approach I'm trying. Please suggest a better approach/function that solves this problem.
I don't know your question exactly, But there are Basic Issue in your code when you start i-1:i, it gives only one value why you use this line instead of directly one variable
p=z["product_name", "Category 1", "Category 2", "Features"].iloc[i-1:i]
try This:
p=z[["product_name", "Category 1", "Category 2", "Features"]].iloc[i-1:i]
Related
So I have created 2 Custom Dimension to group our people with their designated Team and Managers.
First Dimension is for Teams, while 2nd one is Managers.
As I see the scorecard of our employee, I want to see also the Team Score, so by ticking the “Sub totals” do the job (see screenshot). Then I want to see also Managers scores, But after I added the 2nd dimension which is Managers it’s not showing any data.
This is the code for Teams dimension
if (matches_filter(${DA_name},`name1, name2, name3...`), "Team A",
if (matches_filter(${DA_name},`name10, name11, name12...`), "Team B", ...)
This is the code I use for Custom dimension of Managers
if ((${team_name} = "Team A"
OR ${team_name} = "Team B"
OR ${team_name} = "Team C"), "Manager A",
if ((${team_name} = "Team D"
OR ${team_name} = "Team E"
OR ${team_name} = "Team F", "Manager B", “”))
I have a table with many columns. Three of these columns are:
Package Name (text)
Units Required (Int.64)
Assessment (Int.64)
What I am trying to do is to find the 'Minimum' "Package Name" first by selecting the smallest number of "Units Required", then because sometimes there are several instances where the number of required units will be the same, the row with the lowest "Assessment".
I am exploring the Table.Group() approach but I am not getting anywhere with my understanding of it. I am doing this in Power Query in Excel 365.
Psuedo Code would be something like:
Table.Group("Previous Step Name",{"Package Name"},{MIN("Units Required"),MIN("Assessment")})
As an aside - is it possible to use a single Table.Group and group at two levels? such as "Package Name" and "Column X" so that the result would be a: for each "Package Name" then for each "Column X" in each "Package Name" (nested as it were).
Thankyou in advance for taking a look at this.
Any help greatly appreciated.
Cheers
The Frog
I think you have to do it step by step.
Data
Queries
Load_Data
Load data from Excel table
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content]
in
Source
Min_Unit
Identify min unit by grouping with empty "group by" field.
let
Source = Load_Data,
Group = Table.Group(Source, {}, {{"Min_Unit", each List.Min([Units Required]), type number}})
in
Group
Min_Unit_And_Assessment
Use inner join to filter original data for entries which equal min_unit. Next, group by "units required" to get the min_assessment.
let
Source = Table.NestedJoin(Load_Data, {"Units Required"}, Min_Unit, {"Min_Unit"}, "Min_Unit", JoinKind.Inner),
Group = Table.Group(Source , {"Units Required"}, {{"Min_Assessment", each List.Min([Assessment]), type nullable number}})
in
Group
Result
Inner join to filter original data for the combination of min_unit and min_assessment.
let
Source = Table.NestedJoin(Load_Data, {"Units Required", "Assessment"}, Min_Unit_And_Assessment, {"Units Required", "Min_Assessment"}, "Min_Unit_And_Assessment", JoinKind.Inner),
RemoveUnnecessaryColumns = Table.RemoveColumns(Source,{"Min_Unit_And_Assessment"})
in
RemoveUnnecessaryColumns
Result
Qualia, thankyou for pointing me in the right direction.
The way that I solved this was really simple in the end!
Step 1: Sort the rows based on the grouping criteria (package name, system class) in that order
Step 2: Add an Index Column so each row has a unique ID to work with
Step 3: Group the table based on the same fields (package name, system class) and 'aggregate' on the lowest Index Number (MIN)
Step 4: Perform a 'Merge Queries' with a Left Outer Join using the Index Number as the matching field between your current 'step' and the step from earlier in the processing where the Index was added - you can then have the rows matched and only the rows needed will be matched since the others are now gone due to the MIN aggregation from earlier. Here is my example:
Table.NestedJoin(#"Grouped Rows", {"Winner"}, #"Added Index", {"Index"}, "Lookup Data", JoinKind.LeftOuter)
- Grouped Rows was the grouping step (Step 3)
- Winner is the name of the Index that had the minimum value
- Added Index was the last step before grouping that still had all the columns (Step 2)
- Index is the column that was added after the sort to uniquely number each row
Step 5: Expand the table and select the columns of data that you want to hang onto
Treating it a bit like a database was a good approach and I appreciate the suggestion you put together for me. Hopefully this will allow others to solve some of their problems too.
Cheers and many thanks
The Frog
I have a number column called "Team Leader", "Process Lead", "Regional Manager"
I will get Team Leader Name, if "Team Leader" Column is Blank, then Get the Name of "Process Lead", if Both Column is Blank then Get the Name of "Regional Manager"
I also have a calculated column that looks to this column.
=IF(ISBLANK([Team Leader]),"",[Team Leader]),=IF(ISBLANK(["Process Lead]),"",[Process Lead])
This is where I am with my formula which doesn't work. Has anyone achieved this?
Either of these 2 should help you:
First, result in blank value if Regional Manager is not provided.
IF(ISBLANK([Test column1]),IF(ISBLANK([Test Column2]),IF(ISBLANK([Test Column3]),"",[Test Column3]),[Test Column2]),[Test column1])
Another one:
IF(ISBLANK([Test column1]),IF(ISBLANK([Test Column2]),IF(ISBLANK([Test Column3]),"",[Test Column3]),[Test Column2]),[Test column1])
I have Bar chart with values colored by 2 different filters.
What I need is to rename each combination of the 2 filters into just one name.
i tried to write it with CASE expression but with no luck.
below screenshot, shows what is required.
Any ideas ?
Screenshot sample : Motor Type
Why did you case expression not work?
Something like that should work:
<CASE [COLUMN] WHEN "VALUE1" then "Category A" WHEN "VALUE2" then "Category B" ELSE "Category C" END>
But in your situation I would do
<if([column 2] in ('SALA','SUPP'),'Motor category A','Motor category B')>
Add this expression here (right click > custom expression) :
I have two visualization below. Is there anyway to combine those two together and show user_name and service_id based another column "Type".
For examples if [type]="user", this table will show "requestor name" and "user name". if [type]="service" then this same table will show "requestor name" and "service id"
Thanks
There are at least 3 ways you can approach this.
You can create a calculated column using one of the expressions below and use this column in your visualization.
Option 1
case
when [user_name] is null or [user_name] = "" then [service id]
else [user_name]
end
Option 2
If(([user_name] is null) or ([user_name]=""),[service id],[user_name])
Option 3
You can use one of the expressions above in the category axis of your expression.