how to merge two rows into one in spotfire? - spotfire

I am stuck at a point in spotfire wherein I need to transform the table (the one below)
ID First name last name
1 Mark
1 Taylor
2 Howard
2 Giblin
to (the table as shown here)
ID First Name Last Name
1 Mark Taylor
2 James Bond
Could someone please help me out. Thanks for the help in advance!

File > Add Data Tables
Add (button) > From Current Analysis > "Your Table Name"
Under transformations, Select "Calculate and Replace Column" > Add (button)
Then use this formula
Max([FirstName]) over ([ID]) as [FirstName]
Repeat the last step for you last name
Max([LastName]) over ([ID]) as [LastName]
Note, you could do this in a cross table or a calculated column as well. It will not remove the duplicate rows though, only fill in the gaps.

Related

M Query Table.Group with Min based on two columns

I have a table with many columns. Three of these columns are:
Package Name (text)
Units Required (Int.64)
Assessment (Int.64)
What I am trying to do is to find the 'Minimum' "Package Name" first by selecting the smallest number of "Units Required", then because sometimes there are several instances where the number of required units will be the same, the row with the lowest "Assessment".
I am exploring the Table.Group() approach but I am not getting anywhere with my understanding of it. I am doing this in Power Query in Excel 365.
Psuedo Code would be something like:
Table.Group("Previous Step Name",{"Package Name"},{MIN("Units Required"),MIN("Assessment")})
As an aside - is it possible to use a single Table.Group and group at two levels? such as "Package Name" and "Column X" so that the result would be a: for each "Package Name" then for each "Column X" in each "Package Name" (nested as it were).
Thankyou in advance for taking a look at this.
Any help greatly appreciated.
Cheers
The Frog
I think you have to do it step by step.
Data
Queries
Load_Data
Load data from Excel table
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content]
in
Source
Min_Unit
Identify min unit by grouping with empty "group by" field.
let
Source = Load_Data,
Group = Table.Group(Source, {}, {{"Min_Unit", each List.Min([Units Required]), type number}})
in
Group
Min_Unit_And_Assessment
Use inner join to filter original data for entries which equal min_unit. Next, group by "units required" to get the min_assessment.
let
Source = Table.NestedJoin(Load_Data, {"Units Required"}, Min_Unit, {"Min_Unit"}, "Min_Unit", JoinKind.Inner),
Group = Table.Group(Source , {"Units Required"}, {{"Min_Assessment", each List.Min([Assessment]), type nullable number}})
in
Group
Result
Inner join to filter original data for the combination of min_unit and min_assessment.
let
Source = Table.NestedJoin(Load_Data, {"Units Required", "Assessment"}, Min_Unit_And_Assessment, {"Units Required", "Min_Assessment"}, "Min_Unit_And_Assessment", JoinKind.Inner),
RemoveUnnecessaryColumns = Table.RemoveColumns(Source,{"Min_Unit_And_Assessment"})
in
RemoveUnnecessaryColumns
Result
Qualia, thankyou for pointing me in the right direction.
The way that I solved this was really simple in the end!
Step 1: Sort the rows based on the grouping criteria (package name, system class) in that order
Step 2: Add an Index Column so each row has a unique ID to work with
Step 3: Group the table based on the same fields (package name, system class) and 'aggregate' on the lowest Index Number (MIN)
Step 4: Perform a 'Merge Queries' with a Left Outer Join using the Index Number as the matching field between your current 'step' and the step from earlier in the processing where the Index was added - you can then have the rows matched and only the rows needed will be matched since the others are now gone due to the MIN aggregation from earlier. Here is my example:
Table.NestedJoin(#"Grouped Rows", {"Winner"}, #"Added Index", {"Index"}, "Lookup Data", JoinKind.LeftOuter)
- Grouped Rows was the grouping step (Step 3)
- Winner is the name of the Index that had the minimum value
- Added Index was the last step before grouping that still had all the columns (Step 2)
- Index is the column that was added after the sort to uniquely number each row
Step 5: Expand the table and select the columns of data that you want to hang onto
Treating it a bit like a database was a good approach and I appreciate the suggestion you put together for me. Hopefully this will allow others to solve some of their problems too.
Cheers and many thanks
The Frog

Join two tables with OR logic in PowerQuery

As the title states, I am trying to do a merge of 2 tables. I want a nested joint where the values from the first table are always there and rows matching the second table are added to the first. I believe this is known as the nested join.
Unfortunately, it only allows for 1 key to 1 key matching where as I need it for 1 key in table 1 to 2 keys in table 2
Here is an example
Table1:
Group
..
..
Time
Date
Table2:
Group 1
Group 2
..
..
..
Other Info
What I want is where "Group = Group 1 OR Group = Group 2" and display the matching row from table 2 nested into Table 1
I looked at the following example but I must be confused by the syntax because it doesn't seem to be working for me.
How to join two tables in PowerQuery with one of many columns matching?
So after further investigation of the answer post I linked earlier, I will add an explanation of it here:
Table.AddColumn(Source, "Name_of_Column",
(Q1) => Table.SelectRows(Query2,
each Q1[Col_from_q1] = [Col_from_q2] or Q1[Col_from_q1] = [2_Col_from_q2]
)
)
So this did work for me and it adds an extra column that needs to be expanded to get all the values from the table. What i would add is that I don't know / haven't tested if there are multiple matches and how it treats it, based on nestedjoin, I would assume that it will duplicate rows in the first table.

Create Dynamic dropdowns with no blanks

I'm trying to create a dropdown on sheet1 from data on sheet10. I cannot add a new column on sheet10, so I need to dynamically create the dropdown. There will be at least 1500 rows on Sheet10, but the dropdown needs to only list Column2 if the column4 matches a value on Sheet1!A1.
Ideally, the pulldown will concatenate the two Columns (e.g. "Column2(Column4)" but that is gravy.
I've tried several similar answers on this sight, but most require creating a column of the subset, then creates Formula:Name, then Data validation. I've tried making the NAME item WITHOUT creating the helper column using FILTER and OFFSET and some others...
Sheet1:A1 = My Department
Sheet1!B3:B500 "MyNewDropList" Data validation.
Sheet 10!
REF DOC TASK DATE ORG
------- ------ -------- -------
DOC 1 TASK 1 1/1/2000 My Department
DOC 1 TASK 2 1/1/2000
DOC 1 TASK 3 1/1/2000 My Department
DOC 1 TASK 4 1/1/2000
DOC 1 TASK 5 1/1/2000 Your Department
I would like to have a dynamic dropdown on Sheet1!B2:B550 that will include the values in SHeet10!B2:1500 if the "ORG" = the value in Sheet1:A1 (In this case "My Department" If possible, I would like to include ORG in the dropdown.
With the values above, I would need the pulldown in SHeet1! Column B, to show:
TASK1
TASK3
Or better would be:
TASK 1 (My Department)
TASK 3 (My Department)
The dropdown needs to be dynamic so that if The value any value in ORG is changed to "My Department", then the dropdown would need to include the new TASK meeting the My Department criteri
Since I can't add a column, I've decided to create a separate worksheet. From here I've created my dynamic list, created a name (excluding the blank lines). I now have my pulldowns working, but was hoping to do so without creating the separate worksheet.

Power Query - Keeping Most recent records in change log columns

I need to strip records to show just the most recent for a given person, and I'm trying to think of a method for doing this in a custom column so I can just keep the most recent records. This is essentially a a status change list, and I need to match the last change as a "current status" for merging with another query. Each date can be unique, and each person can have any from 1 to a dozen status changes. I've picked a selection below, Last Names have been removed to protect the innocent. For sake of the example, Each "name" has a unique identifier that I can use to prevent any overlap from similar names.
AaronS 4/1/2015
AaronS 10/16/2013
AaronS 5/15/2013
AdamS 2/27/2007
AdamL 12/16/2004
AdamL 11/17/2004
AlanG 11/1/2007
AlexanderJ 7/1/2016
AlexanderJ 1/25/2016
AlexanderJ 4/1/2015
AlexanderJ 10/16/2013
AlexanderJ 6/1/2013
AlexanderJ 11/7/2011
My goal would be to return the most recent date for each individual "name" and nulls for the other rows. Then I can filter out nulls to return one row per name. I'm fairly new to power query and mostly adept with the UI, barely learning M Code. Any help will be most welcome.
GUI
Bring the "Name" and "Date" data into Power Query.
Group by "Name". In the Group By dialog select the operation All Rows. Name the new column "AllRows". Click OK.
Add a custom column and title it "LatestRow". Enter the formula below. Click OK. Note that the "Date" column is coming from the sub-table in the "AllRows" column.
= Table.Max([AllRows], "Date")
Click the expand button in the upper right corner of the "LatestRow" column. This will return the record associated with the latest date for each name.
Code
let
Source = Excel.CurrentWorkbook(){[Name="data"]}[Content],
GroupedRows = Table.Group(Source, {"Name"}, {{"AllRows", each _, type table [Name=nullable text, Date=nullable datetime]}}),
AddedCustomColumn = Table.AddColumn(GroupedRows, "LatestRow", each Table.Max([AllRows], "Date")),
ExpandedLatestRow = Table.ExpandRecordColumn(AddedCustomColumn, "LatestRow", {"Date"}, {"LatestRow.Date"})
in
ExpandedLatestRow

How get rows with oldest date per year

I have task in excel. I think I show you it on example. Let say we have table as:
ID date
1 2015-03-11
1 2015-05-13
2 2013-01-10
2 2010-05-11
1 2014-09-19
2 2013-04-01
I have to make some operations to get rows with oldest date per every year. So I should have:
ID date
1 2015-03-11
1 2014-09-19
2 2013-01-10
2 2010-05-11
I will grateful for any help. Thanks in advance!
This is but one option. I like using SQL for this type of work and since Excel can connect to itself as an ODBC data source, that's just what I did here...
Create a Named range in excel (I called mine SomeTable) I do this by selecting the range in question and clicking in the drop down field to the left of the formula space that usually lists the selected cell (B11 in image below)
I then select data, from external sources and select the option for Microsoft Query (ODBC). Select new data source give it a name (Excel File name) Select microsoft excel driver. click connect. browse to where the file is containing the named range (Some table) Select ok and then in the 4th option select the named range (SomeTable)... select a place to put the table on a worksheet.
Now click in the "table" data it creates and go to the data menu properties. and enter the following in the definition tab under command text
.
Select ID, Date
FROM SomeTable ST
INNER JOIN
(Select MIN(date) as mDate, year(date) as mYear
FROM someTable
Group by year(date)) A on
ST.Date = A.mDate
If all done correctly you should get results like this:
Column EF is the source table named "SomeTable"
A10 is where I chose to put the table
B20 is where the SQL used to get the max per year
was put.

Resources