Microsoft Excel - Comparing 2 Column and Delete Duplicate by Row - excel

I am facing an issue where I need to compare column X and column Y, if X=Y then I want to delete that row. But if X≠Y then just leave it there as I need to correct it manually. I try to find any reference but to no avail.
Example of Table
I try using PowerQuery, because the name list were scattered, after sorting up to X=Y, there are some data that wasnt right because it is comparing to almost identical name. I try to use 'remove duplicate' but nothing happened as it only remove if the column has the same data in multiple row.
Thanks in advance.

For another method that does not involve a helper column, you can use the Table.SelectRows function of Power Query:
let
//sample data
Source = Table.FromRecords(
{[x="Johnny White", y="Johnny White"],
[x= "Black Mmamba", y= "Black Mamba"],
[x="Tom Evans", y="Tom Evans"],
[x="Britney Blue",y="Britney Blue"],
[x="White Kingdom", y="Wine Kingdom"],
[x="Daniel Zack", y="Daniel Zack"]},
type table[x=Text.Type,y=Text.Type]),
//select rows where data not the same in each column
remDupes = Table.SelectRows(Source, each [x] <> [y])
in
remDupes
Source
Dupes Removed

Related

Tableau: Multiple columns in a filter

I have three numeric fields named A,B,C and wants them in a single filter in tableau and based on the one selected in that filter a line chart will be shown. For e.g. in filter Stages B column is selected and line chart of B is shown. Had it been column A selected then line chart of A would be displayed .
Pardon my way of asking question by showing a image. I just picked up learning tableau and not getting this trick any where.
Here is the snapshot of data
Create a (list) parameter named 'ABC'. With the values
A
B
C
Then create a calculated field
IF ABC = 'A' THEN [column_a]
ELSEIF ABC = 'B' THEN [column_b]
ELSEIF ABC = 'C' THEN [column_c]
END
Something like that should work for you. Check out Tableau training here. It's free, but you have to sign up for an account.
Another way without creating a calculated field. Just pivot the three columns to rows and your field on which you can apply filter is created. Let me show you
This is screenshot of input data
I converted three cols to pivots to get data reshaped like this
After renaming pivoted-fields column to Stages I can add directly this one to view and get my desired result.

Power Query: Split table column with multiple cells in the same row

I have a SharePoint list as a datasource in Power Query.
It has a "AttachmentFiles" column, that is a table, in that table i want the values from the column "ServerRelativeURL".
I want to split that column so each value in "ServerRelativeURL"gets its own column.
I can get the values if i use the expand table function, but it will split it into multiple rows, I want to keep it in one row.
I only want one row per unique ID.
Example:
I can live with a fixed number of columns as there are usually no more than 3 attachments per ID.
I'm thinking that I can add a custom column that refers to "AttachmentFiles ServerRelativeURL Value(1)" but I don't know how.
Can anybody help?
Try this code:
let
fn = (x)=> {x, #table({"ServerRelativeUrl"},List.FirstN(List.Zip({{"a".."z"}}), x*2))},
Source = #table({"id", "AttachmentFiles"},{fn(2),fn(3),fn(1)}),
replace = Table.ReplaceValue(Source,0,0,(a,b,c)=>a[ServerRelativeUrl],{"AttachmentFiles"}),
cols = List.Transform({1..List.Max(List.Transform(replace[AttachmentFiles], List.Count))}, each "url"&Text.From(_)),
split = Table.SplitColumn(replace, "AttachmentFiles", (x)=>List.Transform({0..List.Count(x)-1}, each x{_}), cols)
in
split
I manged to solve it myself.
I added 3 custom columns like this
CustomColumn1: [AttachmentFiles]{0}
CustomColumn2: [AttachmentFiles]{1}
CustomColumn3: [AttachmentFiles]{2}
And expanded them with only the "ServerRelativeURL" selected.
It would be nice to have a dynamic solution. But this will work fine for now.

Cleaning Excel Table using VBA without impacting the entire table and formatting

Hi I am trying to change to write VBA for excel to clean up data elements that has extra information without impacting the other elements.
I am writing VBA for the first time my table is in the middle of the sheet.
Given Table and Requested Output.
I think your question was not clear in regard to the "steps" that you want to perform on your data (i.e. the exact logic or transformation that needs to be applied).
Based purely on your images and your comment, I make the "steps" to be:
Split any customer IDs in column valueC into multiple rows.
If column valueC does not contain customer IDs (i.e. is blank or contains non-customer ID text), leave it untouched.
My answer uses Power Query instead of VBA. If you are interested in trying it out, in Excel try clicking Data > Get Data > From Other Sources > Blank Query, then click Advanced Editor near the top-left, copy-paste the code below, then click Done.
You might need to change the name of the table in the first line of the code (below), as it was "Table1" for me, but I imagine yours is named something else. Also, the code below is case-sensitive. So if there is no column named exactly valueC, then you will get an error.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
fxProcessSomeText = (textToProcess as any) =>
let
canBeSplit = Text.StartsWith(textToProcess, "### customer id"),
result = if textToProcess is null then null else if canBeSplit then Text.Split(Text.BetweenDelimiters(textToProcess, "### customer id", " ###"), ",") else {textToProcess}
in
result,
invokeFunction = Table.TransformColumns(Source, {{"valueC", fxProcessSomeText}}),
expanded = Table.ExpandListColumn(invokeFunction, "valueC"),
reindex =
let
removeIndex = Table.RemoveColumns(expanded, {"index"}),
addIndex = Table.AddIndexColumn(removeIndex, "index", 1, 1),
moveIndex = Table.ReorderColumns(addIndex, List.Distinct(List.InsertRange(Table.ColumnNames(addIndex), 0, {"index"})))
in
moveIndex
in
reindex
My output table contains more rows than yours. Also, the value in column valueA, row 11 is 1415 for me (it is 1234 in your request output). Not sure if this is a mistake in your example, or if I'm missing some logic.

power query subtract row above from row below

I am using power query in excel and i used create custom column to create a new column, what i desperately need is for this new column to take the value from the second row and subtract it from the first row , and again this will need to happen for all rows like so: row two is subtracted from row one, and row three will be subtracted from row two and row four will be subtracted from row three. PLEASE help. I have no understanding of dax nor power query started using it today and i only need this one thing to work
PS. I have an index that starts from one, called index
here is the code
= Table.AddColumn(#"Reordered Columns", "Custom", each [#"ODO - Km"] - [#"ODO - Km"])
At this moment the ODO km is subtracting from the ODO km in the same row, I want the previous odo km to subtract from the next ODO km.
Create two indexes, one 0-based, called Index0, and one 1-based, called Index1. Merge the query with itself joining on Index1 = Index0. You'll now have duplicate of every column, but they will be offset by one. Then you can do all of your arithmetic in one row. After this, you can remove all but the result fields you want.
You don't need to do this. You can index rows in a table by using an index. The key is to reference the name of the previous step like below:
let
Source = whatever
addindex = Table.AddIndexColumn(Source , "Index", 0, 1),
addRelative = Table.AddColumn(addindex, "Previous record", each try if [Index]<>0 then addindex[myField]{[Index]-1}),
in
addRelative

Excel Power Query -- Select value in column specified in related table -- INDEX+MATCH alternative

Problem
I have two queries, one contains product data (data_query), the other (recode_query) contains product names from within the data_query and assigns them specific id_tags. id_tags are also column names within the data_query.
What I need to achieve and fail at
I need the data_query to look at the id_tag of the specific product name within the data_query, as parsed from the recode_query (this is already working and in place) and input the retrieved value within the specific custom column cell. In Excel, I would be using INDEX/MATCH combo:
{=INDEX(data_query[#Data];; MATCH(data_query[#id_tag]; data_query[#Headers]; 0))}
I have searched near and far, but I probably can't even spot the solution, even if I have come across it, as I am not that deep in the data manipulation and power query myself.
Is this what you're wanting?
let
DataQuery = Table.FromColumns({{1,2,3}, {"Boxed", "Bagged", "Rubberbanded"}}, {"ID","Pkg"}),
RecodeQuery = Table.FromColumns({{"Squirt Gun", "Coffee Maker", "Trenching Tool"}, {1,2,3}}, {"Prod Name", "ID2"}),
Rzlt = Table.Join(DataQuery, "ID", RecodeQuery, "ID2", JoinKind.Inner)
in
Rzlt

Resources