Tableau: Multiple columns in a filter - pivot

I have three numeric fields named A,B,C and wants them in a single filter in tableau and based on the one selected in that filter a line chart will be shown. For e.g. in filter Stages B column is selected and line chart of B is shown. Had it been column A selected then line chart of A would be displayed .
Pardon my way of asking question by showing a image. I just picked up learning tableau and not getting this trick any where.
Here is the snapshot of data

Create a (list) parameter named 'ABC'. With the values
A
B
C
Then create a calculated field
IF ABC = 'A' THEN [column_a]
ELSEIF ABC = 'B' THEN [column_b]
ELSEIF ABC = 'C' THEN [column_c]
END
Something like that should work for you. Check out Tableau training here. It's free, but you have to sign up for an account.

Another way without creating a calculated field. Just pivot the three columns to rows and your field on which you can apply filter is created. Let me show you
This is screenshot of input data
I converted three cols to pivots to get data reshaped like this
After renaming pivoted-fields column to Stages I can add directly this one to view and get my desired result.

Related

How to get the data from previous row in Azure data factory

I am working on transforming data in Azure data factory
I have a source file that contains data like this:
ABC Code-01
DEF
GHI
JKL Code-02
MNO
I need to make the data looks like this to the sink file:
ABC Code-01
DEF Code-01
GHI Code-01
JKL Code-02
MNO Code-02
You can achieve this using Fill down concept available in Azure data factory. The code snippet is available here.
Note: The code snippet assumes that you have already added source transformation in data flow.
Steps:
Add source and link it with the source file (I have generated file with your sample data).
Edit the data flow script available on the right corner to add code.
Add the code snippet after the source as shown.
source1 derive(dummy = 1) ~> DerivedColumn
DerivedColumn keyGenerate(output(sk as long),
startAt: 1L) ~> SurrogateKey
SurrogateKey window(over(dummy),
asc(sk, true),
Rating2 = coalesce(Rating, last(Rating, true()))) ~> Window1
After adding the code in the script, data flow generated 3 transformations
a. Derived column transformation with a new dummy column with constant “1”
b. SurrogateKey transformation to generate Key value for each row starting with value 1.
c. Window transformation to perform window based aggregation. Here the code add predefined clause last() to take previous row not Null vale if current row value is NULL.
For more information on Window transformation refer - https://learn.microsoft.com/en-us/azure/data-factory/data-flow-window
As I am getting the values as single column in source, added additional columns in Derived column to split and store the single source column into 2 columns.
Substitute NULL values if column value is blank. If it is blank, last() clause will not recognize as NULL to substitute previous values.
case(length(dropLeft(Column_1,4)) >1, dropLeft(Column_1,4), toString(null()))
Preview of Derived column: Column_1 is the Source raw data, dummy is the column generated from the code snippet added with constant 1, Column1Left & Column1Right are to store the values after splitting (Column_1) raw data.
Note: Column1Right blank values are replaced with NULLs.
In windows transformation:
a. Over – This partition the source data based on the column provided. As there no other columns to uses as partition column, add the dummy column generated using derived column.
b. Sort – Sorts the source data based on the sort column. Add the Surrogate Key column to sort the incoming source data.
c. Window Column – Here, provide the expression to copy not Null value from previous rows only when the current value is Null
coalesce(Column1Right, last(Column1Right,true()))
d. Data preview of window transformation: Here, Column1Right data Null Values are replaced by previous not Null values based on the expression added in Window Columns.
Second derived column is added to concat Column1Left and Column1Right as single column.
Second Derived column preview:
A select transformation is added to only select required columns to the sink and remove unwanted columns (This is optional).
sink data output after fill down process.

Microsoft Excel - Comparing 2 Column and Delete Duplicate by Row

I am facing an issue where I need to compare column X and column Y, if X=Y then I want to delete that row. But if X≠Y then just leave it there as I need to correct it manually. I try to find any reference but to no avail.
Example of Table
I try using PowerQuery, because the name list were scattered, after sorting up to X=Y, there are some data that wasnt right because it is comparing to almost identical name. I try to use 'remove duplicate' but nothing happened as it only remove if the column has the same data in multiple row.
Thanks in advance.
For another method that does not involve a helper column, you can use the Table.SelectRows function of Power Query:
let
//sample data
Source = Table.FromRecords(
{[x="Johnny White", y="Johnny White"],
[x= "Black Mmamba", y= "Black Mamba"],
[x="Tom Evans", y="Tom Evans"],
[x="Britney Blue",y="Britney Blue"],
[x="White Kingdom", y="Wine Kingdom"],
[x="Daniel Zack", y="Daniel Zack"]},
type table[x=Text.Type,y=Text.Type]),
//select rows where data not the same in each column
remDupes = Table.SelectRows(Source, each [x] <> [y])
in
remDupes
Source
Dupes Removed

Power Query: Split table column with multiple cells in the same row

I have a SharePoint list as a datasource in Power Query.
It has a "AttachmentFiles" column, that is a table, in that table i want the values from the column "ServerRelativeURL".
I want to split that column so each value in "ServerRelativeURL"gets its own column.
I can get the values if i use the expand table function, but it will split it into multiple rows, I want to keep it in one row.
I only want one row per unique ID.
Example:
I can live with a fixed number of columns as there are usually no more than 3 attachments per ID.
I'm thinking that I can add a custom column that refers to "AttachmentFiles ServerRelativeURL Value(1)" but I don't know how.
Can anybody help?
Try this code:
let
fn = (x)=> {x, #table({"ServerRelativeUrl"},List.FirstN(List.Zip({{"a".."z"}}), x*2))},
Source = #table({"id", "AttachmentFiles"},{fn(2),fn(3),fn(1)}),
replace = Table.ReplaceValue(Source,0,0,(a,b,c)=>a[ServerRelativeUrl],{"AttachmentFiles"}),
cols = List.Transform({1..List.Max(List.Transform(replace[AttachmentFiles], List.Count))}, each "url"&Text.From(_)),
split = Table.SplitColumn(replace, "AttachmentFiles", (x)=>List.Transform({0..List.Count(x)-1}, each x{_}), cols)
in
split
I manged to solve it myself.
I added 3 custom columns like this
CustomColumn1: [AttachmentFiles]{0}
CustomColumn2: [AttachmentFiles]{1}
CustomColumn3: [AttachmentFiles]{2}
And expanded them with only the "ServerRelativeURL" selected.
It would be nice to have a dynamic solution. But this will work fine for now.

Use Power Query to grab top row of CSV files in a folder. Place in Excel

I would like to grab the first rows of all CSV files in a folder. I have read that power query would probably be best.
I have gone to Excel > Data > Get Data > From Folder > OK. That has brought me to a table of all the csvs in the folder. I would like to grab the first row of all of these files. I do not want to import all rows of the tables because it was way too many rows. It is also too many tables to do one by one. Please tell me what I should do next. Thank you!
First image is where I am, Second image is where I would like to be
The approach below should give you a single table, wherein each column contains a given CSV's first row's values. It's not exactly what you've shown in your second image (namely, there are no blank columns in between each column of values), but it might still be okay for you.
You can parse a CSV with Csv.Document function (which should give you a table).
You can get the first row of the table (from the previous step) using:
Table.First and Record.FieldValues
or Table.PromoteHeaders and Table.ColumnNames
(It would make sense to create a custom function to do above the steps for you and then invoke the function for each CSV. See GetFirstRowOfCsv in code below.)
The function above returns a list (containing the CSV's first row's values). Calling the function for all your CSVs should give you a list of lists, which you can then combine into a single table with Table.FromColumns.
Overall, starting from the Folder.Files call, the code looks like:
let
filesInFolder = Folder.Files("C:\Users\"),
GetFirstRowOfCsv = (someFile as binary) as list =>
let
csv = Csv.Document(someFile, [Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.Csv]),
promoted = Table.PromoteHeaders(csv, [PromoteAllScalars=true]),
firstRow = Table.ColumnNames(promoted)
in firstRow,
firstRowExtracted = Table.AddColumn(filesInFolder, "firstRowExtracted", each GetFirstRowOfCsv([Content]), type list),
combined =
let
columns = firstRowExtracted[firstRowExtracted],
headers = List.Transform(firstRowExtracted[Name], each Text.BeforeDelimiter(_, ".csv")),
toTable = Table.FromColumns(columns, headers)
in toTable
in
combined
which gives me:
The null values are because there were more values in the first row of my ActionLinkTemplate.csv than the first rows of the other CSVs.
You will need to change the folder path in the above code to whatever it is on your machine.
In the GUI, you can select the top N row(s) where you choose N. Then you can expand all remaining rows.

Cleaning Excel Table using VBA without impacting the entire table and formatting

Hi I am trying to change to write VBA for excel to clean up data elements that has extra information without impacting the other elements.
I am writing VBA for the first time my table is in the middle of the sheet.
Given Table and Requested Output.
I think your question was not clear in regard to the "steps" that you want to perform on your data (i.e. the exact logic or transformation that needs to be applied).
Based purely on your images and your comment, I make the "steps" to be:
Split any customer IDs in column valueC into multiple rows.
If column valueC does not contain customer IDs (i.e. is blank or contains non-customer ID text), leave it untouched.
My answer uses Power Query instead of VBA. If you are interested in trying it out, in Excel try clicking Data > Get Data > From Other Sources > Blank Query, then click Advanced Editor near the top-left, copy-paste the code below, then click Done.
You might need to change the name of the table in the first line of the code (below), as it was "Table1" for me, but I imagine yours is named something else. Also, the code below is case-sensitive. So if there is no column named exactly valueC, then you will get an error.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
fxProcessSomeText = (textToProcess as any) =>
let
canBeSplit = Text.StartsWith(textToProcess, "### customer id"),
result = if textToProcess is null then null else if canBeSplit then Text.Split(Text.BetweenDelimiters(textToProcess, "### customer id", " ###"), ",") else {textToProcess}
in
result,
invokeFunction = Table.TransformColumns(Source, {{"valueC", fxProcessSomeText}}),
expanded = Table.ExpandListColumn(invokeFunction, "valueC"),
reindex =
let
removeIndex = Table.RemoveColumns(expanded, {"index"}),
addIndex = Table.AddIndexColumn(removeIndex, "index", 1, 1),
moveIndex = Table.ReorderColumns(addIndex, List.Distinct(List.InsertRange(Table.ColumnNames(addIndex), 0, {"index"})))
in
moveIndex
in
reindex
My output table contains more rows than yours. Also, the value in column valueA, row 11 is 1415 for me (it is 1234 in your request output). Not sure if this is a mistake in your example, or if I'm missing some logic.

Resources