Pivoting based on Row Number in Azure Data Factory - Mapping Data Flow - azure

I am new to Azure and am trying to see if the below result is achievable with data factory / mapping data flow without Databricks.
I have my csv file with this sample data :
I have following data in my table :
My expected data/ result:
Which transformations would be helpful to achieve this?
Thanks.

Now, you have the RowNumber column, you can use pivot activity to do row-column pivoting.
I used your sample data to made a test as follows:
My Projection tab is like this:
My DataPreview is like this:
In the Pivot1 activity, we select Table_Name and Row_Number columns to group by. If you don't want Table_Name column, you can delete it here.
At Pivote key tab, we select Col_Name column.
At Pivoted columns, we must select a agrregate function to aggregate the Value column, here I use max().
The result shows:
Please correct me if I understand you wrong in the answer.
update:
The data source like this:
The result shows as you saied, ADF sorts the column alphabetically.It seems no way to customize sorting:
But when we done the sink activity, it will auto mapping into your sql result table.

Related

Azure Data Flow Flatten and Parsing key/value column

I'm trying to transform a key/value data into column using Azure Data Flow. Basically this:
{"key":"rate1","value":"123"}-{"key":"rate2","value":"456"}
into this:
key
value
rate1
123
rate2
456
I was following this example here ( Flatten and Parsing Json using Azure Data Flow ), and everything was look good until I tried to use parse.
The output just shows the value column, not the key. I don't know why. Below are my dataflow settings.
Source query: https://i.stack.imgur.com/6Q8Xb.png
Source Data preview: https://i.stack.imgur.com/UNj8x.png
Derived Column: https://i.stack.imgur.com/C0g1N.png
Derived Column Data preview: https://i.stack.imgur.com/vtVY7.png
Flatten: https://i.stack.imgur.com/Bkp7P.png
Flatten Data preview: https://i.stack.imgur.com/yM6h1.png
Parse: https://i.stack.imgur.com/RUJpr.png
Parse Data preview: https://i.stack.imgur.com/RC42Y.png
Anyone have any idea what I'm missing?
Edit: My source is Snowflake
Thanks in advance!
I reproduced the above and got same result after parse transformation.
The above process is correct, may be the preview is not showing correctly. You can view the desired result as individual columns by using derived column transformation after parse.
In sink select the desired columns by Mapping->deselect auto mapping->+->Fixed mapping.
Sink Data preview:.

Pivot all table in Azure Data Factory - Mapping Data Flow

I am new to Azure and am trying to see if the below request is achievable with data factory I have my csv file with this sample data :
I have this result
My expected data/ result:
enter image description here
Which transformations would be helpful to achieve this?
enter image description here
Thanks.
Use the pivot transformation to create multiple columns from the
unique row values of a single column. Pivot is an aggregation
transformation where you select group by columns and generate pivot
columns using aggregate functions.
The pivot transformation requires three different inputs: group by columns, the pivot key, and how to generate the pivoted columns.
Refer Microsoft official document: Pivot transformation in mapping data flow

Azure Datafactory: How to implement nested sql query in transformation data flow

[![enter image description here][1]][1]
I have two streams customer and customercontact. I am new to azure data factory. I just want to know which activity in data flow transformation will achieve the below sql query result.
(SELECT *
FROM customercontact
WHERE customerid IN
(SELECT customerid
FROM customer)
ORDER BY timestamp DESC
LIMIT 1)
I can utilize Exist transformation for inner query but I am need some help on how I can fetch the first row after sorting customer contact data.So , basically I am looking for a way to add limit/Top/Offset clause in dataflow.
You can achieve transformation for a given query in data flow with different transformation.
For sorting you can use Sort transformation. Here you can select Order Ascending or descending.
For top few records you can use Rank transformation.
For “IN” clause you can use Exists transformation.
Refer - https://learn.microsoft.com/en-us/azure/data-factory/data-flow-rank
Here is my sample data in SQL as Source
I have used Rank transformation.
After rank transformation one more column i.e. RankColumn got added.
Now to select only top 1 record I have used Filter Row Modifier. I used equals(RankColumn,1) expression to select Top 1 record.
Now finally use Sink activity and run pipeline.

Power Query how to make a Table with multiple values a parameter that uses OR

I have a question regarding Power Query and Tables as parameters for excel.
Right now I can create a table and use it as a parameter for Power query via Drill down.
But I'm unsure how i would proceed with a Table that has multiple values. How can a table be recognized with multiple "values" as a parameter
For example:
I have the following rawdata and parameter tables
Rawdata+parametertables
Now if I wanted to filter after Value2 with a parameter tables I would do a drill down of the parameter tables and load them to excel.
After that I have two tables that I can filter Value2 with an OR Function by 1 and 2
Is it possible to somehow combine this into 1 Table and that it still uses an OR Function to search
Value2
Im asking because I want it to be potentially possible to just add more and more parameters into the table without creating a new table everytime. Basically just copy paste some parameters into the parameter table and be done with it
Thanks for any help in advance
Assuming, you use Parameters only for filtering. There are other ways, but this one looks the best from performance point of view.
You may create Parameters table, so you have such tables:
Note, it's handy to have the same names (Value2) for key column in both tables, otherwise Table.Join will create additional column(s) after merging tables.
Add similar step to filter RawData table:
join = Table.Join(RawData, "Value2", Parameters, "Value2")

How do we create a generic mapping dataflow in datafactory that will dynamically extract data from different tables with different schema?

I am trying to create a azure datafactory mapping dataflow that is generic for all tables. I am going to pass table name, the primary column for join purpose and other columns to be used in groupBy and aggregate functions as parameters to the DF.
parameters to df
I am unable to refernce this parameter in groupBy
Error: DF-AGG-003 - Groupby should reference atleast one column -
MapDrifted1 aggregate(
) ~> Aggregate1,[486 619]
Has anyone tried this scenario? Please help if you have some knowledge on this or if it can be handled in u-sql script.
We need to first lookup your parameter string name from your incoming source data to locate the metadata and assign it.
Just add a Derived Column previous to your Aggregate and it will work. Call the column 'groupbycol' in your Derived Column and use this formula: byName($group1).
In your Agg, select 'groupbycol' as your groupby column.

Resources