Pivot all table in Azure Data Factory - Mapping Data Flow - azure

I am new to Azure and am trying to see if the below request is achievable with data factory I have my csv file with this sample data :
I have this result
My expected data/ result:
enter image description here
Which transformations would be helpful to achieve this?
enter image description here
Thanks.

Use the pivot transformation to create multiple columns from the
unique row values of a single column. Pivot is an aggregation
transformation where you select group by columns and generate pivot
columns using aggregate functions.
The pivot transformation requires three different inputs: group by columns, the pivot key, and how to generate the pivoted columns.
Refer Microsoft official document: Pivot transformation in mapping data flow

Related

Azure Datafactory: How to implement nested sql query in transformation data flow

[![enter image description here][1]][1]
I have two streams customer and customercontact. I am new to azure data factory. I just want to know which activity in data flow transformation will achieve the below sql query result.
(SELECT *
FROM customercontact
WHERE customerid IN
(SELECT customerid
FROM customer)
ORDER BY timestamp DESC
LIMIT 1)
I can utilize Exist transformation for inner query but I am need some help on how I can fetch the first row after sorting customer contact data.So , basically I am looking for a way to add limit/Top/Offset clause in dataflow.
You can achieve transformation for a given query in data flow with different transformation.
For sorting you can use Sort transformation. Here you can select Order Ascending or descending.
For top few records you can use Rank transformation.
For “IN” clause you can use Exists transformation.
Refer - https://learn.microsoft.com/en-us/azure/data-factory/data-flow-rank
Here is my sample data in SQL as Source
I have used Rank transformation.
After rank transformation one more column i.e. RankColumn got added.
Now to select only top 1 record I have used Filter Row Modifier. I used equals(RankColumn,1) expression to select Top 1 record.
Now finally use Sink activity and run pipeline.

Power Query how to make a Table with multiple values a parameter that uses OR

I have a question regarding Power Query and Tables as parameters for excel.
Right now I can create a table and use it as a parameter for Power query via Drill down.
But I'm unsure how i would proceed with a Table that has multiple values. How can a table be recognized with multiple "values" as a parameter
For example:
I have the following rawdata and parameter tables
Rawdata+parametertables
Now if I wanted to filter after Value2 with a parameter tables I would do a drill down of the parameter tables and load them to excel.
After that I have two tables that I can filter Value2 with an OR Function by 1 and 2
Is it possible to somehow combine this into 1 Table and that it still uses an OR Function to search
Value2
Im asking because I want it to be potentially possible to just add more and more parameters into the table without creating a new table everytime. Basically just copy paste some parameters into the parameter table and be done with it
Thanks for any help in advance
Assuming, you use Parameters only for filtering. There are other ways, but this one looks the best from performance point of view.
You may create Parameters table, so you have such tables:
Note, it's handy to have the same names (Value2) for key column in both tables, otherwise Table.Join will create additional column(s) after merging tables.
Add similar step to filter RawData table:
join = Table.Join(RawData, "Value2", Parameters, "Value2")

Pivoting based on Row Number in Azure Data Factory - Mapping Data Flow

I am new to Azure and am trying to see if the below result is achievable with data factory / mapping data flow without Databricks.
I have my csv file with this sample data :
I have following data in my table :
My expected data/ result:
Which transformations would be helpful to achieve this?
Thanks.
Now, you have the RowNumber column, you can use pivot activity to do row-column pivoting.
I used your sample data to made a test as follows:
My Projection tab is like this:
My DataPreview is like this:
In the Pivot1 activity, we select Table_Name and Row_Number columns to group by. If you don't want Table_Name column, you can delete it here.
At Pivote key tab, we select Col_Name column.
At Pivoted columns, we must select a agrregate function to aggregate the Value column, here I use max().
The result shows:
Please correct me if I understand you wrong in the answer.
update:
The data source like this:
The result shows as you saied, ADF sorts the column alphabetically.It seems no way to customize sorting:
But when we done the sink activity, it will auto mapping into your sql result table.

How do we create a generic mapping dataflow in datafactory that will dynamically extract data from different tables with different schema?

I am trying to create a azure datafactory mapping dataflow that is generic for all tables. I am going to pass table name, the primary column for join purpose and other columns to be used in groupBy and aggregate functions as parameters to the DF.
parameters to df
I am unable to refernce this parameter in groupBy
Error: DF-AGG-003 - Groupby should reference atleast one column -
MapDrifted1 aggregate(
) ~> Aggregate1,[486 619]
Has anyone tried this scenario? Please help if you have some knowledge on this or if it can be handled in u-sql script.
We need to first lookup your parameter string name from your incoming source data to locate the metadata and assign it.
Just add a Derived Column previous to your Aggregate and it will work. Call the column 'groupbycol' in your Derived Column and use this formula: byName($group1).
In your Agg, select 'groupbycol' as your groupby column.

How to get value from nested relations in Power Pivot?

I'm using Power Pivot add-in to create a data warehouse for generate dynamic tables and graphs (strictly data source is Excel), but I have a problem whit a calculate in the relations. My data model is the following:
My Snowflake data warehouse model
So for the fact table "fSales" I need to multiply the dCostDetail[Value] per dWorkCost[Value] to generate the fSales[Expenses] amount.
I tried to use the formula but I get an error: related but it don't allow to nested between the relations, e.g. fSales[Expenses] = related(dCostDetail[Value])*related(dWorkCost[Value])
Also I tried to use the next formula:
fSales[Expenses] = related(dWorkCost[Value]) * Calculate(Calculate(Calculate(Value(dCostDetail[Value]), Userelationship(fSales[IdProduct],dProduct[Sku]),Userelationships(dProduct[IdCateg],dCategory[IdCategory]), Userelationships(dCategory[IdCategory],dCostDetail[IdCateg]))))
And I need this "type" of normalized model to have the details when I analyze the information, e.g. filter, but if you know another way to generate the calculation it would be ok.
RELATED doesn't work in measures, because it evaluates on a record-by-record level. So you're on the right track, but what you need to do is create a column in Powerpivot in the fSales table called "Cost Detail" or whatever, and use a RELATED formula there to pull in that value from the CostDetail table. Create another column and do the same thing to pull in the dWorkCost value into the fSales table.
Then you can do a measure for the expenses like this:
Expenses:=SUM([whatever you called CostDetailColumn])*SUM([whatever you called WorkCostColumn])
You should be able to drop that measure into a pivot and it should do what you're looking for.

Resources