Can we get an average in row wise for multiple columns in Dataflow ADF? - azure

Please refer the pic, I have to do that same thing but using dataflow

Create Derived column activity this activity helps you to create a new column trough data flow, give column name what you want and in expression give.
(col1+col2+col3)/3
OUTPUT

Related

Data Flow - Window Transformation - NTILE Expression

I'm attempting to assign quartiles to a numeric source data range as it transits a data flow.
I gather that this can be accomplished by using the ntile expression within a window transform.
I'm failing in my attempt to use the documentation provided here to get any success.
This is just a basic attempt to understand the implementation before using it for real application. I have a numeric value in my source dataset, and I want the values within the range to be spread across 4 buckets and defined as such.
Thanks in advance for any assistance with this.
In Window transformation of Data Flow, we can configure the settings keeping the source data numeric column in “Sort” tab as shown below:
Next in Window columns tab, create a new column and write expression as “nTile(4)” in order to create 4 buckets:
In the Data Preview we can see that the data is spread across 4 Buckets:

Excel:- Difference b/w pivot value col in multileveled table

My Raw table looks like this.
When I create a Pivot it looks like the below screenshot. I have filtered the type column and not actual and others. Now i want to subtract these two columns and create just one more additional column. Maybe some column will show 0 here as I have created dummy data, Sorry for that
Can somebody please help
You will need to create a calculated field/column
This is a very easy and simple guide to do so here: How to create a Calculated Field in Pivot Table
Because your data set is not complete, I am unable to replicate and provide results.
Please mark answer as 'accepted' if you have found this response meets your requirements.

Azure Data Factory DataFlow exclude 1 column from expression columns()

I'm looking for a solution for the following problem.
I've created the following expression in a Derived Column in Azure Data Factory DataFlow
md5(concatWS("||", toString(columns())))
But from the above expression column() I want to extract 1 Column.
so something like this md5(concatWS("||", toString(columns()-'PrimaryKey'))). I cannot exclude the primary key column with a select in front of the derived because I need it in a later stage.
So in Databricks i'm executing the following, but I want to achieve this as well in ADF
non_key_columns = [column for column in dfsourcechanges.columns if column not in key_columns]
Are there any suggestions, how I can solve this
You can try to use byNames function to do this. Create an array and add all your column names into it except 'PrimaryKey'. Then pass it to byNames function as first parameter. Something like this expression:md5(concatWS("||", toString(byNames(['yourColumn1','yourColumn2',...]))))

How do we create a generic mapping dataflow in datafactory that will dynamically extract data from different tables with different schema?

I am trying to create a azure datafactory mapping dataflow that is generic for all tables. I am going to pass table name, the primary column for join purpose and other columns to be used in groupBy and aggregate functions as parameters to the DF.
parameters to df
I am unable to refernce this parameter in groupBy
Error: DF-AGG-003 - Groupby should reference atleast one column -
MapDrifted1 aggregate(
) ~> Aggregate1,[486 619]
Has anyone tried this scenario? Please help if you have some knowledge on this or if it can be handled in u-sql script.
We need to first lookup your parameter string name from your incoming source data to locate the metadata and assign it.
Just add a Derived Column previous to your Aggregate and it will work. Call the column 'groupbycol' in your Derived Column and use this formula: byName($group1).
In your Agg, select 'groupbycol' as your groupby column.

Adding Extraction DateTime in Azure Data Factory

I want to write a generic DataFactory in V2 with below scenario.
Source ---> Extracted (Salesforce or some other way), which don't have
extraction timestamp. ---> I want to write it to Blob with extraction
Time Stamp.
I want it to be generic, so I don't want to give column mapping anywhere.
Is there any way to use expression or system variable in Custom activity to append a column in output dataset? I like to have a very simple solution to make implementation realistic.
To do that you should change the query to add the column you need, with the query property in the copy activity of the pipeline. https://learn.microsoft.com/en-us/azure/data-factory/connector-salesforce#copy-activity-properties
I dont know much about Salesforce, but in SQL Server you can do the following:
SELECT *, CURRENT_TIMESTAMP as AddedTimeStamp from [schema].[table]
This will give you every field on your table and will add a column named AddedTimeStamp with the CURRENT_TIMESTAMP value in every row of the result.
Hope this helped!

Resources