Azure Data Flow Flatten and Parsing key/value column - azure

I'm trying to transform a key/value data into column using Azure Data Flow. Basically this:
{"key":"rate1","value":"123"}-{"key":"rate2","value":"456"}
into this:
key
value
rate1
123
rate2
456
I was following this example here ( Flatten and Parsing Json using Azure Data Flow ), and everything was look good until I tried to use parse.
The output just shows the value column, not the key. I don't know why. Below are my dataflow settings.
Source query: https://i.stack.imgur.com/6Q8Xb.png
Source Data preview: https://i.stack.imgur.com/UNj8x.png
Derived Column: https://i.stack.imgur.com/C0g1N.png
Derived Column Data preview: https://i.stack.imgur.com/vtVY7.png
Flatten: https://i.stack.imgur.com/Bkp7P.png
Flatten Data preview: https://i.stack.imgur.com/yM6h1.png
Parse: https://i.stack.imgur.com/RUJpr.png
Parse Data preview: https://i.stack.imgur.com/RC42Y.png
Anyone have any idea what I'm missing?
Edit: My source is Snowflake
Thanks in advance!

I reproduced the above and got same result after parse transformation.
The above process is correct, may be the preview is not showing correctly. You can view the desired result as individual columns by using derived column transformation after parse.
In sink select the desired columns by Mapping->deselect auto mapping->+->Fixed mapping.
Sink Data preview:.

Related

How do you filter for a string (not) containing a substring in an Azure Data Factory data flow expression?

As someone with a background in Alteryx, it has been a slow process to get up to speed with the expressions and syntax within Azure Data Factory data flows. I am trying to filter out rows containing the following string in a similar manner to this Alteryx filter code below:
!Contains([Subtype], "News")
After scrolling through all the string expressions in Azure Data Factory, I am struggling to find anything similar to the logic above. Thanks in advance for any help you can provide me on this front!
You can use Filter transformation in ADF Data flow and give the condition
for any column like below:
My Sample Data:
Here I am filtering out the rows the which contains a string of "Rakesh" in the Name column with the Data flow expression instr(Name,"Rakesh")==0.
instr() returns number of common letters. Our condition satisfies if its result is 0.
Filter Transformation:
.
Output in Data preview of filter:
You can see the remaining rows only in the result.

How to query data from sub-columns of a column in log analytics using Kusto

We have a Table in Azure Log Analytics that is having nested or multi-loop data in properties column.
But we would like to extract the data that is from the nested loop as individual columns.
Is there any way to do that?
Our data looks like below
enter image description here
where inside the properties column the data is into multiple brackets.
We are able to extract the data from Properties column which is not in a nested loop using extend function.
such as
Resource
Workspace
Azure
test
But we want to extract the values that are in the subcolumns as well, such as
ws
env
value1
value2
value3
azure
test
“alpha”=1,”mse”=2
“alpha”=0,”mse”=1
“alpha”=2,”mse”=2
You would have to use scalar functions like parse_json and tabular operators like mv-expand. Check this old thread for a sample.

Not able to change datatype of Additional Column in Copy Activity - Azure Data Factory

I am facing very simple problem of not able to change the datatype of additional column in copy activity in ADF pipeline from String to Datetime
I am trying to change source datatype for additional column in mapping using JSON but still it doesn't work with polybase cmd
When I run my pipeline it gives same error
Is it not possible to change datatype of additional column, by default it takes string only
Dynamic columns return string.
Try to put the value [Ex. utcnow()] in the dynamic content of query and cast it to the required target datatype.
Otherwise you can use data-flow-derived-column :
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-derived-column
Since your source is a query, you can choose to bring current date in source SQL query itself in the desired format rather than adding it in the additional column.
Thanks
Try to use formatDateTime as shown below and define the desired Date format:
Here since format given is ‘yyyy-dd-MM’, the result will look as below:
Note: The output here will be of string format only as in Copy activity we could not cast data type as of the date.
We could either create current date in the Source sql query or use above way so that the data would load into the sink in expected format.

Pivoting based on Row Number in Azure Data Factory - Mapping Data Flow

I am new to Azure and am trying to see if the below result is achievable with data factory / mapping data flow without Databricks.
I have my csv file with this sample data :
I have following data in my table :
My expected data/ result:
Which transformations would be helpful to achieve this?
Thanks.
Now, you have the RowNumber column, you can use pivot activity to do row-column pivoting.
I used your sample data to made a test as follows:
My Projection tab is like this:
My DataPreview is like this:
In the Pivot1 activity, we select Table_Name and Row_Number columns to group by. If you don't want Table_Name column, you can delete it here.
At Pivote key tab, we select Col_Name column.
At Pivoted columns, we must select a agrregate function to aggregate the Value column, here I use max().
The result shows:
Please correct me if I understand you wrong in the answer.
update:
The data source like this:
The result shows as you saied, ADF sorts the column alphabetically.It seems no way to customize sorting:
But when we done the sink activity, it will auto mapping into your sql result table.

ADFv2 trouble with column mapping (reposting)

I have a source .csv with 21 columns and a destination table with 25 columns.
Not ALL columns within the source have a home in the destination table and not all columns in the destination table come from the source.
I cannot get my CopyData task to let me pick and choose how I want the mapping to be. The only way I can get it to work so far is to load the source data to a "holding" table that has a 1:1 mapping and then execute a stored procedure to insert data from that table into the final destination.
I've tried altering the schemas on both the source and destination to match but it still errors out because the ACTUAL source has more columns than the destination or vice versa.
This can't possibly be the most efficient way to accomplish this but I'm at a loss as to how to make it work.
Yes I have tried the user interface, yes I have tried the column schemas, no I can't modify the source file and shouldn't need to.
The error code that is returned is some variation on:
"errorCode": "2200",
"message": "ErrorCode=UserErrorInvalidColumnMappingColumnCountMismatch,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Invalid column mapping provided to copy activity: '{LONG LIST OF COLUMN MAPPING HERE}', Detailed message: Different column count between target structure and column mapping. Target column count:25, Column mapping count:16. Check column mapping in table definition.,Source=Microsoft.DataTransfer.Common,'",
"failureType": "UserError",
"target": "LoadPrimaryOwner"
Tim F. Please view the statements in this Schema mapping in copy activity:
Column mapping supports mapping all or subset of columns in the source
dataset "structure" to all columns in the sink dataset "structure".
The following are error conditions that result in an exception:
1.Source data store query result does not have a column name that is specified in the input dataset "structure" section.
2.Sink data store (if with pre-defined schema) does not have a column name that is specified in the output dataset "structure" section.
3.Either fewer columns or more columns in the "structure" of sink dataset than specified in the mapping.
4.Duplicate mapping.
So,you could know that all the columns in the sink dataset need to be mapped. Since you can't change the destination,maybe you don't have to struggle in an unsupported feature.
Of course ,you could use stored procedure mentioned in your description.That's a perfect workaround and not very troublesome. About the using details, you could refer to my previous cases:
1.Azure Data Factory activity copy: Evaluate column in sink table with #pipeline().TriggerTime
2.Azure Data factory copy activity failed mapping strings (from csv) to Azure SQL table sink uniqueidentifier field
In addition, if you really don't want avoid above solution,you could submit feedback to ADF team about your desired feature.

Resources