Splitting the comma separated values in a column into multiple columns in Snaplogic - snaplogic

I have table data like below.
OTDATA
"ABC,CDE,EDF,123,10/20/2020"
"WDE,RED,ERT,231,09/22/2020"
"ERT,WED,TGY,453,08/10/2020"
I am trying to split into below through snaplogic.
OTDATA,OTDATA,OTDATA,OTDATA,OTDATA
ABC,CDE,EDF,123,10/20/2020
WDE,RED,ERT,231,09/22/2020
ERT,WED,TGY,453,08/10/2020
I have used mapper to do $OTDATA.split(',') but I am not achieving the desired output. Can you please give me a way to do it?

You can use two mappers one after the other with one mapper that splits the string and the other mapper that maps the elements of the resulting array to its corresponding fields.
Please note that you can't have fields with the same name.
Please refer to the following screenshots.
#1 Mapper that splits the string
#2 Mapper that maps the array elements to corresponding fields

Related

How to modify dynamic complex data type fields in azure data factory data flow

This bounty has ended. Answers to this question are eligible for a +400 reputation bounty. Bounty grace period ends in 22 hours.
kevzettler is looking for a more detailed answer to this question.
I have a complex data type (fraudData) that undesirably has hyphen characters in the field names I need to remove or change the hypens to some other character.
The input schema of the complex object looks like:
I have tried using the "Select" and "Derive Column" data flow functions and adding a custom mapping. It seems both functions have the same mapping interface. My current attempt with Select is:
This gets me close to the desired results. I can use the replace expression to convert hypens to underscores.
The problem here is that this mapping creates new root level columns outside of the fraudData structure. I would like to preserve the hierarchy of the fraudData structure and modify the column names in place.
If I am unable to modify the fraudData in place. Is there any way I can take the new columns and merge them into another complex data type?
Update:. I do not know the fields of the complex data type in advance. This is a schema drift problem. This is why I have tried using the pattern matching solution. I will not be able to hardcode out kown sub-column names.
You can rename the sub-columns of complex data type using derived column transformation and convert them as a complex data type again. I tried this with sample data and below is the approach.
Sample complex data type column with two sub fields are taken as in below image.
img:1 source data preview
In Derived column transformation, For the column fraudData, expression is given as
#(fraudData_1_chn=fraudData.{fraudData-1-chn},
fraudData_2_chn=fraudData.{fraudData-2-chn})
img:2 Derived column settings
This expression renames the subfields and nests them under the parent column fraudData.
img:3 Transformed data- Fields are renamed.
Update: To rename sub columns dynamically
You can use below expression to rename all the fields under the root column fraudData.
#(each(fraudData, match(true()), replace($$,'-','_') = $$))
This will replace fields which has - with _.
You can also use pattern match in the expression.
#(each(fraudData, patternMatch(`fraudData-.+` ), replace($$,'-','_') = $$))
This expression will take fields with pattern fraudData-.+ and replace - with _ in those fields only.
Reference:
Microsoft document on script for hierarchical definition in data flow.
Microsoft document on building schemas using derived column transformation .

Using contains function in Azure Data Factory Dataflow expression builder

I am using Azure Data Factory in which a data flow is used, I want to split my file in to two based on a condition. I am attaching an image with 2 lines, the first one is working but I want to use more programatic approach to achieve the same output:
I have a column named indicator inside my dataset, I want to use contains functionality to split the data, basically having 1 file where a string value inside indicator column has substring Weekly or does not.
Similar to what I would use in pandas:
df1 = df[df.indicator.str.contains('Weekly')]
df2 = df[~df.indicator.str.contains('Weekly')]
You can try the below expression as well in the Conditional split.
contains() expects an array. So first split the column content to create the array and give this to contains function.
contains(split(indicator, ' '),#item=='weekly')
This is my sample data.
Conditional split:
Weekly data in the output:
Remaining data:
If you are looking for the existing of a value inside of a string scalar column, use instr().
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-expressions-usage#instr

How to convert recordset.field to a string

I am currently attempting to compare certain values in a column from a query in access to a vector of strings to look for a match between any two values.
I used recordset.fields("column1") to access specific records from my desired column, but it seems like I am unable to get matches since the values are of different data types.
How do I convert the records from recordset.fields("column1") into a string?
Thanks!
If you are working in VBA, surround your value with the CStr() function which will return the value converted to string output.

Excel array formula to multi dimensional intermediate arrays

I have the following data
What I want to achieve is these intermediate arrays
i.e. for JohnProject there are two entries A and C. For Project A there are two tags 10 and 6 in the 2nd table, and for Project C there are three tags 22,9,7. I want to get these as intermediate arrays in a bigger array formula.
With array formula
=UNIQUE(IF({"D"}=D2:D9,ROW(D2:D9)))
I can achieve the closer result for JuliaProject D and 4,8 but for any other who has two projects it doesnt work.
=UNIQUE(IF({"A","C"}=D2:D9,ROW(D2:D9)))
I want to have intermediate results in the form for arrays to be used in other array formula.
EDIT: The complex version of my problem.
Ok for the bigger problem. Its a bit complex to explain. But I can try. I have a mapping table, truth table and Input data. I want to find if the input data can be mapped using the truth table and mapping table. So its a bit of a mapping I want to achieve at the end. Mapping table itself is a many-many where one project can have multiple tags, as well as one tag can associated with multiple projects, but there is no Person in there. Purely projects and tags. The truth table has Person and Projects but not the tags. I want to map From the Unique person to a Tag. I can easily get all the projects associated with one person (In array formula as an intermediate array) and now for each project that I get in an array I want to find out all the tags and make them consolidated in one array. The problem lies in the many-many relationship in the mapping table, as it returns multidimensional array as a result not a single dimension. I want to get an array (intermediate in array formula) to finally search if a claimed tag is in that list. I hope it makes sense.
The following array formula can be used to join USER with a TAG:
{=AGGREGATE(15,6,INDEX($E$2:$E$9,N(IF(MMULT(--IFERROR(TRANSPOSE(INDEX($B$2:$B$6,N(IF($G2=$A$2:$A$6,ROW($B$2:$B$6)))-1))=$D$2:$D$9,FALSE),ROW($B$2:$B$6)^0),ROW($D$2:$D$9)))-1),COLUMNS($G$2:G2))}
I don't know where the formula will be used next, but to display the result I put it into the AGGREGATE function. Without AGGREGATE, the array is returned including error values.
"Clean" formula:
{=INDEX($E$2:$E$9,N(IF(MMULT(--IFERROR(TRANSPOSE(INDEX($B$2:$B$6,N(IF($G2=$A$2:$A$6,ROW($B$2:$B$6)))-1))=$D$2:$D$9,FALSE),ROW($B$2:$B$6)^0),ROW($D$2:$D$9)))-1)}
The following formula will display "Array result without error values" by hitting the highlighted formula inside the formula bar with keystroke F9
1] "Project" array result, in H2 formula copied down :
=INDEX(B$2:B$6,N(IF(1,AGGREGATE(15,6,ROW(B$1:B$5)/($A$2:$A$6=G2),ROW(INDIRECT("1:"&COUNTIF(A$2:A$6,G2)))))))
2] "Tag" array result, in I2 formula copied down :
=INDEX(E$2:E$9,N(IF(1,AGGREGATE(15,6,ROW(E$1:E$8)/ISNUMBER(MATCH(D$2:D$9,IF(A$2:A$6=G2,B$2:B$6),0)),ROW(INDIRECT("1:"&COUNT(MATCH(D$2:D$9,IF(A$2:A$6=G2,B$2:B$6),0))))))))

How to eval a field name contained in another field in an Access Query?

I need to create a long list of complex strings, containing the data of different fields in different places to create explanatory reports.
The only way I conceived, in Access 2010, is to save text parts in a table, together with field names to be used to compose the string to be shown (see line1 expression in figure). Briefly:
//field A contain a string with a field name:
A = "[Quantity]"
//query expression:
=EVAL(A)
//return error instead the number contained in field [Quantity], present in the query dataset
I thought doing an EVAL on a field (A), to obtain the value of the field (B) which name is contained in field A. But seems not working.
Any way exist?
Example (very simplified):
Sample query that EVAL a field containing other field names to obtain the value of the fields
Any Idea?
PS: Sorry for my english, not my mothertongue.
I found a interesting workaround in another forum.
Other people had same problem using EVAL, but found that it is possible to substitute a string with a field contents using REPLACE function.
REPLACE("The value of field Quantity is {Quantity}";"{Quantity}";[Quantity])
( {} are used only for clarity, not needed if one knows that words to be substituted do not compare in the string). Using this code in a query, and nesting as many REPLACE as many different fields one want to use:
REPLACE(REPLACE("<Salutation> <Name>";"<Salutation>";[Salutation]);"<Name>";[Name])
it is possible to embed fields name in a string and substitute them with the current value of that field in a query. Of course the latter example can be done more simply with a concatenation (&), but if the string is contained in a field instead that hardcoded, it can be linked to records as needed.
REPLACE(REPLACE([DescriptiveString];"[Salutation]";[Salutation]);"[Name]";[Name])
Moreover, it is possibile to create complex strings context-based as:
REPLACE(REPLACE(REPLACE("{Salutation} {Name} {MaidenName}";"{Salutation}";[Salutation]);"{Name}";[Name]);"{MaidenName}";IIF(Isnull([MaidenName]);"";[MaidenName]))
The hard part is to enumerate all the field's placeholders one wants to insert in the string (like {Quantity},{Salutation}, {Name}, {MaidenName}) in the REPLACE call, while with EVAL one would avoid this boring part, if only it was working.
Not as neat as I would, but works.

Resources