can we select columns which starts with particular string in ADF Dataflow? - azure

So I have a data and I have to select columns which starts with "Rain" using dataflow.
Is there a way we can do that?

You can achieve this using select transformations. The following is a demonstration of how you can achieve this.
The following are the columns taken as a the source.
After adding the source, use select transformation. In this, remove all the mapped columns. Click on Add Mapping -> Rule-based Mapping
In this rule-based mapping, you can use the condition to select the column names starting with Rain. The rule is startsWith(name,'Rain') and the output column name is $$ (indicates the same name as source column name)
You can inspect the output where you can see that only column with name starting with Rain are selected.

Related

Couldnt convert to number when expanding a table power query

I have a very annoying problem when i try to merge two tables on power query excel. I use one column to match records from both tables and when i try to expand the second table it pops up the following message:
DataFormat.Error: We couldn't convert to Number.
Λεπτομέρειες:
ECS
I have no idea how to fix this. The columns that are matched have text, not value. There are no errors when i import data. Is there anyone that can help?
Try the following:
Delete the step #"Changed Type" in both queries
Make sure that the two merged columns have the same type (text ABC, in your case)
When you create a query from a table, Power Query try to guess (based on the first 200 lines) the type of each column. Now, the value "Λεπτομέρειες: ECS" is probably included in a different column (than the two merged) that has Number 123 as a type. It's kind of a mixed column (due to the source of data itself or to a delimiter issue).

Power query nested table

I have xml file that i need to process. My output look like this :
In the nested table there is only one value but i cant figure out how to ungroup them.
Add a column and insert the following M (replacing channel.item.ht with whatever your column name actually is).
if [channel.item.ht] is table then Record.ToList([channel.item.ht]{0}){0} else [channel.item.ht]

ADF Mapping Data Flow - Dirty / Label Replacement / Sink cache and derived column

I am trying to build a ADF mapping data flow for generic adding Label - it's purpose is to see a value in a particular column and replace it with a label . I already have my dataset that looks like this (Table B):
enter image description here
The goal is to replace the values with the label ones. Since my label dataset mapping file is in a Cached Sink (Table B),I thought that I could use a Derived Column Activity, along with Cached Lookups to find the clean value, given the current Column Name and current value (dirty) as keys. I did a rule-based mapping expression to get just the columns that needed cleaning:
enter image description here
I tested the derived column transformation using: Each column that matches:libCached#lookup(name).Column_Name
enter image description here
This part allow me to distinct column names that need to be replaced by label and that's working fine.
I need help to make the replacement I tried several formulas it still doesn't work, I don't know if it's achievable or not ??
thanks a lot
To replace the actual values in the derived column, you'll need to use the lookup formula using the key that you've set in the cached sink so that ADF can match on that value. In the screenshot you have, it only shows that you checking for null and are not actually returning the lookup value.

Azure Data Factory DataFlow exclude 1 column from expression columns()

I'm looking for a solution for the following problem.
I've created the following expression in a Derived Column in Azure Data Factory DataFlow
md5(concatWS("||", toString(columns())))
But from the above expression column() I want to extract 1 Column.
so something like this md5(concatWS("||", toString(columns()-'PrimaryKey'))). I cannot exclude the primary key column with a select in front of the derived because I need it in a later stage.
So in Databricks i'm executing the following, but I want to achieve this as well in ADF
non_key_columns = [column for column in dfsourcechanges.columns if column not in key_columns]
Are there any suggestions, how I can solve this
You can try to use byNames function to do this. Create an array and add all your column names into it except 'PrimaryKey'. Then pass it to byNames function as first parameter. Something like this expression:md5(concatWS("||", toString(byNames(['yourColumn1','yourColumn2',...]))))

Excel 2010 - filter pivot table by pattern

in Excel 2010 I am trying to analyze some data from an external analysis service.
In a pivot table I am trying to filter the report by one field which has multiple values separated by a comma. These look like this:
AB, CD1, EF1-5
AB, CD1,3, EF1
BCD, EFG
EXG, HIJ, CD1
...
So as you can see, there are hundreds of values in any possible order and no fixed scheme.
What I'm trying to achieve is to select all the fields which have a key beginning with an E (EF1-5, EF1, EFG, EXG, ...) and those beginning with H. These literals are never part of another key, so I could imagine using wildcards and creating a filter pattern like
*E* OR *H*
contains(E) || contains(H)
or equal.
Is there any way to do this?
With best regards,
Babbage
edit: I've tried selecting some of the Keys manually by deselecting all and searching e.g. for EF1-5 to select them. But even then there are more than 10000 Keys with EF1-5 at different locations. So I can't even select them all. The plan was to create two or more pivot tables and merge the results.
Go to the drop down filter that you have in the pivot table, click on Select Multiple Items and deselect ALL. Then in the search bar (again at the pivot table filter) write *e* and select add current selection to filter. Repeat for *h*.
Do not forget to select add current selection to filter in both cases.

Resources