Filter container on column name - blueprism

Below is COLLECTION (DATA) format:
Column A Column B. Column C
AAAA. 1234. 54
AAAA. 5678. 56
AAAA. 1234. 46
I need to loop through the container and sum up COLUMN C for matching COLUMN B.
The filter container is not working:
"[DATA] = COLUMN B"
I cannot put the filter condition on text, e.g.:
"[DATA] = '2234.'"
Also tried without double quotes:
[DATA] = COLUMN B
Is there a possible way of filtering on column name?

Reading between the lines of your post, I guess that you're having problems with filtering collection. Example of filtering can be found below:
Object: Utility - Collection Manipulation
Action: Filter Collection
Input:
Collection_in: [Data]
Filter: "[Column B.] = '1234'"
If you would like to filter collection by a variable, then it's getting a little complicated with all single and double quotation marks and concatenation of strings.
Imagine if the filtering value is stored in data item [Filter].
To filter the collection [Data] by [Filter] data item you need to use following filter:
Filter: "[Column B.] = '" & [Filter] & "'"

Related

PowerQuery/M: How can I combine text from multiple rows into one row

I have a table in PowerQuery like this:
#
KeyA
KeyB
Comment
Value
1.
A1
B1
Comment 1
1
2.
A1
B1
Comment 2
3
3.
A2
B2
Comment 3
6
How can I combine it so that rows with the same entries in columns KeyA and KeyB (e.g. #1 & #2 in the example) concat the text in column Comment and sum the value in column Value, i.e. resulting in this:
A
B
Comments
Sum
A1
B1
Comment 1, Comment 2
4
A2
B2
Comment 3
6
Table.Group is doing exactly what is required!
Here's the solution to the problem:
let
Source = Table.FromRecords({
[KeyA="A1", KeyB="B1", Comment="Comment 1", Value=1],
[KeyA="A1", KeyB="B1", Comment="Comment 2", Value=2],
[KeyA="A2", KeyB="B2", Comment="...", Value=3]
}),
Grouped = Table.Group(
Source,
{"KeyA", "KeyB"},
{
{"Comments", (t) => Text.Combine(t[Comment], ", ")}, // <- Magic happens here!
{"Sum", (t) => List.Sum(t[Value])},
{"RecordCount", each Table.RowCount(_), Int64.Type}
}
)
in
Grouped
Explanation:
The function requires 3 parameters:
the source data
a list of columns that should be used for grouping
a list of function that produce the new aggregated column(s)
The last parameter is the key and where the functions power comes in: internally PowerQuery calls each of these function provided with a table as parameter that contains all the rows for one group. What you do with it, is completely up to you, i.e. you can aggregate it using List.Sum, Avg, ... - but you can also do any other modification of the table.
The entry {"Comments", (t) => Text.Combine(t[Comment], ", ")} creates a new column Comments. (t) => ... is the function defition. This function get called for each group - with a table containing all rows of this group. t[Comment] extracts the Comment column from the table as a list - which can then be used with the List.Combine function to concatenate.
How to get started from the PowerQuery editor
If you have the Source table in the PowerQuery editor, select column KeyA and KeyB and click Group By in the Transform ribbon. This will scaffold the Table.Group query with a record count. USe the Advanced Editor for further modifications.

Find a value in column of lists in pandas dataframe

I have a dataframe with two columns A & B, B is column of lists and A is a string, I want to search a value in the column B and get the corresponding value in column A. For ex :
category zones
0 category_1 [zn_1, zn_2]
1 category_2 [zn_3]
2 category_3 [zn_4]
3 category_4 [zn_5, zn_6]
If the input = 'zn_1', how can i get a response back as 'category_1'?
Use str.contains and filter category values
inputvalue='zn_1'
df[df.zones.str.contains(inputvalue)]['category']
#If didnt want an array
inputvalue='zn_1'
list(df[df.zones.str.contains(inputvalue)]['category'].values)[0]

DAX LOOKUPVALUE on text/string

I would like to create some kind of LOOKUPVALUE on text in DAX that will match a sentence with a particular keyword. For instance in the example below the second and third row have a hit because “Apple” and "Chicken" is in the string. The problem is that the text is in a string and not a standalone value.
Table 1
Table 2
Output
EDIT, improved answer: this new version also works when there are multiple keys in one string.
I think PowerQuery is the natural place to perform an operation like this.
The Output table would look like this:
A description of the applied steps:
Source: a reference to Table1
Added Column Key lists: adds a custom column with lists of the Table2[Key] value(s) that are in the [String] value. This is the logic for this Custom column:
For each row the function selects the values from the Table2[Key] column that it finds in the [String] value. It then returns a list that holds only the selected values.
Expanded Key list: expands the lists in the [Key] column
Join with Table2 on Key: Joins with Table2 on the Key Value
Expanded Table2: Expands the table values in the [ItemTables] column and keeps the [Item] column
Group and concate keys/items: Groups the Output table on String, concatenating the Keys and the Items. If you don't want to see the [Key] column, delete {"Key", each Text.Combine([Key], " | "), type text}, from this step
The script in the Advanced Editor looks like this:
let
Source = #"Table1",
#"Added Column Key lists" = Table.AddColumn(Source, "Key", (r) => List.Select(Table.Column(Table2,"Key"),each Text.Contains(r[String],_,Comparer.OrdinalIgnoreCase)),type text),
#"Expanded Key lists" = Table.ExpandListColumn(#"Added Column Key lists", "Key"),
#"Join with Table2 on Key" = Table.NestedJoin(#"Expanded Key lists", {"Key"}, Table2, {"Key"}, "ItemTables", JoinKind.LeftOuter),
#"Expanded ItemTables" = Table.ExpandTableColumn(#"Join with Table2 on Key", "ItemTables", {"Item"}, {"Item"}),
#"Group and concate keys / items" = Table.Group(#"Expanded ItemTables", {"String"},{{"Key", each Text.Combine([Key], " | "), type text},{"Item", each Text.Combine([Item], " | "), type text}})
in
#"Group and concate keys / items"
Here is a link to my .pbix file
I created the following dummy data sets.
My interpretation of what your after is to Identify if a sentence contains a key word.
This can be done via a calculated column with the following formula -
Lookup = LOOKUPVALUE(Table2[Result],Table2[LookUp], IF(SEARCH("Apple",Table1[Sentence],,0)> 0, "Apple",""))
You can combine the If and Search Functions with the Lookupvalue function.
The formula is searching for the word "Apple" and then returning its position within the text and if no result is found, displays 0.
The IF statement then takes any result greater then 0, as anything greater then 0 means a result has been found and that is its position within the string, and states "Apple". This then becomes your lookup value.
This then displays as bellow
You can then replace the Blank ("") that is currently the result if false, with another if statement to look for another key word such as "Orange" and then add add that to your lookup table to pull through the result your after.
Hope this makes sense and helps!
Try this formula (see the picture which cell is where in my assumptions):
=IFERROR(INDEX($B$7:$B$9,MATCH(1,--NOT(ISERROR(FIND($A$7:$A$9,$A12))),0)),"-")

Table column split by value to other columns and it's values

I have table this kind if look and it represent specifications for products
where 1st columns is SKU and serve as ID and 2nd column us specifications specifications title,Value and 0 or 1 as optional parameter(1 is default if it missed) separated by "~" and ech option is seperated by ^
I want to split it to table with SKU and each of specifications title as column header and value as it's value
I manage to write this code to split it to records with dived specifications and stack with separating title from value for each specification and record and how looking for help with this
let
Source = Excel.CurrentWorkbook(){[Name="Таблица1"]}[Content],
Type = Table.TransformColumnTypes(Source,{{"Part Number", type text}, {"Specifications", type text}}),
#"Replaced Value" = Table.ReplaceValue(Type,"Specification##","",Replacer.ReplaceText,{"Specifications"}),
SplitByDelimiter = (table, column, delimiter) =>
let
Count = List.Count(List.Select(Text.ToList(Table.Column(table, column){0}), each _ = delimiter)) + 1,
Names = List.Transform(List.Numbers(1, Count), each column & "." & Text.From(_)),
Types = List.Transform(Names, each {_, type text}),
Split = Table.SplitColumn(table, column, Splitter.SplitTextByDelimiter(delimiter), Names),
Typed = Table.TransformColumnTypes(Split, Types)
in
Typed,
Split = SplitByDelimiter(#"Replaced Value","Specifications","^"),
Record = Table.ToRecords(Split)
in
Record
Ok, I hope you still need this, as it took the whole evening. :))
Quite interesting task I must say!
I assume that "~1" is always combined with "^", so "~1^" always ending field's value. I also assume that there is no ":" in values, as all colons are removed.
IMO, you don't need to use Table.SplitColumn function at all.
let
//replace it with your Excel.CurrentWorkbook(){[Name="Таблица1"]}[Content],
Source = #table(type table [Part Number = Int64.Type, Specifications = text], {{104, "Model:~104~1^Type:~Watch~1^Metal Type~Steel~1"}, {105, "Model:~105~1^Type:~Watch~1^Metal Type~Titanium~1^Gem Type~Ruby~1"}}),
//I don't know why do you replace these values, do you really need this?
ReplacedValue = Table.ReplaceValue(Source,"Specification##","",Replacer.ReplaceText,{"Specifications"}),
TransformToLists = Table.TransformColumns(Source, {"Specifications", each List.Transform(List.Select(Text.Split(_ & "^", "~1^"), each _ <> "") , each Text.Split(Text.Replace(_, ":", ""), "~")) } ),
ConvertToTable = Table.TransformColumns(TransformToLists, {"Specifications", each Table.PromoteHeaders(Table.FromColumns(_))}),
ExpandedSpecs = Table.TransformRows(ConvertToTable, (x) => Table.ExpandTableColumn(Table.FromRecords({x}), "Specifications", Table.ColumnNames(x[Specifications])) ),
UnionTables = Table.Combine(ExpandedSpecs),
Output = UnionTables
in
Output
UPDATE:
How it works (skipping obvious steps):
TransformToLists: TransformColumns takes table, and a list of column names and functions applied to this column's value. So it applies several nested functions to the value of "Specifications" field of each row. These functions do the following: List.Select returns list of non-empty values, which in order was obtained by applying Text.Split function to the value of "Specifications" field having ":"s removed:
Text.Split(
Text.Replace(_, ":", "")
, "~")
Each keyword means that following function applied to every processed value (it can be field, column, row/record, list item, text, function, etc), which is indicated with the underscore sign. This underscore can be replaced with a function:
each _ equals (x) => some_function_that_returns_corresponding_value(x)
So,
each Text.Replace(_, ":", "")
equals
(x) => Text.Replace(x, ":", "") //where x is any variable name.
//You can go further and make it typed, although generally it is not needed:
(x as text) => Text.Replace(x, ":", "")
//You can also write a custom function and use it:
(x as text) => CustomFunctionName(x)
Having said, TransformToLists step returns a table with two columns: "Part number" and "Specifications", containing list of lists. Each of these lists has two values: column name and its value. This happens because initial value in "Specifications" field has to be split twice: first it is split to pairs by "~1^", and then each pair is split by "~". So now we have column name and its value in each nested list, and now we have to convert them all into a single table.
ConvertToTable: We apply TransformColumns again, using a function for each row's "Specifications" field (remember, a list of lists). We use Table.FromColumns, as it takes a list of lists as an argument, and it returns a table where 1st row is column headers and second is their values. Then we promote 1st row to headers. Now we have a table, and "Specifications" field containing nested table with variable number of columns. And we have to put them all together.
ExpandedSpecs: Table.TransformRows applies transformation function to every row (as a record) in a table (in the code it is signed as x). You can write your custom function, as I did:
= Table.ExpandTableColumn( //this function expands nested table. It needs a table, but "x" that we have is a record. So we do the conversion:
Table.FromRecords({x}) //function takes a list of records, so {x} means a list with single value of x
, "Specifications" //Column to expand
, Table.ColumnNames(x[Specifications]) //3rd argument is a list of resulting columns. It takes table as an argument, and table is contained within "Specifications" field.
)
It returns a list of tables (having single row each), and we combine them using Table.Combine at UnionTables step. This results in a table having all the columns from combined tables, with nulls when there is no such a column in some of them.
Hope it helps. :)
A TextToColumns VBA solution is much simpler if I understand what you are asking MSDN for Range.TextToColumns

intersect cassandra rows

We have cassandra column family.
each row have multiple columns. columns have name, but value is empty.
if we have 5-10 row keys, how we can find column names that appear in all of these keys.
e.g.
row1: php, programming, accounting
row2: php, bookkeeping, accounting
row3: php, accounting
must return:
result: php, accounting
note we can not easily load whole row into the memory, because it may contain 1M+ columns
solution not need to be fast.
In order to do intersection of several rows, we will need to intersect two of them first, then to intersect the result with third and so on.
Looks like in cassandra we can query the data by column names and this is relatively fast operation.
So we first get Column Slice of 10k rows. Making list of column names (in PHP Cassa - put them in array). Then select those from second row.
Code may be looking like this:
$x = $cf->get($first_key, <some column slice>);
$column_names = array();
foreach(array_keys($x) as $k)
$column_names[] = $k;
$result = $cf->get($second_key, $column_slice = null, $column_names);
// write result somewhere, and proceed with next slice
You columns names are sorted and you can create an iterator for each row (this iterator load portion of date at once, for example 10k of columns). Now put each iterator into a priority queue (by the next column name). If you take for queue the k times the iterator with the same column names, this is common names between all rows, in the other case we move to the next element and return iterators to queue.
You could use a Hadoop map/reduce job as follows:
Map output key = column name
Map output value = row key
Reducer counts row keys for each column and outputs column name & count to a CF with the following schema:
key : [column name] {
Count : [count]
}
You can then query counts from this CF in reverse order. The first record will be the max, so you can keep iterating until a value is < max. This will be your intersection.

Resources