I have a huge table in Power Query with text in cells that consist of multiple 'x's and 'z's. I want to deduplicate values so I have one x and one z only.
For example:
xzzzxxxzxz-> xz
zzzzzzzzzz-> z
The table is very big, so I don't want to create additional columns. Can you please help?
You can convert the string to a list of characters, make the list distinct (remove duplicates), sort (if desired), and then transform back to text.
= Table.TransformColumns(#"Previous Step", {{"ColumnName",
each Text.Combine( List.Sort( List.Distinct( Text.ToList(_) ) ) ),
type text}})
i have a table with array columns all_available_tags and used_tags.
example row1:
all_available_tags:A,B,C,D
used_tags:A,B
example row2:
all_available_tags:B,C,D,E,F
used_tags:F
I want to get distinct set of all_available_tags from all rows and do except the set with all used_tags from all rows. from example above, all_available_tags of all rows would be A,B,C,D,E,F and all used_tags would be A,B,F. the end result i am looking for is C,D,E
I think i need to somehow pivot the table but there could be 100s of different tags, so it is not practical to list out everyone of them. is there a good way to do this?
You can try:
with tags(at, ut) as
(
select "A,B,C,D", "A,B"
union all
select "B,C,D,E,F", "F"
)
select splitat
from tags
cross join unnest(split(at, ",")) as t1 splitat
except
select splitut
from tags
cross join unnest(split(ut, ",")) as t2 splitut
I have a column with the following text:
PLEASANT AVENUE
PATTERSON DRIVE
I would like to separate the road type ("Avenue", "Drive", etc.) from the address (or road name, like "Pleasant" or "Patterson").
I need to end up with Col1 as the street name and Col2 as the type, as follows:
col1 | col2
|
PLEASANT | AVENUE
|
PATTERSON | DRIVE
How can I do this?
Select the column where you have the text, then go to Data => Text to column => Delimited => Space.
Adding on to #nicolaesse's post, you could first replace text to create a better delimiter.
For example: replace "AVENUE" with "; AVENUE", and then use the delimiter " ; " to split into columns.
P.S. do this for "DRIVE" too
I have table this kind if look and it represent specifications for products
where 1st columns is SKU and serve as ID and 2nd column us specifications specifications title,Value and 0 or 1 as optional parameter(1 is default if it missed) separated by "~" and ech option is seperated by ^
I want to split it to table with SKU and each of specifications title as column header and value as it's value
I manage to write this code to split it to records with dived specifications and stack with separating title from value for each specification and record and how looking for help with this
let
Source = Excel.CurrentWorkbook(){[Name="Таблица1"]}[Content],
Type = Table.TransformColumnTypes(Source,{{"Part Number", type text}, {"Specifications", type text}}),
#"Replaced Value" = Table.ReplaceValue(Type,"Specification##","",Replacer.ReplaceText,{"Specifications"}),
SplitByDelimiter = (table, column, delimiter) =>
let
Count = List.Count(List.Select(Text.ToList(Table.Column(table, column){0}), each _ = delimiter)) + 1,
Names = List.Transform(List.Numbers(1, Count), each column & "." & Text.From(_)),
Types = List.Transform(Names, each {_, type text}),
Split = Table.SplitColumn(table, column, Splitter.SplitTextByDelimiter(delimiter), Names),
Typed = Table.TransformColumnTypes(Split, Types)
in
Typed,
Split = SplitByDelimiter(#"Replaced Value","Specifications","^"),
Record = Table.ToRecords(Split)
in
Record
Ok, I hope you still need this, as it took the whole evening. :))
Quite interesting task I must say!
I assume that "~1" is always combined with "^", so "~1^" always ending field's value. I also assume that there is no ":" in values, as all colons are removed.
IMO, you don't need to use Table.SplitColumn function at all.
let
//replace it with your Excel.CurrentWorkbook(){[Name="Таблица1"]}[Content],
Source = #table(type table [Part Number = Int64.Type, Specifications = text], {{104, "Model:~104~1^Type:~Watch~1^Metal Type~Steel~1"}, {105, "Model:~105~1^Type:~Watch~1^Metal Type~Titanium~1^Gem Type~Ruby~1"}}),
//I don't know why do you replace these values, do you really need this?
ReplacedValue = Table.ReplaceValue(Source,"Specification##","",Replacer.ReplaceText,{"Specifications"}),
TransformToLists = Table.TransformColumns(Source, {"Specifications", each List.Transform(List.Select(Text.Split(_ & "^", "~1^"), each _ <> "") , each Text.Split(Text.Replace(_, ":", ""), "~")) } ),
ConvertToTable = Table.TransformColumns(TransformToLists, {"Specifications", each Table.PromoteHeaders(Table.FromColumns(_))}),
ExpandedSpecs = Table.TransformRows(ConvertToTable, (x) => Table.ExpandTableColumn(Table.FromRecords({x}), "Specifications", Table.ColumnNames(x[Specifications])) ),
UnionTables = Table.Combine(ExpandedSpecs),
Output = UnionTables
in
Output
UPDATE:
How it works (skipping obvious steps):
TransformToLists: TransformColumns takes table, and a list of column names and functions applied to this column's value. So it applies several nested functions to the value of "Specifications" field of each row. These functions do the following: List.Select returns list of non-empty values, which in order was obtained by applying Text.Split function to the value of "Specifications" field having ":"s removed:
Text.Split(
Text.Replace(_, ":", "")
, "~")
Each keyword means that following function applied to every processed value (it can be field, column, row/record, list item, text, function, etc), which is indicated with the underscore sign. This underscore can be replaced with a function:
each _ equals (x) => some_function_that_returns_corresponding_value(x)
So,
each Text.Replace(_, ":", "")
equals
(x) => Text.Replace(x, ":", "") //where x is any variable name.
//You can go further and make it typed, although generally it is not needed:
(x as text) => Text.Replace(x, ":", "")
//You can also write a custom function and use it:
(x as text) => CustomFunctionName(x)
Having said, TransformToLists step returns a table with two columns: "Part number" and "Specifications", containing list of lists. Each of these lists has two values: column name and its value. This happens because initial value in "Specifications" field has to be split twice: first it is split to pairs by "~1^", and then each pair is split by "~". So now we have column name and its value in each nested list, and now we have to convert them all into a single table.
ConvertToTable: We apply TransformColumns again, using a function for each row's "Specifications" field (remember, a list of lists). We use Table.FromColumns, as it takes a list of lists as an argument, and it returns a table where 1st row is column headers and second is their values. Then we promote 1st row to headers. Now we have a table, and "Specifications" field containing nested table with variable number of columns. And we have to put them all together.
ExpandedSpecs: Table.TransformRows applies transformation function to every row (as a record) in a table (in the code it is signed as x). You can write your custom function, as I did:
= Table.ExpandTableColumn( //this function expands nested table. It needs a table, but "x" that we have is a record. So we do the conversion:
Table.FromRecords({x}) //function takes a list of records, so {x} means a list with single value of x
, "Specifications" //Column to expand
, Table.ColumnNames(x[Specifications]) //3rd argument is a list of resulting columns. It takes table as an argument, and table is contained within "Specifications" field.
)
It returns a list of tables (having single row each), and we combine them using Table.Combine at UnionTables step. This results in a table having all the columns from combined tables, with nulls when there is no such a column in some of them.
Hope it helps. :)
A TextToColumns VBA solution is much simpler if I understand what you are asking MSDN for Range.TextToColumns
I have a spreadsheet looking like this:
あう to meet
青 あお blue
青い あおい blue
Is there a way that I could convert the data in these columns into three SQL statements that I could then use to enter the data into a database? I am not so much concerned in asking how to form the statement but I would like to know if there's a way I can take data in columns and make it into a script?
If I could convert this into:
col1: 青い col2: あおい col3: blue
I could add and modify this for the correct format
INSERT INTO JLPT col1,col2,col3 VALUES ('青い', 'あおい', 'blue')
etc
Use the formula
="('"&A1&"', '"&B1&"', '"&C1&"'), "
in column D and copy the formula down for all the rows. Then prepend
Insert into JPT (col1, col2, col3) values
and your are done. The end result will be something like this:
Just don't forget to delete the last comma (and optionally exchange it for a semicolon) when you copy over the data from Excel.