i am trying to use a if formula in power query. For example, if the Column contains “guy” then value is male and the false value is “female”. I tried different ways and I can’t find the right formula to use in power query. Can anyone help me please?
If you are entering this into the Add Custom Column dialog, something like (for case-insensitive):
= if Text.Contains([Column],"guy", Comparer.OrdinalIgnoreCase) then "male" else "female"
It sounds like you are wanting to conditionally replace values in a column, based on the existing values in that column? If so, you can use the Table.ReplaceValue function in Power Query:
= Table.ReplaceValue(Source,
each [Gender],
each if [Gender] = "guy" then "male" else "female",
Replacer.ReplaceText,{"Gender"})
That will change all values of "guy" to "male', and ALL other values to "female", as you stated.
You can also leave values in place that don't meet the criteria, by simply referencing the column name instead of a specifying a new value:
= Table.ReplaceValue(Source,
each [Gender],
each if [Gender] = "guy" then "male" else [Gender],
Replacer.ReplaceText,{"Gender"})
Create a table with a column called Gender, and load it to power query. Right-click on the column header and choose Replace Values to get the UI to build your statement for you, then replace the generated code with the above modification(s) and apply to your actual requirements. The key is using the each expression to tell Power Query to test at the row value level. If you omit each, you'll see the error:
"Expression.Error: There is an unknown identifier. Did you use the [field] shorthand for a _[field] outside of an 'each' expression?"
= Table.AddColumn(#"Changed Type","ColumnName",each if[Column] ="""Guy""" then"""Male""" else"""Female""")
Related
I am facing an issue where I need to compare column X and column Y, if X=Y then I want to delete that row. But if X≠Y then just leave it there as I need to correct it manually. I try to find any reference but to no avail.
Example of Table
I try using PowerQuery, because the name list were scattered, after sorting up to X=Y, there are some data that wasnt right because it is comparing to almost identical name. I try to use 'remove duplicate' but nothing happened as it only remove if the column has the same data in multiple row.
Thanks in advance.
For another method that does not involve a helper column, you can use the Table.SelectRows function of Power Query:
let
//sample data
Source = Table.FromRecords(
{[x="Johnny White", y="Johnny White"],
[x= "Black Mmamba", y= "Black Mamba"],
[x="Tom Evans", y="Tom Evans"],
[x="Britney Blue",y="Britney Blue"],
[x="White Kingdom", y="Wine Kingdom"],
[x="Daniel Zack", y="Daniel Zack"]},
type table[x=Text.Type,y=Text.Type]),
//select rows where data not the same in each column
remDupes = Table.SelectRows(Source, each [x] <> [y])
in
remDupes
Source
Dupes Removed
Hi I am trying to change to write VBA for excel to clean up data elements that has extra information without impacting the other elements.
I am writing VBA for the first time my table is in the middle of the sheet.
Given Table and Requested Output.
I think your question was not clear in regard to the "steps" that you want to perform on your data (i.e. the exact logic or transformation that needs to be applied).
Based purely on your images and your comment, I make the "steps" to be:
Split any customer IDs in column valueC into multiple rows.
If column valueC does not contain customer IDs (i.e. is blank or contains non-customer ID text), leave it untouched.
My answer uses Power Query instead of VBA. If you are interested in trying it out, in Excel try clicking Data > Get Data > From Other Sources > Blank Query, then click Advanced Editor near the top-left, copy-paste the code below, then click Done.
You might need to change the name of the table in the first line of the code (below), as it was "Table1" for me, but I imagine yours is named something else. Also, the code below is case-sensitive. So if there is no column named exactly valueC, then you will get an error.
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
fxProcessSomeText = (textToProcess as any) =>
let
canBeSplit = Text.StartsWith(textToProcess, "### customer id"),
result = if textToProcess is null then null else if canBeSplit then Text.Split(Text.BetweenDelimiters(textToProcess, "### customer id", " ###"), ",") else {textToProcess}
in
result,
invokeFunction = Table.TransformColumns(Source, {{"valueC", fxProcessSomeText}}),
expanded = Table.ExpandListColumn(invokeFunction, "valueC"),
reindex =
let
removeIndex = Table.RemoveColumns(expanded, {"index"}),
addIndex = Table.AddIndexColumn(removeIndex, "index", 1, 1),
moveIndex = Table.ReorderColumns(addIndex, List.Distinct(List.InsertRange(Table.ColumnNames(addIndex), 0, {"index"})))
in
moveIndex
in
reindex
My output table contains more rows than yours. Also, the value in column valueA, row 11 is 1415 for me (it is 1234 in your request output). Not sure if this is a mistake in your example, or if I'm missing some logic.
By using Power Query, I have created an address list from address fields in approx. 15000 individual excel files.
I now have a list with 15143 rows but I have run into problems with the "Does Not Contain" Text filter.
I want to keep rows that do not contain the search term "foo" in a specific column.
When I first use the "Contains" "foo" Text Filter it returns a list of 150 rows
But when I use the "Does Not Contain" "foo" Text Filter instead the list is shortened to only 3218 rows.
A bit unexpected result...
If I recall my maths lessons correctly 15143-150=14993, not 3218.
This is driving me nuts!
Do I do something wrong or is it the Almighty Microsoft Bug that has hit me, once again?
This behavior is related to the expected Sql logic for null: if a row field is null, it doesn't contain "foo" but it also doesn't not contain "foo". Put differently, a WHERE filter skips rows that evaluate to null, and not null is also null.
You can see this in Power Query:
let
Source = Table.FromColumns({{null, "foo", "bar"}}),
FilteredRows = Table.SelectRows(Source, each
not Text.Contains([Column1], "foo") or Text.Contains([Column1], "foo"))
in
FilteredRows
... only returns the last two rows.
In Power Query if you want to avoid this bizarre kind of logic, you can replace null with empty string and then you get nicer behavior:
= Table.ReplaceValue(Source,null,"",Replacer.ReplaceValue,{"Column1"})
We have a spreadsheet that gets updated monthly, which queries some data from our server.
The query url looks like this:
http://example.com/?2016-01-31
The returned data is in a json format, like below:
{"CID":"1160","date":"2016-01-31","rate":{"USD":1.22}}
We only need the value of 1.22 from the above and I can get that inserted into the worksheet with no problem.
My questions:
1. How to use a cell value [contain the date] to pass the date parameter [2016-01-31] in the query and displays the result in the cell next to it.
2. There's a long list of dates in a column, can this query be filled down automatically per each date?
3. When I load the query result to the worksheet, it always load in pairs. [taking up two cells, one says "Value", the other contains the value which is "1.22" in my case]. Ideally I would only need "1.22", not the title, can this be removed? [Del won't work, will give you a "Column 1" instead, or you have to hide the entire row which will mess up with the layout].
I know this is a lot to ask but I've tried a lot of search and reading in the last few days and I have to say the M language beats me.
Thanks in advance.
Convert your Web.Contents() request into a function:
let
myFunct = ( param as date ) => let
x = Web.Contents(.... & Date.ToText(date) & ....)
in
x
in
myFunct
Reference your data request function from a new query, include any transformations you need (in this case JSON.Document, table expansions, remove extraneous data. Feel free to delete all the extra data here, including columns that just contain the label 'value'.
(assuming your table of domain values already exists) add a custom column like
=Expand(myFunct( [someparameter] ))
edit: got home and got into my bookmarks. Here is a more detailed reference for what you are looking to do: http://datachix.com/2014/05/22/power-query-functions-some-scenarios/
For a table - Add column where you get data and parse JSON
let
tt=#table(
{"date"},{
{"2017-01-01"},
{"2017-01-02"},
{"2017-01-03"}
}),
add_col = Table.AddColumn(tt, "USD", each Json.Document(Web.Contents("http://example.com/?date="&[date]))[rate][USD])
in
add_col
If you need only one value
Json.Document(Web.Contents("http://example.com/?date="&YOUR_DATE_STRING))[rate][USD]
Problem
I have two queries, one contains product data (data_query), the other (recode_query) contains product names from within the data_query and assigns them specific id_tags. id_tags are also column names within the data_query.
What I need to achieve and fail at
I need the data_query to look at the id_tag of the specific product name within the data_query, as parsed from the recode_query (this is already working and in place) and input the retrieved value within the specific custom column cell. In Excel, I would be using INDEX/MATCH combo:
{=INDEX(data_query[#Data];; MATCH(data_query[#id_tag]; data_query[#Headers]; 0))}
I have searched near and far, but I probably can't even spot the solution, even if I have come across it, as I am not that deep in the data manipulation and power query myself.
Is this what you're wanting?
let
DataQuery = Table.FromColumns({{1,2,3}, {"Boxed", "Bagged", "Rubberbanded"}}, {"ID","Pkg"}),
RecodeQuery = Table.FromColumns({{"Squirt Gun", "Coffee Maker", "Trenching Tool"}, {1,2,3}}, {"Prod Name", "ID2"}),
Rzlt = Table.Join(DataQuery, "ID", RecodeQuery, "ID2", JoinKind.Inner)
in
Rzlt