I am using power query in excel and i used create custom column to create a new column, what i desperately need is for this new column to take the value from the second row and subtract it from the first row , and again this will need to happen for all rows like so: row two is subtracted from row one, and row three will be subtracted from row two and row four will be subtracted from row three. PLEASE help. I have no understanding of dax nor power query started using it today and i only need this one thing to work
PS. I have an index that starts from one, called index
here is the code
= Table.AddColumn(#"Reordered Columns", "Custom", each [#"ODO - Km"] - [#"ODO - Km"])
At this moment the ODO km is subtracting from the ODO km in the same row, I want the previous odo km to subtract from the next ODO km.
Create two indexes, one 0-based, called Index0, and one 1-based, called Index1. Merge the query with itself joining on Index1 = Index0. You'll now have duplicate of every column, but they will be offset by one. Then you can do all of your arithmetic in one row. After this, you can remove all but the result fields you want.
You don't need to do this. You can index rows in a table by using an index. The key is to reference the name of the previous step like below:
let
Source = whatever
addindex = Table.AddIndexColumn(Source , "Index", 0, 1),
addRelative = Table.AddColumn(addindex, "Previous record", each try if [Index]<>0 then addindex[myField]{[Index]-1}),
in
addRelative
Related
I am facing an issue where I need to compare column X and column Y, if X=Y then I want to delete that row. But if X≠Y then just leave it there as I need to correct it manually. I try to find any reference but to no avail.
Example of Table
I try using PowerQuery, because the name list were scattered, after sorting up to X=Y, there are some data that wasnt right because it is comparing to almost identical name. I try to use 'remove duplicate' but nothing happened as it only remove if the column has the same data in multiple row.
Thanks in advance.
For another method that does not involve a helper column, you can use the Table.SelectRows function of Power Query:
let
//sample data
Source = Table.FromRecords(
{[x="Johnny White", y="Johnny White"],
[x= "Black Mmamba", y= "Black Mamba"],
[x="Tom Evans", y="Tom Evans"],
[x="Britney Blue",y="Britney Blue"],
[x="White Kingdom", y="Wine Kingdom"],
[x="Daniel Zack", y="Daniel Zack"]},
type table[x=Text.Type,y=Text.Type]),
//select rows where data not the same in each column
remDupes = Table.SelectRows(Source, each [x] <> [y])
in
remDupes
Source
Dupes Removed
I have a SharePoint list as a datasource in Power Query.
It has a "AttachmentFiles" column, that is a table, in that table i want the values from the column "ServerRelativeURL".
I want to split that column so each value in "ServerRelativeURL"gets its own column.
I can get the values if i use the expand table function, but it will split it into multiple rows, I want to keep it in one row.
I only want one row per unique ID.
Example:
I can live with a fixed number of columns as there are usually no more than 3 attachments per ID.
I'm thinking that I can add a custom column that refers to "AttachmentFiles ServerRelativeURL Value(1)" but I don't know how.
Can anybody help?
Try this code:
let
fn = (x)=> {x, #table({"ServerRelativeUrl"},List.FirstN(List.Zip({{"a".."z"}}), x*2))},
Source = #table({"id", "AttachmentFiles"},{fn(2),fn(3),fn(1)}),
replace = Table.ReplaceValue(Source,0,0,(a,b,c)=>a[ServerRelativeUrl],{"AttachmentFiles"}),
cols = List.Transform({1..List.Max(List.Transform(replace[AttachmentFiles], List.Count))}, each "url"&Text.From(_)),
split = Table.SplitColumn(replace, "AttachmentFiles", (x)=>List.Transform({0..List.Count(x)-1}, each x{_}), cols)
in
split
I manged to solve it myself.
I added 3 custom columns like this
CustomColumn1: [AttachmentFiles]{0}
CustomColumn2: [AttachmentFiles]{1}
CustomColumn3: [AttachmentFiles]{2}
And expanded them with only the "ServerRelativeURL" selected.
It would be nice to have a dynamic solution. But this will work fine for now.
I have a comma separated csv file with the following structure:
Col Headers:
ProdDate, ProdTime, OLEDATETIME, ProdBuyPrice, ProdSellPrice, ProdBoughtQTY, ProdSoldQTY, etc
09/21/2019, 13:54:22, 43729.5801, 12.45, 12.61, 8, 9, etc.
This CSV file is atualized many times per minute (5 to 70 times per minute) meaning that it can have 5 to 70 lines within the last minute of sales, then I can't fix an arbitray fixed number on "mantain first lines" to return only the rows that arrived in the last minute and I never did this before with Power Query. So I need an finished recipe to do this, but my googling resulted nothing until now.
Any suggestion?
This is an example of how you can identify a dynamic row number. In this example, we have a table that shows fruit sales by store. We want to create a query that returns the highest number of bananas sold.
This is what our data table looks like.
Step 1 - Add an index column starting from 1. This assigns row numbers.
Add Column > Index Column > From 1
Step 2 - Filter and Sort the data.
Remove any columns that are unnecessary.
Filter the Item column for Bananas.
Sort the Values column in descending order.
Right-click on the first value in the Index column and choose Drill-Down.
RESULT
Now you have a dynamic row #. You could also instead choose the value itself to return the sales instead of the index. To apply this to other scenarios, just keep filtering and sorting until you get to the result you need.
This is how you filter a time column for records occurring in the latest one minute of times.
let
Source = Excel.CurrentWorkbook(){[Name="t_DatesAndTimes"]}[Content],
ChangedTypes_ColData = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}}),
AddCol_DateAndTime = Table.AddColumn(ChangedTypes_ColData, "Date and Time", each [Date] & [Time], type datetime),
LatestTime_ofReport_MinusOneMinute = List.Max(AddCol_DateAndTime[Date and Time])-#duration(0,0,1,0),
FilterRows_KeepTimesInLastMinute = Table.SelectRows(AddCol_DateAndTime, each [Date and Time] >= LatestTime_ofReport_MinusOneMinute)
in
FilterRows_KeepTimesInLastMinute
Data Table needing to be filtered
Table filtered for time in the last minute of times listed in the report.
I can rank my data with this formula, which groups by Year, Trust and ID, and ranks the Areas.
rankx(
filter(Table,
[Year]=earlier([Year])&&[Trust]=earlier([Trust])&&[ID]=earlier([ID])),
[Area], ,1,Dense)
This works fine - unless you have data where the same Area appears more than once in the same group, whereupon it gives all rows the rank of 1. Is there any way to force unique rank values? So two rows that have the same Area would be given the rank of 1 and 2 (in an arbitrary order)? Thank you for your time.
Assuming you don't have duplicate rows in your table, you can add another column as a tie-breaker in your expression.
Suppose your table has an additional column, [Name], that is distinct between your multiple [Area] rows. Then you could write your formula like this:
= RANKX(
FILTER(Table,
[Year] = EARLIER([Year]) &&
[Trust] = EARLIER([Trust]) &&
[ID] = EARLIER([ID])),
[Area] & [Name], , 1, Dense)
You can append as many columns as you need to get the tie-breaking done.
We have cassandra column family.
each row have multiple columns. columns have name, but value is empty.
if we have 5-10 row keys, how we can find column names that appear in all of these keys.
e.g.
row1: php, programming, accounting
row2: php, bookkeeping, accounting
row3: php, accounting
must return:
result: php, accounting
note we can not easily load whole row into the memory, because it may contain 1M+ columns
solution not need to be fast.
In order to do intersection of several rows, we will need to intersect two of them first, then to intersect the result with third and so on.
Looks like in cassandra we can query the data by column names and this is relatively fast operation.
So we first get Column Slice of 10k rows. Making list of column names (in PHP Cassa - put them in array). Then select those from second row.
Code may be looking like this:
$x = $cf->get($first_key, <some column slice>);
$column_names = array();
foreach(array_keys($x) as $k)
$column_names[] = $k;
$result = $cf->get($second_key, $column_slice = null, $column_names);
// write result somewhere, and proceed with next slice
You columns names are sorted and you can create an iterator for each row (this iterator load portion of date at once, for example 10k of columns). Now put each iterator into a priority queue (by the next column name). If you take for queue the k times the iterator with the same column names, this is common names between all rows, in the other case we move to the next element and return iterators to queue.
You could use a Hadoop map/reduce job as follows:
Map output key = column name
Map output value = row key
Reducer counts row keys for each column and outputs column name & count to a CF with the following schema:
key : [column name] {
Count : [count]
}
You can then query counts from this CF in reverse order. The first record will be the max, so you can keep iterating until a value is < max. This will be your intersection.