I have been melting my brain trying to work out the formula i need for a multiple conditional lookup.
I have two data sets, one is job data and the other is contract data.
The job data contains customer name, location of job and date of job. I need to find out if the job was contracted when it took place, and if it was return a value from column N in the contract data.
The problem comes when i try to use the date ranges, as there are frequently more than one contract per customer.
So for example, in my job data:-
CUSTOMER | LOCATION | JOB DATE
Cust A | Port A | 01/01/2014
Cust A | Port B | 01/02/2014
Customer A had a contract in port B that expired on 21st Feb 2014, so here i would want it to return the value from column N in my contract data as the job was under contract.
Customer A did not have a contract in port A at the time of the job, so i would want it to return 'no contract'.
Contract data has columns containing customer name, port name, and a start and end date value, as well as my lookup category.
I think i need to be using index / match but i can't seem to get them to work with my date ranges. Is there another type of lookup i can use to get this to work?
Please help, I'm losing the plot!
Thanks :)
You can use two approaches here:
In both result and source tables make a helper column that concatenates all three values like this: =A2&B2&C2. So that you get something like 'Cust APort A01/01/2014'. That is, you get a unique value by which you can identify the row. You can add delimiter if needed: =A2&"|"&B2&"|"&C2. Then you can perform VLOOKUP by this value.
You can add a helper column with row number (1, 2, 3 ...) in source table. Then you can use =SUMIFS(<row_number_column>,<source_condition_column_1>,<condition_1>,<source_condition_column_2>,<condition_2>,...) to return the row number of source table that matches all three conditions. You can use this row number to perform INDEX or whatever is needed. But BE CAREFUL: check that there are only unique combinations of all three columns in source table, otherwise this approach may return wrong results. I.e. if matching conditions are met in rows 3 and 7 it will return 10 which is completely wrong.
Related
I have 2 large tables in power pivot and I am trying to reconcile stockpile build grades to crushed stockpile grades. Please see example. I can create pivot table that contains the crushed grades but I am unable to find the right way to bring the stockpile grades though for the reconciliation high lighted in green in the attached example.
Thanks for any help or direction on where to look
In Power Query, create your lookup tables.
1) unique crushers, ID
2) Dates, ID
Here is a function to create a dates table, if you need one. After you invoke the function to get the column of dates, add another column for the ID.
/*--------------------------------------------------------------------------------------------------------------------
PQ Create a Dates Table, returning a single column of dates.
Inputs:
Start Date | Enter the year as yyyy, month as mm, day as dd
End Date | Enter the year as yyyy, month as mm, day as dd
Increments | One row will be returned per increment.
Author: Jenn Ratten
Edits:
07/16/18 | Modified query copied from the internet.
10/01/19 | Converted to a function.
--------------------------------------------------------------------------------------------------------------------*/
let
fDatesTable = (StartYear as number, StartMonth as number, StartDay as number, EndYear as number, EndMonth as number, EndDay as number, IncrementDays as number, IncrementHours as number, IncrementMin as number, IncrementSec as number) as table =>
let
StartDate = #date(StartYear,StartMonth,StartDay),
EndDate = #date(EndYear,EndMonth,EndDay),
Increments = #duration(IncrementDays,IncrementHours,IncrementMin,IncrementSec),
DatesTable = Table.FromColumns({List.Dates(StartDate, Number.From(EndDate) - Number.From(StartDate), Increments)}, type table[Date]),
ChangeType = Table.TransformColumnTypes(DatesTable,{{"Date", type date}})
in
ChangeType
in
fDatesTable
Load all of the tables to the data model.
Go to Power Pivot, diagram view, and create your relationships.
Lookup Crusher to data tables 1 and 2
Lookup Date to data tables 1 and 2
Go to Data View on data tables 1 and 2, add 2 new columns for the lookup IDs. You can specify the column header and the formula at one time by clicking in first cell and using this syntax, then either press enter or click the check mark in the formula bar.
Dates Lookup ID:=RELATED(lookup_dates[ID])
Crusher Lookup ID:=RELATED(lookup_crusher[ID])
Optional, but a good practice....
Right-click the new fields you just created and select "hide from client tools". Also hide the date and crusher fields on both data tables, and the ID field on both lookup tables. When you are creating pivots to summarize data from more than one table, the text fields that you place on your pivot table should be the fields that are shared (aka the lookup tables). This helps to minimize pivots in which the grand totals don't match the sum that you actually see on the table. If you hide the fields, it reminds you of that. There are exceptions of course, but this is a good rule of thumb.
Now create measures to sum the tons and any other math calculations you'd like. With the measures, start simple and let the pivot do the slicing. Put the measures in the values section of the pivot table.
Sum of Source Tons:=sum(Table1[Tons])
Sum of Destination Tons:=sum(Table2[Tons])
I have a comma separated csv file with the following structure:
Col Headers:
ProdDate, ProdTime, OLEDATETIME, ProdBuyPrice, ProdSellPrice, ProdBoughtQTY, ProdSoldQTY, etc
09/21/2019, 13:54:22, 43729.5801, 12.45, 12.61, 8, 9, etc.
This CSV file is atualized many times per minute (5 to 70 times per minute) meaning that it can have 5 to 70 lines within the last minute of sales, then I can't fix an arbitray fixed number on "mantain first lines" to return only the rows that arrived in the last minute and I never did this before with Power Query. So I need an finished recipe to do this, but my googling resulted nothing until now.
Any suggestion?
This is an example of how you can identify a dynamic row number. In this example, we have a table that shows fruit sales by store. We want to create a query that returns the highest number of bananas sold.
This is what our data table looks like.
Step 1 - Add an index column starting from 1. This assigns row numbers.
Add Column > Index Column > From 1
Step 2 - Filter and Sort the data.
Remove any columns that are unnecessary.
Filter the Item column for Bananas.
Sort the Values column in descending order.
Right-click on the first value in the Index column and choose Drill-Down.
RESULT
Now you have a dynamic row #. You could also instead choose the value itself to return the sales instead of the index. To apply this to other scenarios, just keep filtering and sorting until you get to the result you need.
This is how you filter a time column for records occurring in the latest one minute of times.
let
Source = Excel.CurrentWorkbook(){[Name="t_DatesAndTimes"]}[Content],
ChangedTypes_ColData = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Time", type time}}),
AddCol_DateAndTime = Table.AddColumn(ChangedTypes_ColData, "Date and Time", each [Date] & [Time], type datetime),
LatestTime_ofReport_MinusOneMinute = List.Max(AddCol_DateAndTime[Date and Time])-#duration(0,0,1,0),
FilterRows_KeepTimesInLastMinute = Table.SelectRows(AddCol_DateAndTime, each [Date and Time] >= LatestTime_ofReport_MinusOneMinute)
in
FilterRows_KeepTimesInLastMinute
Data Table needing to be filtered
Table filtered for time in the last minute of times listed in the report.
I'm looking to try do the following;
I want to have say 3 columns.
Transaction | Category | Amount
so I want to be able to enter a certain Name in Transaction say for argument sake "Tesco" then have a returned result in Category Column say "Groceries" and I can enter a specific amount then myself in Amount Colum.
Thing is I will need to have unlimited or quite a lot of different Transactions and have them all in pre determined Categories so that each time when I type in a Transaction it will automatically display the category for me.
All help much appreciated.
I know a simple If Statement wont suffice I can get it to work no problem using a Simple IF Statement but as each Transaction is different I don't know how to program further.
Thanks.
Colin
Use a lookup table. Let's say it's on a sheet called "Categories" and it looks like this:
| A | B
1 | Name | Category
2 | Tesco | Groceries
3 | Shell | Fuel
Then, in the table you describe, use =VLOOKUP(A2, Categories!$A$2:$B$3, 2, FALSE) in your "Category" field, assuming it's in B2.
I do this a fair bit using Data Validation and tables.
In this case I would have two tables containing my pick lists on a lookup sheet.
Transaction Table : [Name] = "loTrans" - with just the list of transactions sorted
Category Table : [Name] = "loCategory" - two columns in table, sorted by Both columns - Trans and Category
Header1 : Transactions
Header2 : Category
The Details Table:
the transaction field will have a simple data validation, using a
named range "trans", that selects from the table loTrans.
the transaction field will also use data validation, using a named
range, but the source of the named range ("selCat" will be a little more
complex. It will be something like:
=OFFSET(loCategory[Trans],MATCH(Enter_Details!A3,loCategory[Trans],0)-1,1,COUNTIF(loCategory[Trans],Enter_Details!A3),1)
As you enter details, and select different Transactions, the data validation will be limited to the Categorys of your selected transactions
An example file
I have rawdata with only ID's and dates of transactions. Then I have a list of people; their ID, their hire date and their name.
What I want is a column in the rawdata that gives me the persons name. The problem is, that the ID is re-used if the person quits, and another one starts.
Since I have the transaction date, I figure it should be possible to check:
IF the transaction matches the period they were hired, then get the right name.
Is this possible, and then how?
Rawdata:
Column A = Transaction date
Column B = ID
Column C = Here I want their name
People list:
Column A = ID
Column B = Name
Column C = Hire date
Example (Excel Online - can be edited):
https://1drv.ms/x/s!AjWVkb2UBexjhzAKMy-YiE5EoKwc
Screenshot:
Make sure your reference table is ordered in chronological order oldest to newest:
=INDEX(F:F,AGGREGATE(14,6,ROW($G$8:INDEX(G:G,MATCH(1E+99,E:E)))/(($E$8:INDEX(E:E,MATCH(1E+99,E:E))=B8)*($G$8:INDEX(G:G,MATCH(1E+99,E:E))<A8)),1))
I had to name your tables People and Raw_Data to use structured table references in the formula. Additonally, I added some data and changed a few dates.
=INDEX(People[Agent Name], AGGREGATE(15,6, (ROW(People[Personal ID])-ROW(People[[#Headers],[Agent Name]]))/((People[Personal ID]=[#[Personal ID]])*(People[Hire date]=AGGREGATE(14, 6, (People[Hire date])/((People[Personal ID]=[#[Personal ID]])*(People[Hire date]<=[#[Transaction date]])), 1))), 1))
We have cassandra column family.
each row have multiple columns. columns have name, but value is empty.
if we have 5-10 row keys, how we can find column names that appear in all of these keys.
e.g.
row1: php, programming, accounting
row2: php, bookkeeping, accounting
row3: php, accounting
must return:
result: php, accounting
note we can not easily load whole row into the memory, because it may contain 1M+ columns
solution not need to be fast.
In order to do intersection of several rows, we will need to intersect two of them first, then to intersect the result with third and so on.
Looks like in cassandra we can query the data by column names and this is relatively fast operation.
So we first get Column Slice of 10k rows. Making list of column names (in PHP Cassa - put them in array). Then select those from second row.
Code may be looking like this:
$x = $cf->get($first_key, <some column slice>);
$column_names = array();
foreach(array_keys($x) as $k)
$column_names[] = $k;
$result = $cf->get($second_key, $column_slice = null, $column_names);
// write result somewhere, and proceed with next slice
You columns names are sorted and you can create an iterator for each row (this iterator load portion of date at once, for example 10k of columns). Now put each iterator into a priority queue (by the next column name). If you take for queue the k times the iterator with the same column names, this is common names between all rows, in the other case we move to the next element and return iterators to queue.
You could use a Hadoop map/reduce job as follows:
Map output key = column name
Map output value = row key
Reducer counts row keys for each column and outputs column name & count to a CF with the following schema:
key : [column name] {
Count : [count]
}
You can then query counts from this CF in reverse order. The first record will be the max, so you can keep iterating until a value is < max. This will be your intersection.