Filter based on existence in one table and non-existence in another - excel

I have the following data model:
Record: Id, ..., CreateDate
FactA: RecordId, CreateDate
FactB: RecordId, CreateDate
Relationships exist from FactA to Record and FactB to Record.
I've written measures on Records such as this with no issues:
FactA's:=CALCULATE(DISTINCTCOUNT(Records[Id]), FactA)
FactB's:=CALCULATE(DISTINCTCOUNT(Records[Id]), FactB)
Now I'd like a count of Records with FactA but no FactB, in SQL I'd do a LEFT JOIN WHERE FactB.RecordId IS NULL but I can't figure out how to do similar in DAX. I've tried:
-- this returns blank, presumably because when there is a FactB then RecordId isn't blank, and when there is no Fact B then RecordId a NULL which isn't blank either
FactA_No_FactB:=CALCULATE(DISTINCTCOUNT(Records[Id]), FactA, FILTER(FactB, ISBLANK([RecordId])))
-- this returns the long "The value for columns "RecordId" in table "FactB" cannot be determined in the current context" error.
FactA_No_FactB:=CALCULATE(DISTINCTCOUNT(Records[Id]), FILTER(FactA, ISBLANK(FactB[RecordId])))
I've also tried various ways of using RELATED and RELATEDTABLE but I don't really understand enough about DAX and context to know what I'm doing.
Can someone explain how I can write the calculated measure to count Records with FactA but no FactB?
Thanks in advance.
Edit - Workaround
I've come up with this, it looks correct so far but I'm not sure if it is the generally correct way to do this:
-- Take the count with FactA and subtract the count of (FactA and FactB)
FactA_No_FactB:=CALCULATE(DISTINCTCOUNT(Records[Id]), FactA) - CALCULATE(DISTINCTCOUNT(Records[Id]), FactA, FactB)

Here's an alternative, that might still not be the best way of doing it:
FactA_No_FactB:=CALCULATE(DISTINCTCOUNT(Records[ID]), FILTER(Records,CONTAINS(FactA, FactA[RecordID],Records[ID]) && NOT(CONTAINS(FactB,FactB[RecordID],Records[ID]))))
The difference between my version and yours is that mine returns a value of 1 for those items in and A but not B and BLANK for everything else. Your version returns 1 for those items in A but not B, 0 for those in both A and B and BLANK for everything else. Depending on your use case, one outcome may be prefereable over the other.

Related

PETL - Sorting as descending order

I'm having issues sorting the following code:
I first imported the dataframe with etl
I checked if column "quantity" is numeric (raw data present innumerous errors)
I sorted the "quantity" column by the largest amounts (I tried to use 'nlargest' but it doesn't work - not sure why)
I was supposed to sort in decrescent order, I tried a bunch of different combinations but no luck.
I'm wondering if the steps chosen to solve this problem are correct, or I'm missing something in the syntax... Really appreciate any help, thnx!
table = etl.fromdataframe(df)
table = etl.select(table, 'quantity', lambda quantity: quantity.isnumeric())
table2 = etl.head(table, 5)
table

BulkCreate with a 1 to 1 relationship using Sequelize

Description:
I have a table ProductGroupItems that is in a 1 to 1 relationship with BonusGroupItems and ThresholdGroupItems. Sometimes a ProductGroupItem is a bonus item other times its a threshold item. I chose this so that ProductGroupItems wouldn't have any columns that were null.
Question:
How do I insert multiple ProductGroupItems and its corresponding 1 to 1 table (Threshold or Bonus) using Sequelize.bulkCreate?
What I have tried:
Looping through values and making a separate create() each time. This does work but fires a query for EACH record, obviously.
models.ProductGroupItems.bulkCreate(productItems, { return: true }) then doing something fancy with the values that come back. This was too confusing and I ended up not being able to figure it out.
Any help?

Excel Power Query -- Select value in column specified in related table -- INDEX+MATCH alternative

Problem
I have two queries, one contains product data (data_query), the other (recode_query) contains product names from within the data_query and assigns them specific id_tags. id_tags are also column names within the data_query.
What I need to achieve and fail at
I need the data_query to look at the id_tag of the specific product name within the data_query, as parsed from the recode_query (this is already working and in place) and input the retrieved value within the specific custom column cell. In Excel, I would be using INDEX/MATCH combo:
{=INDEX(data_query[#Data];; MATCH(data_query[#id_tag]; data_query[#Headers]; 0))}
I have searched near and far, but I probably can't even spot the solution, even if I have come across it, as I am not that deep in the data manipulation and power query myself.
Is this what you're wanting?
let
DataQuery = Table.FromColumns({{1,2,3}, {"Boxed", "Bagged", "Rubberbanded"}}, {"ID","Pkg"}),
RecodeQuery = Table.FromColumns({{"Squirt Gun", "Coffee Maker", "Trenching Tool"}, {1,2,3}}, {"Prod Name", "ID2"}),
Rzlt = Table.Join(DataQuery, "ID", RecodeQuery, "ID2", JoinKind.Inner)
in
Rzlt

PostgreSQL: how to take query results and change them on the fly before saving to file

I want to know how to take query results and change them in to human talk.
For example:
Select backup_set.id, backup_set.status, backup_set.type
from backset.table
Then take the backup_set.type result, which is usually a number such as 1,2,3,4,5 and change it to something like SUSPENDED, SCHEDULED etc... But I dont want to change the data in the table just the output.
This can be done using a CASE statement.
Select backup_set.id, backup_set.status,
case backup_set.type
when 1 then 'SUSPENDED'
when 2 then 'SCHEDULED'
else 'UNKNOWN'
end as "type"
from backset.table
But it would be better if you stored that information in another table and make the type column a foreign key to that lookup table. Then you can use a simple join to retrieve the description.
Edit
If you want to replace UNKNOWN with the actual numeric value, you just need to put that into the ELSE part. You need to cast the number to a text value though:
Select backup_set.id, backup_set.status,
case backup_set.type
when 1 then 'SUSPENDED'
when 2 then 'SCHEDULED'
else backup_set.type::text
end as "type"
from backset.table

Lucene BooleanQuery - Must be present in one of two columns

Not sure how to format the query in Lucene. The scenario is that the search term must be present in one of the two columns (either one is fine).
boolQuery.Add(query1, Occur.MUST) 'this one is fine
boolQuery.Add(query2, Occur.SHOULD)
boolQuery.Add(query3, Occur.SHOULD)
Brings up results even when the search term is not present at all in column 2 and column 3.
boolQuery.Add(query2, Occur.MUST)
boolQuery.Add(query3, Occur.SHOULD)
Does not bring up results when the search term is present in column 3 but not in column 2.
How do I format the query so that I get equivalent of this:
where column 1= val1 and (column 2 = val2 or column 3 = val2)
MUST, as the name suggests, makes the occurrence mandatory. SHOULD means optional. The first boolean query will basically match only documents hit by the first clause, but if any of them can be hit by the second or third clause, they will score higher. To get the results to match your desired linq (i assume that's what it is) statement, this should work (using java).
BooleanQuery q = new BooleanQuery();
BooleanQuery subQuery = new BooleanQuery();
subQuery.addClause(new BooleanClause(q2,Occur.SHOULD));
subQuery.addClause(new BooleanClause(q3,Occur.SHOULD));
q.addClause(new BooleanClause(q1, Occur.MUST));
q.addClause(new BooleanClause(subQuery,Occur.MUST));
Your confusion probably stems from the fact that the query API implements must and should as unary operators, while in the traditional programming languages AND and OR are binary operators
i solved a similar issue using query syntax:
+(col1:{query} OR col2:{query})
this will return the documents having the value {query} in at least one of the fields.
(note: i am using the classes Query and MultiFieldQueryParser)

Resources