Join on inequality in Power Query

Join on inequality in Power Query - excel

I have been trying to answer this question
With the following data
+---------+---------+-----------+---------+
| Column1 | Column2 | Column3 | Column4 |
+---------+---------+-----------+---------+
| 1 | happy | 1-veggies | GHF |
| 1 | sad | 1-veggies | HGF |
| 2 | angry | 1-veggies | GHG |
| 2 | sad | 1-veggies | FGH |
| 3 | sad | 1-veggies | HGF |
| 4 | moody | 2-meat | FFF |
| 4 | sad | 2-meat | HGF |
| 5 | excited | 2-meat | HGF |
+---------+---------+-----------+---------+
OP was asking for a way of finding how many records there were which matched 'sad' and '1-veggies', and also had another record with the same value in column 1 and a code of GHF or FGH in column 4. The first two rows qualify, but the fourth row does not qualify because (if I understand correctly) it has the correct code, but in the same record as the one matching 'sad' and '1-veggies'. The count should be one.
I think the answer would have been fairly standard if this had been a SQL question - you would do a self-join with an equality on the first column and an inequality on the row number. In SQL it would look something like this:
create table Veggies
(
num integer,
emotion varchar(10),
food varchar(10),
code varchar(10),
seq integer
)
insert into Veggies
values
(1,'happy','1-veggies','GHF',1),
(1,'sad','1-veggies','HGF',2),
(2, 'angry' ,'1-veggies' ,'GHG',3),
(2, 'sad', '1-veggies', 'FGH',4),
(3, 'sad', '1-veggies', 'HGF',5),
(4, 'moody', '2-meat', 'FFF',6),
(4, 'sad', '2-meat', 'HGF',7),
(5, 'excited', '2-meat', 'HGF',8)
with t1 (num,seq)
as
(
select num,seq
from veggies
where emotion='sad' and food='1-veggies'
),
t2 (num,seq)
as
(
select num,seq
from veggies
where code='GHF' or code='FGH'
)
select *
from t1 inner join t2 on t1.num=t2.num and t1.seq<>t2.seq
I thought it might be possible to do the same thing (join on first column equal but row number unequal) in Power Query, but I have worked through the steps of getting the two queries with row numbers, and am stuck here:
I don't see any way of expressing an inequality and the documentation seems unhelpful. Does anyone have any inside knowledge on how to do this?

So although it looks as though you can't translate the SQL in the question directly into Power Query and replicate this in a single step
select *
from t1 inner join t2 on t1.num=t2.num and t1.seq<>t2.seq
you can split it into two steps as suggested by #Ron Rosenfeld.
To recap, the initial steps which hopefully were fairly straightforward were:
Establish a connection to the data as Table 1
Add an index column
Duplicate the table and call it Table 2
Filter table 1 by 'sad' and '1-veggies'
filter table 2 by 'GHF' or 'FGH'
Now join Table 2 to Table 1 using an inner join on Column 1:
and exclude rows that were in table 1 using a left anti join on the index column:
This leaves one row as required.

Related

How to combine two columns into one in Sqlite and also get the underlying value of the Foreign Key?

I want to be able to combine two columns from a table into one column then to to be able to get the actual value of the foreign keys. I can do these things individually but not together.
Following the answer below I was able to combine the two columns into one using the first sql statement below.
How to combine 2 columns into a new one in sqlite
The combining process is shown below:
+---+---+
|HT | AT|
+---+---+
|1 | 2 |
|5 | 7 |
|9 | 5 |
+---+---+
into one column as shown:
+---+
|HT |
+---+
| 1 |
| 5 |
| 9 |
| 2 |
| 7 |
| 5 |
+---+
The second SQL statement show's the actual value of each foreign key corresponding to each foreign key id. The Foreign Key Table.
+-----+------------------------+
|T_id | TN |
+-----+------------------------+
| 1 | 'Dallas Cowboys |
| 2 | 'Chicago Bears' |
| 5 | 'New England Patriots' |
| 7 | 'New York Giants' |
| 9 | 'New York Jets' |
+-----+------------------------+
sql = "SELECT * FROM (SELECT M.HT FROM M UNION SELECT M.AT FROM Match)t"
The second sql statement lets me get the foreign key values for each value in M.HT.
sql = "SELECT M.HT, T.TN FROM M INNER JOIN T ON M.HT = T.Tid WHERE strftime('%Y-%m-%d', M.ST) BETWEEN \'2015-08-01\' AND \'2016-06-30\' AND M.Comp = 6 ORDER BY M.ST"
Result of second SQL statement:
+-----+------------------------+
| HT | TN |
+-----+------------------------+
| 1 | 'Dallas Cowboys |
| 5 | 'New England Patriots' |
| 9 | 'New York Jets' |
+-----+------------------------+
But try as I might I have not been able to combine these queries!

I believe the following will work (assuming that the tables are Match and T and baring the WHERE and ORDER BY clauses for brevity/ease) :-
SELECT DISTINCT(m.ht), t.tn
FROM
(SELECT Match.HT FROM Match UNION SELECT Match.AT FROM Match) AS m
JOIN T ON t.tid = m.ht
JOIN Match ON (m.ht = Match.ht OR m.ht = Match.at)
/* WHERE and ORDER BY clauses using Match as m only has columns ht and at */
WHERE strftime('%Y-%m-%d', Match.ST)
BETWEEN \'2015-08-01\' AND \'2016-06-30\' AND Match.Comp = 6
ORDER BY Match.ST
;
Note only tested without the WHERE and ORDER BY clause.
That is using :-
DROP TABLE IF EXISTS Match;
DROP TABLE IF EXISTS T;
CREATE TABLE IF NOT EXISTS Match (ht INTEGER, at INTEGER, st TEXT DEFAULT (datetime('now')));
CREATE TABLE IF NOT EXISTS t (tid INTEGER PRIMARY KEY, tn TEXT);
INSERT INTO T (tn) VALUES('Cows'),('Bears'),('a'),('b'),('Pats'),('c'),('Giants'),('d'),('Jets');
INSERT INTO Match (ht,at) VALUES (1,2),(5,7),(9,5);
/* Directly without the Common Table Expression */
SELECT
DISTINCT(m.ht), t.tn,
Match.st /*<<<<< Added to show results of obtaining other values from Matches >>>>> */
FROM
(SELECT Match.HT FROM Match UNION SELECT Match.AT FROM Match) AS m
JOIN T ON t.tid = m.ht
JOIN Match ON (m.ht = Match.ht OR m.ht = Match.at)
/* WHERE and ORDER BY clauses here using Match */
;
Noting that limited data (just the one extra column) was used for brevity
Results in :-

Conditional Inner join in sqlite python

I have three tables a, b and c.
Table a is related with table b through column key.
table b is related with table c through columns word, sense and speech. In addition table c holds column id.
Now some rows in a.word have no matching value with b.word, based on that
I want to inner join tables on condition if a.word = b.word then join, otherwise compare only a.end_key = b.key.
As a result I want to have table in form of a with extra columns of start_id and end_id from c matching with key_start and key_end.
I tried following sql command with python:
CREATE TABLE relations
AS
SELECT * FROM
c
INNER JOIN
a
INNER JOIN
b
ON
a.end_key = b.key
AND
a.start_key = b.key
AND
b.word = c.word
AND
b.speech = c.speech
AND
b.sense = c.sense
OR
a.word = b.word
a:
+-----------+---------+------+-----------+
| key_start | key_end | word | relation |
+-----------+---------+------+-----------+
| k5 | k1 | tree | h |
| k7 | k2 | car | m |
| k200 | k3 | bad | ho |
+-----------+---------+------+-----------+
b:
+-----+------+--------+-------+
| key | word | speech | sense |
+-----+------+--------+-------+
| k5 | sky | a | 1 |
| k2 | car | a | 1 |
| k3 | bad | n | 2 |
+-----+------+--------+-------+
c:
+----+---------+--------+-------+
| id | word | speech | sense |
+----+---------+--------+-------+
| 0 | light | a | 1 |
| 0 | dark | b | 3 |
| 1 | neutral | a | 2 |
+----+---------+--------+-------+
Edit for clarification:
The values of tables a, b and c hold hundreds thousands lines, so there are matching values in the tables. Table a is related to table b with end_key ~ key and start_key~key relation. Table b is related to c through word sense and speech, there are values which match in each of these columns.
The desired table is in form
start_id|key_start|key_end|end_id|relation
Where start_id matches key_start and key_end matches end_id.

EDIT new answer
The problem with the proposed query lies in the use of AND's and OR's (and likely missing (...)). This statement
a.word = b.word then join, otherwise compare only a.end_key = b.key.
would translate to:
AND (a.word= b.word OR a.end_key = b.key).
Maybe try it like this:
ON
b.word = c.word
AND
b.speech = c.speech
AND
b.sense = c.sense
AND
(a.word = b.word OR a.end_key = b.key)
It would be a good idea to test in a sqlite manager (eg command line sqlite3, DB Browser for sqlite) before you try it in python; troubleshooting is much easier. And of course test the SELECT before you implement it in a CREATE TABLE.
You could clarify your question by showing the desired columns and result in relations table that this sample data would create (there is nothing between b and c that would match on word, speech, sense). Also the description of the relationship between a and b is confusing. In the first paragraph it says Table a is related with table b through column key. Should key be word?

CQL (Cassandra) Select only rows with Max Value on a Column

This question is essentially the same as in this post, SQL Select only rows with Max Value on a Column, except in CQL. I'm working with Cassandra 3.10 so GROUP BY is supported, but HAVING and JOIN are not.
As in the question in above link, we need to find the rows (including "content" column) in each id, with max(rev). In fact, the actual problem I'm trying to solve is to max(rev) grouping by two identifiers, id1 and id2, so ordering by id also doesn't work here.
+------+-------+-------+--------------------------------------+
| id1 | rev | id2 | content |
+------+-------+-------+------------------------------ -------+
| 1 | 1 | 1 | ... |
| 1 | 2 | 1 | ... |
| 2 | 1 | 2 | ... |
| 1 | 3 | 3 | ...
+------+-------+-------+--------------------------------------+
The SQL solutions I had for this were:
SELECT id1, id2, rev, content FROM table
GROUP BY id1, id2 HAVING rev = MAX(rev);
And
SELECT id1, id2, rev, content FROM table
WHERE rev IN
(SELECT MAX(rev) FROM table GROUP BY id1, id2)
(The second works assuming rev is unique.)
Without HAVING or JOIN, what would be a viable approach in CQL or Cassandra 3.10?

Transpositioning and matching values

Is there any formula or built in tool in Excel to make such a thing?
I have table:
| A | B | C |
1 | nam1 | val1 | val2 |
2 | nam2 | val3 | val4 |
3 | nam3 | val5 | val6 |
I want this to look like that:
1 | val1 | nam1
2 | val2 | nam1
3 | val3 | nam2
4 | val4 | nam2
I want to assign names to values with should be in rows.

I guess getting someone to write formulae for you counts as easier! Assuming a layout as shown I suggest two formulae because one column repeats a column and another column lists a matrix (which I have extended with two further columns in view of your Comment). Applying OFFSET as suggested by #Jeeped:
In Row 1 (in the example ColumnG) and copied down to suit:
=OFFSET($B$1,INT((ROW()-1)/4),MOD(ROW()-1,4))
In H1 and copied down to suit:
=OFFSET($A$1,INT((ROW()-1)/4),0)
The row numbers of the formulae are used to calculate the appropriate offsets for rows and columns relative to the reference cells.

How to align multiline values in AsciiDoc table?

I would like to dynamically generate a table with asciidoc, which could look like this :
--------------------------------------
|Text | Parameter | Value1 | Value2 |
--------------------------------------
|foo | param1 | val1 | val2 |
--------------------------------------
|bar | param2 | val3 | val4 |
| | param3 | value_ | val6 |
| | | multi_ | |
| | | 5 | |
| | param4 | val7 | val8 |
--------------------------------------
| baz | param5 | val9 | val10 |
--------------------------------------
That is, there might be multiple parameters to one text, and their
values might span multiple lines. I am looking for a way to automatically
align these. I have a program that gathers data which changes, so I can
not manually fix things.
What I currently do: I have frame and gridless nested tables in the
Parameter, Value1 and Value2 columns. The problem with this is they only align if each value does not span multiple lines.
I also tried making Parameter, Value1 and Value2 a nested table together, with grid but no frame.
It works in terms of alignment, but doesn't look very good because the grid lines do not touch the gridlines of the outer table. Adding a frame also looks dull since it emphasizes multiparameter entries.
What I really want to do is add an extra line to the outer table (no table nesting) with no horizontal line in between, if there is an extra parameter.
I can not see how to do this with AsciiDoc. Is that possible at all? Any other suggestions on how to solve this?

It turns out this is rather easy with spans (see chapter 23.5):
.Multiline values alined with spans
[cols=",,,",width="60%", options="header"]
|================
|Text | Parameter | Value1 | Value2
|foo | param1 | val1 | val2
.3+<.<|foo .3+<.<|bar | val3 | val4
| razzle bla fasel foo bar | dazzle
|bli | bla
|foo2 | param3 | val5 | val6
|================
Now all I need to do is tell my templating system (jinja2) how much rows I need to span, but that is rather a diligent but routine piece of work.

If you're using asciidoctor, there are many other options for tables including putting columns on new lines and using the metadata for the table to specify how many columns the table contains. This is the recommended way of doing tables in Asciidoctor. You can see this example and many others in the user's guide. To give an example here on SO:
[cols="2*"]
|===
|Cell in column 1, row 1
|Cell in column 2, row 1
|Cell in column 1, row 2
|Cell in column 2, row 2
|===
Asciidoctor can be a drop in replacement for the asciidoc command, though you will want to look at differences between the two.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Join on inequality in Power Query - excel

Related

How to combine two columns into one in Sqlite and also get the underlying value of the Foreign Key?

Conditional Inner join in sqlite python

CQL (Cassandra) Select only rows with Max Value on a Column

Transpositioning and matching values

How to align multiline values in AsciiDoc table?

Categories

Resources