I am running a query with Oracle:
SELECT
c.customer_number,
COUNT(DISTINCT o.ORDER_NUMBER),
COUNT(DISTINCT q.QUOTE_NUMBER)
FROM
Customer c
JOIN Orders o on c.customer_number = o.party_number
JOIN Quote q on c.customer_number = q.account_number
GROUP BY
c.customer_number
This works beautifully and I can get the customer and their order and quote counts.
However, not all customers have orders or quotes but I still want their data. When I use LEFT JOIN I get this error from Oracle:
ORA-24347: Warning of a NULL column in an aggregate function
Seemingly this error is caused by the eventual COUNT(NULL) for customers that are missing orders and/or quotes.
How can I get a COUNT of null values to come out to 0 in this query?
I can do COUNT(DISTINCT NVL(o.ORDER_NUMBER, 0)) but then the counts will come out to 1 if orders/quotes are missing which is no good. Using NVL(o.ORDER_NUMBER, NULL) has the same problem.
Try using inline views:
SELECT
c.customer_number,
o.order_count,
q.quote_count
FROM
customer c,
( SELECT
party_number,
COUNT(DISTINCT order_number) AS order_count
FROM
orders
GROUP BY
party_number
) o,
( SELECT
account_number,
COUNT(DISTINCT quote_number) AS quote_count
FROM
quote
GROUP BY
account_number
) q
WHERE 1=1
AND c.customer_number = o.party_number (+)
AND c.customer_number = q.account_number (+)
;
Sorry, but I'm not working with any databases right now to test this, or to test whatever the ANSI SQL version might be. Just going on memory.
Related
i have i would like to create a join over several tables.
table login : I would like to retrieve all the data from login
table logging : calculating the Nb_of_sessions for each db & for each a specific event type by user
table meeting : calculating the Nb_of_meetings for each db & for each user
table live : calculating the Nb_of_live for each db & for each user
I have those queries with the right results :
SELECT db.id,_id as userid,firstname,lastname
FROM "logins"."login",
UNNEST(dbs) AS a1 (db)
SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
array_join(array_agg(value.from_url),',') as from_url
FROM "loggings"."logging"
where event='url_event'
group by db.id,userid;
SELECT dbid,userid AS userid,count(*) as nb_interviews,
array_join(array_agg(interviewer),',') as interviewer
FROM "meetings"."meeting"
group by dbid,userid;
SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat
FROM "lives"."live",
UNNEST(users) AS r1 (user)
group by dbid,r1.user._id;
But when i begin to try put it all together, it seems i retrieve bad data (i have only on db retrieved) and it seems not efficient.
select a1.db.id,a._id as userid,a.firstname,a.lastname,count(rl._id) as nb_chat
FROM
"logins"."login" a,
"loggings"."logging" b,
"meetings"."meeting" c,
"lives"."live" d,
UNNEST(dbs) AS a1 (db),
UNNEST(users) AS r1 (user)
where a._id = b.userid AND a._id = c.userid AND a._id = r1.user._id
group by 1,2,3,4
Do you have an idea ?
Regards.
The easiest way is to work with with to structure the subquery and then reference them.
with parameter reference:
You can use WITH to flatten nested queries, or to simplify subqueries.
The WITH clause precedes the SELECT list in a query and defines one or
more subqueries for use within the SELECT query.
Each subquery defines a temporary table, similar to a view definition,
which you can reference in the FROM clause. The tables are used only
when the query runs.
Since you already have working sub queries, the following should work:
with logins as
(
SELECT db.id,_id as userid,firstname,lastname
FROM "logins"."login",
UNNEST(dbs) AS a1 (db)
)
,visits as
(
SELECT dbid,userid,count(distinct(sessionid)) as no_of_visits,
array_join(array_agg(value.from_url),',') as from_url
FROM "loggings"."logging"
where event='url_event'
group by db.id,userid
)
,meetings as
(
SELECT dbid,userid AS userid,count(*) as nb_interviews,
array_join(array_agg(interviewer),',') as interviewer
FROM "meetings"."meeting"
group by dbid,userid
)
,chats as
(
SELECT dbid,r1.user._id AS userid,count(_id) as nb_chat
FROM "lives"."live",
UNNEST(users) AS r1 (user)
group by dbid,r1.user._id
)
select *
from logins l
left join visits v
on l.dbid = v.dbid
and l.userid = v.userid
left join meetings m
on l.dbid = m.dbid
and l.userid = m.userid
left join chats c
on l.dbid = c.dbid
and l.userid = c.userid;
All of the documentation for Cosmos DB and it looks like it only supports the JOINkeyword, which seems to be a sort of INNER JOIN.
I have the following query:
SELECT * FROM
(
SELECT
DISTINCT(c.id),
c.OtherCollection,
FROM c
JOIN s IN c.OtherCollection
)
AS c order by c.id
This works fine and returns the data of documents that have OtherCollection populated. But It obviously does not return any documents that do not have it populated.
The reason for the join is that sometimes I execute the following query (queries are built up from user input)
SELECT * FROM
(
SELECT
DISTINCT(c.id),
c.OtherCollection,
FROM c
JOIN s IN c.OtherCollection
WHERE s.PropertyName = 'SomeValue'
)
AS c order by c.id
The question is how can I have a sort of LEFT JOIN operator in this scenario?
CosmosDB JOIN operation is limited to the scope of a single document. What possible is you can join parent object with child objects under same document.
It is totally different from SQL Join query which supports across two/many tables.
You can simulate LEFT JOIN with the EXISTS sentence.
Eg:
SELECT VALUE c
FROM c
WHERE (
(c.OtherCollection = null) OR EXISTS (--Like a "Left Join"
SELECT null
FROM s IN c.OtherCollection
WHERE s.PropertyName = 'SomeValue'
)
)
--AND/OR Some other c Node conditions
order by c.id
Am getting an error that goes like this:
Insert values statement can contain only constant literal values or variable references.
these are the statements in which I am getting the errors:
INSERT INTO val.summary_numbers (metric_name, metric_val, dt_create) VALUES ('Total IP Enconters',
(SELECT
count(DISTINCT encounter_id)
FROM prod.encounter
WHERE encounter_type = 'Inpatient')
,
(SELECT min(mod_loadidentifier)
FROM ccsm.stg_demographics_baseline)
);
INSERT INTO val.summary_numbers (metric_name, metric_val, dt_create) VALUES ('Total 30d Readmits',
(SELECT
count(DISTINCT encounter_id)
FROM prod.encounter_attr
WHERE
attr_name = 'day_30_readmit' AND attr_value = 1)
,
(SELECT min(mod_loadidentifier)
FROM ccsm.stg_demographics_baseline));
Change your query like this:
insert into val.summary_numbers
select
'Total IP Enconters',
(select count(distinct encounter_id)
from prod.encounter
where encounter_type = 'Inpatient'),
(select min(mod_loadidentifier)
from ccsm.stg_demographics_baseline)
When using the ADW service, I would recommend that you consider using the CTAS operation possibly combined with a RENAME. The RENAME is a metadata operation so it is fast and the CTAS is parallel where the INSERT INTO will be row by row.
You may still have a data related issue that can be hard to determine with out the create table statement.
Thanks
My application needs to get some basic data from a user table with primary key user_id - and various other data about the user from secondary tables, each of which has user_id as a foreign key. There are a bunch of these secondary tables such as name, addresss, phone, etcetera - things about a person that can change over time.
More specifically, I need only some values from the most recent row from each secondary table. Each table has a "latest" column which is unix timestamp of the most recent UPDATE or INSERT (we must not delete in this application).
The following works correctly:
SELECT u.username, u.user_id, u.password, u.email, u.active
, n.first , n.middle , n.last
, uo.organization_id /* , other_cols_from_other_tables */
FROM user u
LEFT JOIN user_org uo ON (uo.user_id = u.user_id AND
uo.latest in (select max(latest) from name uo1
where uo1.user_id = u.user_id))
/* here, other LEFT JOINs like the above one */
WHERE u.username = :username
However, a subquery solution is widely discouraged due to slowness, and some of these queries will run on every request. So I came up with the following that works in some cases and gets rid of the subquery:
SELECT u.username, u.user_id, u.password, u.email, u.active
, n.first , n.middle , n.last
, uo.organization_id /* , other_cols_from_other_tables, etc. */
FROM user u
INNER JOIN
( SELECT user_id, MAX(latest) utd
FROM user_org
GROUP BY user_id
) uo1 ON uo1.user_id = u.user_id
LEFT JOIN user_org uo
ON (uo.user_id = u.user_id and uo.latest = uo1.utd)
/* here, other clauses like the part from 'FROM' to here */
WHERE u.username = :username
The latter, unfortunately makes a hard dependence on data in the secondary table, so that the whole query fails if data is lacking in any secondary table for the particular user.
I've researched this on SO and www and there are many solutions for avoiding subqueries, but everything I've found on the subject has the issue in the main query, not in a left join.
The logic I need is "if there's data for this user in this secondary table, get the specified column(s) from the most recent row in that table, otherwise a null".
It seems to me that putting a "current row" marker column on the most recent row in each table would avoid the whole issue and run faster than any other solution, but would be against normalization (I would still have to have the 'latest' column to maintain order-able history of previous data).
Is there a solution that gets normalization + speed? This is mariadb so it needs Mysql syntax.
EDIT: Still would like a better way, but decided to go with the extra column. Now the problem described above is avoided, and the SELECT SQL is much simplified and presumably faster. The downside is adding complexity in saves, but SELECT is more frequent.
MariaDB supports ROW_NUMBER as of version 10.2:
SELECT
u.username,
u.user_id,
u.password,
u.email,
u.active,
uo.organization_id,
...
FROM user u
LEFT JOIN
(
select
user_org.*,
row_number() over(partition by user_id order by latest desc) as rn
from user_org
) uo ON uo.user_id = u.user_id AND uo.rn = 1
...
WHERE u.username = :username;
I'm fairly new to SQL and have started running into sub-queries as in this query below:
SELECT C.CustomerID
, C.Name
, ( Select PhoneNumber
FROM PhoneNumberTable P
WHERE P.CustomerID = C.CustomerID ) AS "PhoneNumber"
FROM CustomerTable C
Comparing to this query with a join below:
SELECT C.CustomerID
, C.Name
, P.PhoneNumber
FROM CustomerTable C
JOIN PhoneNumberTable P
ON P.customerID = C.customerID
Is there a difference in terms of efficiency/speed? The SQL I am working with has several sub-queries as I have shown above (no JOINs) and it is difficult to read.
joins in my experience tend to be faster, but sometimes you need a subquery.
you should also look into CTE they are very usefull and much easier (in my opinion) to manage
in your specific case i would use a join... because you are trying to join the 2 tables together.