How to make subquery fast

How to make subquery fast - subquery

for an author overview we are looking for a query which will show all the authors including their best book. The problem with this query is that it lacks speed. There are only about 1500 authors and the query do generate the overview is currently taking 20 seconds.
The main problem seems te be generating the average rating of all the books per person.
By selecting the following query, it is still rather fast
select
person.id as pers_id,
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate) as year,
thriller.id as thrill_id,
count(user_rating.id) as nr,
AVG(user_rating.rating) as avgrating
from
thriller
inner join
thriller_form
on thriller_form.thriller_id = thriller.id
inner join
thriller_person
on thriller_person.thriller_id = thriller.id
and thriller_person.person_type_id = 1
inner join
person
on person.id = thriller_person.person_id
left outer join
user_rating
on user_rating.thriller_id = thriller.id
and user_rating.rating_type_id = 1
where thriller.id in
(select top 1 B.id from thriller as B
inner join thriller_person as C on B.id=C.thriller_id
and person.id=C.person_id)
group by
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate),
thriller.id,
person.id
order by
person.lastname
However, if we make the subquery a little more complex by selecting the book with the average rating it takes a full 20 seconds to generate a resultset.
The query would then be as follows:
select
person.id as pers_id,
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate) as year,
thriller.id as thrill_id,
count(user_rating.id) as nr,
AVG(user_rating.rating) as avgrating
from
thriller
inner join
thriller_form
on thriller_form.thriller_id = thriller.id
inner join
thriller_person
on thriller_person.thriller_id = thriller.id
and thriller_person.person_type_id = 1
inner join
person
on person.id = thriller_person.person_id
left outer join
user_rating
on user_rating.thriller_id = thriller.id
and user_rating.rating_type_id = 1
where thriller.id in
(select top 1 B.id from thriller as B
inner join thriller_person as C on B.id=C.thriller_id
and person.id=C.person_id
inner join user_rating as D on B.id=D.thriller_id
group by B.id
order by AVG(D.rating))
group by
person.firstname,
person.suffix,
person.lastname,
thriller.title,
year(thriller.orig_pubdate),
thriller.id,
person.id
order by
person.lastname
Anyone got a good suggestion to speed up this query?

Calculating an average requires a table scan since you've got to sum the values and then divide by the number of (relevant) rows. This in turn means that you're doing a lot of rescanning; that's slow. Can you calculate the averages once and store them? That would let your query use those pre-computed values. (Yes, it denormalizes the data, but denormalizing for performance is often necessary; there's a trade-off between performance and minimal data.)
It might be appropriate to use a temporary table as the store of the averages.

Related

Mysql slow multiple join query - version 5.7

I have the following query:
SELECT *
FROM `ResearchEntity` AS `ResearchEntity`
LEFT OUTER JOIN `UserEntity` AS `createdBy` ON `ResearchEntity`.`createdById` = `createdBy`.`id`
LEFT OUTER JOIN `Item1Entity` AS `item1` ON `ResearchEntity`.`id` = `Item1`.`researchId`
LEFT OUTER JOIN `Item2Entity` AS `item2` ON `ResearchEntity`.`id` = `Item2`.`researchId`
LEFT OUTER JOIN `Item3Entity` AS `item3` ON `ResearchEntity`.`id` = `Item3`.`researchId`
LEFT OUTER JOIN `Item4Entity` AS `item4` ON `ResearchEntity`.`id` = `Item4`.`researchId`
LEFT OUTER JOIN `Item5Entity` AS `item5` ON `ResearchEntity`.`id` = `Item5`.`researchId`
LEFT OUTER JOIN `Item6Entity` AS `item6` ON `ResearchEntity`.`id` = `Item6`.`researchId`
LEFT OUTER JOIN `Item7Entity` AS `item7` ON `ResearchEntity`.`id` = `Item7`.`researchId`
LEFT OUTER JOIN `Item8Entity` AS `item8` ON `ResearchEntity`.`id` = `Item8`.`researchId`
LEFT OUTER JOIN `Item9Entity` AS `item9` ON `ResearchEntity`.`id` = `Item9`.`researchId`
LEFT OUTER JOIN `Item10Entity` AS `item10` ON `ResearchEntity`.`id` = `Item10`.`researchId`
ORDER BY `ResearchEntity`.`id` DESC
LIMIT 20, 40;
UserEntity has 4 records
Item*Entity each has 15 records
When I remove the first join, the UserEntity join the query runs very fast, but with it is runs in 50 seconds.
The query is built dynamically in runtime using a ORM.
Why UserEntity is causing this much trouble?
Thanks

Try changing your select * to select specific fields from each entity, the load of your query will be reduced and run time should be faster. something like:
SELECT ResearchEntity.field1 as ReField1, item1.name as Item1Name, ...
-- Edit
Also, try to keep the smaller tables to the left of the JOIN statements.
Ensure you index the joining fields and the joining fields are INT
If you are sure that one Table is causing too much time, consider fetching results of that table separately in your code

MSSQL: Use the result of nested sub-queries

The following works and results in the output shown in the image below.
SELECT
SU_Internal_ID,
NQ_QuestionText,
NA_AnswerText,
NoOfTimesChoosen
FROM
(SELECT
U.SU_Internal_ID,
NQ.NQ_QuestionText,
NA.NA_AnswerText,
COUNT(PC.UserID) AS NoOfTimesChoosen
FROM [dbo].[ParticipantNSChoices] PC
INNER JOIN [dbo].[KnowledgeSurveyAnswers] NA
on PC.NA_Internal_ID = NA.NA_Internal_ID
INNER JOIN [dbo].[KnowledgeSurveyQuestions] NQ
on PC.NQ_Internal_ID = NQ.NQ_Internal_ID
INNER JOIN [dbo].[AspNetUsers] U
on PC.UserID = U.Id
WHERE
U.SU_Internal_ID=1
and NQ.NQ_QuestionText LIKE '%Do you feel comfortable working with computers%'
GROUP
BY U.SU_Internal_ID,
NQ.NQ_QuestionText,
NA.NA_AnswerText ) as A
I want to add a column to show the percent for the two answers 'No' and 'Yes': so next to 'No' I want '20' and next to 'Yes' '80', but I'm pretty new at this and am stuck; I would appreciate any help. Thanks.
Result of working script

You don't need the outer SELECT.
SELECT
U.SU_Internal_ID,
NQ.NQ_QuestionText,
NA.NA_AnswerText,
COUNT(PC.UserID) AS NoOfTimesChoosen,
(cast(COUNT(PC.UserID) as float) /
cast(
(select count(*) from [dbo].[ParticipantNSChoices] PC2
INNER JOIN [dbo].[KnowledgeSurveyAnswers] NA2 on PC2.NA_Internal_ID = NA2.NA_Internal_ID
INNER JOIN [dbo].[KnowledgeSurveyQuestions] NQ2 on PC2.NQ_Internal_ID = NQ2.NQ_Internal_ID
INNER JOIN [dbo].[AspNetUsers] U2 on PC2.UserID = U2.Id
WHERE
U2.SU_Internal_ID=1
and NQ2.NQ_QuestionText LIKE '%Do you feel comfortable working with computers%' )
as float))
* 100 as PercentChosen
FROM [dbo].[ParticipantNSChoices] PC
INNER JOIN [dbo].[KnowledgeSurveyAnswers] NA
on PC.NA_Internal_ID = NA.NA_Internal_ID
INNER JOIN [dbo].[KnowledgeSurveyQuestions] NQ
on PC.NQ_Internal_ID = NQ.NQ_Internal_ID
INNER JOIN [dbo].[AspNetUsers] U
on PC.UserID = U.Id
WHERE
U.SU_Internal_ID=1
and NQ.NQ_QuestionText LIKE '%Do you feel comfortable working with computers%'
GROUP
BY U.SU_Internal_ID,
NQ.NQ_QuestionText,
NA.NA_AnswerText
The counts will be integers, so you need to cast as floats before dividing. You can then further format to your liking. Also, I might not have your exact denominator, because I don't know what your data looks like, but you can modify to match what you need.

How to get sub query columns in main query with WHERE EXISTS in PostgreSQL?

I am stuck with a query which takes more time in JOIN, I want to use WHERE EXISTS in place of JOIN since as performance wise EXISTS takes less time than it.
I have modified the query and it's executing as per expectation but I am not able to use sub query's columns in my main query
Here is my query
SELECT MAX(st.grade_level::integer) AS grades ,
scl.sid AS org_sourced_id
FROM schedules_53b055b75cd237fde3af904c1e726e12 sch
LEFT JOIN schools scl ON(sch.school_id=scl.school_id)
AND scl.batch_id=sch.batch_id
AND scl.client_id = sch.client_id
AND sch.run_id = scl.run_id
WHERE EXISTS
(SELECT t.term_id,t.abbreviation
FROM terms t
WHERE (sch.term = t.term_id)
AND t.batch_id=sch.batch_id
AND t.client_id = sch.client_id
AND t.run_id = sch.run_id)
AND EXISTS
(SELECT st.grade_level,
st.sid
FROM students st
WHERE (sch.student_id=st.sid)
AND st.batch_id= sch.batch_id
AND st.client_id = sch.client_id
AND st.run_id = sch.run_id)
GROUP BY scl.sid ,
sch.course_name ,
sch.course_number,
sch.school_id
And I am getting this error:
ERROR: missing FROM-clause entry for table "st"
SQL state: 42P01
Character: 29
I have only used one column here just for sample but I have to use more fields from sub query.
My main aim is that how can I achieve this with EXISTS or any alternate solution which is more optimal as performance wise
I am using pg module on Node.js since as back end I am using Node.js.
UPDATE
Query with JOIN
SELECT MAX(st.grade_level::integer) AS grades ,
scl.sid AS org_sourced_id
FROM schedules_53b055b75cd237fde3af904c1e726e12 sch
LEFT JOIN schools scl ON(sch.school_id=scl.school_id)
AND scl.batch_id=sch.batch_id
AND scl.client_id = sch.client_id
AND sch.run_id = scl.run_id
LEFT JOIN terms t ON (sch.term = t.term_id)
AND t.batch_id=sch.batch_id
AND t.client_id = sch.client_id
AND t.run_id = sch.run_id
LEFT JOIN students st ON (sch.student_id=st.sid)
AND st.batch_id= sch.batch_id
AND st.client_id = sch.client_id
AND st.run_id = sch.run_id
GROUP BY scl.sid ,
sch.course_name ,
sch.course_number,
sch.school_id

Merge two queries

i need to merge two queries: one lists the sum off all items per month, and one lists the sum of the items YTD.
I used union and it works when 'YTD' is selected in my dropdownlist. However, when i select any other month it give me the results of YTD and the selected month....
The union query so far:
SELECT
Site.Site_Name 'Site',
'YTD' as 'Month_Name',
Sum(MOT.Total_MR_Count_Received) 'Receiving',
Sum(MOT.Total_Line_Item_Count_Received) 'Checking',
Sum(MOT.Total_MR_Count_Shipped) 'Shipment Activity'
FROM
Metrics_Main
INNER JOIN Metrics_MOT MOT ON Metrics_Main.Metrics_Key = MOT.Metrics_Key
INNER JOIN Month ON Metrics_Main.Month_Key = Month.Month_Key
INNER JOIN Site ON Metrics_Main.Site_Key = Site.Site_Key
group by Site.site_name
union
SELECT
Site.Site_Name 'Site',
Month.Month_Name 'Month_Name',
sum(MOT.Total_MR_Count_Received) 'Receiving',
sum(MOT.Total_Line_Item_Count_Received) 'Checking',
sum(MOT.Total_MR_Count_Shipped) 'Shipment_Activity'
FROM
Metrics_Main
INNER JOIN Metrics_MOT MOT ON Metrics_Main.Metrics_Key = MOT.Metrics_Key
INNER JOIN Month ON Metrics_Main.Month_Key = Month.Month_Key
INNER JOIN Site ON Metrics_Main.Site_Key = Site.Site_Key
WHERE
Month.Month_Name like #Month_Name
group by Site.site_name, month.month_name

This will help: http://www.w3schools.com/sql/sql_union.asp. Make sure you that you have the exact same number of columns in the two select queries; so add the "Month_Name" to the first query as well.

Complex sql query and JPQL

How to change this complex sql statement into JPQL?
select a.name, a.surname, b.street, c.location, c.location_desc
from table1 join table2 on b.id = a.some_fk
left join table3 d on d.id = a.id
left join table4 c on c.some2_id = d.some2_fk where a.id = 400;
If this is possible to present in the JPQL form?

Impossible to give a definitive answer without knowing the entities and their mapping, but the query would look like this:
select a.name, a.surname, b.street, c.location, c.locationDesc
from Entity1 a
join a.entity2 b
left join a.entity3 d
left join d.entity4 c
where a.id = 400;
provided the necessary associations between entities are there.

JPQL is object oriented, it operates against JPA entity objects ,not database tables. Either you need to change the Question and add an UML diagramm or provide the Entity classes.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to make subquery fast - subquery

Related

Mysql slow multiple join query - version 5.7

MSSQL: Use the result of nested sub-queries

How to get sub query columns in main query with WHERE EXISTS in PostgreSQL?

Merge two queries

Complex sql query and JPQL

Categories

Resources