Get MysQL rows where 2 words match - search

I am trying to build an easy search based on 2 MySQL tables. One called keywords (words) and another called keyword2data (map which binds words to datasource).
Keywords holds id and keyword whilst keywords2data holds keyword_id and data_id.
data_id it self is a reference to a 3rd but in this case unimportant table.
What i want is to be able to search for example "dog sled" and get all data_id's which has those keywords bound to it.
SELECT k2d.`data_id` , k2d.`keyword_id`
FROM keywords2data as k2d, keywords as k
WHERE k2d.`keyword_id` = k.`id`
&& (k.`keyword` = 'dog' || k.`keyword` = 'sled')
LIMIT 10
Gives me all data_id which has either dog or sled bound to it, not necessary both, which is what i want.
SELECT k2d.`data_id` , k2d.`keyword_id`
FROM keywords2data as k2d, keywords as k
WHERE k2d.`keyword_id` = k.`id`
&& (k.`keyword` = 'dog' && k.`keyword` = 'sled')
LIMIT 10
Gives me nothing since no single row in keywords2data holds 2 keywords.
What is the right way to do this?

How about something like
SELECT k2d.`data_id` ,
k2d.`keyword_id`
FROM keywords2data as k2d INNER JOIN
keywords as k ON k2d.`keyword_id` = k.`id` INNER JOIN
keywords as k2 ON k2d.`keyword_id` = k2.`id`
WHERE k.`keyword` = 'dog'
AND k2.`keyword` = 'sled'
LIMIT 10

How about this?
SELECT k2d.`data_id`,
k2d.`keyword_id`
FROM keywords2data AS k2d
INNER JOIN keywords AS k
ON k2d.`keyword_id` = k.`id`
WHERE k.`keyword` IN ( 'dog', 'sled', 'rex' )
GROUP BY k.keyword
HAVING COUNT(*) = 3

Possibly, this?
To extend to match more keywords you'd just add more words to the OR statement in the subquery and change the =2 afterwards.
This assumes that each data item is linked to a keyword using kerywords2data once and only once.
SELECT k2d.data_id
, k2d.keyword_id
FROM keywords2data AS k2d
, keywords AS k
WHERE k2d.keyword_id = k.id
AND (
SELECT COUNT(*)
FROM keywords2data AS sqk2d
, keywords AS sqk
WHERE sqk2d.data_id = k2d.data_id
AND sqk2d.keyword_id = sqk.id
AND (sqk.keyword = 'dog' || sqk.keyword = 'sled')
) = 2
LIMIT 10
Here's a version that doesn't return the data_id repeated (as per comments), but also doesn't return any keywords at all:
SELECT k2d.data_id
FROM keywords2data AS k2d
WHERE (
SELECT COUNT(*)
FROM keywords2data AS sqk2d
, keywords AS sqk
WHERE sqk2d.data_id = k2d.data_id
AND sqk2d.keyword_id = sqk.id
AND (sqk.keyword = 'dog' || sqk.keyword = 'sled')
) = 2
LIMIT 10

Related

If my search term is not found add to array

Looping through a list of terms and searching database for each term. Is there a way to return a list of search terms that didn't find results in the database?
This is my SELECT query:
SELECT x_tbl.x_str
FROM x_tbl
LEFT JOIN p_tbl ON p_tbl.p_id = x_tbl.p_id
LEFT JOIN t_tbl ON t_tbl.t_id = x_tbl.t_id
WHERE t_tbl.t = %(uw)s
OR t_tbl.ta = %(uw)s
OR %(uw)s = ANY (t_tbl.tb)
OR t_tbl.tc = %(uw)s
OR p_tbl.pa = %(uw)s
OR %(uw)s = ANY (p_tbl.pb)
I want to return uw if it does not find any results in the database. Is this possible?
I want to return uw if it does not find any results in the database.
For one search term
The two LEFT [OUTER] JOIN between x_tbl, p_tbl and t_tbl do not eliminate any rows by themselves. Two times NOT EXISTS returns a list of search terms, that cannot find anything:
SELECT %(uw)s -- old-style Python placeholder
WHERE NOT EXISTS (
SELECT FROM p_tbl
WHERE pa = %(uw)s
OR %(uw)s = ANY (p.pb)
)
AND NOT EXISTS (
SELECT FROM t_tbl
WHERE %(uw)s IN (t, ta, tc)
OR %(uw)s = ANY (tb)
);
If there can be orphans in t_tbl and/or p_tbl (not linked to any row in x_tbl), the set may be bigger, and the query gets more expensive:
SELECT %(uw)s -- old-style Python placeholder
WHERE NOT EXISTS NOT EXISTS (
SELECT FROM x_tbl JOIN p_tbl p USING (p_id)
WHERE p.pa = %(uw)s
OR %(uw)s = ANY (p.pb)
)
AND (
SELECT FROM x_tbl JOIN t_tbl t USING (t_id)
WHERE %(uw)s IN (t.t, t.ta, t.tc)
OR %(uw)s = ANY (t.tb)
);
This is dealing with one search term at a time, like your original query. You mentioned a list. Running a single query for all of them might be (much) cheaper ...
One query to rule them all
Pass the list as array (Postgres array literal) - which may require an explicit cast (::text[]) - unnest() and attach the same WHERE conditions as above:
SELECT uw
FROM unnest(%(my_list_as_array)s::text[]) q(uw)
WHERE NOT EXISTS (
SELECT FROM p_tbl
WHERE pa = q.uw
OR q.uw = ANY (p.pb)
)
AND NOT EXISTS (
SELECT FROM t_tbl
WHERE q.uw IN (t, ta, tc)
OR q.uw = ANY (tb)
);
Or, including the join to tbl_x, same as above:
SELECT uw
FROM unnest(%(my_list_as_array)s::text[]) q(uw)
WHERE NOT EXISTS (
SELECT FROM x_tbl JOIN p_tbl p USING (p_id)
WHERE p.pa = q.uw
OR q.uw = ANY (p.pb)
)
AND NOT EXISTS (
SELECT FROM x_tbl JOIN t_tbl t USING (t_id)
WHERE q.uw IN (t.t, t.ta, t.tc)
OR q.uw = ANY (t.tb)
);
Basics:
Select rows which are not present in other table
You may want to keep array elements in original order, or even attach an ordinal position. See:
PostgreSQL unnest() with element number
Aside, your original query can multiply rows - if there can be more than one row on the right side of each join. See:
Two SQL LEFT JOINS produce incorrect result

Invalid Column In SQL Query

select
university_cars_video_kroenke.dbo.car_customer.cus_first,
university_cars_video_kroenke.dbo.car_customer.cus_last,
(
select COUNT(university_cars_video_kroenke.dbo.car_customer.cus_id)
from university_cars_video_kroenke.dbo.car_purchases
where university_cars_video_kroenke.dbo.car_customer.cus_id = university_cars_video_kroenke.dbo.car_purchases.cus_id
)
from university_cars_video_kroenke.dbo.car_customer
(edited for clarity)
select
customer.cus_first,
customer.cus_last,
(select
COUNT(customer.cus_id)
from purchases
where customer.cus_id = purchases.cus_id )
from customer
My error message is
Msg 8120, Level 16, State 1, Line 4 Column
'university_cars_video_kroenke.dbo.car_customer.cus_first'
is invalid in the select list because it is not contained
in either an aggregate function or the GROUP BY clause
I just want a count of records the cus_id is the same in both tables.
I just want a count of records the cus_id is the same in both tables.
Something like the following should work.
SELECT
A.cus_id,
count(A.cus_id)
FROM
university_cars_video_kroenke.dbo.car_customer AS A,
university_cars_video_kroenke.dbo.car_purchases AS B
WHERE
A.cus_id = B.cus_id

MSSQL: Use the result of nested sub-queries

The following works and results in the output shown in the image below.
SELECT
SU_Internal_ID,
NQ_QuestionText,
NA_AnswerText,
NoOfTimesChoosen
FROM
(SELECT
U.SU_Internal_ID,
NQ.NQ_QuestionText,
NA.NA_AnswerText,
COUNT(PC.UserID) AS NoOfTimesChoosen
FROM [dbo].[ParticipantNSChoices] PC
INNER JOIN [dbo].[KnowledgeSurveyAnswers] NA
on PC.NA_Internal_ID = NA.NA_Internal_ID
INNER JOIN [dbo].[KnowledgeSurveyQuestions] NQ
on PC.NQ_Internal_ID = NQ.NQ_Internal_ID
INNER JOIN [dbo].[AspNetUsers] U
on PC.UserID = U.Id
WHERE
U.SU_Internal_ID=1
and NQ.NQ_QuestionText LIKE '%Do you feel comfortable working with computers%'
GROUP
BY U.SU_Internal_ID,
NQ.NQ_QuestionText,
NA.NA_AnswerText ) as A
I want to add a column to show the percent for the two answers 'No' and 'Yes': so next to 'No' I want '20' and next to 'Yes' '80', but I'm pretty new at this and am stuck; I would appreciate any help. Thanks.
Result of working script
You don't need the outer SELECT.
SELECT
U.SU_Internal_ID,
NQ.NQ_QuestionText,
NA.NA_AnswerText,
COUNT(PC.UserID) AS NoOfTimesChoosen,
(cast(COUNT(PC.UserID) as float) /
cast(
(select count(*) from [dbo].[ParticipantNSChoices] PC2
INNER JOIN [dbo].[KnowledgeSurveyAnswers] NA2 on PC2.NA_Internal_ID = NA2.NA_Internal_ID
INNER JOIN [dbo].[KnowledgeSurveyQuestions] NQ2 on PC2.NQ_Internal_ID = NQ2.NQ_Internal_ID
INNER JOIN [dbo].[AspNetUsers] U2 on PC2.UserID = U2.Id
WHERE
U2.SU_Internal_ID=1
and NQ2.NQ_QuestionText LIKE '%Do you feel comfortable working with computers%' )
as float))
* 100 as PercentChosen
FROM [dbo].[ParticipantNSChoices] PC
INNER JOIN [dbo].[KnowledgeSurveyAnswers] NA
on PC.NA_Internal_ID = NA.NA_Internal_ID
INNER JOIN [dbo].[KnowledgeSurveyQuestions] NQ
on PC.NQ_Internal_ID = NQ.NQ_Internal_ID
INNER JOIN [dbo].[AspNetUsers] U
on PC.UserID = U.Id
WHERE
U.SU_Internal_ID=1
and NQ.NQ_QuestionText LIKE '%Do you feel comfortable working with computers%'
GROUP
BY U.SU_Internal_ID,
NQ.NQ_QuestionText,
NA.NA_AnswerText
The counts will be integers, so you need to cast as floats before dividing. You can then further format to your liking. Also, I might not have your exact denominator, because I don't know what your data looks like, but you can modify to match what you need.

How to get sub query columns in main query with WHERE EXISTS in PostgreSQL?

I am stuck with a query which takes more time in JOIN, I want to use WHERE EXISTS in place of JOIN since as performance wise EXISTS takes less time than it.
I have modified the query and it's executing as per expectation but I am not able to use sub query's columns in my main query
Here is my query
SELECT MAX(st.grade_level::integer) AS grades ,
scl.sid AS org_sourced_id
FROM schedules_53b055b75cd237fde3af904c1e726e12 sch
LEFT JOIN schools scl ON(sch.school_id=scl.school_id)
AND scl.batch_id=sch.batch_id
AND scl.client_id = sch.client_id
AND sch.run_id = scl.run_id
WHERE EXISTS
(SELECT t.term_id,t.abbreviation
FROM terms t
WHERE (sch.term = t.term_id)
AND t.batch_id=sch.batch_id
AND t.client_id = sch.client_id
AND t.run_id = sch.run_id)
AND EXISTS
(SELECT st.grade_level,
st.sid
FROM students st
WHERE (sch.student_id=st.sid)
AND st.batch_id= sch.batch_id
AND st.client_id = sch.client_id
AND st.run_id = sch.run_id)
GROUP BY scl.sid ,
sch.course_name ,
sch.course_number,
sch.school_id
And I am getting this error:
ERROR: missing FROM-clause entry for table "st"
SQL state: 42P01
Character: 29
I have only used one column here just for sample but I have to use more fields from sub query.
My main aim is that how can I achieve this with EXISTS or any alternate solution which is more optimal as performance wise
I am using pg module on Node.js since as back end I am using Node.js.
UPDATE
Query with JOIN
SELECT MAX(st.grade_level::integer) AS grades ,
scl.sid AS org_sourced_id
FROM schedules_53b055b75cd237fde3af904c1e726e12 sch
LEFT JOIN schools scl ON(sch.school_id=scl.school_id)
AND scl.batch_id=sch.batch_id
AND scl.client_id = sch.client_id
AND sch.run_id = scl.run_id
LEFT JOIN terms t ON (sch.term = t.term_id)
AND t.batch_id=sch.batch_id
AND t.client_id = sch.client_id
AND t.run_id = sch.run_id
LEFT JOIN students st ON (sch.student_id=st.sid)
AND st.batch_id= sch.batch_id
AND st.client_id = sch.client_id
AND st.run_id = sch.run_id
GROUP BY scl.sid ,
sch.course_name ,
sch.course_number,
sch.school_id

Spotfire - advanced row level security

I'm working on row level security in Spotfire (6.5) report.
It should be implemented on 3 levels, lets call it L1, L2 and L3. There is additional mapping table that contains Userlogins and specified values on all levels where user has access. Additionaly if user is not in mapping table he is some kind of Root user so he has access to everything.
On DB side it looks like that:
CREATE TABLE SECURITY
(
USER_ID VARCHAR2(100 BYTE)
, L1 VARCHAR2(100 BYTE)
, L2 VARCHAR2(100 BYTE)
, L3 VARCHAR2(100 BYTE)
--, L1L2L3 VARCHAR2(100 BYTE) -- option there could be one column that contains lowest possible level
);
INSERT INTO SECURITY (USER_ID, L1) VALUES ('UNAME1','A');
INSERT INTO SECURITY (USER_ID, L2) VALUES ('UNAME2','BB');
INSERT INTO SECURITY (USER_ID, L3) VALUES ('UNAME3','CCC');
CREATE TABLE SECURED_DATA
(
L1 VARCHAR2(100 BYTE)
, L2 VARCHAR2(100 BYTE)
, L3 VARCHAR2(100 BYTE)
, V1 NUMBER
);
INSERT INTO SECURED_DATA (L1, V1) VALUES ('A',1);
INSERT INTO SECURED_DATA (L1, L2, V1) VALUES ('B','BB',2);
INSERT INTO SECURED_DATA (L1, L2, L3, V1) VALUES ('C','CC','CCC',3);
Finally I've made Information Link and then I've changed its' sql code to something like that:
SELECT
M.*
FROM
SECURITY S
INNER JOIN SECURED_DATA M
ON
(
M.L1 = S.L1
AND S.USER_ID = (%CURRENT_USER%)
)
UNION ALL
SELECT
M.*
FROM
SECURITY S
INNER JOIN SECURED_DATA M
ON
(
M.L2 = S.L2
AND S.USER_ID = (%CURRENT_USER%)
)
UNION ALL
SELECT
M.*
FROM
SECURITY S
INNER JOIN SECURED_DATA M
ON
(
M.L3 = S.L3
AND S.USER_ID = (%CURRENT_USER%)
)
UNION ALL
SELECT
M.*
FROM
SECURED_DATA M
WHERE
(
SELECT
COUNT(1)
FROM
SECURITY S
WHERE S.USER_ID = (%CURRENT_USER%)
)
=0
It works fine, but I'm worndering if there is more smart and more Spotfire way to get it?
Many thanks and regards,
Maciej
My guess on "more smart and more Spotfire way" is that you want to be able to cache a single data set and use it for multiple users, limiting it in the analytic rather than in the data pull. There is some danger to this, if we're doing it for security's sake, because the data will technically be in the analytic, and if they have permission to edit and add visualizations, you no longer control what they can and cannot see. If there's any authoring allowed in Web Player for the specific analytic, I recommend all securities be done DataBase-side.
If you want to do it in Spotfire anyways, here is my recommendation:
Have an Information Link (for example case, named IL_SecurityCheck) which is Select * from SECURITY WHERE S.USER_ID = (%CURRENT_USER%).
If they move from a cover page to the page with the data in it, you can put the code in the script to change pages; if not, you can use a method I explained here: Spotfire Current Date in input field with calendar popup to fire off a script on open.
Button Script required:
from Spotfire.Dxp.Data import *
crossSource = Document.Data.Tables["IL_SecurityCheck"]
rowCount = crossSource.RowCount
rowIndexSet = IndexSet(rowCount, True)
print rowCount
#rowCount = Document.Data.Tables["Managed Care UpDownStream"].RowCount
colCurs = DataValueCursor.CreateFormatted(crossSource.Columns["L1"])
colCurs2 = DataValueCursor.CreateFormatted(crossSource.Columns["L2"])
colCurs3 = DataValueCursor.CreateFormatted(crossSource.Columns["L3"])
x = ""
if rowIndexSet.IsEmpty != True:
for row in crossSource.GetRows(rowIndexSet, colCurs):
if colCurs.CurrentValue is not None:
x += "[L1] = '" + colCurs.CurrentValue + "' and "
for row in crossSource.GetRows(rowIndexSet, colCurs2):
if colCurs2.CurrentValue is not None:
x += "[L2] = '" + colCurs2.CurrentValue + "' and "
for row in crossSource.GetRows(rowIndexSet, colCurs3):
if colCurs3.CurrentValue is not None:
x += "[L3] = '" + colCurs3.CurrentValue + "' and "
x = x[:len(x) - 4]
else:
x = "1=1"
Document.Properties["SecurityLimits"] = x
Visualization Data Limited by Expression: ${SecurityLimits}

Resources