The datastore is postgres. Can someone help in translating this to an Objection.js statement? It's easy to do this with two roundtrips, but ideally this would happen in one.
INSERT INTO reports (
id,
created_by,
desc,
dataset
)
SELECT
'8971e660-7777-4d64-8cc3-171512063fff',
123,
'clone!',
r.dataset,
FROM reports r
WHERE r.id = '7771e660-9d7d-4d64-8cc3-17151206354f';
Related
Need your help. I am trying to convert below SQL query into RedShift, but getting error message "Invalid operation: This type of correlated subquery pattern is not supported yet"
SELECT
Comp_Key,
Comp_Reading_Key,
Row_Num,
Prev_Reading_Date,
( SELECT MAX(X) FROM (
SELECT CAST(dateadd(day, 1, Prev_Reading_Date) AS DATE) AS X
UNION ALL
SELECT dim_date.calendar_date
) a
) as start_dt
FROM stage5
JOIN dim_date ON calendar_date BETWEEN '2020-04-01' and '2020-04-15'
WHERE Comp_Key =50906055
The same query works fine in SQL Server. Could you please help me to run it in RedShift?
Regards,
Kiru
Kiru - you need to convert the correlated query into a join structure. Not knowing the data content of your tables and the exact expected out put I'm just guessing but here's a swag:
SELECT
Comp_Key,
Comp_Reading_Key,
Row_Num,
Prev_Reading_Date,
Max_X
FROM stage5
JOIN dim_date ON calendar_date BETWEEN '2020-04-01' and '2020-04-15'
JOIN ( SELECT MAX(X) as Max_X, MAX(calendar_date) as date FROM (
SELECT CAST(dateadd(day, 1, Prev_Reading_Date) AS DATE) AS X FROM stage5
cross join
SELECT dim_date.calendar_date from dim_date
) a
) as start_dt ON a.date = dim_date.calendar_date
WHERE Comp_Key =50906055
This is just a starting guess but might get you started.
However, you are likely better off rewriting this query to use window functions as they are the fastest way to perform these types of looping queries in Redshift.
Thanks Bill. It won't work in RedShift as it still has correalted sub-query.
However I have modified query in another method and it works fine.
I am closing ticket.
My table schema is:
CREATE TABLE users
(user_id BIGINT PRIMARY KEY,
user_name text,
email_ text);
I inserted below rows into the table.
INSERT INTO users(user_id, email_, user_name)
VALUES(1, 'abc#test.com', 'ABC');
INSERT INTO users(user_id, email_, user_name)
VALUES(2, 'abc#test.com', 'ZYX ABC');
INSERT INTO users(user_id, email_, user_name)
VALUES(3, 'abc#test.com', 'Test ABC');
INSERT INTO users(user_id, email_, user_name)
VALUES(4, 'abc#test.com', 'C ABC');
For searching data into the user_name column, I created an index to use the LIKE operator with '%%':
CREATE CUSTOM INDEX idx_users_user_name ON users (user_name)
USING 'org.apache.cassandra.index.sasi.SASIIndex'
WITH OPTIONS = {
'mode': 'CONTAINS',
'analyzer_class': 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
'case_sensitive': 'false'};
Problem:1
When I am executing below Query, it returns 3 records only, instead of 4.
select *
from users
where user_name like '%ABC%';
Problem:2
When I use below query, it gives an error as
ERROR: com.datastax.driver.core.exceptions.InvalidQueryException:
ORDER BY with 2ndary indexes is not supported.
Query =select * from users where user_name like '%ABC%' ORDER BY user_name ASC;
Query:
select *
from users
where user_name like '%ABC%'
ORDER BY user_name ASC;
My requirement is to filter the user_name with order by user_name.
The first query does work correctly for me using cassandra:latest which is now cassandra:3.11.3. You might want to double-check the inserted data (or just recreate from scratch using the cql statements you provided).
The second one gives you enough info - ordering by secondary indexes is not possible in Cassandra. You might have to sort the result set in your application.
That being said I would not recommend running this setup in real apps. With some additional scale (when you have many records) this will be a suicide performance-wise. I should not go into much detail since maybe you already understand this and SO is not a wikia/documentation site, so here is a link.
My application needs to get some basic data from a user table with primary key user_id - and various other data about the user from secondary tables, each of which has user_id as a foreign key. There are a bunch of these secondary tables such as name, addresss, phone, etcetera - things about a person that can change over time.
More specifically, I need only some values from the most recent row from each secondary table. Each table has a "latest" column which is unix timestamp of the most recent UPDATE or INSERT (we must not delete in this application).
The following works correctly:
SELECT u.username, u.user_id, u.password, u.email, u.active
, n.first , n.middle , n.last
, uo.organization_id /* , other_cols_from_other_tables */
FROM user u
LEFT JOIN user_org uo ON (uo.user_id = u.user_id AND
uo.latest in (select max(latest) from name uo1
where uo1.user_id = u.user_id))
/* here, other LEFT JOINs like the above one */
WHERE u.username = :username
However, a subquery solution is widely discouraged due to slowness, and some of these queries will run on every request. So I came up with the following that works in some cases and gets rid of the subquery:
SELECT u.username, u.user_id, u.password, u.email, u.active
, n.first , n.middle , n.last
, uo.organization_id /* , other_cols_from_other_tables, etc. */
FROM user u
INNER JOIN
( SELECT user_id, MAX(latest) utd
FROM user_org
GROUP BY user_id
) uo1 ON uo1.user_id = u.user_id
LEFT JOIN user_org uo
ON (uo.user_id = u.user_id and uo.latest = uo1.utd)
/* here, other clauses like the part from 'FROM' to here */
WHERE u.username = :username
The latter, unfortunately makes a hard dependence on data in the secondary table, so that the whole query fails if data is lacking in any secondary table for the particular user.
I've researched this on SO and www and there are many solutions for avoiding subqueries, but everything I've found on the subject has the issue in the main query, not in a left join.
The logic I need is "if there's data for this user in this secondary table, get the specified column(s) from the most recent row in that table, otherwise a null".
It seems to me that putting a "current row" marker column on the most recent row in each table would avoid the whole issue and run faster than any other solution, but would be against normalization (I would still have to have the 'latest' column to maintain order-able history of previous data).
Is there a solution that gets normalization + speed? This is mariadb so it needs Mysql syntax.
EDIT: Still would like a better way, but decided to go with the extra column. Now the problem described above is avoided, and the SELECT SQL is much simplified and presumably faster. The downside is adding complexity in saves, but SELECT is more frequent.
MariaDB supports ROW_NUMBER as of version 10.2:
SELECT
u.username,
u.user_id,
u.password,
u.email,
u.active,
uo.organization_id,
...
FROM user u
LEFT JOIN
(
select
user_org.*,
row_number() over(partition by user_id order by latest desc) as rn
from user_org
) uo ON uo.user_id = u.user_id AND uo.rn = 1
...
WHERE u.username = :username;
I'm fairly new to SQL and have started running into sub-queries as in this query below:
SELECT C.CustomerID
, C.Name
, ( Select PhoneNumber
FROM PhoneNumberTable P
WHERE P.CustomerID = C.CustomerID ) AS "PhoneNumber"
FROM CustomerTable C
Comparing to this query with a join below:
SELECT C.CustomerID
, C.Name
, P.PhoneNumber
FROM CustomerTable C
JOIN PhoneNumberTable P
ON P.customerID = C.customerID
Is there a difference in terms of efficiency/speed? The SQL I am working with has several sub-queries as I have shown above (no JOINs) and it is difficult to read.
joins in my experience tend to be faster, but sometimes you need a subquery.
you should also look into CTE they are very usefull and much easier (in my opinion) to manage
in your specific case i would use a join... because you are trying to join the 2 tables together.
I want to perform a simple join on two tables (BusinessUnit and UserBusinessUnit), so I can get a list of all BusinessUnits allocated to a given user.
The first attempt works, but there's no override of Select which allows me to restrict the columns returned (I get all columns from both tables):
var db = new KensDB();
SqlQuery query = db.Select
.From<BusinessUnit>()
.InnerJoin<UserBusinessUnit>( BusinessUnitTable.IdColumn, UserBusinessUnitTable.BusinessUnitIdColumn )
.Where( BusinessUnitTable.RecordStatusColumn ).IsEqualTo( 1 )
.And( UserBusinessUnitTable.UserIdColumn ).IsEqualTo( userId );
The second attept allows the column name restriction, but the generated sql contains pluralised table names (?)
SqlQuery query = new Select( new string[] { BusinessUnitTable.IdColumn, BusinessUnitTable.NameColumn } )
.From<BusinessUnit>()
.InnerJoin<UserBusinessUnit>( BusinessUnitTable.IdColumn, UserBusinessUnitTable.BusinessUnitIdColumn )
.Where( BusinessUnitTable.RecordStatusColumn ).IsEqualTo( 1 )
.And( UserBusinessUnitTable.UserIdColumn ).IsEqualTo( userId );
Produces...
SELECT [BusinessUnits].[Id], [BusinessUnits].[Name]
FROM [BusinessUnits]
INNER JOIN [UserBusinessUnits]
ON [BusinessUnits].[Id] = [UserBusinessUnits].[BusinessUnitId]
WHERE [BusinessUnits].[RecordStatus] = #0
AND [UserBusinessUnits].[UserId] = #1
So, two questions:
- How do I restrict the columns returned in method 1?
- Why does method 2 pluralise the column names in the generated SQL (and can I get round this?)
I'm using 3.0.0.3...
So far my experience with 3.0.0.3 suggests that this is not possible yet with the query tool, although it is with version 2.
I think the preferred method (so far) with version 3 is to use a linq query with something like:
var busUnits = from b in BusinessUnit.All()
join u in UserBusinessUnit.All() on b.Id equals u.BusinessUnitId
select b;
I ran into the pluralized table names myself, but it was because I'd only re-run one template after making schema changes.
Once I re-ran all the templates, the plural table names went away.
Try re-running all 4 templates and see if that solves it for you.