I have compared the sql query on sails.js with the other way of doing it, using waterline's ORM.
I did the following request : Get all countries from all continents and I compared both requests with wireshark.
The simple SQL query :
"SELECT * FROM countries AS cou INNER JOIN continents AS con ON (cou.continent_id=continent.id)"
And then I did the same defining a one to many associations between countries and continents and did the following request.
Continents.find().populate("countries").exec(function(err, result)) {
res.send(result)
}
With that way, it takes around 66 ms to return the result, considering I have 15 ms of network delay, I can go down to 50 ms by moving the node.js server.
When I do it with the sql query, it takes around 35ms, so I could go down to nearly 20ms, which is good for me.
Is there a way to get the same results using both methods? or will the sql query always be faster?
Actually, the query generated in such population is
1. Selection of parents :
select * from continent where ...
Selection of all countries of the retrieved continents.
(select * from country where continent_id = continent_1)
union
(select * from country where continent_id = continent_2)
union
...
union
(select * from country where continent_id = continent_n)
Regroup result (Affectation of every country to its continent by foreign key.
This implementation make easy the management of limit and skip clauses as the call :
Continents.find().populate("countries").limit(2).skip(1).exec(function(err,
result)) {
res.send(result)
}
should only return the second and the third country for every continent and such implementation as you can see generate one only query so DBMS will not be overloaded.
Related
I have a Cosmos Db instance with > 1 Million JSON Documents stored in it.
I am trying to pull data of a certain time frame as to when the document was created based on the _ts variable which is auto-generated when the document is inserted. It represents the UNIX timestamp of that moment.
I am unable to understand, why both these queries produce drastically different results:
Query 1:
Select *
from c
where c._ts > TimeStamp1
AND c._ts < TimeStamp2
Produces 0 results
Query 2
Select *
from c
where c._ts > TimeStamp1
AND c._ts < TimeStamp2
order by c._ts desc
Produces the correct number of results.
What I have tried?
I suspected that might be because of the default CosmosDb index on the data. So, I rewrote the index policy to index only that variable. Still the same problem.
Since my end purpose is to group by on the returned data from the query, then I tried to use group by with order by alone or in a subquery. Surprisingly, according to the docs, CosmosDb yet doesn't support using group by with order by.
What I need help on?
Why am I observing such a behavior?
Is there a way to index the Db in such a way that the rows are returned.
Beyond this, is there a way to still use group by and order by together (Please don't link the question to another one because of this point, I have gone through them and their answers are not valid in my case).
#Andy and #Tiny-wa, Thanks for replying.
I was able to understand the unintended behavior and it was showing up because of the GetCurrentTimestamp() used to calculate the TimeStamps. The documentation states that
This system function will not utilize the index. If you need to
compare values to the current time, obtain the current time before
query execution and use that constant string value in the WHERE
clause.
Although, I don't fully understand what this means but I was to solve this by creating a Stored Procedure where the Time Stamp is fetched before the SQLAPI query is formed and executed and I was able to get the rows as expected.
Stored Procedure Pseudocode for that is like:
function FetchData(){
..
..
..
var Current_TimeStamp = Date.now();
var CDbQuery =
`Select *
FROM c
where (c._ts * 10000000) > DateTimeToTicks(DateTimeAdd("day", -1, TicksToDateTime(` + Current_TimeStamp + ` * 10000)))
AND (c._ts * 10000000) < (` + Current_TimeStamp + ` * 10000)`
var isAccepted = collection.queryDocuments(
collection.getSelfLink(),
XQuery,
function (err, feed, options) {
..
..
..
});
}
I have an Employees collection and I want to retrieve full documents of 10 employees whose ID's I'd like to send to my SQL SELECT. How do I do that?
To further clarify, I have 10 EmployeeId's and I want pull these employees' information from my Employees collection. I'd appreciate your help with this.
Update:
As of 5/6/2015, DocumentDB supports the IN keyword; which supports up to 100 parameters.
Example:
SELECT *
FROM Employees
WHERE Employees.id IN (
"01236", "01237", "01263", "06152", "21224",
"21225", "21226", "21227", "21505", "22903",
"14003", "14004", "14005", "14006", "14007"
)
Original Answer:
Adding on to Ryan's answer... Here's an example:
Create the following UDF:
var containsUdf = {
id: "contains",
body: function(arr, obj) {
if (arr.indexOf(obj) > -1) {
return true;
}
return false;
}
};
Use your contains UDF is a SQL query:
SELECT * FROM Employees e WHERE contains(["1","2","3","4","5"], e.id)
For documentation on creating UDFs, check out the DocumentDB SQL reference
You can also vote for implementing the "IN" keyword for "WHERE" clauses at the DocumentDB Feedback Forums.
You could also achieve this by using OR support. Below is a sample –
SELECT *
FROM Employees e
WHERE e.EmployeeId = 1 OR e.EmployeeId = 2 OR e.EmployeeId = 3
If you need more number of ORs than what DocumentDB caps, you would have to break up your queries into multiple smaller queries by employeeId values. You can also issue the queries in parallel from the client and gather all the results
The best way to do this, today would be to create a Contains() UDF that took in the array of ids to search on and use that in the WHERE clause.
Does
Select * from Employees where EmployeeId in (1,3,5,6,...)
Not work ?
thanks to ryancrawcour we know it doesn't
Another method is to use the ARRAY_CONTAINS method in the SQL API.
Here is the sample code :
SELECT *
FROM Employees
WHERE ARRAY_CONTAINS(["01236", "01237", "01263", "06152", "21224"],Employees.id).
I ran both queries ( using the IN method ) with a sample set of datasets, both are consuming the same amount of RUs.
I have created a query with a subquery in Access, and cannot link it in Excel 2003: when I use the menu Data -> Import External Data -> Import Data... and select the mdb file, the query is not present in the list. If I use the menu Data -> Import External Data -> New Database Query..., I can see my query in the list, but at the end of the import wizard I get this error:
Too few parameters. Expected 2.
My guess is that the query syntax is causing the problem, in fact the query contains a subquery. So, I'll try to describe the query goal and the resulting syntax.
Table Positions
ID (Autonumber, Primary Key)
position (double)
currency_id (long) (references Currency.ID)
portfolio (long)
Table Currency
ID (Autonumber, Primary Key)
code (text)
Query Goal
Join the 2 tables
Filter by portfolio = 1
Filter by currency.code in ("A", "B")
Group by currency and calculate the sum of the positions for each currency group an call the result: sumOfPositions
Calculate abs(sumOfPositions) on each currency group
Calculate the sum of the previous results as a single result
Query
The query without the final sum can be created using the Design View. The resulting SQL is:
SELECT Currency.code, Sum(Positions.position) AS SumOfposition
FROM [Currency] INNER JOIN Positions ON Currency.ID = Positions.currency_id
WHERE (((Positions.portfolio)=1))
GROUP BY Currency.code
HAVING (((Currency.code) In ("A","B")));
in order to calculate the final SUM I did the following (in the SQL View):
SELECT Sum(Abs([temp].[SumOfposition])) AS sumAbs
FROM [SELECT Currency.code, Sum(Positions.position) AS SumOfposition
FROM [Currency] INNER JOIN Positions ON Currency.ID = Positions.currency_id
WHERE (((Positions.portfolio)=1))
GROUP BY Currency.code
HAVING (((Currency.code) In ("A","B")))]. AS temp;
So, the question is: is there a better way for structuring the query in order to make the export work?
I can't see too much wrong with it, but I would take out some of the junk Access puts in and scale down the query to this, hopefully this should run ok:
SELECT Sum(Abs(A.SumOfPosition)) As SumAbs
FROM (SELECT C.code, Sum(P.position) AS SumOfposition
FROM Currency As C INNER JOIN Positions As P ON C.ID = P.currency_id
WHERE P.portfolio=1
GROUP BY C.code
HAVING C.code In ("A","B")) As A
It might be worth trying to declare your parameters in the MS Access query definition and define their datatypes. This is especially important when you are trying to use the query outside of MS Access itself, since it can't auto-detect the parameter types. This approach is sometimes hit or miss, but worth a shot.
PARAMETERS [[Positions].[portfolio]] Long, [[Currency].[code]] Text ( 255 );
SELECT Sum(Abs([temp].[SumOfposition])) AS sumAbs
FROM [SELECT Currency.code, Sum(Positions.position) AS SumOfposition
FROM [Currency] INNER JOIN Positions ON Currency.ID = Positions.currency_id
WHERE (((Positions.portfolio)=1))
GROUP BY Currency.code
HAVING (((Currency.code) In ("A","B")))]. AS temp;
I have solved my problems thanks to the fact that the outer query is doing a trivial sum. When choosing New Database Query... in Excel, at the end of the process, after pressing Finish, an Import Data form pops up, asking
Where do you want to put the data?
you can click on Create a PivotTable report... . If you define the PivotTable properly, Excel will display only the outer sum.
I'm trying to get an average for a count on a groupBy by joining with a subquery. Don't know if that the right way to go at all but I couldn't anything about subqueries other than the mysema doc.
Scenario:
How many orders per product did a customer do on average?
Meaning: A Customer orders products. So a customer ordered a specific product a number of times (count). What's the average number of orders that customer placed for any product?
Might sound a bit hypothetical, in fact it's just part of a prototype, but it made me wonder, how to get a reference to a custom column created within a subquery with the fancy QueryDSL from Mysema.
In SQL you just give the count column an alias and join using a second ID column. QueryDSL has the "as()" method as well but I have no Idea, how to retrieve that column plus I dont't see how it can join one query with anothers, since query.list() just gets a list but for some reason the join accepts it. Feels wrong...
Here's my code:
JPQLQuery query = createJPQLQuery();
QOrdering qOrdering = QOrdering.ordering;
QProduct qProduct = QProduct.product;
QCustomer qCustomer = QCustomer.customer;
// how many of each product did a customer order?
HibernateSubQuery subQuery = new HibernateSubQuery();
subQuery.from(qOrdering).innerJoin(qOrdering.product,qProduct).innerJoin(qOrdering.customer, qCustomer);
subQuery.groupBy(qCustomer,qProduct).list(qCustomer.id,qProduct.id,qProduct.count());
// get the average number of orders per product for each customer
query.from(qCustomer);
query.innerJoin(subQuery.list(qCustomer.id,qOrdering.count().as("count_orders")));
query.groupBy(qCustomer.id);
return (List<Object[]>) query.list(qCustomer.firstname,subQuery.count_orders.avg());
Again: How do I join with a subquery?
How do I get the aliased "count" column to do more aggregation like avg (is my group right btw?)
Might be that I have some other errors in this, so any help appreciated!
Thanks!
Edit:
That's kind of the native SQL I'd like to see QueryDSL produce:
Select avg(numOrders) as average, cust.lastname from
customer cust
inner join
(select count(o.product_id) as numOrders, c.id as cid, p.name
from ordering o
inner join product p on o.product_id=p.id
inner join customer c on o.customer_id=c.id
group by o.customer_id, o.product_id) as numprods
on cust.id = numprods.cid
group by numprods.cid
order by cust.lastname;
Using subqueries in the join clause is not allowed. in JPQL, subqueries are only allowed in the WHERE and HAVING part. The join method signatures in Querydsl JPA queries are too wide.
As this query needs two levels of grouping, maybe it can't be expressed with JPQL / Querydsl JPA.
I'd suggest to write this query using the Querydsl JPA Native query support.
As Querydsl JPA uses JPQL internally, it is restricted by the expressiveness of JPQL.
I know that this question is old and already has an accepted answer, but judging from this question, it seems to still be troubling guys. See my answer in the same question. The use of JoinFlag in the join() section and Expression.path() is able to achieve left-joining a subquery. Hope this helps someone.
QueryDsl does not support subQuery in join but you can achieve this via following way:
We wanted to achieve the following query:
select A.* from A join (select aid from B group by aid) b on b.aid=A.id;
Map a View or SQL query to JPA entity:
import lombok.Setter;
import org.hibernate.annotations.Subselect;
import org.hibernate.annotations.Synchronize;
import javax.persistence.Entity;
import javax.persistence.Id;
#Entity
#Getter
#Setter
#Subselect("select aid from B group by aid")
#Synchronize("B")
public class BGroupByAid {
#Id
private Integer aId;
}
then use the equivalent QueryDSl entity in the class just like the regular entity:
JPAQuery<QAsset> query = new JPAQuery<>(entityManager);
QBGroupByAid bGroupById = QBGroupByAid.bGroupByAid;
List<A> tupleOfAssets =
query.select(A)
.from(A).innerJoin(bGroupById).on(bGroupById.aId.eq(A.aId))
.fetchResults()
.getResults();
You can also use blazebit which supports also subquery in join. I have try it and it is working. You can create SubQueryExpression f.e like this
SubQueryExpression<Tuple> sp2 = getQueryFactory().select(entity.id,
JPQLNextExpressions.rowNumber().over().partitionBy(entity.folId).orderBy(entity.creationDate.desc()).as(rowNumber))
.from(entity)
.where(Expressions.path(Integer.class, rowNumber).eq(1));
and then just join it like this:
return getBlazeQueryFactory()
.select(entity1, entity)
.from(entity1)
.leftJoin(sp2, entity).on(entity.id.eq(entity1.id)).fetch();
I have put here just simple example. So maybe it doesn't make a perfect sense but maybe can be helpful.
Also don't be confused it will can produce union in the generated select. This is just for naming columns from subquery you can read better explanation about this union here: Blaze-Persistence GROUP BY in LEFT JOIN SUBQUERY with COALESCE in root query
I want to perform a simple join on two tables (BusinessUnit and UserBusinessUnit), so I can get a list of all BusinessUnits allocated to a given user.
The first attempt works, but there's no override of Select which allows me to restrict the columns returned (I get all columns from both tables):
var db = new KensDB();
SqlQuery query = db.Select
.From<BusinessUnit>()
.InnerJoin<UserBusinessUnit>( BusinessUnitTable.IdColumn, UserBusinessUnitTable.BusinessUnitIdColumn )
.Where( BusinessUnitTable.RecordStatusColumn ).IsEqualTo( 1 )
.And( UserBusinessUnitTable.UserIdColumn ).IsEqualTo( userId );
The second attept allows the column name restriction, but the generated sql contains pluralised table names (?)
SqlQuery query = new Select( new string[] { BusinessUnitTable.IdColumn, BusinessUnitTable.NameColumn } )
.From<BusinessUnit>()
.InnerJoin<UserBusinessUnit>( BusinessUnitTable.IdColumn, UserBusinessUnitTable.BusinessUnitIdColumn )
.Where( BusinessUnitTable.RecordStatusColumn ).IsEqualTo( 1 )
.And( UserBusinessUnitTable.UserIdColumn ).IsEqualTo( userId );
Produces...
SELECT [BusinessUnits].[Id], [BusinessUnits].[Name]
FROM [BusinessUnits]
INNER JOIN [UserBusinessUnits]
ON [BusinessUnits].[Id] = [UserBusinessUnits].[BusinessUnitId]
WHERE [BusinessUnits].[RecordStatus] = #0
AND [UserBusinessUnits].[UserId] = #1
So, two questions:
- How do I restrict the columns returned in method 1?
- Why does method 2 pluralise the column names in the generated SQL (and can I get round this?)
I'm using 3.0.0.3...
So far my experience with 3.0.0.3 suggests that this is not possible yet with the query tool, although it is with version 2.
I think the preferred method (so far) with version 3 is to use a linq query with something like:
var busUnits = from b in BusinessUnit.All()
join u in UserBusinessUnit.All() on b.Id equals u.BusinessUnitId
select b;
I ran into the pluralized table names myself, but it was because I'd only re-run one template after making schema changes.
Once I re-ran all the templates, the plural table names went away.
Try re-running all 4 templates and see if that solves it for you.