Azure CosmosDB Pagination without OrderBy - azure

When we execute a simple query on CosmosDB without any SORT/Order By clauses :-
SELECT * from c
What is the order of the records retrieved? Is it insertion order or is it random and unpredictable?
If it's random, how does it maintain order for a paginated Query? And what happens if more records are inserted after I get a continuation token from a query?

Related

Need to build a complex query for Azure Table to count number of rows

I am trying to run complex query for Azure Table where I want to count number of rows for all deviceID for specific dateiot and timeiot? Is this possible?
Your query would be something like:
PartitionKey eq 'aaaa' and dateiot eq 'bbbb' and deviceID eq 'cccc' and timeiot eq 'dddd'
2 things though:
This query will do a complete partition scan and may result in incomplete data in a single request. Your code should be able to handle continuation tokens to get all data matching the query.
Table query does not support count functionality so you will get the entities back. You will need to add all entities to get the total count of entities matching the query.

mongodb query optimize: selecting specific column or all columns which query will be faster

I am wondering to find the optimize and faster way of getting data from mongodb. i have find query which returns all the records with all the fields on the other hand am using find query which returns all records but selective fields which query will be more faster ?

node.js and postgres bulk upsert or another pattern?

I am using Postgres, NodeJS and Knex.
I have the following situation:
A database table with a unique field.
In NodeJS I have an array of objects and I need to:
a. Insert a new row, if the table does not contain the unique id, or
b. Update the remaining fields, if the table does contain the unique id.
From my knowledge I have three options:
Do a query to check for each if exists in database and based on the response, do a update or insert. This costs resources because there's a call for each array item and also a insert or update.
Delete all rows that have id in array and then perform a insert. This would mean only 2 operations but the autoincrement field will keep on growing.
Perform an upsert since Postgres 9.5 supports it. Bulk upsert seems to work and there's only a call to database.
Looking through the options I am aware of, upsert seems the most reasonable one but does it have any drawbacks?
Upsert is a common way.
Another way is use separate insert/update operations and most likely it will be faster:
Define existing rows
select id from t where id in (object-ids) (*)
Update existing row by (*) result
Filter array by (*) and bulk insert new rows.
See more details for same question here

Performance impact of Allow filtering on same partition query in cassandra

I have table like this.
CREATE TABLE posts (
topic text
country text,
bookmarked text,
id uuid,
PRIMARY KEY (topic,id)
);
First query on single partition with allow filtering.
select * from posts where topic='cassandra' allow filtering;
Second query on single partition without allow filtering.
select * from posts where topic='cassandra';
My question is what is performance difference between first query and second query? Will first query(with allow filtering) get result from all partition before filtering though we have requested from single partition.
Thanks.
Allow filtering will allow you to run queries without specifying partition key. But if you using one, it will use only specific partition.
In this specific example you should see no difference.
Ran both queries on my test table with tracing on, got single partition in both execution plans:
Executing single-partition query on table_name
You don't need to use ALLOW FILTERING when you are querying with a partition key. So for the two queries you mentioned there will be no performance difference.
For Cassandra version 3.0 and up, ALLOW FILTERING can be used to query with any fields other than partition key. For example, you can run a query like this:
SELECT * FROM posts where country='Bangladesh';
And for Cassandra version below 3.0, ALLOW FILTERING can be used on only primary key.
Although it is not wise to query using ALLOW FILTERING.
Because, the only way Cassandra can execute this query is by retrieving all the rows from the table posts and then by filtering out the ones which do not have the requested value for the country column.
So you should useALLOW FILTERING at you own risk.

Azure query using the select

I am trying to get a query in azure in which I want to get the entity with the given partition key and row key based on Date.
I am keeping entities
Partisionkey, row key, Date, Additional info.
I am looking for a query using tableservies so that ,
I always get the latest one (using date)
How can I get the query? (I am using node and Azure)
TableQuery
.select()
.from('myusertables')
.where('PartitionKey eq ?', '545455');
How write the table query?
To answer you question, check out this previously answered question: How to select only the records with the highest date in LINQ
However, you may be facing a design issue. Performing the operation you are trying to do will require you to pull all the entities from the underlying Azure Table, which will perform slower over time as entities are added. So you may want to reconsider your design and possibly change the way you use your partitionkey and rowkey. You could also store the latest entities in a separate table, so that only 1 entity is found per table, transforming your scan/filter into a seek operation. Food for thought...

Resources