CosmosDb unexpected continuation token - azure

Please note:
I beleive this question is different than the one here talking about why the continuation token is null. The problem listed here is about discussing this unexpected behaviour and see if there is any solution to it.
I've also reported this on cosmosdb github issues because at this stage I think this could very well be an SDK or Cosmos API bug.
Here it goes:
Basically I am getting no result with a continuation token in an unexpected situation.
The only similar experience (no result but a continuation token) I had with CosmosDb was when the RU is not enough and the query needs more RU to finish its job. For example when counting all the documents and you need to continue couple of times.
How to reproduce the issue?
This is very hard to reproduce as the consumer does not control the shard (physical partition) distribution. But you need a comosdb that has a few logical partitions and at least two shards and your query should be formed aiming for the data in the second shared. Do not provide a partition key and make the query cross partition.
Expected behavior
When:
the query is cross partition
there is enough RU
the query costs a very small RU
I'm expecting to receive the result in the first call.
Actual behavior
Query result is empty
Response has an unusual continuation token
The token looks like below:
{"token":null,"range":{"min":"05C1DFFFFFFFFC","max":"FF"}}
Following is the sample code that I can reproduce the issue every single time. In this case I have a document sitting in partition 2 (index 1) which I assume it's the second shard.
var client = new DocumentClient(ServiceEndpoint, AuthKey);
const string query = "select * from c where c.title='JACK CALLAGHAN'";
var collection = UriFactory.CreateDocumentCollectionUri(DatabaseId, CollectionId);
var cQuery = client.CreateDocumentQuery<dynamic>(collection, query, new FeedOptions
{
EnableCrossPartitionQuery = true,
PopulateQueryMetrics = true
}).AsDocumentQuery();
var response = cQuery.ExecuteNextAsync().GetAwaiter().GetResult();
Console.WriteLine($"response.AsEnumerable().Count()= {response.AsEnumerable().Count()}");
foreach (string headerKey in response.ResponseHeaders.Keys)
{
Console.WriteLine($"{headerKey}");
var keyValues = response.ResponseHeaders[headerKey].Split(";");
foreach (var keyValue in keyValues)
{
Console.WriteLine($"{keyValue}");
}
Console.WriteLine();
}
And the output including all the headers:
response.AsEnumerable().Count()= 0
Cache-Control
no-store, no-cache
Pragma
no-cache
Transfer-Encoding
chunked
Server
Microsoft-HTTPAPI/2.0
Strict-Transport-Security
max-age=31536000
x-ms-last-state-change-utc
Wed, 03 Apr 2019 00:50:35.469 GMT
x-ms-resource-quota
documentSize=51200
documentsSize=52428800
documentsCount=-1
collectionSize=52428800
x-ms-resource-usage
documentSize=184
documentsSize=164076
documentsCount=94186
collectionSize=188910
lsn
118852
x-ms-item-count
0
x-ms-schemaversion
1.7
x-ms-alt-content-path
dbs/bettingedge/colls/fixtures
x-ms-content-path
S8sXAPPiCdc=
x-ms-xp-role
1
x-ms-documentdb-query-metrics
totalExecutionTimeInMs=0.27
queryCompileTimeInMs=0.04
queryLogicalPlanBuildTimeInMs=0.02
queryPhysicalPlanBuildTimeInMs=0.03
queryOptimizationTimeInMs=0.00
VMExecutionTimeInMs=0.06
indexLookupTimeInMs=0.05
documentLoadTimeInMs=0.00
systemFunctionExecuteTimeInMs=0.00
userFunctionExecuteTimeInMs=0.00
retrievedDocumentCount=0
retrievedDocumentSize=0
outputDocumentCount=0
outputDocumentSize=49
writeOutputTimeInMs=0.00
indexUtilizationRatio=0.00
x-ms-global-Committed-lsn
118851
x-ms-number-of-read-regions
0
x-ms-transport-request-id
12
x-ms-cosmos-llsn
118852
x-ms-session-token
0:-1#118852
x-ms-request-charge
2.86
x-ms-serviceversion
version=2.2.0.0
x-ms-activity-id
c4bc4b76-47c2-42e9-868a-9ecfe0936b1e
x-ms-continuation
{"token":null,"range":{"min":"05C1DFFFFFFFFC","max":"FF"}}
x-ms-gatewayversion
version=2.2.0.0
Date
Fri, 05 Apr 2019 05:40:21 GMT
Content-Type
application/json
If we continue the query with the composite continuation token we can see the result.
Is that a normal behavior or a bug?

Using .NET Framework will handle continuation token natively:
`var query = Client.CreateDocumentQuery(UriFactory.CreateDocumentCollectionUri(databaseId,collectionI,
sqlQuery,feedOptions).AsDocumentQuery();
while (query.HasMoreResults)
{var response = await query.ExecuteNextAsync();
results.AddRange(response);}`
Adding Feedback provided through linked GitHub Issue:
Regarding your issues we have faced the same situations.
I have some comments.
we didn't know about this behavior and our client received an empty list with a continuation token and pretty much broke our flow.
Now in server side we are handling this situation and continue until we get result. The issue is what if there are 1000 partitions. Do we have to continue 100 times? Is that how ComosDB protect their SLA and under 10 ms response time. $ThinkingLoud
Yes, this broke our flow the first time too. But we handle it on the server side as well (Together with MaxNumberOfObjects we continue to serve request until we receive the number the client wants), and the pattern you're seeing is due to the underlying architecture of CosmosDB, consisting of phyiscal + logical partitions. It sounds like you're implementing Paging with the interplay of your client, and that is fine. However, I don't think this is what CosmosDB refer to with their SLA times.
What other undocumented unexpected behaviors are there that are going to get us by surprise in production environment? #ThinkingLoudAgain
This is a bit vague, but my advice would be for you to read up on all FeedOptions together with CosmosDB Performance tips and make sure you understand them as well as their application areas.
EDIT: Also, another warning. I am currently running into an issue with Continuation Token and DISTINCT keyword in the SQL query. It does not work as expected without an ORDER BY.

This is normal behavior for CosmosDb. There are several things that can cause the query to timeout internally and result in a response with few results or even an empty collection.
From CosmosDb documentation:
The number of items returned per query execution will always be less than or equal to MaxItemCount. However, it is possible that other criteria might have limited the number of results the query could return. If you execute the same query multiple times, the number of pages might not be constant. For example, if a query is throttled there may be fewer available results per page, which means the query will have additional pages. In some cases, it is also possible that your query may return an empty page of results.

Related

Some trivial transactions take dozens of seconds to complete on Spanner microinstance

Here are some bits of context.
Nodejs server, connecting to Cloud Spanner from development machine.
Most of the time the queries take like 200-400ms including data transfer from servers location to my dev machine.
But sometimes these trivial transaction takes 12-16 seconds which surely not acceptable for use case - sessions storage for backend server.
In local dev context sessions service runs on same machine as main backend, at staging at prod they run in same Kubernetes cluster.
This is not about amount of data, it is very small amount of data now in our staging Spanner database overall, like few MB across all tables and just like 10 rows in the table under question.
Spanner instance stats:
Processing units: 100
CPU utilization: 4.3% for the staging database and 10% overall for instance.
Table is like so (few other small fields omitted):
CREATE TABLE sessions
(
id STRING(255) NOT NULL,
created TIMESTAMP,
updated TIMESTAMP,
status STRING(16),
is_local BOOL,
user_id STRING(255),
anonymous BOOL,
expires_at TIMESTAMP,
last_activity_at TIMESTAMP,
json_data STRING(MAX),
) PRIMARY KEY(id);
Transaction under question makes single question like this:
UPDATE ${schema.reportsTable}
SET ${statusCol.columnName} = #status_recycled
WHERE ${idCol.columnName} = #id_value
AND ${statusCol.columnName} = #status_active
with parameters like this:
{
"id_value": "some_session_id",
"status_active": "active",
"status_recycled": "recycled"
}
Yes, that status field of STRING(16) with readable names instead of boolean field is not ideal, I know, but this concept is inherited from an older code. What concerns me is that while we do not have yet too much of data there, just 10 rows or such, experience this sort of delays is surely unexpected at this scale.
Okay, I understand I am like on other side of the globe from the Spanner servers, but this usually gives delays between 200-1200 ms, not 12-16 seconds.
Delay happens quite rarely and randomly but seems to happen on queries like this.
The delay comes at commit, not at e. g. sending SQL command itself or obtaining a transaction.
I tried different query first, like
DELETE FROM Sessions WHERE id = #id_value
and it was the same - random rare long delay of 12-16 such trivial query.
Thanks a lot for your help and time.
PS: Update: actually this 12-16 seconds delay can happen at any random transaction in described context, and all of these transactions are standard CRUD single-row operations.
Update 2:
The code that sends transaction is own wrapper over the standard #google-cloud/spanner client library for nodejs.
The library gives just an easy to use wrapping around the Spanner instance, database, and transaction.
The Spanner instance and database objects are long-living singletons, I mean they do not recreated for every transaction from scratch.
The main purpose of that wrapper is to give logic like:
let result = await useDataContext(async(ctx) => {
let sql = await ctx.getSQLRunner();
return await sql.runSQLUpdate({
sql: `Some SQL Trivial Statement`,
parameters: {
param1: 1,
param2: true,
param3: "some string"
}
});
});
purpose of that is to give some warrantees that if some changes were made over data, transaction.commit surely will be called, and if no changes were made, transaction.end will be called, and if an error boom in the called code, like invalid SQL generated or some variable will be undefined or null, transaction rollback will be initiated.

Unable to delete large number of rows from Spanner

I have 3 node Spanner instance, and a single table that contains around 4 billion rows. The DDL looks like this:
CREATE TABLE predictions (
name STRING(MAX),
...,
model_version INT64,
) PRIMARY KEY (name, model_version)
I'd like to setup a job to periodically remove some old rows from this table using the Python Spanner client. The query I'd like to run is:
DELETE FROM predictions WHERE model_version <> ?
According to the docs, it sounds like I would need to execute this as a Partitioned DML statement. I am using the Python Spanner client as follows, but am experiencing timeouts (504 Deadline Exceeded errors) due to the large number of rows in my table.
# this always throws a "504 Deadline Exceeded" error
database.execute_partitioned_dml(
"DELETE FROM predictions WHERE model_version <> #version",
params={"model_version": 104},
param_types={"model_version": Type(code=INT64)},
)
My first intuition was to see if there was some sort of timeout I could increase, but I don't see any timeout parameters in the source :/
I did notice there was a run_in_transaction method in the Spanner lib that contains a timeout parameter, so I decided to deviate from the partitioned DML approach to see if using this method worked. Here's what I ran:
def delete_old_rows(transaction, model_version):
delete_dml = "DELETE FROM predictions WHERE model_version <> {}".format(model_version),
dml_statements = [
delete_dml,
]
status, row_counts = transaction.batch_update(dml_statements)
database.run_in_transaction(delete_old_rows,
model_version=104,
timeout_secs=3600,
)
What's weird about this is the timeout_secs parameter appears to be ignored, because I still get a 504 Deadline Exceeded error within a minute or 2 of executing the above code, despite a timeout of one hour.
Anyways, I'm not too sure what to try next, or whether or not I'm missing something obvious that would allow me to run a delete query in a timely fashion on this huge Spanner table. The model_version column has pretty low cardinality (generally 2-3 unique model_version values in the entire table), so I'm not sure if that would factor into any recommendations. But if someone could offer some advice or suggestions, that would be awesome :) Thanks in advance
The reason that setting timeout_secs didn't help was because the argument is unfortunately not the timeout for the transaction. It's the retry timeout for the transaction so it's used to set the deadline after which the transaction will stop being retried.
We will update the docs for run_in_transaction to explain this better.
The root cause was that the total timeout for the Streaming RPC calls was set too low in the client libraries, being set to 120s for Streaming APIs (eg ExecuteStreamingSQL used by partitioned DML calls.)
This has been fixed in the client library source code, changing them to a 60 minute timout (which is the maximum), and will be part of the next client library release.
As a workaround, in Java, you can configure the timeouts as part of the SpannerOptions when you connect your database. (I do not know how to set custom timeouts in Python, sorry)
final RetrySettings retrySettings =
RetrySettings.newBuilder()
.setInitialRpcTimeout(Duration.ofMinutes(60L))
.setMaxRpcTimeout(Duration.ofMinutes(60L))
.setMaxAttempts(1)
.setTotalTimeout(Duration.ofMinutes(60L))
.build();
SpannerOptions.Builder builder =
SpannerOptions.newBuilder()
.setProjectId("[PROJECT]"));
builder
.getSpannerStubSettingsBuilder()
.applyToAllUnaryMethods(
new ApiFunction<UnaryCallSettings.Builder<?, ?>, Void>() {
#Override
public Void apply(Builder<?, ?> input) {
input.setRetrySettings(retrySettings);
return null;
}
});
builder
.getSpannerStubSettingsBuilder()
.executeStreamingSqlSettings()
.setRetrySettings(retrySettings);
builder
.getSpannerStubSettingsBuilder()
.streamingReadSettings()
.setRetrySettings(retrySettings);
Spanner spanner = builder.build().getService();
The first suggestion is to try gcloud instead.
https://cloud.google.com/spanner/docs/modify-gcloud#modifying_data_using_dml
Another suggestion is to pass the range of name as well so that limit the number of rows scanned. For example, you could add something like STARTS_WITH(name, 'a') to the WHERE clause so that make sure each transaction touches a small amount of rows but first, you will need to know about the domain of name column values.
Last suggestion is try to avoid using '<>' if possible as it is generally pretty expensive to evaluate.

TableStorage queryEntities sometimes returning 0 entries but no error

TableStorage & Nodejs
Using the function "queryEntities" sometimes result.entries.length is 0, even when I am pretty sure there are a lot of entries in the database. The "where" parameters are ok, but sometimes (maybe one every 100) it returns 0 entries. Not error returned. Just 0 entries.
And in my function that's causing troubles.
My theory is that the database sometimes is saturated because this function executes every 10 seconds and maybe sometimes before one finish another one starts and both operate over the same table, and instead of error it returns a length 0 , what is something awful.
There is any way to resolve this? Shouldn't it return error?
This is expected behavior. In this particular scenario, please check for the presence of continuation tokens in the response. Presence of these tokens in the response indicate that there may be entities available matching the query and you should execute the same query again with the continuation token you received.
Please read this document for explanation: https://learn.microsoft.com/en-us/rest/api/storageservices/query-timeout-and-pagination.
From this link:
A query against the Table service may return a maximum of 1,000 items
at one time and may execute for a maximum of five seconds. If the
result set contains more than 1,000 items, if the query did not
complete within five seconds, or if the query crosses the partition
boundary, the response includes headers which provide the developer
with continuation tokens to use in order to resume the query at the
next item in the result set.

Request rate is large for remove request

When I try to find some docs in documentDB, all is good -
collection.find({query})
when I try to remove, all is bad
[mongooseModel | collection].remove({same-query})
I got
Request rate is large
The number of documents to remove ~ 10 000 . I have tested queries in robomongo shell, which limits find results to 50 per page. Also my remove query fails with mongoose. I can't understand such behavior. How can I got in to Request Limit while remove query is a single request?
Update
Count with query also raise same error.
db.getCollection('taxonomies').count({query})
"Request rate is large" indicates that the application has exceeded the provisioned RU quota, and should retry the request after a small time interval.
Since you are using DocumentDB Node.js API, you could check out #Larry Maccherone's answer in Request rate is large on how to avoid this issue by handling retry behavior and logic in your application's error handling routines.
More on this:
Dealing with RequestRateTooLarge errors in Azure DocumentDB and testing performance
Request Units in Azure Cosmos DB
This is an awful part of CosmosDB - I don't see how they intend to become a real player in the cloud database game with limitations like this. That being said - I came up with a hack to delete all records from a collection through the MongoDB API when you are bumping up against the "Rate Exceeded" error. See below:
var loop = true
while(loop){
try {
db.grid.deleteMany({"myField":"my-query"})
}
catch (err) {
print(err)
printjson(db.runCommand({getLastRequestStatistics:1}))
continue
}
loop = false
}
Just update the deleteMany query and run this script directly in your mongo shell and it should loop through and delete the records.

Bigquery API Intermittently returns http error 400 "Bad Request"

I am getting http error 400 returns intermittently for a particular query, yet when I examine the text of the query it appears to be correct, and if I then copy the query to the Bigquery GUI and run it, it executes without any problems. The query is being constructed in node.js and submitted though the gcloud node.js api. The response I receive, which contains the text of the query is too large to post here, but I do have the path name:
"pathname":"/bigquery/v2/projects/rising-ocean-426/queries/job_aSR9OCO4U_P51gYZ2xdRb145YEA"
The error seems to occur only if the live_seconds_viewed calculations are included in the query. If any part of the live_seconds_viewed calculation is included then the query fails intermittently.
The initial calculation of this field is:
CASE WHEN event = 'video_engagement'
AND range IS NULL
AND INTEGER(video_seconds_viewed) > 0
THEN 10
ELSE 0 END AS live_seconds_viewed,
Sometimes I can get the query to execute simply by changing the order of the expressions. But again, it is intermittent.
Any help with this would be greatly appreciated.
After long and arduous trial and error, I've determined that the reason why the query is failing is simply that the string length of the query is too long. When the query is executed from the GUI, apparently the white space is stripped so the query executes because without the white space it is short enough to pass the size limit.
When I manipulated the query to determine what part or parts were causing the problem, I would inadvertently reduce the size of the query below the critical limit and cause the query to pass.
It would be great if the error response from Bigquery included some hint about what the problem is rather than firing off a 400 error bad request and calling it quits.
It would be even better if the Bigquery parser would ignore white space when determining the size of the query. In this way the behavior on the GUI would match the behavior when submitting the query through the API.

Resources