Azure CosmosDB. Continuation token length in stored procedure - azure

I have a REST API which is intent to query the documents stored in CosmosDB with OData-like syntax. I'm returning documents with chunks. I.e. I'm setting $top=10 and get 10 documents with a continuation token. This continuation token is returned from stored procedure:
var accepted = collection.queryDocuments(collection.getSelfLink(),
sql, requestOptions,
function (err, documents, responseOptions) {
// ...
// put responseOptions.continuation into response body
});
The problem is if the continuation token is long (i.e. 6k characters), an I pass it into URL, the URL cannot be handled and I can't reach out my endpoint (getting 404). As far as I understand the more complex initial SQL query is the longer is the continuation token an its length cannot be set up.
Is there a workaround for that?

Don't think there would be a out of the box solution for this issue. What you can try is to implement tiny url kind of framework at your service layer.
https://www.geeksforgeeks.org/how-to-design-a-tiny-url-or-url-shortener/

Related

Golang / CosmosDB Pagination

I'm trying to implement pagination while selecting records from CosmosDB using cosmosapi package.
The azure documentation states that continuation tokens never expire and I'm trying to understand the semantics of that.
In How does Cosmos DB Continuation Token work? there is an agreement that
Documents created after serving the first page are observable on
subsequent pages
I tried to validate that point by running some experiments from a golang applicaiton, and something is not quite right. As a very high level example, if we insert three records to CosmosDB:
Insert record #1
Insert record #2
Insert record #3
Then if we try to select from the table (query = SELECT * FROM c ORDER BY c.dateField DESC) using this options:
opts := cosmosapi.QueryDocumentsOptions{
IsQuery: true,
ContentType: cosmosapi.QUERY_CONTENT_TYPE,
ConsistencyLevel: cosmosapi.ConsistencyLevelStrong,
Continuation: "",
PartitionKeyValue: partitionKeyValue,
MaxItemCount: 2,
}
it returns:
record #1
record #2
continuation token = "cont-token-1"
Now when selecting again with the same options, but different continuation token:
opts := cosmosapi.QueryDocumentsOptions{
IsQuery: true,
ContentType: cosmosapi.QUERY_CONTENT_TYPE,
ConsistencyLevel: cosmosapi.ConsistencyLevelStrong,
Continuation: "cont-token-1",
PartitionKeyValue: partitionKeyValue,
MaxItemCount: 2,
}
It returns
record #3
Which is fairly logical.
Now when I try to insert record #4, and it gets inserted right after record #3, and try to fetch using "cont-token-1", record #4 does not show up. It only shows up when I regenerate the continuation tokens by selecting again using an empty opts.Continuation field.
If I try to select using an empty continuation token, then it fetches record #1 and record #2, and leads to a new token that fetches record #3 and record #4.
Is this the expected behavior? Or am I missing anything?
From my understanding, it should show up. The continuation token is like a bookmark, and it should see the results even when using the same continuation token.
A continuation token can only be used with the exact same query and will return the exact same answer every time, regardless of how you change the underlying data, you need to get a new token if your underlying data changes in such a way that would have been included in the first answer.

Understanding "x-ms-request-charge" and "x-ms-total-request-charge" in CosmosDB Gremlin API

I am using gremlin (version 3.4.6) package to query my Cosmos DB account targeting Gremlin (Graph) API. The code is fairly straightforward:
const gremlin = require('gremlin');
const authenticator = new gremlin.driver.auth.PlainTextSaslAuthenticator(
`/dbs/<database-name>/colls/<container-name>`,
"<my-account-key>"
);
const client = new gremlin.driver.Client(
"wss://<account-name>.gremlin.cosmosdb.azure.com:443/",
{
authenticator,
traversalsource : "g",
rejectUnauthorized : true,
mimeType : "application/vnd.gremlin-v2.0+json"
}
);
client.submit("g.V()")
.then((result) => {
console.log(result);
})
.catch((error) => {
console.log(error);
});
The code is working perfectly fine and I am getting the result back. The result object has an attributes property which looks something like this:
{
"x-ms-status-code": 200,
"x-ms-request-charge": 0,
"x-ms-total-request-charge": 123.85999999999989,
"x-ms-server-time-ms": 0.0419,
"x-ms-total-server-time-ms": 129.73709999999994,
"x-ms-activity-id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
If you notice, there are two things related to request charge (basically how expensive my query is): x-ms-request-charge and x-ms-total-request-charge.
I have three questions regarding this:
What's the difference between the two?
I noticed that x-ms-request-charge is coming always as 0 and x-ms-total-request-charge as a non-zero value. Why is that? and
Which value should I use to calculate the request charge? My guess is to use x-ms-total-request-charge as it is a non-zero value.
And while we're at it, I would appreciate if someone can tell me the difference between x-ms-server-time-ms and x-ms-total-server-time-ms as well.
These response codes are specific to our Gremlin API and are documented here, Azure Cosmos DB Gremlin server response headers.
For a single request, Gremlin server can send response with multiple partial response messages (loosely equivalent to a page, but returned as a stream instead of multiple request/responses with continuations as is done with SQL API).
x-ms-request-charge is the RUs consumed to resolve a single partial response.
x-ms-total-request-charge is running total RUs consumed up to the current partial response. So when the final message is sent, this will denote the total RUs consumed for the entire request.
Depending on the Gremlin client driver implementation, each partial responses may be exposed to the caller OR the driver will accumulate all responses internally and return a final result. Given the latter, this prompted us to add the x-ms-total-request-charge, so that drivers implemented this way could still resolve the total cost of the request.
Thanks for the question and hope this is helpful.

How to make a httpsrequest 'Get' in apex and then update a record in Salesforce

Overview:
We have a third party what hosts a text value at a given endpoint. Using a 'Get' request to a url where we also pass a key and parameters returns a string values (of decimal numbers and a space).
I created some apex code, including #InvocableMethod, so I could all the apex from a flow where I pass in the URL, and then the text is returned to the flow. I then go on to update a record.
Here is the Method,
there is also a class, FR_Amount_Variables , storing the URL and String #InvocableVariable values.
public class FR_Amount_Sync {
#InvocableMethod(label='FR Amount Raised Get')
public static List<FR_Amount_Variables>getFRamount (List<FR_Amount_Variables> inputURL) {
FR_Amount_Variables amtvar = new FR_Amount_Variables();
List<FR_Amount_Variables> getFRamount = new List<FR_Amount_Variables>();
string endpoint = inputURL[0].URL;
Http http = new Http();
HttpRequest request = new HttpRequest();
request.setEndpoint(endpoint);
request.setMethod('GET');
HttpResponse response = http.send(request);
string Amounts= response.getBody();
Amounts= Amounts.replaceAll( '\\s+', '');
if(String.isEmpty(Amounts)){
Boolean isEmpty = true;
Amounts = '0.00';
}
decimal amountss = decimal.valueOf(Amounts);
amtvar.amount = amountss;
getFRamount.add(amtvar);
return getFRamount;
}
}
The image of the flow can be seen below
Update Flow
Issue:
When I run the flow in Debug mode, set the 3 input variables and run, the flow executes the apex and updates the specified record correctly.
Likewise if I preset the flow's input variables (add a default value), and the just run the flow, the apex and record updates succeed with the record being update with the correct value from the 3rd party.
The issue is when I try to automatically run the flow, either by Process Builder, or by Mass Action Scheduler, I receive system exception errors.
An Apex error occurred:
System.CalloutException: You have uncommitted work pending. Please commit or rollback before calling out
and
An Apex error occurred: System.CalloutException: Callout loop not allowed
respectively.
I was wondering if there is anyway to trigger a flow that doesn't trigger an error. Otherwise is there a way I can make a httprequest 'get' callout and then update a record with the received record.
We cannot do DML before the Callout in the same transaction.
DML can be done after the Callout.
So, the best practice is to do Callout using future method. In this way, the flow will handle the DML operations.
For example, check this link -
https://www.infallibletechie.com/2020/04/how-to-do-callout-from-flow-in.html

Cloudant change notifications

I am new to Cloudant but have found it useful for a first stage of IoT data. But I need to subscribe to changes based on an id field that is separate from the _id and is unique to the sensor that is sending the data. The examples that I’ve seen so far haven’t helped with this problem. What I’m doing now is sending a separate json doc for each post, so it should return new docs with this sensor id. The json docs sometimes come in by the second but it can be hours as well.
I’m using c# in a .Net web app. The code below creates a call to the Cloudant database and returns the data that I want based on an index that was created for the field SensorID,
json =
{{
"selector": {
"SensorID" : "h7365cf3-17bc-4422-b436-f7bcf12b2e2a"
},
"fields": [
"Data"
]
}}
url = My Cloudant url + ” /_find”.
This returns all docs with the sensorID field that corresponds to the SensorID value in the json query, but just the json object of each doc nested in the Data field.
using (WebClient client = new WebClient())
{
byte[] postBytes = System.Text.ASCIIEncoding.UTF8.GetBytes(json.ToString());
client.UseDefaultCredentials = true;
client.Credentials = new NetworkCredential(username, password);
client.Headers[HttpRequestHeader.ContentType] = "application/json";
var response = client.UploadData(url, "POST", postBytes);
JObject iJson = JObject.Parse(client.Encoding.GetString(response));
return parseIncoming(iJson);
}
When the call is to My Cloudant url + “GET /_DB_UPDATES”, it returns information regarding changes to the whole database. This can be set up as a continuous feed.
I was hoping that this meant that i could subscribe to changes in documents to get new data coming, like Redis Pub/Sub. I’m starting to think that this might not be the case, but if anybody can show me how to do it I would be grateful.
As #adasilva70 said, you need to use the _changes feed.
You can filter changes with an appropriate filter function (so that only changes regarding the documents you're interested in show up).
You can get all updates since a given sequence point (everything since the last data you got) and/or you can use long polling or continuous mode for instant notifications.

What are all the ways CouchDB reponses fail?

I'm building a Node.js application on the express.js framework with CouchDB as a database. I'm utilizing CouchDB's session api for maintaining session state, and various databases for different sections of data.
On essentially every request my application code makes a request to Couch and then if there's an error (with Node) I can respond appropriately, by logging the error and redirecting to a 404 page or something like that. But if I get a CouchDB error, Node wouldn't consider it an error, it would consider that data. Now that's totally fine with me as long as CouchDB can only return this format:
{
"error": "illegal_database_name",
"reason": "Only lowercase characters (a-z), digits (0-9), and any of the characters _, $, (, ), +, -, and / are allowed. Must begin with a letter."
}
A JSON doc with two properties, error and reason. That's fine I can parse it and return the appropriate message; quite gracefully actually.
BUT! Is that all I can expect from CouchDB, or is there another way Couch might fail, that wouldn't yield a JSON doc with those two fields (properties)?
dscape's information of relying on the response codes is correct, and in most situations you will get an object with error and reason. The bulk-document errors are the only place I can think of where neither of these will be true. If just one document fails then you'll still get a 200, but you'll get the error/reason within the array element corresponding to the document that failed. See the docs for more info on that.

Resources