How to modify Server side paging size - pagination

I have an OData Web API service using .NET 4.5. It has a WebApi controller derived from EntitySetController:
public class WorkItemsController : EntitySetController<WorkItem, string>
{
[Queryable(PageSize=100)]
public override IQueryable<WorkItem> Get()
{## Heading ##
// go to AWS DynamoDb, get the workitems and then return
}
}
As you can see, I set the server-side page size to 100 by default. Later I realize that I need to increase the size programmatically inside the Get() function. Does anyone know how to do it?
If you want to know the reason, here is why:
AWS DynamoDb doesn't support $skip or $top query. Each time a client wants to get a collection of workitems, I need to get all workitems from DynamoDb. When the number is big, it takes very long time if each time we only return 100 items back to user. So my strategy is to double/triple the number of workitems we return to user each time. So user will get 100, 200, 400, 800 workitems with consecutive requests. Assuming there are 1500 workitems in DynamoDb, I will query only 4 times to return all of them back to user. If we keep a constant pagesize, like 100, I need to query 15 times.

You can invoke the ODataQueryOptions in your method and set the page size.
public IQueryable Get(ODataQueryOptions queryOptions)
{
var settings = new ODataQuerySettings { PageSize = 100 };
var result = GetResult();
return queryOptions.ApplyTo(result, settings);
}

This is exactly the issue, that LINQ2DynamoDB addresses. To support $skip and $top (that is, Enumerable.Skip() and Enumerable.Take()), it caches the results returned by DynamoDb in ElastiCache. So that server-side paging works much more efficiently and the number of read operations is greatly reduced.
Moreover, LINQ2DynamoDB supports OData automatically, so maybe you don't even need to do any WebApi controllers.
Why not trying it? :)

Related

How to get the total count of records for a search endpoint for an oracle r2dbc to implement pagination

Iam implementing a search endpoint for a particular condition using oracle r2dbc to provide a paged output. I need total number of records so that the user can fetch the results accordingly. Below is the code that am using. However it fails the performance test and throws up 503 error. Any help will be appreciated.
{
return responseMapper.map(planogramRepository.findPlanogramsByDepartmentAndSubgroup(filter.getDepartment_id(),filter.getSubgroup(),pageRequest))
.collectList()
.zipWith(this.planogramRepository.countPlanogramsByDepartmentAndSubgroup(filter.getDepartment_id(),filter.getSubgroup()))
.flatMap(tuple2 -> Mono.just(SearchResponse.builder().totalCount(tuple2.getT2()).planograms(tuple2.getT1()).build()));
}

Google Photos API mediaItems list/search methods ignore pageSize param

I am attempting to do a retrieve of all media items that a given Google Photos user has, irrespective of any album(s) that they are in. However when I attempt to use either the mediaItems.list or the mediaItems.search methods, the pageSize param I am including in either request is either being ignored or not fully fullfilled.
Details of mediaItems.list request
GET https://photoslibrary.googleapis.com/v1/mediaItems?pageSize=<###>
Details of mediaItems.search request
POST https://photoslibrary.googleapis.com/v1/mediaItems:search
BODY { 'pageSize': <###> }
I have made a simple implementation of these two requests here as an example for this question, it just requires a valid accessToken to use:
https://jsfiddle.net/zb2htog1/
Running this script with the following pageSize against a Google Photos account with 100s of photos and 10s of albums consistently returns the same unexpected amount of result for both methods:
Request pageSize
Returned media items count
1
1
25
9
50
17
100
34
I know that Google states the following for the pageSize parameter for both of these methods:
“Maximum number of media items to return in the response. Fewer media
items might be returned than the specified number. The default
pageSize is 25, the maximum is 100.”
I originally assumed that the reason fewer media items might be returned is because an account might have less media items in total than a requested pageSize, or that a request with a pageToken has reached the end of a set of paged results. However I am now wondering if this just means that results may vary in general?
Can anyone else confirm if they have the same experience when using these methods without an album ID for an account with a suitable amount of photos to test this? Or am I perhaps constructing my requests in an incorrect fashion?
I experience something similar. I get back half of what I expect.
If I don't set the pageSize, I get back just 13, If I set to 100, I get back 50.

Understanding "x-ms-request-charge" and "x-ms-total-request-charge" in CosmosDB Gremlin API

I am using gremlin (version 3.4.6) package to query my Cosmos DB account targeting Gremlin (Graph) API. The code is fairly straightforward:
const gremlin = require('gremlin');
const authenticator = new gremlin.driver.auth.PlainTextSaslAuthenticator(
`/dbs/<database-name>/colls/<container-name>`,
"<my-account-key>"
);
const client = new gremlin.driver.Client(
"wss://<account-name>.gremlin.cosmosdb.azure.com:443/",
{
authenticator,
traversalsource : "g",
rejectUnauthorized : true,
mimeType : "application/vnd.gremlin-v2.0+json"
}
);
client.submit("g.V()")
.then((result) => {
console.log(result);
})
.catch((error) => {
console.log(error);
});
The code is working perfectly fine and I am getting the result back. The result object has an attributes property which looks something like this:
{
"x-ms-status-code": 200,
"x-ms-request-charge": 0,
"x-ms-total-request-charge": 123.85999999999989,
"x-ms-server-time-ms": 0.0419,
"x-ms-total-server-time-ms": 129.73709999999994,
"x-ms-activity-id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
If you notice, there are two things related to request charge (basically how expensive my query is): x-ms-request-charge and x-ms-total-request-charge.
I have three questions regarding this:
What's the difference between the two?
I noticed that x-ms-request-charge is coming always as 0 and x-ms-total-request-charge as a non-zero value. Why is that? and
Which value should I use to calculate the request charge? My guess is to use x-ms-total-request-charge as it is a non-zero value.
And while we're at it, I would appreciate if someone can tell me the difference between x-ms-server-time-ms and x-ms-total-server-time-ms as well.
These response codes are specific to our Gremlin API and are documented here, Azure Cosmos DB Gremlin server response headers.
For a single request, Gremlin server can send response with multiple partial response messages (loosely equivalent to a page, but returned as a stream instead of multiple request/responses with continuations as is done with SQL API).
x-ms-request-charge is the RUs consumed to resolve a single partial response.
x-ms-total-request-charge is running total RUs consumed up to the current partial response. So when the final message is sent, this will denote the total RUs consumed for the entire request.
Depending on the Gremlin client driver implementation, each partial responses may be exposed to the caller OR the driver will accumulate all responses internally and return a final result. Given the latter, this prompted us to add the x-ms-total-request-charge, so that drivers implemented this way could still resolve the total cost of the request.
Thanks for the question and hope this is helpful.

Performance of Azure Cosmos DB of element with big property containing html

We're using Azure Cosmos DB Graph API to cache items from a CMS that have properties containing a fairly big chunk of html.
When adding 8000 items Cosmos DB is starting to be very slow.
For instance this simple query takes about 12-15 seconds to complete:
g.V().hasLabel('news').limit(10)
Data in each vertex is around around 4-5 kb and I've excluded the Content-property in graph settings.
I've increased the RU to 5000/s and the Monitor-tab in Azure Portal seem to indicate is enough. Estimating throughput needs suggests that 5000 RU should be enough for 500 reads/s but I can't even do one.
Querying out items without the html-property like g.V().hasLabel('user') is still fast.
I also tried to exclude the path from indexing but no difference (haven't reloaded items if that is necessary?)
"excludedPaths": [
{
"path": "/Content/?"
}
]
What can I do to get this up to speed?
If you are using the .NET SDK, it appears that the request retrieves all of the results for the "hasLabel" filter and performs the "limit" filtering in the client side SDK code.
I sniffed a few queries with the "limit" extension in Fiddler and no matter the value, the query in the request does not contain a TOP clause. The document db query in the body of the request looks like:
{"query":"SELECT N_2 FROM Node N_2 WHERE (IS_DEFINED(N_2._isEdge) = false AND (N_2.label = 'news'))"}
I would expect it to be: {"query":"SELECT TOP 10 N_2 FROM Node N_2 WHERE (IS_DEFINED(N_2._isEdge) = false AND (N_2.label = 'news'))"}

Create an alert on calling a third party API using Azure Application Insights

I've enabled application insights on an Azure WebApp that I created. My WebApp is calling a third party API which runs on a quota. I am only allowed 100k calls per month.
I need to track those API calls so that I can create an alert when the number of calls has reached 50%, then another alert 75%.
I am using TrackEvent every time the call is made and the event in the AppInsights dashboard does increment. But I can't seem to create an alert when a certain number of calls is made. I can't see it from the list of 'events' dropdown.
Also in addition, one other requirement that I need is to create an alerts when the number of calls to the goes over 10 per minutes.
Is TrackEvent the right method to use for these requirements?
I did something like this ...
var telemetryEventClient = new Microsoft.ApplicationInsights.TelemetryClient(new Microsoft.ApplicationInsights.Extensibility.TelemetryConfiguration() { InstrumentationKey = "Instrumentation Key" });
telemetryEventClient.Context.Operation.Name = "MyAPIProvider";
var properties = new Dictionary<string, string>
{
{ "Source", "WebAppToAPI" }
};
var metrics = new Dictionary<string, double>
{
{ "CallingAPIMetric", 1 }
};
telemetryEventClient.TrackEvent("CallingAPI", properties, metrics);
but when I looked at setting up the alert and placed a threshold of 50000 (for testing, I just put 5), I never reach that as the event count is always 1. Am I approaching this the right way?
The alert you're trying to define always looks at the value you supply in your custom event - not the amount of events you're firing.
You can create an automated flow to query your events and send you an email whenever the query result passes some threshold.
The Application Insights Connector which works both for Flow and Microsoft Logic Apps was created just for that, and can be defined on any query result from any document type (event, metric or even traces).
Step-by-step documentation on how to create your own flow are here.
As for your query - you need a simple analytics query like this:
customEvents
| where timestamp > ago(1h) // or any time range you need
| where name == "CallingAPI"
| count

Resources