DocumentDB performance issues - azure

When running from DocumentDB queries from C# code on my local computer a simple DocumentDB query takes about 0.5 seconds in average. Another example, getting a reference to a document collection takes about 0.7 seconds in average. Is this to be expected? Below is my code for checking if a collection exists, it is pretty straight forward - but is there any way of improving the bad performance?
// Create a new instance of the DocumentClient
var client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey);
// Get the database with the id=FamilyRegistry
var database = client.CreateDatabaseQuery().Where(db => db.Id == "FamilyRegistry").AsEnumerable().FirstOrDefault();
var stopWatch = new Stopwatch();
stopWatch.Start();
// Get the document collection with the id=FamilyCollection
var documentCollection = client.CreateDocumentCollectionQuery("dbs/"
+ database.Id).Where(c => c.Id == "FamilyCollection").AsEnumerable().FirstOrDefault();
stopWatch.Stop();
// Get the elapsed time as a TimeSpan value.
var ts = stopWatch.Elapsed;
// Format and display the TimeSpan value.
var elapsedTime = String.Format("{0:00} seconds, {1:00} milliseconds",
ts.Seconds,
ts.Milliseconds );
Console.WriteLine("Time taken to get a document collection: " + elapsedTime);
Console.ReadKey();
Average output on local computer:
Time taken to get a document collection: 0 seconds, 752 milliseconds
In another piece of my code I'm doing 20 small document updates that are about 400 bytes each in JSON size and it still takes 12 seconds in total. I'm only running from my development environment but I was expecting better performance.

In short, this can be done end to end in ~9 milliseconds with DocumentDB. I'll walk through the changes required, and why/how they impact results below.
The very first query always takes longer in DocumentDB because it does some setup work (fetching physical addresses of DocumentDB partitions). The next couple requests take a little bit longer to warm the connection pools. The subsequent queries will be as fast as your network (the latency of reads in DocumentDB is very low due to SSD storage).
For example, if you modify your code above to measure, for example 10 readings instead of just the first one like shown below:
using (DocumentClient client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey))
{
long totalRequests = 10;
var database = client.CreateDatabaseQuery().Where(db => db.Id == "FamilyRegistry").AsEnumerable().FirstOrDefault();
Stopwatch watch = new Stopwatch();
for (int i = 0; i < totalRequests; i++)
{
watch.Start();
var documentCollection = client.CreateDocumentCollectionQuery("dbs/"+ database.Id)
.Where(c => c.Id == "FamilyCollection").AsEnumerable().FirstOrDefault();
Console.WriteLine("Finished read {0} in {1}ms ", i, watch.ElapsedMilliseconds);
watch.Reset();
}
}
Console.ReadKey();
I get the following results running from my desktop in Redmond against the Azure West US data center, i.e. about 50 milliseconds. These numbers may vary based on the network connectivity and distance of your client from the Azure DC hosting DocumentDB:
Finished read 0 in 217ms
Finished read 1 in 46ms
Finished read 2 in 51ms
Finished read 3 in 47ms
Finished read 4 in 46ms
Finished read 5 in 93ms
Finished read 6 in 48ms
Finished read 7 in 45ms
Finished read 8 in 45ms
Finished read 9 in 51ms
Next, I switch to Direct/TCP connectivity from the default of Gateway to improve the latency from two hops to one, i.e., change the initialization code to:
using (DocumentClient client = new DocumentClient(new Uri(EndpointUrl), AuthorizationKey, new ConnectionPolicy { ConnectionMode = ConnectionMode.Direct, ConnectionProtocol = Protocol.Tcp }))
Now the operation to find the collection by ID completes within 23 milliseconds:
Finished read 0 in 197ms
Finished read 1 in 117ms
Finished read 2 in 23ms
Finished read 3 in 23ms
Finished read 4 in 25ms
Finished read 5 in 23ms
Finished read 6 in 31ms
Finished read 7 in 23ms
Finished read 8 in 23ms
Finished read 9 in 23ms
How about when you run the same results from an Azure VM or Worker Role also running in the same Azure DC? The same operation completes with about 9 milliseconds!
Finished read 0 in 140ms
Finished read 1 in 10ms
Finished read 2 in 8ms
Finished read 3 in 9ms
Finished read 4 in 9ms
Finished read 5 in 9ms
Finished read 6 in 9ms
Finished read 7 in 9ms
Finished read 8 in 10ms
Finished read 9 in 8ms
Finished read 9 in 9ms
So, to summarize:
For performance measurements, please allow for a few measurement samples to account for startup/initialization of the DocumentDB client.
Please use TCP/Direct connectivity for lowest latency.
When possible, run within the same Azure region.
If you follow these steps, you can get great performance and you'll be able to get the best performance numbers with DocumentDB.

Related

Laravel-Excel keeps browser busy for 140 seconds after completion of import: how do I correct it?

Using the import to models option, I am importing an XLS file with about 15,000 rows.
With the microtime_float function, the script times and echos out how long it takes. At 29.6 secs, this happens, showing it took less than 30 seconds. At that time, I can see the database has all 15k+ records as expected, no issues there.
Problem is, the browser is kept busy and at 1 min 22 secs, 1 min 55 secs and 2 min 26 secs it prompts me to either wait or kill the process. I keep clicking wait and finally it ends at 2 mins 49 secs.
This is a terrible user experience, how can I cut off this extra wait time?
It's a very basic setup: the route calls importcontroller#import with http get and the code is as follows:
public function import()
{
ini_set('memory_limit', '1024M');
$start = $this->microtime_float();
Excel::import(new myImport, 'myfile.xls' , null, \Maatwebsite\Excel\Excel::XLS);
$end = $this->microtime_float();
$t = $end - $start;
return "Time: $t";
}
The class uses certain concerns as follows:
class myImport implements ToModel, WithBatchInserts, WithChunkReading, WithStartRow

rampUser method is getting stuck in gatling 3.3

I am having issues using rampUser() method in my gatling script. The request is getting stuck after the following entry which had passed half way through.
Version : 3.3
================================================================================
2019-12-18 09:51:44 45s elapsed
---- Requests ------------------------------------------------------------------
> Global (OK=2 KO=0 )
> graphql / request_0 (OK=1 KO=0 )
> rest / request_0 (OK=1 KO=0 )
---- xxxSimulation ---------------------------------------------------
[##################################### ] 50%
waiting: 1 / active: 0 / done: 1
================================================================================
I am seeing the following in the log which gets repeated for ever and the log size increases
09:35:46.495 [GatlingSystem-akka.actor.default-dispatcher-2] DEBUG io.gatling.core.controller.inject.open.OpenWorkload - Injecting 0 users in scenario xxSimulation, continue=true
09:35:47.494 [GatlingSystem-akka.actor.default-dispatcher-6] DEBUG io.gatling.core.controller.inject.open.OpenWorkload - Injecting 0 users in scenario xxSimulation, continue=true
The above issue is happening only with rampUser and not happening with
atOnceUsers()
rampUsersPerSec()
rampConcurrentUsers()
constantConcurrentUsers()
constantUsersPerSec()
incrementUsersPerSec()
Is there a way to mimic rampUser() in some other way or is there a solution for this.
My code is very minimal
setUp(
scenarioBuilder.inject(
rampUsers(2).during(1 minutes)
)
).protocols(protocolBuilder)
I am stuck with this for some time and my earlier post with more information can be found here
Can any of the gatling experts help me on this?
Thanks for looking into it.
It seems you have slightly incorrect syntax for a rampUsers. You should try remove a . before during.
I have in my own script this code and it works fine:
setUp(userScenario.inject(
// atOnceUsers(4),
rampUsers(24) during (1 seconds))
).protocols(httpProtocol)
Also, in Gatling documentation example is also without a dot Open model:
scn.inject(
nothingFor(4 seconds), // 1
atOnceUsers(10), // 2
rampUsers(10) during (5 seconds), // HERE
constantUsersPerSec(20) during (15 seconds), // 4
constantUsersPerSec(20) during (15 seconds) randomized, // 5
rampUsersPerSec(10) to 20 during (10 minutes), // 6
rampUsersPerSec(10) to 20 during (10 minutes) randomized, // 7
heavisideUsers(1000) during (20 seconds) // 8
).protocols(httpProtocol)
)
My guess is that syntax can't be parsed, so instead 0 is substituted. (Here is example of rounding. Not applicable, but as reference: gatling-user-injection-constantuserspersec)
Also, you mentioned that others method work, could you paste working code as well?

Redis Node - Querying a list of 250k items of ~15 bytes takes at least 10 seconds

I'd like to query a whole list of 250k items of ~15 bytes each.
Each item (some coordinates) is a 15 bytes string like that xxxxxx_xxxxxx_xxxxxx.
I'm storing them using this function :
function setLocation({id, lat, lng}) {
const str = `${id}_${lat}_${lng}`
client.lpush('locations', str, (err, status) => {
console.log('pushed:', status)
})
}
Using nodejs, doing a lrange('locations', 0, -1) takes between 10 seconds and 15 seconds.
Slowlog redis lab:
I tried to use sets, same results.
According to this post
This shouldn't take more than a few milliseconds.
What am I doing wrong here ?
Update:
I'm using an instance on Redis lab

NodeJs scheduling jobs on multiple nodes

I have two nodeJs servers running behind a Load Balancer. I have some scheduled jobs that i want to run only once on any of the two instances in a distributed manner.
Which module should i use ? Will node-quartz(https://www.npmjs.com/package/node-quartz) be useful for this ?
Adding redis and using node-redlock seemed like overkill for the little caching job I needed to schedule for once a day on a single server with three Node.js processes behind a load balancer.
I discovered http://kvz.io/blog/2012/12/31/lock-your-cronjobs/ - and that led me to the concept behind Tim Kay's solo.
The concept goes like this - instead of locking on an object (only works in a single process) or using a distributed lock (needed for multiple servers), "lock" by listening on a port. All the processes on the server share the same ports. And if the process fails, it will (of course) release the port.
Note that hard-failing (no catch anywhere surrounding) or releasing the lock in catch are both OK, but neglecting to release the lock when catching exceptions around the critical section will mean that the scheduled job never executes until the locking process gets recycled for some other reason.
I'll update when I've tried to implement this.
Edit
Here's my working example of locking on a port:
multiProc.js
var net = require('net');
var server = net.createServer();
server.on('error', function () { console.log('I am process number two!'); });
server.listen({ port: 3000 },
function () { console.log('I am process number one!');
setTimeout(function () {server.close()}, 3000); });
If I run this twice within 3 seconds, here's the output from the first and second instances
first
I am process number one!
second
I am process number two!
If, on the other hand, more than 3 seconds pass between executing the two instances, both claim to be process number one.
I haven't done this before but I can see myself doing it this way.
Using any scheduler library for Node.js.
In order to achieve your goal, i would use redis for distributed lock. Before running any scheduled jobs, a worker / node will have to get the lock; do the job; and release / ack() when finishing the job (or on error).
A single server can be selected a leader by conducting a election among available instances using Zoologist package
https://www.npmjs.com/package/zoologist
Requires Zookeeper server to conduct the election
I don't know if this might help you, but still posting it here.
Usually node-schedule is used for time based schedules where you have to execute arbitrary code only once. For eg: a database read/write on next month 6:00 PM.
The following post will explain writing the scheduled Jobs which will perform certain action based on our requirement for a particular time / day instance.
For performing the above task we are going to use CRON package of node.
To add a job we need to :
1) Install Cron
npm install cron
2) Require cron 's CronJob to our project.
var CronJob = require('cron').CronJob
3) Create an instance of CronJob
var jobs = new CronJob({
cronTime: ' * * * * * *',
onTick: function () {
//perform Your action
},
start: false,
timeZone: 'Asia/Kolkata'
});
Arguments
cronTime: it takes 6 arguments namely :
1) Second - > 0 - 59
2) Minute - > 0 - 59
3) Hour - > 0 - 23
4) Day of Month - > 1 - 31
5) Months - > 0 - 11
6) Day of Week - > 0 - 6
Note: We Can define cronTime in ranges alse like * for always.
0 - 59 / 5 at every 5 minute.
onTick: The operation to perform.
Start: It takes a boolean and if true then starts the job now.
timeZone: job's timeZone
4) To start Job
jobs.start()
For Example :
var jobs = new CronJob({
cronTime: ' 00 00 0-23 * * *',
onTick: function () {
printMyName();
},
start: false,
timeZone: 'Asia/Kolkata'
});
jobs.start();
var printMyName = function () {
var date = new Date();
console.log("Hi Vipul it is ", today);
};
Hope it helps.

spring integration task executor queue filled with more records

I started to build a Spring Integration app, in which the input gateway generates a fixed number (50) of records and then stops generating new records. There are basic filters/routers/transformers in the middle, and the ending service activator and task executor config are as following:
<int:service-activator input-channel="inChannel" output-channel="outChannel" ref="svcProcessor">
<int:poller fixed-rate="100" task-executor="myTaskExecutor"/>
</int:service-activator>
<task:executor id = "myTaskExecutor" pool-size="5" queue-capacity="100"/>
I tried to put some debug info at the begging of the svcProcessor method:
#Qualifier(value="myTaskExecutor")
#Autowired
ThreadPoolTaskExecutor executor;
#ServiceActivator
public Order processOrder(Order order) {
log.debug("---- " + "executor size: " + executor.getActiveCount() +
" q: " + executor.getThreadPoolExecutor().getQueue().size() +
" r: " + executor.getThreadPoolExecutor().getQueue().remainingCapacity()+
" done: " + executor.getThreadPoolExecutor().getCompletedTaskCount() +
" task: " + executor.getThreadPoolExecutor().getTaskCount()
);
//
//process order takes up to 5 seconds.
//
return order;
}
After sometimes the program runs, the log shows the queue has reached over 50, then eventually gets reject exception:
23:38:31.096 DEBUG [myTaskExecutor-2] ---- executor size: 5 q: 44 r: 56 done: 11 task: 60
23:38:31.870 DEBUG [myTaskExecutor-5] ---- executor size: 5 q: 51 r: 49 done: 11 task: 67
23:38:33.600 DEBUG [myTaskExecutor-4] ---- executor size: 5 q: 69 r: 31 done: 11 task: 85
23:32:46.792 DEBUG [myTaskExecutor-1] ---- executor size: 5 q: 72 r: 28 done: 11 task: 88
It looks like the active count and sum of queue size/remaining looks right with the config of 5 and 100, but I am not clear why there are more than 50 records in the queue, and the taskCount is also larger than the limit 50.
Am I looking at the wrong info from the executor and the queue?
Thanks
UPDATE:
(not sure if I should open another question)
I tried the xml version of the cafeDemo from spring-integration (branch SI3.0.x), and used pool provided in the document, but used 100 milliseconds rate and added capacity:
<int:service-activator input-channel="hotDrinks" ref="barista" method="prepareHotDrink" output-channel="preparedDrinks">
<int:poller task-executor="pool" fixed-rate="100"/>
</int:service-activator>
<task:executor id="pool" pool-size="5" queue-capacity="200"/>
After I ran it, it also got rejection exception after around the 20th delivery:
org.springframework.core.task.TaskRejectedException: Executor [java.util.concurrent.ThreadPoolExecutor#6c31732b[Running, pool size = 5, active threads = 5, queued tasks = 200, completed tasks = 0]]
There are only about 32 orders placed until the exception, so I am not sure why queued tasks = 200 and completed task = 0?
THANKS
getTaskCount() This method gives the number of total task assigned to executor since the start. So, it will increase with time.
And other variables are approximate number not exact as per documentation of java.
getCompletedTaskCount()
Returns the approximate total number of tasks that have completed execution.
public int getActiveCount()
Returns the approximate number of threads that are actively executing tasks.
Ideally getTaskCount() and getCompletedTaskCount() will increase linearly with time, as it includes all the previous tasks assigned since start of execution of your code. However, activeCount should be less than 50, but being approximate number it will go beyond 50 sometimes with little margin.
Refer :-
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html

Resources