Be notified when the poolsize is reached - node.js

In order to improve the pool design of an application, I would like to be notified (ideally with an event) when the pool size of an application is reached. This way, I can add a log, if this log occurs too often I will increase the pool size.
With a mongo client initialized this way :
const client = new MongoClient(url, {
poolSize: 10,
});
Is there a way to be notified when the 10 connections are reached within my application ?

Use connection pool events. These should be implemented by all recent MongoDB drivers.
Node documentation
Documentation/example in Ruby
For your question you would track the pool size using ConnectionCheckOut*/ConnectionCheckedIn events if your driver does not expose the pool size or the pool at all directly.

Related

Efficient OkHttp configuration for Multithreaded environment

What is the best configuration I can use to set up the OkHttp3 client correctly in a multi threaded environment? Had 2 main questions:
Connection pool - How do we define the number of available connections in the pool? Can it be scaled at runtime? The number of concurrent users will be very high and need to make sure users aren't waiting a long time for the connection to be available from the pool.
I read the OkHttp might end up doing multiple retries in case of failures or timeouts. Is it possible to only enable this for only the "Gets" and not "Post" while using just 1 OkHttp client?
Also Anything else I should be considering?
Here is my starting code for the client.
private static final int timeout = 15000;
private static final OkHttpClient okClient = new OkHttpClient()
.newBuilder()
.connectTimeout(timeout, TimeUnit.MILLISECONDS)
.readTimeout(timeout, TimeUnit.MILLISECONDS)
.writeTimeout(timeout, TimeUnit.MILLISECONDS)
.retryOnConnectionFailure(false)
.addInterceptor(new HttpLoggingInterceptor().setLevel(HttpLoggingInterceptor.Level.BASIC))
.build();
You can configure the connection pool then pass into the client builder.
https://square.github.io/okhttp/3.x/okhttp/okhttp3/ConnectionPool.html
See Connection Pool - OkHttp for an example.
For the second question, you can disable automatic retries and do this in your application code instead. Use retryOnConnectionFailure(false) as you show above.
To have this applied differently for get and posts you should use customise one client like the following
val postClient = client.newBuilder().retryOnConnectionFailure(false).build()

Meteor MongoDB subscription delivering data in 10 second intervals instead of live

I believe this is more of a MongoDB question than a Meteor question, so don't get scared if you know a lot about mongo but nothing about meteor.
Running Meteor in development mode, but connecting it to an external Mongo instance instead of using Meteor's bundled one, results in the same problem. This leads me to believe this is a Mongo problem, not a Meteor problem.
The actual problem
I have a meteor project which continuosly gets data added to the database, and displays them live in the application. It works perfectly in development mode, but has strange behaviour when built and deployed to production. It works as follows:
A tiny script running separately collects broadcast UDP packages and shoves them into a mongo collection
The Meteor application then publishes a subset of this collection so the client can use it
The client subscribes and live-updates its view
The problem here is that the subscription appears to only get data about every 10 seconds, while these UDP packages arrive and gets shoved into the database several times per second. This makes the application behave weird
It is most noticeable on the collection of UDP messages, but not limited to it. It happens with every collection which is subscribed to, even those not populated by the external script
Querying the database directly, either through the mongo shell or through the application, shows that the documents are indeed added and updated as they are supposed to. The publication just fails to notice and appears to default to querying on a 10 second interval
Meteor uses oplog tailing on the MongoDB to find out when documents are added/updated/removed and update the publications based on this
Anyone with a bit more Mongo experience than me who might have a clue about what the problem is?
For reference, this is the dead simple publication function
/**
* Publishes a custom part of the collection. See {#link https://docs.meteor.com/api/collections.html#Mongo-Collection-find} for args
*
* #returns {Mongo.Cursor} A cursor to the collection
*
* #private
*/
function custom(selector = {}, options = {}) {
return udps.find(selector, options);
}
and the code subscribing to it:
Tracker.autorun(() => {
// Params for the subscription
const selector = {
"receivedOn.port": port
};
const options = {
limit,
sort: {"receivedOn.date": -1},
fields: {
"receivedOn.port": 1,
"receivedOn.date": 1
}
};
// Make the subscription
const subscription = Meteor.subscribe("udps", selector, options);
// Get the messages
const messages = udps.find(selector, options).fetch();
doStuffWith(messages); // Not actual code. Just for demonstration
});
Versions:
Development:
node 8.9.3
mongo 3.2.15
Production:
node 8.6.0
mongo 3.4.10
Meteor use two modes of operation to provide real time on top of mongodb that doesn’t have any built-in real time features. poll-and-diff and oplog-tailing
1 - Oplog-tailing
It works by reading the mongo database’s replication log that it uses to synchronize secondary databases (the ‘oplog’). This allows Meteor to deliver realtime updates across multiple hosts and scale horizontally.
It's more complicated, and provides real-time updates across multiple servers.
2 - Poll and diff
The poll-and-diff driver works by repeatedly running your query (polling) and computing the difference between new and old results (diffing). The server will re-run the query every time another client on the same server does a write that could affect the results. It will also re-run periodically to pick up changes from other servers or external processes modifying the database. Thus poll-and-diff can deliver realtime results for clients connected to the same server, but it introduces noticeable lag for external writes.
(the default is 10 seconds, and this is what you are experiencing , see attached image also ).
This may or may not be detrimental to the application UX, depending on the application (eg, bad for chat, fine for todos).
This approach is simple and and delivers easy to understand scaling characteristics. However, it does not scale well with lots of users and lots of data. Because each change causes all results to be refetched, CPU time and network bandwidth scale O(N²) with users. Meteor automatically de-duplicates identical queries, though, so if each user does the same query the results can be shared.
You can tune poll-and-diff by changing values of pollingIntervalMs and pollingThrottleMs.
You have to use disableOplog: true option to opt-out of oplog tailing on a per query basis.
Meteor.publish("udpsPub", function (selector) {
return udps.find(selector, {
disableOplog: true,
pollingThrottleMs: 10000,
pollingIntervalMs: 10000
});
});
Additional links:
https://medium.baqend.com/real-time-databases-explained-why-meteor-rethinkdb-parse-and-firebase-dont-scale-822ff87d2f87
https://blog.meteor.com/tuning-meteor-mongo-livedata-for-scalability-13fe9deb8908
How to use pollingThrottle and pollingInterval?
It's a DDP (Websocket ) heartbeat configuration.
Meteor real time communication and live updates is performed using DDP ( JSON based protocol which Meteor had implemented on top of SockJS ).
Client and server where it can change data and react to its changes.
DDP (Websocket) protocol implements so called PING/PONG messages (Heartbeats) to keep Websockets alive. The server sends a PING message to the client through the Websocket, which then replies with PONG.
By default heartbeatInterval is configure at little more than 17 seconds (17500 milliseconds).
Check here: https://github.com/meteor/meteor/blob/d6f0fdfb35989462dcc66b607aa00579fba387f6/packages/ddp-client/common/livedata_connection.js#L54
You can configure heartbeat time in milliseconds on server by using:
Meteor.server.options.heartbeatInterval = 30000;
Meteor.server.options.heartbeatTimeout = 30000;
Other Link:
https://github.com/meteor/meteor/blob/0963bda60ea5495790f8970cd520314fd9fcee05/packages/ddp/DDP.md#heartbeats

scala: apache httpclient in multi-threaded environment

I am writing a singleton class (Object in scala) which uses apache httpclient(4.5.2) to post some file content and return status to caller.
object HttpUtils{
protected val retryHandler = new HttpRequestRetryHandler() {
def retryRequest(exception: IOException, executionCount: Int, context: HttpContext): Boolean = {
//retry logic
true
}
}
private val connectionManager = new PoolingHttpClientConnectionManager()
// Reusing same client for each request that might be coming from different threads .
// Is it correct ????
val httpClient = HttpClients.custom()
.setConnectionManager(connectionManager)
.setRetryHandler(retryHandler)
.build()
def restApiCall (url : String, rDD: RDD[SomeMessage]) : Boolean = {
// Creating new context for each request
val httpContext: HttpClientContext = HttpClientContext.create
val post = new HttpPost(url)
// convert RDD to text file using rDD.collect
// add this file as MultipartEntity to post
var response = None: Option[CloseableHttpResponse] // Is it correct way of using it ?
try {
response = Some(httpClient.execute(post, httpContext))
val responseCode = response.get.getStatusLine.getStatusCode
EntityUtils.consume(response.get.getEntity) // Is it require ???
if (responseCode == 200) true
else false
}
finally {
if (response.isDefined) response.get.close
post.releaseConnection() // Is it require ???
}
}
def onShutDown = {
connectionManager.close()
httpClient.close()
}
}
Multiple threads (More specifically from spark streaming context) are calling restApiCall method. I am relatively new to scala and apache httpClient. I have to make frequent connections to only few fixed server (i.e. 5-6 fixed URL's with different request parameters).
I went through multiple online resource but still not confident about it.
Is it the best way to use http client in multi-threaded environment?
Is it possible to keep live connections and use it for various requests ? Will it be beneficial in this case ?
Am i using/releasing all resources efficiently ? If not please suggest.
Is it good to use it in Scala or there exist some better library ?
Thanks in advance.
It seems the official docs have answers to all your questions:
2.3.3. Pooling connection manager
PoolingHttpClientConnectionManager is a more complex implementation
that manages a pool of client connections and is able to service
connection requests from multiple execution threads. Connections are
pooled on a per route basis. A request for a route for which the
manager already has a persistent connection available in the pool will
be serviced by leasing a connection from the pool rather than creating
a brand new connection.
PoolingHttpClientConnectionManager maintains a maximum limit of
connections on a per route basis and in total. Per default this
implementation will create no more than 2 concurrent connections per
given route and no more 20 connections in total. For many real-world
applications these limits may prove too constraining, especially if
they use HTTP as a transport protocol for their services.
2.4. Multithreaded request execution
When equipped with a pooling connection manager such as
PoolingClientConnectionManager, HttpClient can be used to execute
multiple requests simultaneously using multiple threads of execution.
The PoolingClientConnectionManager will allocate connections based on
its configuration. If all connections for a given route have already
been leased, a request for a connection will block until a connection
is released back to the pool. One can ensure the connection manager
does not block indefinitely in the connection request operation by
setting 'http.conn-manager.timeout' to a positive value. If the
connection request cannot be serviced within the given time period
ConnectionPoolTimeoutException will be thrown.

How to use database connections pool in Sequelize.js

I need some clarification about what the pool is and what it does. The docs say Sequelize will setup a connection pool on initialization so you should ideally only ever create one instance per database.
var sequelize = new Sequelize('database', 'username', 'password', {
host: 'localhost',
dialect: 'mysql'|'mariadb'|'sqlite'|'postgres'|'mssql',
pool: {
max: 5,
min: 0,
idle: 10000
},
// SQLite only
storage: 'path/to/database.sqlite'
});
When your application needs to retrieve data from the database, it creates a database connection. Creating this connection involves some overhead of time and machine resources for both your application and the database. Many database libraries and ORM's will try to reuse connections when possible, so that they do not incur the overhead of establishing that DB connection over and over again. The pool is the collection of these saved, reusable connections that, in your case, Sequelize pulls from. Your configuration of
pool: {
max: 5,
min: 0,
idle: 10000
}
reflects that your pool should:
Never have more than five open connections (max: 5)
At a minimum, have zero open connections/maintain no minimum number of connections (min: 0)
Remove a connection from the pool after the connection has been idle (not been used) for 10 seconds (idle: 10000)
tl;dr: Pools are a good thing that help with database and overall application performance, but if you are too aggressive with your pool configuration you may impact that overall performance negatively.
pool is draining error
I found this thread in my search for a Sequalize error was giving my node.js app: pool is draining. I could not for the life of me figure it out. So for those who follow in my footsteps:
The issue was that I was closing the database earlier than I thought I was, with the command sequelize.closeConnections(). For some reason, instead of an error like 'the database has been closed`, it was instead giving the obscure error 'pool is draining'.
Seems that you can try to put pool to false to avoid having the pool bing created. Here is the API details table :http://sequelize.readthedocs.org/en/latest/api/sequelize/
[options.pool={}]
Object
Should sequelize use a connection pool.
Default is true

Pusher Account over quota

We use Puhser in our application in order to have real-time updates.
Something very stange happens - while google analytics says that we have around 200 simultaneous connections, Pusher says that we have 1500.
I would like to monitor Pusher connections in real-time but could not find any method to do so. Somebody can help??
Currently there's no way to get realtime stats on the number of connections you currently have open for your app. However, it is something that we're investigating currently.
In terms of why the numbers vary between Pusher and Google Analytics, it's usually down to the fact that Google Analytics uses different methods of tracking whether or not a user is on the site. We're confident that our connection counting is correct, however, that's not to say that there isn't a potentially unexpected reason for your count to be high.
A connection is counted as a WebSocket connection to Pusher. When using the Pusher JavaScript library a new WebSocket connection is created when you create a new Pusher instance.
var pusher = new Pusher('APP_KEY');
Channel subscriptions are created over the existing WebSocket connection (known as multiplexing), and do not count towards your connection quota (there is no limit on the number allowed per connection).
var channel1 = pusher.subscribe('ch1');
var channel2 = pusher.subscribe('ch2');
// All done over as single connection
// more subscriptions
// ...
var channel 100 = pusher.subscribe('ch100');
// Still just a 1 connection
Common reasons why connections are higher than expected
Users open multiple tabs
If a user has multiple tabs open to the same application, multiple instances of Pusher will be created and therefore multiple connections will be used e.g. 2 tabs open will mean 2 connections are established.
Incorrectly coded applications
As mentioned above, a new connection is created every time a new Pusher object is instantiated. It is therefore possible to create many connections in the same page.
Using an older version of one our libraries
Our connection strategies have improved over time, and we recommend that you keep up to date with the latest versions.
Specifically, in newer versions of our JS library, we carry out ping-pong requests between server and client to verify that the client is still around.
Other remedies
While our efforts are always to keep a connection going indefinitely to an application, it is possible to disconnect manually if you feel this works in your scenario. It can be achieved by making a call to Pusher.disconnect(). Below is some example code:
var pusher = new Pusher("APP_KEY");
var timeoutId = null;
function startInactivityCheck() {
timeoutId = window.setTimeout(function(){
pusher.disconnect();
}, 5 * 60 * 1000); // called after 5 minutes
};
// called by something that detects user activity
function userActivityDetected(){
if(timeoutId !== null) {
window.clearTimeout(timeoutId);
}
startInactivityCheck();
};
How this disconnection is transmitted to the user is up to you but you may consider prompting them to let them know that they will not receive any further real-time updates due to a long period of inactivity. If they wish to start receiving real-time updates again they should click a button.

Resources