huge performance gap with nodejs and postgresql benchmark - node.js

I am creating an application with nodejs, it is using Hapi for web framework and knex for sql builder, The primary code is following:
server.route({
method: 'POST',
path: '/location',
config: {
tags: ['api'],
validate: {
payload: {
longitude: Joi.string().regex(/^\d+\.\d+$/).required(),
latitude: Joi.string().regex(/^\d+\.\d+$/).required(),
date: Joi.string().regex(/^\d{4}-\d{1,2}-\d{1,2}\s\d{1,2}:\d{1,2}.+$/).required(),
phone: Joi.string().required(/^1\d{10}$/)
}
}
},
handler: createLocation
})
async function createLocation(request, reply){
try{
const data = await knex('locations').insert(request.payload)
reply(data)
}catch(error){
reply(error)
}
}
It simply insert some date to postgresql. I am using Wrk to benchmark it Concurrent throughput in Google Compute Engine(cheapest machine), Result:
$ wrk -c 100 -t 12 http://localhost/api/location -s wrk.lua
Running 10s test # http://panpan.tuols.com/api/location
12 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 542.22ms 102.93ms 1.02s 88.32%
Req/Sec 21.95 18.74 70.00 78.71%
1730 requests in 10.02s, 0.94MB read
Requests/sec: 172.65
Transfer/sec: 96.44KB
Then I using pgbench to test postgresql insert performance:
$ pgbench location -U postgres -h localhost -r -t 1000 -f index.sql -c 10
transaction type: index.sql
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
latency average = 1.663 ms
tps = 6014.610692 (including connections establishing)
tps = 6029.973067 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
1.595 INSERT INTO "public"."locations"("phone", "longitude", "latitude", "date", "createdAt", "updatedAt") VALUES('18382383428', '123,33', '123.33', 'now()', 'now()', 'now()') RETURNING "id", "phone", "longitude", "latitude", "date", "createdAt", "updatedAt";
The nodejs is 172.65 req/s, and the postgresql internal native is 6000 req/s, The are actually do some thing, if ignore the http overhead, the difference should not so much big, Why the performance is so much hug different? It is nodejs or node-pg package problem?

Related

Artillery return 404 for target url

I have an artillery config setup to test two different versions of my API gateway (Apollo federated Graphql gateway). The V2 gateway URL returns 404 with artillery, but from the browser and Postman, it is accessible. I am quite confused as to why this is the case. What could be the issue here? The V1 URL of the gateway is accessible by artillery, but when it hits the V2 URL, it returns a 404. Here is my config file
config:
environments:
v2:
target: "https://v2.com/graphql" # not accessible by artillery but works with postman and browser
phases:
- duration: 60
arrivalRate: 5
name: Warm up
- duration: 120
arrivalRate: 5
rampTo: 50
name: Ramp up load
- duration: 300
arrivalRate: 50
name: Sustained load
v1:
target: "https://v1.com/graphql" # accessible by artillery and other agents
phases:
- duration: 60
arrivalRate: 5
name: Warm up
- duration: 120
arrivalRate: 5
rampTo: 50
name: Ramp up load
payload:
path: "test-user.csv"
skipHeader: true
fields:
- "email"
- "_id"
-
path: "test-campaign.csv"
skipHeader: true
fields:
- "campaignId"
scenarios:
- name: "Test"
flow:
- post:
url: "/"
headers:
Authorization: Bearer <token>
json:
query: |
query Query {
randomQuery {
... on Error {
message
statusCode
}
... on Response {
res {
__typename
}
}
Any help would be appreciated. Thank you.

Get the readiness probe logs

Good afternoon, I have a cluster in AKS with a microservice that has a readiness which queries 3 services on which it depends to function.
readinessProbe:
failureThreshold: 3
httpGet:
path: / readiness
port: http
scheme: HTTP
initialDelaySeconds: 3
periodSeconds: 3
successThreshold: 1
timeoutSeconds: 1
{"podname": "podname", "healths": [{"name": "redis", "status": "ok"}, {"name": "pushPublisher", "status": "ok"}, {"name": "postgres", "status": "ok"}]}
Sometimes the microservice is put off (running 0/1) due to the failure of some of the services in the readiness probe and in the events it is not found which service is the one that falls or exceeds the timeout time. Setting a higher value for timeoutSeconds would not be a solution because we do not know what the origin of the problem is.
Is there a way to view the readiness probe logs or redirect the logs to see them somewhere beyond the event information? in events I only have the following:
http-get http: //: http / readiness delay = 3s timeout = 1s period = 3s # success = 1 # failure = 3

"[circuit_breaking_exception] [parent]" Data too large, data for "[<http_request>]" would be error

After smoothly working for more than 10 months, I start getting this error on production suddenly while doing simple search queries.
{
"error" : {
"root_cause" : [
{
"type" : "circuit_breaking_exception",
"reason" : "[parent] Data too large, data for [<http_request>] would be [745522124/710.9mb], which is larger than the limit of [745517875/710.9mb]",
"bytes_wanted" : 745522124,
"bytes_limit" : 745517875
}
],
"type" : "circuit_breaking_exception",
"reason" : "[parent] Data too large, data for [<http_request>] would be [745522124/710.9mb], which is larger than the limit of [745517875/710.9mb]",
"bytes_wanted" : 745522124,
"bytes_limit" : 745517875
},
"status" : 503
}
Initially, I was getting this error while doing simple term queries when I got this circuit_breaking_exception error, To debug this I tried _cat/health query on elasticsearch cluster, but still, the same error, even the simplest query localhost:9200 is giving the same error Not sure what happens to the cluster suddenly.
Her is my circuit breaker status:
"breakers" : {
"request" : {
"limit_size_in_bytes" : 639015321,
"limit_size" : "609.4mb",
"estimated_size_in_bytes" : 0,
"estimated_size" : "0b",
"overhead" : 1.0,
"tripped" : 0
},
"fielddata" : {
"limit_size_in_bytes" : 639015321,
"limit_size" : "609.4mb",
"estimated_size_in_bytes" : 406826332,
"estimated_size" : "387.9mb",
"overhead" : 1.03,
"tripped" : 0
},
"in_flight_requests" : {
"limit_size_in_bytes" : 1065025536,
"limit_size" : "1015.6mb",
"estimated_size_in_bytes" : 560,
"estimated_size" : "560b",
"overhead" : 1.0,
"tripped" : 0
},
"accounting" : {
"limit_size_in_bytes" : 1065025536,
"limit_size" : "1015.6mb",
"estimated_size_in_bytes" : 146387859,
"estimated_size" : "139.6mb",
"overhead" : 1.0,
"tripped" : 0
},
"parent" : {
"limit_size_in_bytes" : 745517875,
"limit_size" : "710.9mb",
"estimated_size_in_bytes" : 553214751,
"estimated_size" : "527.5mb",
"overhead" : 1.0,
"tripped" : 0
}
}
I found a similar issue hereGithub Issue that suggests increasing circuit breaker memory or disabling the same. But I am not sure what to choose. Please help!
Elasticsearch Version 6.3
After some more research finally, I found a solution for this i.e
We should not disable circuit breaker as it might result in OOM error and eventually might crash elasticsearch.
dynamically increasing circuit breaker memory percentage is good but it is also a temporary solution because at the end after solution increased percentage might also fill up.
Finally, we have a third option i.e increase overall JVM heap size which is 1GB by default but as recommended it should be around 30-32 GB on production, also it should be less than 50% of available total memory.
For more info check this for good JVM memory configurations of elasticsearch on production, Heap: Sizing and Swapping
In my case I have an index with large documents, each document has ~30 KB and more than 130 fields (nested objects, arrays, dates and ids).
and I was searching all fields using this DSL query:
query_string: {
query: term,
analyze_wildcard: true,
fields: ['*'], // search all fields
fuzziness: 'AUTO'
}
Since full-text searches are expensive. Searching through multiple fields at once is even more expensive. Expensive in terms of computing power, not storage.
Therefore:
The more fields a query_string or multi_match query targets, the
slower it is. A common technique to improve search speed over multiple
fields is to copy their values into a single field at index time, and
then use this field at search time.
please refer to ELK docs that recommends searching as few fields as possible with the help of copy-to directive.
After I changed my query to search one field:
query_string: {
query: term,
analyze_wildcard: true,
fields: ['search_field'] // search in one field
}
everything worked like a charm.
I got this error with my docker container so I increase the java_opts to 1GB and now it works without any error.
Here are the docker-compose.yml
version: '1'
services:
elasticsearch-cont:
image: docker.elastic.co/elasticsearch/elasticsearch:7.9.2
container_name: elasticsearch
environment:
- "ES_JAVA_OPTS=-Xms1024m -Xmx1024m"
- discovery.type=single-node
ulimits:
memlock:
soft: -1
hard: -1
ports:
- 9200:9200
- 9300:9300
networks:
- elastic
networks:
elastic:
driver: bridge
In my case, I also have an index with large documents which store system running logs and I searched the index with all fields. I use the Java Client API, like this:
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("uid", uid);
searchSourceBuilder.query(termQueryBuilder);
When I changed my code like this:
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("uid", uid);
searchSourceBuilder.fetchField("uid");
searchSourceBuilder.fetchSource(false);
searchSourceBuilder.query(termQueryBuilder);
the error disappeared.

Mongodb - replica set - max connections

I have a replicaset of 3 mongo node, 1 primary, 1 secondary and 1 arbiter.
Connected on this replicaset, i have 20 node process, on 20 different serveur using their own connections to the replicaset. All those process use mongoose.
My primary replicaset show the following :
rsProd:PRIMARY> db.serverStatus().connections
{ "current" : 284, "available" : 50916, "totalCreated" : NumberLong(42655) }
From time to time, when i restart some nodejs node i have the following errors :
mongodb no valid seed servers in list
My connection string to the replicaset is the following :
"mongodb://mongo2aws.abcdef:27017/dbname,mongo1.abcdef:27017/dbname"
And my db options are the following :
config.db_options = {
user: "MYUSER",
pass: "MYPASSWORD",
replset: {
rs_name: "RSNAME",
ssl: true,
sslValidate:false,
sslCA: ca,
ca: ca,
sslKey: key,
sslCert: key
},
socketOptions : {
keepAlive : 1,
connectTimeoutMS : 1000
},
server: {
ssl: true,
sslValidate:false,
sslCA: ca,
ca: ca,
sslKey: key,
sslCert: key
},
auth: {
authdb: 'MYAUTHDB'
}
};
I haven't this error when i was running only 16 node process.
According to this i suppose that i have reach a limit of max concurrent connections or something like this.
But, if i restart again crashing node, it finally seems to work.
Why mongo / mongoose raise this error ?
What can i do to prevent this / increase limit ?
Thanks in advance
Best regards.
Solved by increasing ulimit open files
Default ulimit for open files in AWS EC2 ubuntu server is set to 1000 by default.
In addition, adding reconnect options prevent this problem :
config.db_options.reconnectTries=10;
config.db_options.reconnectInterval=500;
config.db_options.poolSize=20;
config.db_options.connectTimeoutMS=5000;

Low performance with Logstash Kafka input plugin with no-op output except metrics

Test environment is as follows:
CPU: Intel L5640 2.26 GHz 6 cores * 2 EA
Memory: SAMSUNG PC3-10600R 4 GB * 4 EA
HDD: TOSHIBA SAS 10,000 RPM 300 GB * 6 EA
OS: CentOS release 6.6 (Final)
Logstash 2.3.4
I used the following configuration:
input {
kafka {
zk_connect => '1.2.3.4:2181'
topic_id => 'some-log'
consumer_threads => 1
}
}
filter {
metrics {
meter => "events"
add_tag => "metric"
}
}
output {
if "metric" in [tags] {
stdout {
codec => line {
format => "Count: %{[events][count]}"
}
}
}
}
I got the following result:
./bin/logstash -f some-log-kafka.conf
Settings: Default pipeline workers: 24
Pipeline main started
Count: 9614
Count: 23080
Count: 37087
Count: 50815
Count: 64517
Count: 78296
Count: 91977
Count: 105990
Default flush_interval is 5 seconds, so it looks roughly 14K per 5 seconds (2.8K per second).
With consumer_threads set to 10, I got the following result:
./bin/logstash -f impression-log-kafka.conf
Settings: Default pipeline workers: 24
Pipeline main started
Count: 9599
Count: 23254
Count: 37253
Count: 51029
Count: 64881
Count: 78868
Count: 92663
Count: 106267
It looks increasing consumer_threads doesn't make much difference.
Based on my simple no-op consumer benchmark, I expected around 30K and at least 10K but it's just 1/10 of the expected performance.
How can I enhance its performance?
Additional comment:
With Kafka client Java library, I'm using bootstrap servers, whereas with Logstash Kafka input plugin, I'm using ZooKeeper (there's no option for bootstrap servers). I'm not sure this could lead to the huge difference.

Resources