I have neo4j instance run on 10$/mo digitalocean vps:
1GB / 1 CPU,
30GB SSD DISK,
2TB TRANSFER
I'm using awesome node library seraph. But while I tried to save a bit of data (about 10 properties per node) - (around 100 root nodes + 3-5 related nodes for each) after about half of it I start to get econnreset, missing ids and stuff. I guess it's because of wrong configuration/not enough resources.
So how to check what is wrong? What kind of logs to read?
I actually found this issue, and added this to configuration:
neostore.relationshipgroupstore.db.mapped_memory=10M
neostore.nodestore.db.mapped_memory=250M
neostore.propertystore.db.mapped_memory=500M
neostore.propertystore.db.strings.mapped_memory=500M
and there are less errors, but still there is a lot of stuff that is lost due to econnreset.
What should I change? Thank you very much!
Use Neo4j 2.2.2
Configure in neo4j.properties dbms.pagecache.memory=250M
Configure in neo4j-wrapper.conf the wrapper.java.maxmemory=512
I'm not sure if seraph uses the REST API to create data or Cyphers remote endpoint, the latter would be preferred.
Please provide your code !!
I presume you can do your whole operation with 500 cypher statements each parametrized with your 10 properties passed to neo as map parameter that you can assign directly.
Related
For example, reading "dbs/colls/document" instead of getting a container, then calling read on the container.
I've been having an issue where the first readItem on a container (after calling database.getContainer(x)) is extremely slow (like 1 second or longer) and was thinking using a database link could be faster.
I'm guessing a read after getting the container is slow because it doesn't make a service call until I call read.
Is there a way I can have this preloaded when reading in a database?
I have an application with a read(collectionName, key) method, and my approach was to use getContainer(collectionName) and then call read on that, but this method needs to be fast.
As discussed, the best practice is to keep an instance of your container alive between requests and call readItem on each request. This should resolve the primary issue.
As for the secondary concern, the "high latency every 50 requests or so", this is a known issue however it should only occur in the first minute or so of operation. If you can tolerate the initial slow requests, the solution is to wait for performance to stabilize. How long do you have to run your app for before you no longer see these high-latency requests?
FYI, if latency is a concern, run your client application in a geographically colocated Azure VM. Also a good rule of thumb is to allocate client CPU cores such that CPU utilization is not more than 40% or 50%.
I have a couchbase cluster setup (couchbase version 4.1) where there are N data nodes, 1 Query Node and 1 Index Node. Data nodes have roughly 1 million key value pairs in a single bucket. This whole setup is hosted in Microsoft Azure within a virtual network. And can assure you that each node has enough resources that RAM, CPU or Disk is not an issue.
Now i can GET/SET JSON documents in my couchbase server without any issue. I am just testing, so ports are not issue as i have opened all ports between machines for now.
But when i try to run N1QL queries (from couchbase shell or using python SDK) it does not work. The query just hangs and i don't get any reply from server. On the other hand, once in a while the query just works without any issue and then after a minute it again stops working.
I have created PRIMARY index on my bucket and any other required Global Secondary Index if needed.
I also installed sample buckets provided by couchbase. Same problems exist.
Does anyone have a clue what the issue could be?
Your query hangs probably because you are straining the server too much, I don't know how many N1QL ops you are push each second, but for that type of query you will benefit the most with several tweaks, which lower cpu usage and increase efficiency.
Create a specific covering index such as:
create index inx_id_email on clients(id,email) where transaction_successful=false
use explain keyword to check if your query is using the index.
(explain SELECT id, email FROM clients where transaction_successful = false LIMIT 100 OFFSET 200)
I believe that your query/index nodes are utilized too much because you actually doing the equivalent to primary scan in relational databases.
I cannot figure out what is the cause of the bottleneck on this site, very bad response times once about 400 users reached. The site is on Google compute engine, using an instance group, with network load balancing. We created the project with sailjs.
I have been doing load testing with Google container engine using kubernetes, running the locust.py script.
The main results for one of the tests are:
RPS : 30
Spawn rate: 5 p/s
TOTALS USERS: 1000
AVG(res time): 27500!! (27,5 seconds)
The response time initially is great, below one second, but when it starts reaching about 400 users the response time starts to jump massively.
I have tested obvious factors that can influence that response time, results below:
Compute engine Instances
(2 x standard-n2, 200gb disk, ram:7.5gb per instance):
Only about 20% cpu utilization used
Outgoing network bytes: 340k bytes/sec
Incoming network bytes: 190k bytes/sec
Disk operations: 1 op/sec
Memory: below 10%
MySQL:
Max_used_connections : 41 (below total possible)
Connection errors: 0
All other results for MySQL also seem fine, no reason to cause bottleneck.
I tried the same test for a new sailjs created project, and it did better, but still had terrible results, 5 seconds res time for about 2000 users.
What else should I test? What could be the bottleneck?
Are you doing any file reading/writing? This is a major obstacle in node.js, and will always cause some issues. Caching read files or removing the need for such code should be done as much as possible. In my own experience, serving files like images, css, js and such trough my node server would start causing trouble when the amount of concurrent requests increased. The solution was to serve all of this trough a CDN.
Another proble could be the mysql driver. We had some problems with connection not being closed correctly (Not using sails.js, but I think they used the same driver at the time I encountered this), so they would cause problems on the mysql server, resulting in long delays when fetching data from the database. You should time/track the amount of mysql queries and make sure they arent delayed.
Lastly, it could be some special issue with sails.js and Google compute engine. You should make sure there arent any open issues on either of these about the same problem you are experiencing.
I am currently working on a node.js api deployed on aws with elastic beanstalk.
The api accepts a url with query parameters, saves the parameters on a db (in my case aws rds), and redirects to a new url without waiting for the db response.
The main priority by far for the api is the redirection speed and the ability to handle a lot of requests. The aim of this question is to get your suggestions on how to do that.
I ran the api through a service called blitz.io to see what load it could handle and this is the report I got from them: https://www.dropbox.com/s/15wsa8ksj3lz99e/Blitz.pdf?dl=0
The instance and the database are running on t2.micro and db.t2.micro respectively.
The api can handle the load if no write is performed on the db, but crashes under a certain load when it writes on the db (I shared the report for the latter case) even without waiting for the db responses.
I checked the logs and found the following error in /var/log/nginx/error.log:
*1254 socket() failed (24: Too many open files) while connecting to upstream
I am not familiar with how nginx works but I imagine that every db connection is seen as an open file. Hence, the error implies that we reach the limit for open files before being able to close the connections. Is that a correct interpretation? Why am I getting the error?
I increased the limit in the way suggested here: https://forums.aws.amazon.com/thread.jspa?messageID=613983#613983 but it did not solve the problem.
At this point I am not sure what to do. Can I close the connections before getting a response from the db? Is it a hardware limitation? The writes to the db are tiny.
Thank you in advance for your help! :)
if you just modified ulimit, it might not be enough. You should look at fs.file-max for number of file descriptors,
sysctl -w fs.file-max=100000
as explained there :
http://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/
I am using Hazelcast version 3.3.1.
I have a 9 node cluster running on aws using c3.2xlarge servers.
I am using a distributed executor service and a distributed map.
Distributed executor service uses a single thread.
Distributed map is configured with no replication and no near-cache and stores about 1 million objects of size 1-2kb using Kryo serializer.
My use case goes as follow:
All 9 nodes constantly execute a synchronous remote operation on the distributed executor service and generate about 20k hits per second (about ~2k per node).
Invocations are executed using Hazelcast API: com.hazelcast.core.IExecutorService#executeOnKeyOwner.
Each operation accesses the distributed map on the node owning the partition, does some calculation using the stored object and stores the object in to the map. (for that I use the get and set API of the IMap object).
Every once in a while Hazelcast encounters a timeout exceptions such as:
com.hazelcast.core.OperationTimeoutException: No response for 120000 ms. Aborting invocation! BasicInvocationFuture{invocation=BasicInvocation{ serviceName='hz:impl:mapService', op=GetOperation{}, partitionId=212, replicaIndex=0, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeout=60000, target=Address[172.31.44.2]:5701, backupsExpected=0, backupsCompleted=0}, response=null, done=false} No response has been received! backups-expected:0 backups-completed: 0
In some cases I see map partitions start to migrate which makes thing even worse, nodes constantly leave and re-join the cluster and the only way I can overcome the problem is by restarting the entire cluster.
I am wondering what may cause Hazelcast to block a map-get operation for 120 seconds?
I am pretty sure it's not network related since other services on the same servers operate just fine.
Also note that the servers are mostly idle (~70%).
Any feedbacks on my use case will be highly appreciated.
Why don't you make use of an entry processor? This is also send to the right machine owning the partition and the load, modify, store is done automatically and atomically. So no race problems. It will probably outperform the current approach significantly since there is less remoting involved.
The fact that the map.get is not returning for 120 seconds is indeed very confusing. If you switch to Hazelcast 3.5 we added some logging/debugging stuff for this using the slow operation detector (executing side) and slow invocation detector (caller side) and should give you some insights what is happening.
Do you see any Health monitor logs being printed?