Neo4j cpu stuck on GC - garbage-collection

Suddenly, after working for one month with almost no use of cpu (between 1 to 5%). The neo4j server is stuck 100% cpu on garbage collecting.
I have neo4j-entherprise 2.0.3 (not embedded) running on ubuntu 4 processors server .
this is my neo4j configuration:
wrapper:
wrapper.java.additional=-Dorg.neo4j.server.properties=conf/neo4j-server.properties
wrapper.java.additional=-Djava.util.logging.config.file=conf/logging.properties
wrapper.java.additional=-Dlog4j.configuration=file:conf/log4j.properties
#********************************************************************
# JVM Parameters
#********************************************************************
wrapper.java.additional=-XX:+UseConcMarkSweepGC
wrapper.java.additional=-XX:+CMSClassUnloadingEnabled
# Remote JMX monitoring, uncomment and adjust the following lines as needed.
# Also make sure to update the jmx.access and jmx.password files with appropriate permission roles and passwords,
# the shipped configuration contains only a read only role called 'monitor' with password 'Neo4j'.
# For more details, see: http://download.oracle.com/javase/6/docs/technotes/guides/management/agent.html
# On Unix based systems the jmx.password file needs to be owned by the user that will run the server,
# and have permissions set to 0600.
# For details on setting these file permissions on Windows see:
# http://download.oracle.com/javase/1.5.0/docs/guide/management/security-windows.html
wrapper.java.additional=-Dcom.sun.management.jmxremote.port=3637
wrapper.java.additional=-Dcom.sun.management.jmxremote.authenticate=true
wrapper.java.additional=-Dcom.sun.management.jmxremote.ssl=false
wrapper.java.additional=-Dcom.sun.management.jmxremote.password.file=conf/jmx.password
wrapper.java.additional=-Dcom.sun.management.jmxremote.access.file=conf/jmx.access
# Some systems cannot discover host name automatically, and need this line configured:
#wrapper.java.additional=-Djava.rmi.server.hostname=$THE_NEO4J_SERVER_HOSTNAME
# disable UDC (report data to neo4j..)
wrapper.java.additional=-Dneo4j.ext.udc.disable=true
# Uncomment the following lines to enable garbage collection logging
wrapper.java.additional=-Xloggc:data/log/neo4j-gc.log
wrapper.java.additional=-XX:+PrintGCDetails
wrapper.java.additional=-XX:+PrintGCDateStamps
wrapper.java.additional=-XX:+PrintGCApplicationStoppedTime
#wrapper.java.additional=-XX:+PrintPromotionFailure
#wrapper.java.additional=-XX:+PrintTenuringDistribution
# Uncomment the following lines to enable JVM startup diagnostics
#wrapper.java.additional=-XX:+PrintFlagsFinal
#wrapper.java.additional=-XX:+PrintFlagsInitial
# Java Heap Size: by default the Java heap size is dynamically
# calculated based on available system resources.
# Uncomment these lines to set specific initial and maximum
# heap size in MB.
#wrapper.java.initmemory=512
wrapper.java.maxmemory=3072
#********************************************************************
# Wrapper settings
#********************************************************************
# path is relative to the bin dir
wrapper.pidfile=../data/neo4j-server.pid
#********************************************************************
# Wrapper Windows NT/2000/XP Service Properties
#********************************************************************
# WARNING - Do not modify any of these properties when an application
# using this configuration file has been installed as a service.
# Please uninstall the service before modifying this section. The
# service can then be reinstalled.
# Name of the service
wrapper.name=neo4j
defaults values:
# Default values for the low-level graph engine
neostore.nodestore.db.mapped_memory=25M
neostore.relationshipstore.db.mapped_memory=120M
neostore.propertystore.db.mapped_memory=90M
neostore.propertystore.db.strings.mapped_memory=100M
neostore.propertystore.db.arrays.mapped_memory=100M
What can I do?
EDIT:
The store file sizes:
[
{
"description": "Information about the sizes of the different parts of the Neo4j graph store",
"name": "org.neo4j:instance=kernel#0,name=Store file sizes",
"attributes": [
{
"description": "The total disk space used by this Neo4j instance, in bytes.",
"name": "TotalStoreSize",
"value": 401188207,
"isReadable": "true",
"type": "long",
"isWriteable": "false ",
"isIs": "false "
},
{
"description": "The amount of disk space used by the current Neo4j logical log, in bytes.",
"name": "LogicalLogSize",
"value": 24957516,
"isReadable": "true",
"type": "long",
"isWriteable": "false ",
"isIs": "false "
},
{
"description": "The amount of disk space used to store array properties, in bytes.",
"name": "ArrayStoreSize",
"value": 128,
"isReadable": "true",
"type": "long",
"isWriteable": "false ",
"isIs": "false "
},
{
"description": "The amount of disk space used to store nodes, in bytes.",
"name": "NodeStoreSize",
"value": 524160,
"isReadable": "true",
"type": "long",
"isWriteable": "false ",
"isIs": "false "
},
{
"description": "The amount of disk space used to store properties (excluding string values and array values), in bytes.",
"name": "PropertyStoreSize",
"value": 145348280,
"isReadable": "true",
"type": "long",
"isWriteable": "false ",
"isIs": "false "
},
{
"description": "The amount of disk space used to store relationships, in bytes.",
"name": "RelationshipStoreSize",
"value": 114126903,
"isReadable": "true",
"type": "long",
"isWriteable": "false ",
"isIs": "false "
},
{
"description": "The amount of disk space used to store string properties, in bytes.",
"name": "StringStoreSize",
"value": 128,
"isReadable": "true",
"type": "long",
"isWriteable": "false ",
"isIs": "false "
}
],
"url": "org.neo4j/instance%3Dkernel%230%2Cname%3DStore+file+sizes"
}
]

Assuming you have 16 GB of RAM in the machine.
First thing is to set the neostore.xxx.mapped_memory settings to match the size of your store files. I'm assuming their total is 5 GB -> you have 11 GB left. See http://docs.neo4j.org/chunked/2.0.4/configuration-caches.html for more details.
Reserve some RAM for the system: 1GB -> you have 10 GB left.
Assign the remaining RAM to java heap using wrapper.java.initmemory wrapper.java.maxmemory. Set both to the same value.
If hpc is used as cache_type consider tweaking its settings based on cache hit ratio for relationships and nodes. Use JMX to monitor them, http://docs.neo4j.org/chunked/2.0.4/jmx-mxbeans.html#jmx-cache-nodecache.

We also experienced these kind of issues. In addition to configuration changes similar to what #stefan-armbruster mentioned updating Neo4j to 2.1.2 we also configured Neo4j to use G1 garbage collection instead of CMS.
Since making the garbage collection change we have seen far fewer spikes than we did previously.
If you want to give it a shot you can enable G1 GC by adding the following to your conf/neo4j-wrapper.conf file.
wrapper.java.additional=-XX:+UseG1GC
Hopefully with a combination of this and the changes suggested by #stefan-armbruster you'll resolve the issue.

Related

convert kubernetes state metrics curl response units to megabytes

i need to get kube state metrics with Mi, it default comes with Ki. can any one please help me
[root#dte-dev-1-bizsvck8s-mst harsha]# curl http://<server IP>:8088/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/hello-kubernetes-65bc74d4b9-qp9dc
{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "hello-kubernetes-65bc74d4b9-qp9dc",
"namespace": "default",
"selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/default/pods/hello-kubernetes-65bc74d4b9-qp9dc",
"creationTimestamp": "2020-04-17T12:31:59Z"
},
"timestamp": "2020-04-17T12:31:26Z",
"window": "30s",
"containers": [
{
"name": "hello-kubernetes",
"usage": {
"cpu": "0",
"memory": "20552Ki"
}
}
]
i want to get memory usage from Mi (megabytes) not the Ki. Please help me!
This unit is hardcoded in official kube-state-metrics code which shouldn't be changed. For example node metrics - especially memory usage is in Megabytes unit not Kilobytes.
To get memory usage of specific pod in Megabytes units simply execute:
kubectl top pod --namespace example-app
NAME CPU(cores) MEMORY(bytes)
app-deployment-76bf4969df-65wmd 12m 1Mi
app-deployment-76bf4969df-mmqvt 16m 1Mi
The kubectl top command returns current CPU and memory usage for a cluster’s pods or nodes, or for a particular pod or node if specified.
You can also convert received value:
1 KB = 0.001 MB (in decimal),
1 KB = 0.0009765625 MB (in binary)
Take a look: kube-state-metrics-monitoring.

Slow Elasticsearch indexing using join datatype

We have an index with a join datatype and the indexing speed is very slow.
At best we are indexing 100/sec, but mostly around 50/sec, the times is varying depending of the document size. We are using multiple threads with .NET Nest when indexing but both batching and single inserts are pretty slow. We have tested various batch sizes but still not getting any speed to talk about. Even with only small documents containing "metadata" it is slow, but speed will drop radically when the document size is increasing. Document size in this solution can vary from small up to 6 MB
What can we expect using the join datatype and indexing? How much penalty must we expect to get using it? We did of course try to avoid this when designing it, but we did not find any way around it. Any tips or tricks?
We are using a 3-node cluster in Azure, all with 32 GB of RAM and premium SSD disks. The Java Heap size is set to 16GB. Swapping is Disabled. Memory usage on the VM’s is stable about 60% of total, but the CPU is very low < 10 %. We are running Elasticsearch v. 6.2.3.
A short version of the mapping:
"mappings": {
"log": {
"_routing": {
"required": true
},
"properties": {
"body": {
"type": "text"
},
"description": {
"type": "text"
},
"headStepJoinField": {
"type": "join",
"eager_global_ordinals": true,
"relations": {
"head": "step"
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"statusId": {
"type": "keyword"
},
"stepId": {
"type": "keyword"
}
}
}
}

IoT Central - State values error: Another instance with the same id already exists

I just don't get it.
If I want to define a status, I must be able to refer to a value several times, right?
Sensor value is occupancy with value 0 (Free) or 1 (Occupied). So I have 2 states, but I can only use "occupancy" in the Name it once ...
Regards,
Matthias
enter image description here
For better understanding, let's call the State property (such as a device twin reported property) as an Occupancy.
The following screen snippet shows its declaration, where the Occupancy state property has two states such as the Free and Occupied (Occupancy.Free and Occupancy.Occupied):
and its declaration in the Interface instance of the Capability Model (in my example):
{
"#id": "urn:rigado:interfaces:S1_Sensor:Occupancy:3",
"#type": [
"Property",
"SemanticType/State"
],
"displayName": {
"en": "Occupancy"
},
"name": "Occupancy",
"schema": {
"#id": "urn:rigado:interfaces:S1_Sensor:Occupancy:xkuwdf9p:3",
"#type": "Enum",
"valueSchema": "integer",
"enumValues": [
{
"#id": "urn:rigado:interfaces:S1_Sensor:Occupancy:xkuwdf9p:Free:3",
"#type": "EnumValue",
"displayName": {
"en": "Free"
},
"enumValue": 0,
"name": "Free"
},
{
"#id": "urn:rigado:interfaces:S1_Sensor:Occupancy:xkuwdf9p:Occupied:3",
"#type": "EnumValue",
"displayName": {
"en": "Occupied"
},
"enumValue": 1,
"name": "Occupied"
}
]
}
}
As you can see in the above schema, the names in the enumValues array must be unique, that is a reason why you get the error, when you used the same enum name.
Note, that the device can change the state of the Occupancy property between the values such as Free (0) and Occupied (1).
For testing purpose can be used the Azure IoT Hub Tester, see the following screen snippet:
The following screen snippets show changing a state in the Occupancy reported property on the PnP device (sensor3) connected to the IoTC App:
Publishing a Occupancy state:
Get the device twin properties:
and the IoTC App Dashboard for Occupancy State property:
As you can see, the about state has a value Free.

speed up copy task in azure Data factory

I have a Copy job should copy 100 GB of excel files between two Azure DataLake.
"properties": {
"activities": [
{
"name": "Copy Data1",
"type": "Copy",
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"typeProperties": {
"source": {
"type": "AzureDataLakeStoreSource",
"recursive": true,
"maxConcurrentConnections": 256
},
"sink": {
"type": "AzureDataLakeStoreSink",
"maxConcurrentConnections": 256
},
"enableStaging": false,
"parallelCopies": 32,
"dataIntegrationUnits": 256
},
"inputs": [
{
"referenceName": "SourceLake",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "DestLake",
"type": "DatasetReference"
}
]
}
],
my throughput is about 4 MB/s. As I read here it should be 56 MB/s. What should I do to reach this throughput?
You can use the Copy actives Performance tuning to help you tune the performance of your Azure Data Factory service with the copy activity.
Summary:
Take these steps to tune the performance of your Azure Data Factory service with the copy activity.
Establish a baseline. During the development phase, test your pipeline by using the copy activity against a representative data sample. Collect execution details and performance characteristics following copy activity monitoring.
Diagnose and optimize performance. If the performance you observe doesn't meet your expectations, identify performance bottlenecks. Then, optimize performance to remove or reduce the effect of bottlenecks.
In some cases, when you run a copy activity in Azure Data Factory, you see a "Performance tuning tips" message on top of the copy activity monitoring page, as shown in the following example. The message tells you the bottleneck that was identified for the given copy run. It also guides you on what to change to boost copy throughput.
Your file is about 100 GB size. But test files for file-based stores are multiple files with 10 GB in size. The performance may be different.
Hope this helps.

Azure resource ID reported depending on consumed volumes

guys, couldn't find similar question, so asking here.
We have a client to Microsoft REST API, and we receive consumed usage normally for multiple subscriptions.
But there's a problematic point.
There are some resource types, which are billed depending on the consumed volume. Each of these has got it's own resource ID. For example for BLOB storage there're at least 3 different IDs depending on the consumed amount (which I suspect, should be billed differently).
The question is - am I right presuming, that when end user (our customer) will exceed amount of resources allocated for a particular usage resource ID, next report will contain different resource ID for the same, well, resource?
Here's the REST response i'm talking about:
{
"usageStartTime": "2017-06-07T17:00:00-07:00",
"usageEndTime": "2017-06-08T17:00:00-07:00",
"resource": {
"id": "**8767aeb3-6909-4db2-9927-3f51e9a9085e**", //I'm talking about this one
"name": "Storage Admin",
"category": "Storage",
"subcategory": "Block Blob",
"region": "Azure Stack"
},
"quantity": 0.217790327034891,
"unit": "1 GB/Hr",
"infoFields": {},
"instanceData": {
"resourceUri": "/subscriptions/ab7e2384-eeee-489a-a14f-1eb41ddd261d/resourcegroups/system.local/providers/Microsoft.Storage/storageaccounts/srphealthaccount",
"location": "azurestack",
"partNumber": "",
"orderNumber": "",
"additionalInfo": {
"azureStack.MeterId": "09F8879E-87E9-4305-A572-4B7BE209F857",
"azureStack.SubscriptionId": "dbd1aa30-e40d-4436-b465-3a8bc11df027",
"azureStack.Location": "local",
"azureStack.EventDateTime": "06/05/2017 06:00:00"
}
"attributes": {
"objectType": "AzureUtilizationRecord"
}
}

Resources