how to fix cassandra debug.log error LEAK: ByteBuf.release() - cassandra

I am getting this error in my debug file.If anybody knows this error please help to solve it. Frustated with this error
ERROR [epollEventLoopGroup-2-51] 2017-11-09 16:09:21,495 Slf4JLogger.java:176 - LEAK: ByteBuf.release() was not called before it's garbage-collected. Enable advanced leak reporting to find out where the leak occurred. To enable advanced leak reporting, specify the JVM option '-Dio.netty.leakDetection.level=advanced' or call ResourceLeakDetector.setLevel() See http://netty.io/wiki/reference-counted-objects.html for more information.
I am using G1 garbage collector instead of CMS collector
I have 4 servers
x.x.x.1 contains-------------------------------------------
MAX_HEAP_SIZE="8G"
HEAP_NEWSIZE="2000M"
OS: CentOS - 7
RAM: 142 GB
Swap: 3 GB
Processor: Intel(R) Xeon(R) CPU E5-2630 v4 # 2.20GHz
Core: 40 Core
Disk: 2.5T
x.x.x.2 contains--------------------------------------------
MAX_HEAP_SIZE="16G"
HEAP_NEWSIZE="4000M"
OS: CentOS - 7
RAM: 125 GB
Swap: 3 GB
Processor: Intel(R) Xeon(R) CPU E5-2630 v4 # 2.20GHz
Core: 40 Core
Disk: 2.2T
x.x.x.3 contains---------------------------------------------
MAX_HEAP_SIZE="16G"
HEAP_NEWSIZE="4000M"
OS: CentOS - 7
RAM: 125 GB
Swap: 3 GB
Processor: Intel(R) Xeon(R) CPU E5-2630 v4 # 2.20GHz
Core: 40 Core
Disk: 2 TB
x.x.x.4 contains-----------------------------------------
MAX_HEAP_SIZE="4G"
HEAP_NEWSIZE="1200M"
OS: CentOS - 7
RAM: 125 GB
Swap: 3 GB
Processor: Intel(R) Xeon(R) CPU E5-1650 v3 # 3.50GHz
Core: 12 Core
Disk: 2.7 TB
jvm options are like this----------------------
-XX:InitiatingHeapOccupancyPercent=70
-XX:ParallelGCThreads=16
-XX:ConcGCThreads=16
log options are like this -----------------------
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure
-XX:PrintFLSStatistics=1
-Xloggc:/var/log/cassandra/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=10M
but thing is one of the server is always turns down.
Thanks and Regards
pavs

Related

intel SPDK ioat example fail to run

I am new in the intel SPDK and meet some problem when I run the example code.
I setup the BIOS as this page said.
Intel® Hyper-Threading Technology off
Intel SpeedStep® technology enabled
Intel® Turbo Boost Technology disabled
then I git clone from this page and run all the command. The test command ./test/unit/unittest.sh return All unit tests passed.
But when I run the example examples/ioat/verify/verify , it return
EAL: 24 hugepages of size 1073741824 reserved, but no mounted hugetlbfs found for that size
Starting SPDK v18.10-pre / DPDK 18.05.0 initialization...
[ DPDK EAL parameters: verify --no-shconf -c 0x1 --legacy-mem --file-prefix=spdk_pid3170 ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/spdk_pid3170/mp_socket
EAL: 24 hugepages of size 1073741824 reserved, but no mounted hugetlbfs found
for that size
EAL: Probing VFIO support...
User configuration:
Run time: 10 seconds
Core mask: 0x1
Queue depth: 32
Not enough ioat channels found. Check that ioat channels are bound
to uio_pci_generic or vfio-pci. scripts/setup.sh can help with this.
and scripts/setup.sh status shows
Hugepages
node hugesize free / total
node0 1048576kB 24 / 24
node0 2048kB 0 / 800
node1 1048576kB 0 / 0
node1 2048kB 0 / 224
NVMe devices
BDF Numa Node Driver name Device name
I/OAT DMA
BDF Numa Node Driver Name
virtio
BDF Numa Node Driver Name Device Name
My hardware is:
linux kernel version 4.15.7
with ioatdma compile as module
CPU intel Xeon E5-2695
chipset C612
It would be great help if somebody could give me some advises or send me some website about SPDK!
Thank you!
Run ./scripts/setup.sh (with no parameters). If there will be no ioat devices under I/OAT DMA section you can't run this app. Also there is no hugetlbfs mount points.

TCP/UDP packets not reaching docker container

My host machine OS is OEL7 with kernel
Linux ispaaaems1 3.10.0-123.el7.x86_64 #1 SMP Wed Jul 9 18:59:11 PDT 2014 x86_64 x86_64 x86_64 GNU/Linux
And my docker info is
Containers: 4
Images: 124
Storage Driver: devicemapper
Pool Name: docker-253:0-88356-pool
Pool Blocksize: 65.54 kB
Backing Filesystem: xfs
Data file:
Metadata file:
Data Space Used: 7.43 GB
Data Space Total: 107.4 GB
Data Space Available: 99.94 GB
Metadata Space Used: 9.302 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.138 GB
Udev Sync Supported: true
Library Version: 1.02.107-RHEL7 (2015-12-01)
Execution Driver: native-0.2
Kernel Version: 3.10.0-123.el7.x86_64
Operating System: Oracle Linux Server 7.2
CPUs: 2
Total Memory: 7.641 GiB
Name: ispaaaems1
ID: 6MUK:HS3D:OQTS:QMWY:WCKE:AZT6:COJP:F7EA:RPNX:7RHY:TKFB:D4LT
I am running a docker container with OS OEL6.6. I am sending a radius request at 1812-1813. All the packets are reaching the host machine, but few packets (3 out of 5) are getting dropped (not reaching inside the container).
Any help will be appreciated. Thanks in adavance.

Xen PV VM uses max 1 Thread

I am running a CPU benchmark tool (LINPACK) on a PV virtual machine inside Xen 4.5.1 on Ubuntu 15.10 x64 on an IBM x3550 M4 server. This tool should consume all possible CPU cycles available. I allocate 4 vCPUs by defining this in the Xen PV (test.cfg). However, LINPACK detects only 1 core and 4 threads while it should detect at least 4 cores:
CPU frequency: 2.494 GHz
Number of CPUs: 1
Number of cores: 1
Number of threads: 4
This is what lscpu says inside this Xen PV VM:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 4
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 62
Model name: Intel(R) Xeon(R) CPU E5-2609 v2 # 2.50GHz
Stepping: 4
CPU MHz: 2500.062
BogoMIPS: 5000.12
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 10240K
NUMA node0 CPU(s): 0-3
Other platforms such as Docker and HVM DO get cores allocated inside the virtual node (see below). These nodes have significant better performance then the Xen PV virtual node.
CPU frequency: 2.499 GHz
Number of CPUs: 2
Number of cores: 8
Number of threads: 4
This is the DomU Xen host machine lscpu:
lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 62
Model name: Intel(R) Xeon(R) CPU E5-2609 v2 # 2.50GHz
Stepping: 4
CPU MHz: 2500.062
BogoMIPS: 5000.12
Hypervisor vendor: Xen
Virtualization type: none
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 10240K
NUMA node0 CPU(s): 0-7
Xen vCPU list:
xl vcpu-list
Domain-0 0 0 7 -b- 10.1 all / all
Domain-0 0 1 2 -b- 6.5 all / all
Domain-0 0 2 4 r-- 2.6 all / all
Domain-0 0 3 0 r-- 3.9 all / all
Domain-0 0 4 3 -b- 4.4 all / all
Domain-0 0 5 6 -b- 2.6 all / all
Domain-0 0 6 5 -b- 4.7 all / all
Domain-0 0 7 7 -b- 2.9 all / all
test 3 0 1 -b- 1.5 0-3 / all
test 3 1 0 -b- 1.8 0-3 / all
test 3 2 0 -b- 0.7 0-3 / all
test 3 3 2 -b- 0.6 0-3 / all
xen DomU PV VM config:
cat test.cfg
bootloader = '/usr/lib/xen-4.5/bin/pygrub'
vcpus = '4'
memory = '2048'
cpus = "0-3"
Is there any options to give the paravirtualized guest the host-cpu topology? In other words, how do I get Xen to use more vCPU cores/ vCPUs?
Thanks!

NUMA support on which CPU? What are the current server configuration of this kind of CPU?

NUMA support on which CPU? What are the current server configuration of this kind of CPU? Linux NUMA commands regarding what, how to open NUMA?
This is going to depend of your server, if it's using a multicore cpu that support Numa affinity. Type numactl --hardware and you'll check how it's the current configuration, for example:
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 32733 MB
node 0 free: 4027 MB
node 1 cpus: 8 9 10 11 12 13 14 15
node 1 size: 32767 MB
node 1 free: 20898 MB
node distances:
node 0 1
0: 10 21
1: 21 10
If you want to check performance with your application, just make sure that it's using the CPUs from the same numa node. You can check this using ps -aux ortop commands.

Cannot Understand the TOP command output on Hadoop Datanode

Hi I just installed Cloudera Manager on my cluster, 1 namenode and 4 datanodes, each data nodes has 64 GB RAM, 24 cores Xeon CPU, 16 1T disks SAS..etc.
I installed brand new Redhat Linux and upgraded to 6.5, each disk has been logically set up as RAID0 since there is no JBOD option available on the array controller.
I am running a hive query and here is the top command on the data node. I am so confused and wondering if some experienced hadoop admin could help me understand if my cluster is working fine.
Why there is only 1 task running out of 897 while the other 896 sleeping? There are 2271 mappers for that hive query and it is only 80% on the mapper side.
The load average is 8.66, I read from here that if you computer is working hard, the load average should be around the number of cores. Is my datanode working hard enought?
List item 69/70 memory has been "used", seems like the active yarn process is fairly low memory cost, how could those 64GB memory be so easily used up?
Here is the top output:
top - 22:50:24 up 1 day, 8:24, 3 users, load average: 8.66, 8.50, 7.95
Tasks: 897 total, 1 running, 896 sleeping, 0 stopped, 0 zombie
Cpu(s): 32.3%us, 5.2%sy, 0.0%ni, 62.3%id, 0.2%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 70096068k total, 69286800k used, 809268k free, 222268k buffers
Swap: 4194296k total, 0k used, 4194296k free, 61468376k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
439 yarn 20 0 1417m 591m 19m S 193.9 0.9 1:06.12 java
561 yarn 20 0 1401m 581m 19m S 193.2 0.8 0:19.75 java
721 yarn 20 0 1415m 561m 19m S 172.0 0.8 0:08.54 java
611 yarn 20 0 1415m 574m 19m S 127.0 0.8 0:16.87 java
354 yarn 20 0 1428m 595m 19m S 121.4 0.9 0:35.96 java
27418 yarn 20 0 1513m 483m 18m S 13.6 0.7 18:26.14 java
16895 hdfs 20 0 1438m 410m 18m S 9.6 0.6 103:23.70 java
3726 hdfs 20 0 860m 249m 21m S 1.7 0.4 2:12.28 java
I am fairly new at system admin and any metric tool or common sense will be much appreciated! Thanks!

Resources