Updating topology failed in Heron cluster - heron

When I tried to update a topology running on the Heron Cluster, failed message as following:
The output of update command using --verbose as follows:
[2018-07-03 12:07:27 +0800] [FINE] com.twitter.heron.scheduler.RuntimeManagerMain: Exception when submitting topology
com.twitter.heron.spi.packing.PackingException: Could not initialize containers using existing packing plan
at com.twitter.heron.packing.builder.PackingPlanBuilder.initContainers(PackingPlanBuilder.java:259)
at com.twitter.heron.packing.builder.PackingPlanBuilder.addInstance(PackingPlanBuilder.java:153)
at com.twitter.heron.packing.builder.PackingPlanBuilder.addInstance(PackingPlanBuilder.java:141)
at com.twitter.heron.packing.binpacking.FirstFitDecreasingPacking.placeFFDInstance(FirstFitDecreasingPacking.java:312)
at com.twitter.heron.packing.binpacking.FirstFitDecreasingPacking.assignInstancesToContainers(FirstFitDecreasingPacking.java:265)
at com.twitter.heron.packing.binpacking.FirstFitDecreasingPacking.getFFDAllocation(FirstFitDecreasingPacking.java:246)
at com.twitter.heron.packing.binpacking.FirstFitDecreasingPacking.repack(FirstFitDecreasingPacking.java:180)
at com.twitter.heron.scheduler.RuntimeManagerRunner.buildNewPackingPlan(RuntimeManagerRunner.java:304)
at com.twitter.heron.scheduler.RuntimeManagerRunner.updateTopologyHandler(RuntimeManagerRunner.java:183)
at com.twitter.heron.scheduler.RuntimeManagerRunner.call(RuntimeManagerRunner.java:81)
at com.twitter.heron.scheduler.RuntimeManagerMain.callRuntimeManagerRunner(RuntimeManagerMain.java:448)
at com.twitter.heron.scheduler.RuntimeManagerMain.manageTopology(RuntimeManagerMain.java:396)
at com.twitter.heron.scheduler.RuntimeManagerMain.main(RuntimeManagerMain.java:317)
Caused by: com.twitter.heron.packing.ResourceExceededException: Insufficient container resources to add instancePlan {component-name: split, task-id: 1, component-index: 0, instance-resource: {cpu: 1.000000, ram: ByteAmount{1 GB (536870912 bytes)}, disk: ByteAmount{1 GB (1073741824 bytes)}}} to container {containerId=1, instances=[{component-name: spout, task-id: 3, component-index: 0, instance-resource: {cpu: 1.000000, ram: ByteAmount{1 GB (536870912 bytes)}, disk: ByteAmount{1 GB (1073741824 bytes)}}}, {component-name: count, task-id: 5, component-index: 1, instance-resource: {cpu: 1.000000, ram: ByteAmount{1 GB (536870912 bytes)}, disk: ByteAmount{1 GB (1073741824 bytes)}}}], capacity={cpu: 2.000000, ram: ByteAmount{4 GB (3758096384 bytes)}, disk: ByteAmount{3 GB (3221225472 bytes)}}, paddingPercentage=10}
at com.twitter.heron.packing.builder.PackingPlanBuilder.getContainers(PackingPlanBuilder.java:392)
at com.twitter.heron.packing.builder.PackingPlanBuilder.initContainers(PackingPlanBuilder.java:256)
... 12 more
Caused by: com.twitter.heron.packing.ResourceExceededException: Adding 1.0 cores to existing 2.0 cores with 10 percent padding would exceed capacity 2.0
at com.twitter.heron.packing.builder.Container.assertHasSpace(Container.java:165)
at com.twitter.heron.packing.builder.Container.add(Container.java:77)
at com.twitter.heron.packing.builder.PackingPlanBuilder.addToContainer(PackingPlanBuilder.java:417)
at com.twitter.heron.packing.builder.PackingPlanBuilder.getContainers(PackingPlanBuilder.java:390)
... 13 more
[2018-07-03 12:07:28 +0000] [ERROR]: Could not initialize containers using existing packing plan
This topology is running normally, I don't know what reasons causes this problem.

The error implies that your config doesn't set the repacking class correctly: Creating Packing Class in Heron
To solve this problem, you will need to add the corresponding config into your packing.yaml config file. Here is an Example

Related

JDK 1.8 -XX:+UseLargePages behavior when there's not enough huge pages left on os

I am currently confusing how to optimize using HugePages with JVM applications with Netty, -XX:+UseLargePages option enabled, and using G1Gc.
Also, I didn't forget to set the same max and min size of the heap and metaspace.
My application looks fine, but I was wondering what happens if there's no remaining free huge pages on system since JVM uses additional native memory area to allocate direct memory buffer, etc.
(Assume that application started up normally, and consumes additional HugePages on off-heap memory area.)
I've read following page, but there's no description of the behavior when JVM failed to allocate huge pages.
https://www.oracle.com/java/technologies/javase/largememory-pages.html
I use CentOS 7 and OpenJDK 1.8.0_151-b12 for the testbed before deployment.
If allocating large pages fails, OpenJDK 8 or later falls back to allocating regular pages.
src/hotspot/share/memory/virtualspace.cpp:
if (base != NULL) {
[...]
} else {
// failed; try to reserve regular memory below
if (UseLargePages && (!FLAG_IS_DEFAULT(UseLargePages) ||
!FLAG_IS_DEFAULT(LargePageSizeInBytes))) {
log_debug(gc, heap, coops)("Reserve regular memory without large pages");
}
}
All GC implementations use the ReservedSpace helper for allocating memory, so this is not GC-specific.
You can easily test that behavior on Linux by restricting available large pages:
$ echo 16 > /proc/sys/vm/nr_hugepages
$ cat /proc/meminfo | grep HugePages
AnonHugePages: 40960 kB
HugePages_Total: 16
HugePages_Free: 16
HugePages_Rsvd: 0
HugePages_Surp: 0
$ java -XX:+UseLargePages Test
OpenJDK 64-Bit Server VM warning: Failed to reserve large pages memory req_addr: 0x0000000000000000 bytes: 251658240 (errno = 12).
OpenJDK 64-Bit Server VM warning: Failed to reserve large pages memory req_addr: 0x0000000707c00000 bytes: 4164943872 (errno = 12).
OpenJDK 64-Bit Server VM warning: Failed to reserve large pages memory req_addr: 0x0000000000000000 bytes: 67108864 (errno = 12).
OpenJDK 64-Bit Server VM warning: Failed to reserve large pages memory req_addr: 0x0000000000000000 bytes: 67108864 (errno = 12).
$ echo $?
0
strace confirms the failed allocation attempt and the successful retry with the same size but without MAP_HUGETLB:
11631 mmap(NULL, 251658240, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = -1 ENOMEM (Cannot allocate memory)
11631 mmap(NULL, 251658240, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f35d489c000

How To Figure Out Cassandra Extra Off-Heap Memory Usage

I have nodes which have 32GB Ram. I set 20 GB heap size. I am aware of cassandra uses off-heap for memtable, cache etc. Even though usage of memory is very low by memtable, cache etc, somehow cassandra uses 27GB memory. How can I figure out, how cassandra uses that extra 7GB memory?
You can get metrics with JConsole
Or you can use Jolokia agent. For that download and set up jolokia agent to your Cassandra node. E.g.on Linux
mkdir /opt/jolokia
cd /opt/jolokia
wget https://github.com/rhuss/jolokia/releases/download/v1.4.0/jolokia-1.4.0-bin.tar.gz
tar -xf jolokia-1.4.0-bin.tar.gz
Add the agent path as a JVM option to the end of your cassandra-env.sh file and restart cassandra
echo 'JVM_OPTS="$JVM_OPTS -javaagent:/opt/jolokia/jolokia-1.4.0/agents/jolokia-jvm.jar"' >> /etc/conf/cassandra/cassandra-env.sh
Restart Cassandra.
Then you can query metrics like:
Total on heap memory:
wget http://localhost:8778/jolokia/read/org.apache.cassandra.metrics:type=Memory/HeapMemoryUsage
Total off heap memory:
wget http://localhost:8778/jolokia/read/org.apache.cassandra.metrics:type=Memory/NonHeapMemoryUsage
Off heap memory used by memtables:
wget http://localhost:8778/jolokia/read/org.apache.cassandra.metrics:type=Table,keyspace=*,scope=*,name=MemtableOffHeapSize
And also for Bloomfilter, IndexSummary and Compression metadata:
wget http://localhost:8778/jolokia/read/org.apache.cassandra.metrics:type=Table,keyspace=*,scope=*,name=BloomFilterOffHeapMemoryUsed
wget http://localhost:8778/jolokia/read/org.apache.cassandra.metrics:type=Table,keyspace=*,scope=*,name=IndexSummaryOffHeapMemoryUsed
wget http://localhost:8778/jolokia/read/org.apache.cassandra.metrics:type=Table,keyspace=*,scope=*,name=CompressionMetadataOffHeapMemoryUsed
UPDATE:
Example response from Jolokia endpoint:
{
"request":{
"mbean":"org.apache.cassandra.metrics:keyspace=*,name=CompressionMetadataOffHeapMemoryUsed,scope=*,type=Table",
"type":"read"
},
"value":{
"org.apache.cassandra.metrics:keyspace=my_keyspace,name=CompressionMetadataOffHeapMemoryUsed,scope=my_table_name,type=Table":{
"Value":832
},
"org.apache.cassandra.metrics:keyspace=system,name=CompressionMetadataOffHeapMemoryUsed,scope=compaction_history,type=Table":{
"Value":64
},
"org.apache.cassandra.metrics:keyspace=my_keyspace,name=CompressionMetadataOffHeapMemoryUsed,scope=my_table_name2,type=Table":{
"Value":8184
},
...
}
}

Hadoop error log jvm sqoop

My mistake - after 6-8 hours of running programs on Java i get this log hs_err_pid6662.log
and this
[testuser#apus ~]$ sh /home/progr/work/import.sh
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: Resource temporarily unavailable
Programs run every five minutes and try to import/export from oracle
How to fix this?
# There is insufficient memory for the Java Runtime Environment to continue.
# Cannot create GC thread. Out of system resources.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (gcTaskThread.cpp:48), pid=6662,
tid=0x00007f429a675700
#
--------------- T H R E A D ---------------
Current thread (0x00007f4294019000): JavaThread "Unknown thread"
[_thread_in_vm, id=6696, stack(0x00007f429a575000,0x00007f429a676000)]
Stack: [0x00007f429a575000,0x00007f429a676000], sp=0x00007f429a674550,
free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
code)
VM Arguments:
jvm_args: -Xmx1000m -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -
Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop -Dhadoop.id.str= -
Dhadoop.root.logger=INFO,console -
Launcher Type: SUN_STANDARD
Environment Variables:
JAVA_HOME=/usr/java/jdk1.8.0_102
# JRE version: (8.0_102-b14) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.102-b14 mixed mode linux-
amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core
dumping, try "ulimit -c unlimited" before starting Java again
Memory: 4k page, physical 24591972k(6051016k free), swap 12369916k(11359436k
free)
I am running programs like sqoop-import,sqoop-export on Java every 5 minutes.
example:
#!/bin/bash
hadoop jar /home/progr/import_sqoop/oracle.jar.
CDH version 5.11.1
java version jdk1.8.0_102
OS:Red Hat Enterprise Linux Server release 6.9 (Santiago)
Mem free:
total used free shared buffers cached
Mem: 24591972 20080336 4511636 132036 334456 2825792
-/+ buffers/cache: 16920088 7671884
Swap: 12369916 1008664 11361252
Host Memory Usage
enter image description here
The maximum heap memory is (by default) limited to 1GB. You need to increase this
JRE version: (8.0_102-b14) (build )
jvm_args: -Xmx1000m -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -
Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop -Dhadoop.id.str= -
Dhadoop.root.logger=INFO,console -
Try the following for to increase this to 2048MB (or higher if required).
export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
Reference:
Pig: Hadoop jobs Fail
https://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201104.mbox/%3C5FFFF0E4-B3BA-420A-ADE3-B422A66E8B11#yahoo-inc.com%3E

Heketi can't provision a volume for Heketi database

I'm trying to make glusterfs cluster with Heketi for Kubernetes persistent volumes. I have 3 nodes in gluster cluster:
heketi-cli node list
Id:242e801e6eeb7ec10acda60a409b5d98 Cluster:fd539c5d13b6229498c6c67ac491163d
Id:439fb090888a745633f9db6ac4d243b8 Cluster:fd539c5d13b6229498c6c67ac491163d
Id:5e9b7e5f3ec33c77c42437e89ca857a3 Cluster:fd539c5d13b6229498c6c67ac491163d
But when I try to provision a volume for Heketi database by using command:
heketi-cli setup-openshift-heketi-storage
I get an error:
Error: No space
But I have enough free space on my volumes:
Devices:
Id:931b4f87e3675368a4f737ed6862e0cf Name:/dev/sdb State:online Size (GiB):29 Used (GiB):0 Free (GiB):29
Devices:
Id:3a2a30b22ade4efca7949e9cc082b685 Name:/dev/sdb State:online Size (GiB):29 Used (GiB):0 Free (GiB):29
Devices:
Id:5d1b5c7b258c52569bff1e1c720015c5 Name:/dev/sdb State:online Size (GiB):29 Used (GiB):0 Free (GiB):29
What can be the reason for this strange behavior?
I'm sorry, I have found the reason. It's the count of gluster node, it should be equal to count of gluster instances in kubernetes. In previous turn I had only 3 gluster nodes and 4 gluster instances in kubernetes.
There can be a number of problems that lead to this error message. The 2 most common ones are:
You do not have the minimum of 3 nodes in your gluster cluster
The heketi-cli setup-openshift-heketi-storage command needs to create a volume for heketi's database. That volume is now 2GB by default but it used to 32GB(!) (see heketi issue #639). So depending on your heketi-cli version it may be trying to create a 32GB volume on your 29GB bricks. Nasty.
I suggest you look at the logs of heketi:
$ kubectl get pod -l name=heketi
NAME READY STATUS RESTARTS AGE
heketi-703226055-7g3hb 1/1 Running 0 18h
$ kubectl logs heketi-703226055-7g3hb -f
Heketi v3.0.0-111-gc5f0f58
[heketi] INFO 2017/02/14 22:17:53 Loaded kubernetes executor
...

Spark - UbuntuVM - insufficient memory for the Java Runtime Environment

I'm trying to install Spark1.5.1 on Ubuntu14.04 VM. After un-tarring the file, I changed the directory to the extracted folder and executed the command "./bin/pyspark" which should fire up the pyspark shell. But I got an error message as follows:
[ OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000c5550000, 715849728, 0) failed;
error='Cannot allocate memory' (errno=12) There is insufficient
memory for the Java Runtime Environment to continue.
Native memory allocation (malloc) failed to allocate 715849728 bytes
for committing reserved memory.
An error report file with more information is saved as:
/home/datascience/spark-1.5.1-bin-hadoop2.6/hs_err_pid2750.log ]
Could anyone please give me some directions to sort out the problem?
We need to set spark.executor.memory in conf/spark-defaults.conf file to a value specific to your machine. For example,
usr1#host:~/spark-1.6.1$ cp conf/spark-defaults.conf.template conf/spark-defaults.conf
nano conf/spark-defaults.conf
spark.driver.memory 512m
For more information, refer to the official documentation: http://spark.apache.org/docs/latest/configuration.html
Pretty much what it says. It wants 7GB of RAM. So give the VM ~ 8GB of RAM.

Resources