I just installed Oracle Coherence 3.6 on RHEL 5.5. When I execute cache-server.sh I get a lot of GC warnings about allocating large blocks and then it fails with a segmentation fault. Suggestions? Here is the stack:
GC Warning: Repeated allocation of very large block (appr. size 1024000):
May lead to memory leak and poor performance.
GC Warning: Repeated allocation of very large block (appr. size 1024000):
May lead to memory leak and poor performance.
./bin/cache-server.sh: line 24: 6142 Segmentation fault $JAVAEXEC -server -showversion $JAVA_OPTS -cp "$COHERENCE_HOME/lib/coherence.jar" com.tangosol.net.DefaultCacheServer $1
[root#localhost coherence_3.6]# swapon -s
Filename Type Size Used Priority
/dev/mapper/VolGroup00-LogVol01 partition 2097144 0 -1
[root#localhost coherence_3.6]# free
total used free shared buffers cached
Mem: 3631880 662792 2969088 0 142636 353244
-/+ buffers/cache: 166912 3464968
Swap: 2097144 0 2097144
[root#localhost coherence_3.6]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
147G 6.7G 133G 5% /
/dev/sda1 99M 12M 82M 13% /boot
tmpfs 1.8G 0 1.8G 0% /dev/shm
/dev/hdb 2.8G 2.8G 0 100% /media/RHEL_5.5 Source
/dev/hda 57M 57M 0 100% /media/VBOXADDITIONS_4.2.16_86992
[root#localhost coherence_3.6]#
I haven't seen this issue before, but to start, I'd suggest the following:
Check for Linux updates. The JVMs for example now try to use large pages, and there have been some bugs in RH related to large pages that are fixed in the latest versions.
Download the latest Java 7 JDK. While no JDK is entirely bug-free, we have done extensive testing with JDK 7 patch levels 15, 21 and 40.
Download the latest version of Coherence. Coherence 12.1.2 is now out, but if you don't want to go for the very latest, then Coherence 3.7.1 is the suggested version. (The release after 3.7.1 is called 12.1.2. That is to align with Oracle versioning.)
I would check your space allocation on disk and memory/swap. You are probably running out of space somewhere.
df -h
free
You could also check your Java version - make sure that you are well patched.
Are you using Java 6 or Java 7?
There are Oracle forums for Coherence - you should try and ask the question there - thats where the real experts hang out.
Related
I am currently confusing how to optimize using HugePages with JVM applications with Netty, -XX:+UseLargePages option enabled, and using G1Gc.
Also, I didn't forget to set the same max and min size of the heap and metaspace.
My application looks fine, but I was wondering what happens if there's no remaining free huge pages on system since JVM uses additional native memory area to allocate direct memory buffer, etc.
(Assume that application started up normally, and consumes additional HugePages on off-heap memory area.)
I've read following page, but there's no description of the behavior when JVM failed to allocate huge pages.
https://www.oracle.com/java/technologies/javase/largememory-pages.html
I use CentOS 7 and OpenJDK 1.8.0_151-b12 for the testbed before deployment.
If allocating large pages fails, OpenJDK 8 or later falls back to allocating regular pages.
src/hotspot/share/memory/virtualspace.cpp:
if (base != NULL) {
[...]
} else {
// failed; try to reserve regular memory below
if (UseLargePages && (!FLAG_IS_DEFAULT(UseLargePages) ||
!FLAG_IS_DEFAULT(LargePageSizeInBytes))) {
log_debug(gc, heap, coops)("Reserve regular memory without large pages");
}
}
All GC implementations use the ReservedSpace helper for allocating memory, so this is not GC-specific.
You can easily test that behavior on Linux by restricting available large pages:
$ echo 16 > /proc/sys/vm/nr_hugepages
$ cat /proc/meminfo | grep HugePages
AnonHugePages: 40960 kB
HugePages_Total: 16
HugePages_Free: 16
HugePages_Rsvd: 0
HugePages_Surp: 0
$ java -XX:+UseLargePages Test
OpenJDK 64-Bit Server VM warning: Failed to reserve large pages memory req_addr: 0x0000000000000000 bytes: 251658240 (errno = 12).
OpenJDK 64-Bit Server VM warning: Failed to reserve large pages memory req_addr: 0x0000000707c00000 bytes: 4164943872 (errno = 12).
OpenJDK 64-Bit Server VM warning: Failed to reserve large pages memory req_addr: 0x0000000000000000 bytes: 67108864 (errno = 12).
OpenJDK 64-Bit Server VM warning: Failed to reserve large pages memory req_addr: 0x0000000000000000 bytes: 67108864 (errno = 12).
$ echo $?
0
strace confirms the failed allocation attempt and the successful retry with the same size but without MAP_HUGETLB:
11631 mmap(NULL, 251658240, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = -1 ENOMEM (Cannot allocate memory)
11631 mmap(NULL, 251658240, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f35d489c000
I am running MPI job in linux server. I got error:
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to initialize while trying to
allocate some locked memory. This typically can indicate that the
memlock limits are set too low. For most HPC installations, the
memlock limits should be set to "unlimited". The failure occured
here:
Local host: yw0431
OMPI source: ../../../../../ompi/mca/btl/openib/btl_openib_component.c:1216
Function: ompi_free_list_init_ex_new()
Device: mlx4_0
Memlock limit: 65536
You may need to consult with your system administrator to get this
problem fixed. This FAQ entry on the Open MPI web site may also be
helpful:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: yw0431
Local device: mlx4_0
--------------------------------------------------------------------------
[yw0431:20193] 11 more processes have sent help message help-mpi-btl-openib.txt / init-fail-no-mem
[yw0431:20193] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[yw0431:20193] 11 more processes have sent help message help-mpi-btl-openib.txt / error in device init
forrtl: error (78): process killed (SIGTERM)
it means that my linux server have locked memory with 65M, but my job needed more memory. I think 2G should be emough.
I have found a solution about ulimiting the memory:
ulimit -l unlimited
But i am worried that i will cause system crash or some bad things happen.
so can i set "ulimit -l umlimited"?
When you set ulimit as unlimited and your process starting using memory exhaustively then OOM killer will kill ur job for system stability,I would set the ulimit as 80 to 90% of RAM of instead of unlimited.
My mistake - after 6-8 hours of running programs on Java i get this log hs_err_pid6662.log
and this
[testuser#apus ~]$ sh /home/progr/work/import.sh
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: retry: Resource temporarily unavailable
/usr/bin/hadoop: fork: Resource temporarily unavailable
Programs run every five minutes and try to import/export from oracle
How to fix this?
# There is insufficient memory for the Java Runtime Environment to continue.
# Cannot create GC thread. Out of system resources.
# Possible reasons:
# The system is out of physical RAM or swap space
# In 32 bit mode, the process size limit was hit
# Possible solutions:
# Reduce memory load on the system
# Increase physical memory or swap space
# Check if swap backing store is full
# Use 64 bit Java on a 64 bit OS
# Decrease Java heap size (-Xmx/-Xms)
# Decrease number of Java threads
# Decrease Java thread stack sizes (-Xss)
# Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
# Out of Memory Error (gcTaskThread.cpp:48), pid=6662,
tid=0x00007f429a675700
#
--------------- T H R E A D ---------------
Current thread (0x00007f4294019000): JavaThread "Unknown thread"
[_thread_in_vm, id=6696, stack(0x00007f429a575000,0x00007f429a676000)]
Stack: [0x00007f429a575000,0x00007f429a676000], sp=0x00007f429a674550,
free space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native
code)
VM Arguments:
jvm_args: -Xmx1000m -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -
Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop -Dhadoop.id.str= -
Dhadoop.root.logger=INFO,console -
Launcher Type: SUN_STANDARD
Environment Variables:
JAVA_HOME=/usr/java/jdk1.8.0_102
# JRE version: (8.0_102-b14) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.102-b14 mixed mode linux-
amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core
dumping, try "ulimit -c unlimited" before starting Java again
Memory: 4k page, physical 24591972k(6051016k free), swap 12369916k(11359436k
free)
I am running programs like sqoop-import,sqoop-export on Java every 5 minutes.
example:
#!/bin/bash
hadoop jar /home/progr/import_sqoop/oracle.jar.
CDH version 5.11.1
java version jdk1.8.0_102
OS:Red Hat Enterprise Linux Server release 6.9 (Santiago)
Mem free:
total used free shared buffers cached
Mem: 24591972 20080336 4511636 132036 334456 2825792
-/+ buffers/cache: 16920088 7671884
Swap: 12369916 1008664 11361252
Host Memory Usage
enter image description here
The maximum heap memory is (by default) limited to 1GB. You need to increase this
JRE version: (8.0_102-b14) (build )
jvm_args: -Xmx1000m -Dhadoop.log.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -
Dhadoop.home.dir=/opt/cloudera/parcels/CDH-5.11.1-
1.cdh5.11.1.p0.4/lib/hadoop -Dhadoop.id.str= -
Dhadoop.root.logger=INFO,console -
Try the following for to increase this to 2048MB (or higher if required).
export HADOOP_CLIENT_OPTS="-Xmx2048m ${HADOOP_CLIENT_OPTS}"
Reference:
Pig: Hadoop jobs Fail
https://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201104.mbox/%3C5FFFF0E4-B3BA-420A-ADE3-B422A66E8B11#yahoo-inc.com%3E
I'm trying to execute Jest on Ubuntu 14.04.02, in a virtual machine with 4gb of RAM. node version 0.12.2, npm 2.0.0-alpha-5
free shows me:
total used free shared buffers cached
Mem: 3.8G 199M 3.6G 976K 1.1M 18M
When I run npm test, I keep getting a variety of out of memory errors:
Error: FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
FATAL ERROR: Committing semi space failed. Allocation failed - process out of memory
# Fatal error in ../deps/v8/src/heap/store-buffer.cc, line 132
# CHECK(old_virtual_memory_->Commit(reinterpret_cast<void*>(old_limit_), grow * kPointerSize, false)) failed
Any idea what the minimum memory requirement is...or if I have misconfiguration something that is leading to this?
It turns out downgrading to node version 0.10.32, installed via npm, healed the issue.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
Improve this question
Hudson is repeatedly failing after building few projects with following stack trace with error "No space left on device", though there is enough space on disk. There are no limits of quotas on any folder. Below is output of different system commands.
Here is key system information:
Hudson ver. 1.361
executable-war /opt/hudson/hudson.war
java.runtime.name OpenJDK Runtime Environment
java.runtime.version 1.6.0_18-b18
os.name Linux-Ubuntu 10.04
os.version 2.6.32-19-generic
There is 50% free space according to df
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 147550696 67382688 72672840 49% /
none 1535580 304 1535276 1% /dev
none 1539732 504 1539228 1% /dev/shm
none 1539732 96 1539636 1% /var/run
none 1539732 0 1539732 0% /var/lock
none 1539732 0 1539732 0% /lib/init/rw
none 147550696 67382688 72672840 49% /var/lib/ureadahead/debugfs
next i checked open file handles by lsof, thats also within limit
lsof | wc -l
694
then i checked file handles by this command
cat /proc/sys/fs/file-nr
3392 0 306935
Detailed Error
[INFO] No tests to run.
[HUDSON] Recording test results
[INFO] [jar:jar {execution: default-jar}]
[INFO] Building jar: /opt/hudson/jobs/EIF_Branch_R1.2__Projects_PkgOnly/workspace/r1.2/projects/CRMServices/deployment/target/eif.deployment.CRMServices-1.2-bt.jar
[INFO] [antrun:run {execution: default}]
[INFO] Executing tasks
[unzip] Expanding: /opt/hudson/jobs/EIF_Branch_R1.2__Projects_PkgOnly/workspace/r1.2/projects/CRMServices/deployment/src/it/resources/CRMServices.Driver-soapui-project.zip into /opt/hudson/jobs/EIF_Branch_R1.2__Projects_PkgOnly/workspace/r1.2/projects/CRMServices/deployment/src/it/resources
[HUDSON] Archiving /opt/hudson/jobs/EIF_Branch_R1.2__Projects_PkgOnly/workspace/r1.2/projects/CRMServices/deployment/pom.xml to /opt/hudson/jobs/EIF_Branch_R1.2__Projects_PkgOnly/modules/rogers.bt.deployment$eif.deployment.CRMServices/builds/2010-07-20_12-13-58/archive/rogers.bt.deployment/eif.deployment.CRMServices/1.2-bt/pom.xml
[HUDSON] Archiving /opt/hudson/jobs/EIF_Branch_R1.2__Projects_PkgOnly/workspace/r1.2/projects/CRMServices/deployment/target/eif.deployment.CRMServices-1.2-bt.jar to /opt/hudson/jobs/EIF_Branch_R1.2__Projects_PkgOnly/modules/rogers.bt.deployment$eif.deployment.CRMServices/builds/2010-07-20_12-13-58/archive/rogers.bt.deployment/eif.deployment.CRMServices/1.2-bt/eif.deployment.CRMServices-1.2-bt.jar
[HUDSON] Re-archiving /opt/hudson/jobs/EIF_Branch_R1.2__Projects_PkgOnly/workspace/r1.2/projects/CRMServices/deployment/target/eif.deployment.CRMServices-1.2-bt.jar
[INFO] ------------------------------------------------------------------------
[ERROR] FATAL ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Failed to serialize hudson.model.Actionable#actions for class hudson.maven.MavenModuleSetBuild
No space left on device
[INFO] ------------------------------------------------------------------------
[INFO] Trace
java.langchannel stopped
ERROR: Failed to parse POMs
java.io.IOException: Remote call on Channel to Maven [/opt/bea/jdk160_05/bin/java, -cp, /opt/hudson/plugins/maven-plugin/WEB-INF/lib/maven-agent-1.363.jar:/opt/hudson/tools/Maven_2.2.1/boot/classworlds-1.1.jar, hudson.maven.agent.Main, /opt/hudson/tools/Maven_2.2.1, /opt/hudson/war/WEB-INF/lib/remoting-1.363.jar, /opt/hudson/plugins/maven-plugin/WEB-INF/lib/maven-interceptor-1.363.jar, 55951, /opt/hudson/plugins/maven-plugin/WEB-INF/lib/maven2.1-interceptor-1.2.jar] failed
at hudson.remoting.Channel.call(Channel.java:564)
at hudson.maven.ProcessCache$MavenProcess.call(ProcessCache.java:156)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.doRun(MavenModuleSetBuild.java:483)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:416)
at hudson.model.Run.run(Run.java:1253)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:306)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:127)
Caused by: java.lang.Error: Unable to load resource hudson/maven/Messages.properties
at hudson.remoting.RemoteClassLoader.findResource(RemoteClassLoader.java:198)
at java.lang.ClassLoader.getResource(ClassLoader.java:977)
at java.lang.Class.getResource(Class.java:2074)
at org.jvnet.localizer.ResourceBundleHolder.get(ResourceBundleHolder.java:83)
at org.jvnet.localizer.ResourceBundleHolder.get(ResourceBundleHolder.java:102)
at org.jvnet.localizer.ResourceBundleHolder.get(ResourceBundleHolder.java:102)
at org.jvnet.localizer.ResourceBundleHolder.format(ResourceBundleHolder.java:139)
at hudson.maven.Messages.MavenBuilder_AsyncFailed(Messages.java:233)
at hudson.maven.MavenBuilder.call(MavenBuilder.java:184)
at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:696)
at hudson.maven.MavenModuleSetBuild$Builder.call(MavenModuleSetBuild.java:640)
at hudson.remoting.UserRequest.perform(UserRequest.java:114)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:270)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:247)
at hudson.remoting.RemoteClassLoader.makeResource(RemoteClassLoader.java:267)
at hudson.remoting.RemoteClassLoader.findResource(RemoteClassLoader.java:194)
... 19 more
FATAL: : No space left on device
hudson.util.IOException2: : No space left on device
at hudson.XmlFile.write(XmlFile.java:168)
at hudson.model.Run.save(Run.java:1383)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.post2(MavenModuleSetBuild.java:595)
at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:528)
at hudson.model.Run.run(Run.java:1276)
at hudson.maven.MavenModuleSetBuild.run(MavenModuleSetBuild.java:306)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:127)
Caused by: com.thoughtworks.xstream.io.StreamException: : No space left on device
at com.thoughtworks.xstream.core.util.QuickWriter.flush(QuickWriter.java:73)
at com.thoughtworks.xstream.io.xml.PrettyPrintWriter.endNode(PrettyPrintWriter.java:288)
at com.thoughtworks.xstream.io.WriterWrapper.endNode(WriterWrapper.java:37)
at com.thoughtworks.xstream.io.path.PathTrackingWriter.endNode(PathTrackingWriter.java:48)
at com.thoughtworks.xstream.core.TreeMarshaller.start(TreeMarshaller.java:99)
at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.marshal(AbstractTreeMarshallingStrategy.java:38)
at com.thoughtworks.xstream.XStream.marshal(XStream.java:840)
at com.thoughtworks.xstream.XStream.marshal(XStream.java:829)
at com.thoughtworks.xstream.XStream.toXML(XStream.java:804)
at hudson.XmlFile.write(XmlFile.java:165)
... 7 more
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(FileOutputStream.java)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212)
at java.io.BufferedWriter.flush(BufferedWriter.java:236)
at hudson.util.AtomicFileWriter.flush(AtomicFileWriter.java:91)
at com.thoughtworks.xstream.io.xml.PrettyPrintWriter.endNode(PrettyPrintWriter.java:288)
at com.thoughtworks.xstream.io.path.PathTrackingWriter.endNode(PathTrackingWriter.java:49)
at com.thoughtworks.xstream.core.AbstractTreeMarshallingStrategy.marshal(AbstractTreeMarshallingStrategy.java:38)
at hudson.XmlFile.write(XmlFile.java:165)
at hudson.model.Run.save(Run.java:1384)
at hudson.maven.MavenModuleSetBuild$RunnerImpl.post2(MavenModuleSetBuild.java:595)
at hudson.model.AbstractBuild$AbstractRunner.post(AbstractBuild.java:528)
The "no space left on device" error isn't necessarily caused by running out of storage capacity, as it suggests, it can also be cause by running out of i-nodes on the filesystem. In other words, a given filesystem can only contain so many files. Running df will suggest everything's fine.
See this article for more info.
You either need to delete some files that you don't need any more, or put Hudson on a different filesystem.
It's not uncommon in situations where you know you'll have a lot of small files to build a filesystem with an explicitly larger i-node table.
It is due to overflow of memory.
run this command to check the available memory size
df
If you find 100% or near to 100% for any disk, please delete data from that directory, but be carefully, and then try again.
In most of cases, the /tmp directory will be 100%, so reboot your machine
sudo reboot
it will erase all files from tmp directory.
Also do ipcs -u and ipcs -p to see if the "out of space" is the SHM memory full and what apps is using it. If segments allocated are equal to sysctl kernel.shmmni, you are with a "full" shm. You can also compare with ipcs -l output, as this show the system limits.
Latest version of java might abuse the shm usage until it is none free. Either close the program that is abusing the shm or increase the kernel.shmmni and kernel.shmmax sysctl to increase the available shm space