Restart dead accumulo tablet server - accumulo

I use a single-instance Accumulo database. It worked all fine until I tried to ingest multiple data (following this tutorial), then my tablet sever died.
I tried to restart it (using bin/start-all or bin/start-here) but it did not work. Then I restarted the whole server and it seams, that bin/start-all starts the tablet server first:
WARN : Using Zookeeper /root/Installs/zookeeper-3.4.6/zookeeper-3.4.6. Use version 3.3.0 or greater to avoid zookeeper deadlock bug.
Starting monitor on localhost
WARN : Max open files on localhost is 1024, recommend 32768
Starting tablet servers .... done
Starting tablet server on 46.101.229.80
WARN : Max open files on 46.101.229.80 is 1024, recommend 32768
OpenJDK Client VM warning: You have loaded library /root/Installs/hadoop-2.6.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
2016-01-27 04:44:18,778 [util.NativeCodeLoader] WARN : Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2016-01-27 04:44:23,770 [fs.VolumeManagerImpl] WARN : dfs.datanode.synconclose set to false in hdfs-site.xml: data loss is possible on hard system reset or power loss
2016-01-27 04:44:23,803 [server.Accumulo] INFO : Attempting to talk to zookeeper
2016-01-27 04:44:24,246 [server.Accumulo] INFO : ZooKeeper connected and initialized, attempting to talk to HDFS
2016-01-27 04:44:24,802 [server.Accumulo] INFO : Connected to HDFS
Starting master on 46.101.229.80
WARN : Max open files on 46.101.229.80 is 1024, recommend 32768
Starting garbage collector on 46.101.229.80
WARN : Max open files on 46.101.229.80 is 1024, recommend 32768
Starting tracer on 46.101.229.80
WARN : Max open files on 46.101.229.80 is 1024, recommend 32768
But checking the monitor the tablet server is still dead.
The tserver_46.101.229.80.err-log ist empty, the tserver_46.101.229.80.out-log says:
OpenJDK Client VM warning: You have loaded library /root/Installs/hadoop-2.6.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
#
# java.lang.OutOfMemoryError: Java heap space
# -XX:OnOutOfMemoryError="kill -9 %p"
# Executing /bin/sh -c "kill -9 3225"...
How can I get the tabletServer up again?
I use a 32-bit 14.04 Linux of DigitalOcean, Hadoop 2.6, ZooKeeper 3.4.6 and Accuulo 1.6.4

If the TabletServer is repeatedly crashing with an OutOfMemoryError, you need to increase the JVM maximum heap size via the -Xmx option in the ACCUMULO_TSERVER_OPTS in accumulo-env.sh.

Related

Add a Cassandra OSS 4.0 RC1 node into a cluster with DSE 6.0.14 nodes

Every nodes of the cluster are in Version DSE 6.0.14, they set in ssl mode (listen on port 7001).
We're trying to add a node in version open Sources 4.0 RC1.
We force the port communication on this node:
storage_port: 7001
else the node try to communicate on the 7000 port that is closed.
We encountered the following error, when I try to start the service of the new node :
INFO [main] 2021-05-10 16:22:00,985 StorageService.java:528 - Gathering node replacement information for /10.135.66.204:7001
DEBUG [main] 2021-05-10 16:22:00,986 YamlConfigurationLoader.java:112 - Loading settings from file:/etc/cassandra/default.conf/cassandra.yaml
DEBUG [main] 2021-05-10 16:22:00,996 YamlConfigurationLoader.java:112 - Loading settings from file:/etc/cassandra/default.conf/cassandra.yaml
INFO [Messaging-EventLoop-3-1] 2021-05-10 16:22:01,138 InboundConnectionInitiator.java:281 - peer /10.137.65.201:54916 only supports messaging versions lower (2) than this node supports (10)
ERROR [Messaging-EventLoop-3-2] 2021-05-10 16:22:01,237 NoSpamLogger.java:98 - /xx.xxx.xx.xxx:7001->/xx.xxx.xx.xxx:7001-URGENT_MESSAGES-[no-channel] failed to connect
java.nio.channels.ClosedChannelException: null
[...]
INFO [ScheduledTasks:1] 2021-05-10 16:22:02,398 TokenMetadata.java:525 - Updating topology for all endpoints that have changed
ERROR [Messaging-EventLoop-3-1] 2021-05-10 16:22:09,467 InboundConnectionInitiator.java:360 - Failed to properly handshake with peer /xx.xxx.xx.xxx:54922. Closing the channel.
java.lang.AssertionError: null
[...]
I don't know if the error come from a mistake in the config of the node oss 4.0 or if there is an incompatibility between the new node version and the version of the existing node in the cluster.

how to increase open files on kafka cluster

We have 10 kafka machines with kafka version - 1.X
this kafka cluster version is part of HDP version - 2.6.5
We noticed that under /var/log/kafka/server.log the following message
ERROR Error while accepting connection {kafka.network.Accetpr}
java.io.IOException: Too many open files
We saw also additionally
Broker 21 stopped fetcher for partition ...................... because they are in the failed log dir /kafka/kafka-logs {kafka.server.ReplicaManager}
and
WARN Received a PartitionLeaderEpoch assignment for an epoch < latestEpoch. this implies messages have arrived out of order. New: {epoch:0, offset:2227488}, Currnet: {epoch 2, offset:261} for Partition: cars-list-75 {kafka.server.epochLeaderEpocHFileCache}
so regarding to the issue -
ERROR Error while accepting connection {kafka.network.Accetpr}
java.io.IOException: Too many open files
how to increase the MAX open files , in order to avoid this issue
update:
in ambari we saw the following parameter from kafka --> config
is this is the parameter that we should to increase?
It can be done like this:
echo "* hard nofile 100000
* soft nofile 100000" | sudo tee --append /etc/security/limits.conf
Then you should reboot.

Neo4j refused to connect

Characteristics :
Linux
Neo4j version 3.2.1
Access on remote
Installation
I Had install neo4j and gave the folder chmod 777 .
Im running it remotely on my machine and I had already enabled non local access
Doing NEo4j start i get this message
Active database: graph.db
Directories in use:
home: /home/cloudera/Muna/apps/neo4j
config: /home/cloudera/Muna/apps/neo4j/conf
logs: /home/cloudera/Muna/apps/neo4j/logs
plugins: /home/cloudera/Muna/apps/neo4j/plugins
import: /home/cloudera/Muna/apps/neo4j/import
data: /home/cloudera/Muna/apps/neo4j/data
certificates: /home/cloudera/Muna/apps/neo4j/certificates
run: /home/cloudera/Muna/apps/neo4j/run
Starting Neo4j.
WARNING: Max 1024 open files allowed, minimum of 40000 recommended. See the Neo4j manual.
Started neo4j (pid 9469). It is available at http://0.0.0.0:7474/
There may be a short delay until the server is ready.
See /home/cloudera/Muna/apps/neo4j/logs/neo4j.log for current status.
and it is not connecting in the browser .
running neo4j console
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 409600000 bytes for AllocateHeap
# An error report file with more information is saved as:
# /home/cloudera/hs_err_pid18598.log
where could the problem be coming from ?
Firstly, you should set the maximum open files to 40000, which is the recommended value. Then you do not get the WARNING. Like this: http://neo4j.com/docs/1.6.2/configuration-linux-notes.html
Secondly,'failed to allocate memory' means that the Java virtual machine cannot allocate the memory you start it with.
It can be a misconfiguration, or you physically do not have enough memory.
Please read the memory sizing guidelines here:
https://neo4j.com/docs/operations-manual/current/performance/

cassandra installation failed

I have been trying to install datastax c* and getting stuck at the below line. It doesn't go forward after this line. May I know what the issue can be?
NFO [main] 2016-02-01 11:09:01,032 CassandraDaemon.java:205 - JVM Arguments: [-Ddse.system_memory_in_mb=991, -Dcassandra.config.loader=com.datastax.bdp.config.DseConfigurationLoader, -Ddse.system_memory_in_mb=991, -Dcassandra.config.loader=com.datastax.bdp.config.DseConfigurationLoader, -ea, -javaagent:/usr/share/dse/cassandra/lib/jamm-0.3.0.jar, -XX:+UseThreadPriorities, -XX:ThreadPriorityPolicy=42, -Xms495M, -Xmx495M, -XX:+HeapDumpOnOutOfMemoryError, -Xss256k, -XX:+AlwaysPreTouch, -XX:-UseBiasedLocking, -XX:StringTableSize=1000003, -XX:+UseTLAB, -XX:+ResizeTLAB, -XX:CompileCommandFile=/etc/dse/cassandra/hotspot_compiler, -XX:+UseG1GC, -XX:G1RSetUpdatingPauseTimePercent=5, -XX:MaxGCPauseMillis=500, -Djava.net.preferIPv4Stack=true, -Dcassandra.jmx.local.port=7199, -XX:+DisableExplicitGC, -Dlogback.configurationFile=logback.xml, -Dcassandra.logdir=/var/log/cassandra, -Dcassandra.storagedir=, -Dcassandra-pidfile=/var/run/dse/dse.pid, -Dsearch-service=true, -Dcatalina.home=/usr/share/dse/tomcat, -Dcatalina.base=/usr/share/dse/tomcat, -Djava.util.logging.config.file=/usr/share/dse/tomcat/conf/logging.properties, -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager, -Dtomcat.logs=/var/log/tomcat, -XX:HeapDumpPath=/var/lib/cassandra/java_1454342934.hprof, -XX:ErrorFile=/var/lib/cassandra/hs_err_1454342934.log, -Djava.library.path=:/usr/share/dse/hadoop/native/Error:_JAVA_HOME_is_not_set./lib:/usr/share/dse/hadoop/native/Error:_JAVA_HOME_is_not_set./lib, -Dsolr.solr.home=solr/, -Ddse.system_memory_in_mb=991, -Dcassandra.config.loader=com.datastax.bdp.config.DseConfigurationLoader, -Ddse.system_memory_in_mb=991, -Dcassandra.config.loader=com.datastax.bdp.config.DseConfigurationLoader]
I see exactly the same issue when starting DSE in a Vagrant VM (CentOS 7) that does not have enough RAM allocated - are you running in Vagrant / a VM, or on hardware with limited memory?
If you set the ram to 2096 or higher, you should see DSE start up successfully.
DataStax is pretty resource intensive, though it's unfortunate the error messages aren't more helpful here!
(The tell-tale symptom here is Error:_JAVA_HOME_is_not_set in the command line)

cassandra connection error : Could not connect to localhost:9160

my cassandra was working well but suddenly it stop working!
when i use cqlsh command i get this error:
connection error : Could not connect to localhost:9160
and in output.log file i seeing this :
Service exit with a return value of 1
OpenJDK Client VM warning: Insufficient space for shared memory file:
/tmp/hsperfdata_cassandra/10963
Try using the -Djava.io.tmpdir= option to select an alternate temp location.
INFO 12:23:31,307 Logging initialized
log4j:ERROR Failed to flush writer,
java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:297)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:220)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:290)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:294)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:140)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
at org.apache.log4j.Category.callAppenders(Category.java:206)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.info(Category.java:666)
at org.apache.cassandra.service.CassandraDaemon.initLog4j(CassandraDaemon.java:118)
at org.apache.cassandra.service.CassandraDaemon.<clinit>(CassandraDaemon.java:65)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at java.lang.Class.newInstance0(Class.java:374)
at java.lang.Class.newInstance(Class.java:327)
at org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:190)
INFO 12:23:31,332 32bit JVM detected. It is recommended to run Cassandra on a 64bit JVM for better performance.
INFO 12:23:31,335 JVM vendor/version: OpenJDK Client VM/1.6.0_27
WARN 12:23:31,335 OpenJDK is not recommended. Please upgrade to the newest Oracle Java release
INFO 12:23:31,335 Heap size: 252641280/253689856
INFO 12:23:31,335 Classpath: /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar$
INFO 12:23:31,691 JNA mlockall successful
INFO 12:23:31,715 Loading settings from filService exit with a return value of 1
OpenJDK Client VM warning: Insufficient space for shared memory file:
can somebody help me? :(
The /tmp directory on your host isn't large enough for the temp files that cassandra wishes to make. The temp files are related to the amount of data in the system. As your database is larger now than it was in the past, it started before but it does not start now.
Check the status of the /tmp directory with df. Here is my system
$ df /tmp
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda6 570881944 350121276 191761552 65% /
To alter the place that is used for these temp files like the error says set java.io.tmpdir
On my system (Ubuntu Linux) this can be done by editing the end of the file /etc/cassandra/cassandra-env.sh
JVM_OPTS="$JVM_OPTS $JVM_EXTRA_OPTS -Djava.io.tmpdir=/opt"
Ensure that the new temp directory has sufficient space and that the permissions are correct, probably allowing read/write for the cassandra user would be enough

Resources