I want to add Cassandra monitoring using Prometheus. ref https://blog.pythian.com/step-step-monitoring-cassandra-prometheus-grafana/
When I add /etc/cassandra/cassandra-env.sh
JVM_OPTS="$JVM_OPTS -javaagent:/opt/jmx_prometheus/jmx_prometheus_javaagent-0.3.0.jar=7070:/opt/jmx_prometheus/cassandra.yml"
I get an error :
ubuntu#ip-172-21-0-111:~$ sudo service cassandra status
● cassandra.service - LSB: distributed storage system for structured data
Loaded: loaded (/etc/init.d/cassandra; bad; vendor preset: enabled)
Active: active (exited) since Mon 2020-04-13 05:43:38 UTC; 3s ago
Docs: man:systemd-sysv-generator(8)
Process: 3557 ExecStop=/etc/init.d/cassandra stop (code=exited, status=0/SUCCESS)
Process: 3570 ExecStart=/etc/init.d/cassandra start (code=exited, status=0/SUCCESS)
Apr 13 05:43:38 ip-172-21-0-111 systemd[1]: Starting LSB: distributed storage system for structured data...
Apr 13 05:43:38 ip-172-21-0-111 systemd[1]: Started LSB: distributed storage system for structured data.
ubuntu#ip-172-21-0-111:~$ nodetool status
nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused (Connection refused)'.
when I remove jmx_prometheus entry I get it working :
ubuntu#ip-172-21-0-111:~$ nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 172.21.0.111 1.83 GiB 128 100.0% b52324d0-c57f-46e3-bc10-a6dc07bae17a rack1
ubuntu#ip-172-21-0-111:~$ tail -f /var/log/cassandra/system.log
INFO [main] 2020-04-13 05:37:36,609 StorageService.java:2169 - Node /172.21.0.111 state jump to NORMAL
INFO [main] 2020-04-13 05:37:36,617 CassandraDaemon.java:673 - Waiting for gossip to settle before accepting client requests...
INFO [main] 2020-04-13 05:37:44,621 CassandraDaemon.java:704 - No gossip backlog; proceeding
INFO [main] 2020-04-13 05:37:44,713 NativeTransportService.java:70 - Netty using native Epoll event loop
INFO [main] 2020-04-13 05:37:44,773 Server.java:161 - Using Netty Version: [netty-buffer=netty-buffer-4.0.36.Final.e8fa848, netty-codec=netty-codec-4.0.36.Final.e8fa848, netty-codec-haproxy=netty-codec-haproxy-4.0.36.Final.e8fa848, netty-codec-http=netty-codec-http-4.0.36.Final.e8fa848, netty-codec-socks=netty-codec-socks-4.0.36.Final.e8fa848, netty-common=netty-common-4.0.36.Final.e8fa848, netty-handler=netty-handler-4.0.36.Final.e8fa848, netty-tcnative=netty-tcnative-1.1.33.Fork15.906a8ca, netty-transport=netty-transport-4.0.36.Final.e8fa848, netty-transport-native-epoll=netty-transport-native-epoll-4.0.36.Final.e8fa848, netty-transport-rxtx=netty-transport-rxtx-4.0.36.Final.e8fa848, netty-transport-sctp=netty-transport-sctp-4.0.36.Final.e8fa848, netty-transport-udt=netty-transport-udt-4.0.36.Final.e8fa848]
INFO [main] 2020-04-13 05:37:44,773 Server.java:162 - Starting listening for CQL clients on /172.21.0.111:9042 (unencrypted)...
INFO [main] 2020-04-13 05:37:44,811 CassandraDaemon.java:505 - Not starting RPC server as requested. Use JMX (StorageService->startRPCServer()) or nodetool (enablethrift) to start it
INFO [SharedPool-Worker-1] 2020-04-13 05:37:46,625 ApproximateTime.java:44 - Scheduling approximate time-check task with a precision of 10 milliseconds
INFO [OptionalTasks:1] 2020-04-13 05:37:46,752 CassandraRoleManager.java:339 - Created default superuser role 'cassandra'
It worked! Changed port to 7071 from 7070 in JVM_OPTS="$JVM_OPTS -javaagent:/opt/jmx_prometheus/jmx_prometheus_javaagent-0.3.0.jar=7071:/opt/jmx_prometheus/cassandra.yml"
Related
I created a three Cassandra node initially it's working fine, but now 2 nodes are stop working.
I tried
sudo service dse stop
and
sudo service dse start
got below error
Job for dse.service failed because the control process exited with error code.
See "systemctl status dse.service" and "journalctl -xe" for details.
systemctl status dse.service
● dse.service - LSB: DataStax Enterprise
Loaded: loaded (/etc/init.d/dse; generated)
Active: failed (Result: exit-code) since Tue 2020-03-17 04:34:24 UTC; 4min 43s ago
Docs: man:systemd-sysv-generator(8)
Process: 4263 ExecStop=/etc/init.d/dse stop (code=exited, status=0/SUCCESS)
Process: 11273 ExecStart=/etc/init.d/dse start (code=exited, status=1/FAILURE)
Tasks: 0 (limit: 4915)
CGroup: /system.slice/dse.service
Mar 17 04:34:14 cstar-node1 su[11442]: pam_unix(su:session): session closed for user cassandra
Mar 17 04:34:14 cstar-node1 su[11456]: Successful su for cassandra by root
Mar 17 04:34:14 cstar-node1 su[11456]: + ??? root:cassandra
Mar 17 04:34:14 cstar-node1 su[11456]: pam_unix(su:session): session opened for user cassandra by (uid=0)
Mar 17 04:34:14 cstar-node1 su[11456]: pam_unix(su:session): session closed for user cassandra
Mar 17 04:34:24 cstar-node1 dse[11273]: ERROR: DSE failed to start. Please check your logs.
Mar 17 04:34:24 cstar-node1 dse[11273]: ...fail!
Mar 17 04:34:24 cstar-node1 systemd[1]: dse.service: Control process exited, code=exited status=1
Mar 17 04:34:24 cstar-node1 systemd[1]: dse.service: Failed with result 'exit-code'.
Mar 17 04:34:24 cstar-node1 systemd[1]: Failed to start LSB: DataStax Enterprise.
only one node is UP
nodetool status
Datacenter: Cassandra
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
DN X.X.X.X ? 1 ? 46fdfb5e-238c-476b-a243-184a530fg30e rack1
UN X.X.X.Y 207.4 KiB 1 ? 7fasd242-891d-4ecf-ggef-0f8hffarr434 rack1
DN X.X.X.Z ? 1 ? 34ffda2f-46d2-443d-4546-33c55cface2c rack1
how to resolve this error? can anyone help me.
Thanks in advance.
It's some time ago - so even if it is not of help for you anymore, it might for others. I had the same issue but there hasn't been any entry in the cassandra logs nor in the system logs. Also the start process failed with that non-descriptive message above. To resolve the issue I've been stopping (as root):
the agent: systemctl stop datastax-agent and
the dse service: systemctl stop dse
Then deleted the directories where the PIDs are located:
/var/run/datastax-agent
/var/run/dse
And finally restarted both services. That did the trick for me. I cannot say if the deletion of the PIDs or restarting the datastax-agent actually resolved the problem but I my blind guess would fall on the PIDs.
I have two Ubuntu 16.04 nodes on which I installed Cassandra 3.11.3 with java version "1.8.0_181". I want to merge these two nodes into a Cassandra cluster. Their intern ips are 172.16.10.20 and 172.16.10.30.
on each /etc/cassandra/cassandra.yaml file I modified the following lines:
cluster_name: 'my_cluster'
- seeds: "172.16.10.20,172.16.10.30"
listen_address: XXXX
rpc_address: XXXX
where XXXX is respectively the intern ip of the current node.
I then restart Cassandra on each node
sudo service cassandra restart
and check that Cassandra runs :
sudo service cassandra status
cassandra.service - LSB: distributed storage system for structured data
Loaded: loaded (/etc/init.d/cassandra; generated)
Active: active (running) since Wed 2018-08-08 00:31:42 UTC; 3s ago
Docs: man:systemd-sysv-generator(8)
and the cluster
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 172.16.10.20 190.11 KiB 256 100.0% 84dded4c-c74e-45f4-9481-ff837fec229d rack1
UN 172.16.10.30 265.06 KiB 256 100.0% 4695fef4-70c7-46b2-a0bd-8b752fe5beb6 rack1
Everything is up and normal.
I want to connect now to Cassandra:
cqlsh
and get:
Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
Some googling later, I want to start cassandra by hand
cassandra
and get (among a huge message):
ERROR [main] 2018-08-07 23:02:51,365 CassandraDaemon.java:708 - Port already in use: 7199; nested exception is:
java.net.BindException: Address already in use (Bind failed)
It looks like the port 7199 is already in use. I kill the corresponding pid, change in /etc/cassandra/cassandra-env.sh the JMX_PORT to 7200... same issue, the port is said to be in use, plus the error
00:33:06,236 |-ERROR in ch.qos.logback.core.rolling.RollingFileAppender[SYSTEMLOG] - openFile(/var/log/cassandra/system.log,true) call failed. java.io.FileNotFoundException: /var/log/cassandra/system.log (Permission denied)
I have changed the permission, but the error remains. At this point of the story I am running out of ideas. What I am trying to achieve seem pretty straighforward so I guess others must have run into a similar issue.
The nodetool status output says it all here. You had everything running just fine. So revert any changes in terms of port usage.
As your nodetool status reveals that your node IPs are 172.16.10.20 and 172.16.10.30, try running cqlsh and providing one of those IPs. cqlsh tries to connect to 127.0.0.1 by default, which will not work in a plural node cluster.
cqlsh 172.16.10.20 -u yourusername -p yourpassword
Note: You can omit -u and -p if you don't have auth enabled. But if that's true, then you should really change your cluster to enable auth.
i installed cassandra 3.11.3-1 on centos7 & vmware
i didn't have error while installing cassandra.
i started cassandra and faced this logs.
[root#localhost ~]# service cassandra start
Starting cassandra (via systemctl): [ OK ]
[root#localhost ~]# systemctl status cassandra
cassandra.service - LSB: distributed storage system for structured data
Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor preset: disabled)
Active: deactivating (stop) (Result: exit-code) since 2018-08-02 15:15:45
KST; 6s ago
Docs: man:systemd-sysv-generator(8)
Process: 10366 ExecStart=/etc/rc.d/init.d/cassandra start (code=exited,
status=0/SUCCESS)
Main PID: 10450 (code=exited, status=3); : 10478 (cassandra)
Tasks: 2
CGroup: /system.slice/cassandra.service
└─control
├─10478 /bin/bash /etc/rc.d/init.d/cassandra stop
└─10549 sleep 0.5
02 15:15:39 localhost.localdomain systemd[1]: Starting LSB: distributed
stora....
02 15:15:39 localhost.localdomain su[10376]: (to cassandra) root on none
02 15:15:41 localhost.localdomain cassandra[10366]: Starting Cassandra: OK
02 15:15:41 localhost.localdomain systemd[1]: Started LSB: distributed
storag....
02 15:15:45 localhost.localdomain systemd[1]: cassandra.service: main
process...D
02 15:15:45 localhost.localdomain su[10489]: (to cassandra) root on none
02 15:15:45 localhost.localdomain cassandra[10478]: Shutdown Cassandra:
bash: …
Hint: Some lines were ellipsized, use -l to show in full.
it means 'cassandra starting is ok' right?
but when i check the node status like this
"nodetool status"
then i met this logs.
"] nodetool status"
nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException:
Connection refused)
so i searched a lot on google.
i found some information.
so i tried this.
edit cassandra-env.sh
"JVM_OPTS -Djava.rmi.server.hostname=127.0.0.1"
expend memory size : 1GB -> 2GB
but i still faced same error.
someone help me please.
------------- system.log ----------------------------------
INFO [main] 2018-08-02 15:15:44,866 YamlConfigurationLoader.java:89 - Configuration location: file:/etc/cassandra/default.conf/cassandra.yaml
ERROR [main] 2018-08-02 15:15:45,043 CassandraDaemon.java:708 - Exception encountered during startup: Invalid yaml: file:/etc/cassandra/default.conf/cassandra.yaml
Error: while scanning a simple key; could not found expected ':'; in 'reader', line 601, column 1:
Set listen_address OR listen_i ...
First, you need to make sure that the activated version of OpenJDK should be openjdk-8-jdk. If you have multiple versions of OpenJDK on your machine, then you could follow this tutorial to set a default version (in this case openjdk-8-jdk).
Then, you need to check the status of cassandra service again. The result for activating cassandra service should be as following
After that, you can follow this instruction to modify JVM_OPTS in /etc/cassandra/cassandra-env.sh. In my case, I do not need to follow 2nd step. Finally, when checking status of a node, you should see the result as following
The cassandra service takes some time to start. After the install, check whether the service has started by using
service --status-all
and you will see something like this
[ + ] cassandra
[ - ] dbus
[ ? ] hwclock.sh
[ ? ] kmod
[ - ] ntp
[ - ] procps
[ - ] rsync
[ - ] udev
[ - ] x11-common
if you see a - sign instead of a + sign next to cassandra, it would mean that service has not yet started. you can start it by issuing this command
service cassandra restart
keep checking the status until you get the a +. Now you should be able to execute the nodetool command
nodetool status
and now you should get the desired result, someething like this
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 84.76 KiB 16 100.0% 7615cf7e-14cc-4475-bf46-ceeb122b6a12 rack1
this worked for me
When I checked this "systemctl status cassandra" I could see Active: failed as seen below
● cassandra.service - LSB: distributed storage system for structured
data Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor
preset: disabled) Active: failed (Result: signal) since Fri
2022-01-07 02:28:07 UTC; 10min ago
Docs: man:systemd-sysv-generator(8) Main PID: 8239 (code=killed, signal=KILL)
So I changed below parameter in cassandra-env.sh
JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=localhost"
After I changed the parameter I could see this
[root#ip-172-31-28-163 default.conf]# systemctl status cassandra ●
cassandra.service - LSB: distributed storage system for structured
data Loaded: loaded (/etc/rc.d/init.d/cassandra; bad; vendor
preset: disabled) Active: active (running) since Fri
2022-01-07 02:47:27 UTC; 12min ago
Docs: man:systemd-sysv-generator(8)
Be sure to do this as root user. Be careful as well
Check in the file cassandra-env.sh (/etc/cassandra/cassandra-env.sh) if the parameters system_memory_in_mb and system_cpu_cores values is configured acoording your machine capabilites
Another solution. In my case I was installing Cassandra 41x with older Java 8 Version which caused an issue and in case to solve this I installed new Java 11. Just provision right version of Java.
This is a snippet from the system log while shutting down:
INFO [RMI TCP Connection(12)-127.0.0.1] 2016-07-27 22:28:50,995 StorageService.java:3788 - Announcing that I have left the ring for 30000ms
INFO [RMI TCP Connection(12)-127.0.0.1] 2016-07-27 22:29:20,995 ThriftServer.java:142 - Stop listening to thrift clients
INFO [RMI TCP Connection(12)-127.0.0.1] 2016-07-27 22:29:20,997 Server.java:182 - Stop listening for CQL clients
WARN [RMI TCP Connection(12)-127.0.0.1] 2016-07-27 22:29:20,997 Gossiper.java:1508 - No local state or state is in silent shutdown, not announcing shutdown
INFO [RMI TCP Connection(12)-127.0.0.1] 2016-07-27 22:29:20,997 MessagingService.java:786 - Waiting for messaging service to quiesce
INFO [ACCEPT-sysengplayl0127.bio-iad.ea.com/10.72.194.229] 2016-07-27 22:29:20,998 MessagingService.java:1133 - MessagingService has terminated the accept() thread
INFO [RMI TCP Connection(12)-127.0.0.1] 2016-07-27 22:29:21,022 StorageService.java:1411 - DECOMMISSIONED
INFO [main] 2016-07-27 22:32:17,534 YamlConfigurationLoader.java:89 - Configuration location: file:/opt/cassandra/product/apache-cassandra-3.7/conf/cassandra.yaml
And then while starting up:
INFO [main] 2016-07-27 22:32:20,316 StorageService.java:630 - Cassandra version: 3.7
INFO [main] 2016-07-27 22:32:20,316 StorageService.java:631 - Thrift API version: 20.1.0
INFO [main] 2016-07-27 22:32:20,316 StorageService.java:632 - CQL supported versions: 3.4.2 (default: 3.4.2)
INFO [main] 2016-07-27 22:32:20,351 IndexSummaryManager.java:85 - Initializing index summary manager with a memory pool size of 397 MB and a resize interval of 60 minutes
ERROR [main] 2016-07-27 22:32:20,357 CassandraDaemon.java:731 - Fatal configuration error
org.apache.cassandra.exceptions.ConfigurationException: This node was decommissioned and will not rejoin the ring unless cassandra.override_decommission=true has been set, or all existing data is removed and the node is bootstrapped again
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:815) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:725) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:625) ~[apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:370) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:585) [apache-cassandra-3.7.jar:3.7]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:714) [apache-cassandra-3.7.jar:3.7]
WARN [StorageServiceShutdownHook] 2016-07-27 22:32:20,358 Gossiper.java:1508 - No local state or state is in silent shutdown, not announcing shutdown
INFO [StorageServiceShutdownHook] 2016-07-27 22:32:20,359 MessagingService.java:786 - Waiting for messaging service to quiesce
Is there something wrong with the configuration?
I had faced same issue.
Posting the answer so that it might help others.
As the log suggests, the property "cassandra.override_decommission" should be overridden.
start cassandra with the syntax:
cassandra -Dcassandra.override_decommission=true
This should add the node back to the cluster.
I installed Data Stax 3.7 on my Windows machine(IP:10.175.12.249) and made following changes in my cassandra.yaml file:
cluster_name: 'Test_cluster'
listen_address: "10.175.12.249"
start_rpc: true
rpc_address: "0.0.0.0"
broadcast_rpc_address: "10.175.12.249"
seeds: "10.175.12.249"
endpoint_snitch: SimpleSnitch
Now, I started the service and cassandra is running fine on seed node.
I tried adding another node to my cluster. So I installed Data Stax 3.7 on another Windows machine(IP:192.168.158.78) and made following changes in cassandra.yaml file:
cluster_name: 'Test_cluster'
listen_address: "192.168.158.78"
start_rpc: true
rpc_address: "0.0.0.0"
broadcast_rpc_address: "192.168.158.78"
seeds: "10.175.12.249"
endpoint_snitch: SimpleSnitch
Now when I started the cassandra service on my 2nd machine, I am getting the following error:
INFO 09:41:27 Cassandra version: 3.7.0
INFO 09:41:27 Thrift API version: 20.1.0
INFO 09:41:27 CQL supported versions: 3.4.2 (default: 3.4.2)
INFO 09:41:27 Initializing index summary manager with a memory pool size of 100 MB and a resize interval of 60 minutes
INFO 09:41:27 Starting Messaging Service on /192.168.158.78:7000 (Intel(R) Centrino(R) Advanced-N 6235)
INFO 09:41:27 Scheduling approximate time-check task with a precision of 10 milliseconds
Exception (java.lang.RuntimeException) encountered during startup: Unable to gossip with any seeds
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1386)
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:561)
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:855)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:725)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:625)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:370)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:585)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:714)
ERROR 09:41:58 Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1386) ~[apache-cassandra-3.7.0.jar:3.7.0]
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:561) ~[apache-cassandra-3.7.0.jar:3.7.0]
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:855) ~[apache-cassandra-3.7.0.jar:3.7.0]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:725) ~[apache-cassandra-3.7.0.jar:3.7.0]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:625) ~[apache-cassandra-3.7.0.jar:3.7.0]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:370) [apache-cassandra-3.7.0.jar:3.7.0]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:585) [apache-cassandra-3.7.0.jar:3.7.0]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:714) [apache-cassandra-3.7.0.jar:3.7.0]
WARN 09:41:58 No local state or state is in silent shutdown, not announcing shutdown
INFO 09:41:58 Waiting for messaging service to quiesce
Below is the output of nodetool status on seed node(IP:10.175.12.249):
C:\Program Files\DataStax-DDC\apache-cassandra\bin>nodetool status
Datacenter: datacenter1
========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
DN 192.168.158.78 ? 256 68.1% 6bc4e927-3def-4dfc-b5e7-31f5882ce475 rack1
UN 10.175.12.249 257.76 KiB 256 65.7% 300d731e-a27c-4922-aacc-6d42e8e49151 rack1
Thanks!!!
The - seeds: in conf/cassandra.yaml should have the same value (same IP or the hostname) as listen_address: in the same conf file.
I came across this error when the IP addresses were not matching. Try keeping the same and restart the cluster. Hope this helps...