cassandra node to node encryption throws unable to gossip with peers exception - cassandra
We currently run a multi region cassandra cluster in AWS. It runs in four regions, 12 nodes per region. It runs without node to node encryption (or client encryption either). We are trying to enable inter datacenter node to node encryption. However, when we flip encryption over we get an exception that nodes are unable to gossip with any peers.
It could possibly be that we didn't build our jks keystore/truststores correctly (more on how we built these files below). But, we additionally do not see intra datacenter communication working (which should be set to unencrypted communication). Additionally, cqlsh cannot connect to the node either; even though we have (by default) client_auth_required set to false.
ERROR [main] 2019-08-15 18:46:32,241 CassandraDaemon.java:749 - Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any peers
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1435) ~[apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:566) ~[apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:823) ~[apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:683) ~[apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:632) ~[apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:388) [apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:620) [apache-cassandra-3.11.4.jar:3.11.4]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:732) [apache-cassandra-3.11.4.jar:3.11.4]
INFO [main] 2019-08-15 18:47:07,384 YamlConfigurationLoader.java:89 - Configuration location: file:/etc/cassandra/cassandra.yaml
Something to note is that this error message occurs after a few minutes of the node being up. (i.e. there is a delay between start up before this exception is thrown).
Information about our cassandra setup
cassandra version: 3.11.4
JDK version: openjdk-8.
Linux: Ubuntu 18.04 (bionic).
cassandra.yaml
endpoint_snitch: Ec2MultiRegionSnitch
server_encryption_options:
internode_encryption: dc
keystore: <omitted>
keystore_password: <omitted>
truststore: <omitted>
truststore_password: <omitted>
client_encryption_options:
enabled: false
cassandra-rackdc.properties
prefer_local=true
No obvious errors with SSH output
When starting cassandra with JVM_OPTS="$JVM_OPTS -Djavax.net.debug=ssl" added to cassandra-env.sh we see SSL logs printed to stdout (Note: Subject and Issuer were omitted on purpose).
found key for : cassy-us-west-2
adding as trusted cert:
Subject: ...
Issuer: ...
Algorithm: RSA; Serial number: 0xdad28d843fc73325d4c1a75207d4e74
Valid from Fri May 27 00:00:00 UTC 2016 until Tue May 26 23:59:59 UTC 2026
...
trigger seeding of SecureRandom
done seeding SecureRandom
Looking at Java SE SSL/TLS connection debugging, this looks correct. But to note, we see this series of messages (along with the RSA key signature output) repeated several times in rapid fire. We never observe any messages about the trust store being added; however that might be something that occurs only on client initiation (?)
Additionally, we do see cassandra report that the Encrypted Messaging service has been started.
INFO [main] 2019-08-15 18:45:31,022 MessagingService.java:704 - Starting Encrypted Messaging Service on SSL port 7001
Doesn't appear to be a cassandra.yaml configuration problem
We can bring the node back online by simply configuring internode_encryption: none. This action seems to rule out a broadcast_address or rpc_address configuration problem.
How we built our keystore/truststores
We followed the basic template datastax docs for preparing SSL certificates. One minor difference was that our private key and CSRs were generated using openssl. One per each region (we plan to share key/signed certs across nodes in regions). This was created using a command template as:
openssl req -new -newkey rsa:2048 -out cassy-<region>.csr -keyout cassy-<region>.key -config cassy-<region>.conf -subj "..." -nodes -sha256
The generated CSR was then signed by an internal root CA. Because we generated our files using openssl, we had to build our jks files by importing our certs into them.
Commands to generate truststore
We distribute this one file to all nodes.
keytool -importcert
-keystore generic-server-truststore.jks
-alias rootCa
-file rootCa.crt
-noprompt
-keypass omitted
-storepass omitted
Commands to generate keystore
This was done one per region; but essentially we created a keystore with keytool, then deleted the key entry and then imported our key entry using keytool from a pkcs12 file.
keytool -genkeypair -keyalg RSA -alias cassy-${region} -keystore cassy-${region}.jks -storepass omitted -keypass omitted -validity 365 -keysize 2048 -dname "..."
keytool -delete -alias cassy-${region} -keystore cassy-${region}.jks -storepass omitted
openssl pkcs12 -export -in signed_certs/${region}.pem -inkey keys/cassandra.${region}.key -name cassy-${region} -out ${region}.p12
keytool -importkeystore -deststorepass omitted -destkeystore cassy-${region}.jks -srckeystore ${region}.p12 -srcstoretype PKCS12
keytool -importcert -keystore cassy-${region}.jks -alias rootCa -file ca.crt -noprompt -keypass omitted -storepass omitted
Looking back at this, I don't remember why we used keytool to generate a keypair/keystore, then deleted and imported. I think it was because the keytool importkeystore command refused to run if the keystore didn't already exist.
ca.crt and pem file
The ca.crt file contains the root certificate and the intermediate certificate that was used to sign the CSR. The pem file contains the signed CSR returned to us, the intermediate cert, and the root CA (in that order).
openssl verify ca.crt and pem
openssl verify -CAfile ca.crt us-west-2.pem
signed_certs/us-west-2.pem: OK
Command output after enabling encryption
nodetool status (output truncated)
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
?N 52.44.11.221 ? 256 25.4% null 1c
...
?N 52.204.232.195 ? 256 23.2% null 1d
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
?N 34.209.2.144 ? 256 26.5% null 2c
UN 52.40.32.177 105.99 GiB 256 23.7% null 2c
?N 34.210.109.203 ? 256 24.7% null 2a
...
With the online node being the node with encryption set.
cqlsh to localhost
cassy-node6:~$ cqlsh
Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
cqlsh to remote node
Remote node is a node with encryption enabled
cassy-node6:~$ cqlsh 10.0.2.7
Connection error: ('Unable to connect to any servers', {'10.0.2.7': error(111, "Tried connecting to [('10.0.2.7', 9042)]. Last error: Connection refused")})
Behavior we expected
We expected for the node to report that the other regions were all down, as they need to be handled over encryption. So that works as expected; however, cqlsh and intra datacenter peers being reported as unreachable is unexpected.
Specifically, we expected the node to still show peer nodes within the same datacenter as up and normal, regardless if there is a cert issue/error. We also expected cqlsh to continue to work.
Lastly, we are also trying to figure out if we have a jks certificate problem.
Related
openssl certificate error for WinRM connection
I have on Windows server certificate which is valid and active and WinRM listener is active as well on port 5986 (telnet works) for WinRM connection that needs to be established from the Linux server. I didn't copy that certificate anywhere on the linux server as I do not know where that should be or how it should be configured. If I try to establish WinRM connection I am getting this error on the Linux server. openssl s_client -connect 10.7.147.210:5986 No client certificate CA names sent Peer signing digest: SHA256 Peer signature type: RSA Server Temp Key: ECDH, P-384, 384 bits --- SSL handshake has read 1367 bytes and written 447 bytes Verification error: unable to verify the first certificate I tried by referencing CAfile and CApath and cert option but without success: openssl s_client -cert winrmcert.pem -key winrmcert.key -CApath . -connect 10.7.147.210:5986 openssl s_client -CAfile winrmcert.pem -connect 10.7.147.210:5986 Can you please help me what I need to do and configure on Linux server for certificate generated on Windows server for WinRM connection? I am not the expert for this topic so I would appreciate all useful instructions. Thank you
while starting orderer service in multihost env orderer1-org0 | panic: runtime error: index out of range [1] with length 1
Environment - Multihost Env(4org,1 org hosting entire raft clusters) Followed the fabric-ca operations guide till genesis block everything went fine but while creating the docker for ordering service found the following error: [orderer.common.server] initializeServerConfig -> INFO 004 Starting orderer with TLS enabled orderer1-org0 | panic: runtime error: index out of range 1 with length 1 The below commands when executed confirm the hash for key,ca are same but doesn't match with tls-ca openssl pkey -in hyperledger/org2/peer1/tls-msp/keystore/key.pem -pubout -outform pem | sha256sum openssl x509 -in hyperledger/org2/peer1/tls-msp/signcerts/cert.pem -pubkey -noout -outform pem | sha256sum openssl x509 -in hyperledger/org2/peer1/tls-msp/tlscacerts/tls-orderer1-org0-7052.pem -pubkey -noout -outform pem | sha256sum Can you help to identify in which particular step in operation guide would have caused the error so that I can rerun from that if I need to restart completely then what is the change or caution that I need to consider. Note the passwords for the ca and tls-ca for certain identities are different as per the latest operation guide hope this didn't cause the issue. Attached the docker for the orderer and screenshot of the file. [![enter image description here][2]][2]
Hyperledger Fabric: Chain file does not exist at /etc/hyperledger/fabric-ca-server/ca-chain.pem
we get this error when trying to enroll a user against an intermediate CA: root#dda3b6a7d56c:/home# fabric-ca-client enroll -u http://ica-admin:ica-adminpw#ica-jnj:7054 -M ica-admin 2019/03/21 16:47:27 [INFO] Created a default configuration file at /root/.fabric-ca-client/fabric-ca-client-config.yaml 2019/03/21 16:47:27 [INFO] generating key: &{A:ecdsa S:256} 2019/03/21 16:47:27 [INFO] encoded CSR Error: Response from server: Error Code: 0 - Chain file does not exist at /etc/hyperledger/fabric-ca-server/ca-chain.pem we started our intermediate CA (ica-jnj server) like this: root#710d3b5984cd:/etc/hyperledger/fabric-ca-server# fabric-ca-server start -b ica-admin:ica-adminpw -u http://admin:adminpw#rca-jnj:7054 we are not using any TLS. how can we fix this error?
The most likely cause of this error is that the files ca-cert.pem and ca-key.pem were not deleted before starting the intermediate CA. When an instance of fabric-ca is created, it automatically comes with above 2 files inside /etc/hyperledger/fabric folder. These files need to be deleted for an intermediate CA. Once you do that, after starting the fabric-ca-server you should see ca-chain.pem file in the directory. The chain file can be inspected by running: openssl crl2pkcs7 -nocrl -certfile ca-chain.pem | openssl pkcs7 -print_certs -text -noout and will show the chain from intermediate CA to root CA
Logstash Forwarder
Trying to send some logs to logstash server. Using logstash forwarder to forward the logs to logstash But its getting timed out: 2015/03/04 08:19:15.266955 Started harvester at end of file (current offset now 10659): /apps/azuga-dds/logs/amqData.log 2015/03/04 08:19:15.267089 Setting trusted CA from file: /etc/logstash-forwarder/logstash-forwarder.crt 2015/03/04 08:19:15.290016 Connecting to [10.90.9.242]:5000 (ec2-54-70-33-51.us-west-2.compute.amazonaws.com) 2015/03/04 08:19:20.290259 Failure connecting to 10.90.9.242: dial tcp 10.90.9.242:5000: i/o timeout 2015/03/04 08:19:21.291691 Connecting to [10.90.9.242]:5000 (ec2-54-70-33-51.us-west-2.compute.amazonaws.com) 2015/03/04 08:19:26.291903 Failure connecting to 10.90.9.242: dial tcp 10.90.9.242:5000: i/o timeout 2015/03/04 08:19:27.293218 Connecting to [10.90.9.242]:5000 (ec2-54-70-33-51.us-west-2.compute.amazonaws.com) Any idea how to resolve this issue.
You may have some problems with SSL cert, sometimes checking cert may help. And be sure that you are using same version of JVM on logstash-forwarder, logstash and elasticsearch. Generate cert with your logstash-server IP, log says that you try to connect to host with IP, that not listed in cerificate. Try openssl s_client -showcerts -connect host:port
Try generating a new ssl cert in the logsatsh server (10.90.9.242) with the IP-SAN alternate name which means editing the /etc/ssl/openssl.cnf and adding: subjectAltName = IP:10.90.9.242 under the [v3_ca] section. and only afterwards generating the cert and key by running: openssl req -x509 -batch -nodes -newkey rsa:2048 -keyout /etc/pki/tls/private/logstash-forwarder.key -out /etc/pki/tls/certs/logstash-forwarder.crt -days 3650 Don't forget to reset the logstash and move the crt and key to the correct path of the logstash-forwarder (written in the config file).
generate key and certificate using keytool
I want to generate a self signed trusted certificate and a csr and sign the csr with trusted certificate created. I am trying it with keytool. In the first step of creating a trusted certificate using the below command keytool -genkey -alias mytrustCA -keyalg RSA -keystore keystore.jks -keysize 1024 where it puts the certificate into keystore. How can I store it to a file ? and when I list the contents using keytool -list -v -keystore cert/test.keystore Certificate created with above "genkey" command creates with entry type as "PrivateKeyEntry", how can create a trusted Cert Entry ?
In your first command, you have used the -genkey option to generate the keystore named keystore.jks. To export the certificate in .CER format file, you will need to use the -export option of the keytool. An example is: keytool -v -export -file mytrustCA.cer -keystore keystore.jks -alias mytrustCA This will generate a file named mytrustCA.cer To generate a certificate request to send to a CA for obtaining a signed certificate, you will need to use the -certreq option of keytool. An example is: keytool -v -certreq -keystore keystore.jks -alias mytrustCA This will ask for the keystore password and on successful authentication, it will show the certificate request as given below (a sample). -----BEGIN NEW CERTIFICATE REQUEST----- MIIBtDCCAR0CAQAwdDELMAkGA1UEBhMCSU4xFDASBgNVBAgTC01haGFyYXNodHJhMQ8wDQYDVQQH EwZNdW1iYWkxEjAQBgNVBAoTCU1pbmRzdG9ybTEUMBIGA1UECxMLRW5naW5lZXJpbmcxFDASBgNV BAMTC1JvbWluIElyYW5pMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQCqOLEumwLHlzIUAPD6 Ab1pVp84mhSNCCcUKInZbSdiDYnKSr46EjEw0PtZOVPJbM4ZG3bZsOboYr0YfViJi41o4yJICFAZ 8wCQQxPK/4N8MPV7C5WDH28kRKGH/Pc2e7CxV+as573I34QmkINk7fEyERMDwP/WgmrcKZgL0sfy ewIDAQABoAAwDQYJKoZIhvcNAQEFBQADgYEAlcpjOUZFP9ixskXSA7HNlioWwjbL9f9rQskJ9rK8 kGLJ1td+mqqm20yo/JrKCzZjOMqr/aL6Zw2dkoyU34T9HnR2Bs3SgKn6wlYsYEVvVBk71Ec6PeTi e+fhfNQEHsj4wuB4qixO3s1jtsLDy+DpTzYguszczwxXGFVNuk+y2VY= -----END NEW CERTIFICATE REQUEST----- You will need to send this Certificate REquest or paste it into the Digital Certificate signer webpage. Alternately, you can even redirect this output to a file instead of the console as follows: keytool -v -certreq -keystore keystore.jks -alias mytrustCA > mycertreq.txt
This is a command line example without any interactive prompts, may be easier to use this way and document all commands in a text file. Create JavaKeyStore file and a self-signed certificate key keytool -genkey -alias server -keyalg RSA -keysize 2048 -sigalg SHA256withRSA -storetype JKS \ -keystore my.server.com.jks -storepass mypwd -keypass mypwd \ -dname "CN=my.server.com, OU=EastCoast, O=MyComp Ltd, L=New York, ST=, C=US" \ -ext "SAN=dns:my.server.com,dns:www.my.server.com,ip:11.22.33.44" \ -validity 7200 keytool -keystore my.server.com.jks -storepass mypwd -list -v You can use this keystore(.jks) file already in Tomcat but browsers give a self-signed certificate warning. Give SubjectAlternativeName extension argument with one or more dns names and optional ip address. Create CertificateSigningRequest file keytool -certreq -alias server -file my.server.com.csr \ -keystore my.server.com.jks -storepass mypwd \ -ext "SAN=dns:my.server.com,dns:www.my.server.com,ip:11.22.33.44" \ keytool -printcertreq -file my.server.com.csr Send .csr file to CertificateAuthority(CA) operator for signing, you should later receive a certificate(cer) file. You must give here SubjectAlternativeName extension argument second time. Import Certificate file to a keystore keytool -import -trustcacerts -keystore my.server.com.jks -storepass mypwd \ -alias server -file my.server.com.cer This command pairs your private key and a public certificate with a trusted valid CA authority. Browsers should not give a certificate warning anymore. Import intermediate CA certs keytool.exe -importcert -trustcacerts -file SomeCA.cer -alias someca -keystore my.server.com.jks -storepass mypwd keytool.exe -importcert -trustcacerts -file SomeCAIssuing.cer -alias somecaissuing -keystore my.server.com.jks -storepass mypwd This imports CA issuing certificates, you may need to do this before importing your certificate file(.cer). Your hostname certificate may have an expiration date, so once about to expire soon create a new signing request(.csr) file from the keystore, send new csr file to CA authority, import new certificate(.cer) file. You most likely are using jks keystore in Tomcat web server so here is tomcat/conf/server.xml https connector examples. Tomcat 9+ <Connector port="443" protocol="org.apache.coyote.http11.Http11NioProtocol" connectionTimeout="20000" maxThreads="150" URIEncoding="UTF-8" useBodyEncodingForURI="true" maxHttpHeaderSize="65536" compression="on" compressionMinSize="2048" noCompressionUserAgents="gozilla, traviata" compressableMimeType="text/html,text/xml,text/plain,text/css,text/javascript,text/json,application/json" SSLEnabled="true" scheme="https" secure="true"> <SSLHostConfig protocols="all"> <Certificate certificateKeystoreFile="my.server.com.jks" certificateKeystoreType="JKS" certificateKeystorePassword="mypwd" certificateKeyAlias="server" /> </SSLHostConfig> </Connector> Tomcat8.5, if older than 8.0 you may need to drop ciphers arguments <Connector port="8443" protocol="org.apache.coyote.http11.Http11NioProtocol" disableUploadTimeout="true" useBodyEncodingForURI="true" acceptCount="300" acceptorThreadCount="2" maxThreads="400" compressableMimeType="text/html,text/xml,text/plain,text/css,text/javascript,text/json,application/json" compression="off" compressionMinSize="2048" keystoreFile="my.server.com.jks" keystorePass="mypwd" keyAlias="server" SSLEnabled="true" scheme="https" secure="true" clientAuth="false" sslEnabledProtocols="+TLSv1,+TLSv1.1,+TLSv1.2" ciphers=" TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384, TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA384, TLS_ECDH_RSA_WITH_AES_256_CBC_SHA384, TLS_DHE_DSS_WITH_AES_256_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA, TLS_ECDH_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_DSS_WITH_AES_256_CBC_SHA, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDH_RSA_WITH_AES_128_CBC_SHA256, TLS_DHE_DSS_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA, TLS_ECDH_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_DSS_WITH_AES_128_CBC_SHA, TLS_ECDHE_ECDSA_WITH_RC4_128_SHA, TLS_ECDH_ECDSA_WITH_RC4_128_SHA, TLS_ECDH_RSA_WITH_RC4_128_SHA, TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, TLS_RSA_WITH_AES_256_GCM_SHA384, TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384, TLS_DHE_DSS_WITH_AES_256_GCM_SHA384, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDH_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDH_RSA_WITH_AES_128_GCM_SHA256, TLS_DHE_DSS_WITH_AES_128_GCM_SHA256, TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA, SSL_RSA_WITH_RC4_128_MD5, SSL_RSA_WITH_RC4_128_SHA, TLS_EMPTY_RENEGOTIATION_INFO_SCSVF " />