How to connect to Cassandra Database using Python code - cassandra

I had followed the steps given in https://docs.datastax.com/en/developer/python-driver/3.25/getting_started/ to connect to cassandra database using python code, but still after running the code snippet I am getting
NoHostAvailable: ('Unable to connect to any servers', {'hosts"port': OperationTimedOut('errors=None, last_host=None'),
Python version 2.7 and 3 (classpath is set for both the python versions)
Java 1.8 (class path has been set)
Apache cassandra 3.11.6 (apache home classpath has been set)

I tend to use a very simple app to test connectivity to a Cassandra cluster:
from cassandra.cluster import Cluster
cluster = Cluster(['10.1.2.3'], port=45678)
session = cluster.connect()
row = session.execute("SELECT release_version FROM system.local").one()
if row:
print(row[0])
Then run it:
$ python HelloCassandra.py
4.0.6
In your comment you mentioned that you're getting OperationTimedOut which indicates that the driver never got a response back from the node within the client timeout period. This usually means (a) you're connecting to the wrong IP, (b) you're connecting to the wrong CQL port, or (c) there's a network connectivity issue between your app and the cluster.
Make sure that you're using the IP address that you've set in rpc_address of cassandra.yaml. Also make sure that the node is listening for CQL clients on the right port. You can easily verify this by checking the output of either of these Linux utilities like netstat or lsof, for example:
$ sudo lsof -nPi -sTCP:LISTEN
Cheers!

So that error message suggests that the host/port combination either does not have Cassandra running on it or is under heavy load and unable to respond.
Can you edit your question to include the Cassandra connection portion of your code, as well as maybe how you're calling it? I have a test script which I use (and you're welcome to check it out), and here is the connection portion:
protocol=4
hostname=sys.argv[1]
username=sys.argv[2]
password=sys.argv[3]
nodes = []
nodes.append(hostname)
auth_provider = PlainTextAuthProvider(username=username, password=password)
cluster = Cluster(nodes,auth_provider=auth_provider, protocol_version=protocol)
session = cluster.connect()
I call it like this:
$ python3 testCassandra.py 127.0.0.1 aaron notReallyMyPassword
local
One thing you might try too, would be to run a nodetool status on the cluster just to make sure it's running ok.
Edit
local variable 'session' referenced before assignment
So this sounds to me like you're attempting a session.execute before session = cluster.connect(). Have a look at my Git repo (linked above) to see the correct order for instantiating session.
I am not using default port
In that case, make sure the port is being set in the cluster definition. Ex:
port = 19099
cluster = Cluster(nodes,auth_provider=auth_provider, port=port)

Related

Spring Boot app can't connect to Cassandra cluster, driver returning "AllNodesFailedException: Could not reach any contact point"

i've updated my spring-boot to v3.0.0 and spring-data-cassandra to v4.0.0 which resulted in unable to connect to cassandra cluster which is deployed in stg env and runs on IPv6 address having different datacenter rather DC1
i've added a config file which accepts localDC programatically
`#Bean(destroyMethod = "close")
public CqlSession session() {
CqlSession session = CqlSession.builder()
.addContactPoint(InetSocketAddress.createUnresolved("[240b:c0e0:1xx:xxx8:xxxx:x:x:x]", port))
.withConfigLoader(
DriverConfigLoader.programmaticBuilder()
.withString(DefaultDriverOption.LOAD_BALANCING_LOCAL_DATACENTER, localDatacenter)
.withString(DefaultDriverOption.AUTH_PROVIDER_PASSWORD,password)
.withString(DefaultDriverOption.CONNECTION_INIT_QUERY_TIMEOUT,"10s")
.withString(DefaultDriverOption.CONNECTION_CONNECT_TIMEOUT, "20s")
.withString(DefaultDriverOption.REQUEST_TIMEOUT, "20s")
.withString(DefaultDriverOption.CONTROL_CONNECTION_TIMEOUT, "20s")
.withString(DefaultDriverOption.SESSION_KEYSPACE,keyspace)
.build())
//.addContactPoint(InetSocketAddress.createUnresolved(InetAddress.getByName(contactPoints).getHostName(), port))
.build();
}
return session;`
and this is my application.yml file
spring:
data:
cassandra:
keyspace-name: xxx
contact-points: [xxxx:xxxx:xxxx:xxx:xxx:xxx]
port: xxx
local-datacenter: xxxx
use-dc-aware: true
username: xxxxx
password: xxxxx
ssl: true
SchemaAction: CREATE_IF_NOT_EXISTS
So locally I was able to connect to cassandra (by default it is pointing to localhost) , but in stg env my appplication is not able to connect to that cluster
logs in my stg env
caused by: com.datastax.oss.driver.api.core.AllNodesFailedException: Could not reach any contact point, make sure you've provided valid addresses (showing first 1 nodes, use getAllErrors() for more): Node (endPoint=/[240b:cOe0:102:xxxx:xxxx:x:x:x]:3xxx,hostId-null,hashCode=4e9ba6a8):[com.datastax.oss.driver.api.core.connection.ConnectionInitException:[s0|controllid:0x984419ed,L:/[240b:cOe0:102:5dd7: xxxx:x:x:xxx]:4xxx - R:/[240b:c0e0:102:xxxx:xxxx:x:x:x]:3xxx] Protocol initialization request, step 1 (OPTIONS: unexpected tarlure com.datastax.oss.driver.apt.core.connection.closedconnectiontxception: Lost connection to remote peer)]
Network
You appear to have a networking issue. The driver can't connect to any of the nodes because they are unreachable from a network perspective as it states in the error message:
... AllNodesFailedException: Could not reach any contact point ...
You need to check that:
you have configured the correct IP addresses,
you have configured the correct CQL port, and
there is network connectivity between your app and the cluster.
Security
I also noted that you configured the driver to use SSL:
ssl: true
but I don't see anywhere where you've configured the certificate credentials and this could explain why the driver can't initiate connections.
Check that the cluster has client-to-node encryption enabled. If it does then you need to prepare the client certificates and configure SSL on the driver.
Driver build
This post appears to be a duplicate of another question you posted but is now closed due to lack of clarity and details.
In that question it appears you are running a version of the Java driver not produced by DataStax as pointed out by #absurdface:
Specifically I note that java-driver-core-4.11.4-yb-1-RC1.jar isn't a Java driver artifact released by DataStax (there isn't even a 4.11.4 Java driver release). This could be relevant for reasons we'll get into ...
We are not aware of where this build came from and without knowing much about it, it could be the reason you are not able to connect to the cluster.
We recommend that you switch to one of the supported builds of the Java driver. Cheers!
A hearty +1 to everything #erick-ramirez mentioned above. I would also expand on his answers with an observation or two.
Normally spring-data-cassandra is used to automatically configure a CqlSession and make it available for injection (or for use in CqlTemplate etc.). That's what you'd normally be configuring with your application.yml file. But you're apparently creating the CqlSession directly in code, which means that spring-data-cassandra isn't involved... and therefore what's in your application.yml likely isn't being used.
This analysis strongly suggests that your CqlSession is not being configured to use SSL. My understanding is that your testing sequence went as follows:
Tested app locally on a local server, everything worked
Tested app against test environment, observed the errors above
If this sequence is correct and you have SSL enabled in you test environment but not on your local Cassandra instance that could very easily explain the behaviour you're describing.
This explanation could also explain the specific error you cite in the error message. "Lost connection to remote peer" indicates that something is unexpectedly killing your socket connection before any protocol messages are explained... and an SSL issue would cause almost exactly that behaviour.
I would recommend checking the SSL configuration for both servers involved in your testing. I would also suggest consulting the SSL-related documentation referenced by Erick above and confirm that you have all the relevant materials when building your CqlSession.
added the certificate in my spring application
public CqlSession session() throws IOException, CertificateException, NoSuchAlgorithmException, KeyStoreException, KeyManagementException {
Resource resource = new ClassPathResource("root.crt");
InputStream inputStream = resource.getInputStream();
CertificateFactory cf = CertificateFactory.getInstance("X.509");
Certificate cert = cf.generateCertificate(inputStream);
TrustManagerFactory trustManagerFactory = TrustManagerFactory.getInstance(TrustManagerFactory.getDefaultAlgorithm());
KeyStore keyStore = KeyStore.getInstance(KeyStore.getDefaultType());
keyStore.load(null);
keyStore.setCertificateEntry("ca", cert);
trustManagerFactory.init(keyStore);
SSLContext sslContext = SSLContext.getInstance("TLSv1.3");
sslContext.init(null, trustManagerFactory.getTrustManagers(), null);
return CqlSession.builder()
.withSslContext(sslContext)
.addContactPoint(new InetSocketAddress(contactPoints,port))
.withAuthCredentials(username, password)
.withLocalDatacenter(localDatacenter)
.withKeyspace(keyspace)
.build();
}
so added the cert file in the configuration file of the cqlsession builder and this helped me in connecting to the remote cassandra cluster

How to run a http server on EMR master node of a Spark application

I have a Spark streaming application (Spark 2.4.4) running on AWS EMR 5.28.0. In the driver application on master node, besides setting up the spark streaming job, I am also running a http server (Akka-http 10.1.6) which can query the driver application for data, I bind to port 6161 like the following:
val bindingFuture: Future[ServerBinding] = Http().bindAndHandle(myapiroutes, "127.0.0.1", 6161)
try {
bindingFuture.map { serverBinding =>
log.info(s"AlertRestApi bound to ${serverBinding.localAddress}")
}
} catch {
case ex: Exception => {
log.error(s"Failed to bind to 127.0.0:6161")
system.terminate()
}
}
then I start spark streaming:
ssc.start()
When I test this on local spark, I am able to access http://localhost:6161/myapp/v1/data and get data from spark streaming, everything is good so far.
However, when I run this application in AWS EMR, I could not access port 6161. I ssh into the driver node and try to curl my url, it gives me error message:
[hadoop#ip-xxx-xx-xx-x ~]$ curl http://xxx.xx.xx.x:6161/myapp/v1/data
curl: (7) Failed to connect to xxx.xx.xx.x port 6161: Connection refused
when I look into the log in the driver node, I do see the port is bound (why the host shows 0:0:0:0:0:0:0:0? I don't know, that is the way in my dev testing, and it works, I see the same log and able to access the url):
20/04/13 16:53:26 INFO MyApp: MyRestApi bound to /0:0:0:0:0:0:0:0:6161
So my question is, what should I do so that I can access the api at port 6161 on the driver node? I realize Yarn resource manager may be involved but I know nothing about Yarn resource manager to point myself where to investigate.
Please help. Thanks
You are mentioning 127.0.0.1 as the host name or 0.0.0.0??
127.0.0.1 will work in your local system but not in AWS as it is loopback address. In such case you need to use 0.0.0.0 as the host name
Also make sure that ports are open and access is provided from your IP. To do that, go to Inbound rules for your instance and add 6161 under custom TCP rule if not done already.
Let me know if this makes any difference

One row test insertion to SQL Server RDS works but full load times out

I have a Glue job script that does this (not showing imports and setup here) and it inserts the row into SQL Server RDS just fine:
columns = ['test']
vals = [("test")]
df = sqlContext.createDataFrame(vals, columns)
test = DynamicFrame.fromDF(df, glueContext, "test")
datasink = glueContext.write_dynamic_frame.from_catalog(frame = test,
database = "database-name", table_name = "table-name")
job.commit()
When I run with this same connection but for a larger test load (ends up being about 100 rows) I get this error:
An error occurred while calling o596.pyWriteDynamicFrame. The TCP/IP connection to the host , port 1433 has failed. Error: "Connection timed out: no further information. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall
The thing is that I know there's no firewall or security group issue since one row inserts just fine. I've tried adding a loginTimeout parameter to the JDBC connection like so:
jdbc:sqlserver://<host>:<port>;databaseName=dbName;loginTimeout=600;
As it indicates you can do so here. But the connection fails with Glue when I do that but succeeds when I remove the loginTimeout parameter.
I've also checked the remote timeout configuration on my SQL Server instance and it shows as 600 seconds which is longer than any of my failed jobs so it couldn't be that.
How can I get around this connection timeout error? It seems to be a limitation built into Glue.
In order to do a JDBC connection with Glue you need to follow the steps in this documentation: https://docs.aws.amazon.com/glue/latest/dg/setup-vpc-for-glue-access.html
We had done that but it turns out that our self-referencing sec group wasn't actually self-referencing. Once we changed that it got resolved
I also had to create the connection as an Amazon RDS connection and not as a JDBC connection even though it's doing the same thing under the hood.
Even after doing all that I still had issues. Turns out that you need to add the sql connection specifically to the job outside of the script. If you hit "Edit Job" you'll see a list of sql connections there. If the connection you're trying to hit isn't on the list of required connections you will always timeout

Cassandra Connection with Groovy Script In SoapUI

thanks for the time. I am trying to access a remote Cassandra DB in order to complete my assertions. I see that the Server is running:
Cassandra V 3.0.8.1293
Driver Type: Cassandra CQL
Datastax Java Driver for Apache Cassandra - Core [3.0.5]
So, I am trying with the following simple code to access the DB
import com.datastax.driver.core.*
Cluster cluster = null;
try {
cluster = Cluster.builder().addContactPoint("x.x.x.x").withCredentials("xxxxxxx", "xxxxxx").withPort(9042).build()
Session session = cluster.connect();
ResultSet rs = session.execute("select * from TABLE");
Row row = rs.one();
} finally {
if (cluster != null) cluster.close();
}
when I use the cassandra-driver-core-2.0.1.jar I am getting the error :
ERROR:com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /x.x.x.x(null))
Read the documentation and a lot of posts here and on other blogs and I saw that there may be an incompatibility with the driver version so I tried to upgrade the driver to many versions (cassandra-driver-core-2.5,cassandra-driver-core-3,cassandra-driver-core-3.2), but on that I am getting the following:
ERROR:java.lang.ExceptionInInitializerError
Have also tried to connect using JDBC, but to no avail, using the configuration proposed at this thread
SoapUI JDBC connection with Apache Cassandra
Actually I am running out of ideas. Can anyone propose or point to some direction on how to actually achieve this, either by pointing me to some tutorial or any idea.
Thank you very much
I think you haven't enable remote access to cassandra.
Try enabling remote access using below configuration -
File Path /etc/cassandra/default.conf/cassandra.yaml
rpc_address: 0.0.0.0
broadcast_rpc_address: <serverIPAddress>
After that, restart cassandra service.

Error while connecting to Cassandra using Java Driver for Apache Cassandra 1.0 from com.example.cassandra

While connecting to Cassandra client using java driver for Cannsandra by DataStax, it is throwing following error..
Exception in thread "main" com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/127.0.0.1])
Please suggest...
Thanks!
My java code is like this:
package com.example.cassandra;
import com.datastax.driver.core.Cluster;
import com.datastax.driver.core.Host;
import com.datastax.driver.core.Metadata;
public class SimpleClient {
private Cluster cluster;
public void connect(String node){
cluster = Cluster.builder().addContactPoint(node).build();
Metadata metadata = cluster.getMetadata();
System.out.println(metadata.getClusterName());
}
public void close()
{
cluster.shutdown();
}
public static void main(String args[]) {
SimpleClient client = new SimpleClient();
client.connect("127.0.0.1");
client.close();
}
In my case, I ran into this issue as I used the default RPC port of 9160 during connection. One can find a different port for CQL in cassandra.yaml -
start_native_transport: true
# port for the CQL native transport to listen for clients on
native_transport_port: 9042
Once I changed the code to use port 9042 the connection attempt succeeded -
public BinaryDriverTest(String cassandraHost, int cassandraPort, String keyspaceName) {
m_cassandraHost = cassandraHost;
m_cassandraPort = cassandraPort;
m_keyspaceName = keyspaceName;
LOG.info("Connecting to {}:{}...", cassandraHost, cassandraPort);
cluster = Cluster.builder().withPort(m_cassandraPort).addContactPoint(cassandraHost).build();
session = cluster.connect(m_keyspaceName);
LOG.info("Connected.");
}
public static void main(String[] args) {
BinaryDriverTest bdt = new BinaryDriverTest("127.0.0.1", 9042, "Tutorial");
}
I had this issue and it was sorted by setting the ReadTimeout in SocketOptions:
Cluster cluster = Cluster.builder().addContactPoint("localhost").build();
cluster.getConfiguration().getSocketOptions().setReadTimeoutMillis(HIGHER_TIMEOUT);
Go to your Apache Cassandra conf directory and enable the binary protocol
Cassandra binary protocol
The Java driver uses the binary protocol that was introduced in Cassandra 1.2. It only works with a version of Cassandra greater than or equal to 1.2. Furthermore, the binary protocol server is not started with the default configuration file in Cassandr a 1.2. You must edit the cassandra.yaml file for each node:
start_native_transport: true
Then restart the node.
I was also having same problem. I have installed Cassandra in a separate Linux pc and tried to connect via Window pc. I was not allowed to create the connection.
But when we edit cassandra.yaml, set my linux pc ip address to rpc_address and restart, it allows me to connect successfully,
# The address or interface to bind the Thrift RPC service and native transport
# server to.
#
# Set rpc_address OR rpc_interface, not both. Interfaces must correspond
# to a single address, IP aliasing is not supported.
#
# Leaving rpc_address blank has the same effect as on listen_address
# (i.e. it will be based on the configured hostname of the node).
#
# Note that unlike listen_address, you can specify 0.0.0.0, but you must also
# set broadcast_rpc_address to a value other than 0.0.0.0.
#rpc_address: localhost
rpc_address: 192.168.0.10
Just posting this for people who might have the same problem as I did, when I got that error message. Turned out my complex dependency tree brought about an old version of com.google.collections, which broke the CQL driver. Removing this dependency and relying entirely on guava solved my problem.
I was having the same issue testing a new cluster with one node.
After removing this from the Cluster builder I was able to connect:
.withLoadBalancingPolicy(new DCAwareRoundRobinPolicy("US_EAST"))
It was able to connect.
In my case this was a port issue, which I forgot to update
Old RPC port is 9160
New binary port is 9042
I too encountered this problem, and it was caused by a simple error in the statement that was being submitted.
session.prepare(null);
Obviously, the error message is misleading.
Edit
/etc/cassandra/cassandra.yaml
and change
rpc_address to 0.0.0.0,broadcast_rpc_address and listen_address to ip address of the cluster.
Assuming you have default configurations in place, check the driver version compatibility. Not all driver versions are compatible with all versions of Cassandra, though they claim backward compatibility. Please see the below link.
http://docs.datastax.com/en/developer/java-driver/3.1/manual/native_protocol/
I ran into a similar issue & changing the driver version solved my problem.
Note: Hopefully, you are using Maven (or something similar) to resolve dependencies. Otherwise, you may have to download a lot of dependencies for higher versions of the driver.
Check below points:
i) check server ip
ii) check listening port
iii) data-stack client dependency must match the server version.
About the yaml file, latest versions has below properties enabled:
start_native_transport: true
native_transport_port: 9042

Resources