How to connect Cassandra from gcloud cluster using python - python-3.x

We try to connect cluster using bash script using Jupyter notebook :
!gcloud compute --project "project_name" ssh --zone "us-central1-a" "cassandra-abc-m"
After that we try to connect using :
import cql
con= cql.connect(host="127.0.0.1",port=9160,keyspace="testKS")
cur=con.cursor()
result=cur.execute("select * from TestCF")
How to inter-connect both?
Kindly help me for it.

As I understand the question, you are SSHing out to a Google Compute (GCP) instance (running Cassandra) and are then trying to run a Python script to connect to the local node. I see two problems in your cql.connect line.
First, Cassandra does not use port 9160 for CQL. CQL uses port 9042. I find this point confuses people so much, that I recommend not setting port= at all. The driver will use the default, which should work.
Secondly, if you deployed Cassandra to a GCP instance, then you probably changed listen_address and rpc_address. This means Cassandra cannot bind to 127.0.0.1. You need to use the value defined in the yaml's rpc_address (or broadcast_rpc_address) property.
$ grep rpc_address cassandra.yaml
rpc_address: 10.19.17.5
In my case, I need to specify 10.19.17.5 if I want to connect either locally or remote.
tl;dr;
Don't specify the port.
Connect to your external-facing IP address, as 127.0.0.1 will never work.

Related

Cassandra not started correctly

I try to build Cassandra cluster with 3 nodes
On node 1 i've got a trouble when i try use cqlsh:
Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
In cassandra.yaml i dont't see strings with ip 127.0.0.1, i wrote there correct parametres (in my mind)
cluster_name: 'cassandra_cluster'
seeds: "node_public_ip_1,node_public_ip_2,node_public_ip_3"
listen_address: node_public_ip_1
rpc_adress: node_public_ip_1
in cassandra-topology.properties:
`# Cassandra Node IP=Data Center:Rack`
node_public_ip_1=dc1:rac1
node_public_ip_2=dc1:rac1
node_public_ip_3=dc1:rac1
What i do wrong?
Thanks!
When you run cqlsh without specifying the host, it defaults to localhost (127.0.0.1). You need to specify the client IP address (rpc_address) of the node you want to connect to when running cqlsh.
Also, the standard recommendation for multi-homed servers is to use the private IP for internal cluster comms and the public IP for client connections. This means that in cassandra.yaml you would configure:
listen_address: private_ip
rpc_address: public_ip
Since the seeds are used for seeding cluster communication, you would also configure the seeds list with the nodes' private IP addresses.
Finally, if you're using the old PropertyFileSnitch, you should also configure cassandra-topology.properties with the nodes' private IP addresses.
However, PFS is really old and in almost all cases our recommendation is to use GossipingPropertyFileSnitch since it has all the benefits of being able to expand the cluster in the future without any downsides as I've explained in this post -- https://community.datastax.com/questions/8887/.
It is also a lot simpler to manage GPFS since you only need to set a single node's DC and rack configuration in cassandra-rackdc.properties without having to reconfigure all the nodes whenever you add/remove nodes from the cluster.
When you do switch to GPFS, you should delete cassandra-topology.properties on the nodes to prevent any gossip issues in the future as I've explained in this post -- https://community.datastax.com/questions/4621/. Cheers!
Try run the "nodetool status" and "nodetool describecluster" on both nodes of this cluster to identify the IP used by nodes.
and try connect passing the host/ip. like:
cqlsh <host> <port> -u <user> -p <password>

NoHostAvailableError: All host(s) tried for query failed

I've installed Cassandra on one EC2 instance that contains one keyspace with SimpleStrategy and replcation factor 1.
I've also made port 9042 accessible from anywhere in the security group.
I have a Node.js application that contains the following code:
const cassandra = require('cassandra-driver');
const client = new cassandra.Client({ contactPoints: ['12.34.567.890:9042',], keyspace: 'ks1' });
const query = 'CREATE TABLE table.name ( field1 text, field2 text, field3 counter, PRIMARY KEY (field1, field2));';
client.execute(query)
.then(result => console.log(result));
which produces the following error:
NoHostAvailableError: All host(s) tried for query failed. First host
tried, 12.34.567.890:9042: DriverError: Connection timeout. See
innerErrors.
I use cassandra-driver.
I've made sure Cassandra is running.
Edit:
As Aaron suggested, I have installed cqlsh on the client machine. When I go cqlsh 12.34.567.890 9042, it returns:
Connection error: ('Unable to connect to any servers',
{'12.34.567.890': error(10061, "Tried connecting to [('12.34.567.890',
9042)]. Last error: No connection could be made because the target
machine actively refused it")})
As Aaron suggeted, I have edited Cassandra.yaml on the server and replaced localhost with 12.34.567.890. I'm still getting the same error though.
First of all, you don't need to specify the port. Try this:
const client = new cassandra.Client({ contactPoints: ['12.34.567.890'], keyspace: 'ks1' });
Secondly, where is your NodeJS app running from? Install and run cqlsh from there, just to make sure connection is possible. You can also use telnet to make sure you can connect to your node on 9042.
Also, you're going to want to enable authentication and authorization, and never use SimpleStrategy again. Enabling auth and building your keyspaces with NetworkTopologyStrategy are good habits to get into.
I just noticed that you said this:
instance that contains one keystore
Did you mean "keyspace" or are you using client-to-node SSL? If so, you're going to need to adjust your connection code to present a SSL certificate which matches the one in your node's keystore.
If you're still having problems, the next thing to do, would be to go about ensuring that you are connecting to the correct IP address. Grep your cassandra.yaml to see that:
$ grep "_address:" conf/cassandra.yaml
listen_address: 192.168.0.2
broadcast_address: 10.1.1.4
# listen_on_broadcast_address: false
rpc_address: 192.168.0.2
broadcast_rpc_address: 10.1.1.4
If they're configured, you'll want to use the "broadcast" address. These different addresses are typically useful for deployments where you have both an internal and external IP address.
$ grep "_address:" conf/cassandra.yaml
listen_address: localhost
# broadcast_address: 1.2.3.4
# listen_on_broadcast_address: false
rpc_address: localhost
# broadcast_rpc_address: 1.2.3.4
If you see output that looks like this, it means that Cassandra is listening on your local IP of 127.0.0.1. In which case, you wouldn't even need to specify it.
grep "_address:" cassandra.yaml returns exactly what you wrote in the second quote (with the localhost). Is it good or I need to change it?
You will need to change this. Otherwise, it will only accept connections on 127.0.0.1, which will not allow anything outside that node to connect to it.
Then what should I write there? I guess Cassandra is not supposed to be aware to the IP address of the machine that hosting it.
Actually, the main problem is that Cassandra is very aware of which IP it is on. Since you're trying to connect on 12.34.567.890 (which I know isn't a real IP), you should definitely use that.
You only need to specify broadcast addresses if each of your instances has both internal and external IP addresses. Typically the internal address gets specified as both rpc and listen, while the external becomes the broadcast addresses. But if your instance only has one IP, then you can leave the broadcast addresses commented-out.

Spark standalone cluster doesn't accept connections

I'm trying to run the simplest Spark standalone cluster on an Azure VM. I'm running a single master, with a single worker running on the same machine. I can access the Web UI perfectly, and can see that the worker is registered with the master.
But I can't connect to this cluster using spark-shell on my laptop. When I looked in the logs, I see
15/09/27 12:03:33 ERROR ErrorMonitor: dropping message [class akka.actor.ActorSelectionMessage]
for non-local recipient [Actor[akka.tcp://sparkMaster#40.113.XXX.YYY:7077/]]
arriving at [akka.tcp://sparkMaster#40.113.XXX.YYY:7077] inbound addresses
are [akka.tcp://sparkMaster#somehostname:7077]
akka.event.Logging$Error$NoCause$
Now I think the reason why this is happening is that on Azure, every virtual machine sits behind a type of firewall/load balancer. I'm trying to connect using the Public IP that Azure tells me (40.113.XXX.YYY), but Spark refuses to accept connections because this is not the IP of an interface.
Since this IP is not of the machine, I can't bind to an interface either.
How can I get Spark to accept these packets as well?
Thanks!
you can try to set the ip address to MASTER_IP which in spark-env.sh instead of hostname.
I got the same problem and was able to resolve by getting the configured --ip parameter of the command line that runs spark:
$ ps aux | grep spark
[bla bla...] org.apache.spark.deploy.master.Master --ip YOUR_CONFIGURED_IP [bla bla...]
Then I was able to connect to my cluster by using exactly the same string as YOUR_CONFIGURED_IP:
spark-shell --master spark://YOUR_CONFIGURED_IP:7077

How do I connect to local cassandra db

I have a cassandra db running locally. I can see it working in Ops Center. However, when I open dev center and try to connect I get a cryptic "unable to connect" error.
How can I get the exact name / connectionstring that I need to use to connect to this local cassandra db via dev center?
The hostname/IP to connect to is specified in the listen_address property of your cassandra.yaml.If you are connecting to Cassandra from your localhost only (a sandbox machine), then you can set the listen_address in your cassandra.yaml accordingly:
listen_address: localhost
When you start Cassandra, you should see lines similar to this either in STDOUT or in your system.log (timestamps removed for brevity):
Starting listening for CQL clients on localhost/127.0.0.1:9042...
Binding thrift service to localhost/127.0.0.1:9160
Listening for thrift clients...
These lines indicate which address you should be using to connect to your cluster. The first way to test your connection, is with clqsh. Note that cqlsh will connect to "localhost" by default. If you are connecting to a host/IP other than localhost, then you will need to specify it on the command line.
$ cqlsh
Connected to Test Cluster at localhost:9042.
[cqlsh 5.0.1 | Cassandra 2.1.0-rc5-SNAPSHOT | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.
cqlsh>
If this works, then you should also be able to connect (and test) from DataStax Dev Center (also on your local machine) by defining a connection to localhost, like this:
At this point, you should be able to connect via your application code (Java CQL3 driver shown):
cluster = Cluster.builder().addContactPoint("localhost").build();
Metadata metadata = cluster.Metadata;
Console.WriteLine("Connected to cluster: " + metadata.ClusterName.ToString());
Session session = cluster.connect();

Apache Cassandra remote access

I have installed Apache Cassandra on the remote Ubuntu server. How to allow remote access for an Apache Cassandra database? And how to make a connection?
Remote access to Cassandra is via its thrift port (although note that the JMX port can be used to perform some limited operations).
The thrift port is defined in cassandra.yaml by the rpc_port parameter, which defaults to 9160. Your cassandra node should be bound to the IP address of your server's network card - it shouldn't be 127.0.0.1 or localhost which is the loopback interface's IP, binding to this will prevent direct remote access. You configure the bound address with the rpc_address parameter in cassandra.yaml. Setting this to 0.0.0.0 says "listen on all network interfaces" which may or may not be suitable for you.
To make a connection you can use:
The cassandra-cli in the cassandra distribution's bin directory provides simple get / set / list operations and depends on Java
The cqlsh shell which provides CQL access to cassandra, this depends on Python
A higher level interface such as Apollo
For anyone finding this question now, the top answer is out of date.
Apache Cassandra's thrift interface is deprecated and will be removed in Cassandra 4.0. The default client port is now 9042.
As noted by Tyler Hobbs, you will need to ensure that the rpc_address parameter is not set to 127.0.0.1 or localhost (it is localhost by default). If you set it to 0.0.0.0 to listen on all interfaces, you will also need to set broadcast_rpc_address to either the node's public or private IP address (depending on how you plan to connect to Cassandra)
Cassandra-cli is also deprecated and Apollo is no longer active. Use cqlsh in lieu of cassandra-cli and the Java driver in lieu of Apollo.
I do not recommend making the JMX port accessible remotely unless you secure it properly by enabling SSL and strong authentication.
Hope this is helpful.
cassandra 3.11.3
I did the following to get mine working. Changes in cassandra.yaml :
start_rpc: true
rpc_address: 0.0.0.0
broadcast_rpc_address: ***.***.***.***
broadcast_rpc_address is the address of machine where cassandra is installed
seed_provider:
- class_name: ...
- seeds: "127.0.0.1, ***.***.***.***"
In seeds i added/appended the ip address of machine where cassandra was running.
I accessed it from windows using tableplus. In tableplus, I wrote the ip address of the cassandra machine, in the port section I wrote 9042 and used the username and password, which i used for ssh connection.
For anyone using Azure, the issue may be that you need to create a public ip address since the virtual ip points to the cloud service itself and not the virtual machine. You can find more info in this post

Resources