DataStax Devcenter fails to connect to the remote cassandra db - cassandra

I've installed DataStax cassandra and it is up and running on my remote machine. Now I am trying to connecto via DataStax Devcenter but it fails.
Before posting this question I've read identical here: DataStax Devcenter fails to connect to the remote cassandra database
I went to cassandra.yaml conf file but start_native_transport: true option is not in my file. Where should I look for it?
Also I've changed rpc_address to: 0.0.0.0.
UPDATE:
If I add start_native_transport: true into my cassandra.yaml it just crashes on Cassandra restart. Please refer a log below:
ERROR 17:48:32,626 Fatal configuration error error
Can't construct a java object for tag:yaml.org,2002:org.apache.cassandra.config.Config; exception=Cannot create property=start_native_transport for JavaBean=org.apache.cassandra.config.Config#ef28a30; Unable to find property 'start_native_transport' on class: org.apache.cassandra.config.Config
in "<reader>", line 10, column 1:
cluster_name: 'Test Cluster'
^
at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:372)
at org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:177)
at org.yaml.snakeyaml.constructor.BaseConstructor.constructDocument(BaseConstructor.java:136)
at org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:122)
at org.yaml.snakeyaml.Loader.load(Loader.java:52)
at org.yaml.snakeyaml.Yaml.load(Yaml.java:166)
at org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:141)
at org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:116)
at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:124)
at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:389)
at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)
Caused by: org.yaml.snakeyaml.error.YAMLException: Cannot create property=start_native_transport for JavaBean=org.apache.cassandra.config.Config#ef28a30; Unable to find property 'start_native_transport' on class: org.apache.cassandra.config.Config
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.constructJavaBean2ndStep(Constructor.java:305)
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.construct(Constructor.java:184)
at org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:370)
... 10 more
Caused by: org.yaml.snakeyaml.error.YAMLException: Unable to find property 'start_native_transport' on class: org.apache.cassandra.config.Config
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.getProperty(Constructor.java:342)
at org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.constructJavaBean2ndStep(Constructor.java:240)
... 12 more
null; Can't construct a java object for tag:yaml.org,2002:org.apache.cassandra.config.Config; exception=Cannot create property=start_native_transport for JavaBean=org.apache.cassandra.config.Config#ef28a30; Unable to find property 'start_native_transport' on class: org.apache.cassandra.config.Config
Invalid yaml; unable to start server. See log for stacktrace.
Thanks for any Help!

start_native_transport: true
should be there in cassandra.yaml if its not there then you should add it into cassandra.yaml and try after restarting the Cassandra server

What version of Cassandra are you using? DevCenter supports Cassandra versions >= 1.2
If you still see errors with the change in cassandra.yaml you can post a link to a Gist. But the YAML format is pretty simple so I think you'll figure it out.
If you read my previous answer you'll notice that it required the rpc_address to be set to a different value than 0.0.0.0. Anyways the latest version of DevCenter (1.1.1) will work even all the nodes in your cluster have the rpc_address set to 0.0.0.0 (as a side note I don't think that's generally a good setting).

DevCenter.ini does not have java VM information.
Adding below line of VM info helped resolve connection issue.
-vm
C:\Program Files (x86)\JDK64\1.8.0.74\jre\bin\java.exe
NOTE: above line represents appropriate java.exe from JRE version

Related

org.postgresql.util.PSQLException: SSL error: Received fatal alert: handshake_failure while writing from Azure Databricks to Azure Postgres Citus

I am trying to write pyspark dataframe to Azure Postgres Citus (Hyperscale).
I am using latest Postgres JDBC Driver and I tried writing on Databricks Runtime 7,6,5.
df.write.format("jdbc").option("url","jdbc:postgresql://<HOST>:5432/citus?user=citus&password=<PWD>&sslmode=require" ).option("dbTable", table_name).mode(method).save()
This is what I get after running the above command
org.postgresql.util.PSQLException: SSL error: Received fatal alert: handshake_failure
I have already tried different parameters in the URL and unders the option as well, but no luck so far.
However, I am able to connect to this instance using my local machine and on databricks driver/notebook using psycopg2
Both the Azure Postgres Citus and Databricks are in the same region and Azure Postgres Citus is public.
It worked by overwriting the java security properties for driver and executor
spark.driver.extraJavaOptions -Djava.security.properties=
spark.executor.extraJavaOptions -Djava.security.properties=
Explanation:
What is happening in reality is that the “security” variable of the JVM is reading by default the following file (/databricks/spark/dbconf/java/extra.security) and in this file there are some TLS algorithms that are being disabled by default. That means that if I edit this file and replace the TLS cyphers that work for PostGres citus for an empty string that should also work.
When I set this variable to the executors (spark.executor.extraJavaOptions) it will not change the default variables from the JVM. The same does not happen for the driver which overwrites and so it starts to work.
Note: We need to edit this file before the variable is read and so the init script is the only way of accomplishing that.

Connecting to Cassandra with Spark

First, I have bought the new O'Reilly Spark book and tried those Cassandra setup instructions. I've also found other stackoverflow posts and various posts and guides over the web. None of them work as-is. Below is as far as I could get.
This is a test with only a handful of records of dummy test data. I am running the most recent Cassandra 2.0.7 Virtual Box VM provided by plasetcassandra.org linked from the main Cassandra project page.
I downloaded Spark 1.2.1 source and got the latest Cassandra Connector code from github and built both against Scala 2.11. I have JDK 1.8.0_40 and Scala 2.11.6 setup on Mac OS 10.10.2.
I run the spark shell with the cassandra connector loaded:
bin/spark-shell --driver-class-path ../spark-cassandra-connector/spark-cassandra-connector/target/scala-2.11/spark-cassandra-connector-assembly-1.2.0-SNAPSHOT.jar
Then I do what should be a simple row count type test on a test table of four records:
import com.datastax.spark.connector._
sc.stop
val conf = new org.apache.spark.SparkConf(true).set("spark.cassandra.connection.host", "192.168.56.101")
val sc = new org.apache.spark.SparkContext(conf)
val table = sc.cassandraTable("mykeyspace", "playlists")
table.count
I get the following error. What is confusing is that it is getting errors trying to find Cassandra at 127.0.0.1, but it also recognizes the host name that I configured which is 192.168.56.101.
15/03/16 15:56:54 INFO Cluster: New Cassandra host /192.168.56.101:9042 added
15/03/16 15:56:54 INFO CassandraConnector: Connected to Cassandra cluster: Cluster on a Stick
15/03/16 15:56:54 ERROR ServerSideTokenRangeSplitter: Failure while fetching splits from Cassandra
java.io.IOException: Failed to open thrift connection to Cassandra at 127.0.0.1:9160
<snip>
java.io.IOException: Failed to fetch splits of TokenRange(0,0,Set(CassandraNode(/127.0.0.1,/127.0.0.1)),None) from all endpoints: CassandraNode(/127.0.0.1,/127.0.0.1)
BTW, I can also use a configuration file at conf/spark-defaults.conf to do the above without having to close/recreate a spark context or pass in the --driver-clas-path argument. I ultimately hit the same error though, and the above steps seem easier to communicate in this post.
Any ideas?
Check the rpc_address config in your cassandra.yaml file on your cassandra node. It's likely that the spark connector is using that value from the system.local/system.peers tables and it may be set to 127.0.0.1 in your cassandra.yaml.
The spark connector uses thrift to get token range splits from cassandra. Eventually I'm betting this will be replaced as C* 2.1.4 has a new table called system.size_estimates (CASSANDRA-7688). It looks like it's getting the host metadata to find the nearest host and then making the query using thrift on port 9160.

how to use presto to query hive data

I just installed presto and when I use the presto-cli to query hive data, I get the following error:
$ ./presto --server node6:8080 --catalog hive --schema default
presto:default> show tables;
Query 20131113_150006_00002_u8uyp failed: Table hive.information_schema.tables does not exist
The config.properties is:
coordinator=true
datasources=jmx,hive
http-server.http.port=8080
presto-metastore.db.type=h2
presto-metastore.db.filename=/root/h2
task.max-memory=1GB
discovery-server.enabled=true
discovery.uri=`http://node6:8080`
And the hive.properties is:
connector.name=hive-cdh4
hive.metastore.uri=thrift://node6:9083
The hadoop distribution I used is CDH 4.4. I believe it's properly installed and hive can process queries successfully on its own.
Can anyone help me work it out? Any ideas will be appreciated.
As recommended by the Getting Started, I created a controller (jmx only) and a separate worker (jmx,hive), each on separate machines.
What finally solved this for me was to specify the worker's hostname and http-server.http.port as the --server argument to presto. When specifying the controller, it didn't work.
This all makes sense, but I am still wondering what will happen when I have two Presto-Hive workers...
Add more line to etc/catalog/hive.properties
"hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml"
ofcourse check values of path before do it.
presto-metastore.db.filename= <- is this the value for Hive Warehouse
Directory ?
=> this presto's metastore,not hive.
I just figured out what was wrong in my case:
you also have to add following line to $HIVE_HOME/conf/hive-env.sh for informing hive to open thrift port(same as mentioned under hive.metastore.uris property in hive-site.xml file). This port is used by hive client to connect to Metastore through RPC.
export METASTORE_PORT=9084
in the hive-env.sh file in the conf folder.
This should sync your hive with presto.

populate_io_cache_on_flush is not a column defined in this metadata

While connecting to Cassandra 1.2.1 using Data-stax Java driver version 1.0.2, I am getting the error:
Exception in thread "main" java.lang.IllegalArgumentException: populate_io_cache_on_flush is not a column defined in this metadata
at com.datastax.driver.core.ColumnDefinitions.getIdx(ColumnDefinitions.java:268)
at com.datastax.driver.core.Row.isNull(Row.java:84)
at com.datastax.driver.core.TableMetadata$Options.<init>(TableMetadata.java:440)
at com.datastax.driver.core.TableMetadata.build(TableMetadata.java:107)
at com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:124)
at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:88)
at com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:265)
at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:220)
at below line:
cluster = Cluster.builder().addContactPoint("localhost").build();
I tried deleted folder \var\lib\cassandra and then restart the cassandra server too which means there is no previous data. The server starts without any error but I am still getting the above error when I am trying to connect to it.
Ohk. Just discovered that it went away when I use latest version of Cassandra(1.2.8). So it might be because of version incompatibility.

Error while connecting to Cassandra using Java Driver for Apache Cassandra 1.0 from com.example.cassandra

While connecting to Cassandra client using java driver for Cannsandra by DataStax, it is throwing following error..
Exception in thread "main" com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: [/127.0.0.1])
Please suggest...
Thanks!
My java code is like this:
package com.example.cassandra;
import com.datastax.driver.core.Cluster;
import com.datastax.driver.core.Host;
import com.datastax.driver.core.Metadata;
public class SimpleClient {
private Cluster cluster;
public void connect(String node){
cluster = Cluster.builder().addContactPoint(node).build();
Metadata metadata = cluster.getMetadata();
System.out.println(metadata.getClusterName());
}
public void close()
{
cluster.shutdown();
}
public static void main(String args[]) {
SimpleClient client = new SimpleClient();
client.connect("127.0.0.1");
client.close();
}
In my case, I ran into this issue as I used the default RPC port of 9160 during connection. One can find a different port for CQL in cassandra.yaml -
start_native_transport: true
# port for the CQL native transport to listen for clients on
native_transport_port: 9042
Once I changed the code to use port 9042 the connection attempt succeeded -
public BinaryDriverTest(String cassandraHost, int cassandraPort, String keyspaceName) {
m_cassandraHost = cassandraHost;
m_cassandraPort = cassandraPort;
m_keyspaceName = keyspaceName;
LOG.info("Connecting to {}:{}...", cassandraHost, cassandraPort);
cluster = Cluster.builder().withPort(m_cassandraPort).addContactPoint(cassandraHost).build();
session = cluster.connect(m_keyspaceName);
LOG.info("Connected.");
}
public static void main(String[] args) {
BinaryDriverTest bdt = new BinaryDriverTest("127.0.0.1", 9042, "Tutorial");
}
I had this issue and it was sorted by setting the ReadTimeout in SocketOptions:
Cluster cluster = Cluster.builder().addContactPoint("localhost").build();
cluster.getConfiguration().getSocketOptions().setReadTimeoutMillis(HIGHER_TIMEOUT);
Go to your Apache Cassandra conf directory and enable the binary protocol
Cassandra binary protocol
The Java driver uses the binary protocol that was introduced in Cassandra 1.2. It only works with a version of Cassandra greater than or equal to 1.2. Furthermore, the binary protocol server is not started with the default configuration file in Cassandr a 1.2. You must edit the cassandra.yaml file for each node:
start_native_transport: true
Then restart the node.
I was also having same problem. I have installed Cassandra in a separate Linux pc and tried to connect via Window pc. I was not allowed to create the connection.
But when we edit cassandra.yaml, set my linux pc ip address to rpc_address and restart, it allows me to connect successfully,
# The address or interface to bind the Thrift RPC service and native transport
# server to.
#
# Set rpc_address OR rpc_interface, not both. Interfaces must correspond
# to a single address, IP aliasing is not supported.
#
# Leaving rpc_address blank has the same effect as on listen_address
# (i.e. it will be based on the configured hostname of the node).
#
# Note that unlike listen_address, you can specify 0.0.0.0, but you must also
# set broadcast_rpc_address to a value other than 0.0.0.0.
#rpc_address: localhost
rpc_address: 192.168.0.10
Just posting this for people who might have the same problem as I did, when I got that error message. Turned out my complex dependency tree brought about an old version of com.google.collections, which broke the CQL driver. Removing this dependency and relying entirely on guava solved my problem.
I was having the same issue testing a new cluster with one node.
After removing this from the Cluster builder I was able to connect:
.withLoadBalancingPolicy(new DCAwareRoundRobinPolicy("US_EAST"))
It was able to connect.
In my case this was a port issue, which I forgot to update
Old RPC port is 9160
New binary port is 9042
I too encountered this problem, and it was caused by a simple error in the statement that was being submitted.
session.prepare(null);
Obviously, the error message is misleading.
Edit
/etc/cassandra/cassandra.yaml
and change
rpc_address to 0.0.0.0,broadcast_rpc_address and listen_address to ip address of the cluster.
Assuming you have default configurations in place, check the driver version compatibility. Not all driver versions are compatible with all versions of Cassandra, though they claim backward compatibility. Please see the below link.
http://docs.datastax.com/en/developer/java-driver/3.1/manual/native_protocol/
I ran into a similar issue & changing the driver version solved my problem.
Note: Hopefully, you are using Maven (or something similar) to resolve dependencies. Otherwise, you may have to download a lot of dependencies for higher versions of the driver.
Check below points:
i) check server ip
ii) check listening port
iii) data-stack client dependency must match the server version.
About the yaml file, latest versions has below properties enabled:
start_native_transport: true
native_transport_port: 9042

Resources