Explanation with an example:
import cql
cql connect to CF/Keyspace
last_key = XYZ (say it's getting fetched from else where)
cursor.execute(select * from domain_dimension where key=:key", key="last_key")
The CQL documentation says it can be done, but on console it says execute() got unexpected keyword argument.
Does Cassandra CQL really support query substitution?
It looks like you need to pass the substitutions in a dict as a single arg, not as keyword args.
cursor.execute("select * from domain_dimension where key=:key", {'key': last_key})
That is how it specified in the example on the project homepage: http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/
Related
I am trying to write the data to IBM DB2 (10.5 fix pack 11) using Pyspark (2.4).
When I try to execute below piece of code
df.write.format("jdbc")
.mode('overwrite').option("url",'jdbc:db2://<host>:<port>/<DB>').
option("driver", 'com.ibm.db2.jcc.DB2Driver').
option('sslConnection', 'true')
.option('sslCertLocation','</location/***_ssl.crt?').
option("numPartitions", 1).
option("batchsize", 1000)
.option('truncate','true').
option("dbtable", '<TABLE>').
option("user",'<user>').
option("password", '<PW>')
.save()
job is throwing the following exception:
File
"/usr/local/Cellar/apache-spark/3.0.1/libexec/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 326, in get_return_value py4j.protocol.Py4JJavaError: An error
occurred while calling o97.save. :
com.ibm.db2.jcc.am.SqlSyntaxErrorException: DB2 SQL Error:
SQLCODE=-104, SQLSTATE=42601,
SQLERRMC=END-OF-STATEMENT;ABLE<SEHEMA.TABLE>;IMMEDIATE, DRIVER=4.19.80
at com.ibm.db2.jcc.am.b5.a(b5.java:747)
Job is trying to perform truncate but seems like DB2 is expecting ** IMMEDIATE** keyword
In my above code all I am passing is only name of the dbtable, is there a way to pass
IMMEDIATE keyword?
And also from DB2 side, is there a way to set this while opening the session?
Just FYI, my code with out truncate works, but that delete the table and recreates and loads, I don't want to do that on prod environment.
Any thoughts on how to solve this issue are highly appreciated.
DB2Dialect in Spark 2.4 doesn't override the default JDBCDialect's implementation of a TRUNCATE TABLE. Comments in the code suggest to override this method to return a statement that suits your database engine.
/**
* The SQL query that should be used to truncate a table. Dialects can override this method to
* return a query that is suitable for a particular database. For PostgreSQL, for instance,
* a different query is used to prevent "TRUNCATE" affecting other tables.
* #param table The table to truncate
* #param cascade Whether or not to cascade the truncation
* #return The SQL query to use for truncating a table
*/
#Since("2.4.0")
def getTruncateQuery(
table: String,
cascade: Option[Boolean] = isCascadingTruncateTable): String = {
s"TRUNCATE TABLE $table"
}
Perhaps in DB2 case you can actually extend DB2Dialect itself, add your getTruncateQuery() implementation and define your "custom" JDBC protocol, "jdbc:mydb2" for example. You can then use this protocol in JDBC connection URL, .option("url",'jdbc:mydb2://<host>:<port>/<DB>').
Using Cassandra 2.2.8 with 3.0 Connector.
I am trying to create a Statement with QueryBuilder. When I execute Statement it complains no keyspace defined. The only way I know to set keyspace is as below (There is no setKeyspace method in Statement). When I do a getKeySpace - I actually get null
Statement s = QueryBuilder.select().all()
.from("test.tests")
System.out.println("getKeyspace:"+ s.getKeyspace()); >> null
Am I doing something wrong, Is there any other (more reliable) way to setKeyspace?
Thanks
from(String) expects a table name. While what you are doing is technically valid and cassandra will interpret it correctly, the driver is not able to derive the keyspace name in this way.
Instead you could use from(String, String) which takes the first parameter as the keyspace.
Statement s = QueryBuilder.select().all()
.from("test", "tests");
System.out.println("getKeyspace:" + s.getKeyspace()); // >> test
The DataStax documentation says that to page through all data, the following CQL query is useful:
SELECT * FROM test WHERE token(k) > token(42);
Is it possible to build this query using the QueryBuilder? It provides a token method, but that seems to work only on column names, not on values.
Ideally, the value (in the example: 42) is of type Object, just like in the eq/gte/lte functions.
Try using automatic paging with the .fetchSize method. It uses token under the hood:
Automatic paging is introduced Cassandra 2.0. Automatic paging allows the developer to iterate on an entire ResultSet without having to care about its size: some extra rows are fetched as the client code iterate over the results while the old ones are dropped. The amount of rows that must be retrieved can be parameterized at query time. In the Java Driver this will looks like:
Statement stmt = new SimpleStatement("SELECT * FROM images");
stmt.setFetchSize(100);
ResultSet rs = session.execute(stmt);
Source: http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0
QueryBuilder.fcall("token", value) ;
can solve the problem!
USE users_tracking;
SELECT user_name FROM visits
where port_name IN
(SELECT port_name FROM ports where location = 'NY' )//as temp;
It gives an error
mismatched input 'SELECT' expecting RULE_T_R_PAREN
Is there any way I can store the inner query in a variable and then use that?
I tried using set#varname := query but it does not recognize the set command.
Nested queries are not allowed in Cassandra CQL. For this kind of complex querying feature you'll need to use Hive or SparkSQL.
Here is a full CQL reference,
http://cassandra.apache.org/doc/cql3/CQL.html
I am trying to run the following query
SELECT edge_id, b_id FROM booking_by_edge WHERE edge_id IN ?
I bind Java list of Long's as a parameter and I get an exception
SyntaxError: line 0:-1 mismatched input '<EOF>' expecting ')' (ResultSetFuture.java:242)
If I try to use (?) it expects single Long item to be bound, but I need a collection
Is there an error in my syntax?
Tested in Cassandra 2.1.3, the following code snippet works:
PreparedStatement prepared = session.prepare("SELECT edge_id, b_id FROM booking_by_edge WHERE edge_id IN ?;");
List<Long> edgeIds = Arrays.asList(1L, 2L, 3L);
session.execute(prepared.bind(edgeIds));
Got response on Datastax bugzilla, it is currently not supported, but planned
https://issues.apache.org/jira/browse/CASSANDRA-4210
Update: Supported in Cassandra 2.0.1
It's a bit hard to find in the documentation but it is described in the tuples section of the manual.
If you want to use named parameters you should use the setList() method.
BoundStatement bs = session.prepare("select col from table where col in :values").bind();
bs.setList("values", Arrays.asList(v1, v2, v3));