Error while passing multiple parameters as array in voltdb Adhoc stored procedure - voltdb

I have a scenario where I am getting a SQL query and SQL arguments (to avoid SQL injection) as input.
And I am running that SQL using VoltDB's AdHoc stored procedure using below code.
private static final String voltdbServer = "localhost";
private static final int voltdbPort = 21212;
public ClientResponse runAdHoc(String sql, Object... sqlArgs) throws IOException, ProcCallException
{
ClientConfig clientConfig = new ClientConfig();
Client voltdbClient = ClientFactory.createClient(clientConfig);
voltdbClient.createConnection(voltdbServer, voltdbPort);
return voltdbClient.callProcedure("#AdHoc", sql, sqlArgs);
}
But I get an error org.voltdb.client.ProcCallException: SQL error while compiling query: Incorrect number of parameters passed: expected 2, passed 1
For runAdHoc("select * from table where column1 = ? and column2 = ?", "column1", "column2"), when there are two or more parameters.
And I get error org.voltdb.client.ProcCallException: Unable to execute adhoc sql statement(s): Array / Scalar parameter mismatch ([Ljava.lang.String; to java.lang.String)
For runAdHoc("select * from table where column1 = ?", "column1");, when there is only one parameter.
But I do not face this problem when I directly call voltdbClient.callProcedure("#AdHoc", "select * from table where column1 = ? and column2 = ?", "column1", "column2")
I think VoltDb is not able to treat sqlArgs as separate parameters instead, it is treating them as one array.
One way to solve this problem is parsing the SQL string myself and then passing it but I am posting this to know the efficient way to solve this problem.
Note:- Used SQL is just a test SQL

The #Adhoc system procedure is recognizing the array as one parameter. This kind of thing happens with #Adhoc because there is no planning of the procedure going on where one can explicitly state what each parameter is.
You have the right idea about parsing the sqlArgs array into the actual parameters to pass in separately. You could also concatenate these separate parameters into the SQL statement itself. That way, your adhoc statement will simply be:
voltdbClient.callProcedure("#AdHoc", sql)
Full disclosure: I work at VoltDB.

I posted the same question on VoltDB public slack channel and got one response which solved the problem which is as follows:
The short explanation is that your parameters to #Adhoc are being turned into [sql, sqlArgs] when they need to be [sql, sqlArg1, sqlArg2, …]. You’ll need to create a new array that is sqlArgs.length + 1, put sql at position 0, and copy sqlArgs into the new array starting at position 1. then pass that newly constructed array in the call to client.callProcedure("#AdHoc", newArray)
So I modified my runAdHoc method as below and it solved this problem
public ClientResponse runAdHoc(String sql, Object... sqlArgs) throws IOException, ProcCallException
{
ClientConfig clientConfig = new ClientConfig();
Client voltdbClient = ClientFactory.createClient(clientConfig);
voltdbClient.createConnection(voltdbServer, voltdbPort);
Object[] procArgs;
if (sqlArgs == null || sqlArgs.length == 0)
{
procArgs = new Object[1];
} else
{
procArgs = new Object[sqlArgs.length + 1];
System.arraycopy(sqlArgs, 0, procArgs, 1, sqlArgs.length);
}
procArgs[0] = sql;
return voltdbClient.callProcedure("#AdHoc", procArgs);
}

Related

Is it possible to insert a record that populates a column with type set<text> using a single prepared statement?

Is it possible to insert a record using prepared statements when that record contains a Set and is intended to be applied to a field with a type of 'set'?
I can see how to do it with QueryBuilder.update -
Update.Where setInsertionQuery = QueryBuilder.update(keyspaceName, tableName)
.with(QueryBuilder.addAll("set_column", QueryBuilder.bindMarker()))
.where(QueryBuilder.eq("id_column", QueryBuilder.bindMarker()));
PreparedStatement preparedStatement = keyspace.prepare(setInsertionQuery.toString());
Set<String> set = new HashSet<>(Collections.singleton("value"));
BoundStatement boundStatement = preparedStatement.bind(set,"id-value");
However that QueryBuilder.addAll() method returns an Assignment, and that appears to be only usable with QueryBuilder.update(), and not with QueryBuilder.insertInto(). Is there any way to insert that record in one step, or do I have to first call QueryBuilder.insertInto() while leaving the set column blank, and then populate it with a subsequent call too QueryBuilder.update() that uses addAll()?
I figured out how to do this. The key thing is to call the no-arg PreparedStatement.bind(), and then chain that with calls to setXXX methods, using setSet() for the set values. So something like this -
Insert insertSet = QueryBuilder.insertInto(tableName)
.value("set_column", QueryBuilder.bindMarker("set_column"))
.value("id_column", QueryBuilder.bindMarker("id_column")));
PreparedStatement preparedStatement = keyspace.prepare(setInsertionQuery.toString());
Set<String> set = new HashSet<>(Collections.singleton("value"));
BoundStatement boundStatement = preparedStatement.bind()
.setSet("id_column",set,String.class))
.setString("id_column", "id value"));

CosmosDB Cassandra API - select in query throws exception

SELECT partition_int, clustering_int, value_string
FROM test_ks1.test WHERE partition_int = ? AND clustering_int IN ?
Prepared select query with in clause throws the following exception:
java.lang.ClassCastException: class com.datastax.oss.driver.internal.core.type.PrimitiveType cannot be cast to class com.datastax.oss.driver.api.core.type.ListType
(com.datastax.oss.driver.internal.core.type.PrimitiveType and com.datastax.oss.driver.api.core.type.ListType are in unnamed module of loader 'app')
at com.datastax.oss.driver.internal.core.type.codec.registry.CachingCodecRegistry.inspectType(CachingCodecRegistry.java:343)
at com.datastax.oss.driver.internal.core.type.codec.registry.CachingCodecRegistry.codecFor(CachingCodecRegistry.java:256)
at com.datastax.oss.driver.internal.core.data.ValuesHelper.encodePreparedValues(ValuesHelper.java:112)
at com.datastax.oss.driver.internal.core.cql.DefaultPreparedStatement.bind(DefaultPreparedStatement.java:159)
Using datastax oss driver - version 4.5.1 and cosmosDB.
The query works with cassadra as docker and works in cqlsh with CosmosDB.
Queries used:
CREATE TABLE IF NOT EXISTS test_ks1.test (partition_int int, value_string text, clustering_int int, PRIMARY KEY ((partition_int),clustering_int))
Prepare the statement: INSERT INTO test_ks1.test (partition_int,clustering_int,value_string) values (?,?,?)
Insert values: 1,1,”a” | 1,2,”b”
Prepare the statement: SELECT partition_int, clustering_int, value_string FROM test_ks1.test WHERE partition_int = ? AND clustering_int IN ?
Execute with parameters 1,List.of(1,2)
The expected parameter is an integer and not list of integers
Sample code of the select prepared statement:
final CqlSessionBuilder sessionBuilder = CqlSession.builder()
.withConfigLoader(loadConfig(sessionConfig));
CqlSession session = sessionBuilder.build();
PreparedStatement statement = session.prepare(
"SELECT partition_int, clustering_int,"
+ "value_string FROM test_ks1.test WHERE partition_int = ? "
+ "AND clustering_int IN ?");
com.datastax.oss.driver.api.core.cql.ResultSet rs =
session.execute(statement.bind(1,List.of(1, 2)));
Is there a workaround to use prepared select queries with in clause?
Thanks.
My suggestion would be to use an work around like this:
//IDS you want to use;
List<Integer> list = Arrays.asList(1, 2);
// Multiply the number of question marks
String markers = StringUtils.repeat("?,", list.size()-1);
//Final query
final String query = "SELECT * FROM test_ks1.test where clustering_int in ("+markers+" ?)";
PreparedStatement prepared = session.prepare(query);
BoundStatement bound = prepared.bind(list.toArray()).setIdempotent(true);
List<Row> rows = session.execute(bound).all();
I have tried on my end and it works with me.
Now on your case you also have another parameter before the IN that you need to include in the parameter list but only after build the markers placeholder.

How to create Spark broadcast variable from Java String array?

I have Java String array which contains 45 string which is basically column names
String[] fieldNames = {"colname1","colname2",...};
Currently I am storing above array of String in a Spark driver in a static field. My job is running slow so trying to refactor code. I am using above String array while creating a DataFrame
DataFrame dfWithColNames = sourceFrame.toDF(fieldNames);
I want to do the above using broadcast variable to that it don't ship huge string array to every executor. I believe we can do something like the following to create broadcast
String[] brArray = sc.broadcast(fieldNames,String[].class);//gives compilation error
DataFrame df = sourceFrame.toDF(???);//how do I use above broadcast can I use it as is by passing brArray
I am new to Spark.
This is a bit old question, however, I hope my solution would help somebody.
In order to broadcast any object (could be a single POJO or a collection) with Spark 2+ you first need to have the following method that creates a classTag for you:
private static <T> ClassTag<T> classTag(Class<T> clazz) {
return scala.reflect.ClassManifestFactory.fromClass(clazz);
}
next you use a JavaSparkContext from a SparkSession to broadcast your object as previously:
sparkSession.sparkContext().broadcast(
yourObject,
classTag(YourObject.class)
)
In case of a collection, say, java.util.List, you use the following:
sparkSession.sparkContext().broadcast(
yourObject,
classTag(List.class)
)
The return variable of sc.broadcast is of type Broadcast<String[]> and not String[]. When you want to access the value, you simply call value() on the variable. From your example it would be like:
Broadcast<String[]> broadcastedFieldNames = sc.broadcast(fieldNames)
DataFrame df = sourceFrame.toDF(broadcastedFieldNames.value())
Note, that if you are writing this in Java, you probably want to wrap the SparkContext within the JavaSparkContext. It makes everything easier and you can then avoid having to pass a ClassTag to the broadcast function.
You can read more on broadcasting variables on http://spark.apache.org/docs/latest/programming-guide.html#broadcast-variables
ArrayList<String> dataToBroadcast = new ArrayList();
dataToBroadcast .add("string1");
...
dataToBroadcast .add("stringn");
//Creating the broadcast variable
//No need to write classTag code by hand use akka.japi.Util which is available
Broadcast<ArrayList<String>> strngBrdCast = spark.sparkContext().broadcast(
dataToBroadcast,
akka.japi.Util.classTag(ArrayList.class));
//Here is the catch.When you are iterating over a Dataset,
//Spark will actally run it in distributed mode. So if you try to accees
//Your object directly (e.g. dataToBroadcast) it would be null .
//Cause you didn't ask spark to explicitly send tha outside variable to each
//machine where you are running this for each parallelly.
//So you need to use Broadcast variable.(Most common use of Broadcast)
someSparkDataSetWhere.foreach((row) -> {
ArrayList<String> stringlist = strngBrdCast.value();
...
...
})

Setting a NULL value in a BoundStatement

I'm using Cassandra Driver 2.0.0-beta2 with Cassandra 2.0.1.
I want to set a NULL value to a column of type 'int', in a BoundStatement. I don't think I can with setInt.
This is the code I'm using:
String insertStatementString = "insert into subscribers(subscriber,start_date,subscriber_id)";
PreparedStatement insertStatement = session.prepare(insertStatementString);
BoundStatement bs = new BoundStatement(insertStatement);
bs.setString("subscriber",s.getSubscriberName());
bs.setDate("start_date",startDate);
bs.setInt("subscriber_id",s.getSubscriberID());
The last line throws a null pointer exception, which can be explained because s.getSubscriberID() return an Integer and the BoundStatement accepts only ints, so when the id is null, it can't be converted, thus the exception.
The definition in my opinion should change to:
BoundStatement.setInt(String name, Integer v);
The way it is right now, I can't set NULL values for numbers.
Or am I missing something?
Is there other way to achieve this?
In cqlsh, setting null to a column of type 'int' is possible.
There is no need to bind values where the value will be empty or null. Therefore a null check might be useful, e.g.,
if(null != s.getSubscriberID()){
bs.setInt("subscriber_id",s.getSubscriberID());
}
As to the question of multiple instantiations of BoundStatement, the creation of multiple BoundStatement will be cheap in comparison with PreparedStatements (see the CQL documentation on prepared statements). Therefore the benefit is more clear when you begin to reuse the PreparedStatement, e.g., with a loop
String insertStatementString = "insert into subscribers(subscriber,start_date,subscriber_id)";
PreparedStatement insertStatement = session.prepare(insertStatementString);
// Inside a loop for example
for(Subscriber s: subscribersCollection){
BoundStatement bs = new BoundStatement(insertStatement);
bs.setString("subscriber",s.getSubscriberName());
bs.setDate("start_date",startDate);
if(null != s.getSubscriberID()){
bs.setInt("subscriber_id",s.getSubscriberID());
}
session.execute(bs);
}
I decided not to set the value at all. By default, it is null. It's a weird workaround.
But now I have to instantiate the BoundStatement before every call, because otherwise I risk having a value different than null from a previous call.
It would be great if they added a more comprehensive 'null' support.

need a counter query which give all counter for a primary key using hector API

I am using hector API for cassandra.
i create a counter table like follows
private void addColumnFamilyCounter(ThriftCluster cluster, String cfName, int rowCacheKeysToSave) {
String cassandraKeyspace = this.env.getProperty("cassandra.keyspace");
ThriftCfDef cfd =
new ThriftCfDef(cassandraKeyspace, cfName, ComparatorType.UTF8TYPE);
cfd.setRowCacheKeysToSave(rowCacheKeysToSave);
cfd.setDefaultValidationClass(ComparatorType.COUNTERTYPE.getClassName());
cluster.addColumnFamily(cfd);
}
and call the above method like follows
addColumnFamilyCounter(cluster, COUNTER_CF, 0);
The format of the table is like follows
Primary key columns
Munich jingle : 1
mingle : 2
tingle : 1
pingle : 5
Now i want to execute a query to get all the columns and their values under Munich. is there any way i can get all the columns.
What i knw till now is the following query but it gives me value for only a combination of primary key and a column key.
#Override
public long getTagCounter(String domain, String tag) {
CounterQuery<String, String> counter =
new ThriftCounterColumnQuery<String, String>(keyspaceOperator,
StringSerializer.get(),
StringSerializer.get());
counter.setColumnFamily(TAG_COUNTER_CF).setKey("p_key").setName("name");
return counter.execute().get().getValue();
}
okay i found answer by myself.I hope it will be helpful to other
CounterSlice<String> query = HFactory.createCounterSliceQuery(keyspaceOperator , StringSerializer.get(), StringSerializer.get())
.setColumnFamily("CF")
.setKey("PK")
.setRange(null, null, false, Integer.MAX_VALUE).execute().get();
for(HCounterColumn<String> col : query.getColumns()){
log.info(col.getName());
log.info(col.getvalue());
}

Resources