Cannot insert into Cassandra table, getting SyntaxError - cassandra

I have an assignment where I have to build a Cassandra database. I have connected Cassandra with IntelliJ, i'm writing in java and the output is shown in the command line.
My keyspace farm_db contains a couple of tables in wish i'm would like to insert data. I would like to insert the data with two columns and a list all in one row, in the table 'farmers'. This is a part of my database so far:
cqlsh:farm_db> use farm_db;
cqlsh:farm_db> Describe tables;
farmers foods_dairy_eggs foods_meat
foods_bread_cookies foods_fruit_vegetables
cqlsh:farm_db> select * from farmers;
farmer_id | delivery | the_farmer
-----------+----------+------------
This is what i'm trying to do:
[Picture of what i'm trying to do][1]
I need to insert the collection types 'list' and 'map' in 'farmers' but after a couple of failed attempts with that I tried using hashmap and arraylist instead. I think this could work but i seem to have an error in my syntax and I have no idea what the problem seem to be:
Exception in thread "main" com.datastax.driver.core.exceptions.SyntaxError: line 1:31 mismatched input 'int' expecting ')' (INSERT INTO farmers (farmer_id [int]...)
Am I missing something or am I doing something wrong?
This is my code:
public class FarmersClass {
public static String serverIP = "127.0.0.1";
public static String keyspace = "";
//Create db
public void crateDatabase(String databaseName) {
Cluster cluster = Cluster.builder()
.addContactPoints(serverIP)
.build();
keyspace = databaseName;
Session session = cluster.connect();
String create_db_query = "CREATE KEYSPACE farm_db WITH replication "
+ "= {'class':'SimpleStrategy', 'replication_factor':1};";
session.execute(create_db_query);
}
//Create table
public void createFarmersTable() {
Cluster cluster = Cluster.builder()
.addContactPoints(serverIP)
.build();
Session session = cluster.connect("farm_db");
String create_farm_table_query = "CREATE TABLE farmers(farmer_id int PRIMARY KEY, the_farmer Map <text, text>, delivery list<text>); ";
session.execute(create_farm_table_query);
}
//Insert data in table 'farmer'.
public void insertFarmers(int id, HashMap< String, String> the_farmer, ArrayList <String> delivery) {
Cluster cluster = Cluster.builder()
.addContactPoints(serverIP)
.build();
Session session = cluster.connect("farm_db");
String insert_query = "INSERT INTO farmers (farmer_id int PRIMARY KEY, the_farmer, delivery) values (" + id + "," + the_farmer + "," + delivery + ");";
System.out.println(insert_query);
session.execute(insert_query);
}
}
public static void main(String[] args) {
FarmersClass farmersClass = new FarmersClass();
// FarmersClass.crateDatabase("farm_db");
// FarmersClass.createFarmersTable();
//Collection type map
HashMap<String, String> the_farmer = new HashMap<>();
the_farmer.put("Name", "Ana Petersen ");
the_farmer.put("Farmhouse", "The great farmhouse");
the_farmer.put("Foods", "Fruits & Vegetables");
//Collection type list
ArrayList<String> delivery = new ArrayList<String>();
String delivery_1 = "Village 1";
String delivery_2 = "Village 2";
delivery.add(delivery_1);
delivery.add(delivery_2);
FarmersClass.insertFarmers(1, the_farmer, delivery);
}

The problem is the syntax of your CQL INSERT query:
String insert_query = \
"INSERT INTO farmers (farmer_id int PRIMARY KEY, the_farmer, delivery) \
values (" + id + "," + the_farmer + "," + delivery + ");";
You've incorrectly added int PRIMARY KEY in the list of columns.
The correct format is:
INSERT INTO table_name (pk, col2, col3) VALUES ( ... )
For details and examples, see CQL INSERT. Cheers!

Related

SQL Parser Visitor + Metabase + Presto

I'm facing what seems to be a quite easy problem, but I'm not able to put my head around the problem to find a suitable solution.
Problem:
I need to append the schema into my SQL statement, in a "weird"(with schema in double quotes) way.
FROM "SCHEMA".tableB tableB
LEFT JOIN "SCHEMA".tableC tableC
Context
Basically, we are hosting and exposing a Metabase tool that will connect and perform query on our Hive database using Presto SQL.
Metabase allow the customer to write SQL statements and some customers, they just don't type the schema on statements. Today we are throwing and error for those queries, but I could easily retrieve the schema value from the Authorization header, since in our multi-tenant product the schema is the tenant id where this user is logged, and with that information in hands, I could append to the customer SQL statement and avoid the error.
Imagine that the customer typed the follow statement:
SELECT tableA.*
, (tableA.valorfaturado + tableA.valorcortado) valorpedido
FROM (SELECT from_unixtime(tableB.datacorte / 1000) datacorte
, COALESCE((tableB.quantidadecortada * tableC.preco), 0) valorcortado
, COALESCE((tableB.quantidade * tableC.preco), 0) valorfaturado
, tableB.quantidadecortada
FROM tableB tableB
LEFT JOIN tableC tableC
ON tableC.numeropedido = tableB.numeropedido
AND tableC.codigoproduto = tableB.codigoproduto
AND tableC.codigofilial = tableB.codigofilial
LEFT JOIN tableD tableD
ON tableD.numero = tableB.numeropedido
WHERE (CASE
WHEN COALESCE(tableB.codigofilial, '') = '' THEN
tableD.codigofilial
ELSE
tableB.codigofilial
END) = '10'
AND from_unixtime(tableB.datacorte / 1000) BETWEEN from_iso8601_timestamp('2020-07-01T03:00:00.000Z') AND from_iso8601_timestamp('2020-08-01T02:59:59.999Z')) tableA
ORDER BY datacorte
I should convert this into (adding the "SCHEMA"):
SELECT tableA.*
, (tableA.valorfaturado + tableA.valorcortado) valorpedido
FROM (SELECT from_unixtime(tableB.datacorte / 1000) datacorte
, COALESCE((tableB.quantidadecortada * tableC.preco), 0) valorcortado
, COALESCE((tableB.quantidade * tableC.preco), 0) valorfaturado
, tableB.quantidadecortada
FROM "SCHEMA".tableB tableB
LEFT JOIN "SCHEMA".tableC tableC
ON tableC.numeropedido = tableB.numeropedido
AND tableC.codigoproduto = tableB.codigoproduto
AND tableC.codigofilial = tableB.codigofilial
LEFT JOIN "SCHEMA".tableD tableD
ON tableD.numero = tableB.numeropedido
WHERE (CASE
WHEN COALESCE(tableB.codigofilial, '') = '' THEN
tableD.codigofilial
ELSE
tableB.codigofilial
END) = '10'
AND from_unixtime(tableB.datacorte / 1000) BETWEEN from_iso8601_timestamp('2020-07-01T03:00:00.000Z') AND from_iso8601_timestamp('2020-08-01T02:59:59.999Z')) tableA
ORDER BY datacorte
Still trying to find a solution that uses only presto-parser and Visitor + Instrumentation solution.
Also, I know about JSQLParser and I tried, but I alway come back to try to find a "plain" solution scared that JSQLParser will not be able to support all the Presto/Hive queries, that are a little bit different than standard SQL;
I create a little project on GitHub with test case to validate..
https://github.com/genyherrera/prestosqlerror
But for those that don't want to clone a repository, here are the classes and dependencies:
import java.util.Optional;
import com.facebook.presto.sql.SqlFormatter;
import com.facebook.presto.sql.parser.ParsingOptions;
import com.facebook.presto.sql.parser.SqlParser;
public class SchemaAwareQueryAdapter {
// Inspired from
// https://github.com/prestodb/presto/tree/master/presto-parser/src/test/java/com/facebook/presto/sql/parser
private static final SqlParser SQL_PARSER = new SqlParser();
public String rewriteSql(String sqlStatement, String schemaId) {
com.facebook.presto.sql.tree.Statement statement = SQL_PARSER.createStatement(sqlStatement, ParsingOptions.builder().build());
SchemaAwareQueryVisitor visitor = new SchemaAwareQueryVisitor(schemaId);
statement.accept(visitor, null);
return SqlFormatter.formatSql(statement, Optional.empty());
}
}
public class SchemaAwareQueryVisitor extends DefaultTraversalVisitor<Void, Void> {
private String schemaId;
public SchemaAwareQueryVisitor(String schemaId) {
super();
this.schemaId = schemaId;
}
/**
* The customer can type:
* [table name]
* [schema].[table name]
* [catalog].[schema].[table name]
*/
#Override
protected Void visitTable(Table node, Void context) {
List<String> parts = node.getName().getParts();
// [table name] -> is the only one we need to modify, so let's check by parts.size() ==1
if (parts.size() == 1) {
try {
Field privateStringField = Table.class.getDeclaredField("name");
privateStringField.setAccessible(true);
QualifiedName qualifiedName = QualifiedName.of("\""+schemaId+"\"",node.getName().getParts().get(0));
privateStringField.set(node, qualifiedName);
} catch (NoSuchFieldException | SecurityException | IllegalArgumentException | IllegalAccessException e) {
throw new SecurityException("Unable to execute query");
}
}
return null;
}
}
import static org.testng.Assert.assertEquals;
import org.gherrera.prestosqlparser.SchemaAwareQueryAdapter;
import org.testng.annotations.Test;
public class SchemaAwareTest {
private static final String schemaId = "SCHEMA";
private SchemaAwareQueryAdapter adapter = new SchemaAwareQueryAdapter();
#Test
public void testAppendSchemaA() {
String sql = "select * from tableA";
String bound = adapter.rewriteSql(sql, schemaId);
assertEqualsFormattingStripped(bound,
"select * from \"SCHEMA\".tableA");
}
private void assertEqualsFormattingStripped(String sql1, String sql2) {
assertEquals(sql1.replace("\n", " ").toLowerCase().replace("\r", " ").replaceAll(" +", " ").trim(),
sql2.replace("\n", " ").toLowerCase().replace("\r", " ").replaceAll(" +", " ").trim());
}
}
<dependencies>
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-parser</artifactId>
<version>0.229</version>
</dependency>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>6.10</version>
<scope>test</scope>
</dependency>
</dependencies>
PS: I was able to add the schema without the doubles quotes, but them I got into identifiers must not start with a digit; surround the identifier with double quotes error. Basically this error comes from SqlParser$PostProcessor.exitDigitIdentifier(...) method..
Thanks
I was able to find a solution for my case, either way will share on Presto Slack my finding to see if that is expected behavior.
So, if you want to append with double quote your schema, you will need to create your own Vistor class and you'll need to override the method visitTable and when you Qualify the name of your table with schema, (here's the tick), pass the schema as UPPERCASE, so it will not match the regex pattern on class SqlFormatter on method formatName and it will add the double-quote..
public class SchemaAwareQueryVisitor extends DefaultTraversalVisitor<Void, Void> {
private String schemaId;
public SchemaAwareQueryVisitor(String schemaId) {
super();
this.schemaId = schemaId;
}
#Override
protected Void visitTable(Table node, Void context) {
try {
Field privateStringField = Table.class.getDeclaredField("name");
privateStringField.setAccessible(true);
QualifiedName qualifiedName = QualifiedName.of(schemaId, node.getName().getParts().get(0));
privateStringField.set(node, qualifiedName);
} catch (NoSuchFieldException
| SecurityException
| IllegalArgumentException
| IllegalAccessException e) {
throw new SecurityException("Unable to execute query");
}
return null;
}
}

Can't insert in a table having composite primary key using LINQ

Added that SqlConnection part as someone suggests on this website but still getting exception
Cannot insert explicit value for identity column when identity insert is OFF
on dv.SubmitChanges();
The composite primary key is composed of stid and bookid.
Also please tell me how I can make this query without concatenating textbox's value i.e bname.Text
using (SqlConnection conn = new SqlConnection(cs))
{
conn.Open();
using (DataClasses1DataContext dv = new DataClasses1DataContext())
{
var iq = (from b in dv.Books
where SqlMethods.Like(b.name, "%" + bname.Text + "%")
select b.Id).FirstOrDefault();
IssueInfo info = new IssueInfo()
{
stid = Convert.ToInt32(rollno.Text),
due_date = Convert.ToDateTime(dateTimePicker1.Text),
bookid = Convert.ToInt32(iq)
};
dv.ExecuteCommand("SET IDENTITY_INSERT IssueInfo ON");
dv.IssueInfos.InsertOnSubmit(info);
dv.SubmitChanges();
dv.ExecuteCommand("SET IDENTITY_INSERT IssueInfo OFF");
}
}

Persisting data to DynamoDB using Apache Spark

I have a application where
1. I read JSON files from S3 using SqlContext.read.json into Dataframe
2. Then do some transformations on the DataFrame
3. Finally I want to persist the records to DynamoDB using one of the record value as key and rest of JSON parameters as values/columns.
I am trying something like :
JobConf jobConf = new JobConf(sc.hadoopConfiguration());
jobConf.set("dynamodb.servicename", "dynamodb");
jobConf.set("dynamodb.input.tableName", "my-dynamo-table"); // Pointing to DynamoDB table
jobConf.set("dynamodb.endpoint", "dynamodb.us-east-1.amazonaws.com");
jobConf.set("dynamodb.regionid", "us-east-1");
jobConf.set("dynamodb.throughput.read", "1");
jobConf.set("dynamodb.throughput.read.percent", "1");
jobConf.set("dynamodb.version", "2011-12-05");
jobConf.set("mapred.output.format.class", "org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat");
jobConf.set("mapred.input.format.class", "org.apache.hadoop.dynamodb.read.DynamoDBInputFormat");
DataFrame df = sqlContext.read().json("s3n://mybucket/abc.json");
RDD<String> jsonRDD = df.toJSON();
JavaRDD<String> jsonJavaRDD = jsonRDD.toJavaRDD();
PairFunction<String, Text, DynamoDBItemWritable> keyData = new PairFunction<String, Text, DynamoDBItemWritable>() {
public Tuple2<Text, DynamoDBItemWritable> call(String row) {
DynamoDBItemWritable writeable = new DynamoDBItemWritable();
try {
System.out.println("JSON : " + row);
JSONObject jsonObject = new JSONObject(row);
System.out.println("JSON Object: " + jsonObject);
Map<String, AttributeValue> attributes = new HashMap<String, AttributeValue>();
AttributeValue attributeValue = new AttributeValue();
attributeValue.setS(row);
attributes.put("values", attributeValue);
AttributeValue attributeKeyValue = new AttributeValue();
attributeValue.setS(jsonObject.getString("external_id"));
attributes.put("primary_key", attributeKeyValue);
AttributeValue attributeSecValue = new AttributeValue();
attributeValue.setS(jsonObject.getString("123434335"));
attributes.put("creation_date", attributeSecValue);
writeable.setItem(attributes);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return new Tuple2(new Text(row), writeable);
}
};
JavaPairRDD<Text, DynamoDBItemWritable> pairs = jsonJavaRDD
.mapToPair(keyData);
Map<Text, DynamoDBItemWritable> map = pairs.collectAsMap();
System.out.println("Results : " + map);
pairs.saveAsHadoopDataset(jobConf);
However I do not see any data getting written to DynamoDB. Nor do I get any error messages.
I'm not sure, but your's seems more complex than it may need to be.
I've used the following to write an RDD to DynamoDB successfully:
val ddbInsertFormattedRDD = inputRDD.map { case (skey, svalue) =>
val ddbMap = new util.HashMap[String, AttributeValue]()
val key = new AttributeValue()
key.setS(skey.toString)
ddbMap.put("DynamoDbKey", key)
val value = new AttributeValue()
value.setS(svalue.toString)
ddbMap.put("DynamoDbKey", value)
val item = new DynamoDBItemWritable()
item.setItem(ddbMap)
(new Text(""), item)
}
val ddbConf = new JobConf(sc.hadoopConfiguration)
ddbConf.set("dynamodb.output.tableName", "my-dynamo-table")
ddbConf.set("dynamodb.throughput.write.percent", "0.5")
ddbConf.set("mapred.input.format.class", "org.apache.hadoop.dynamodb.read.DynamoDBInputFormat")
ddbConf.set("mapred.output.format.class", "org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat")
ddbInsertFormattedRDD.saveAsHadoopDataset(ddbConf)
Also, have you checked that you have upped the capacity correctly?

cassandra trigger on composite blob key

I use Cassandra 2.1.9 and have table like
create table "Keyspace1"."Standard4" ( id blob, user_name blob, data blob, primary key(id, user_name));
and I follow the post in Cassandra Sample Trigger Code to get inserted value and do trigger code like
public class InvertedIndex implements ITrigger
{
private static final Logger logger = LoggerFactory.getLogger(InvertedIndex.class);
public Collection augment(ByteBuffer key, ColumnFamily update)
{
CFMetaData cfm = update.metadata();
ByteBuffer id_bb = key;
String id_Value = new String(id_bb.array());
Iterator col_itr=update.iterator();
Cell username_col=(Cell)col_itr.next();
ByteBuffer username_bb=CompositeType.extractComponent(username_col.name().collectionElement(),0);
String username_Value = new String(username_bb.array());
Cell data_col=(Cell)col_itr.next();
ByteBuffer data_bb=BytesType.instance.compose(data_col.value());
String data_Value = new String(data_bb.array());
logger.info(" id --> "+id_Value);
logger.info(" username-->"+username_Value);
logger.info(" data ---> "+data_Value);
return null;
}
}
I tried insert into "Keyspace1"."Standard4" (id, user_name, data) values (textAsBlob('id1'), textAsBlob('user_name1'), textAsBlob('data1'));
and got run time exception in ByteBuffer username_bb=CompositeType.extractComponent(username_col.name().collectionElement(),0);
Caused by: java.lang.NullPointerException: null
at org.apache.cassandra.db.marshal.CompositeType.extractComponent(CompositeType.java:191) ~[apache-cassandra-2.1.9.jar:2.1.9]
at org.apache.cassandra.triggers.InvertedIndex.augment(InvertedIndex.java:52) ~[na:na]
at org.apache.cassandra.triggers.TriggerExecutor.executeInternal(TriggerExecutor.java:223) ~[apache-cassandra-2.1.9.jar:2.1.9]
... 17 common frames omitted
Can anybody tell me how to correct?
You are trying to show all the inserted column name and value right ?
Here is the code:
#Override
public Collection<Mutation> augment(ByteBuffer key, ColumnFamily update) {
CFMetaData cfm = update.metadata();
System.out.println("key => " + ByteBufferUtil.toInt(key));
for (Cell cell : update) {
if (cell.value().remaining() > 0) {
try {
String name = cfm.comparator.getString(cell.name());
String value = cfm.getValueValidator(cell.name()).getString(cell.value());
System.out.println("Column Name => " + name + " Value => " + value);
} catch (Exception e) {
System.out.println("Exception : " + e.getMessage());
}
}
}
return null;
}

Retrieving data from composite key via astyanax

I am very naive in cassandra & am using astyanax
CREATE TABLE employees (empID int, deptID int, first_name varchar,
last_name varchar, PRIMARY KEY (empID, deptID));
i want to get the values of query:
select * from employees where empID =2 and deptID = 800;
public void read(Integer empID, String deptID) {
OperationResult<ColumnList<String>> result;
try {
columnFamilies = ColumnFamily.newColumnFamily("employees", IntegerSerializer.get(), StringSerializer.get());
result = keyspace.prepareQuery(columnFamilies).getKey(empID).execute();
ColumnList<String> cols = result.getResult();
//Other stuff
}
how should i achieve this
As far as I can find, there isn't a super clean way to do this. You have to do it by executing a cql query and then iterating through the rows. This code is taken from the astynax examples file
public void read(int empId) {
logger.debug("read()");
try {
OperationResult<CqlResult<Integer, String>> result
= keyspace.prepareQuery(EMP_CF)
.withCql(String.format("SELECT * FROM %s WHERE %s=%d;", EMP_CF_NAME, COL_NAME_EMPID, empId))
.execute();
for (Row<Integer, String> row : result.getResult().getRows()) {
logger.debug("row: "+row.getKey()+","+row); // why is rowKey null?
ColumnList<String> cols = row.getColumns();
logger.debug("emp");
logger.debug("- emp id: "+cols.getIntegerValue(COL_NAME_EMPID, null));
logger.debug("- dept: "+cols.getIntegerValue(COL_NAME_DEPTID, null));
logger.debug("- firstName: "+cols.getStringValue(COL_NAME_FIRST_NAME, null));
logger.debug("- lastName: "+cols.getStringValue(COL_NAME_LAST_NAME, null));
}
} catch (ConnectionException e) {
logger.error("failed to read from C*", e);
throw new RuntimeException("failed to read from C*", e);
}
}
You just have to tune the cql query to return what you want. This is a bit frustrating because according to the documentation, you can do
Column<String> result = keyspace.prepareQuery(CF_COUNTER1)
.getKey(rowKey)
.getColumn("Column1")
.execute().getResult();
Long counterValue = result.getLongValue();
However I don't know what rowkey is. I've posted a question about what rowkey can be. Hopefully that will help

Resources