Getting PK columns from ColumnFamily columnFamily, ByteBuffer key Cassandra triggers - cassandra

I am new to cassandra triggers. I am still ramping up. I could find a way to extract value out for a given ByteBuffer key, but do not know how to get the "name" of the actual primary key column
public static String getKeyText(ColumnFamily columnFamily, ByteBuffer key) {
CFMetaData cfm = columnFamily.metadata();
String key_data = cfm.getKeyValidator().getString(key);
}
Any idea on how to get just the key column name?
Any pointers are highly appreciated
Thanks

not sure if this what you mean, but you can get the name of the partition keys from columnFamily.partitionKeyColumns() the ColumnDefinition's have a name field thats readable. There may be more than one depending on schema
https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/config/CFMetaData.java#L797

Related

com.datastax.driver.core.exceptions.InvalidQueryException: Invalid operator IN for PRIMARY KEY part

I have cassandra 2.1.15.
I have this table
CREATE TABLE ks_mobapp.messages (
pair_id text,
belong_to text,
message_id timeuuid,
cli_time bigint,
sender text,
text text,
time bigint,
PRIMARY KEY ((pair_id, belong_to), message_id)
) WITH CLUSTERING ORDER BY (message_id DESC)
I was trying to delete multiple record as
instances.getCqlSession().execute(QueryBuilder.delete()
.from(AppConstants.KEYSPACE, "messages")
.where(QueryBuilder.eq("pair_id", pairId))
.and(QueryBuilder.eq("belong_to", currentUser.value("userId")))
.and(QueryBuilder.in("message_id", msgId)));
I am getting error:
Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Invalid operator IN for PRIMARY KEY part message_id
Then I tried:
Session session = instances.getCqlSession();
PreparedStatement statement = session.prepare("DELETE FROM ks_mobApp.messages WHERE pair_id = ? AND belong_to = ? AND message_id = ?;");
Iterator<String> iterator = msgId.iterator();
while(iterator.hasNext()) {
try {
session.executeAsync(statement.bind(pairId, currentUser.value("userId"), UUID.fromString(iterator.next())));
} catch(Exception ex) {
}
}
Its working nice. Is this the correct way? I can't use IN for same partition key ?
DELETE in Query only supported for partition key.
Delete IN relation is only supported for partition key)
There are some WHERE clause restrictions for the UPDATE and DELETE statements in cassandra 2.x
more specifically you can only use the IN operator on the last partition key column. So in your case the last partition column is belong_to. so IN can only be used on that column.
However these limitation are removed in cassandra 3.0. and it will allow
IN to be specified on any partition key column
IN to be specified on any clustering column
Here is the patch https://issues.apache.org/jira/browse/CASSANDRA-6237
Read this also http://www.datastax.com/dev/blog/a-deep-look-to-the-cql-where-clause

azure table storage partitionkey/row key background?

I'm reviewing Azure tables in an existing implementation. Here's an example of data from 1 row:
partitionkey (string):
6b348096-e6cb-4126-ba3c-cd0c9e8ba9c9
rowkey (string):
02519452888782521547_c1a98e0f-1b25-4d38-bd96-d72b30a97bf0
Obviously, rowkey does not have a proper guid and both column names are native to Azure and required per entity. There does not appear to be an identity or default insert for these columns. Can someone please provide context around these columns and the implementation style differences and considerations between these Azure columns vs a SQL Server style implementation?
I have no idea about how the partition key and row key are constructed in your table, please turn to someone who created the table. :)
About the considerations on the Azure Table design, you can refer to this post (which is very complete and helpful).
Partitionkey and Rowkey are just two properties of entitis in Azure table. Rowkey is the "primary key" within one partition. Within one PartitionKey, you can only have unique RowKeys. If you use multiple partitions, the same RowKey can be reused in every partition. PartitionKey + RowKey form the unique identifier(Primary key) for an entity.
In your table, the partitionkey and rowkey are just assigned with a random string. I'm not sure whether you designed this table or somebody else, but these two properties can be assigned with other values through Azure Storage .NET client libary and Rest API. As the example below, you can design the rowkey and partitionkey, and assign whatever valid value you want, here lastname for Partitionkey and firstname for rowkey:
public class CustomerEntity : TableEntity
{
public CustomerEntity(string lastName, string firstName)
{
this.PartitionKey = lastName;
this.RowKey = firstName;
}
public CustomerEntity() { }
public string Email { get; set; }
public string PhoneNumber { get; set; }
}
It’s better to think about both properties and your partitioning strategy. Don’t just assign them a guid or a random string as it does matter for performance. I recommend you go through Designing a Scalable Partitioning Strategy for Azure Table Storage, the most commom used is Range Partitions, but you can choose whatever you want.
This is a great blog help you understand how partitionkey and rowkey work http://blog.maartenballiauw.be/post/2012/10/08/What-PartitionKey-and-RowKey-are-for-in-Windows-Azure-Table-Storage.aspx

Deserializing mutations from commit log in cassandra

I'm trying to deserialize the commit log in Cassandra for a research project.
I have succeeded so far in deserializing the cell names and the cell values from the mutation entries in the commit log.
However, am struggling to deserialize the primary key entries of the mutations since per design the cell values are empty for the primary keys. The closest I could get is to retrieve the partition key name from the column definition of the column family metadata. But I don't know how to get the actual value of the primary key ?
Thanks
Below is my approach to deserialize the mutation:
// function in CommitLog.java
public ReplayPosition add(Mutation mutation){
Collection<ColumnFamily> myCollection = mutation.getColumnFamilies();
for(ColumnFamily cf:myCollection) {
CFMetaData cfm = cf.metadata();
// Retrieve name of partition key
logger.info("partition key={}.", cfm.partitionKeyColumns().get(0).name.toString());
for (Cell cell : cf){
// Retrieve cell name
String name = cfm.comparator.getString(cell.name());
logger.info("name={}.", name);
// Retrieve cell value
String value = cfm.getValueValidator(cell.name()).getString(cell.value());
logger.info("value={}.", value);
}
}
}

Astyanax getKey with compound key

I would like to run the following code with a compound primary key.
Column<String> result = keyspace.prepareQuery(CF_COUNTER1)
.getKey(rowKey)
.getColumn("Column1")
.execute().getResult();
Long counterValue = result.getLongValue();
Research seems to show that it can be a string that represents a key (if it's not a compound primary key). The documentation says that it is of type K, alas, I am not very experience with Java, and have no idea what that means. Is it just a base type that lots of stuff inherits from? If so, I'm not really any closer to knowing what getKey(K) needs in order to handle a compound key (am I?).
You just need to write a class that fits the columns in your data model. You can then give this class to Astyanax in your mutations or queries.
For example, if you had a data model like this
CREATE TABLE fishblogs (
userid varchar,
when timestamp,
fishtype varchar,
blog varchar,
image blob,
PRIMARY KEY (userid, when, fishtype)
);
you would create a class like this:
public class FishBlog {
#Component(ordinal = 0)
public long when;
#Component(ordinal = 1)
public String fishtype;
#Component(ordinal = 2)
public String field;
public FishBlog() {
}
}
When and fishtype form your composite column key and are represented by the FishBlog class. Userid would be your row/partition key and can be of the simple "string" type.
Have a look at this blog explaining in great detail how to insert data with composite keys (where I took this example from).
Hope that helps.

Cassandra Hector: how to insert null as a column value?

An often use-case with Cassandra is storing the data in the column names of the dynamically created column family. In this situation the row values themselves are not needed, and a usual practice is to store nulls there.
However, when dealing with Hector, it seems like there is no way to insert null value, because Hector HColumnImpl does an explicit null-check in the column's constructor:
public HColumnImpl(N name, V value, long clock, Serializer<N> nameSerializer,
Serializer<V> valueSerializer) {
this(nameSerializer, valueSerializer);
notNull(name, "name is null");
notNull(value, "value is null");
this.column = new Column(nameSerializer.toByteBuffer(name));
this.column.setValue(valueSerializer.toByteBuffer(value));
this.column.setTimestamp(clock);
}
Are there any ways to insert nulls via Hector? If not, what is the best practice in the situation when you don't care about column values and need only their names?
Try using an empty byte[], i.e. new byte[0];

Resources