ThriftColumnFamilyTemplate for querying super column family and their columns - cassandra

I have a cassandra data model that's super column family. There are multiple super columns and every super column has multiple columns of different type (for example quantity is integer, Id is long, and name is a string). I am able to query names of all super columns for a row using ThriftSuperCfTemplate. However, I am unable to retrieve the name/values of the columns of super columns. I am wondering if there are any samples available?

this is a sample from our test suite in Hector to achieve that.
More info will be posted soon in hector-client.org
#Test
public void testQuerySingleSubColumn() {
SuperCfTemplate<String, String, String> sTemplate =
new ThriftSuperCfTemplate<String, String, String>(keyspace, "Super1", se, se, se);
SuperCfUpdater sUpdater = sTemplate.createUpdater("skey3","super1");
sUpdater.setString("sub1_col_1", "sub1_val_1");
sTemplate.update(sUpdater);
HColumn<String,String> myCol = sTemplate.querySingleSubColumn("skey3", "super1", "sub1_col_1", se);
assertEquals("sub1_val_1", myCol.getValue());
}

Related

complex column design in Cassandra and query with Hector

I have a requirement to design like following
Row Column Column
A | B | E |
SubColumn SubColumn
C D F G
1 2 3 4
I don't know if this type of structure is possible. also i would like to know How to query such a table using Hector client. It would be nice if I can read some examples of insertion , deletion and updation using Hector.
Inserting a Column on a SuperColumn with Mutator is quite similar to what we have already seen. The only difference is we are now creating an HSuperColumn in place of HColumn to provide some additional structure. For example, if we wanted to store our users under the "billing" department, we would use the following call to Mutator:
Mutator<String> mutator =
HFactory.createMutator(keyspace, stringSerializer);
mutator.insert("billing", "Super1", HFactory.createSuperColumn("jsmith",
Arrays.asList(HFactory.createStringColumn("first", "John")),
stringSerializer, stringSerializer, stringSerializer));
As for retrieval of a SuperColumn, the simple case is almost identical to retrieval of a standard Column. The only difference is the Query implementation used.
SuperColumnQuery<String, String, String, String> superColumnQuery =
HFactory.createSuperColumnQuery(keyspace, stringSerializer,
stringSerializer, stringSerializer, stringSerializer);
superColumnQuery.setColumnFamily("Super1")
.setKey("billing").setSuperName("jsmith");
Result<HSuperColumn<String, String, String>> result = superColumnQuery.execute();

How to update multiple rows using Hector

Is there a way I can update multiple rows in cassandra database using column family template like supply a list of keys.
currently I am using updater columnFamilyTemplate to loop through a list of a keys and do an update for each row. I have seen queries like multigetSliceQuery but I don't know their equivalence in doing updates.
There is no utility method in ColumnFamilyTemplate that allow you to just pass a list of keys with a list of mutation in one call.
You can implement your own using mutators.
This is the basic code on how to do it in hector
Set<String> keys = MY_KEYS;
Map<String, String> pairsOfNameValues = MY_MUTATION_BY_NAME_AND_VALUE;
Set<HColumn<String, String>> colums = new HashSet<HColumn<String,String>>();
for (Entry<String, String> pair : pairsOfNameValues.entrySet()) {
colums.add(HFactory.createStringColumn(pair.getKey(), pair.getValue()));
}
Mutator<String> mutator = template.createMutator();
String column_family_name = template.getColumnFamily();
for (String key : keys) {
for (HColumn<String, String> column : colums) {
mutator.addInsertion(key, BASIC_COLUMN_FAMILY, column);
}
}
mutator.execute();
Well it should look like that. This is an example for insertion, be sure to use the following methods for batch mutations:
mutator.addInsertion
mutator.addDeletion
mutator.addCounter
mutator.addCounterDeletion
since this ones will execute right away without waiting for the mutator.execute():
mutator.incrementCounter
mutator.deleteCounter
mutator.insert
mutator.delete
As a last note: A mutator allows you to batch mutations on multiple rows on multiple column families at once ... which is why I generally prefer to use them instead of CF templates. I have a lot of denormalization for functionalities that use the "push-on-write" pattern of NoSQL.
You can use a batch mutation to insert as much as you want (within thrift_max_message_length_in_mb). See http://hector-client.github.com/hector//source/content/API/core/1.0-1/me/prettyprint/cassandra/model/MutatorImpl.html.

Cassandra Hector: how to insert null as a column value?

An often use-case with Cassandra is storing the data in the column names of the dynamically created column family. In this situation the row values themselves are not needed, and a usual practice is to store nulls there.
However, when dealing with Hector, it seems like there is no way to insert null value, because Hector HColumnImpl does an explicit null-check in the column's constructor:
public HColumnImpl(N name, V value, long clock, Serializer<N> nameSerializer,
Serializer<V> valueSerializer) {
this(nameSerializer, valueSerializer);
notNull(name, "name is null");
notNull(value, "value is null");
this.column = new Column(nameSerializer.toByteBuffer(name));
this.column.setValue(valueSerializer.toByteBuffer(value));
this.column.setTimestamp(clock);
}
Are there any ways to insert nulls via Hector? If not, what is the best practice in the situation when you don't care about column values and need only their names?
Try using an empty byte[], i.e. new byte[0];

Using hector, how to delete a range of super columns?

I have a super column family for which over the time need to remove a range of super columns. I searched around, didn't seem to find a solution for that using hector. Can anyone please help?
You'll have to do a column slice first to get the columns you want to delete, then loop through and generate a list of mutations. You can then send all these mutations to Cassandra in one Hector call:
Mutator<..> mutator = HFactory.createMutator(keyspace, serializer);
SuperSlice<..> result = HFactory.createSuperSliceQuery(keyspace, ... serializers ...)
.setColumnFamily(cf)
.setKey(key)
.setRange("", "", false, Integer.MAX_VALUE)
.execute()
.get();
for (HSuperColumn<..> col in result.getSuperColumns())
mutator.addDeletion(key, cf, col.getName(), serializer);
mutator.execute();

Retrieve AutoIncrement key value when the column is NOT the first one in the table

I've got a question regarding how to retrieve the auto-increment or identity value for a column in SQL Server 2005, when said column is not the first declared column in a table.
I can get the generated value for a table just by issuing the following code:
MyTable newRecord = new MyTable();
newRecord.SomeColumn = 2;
newRecord.Save();
return newRecord.MyIdColumn;
Which works fine regardles of how many other columns make up the primary key of that particular table, but the first column declared MUST be the identity column, otherwise this doesn't work.
My problem is that I have to integrate my code with other tables that are out of my reach, and they have identity columns which are NOT the first columns in those tables, so I was wondering if there is a proper workaround to my problem, or if I'm stuck using something along the lines of SELECT ##IDENTITY to manually get that value?
Many thanks in advance for all your help!
From the "ewww gross" department:
Here's my workaround for now, hopefully someone may propose a better solution to what I did.
MyTable newRecord = new MyTable();
newRecord.SomeColumn = 2;
newRecord.Save();
CodingHorror horror = new CodingHorror();
string SQL = "SELECT IDENT_CURRENT(#tableName)";
int newId = horror.ExecuteScalar<int>(SQL, "MyTable");
newRecord.MyIdColumn = newId;
newRecord.MarkClean();
return newRecord.MyIdColumn;

Resources