UUID Cassandra - cassandra

UUID Cassandra - cassandra

I am new to Cassandra. I am trying to insert some values to the columnfamily. The definition of columnfamily in the config file is as follows.
<ColumnFamily Name="CommandQueue"
ColumnType="Super"
CompareWith="TimeUUIDType"
CompareSubcolumnsWith="UTF8Type"/>
When ever I try to insert values to I always get "InvalidRequestException(why: UUIDs must be exactly 16 bytes)".
I am using batch_mutate() to insert column.
How can I insert values to the column family.

"We have an API for that" :-)
https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/utils/TimeUUIDUtils.java
This class makes it easy to build type1 UUIDs and extract the timestamps as needed. See the related test case for examples.
Hector is MIT licensed, so if you are set on doing your own thing, feel free to use whatever helps.

Below is a code snippet (from Nick Berardi's Coder Journal)
public static Guid GenerateTimeBasedGuid(DateTime dateTime)
{
long ticks = dateTime.Ticks - GregorianCalendarStart.Ticks;
byte[] guid = new byte[ByteArraySize];
byte[] clockSequenceBytes = BitConverter.GetBytes(Convert.ToInt16(Environment.TickCount
% Int16.MaxValue));
byte[] timestamp = BitConverter.GetBytes(ticks);
// copy node
Array.Copy(Node, 0, guid, NodeByte, Node.Length);
// copy clock sequence
Array.Copy(clockSequenceBytes, 0, guid, GuidClockSequenceByte,clockSequenceBytes.Length);
// copy timestamp
Array.Copy(timestamp, 0, guid, 0, timestamp.Length);
// set the variant
guid[VariantByte] &= (byte)VariantByteMask;
guid[VariantByte] |= (byte)VariantByteShift;
// set the version
guid[VersionByte] &= (byte)VersionByteMask;
guid[VersionByte] |= (byte)((int)GuidVersion.TimeBased << VersionByteShift);
return new Guid(guid);
}

I am just continuing where "Schildmejir" has stopped. This how you can actually use the generated GUID in inserting values to columnfamilies.
Mutation foobar = new Mutation()
{
Column_or_supercolumn = new ColumnOrSuperColumn()
{ Super_column = new SuperColumn()
{ Name = GuidGenerator.GenerateTimeBasedGuid(DateTime.Now).ToByteArray(),
Columns = listOfSomeColumns
}
}
};
List<Column> foobarlist = new List<Column>();
listOfChannelIds.Add(new Column() { Name = utf8Encoding.GetBytes("somename"), Value = utf8Encoding.GetBytes(somestring), Timestamp = timeStamp });
You can use the generated GUID either in SupercolumnName or columnName depending on the requirement.

Cassandra expects UUIDs to conform to RFC 4122, so you'll need to either generate compliant values yourself or use an existing library for the language of your choice (most languages have free UUID generation libraries readily available).

Related

Map to hold multiple sets of key and values

I have a map1 which holds the information as
[40256942,6] [60246792,5]
Now that I want to prepare a map2 that holds information such as
itemNo, 40256942
qty, 6
itemNo, 60246792
qty, 5
to prepare final information as json
“partialArticlesInfo”: [{itemNo:”40256942”, availQty:”6”}, {itemNo:”60246792”, availQty:”5”}]
I am trying to iterate map1 to retrieve values and set that against the key. But I am getting only one entry which is last one. Is there any way , I get the new map with entries such as mentioned above
Map<String, String> partialArticlesInfo = new HashMap<String,String>();
Map<String, String> partialArticlesTempMap = null;
for (Map.Entry<String,String> entry : partialStockArticlesQtyMap.entrySet())
{
partialArticlesTempMap = new HashMap<String,String>();
partialArticlesTempMap.put("itemNo",entry.getKey());
partialArticlesTempMap.put("availQty",entry.getValue());
partialArticlesInfo.putAll(partialArticlesTempMap);
}

In Java (I'm assuming you're using Java, in the future it would be helpful to specify that) and every other language I know of, a map holds mappings between keys and values. Only one mapping is allowed per key. In your "map2", the keys are "itemNo" and "availQty". So what is happening is that your for loop sets the values for the first entry, and then is overwriting them with the data from the second entry, which is why that is the only one you see. Look at Java - Map and Map - Java 8 for more info.
I don't understand why you are trying to put the data into a map, you could just put it straight into JSON with something like this:
JSONArray partialArticlesInfo = new JSONArray();
for (Map.Entry<String,String> entry : partialStockArticlesQtyMap.entrySet()) {
JSONObject stockEntry = new JSONObject();
stockEntry.put("itemNo", entry.getKey());
stockEntry.put("availQty", entry.getValue());
partialArticlesInfo.put(stockEntry);
}
JSONObject root = new JSONObject();
root.put("partialArticlesInfo",partialArticlesInfo);
This will take "map1" (partialStockArticlesQtyMap in your code) and create a JSON object exactly like your example - no need to have map2 as an intermediate step. It loops over each entry in map1, creates a JSON object representing it and adds it to a JSON array, which is finally added to a root JSON object as "partialArticlesInfo".
The exact code may be slightly different depending on which JSON library you are using - check the docs for the specifics.

I agree with Brendan. Another solution would be otherwise to store in the Set or List objects like the following.
class Item {
Long itemNo;
int quantity;
public int hashCode() {
Long.hashCode(itemNo) + Integer.hashCode(quantity);
}
public int equals(Object other) {
other instanceOf Item && other.itemNo == this.itemNo && other.quantity = this.quantity;
}
}
}
then you can use the JsonArray method described by him to get the Json string in output
This means that adding new variables to the object won't require any more effort to generate the Json

Cassandra BoundStatement with Multiple Parameters and Multi-Partition Query

After reading "Asynchronous queries with the Java driver" article in the datastax blog, I was trying to implement a solution similar to the one in the section called - 'Case study: multi-partition query, a.k.a. “client-side SELECT...IN“'.
I currently have code that looks something like this:
public Future<List<ResultSet>> executeMultipleAsync(final BoundStatement statement, final Object... partitionKeys) {
List<Future<ResultSet>> futures = Lists.newArrayListWithExpectedSize(partitionKeys.length);
for (Object partitionKey : partitionKeys) {
Statement bs = statement.bind(partitionKey);
futures.add(executeWithRetry(bs));
}
return Futures.successfulAsList(futures);
}
But, I'd like to improve on that. In the cql query this BoundStatement holds, I'd like to have something that looks like this:
SELECT * FROM <column_family_name> WHERE <param1> = :p1_name AND param2 = :p2_name AND <partiotion_key_name> = ?;
I'd like the clients of this method to give me a BoundStatement with an already bound parameters (two parameters in this case) and a list of partition keys. In this case, all I need to do, is bind the partition keys and execute the queries. Unfortunately, when I bind the key to this statement I fail with an error - com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for value 0 of CQL type varchar, expecting class java.lang.String but class java.lang.Long provided. The problem is, that I try to bind the key to the first parameter and not the last. Which is a string and not a long.
I can solve this by either giving the partition parameter a name but then I'd have to get the name via method parameters, or by specifying it's index which again will require an additional method parameter. Either way, if I use the name or the index I have to bind it with a specific type. For instance: bs.setLong("<key_name>", partitionKey);. For some reason, I can't leave it to the BoundStatement to interpret the type of the last parameter.
I'd like to avoid passing the parameter name explicitly and bypass the type problem. Is there anything that can be done?
Thanks!

I've posted the same question in 'DataStax Java Driver for Apache Cassandra User Mailing List' and got an answer saying the functionality that I'm missing may be added in the next version (2.2) of the datastax java driver.
In JAVA-721 (to be introduced in 2.2) we are tentatively planning on
adding the following methods with the signature to BoundStatement:
public BoundStatement setObject(int i, V v) public
BoundStatement setObject(String name, V v)
and
You can emulate setObject in 2.1:
void setObject(BoundStatement bs, int position, Object object,
ProtocolVersion protocolVersion) {
DataType type = bs.preparedStatement().getVariables().getType(position);
ByteBuffer buffer = type.serialize(object, protocolVersion);
bs.setBytesUnsafe(position, buffer);
}
To avoid passing the parameter name, one thing you could do is look
for a position that isn't bound yet:
int findUnsetPosition(BoundStatement bs) {
int size = bs.preparedStatement().getVariables().size();
for (int i = 0; i < size; i++)
if (!bs.isSet(i))
return i;
throw new IllegalArgumentException("found no unset position");
}
I don't recommend it though, because it's ugly and unpredictable if
the user forgot to bind one of the non-PK variables.
The way I would do it is require the user to pass a callback that sets
the PK:
interface PKBinder<T> {
void bind(BoundStatement bs, T pk);
}
public <T> Future<List<ResultSet>> executeMultipleAsync(final BoundStatement statement, PKBinder<T> pkBinder, final T...
partitionKeys)
As a bonus, this will also work with composite partition keys.

Cassandra convert UUID to string and back

If i use UUID1 for my column names and then retrieve them with php how can i convert that UUID to readable string so i could insert that string to the HTML and then later on use it to select that same column by converting that string back to UUID? Is that even possible?
I could go with UTF8 or something else but i want to avoid collisions and get ordered wide rows, and i really need to store those column names to the HTML, i can't see any other way to do it.
I'm using phpcassa.

You can cast UUID objects to strings to get a nice printable version. That same string can be used with UUID::import() to create an identical UUID object again:
use phpcassa\UUID;
$uuid = UUID::uuid1();
$pretty_uuid = (string)$uuid;
echo("Printable version: " . $pretty_uuid . "\n");
$uuid_copy = UUID::import($pretty_uuid);
assert ($uuid == $uuid_copy);

Assuming you are getting the UUID as byte[], you can use something like this:
public Object convertFromNoSqlImpl(byte[] value) {
byte[] timeArray = new byte[8];
byte[] clockSeqAndNodeArray=new byte[8];
System.arraycopy(value,0,timeArray,0,8);
System.arraycopy(value,8,clockSeqAndNodeArray,0,8);
long time = StandardConverters.convertFromBytes(Long.class, timeArray);
long clockSeqAndNode = StandardConverters.convertFromBytes(Long.class, clockSeqAndNodeArray);
UUID ud = new UUID(time,clockSeqAndNode);
return ud;
}

Creating Data Table from object array

I am not sure if I am going about this the correct way but I have a c# method which loads an excel sheet into a 2 dimentional object array. In this array item 1,1 - 1,16 contain headers, then 2-1 - 2-16 contain data that match up with those headers as do x-1 - x-16 from there on in. I would like to turn this array into a data table so ultimately I can have it in a format I will then import into an access or SQL server db depending on a clients needs. I have tried using the following code to no avail, but I have a feeling I am way off. Any help on this would be very much appreciated.
private void ProcessObjects(object[,] valueArray)
{
DataTable holdingTable = new DataTable();
DataRow holdingRow;
holdingTable.BeginLoadData();
foreach(int row in valueArray)
{
holdingRow = holdingTable.LoadDataRow(valueArray[row], true);
}
}

Any chance you're using a repository pattern (like subsonic or EF) or using LinqToSql?
You could do this (LinqToSql for simplicity):
List<SomeType> myList = valueArray.ToList().Skip([your header rows]).ConvertAll(f => Property1 = f[0] [the rest of your convert statement])
DataContext dc = new DataContext();
dc.SomeType.InsertAllOnSubmit(myList);
dc.SubmitChanges();

Insertion into Cassandra via thrift-client doesn't work after removing a row via cassandra-cli

I wrote a simpe test to validate my own understanding of the thrift interface for Cassandra. It just inserts a row into the database (using the keyspace and column familty that come preconfigured with the cassandra installation), then reads it from the database and compares the results.
public class CassandraAPITest {
#Test
public void testCassandraAPI() throws Exception {
TTransport tr = new TSocket("localhost", 9160);
tr.open();
Client client = new Cassandra.Client(new TBinaryProtocol(tr));
String key = "123";
byte[] value = { 52, 53, 54 };
ColumnPath columnPath = new ColumnPath("Standard1");
columnPath.setColumn("abc".getBytes("UTF8"));
long timestamp = System.currentTimeMillis();
client.insert("Keyspace1", key, columnPath, value, timestamp, ConsistencyLevel.ONE);
SlicePredicate predicate = new SlicePredicate();
SliceRange sliceRange = new SliceRange();
sliceRange.setStart(new byte[0]);
sliceRange.setFinish(new byte[0]);
predicate.setSlice_range(sliceRange);
List<ColumnOrSuperColumn> result = client.get_slice("Keyspace1", key, new ColumnParent("Standard1"), predicate, ConsistencyLevel.ONE);
assertEquals(1, result.size());
byte[] actual = result.get(0).column.value;
assertArrayEquals(value, actual);
// client.remove("Keyspace1", key, columnPath, System.currentTimeMillis(), ConsistencyLevel.ONE);
tr.close();
}
}
This test runs fine. Of course it leaves a row behind in the database. I could delete the row at the end of the test by uncommenting the client.remove statement above (this also works fine). But what I tried instead was deleting the row via the command-line interface:
cassandra> connect localhost/9160
Connected to: "Test Cluster" on localhost/9160
cassandra> get Keyspace1.Standard1['123']
=> (column=616263, value=456, timestamp=1287909211506)
Returned 1 results.
cassandra> del Keyspace1.Standard1['123']
row removed.
cassandra> get Keyspace1.Standard1['123']
Returned 0 results.
The test fails afterwards. Inserting the row into the database seems to have no effect anymore, so the line assertEquals(1, result.size()) fails:
java.lang.AssertionError: expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:91)
at org.junit.Assert.failNotEquals(Assert.java:618)
at org.junit.Assert.assertEquals(Assert.java:126)
at org.junit.Assert.assertEquals(Assert.java:443)
at org.junit.Assert.assertEquals(Assert.java:427)
at test.package.CassandraAPITest.testCassandraAPI(CassandraAPITest.java:48)
I don't get any error messages (neither on the client nor on the server) and I have no idea what the cause of the problem might be.

You are inserting with millisecond resolution but the CLI (and other high level clients) uses microseconds. So your second insert is in the past, compared to the delete, so Cassandra correctly ignores it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

UUID Cassandra - cassandra

Cassandra expects UUIDs to conform to RFC 4122, so you'll need to either generate compliant values yourself or use an existing library for the language of your choice (most languages have free UUID generation libraries readily available).

Related

Map to hold multiple sets of key and values

Cassandra BoundStatement with Multiple Parameters and Multi-Partition Query

Cassandra convert UUID to string and back

Creating Data Table from object array

Insertion into Cassandra via thrift-client doesn't work after removing a row via cassandra-cli

Categories

Resources