update cassandra field using string concatenation - cassandra

I am trying to update an existing string column in cassandra table.
For example i want to append domain id in front of username.
Following is my table
id, username
1, agaikwad
2, xyz
I want to write cql to update above table to reflect following
id, username
1, homeoffice\\agaikwad
2, homeoffice\\xyz
Following is what I have tried
update users set username = 'homeoffice\\' + username where id = <id>

This is not allowed in C* because it implicitly requires a read before a write which is a bad practice with C* (and an expensive proposition in a distributed system). For a similar behavior you could store this field as a list of strings, lists support the append operation and you would be able to concatenate on the application side.

Related

Fetching Data from Database using Strings not IDs

Whenever we save data to the database, there is always a corresponding ID which we use to fetch the data from that specific column.
sql_con.execute("SELECT FROM DBNAME WHERE ID = ?", id)
The above code only allows us to fetch data based from the ID. The problem is that the above code only accepts 1 supplied binding. In my database, I used sets of strings as the ID for each column, which means that the binding of my IDs are more than 1. And, those sets of strings have different bindings (or character count).
How do I modify the code in above, so I could input strings as my ID, preventing it from receiving the specific error:
sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement uses 1, and there are 8 supplied.
Thank you in advance. I use Python 3.xx and in-built module sqlite3. Database is in .db file format and is a disk-based database.
I found the answer for my own question, by asking someone else.
For you to resolve this problem with the bindings of the input, just simply convert the parameter into a tuple.
OLD CODE:
sql_con.execute("SELECT FROM DBNAME WHERE ID = ?", id)
INTO THIS...
NEW CODE:
sql_con.execute("SELECT * FROM DBNAME WHERE ID = ?", (id,))
Hope it helps.

unable to upsert using java datastax driver

I am unable to upsert a row using the datastax driver.
The data in the Cassandra table is stored like follows:
tag | partition_info
------------+--------------------------------------------------
sometag | {{year: 2018, month: 1}, {year: 2018, month: 2}}
tag is primary key and partition_info is a UDT
CREATE TYPE codingjedi.tag_partitions (
year bigint,
month bigint
);
I want that if a tag doesn't exist then it gets created. If tag exists then the new udt value gets appended to old one. I suppose I cannot use insert as it overrides previous value i.e. this will not work
QueryBuilder.insertInto(tableName).value("tag",model.tag)
.value("partition_info",setAsJavaSet(Set(partitionsInfo)))
I am trying to use update but it isn't working. Datastax driver gives error java.lang.IllegalArgumentException for following query
QueryBuilder.update(tableName).`with`(QueryBuilder.append("partition_info",setAsJavaSet(Set(partitionsInfo))))
.where(QueryBuilder.eq("tag", id.tag))
I tried using add and append for primary key but but got the error PRIMARY KEY part tag found in SET part
QueryBuilder.update(tableName).`with`(QueryBuilder.add("tag",id.tag))
.and(QueryBuilder.append("partition_info",setAsJavaSet(Set(partitionsInfo)))) .where(QueryBuilder.eq("tag", id.tag))
You're using the incorrect operation in your update statement - you're using append, but it's used to append data to columns of list types. You can use instead either add if you're adding a single value (your case, so you wont even need to wrap data into Set explicitly), or addAll if you're adding multiple values.
QueryBuilder.update(tableName)
.`with`(QueryBuilder.add("partition_info", partitionsInfo))
.where(QueryBuilder.eq("tag", id.tag))

Create a Couchbase Document without Specifying an ID

Is it possible to insert a new document into a Couchbase bucket without specifying the document's ID? I would like use Couchbase's Java SDK create a document and have Couchbase determine the document's UUID with Groovy code similar to the following:
import com.couchbase.client.java.CouchbaseCluster
import com.couchbase.client.java.Cluster
import com.couchbase.client.java.Bucket
import com.couchbase.client.java.document.JsonDocument
// Connect to localhost
CouchbaseCluster myCluster = CouchbaseCluster.create()
// Connect to a specific bucket
Bucket myBucket = myCluster.openBucket("default")
// Build the document
JsonObject person = JsonObject.empty()
.put("firstname", "Stephen")
.put("lastname", "Curry")
.put("twitterHandle", "#StephenCurry30")
.put("title", "First Unanimous NBA MVP)
// Create the document
JsonDocument stored = myBucket.upsert(JsonDocument.create(person));
No, Couchbase documents have to have a key, that's the whole point of a key-value store, after all. However, if you don't care what the key is, for example, because you retrieve documents through queries rather than by key, you can just use a uuid or any other unique value when creating the document.
It seems there is no way to have Couchbase generate the document IDs for me. At the suggestion of another developer, I am using UUID.randomUUID() to generate the document IDs in my application. The approach is working well for me so far.
Reference: https://forums.couchbase.com/t/create-a-couchbase-document-without-specifying-an-id/8243/4
As you already found out, generating a UUID is one approach.
If you want to generate a more meaningful ID, for instance a "foo" prefix followed by a sequence number, you can make use of atomic counters in Couchbase.
The atomic counter is a document that contains a long, on which the SDK relies to guarantee a unique, incremented value each time you call bucket.counter("counterKey", 1, 2). This code would take the value of the counter document "counterKey", increment it by 1 atomically and return the incremented value. If the counter doesn't exist, it is created with the initial value 2, which is the value returned.
This is not automatic, but a Couchbase way of creating sequences / IDs.

Astyanax Composite Keys in Cassandra

Im trying to create a schema that will enable me access rows with only part of the row_key.
For example the key is of the form user_id:machine_os:machine_arch
An example of a row key: 12242:"windows2000":"x86"
From the documentation I could not understand whether this will enable me to query all rows that have userid=12242 or query all rows that have "windows2000"
Is there any feasible way to achieve this ?
Thanks,
Yadid
Alright, here is what is happening: based on your schema, you are effectively creating a column family with a composite primary key or a composite rowkey. What this means is, you will need to restrict each component of the composite key except the last one with a strict equality relation. The last component of the composite key can use inequality and the IN relation, but not the 1st and 2nd components.
Additionally, you must specify all three parts if you want to utilize any kind of filtering. This is necessary because without all parts of the partition key, the coordinator node will have no idea on which node in the cluster the data exists (remember, Cassandra uses the partition key to determine replicas and data placement).
Effectively, this means you can't do any of these:
select * from datacf where user_id = 100012; # missing 2nd and 3rd key components
select * from datacf where user_id = 100012; and machine_arch = 'x86'; # missing 3rd key component
select * from datacf where machine_arch = 'x86'; # you have to specify the 1st
select * from datacf where user_id = 100012 and machine_arch in ('x86', 'x64'); # nope, still want 3rd
However, you will be able to run queries like this:
select * from datacf where user_id = 100012 and machine_arch = 'x86'
and machine_os = "windows2000"; # yes! all 3 parts are there
select * from datacf where user_id = 100012 and machine_os = "windows2000"
and machine_arch in ('x86', 'x64'); # the last part of the key can use the 'IN' or other equality relations
To answer your initial question, with you existing data model, you will neither be able to query data with userid = 12242 or query all rows that have "windows2000" as the machine_os.
If you can tell me exactly what kind of query you will be running, I can probably help in trying to design the table accordingly. Cassandra data models usually work better when looked at from the data retrieval perspective. Long story short- use only user_id as your primary key and use secondary indexes on other columns you want to query on.

Cassandra Searching for a RowKey

I am very new to Cassandra and this time still I have not done my part on reading much about the architecture. I have a simple question for which I am not getting an answer for.
This is a sample data when I do a list abcColumnFamily:
RowKey:Message_1
=> (column=word, value=Message_1, timestamp=1373976339934001)
RowKey:Message_2
=> (column=word, value=Message_2, timestamp=1373976339934001)
How can I search for the Rowkey having say Message_1
In SQL world: Select * from Table where Rowkey = 'Message_1' (= OR like). I want to simply search on full string.
My intention is to just check whether a particular data of my interest is there in a rowkey or not.
For CQL try:
select * from abcColumnFamily where KEY = 'Message_1'
If You want to query that data using CLI try the following:
assume abcColumnFamily keys as utf8;
get abcColumnFamily['Message_1'];

Resources