In my application, I want to get all the rows in a column family, but to ignore the rows that are temporarily unavailable (e.g. some nodes are down).
I have multiple nodes. If one of the node is down, then get_range will throw UnavailableException, and I can get nothing.
What I want is to get all the rows that are currently available, because, to the user, its better than nothing. How can I do this?
I'm using pycassa.
The row keys in my column family are like random string, so I cannot use get to get all the rows one by one.
If get_range by token support is added to pycassa, you could fetch each token range (as reported by describe_ring) separately, discarding those that resulted in an UnavailableException. Barring that, using consistency level ONE is your best option, as Dean mentions.
there should be a call to get that takes a List of rowkeys so you don't need to get them one by one. Also, if you have an index, that can help. for instance playORM has an index for each partition of a table(and you can have as many partitions as you want). With that, you can then iterate over each index and call get passing it a LIST of keys.
Also, make sure your consistency read is set to ONE as well ;).
later,
Dean
Related
I have a table with a PRIMARY KEY of ( (A,B), C)
Partition key (A,B)
Clustering key C
My question is related to deleting from this table.
Is it efficient to use the IN clause when deleting or to issue multiple
delete statements using the equality operation.
delete from table where A=xx and B IN ('a','b','c');
-OR-
delete from table where A=xx and B='a';
delete from table where A=xx and B='b';
delete from table where A=xx and B='c';
Is there any harm in using the IN operator as in the 1st delete statement.
There may be up to around 20 deletes in total (or 20 items in the IN clause).
Thanks in advance for all your help!
With a few small exceptions its almost always better to use the 2nd option multiple deletes issued asynchronously instead. The coordinator of the IN clause will be put on a lot of load while the later will evenly distribute the load. Also with a TokenAware load balancer the requests will go directly to the correct replicas and can complete pretty quickly. If you are doing hundreds or more of the deletes you might wanna use a Semaphore or something though to limit number of in flight deletes, just to prevent overloading cluster.
It depends on the needs of your application. If the delete operations are expected to be fast, then you'll probably want to run each one explicitly (second option).
On the other hand, if the delete runs as a part of a batch or cleanup job, and nobody really cares how long it takes, then you could probably get away with using IN. The trick there would be in keeping it from timing-out (and as Chris indicated, putting undue load on the node). It might make sense to break-down your groups of values for column B, to keep those small. While 20 list items with IN isn't the most I've heard of someone trying, it's definitely more than I would ever use personally (I'd try to keep it smaller than 10).
Essentially, using the IN operator with a DELETE is going to be susceptible to performance issues just like it would be on a SELECT, as described in this answer (included here for reference):
Is the IN relation in Cassandra bad for queries?
It seems to me that using IF would make the statement possibly fail if re-tried. Therefore, the statement is not idempotent. For instance, given the CQL below, if it fails because of a timeout or system problem and I retry it, then it may not work because another person may have updated the version between retries.
UPDATE users
SET name = 'foo', version = 4
WHERE userid = 1
IF version = 3
Best practices for updates in Cassandra are to make updates idempotent, yet the IF operator is in direct opposition to this. Am I missing something?
If your application is idempotent, then generally you wouldn't need to use the expensive IF clause, since all your clients would be trying to set the same value.
For example, suppose your clients were aggregating some values and writing the result to a roll up table. Each client would calculate the same total and write the same value, so it wouldn't matter if multiple clients wrote to it, or what order they wrote to it, since it would be the same value.
If what you are actually looking for is mutual exclusion, such as keeping a bank balance, then the IF clause could be used. You might read a row to get the current balance, then subtract some money and update the balance only if the balance hadn't changed since you read it. If another client was trying to add a deposit at the same time, then it would fail and would have to try again.
But another way to do that without mutual exclusion is to write each withdrawal and deposit as a separate clustered transaction row, and then calculate the balance as an idempotent result of applying all the transaction rows.
You can use the IF clause for idempotent writes, but it seems pointless. The first client to do the write would succeed and Cassandra would return the value "applied=True". And the next client to try the same write would get back "applied=False, version=4", indicating that the row had already been updated to version 4 so nothing was changed.
This question is more about linerizability(ordering) than idempotency I think. This query uses Paxos to try to determine the state of the system before applying a change. If the state of the system is identical then the query can be retried many times without a change in the results. This provides a weak form of ordering (and is expensive) unlike most Cassandra writes. Generally you should only use CAS operations if you are attempting to record state of a system (rather than a history or log)
Do not use many of these queries if you can help it, the guidelines suggest having only a small percentage of your queries rely on this behavior.
Suppose I store a list of events in a Cassandra row, implemented with composite columns:
{
event:123 => 'something happened'
event:234 => 'something else happened'
}
It's almost fine by me, and, as far as I understand, that's a common pattern. Comparing to having a single column event with the jsonized list, that scales better since it's easy to add a new item to the list without reading it first and then writing back.
However, now I need to implement these two requirements:
I don't want to add a new event if the last added one is the same,
I want to keep only N last events.
Is there any standard way of doing that with the best possible performance? (Any storage schema changes are ok).
Checking whether or not things already exist, or checking how many that exist and removing extra items, are both read-modify-write operations, and they don't fit very well with the constraints of Cassandra.
One way of keeping only the N last events is to make sure they are ordered so that you can do a range query and read the N last (for example prefixing the column key with a timestamp/TimeUUID). This wouldn't remove the outdated events, that you need to do as a separate process, but by doing it this way the code that queries the data will only see the last N, which is the real requirement if I interpret things correctly. The garbage collection of old events is just an optimization to avoid keeping things that will never be needed again.
If the requirement isn't a strict N events, but events that are not older than T you can of course use the TTL feature, but I assume that it's not an option for you.
The first requirement is trickier. You can do a read before ever write and check if you have an item, but that would be slow, and unless you do some kind of locking outside of Cassandra there is no guarantee that two writers won't do both do a read and then both do a write, so that neither sees the other's write. Maybe that's not a problem for you, but there's no good way around it. Cassandra doesn't do CAS.
The way I've handled similar situations when using Cassandra is to keep a cache in the application nodes of what has been written, and check that before writing. You then need to make sure that each application node sees all events for the same row, and that events for the same row aren't distributed over multiple application nodes. One way of doing that is to have a message queue system in front of your application nodes, and divide the event stream over several queues by the same key as you use as row key in the database.
There are several roll your own strategies for secondary indexes that handle concurrent updates, this for example:
http://www.slideshare.net/edanuff/indexing-in-cassandra
which uses 3 ColumnFamilies.
My question is, how is the PlayORM #NoSqlIndexed annotation implemented; in terms of what extra ColumnFamilies are needed / created?
Additionally, are concurrent updates supported - ie, it would not be possible with two competing updates to have the index updated from one and the table from the other?
You can do concurrent updates with no locking.
Slide 46's question of Can't I get a false positive? is the same case with PlayOrm.
The one caveat is you may need to resolve on read. Example is thus. Say you have Fred with an address of 123 in the database.
Now, two servers make an update to Fred
server 1 : Fred's new address is 456 (results in deleting index 123.fred and adding 456.fred)
server 2 : Fred's new address is 789 (results in deleting index 123.fred and adding 789.fred)
This means your index may have a duplicate of 456.fred and 789.fred. You can then resolve this on read as the query WILL return Fred when you ask for people with address 456. There is another ticket out for us to resolve this on reads for you ;) and eliminate the entry.
We did ask about getting a change in cassandra where we could possibly do (add column 456.fred IF column 123.fred exists or fail) but not sure if they will ever implement something like that. That would propogate a failure back to the loser(ie. last writer gets exception). It would be nice but I am not sure they will do a feature like this.
BIG NOTE: Unlike CQL, the query is NOT sent to all nodes. It only puts load on the nodes that contains the index instead of all 100 computers. ie. it can scale better this way.
MORE DETAIL: On slide 27 of that presentation your link has, it is ALMOST like that for our indexes. The format does not contain the 1, 2, 3 though. The index format is
Indexes=
{"User_Keys_By_Last_Name":{
{"adams","e5d…"}: null,
{"alden","e80…"}: null,
{"anderson","e5f…"}: null,
{"anderson","e71…"}: null,
{"doe","e78…"}: null,
{"franks","e66…"}: null,
…:…,
}
}
This way, we can avoid the read to find out if we need to use a 1, 2, 3, 4, 5 for the second half of the name. Instead we use the FK which we know is unique and just have to do a write. Cassandra is all about resolving conflicts on a read anyways which is why the repair process exists. It is based on the fact that conflicts will happen a very low percentage of the time and just take a hit then at that low percentage.
LASTLY, you can just use the command line tool to view the index!!!! It batches stuff in about 200 columns each streaming back so you could have 1 million entries and the command line tool will happily just keep printing them until you ctrl-c it.
later,
Dean
As of now, only 3 tables are created for all indexes in Playorm. i.e, All the indexes are stored in StringIndice, IntegerIndice and DecimalIndice column families.
Apart from that, there is a pattern under development which will created a new table for the column if required. See the pattern details at https://github.com/deanhiller/playorm/issues/44.
Is there any way to get all the data from a column family or from a key space?
I can't think of a way of doing this without knowing every single key for every single entry made to the database.
My problem is that I'm trying to create a Twitter clone where each message has its own id, and store those in the same keyspace in the same column family.
But then how do I get them back? I'll have to keep a track of every single id, and that can't possibly work.
Any help/ideas would be appreciated.
You can retrieve all data from a column family using get_range_slices, setting the range start and end to the same value to indicate that you want all data.
See the Cassandra FAQ
See http://aquiles.codeplex.com/discussions/278245 for a Thrift example.
Haven't yet found a handy Hector example but I think it uses RangeSlicesQuery...
However, it's not clear why you want to do this - for this sort of application you would normally look up the messages by ID, and use an index to determine which IDs you need. For example, storing a row for each user that lists all their messages. For example in the messages column family you might have something like:
MsgID0001 -> time text
1234567 Hello world
MsgID0300 -> time text
3456789 LOL ROTFL
And then in a "user2msg" column family, store the messages, perhaps using timestamp column names so the messages are stored in sorted in time order:
UserID001 -> 1234567 3456789
MsgID0001 MsgID0300
This can then be used to look up a particular user's messages, possibly filtered by time.
You'd then also need further column families to store user profiles etc.
Perhaps you need to add more detail to your question?
Update in response to comment: Yes, if you have one message per row, you have to retrieve each message individually. But what is your alternative? Retrieving all messages is only useful for doing batch processing of messages, not for (for example) showing a user their recent messages. Bear in mind that retrieving all messages could take a very long time - you have not explained why you want to retrieve all messages and what you are going to do with them all. How many messages are you expecting to have?
One possibility is to denormalise, i.e. in a row for each user, store the entire messages, so you don't have to do a separate lookup step for each message. This doubles the amount of storage required, however.
The answer i was looking for is CQL, cassandra's query language. It works similarly to sql which is what i need for the function im after.
this link has some excellent tutorials.