I am using MemSQL.
I want to reset Auto increment id to 1 after I issue a truncate table command. I issued the following commands:-
truncate table BOOKS;
AGGREGATOR SYNC AUTO_INCREMENT ON db.BOOKS;
But when I insert rows after that the id continues to increase as per earlier insertions.
How can I reset id to 1 in MemSQL?
MemSQL doesn't support ALTER TABLE AUTO_INCREMENT= to reset the auto_increment value yet. The only way to reset it is to drop and recreate the table right now.
Keep in mind that auto_increments behave differently in MemSQL compared to other databases. For example, the auto_increment values of rows inserted one after another on different aggregators won't be consecutive.
Unfortunately, we don't support that as of 4.1.
You can always drop and recreate the table thought :P.
Related
I have the following table on my Cassandra db, I want to find the delta difference in terms of cassandra query. For example, if I operate any insert,update,delete operation to the table I should be able to show which row/rows are getting impacted as my final result.
Let's say on first instance I have perform some 10 rows insertions so if I take the delta difference the output should only show that 10 rows are inserted. Same if we modify any number of rows or delete some rows then those changes should be captured.
Next time if we run the query it should idealy give 0 as we have not insert/modify/delete any row/rows
Here is the following table
CREATE TABLE datainv (
datainv_account_id uuid,
datainv_run_id uuid,
id uuid,
datainv_summary text,
json text,
number text,
PRIMARY KEY (datainv_account_id, datainv_run_id));
many things I have searched on internet but most of the solution are based on timeuuid,but in this case I have uuid columns only. So I'm not getting any solution that the same use-case can be achieved using uuid
It's not so easy to generate a diff between 2 table states in Cassandra, because you can't easily detect if you have inserted new partitions or not. You can implement something based on the timeuuid or on the timestamp as clustering column - in this case you'll able to filter out the data since latest change, as you have ordering of values that you don't have with uuid that is completely random. But it still requires that you perform the full scan of all the table. Plus it won't detect deletions...
Theoretically you can implement this with Spark as following:
read all primary key values & store this data in some other table/on disk;
next time, read all primary key values & find difference between original set of primary keys & new set - for example, do full outer join & use presence of None on left as addition, and presence of None on right as deletion;
store new set of the primary keys in a separate table/on disk, but previous version should be truncated.
but it will consume quite a lot of resources.
I have a table in Cassandra say employee(id, email, role, name, password) with only id as my primary key.
I want to ...
1. Add another column (manager_id) in with a default value in it
I know that I can add a column in the table but there is no way i can provide a default value to that column through CQL. I can also not update the value for manager_id later since I need to know the id (Partition key and the values are randomly generated unique values which i don't know) to update the row. Is there any way I can achieve this?
2. Rename this table to all_employee.
I also know that its not allowed to rename a table in cassandra. So I am trying to copy the data of table(employee) to csv and copy from csv to new table (all_employee) and deleting the old table(employee). I am doing this through an automated script with cql queries in it and script works fine but will fail if it gets executed again(Which i can not restrict) since the table employee will not be there once its deleted. Essentially I am looking for "If exists" clause in COPY query which is not supported in cql. Is there any other way I can achieve the outcome?
Please note that the amount of data in the table is very small so performance in not an issue.
For #1
I dont think cassandra support default column . You need to do that from your appliaction. Write some default value every time you insert a row.
For #2
you can check if the table exists before trying to copy from it.
SELECT your_table_name FROM system_schema.tables WHERE keyspace_name='your_keyspace_name';
CREATE TABLE user_logins (
user_id bigint PRIMARY KEY,
login_time timestamp
)WITH CLUSTERING ORDER BY (login_time DESC);
Is there a way in which I can maintain the last 3 versions of a key in Cassandra ? If I add more rows with this primary key, it should automatically truncate delete the oldest row to ensure that only 3 rows are maintained at a time ? For example, for every user only maintain their last 3 login timestamps.
One way is I use collections like list to store the timestamp and then do a read-before-write to fetch the current value, modify it and persist it? Is there any other way I can have a TTL like functionality without the notion of time but maintaining last N versions ?
There's no such functionality in Cassandra, you will need to work with workarounds.
One way, as you said, is to work with lists/maps/sets/UDTs and update( read-before-write) them all the time.
Another way is to have N tables, user_logins_N for example, and on the client/server side you will round robin on these tables. Each table will have only one version of the key with one login timestamp, and you will always save only N versions of logins.
Third way, is to do some background house keeping process, that will go through keys and delete old/irrelevant logins.
i have a table in Cassandradb as mentioned below:
CREATE TABLE remaining (owner varchar,buddy varchar,remain counter,primary key(owner,buddy));
generally i do some inc/dec operations on REMAIN field ,using cql like below:
update remaining set remain=remain + 1 where owner='userA' and buddy='userB';
update remaining set remain=remain + 1 where owner='userA' and buddy='userC';
....
and now i need to find out all buddies for userA which it's REMAIN field greater then 1. when i using:
select buddy,remain from remaining where owner='userA' and remain > 0;
gives me an error:
No indexed columns present in by-columns clause with Equal operator
how to do this in a cassandradb way?
The short answer to this is that you cannot do queries with conditionals on counter columns in Cassandra.
The reason behind this is that all Cassandra queries need to be modeled around the primary key of that particular table. Counter columns are not allowed as parts of the primary key of a table (their changing values would cause constant reorganization of the dat on disk). Counter columns are more used for tracking the state of a known piece of data, for example number of times a photo has been up-voted. This could be quickly recalled as long as we knew which photo we were interested in. To actually sort photos by numbers of votes you would need to perform an analytics style query using spark or Hadoop.
What is the difference between UPDATE and INSERT when executing CQL against Cassandra?
It looks like there used to be no difference, but now the documentation says that INSERT does not support counters while UPDATE does.
Is there a "preferred" method to use? Or are there cases where one should be used over the other?
Thanks so much!
There is a subtle difference. Inserted records via INSERT remain if you set all non-key fields to null. Records inserted via UPDATE go away if you set all non-key fields to null.
Try this:
CREATE TABLE T (
pk int,
f1 int,
PRIMARY KEY (pk)
);
INSERT INTO T (pk, f1) VALUES (1, 1);
UPDATE T SET f1=2 where pk=2;
SELECT * FROM T;
Returns:
pk | f1
----+----
1 | 1
2 | 2
Now, update each row setting f1 to null.
UPDATE T SET f1 = null WHERE pk = 1;
UPDATE T SET f1 = null WHERE pk = 2;
SELECT * FROM T;
Note that row 1 remains, while row 2 is removed.
pk | f1
----+------
1 | null
If you look at these using Cassandra-cli, you will see a different in how the rows are added.
I'd sure like to know whether this is by design or a bug and see this behavior documented.
Counter Columns in Cassandra couldn't be set to an arbitrary value: they can only be incremented or decremented by any arbitrary value.
For this reason, INSERT doesn't support Counter Column because you cannot "insert" a value into a Counter Column. You can only UPDATE them (increment or decrement) by some value. Here's how you would update a Counter column.
UPDATE ... SET name1 = name1 + <value>
You asked:
Is there a "preferred" method to use? Or are there cases where one should be used over the other?
Yes. If you are inserting values to the database, you can use INSERT. If the column doesn't exists, it will be created for you. Otherwise, INSERT's effect is similar to UPDATE. INSERT is useful when you don't have a pre-designed schema (Dynamic Column Family, i.e. insert anything, anytime). If you are designing the schema before hand (Static Column Family, similar to RDMS) and know each column, then you can use UPDATE.
Another subtle difference (i'm starting to believe cql is a terrible interface to cassandra, full of subtleties and caveats due to using similar SQL syntax but slightly different semantics) is with setting TTLs on existing data. With UPDATE you cannot update the TTL of the keys, even if the new actual values are equal to the old values. The solution is to INSERT the new row instead, with the new TTL already set
Regarding the subtle difference highlighted by billbaird (I'm unable to comment on that post directly) where a row created by an update operation will be deleted if all non-key fields are null:
That is expected behavior and not a bug based on the bug report at https://issues.apache.org/jira/browse/CASSANDRA-11805 (which was closed as "Not A Problem")
I ran into this myself when using Spring Data for the first time. I was using the save(T entity) method of a repository, but no row was being created. it turned out Spring Data was using an UPDATE because it determined that the object wasn't 'new' (not sure that test for 'isNew' makes sense here), and I happened to be testing with entities that only had the key fields set.
For this Spring Data case, the Cassandra-specific repository interfaces do provide an insert method that appear to consistently use an INSERT if that behavior is desired instead (though Spring's documentation doesn't document these details sufficiently either).