I tried creating hash index on my table in memsql using
CREATE INDEX hashindex USING HASH ON table (column);
But i get the following error
ERROR 1710 (HY000): MemSQL does not support non-unique hash indexes.
Am i missing something ?
In order to make that statement work, you would need to add the UNIQUE keyword between CREATE and INDEX. Ex:
CREATE UNIQUE INDEX hashindex USING HASH ON table (column);
If you are intentionally trying to have a non-unique hash index, however, it is not supported (as indicated in the error). If you are trying to have a unique index, then great! Adding the keyword will work for you. Just note that adding a unique index cannot be performed as an online operation.
Related
I'm trying to figure out if there is a way to ignore duplicates when doing a bulkInsert.
The problem is that when you are using uuid's for your primary key you must include the id in your insert statement. Because of this I generate a uuid before I insert the data and technically all the other fields could be duplicates of another row except for the uuid.
I want to know if there is a way to do an insert in Sequelize in which I can check if the row to be inserted is a duplicate dependent upon fields that I choose.
UPDATE: I am using the postgres dialect and I have just discovered that it has a ON CONFLICT (KEY,KEY) DO NOTHING
As of primary keys the best way to generate them is on a DB side setting a primary key default value to uuid_generate_v4() (in case of PostgreSQL).
This (among other benefits) will get you an ability not to indicate a primary key column in INSERT query.
I want to use the IN clause for the non-primary key column in Cassandra. Is it possible? if it is not is there any alternate or suggestion?
Three possible solutions
Create a secondary index. This is not recommended due to performance problems.
See if you can designate that column in the existing table as part of the primary key
Create another denormalised table that table is optimised for your query. i.e data model by query pattern
Update:
And also even after you move that to primary key, operations with IN clause can be further optimised. I found this cassandra lookup by list of primary keys in java very useful
I find this abit confusing. Iam using memsql column store. I try to understand if there is a way to enforce duplications on specific key (e.g eventId). I found some doc regarding Unenforced Unique but I didnt really understand its intention.
The point of unenforced unique keys is as a hint:
An unenforced unique constraint is informational: the query planner may use the unenforced unique constraint as a hint to choose better query plans.
from https://docs.memsql.com/v6.8/concepts/unenforced-unique-constraints/.
Unfortunately MemSQL does not support (enforced) unique constraints on columnstore tables.
MemSQL now supports unique constraint with version 7+ but can be applied to only single column:
https://docs.memsql.com/v7.1/guides/use-memsql/physical-schema-design/creating-a-columnstore-table/creating-a-columnstore-table/
Your columnstore table definition can contain metadata-only unenforced unique keys, single-column hash keys (which may be UNIQUE), and a FULLTEXT key. You cannot define more than one unique key.
one hack to enable UNIQUE constraint on multi columns is to use a computed column consisting of multiple columns appended and then apply UNIQUE on it which will indirectly enforce uniqueness on multiple columns.
example:
CREATE TABLE articles (
id INT UNSIGNED,
year int UNSIGNED,
title VARCHAR(200),
body TEXT,
SHARD KEY(title),
KEY (id) USING CLUSTERED COLUMNSTORE,
KEY (id) USING HASH,
UNIQUE KEY (title) USING HASH,
KEY (year) USING HASH);
I have a dynamo DB table where the sort key has a numeric value.
I have a requirement to retrieve the first item which has a lower value than the one, that I have.
I have gone through http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_UpdateItem.html#API_UpdateItem_Examples docs but I can see no way to:
- sort the output
- limit the result to 1 entry
Is there any way to actually achieve what I want with dynamo DB?
EDIT:
According to this: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html
The results are sorted using sorting key, and when it's numeric, they are sorted descending. Which is great, but I still can't find any way to get only a single result [don't want to "pay" for the full table scan in some cases].
Are you searching for the next item which has a lower sort key within the same Partition Key?
In that case, you are able to use Query as you've found, sort in Descending and Limit to 1. This will not scan the entire table.
Alternatively, if you wish you scan cross Partitions, unfortunately a Table Scan is the only way to do this.
I want to fetch all rows having a common prefix using hector API. I played with RangeSuperSlicesQuery a bit but didn't find a way to get it working properly. Does key range parameters work with wild cards etc?
Update: I used ByteOrderedPartitioner instead of RandomPartitioner and it works fine with that. Is this the expected behavior?
Yes, that's the expected behavior. In RandomPartitioner, rows are stored in the order of the MD5 hash of their keys, so to get a meaningful range of keys, you need to use an order preserving partitioner like ByteOrderedPartitioner.
However, there are downsides to using ByteOrderedPartitioner or OrderPreservingPartitioner that you can usually avoid with a slightly different data model and RandomPartitioner.
To elaborate on the above answer, you should consider using column names as your "common prefix" instead of the key. Then you can either use a column slice to get all column names in a certain range, or you could use a secondary index then do an indexed slice for all keys with that column name.
Column slice example:
Key (without prefix)
<prefix1> : <data>
<prefix2> : <data>
...
Secondary index example:
Key (with or without prefix)
"prefix" : <the_prefix> <-- this column is indexed
otherCol1 : <data>
...