Are voltdb default procedures partitioned? - voltdb

Voltdb creates set of default procedures on tables with primary keys (like HELLO_WORLD.insert, HELLO_WORLD.upsert, HELLO_WORLD.delete etc).
Are these procedures partitioned if my HELLO_WORLD table is partitioned? .
I couldn't find any documentation on default procedure partitioning.

Yes, the default procedures generated when you create a table in VoltDB are partitioned, if the table is partitioned.
All tables will get an insert procedure. If the table has a primary key, then upsert, update, and delete procedures are also generated.

Related

Cannot colocate hash partitioned table in YugabyteDB

[Question posted by a user on YugabyteDB Community Slack]
I'm trying to run through the demo instructions for row-level geo-partitioning. I've created the parent table and tablespaces, but am getting the following error when attempting to create a partitioned table:
CREATE TABLE transactions_us
PARTITION OF transactions
(user_id, account_id, geo_partition, account_type,
amount, txn_type, created_at,
PRIMARY KEY (user_id HASH, account_id, geo_partition))
FOR VALUES IN ('US') TABLESPACE us_tablespace;
ERROR: Invalid argument: Invalid table definition: Error creating table transactions_us on the master: Cannot colocate hash partitioned table
Are these demo instructions still valid?
Your database is colocated, so tables you create inside the database are colocated by default. For now, we disallow hash partitioned colocated tables, so you got an error.
You have two options:
make the tables range partitioned (specify ASC/DESC instead of HASH)
don't use a colocated database
To try Row Level Geo Partitoning, do not use colocated databases. If you want to use colocation + row level geo partitioning, then you would have to switch to Tablegroups wherein we have a work-in-progress feature that you can track here: https://github.com/yugabyte/yugabyte-db/issues/5823

Azure Cosmos DB asking for partition key for stored procedure

I am using GUID Id as my partition key and I am facing problem when I am trying to run a stored procedure. To run a store procedure I need to provide partition key ans I am not sure what value should I provide in this case? Please assist.
If the collection the stored procedure is registered against is a
single-partition collection, then the transaction is scoped to all the
documents within the collection. If the collection is partitioned,
then stored procedures are executed in the transaction scope of a
single partition key. Each stored procedure execution must then
include a partition key value corresponding to the scope the
transaction must run under.
You could refer to the description above which mentioned here.
As #Rafat Sarosh said, GUID Id is not an appropriate partitioning key. Based on your situation , city may be more appropriate.You may need to adjust your database partitioning scheme because the partitioning key can not be deleted or modified after you have defined it.
I suggest you exporting your data to json file then import to a new collection which is partitioned by city via Azure Cosmos DB Data migration tool.
Hope it helps you.
Just for summary:
Issue:
Unable to provide specific partition key value when executing sql to query documents.
Solution:
1.Set EnableCrossPartitionQuery to true when executing query sql.(has performance bottleneck)
2.Consider setting a frequently queried field as a partitioning key.
Example your partition key is /id
and your cosmos document is
{
"id" : abcde
}
When store procedure run, you need to paste: abcde value
So if you want your store procedure running cross partition, it can't
Answer from cosmos team
https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/33550159-support-stored-procedure-execution-over-all-partit

How do I get partition id from java stored procedure in voltdb

I am having a single partitioned java stored procedure in Voltdb. I need to get the partition id of the current partition in which the procedure is running. How do I get that within the procedure?
Rather than maintaining your own transaction counts in a table, you could call the #Statistics system procedure:
SQL> exec #Statistics PROCEDUREPROFILE 0;
This provides statistics for each procedure for each partition.
SQL> exec #Statistics TABLE 0;
This provides the count of how many records per table per partition.

Cassandra: selecting first entry for each value of an indexed column

I have a table of events and would like to extract the first timestamp (column unixtime) for each user.
Is there a way to do this with a single Cassandra query?
The schema is the following:
CREATE TABLE events (
id VARCHAR,
unixtime bigint,
u bigint,
type VARCHAR,
payload map<text, text>,
PRIMARY KEY(id)
);
CREATE INDEX events_u
ON events (u);
CREATE INDEX events_unixtime
ON events (unixtime);
CREATE INDEX events_type
ON events (type);
According to your schema, each user will have a single time stamp. If you want one event per entry, consider:
PRIMARY KEY (id, unixtime).
Assuming that is your schema, the entries for a user will be stored in ascending unixtime order. Be careful though...if it's an unbounded event stream and users have lots of events, the partition for the id will grow and grow. It's recommended to keep partition sizes to tens or hundreds of megs. If you anticipate larger, you'll need to start some form of bucketing.
Now, on to your query. In a word, no. If you don't hit a partition (by specifying the partition key), your query becomes a cluster wide operation. With little data it'll work. But with lots of data, you'll get timeouts. If you do have the data in its current form, then I recommend you use the Cassandra Spark connector and Apache Spark to do your query. An added benefit of the spark connectory is that if you have cassandra nodes as spark worker nodes, due to locality, you can efficiently hit a secondary index without specifying the partition key (which would normally cause a cluster wide query with timeout issues, etc.). You could even use Spark to get the required data and store it into another cassandra table for fast querying.

Hector support for CQL3 specific features (Partition & Clustering keys) and Compact Storage option

I'm trying to leverage a specific feature of Apache Cassandra CQL3, which is partition and clustering keys for tables which are created with compact storage option.
For Eg.
CREATE TABLE EMPLOYEE(id uuid, name text, field text, value text, primary key(id, name , field )) with compact storage;
I've created the table via CQL3 and i;m able to insert rows successfully using the Hector API.
But I couldn't find right set of options in the hector api to create the table itself as i require.
To elaborate a little bit more:
In ColumnFamilyDefinition.java i couldnt see an option for setting storage option (as compact storage) and In ColumnDefinition.java, i couldnt find the option to say that this column is part of the Partition and Clustering Keys
Could you please give me an idea of whether i can use Hector for this (i.e. Creating table) or not and if i can do that, what are the options that i need to provide?
If you are not tied to Hector, you could look into the DataStax Java Driver which was created to use CQL3 and Cassandra's binary protocol.

Resources