How do I get partition id from java stored procedure in voltdb - voltdb

I am having a single partitioned java stored procedure in Voltdb. I need to get the partition id of the current partition in which the procedure is running. How do I get that within the procedure?

Rather than maintaining your own transaction counts in a table, you could call the #Statistics system procedure:
SQL> exec #Statistics PROCEDUREPROFILE 0;
This provides statistics for each procedure for each partition.
SQL> exec #Statistics TABLE 0;
This provides the count of how many records per table per partition.

Related

Azure NodeJS - query all documents with stored procedure

I am trying to call a stored procedure in an azure function which does a query of 'SELECT * from Events c WHERE c.state = "0". When I run the code below, it says PartitionKey value must be supplied for this operation. I have thousands and thousands of partition keys, and I need to query every single document. How would I go about doing this? I read about enabling cross partition, but I can't find where to put that. Is it in the Azure function or Stored Procedure? Thanks
client.executeStoredProcedure(databaseUrl + "/colls/Events/sprocs/Events_FindDocs", "null",
(err, results) => {
if(err){
context.log(err);
} else {
context.log(results);
}
});
Stored procedures cannot run across partitions.
They are partition specific so you won't be able to query everything in a stored procedure if your collection is partitioned.
From the documentation:
"If the collection the stored procedure is registered against is a single-partition collection, then the transaction is scoped to all the documents within the collection. If the collection is partitioned, then stored procedures are executed in the transaction scope of a single partition key. Each stored procedure execution must then include a partition key value corresponding to the scope the transaction must run under."
Learn more about Stored Procedures here

Azure Cosmos DB asking for partition key for stored procedure

I am using GUID Id as my partition key and I am facing problem when I am trying to run a stored procedure. To run a store procedure I need to provide partition key ans I am not sure what value should I provide in this case? Please assist.
If the collection the stored procedure is registered against is a
single-partition collection, then the transaction is scoped to all the
documents within the collection. If the collection is partitioned,
then stored procedures are executed in the transaction scope of a
single partition key. Each stored procedure execution must then
include a partition key value corresponding to the scope the
transaction must run under.
You could refer to the description above which mentioned here.
As #Rafat Sarosh said, GUID Id is not an appropriate partitioning key. Based on your situation , city may be more appropriate.You may need to adjust your database partitioning scheme because the partitioning key can not be deleted or modified after you have defined it.
I suggest you exporting your data to json file then import to a new collection which is partitioned by city via Azure Cosmos DB Data migration tool.
Hope it helps you.
Just for summary:
Issue:
Unable to provide specific partition key value when executing sql to query documents.
Solution:
1.Set EnableCrossPartitionQuery to true when executing query sql.(has performance bottleneck)
2.Consider setting a frequently queried field as a partitioning key.
Example your partition key is /id
and your cosmos document is
{
"id" : abcde
}
When store procedure run, you need to paste: abcde value
So if you want your store procedure running cross partition, it can't
Answer from cosmos team
https://feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/33550159-support-stored-procedure-execution-over-all-partit

Are voltdb default procedures partitioned?

Voltdb creates set of default procedures on tables with primary keys (like HELLO_WORLD.insert, HELLO_WORLD.upsert, HELLO_WORLD.delete etc).
Are these procedures partitioned if my HELLO_WORLD table is partitioned? .
I couldn't find any documentation on default procedure partitioning.
Yes, the default procedures generated when you create a table in VoltDB are partitioned, if the table is partitioned.
All tables will get an insert procedure. If the table has a primary key, then upsert, update, and delete procedures are also generated.

Cassandra: selecting first entry for each value of an indexed column

I have a table of events and would like to extract the first timestamp (column unixtime) for each user.
Is there a way to do this with a single Cassandra query?
The schema is the following:
CREATE TABLE events (
id VARCHAR,
unixtime bigint,
u bigint,
type VARCHAR,
payload map<text, text>,
PRIMARY KEY(id)
);
CREATE INDEX events_u
ON events (u);
CREATE INDEX events_unixtime
ON events (unixtime);
CREATE INDEX events_type
ON events (type);
According to your schema, each user will have a single time stamp. If you want one event per entry, consider:
PRIMARY KEY (id, unixtime).
Assuming that is your schema, the entries for a user will be stored in ascending unixtime order. Be careful though...if it's an unbounded event stream and users have lots of events, the partition for the id will grow and grow. It's recommended to keep partition sizes to tens or hundreds of megs. If you anticipate larger, you'll need to start some form of bucketing.
Now, on to your query. In a word, no. If you don't hit a partition (by specifying the partition key), your query becomes a cluster wide operation. With little data it'll work. But with lots of data, you'll get timeouts. If you do have the data in its current form, then I recommend you use the Cassandra Spark connector and Apache Spark to do your query. An added benefit of the spark connectory is that if you have cassandra nodes as spark worker nodes, due to locality, you can efficiently hit a secondary index without specifying the partition key (which would normally cause a cluster wide query with timeout issues, etc.). You could even use Spark to get the required data and store it into another cassandra table for fast querying.

Insert multiple records at once in Cassandra

I've been researching a lot about how to insert multiple records in direct cassandra cqlsh console. I found something about batch, so I thought of using it with a loop (for, while) but it seems that Cassandra does not support batch.
How could insert multiple records in direct cassandra console? There is something like stored procedure in cassandra?
Cassandra does not (at this time) have stored procedures, but you should be able to accomplish this with a batch statement. Essentially you should be able to encapsulate multiple INSERTs inside of BEGIN BATCH and APPLY BATCH statements. This example is from the DataStax documentation on batch operations.
BEGIN BATCH
INSERT INTO purchases (user, balance) VALUES ('user1', -8) USING TIMESTAMP 19998889022757000;
INSERT INTO purchases (user, expense_id, amount, description, paid)
VALUES ('user1', 1, 8, 'burrito', false);
APPLY BATCH;
Check the doc linked above for more information.
Edit:
If you are meaning to INSERT several million records at once, then you should consider other methods. The cqlsh COPY command is a viable alternative (for a few million records or less) or the Cassandra Bulk Loader for 10 million or more.

Resources