Updating to empty set - cassandra

I just created a new column for my table
alter table user add (questions set<timeuuid>);
Now the table looks like
user (
google_id text PRIMARY KEY,
date_of_birth timestamp,
display_name text,
joined timestamp,
last_seen timestamp,
points int,
questions set<timeuuid>
)
Then I tried to update all those null values to empty sets, by doing
update user set questions = {} where google_id = ?;
for each google id.
However they are still null.
How can I fill that column with empty sets?

A set, list, or map needs to have at least one element because an
empty set, list, or map is stored as a null set.
source
Also, this might be helpful if you're using a client (java for instance).

I've learnt that there's not really such a thing as an empty set, or list, etc.
These display as null in cqlsh.
However, you can still add elements to them, e.g.
> select * from id_set;
set_id | set_content
-----------------------+---------------------------------
104649882895086167215 | null
105781005288147046623 | null
> update id_set set set_content = set_content + {'apple','orange'} where set_id = '105781005288147046623';
set_id | set_content
-----------------------+---------------------------------
104649882895086167215 | null
105781005288147046623 | { 'apple', 'orange' }
So even though it displays as null you can think of it as already containing the empty set.

Related

How to add attributes to database columns

Im currently working on creating correct database columns for my database. I have created two tables and used alter:
CREATE TABLE stores (
id SERIAL PRIMARY KEY,
store_name TEXT
-- add more fields if needed
);
CREATE TABLE products (
id SERIAL,
store_id INTEGER NOT NULL,
title TEXT,
image TEXT,
url TEXT UNIQUE,
added_date timestamp without time zone NOT NULL DEFAULT NOW(),
PRIMARY KEY(id, store_id)
);
ALTER TABLE products
ADD CONSTRAINT "FK_products_stores" FOREIGN KEY ("store_id")
REFERENCES stores (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE RESTRICT;
Now I am trying to use it together with PeeWee and I have managed to do a small step which is:
class Stores(Model):
id = IntegerField(column_name='id')
store_id = TextField(column_name='store_name')
class Products(Model):
id = IntegerField(column_name='id')
store_id = IntegerField(column_name='store_id')
title = TextField(column_name='title')
url = TextField(column_name='url')
image = TextField(column_name='image')
However my problem is that I have used:
ALTER TABLE products
ADD CONSTRAINT "FK_products_stores" FOREIGN KEY ("store_id")
REFERENCES stores (id) MATCH SIMPLE
ON UPDATE NO ACTION
ON DELETE RESTRICT;
which means that I do have a Foreign key and I am quite not sure how I can apply to use Foreign key together with PeeWee. I wonder how can I do that?
You need to add a ForeignKeyField to Products and remove store_id
class Products(Model):
id = IntegerField(column_name='id')
title = TextField(column_name='title')
url = TextField(column_name='url')
image = TextField(column_name='image')
store = ForeignKeyField(Stores, backref='products')

DynamoDB adds item instead of updates item when using update_item

I have an html table that is filled from a DynamoDB table. Clicking a row pops up an edit form in a modal. The data inputted is sent to a flask server to update the item - using AWS DynamoDB - that was edited in the modal form. Upon reading the AWS documentation for this, the correct method is to use update_item. However, when doing so the item is added again instead of updating the item. I used the AWS here to script the below. In my DynamoDB table, the primary partition key is KEY1 and the primary sort key is KEY2 in the below reference.
table = dynamodb.Table('table_name') #define DynamoDB table
key1 = account_id #string value of account id
key2 = request.form["KEY2"] #this is a read only field in the form, so the key does not get updated here
form_val1 = request.form["input1"]
form_val2 = request.form["input2"]
form_val3 = request.form["input3"]
form_val4 = request.form["input4"]
form_val5 = request.form["input5"]
form_val6 = request.form["input6"]
form_val7 = request.form["input7"]
form_val8 = request.form["input8"]
form_val9 = request.form["input9"]
#update item in dynamo
table.update_item(
Key={
'KEY1': key1, #partition key
'KEY2': key2 #sort key
},
UpdateExpression='SET dbField1 = :val1, dbField2 = :val2, dbField3 = :val3, dbField4 = :val4, dbField5 = :val5, dbField6 = :val6, dbField7 = :val7, dbField8 = :val8, dbField9 = :val9',
ExpressionAttributeValues={
':val1': form_val1,
':val2': form_val2,
':val3': form_val3,
':val4': form_val4,
':val5': form_val5,
':val6': form_val6,
':val7': form_val7,
':val8': form_val8,
':val9': form_val9
}
)
You can't and I will explain to you for what that not is possible.
When you create a table on dynamo DB with key and a order key you automatically create an index between key and sort key. We know an index is inmutable, that means you can't update the keys. Is for that reason that when you update dynamo create a new element.
It's a problem of the definition of your table because you never need to change the key or the sort key. Recreate your table only with the index and not with the sort index (because if your app can change the sort index that make not sense).
Is this the full query? the update_item docs say that TableName is required, which I don't see in your snippet.
From the updateitem docs:
Edits an existing item's attributes, or adds a new item to the table
if it does not already exist.
Make sure that the primary key (partition key and sort key) are unique in your table. If they are not, updateitem will create a new item in the database.
Are you absolutely certain that the primary key for the item already exists in the database?

How to run CQL in Zeppelin by taking input in user input format?

I was trying to run CQL query by taking in user input format in Zeppelin tool:-
%cassandra
SELECT ${Select Fields Type=uuid ,uuid | created_by | email_verify| username} FROM
${Select Table=keyspace.table_name}
${WHERE email_verify="true" } ${ORDER BY='updated_date' }LIMIT ${limit = 10};
while running this query I was getting this error:
line 4:0 mismatched input 'true' expecting EOF
(SELECT uuid FROM keyspace.table_name ["true"]...)
You need to move WHERE and ORDER BY out of the dynamic form declaration.
The input field declaration is looks as following: ${field_name=default_value}. In your case, instead of WHERE ..., you've got the field name of WHERE email_verify.
It should be as following (didn't tested):
%cassandra
SELECT ${Select Fields Type=uuid ,uuid | created_by | email_verify| username} FROM
${Select Table=keyspace.table_name}
WHERE ${where_cond=email_verify='true'} ORDER BY ${order_by='updated_date'} LIMIT ${limit = 10};
Update:
here is the working example for table with following structure:
CREATE TABLE test.scala_test2 (
id int,
c int,
t text,
tm timestamp,
PRIMARY KEY (id, c)
) WITH CLUSTERING ORDER BY (c ASC)

How to insert an array of strings in javascript into PostgreSQL

I am building an API server which accepts file uploads using multer.
I need to store an array of all the paths to all files uploaded for each request to a column in the PostgreSQL database which I have connected to the server.
Say I have a table created with the following query
CREATE TABLE IF NOT EXISTS records
(
id SERIAL PRIMARY KEY,
created_on TIMESTAMPTZ NOT NULL DEFAULT NOW(),
created_by INTEGER,
title VARCHAR NOT NULL,
type VARCHAR NOT NULL
)
How do I define a new column filepaths on the above table where I can insert a javascript string array (ex: ['path-to-file-1', 'path-to-file-2', 'path-to-file-3']).
Also how do I retrive, update/edit the list in javascript using node-postgres
You have 2 options:
use json or jsonb type. In the case string to insert will look:
'["path-to-file-1", "path-to-file-2", "path-to-file-3"]'
I would prefer jsonb - it allows to have good indexes. Json is rather just text with some additional built-in functions.
Use array of text - something like filepaths text[]. To insert you can use:
ARRAY ['path-to-file-1', 'path-to-file-2', 'path-to-file-3']
or
'{path-to-file-1,path-to-file-2,path-to-file-3,"path to file 4"}'
You need to use " here only for elements that contain space and so on. But you fill free to use it for all elements too.
You can create a file table that has a path column and a foreign key reference to the record that it belongs to. This way you can store the path as just a text column instead of storing an array in a column, which is better practice for relational databases. You'll also be able to store additional information on a file if you need to later. And it'll be more simple to interact with the file path records since you'd add a new file path by just inserting a new row into the file table (with the appropriate foreign key) and remove by deleting a row from the file table.
For example:
CREATE TABLE IF NOT EXISTS file (
record_id integer NOT NULL REFERENCES records(id) ON DELETE CASCADE,
path text NOT NULL
);
Then to get all the files for a record you can join the two tables together and convert to an array if you want.
For example:
SELECT
records.*,
ARRAY (
SELECT
file.path
FROM
file
WHERE
records.id = file.record_id
) AS file_paths
FROM
records;
Sample input (using only the title field of records):
INSERT INTO records (title) VALUES ('A'), ('B'), ('C');
INSERT INTO file (record_id, path) VALUES (1, 'patha1'), (1, 'patha2'), (1, 'patha3'), (2, 'pathb1');
Sample output:
id | title | file_paths
----+-------+------------------------
1 | A | {patha1,patha2,patha3}
2 | B | {pathb1}
3 | C | {}

Is it legit to store CQL tuples with null components in Cassandra 3.x

I have to store a protocol buffer structure in Cassandra 3.x. It is defined in a .proto file as:
message Attribute
{
required string key = 1;
oneof value {
int64 integerValue = 2;
float floatValue = 3;
string stringValue = 4;
}
}
To store multiple Attributes I was thinking about this CQL definition.
CREATE TABLE ... attributes: map<text, tuple<int, float, text> ...
and in each tuple 2 of 3 components would actually be null. I haven't tested this syntax yet but are there any downsides using this approach? Maybe there is a better way, i.e. User Defined Types?
Let's try this out. I'll start with a simple table, containing a valuemap column of type map<text,tuple<int,float,text> as you have above:
CREATE TABLE tupleTest (
key text,
value text,
valuemap map<text, FROZEN<tuple<int,float,text>>>,
PRIMARY KEY (key));
I'll INSERT some data:
INSERT INTO tupletest (key,value,valuemap) VALUES ('1','A',{'a':(0,0.0,'hi')});
INSERT INTO tupletest (key,value,valuemap) VALUES ('2','B',{'b':(0,null,'hi')});
INSERT INTO tupletest (key,value,valuemap) VALUES ('3','C',{'c':(null,null,'hi')});
And then I'll SELECT it, just to see:
aploetz#cqlsh:stackoverflow> SELECT * FROM tupletest ;
key | value | valuemap
-----+-------+---------------------------
3 | C | {'c': (None, None, 'hi')}
2 | B | {'b': (0, None, 'hi')}
1 | A | {'a': (0, 0, 'hi')}
(3 rows)
The main apprehension about explicitly INSERTing NULL values into Cassandra, is that in "normal" columns they actually create tombstones. But since we are not setting an entire column to NULL, merely an element in a tuple (nested inside a map), this is not the case. In fact, they are showing as None. And when I view the underlying SSTables, I also do not see evidence that a tombstone has been written.
Normally, I'd say that explicitly INSERTing a NULL into Cassandra is a terrible, terrible idea. But in this case, it shouldn't cause you any issues. Now, as to whether or not this is considered to be "legit" or a good practice...well, my data modeling senses do not approve. I would find another way to represent the absence of a value in a tuple type, as someone (the developer who follows you) could see this and interpret that as being "ok" to explicitly INSERT NULLs into other column values.

Resources