Inserting a value on a frozen set in cassandra 3 - cassandra

I am currently working on a Cassandra 3 database in which one of its tables has a column that is defined like this:
column_name map<int, frozen <set<int>>>
When I have to change the value of a complete set given a map key x I just have to do this:
UPDATE keyspace.table SET column_name[x] = {1,2,3,4,5} WHERE ...
The thing is that I need to insert a value on a set given a key. I tried with this:
UPDATE keyspace.table SET column_name[x] = column_name[x] + {1} WHERE ...
But it returns:
SyntaxException: line 1:41 no viable alternative at input '[' (... SET column_name[x] = [column_name][...)
What am I doing wrong? Does anyone know how to insert data the way I need?

Since the value of map is frozen, you can't use update like this.
A frozen value serializes multiple components into a single value. Non-frozen types allow updates to individual fields. Cassandra treats the value of a frozen type as a blob. The entire value must be overwritten.
You have to read the full map get the value of the key append new item and then reinsert

Related

Checking if key exists in Presto value map

I am new to Presto, and can't quite figure out how to check if a key is present in a map. When I run a SELECT query, this error message is returned:
Key not present in map: element
SELECT value_map['element'] FROM
mytable
WHERE name = 'foobar'
Adding AND contains(value_map, 'element') does not work
The data type is a string array
SELECT typeof('value_map') FROM mytable
returns varchar(9)
How would I only select records where 'element' is present in the value_map?
You can lookup a value in a map if the key is present with element_at, like this:
SELECT element_at(value_map, 'element')
FROM ...
WHERE element_at(value_map, 'element') IS NOT NULL
element_at is ambiguous in that case -- it'll return NULL when either there's no such key or the key does exist and has NULL associated with it. A guaranteed approach is contains(map_keys(my_map), 'mykey'), which admittedly should be a bit slower than the original variant.

UPDATE prepared statement with Object

I have an Object that maps column names to values. The columns to be updated are not known beforehand and are decided at run-time.
e.g. map = {col1: "value1", col2: "value2"}.
I want to execute an UPDATE query, updating a table with those columns to the corresponding values. Can I do the following? If not, is there an elegant way of doing it without building the query manually?
db.none('UPDATE mytable SET $1 WHERE id = 99', map)
is there an elegant way of doing it without building the query manually?
Yes, there is, by using the helpers for SQL generation.
You can pre-declare a static object like this:
const cs = new pgp.helpers.ColumnSet(['col1', 'col2'], {table: 'mytable'});
And then use it like this, via helpers.update:
const sql = pgp.helpers.update(data, cs) + /* WHERE clause with the condition */;
// and then execute it:
db.none(sql).then(data => {}).catch(error => {})
This approach will work with both a single object and an array of objects, and you will just append the update condition accordingly.
See also: PostgreSQL multi-row updates in Node.js
What if the column names are not known beforehand?
For that see: Dynamic named parameters in pg-promise, and note that a proper answer would depend on how you intend to cast types of such columns.
Something like this :
map = {col1: "value1", col2: "value2",id:"existingId"}.
db.none("UPDATE mytable SET col1=${col1}, col2=${col2} where id=${id}", map)

COPY FROM CSV with static fields on Postgres

I'd like to switch an actual system importing data into a PostgreSQL 9.5 database from CSV files to a more efficient system.
I'd like to use the COPY statement because of its good performance. The problem is that I need to have one field populated that is not in the CSV file.
Is there a way to have the COPY statement add a static field to all the rows inserted ?
The perfect solution would have looked like that :
COPY data(field1, field2, field3='Account-005')
FROM '/tmp/Account-005.csv'
WITH DELIMITER ',' CSV HEADER;
Do you know a way to have that field populated in every row ?
My server is running node.js so I'm open to any cost-efficient solution to complete the files using node before COPYing it.
Use a temp table to import into. This allows you to:
add/remove/update columns
add extra literal data
delete or ignore records (such as duplicates)
, before inserting the new records into the actual table.
-- target table
CREATE TABLE data
( id SERIAL PRIMARY KEY
, batch_name varchar NOT NULL
, remote_key varchar NOT NULL
, payload varchar
, UNIQUE (batch_name, remote_key)
-- or::
-- , UNIQUE (remote_key)
);
-- temp table
CREATE TEMP TABLE temp_data
( remote_key varchar -- PRIMARY KEY
, payload varchar
);
COPY temp_data(remote_key,payload)
FROM '/tmp/Account-005'
;
-- The actual insert
-- (you could also filter out or handle duplicates here)
INSERT INTO data(batch_name, remote_key, payload)
SELECT 'Account-005', t.remote_key, t.payload
FROM temp_data t
;
BTW It is possible to automate the above: put it into a function (or maybe a prepared statement), using the filename/literal as argument.
Set a default for the column:
alter table data
alter column field3 set default 'Account-005'
Do not mention it the the copy command:
COPY data(field1, field2) FROM...

CQL no viable alternative at input '(' error

I have a issue with my CQL and cassandra is giving me no viable alternative at input '(' (...WHERE id = ? if [(]...) error message. I think there is a problem with my statement.
UPDATE <TABLE> USING TTL 300
SET <attribute1> = 13381990-735b-11e5-9bed-2ae6d3dfc201
WHERE <attribute2> = dfa2efb0-7247-11e5-a9e5-0242ac110003
IF (<attribute1> = null OR <attribute1> = 13381990-735b-11e5-9bed-2ae6d3dfc201) AND <attribute3> = 0;
Any idea were the problem is in the statement about?
It would help to have your complete table structure, so to test your statement I made a couple of educated guesses.
With this table:
CREATE TABLE lwtTest (attribute1 timeuuid, attribute2 timeuuid PRIMARY KEY, attribute3 int);
This statement works, as long as I don't add the lightweight transaction on the end:
UPDATE lwttest USING TTL 300 SET attribute1=13381990-735b-11e5-9bed-2ae6d3dfc201
WHERE attribute2=dfa2efb0-7247-11e5-a9e5-0242ac110003;
Your lightweight transaction...
IF (attribute1=null OR attribute1=13381990-735b-11e5-9bed-2ae6d3dfc201) AND attribute3 = 0;
...has a few issues.
"null" in Cassandra is not similar (at all) to its RDBMS counterpart. Not every row needs to have a value for every column. Those CQL rows without values for certain column values in a table will show "null." But you cannot query by "null" since it isn't really there.
The OR keyword does not exist in CQL.
You cannot use extra parenthesis to separate conditions in your WHERE clause or your lightweight transaction.
Bearing those points in mind, the following UPDATE and lightweight transaction runs without error:
UPDATE lwttest USING TTL 300 SET attribute1=13381990-735b-11e5-9bed-2ae6d3dfc201
WHERE attribute2=dfa2efb0-7247-11e5-a9e5-0242ac110003
IF attribute1=13381990-735b-11e5-9bed-2ae6d3dfc201 AND attribute3=0;
[applied]
-----------
False

Update time and remaining time to leave for cassandra row

How can I tell when a certain row was written, when is it going to be discarded?
I've searched for that info but couldnt find it.
Thanks.
Using the WRITETIME function in a SELECT statement will return the date/time in microseconds that the column was written to the database.
For example:
select writetime(login) from user;
Will return something like:
writetime(login)
------------------
1439082127862000
When you insert a row with a TTL (time-to-live) in seconds, for example:
INSERT INTO user(login) VALUES ('admin') USING TTL 60;
Using the TTL function in a SELECT statement will return the amount of seconds the data inserted has to live.
For example:
select ttl(login) from user;
Will return something like:
ttl(login)
------------------
59
If you don't specify a TTL, the above query will return:
ttl(login)
------------------
null
If you're in Casandra 2.2+, you can create a user-defined function (UDF) to convert the microseconds returned by WRITETIME to a more readable format.
To use user-defined functions, enable_user_defined_functions must be set to true in cassandra.yaml file.
Then, in cqlsh create a function like the following:
CREATE OR REPLACE FUNCTION microsToFormattedDate (input bigint) CALLED ON NULL INPUT RETURNS text LANGUAGE java AS 'return new java.text.SimpleDateFormat("yyyy-MM-dd HH:mm:ss,SSS").format( new java.util.Date(input / 1000) );';
User-defined functions are defined within a keyspace. If no keyspace is defined, the current keyspace is used.
Now using the function:
select microsToFormattedDate( writetime(login) ) from user;
Will return something like this:
social.microstoformatteddate(writetime(login))
-----------------------------------------------
2015-08-08 20:02:07,862
Use writetime method in cql to get the time the column was written.
select writetime(column) from tablename where clause

Resources