Hi i have a columnfamily in cassandra db and when i check the contents of table it is shown differently when used as
./cqlsh -2
select * from table1;
KEY,31281881-1bef-447a-88cf-a227dae821d6 | A,0xaa| Cidr,10.10.12.0/24 | B,0xac | C,0x01 | Ip,10.10.12.1 | D,0x00000000 | E,0xace | F,0x00000000 | G,0x7375626e657431 | H,0x666230363 | I,0x00 | J,0x353839
While output is as this for
./cqlsh -3
select * from table1;
key | Cidr | Ip
--------------------------------------+---------------+------------
31281881-1bef-447a-88cf-a227dae821d6 | 10.10.12.0/24 | 10.10.12.1
This value is inserted using java program running.
I want to suppose update value of coulmn "B" which is only seen when using -2 option manually in database, it gives me error that it is hex value.
I am using this command to update but always getting error
cqlsh:sdnctl_db> update table1 SET B='0x7375626e657431' where key='31281881-1bef-447a-88cf-a227dae821d6';
Bad Request: cannot parse '0x7375626e657431' as hex bytes
cqlsh:sdnctl_db> update table1 SET B=0x7375626e657431 where key='31281881-1bef-447a-88cf-a227dae821d6';
Bad Request: line 1:30 no viable alternative at input 'x7375626e657431'
cqlsh:sdnctl_db> update table1 SET B=7375626e657431 where key='31281881-1bef-447a-88cf-a227dae821d6';
Bad Request: line 1:37 mismatched character '6' expecting '-'
I need to insert the hex value only which will be picked by the application but not able to insert.
Kindly help me in correcting the syntax.
It depends on what data type does column B have. Here is the reference of the CQL data types. The documentation says that blob data is represented as a hexadecimal string, so my assumption is that your column B is also a blob. From this other cassandra question (and answer) you see how to insert strings into blobs.
cqlsh:so> CREATE TABLE test (a blob, b int, PRIMARY KEY (b));
cqlsh:so> INSERT INTO test(a,b) VALUES (textAsBlob('a'), 0);
cqlsh:so> SELECT * FROM test;
b | a
---+------
0 | 0x61
cqlsh:so> UPDATE test SET a = textASBlob('b') WHERE b = 0;
cqlsh:so> SELECT * FROM test;
b | a
---+------
0 | 0x62
In your case, you could convert your hex string to a char string in code, and then use the textAsBlob function of cqlsh.
If the data type of column B is int instead, why don't you convert the hex number to int before executing the insert?
Related
I have a dataset.table partioned by date (100 partition) like this :
table_name_(100) which means : table_name_20200101, table_name_20200102, table_name_20200103, ...
Exemple of table_name_20200101 :
| id | col_1 | col_2 | col_3 |
-----------------------------------------------------------------------------
| xxx | 2 | 6 | 10 |
| yyy | 1 | 60 | 29 |
| zzz | 12 | 61 | 78 |
| aaa | 18 | 56 | 80 |
I would like to delete the row ID = yyy in all the table (partioned) :
DELETE FROM `project_id.dataset_id.table_name_*`
WHERE id = 'yyy'
I got this error :
Illegal operation (write) on meta-table
project_id:dataset_id.table_name_*
Is there a way to delete rows 'yyy' in all table (partioned) ?
Thank you
Okay, some various things to call out here to ensure we're using consistent terminology.
You're talking about sharded tables, not partitioned. In a partitioned table, the data within the table is organized based on the partitioning specification. Here, you just have a series of tables named using a common prefix and a suffix based on date.
The use of the table_prefix* syntax is called a wildcard table, and DML is explicitly not allowed via wildcard tables: https://cloud.google.com/bigquery/docs/querying-wildcard-tables
The table_name_(100) is an aspect of how the BigQuery UI collapses series of like-named tables to save space in the navigation panes. It's not how the service itself references tables at all.
The way you can accomplish this is to leverage other aspects of BigQuery: The INFORMATION_SCHEMA tables and scripting functionality.
Information about what tables are in a dataset is available via the TABLES view: https://cloud.google.com/bigquery/docs/information-schema-tables
Information about scripting can be found here: https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting
Now, here's an example that combines these concepts:
DECLARE myTables ARRAY<STRING>;
DECLARE X INT64 DEFAULT 0;
DECLARE queryStr STRING;
# First, we query INFORMATION_SCHEMA to generate an array of the tables we want to process.
# This INFORMATION_SCHEMA query currently has a LIMIT clause so that if you get it wrong,
# you won't bork all the tables in the dataset in one go.
SET myTables = (
SELECT
ARRAY_AGG(t)
FROM (
SELECT
TABLE_NAME as t
FROM `my-project-id`.my_dataset.INFORMATION_SCHEMA.TABLES
WHERE
TABLE_TYPE = 'BASE TABLE' AND
STARTS_WITH(TABLE_NAME, 'table_name_')
ORDER BY TABLE_NAME
LIMIT 2
)
);
# Now, we process that array of tables using scripting's loop construct,
# one at a time.
LOOP
IF X >= ARRAY_LENGTH(myTables)
THEN LEAVE;
END IF;
# DANGER WILL ROBINSON: This mutates tables!!!
#
# The next line constructs the SQL statement we want to run for each table.
#
# In this example, we're constructing the same DML DELETE
# statement to run on each table. For safety sake, you may want to start with
# something like a SELECT query to validate your assumptions and project the
# myTables values to see what you're getting.
SET queryStr = "DELETE FROM `my-project-id`.my_dataset." || myTables[SAFE_OFFSET(X)] || " WHERE id = 'yyy'";
# Now, run the generated SQL via EXECUTE IMMEDIATE.
EXECUTE IMMEDIATE queryStr;
SET X = X + 1;
END LOOP;
How do I query in cassandra for != null columns.
Select * from tableA where id != null;
Select * from tableA where name != null;
Then I wanted to store these values and insert these into different table.
I don't think this is possible with Cassandra. First of all, Cassandra CQL doesn't support the use of NOT or not equal to operators in the WHERE clause. Secondly, your WHERE clause can only contain primary key columns, and primary key columns will not allow null values to be inserted. I wasn't sure about secondary indexes though, so I ran this quick test:
create table nullTest (id text PRIMARY KEY, name text);
INSERT INTO nullTest (id,name) VALUES ('1','bob');
INSERT INTO nullTest (id,name) VALUES ('2',null);
I now have a table and two rows (one with null data):
SELECT * FROM nullTest;
id | name
----+------
2 | null
1 | bob
(2 rows)
I then try to create a secondary index on name, which I know contains null values.
CREATE INDEX nullTestIdx ON nullTest(name);
It lets me do it. Now, I'll run a query on that index.
SELECT * FROM nullTest WHERE name=null;
Bad Request: Unsupported null value for indexed column name
And again, this is done under the premise that you can't query for not null, if you can't even query for column values that may actually be null.
So, I'm thinking this can't be done. Also, if null values are a possibility in your primary key, then you may want to re-evaluate your data model. Again, I know the OP's question is about querying where data is not null. But as I mentioned before, Cassandra CQL doesn't have a NOT or != operator, so that's going to be a problem right there.
Another option, is to insert an empty string instead of a null. You would then be able to query on an empty string. But that still doesn't get you past the fundamental design flaw of having a null in a primary key field. Perhaps if you had a composite primary key, and only part of it (the clustering columns) had the possibility of being empty (certainly not part of the partitioning key). But you'd still be stuck with the problem of not being able to query for rows that are "not empty" (instead of not null).
NOTE: Inserting null values was done here for demonstration purposes only. It is something you should do your best to avoid, as inserting a null column value WILL create a tombstone. Likewise, inserting lots of null values will create lots of tombstones.
1) select * from test;
name | id | address
------------------+----+------------------
bangalore | 3 | ramyam_lab
bangalore | 4 | bangalore_ramyam
bangalore | 5 | jasgdjgkj
prasad | 11 | null
prasad | 12 | null
india | 6 | karnata
india | 7 | karnata
ramyam-bangalore | 3 | jasgdjgkj
ramyam-bangalore | 5 | jasgdjgkj
2)cassandra does't support null values selection.It is showing null for our understanding.
3) For handling null values use another strings like "not-available","null",then we can select data
I get a syntax error in a MySql database version 7.0
SELECT
r.id,
r.number,
r.numbertype,
r.forhandler,
LAG(r.number) OVER (PARTITION BY r.numbertype ORDER BY r.number) AS last_row_number,
LEAD(r.number) OVER (PARTITION BY r.numbertype ORDER BY r.number) AS next_row_number,
r.number -(LAG(r.number) OVER (PARTITION BY r.numbertype ORDER BY r.number)) AS gap_last_rk,
CAST (r.number-(LEAD(r.number) OVER (PARTITION BY r.numbertype ORDER BY r.`number`)) AS BIGINT SIGNED) AS gap_next_rk
FROM admin.numberranges r
WHERE r.status=2
ORDER BY r.number;
The syntax error is in my CAST part. My column NUMBER that is a BIG INT UNSIGNED.
I tried convert as well -:(
Error Code: 1064
You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'BIGINT SIGNED) AS neg_number
First, you have a space after CAST, which can result in other parse errors/issues with your question. You have to use CAST(...). Second, the type BIGINT SIGNED is not allowed, check the list for CAST(expr AS type). When you want a signed number you use the type SIGNED or SIGNED INTEGER, as described in the documentation:
The type can be one of the following values:
[...]
SIGNED [INTEGER]
See the following queries on how to use the CAST() function (examples run on MySQL 8.0.23, the result might not be the same for MariaDB but the type restrictions are similar, see the MySQL documentation of CONVERT(expr, type)):
mysql> EXPLAIN Dummy;
+-------+-----------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------------+------+-----+---------+-------+
| Test | bigint unsigned | YES | | NULL | |
+-------+-----------------+------+-----+---------+-------+
1 row in set (0.01 sec)
mysql> SELECT Test, CAST(Test AS BIGINT SIGNED) FROM Dummy;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax
to use near 'BIGINT SIGNED) FROM Dummy' at line 1
mysql> SELECT Test, CAST(Test AS SIGNED) FROM Dummy;
+------+----------------------+
| Test | CAST(Test AS SIGNED) |
+------+----------------------+
| 1234 | 1234 |
+------+----------------------+
1 row in set (0.00 sec)
Currently I am developing a solution in the field of time-series data. Within these data we have: an ID, a value and a timestamp.
So here it comes: the value might be of type boolean, float or string. I consider three approaches:
a) For every data type a distinct table, all sensor values of type boolean into a table, all sensor values of type string into another. The obvious disadvantage is that you have to know where to look for a certain sensor.
b) A meta-column describing the data type plus all values of type string. The obvious disadvantage is the data conversion e.g. for calculating the MAX, AVG and so on.
c) Having three columns of different type but only one will be with a value per record. The disadvantage is 500000 sensors firing every 100ms ... plenty of unused space.
As my knowledge is limited any help is appreciated.
500000 sensors firing every 100ms
First thing, is to make sure that you partition properly, to make sure that you don't exceed the limit of 2 billion columns per partition.
CREATE TABLE sensorData (
stationID uuid,
datebucket text,
recorded timeuuid,
intValue bigint,
strValue text,
blnValue boolean,
PRIMARY KEY ((stationID,datebucket),recorded));
With a half-million every 100ms, that's 500 million in a second. So you'll want to set your datebucket to be very granular...down to the second. Next I'll insert some data:
stationid | datebucket | recorded | blnvalue | intvalue | strvalue
--------------------------------------+---------------------+--------------------------------------+----------+----------+----------
8b466f1d-8d6b-46fa-9f5b-8c4eb51aa40c | 2015-04-22T14:54:29 | 6338df40-e929-11e4-88c8-21b264d4c94d | null | 59 | null
8b466f1d-8d6b-46fa-9f5b-8c4eb51aa40c | 2015-04-22T14:54:29 | 633e0f60-e929-11e4-88c8-21b264d4c94d | null | null | CD
8b466f1d-8d6b-46fa-9f5b-8c4eb51aa40c | 2015-04-22T14:54:29 | 6342f160-e929-11e4-88c8-21b264d4c94d | True | null | null
3221b1d7-13b4-40d4-b41c-8d885c63494f | 2015-04-22T14:56:19 | a48bbdf0-e929-11e4-88c8-21b264d4c94d | False | null | null
...plenty of unused space.
You might be suprised. With the CQL output of SELECT * above, it appears that there are null values all over the place. But watch what happens when we use the cassandra-cli tool to view how the data is stored "under the hood:"
RowKey: 3221b1d7-13b4-40d4-b41c-8d885c63494f:2015-04-22T14\:56\:19
=> (name=a48bbdf0-e929-11e4-88c8-21b264d4c94d:, value=, timestamp=1429733297352000)
=> (name=a48bbdf0-e929-11e4-88c8-21b264d4c94d:blnvalue, value=00, timestamp=1429733297352000)
As you can see, the data (above) stored for the CQL row where stationid=3221b1d7-13b4-40d4-b41c-8d885c63494f AND datebucket='2015-04-22T14:56:19' shows that blnValue has a value of 00 (false). But also notice that intValue and strValue are not present. Cassandra doesn't force a null value like an RDBMS does.
The obvious disadvantage is the data conversion e.g. for calculating the MAX, AVG and so on.
Perhaps you already know this, but I did want to mention that Cassandra CQL does not contain definitions for MAX, AVG or any other data aggregation function. You'll either need to do that client-side, or implement Apache-Spark to perform OLAP-type queries.
Be sure to read through Patrick McFadin's Getting Started With Time Series Data Modeling. It contains good suggestions on how to solve time series problems like this.
I have been trying to use COPY FROM to insert into a Cassandra table that has a timestamp type column. However, I encountered the following error:
code=2200 [Invalid query] message="unable to coerce '2015-03-06 18:11:33GMT' to a formatted date (long)"
Aborting import at record #3. Previously-inserted values still present.
0 rows imported in 0.211 seconds.
The content of the CSV file was actually created with a COPY TO command. My TZ environment variable has been set to GMT.
I did some searching and found a post here that mentioned using Z instead of GMT as the timezone in the data string, i.e. '2015-03-06 18:11:33Z'. If I replace all the GMT in my CSV with Z, COPY FROM worked. Link for the post here:
unable to coerce '2012/11/11' to a formatted date (long)
When I run a SELECT on this table, the datetime column shows up in the format of: 2015-03-06 17:53:23GMT.
Further info, there was a bug about 'Z' timezone but it was fixed. Link: https://issues.apache.org/jira/browse/CASSANDRA-6973
So my question is, is there a way that I can run COPY TO so that it writes Z instead of GMT for time zone?
Alternatively, is there a way I can make COPY FROM work with GMT?
Thanks.
Note: The solution is in the comment from #Aaron for this post. Yes, it's a hack but it works.
I think what is happening here, is that you are getting bit by your time_format property in your ~/.cassandra/cqlshrc file. COPY uses this setting when exporting your timestamp data during a COPY TO. CQLSH uses the Python strftime formats. It is interesting to note that the lowercase %z and uppercase %Z seem to represent your problem.
When I SELECT timestamp data with %Z (upper), it looks like this:
aploetz#cqlsh:stackoverflow> SELECT * FROm posts1;
userid | posttime | postcontent | postid
--------+------------------------+--------------+--------------------------------------
1 | 2015-01-25 13:25:00CST | blahblah5 | 13218139-991c-4ddc-a11a-86992f6fed66
1 | 2015-01-25 13:22:00CST | blahblah2 | eacdebcc-35c5-45f7-9374-d5fd987e699f
0 | 2015-03-12 14:10:00CDT | sdgfjdsgojr | 82766df6-4cca-4ad1-ae59-ba4488103da4
0 | 2015-03-12 13:56:00CDT | kdsjfsdjflds | bd5c2be8-be66-41da-b9ff-98e9a4836000
0 | 2015-03-12 09:10:00CDT | sdgfjdsgojr | 6865216f-fc4d-431c-8067-c27cf20b6be7
When I try to INSERT a record using that date format, it fails:
aploetz#cqlsh:stackoverflow> INSERT INTO posts1 (userid,posttime,postcontent,postid) VALUES (0,'2015-03-12 14:27CST','sdgfjdsgojr',uuid());
code=2200 [Invalid query] message="unable to coerce '2015-03-12 14:27CST' to a formatted date (long)"
But when I alter time_format to use the (lowercase) %z the same query produces this:
aploetz#cqlsh:stackoverflow> SELECT * FROm posts1;
userid | posttime | postcontent | postid
--------+--------------------------+--------------+--------------------------------------
1 | 2015-01-25 13:25:00-0600 | blahblah5 | 13218139-991c-4ddc-a11a-86992f6fed66
1 | 2015-01-25 13:22:00-0600 | blahblah2 | eacdebcc-35c5-45f7-9374-d5fd987e699f
0 | 2015-03-12 14:10:00-0500 | sdgfjdsgojr | 82766df6-4cca-4ad1-ae59-ba4488103da4
0 | 2015-03-12 13:56:00-0500 | kdsjfsdjflds | bd5c2be8-be66-41da-b9ff-98e9a4836000
0 | 2015-03-12 09:10:00-0500 | sdgfjdsgojr | 6865216f-fc4d-431c-8067-c27cf20b6be7
I can also INSERT data in this format:
INSERT INTO posts1 (userid,posttime,postcontent,postid)
VALUES (0,'2015-03-12 14:27-0500','sdgfjdsgojr',uuid());
It also appears in this way when I run a COPY TO, and a COPY FROM of the same data/file also works.
In summary, check your ~/.cassandra/cqlshrc and make sure that you are either using the default setting, or this setting in the [ui] section:
[ui]
time_format = %Y-%m-%d %H:%M:%S%z
It won't get you the 'Z' like you asked for, but it will allow you to COPY TO/FROM your data without having to muck with the CSV file.
Edit
For those of you poor souls out there using CQLSH (or Cassandra, God help you) on Windows, the default location of the cqlshrc file is c:\Users\%USERNAME%\.cassandra\cqlshrc.
Edit - 20150903
Inspired by this question, I submitted a patch (CASSANDRA-8970) to allow users to specify a custom time format with COPY, and it was marked as "Ready To Commit" yesterday. Basically, this patch will allow this problem to be solved by doing the following:
COPY posts1 TO '/home/aploetz/posts1.csv' WITH DELIMITER='|' AND HEADER=true
AND TIME_FORMAT='%Y-%m-%d %H:%M:%SZ;
Edit - 20161010
The COPY command was improved in Cassandra 2.2.5, and the TIMEFORMAT option has been renamed to DATETIMEFORMAT.
From New options and better performance in cqlsh copy:
DATETIMEFORMAT, which used to be called TIMEFORMAT, a string containing the Python strftime format for date and time values, such as ā%Y-%m-%d %H:%M:%S%zā. It defaults to the time_format value in cqlshrc.