Displaying indexes with metadata in YugabyteDB YCQL - yugabytedb

[Question posted by a user on YugabyteDB Community Slack]
How can I get metadata info about indexes from the driver?
https://github.com/yugabyte/cassandra-java-driver/blob/3.10.0-yb-x/driver-core/src/main/java/com/datastax/driver/core/IndexMetadata.java
I use this one but I do not see any info about unique status or about where conditions.

You can query the system_schema.indexes table:
ycqlsh:ybdemo> select * from system_schema.indexes;
keyspace_name | table_name | index_name | kind | options | table_id | index_id | transactions | is_unique | tablets
---------------+------------+--------------------+------------+--------------------------------------------------------------------------+--------------------------------------+--------------------------------------+---------------------+-----------+---------
ybdemo | emp | emp_by_userid | COMPOSITES | {'include': 'enum', 'target': 'userid'} | 13563f8c-997e-a298-de46-0c05025e00a7 | 57b96b7f-6be9-55b8-4145-c30739b6d467 | {'enabled': 'true'} | True | null
ybdemo | emp | emp_by_userid_bbbb | COMPOSITES | {'include': 'enum', 'predicate': 'lastname = ''x''', 'target': 'userid'} | 13563f8c-997e-a298-de46-0c05025e00a7 | f44cc10d-5251-6189-8040-6d73857f09dc | {'enabled': 'true'} | True | null
You can see both is_unique and predicate that is used on the partial index.
This was done on 2.15.0.0.

Related

Select rows from array of uuid when dealing with two tables

I have products and providers. Each product has an uuid and each provider has a list of uuid of products that they can provide.
How do I select all the products that a given (i.e. by provider uuid) provider can offer?
Products:
+------+------+------+
| uuid | date | name |
+------+------+------+
| 0 | - | - |
| 1 | - | - |
| 2 | - | - |
+------+------+------+
Providers:
+------+----------------+
| uuid | array_products |
+------+----------------+
| 0 | [...] |
| 1 | [...] |
| 2 | [...] |
+------+----------------+
select p.name, u.product_uuid
from products p
join
(
select unnest(array_products) as product_uuid
from providers where uuid = :target_provider_uuid
) u on p.uuid = u.product_uuid;
Please note however that your data design is not efficient and much harder to work with than a normalized one.

Error while querying hive table with map datatype in Spark SQL. But working while executing in HiveQL

I have hive table with below structure
+---------------+--------------+----------------------+
| column_value | metric_name | key |
+---------------+--------------+----------------------+
| A37B | Mean | {0:"202006",1:"1"} |
| ACCOUNT_ID | Mean | {0:"202006",1:"2"} |
| ANB_200 | Mean | {0:"202006",1:"3"} |
| ANB_201 | Mean | {0:"202006",1:"4"} |
| AS82_RE | Mean | {0:"202006",1:"5"} |
| ATTR001 | Mean | {0:"202007",1:"2"} |
| ATTR001_RE | Mean | {0:"202007",1:"3"} |
| ATTR002 | Mean | {0:"202007",1:"4"} |
| ATTR002_RE | Mean | {0:"202007",1:"5"} |
| ATTR003 | Mean | {0:"202008",1:"3"} |
| ATTR004 | Mean | {0:"202008",1:"4"} |
| ATTR005 | Mean | {0:"202008",1:"5"} |
| ATTR006 | Mean | {0:"202009",1:"4"} |
| ATTR006 | Mean | {0:"202009",1:"5"} |
I need to write a spark sql query to filter based on Key column with NOT IN condition with commination of both keys.
The following query works fine in HiveQL in Beeline
select * from your_data where key[0] between '202006' and '202009' and key NOT IN ( map(0,"202009",1,"5") );
But when i try the same query in Spark SQL. I am getting error
cannot resolve due to data type mismatch: map<int,string>
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$3.applyOrElse(CheckAnalysis.scala:115)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$3.applyOrElse(CheckAnalysis.scala:107)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:278)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:278)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:277)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:275)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:275)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:326)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:187)
at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:324)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:275)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:275)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:275)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:326)
Please help!
I got the answer from different question which i raised before. This query is working fine
select * from your_data where key[0] between 202006 and 202009 and NOT (key[0]="202009" and key[1]="5" );

Caasandra PRIMARY KEY column "user" cannot be restricted as preceding column "eventtype" is not restricted

The table i developed is below one
create table userevent(id uuid,eventtype text,sourceip text,user text,sessionid text,roleid int,menu text,action text,log text,date timestamp,PRIMARY KEY (id,eventtype,user));
id | eventtype | user | action | date | log | menu | roleid | sessionid | sourceip
--------------------------------------+-----------+---------+--------+--------------------------+----------+-----------+--------+-----------+--------------
b15c6780-d69e-11e8-bb9a-59dfa00365c6 | DemoType | Aqib | Login | 2018-10-01 04:05:00+0000 | demolog | demomenu | 1 | Demo_1 | 121.11.11.12
95df3410-d69e-11e8-bb9a-59dfa00365c6 | DemoType | Aqib | Login | 2018-09-30 22:35:00+0000 | demolog | demomenu | 1 | Demo_1 | 121.11.11.12
575b05c0-d69e-11e8-bb9a-59dfa00365c6 | DemoType | Aqib | Login | 2018-10-01 04:05:00+0000 | demolog | demomenu | 1 | Demo_1 | 121.11.11.12
e6cbc190-d69e-11e8-bb9a-59dfa00365c6 | DemoType3 | Jasim | Login | 2018-05-31 22:35:00+0000 | demolog3 | demomenu3 | 3 | Demo_3 | 121.11.11.12
d66992a0-d69e-11e8-bb9a-59dfa00365c6 | DemoType | Shafeer | Login | 2018-07-31 22:35:00+0000 | demolog | demomenu | 2 | Demo_2 | 121.11.11.12
But when i queried as below,
select * from userevent where user='Aqib';
Its showing some thing like this : InvalidRequest: Error from server: code=2200 [Invalid query] message="PRIMARY KEY column "user" cannot be restricted as preceding column "eventtype" is not restricted"
What is the error...........
You need to read about data modelling for Cassandra, or for example take DS220 course on the DataStax Academy. Every row has primary key consisting of the partition key that defines on which node the data is located, and clustering keys that define placement inside partition. In your case, your primary key consists at least from id, eventtype, user. To put condition on user you need to specify both id and eventtype.
You can add the index, or materialized view to access only by user, but I recommend to get more into data modelling first - define your queries, and then build table structures about queries that you need to perform.

Find all occurrences from a string - Presto

I have the following as rows in HIVE (HDFS) and using Presto as the Query Engine.
1,#markbutcher72 #charlottegloyn Not what Belinda Carlisle thought. And yes, she was singing about Edgbaston.
2,#tomkingham #markbutcher72 #charlottegloyn It's true the garden of Eden is currently very green...
3,#MrRhysBenjamin #gasuperspark1 #markbutcher72 Actually it's Springfield Park, the (occasional) home of the might
The requirement is to do get the following through Presto Query. How can we get this please
1,markbutcher72
1,charlottegloyn
2,tomkingham
2,markbutcher72
2,charlottegloyn
3,MrRhysBenjamin
3,gasuperspark1
3,markbutcher72
select t.id
,u.token
from mytable as t
cross join unnest (regexp_extract_all(text,'(?<=#)\S+')) as u(token)
;
+----+----------------+
| id | token |
+----+----------------+
| 1 | markbutcher72 |
| 1 | charlottegloyn |
| 2 | tomkingham |
| 2 | markbutcher72 |
| 2 | charlottegloyn |
| 3 | MrRhysBenjamin |
| 3 | gasuperspark1 |
| 3 | markbutcher72 |
+----+----------------+

through associations in sails.js

A while ago I asked how to perform the "Through Associations".
I have the following tables :
genres
+-----------+--------------+------+-----+
| Field | Type | Null | Key |
+-----------+--------------+------+-----+
| id | int(6) | NO | PRI |
| slug | varchar(255) | NO | |
| parent_id | int(11) | YES | MUL |
+-----------+--------------+------+-----+
genres_radios
+----------+--------+------+-----+
| Field | Type | Null | Key |
+----------+--------+------+-----+
| genre_id | int(6) | NO | MUL |
| radio_id | int(6) | NO | MUL |
+----------+--------+------+-----+
radios
+-----------+--------------+------+-----+
| Field | Type | Null | Key |
+-----------+--------------+------+-----+
| id | int(5) | NO | PRI |
| slug | varchar(100) | NO | |
| url | varchar(100) | NO | |
+-----------+--------------+------+-----+
The answer is there : Sails.js associations.
Now I was wondering, if I had a new field in the genres_radios table, for example:
genres_radios
+----------+--------+------+-----+
| Field | Type | Null | Key |
+----------+--------+------+-----+
| genre_id | int(6) | NO | MUL |
| new_field| int(10)| NO | |
| radio_id | int(6) | NO | MUL |
+----------+--------+------+-----+
How would I do to get that attribute while making the join?
It is not implemented yet. Quoting Waterline's documentation :
Many-to-Many Through Associations
Many-to-Many through associations behave the same way as many-to-many
associations with the exception of the join table being automatically
created for you. This allows you to attach additional attributes onto
the relationship inside of the join table.
Coming Soon

Resources