I'm using solr to index an entity which has indefinite number of related entities
Table 1
id name
1 | aa
2 | bb
3 | cc
Table 2
id field1 field2
1 | works in | New York
1 | likes to go to | Paris
As you see, each row represents an entity related to entity with id 1 and which value corresponds which matters.
How do I achieve this with Solr's data import handler?
I used SubEntity in data-config.xml and multiValued=true for field1 and field2, but the indexed document looks like
id 1
field1:[works in, likes to go to]
field2:[New York, Paris]
and the relationships between columns were completely lost. If one searches works in Paris he can also get entity 1. What should I do to maintain the relationships? Thanks a lot.
Schema Definition in schema.xml
id(type string)
name(type string)
worksIn (type string, multi value= true) - your choice if multi-value required or not
likesToGo (type string, multi value= true) - multivalue makes sense here as person is,most likely have more places to go, anyways your requirement
Sample docs after indexing
1,aa, worksIn[Newyork, New Jeysey], likesToGo[Paris, Moon]
2,bb, worksIn[Dallas], likesToGo[NewYork, Sun]
Querying
For "works in Paris", query is "worksIn:Paris".
You get doc with id 1
For "likes to go to sun", query is "likesToGo:sun".
You get doc with id 2
Related
I have two tables mentioned below:
Reports
Id |status
1 |Active
Reports_details
Id| Cntry| State | City
1 | IN | UP | Delhi
1 | US | Texas | Salt lake
Now my requirement is
Select distinct r.Id from Reports r left join Reports_details rd on r.Id=rd.Id where r.status=‘Active’ and contains(city,’”Del*”’)
Note: using contains for full text search
Problem: How to add where clause on both tables Bookshelf Model simultaneously
and how to fetch above query data with pagination
Tried created 2 respective Models with belongs on and hasMany but issue comes when applying where on either Model, it’s not accepting where clause from both table-error:Invalid column name
Appreciate your suggestion on the work around. Thank You
I need to do this query for cassndra:
select * from classes where students = null allow filtering;
students is a set
but looks like set do not allow = operator.
To test this out, I followed the DataStax docs on Indexing a Collection.
> CREATE TABLE cyclist_career_teams ( id UUID PRIMARY KEY, lastname text, teams set<text> );
> CREATE INDEX team_idx ON cyclist_career_teams ( teams );
With the table created and a secondary index on the teams set, I then inserted some test data:
> SELECT lastname,teams FROM cyclist_career_teams ;
lastname | teams
-----------------+---------------------------------------------------------------------------------------------------------
Vos | {'Neiderland bloeit', 'Rabobank Womens Team', 'Rabobonk-Liv Giant', 'Rabobonk-Liv Womens Cycling Team'}
Van Der Breggen | {'Rabobonk-Liv Womens Cycling Team', 'Sengers Ladies Cycling Team', 'Team Flexpoint'}
Brand | {'AA Drink - Leontien.nl', 'Rabobonk-Liv Giant', 'Rabobonk-Liv Womens Cycling Team'}
Armistead | null
Note that for Lizzie Armistead, I intentionally omitted a value for the teams column. While CQL does not allow the equals "=" relation on set types, it does allow CONTAINS. However, attempting to use that with null yields a different error:
> SELECT lastname,teams FROM cyclist_career_teams WHERE teams CONTAINS null;
[Invalid query] message="Unsupported null value for column teams"
The reason for this behavior, is related to how Cassandra has some special treatment for null values and the "null" keyword. Essentially, writing a null creates a tombstone, which is Cassandra's structure signifying a delete.
Even if Cassandra's treatment of null was not a factor, you'd still be faced with the problem that a value of "null" is not unique and your query would have to poll each node in the cluster. Such use cases are well-known anti-patterns. Unfortunately, Cassandra is just not good at querying data (or filtering on a key value) which does not exist.
One thing you could try, would be to use a string literal to indicate an empty value, like this:
> INSERT INTO cyclist_career_teams (id,lastname) VALUES (uuid(),'Armistead',{'empty'});
> SELECT lastname,teams FROM cyclist_career_teams WHERE teams CONTAINS 'empty';
lastname | teams
-----------+-----------
Armistead | {'empty'}
(1 rows)
To be honest though, because of the afore-mentioned anti-pattern, I can't recommend this approach in good faith. But with some added application logic at creation time, an "empty" string literal could work for you.
How do I query in cassandra for != null columns.
Select * from tableA where id != null;
Select * from tableA where name != null;
Then I wanted to store these values and insert these into different table.
I don't think this is possible with Cassandra. First of all, Cassandra CQL doesn't support the use of NOT or not equal to operators in the WHERE clause. Secondly, your WHERE clause can only contain primary key columns, and primary key columns will not allow null values to be inserted. I wasn't sure about secondary indexes though, so I ran this quick test:
create table nullTest (id text PRIMARY KEY, name text);
INSERT INTO nullTest (id,name) VALUES ('1','bob');
INSERT INTO nullTest (id,name) VALUES ('2',null);
I now have a table and two rows (one with null data):
SELECT * FROM nullTest;
id | name
----+------
2 | null
1 | bob
(2 rows)
I then try to create a secondary index on name, which I know contains null values.
CREATE INDEX nullTestIdx ON nullTest(name);
It lets me do it. Now, I'll run a query on that index.
SELECT * FROM nullTest WHERE name=null;
Bad Request: Unsupported null value for indexed column name
And again, this is done under the premise that you can't query for not null, if you can't even query for column values that may actually be null.
So, I'm thinking this can't be done. Also, if null values are a possibility in your primary key, then you may want to re-evaluate your data model. Again, I know the OP's question is about querying where data is not null. But as I mentioned before, Cassandra CQL doesn't have a NOT or != operator, so that's going to be a problem right there.
Another option, is to insert an empty string instead of a null. You would then be able to query on an empty string. But that still doesn't get you past the fundamental design flaw of having a null in a primary key field. Perhaps if you had a composite primary key, and only part of it (the clustering columns) had the possibility of being empty (certainly not part of the partitioning key). But you'd still be stuck with the problem of not being able to query for rows that are "not empty" (instead of not null).
NOTE: Inserting null values was done here for demonstration purposes only. It is something you should do your best to avoid, as inserting a null column value WILL create a tombstone. Likewise, inserting lots of null values will create lots of tombstones.
1) select * from test;
name | id | address
------------------+----+------------------
bangalore | 3 | ramyam_lab
bangalore | 4 | bangalore_ramyam
bangalore | 5 | jasgdjgkj
prasad | 11 | null
prasad | 12 | null
india | 6 | karnata
india | 7 | karnata
ramyam-bangalore | 3 | jasgdjgkj
ramyam-bangalore | 5 | jasgdjgkj
2)cassandra does't support null values selection.It is showing null for our understanding.
3) For handling null values use another strings like "not-available","null",then we can select data
I am looking for the best way to store and retrieve an array of data. The solution I am currently implementing uses a many to many relationship as follows.
venue_themes
user_id style environment
A1A2 formal indoor
A2B2 formal outdoor
theme_setting_to_setting_enum
id user_id setting_enum_id
1 A1A2 1
2 A1A2 3
3 A2B2 1
4 A2B2 2
setting_enum
id value
1 garden
2 beach
3 golf course
4 backyard
The query I currently have is:
SELECT vt.user_id, vt.style, vt.environment, se.value FROM venue_themes vt JOIN theme_settings_to_setting_enum ts ON vt.user_id = ts.user_id JOIN setting_enum se ON ts.setting_enum_id = se.id GROUP BY vt.user_id, ts.id, se.id;
This works but it returns multiple rows with the same data other than my setting enum values.
An example return is :
user_id style environment value
AAAA formal indoor beach
AAAA formal indoor backyard
AAAA formal indoor tent
This is fine but seems excessive if I have many values. What I really want my data to look like is:
user_id style environment value
AAAA formal indoor beach, backyard, tent
Ideally I would have my values returned in an array or something similar so I don't have to build a function to manipulate the returned data.
You can remove se.id from the GROUP BY clause, and use STRING_AGG() to generate the CSV string:
SELECT vt.user_id, vt.style, vt.environment, STRING_AGG(se.value, ', ') se_values
FROM venue_themes vt
JOIN theme_settings_to_setting_enum ts ON vt.user_id = ts.user_id
JOIN setting_enum se ON ts.setting_enum_id = se.id
GROUP BY vt.user_id;
Assuming that user_id is the primary key of venue_themes, it is sufficient to have just this column in the GROUP BY clause (other columns of the table are fonctionnally dependent on the primary key).
You can control the order in which values are aggregated in the string with an ORDER BY clause:
STRING_AGG(se.value, ', ' ORDER BY se.id) se_values
If you want an array instead of a CSV string, then use ARRAY_AGG():
ARRAY_AGG(se.value, ', ' ORDER BY se.id) se_values
I'm looking to try do the following;
I want to have say 3 columns.
Transaction | Category | Amount
so I want to be able to enter a certain Name in Transaction say for argument sake "Tesco" then have a returned result in Category Column say "Groceries" and I can enter a specific amount then myself in Amount Colum.
Thing is I will need to have unlimited or quite a lot of different Transactions and have them all in pre determined Categories so that each time when I type in a Transaction it will automatically display the category for me.
All help much appreciated.
I know a simple If Statement wont suffice I can get it to work no problem using a Simple IF Statement but as each Transaction is different I don't know how to program further.
Thanks.
Colin
Use a lookup table. Let's say it's on a sheet called "Categories" and it looks like this:
| A | B
1 | Name | Category
2 | Tesco | Groceries
3 | Shell | Fuel
Then, in the table you describe, use =VLOOKUP(A2, Categories!$A$2:$B$3, 2, FALSE) in your "Category" field, assuming it's in B2.
I do this a fair bit using Data Validation and tables.
In this case I would have two tables containing my pick lists on a lookup sheet.
Transaction Table : [Name] = "loTrans" - with just the list of transactions sorted
Category Table : [Name] = "loCategory" - two columns in table, sorted by Both columns - Trans and Category
Header1 : Transactions
Header2 : Category
The Details Table:
the transaction field will have a simple data validation, using a
named range "trans", that selects from the table loTrans.
the transaction field will also use data validation, using a named
range, but the source of the named range ("selCat" will be a little more
complex. It will be something like:
=OFFSET(loCategory[Trans],MATCH(Enter_Details!A3,loCategory[Trans],0)-1,1,COUNTIF(loCategory[Trans],Enter_Details!A3),1)
As you enter details, and select different Transactions, the data validation will be limited to the Categorys of your selected transactions
An example file