Cassandra Custom Secondary Index - cassandra

This seems to be a mystery in cassandra, According to official documentation, one can create index on a column by using a custom indexer class
CREATE CUSTOM INDEX ON users (email) USING 'path.to.the.IndexClass';
But I could not find any documentation regarding the interface/class to be implemented/extended to do this and how to configure cassandra to find the class?
I wanted to write a custom indexer which could skip indexing rows based on conditions/options.

Here what I've found https://issues.apache.org/jira/browse/CASSANDRA-6480
So you have to implement a subclass of org.apache.cassandra.db.index.SecondaryIndex and make sure that class is on the classpath for your Cassandra

Here you can find example of implementation:
http://tuplejump.github.io/stargate/
https://github.com/tuplejump/stargate-core

Stratio's Cassandra Lucene Index is a plugin which supports custom indexes in cassandra, you can find the required documentation on how to work with custom indexes.

Related

Indexing dynamic fields in azure search

I have used solr search engine which has a feature of dynamic fields. For example , if we define product_* field in the schema.xml, it will accept all the fields starting with product_ during the indexing.
Is there a feature like this in azure search where we can just define a wildcard for a field and it can accept the related fields in the indexing? As the fixed field thing reduces flexibility and one has to define a new schema every time for adding new fields.
Azure Cognitive Search does not support dynamic fields. Adding fields to the schema as you detect them during indexing is the suggested workaround.
Please consider creating an item on our User Voice page for this. While we haven't considered adding support for dynamic fields specifically, we have been looking at making schemas more flexible and extensible, and your input could help us prioritize this.

Can data in Solr be extended with manually defined meta data?

I have several documents in a solr collection that I want to be able to search through. Most of the data comes from web sites I can easily crawl, however, I need to add some attributes manually to because I have to add these attributes manually.
So as an example I get the following info from a site (all attributes returned from crawled site):
Name: Porsche Boxter
Year: 1996
...
I want to add additional fields through a web interface (info not present on crawled sites):
Cool: yes
foo: bar
My questions:
Does it make sense at all to store additional information along the indexed data within Solr (inside the documents) or would a best practice only have all crawled data in Solr and merge with an external managed database during query time? To me it makes more sense to have all my data that is eventually queried in Solr as some of the manually added attributed are required search criteria (e.g. look only for cool cars from the 90s).
Is it possible to use Solr to store additional information about indexed documents? I know the entire schema in advance, perhaps this is useful?
If I store my data exclusively in Solr, how can I ensure that during the next crawl the manually added data is not overwritten? Would partial update be required?
Since I am new to Solr it would also be very helpful if someone could simply manage what to look for in the documentation that describes my use case.
That depends on how often the external data changes. The more often, the less meaningful. Generally it is a good idea to store such data along the index data, because you get them without an additional database query.
Yes. Use indexed:falseand stored:true. If you knew not know all of such fields in advance you could use a dynamicField like <dynamicField name="*_stored" type="string" indexed="false" stored="true" />.
Yes. You have to use partial update. This is no problem in your case, because the fields not updated have stored:true.

Unable to create nested collection datatype

I am not able to create collection inside another collection in Cassandra. Please find error details below
cqlsh:TestKeyspace> create table users2(user_id text primary key, feeschedule map<text,set<text>>);
Bad Request: map type cannot contain another collection
Here I am trying to create column named feeschedule with type Map and Map have values which is of type List.
Could anybody suggest me how do I achieve it in Cassandra.
My Cassandra version details are given below:
cqlsh version- cqlsh 4.1.0
Cassandra version – 2.0.2
Thanks in advance,
You are correct, nested collections are not supported.
You will be able to do something similar with user-defined types, but not until 2.1: https://issues.apache.org/jira/browse/CASSANDRA-5590

Cassandra Astyanax Composite Column

I'm currently trying to figure out whether I can access composite columns in Cassandra without using the AnnotatedCompositeSerializer. I'm looking for a method similar to what Hector does, using the Composite class and adding components.
I have search in Google but wasn't able to find any hints except for the AnnotatedCompositeSerializer. I want to use the composite key as a row key by the way.
Any hints on where to look next?
Take a look at the PrefixedSerializer from the wiki page.

Is there deep loading in subsonic?

I am new in subsonic, and can't find the way to load data whith it's parents or childs data in one query. Is it possible in subsonic?
Basically no SubSonic 2 does not support deep loading. It is possible in SubSonic 3 using IQueryable however. See the following post for more:
Subsonic Deeploads: Is This Supported?
You CAN do it with subsonic 2. Make a partial class with the same namespace and class name.
Then create a property that loads the data when it is called.

Resources