Solr terms component over multiple fields? - search

I am able to receive the most frequently used terms in my index via the terms compontent described here:
http://wiki.apache.org/solr/TermsComponent
However this only seems to work for exactly one field.
I would really like to have this functionality over several fields.
I am aware that I can use an extra field that I fill with all the data when indexing, but I would like to leverage this redundancy if possible.
Is there a possibility to use the termscomponent over several fields?

Is there a possibility to use the termscomponent over several fields?
No, the current implementation of TermsComponent takes only a single field, as noted in the documentation.
Perhaps it would be interesting to implement this, accepting multiple comma-separated fields in terms.fl, then setting per-field parameters as with faceting, e.g. terms.<field>.limit
I'm not familiar enough with the implementation to say if this is possible or really desirable, I'd try asking about it on the solr-dev list.
If this is about implementing suggestions / autocomplete, take a look at the Suggester component instead.

You can add extra terms.fl=fieldName parameters.
Just like for the facet.field parameter.
I hope this helps

Related

Send one of multiple parameters to REST API and use it

I use MEAN stack to develop an application.
I'm trying to develop a restful API to get users by first name or lastname
Should I write one get function to get the users for both firstname and lastname?
What is the best practice to write the URL to be handled by the backend?
Should I use the following?
To get user by firstname: localhost:3000/users?firstname=Joe
To get user by name:localhost:3000/users?firstname=Terry
And then check what is the parameter in my code and proceed.
In other words,What is the best practice if I want to pass one of multiple parameters to restful API and search by only one parameter?
Should I use content-location header?
There is no single best practice. There are lots of different ways to design a REST interface. You can use a scheme that is primarily path based such as:
http://myserver.com/query/users?firstname=Joe
Or primarily query parameter based:
http://myserver.com/query?type=users&firstname=Joe
Or, even entirely path based:
http://myserver.com/query/users/firstname/Joe
Only the last scheme dictates that only one search criteria can be passed, but this is likely also a limiting aspect of this scheme because if you, at some time in the future, want to be able to search on more than one parameter, you'd probably need to redesign.
In general, you want to take into account these considerations:
Make a list of all the things you think your REST API will want to do now and possibly in the future.
Design a scheme that anticipates all the things in your above list and feels extensible (you could easily add more things on to it without having to redesign anything).
Design a scheme that feels consistent for all of the different things a client will do with it. For example, there should be a consistent use of path and query parameters. You don't want some parts of your API using exclusively path segments and another part looking like a completely different design that uses only query parameters. An appropriate mix of the two is often the cleanest design.
Pick a design that "makes sense" to people who don't know your functionality. It should read logically and with a good REST API, the URL is often fairly self describing.
So, we can't really make a concrete recommendation on your one URL because it really needs to be considered in the totality of your whole API.
Of the three examples above, without knowing anything more about the rest of what you're trying to do, I like the first one because it puts what feels to me like the action into the path /query/users and then puts the parameters to that action into the query string and is easily extensible to add more arguments to the query. And, it reads very clearly.
There are clearly many different ways to successfully design and structure a REST API so there is no single best practice.

Solr - Enriching the TermsComponent answer

I'm using Solr 3.5.0 (with WebSphere Commerce). While performing a search, commerce use the suggestion tool to suggest (auto-complete) search terms regarding the letters already typed on the search box.
Currently WebSphere Commerce is using the Solr's TermsComponent. But one of my new requirement is to be abble to enrich the list of suggested terms.
Do you know is there is any way to do that by creating a plain text dictionary, using an other solr component, ... ?
Thanks for reading,
and for your help.
Regards,
Dekx.
I think a plain-text dictionary probably wouldn't be a usable data source (even if you could use it, search linearly through a plain-text file would probably be too slow). If you create an index from you dictionary, you could probably incorporate it in the TermsComponent as a shard (see the TermsComponent documentation, under the heading "Distributed Search Support").
I don't believe TermsComponent supports searching multiple fields, so you'll want to make sure the same field name is used for the terms in the dictionary that you want to use (that is, if you are looking at the "name" field in the index, then create a "name" field in your indexed dictionary as well, rather than a "dictionaryentry" field)
Just to my mind, though, I fail to understand what the value this would be. Generally, it's intended to look at the terms available in the index on that field. "Enriching" it with more data, would just be providing suggestions that it won't actually be able to find when searching. Of course, I don't really know about your search implementation, but in most cases, that would certainly be my thought.

Cocuhdb a fit for a query like this?

I have been looking for an escape from GAE as the datastore does not support a lot of the things I want to do with it.
So I have looked at CouchDB (among others) and I really like the REST interface and the hosting option I found at Cloudant.
But for all my googling and reading any docs I could find, I still am not sure if it is a good fit.
So I come here in the hope that someone might have more insight.
I write web apps and a lot of the projects I want to do will involve a query that looks like this:
Find all entries that are within a user-input-lat/long bounding box and where start-time is less than user-input-time-1 and end-time is greater than user-input-time-2 and has all tags in user-input-list-of-tags.
Thats not even pseudocode, but I hope it makes sense anyway.
I am not just looking for a "You cannot do that in CouchDB". Some kind of explanation and perhaps something like "If you can live without the tags then you can do this:"
I would like to use the Cloudant service so GeoCouch is apparently out of the question, but they do something that should work like lucene, but does that mean the queries are slow?
As you can tell, I am a bit confused here, so just do your best to straighten me out and I'll be greatfull :)
Not mentioning the tags (which in itself is already a problem), what you describe is a multi-dimensional query : you have several "coordinates" (lat, long, start-time, end-time) and provide a range for each of these coordinates.
On its own, CouchDB cannot perform multi-dimensional queries at all — you only get single-dimension queries across one coordinate.
Tags certainly are possible, but it depends on whether you need documents that have at least one tag in the list, or documents that have all tags in the list. The first case is easy (run one query per tag using the bulk API), the second might require excessive amounts of memory (if a document has N tags, it needs to emit 2N-1 tag-sets in order to match all possible tag combinations involving it, so you should place an upper bound on either the number of tags in a document, or the number of tags in a query).
Lucene does allow multi-dimensional and keyword-based queries, though I cannot vouch for their performance.

cakephp filter index pages according to foreign keys

I'm pretty new to CakePHP and was missing a crucial feature not generated as scaffold: filtering.
What do I have to do to provide dropdowns or multi-selects on the index pages for each field that is a (foreign) key, thereby allowing to filter the table ("OR" inside multi-select, "AND" between different multi-selects, if any)?
From what my websearch has shown me there are many more people trying to accomplish the same thing, although I couldn't find anything that would work for me because either they have text fields and do wildcard filtering, or the plugins they propose only work for 1.2 whereas i now started with 1.3 etc. etc.
Can someone alleviate the confusion and maybe present some working code or direct me to the definitive guide[tm] where this matter has been solved?
Thx
It seems to me that scaffolding is provided as-is. If you find any helper accomplishing this, many would be interested, I'm sure. But scaffolding is not really meant satisfy such "complex" requirements, it just lists the row in the model.
Although it shouldn't be difficult to program what you want: using cake from the console it can generate you all the code that scaffold does, you only have to add your filters.
I think you want this one. The author of the filter wrote:
Filters hasOne and belongsTo relationships (I prefer selects from dropdowns, but to each their own).

Alternative Data Access pattern to Repository

I have certain objects in my domain which are not aggregate roots/entities, yet I still need to retrieve them from a database. I don't want to confuse things by creating repositories for these things. So, what are alternative data access patterns? Would you simply create a DAO for them, while still of course separating the interface?
Edit:
Some more detail on what I'm doing. I need to create a code. This code has certain rules as to its format. One of the rules is that the final character must be a unique number incremented by one from the last code generated. For example:
ABCD1
ABCD2
ABCD3
So, I'm keeping a table with one row, one column to store the number in question. Now, I don't want to consider this number an entity and create a repository for it - that's overkill. I just need a way of retrieving the number, adding 1 to it, and saving it. I know there are myriad ways I could do it, but I'm wondering if there's an customary way.
There are several data access patterns that could apply, in theory. You'd need to provide more detail though if you want us to suggest a specific pattern.
Without more detail, all I can suggest is to consider looking into Martin Fowler's Patterns of Enterprise Application Architecture book.
Edit: Customary way? No, not that I can think of - it really depends on where and how you're using this unique code in your domain. If I were doing this, I'd probably create a small service that speaks directly to the database to perform this function - not as heavy-weight as a repository, and very focused on the problem at hand.
Based on the edit: I would look first at the context in which you need to create that code. Perhaps there are some related entities or something that you are missing.
btw, I find the question really interesting as it comes up from time to time while coding specific features. I usually end up finding I was missing something on the scenario and it ends up fitting well with the normal repository pattern.
After surveying the options I'm going with the Table Gateway pattern.

Resources