There are questions like this on here, but no answers.
I need to implement a feature where the two types of nodes (labelled :Hashtags and :Statements) in my Neo4J 2.0 database can be searched by the users from my Node.Js app.
So that means the users enter something they need into a search field, click search, and get the results. A better scenario is that it's more responsive and finds possible matches on the fly.
How would you implement that?
I have some ideas, but unsure about which one to go for:
Each time the user makes a search, make this kind of Cypher query (not very efficient to query the database so much, I guess, and won't work for responsive results suggestions):
MATCH (h:Hashtag{name:"user_query"}), (s:Hashtag{name:"user_query"}) RETURN h,s;
Install something like Elastic Search and let it handle the search (this is what the guys from Linkurio.us have done)
In the first option the .name property of those labeled nodes is, of course, indexed.
The second option seems to be more robust, but I really would like to avoid having to install extra software and having this kind of dependencies.
Maybe you know of a better solution?
Thank you!
I don't understand why the first option would not be responsive?
After all the Neo4j indexing by default is using Lucene, the same as elastic search?
And with an index (or unique constraint) the lookup should be instant.
Did you actually test the performance? (Make sure to use parameters for the actual value)
Related
I'm trying to create a search feature in Meteor 1.8.1 that does the following:
returns partial matches, e.g. "fish" will find "fish", "fishcake" and "dogfish"
has server-side control of which documents are returned, so search results don't include documents that are not published to the user
is reasonably efficient
returns a limited number of results
This seems like it should be a common requirement, but I'm failing to find any solution.
MongoDB full text search will only return on whole words, so will only find "fish".
Easy search doesn't support server-side permissions, as far as I can tell.
I could try a regex solution but I think it would be expensive?
Thank you for any solutions!
Edit: From discussion it seems that Easy Search does support server-side filtering using a selector, and this would be the best solution. However, I can't get a selector working from the examples and documentation. For clarity, I've created a new question for that issue
The documentation explicitly states that for advanced use cases you may want to use elastic search and offers you a pluggable extension to ease the burden of integration.
https://matteodem.github.io/meteor-easy-search/docs/recipes/#advanced-search
You might wish that a search for cafe returns documents with the text café in them (special character). Or that your search string is split up by whitespace and those terms used to search across multiple fields.
You should consider using a search engine like ElasticSearch for your search if you have these usecases. ElasticSearch allows you to configure precisely how your fields are being searched. One way you can do that is by analyzing your data, so that searching itself is as fast as possible.
I'm using Solr 3.5.0 (with WebSphere Commerce). While performing a search, commerce use the suggestion tool to suggest (auto-complete) search terms regarding the letters already typed on the search box.
Currently WebSphere Commerce is using the Solr's TermsComponent. But one of my new requirement is to be abble to enrich the list of suggested terms.
Do you know is there is any way to do that by creating a plain text dictionary, using an other solr component, ... ?
Thanks for reading,
and for your help.
Regards,
Dekx.
I think a plain-text dictionary probably wouldn't be a usable data source (even if you could use it, search linearly through a plain-text file would probably be too slow). If you create an index from you dictionary, you could probably incorporate it in the TermsComponent as a shard (see the TermsComponent documentation, under the heading "Distributed Search Support").
I don't believe TermsComponent supports searching multiple fields, so you'll want to make sure the same field name is used for the terms in the dictionary that you want to use (that is, if you are looking at the "name" field in the index, then create a "name" field in your indexed dictionary as well, rather than a "dictionaryentry" field)
Just to my mind, though, I fail to understand what the value this would be. Generally, it's intended to look at the terms available in the index on that field. "Enriching" it with more data, would just be providing suggestions that it won't actually be able to find when searching. Of course, I don't really know about your search implementation, but in most cases, that would certainly be my thought.
I am using node-neo4j to communicate with my neo4j. Following github.com/aseemk/node-neo4j-template was a real help to get started. Still learning my way to get things done, I am looking to solve a few issues, I'd appreciate any heads up you give me.
Implement site wide search.
We have users indexed with their email id's, and want to index stories/posts by tags or keywords. How do we search across all nodes, do we maintain indices for all nodes of various types, what would be a good approach? Should I go with google to enable this feature? How to index same node with multiple tags/keywords?
Specify custom id's for nodes
We are fine with integer indices for nodes, but since these id's can be re-used, we would like to identify nodes with unique id's, Is there a way to make neo4j use uuid's, adding an uid attribute would do but want to avoid having to maintain two id's.
Traversing nodes
How do we traverse nodes using node-neo4j, Cipher-lang looks like the answer, I am yet to get used to it. Does node-neo4j help do this out of the box?
Transactions
I may sound silly, but can I do transactional operations with node-neo4j?
Too many questions, I feel most of my doubts would clear once I get more used to querying the db, but any input from you will give me a headstart.
You probably should have broken this up into separate questions. I can answer a couple of them but not all.
Yes, node-neo4j can handle Cypher out of the box, with the query method: https://github.com/thingdom/node-neo4j/blob/develop/lib/GraphDatabase._coffee#L179. Help with Cypher--you should watch this intro video: http://vimeopro.com/neo4j/webinars/video/48603403
For your uuid, you probably should add a separate attribute to the nodes, and have an index on it--just ignore the regular ids except during transient queries where it's more convenient. As far as I know there's no way to override the incrementing ID--that sure would be nice, though.
Hope that helps.
I have been looking for an escape from GAE as the datastore does not support a lot of the things I want to do with it.
So I have looked at CouchDB (among others) and I really like the REST interface and the hosting option I found at Cloudant.
But for all my googling and reading any docs I could find, I still am not sure if it is a good fit.
So I come here in the hope that someone might have more insight.
I write web apps and a lot of the projects I want to do will involve a query that looks like this:
Find all entries that are within a user-input-lat/long bounding box and where start-time is less than user-input-time-1 and end-time is greater than user-input-time-2 and has all tags in user-input-list-of-tags.
Thats not even pseudocode, but I hope it makes sense anyway.
I am not just looking for a "You cannot do that in CouchDB". Some kind of explanation and perhaps something like "If you can live without the tags then you can do this:"
I would like to use the Cloudant service so GeoCouch is apparently out of the question, but they do something that should work like lucene, but does that mean the queries are slow?
As you can tell, I am a bit confused here, so just do your best to straighten me out and I'll be greatfull :)
Not mentioning the tags (which in itself is already a problem), what you describe is a multi-dimensional query : you have several "coordinates" (lat, long, start-time, end-time) and provide a range for each of these coordinates.
On its own, CouchDB cannot perform multi-dimensional queries at all — you only get single-dimension queries across one coordinate.
Tags certainly are possible, but it depends on whether you need documents that have at least one tag in the list, or documents that have all tags in the list. The first case is easy (run one query per tag using the bulk API), the second might require excessive amounts of memory (if a document has N tags, it needs to emit 2N-1 tag-sets in order to match all possible tag combinations involving it, so you should place an upper bound on either the number of tags in a document, or the number of tags in a query).
Lucene does allow multi-dimensional and keyword-based queries, though I cannot vouch for their performance.
for my current project I need to implement a search functionality, that combines both the content search and the user search.
This raises several issues, for example relevance. Nodes have a relevance when searched, and users do not.
Now, I was thinking how to tackle this the best way.
To my knowledge, I have several options :
hook search, and use the do_search function twice, once for nodes, once for users, and combine both results. However, this messes up the paging I presume. This seems my best bet.
hook search, and use the do_search's capability to combine both a node and a user search. do_search can combine 2 queries, but I'm not sure how this exactly works.
hook search, and write to whole stuff by hand, but that I would rather not do.
Any suggestions ? Is there someone who has done this before ?
Many thanks!
content_profile exposes user data in a node so you would just need one search
An option which would not require converting user profiles to nodes is the excellent https://www.drupal.org/project/search_by_page (enable the node and user search components; no need to use its signature paths component)