ElasticSearch and Couchdb view

ElasticSearch and Couchdb view - couchdb

New to the whole ElasticSearch and couchDB setup. Just got a river going from ES to a db I have in couchDB. If I have a view in a db is there a way to just index that view? For example I have a db named "Movies" and a view called "Action" and another called "byActor".
I was thinking that I could do an index and point it to that, like below, but that doesn't seem to work.
{
"type" : "couchdb",
"couchdb" : {
"host" : "localhost",
"port" : 5984,
"db" : "Movies",
"filter" : null
},
"index" : {
"index" : "Action",
"bulk_size" : "100",
"bulk_timeout" : "10ms"
}
}
I think I may not understand what index is exactly because when I run http://localhost:9200/Movies/Action/_search?pretty=true nothing is returned.
Edit: In looking around more it's seeming like this isn't the way to do this. Index just seems to be the way ES indexes? Anyways, I'm reading that mapping might accomplish this. Is that true?

Indexing views is not yet in CouchDb River. See this pull request.

Related

How to do multiple text search using "$text query and $or" in mongodb / mongoose?

Here is the model 'Class' model for which I have created the "text" index for 'keywords','lifeArea',''type'.
Structure of the model:
{
"_id" : ObjectId("558cf6e3387419850d892712"),
"keywords" : "rama,seetha",
"lifeArea" : [
"Emotional Wellness"
],
"type" : "Pre Recorded Class",
"description" : "ram description",
"synopsis" : "ram syn",
"name" : "ram demo",
"__v" : 0
}
db.Class.getIndexes()
// displaying index
{
"v" : 1,
"key" : {
"_fts" : "text",
"_ftsx" : 1
},
"name" : "classIndex",
"ns" : "innrme.classes",
"weights" : {
"keywords" : 1,
"lifeArea" : 1,
"type" : 1
},
"default_language" : "english",
"language_override" : "language",
"textIndexVersion" : 2
}
I want to do a text search on the fields mentioned above. I tried the following query.
db.classes.find({$or:[{keywords: { $text: { $search: "rama abc" } } }, {type: {$text: { $search: "class" }}}],score: {$meta: 'textScore'}});
But it did not work and I got the follwing error
Error: error: {
"$err" : "Can't canonicalize query: BadValue unknown operator: $text",
"code" : 17287
}
Please help me to get the correct query.
Please correct/educate me if I am wrong in asking the question or in explaining the problem

That actual error suggests your mongodb is a version less than 2.6 ( so no text search in that way ). But you cannot do that anyway for two reasons.
An $or expression can only have one special index expression, being either "text" or "geospatial" in the arguments.
You are expecting text searches on "two" different fields and you can only have one text index per collection. However that single index can be spread over several fields in the document. But you cannot ask different search terms for different fields.
Documentation quote:
You cannot combine the $text expression, which requires a special text index, with a query operator that requires a different type of special index. For example you cannot combine $text expression with the $near operator.
And it should also say "You cannot use $or with a $text expression or the $near operator where either are used in more than one condition." But that little piece of information is missing, but you still cannot do it.
Your syntax is generally not correct, but even with the correct syntax in a supported version of MongoDB you would get an error trying to use $or like this:
Error: error: {
"$err" : "Can't canonicalize query: BadValue Too many text expressions",
"code" : 17287
}
So to resolve this you need:
To have a MongoDB server version of 2.6 or greater that supports the $text syntax ( or live with command forms )
To live with indexing over multiple fields and using a single index.
To execute "separate queries" in place of your "or" conditions and "combine" the results in your client API interface.
That is the only way you get "or" conditions like this with MongoDB text search.

First of all I don't think you can use $text in that manner, you need first to create a text index on the collection then you can use it but without specifying any field because it works on indexes not fields.
Please check here: http://docs.mongodb.org/manual/administration/indexes-text/

Mongodb query slow response time

I'm working on a project that uses flexible schemas. I've setup a local mongodb server and am using mongoose inside node.
Having an interesting scaling problem and was wondering if these response times were normal. If a query returns 50 documents, I takes 5-10 seconds for mongo to respond. In the same collection, a query that returns 2 documents is milliseconds.
It's not a slow connection because it's local, was wondering if anyone had an idea as to what was causing this.
I'm using OS X and mongo 3.0.1
Edit: The documents are nearly empty at the moment, with just one or two properties.
Edit: The total number of documents doesn't really matter, just the returned size. If there are 51 documents, 50 like {_id: "...", _schema:"bar"} and 1 {_id:"...", _schema: "foobar" } then collection.find({_schema:"bar"}) takes several seconds and collection.find({_schema:"foobar"}) takes no time.
Explain output:
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "mean-dev.documentmodels",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [ ]
},
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"$and" : [ ]
},
"direction" : "forward"
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "Sams-MBP.local",
"port" : 27017,
"version" : "3.0.1",
"gitVersion" : "nogitversion"
},
"ok" : 1

No, it should not take that much time.
The issue is probably in the operations in your query (projections, sorting, geosearch, grouping, etc). The best way to solve that is by creating an index to speed up such query.
To create an index on _schema field execute that command in mongodb:
db.collection.ensureIndex({"_schema":1});

MongoDB collection scheme

I am starting to develop online football management game using NodeJS and MongoDB. But now i don't know, should i use multiple collections or can i put everything in one ? Example:
{
"_id" : ObjectId("5118ee01032016dc02000001"),
"country" : "Aruba",
"date" : "February 11th 2013, 3:11:29 pm",
"email" : "tadad#adadasdsd.com",
"name" : "test",
"pass" : "9WcFwIITRp0e82ca3c3b314a656bfb437553b1d013",
"team" : {
"name" : "teamname",
"logo" : "urltologo",
"color" : "color",
"players" : [{
"name" : "name",
"surname" : "surname",
"tackling" : 58,
"finishing" : 84,
"pace" : 51,
....
}, {
"name" : "name",
"surname" : "surname",
"start_age" : 19,
"tackling" : 58,
"finishing" : 84,
"pace" : 51,
...
}],
"stadium" : {
"name" : "stadium",
"capacity" : 50000,
"pic" : "http://urltopic",
....
},
},
}
or create different collections for users, fixtures, players, teams ? Or any other method ?

When I started with MongoDB, I went by the mantra of 'embed everything', which is exactly what you're doing above. However, there needs to be some consideration for sub-documents that can grow to be very large. You should think about how often you'll be updating a particular document or subdocument as well. For instance, your players are probably going to be updated on a regular basis, so you'd probably want to put them in their own collection for ease of use. Anyway, the flexibility of MongoDB makes it so that there's really no 'right' answer to this problem, but it may help you to refer to the docs on data modeling.

There is no hard and fast rule on how to design schemas in mongo. A lot depends on your application data access patterns, frequency of data access and the relationships between different entities, how they shrink/grow/change and which of them stay intact. It is not feasible to give an advice without knowing how your application is supposed to work. I recommend you consult a book, such as MongoDB in Action for example, which has advice on how to design schema in mongo properly taking into the account application specific requirements.

Mongodb increased db.currentOp() issue

My site using mongodb for the chat application. Mongodb queries are getting timed out so i checked the db.currentOp(). Below is the currentOp() and Mongodb details,
637 active operations
750 inactive operations
Other details about mongodb:
Mongo db is running with sharding
I have two databases
a)First database having, two table only
b)Second database having , 5 tables
My questions are, why the current.Op() count got increased suddenly and what are the causes we have to taken care if currentOp() count is increased. Please help me on this and apologies for my bad English.
Below are the sample output of my currentOp()
MongoDB shell version: 1.8.2
> db.currentOp()
{
"inprog" : [
{
"opid" : "msdata1:234234234",
"active" : true,
"lockType" : "read",
"waitingForLock" : false,
"secs_running" : 43534,
"op" : "getmore",
"ns" : "local.oplog.rs",
"query" : {
},
"client_s" : "70.52.078.123:12345",
"desc" : "conn"
},
{
"opid" : "msdata1:2342323423",
"active" : true,
"lockType" : "read",
"waitingForLock" : false,
"secs_running" : 231231,
"op" : "query",
"ns" : "ichat.chatmemberlist",
"query" : {
"count" : "chatmemberlist",
"query" : {
"Mid" : "23423",
"bmid" : "23423"
}
},
"client_s" : "70.52.078.123:12345",
"desc" : "conn"
},
{
"opid" : "msdata1:2342323423",
"active" : false,
"lockType" : "write",
"waitingForLock" : true,
"op" : "update",
"ns" : "?ichat.useravail",
"query" : {
"Mid" : "23423"
},
"client_s" : "70.512.078234.423:12345",
"desc" : "conn"
},
...
...
...

From the limited amount of info, I can see that your queries are just running a really long time: "secs_running" : 231231, means 231 seconds. It's likely that you don't have enough resources available for the type of queries that you are running. That could be that you don't have enough memory, or perhaps too much queries that are acquiring a lock. If you're not on MongoDB 2.0.x yet, then you might want to upgrade to that too as it has vastly improved locking: http://blog.pythonisito.com/2011/12/mongodbs-write-lock.html
I would advice to check the mongodb.log file to see which queries are being slow, then use explain to figure out whether you've indexes on the columns and then either add indexes, or see how you can re-design your schema if that might look like a better solution.

Search CouchDB Using ElasticSearch River

I've created a couchDB river (from this elasticsearch example) for elasticsearch with the following code:
curl -XPUT 'localhost:9200/_river/tasks/_meta' -d '{
"type" : "couchdb",
"couchdb" : {
"host" : "localhost",
"port" : 5984,
"db" : "tasks",
"filter" : null
},
"index" : {
"index" : "tasks",
"type" : "tasks",
"bulk_size" : "100",
"bulk_timeout" : "10ms"
}
}'
When I try to search the the couchDB using elasticsearch with this command:
curl -XGET http://localhost:9200/tasks/tasks -d query{"user":"jbattle"}
I get the response:
No handler found for uri [/tasks/tasks] and method [GET][]
I've been searching but have yet to discover a solution to/for this issue.
UPDATE:
I've discovered the proper query is:
curl -XGET 'http://localhost:9200/_river/tasks/_search?q=user:jbattle&pretty=true'
Though, despite no longer receiving an error, I get 0 hits:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}

Both of your queries are incorrect. The first one is missing the endpoint /_search and the second one is querying index _river instead of index tasks.
The _river index is where your river is stored not your data. When you configured your river, you specified index tasks.
So try this instead:
curl -XGET 'http://localhost:9200/tasks/tasks/_search?q=user:jbattle&pretty=true'
Or if that doesn't work, try searching for any docs in tasks/tasks:
curl -XGET 'http://localhost:9200/tasks/tasks/_search?q=*&pretty=true'
clint

The example file you posted got moved to github. These guys give a decent walkthrough of getting couch and elasticsearch to work together.
Unfortunately, the currently accepted answer doesn't work for me. But if I paste something like this in my browser's address bar it works. Notice that there is only one reference to the "tasks" index in the url, not two.
http://localhost:9200/tasks/_search?pretty=true
To do a real search you could try something like this:
http://localhost:9200/tasks/_search?q="hello"&pretty=true

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string