indexing couchdb using elastic search - couchdb

HI I have installed elasticsearch version 0.18.7 and configured couchdb according to these instructions. I am trying to create indexing in the following way:
curl -XPUT '10.50.10.86:9200/_river/tasks/_meta' -d '{
"type": "couchdb",
"couchdb": {
"host": "10.50.10.86",
"port": 5984,
"db": "tasks",
"filter": null
},
"index": {
"index": "tasks",
"type": "tasks",
"bulk_size": "100",
"bulk_timeout": "10ms"
}
}'
and got the message like,
{
"ok": true,
"_index": "_river",
"_type": "tasks",
"_id": "_meta",
"_version": 2
}
when trying to access the url like
curl -GET 'http://10.50.10.86:9200/tasks/tasks?q=*&pretty=true'
then
{
"error": "IndexMissingException[[tasks] missing]",
"status": 404
}
Please guide me how to indexing couchdb using elasticsearch.

I'm not sure where es_test_db2 is coming from. What's the output of this?
curl 10.50.10.86:9200/_river/tasks/_status\?pretty=1

Related

Logback losghstash appender add own field

I need to send application logs directly to logstash using: Logstash Logback Encoder from multiple microservices. Problem is that when I am sending logs logstash recive logs like this:
{
"_index": "logstash-2021.01.21-000001",
"_type": "_doc",
"_id": "id",
"_version": 1,
"_score": 1.6928859,
"_source": {
"#timestamp": "2021-01-21T14:13:05.480Z",
"#version": "1",
"message": "message",
"host": "gateway",
"port": 43892
},
"fields": {
"#timestamp": [
"2021-01-21T14:13:05.480Z"
]
},
"highlight": {
"message": [msg]
},
"sort": [ sort ]
}
I need to add a custom field in "fields" section or in general section. Do you have any idea how I can do this?
You can use mutate filter in your logstash configuration file.
For example, into logstash configuration your file, this looks like this :
filter {
mutate { add_field => { "field_name" => "field_value" } }
}

Sorting in Elastic Search, using nested object type

I am trying to get data using elastic search in a python program. Currently I am getting the following data from an elastic search request. I wish to sort the data on rank:type. For example i want to sort data by raw_freq or maybe by score.
What should the query look like?
I believe it will be something using nested query. Help would be very much appreciated.
{
"data": [
{
"customer_id": 108,
"id": "Qrkz-2QBigkG_fmtME8z",
"rank": [
{
"type": "raw_freq",
"value": 2
},
{
"type": "score",
"value": 3
},
{
"type": "pmiii",
"value": 1.584962
}
],
"status": "pending",
"value": "testingFreq2"
},
],
}
Here is a simple example of how you can sort your data:
"query": {
"term": {"status": "pending"}
},
"sort": [
{"rank.type.keyword": {"order" : "desc"}}
]

Query filter in composer rest server

I'm having problems with queries in composer rest server.
I'm building a filter like this:
{
"where": {
"and": [
{
"origin": "web"
},
{
"affiliate": "resource:org.acme.affiliates.Affiliate#2"
},
{
"createdAt": {
"gte": "2018-01-01"
}
},
{
"createdAt": {
"lte": "2018-06-07"
}
}
]
}
}
request:
curl -X GET --header 'Accept: application/json' 'http://localhost:3000/api/User?filter=%7B%22where%22%3A%7B%22and%22%3A%5B%7B%22origin%22%3A%22web%22%7D%2C%7B%22affiliate%22%3A%22resource%3Aorg.acme.affiliates.Affiliate%232%22%7D%2C%7B%22createdAt%22%3A%7B%22gte%22%3A%222018-01-01%22%2C%22lte%22%3A%222018-06-07%22%7D%7D%5D%7D%7D'
response:
[
{
"$class": "org.acme.affiliates.User",
"affiliate": "resource:org.acme.affiliates.Affiliate#2",
"userId": "14",
"email": "diego#duncan.com",
"firstName": "diego",
"lastName": "duncan",
"createdAt": "2018-04-20T20:48:08.151Z",
"origin": "web"
},
{
"$class": "org.acme.affiliates.User",
"affiliate": "resource:org.acme.affiliates.Affiliate#1",
"userId": "15",
"email": "diego#algo.com",
"firstName": "diego",
"lastName": "algo",
"createdAt": "2018-04-20T20:53:40.720Z",
"origin": "web"
}
]
As you see, filters are not working because Affiliate#1 appears.
I tested without createdAt filters and work perfectly, then i tested without affiliate and work good too. I tested with createdAt with range instead gte and lte with the same wrong result.
hlfv1
composer rest server v0.16.6
its a loopback filter issue, most likely to do with the date range comparison. (the other comparisons are fine as you wrote).
The suggestion here -> https://github.com/strongloop/loopback-connector-mongodb/issues/176 would suggest that you need to use the between operator instead for DateTimes. eg
{"where":{"createdAt":{"between": ['2018-01-05 10:00', '2018-05-10 10:00']}}}
I answer my own question:
I was using hlfv1 and composer 0.16.6
After update to hlfv11 and composer 0.19.8 the bug is fixed.

How to create initial Elasticsearch settings when using the river plugin

I am using the river plugin for CouchDB and when I execute the following curl command:
curl -XPUT 'localhost:9200/_river/blog/_meta' -d '{
"type": "couchdb",
"couchdb": {
"host": "localhost",
"port": 5984,
"db": "blog",
"filter": null
},
"index": {
"analysis": {
"analyzer": {
"whitespace": {
"type": "whitespace",
"filter": "lowercase"
},
"ox_edgeNGram": {
"type": "custom",
"tokenizer": "ox_t_edgeNGram",
"filter": [
"lowercase"
]
},
"ox_NGram": {
"type": "custom",
"tokenizer": "ox_t_NGram",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"ox_t_edgeNGram": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 25,
"side": "front"
},
"ox_t_NGram": {
"type": "NGram",
"min_gram": 2,
"max_gram": 25
}
}
}
}
}'
receive the response:
{
"ok": true,
"_index": "_river",
"_type": "blog",
"_id": "_meta",
"_version": 1
}
The problem I have, is when I want to view the settings in the browser and go to:
http://localhost:9200/blog/_settings?pretty=true
The json that is returned is as follows, but I'm expecting information regarding the analyzer etc. that I thought I created.
Returned JSON:
{
"blog": {
"settings": {
"index.number_of_shards": "5",
"index.number_of_replicas": "1"
}
}
}
It should also be noted that when I create a blog index without using the river and run a curl command to input the analysis information, I do receive a response from the browser indicating the settings that I input.
How can I set the default settings of a an index when using the River plugin?
To solve this issue:
Create new Elasticsearch index + mappings etc.
Create new Elasticsearch river with the name of the index set to that of the index created in step one.
I found the answer here:
http://groups.google.com/a/elasticsearch.com/group/users/browse_thread/thread/5ebf1556d139d5ac/f17e71e04cac5889?lnk=gst&q=couchDB+river+settings#f17e71e04cac5889
You can try this url http://localhost:9200/blog/_mapping?pretty=true
In the response mapping, if the analyzer is not explicitly mentioned, it is then the default analyzer.

CouchDB, Elastic Search, and River Plugin not operating correctly

I'm trying to get ElasticSearch to work, specifically with the River Plugin. For some reason I just can't get it to work. I've included the procedure I'm using to try and do it, found here:
curl -XDELETE 'http://localhost:9200/_all/'
Response:
{
"ok": true,
"acknowledged": true
}
This is so I know I'm working with an empty set of elasticsearch instances.
I have an existing database, called test and the river plugin has already been installed. Is there anyway to test to confirm the River Plugin is installed and running?
I issue the following command:
curl -XPUT 'http://localhost:9200/_river/my_index/_meta' -d '{
"type" : "couchdb",
"couchdb" : {
"host" : "localhost",
"port" : 5984,
"db" : "my_couch_db",
"filter" : null
}
}'
my_couch_db is a real database, I see it in Futon. There is a document in it.
Response:
{
"ok": true,
"_index": "_river",
"_type": "my_index",
"_id": "_meta",
"_version": 1
}
Now at this point, my understanding is elasticseach should be working as I saw in the tutorial.
I try to query, just to find anything. I go to
http://localhost:9200/my_couch_db/my_couch_db.
Response:
No handler found for uri [/my_couch_db/my_couch_db] and method [GET]
What's weird is when I go to
localhost:5984/my_couch_db/__changes
I get
{
"error": "not_found",
"reason": "missing"
}
Anyone have any idea what part of this I'm screwing up?
I try to query, just to find anything.
I go to
http://localhost:9200/my_couch_db/my_couch_db.
try adding /_search (w/ optional ?pretty=true) at the end of your curl -XGET like so:
C:\>curl -XGET "http://localhost:9200/my_couch_db/my_couch_db/_search?pretty=true"
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.0,
"hits": [
{
"_index": "my_couch_db",
"_type": "my_couch_db",
"_id": "a2b52647416f2fc27684dacf52001b7b",
"_score": 1.0,
"_source": {
"_rev": "1-5e4efe372810958ed636d2385bf8a36d",
"_id": "a2b52647416f2fc27684dacf52001b7b",
"test": "hello"
}
}
]
}
}
What's weird is when I go to
localhost:5984/my_couch_db/__changes
I get {"error":"not_found","reason":"missing"}
try removing one of the underscores from your __changes and that should work like so:
C:\>curl -XGET "http://localhost:5984/my_couch_db/_changes"
{
"results": [
{
"seq": 1,
"id": "a2b52647416f2fc27684dacf52001b7b",
"changes": [
{
"rev": "1-5e4efe372810958ed636d2385bf8a36d"
}
]
}
],
"last_seq": 1
}

Resources