Elasticsearch shows data as buffer type in Kibana - node.js

I am trying to index one json to elastic search.
It seems to be working fine as it is not giving any error.
I have indexed document as below.
await client.index({
id: fieldId.toString(),
index: 'project_documents_textfielddata',
body: {
FieldId: fieldId,
DocumentId: documentId,
Value: fieldData.fieldHTMLText,
},
routing: projectId.toString(),
});
But in elasticsearch kibana it is showing as buffer type as below (I have truncated buffer as it was very long).
{
"_index": "documenttextfile.files",
"_id": "6252ab411deaba21fd877c26",
"_version": 1,
"_score": 1,
"_routing": "62505a765ff176cd491f1d1e",
"_source": {
"id": "6252ab411deaba21fd877c26",
"Content": {
"type": "Buffer",
"data": [
10,
// Some extra large binary content removed for convenient
48,
56,
50,
],
"id": [
"6252ab411deaba21fd877c26"
],
"Content.type.keyword": [
"Buffer"
]
}
}
So how can I see my data as is (i.e. in json format) in Kibana. I've seen many tutorials on Kibana, they are able to see data in plain text instead of buffer.
Or am I doing anything wrong while indexing? I am basically trying to see the data the way we can see in mongodb compass.

Your fieldData.fieldHTMLTextfield is probably of type Buffer and you simply need to call fieldData.fieldHTMLText.toString()on it in order to transform the buffer to a string.
PS: the problem has nothing to do with Kibana which shows you exactly what you're sending to Elasticsearch, i.e. a Buffer. So the problem is more related to your understand of Node.js data structures (i.e. Buffer vs string) ;-)

Related

Arangodb Search view is not consistent with collection

I am using arangodb version 3.9.3 and I am experiencing an inconsistency between an arangodb collection and the corresponding arangosearch view on top of it.
The number of documents found by querying the collection is nearly half of the number of documents found by querying the view.
I am using the following query
RETURN COUNT(FOR n IN collection RETURN n)
which returns 4353.
And the query on view
RETURN COUNT(FOR n IN view SEARCH true RETURN n)
returns 7303.
Because of this inconsistency, the LIMIT operation is not working as expected and queries are also returning extra results. Moreover, it seems that it is returning old documents as well.
I also tested it on older arangodb versions.
It is happening on both 3.7.18, 3.8.7 and on latest 3.9.3. And I am using a single node instance.
My workflow is something like this:
I just delete and create a document (changing few attributes of the document) multiple times (like 1000 times) in a loop which leads to inconsistency in the view.
And this seems to only happen when I have primarySortOrder/storedValues defined.
I am creating the View on app startup which looks like this
{
"writebufferSizeMax": 33554432,
"writebufferIdle": 64,
"cleanupIntervalStep": 2,
"commitIntervalMsec": 250,
"consolidationIntervalMsec": 500,
"consolidationPolicy": {
"type": "tier",
"segmentsBytesFloor": 2097152,
"segmentsBytesMax": 5368709120,
"segmentsMax": 10,
"segmentsMin": 1,
"minScore": 0
},
"primarySortCompression": "none",
"writebufferActive": 0,
"links": {
"FilterNodes": {
"analyzers": [
"identity"
],
"fields": {
"name": {},
"city": {}
},
"includeAllFields": false,
"storeValues": "none",
"trackListPositions": false
}
},
"globallyUniqueId": "h9943DEC4CDFB/231",
"id": "231",
"storedValues": [],
"primarySort": [
{
"field": "name",
"asc": true
},
{
"field": "city",
"asc": true
}
],
"type": "arangosearch"
}
I am unsure if I am doing something wrong or if this is a longstanding bug, which seems like a pretty strong inconsistency.
Has anyone else encountered this? And can anyone help?
Thanks

WikiJS GraphQL API returning 2 different IDs for the same page

I am using the docker container flavor of WikiJS with a postgres database, out of the box. No tweaks. I am trying to get the API working and it appears that everything works functionally. However, I'm getting wrong ID values for search.
The following query:
query {
pages {
list{
id
path
title
contentType
isPublished
isPrivate
createdAt
updatedAt
}
}
}
Returns the following result:
...
{
"id": 41,
"path": "characters/nicodemus",
"title": "Nicodemus",
"contentType": "markdown",
"isPublished": true,
"isPrivate": false,
"createdAt": "2022-07-26T20:52:26.727Z",
"updatedAt": "2022-08-17T20:31:02.537Z"
},
...
But this query:
query {
pages {
search (query:"nicodemus"){
results{
id
title
path
locale
}
}
}
}
returns this result:
{
"id": "53",
"title": "Nicodemus",
"path": "characters/nicodemus",
"locale": "en"
},
The second result, which is much more efficient than just grabbing every page every time, returns the page id as 53 (incorrect), while all pages returns the page id as 41 (correct).
I am not very familiar with graphql, but I understand the basics and like I said, it seems to be working fine. I don't even know where to start debugging this issue.

Fauxton CouchDB doesn't show the data from blockfile

I am using fabric 2.0 and CouchDB as a state database.
The data in the blockfile are correct (according the input I entered via the WebApp)
When I go to fauxton (http://localhost:6984/_utils/#database) I can't see the values (e.g. value1 : 4, value2: ID, ...)
I only receive this via fauxton:
Normally there should be also "_value1" :"4" ,...
{
"_id": "\u0000key~timestamp\u0000co\u00002021-04-30T08:47:23.961Z\u0000",
"_rev": "1-210ffdc0",
"~version": "CgMBBgA=",
"_attachments": {
"valueBytes": {
"content_type": "application/octet-stream",
"revpos": 1,
"digest": "md5-n7UoN6woal8ve/9S9DrNTA==",
"length": 48,
"stub": true
}
}
}
Does anyone has an idea, why the data in the blockfile are correct, but doesn't show up in the right way in fauxton?
I checked the docker-compose-couch.yaml and it's correct...
If the value cannot be parsed as JSON, it will be stored in CouchDB as a binary attachment. Fauxton UI only shows the digest of the binary value, not the actual binary attachment.

How to read json files with nested categories in node.js

I am using Perspective API (you can check out at: http://perspectiveapi.com/) for my discord application. I am sending an analyze request and api returning this:
{
"attributeScores": {
"TOXICITY": {
"spanScores": [
{
"begin": 0,
"end": 22,
"score": {
"value": 0.9345592,
"type": "PROBABILITY"
}
}
],
"summaryScore": {
"value": 0.9345592,
"type": "PROBABILITY"
}
}
},
"languages": [
"en"
],
"detectedLanguages": [
"en"
]
}
I need to get "value" in "summaryScore" as an integer. I searched it on Google, but i just found reading value for not categorized or only 1 times categorized json files. How can i do that?
Note: Sorry if i asked something really easy or if i slaughtered english. My primary language is not english and i am not much experienced on node.js
First you must make sure the object you have recived is presived by nodeJS as a JSON object, look at this answer for how first. After the object is stored as a JSON object you can do the following:
Reading from nested objects or arrays is as easy as doing this:
object.attributeScores.TOXICITY.summaryScore.value
If you look closer to the object and its structure you can see that the root object (the first {}) contains 3 values: "attributeScores", "languages" and "detectedLanguages".
The field you are looking for exists inside the "summeryScore" object that exists inside the "TOXICITY" object and so on. Thus you need to traverse the object structure until you get to the value you need.

limit in _source in elasticsearch

This is my source from ES:
"_source": {
"queryHash": "query412236215",
"id": "query412236215",
"content": {
"columns": [
{
"name": "Catalog",
"type": "varchar(10)",
"typeSignature": {
"rawType": "varchar",
"typeArguments": [],
"literalArguments": [],
"arguments": [
{
"kind": "LONG_LITERAL",
"value": 10
}
]
}
}
],
"data": [
[
"apm"
],
[
"postgresql"
],
[
"rest"
],
[
"system"
],
[
"tpch"
]
],
"query_string": "show catalogs",
"execution_time": 1979
},
"createdOn": "1514269074289"
}
How can i get the n records inside _source.data?
Lets say _source.data have 100 records , I want only 10 at a time ,also is it possible to assign offset for next 10 records?
Thanks
Take a look at scripting. As far as I know there isn't any built-in solution because Elasticsearch is primarily built for searching and filtering with a document store only as a secondary concern.
First, the order in _source is stable, so it's not totally impossible:
When you get a document back from Elasticsearch, any arrays will be in
the same order as when you indexed the document. The _source field
that you get back contains exactly the same JSON document that you
indexed.
However, arrays are indexed—made searchable—as multivalue fields,
which are unordered. At search time, you can’t refer to "the first
element" or "the last element." Rather, think of an array as a bag of
values.
However, source filtering doesn't cover this, so you're out of luck with arrays.
Also inner hits won't help you. They do have options for sort, size, and from, but those will only return the matched subdocuments and I assume you want to page freely through all of them.
So your final hope is scripting, where you can build whatever you want. But this is probably not what you want:
Do you really need paging here? Results are transferred in a compressed fashion, so the overhead of paging is probably much larger than transferring the data in one go.
If you do need paging, because your array is huge, you probably want to restructure your documents.

Resources