Why are *.raw fields empty in Kibana? - logstash

With a standard configuration of ELK stack (deviant/docker-elk)
and the template http://localhost:9200/logstash-alimentaris/_mapping/?pretty=true set to:
{
"string_fields": {
"mapping": {
"fielddata": {},
"index": "analyzed",
"omit_norms": true,
"type": "string",
"fields": {
"raw": {
"ignore_above": 256,
"index": "not_analyzed",
"type": "string",
"doc_values": true
}
}
},
"match": "*",
"match_mapping_type": "string"
}
}
In Kibana all raw fields are empty. What are some possibilities to investigate what hinders Elasticsearch to fill the raw fields?
One possibility is to create a custom template: Change default mapping of string to "not analyzed" in Elasticsearch
But, it is documented that the raw index works out of the box, and it would be in many cases better to stick with the default configuration. What are possible solutions, hints?

The logic in Kibana is to not show the raw variables in the discover pane. They show up only if one selects "show missing fields", and then they are represented as empty.
In the visualizations one can and should use the .raw fields in most of the aggregations.

Related

Azure Data Factory Copy Activity error mapping JSON to SQL

I have an Azure Data Factory Copy Activity that is using a REST request to elastic search as the Source and attempting to map the response to a SQL table as the Sink. Everything works fine except when it attempts to map the data field that contains the dynamic JSON. I get the following error:
{
"errorCode": "2200",
"message": "ErrorCode=UserErrorUnsupportedHierarchicalComplexValue,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The retrieved type of data JObject with value {\"name\":\"department\"} is not supported yet, please either remove the targeted column or enable skip incompatible row to skip them.,Source=Microsoft.DataTransfer.Common,'",
"failureType": "UserError",
"target": "CopyContents_Paged",
"details": []
}
Here's an example of my mapping configuration:
"type": "TabularTranslator",
"mappings": [
{
"source": {
"path": "['_source']['id']"
},
"sink": {
"name": "ContentItemId",
"type": "String"
}
},
{
"source": {
"path": "['_source']['status']"
},
"sink": {
"name": "Status",
"type": "Int32"
}
},
{
"source": {
"path": "['_source']['data']"
},
"sink": {
"name": "Data",
"type": "String"
}
}
],
"collectionReference": "$['hits']['hits']"
}
The JSON in the data object is dynamic so I'm unable to do an explicit mapping for the nested fields within it. That's why I'm trying to just store the entire JSON object under data in a column of a SQL table.
How can I adjust my mapping configuration to allow this to work properly?
I posted this question on the MSDN forums and I was told that if you are using a tabular sink you can set this option "mapComplexValuesToString": true and it should allow complex JSON properties to get mapped correctly. This resolved my ADF copy activity issue.
I have the same problem a few days ago. You need to convert your JSON object to a Json String. It will solve your mapping problem (UserErrorUnsupportedHierarchicalComplexValue).
Try it and tell me if also resolves your error.

How to insert data to bigquery table with custom fields with NodeJS?

I'm using npm BigQuery module for inserting data into bigquery. I have a custom field say params which is of type RECORD and accept any int,float or string value as a key value pair. How can I insert to such fields?
Looked into this, but could not find anything useful
[https://cloud.google.com/nodejs/docs/reference/bigquery/1.3.x/Table#insert]
If I understand correctly, you are asking for a map with ANY TYPE value, which is not support in BigQuery.
You may have a map with value type info with a record like below schema.
Your insert code needs to pick correct type_value to set.
{
"name": "map_field",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "key",
"type": "STRING",
},
{
"name": "int_value",
"type": "INTEGER"
},
{
"name": "string_value",
"type": "STRING"
},
{
"name": "float_value",
"type": "FLOAT"
}
]
}

Angular formly: calculate value of one field based on other fields input

json configuration:
{
"moduleconfigs": {
"create": [
{
"key": "Committed",
"type": "horizontalInput",
"templateOptions": {
"label": "Committed"
}
},
{
"key": "Uncommitted",
"type": "horizontalInput",
"templateOptions": {
"label": "Uncommitted"
}
},
{
"key": "Line",
"type": "horizontalInput",
"templateOptions": {
"label": "Line"
}
},{
"key": "Total",
"type": "horizontalInput",
"templateOptions": {
"label": "Total"
},
"expressionProperties":{
"value": function($viewValue, $modelValue, scope){
return scope.model.lineFill+scope.model.uncommitedBPD+scope.model.commitedBPD;
}
}
}
]
}
}
html:
<form>
<formly-for model="vm.myModel" fields="vm.myFields"></formly-form> </form>
I am new to angular formly. I am creating form using angular formly json. Total field should display sum of values provided in Committed+Uncommitted+Line fields. i am using expressionProperties but is not working.
I'm guessing you've moved on from this issue... however...
You are doing two things wrong.
First (1): They key value in the formly field configuration object is setting the value by that name on the model.
So your first field configuration object is:
{
"key": "Committed",
"type": "horizontalInput",
"templateOptions": {
"label": "Committed"
}
},
Then later you try to access that value using the key commitedBPDso you'll always get undefined.
Basically formly is setting the value input in that field on the model object with the key of Committed you need to change the key to match.
Second (2): I could be wrong but I don't think you can use an expression property to set the value like that. Formly will automatically respect value changes on the model so you're better off putting on onChange on the other formly field configuration objects that does the parsing and addition something like this:
{
"key": "Committed",
"type": "horizontalInput",
"templateOptions": {
"label": "Committed"
"onChange": addTotal
}
}...
function addTotal() {
//You have access to the model here because it's in your controller
// NOTE: the parseInput function you'll have to write yourself
vm.model.Total = parseInput(vm.model.Committed) + ...
}
All in all your biggest problem is trying to access the values from the model object with the wrong key
Yes, updating the model does not change form.input.value
The only way I've found is like this:
item.fieldGroup['Import'].formControl.setValue(333.33)

No results when in the mapping, the field _all has specified an index_analyzer

With Elasticsearch I have created an index using a custom mapping and custom set of analszers, however I'm not able to do query search on the _all field.
I'm using these analyzers:
{
"analysis": {
"analyzer": {
"case_insensitive": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase",
"asciifolding"
],
"char_filter": "punctuation"
}
},
"char_filter": {
"punctuation": {
"type": "mapping",
"mappings": [
".=>\\u0020",
"-=>\\u0020",
"_=>\\u0020"
]
}
}
}
}
and this mapping:
{
"article": {
"_all": {
"enabled": true,
"store": "yes",
"index_analyzer": "case_insensitive",
"search_analyzer": "case_insensitive"
},
"properties": {
"title": {
"type": "string",
"index": "analyzed"
},
"subtitle": {
"type": "string",
"analyzer": "case_insensitive"
},
"comment": {
"type": "string",
"index": "not_analyzed"
},
"review": {
"type":"string",
"index": "not_analyzed",
"include_in_all":false
}
}
}
}
Then I add a document like this:
{
"title": "This is the story of a wonderful man.",
"subtitle":"A man goes on vacation in the worst place possible.",
"comment": "I like the movie very much, however I did not undertand it.",
"review":"Very well"
}
and I expect the following 3 out of 4 fields shall be included in _all, in particular title, subtitle and comment.
The analyzer is working as following (tested using the analyzer test in elasticsearch):
"I like the movie very much, however I did not undertand it." -> "i like the movie very much, however i did not undertand it "
"This is the story of a wonderful man." -> "this is the story of a wonderful man "
I expect that at least searching on _all using the query: "This is the story of a wonderful man." I should be able to find the document.
What am I doing wrong?
How is elasticsearch populating the _all field?
If the field 'title' shall be added to the _all field, which data is used and how? is it using the output of the analyzer selected for the 'title' field as input for the analyzer of the _all or is using the raw data?
How is the flow of data in the _all field? For example
input -> analyzer -> title -> index_analyser -> _all
or
input -> analyzer -> title
-> index_analyser -> _all
Thank you in advance...
Your mapping looks ok to me. The only thing I would try is to set one of the fields explicitly to include_in_all=true and then rerun your query.
According to the docs, it may be that as you are overriding the default value of include_in_all for one of the fields, it may have changed it for all the other fields of the objects. See here _all
Relevant text from the documentation is below:
Inclusion in the _all field can be controlled on a field-by-field basis by using the include_in_all setting, which defaults to true. Setting include_in_all on an object (or on the root object) changes the default for all fields within that object.
UPDATE:
I think I know why its not working. Here is what I did. First, I removed the custom analysers from the _all_ field (so using the standard analyser). With this I was able to query and get the results as expected. Results were returned for terms that were in any of the document attributes but review. At least this confirms that the general behaviour of _all is correct. Next to test the analysers, I did a query on the subtitle field with the exact text(as it is using keyword analyser). This also worked. Then I realised that _all is an aggregated field and then analysed.
So the query should include all the text from all the fields to work. But again, how do we know in which order they were aggregated :)
This link _all custom analyser has some information. Relevant bits extracted below (from Shay).
You don't want to set the analyzer for _all to be keyword, _all is an aggregation of all the other fields int the doc, so you basically treat the whole aggregation of text as a single token.

How to search through data with arbitrary amount of fields?

I have the web-form builder for science events. The event moderator creates registration form with arbitrary amount of boolean, integer, enum and text fields.
Created form is used for:
register a new member to event;
search through registered members.
What is the best search tool for second task (to search memebers of event)? Is ElasticSearch well for this task?
I wrote a post about how to index arbitrary data into Elasticsearch and then to search it by specific fields and values. All this, without blowing up your index mapping.
The post is here: http://smnh.me/indexing-and-searching-arbitrary-json-data-using-elasticsearch/
In short, you will need to do the following steps to get what you want:
Create a special index described in the post.
Flatten the data you want to index using the flattenData function:
https://gist.github.com/smnh/30f96028511e1440b7b02ea559858af4.
Create a document with the original and flattened data and index it into Elasticsearch:
{
"data": { ... },
"flatData": [ ... ]
}
Optional: use Elasticsearch aggregations to find which fields and types have been indexed.
Execute queries on the flatData object to find what you need.
Example
Basing on your original question, let's assume that the first event moderator created a form with following fields to register members for the science event:
name string
age long
sex long - 0 for male, 1 for female
In addition to this data, the related event probably has some sort of id, let's call it eventId. So the final document could look like this:
{
"eventId": "2T73ZT1R463DJNWE36IA8FEN",
"name": "Bob",
"age": 22,
"sex": 0
}
Now, before we index this document, we will flatten it using the flattenData function:
flattenData(document);
This will produce the following array:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "2T73ZT1R463DJNWE36IA8FEN"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Bob"
},
{
"key": "age",
"type": "long",
"key_type": "age.long",
"value_long": 22
},
{
"key": "sex",
"type": "long",
"key_type": "sex.long",
"value_long": 0
}
]
Then we will wrap this data in a document as I've showed before and index it.
Then, the second event moderator, creates another form having a new field, field with same name and type, and also a field with same name but with different type:
name string
city string
sex string - "male" or "female"
This event moderator decided that instead of having 0 and 1 for male and female, his form will allow choosing between two strings - "male" and "female".
Let's try to flatten the data submitted by this form:
flattenData({
"eventId": "F1BU9GGK5IX3ZWOLGCE3I5ML",
"name": "Alice",
"city": "New York",
"sex": "female"
});
This will produce the following data:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "F1BU9GGK5IX3ZWOLGCE3I5ML"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Alice"
},
{
"key": "city",
"type": "string",
"key_type": "city.string",
"value_string": "New York"
},
{
"key": "sex",
"type": "string",
"key_type": "sex.string",
"value_string": "female"
}
]
Then, after wrapping the flattened data in a document and indexing it into Elasticsearch we can execute complicated queries.
For example, to find members named "Bob" registered for the event with ID 2T73ZT1R463DJNWE36IA8FEN we can execute the following query:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "eventId"}},
{"match": {"flatData.value_string.keyword": "2T73ZT1R463DJNWE36IA8FEN"}}
]
}
}
}
},
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "name"}},
{"match": {"flatData.value_string": "bob"}}
]
}
}
}
}
]
}
}
}
ElasticSearch automatically detects the field content in order to index it correctly, even if the mapping hasn't been defined previously. So, yes : ElasticSearch suits well these cases.
However, you may want to fine tune this behavior, or maybe the default mapping applied by ElasticSearch doesn't correspond to what you need : in this case, take a look at the default mapping or, for even further control, the dynamic templates feature.
If you let your end users decide the keys you store things in, you'll have an ever-growing mapping and cluster state, which is problematic.
This case and a suggested solution is covered in this article on common problems with Elasticsearch.
Essentially, you want to have everything that can possibly be user-defined as a value. Using nested documents, you can have a key-field and differently mapped value fields to achieve pretty much the same.

Resources