fiware NGSI-LD orion-ld context broker didn't send all entities for a type as expected - get

I filled a fiware orion-ld with ca 90.000 addresses (ID's like 'urn:ngsi-ld:address:Testtown:12345:Haus-Knipp-Str:15').
If I read all entities for type "Address" from orion-ld broker via url (with offset settings, step by step with limit 1000) like "http://{{orion}}/ngsi-ld/v1/entities?type=Address&offset=0&limit=1000" I receive at any time the expected answer from orion-ld until I try to request
"http://{{orion}}/ngsi-ld/v1/entities?type=Address&offset=25000&limit=1000".
Then the unexpected answer is "[]". It should be filled with 1000 entities.
If I delete all entities for type "Address" stepwise by 1000 entities (each time I request the first 1000 entities and delete them afterward), I am able to delete >98000 entities. So all 98000 entities were in the orion-ld. But I wasn't able to read all entities from orion-ld.
I try to find out the last readable entity, it is "http://{{orion}}/ngsi-ld/v1/entities?type=Address&offset=25248&limit=1" the next request failed: "http://{{orion}}/ngsi-ld/v1/entities?type=Address&offset=25249&limit=1" If I try to get entities above the 25248 it failed at all.
If I check, if all 98000 ID's for type "Address" are in orion-ld, I request them by url "http://{{orion}}/ngsi-ld/v1/entities/{{id}}". I receive for each ID the expected answer. So all ID's are in the orion-ld. Only the read request failed, starting with offset>25248.
Is there an error in orion-ld, or an unknown threshold, which needs to be handled in another way?
Today I create 100,000 entities of type "TestTestTest" like:
{
"#context": "https://uri.etsi.org/ngsi-ld/v1/ngsi-ld-core-context.jsonld",
"id": "TestTestTest:20000",
"type": "TestTestTest",
"name": {
"type": "Property",
"value": 20000
}
},
If I try to read all entities of type "TestTestTest" again I found another threshold: 87637
read all entities for type TestTestTest with offset=86000 and limit=1000 --> successfull: read 1000 values
read all entities for type TestTestTest with offset=87000 and limit=1000 --> error: read 0 values
read all entities for type TestTestTest with offset=87000 and limit=500 --> successfull: read 500 values
read all entities for type TestTestTest with offset=87500 and limit=500 --> error: read 0 values
read all entities for type TestTestTest with offset=87500 and limit=250 --> error: read 0 values
read all entities for type TestTestTest with offset=87500 and limit=125 --> successfull: read 125 values
read all entities for type TestTestTest with offset=87625 and limit=125 --> error: read 0 values
read all entities for type TestTestTest with offset=87625 and limit=62 --> error: read 0 values
read all entities for type TestTestTest with offset=87625 and limit=31 --> error: read 0 values
read all entities for type TestTestTest with offset=87625 and limit=15 --> error: read 0 values
read all entities for type TestTestTest with offset=87625 and limit=7 --> successfull: read 7 values
read all entities for type TestTestTest with offset=87632 and limit=7 --> error: read 0 values
read all entities for type TestTestTest with offset=87632 and limit=3 --> successfull: read 3 values
read all entities for type TestTestTest with offset=87635 and limit=3 --> successfull: read 3 values
read all entities for type TestTestTest with offset=87638 and limit=3 --> error: read 0 values
read all entities for type TestTestTest with offset=87638 and limit=1 --> error: read 0 values
try to read the last readable entity with type TestTestTest with offset=87637 and limit=1 --> id=TestTestTest:87637
try to read an entity after the last readable entity with type TestTestTest with offset=87638 and limit=1 --> []
ERROR, the requested amount of entities (iOffset=87637) seems not the expected count.
I use the following version of orion-ld on a docker environment:
{
"orionld version": "post-v0.8.1",
"orion version": "1.15.0-next",
"uptime": "7 d, 12 h, 2 m, 55 s",
"git_hash": "nogitversion",
"compile_time": "Wed Dec 15 15:12:50 UTC 2021",
"compiled_by": "root",
"compiled_in": "",
"release_date": "Wed Dec 15 15:12:50 UTC 2021",
"doc": "https://fiware-orion.readthedocs.org/en/master/"
}
It would be nice if you can help in this case.
Thanks, Knigge

Yes, this really seems to be a bug in Orion-LD.
As a bug, we need it as an issue on the github repo.
Could you file an issue .... here ?
(that way you'll be informed on updates)

So, I filled my DB with 27000 entities and tested:
ldcurl GET "/entities?type=T&offset=26000&count=true&limit=1000"
(ldcurl is a tiny script of mine that adds host, port, accept-header etc)
I had no problem, got this output for the sample test with offset 26000
HTTP/1.1 200 OK
Connection: Keep-Alive
Content-Length: 138003
Content-Type: application/json
Link: <https://uri.etsi.org/ngsi-ld/v1/ngsi-ld-core-context.jsonld>; rel="http://www.w3.org/ns/json-ld#context"; type="application/ld+json"
NGSILD-Results-Count: 27000
Date: Thu, 27 Jan 2022 19:31:14 GMT
[
{
"id": "urn:ngsi-ld:entities:E26001",
"type": "T",
"A1": {
"type": "Property",
"value": "E26002:A1"
}
},
{
"id": "urn:ngsi-ld:entities:E26002",
"type": "T",
"A1": {
"type": "Property",
"value": "E26003:A1"
}
},
...
I tested with what's currently in the develop branch and also with what's in release/1.0.0.
Look me up on skype, same user name as here in SOF, perhaps we can look at your problem together, as I don't seem to be able to reproduce it myself.
Meanwhile, I'll make my entities more similar to yours, see if I have more luck that way.

Related

Mcprotocol - Getting Error code - BAD 255

We're using nodejs mcprotocol library to pole periodic data from Mitsubishi Electric PLC-FX5U-64M after every 1 minute and saving it to mongodb collection.
We've added 1 tag named as Tag_1 with configuration as
{
'tagname' : 'Tag_1',
'type' : 'number',
'address' : 'D10',
'tagsize' : 2
}
But when we read the tag value we're getting error of something went wrong writing values with status code and actual data as BAD 255.MC protocol BAD 255
Please refer to attached image for the same.
Thank you.
We've so far tried by changing the address series to test for another Tag of D sereis, but still the BAD 255 code continues.

Lowercasing complex object field names in azure data factory data flow

I'm trying to lowercase the field names in a row entry in azure data flow. Inside a complex object I've got something like
{
"field": "sample",
"functions": [
{
"Name": "asdf",
"Value": "sdfsd"
},
{
"Name": "dfs",
"Value": "zxcv"
}
]
}
and basically what I want is for "Name" and "Value to be "name" and "value". However can't seem to use any expressions that will work for the nested fields of a complex object in the expression builder.
I've tried using a something like a select with a rule-based mapping that is the rule being 1 == 1 and lower($$), but with $$ it seems to only work for root columns of the complex object and not the nested fields inside.
As suggested by #Mark Kromer MSFT, for changing case of columns inside complex type select the functions in the Hierarchy level.
Please check the below for your reference:
Here, I have used both.
You can see the difference in results.

How to show data properly in Office Excel Using Power Query Editor?

I have below JSON output from an API, in Office Excel I am importing data via Web from API.
[{
"level": 1,
"children": [{
"level": 2,
"children": [{
"level": 3,
"name": "Chandni Chowk",
"data": ["Data 1", "Data 2"]
}],
"name": "Delhi",
"data": ["Delhi Area"]
}],
"name": "Country",
"data": ["India", "Bangladesh"]
}]
https://learn.microsoft.com/en-us/powerquery-m/quick-tour-of-the-power-query-m-formula-language
I have above document.
let
Source = Json.Document(Web.Contents("MY API URL GOES HERE")),
AsTable = Table.FromRecords(Source)
----
----
in
#"Renamed Column2"
In the power query editor I have this for now.
As a out put in Excel file I need like this.
Country Delhi Chandni Chowk
India Delhi Area Data 1
Bangladesh Data 2
Can I get this data from this JSON or I need to change my JSON output format which matches power query?
Power Query interprets JSON as a hierarchy of records and lists. My goal is to flatten the JSON into a record like this and then convert it into a table:
Country : {"India", "Bangladesh"}
Delhi : {"Delhi Area"}
Chandni Chowk : {"Data 1", "Data 2"}
At any particular level, we can pull the name and data value using Record.FromList:
Record.FromList({CurrentLevel[data]}, {CurrentLevel[name]})
For the first level, this is
Record.FromList({{"India","Bangladesh"}}, {"Country"})
which corresponds to the first field in the goal record.
At any level, we can navigate to the next level like this:
NextLevel = CurrentLevel[children]{0}
Using these to building blocks, we can now write a custom function Expand to flatten the record:
1 | (R as record) as record =>
2 | let
3 | ThisLevel = Record.FromList({R[data]}, {R[name]}),
4 | CombLevel = if Record.HasFields(R, {"children"})
5 | then Record.Combine({ThisLevel, #Expand(R[children]{0})})
6 | else ThisLevel
7 | in
8 | CombLevel
Line 1: The syntax for defining a function. It takes a record R and returns a record after doing some transformations.
Line 3: How to deal with the current level, as mentioned earlier.
Line 4: Check if the record has another level to expand down to.
Line 5: If it does, then Record.Combine the result of the current level with the result of the next level, where the result of the next level is calculated by navigating to the next level and recursively applying the function we're defining. With three levels this looks like:
Record.Combine({Level1, Record.Combine({Level2, Level3})})
Line 6: Recursion stops when there are no more levels to expand. No more combinations, just the last level is returned.
All that's left is to transform it into the shape we want. Here's what my query looks like using the Expand function we just defined:
let
Source = Json.Document( < JSON Source > ),
ExpandRecord = Expand(Source{0}),
ToTable = Table.FromColumns(
Record.FieldValues(ExpandRecord),
Record.FieldNames(ExpandRecord)
)
in
ToTable
This uses Record.FieldValues and Record.FieldName as arguments in Table.FromColumns.
The step after using the Expand custom function looks like this in the query editor if you select the first list cell:
The final result is what you asked for:

How do I keep existing data in couchbase and only update the new data without overwriting

So, say I have created some records/documents under a bucket and the user updates only one column out of 10 in the RDBMS, so I am trying to send only that one columns data and update it in couchbase. But the problem is that couchbase is overwriting the entire record and putting NULL`s for the rest of the columns.
One approach is to copy all the data from the exisiting record after fetching it from Cbase, and then overwriting the new column while copying the data from the old one. But that doesn`t look like a optimal approach
Any suggestions?
You can use N1QL update Statments google for Couchbase N1QL
UPDATE replaces a document that already exists with updated values.
update:
UPDATE keyspace-ref [use-keys-clause] [set-clause] [unset-clause] [where-clause] [limit-clause] [returning-clause]
set-clause:
SET path = expression [update-for] [ , path = expression [update-for] ]*
update-for:
FOR variable (IN | WITHIN) path (, variable (IN | WITHIN) path)* [WHEN condition ] END
unset-clause:
UNSET path [update-for] (, path [ update-for ])*
keyspace-ref: Specifies the keyspace for which to update the document.
You can add an optional namespace-name to the keyspace-name in this way:
namespace-name:keyspace-name.
use-keys-clause:Specifies the keys of the data items to be updated. Optional. Keys can be any expression.
set-clause:Specifies the value for an attribute to be changed.
unset-clause: Removes the specified attribute from the document.
update-for: The update for clause uses the FOR statement to iterate over a nested array and SET or UNSET the given attribute for every matching element in the array.
where-clause:Specifies the condition that needs to be met for data to be updated. Optional.
limit-clause:Specifies the greatest number of objects that can be updated. This clause must have a non-negative integer as its upper bound. Optional.
returning-clause:Returns the data you updated as specified in the result_expression.
RBAC Privileges
User executing the UPDATE statement must have the Query Update privilege on the target keyspace. If the statement has any clauses that needs data read, such as SELECT clause, or RETURNING clause, then Query Select privilege is also required on the keyspaces referred in the respective clauses. For more details about user roles, see Authorization.
For example,
To execute the following statement, user must have the Query Update privilege on travel-sample.
UPDATE `travel-sample` SET foo = 5
To execute the following statement, user must have the Query Update privilege on the travel-sample and Query Select privilege on beer-sample.
UPDATE `travel-sample`
SET foo = 9
WHERE city = (SELECT raw city FROM `beer-sample` WHERE type = "brewery"
To execute the following statement, user must have the Query Update privilege on `travel-sample` and Query Select privilege on `travel-sample`.
UPDATE `travel-sample`
SET city = “San Francisco”
WHERE lower(city) = "sanfrancisco"
RETURNING *
Example
The following statement changes the "type" of the product, "odwalla-juice1" to "product-juice".
UPDATE product USE KEYS "odwalla-juice1" SET type = "product-juice" RETURNING product.type
"results": [
{
"type": "product-juice"
}
]
This statement removes the "type" attribute from the "product" keyspace for the document with the "odwalla-juice1" key.
UPDATE product USE KEYS "odwalla-juice1" UNSET type RETURNING product.*
"results": [
{
"productId": "odwalla-juice1",
"unitPrice": 5.4
}
]
This statement unsets the "gender" attribute in the "children" array for the document with the key, "dave" in the tutorial keyspace.
UPDATE tutorial t USE KEYS "dave" UNSET c.gender FOR c IN children END RETURNING t
"results": [
{
"t": {
"age": 46,
"children": [
{
"age": 17,
"fname": "Aiden"
},
{
"age": 2,
"fname": "Bill"
}
],
"email": "dave#gmail.com",
"fname": "Dave",
"hobbies": [
"golf",
"surfing"
],
"lname": "Smith",
"relation": "friend",
"title": "Mr.",
"type": "contact"
}
}
]
Starting version 4.5.1, the UPDATE statement has been improved to SET nested array elements. The FOR clause is enhanced to evaluate functions and expressions, and the new syntax supports multiple nested FOR expressions to access and update fields in nested arrays. Additional array levels are supported by chaining the FOR clauses.
Example
UPDATE default
SET i.subitems = ( ARRAY OBJECT_ADD(s, 'new', 'new_value' )
FOR s IN i.subitems END )
FOR s IN ARRAY_FLATTEN(ARRAY i.subitems
FOR i IN items END, 1) END;
If you're using structured (json) data, you need to read the existing record then update the field you want in your program's data structure and then send the record up again. You can't update individual fields in the json structure without sending it all up again. There isn't a way around this that I'm aware of.
It is indeed true, to update individual items in a JSON doc, you need to fetch the entire document and overwrite it.
We are working on adding individual item updates in the near future.

ElasticSearch default scoring mechanism

What I am looking for, is plain, clear explanation, of how default scoring mechanism of ElasticSearch (Lucene) really works. I mean, does it use Lucene scoring, or maybe it uses scoring of its own?
For example, I want to search for document by, for example, "Name" field. I use .NET NEST client to write my queries. Let's consider this type of query:
IQueryResponse<SomeEntity> queryResult = client.Search<SomeEntity>(s =>
s.From(0)
.Size(300)
.Explain()
.Query(q => q.Match(a => a.OnField(q.Resolve(f => f.Name)).QueryString("ExampleName")))
);
which is translated to such JSON query:
{
"from": 0,
"size": 300,
"explain": true,
"query": {
"match": {
"Name": {
"query": "ExampleName"
}
}
}
}
There is about 1.1 million documents that search is performed on. What I get in return, is (that is only part of the result, formatted on my own):
650 "ExampleName" 7,313398
651 "ExampleName" 7,313398
652 "ExampleName" 7,313398
653 "ExampleName" 7,239194
654 "ExampleName" 7,239194
860 "ExampleName of Something" 4,5708737
where first field is just an Id, second is Name field on which ElasticSearch performed it's searching, and third is score.
As you can see, there are many duplicates in ES index. As some of found documents have diffrent score, despite that they are exactly the same (with only diffrent Id), I concluded that diffrent shards performed searching on diffrent parts of whole dataset, which leads me to trail that the score is somewhat based on overall data in given shard, not exclusively on document that is actually considered by search engine.
The question is, how exactly does this scoring work? I mean, could you tell me/show me/point me to exact formula to calculate score for each document found by ES? And eventually, how this scoring mechanism can be changed?
The default scoring is the DefaultSimilarity algorithm in core Lucene, largely documented here. You can customize scoring by configuring your own Similarity, or using something like a custom_score query.
The odd score variation in the first five results shown seems small enough that it doesn't concern me much, as far as the validity of the query results and their ordering, but if you want to understand the cause of it, the explain api can show you exactly what is going on there.
The score variation is based on the data in a given shard (like you suspected). By default ES uses a search type called 'query then fetch' which, sends the query to each shard, finds all the matching documents with scores using local TDIFs (this will vary based on data on a given shard - here's your problem).
You can change this by using 'dfs query then fetch' search type - prequery each shard asking about term and document frequencies and then sends a query to each shard etc..
You can set it in the url
$ curl -XGET '/index/type/search?pretty=true&search_type=dfs_query_then_fetch' -d '{
"from": 0,
"size": 300,
"explain": true,
"query": {
"match": {
"Name": {
"query": "ExampleName"
}
}
}
}'
Great explanation in ElasticSearch documentation:
What is relevance:
https://www.elastic.co/guide/en/elasticsearch/guide/current/relevance-intro.html
Theory behind relevance scoring:
https://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html

Resources