Cross-layer filtering extension with more then 2 filter query is not working - cql

i have a Query like below
http://localhost:8080/geoserver/wfs/?service=wfs&version=1.1.0&outputFormat=json&request=getfeature&typename=tiger:poly_landmarks,tiger:poi&cql_filter=INTERSECTS(the_geom, querySingle('tiger:poly_landmarks', 'the_geom','LAND=83','CFCC=H11'))
Which gives 3 features
{
"type": "FeatureCollection"
"totalFeatures": 3
"features": [3]
0: {
"type": "Feature"
"id": "poly_landmarks.3"
...More
But if i add one more filter to the querySingle 'LANAME=East River' as below
http://localhost:8080/geoserver/wfs/?service=wfs&version=1.1.0&outputFormat=json&request=getfeature&typename=tiger:poly_landmarks,tiger:poi&cql_filter=INTERSECTS(the_geom, querySingle('tiger:poly_landmarks', 'the_geom','LAND=83','CFCC=H11','LANAME=East River'))
Gives error saying
<ows:ExceptionReport version="1.0.0" xsi:schemaLocation="http://www.opengis.net/ows http://localhost:8080/geoserver/schemas/ows/1.0.0/owsExceptionReport.xsd">
<ows:Exception exceptionCode="NoApplicableCode">
<ows:ExceptionText>Could not parse CQL filter list. Function not found. Parsing : INTERSECTS(the_geom, querySingle('tiger:poly_landmarks', 'the_geom','LAND=83','CFCC=H11','LANAME=East River')).</ows:ExceptionText>
</ows:Exception>
</ows:ExceptionReport>

For the documentation it seems that querySingle only takes 3 arguments so your way won't work. I suspect (i.e. I haven't tested this in this context) you can construct a more complex CQL filter by using AND. So I would try
querySingle('tiger:poly_landmarks', 'the_geom','LAND=83 AND CFCC=H11 AND LANAME=East River'))

Related

Lowercasing complex object field names in azure data factory data flow

I'm trying to lowercase the field names in a row entry in azure data flow. Inside a complex object I've got something like
{
"field": "sample",
"functions": [
{
"Name": "asdf",
"Value": "sdfsd"
},
{
"Name": "dfs",
"Value": "zxcv"
}
]
}
and basically what I want is for "Name" and "Value to be "name" and "value". However can't seem to use any expressions that will work for the nested fields of a complex object in the expression builder.
I've tried using a something like a select with a rule-based mapping that is the rule being 1 == 1 and lower($$), but with $$ it seems to only work for root columns of the complex object and not the nested fields inside.
As suggested by #Mark Kromer MSFT, for changing case of columns inside complex type select the functions in the Hierarchy level.
Please check the below for your reference:
Here, I have used both.
You can see the difference in results.

How to obtain nested fields within JSON

Background:
I wish to update a nested field within my JSON document. I want to query for all of the "state" that equal "new"
{
"id": "123"
"feedback" : {
"Features" : [
{
"state":"new"
}
]
}
This is what I have tried to do:
Since this is a nested document. My query looks like this:
SELECT * FROM c WHERE c.feedback.Features.state = "new"
However, I keep ending up with zero results when I know that this exists within the database. What am I doing wrong? Maybe I am getting 0 results because the Features is an array?
Any help is appreciated
For arrays, you'll need to use ARRAY_CONTAINS(). For example, in your case:
SELECT *
FROM c
WHERE ARRAY_CONTAINS(c.feedback.Features,{'state': 'new'}, true)
The 3rd parameter specifies that you're searching within documents within the array, not scalar values.

Finding Vertices that are connected to all current vertices

I am fairly new to graph db's and gremlin and I am having a similar problem to others, see this question, in that I am trying to get the Resource Verticies that meet the all the Criteria of a selected Item. So for the following graph
Item 1 should return Resource 1 & Resource 2.
Item 2 should return Resource 2 only.
Here's a script to create the sample data:
g.addV("Resource").property("name", "Resource1")
g.addV("Resource").property("name", "Resource2")
g.addV("Criteria").property("name", "Criteria1")
g.addV("Criteria").property("name", "Criteria2")
g.addV("Item").property("name", "Item1")
g.addV("Item").property("name", "Item2")
g.V().has("Resource", "name", "Resource1").addE("isOf").to(g.V().has("Criteria", "name", "Criteria1"))
g.V().has("Resource", "name", "Resource2").addE("isOf").to(g.V().has("Criteria", "name", "Criteria1"))
g.V().has("Resource", "name", "Resource2").addE("isOf").to(g.V().has("Criteria", "name", "Criteria2"))
g.V().has("Item", "name", "Item1").addE("needs").to(g.V().has("Criteria", "name", "Criteria1"))
g.V().has("Item", "name", "Item2").addE("needs").to(g.V().has("Criteria", "name", "Criteria1"))
g.V().has("Item", "name", "Item2").addE("needs").to(g.V().has("Criteria", "name", "Criteria2"))
When I try the following, I get Resource 1 & 2 as it is looking at all related Resources to both Criteria, whereas I only want the Resources that match both Criteria (Resource 2).
g.V()
.hasLabel('Item')
.has('name', 'Item2')
.outE('needs')
.inV()
.aggregate("x")
.inE('isOf')
.outV()
.dedup()
So if I try the following, as the referenced question suggests.
g.V()
.hasLabel('Item')
.has('name', 'Item2')
.outE('needs')
.inV()
.aggregate("x")
.inE('isOf')
.outV()
.dedup()
.filter(
out("isOf")
.where(within("x"))
.count()
.where(eq("x"))
.by()
.by(count(local)))
.valueMap()
I get the following exception as the answer provided for the other question doesn't work in Azure CosmosDB graph database as it doesn't support the gremlin Filter statement.
Failure in submitting query: g.V().hasLabel('Item').has('name', 'Item2').outE('needs').inV().aggregate("x").inE('isOf').outV().dedup().filter(out("isOf").where(within("x")).count().where(eq("x")).by().by(count(local))).valueMap(): "Script eval error: \r\n\nActivityId : d2eccb49-9ca5-4ac6-bfd7-b851d63662c9\nExceptionType : GraphCompileException\nExceptionMessage :\r\n\tGremlin Query Compilation Error: Unable to find any method 'filter' # line 1, column 113.\r\n\t1 Error(s)\nSource : Microsoft.Azure.Cosmos.Gremlin.Core\n\tGremlinRequestId : d2eccb49-9ca5-4ac6-bfd7-b851d63662c9\n\tContext : graphcompute\n\tScope : graphparse-translate-csharpexpressionbinding\n\tGraphInterOpStatusCode : QuerySyntaxError\n\tHResult : 0x80131500\r\n"
I'm interested to know if there is a way to solve my problem with the gremlin steps MS provide in Azure CosmosDB (these).
Just replace filter with where.
The query will work equally the same.

Azure Stream Processing upsert to DocumentDB with array

I'm using Azure Stream Analytics to copy my Json over to DocumentDB using upsert to overwrite the document with the latest data. This is great for my base data, but I would love to be able to append the list data, as unfortunately I can only send one list item at a time.
In the example below, the document is matched on id, and all items are updated, but I would like the "myList" array to keep growing with the "myList" data from each document (with the same id). Is this possible? Is there any other way to use Stream Analytics to update this list in the document?
I'd rather steer clear of using a tumbling window if possible, but is that an option that would work?
Sample documents:
{
"id": "1234",
"otherData": "example",
"myList": [{"listitem": 1}]
}
{
"id": "1234",
"otherData": "example 2",
"myList": [{"listitem": 2}]
}
Desired output:
{
"id": "1234",
"otherData": "example 2",
"myList": [{"listitem": 1}, {"listitem": 2}]
}
My current query:
SELECT id, otherData, myList INTO [myoutput] FROM [myinput]
Currently arrays are not merged, this is the existing behavior of DocumentDB output from ASA, also mentioned in this article. I doubt using a tumbling window would help here.
Note that changes in the values of array properties in your JSON document result in the entire array getting overwritten, i.e. the array is not merged.
You could transform the input that is coming as an array (myList) into a dictionary using GetArrayElements function .
Your query might look something like --
SELECT i.id , i.otherData, listItemFromArray
INTO myoutput
FROM myinput i
CROSS APPLY GetArrayElements(i.myList) AS listItemFromArray
cheers!

ElasticSearch default scoring mechanism

What I am looking for, is plain, clear explanation, of how default scoring mechanism of ElasticSearch (Lucene) really works. I mean, does it use Lucene scoring, or maybe it uses scoring of its own?
For example, I want to search for document by, for example, "Name" field. I use .NET NEST client to write my queries. Let's consider this type of query:
IQueryResponse<SomeEntity> queryResult = client.Search<SomeEntity>(s =>
s.From(0)
.Size(300)
.Explain()
.Query(q => q.Match(a => a.OnField(q.Resolve(f => f.Name)).QueryString("ExampleName")))
);
which is translated to such JSON query:
{
"from": 0,
"size": 300,
"explain": true,
"query": {
"match": {
"Name": {
"query": "ExampleName"
}
}
}
}
There is about 1.1 million documents that search is performed on. What I get in return, is (that is only part of the result, formatted on my own):
650 "ExampleName" 7,313398
651 "ExampleName" 7,313398
652 "ExampleName" 7,313398
653 "ExampleName" 7,239194
654 "ExampleName" 7,239194
860 "ExampleName of Something" 4,5708737
where first field is just an Id, second is Name field on which ElasticSearch performed it's searching, and third is score.
As you can see, there are many duplicates in ES index. As some of found documents have diffrent score, despite that they are exactly the same (with only diffrent Id), I concluded that diffrent shards performed searching on diffrent parts of whole dataset, which leads me to trail that the score is somewhat based on overall data in given shard, not exclusively on document that is actually considered by search engine.
The question is, how exactly does this scoring work? I mean, could you tell me/show me/point me to exact formula to calculate score for each document found by ES? And eventually, how this scoring mechanism can be changed?
The default scoring is the DefaultSimilarity algorithm in core Lucene, largely documented here. You can customize scoring by configuring your own Similarity, or using something like a custom_score query.
The odd score variation in the first five results shown seems small enough that it doesn't concern me much, as far as the validity of the query results and their ordering, but if you want to understand the cause of it, the explain api can show you exactly what is going on there.
The score variation is based on the data in a given shard (like you suspected). By default ES uses a search type called 'query then fetch' which, sends the query to each shard, finds all the matching documents with scores using local TDIFs (this will vary based on data on a given shard - here's your problem).
You can change this by using 'dfs query then fetch' search type - prequery each shard asking about term and document frequencies and then sends a query to each shard etc..
You can set it in the url
$ curl -XGET '/index/type/search?pretty=true&search_type=dfs_query_then_fetch' -d '{
"from": 0,
"size": 300,
"explain": true,
"query": {
"match": {
"Name": {
"query": "ExampleName"
}
}
}
}'
Great explanation in ElasticSearch documentation:
What is relevance:
https://www.elastic.co/guide/en/elasticsearch/guide/current/relevance-intro.html
Theory behind relevance scoring:
https://www.elastic.co/guide/en/elasticsearch/guide/current/scoring-theory.html

Resources