Finding Vertices that are connected to all current vertices - azure

I am fairly new to graph db's and gremlin and I am having a similar problem to others, see this question, in that I am trying to get the Resource Verticies that meet the all the Criteria of a selected Item. So for the following graph
Item 1 should return Resource 1 & Resource 2.
Item 2 should return Resource 2 only.
Here's a script to create the sample data:
g.addV("Resource").property("name", "Resource1")
g.addV("Resource").property("name", "Resource2")
g.addV("Criteria").property("name", "Criteria1")
g.addV("Criteria").property("name", "Criteria2")
g.addV("Item").property("name", "Item1")
g.addV("Item").property("name", "Item2")
g.V().has("Resource", "name", "Resource1").addE("isOf").to(g.V().has("Criteria", "name", "Criteria1"))
g.V().has("Resource", "name", "Resource2").addE("isOf").to(g.V().has("Criteria", "name", "Criteria1"))
g.V().has("Resource", "name", "Resource2").addE("isOf").to(g.V().has("Criteria", "name", "Criteria2"))
g.V().has("Item", "name", "Item1").addE("needs").to(g.V().has("Criteria", "name", "Criteria1"))
g.V().has("Item", "name", "Item2").addE("needs").to(g.V().has("Criteria", "name", "Criteria1"))
g.V().has("Item", "name", "Item2").addE("needs").to(g.V().has("Criteria", "name", "Criteria2"))
When I try the following, I get Resource 1 & 2 as it is looking at all related Resources to both Criteria, whereas I only want the Resources that match both Criteria (Resource 2).
g.V()
.hasLabel('Item')
.has('name', 'Item2')
.outE('needs')
.inV()
.aggregate("x")
.inE('isOf')
.outV()
.dedup()
So if I try the following, as the referenced question suggests.
g.V()
.hasLabel('Item')
.has('name', 'Item2')
.outE('needs')
.inV()
.aggregate("x")
.inE('isOf')
.outV()
.dedup()
.filter(
out("isOf")
.where(within("x"))
.count()
.where(eq("x"))
.by()
.by(count(local)))
.valueMap()
I get the following exception as the answer provided for the other question doesn't work in Azure CosmosDB graph database as it doesn't support the gremlin Filter statement.
Failure in submitting query: g.V().hasLabel('Item').has('name', 'Item2').outE('needs').inV().aggregate("x").inE('isOf').outV().dedup().filter(out("isOf").where(within("x")).count().where(eq("x")).by().by(count(local))).valueMap(): "Script eval error: \r\n\nActivityId : d2eccb49-9ca5-4ac6-bfd7-b851d63662c9\nExceptionType : GraphCompileException\nExceptionMessage :\r\n\tGremlin Query Compilation Error: Unable to find any method 'filter' # line 1, column 113.\r\n\t1 Error(s)\nSource : Microsoft.Azure.Cosmos.Gremlin.Core\n\tGremlinRequestId : d2eccb49-9ca5-4ac6-bfd7-b851d63662c9\n\tContext : graphcompute\n\tScope : graphparse-translate-csharpexpressionbinding\n\tGraphInterOpStatusCode : QuerySyntaxError\n\tHResult : 0x80131500\r\n"
I'm interested to know if there is a way to solve my problem with the gremlin steps MS provide in Azure CosmosDB (these).

Just replace filter with where.
The query will work equally the same.

Related

Lowercasing complex object field names in azure data factory data flow

I'm trying to lowercase the field names in a row entry in azure data flow. Inside a complex object I've got something like
{
"field": "sample",
"functions": [
{
"Name": "asdf",
"Value": "sdfsd"
},
{
"Name": "dfs",
"Value": "zxcv"
}
]
}
and basically what I want is for "Name" and "Value to be "name" and "value". However can't seem to use any expressions that will work for the nested fields of a complex object in the expression builder.
I've tried using a something like a select with a rule-based mapping that is the rule being 1 == 1 and lower($$), but with $$ it seems to only work for root columns of the complex object and not the nested fields inside.
As suggested by #Mark Kromer MSFT, for changing case of columns inside complex type select the functions in the Hierarchy level.
Please check the below for your reference:
Here, I have used both.
You can see the difference in results.

knex query with quotes around table name

I have a query as below which return user information where code is not null
select name,age from "dev.user" where code is not null
return expected output
in knex I am doing
knex("user").select('name','age').whereNotNull('code')
It return empty !
debug query return as follow
select "name", "age" from "user" where "code" is not null
knex("dev.user").select('name','age').whereNotNull('code')
debug query return as follow
select "name", "age" from "dev"."user" where "code" is not null
First when I initialise knex I have set the schema which is not working second even if I provide schema it is generating query as "dev"."user" instead of "dev.user"
any pointers will be helpful
From the documentation, it looks like you should use .withSchema(). (see link)
NOTE: identifier syntax has no place for selecting schema, so if you
are doing schemaName.tableName, query might be rendered wrong. Use
.withSchema('schemaName') instead.
Which would be:
knex("user")
.withSchema("dev")
.select('name','age').whereNotNull('code')
I haven't had to include a schema myself, so best of luck with this.
Gary.

Azure Stream Processing upsert to DocumentDB with array

I'm using Azure Stream Analytics to copy my Json over to DocumentDB using upsert to overwrite the document with the latest data. This is great for my base data, but I would love to be able to append the list data, as unfortunately I can only send one list item at a time.
In the example below, the document is matched on id, and all items are updated, but I would like the "myList" array to keep growing with the "myList" data from each document (with the same id). Is this possible? Is there any other way to use Stream Analytics to update this list in the document?
I'd rather steer clear of using a tumbling window if possible, but is that an option that would work?
Sample documents:
{
"id": "1234",
"otherData": "example",
"myList": [{"listitem": 1}]
}
{
"id": "1234",
"otherData": "example 2",
"myList": [{"listitem": 2}]
}
Desired output:
{
"id": "1234",
"otherData": "example 2",
"myList": [{"listitem": 1}, {"listitem": 2}]
}
My current query:
SELECT id, otherData, myList INTO [myoutput] FROM [myinput]
Currently arrays are not merged, this is the existing behavior of DocumentDB output from ASA, also mentioned in this article. I doubt using a tumbling window would help here.
Note that changes in the values of array properties in your JSON document result in the entire array getting overwritten, i.e. the array is not merged.
You could transform the input that is coming as an array (myList) into a dictionary using GetArrayElements function .
Your query might look something like --
SELECT i.id , i.otherData, listItemFromArray
INTO myoutput
FROM myinput i
CROSS APPLY GetArrayElements(i.myList) AS listItemFromArray
cheers!

DocumentDB - Cannot compare two paths in query

Microsoft Azure Documents BadRequestException An invalid query has been specified with filters against path(s) that are not range-indexed. Consider adding allow scan header in the request.
My query is:
SELECT c.id FROM users c WHERE (c.lat < 29.89)
OVER ?? number of documents (as there are no way to get the number of document in collection with DocumentDB)
If you look at the blogpost here:
http://azure.microsoft.com/blog/2015/01/27/performance-tips-for-azure-documentdb-part-2/
Indexing Policy Tip #3: Specify range index path type for all paths
used in range queries
DocumentDB currently supports two index path types: Hash and Range.
Choosing an index path type of Hash enables efficient equality
queries. Choosing an index type of Range enables range queries (using
>, <, >=, <=).
It gives an example in C# to add a Range Index to make the path comparable, but there is similar functionality in the node.js library.
When you create a collection, you can pass the IndexingPolicy through the body parameter. The IndexingPolicy has a couple of members. One of which is the IncludedPaths, where you can define indices.
var policy = {
Automatic: true,
IndexingMode: 'Lazy',
IncludedPaths: [
{
IndexType: "Range",
Path: "path to be indexed (c.lat)",
NempericPrecission: "1",
StringPrecission: "1"
}
],
ExcludedPaths: []
}
client.createCollection(
'#yourdblink',
{
id: 10001,
indexingPolicy: policy
});
The Policy can be changed in the new Azure Portal (https://portal.azure.com), under (your DocumentDB resource) -> Settings -> Indexing Policy.

How do I keep existing data in couchbase and only update the new data without overwriting

So, say I have created some records/documents under a bucket and the user updates only one column out of 10 in the RDBMS, so I am trying to send only that one columns data and update it in couchbase. But the problem is that couchbase is overwriting the entire record and putting NULL`s for the rest of the columns.
One approach is to copy all the data from the exisiting record after fetching it from Cbase, and then overwriting the new column while copying the data from the old one. But that doesn`t look like a optimal approach
Any suggestions?
You can use N1QL update Statments google for Couchbase N1QL
UPDATE replaces a document that already exists with updated values.
update:
UPDATE keyspace-ref [use-keys-clause] [set-clause] [unset-clause] [where-clause] [limit-clause] [returning-clause]
set-clause:
SET path = expression [update-for] [ , path = expression [update-for] ]*
update-for:
FOR variable (IN | WITHIN) path (, variable (IN | WITHIN) path)* [WHEN condition ] END
unset-clause:
UNSET path [update-for] (, path [ update-for ])*
keyspace-ref: Specifies the keyspace for which to update the document.
You can add an optional namespace-name to the keyspace-name in this way:
namespace-name:keyspace-name.
use-keys-clause:Specifies the keys of the data items to be updated. Optional. Keys can be any expression.
set-clause:Specifies the value for an attribute to be changed.
unset-clause: Removes the specified attribute from the document.
update-for: The update for clause uses the FOR statement to iterate over a nested array and SET or UNSET the given attribute for every matching element in the array.
where-clause:Specifies the condition that needs to be met for data to be updated. Optional.
limit-clause:Specifies the greatest number of objects that can be updated. This clause must have a non-negative integer as its upper bound. Optional.
returning-clause:Returns the data you updated as specified in the result_expression.
RBAC Privileges
User executing the UPDATE statement must have the Query Update privilege on the target keyspace. If the statement has any clauses that needs data read, such as SELECT clause, or RETURNING clause, then Query Select privilege is also required on the keyspaces referred in the respective clauses. For more details about user roles, see Authorization.
For example,
To execute the following statement, user must have the Query Update privilege on travel-sample.
UPDATE `travel-sample` SET foo = 5
To execute the following statement, user must have the Query Update privilege on the travel-sample and Query Select privilege on beer-sample.
UPDATE `travel-sample`
SET foo = 9
WHERE city = (SELECT raw city FROM `beer-sample` WHERE type = "brewery"
To execute the following statement, user must have the Query Update privilege on `travel-sample` and Query Select privilege on `travel-sample`.
UPDATE `travel-sample`
SET city = “San Francisco”
WHERE lower(city) = "sanfrancisco"
RETURNING *
Example
The following statement changes the "type" of the product, "odwalla-juice1" to "product-juice".
UPDATE product USE KEYS "odwalla-juice1" SET type = "product-juice" RETURNING product.type
"results": [
{
"type": "product-juice"
}
]
This statement removes the "type" attribute from the "product" keyspace for the document with the "odwalla-juice1" key.
UPDATE product USE KEYS "odwalla-juice1" UNSET type RETURNING product.*
"results": [
{
"productId": "odwalla-juice1",
"unitPrice": 5.4
}
]
This statement unsets the "gender" attribute in the "children" array for the document with the key, "dave" in the tutorial keyspace.
UPDATE tutorial t USE KEYS "dave" UNSET c.gender FOR c IN children END RETURNING t
"results": [
{
"t": {
"age": 46,
"children": [
{
"age": 17,
"fname": "Aiden"
},
{
"age": 2,
"fname": "Bill"
}
],
"email": "dave#gmail.com",
"fname": "Dave",
"hobbies": [
"golf",
"surfing"
],
"lname": "Smith",
"relation": "friend",
"title": "Mr.",
"type": "contact"
}
}
]
Starting version 4.5.1, the UPDATE statement has been improved to SET nested array elements. The FOR clause is enhanced to evaluate functions and expressions, and the new syntax supports multiple nested FOR expressions to access and update fields in nested arrays. Additional array levels are supported by chaining the FOR clauses.
Example
UPDATE default
SET i.subitems = ( ARRAY OBJECT_ADD(s, 'new', 'new_value' )
FOR s IN i.subitems END )
FOR s IN ARRAY_FLATTEN(ARRAY i.subitems
FOR i IN items END, 1) END;
If you're using structured (json) data, you need to read the existing record then update the field you want in your program's data structure and then send the record up again. You can't update individual fields in the json structure without sending it all up again. There isn't a way around this that I'm aware of.
It is indeed true, to update individual items in a JSON doc, you need to fetch the entire document and overwrite it.
We are working on adding individual item updates in the near future.

Resources