How can CosmosDB Query the values of the properties within a dynamic JSON?
The app allows storing a JSON as a set of custom properties for an object. They are serialized and stored in CosmosDb. For example, here are two entries:
{
"id": "ade9f2d6-fff6-4993-8473-a2af40f071f4",
...
"Properties": {
"fn": "Ernest",
"ln": "Hemingway",
"a_book": "The Old Man and the Sea"
},
...
}
and
{
"id": "23cb9d4c-da56-40ec-9fbe-7f5178a92a4f",
...
"Properties": {
"First Name": "Salvador",
"Last Name": "Dali",
"Period": "Surrealism"
},
...
}
How can the query be structured so that it searches in the values of Properties?
I’m looking for something that doesn’t involve the name of the
sub-propety, like SELECT * FROM c WHERE
some_function_here(c.Properties, ‘Ernest’)
Maybe I get your idea that you want to filter the documents by the value of the Properties, not the name. If so , you could use UDF in cosmos db.
sample udf:
function query(Properties,filedValue){
for(var k in Properties){
if(Properties[k] == filedValue)
return true;
}
return false;
}
sample query:
SELECT c.id FROM c where udf.query(c.Properties,'Ernest')
output:
Just summary here, Ovi's udf function like:
function QueryProperties (Properties, filedValue) {
for (var k in Properties) {
if (Properties[k] && Properties[k].toString().toUpperCase().includes(filedValue.toString().toUpperCase()))
return true;
return false;
}
Both of the following syntax's will work.
SELECT * FROM c where c.Properties["First Name"] = 'Salvador'
SELECT * FROM c where c.Properties.fn = 'Ernest'
Related
I have a document structure in Cosmos that typically looks like this:
{
"Item No": "123456",
"Item Desc": "This is a description",
"images": [
"https://somedomain.com/image1.png",
"https://somedomain.com/image2.png"
]
}
Sometimes, there will be empty image values. I have written a UDF (user defined function) which will replace any empty values, with a default value:
function missingImage(doc, prop) {
if (typeof doc[prop] === "undefined" || doc[prop] === "" || doc[prop] === null) {
return "https://via.placeholder.com/150";
}
}
In the event an image url is blank, I get back this return (correct):
{
"id": "e3842b29-313c-4a84-bc94-bc43a9a55742",
"Item No": "123456",
"Item Desc": "This is a description.",
"image": "https://via.placeholder.com/150"
},
My SELECT query looks like this:
"c.id, c['Item No'], c['Item Desc'], udf.missingImage(c.images[0]) as image"
However, in situations where no image key exists at all, for example:
{
"Item No": "123456",
"Item Desc": "This is a description."
}
I don't get back my default.
My question: How can I modify my UDF or query, such that if the images key does not exist, I still return a default value?
Thanks #jay-gong for your time and response, however, this does not address the issue. I am looking for a way to return a default value when no images key exists in the document at all.
I feel the answer here, is not going to be through the UDF, rather it will need to be addressed at the query level. I am basing this off that fact that if I directly return a default, as you will see in the below UDF example, I don't get back images regardless.
The document in Cosmos:
{
"id": "8fdc9f47-6209-455d-9b9c-482341bb3170",
"Item No": "123456",
"Item Desc": "This is a description."
}
The UDF:
function missingImage(images) {
return "https://via.placeholder.com/150";
}
The query:
SELECT c.id, c['Item No'], c['Item Desc'], udf.missingImage(c.images) FROM c
The return:
[
{
"id": "8fdc9f47-6209-455d-9b9c-482341bb3170",
"Item No": "123456",
"Item Desc": "This is a description."
}
]
UPDATE:
I have come up with a solution, which is to use IS_DEFINED to check if the images key is defined. If its not, return false, which then gives me something to act on within the UDF.
The query:
SELECT c.id,
c['Item No'],
c['Item Desc'],
udf.missingImage((IS_DEFINED(c.images) = true ? c.images : false))
FROM c
The UDF:
function missingImage(images) {
if (images == false) {
return "https://via.placeholder.com/150";
}
return images;
}
Firstly, the sample document you provided is incorrect format of json.
I suppose that it should be like:
You could modify the udf function like:
function missingImage(images) {
for(var i =0;i<images.length;i++){
if (typeof images[i] === "undefined" || images[i] === "" || images[i] === null) {
images[i] = "https://via.placeholder.com/150";
}
}
return images;
}
Then use sql to make sure no "" value in the results:
SELECT udf.missingImage(c.images) FROM c
Very specific question, if I have the following input in my Streaming Analytics component:
//some input
"outputs": [
{
"name": "output1",
"unit": "unit1",
"value": "95813"
},
{
"name": "output2",
"unit": "unit2",
"value": "303883"
}, // and more array values
How can I get a JSON result that would look as follows:
"outputs":[ {
"output1":95813,
"output2":303883
//etc
}]
So, I don't need the unit value, and to save space, I'd like to use the 'name' as the key, and the 'value' as the value of the key-value array.
This is my current query:
SELECT
input.outputs as outputs
INTO "to-mongodb"
FROM "from-iothub" input
but this of course creates seperate JSON arrays, with the same structure as I do get as my input.
Anyone any idea on how to do this?
In worst case, just filtering out the 'unit' would also already be a great help.
Thanks in advance
You could use user-defined functions in Azure Stream Analytics. Please refer to the sample function I tested for you.
UDF:
function main(arg) {
var array = arg.outputs;
var returnJson = {};
var outputArray = [];
var map = {};
for(var i=0;i<array.length;i++){
var key=array[i].name;
map[key] = array[i].value;
}
outputArray.push(map);
returnJson = {"outputs" : outputArray};
return returnJson;
}
Query:
WITH
c AS
(
SELECT
udf.processArray(jsoninput) as result
from jsoninput
)
SELECT
c.result
INTO
jaycosmostest
FROM
c
Test Output:
Hope it helps you.
I need to create a view that lists the values for an attribute of a doc field.
Sample Doc:
{
"_id": "003e5a9742e04ce7a6791aa845405c17",
"title", "testdoc",
"samples": [
{
"confidence": "high",
"handle": "joetest"
}
]
}
Example using that doc, I want a view that will return the values for "handle"
I found this example with the heading - Get contents of an object with specific attributes e.g. doc.objects.[0].attribute. But when I fill in the attribute name, e.g. "handle" and replace doc.objects with doc.samples, I get no results:
Toggle line numbers
// map
function(doc) {
for (var idx in doc.objects) {
emit(doc.objects[idx], attribute)
}
}
That will create an array of key-value-pairs where the key is alway the value of handle. Replace null with a value you want e.g. doc.title. If you want to get the doc attached to every row use the query parameter ?include_docs=true while requesting the view.
// map
function (doc) {
var samples = doc.samples
for(var i = 0, sample; sample = samples[i++];) {
emit(sample.handle, null)
}
}
Like this ->
function(doc) {
for (var i in doc.samples) {
emit(doc._id, doc.samples[i].handle)
}
}
It will produce a result based on the doc._id field as the key. Or, if you want your key to be based on the .handle field you reverse the parameters in emit so you can search by startKey=, endKey=.
Is the below query supported in Azure DocumentDB? It returns no documents.
Variables values at runtime:
1. collectionLink = "<link for my collection>"
2. feedOptions = new FeedOptions { MaxItemCount = 2 }
3. name = "chris"
client.CreateDocumentQuery<T>(collectionLink, feedOptions).Where(m => (m.Status == "Foo" && (m.Black.Name == null || m.Black.Name != name) && (m.White.Name == null || m.White.Name != name)));
I have tested with simpler queries, such as the below, which both return results I expect.
client.CreateDocumentQuery<T>(collectionLink, feedOptions).Where(m => m.Status == "Foo");
client.CreateDocumentQuery<T>(collectionLink, feedOptions).Where(m => m.Status == "Foo").Where(m => m.Size == 19);
Lastly, I've ensured there are documents which meet the problematic query's filter criteria:
{
"id": "1992db52-c9c6-4496-aaaa-f8cb83a8c6b0",
"status": "Foo",
"size": 19,
"black": {
"name": "charlie"
},
"white": {},
}
Thanks.
Turns out the "m.White.Name == null || m.White.Name != name" check is problematic because the Name field does not exist on the document in the DB.
When the document is edited to the following, the query returns it. Notice the explicit null value for Name field.
{
"id": "1992db52-c9c6-4496-aaaa-f8cb83a8c6b0",
"status": "Foo",
"size": 19,
"black": {
"name": "charlie"
},
"white": {
"name": null
},
}
The query can be written to handle missing properties using DocumentDB UDFs as follows. DocumentDB uses JavaScript's semantics, and an explicit null is different from a missing property ("undefined") in JavaScript. To check for explicit null is simple (== null like your query), but to query for a field that may or may not exist in DocumentDB, you must first create a UDF for ISDEFINED:
function ISDEFINED(doc, prop) {
return doc[prop] !== undefined;
}
And then use it in a DocumentDB query like:
client.CreateDocumentQuery<T>(
collectionLink,
"SELECT * FROM docs m WHERE m.Status == "Foo" AND (ISDEFINED(m.white, "name") OR m.white.name != name)");
Hope this helps. Note that since != and UDFs both require scans, it's a good idea for performance/scale to always use them only within queries that have other filters.
Is there a way to do the following in CouchDB? A way to return unique, distinct values by a given key?
SELECT DISTINCT field FROM table WHERE key="key1"
'key1' => 'somevalue'
'key1' => 'somevalue'
'key2' => 'anotherval'
'key2' => 'andanother'
'key2' => 'andanother'
For example:
http://localhost:5984/database/_design/designdoc/_view/distinctview?key="key1" would return ['somevalue']
http://localhost:5984/database/_design/designdoc/_view/distinctview?key="key2" would return ['anotherval', 'andanother']
As suggested in the CouchDB definitive guide, you should put the values you want to be unique in the key, then query the reduce function with group=true.
For example, given that keyfield is the field with "key1" and "key2" and valuefield is the field with the values, your map function could be:
function(doc) {
// filter to get only the interesting documents: change as needed
if (doc.keyfield && doc.valuefield) {
/*
* This is the important stuff:
*
* - by putting both, the key and the value, in the emitted key,
* you can filter out duplicates
* (simply group the results on the full key);
*
* - as a bonus, by emitting 1 as the value, you get the number
* of duplicates by using the `_sum` reduce function.
*/
emit([doc.keyfield, doc.valuefield], 1);
}
}
and your reduce function could be:
_sum
Then querying with group=true&startkey=["key2"]&endkey=["key2",{}] gives:
{"rows":[
{"key":["key2","anotherval"],"value":1},
{"key":["key2","andanother"],"value":2}
]}
Based on what I see here, (I'll change my answer if needed) key1 and key2 look like independent fields, so you'll need 2 separate views.
I created 5 simple documents in my test database:
// I've left out fields like _id and _rev for the sake of simplicity
{ "key1": "somevalue" }
{ "key1": "somevalue" }
{ "key2": "anotherval" }
{ "key2": "andanother" }
{ "key2": "andanother" }
Here are the 2 view queries you'll need:
// view for key1
function(doc) {
if (doc.key1) {
emit("key1", doc.key1);
}
}
// view for key2
function(doc) {
if (doc.key2) {
emit("key2", doc.key2);
}
}
From there, your reduce function can return all the values in an array by just doing this:
function (key, values) {
return values;
}
However, you specifically mentioned distinct values. Since JavaScript doesn't have a native unique() method for arrays, and we can't use CommonJS modules in view functions, we'll have to add our own logic for that. I just copy-pasted the first array.unique() function I found on Google, you can write your own that is better optimized for sure.
function (key, values, rereduce) {
var o = {}, i, l = values.length, r = [];
for (i = 0; i < l; i += 1) {
o[values[i]] = values[i];
}
for (i in o) {
r.push(o[i]);
}
return r;
}
You'll use this same reduce function in both views. When you query any of those views, by default it will also perform the reduce. (You'll need to explicitly pass reduce=false to get just the results of your map function.
Here are the result-sets you'd retrieve using the above map/reduce queries: (remember they are 2 separate queries)
{"rows":[
{"key":"key1", "value": ["somevalue"]}
]}
{"rows":[
{"key": "key2", "value": ["anotherval", "andanother"]}
]}