Difficulty aligning JSONschema and XSD equivalent

Difficulty aligning JSONschema and XSD equivalent - xsd

I've got a problem, I can't solve on my own. So I'm trying to find help at StackOverflow.
I'm generating XSD and use a python-based converter to convert the XSD to JSON schema. The JSON schema is used to validate JSON code which I will not explain in detail. My problem is, that I have JSON code to validate which contains an array containing arrays (plural) containing integers.
The JSON-code looks like this:
"factors": [
[12,3],
[1,14]
]
I know how to write the JSON schema for this:
"factors": {
"items": {
"$ref": "#/definitions/factorscontent"
},
"type": "array"
}
...
"factorscontent": {
"items": {
"type": "integer"
},
"type": "array",
"properties": {}
}
This works fine. But I don't know how to build this structure in XSD. Does anyone know how to face the "integer in arrays in array" problem?
One thing to add: If the json-code would look like this:
"factors": [ "items":{ [12,3], [1,14] } ]
I wouldn't have a problem because I could express the content of the "factors-array" as
<xs:element name="items" type="integer" max0occurs="unbounded"/>
But the array structure inside contains just integers, no elements!

To avoid down voting, please cleanup your "would" JSON example, since "factors": [ "items":{ [12,3], [1,14] } ] is simply not well-formed JSON. A well-formed may look as below (of course, feel free to correct it so that it reflects what you've wanted, at least our tool would generate it the way it follows from the XSD snippet you've shared):
{
"factors": [
{
"items": [ 12, 3 ]
},
{
"items": [ 1, 14 ]
}
]
}
Your question has an easy answer: there is no way to do it, unless you rely on proprietary markup in the XSD.
In XML, representing data requires some sort of markup; a text node and an attribute require an element to "hold" them. In XSD, you may use an <xsd:list/> to model an array of integers; even then, using it needs a(n) (repeating) element to make it an array of arrays (alternatively, the element may be wrapped in a repeating compositor, typically an <xsd:sequence/>).
Because of this must-have element, any converter that I know will automatically create objects (where object is as defined by JSON Schema Draft) and use the name of the element and/or attribute to create a property. Your array of arrays has no object, hence no property is allowed, therefore you cannot rely on core XSD constructs for this kind of conversion.
We've been doing XSD to JSD conversions for three years against real XSDs and so we've refined this conversion quite a bit. To achieve a scenario such as yours, we've defined a proprietary markup in an <xsd:appinfo/>: it instructs the conversion engine to skip creating a property and so, the associated object creation. I would recommend you change your Python based parser, so that it considers some sort of "hints", either the way we did it, or by authoring patterns (if you wish to apply them indiscriminately).

Related

Is there a common pattern for handling pagination where results from search index may be expanded into multiple rows?

This is a contrived / made up example that may not make sense practically, but I'm trying to paint a picture:
There is a web service / search API that supports Relay style pagination for searching for products
A user searches makes a search and requests 3 documents back (e.g. first: 3...)
The service takes the request and passes it to a search index
The search index returns:
[
{
"name": "Shirt",
"sizes": ["S"]
},
{
"name": "Jacket",
"sizes": ["XL"]
},
{
"name": "Hat",
"sizes": ["S", "M"]
}
]
The result of this should be expanded, so that each product shows up as an individual record in the result set with one size per result record, so the above example would split the Hat product result into two results, so the final result would be:
[
{
"name": "Shirt",
"sizes": ["S"]
},
{
"name": "Jacket",
"sizes": ["XL"]
},
{
"name": "Hat",
"sizes": ["S"]
}
]
If the SECOND page was requested, the second page would actually start with the second Hat size (M):
[
...
{
"name": "Hat",
"sizes": ["M"]
},
...
]
I'm wondering if there is a common strategy for handling this, or common libraries that I might use to handle some of this logic.
I'm using (https://opensearch.org)[OpenSearch] and Elasticsearch has a "collapse" and "expand" feature that sounds like it almost does what I'd want at the search backend level, but unfortunately I don't think this is actually the case.
In reality what I want to do is likely not even possible 100%, because if the search results change in between queries you might not end up seeing the correct thing on a subsequent page for example, but I still feel like this might be a common enough issue to have some discussion or solution around it.
I'm thinking that one somewhat certain way of handling this is by denormalizing the data in the search index a bit, and just sticking (for my example) a separate document in the index for both the S and M Hat products (even though the rest of the data would be the same). I'd just need to make sure to remove all documents, and would need to come up with unique identifiers in the index for the documents (so somehow encode the Size in the indexed documents ID).

Convert node module's exported functions' into JSON objects?

it is my first question so I hope I'll write it well, any suggestion is well accepted. This is a design question.
I'm using the create-electron-app-typescript boilerplate. I use uniforms & JSON schemas for user input, and I'm new to all of these languages/tools.
I'm designing a tool that aims to create a simple, general graphical interface for modules that can expose (export) certain functionalities.
e.g.:
module.exports = {
function sendMoney(account?, amount),
function foo(something)
}
I wish I could translate each function in a schema like this:
{
"title": "sendMoney",
"type": "object",
"properties": {
"account": {
"type":"string"
},
"amount": {
"type":"int"
}
},
"required": ["amount"]
}
in order to easily translate it instantly into a form as I do now using Uniforms/React.
The goal is to let the user implicitly create a high level "script" which is the ordered and sequential execution of these functions, that will be interpreted in a second moment.
[EDITED]

Is there a way to define a type definition for an object with changing property names in GraphQL? [duplicate]

Let's say my graphql server wants to fetch the following data as JSON where person3 and person5 are some id's:
"persons": {
"person3": {
"id": "person3",
"name": "Mike"
},
"person5": {
"id": "person5",
"name": "Lisa"
}
}
Question: How to create the schema type definition with apollo?
The keys person3 and person5 here are dynamically generated depending on my query (i.e. the area used in the query). So at another time I might get person1, person2, person3 returned.
As you see persons is not an Iterable, so the following won't work as a graphql type definition I did with apollo:
type Person {
id: String
name: String
}
type Query {
persons(area: String): [Person]
}
The keys in the persons object may always be different.
One solution of course would be to transform the incoming JSON data to use an array for persons, but is there no way to work with the data as such?

GraphQL relies on both the server and the client knowing ahead of time what fields are available available for each type. In some cases, the client can discover those fields (via introspection), but for the server, they always need to be known ahead of time. So to somehow dynamically generate those fields based on the returned data is not really possible.
You could utilize a custom JSON scalar (graphql-type-json module) and return that for your query:
type Query {
persons(area: String): JSON
}
By utilizing JSON, you bypass the requirement for the returned data to fit any specific structure, so you can send back whatever you want as long it's properly formatted JSON.
Of course, there's significant disadvantages in doing this. For example, you lose the safety net provided by the type(s) you would have previously used (literally any structure could be returned, and if you're returning the wrong one, you won't find out about it until the client tries to use it and fails). You also lose the ability to use resolvers for any fields within the returned data.
But... your funeral :)
As an aside, I would consider flattening out the data into an array (like you suggested in your question) before sending it back to the client. If you're writing the client code, and working with a dynamically-sized list of customers, chances are an array will be much easier to work with rather than an object keyed by id. If you're using React, for example, and displaying a component for each customer, you'll end up converting that object to an array to map it anyway. In designing your API, I would make client usability a higher consideration than avoiding additional processing of your data.

You can write your own GraphQLScalarType and precisely describe your object and your dynamic keys, what you allow and what you do not allow or transform.
See https://graphql.org/graphql-js/type/#graphqlscalartype
You can have a look at taion/graphql-type-json where he creates a Scalar that allows and transforms any kind of content:
https://github.com/taion/graphql-type-json/blob/master/src/index.js

I had a similar problem with dynamic keys in a schema, and ended up going with a solution like this:
query lookupPersons {
persons {
personKeys
person3: personValue(key: "person3") {
id
name
}
}
}
returns:
{
data: {
persons: {
personKeys: ["person1", "person2", "person3"]
person3: {
id: "person3"
name: "Mike"
}
}
}
}
by shifting the complexity to the query, it simplifies the response shape.
the advantage compared to the JSON approach is it doesn't need any deserialisation from the client
Additional info for Venryx: a possible schema to fit my query looks like this:
type Person {
id: String
name: String
}
type PersonsResult {
personKeys: [String]
personValue(key: String): Person
}
type Query {
persons(area: String): PersonsResult
}
As an aside, if your data set for persons gets large enough, you're going to probably want pagination on personKeys as well, at which point, you should look into https://relay.dev/graphql/connections.htm

MongoDB nested array update multiple documents [duplicate]

I am trying to update a value in the nested array but can't get it to work.
My object is like this
{
"_id": {
"$oid": "1"
},
"array1": [
{
"_id": "12",
"array2": [
{
"_id": "123",
"answeredBy": [], // need to push "success"
},
{
"_id": "124",
"answeredBy": [],
}
],
}
]
}
I need to push a value to "answeredBy" array.
In the below example, I tried pushing "success" string to the "answeredBy" array of the "123 _id" object but it does not work.
callback = function(err,value){
if(err){
res.send(err);
}else{
res.send(value);
}
};
conditions = {
"_id": 1,
"array1._id": 12,
"array2._id": 123
};
updates = {
$push: {
"array2.$.answeredBy": "success"
}
};
options = {
upsert: true
};
Model.update(conditions, updates, options, callback);
I found this link, but its answer only says I should use object like structure instead of array's. This cannot be applied in my situation. I really need my object to be nested in arrays
It would be great if you can help me out here. I've been spending hours to figure this out.
Thank you in advance!

General Scope and Explanation
There are a few things wrong with what you are doing here. Firstly your query conditions. You are referring to several _id values where you should not need to, and at least one of which is not on the top level.
In order to get into a "nested" value and also presuming that _id value is unique and would not appear in any other document, you query form should be like this:
Model.update(
{ "array1.array2._id": "123" },
{ "$push": { "array1.0.array2.$.answeredBy": "success" } },
function(err,numAffected) {
// something with the result in here
}
);
Now that would actually work, but really it is only a fluke that it does as there are very good reasons why it should not work for you.
The important reading is in the official documentation for the positional $ operator under the subject of "Nested Arrays". What this says is:
The positional $ operator cannot be used for queries which traverse more than one array, such as queries that traverse arrays nested within other arrays, because the replacement for the $ placeholder is a single value
Specifically what that means is the element that will be matched and returned in the positional placeholder is the value of the index from the first matching array. This means in your case the matching index on the "top" level array.
So if you look at the query notation as shown, we have "hardcoded" the first ( or 0 index ) position in the top level array, and it just so happens that the matching element within "array2" is also the zero index entry.
To demonstrate this you can change the matching _id value to "124" and the result will $push an new entry onto the element with _id "123" as they are both in the zero index entry of "array1" and that is the value returned to the placeholder.
So that is the general problem with nesting arrays. You could remove one of the levels and you would still be able to $push to the correct element in your "top" array, but there would still be multiple levels.
Try to avoid nesting arrays as you will run into update problems as is shown.
The general case is to "flatten" the things you "think" are "levels" and actually make theses "attributes" on the final detail items. For example, the "flattened" form of the structure in the question should be something like:
{
"answers": [
{ "by": "success", "type2": "123", "type1": "12" }
]
}
Or even when accepting the inner array is $push only, and never updated:
{
"array": [
{ "type1": "12", "type2": "123", "answeredBy": ["success"] },
{ "type1": "12", "type2": "124", "answeredBy": [] }
]
}
Which both lend themselves to atomic updates within the scope of the positional $ operator
MongoDB 3.6 and Above
From MongoDB 3.6 there are new features available to work with nested arrays. This uses the positional filtered $[<identifier>] syntax in order to match the specific elements and apply different conditions through arrayFilters in the update statement:
Model.update(
{
"_id": 1,
"array1": {
"$elemMatch": {
"_id": "12","array2._id": "123"
}
}
},
{
"$push": { "array1.$[outer].array2.$[inner].answeredBy": "success" }
},
{
"arrayFilters": [{ "outer._id": "12" },{ "inner._id": "123" }]
}
)
The "arrayFilters" as passed to the options for .update() or even
.updateOne(), .updateMany(), .findOneAndUpdate() or .bulkWrite() method specifies the conditions to match on the identifier given in the update statement. Any elements that match the condition given will be updated.
Because the structure is "nested", we actually use "multiple filters" as is specified with an "array" of filter definitions as shown. The marked "identifier" is used in matching against the positional filtered $[<identifier>] syntax actually used in the update block of the statement. In this case inner and outer are the identifiers used for each condition as specified with the nested chain.
This new expansion makes the update of nested array content possible, but it does not really help with the practicality of "querying" such data, so the same caveats apply as explained earlier.
You typically really "mean" to express as "attributes", even if your brain initially thinks "nesting", it's just usually a reaction to how you believe the "previous relational parts" come together. In reality you really need more denormalization.
Also see How to Update Multiple Array Elements in mongodb, since these new update operators actually match and update "multiple array elements" rather than just the first, which has been the previous action of positional updates.
NOTE Somewhat ironically, since this is specified in the "options" argument for .update() and like methods, the syntax is generally compatible with all recent release driver versions.
However this is not true of the mongo shell, since the way the method is implemented there ( "ironically for backward compatibility" ) the arrayFilters argument is not recognized and removed by an internal method that parses the options in order to deliver "backward compatibility" with prior MongoDB server versions and a "legacy" .update() API call syntax.
So if you want to use the command in the mongo shell or other "shell based" products ( notably Robo 3T ) you need a latest version from either the development branch or production release as of 3.6 or greater.
See also positional all $[] which also updates "multiple array elements" but without applying to specified conditions and applies to all elements in the array where that is the desired action.

I know this is a very old question, but I just struggled with this problem myself, and found, what I believe to be, a better answer.
A way to solve this problem is to use Sub-Documents. This is done by nesting schemas within your schemas
MainSchema = new mongoose.Schema({
array1: [Array1Schema]
})
Array1Schema = new mongoose.Schema({
array2: [Array2Schema]
})
Array2Schema = new mongoose.Schema({
answeredBy": [...]
})
This way the object will look like the one you show, but now each array are filled with sub-documents. This makes it possible to dot your way into the sub-document you want. Instead of using a .update you then use a .find or .findOne to get the document you want to update.
Main.findOne((
{
_id: 1
}
)
.exec(
function(err, result){
result.array1.id(12).array2.id(123).answeredBy.push('success')
result.save(function(err){
console.log(result)
});
}
)
Haven't used the .push() function this way myself, so the syntax might not be right, but I have used both .set() and .remove(), and both works perfectly fine.

How to retrieve a subset of fields from array of objects using the Couchbase sub-document API?

I am trying to write a lookup which returns an array from a document and skipping some fields:
{
"id": 10000,
"schedule": [
{
"day": 0,
"flight": "AF198",
"utc": "10:13:00"
},
{
"day": 0,
"flight": "AF547",
"utc": "19:14:00"
},
...
]
}
I would like to get all schedule items but only the flight properties. I want to get something like this:
[
{
"flight: "AF198"
},
{
"flight: "AF547"
},
...
]
bucket.lookupIn(key).get("schedule.flight") doesn’t work. I tried "schedule[].flight", "schedule.$.flight" It seems I always need to know the index.
I saw that this is possible with N1QL.
Couchbase - SELECT a subset of fields from array of objects
Do you guys know how to do this with the Subdocument API? Sorry if it is a trivial question. I just cannot find an example on
https://developer.couchbase.com/documentation/server/current/sdk/subdocument-operations.html

Couchbase Subdocument requires the full path, it does not support expansion. In this case it needs to know the index of the array. There are a few other options:
If every path is known, then you could chain all of the subdocument gets. A total of 16 paths can be got at once:
bucket.lookupIn(key).get("schedule[0].flight").get(schedule[1].flight")
Get the parent object and filter on the application side:
bucket.lookupIn(key).get("schedule")
As mentioned in the question, use N1QL.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Difficulty aligning JSONschema and XSD equivalent - xsd

Related

Is there a common pattern for handling pagination where results from search index may be expanded into multiple rows?

Convert node module's exported functions' into JSON objects?

Is there a way to define a type definition for an object with changing property names in GraphQL? [duplicate]

MongoDB nested array update multiple documents [duplicate]

How to retrieve a subset of fields from array of objects using the Couchbase sub-document API?

Categories

Resources