How to use Pydantic base schema to n number of child schema - python-3.x

I'm new to pydantic, I want to define pydantic schema and fields for the below python dictionary which in the form of JSONAPI standard
{
"data": {
"type": "string",
"attributes":{
"title": "string",
"name": "string"
}
}
I managed to achieve this by defining multiple schemas like below,
class Child2(BaseModel):
title: str
name: str
class Child1(BaseModel):
type: str
attributes: Child2
class BaseParent(BaseModel):
data: Child1
But, I will be having multiple json request with the same json API structure as below,
example 1 {
"data": {
"type": "string",
"attributes":{
"source": "001",
"status": "New"
}
}
example 2 {
"data": {
"type": "string",
"attributes":{
"id": "001"
}
}
If you look into the above python dictionary, Values only under the attributes object are different. So, is there any way that I can define a parent marshmallow scheme for { "data": { "type": "string", "attributes":{ } } } and use that parent schema for all child schema's.

I found an answer finally, Hope this will help someone.
Pydantic 'create_model' concept will help to resolve this kind of requirement by passing child schema as one of the field values to create_model.
class Child(BaseModel):
title: str
name: str
class BaseParent(BaseModel):
data: create_model('BaseParent', type=(str, ...), attributes=(Child, ...))
And this will frame a BaseParent schema structure as below,
data=BaseParent(type='id', attributes=Child2(title='test002', name='Test'))

Related

Nested iteration over JSON using groovy closure in REST-assured

I have the following JSON response for my REST endpoint:
{
"response": {
"status": 200,
"startRow": 0,
"endRow": 1,
"totalRows": 1,
"next": "",
"data": {
"id": "workflow-1",
"name": "SampleWorkflow",
"tasks": [
{
"id": "task-0",
"name": "AWX",
"triggered_by": ["task-5"]
},
{
"id": "task-1",
"name": "BrainStorming",
"triggered_by": ["task-2", "task-5"]
},
{
"id": "task-2",
"name": "OnHold",
"triggered_by": ["task-0", "task-4", "task-7", "task-8", "task9"]
},
{
"id": "task-3",
"name": "InvestigateSuggestions",
"triggered_by": ["task-6"]
},
{
"id": "task-4",
"name": "Mistral",
"triggered_by": ["task-3"]
},
{
"id": "task-5",
"name": "Ansible",
"triggered_by": ["task-3"]
},
{
"id": "task-6",
"name": "Integration",
"triggered_by": []
},
{
"id": "task-7",
"name": "Tower",
"triggered_by": ["task-5"]
},
{
"id": "task-8",
"name": "Camunda",
"triggered_by": ["task-3"]
},
{
"id": "task-9",
"name": "HungOnMistral",
"triggered_by": ["task-0", "task-7"]
},
{
"id": "task-10",
"name": "MistralIsChosen",
"triggered_by": ["task-1"]
}
]
}
}
}
I am using rest-assured with a groovy gpath expression for an extraction as follows:
given()
.when()
.get("http://localhost:8080/workflow-1")
.then()
.extract()
.path("response.data.tasks.findAll{ it.triggered_by.contains('task-3') }.name")
which correctly gives me [Mistral, Ansible, Camunda]
What I am trying to achieve is to find the task names that are triggered by the InvestigateSuggestions task. But I don't know for sure that the taskId I have to pass in to contains() is task-3; I only know its name i.e. InvestigateSuggestions. So I attempt to do:
given()
.when()
.get("http://localhost:8080/workflow-1")
.then()
.extract()
.path("response.data.tasks.findAll{
it.triggered_by.contains(response.data.tasks.find{
it.name.equals('InvestigateSuggestions')}.id) }.name")
which does not work and complains that the parameter "response" was used but not defined.
How do I iterate over the outer collection from inside the findAll closure to find the correct id to pass into contains() ??
You can make use of a dirty secret, the restAssuredJsonRootObject. This is undocumented (and subject to change although it has never change as far as I can remember in the 7 year+ lifespan of REST Assured).
This would allow you to write:
given()
.when()
.get("http://localhost:8080/workflow-1")
.then()
.extract()
.path("response.data.tasks.findAll{
it.triggered_by.contains(restAssuredJsonRootObject.response.data.tasks.find{
it.name.equals('InvestigateSuggestions')}.id) }.name")
If you don't want to use this "hack" then you need to do something similar to what Michael Easter proposed in his answer.
When it comes to generating matchers based on the response body the story is better. See docs here.
I'm not sure if this is idiomatic but one approach is to find the id first and then substitute into another query:
#Test
void testCase1() {
def json = given()
.when()
.get("http://localhost:5151/egg_minimal/stacko.json")
// e.g. id = 'task-3' for name 'InvestigateSuggestions'
def id = json
.then()
.extract()
.path("response.data.tasks.find { it.name == 'InvestigateSuggestions' }.id")
// e.g. tasks have name 'task-3'
def tasks = json
.then()
.extract()
.path("response.data.tasks.findAll{ it.triggered_by.contains('${id}') }.name")
assertEquals(['Mistral', 'Ansible', 'Camunda'], tasks)
}

Logic App : Finding element in Json Object array (like XPath fr XML)

In my logic app, I have a JSON object (parsed from an API response) and it contains an object array.
How can I find a specific element based on attribute values... Example below where I want to find the (first) active one
{
"MyList" : [
{
"Descrip" : "This is the first item",
"IsActive" : "N"
},
{
"Descrip" : "This is the second item",
"IsActive" : "N"
},
{
"Descrip" : "This is the third item",
"IsActive" : "Y"
}
]
}
Well... The answer is in plain sight ... There's a FILTER ARRAY action, which works on a JSON Object (from PARSE JSON action).. couple this with an #first() expression will give the desired outcome.
You can use the Parse JSON Task to parse your JSON and a Condition to filter for the IsActive attribute:
Use the following Schema to parse the JSON:
{
"type": "object",
"properties": {
"MyList": {
"type": "array",
"items": {
"type": "object",
"properties": {
"Descrip": {
"type": "string"
},
"IsActive": {
"type": "string"
}
},
"required": [
"Descrip",
"IsActive"
]
}
}
}
}
Here how it looks like (I included the sample data you provided to test it):
Then you can add the Condition:
And perform whatever action you want within the If true section.

How to convert JSON schema to mongoose schema

Is there a way to take valid JSON schema like the one below and turn it into mongoose schema?
{
"$schema": "http://json-schema.org/draft-04/schema#",
"description": "some desc",
"title": "Product",
"type": "object",
"properties": {
"endpoints": {
"type": "array",
"items": {
"type": "string"
}
},
"poi": {
"type": "array",
"items": {
"type": "object",
"properties": {
"location_name": {
"type": "string"
},
"distance": {
"type": "string"
}
}
}
}
}
}
This seems so basic and simple to me but I haven't found anything on the net.
There are bunch of examples on how to get JSON schema and there are bunch of examples how to create mongoose schema from objects like this:
const newSchema = new mongoose.Schema({ name: String });
If I try to put JSON schema directly I get an error
node_modules/mongoose/lib/schema.js:674
throw new TypeError('Undefined type `' + name + '` at `' + path +
^
TypeError: Undefined type `Http://json-schema.org/draft-04/schema#` at `$schema`
Did you try nesting Schemas? You can only nest using refs or arrays.
But I could not find anywhere on the net transfer from one type to another.
Anyone had this issue before?
EDIT:
This question was conceptually incorrect.
Basically what you do is validate JSON schema against the data before saving it to DB. You do this using jsonschema from npm or some other.
So data validating step is not directly linked with saving to DB step.
I thought you can apply JSON schema to MongoDB schema but that was not true. (especially when you have deeply nested objects - then it's a mess)
I have been looking into this. Since you placed a node tag on your question, I found these npm repos:
https://github.com/jon49/json-schema-to-mongoose
https://github.com/topliceanu/mongoose-gen
Both are working so far.
They are both a few years old. The former (TypeScript) has more recent commits. I may end up liking the latter more.

How to search through data with arbitrary amount of fields?

I have the web-form builder for science events. The event moderator creates registration form with arbitrary amount of boolean, integer, enum and text fields.
Created form is used for:
register a new member to event;
search through registered members.
What is the best search tool for second task (to search memebers of event)? Is ElasticSearch well for this task?
I wrote a post about how to index arbitrary data into Elasticsearch and then to search it by specific fields and values. All this, without blowing up your index mapping.
The post is here: http://smnh.me/indexing-and-searching-arbitrary-json-data-using-elasticsearch/
In short, you will need to do the following steps to get what you want:
Create a special index described in the post.
Flatten the data you want to index using the flattenData function:
https://gist.github.com/smnh/30f96028511e1440b7b02ea559858af4.
Create a document with the original and flattened data and index it into Elasticsearch:
{
"data": { ... },
"flatData": [ ... ]
}
Optional: use Elasticsearch aggregations to find which fields and types have been indexed.
Execute queries on the flatData object to find what you need.
Example
Basing on your original question, let's assume that the first event moderator created a form with following fields to register members for the science event:
name string
age long
sex long - 0 for male, 1 for female
In addition to this data, the related event probably has some sort of id, let's call it eventId. So the final document could look like this:
{
"eventId": "2T73ZT1R463DJNWE36IA8FEN",
"name": "Bob",
"age": 22,
"sex": 0
}
Now, before we index this document, we will flatten it using the flattenData function:
flattenData(document);
This will produce the following array:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "2T73ZT1R463DJNWE36IA8FEN"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Bob"
},
{
"key": "age",
"type": "long",
"key_type": "age.long",
"value_long": 22
},
{
"key": "sex",
"type": "long",
"key_type": "sex.long",
"value_long": 0
}
]
Then we will wrap this data in a document as I've showed before and index it.
Then, the second event moderator, creates another form having a new field, field with same name and type, and also a field with same name but with different type:
name string
city string
sex string - "male" or "female"
This event moderator decided that instead of having 0 and 1 for male and female, his form will allow choosing between two strings - "male" and "female".
Let's try to flatten the data submitted by this form:
flattenData({
"eventId": "F1BU9GGK5IX3ZWOLGCE3I5ML",
"name": "Alice",
"city": "New York",
"sex": "female"
});
This will produce the following data:
[
{
"key": "eventId",
"type": "string",
"key_type": "eventId.string",
"value_string": "F1BU9GGK5IX3ZWOLGCE3I5ML"
},
{
"key": "name",
"type": "string",
"key_type": "name.string",
"value_string": "Alice"
},
{
"key": "city",
"type": "string",
"key_type": "city.string",
"value_string": "New York"
},
{
"key": "sex",
"type": "string",
"key_type": "sex.string",
"value_string": "female"
}
]
Then, after wrapping the flattened data in a document and indexing it into Elasticsearch we can execute complicated queries.
For example, to find members named "Bob" registered for the event with ID 2T73ZT1R463DJNWE36IA8FEN we can execute the following query:
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "eventId"}},
{"match": {"flatData.value_string.keyword": "2T73ZT1R463DJNWE36IA8FEN"}}
]
}
}
}
},
{
"nested": {
"path": "flatData",
"query": {
"bool": {
"must": [
{"term": {"flatData.key": "name"}},
{"match": {"flatData.value_string": "bob"}}
]
}
}
}
}
]
}
}
}
ElasticSearch automatically detects the field content in order to index it correctly, even if the mapping hasn't been defined previously. So, yes : ElasticSearch suits well these cases.
However, you may want to fine tune this behavior, or maybe the default mapping applied by ElasticSearch doesn't correspond to what you need : in this case, take a look at the default mapping or, for even further control, the dynamic templates feature.
If you let your end users decide the keys you store things in, you'll have an ever-growing mapping and cluster state, which is problematic.
This case and a suggested solution is covered in this article on common problems with Elasticsearch.
Essentially, you want to have everything that can possibly be user-defined as a value. Using nested documents, you can have a key-field and differently mapped value fields to achieve pretty much the same.

Elasticsearch term filter on inner object field not matching

I have just organized my document structure to have a more OO design (e.g. moved top level properties like venueId and venueName into a venue object with id and name fields).
However I can now not get a simple term filter working for fields on the child venue inner object.
Here is my mapping:
{
"deal": {
"properties": {
"textId": {"type":"string","name":"textId","index":"no"},
"displayId": {"type":"string","name":"displayId","index":"no"},
"active": {"name":"active","type":"boolean","index":"not_analyzed"},
"venue": {
"type":"object",
"path":"full",
"properties": {
"textId": {"type":"string","name":"textId","index":"not_analyzed"},
"regionId": {"type":"string","name":"regionId","index":"not_analyzed"},
"displayId": {"type":"string","name":"displayId","index":"not_analyzed"},
"name": {"type":"string","name":"name"},
"address": {"type":"string","name":"address"},
"area": {
"type":"multi_field",
"fields": {
"area": {"type":"string","index":"not_analyzed"},
"area_search": {"type":"string","index":"analyzed"}}},
"location": {"type":"geo_point","lat_lon":true}}},
"tags": {
"type":"multi_field",
"fields": {
"tags":{"type":"string","index":"not_analyzed"},
"tags_search":{"type":"string","index":"analyzed"}}},
"days": {
"type":"multi_field",
"fields": {
"days":{"type":"string","index":"not_analyzed"},
"days_search":{"type":"string","index":"analyzed"}}},
"value": {"type":"string","name":"value"},
"title": {"type":"string","name":"title"},
"subtitle": {"type":"string","name":"subtitle"},
"description": {"type":"string","name":"description"},
"time": {"type":"string","name":"time"},
"link": {"type":"string","name":"link","index":"no"},
"previewImage": {"type":"string","name":"previewImage","index":"no"},
"detailImage": {"type":"string","name":"detailImage","index":"no"}}}
}
Here is an example document:
GET /production/deals/wa-au-some-venue-weekends-some-deal
{
"_index":"some-index-v1",
"_type":"deals",
"_id":"wa-au-some-venue-weekends-some-deal",
"_version":1,
"exists":true,
"_source" : {
"id":"921d5fe0-8867-4d5c-81b4-7c1caf11325f",
"textId":"wa-au-some-venue-weekends-some-deal",
"displayId":"some-venue-weekends-some-deal",
"active":true,
"venue":{
"id":"46a7cb64-395c-4bc4-814a-a7735591f9de",
"textId":"wa-au-some-venue",
"regionId":"wa-au",
"displayId":"some-venue",
"name":"Some Venue",
"address":"sdgfdg",
"area":"Swan Valley & Surrounds"},
"tags":["Lunch"],
"days":["Saturday","Sunday"],
"value":"$1",
"title":"Some Deal",
"subtitle":"",
"description":"",
"time":"5pm - Late"
}
}
And here is an 'explain' test on that same document:
POST /production/deals/wa-au-some-venue-weekends-some-deal/_explain
{
"query": {
"filtered": {
"filter": {
"term": {
"venue.regionId": "wa-au"
}
}
}
}
}
{
"ok":true,
"_index":"some-index-v1",
"_type":"deals",
"_id":"wa-au-some-venue-weekends-some-deal",
"matched":false,
"explanation":{
"value":0.0,
"description":"ConstantScore(cache(venue.regionId:wa-au)) doesn't match id 0"
}
}
Is there any way to get more useful debugging info?
Is there something wrong with the explain result description? Simply saying "doesn't match id 0" does not really make sense to me... the field is called 'regionId' (not 'id') and the value is definitely not 0...???
That happens because the type you submitted the mapping for is called deal, while the type you indexed the document in is called deals.
If you look at the mapping for your type deals, you'll see that was automatically generated and the field venue.regionId is analyzed, thus you most likely have two tokens in your index: wa and au. Only searching for those tokens on that type you would get back that document.
Anything else looks just great! Only a small character is wrong ;)

Resources