this is partially a follow-up to this question: Filtering Arrays in NodeJS without knowing where the value's location is
I got the JSON output but now I'm required to append the coordinates of each element to its respective element in the JSON.
I tried to use page.$(input[type='']) selector while having a variable instead of type for each key, and having the value being the value of said key, the issue is, I only got one type of output, that is the first element with type text, it returned null when element type was anything but text (i.e. element for example), and it, of course, didn't cycle through all elements with type text (I do know why), I tried using page.$$(input[type='']) but I couldn't figure out much how to use elementHandle on each object & lastly I can't figure out how to append back the coordinates to each element without losing the original hierarchy.
For reference: Here's a sample of the outputted JSON:
[
{
"type": "element",
"tagName": "form",
"attributes": [
{
"key": "action",
"value": "/action_page.php"
},
{
"key": "target",
"value": "_blank"
}
],
"children": [
{
"type": "text",
"content": "\nFirst name:"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "text"
},
{
"key": "name",
"value": "firstname"
},
{
"key": "value",
"value": "John"
}
],
"children": []
},
{
"type": "text",
"content": "\nLast name:"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "text"
},
{
"key": "name",
"value": "lastname"
},
{
"key": "value",
"value": "Doe"
}
],
"children": []
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "submit"
},
{
"key": "value",
"value": "Submit"
}
],
"children": []
},
{
"type": "text",
"content": "\n"
}
]
please note that this output isn't uniform and this will be used on multiple pages with different layouts, so the answer does need to be as adaptable as possible.
Related
For the store API endpoint /store-api/product is it possible to filter on the properties of a product? Not the defaults such as whether it's active or stock levels, but the properties we've defined on the product, for example colour or farbe? For the search endpoint it supports passing in a list of properties ID's which this one does not.
None of the below queries work, and return the various errors below or Call to a member function buildAccessor() on null.
{
"limit": 40,
"filter": [
{
"type": "contains",
"field": "Farbe",
"value": "red"
}
]
}
"Field \"Farbe\" in entity \"product\" was not found."
{
"limit": 40,
"filter": [
{
"type": "contains",
"field": "properties.Farbe",
"value": "red"
}
]
}
"Field \"Farbe\" in entity \"property_group_option\" was not found."
You can combine filters for the name of the property value and their respective group in a multi filter. The following example will only give you products that have the "shoe-color" property with the value "coral".
{
"limit": 1,
"includes": {
"product": ["id", "productNumber", "properties"],
"property_group_option": ["name", "group"],
"property_group": ["name"]
},
"associations": {
"properties": {
"associations": {
"group": []
}
}
},
"filter": [
{
"type": "multi",
"operator": "and",
"queries": [
{
"type": "equals",
"field": "properties.group.name",
"value": "shoe-color"
},
{
"type": "equals",
"field": "properties.name",
"value": "coral"
}
]
}
]
}
Example response:
{
"entity": "product",
"total": 1,
"aggregations": [],
"page": 1,
"limit": 1,
"elements": [
{
"productNumber": "6bbfe1f608504c9b9a7bf92d6a071734",
"properties": [
{
"name": "coral",
"group": {
"name": "shoe-color",
"apiAlias": "property_group"
},
"apiAlias": "property_group_option"
},
{
"name": "cotton",
"group": {
"name": "textile",
"apiAlias": "property_group"
},
"apiAlias": "property_group_option"
}
],
"id": "062ba988aa1840fa84371c9c43b2f838",
"apiAlias": "product"
}
],
"states": [],
"apiAlias": "dal_entity_search_result"
}
I'm working on this project that should scrape websites and output HTML in the form of a JSON, now the only useful things in those JSONs to us are "forms".
I wanted to filter that but the native array filter only works when I know the attribute's location relative to the entire page (DOM??) but that won't always be the case, and I fear checking every object's value till I reach the desired value isn't viable due to
some pages being humongous,
form being a string in other places we don't want, this is in NodeJS
Snippet of input:
[
{
"type": "element",
"tagName": "p",
"attributes": [],
"children": [
{
"type": "text",
"content": "This is how the HTML code above will be displayed in a browser:"
}
]
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "form",
"attributes": [
{
"key": "action",
"value": "/action_page.php"
},
{
"key": "target",
"value": "_blank"
}
],
"children": [
{
"type": "text",
"content": "\nFirst name:"
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "text"
},
{
"key": "name",
"value": "firstname0"
},
{
"key": "value",
"value": "John"
}
],
"children": []
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "text",
"content": "\nLast name:"
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "text"
},
{
"key": "name",
"value": "lastname0"
},
{
"key": "value",
"value": "Doe"
}
],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "submit"
},
{
"key": "value",
"value": "Submit"
}
],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "reset"
}
],
"children": []
},
{
"type": "text",
"content": "\n"
}
]
},
{
"type": "text",
"content": "\n"
}
]
A snippet of output:
[
{
"type": "element",
"tagName": "form",
"attributes": [
{
"key": "action",
"value": "/action_page.php"
},
{
"key": "target",
"value": "_blank"
}
],
"children": [
{
"type": "text",
"content": "\nFirst name:"
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "text"
},
{
"key": "name",
"value": "firstname0"
},
{
"key": "value",
"value": "John"
}
],
"children": []
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "text",
"content": "\nLast name:"
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "text"
},
{
"key": "name",
"value": "lastname0"
},
{
"key": "value",
"value": "Doe"
}
],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "element",
"tagName": "br",
"attributes": [],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "submit"
},
{
"key": "value",
"value": "Submit"
}
],
"children": []
},
{
"type": "text",
"content": "\n"
},
{
"type": "element",
"tagName": "input",
"attributes": [
{
"key": "type",
"value": "reset"
}
],
"children": []
},
{
"type": "text",
"content": "\n"
}
]
}
]
TL;DR: only retain forms and any of its children.
First of all, this input looks like very incomplete, it may be an array or an object. If I assume it's an array of objects, then I can use jsonpath to access any of the values.
var jp = require('jsonpath');
var formNodes = jp.query(nodes, `$..[?(#.tagName=="form")]`);
You can achive the same using vanila javascript, there was several stackoverflow questions for that. But I found jsonpath and xpath being easier to implement than those.
I need help creating a JSON schema for a value that could be an object, or an array of objects.
lib: jsonschema==3.2.0
py: 3.8
I have 2 responses from the server:
first:
{
"result": [
{
"brand": "Test"
}
]}
second:
{
"result":
{
"brand": "Test"
}
}
As you can see the difference between both in the first case its an array of obj the second just object.
my schema:
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "http://example.com/example.json",
"type": "object",
"required": [
"result"
],
"properties": {
"result": {
"$id": "#/properties/result",
"type": ["array", "object"],
"additionalItems": true,
"items": {
"$id": "#/properties/result/items",
"anyOf": [
{
"$id": "#/properties/result/items/anyOf/0",
"type": "object",
"required": [
"brand"
],
"properties": {
"brand": {
"$id": "#/properties/result/items/anyOf/0/properties/brand",
"type": "string"
}
},
"additionalProperties": true
}
]
}
}
},
"additionalProperties": true}
In the first case when return array, it checks the "brand" type on the second when return object, no.
How I can set up 2 types for one field "result" that it could check the brand type?
Your schema can be fixed as follows:
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "http://example.com/example.json",
"type": "object",
"required": [
"result"
],
"properties": {
"result": {
"$id": "#/properties/result",
"anyOf": [
{
"$id": "#/properties/result/items/brand",
"type": "object",
"properties": {
"brand": {
"$id": "#/properties/result/items/anyOf/0/properties/brand",
"type": "string"
}
},
"required": [
"brand"
],
"additionalProperties": true
},
{
"$id": "#/properties/result/items/array",
"type": "array",
"items": {
"$ref": "#/properties/result/items/brand"
}
}
]
}
},
"additionalProperties": true
}
Demos here, here and here.
However, it is customary to extract reusable portions of a schema into a separate "definitions" section, like so:
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "http://example.com/example.json",
"definitions": {
"brand": {
"type": "object",
"properties": {
"brand": {
"$id": "#/properties/result/items/anyOf/0/properties/brand",
"type": "string"
}
},
"required": [
"brand"
],
"additionalProperties": true
}
},
"type": "object",
"required": [
"result"
],
"properties": {
"result": {
"$id": "#/properties/result",
"anyOf": [
{
"$ref": "#/definitions/brand"
},
{
"$id": "#/properties/result/items/array",
"type": "array",
"items": {
"$ref": "#/definitions/brand"
}
}
]
}
},
"additionalProperties": true
}
Demos here, here and here.
Notes:
To express that the property "result" may be of two different types, use the "anyof" keyword for the property's schema. The value of the "anyOf" should be an array with the schemas for each possible type (here the "brand" object or an array of "brand" objects) as the array items.
See: Multiple Types.
To avoid duplicating the definitions for the "brand" object, you can use the "$ref" when defining a schema for the array's items to refer back to the previously given schema for "brand". As noted above it s customary to place reused subschemas into a "definitions" section, but it is not necessary, "$ref" can refer to any schema item via the JSON Pointer syntax.
See: Reuse.
When the items of a list have a single schema, "additionalItems" should not be used.
See: List validation.
I am trying to parse some JSON that is the output of an AWS CLI command to display Snapshots. I want to load this data up into a spreadsheet to be able to filter, group, and audit it.
I've been stumped on how to get the nested Tags array flattened into the parent objects such that the intermediate can then be passed to the #csv filter.
Here is the example:
Initial input JSON:
{
"Snapshots": [
{
"SnapshotId": "snap-fff",
"StartTime": "2014-04-01T06:00:13.000Z",
"VolumeId": "vol-fff",
"VolumeSize": 50,
"Description": "desc1",
"Tags": [
{
"Value": "/dev/sdf",
"Key": "device"
},
{
"Value": "a name",
"Key": "Name"
},
{
"Value": "Internal",
"Key": "Customer"
},
{
"Value": "Demo",
"Key": "Environment"
},
{
"Value": "Brand 1",
"Key": "Branding"
},
{
"Value": "i-fff",
"Key": "instance_id"
}
]
},
{
"SnapshotId": "snap-ccc",
"StartTime": "2014-07-01T05:59:14.000Z",
"VolumeId": "vol-ccc",
"VolumeSize": 8,
"Description": "B Desc",
"Tags": [
{
"Value": "/dev/sda1",
"Key": "device"
},
{
"Value": "External",
"Key": "Customer"
},
{
"Value": "Production",
"Key": "Environment"
},
{
"Value": "i-ccc",
"Key": "instance_id"
},
{
"Value": "B Brand",
"Key": "Branding"
},
{
"Value": "B Name",
"Key": "Name"
},
{
"Value": "AnotherValue",
"Key": "AnotherKey"
}
]
}
]
}
Desired Intermediate:
[
{
"SnapshotId": "snap-fff",
"StartTime": "2014-04-01T06:00:13.000Z",
"VolumeId": "vol-fff",
"VolumeSize": 50,
"Description": "desc1",
"device": "/dev/sdf",
"Name": "a name",
"Customer": "Internal",
"Environment": "Demo",
"Branding": "Brand 1",
"instance_id": "i-fff",
}
{
"SnapshotId": "snap-ccc",
"StartTime": "2014-07-01T05:59:14.000Z",
"VolumeId": "vol-ccc",
"VolumeSize": 8,
"Description": "B Desc",
"device": "/dev/sda1",
"Customer": "External",
"Environment": "Production",
"instance_id": "i-ccc",
"Branding": "B Brand",
"Name": "B Name",
"AnotherKey": "AnotherValue",
}
]
Final Output:
"SnapshotId","StartTime","VolumeId","VolumeSize","Description","device","Name","Customer","Environment","Branding","instance_id","AnotherKey"
"snap-fff","2014-04-01T06:00:13.000Z","vol-fff",50,"desc1","/dev/sdf","a name","Internal","Demo","Brand 1","i-fff",""
"snap-ccc","2014-07-01T05:59:14.000Z","vol-ccc",8,"B Desc","/dev/sda1","External","Production","i-ccc","B Brand","B Name","AnotherValue"
The following jq filter produces the requested intermediate output:
.Snapshots[] | (. + (.Tags|from_entries)) | del(.Tags)
Explanation: from_entries converts the array of key-value objects to an object with the given key-value pairs. This is added to the target object, and finally the "Tags" key is removed.
If the "target" object has a key that also appears in the "Tags" array, then the above filter will favor the value in the "Tags" array. You may accordingly wish to change the order of the operands of "+", or resolve the conflict in some other way.
I have a frustrating issue with the type set as input I cant see the label. Its works fine if I use type as select. This is my code:
{
"type": "input",
"key": "firstName",
"templateOptions": {
"type": "text",
"placeholder": "jane doe",
"label": "First name"
}
},
that doesn't show the label but this does:
{
"key": "transportation",
"type": "select",
"templateOptions": {
"label": "How do you get around in the city",
"valueProp": "name",
"options": [
{
"name": "Car"
},
{
"name": "Helicopter"
}
]
}
}
Any help would be appreciated ?