Spark NLP Server REST API grouping parameter not respected - apache-spark

I make a POST request to installed Spark NLP Server at http://localhost:5000/api/results/ with JSON body of :
{
"data": "I eat in the tree",
"spell": "xx.embed.bert_base_multilingual_uncased",
"grouping": "document"
}
and receive a UUID response:
{
"uuid": "3i9aFi1YlYXrSZSin2bhu"
}
Sending the UUID in a GET request http://localhost:5000/api/results/3i9aFi1YlYXrSZSin2bhu
I receive the embedding but it includes 5 embeddings rather than one for the whole document. It appears the API is not respecting the grouping parameter or am I making some error.
The UI returns only the single embedding for the document if grouping is selected in the UI.

Related

JMeter: Connect to PostGresSQL in JSR using groovy and then compare values from multiple tables in DB with API response

Sorry for the long post, but I really need some guidance here. I need to compare values from an API response with the values from multiple tables in the DB.
Currently, I am doing it as follows:
Use a JDBC Connect Configuration to connect to Postgres DB and then use the JDBC Sampler to execute queries. I use it three times to query 3 different tables. I store this data in variables (lets call them DBVariables). Please see this image for current Jmeter setup. https://i.stack.imgur.com/GZJyF.png
In JSR Assertion, I have written code that takes data from various DBVariables and compares it against the API response.
However, my issue is the API response can have an array of records and then nested arrays inside each (please see API response sample below). And these array elements can be sorted in any order. This is where I have issues.
I was wondering what would be the most efficient way of writing this JSR Assertion to validate all data elements returned by the API are the same as what is in the DB.
I am very new to groovy, but I think if I can query the DB inside the JSR assertion (instead of using the JDBC sampler), then the comparison can be done by storing API response in a map and then the DBResponse in another map and sorting both and comparing the items.
My questions are:
How can I connect to postgressql using groovy and then execute query statements in it? I have not done that before and was hoping if someone can provide a sample code.
How can I store API response and DB responses in Map and sort them and compare them in groovy?
The API response is of the following type:
{
"data":{
"response":{
"employeeList":[
{
"employeeNumber":"11102",
"addressList":[
{
"addrType":"Home",
"street_1":"123 Any street"
},
{
"addrType":"Alternate",
"street_1":"123 Any street"
}
],
"departmentList":[
{
"deptName":"IT"
},
{
"deptName":"Finance"
},
{
"deptName":"IT"
}
]
},
{
"employeeNumber":"11103",
"addressList":[
{
"addrType":"Home",
"street_1":"123 Any street"
},
{
"addrType":"Alternate",
"street_1":"123 Any street"
}
],
"departmentList":[
{
"deptName":"IT"
},
{
"deptName":"Finance"
},
{
"deptName":"IT"
}
]
}
]
}
}
}
Have you seen Working with a relational database chapter of Groovy documentation? Alternatively you can obtain a Connection instance from the JDBC Configuration Element like
def connection = org.apache.jmeter.protocol.jdbc.config.getConnection('your-pool-name')
With regards to "sort" There is DefaultGroovyMethods class which provides sort() function for any "sortable" entity. With regards to "compare" - we don't know how the object from the database looks like hence cannot provide a comprehensive solution.
Maybe an easier option would be converting the response from the JDBC Sampler to JSON using JsonBuilder and once you have 2 JSON structures use the library like JSONassert which doesn't care about order and depth
You haven't asked, but if you're "very new to groovy" maybe it worth extracting individual values from API using JSON Extractor, do the same for the database with the JDBC elements and compare individual JMeter Variables using Response Assertion?

Google Places API returns 0 results after deploying on Google Cloud

I have developed an application that is powered by Google Places API. The problem is the places are loading when running locally but not after deploying it on google cloud. I am using a default keyword to fetch the desired results but surprisingly it is not working after its deployed. I tried changing the keyword but still, it returns 0 results. Please have a look at my code below
await axios
.get("https://maps.googleapis.com/maps/api/place/nearbysearch/json", {
params: {
key: process.env.GOOGLE_PLACE_API_KEY,
location: req.body.ll,
radius: 20000,
keyword: "popular destinations near me",
},
})
and the response I get
{ html_attributions: [], results: [], status: 'ZERO_RESULTS' }
Postman request of the same works without any issue
and the same request sent with a raw JSON data, I am getting an error
{
"key": "my key",
"location": "my location",
"radius": "20000",
"keyword": "popular destinations near me"
}
{
"error_message": "You must use an API key to authenticate each request to Google Maps Platform APIs. For additional information, please refer to http://g.co/dev/maps-no-account",
"html_attributions": [],
"results": [],
"status": "REQUEST_DENIED"
}
The same request sent using postman returns 20+ results. I have no clue what could possibly be wrong with the above request. Any help is appreciated, thanks.
The Place API Place Search Nearby Search requests required parameters which are key, location, and radius. There are optional parameters you can input too, they are
keyword(not keywords),
language,
minprice&maxprice,
name,
The name field is no longer restricted to place names. Values in this field are combined with values in the keyword field and passed as part of the same search string. We recommend using only the keyword parameter for all search terms.
opennow,rankby,type,pagetoken.
So you cannot assign more than one keyword to your request.

How to correct paging by using top and skip in Azure Search?

According to this document Total hits and Page Counts, if I want to implements paging, I should combine skip and top in my query.
The following is my POST query
{
"count": true ,
"search": "apple",
"searchFields": "content",
"select": "itemID, content",
"searchMode": "all",
"queryType":"full",
"skip": 100,
"top":100
}
It should return something like from 100 --> 200
{
"#odata.context": "https://example.search.windows.net/indexes('example-index-')/$metadata#docs(*)",
"#odata.count": 1611,
"#search.nextPageParameters": {
"count": true,
"search": "apple",
"searchFields": "content",
"select": "itemID, content",
"searchMode": "all",
"queryType": "full",
"skip": 200
},
But it just return top 100 , not paging offset to 100
"#odata.context": "https://example.search.windows.net/indexes('example-index')/$metadata#docs(*)",
"#odata.count": 1611,
Is there any thing I should setup?
The short answer of how to implement paging (from this article): Set top to your desired page size, then increment skip by that page size on every request. Here is an example of how that would look with the REST API using GET:
GET /indexes/onlineCatalog/docs?search=*$top=15&$skip=0
GET /indexes/onlineCatalog/docs?search=*$top=15&$skip=15
GET /indexes/onlineCatalog/docs?search=*$top=15&$skip=30
The REST API documentation also covers how to implement paging, as well as the true purpose of #odata.nextLink and #search.nextPageParameters (also covered in this related question).
Quoting from the API reference:
Continuation of Partial Search Responses
Sometimes Azure Search can't return all the requested results in a single Search response. This can happen for different reasons, such as when the query requests too many documents by not specifying $top or specifying a value for $top that is too large. In such cases, Azure Search will include the #odata.nextLink annotation in the response body, and also #search.nextPageParameters if it was a POST request. You can use the values of these annotations to formulate another Search request to get the next part of the search response. This is called a continuation of the original Search request, and the annotations are generally called continuation tokens. See the example in Response below for details on the syntax of these annotations and where they appear in the response body.
The reasons why Azure Search might return continuation tokens are implementation-specific and subject to change. Robust clients should always be ready to handle cases where fewer documents than expected are returned and a continuation token is included to continue retrieving documents. Also note that you must use the same HTTP method as the original request in order to continue. For example, if you sent a GET request, any continuation requests you send must also use GET (and likewise for POST).
Note
The purpose of #odata.nextLink and #search.nextPageParameters is to protect the service from queries that request too many results, not to provide a general mechanism for paging. If you want to page through results, use $top and $skip together. For example, if you want pages of size 10, your first request should have $top=10 and $skip=0, the second request should have $top=10 and $skip=10, the third request should have $top=10 and $skip=20, and so on.
The follwing example is Get method without top.
GET Method
https://example.search.windows.net/indexes('example-content-index-zh')/docs?api-version=2017-11-11&search=2018&$count=true&$skip=150&select=itemID
The Result:
{
"#odata.context": "https://example.search.windows.net/indexes('example-index-zh')/$metadata#docs(*)",
"#odata.count": 133063,
"value": [ //skip value ],
"#odata.nextLink": "https://example.search.windows.net/indexes('example-content-index-zh')/docs?api-version=2017-11-11&search=2018&$count=true&$skip=150"
}
No matter I use post or get, onces I add $top=# the #odata.nextLink will not show in result.
Finally, I figure out although #odata.nextLink or #search.nextPageParameters will not show. But the paging is working, I have to count myself.

Custom update actions in RESTful Services

My API requires many different types of updates that can be performed by different types of roles. A ticket can have it's data updated, a ticket can be approved (which includes some information), a ticket can be rejected, a ticket can be archived (state that makes a ticket unable to be updated), etc.
I've recently started working as a backend developer and I really do not know what is the most correct approach to this situation but I've two ideas in mind:
A single update endpoint (e.g. /api/tickets/:id) that accepts an action field with the type of update that wants to be done to that file;
Multiple custom action endpoints (e.g. /api/tickets/:id/validate, /api/tickets/:id/accept, etc.)
Which one of those is the best approach to the situation when it comes to the REST architecture? If both are incorrect when it comes to REST, what would be the most correct approach? I couldn't really find any post on SO that answered my question so I decided to create my own. Thank you!
REST stands for Representational State Transfer, which means that the client and the server affect each other’s state by exchanging representations of resources.
A client might GET a representation of a ticket like this:
GET /api/tickets/123 HTTP/1.1
HTTP/1.1 200 OK
Content-Type: application/json
{
"id": 123,
"state": "new",
"archived": false,
"comments": []
}
Now the client can PUT a new representation (replacing it wholesale) or PATCH specific parts of the existing one:
PATCH /api/tickets/123 HTTP/1.1
Content-Type: application/json-patch+json
[
{"op": "replace", "path": "/state", "value": "approved"},
{"op": "add", "path": "/comments", "value": {
"text": "Looks good to me!"
}}
]
HTTP/1.1 200 OK
Content-Type: application/json
{
"id": 123,
"state": "approved",
"archived": false,
"comments": [
{"author": "Thomas", "text": "Looks good to me!"}
]
}
Note how the update is expressed in terms of what is to be changed in the JSON representation, using the standard JSON Patch format. The client could use the same approach to update any resource represented in JSON. This contributes to a uniform interface, decoupling the client from the specific server.
Of course, the server need not support arbitrary updates:
PATCH /api/tickets/123 HTTP/1.1
Content-Type: application/json-patch+json
[
{"op": "replace", "path": "/id", "value": 456}
]
HTTP/1.1 422 Unprocessable Entity
Content-Type: text/plain
Cannot replace /id.
Still, this might require more complexity on the server than dedicated operations like “approve”, “reject”, etc. You might want to use a library to implement JSON Patch. If you find that you need many custom actions which are hard to express in terms of a representation, an RPC architecture might suit you better than REST.
REST is all about resources. And the state of the resources should be manipulated using representations (such as JSON, XML, you name it) on the top of stateless communication between client and server.
Once URIs are meant to identify the resources, it makes sense to use nouns instead of verbs in the URI. And when designing a REST API over the HTTP protocol, HTTP methods should be used to indicate the action intended to be performed over the resource.
Performing partial updates
You could use the PATCH HTTP verb to perform partial updates to your resource. The PATCH request payload must contain set of changes to applied to the resource.
There are a couple of formats that can be used to describe the set of changes to be applied to the resource:
###JSON Merge Patch
It's defined in the RFC 7396 and can be applied as described below:
If the provided merge patch contains members that do not appear within the target, those members are added. If the target does contain the member, the value is replaced. Null values in the merge patch are given special meaning to indicate the removal of existing values in the target.
So a request to modify the status of a ticket could be like:
PATCH /api/tickets/1 HTTP/1.1
Host: example.org
Content-Type: application/merge-patch+json
{
"status": "valid"
}
###JSON Patch
It's defined in the RFC 6902 and expresses a sequence of operations to be applies to a target JSON document. A request to modify the status of a ticket could be like:
PATCH /api/tickets/1 HTTP/1.1
Host: example.org
Content-Type: application/json-patch+json
[
{
"op": "replace",
"path": "/status",
"value": "valid"
}
]
The path is a JSON Pointer value, as described in the RFC 6901.
Try to either
Deal with a single object -> api/v1/tickets/1
Deal with a list of objects -> api/v1/tickets/.
Then try to capture all actions as CRUD actions.
Create object(s) -> HTTP POST
Retreive object(s) -> HTTP GET
Update object(s) -> HTTP PATCH
Delete object(s) -> HTTP DELETE
And also:
Save object(s) entirely -> HTTP PUT
When you are changing statuses, and these are just attributes on a ticket. I would send a PATCH request, for instance. If I need to change the statues of ticket #1 to "rejected" I would send something like PATCH api/v1/tickets/1 with a payload like:
{
"status": "rejected"
}
REST has a lot of best practices but not everything is set in stone. Maybe this tutorial: https://restfulapi.net can help you?
Really it all comes down to a matter of taste. It is often observed in the industry to have the static parameters in the URL (eg: /tickets/update, /users/add, /admin/accounts) and the variable parameters in the query (eg: IDs, messages, names). It allows to have a fixed number of endpoints.
I see you're using NodeJS so you're probably using Express and in Express getting the url parameters and the body parameters are equally easy:
// supposing you're using a JSON-based API with body-parser for JSON parsing
app.get('/some/path/:someParam', (req, res) => {
console.log(req.params.someParam);
console.log(req.body.someOtherParam);
res.send();
}

should I pass query parameters in the url or in the payload?

Suppose I have a REST endpoint like this :
http://server/users/query
And I have parameters in my query : age, city, country
I want to do a GET request with those parameters.
Should I better pass the parameters in the url ? Or put something like this in the payload of my GET request.
"query": {
"age": "something",
"city": "something",
"country": "something"
}
On my understanding, you have a collection of users and you want to get a representation of it. You should consider query parameters to filter your collection, as following:
http://[host]/api/users?age=something&city=something&country=something
And avoid GET requests with a payload. See the quote from the RFC 7231:
A payload within a GET request message has no defined semantics;
sending a payload body on a GET request might cause some existing
implementations to reject the request.
From MDN: GET requests (typically) don't have bodies so use the query parameters or the path.
If you are making requests to a server you should instead read the documentation of it's API.

Resources