Availability tests: is it possibile to HTTP-ping an endpoint with AAD enabled? - azure

To better understand and then implement it in our projects I've created a Function App with 3 endpoints (source code in this gist) and configured 6 Availability Tests (of kind Ping Test):
oktest: https://myfuncname.azurewebsites.net/api/ok (parse dependent requests = false)
oktest2: https://myfuncname.azurewebsites.net/api/ok (parse dependent requests = true)
badtest: https://myfuncname.azurewebsites.net/api/bad (parse dependent requests = false)
badtest2: https://myfuncname.azurewebsites.net/api/bad (parse dependent requests = true)
errtest: https://myfuncname.azurewebsites.net/api/err (parse dependent requests = false)
errtest2: https://myfuncname.azurewebsites.net/api/err (parse dependent requests = true)
The Function App is configured with AAD using Client ID and Allowed Token Audiences.
/ok returns 200 content OK
/bad returns 400 content BAD
/err returns 500
Testing Azure the endpoints with Postman (using a valid Bearer Token) produces expected results as in local hosting environment.
I'm expecting to have 100% ok success in oktest and oktest2 and 100% of failures in other tests.
I'm getting these results:
oktest: success 21 fail 0
oktest2: success 17 fail 0
badtest: success 20 fail 2
badtest2: success 17 fail 0
errtest: success 21 fail 1
errtest2: success 17 fail 0
Then I set authentication to Allow Anonymous and I get these results after some cycle:
oktest: success 31 fail 0
oktest2: success 36 fail 0
badtest: success 29 fail 9
badtest2: success 25 fail 8
errtest: success 28 fail 8
errtest2: success 24 fail 7
It's quite clear that first authentication settings prevented endpoints to be HTTP-pinged. Is this possible to keep AAD authentication or we've to rethink our network architecture?
Any help or suggestion will be very appreciated!
Regards,
Giacomo S. S.

I think I've understood the situation.
When Allow Anonymous is enabled the HTTP-ping correctly fails in 'badtest':
When AAD authentication is enabled the HTTP-ping returns a success cause the
Availability Test is able to authenticate and than returns 200 OK for that (a successful authentication):
Looking these details it's possible explain any single result.
My conclusion: Availability Tests are more oriented to test web pages and not REST services.
Confirmations can be found in MS docs.
If anyone have something to add, I'm very interested!

Related

BigQuery Internal Error with `pageToken` when running in GCP

I run into this error with BigQuery:
"An internal error occurred and the request could not be completed. This is usually caused by a transient issue. Retrying the job with back-off as described in the BigQuery SLA should solve the problem: https://cloud.google.com/bigquery/sla. If the error continues to occur please contact support at https://cloud.google.com/support. Error: 5034996"
Two application use the same way with pageToken to paginate trough big result sets.
run query with initital startIndex: 0, maxResults: 10
get result together with pageToken
send to client
... some time may pass ...
request "next page": use pageToken together with maxResults: 10 to get the next result
repeat from 3.
NodeJS 16, #google-cloud/bigquery 6.0.3
Locally (Windows 10), for both application every thing works, pagination with pageToken returns results quite fast (<5s). All steps 1 to 6 and requesting multiple next pages one after another works, even tested that the pageToken still works after 60min+.
Production Cloud has problems: the initial query works always, but as soon as a pageToken is given, the query fails after ~15s+, even when "requested the next page directly (1-5s. delay) after getting the first page". Steps 1 to 3 work, but requesting next page fails nearly most time, it's very rare that it doesn't fail.
Production uses Google Cloud Functions and Google Cloud Run to serve the applications.
One application is an internal experiment, this application uses the same dataset + table when running locally and when running in "production".
The other application uses the same dataset but different tables for local/production - and is in another Google Cloud project than the first application.
Thus project-level quotas or e.g. different table setups locally/prod shouldn't cause the issue here (hopefully).
Example code used:
const [rows, , jobQueryResults] = await (job.getQueryResults(
('maxResults' in paginate ?
{
// there seems to be no need to give the startIndex again / but tested it also with giving a stable `0` index; atm. as soon as a pageToken is given the `startIndex` is omitted
startIndex: paginate.pageToken ? undefined : paginate.startIndex,
maxResults: paginate.maxResults,
pageToken: paginate.pageToken,
} : undefined) as QueryResultsOptions,
) as Promise<QueryRowsResponse>)
What wonders me is that the pageToken isn't shown in the log of the failure, the maxResults is visible:
Edit
The error suggests some SLA problem, one of the GCP projects only include experimentals (non public) applications, thus any traffic/usage can be easily monitored.
The monitoring for BigQuery in that project shows roughly 1 job per 1 second when testing it, job 1+2 where "load without pageToken" -> 3 used the pageToken from 2 and run into an error, the retries must happen from BigQuery side, there is nothing implemented from my side (using only the official BigQuery package).

CAP - Gateway Timeout - How to increase the time out of incoming request

I trigger a Post Function Import (Action in CDS), this would typically take about 2 minutes for processing. The POST operation was successfully completed in JAVA, however I get a Gateway Timeout.
How to increase the timeout of incoming requests? I had tried to set the property INCOMING_CONNECTION_TIMEOUT: 0 in mta.yaml of service project as well as using the command
cf set-env x-service-name-blue INCOMING_CONNECTION_TIMEOUT 0
cf restage x-service-name-blue
It did not work either.
Could you assist?
Update: I think the correct environment variable on the approuter is called SESSION_TIMEOUT. Can you try this one instead?
This is for the XS Advanced approuter, though I'm not sure if it still applies to the one used for CF apps, this documentation suggest that it's a property of the approuter, so you can try setting it there.

How to trigger a failure in an azure function http trigger, WITH a custom error response

I cannot find a way to fail a http call to a nodejs azure function, and include a custom error response.
Calling context.done() allows for a custom response (but not indicated as a failure in Application Insights)
Calling context.done(true, XXX) does create a failure, but returns a generic error to the user (no matter what I put in XXX):
{"id":"b6ca6fb0-686a-4a9c-8c66-356b6db51848","requestId":"7599d94b-d3f2-48fe-80cd-e067bb1c8557","statusCode":500,"errorCode":0,"message":"An error has occurred. For more information, please check the logs for error ID b6ca6fb0-686a-4a9c-8c66-356b6db51848"}
This is just the latest headache I have ran into in trying to get a fast web api running on Azure funcs. If you cant track errors, than it should hardly be called "Application Insights". Any ideas?
Success will be true, but resultCode will be set to your value.
Try an AppInsights query like this:
// Get all errors
requests
| where toint(resultCode) >= 400
| limit 10
[Update]
The Id value in Requests is the 'function instance id', which uniquely identifies that invocation.
There is also a 'traces' table that contains the logging messages from your azure function. You can join between requests and traces via the operation_Id.
requests
| where toint(resultCode) >= 400
| take 10
| join (traces) on operation_Id
| project id, message, operation_Id
The response body is not automatically logged to AppInsights. You'll need to add some explicit log statements to capture that.
Why not use context.res to return a customer response for an HTTP trigger function?

Which status codes should I expect when using Azure Table Storage

I want to do something when/if an insert operation on Azure Table Storage fails. Assume that I want to return false from the below code when I receive an error. _table is of type CloudTable and the code below works.
public bool InsertEntity(TableEntity entity)
{
var insertOperation = TableOperation.Insert(entity);
var result = _table.Execute(insertOperation);
return (result.HttpStatusCode == (int)System.Net.HttpStatusCode.OK);
}
I get the result 203 when the operation succeeds. But there are other possible results like "200 OK".
How can I write a piece of code that will allow me to understand from the status code that something went wrong?
Using the .NET SDK, any situation that needs to be handled will throw an exception. i.e. Any status code that is not 2xx will cause an exception.
To handle situations where something went wrong, I don't have to manually check the status code of the result for every request. All I have to do is to write exception handling code. Like below:
try
{
var result = _table.Execute(insertOperation);
}
catch (Exception)
{
Log("Something went wrong in table operation.");
}
From this page:
REST API operations for Azure storage services return standard HTTP
status codes, as defined in the HTTP/1.1 Status Code Definitions.
So every successful operation against table service will return 2XX status code. To find out about the exact code returned, I would recommend checking out each operation on the REST API Documentation page. For example, Create Table operation returns 201 status code if the operation is successful.
Similarly, for errors in table service you will get error code in 400 range (that would mean you provided incorrect data e.g. 409 (Conflict) error if you're trying to create a table which already exists) or in 500 range (for example, table service is unavailable). You can find the list of all Table Service Error Codes here: https://msdn.microsoft.com/en-us/library/azure/dd179438.aspx.
Basically, any return in 2xx is "OK". In this example:
https://msdn.microsoft.com/en-us/library/system.net.httpstatuscode%28v=vs.110%29.aspx
203 Non-Authoritative Information:
Indicates that the returned metainformation is from a cached copy
instead of the
origin server and therefore may be incorrect.
This Azure white paper elaborates further:
http://go.microsoft.com/fwlink/?LinkId=153401
9.6.5 Error handling and reporting
The REST API is designed to look like a standard HTTP server interacting with existing HTTP clients
(e.g., browsers, HTTP client libraries, proxies, caches, and so on).
To ensure the HTTP clients handle errors properly, we map each Windows
Azure Table error to an HTTP status code.
HTTP status codes are less expressive than Windows Azure Table error
codes and contain less information about the error. Although the HTTP
status codes contain less information about the error, clients that
understand HTTP will usually handle the error correctly.
Therefore, when handling errors or reporting Windows Azure Table
errors to end users, use the Windows Azure Table error code along with
the HTTP status code as it contains more information about the error.
Additionally, when debugging your application, you should also consult
the human readable element of the XML error
response.
These links are also useful:
Microsoft Azure: Status and Error Codes
Clean way to catch errors from Azure Table (other than string match?)
If you are using Azure Storage SDK accessing Azure Table Storage, the SDK would throw a StorageException on the client side for unexpected Http Status Codes returned from the table storage service. To extract the actual HttpStatusCode you would need to wrap your code in a try {} catch(StorageException ex){} block. And then parse the actual exception object to extract the HttpStatusCode embedded in it.
Have a look at Azure Storage Exception parser I implemented in Nuget:
https://www.nuget.org/packages/AzureStorageExceptionParser/
This extracts HttpStatusCode and many other useful fields from Azure StorageExceptions. You can use the same library accross table, blob, queue clients etc. as they all follow the same StorageException pattern.
Note that there will be some exceptions thrown by the Azure Storage SDK that are not StorageExceptions, those are mostly client side request validation type of exceptions and naturally they do not contain any HttpStatusCode. (Hence you would need to have a catch for specifically StorageExceptions to extract HttpStatusCode s).
As a separate note, Azure Storage SDK has a fairly robust retry mechanism for failed requests. Below is the snippet from SDK source code where they decide if the failed response is retrieable or not.
https://github.com/Azure/azure-storage-net/blob/master/Lib/Common/RetryPolicies/ExponentialRetry.cs
if ((statusCode >= 300 && statusCode < 500 && statusCode != 408)
|| statusCode == 501 // Not Implemented
|| statusCode == 505 // Version Not Supported
|| lastException.Message == SR.BlobTypeMismatch)
{
return false; //aka. do not Retry if w are here otherwise Retry if within max retry count..
}

Max request length exceeded

I have a user receiving the following error in response to an ItemQueryRq with the QuickBooks Web Connector and IIS 7.
Version:
1.6
Message:
ReceiveResponseXML failed
Description:
QBWC1042: ReceiveResponseXML failed
Error message: There was an exception running the extensions specified in the config file. --> Maximum request length exceeded. See QWCLog for more details. Remember to turn logging on.
The log shows the prior request to be
QBWebConnector.SOAPWebService.ProcessRequestXML() : Response received from QuickBooks: size (bytes) = 3048763
In IIS 7, the max allowed content length is set to 30000000, so I'm not sure what I need to change to allow this response through. Can someone point me in the right direction?
Chances are, your web server is rejecting the Web Connector's HTTP request because you're trying to POST too much data to it. It's tough to tell for sure though, because it doesn't look like you have the Web Connector in VERBOSE mode, and you didn't really post enough of the log to be able to see the rest of what happened, and you didn't post the ItemQuery request you sent or an idea of how many items you're getting back in the response.
If I had to guess, you're sending a very generic ItemQueryRq to try to fetch ALL items, which has a high likelihood of returning A LOT of data, and thus having IIS reject the HTTP request.
Whenever you're fetching a large amount of data using the Web Connector, you should be using iterators. Iterators allow you to break up the result set into smaller chunks.
qbXML Iterator example
other qbXML examples
If you just need to determine if an item exists in QB you can simply add IncludeRetElement to your ItemQuery
So you should post something like
<ItemQueryRq requestID="55">
<FullName>Prepay Discount</FullName>
<IncludeRetElement>ListID</IncludeRetElement>
</ItemQueryRq>
And in Item query response just check the status code. If it is equal to 500 then it means that you should push your item into QB, if it is equal to 0 then it means that item exists
That workaround will save plenty of bytes in your response

Resources