I am using Azure offline data sync framework to sync mobile app data to server. But I am seeing huge performance hit. Time taken to sync 2000 records of data is around 25 min. When I had further analysis, each http request is taking around 800ms during PushAsync.
Can someone help me how to send multiple records as part of http request during PushAsync operation?
Related
i want to build an api using python and host it on google cloud. api will basically read some data in bucket and then do some processing on it and return the data back. i am hoping that i can read the data in memory and when request comes, just process it and send response back to serve it with low latency. assume i will read few thousand records from some database/storage and when request comes process and send 10 back based on request parameters. i dont want to make connection/read storage when the request comes as it will take time and i want to serve as fast as possible.
will google cloud function work for this need? or shoudl i go with app engine. (basically i want to be able to read the data once and hold it for incoming requests). data mostly will be less than 1-2 gb (max)
thanks,
Manish
You have to have the static data with your function code. Increase the Cloud Functions memory to allow it to load in memory the data to keep them warm and for having a very fast access to it.
Then, you have 2 way to achieve it:
Load the data at startup. You load them only once, the first call has a high latency to download (from GCS for instance) and load the data in memory. The advantage is: in case of data update, you don't have to redeploy your function, only update the data in their location. At the next function start, the new data will be loaded
Deploy the function with the static data in the deployment. This time the startup time is much faster (no download), only the data to load in memory. But when you want to update the data, you have to redeploy your function.
A final word if you have 2 set of static data, you must have 2 functions. The responsibility is different, so the deployment is different.
We're running our Node backend on Firebase Functions, and have to frequently hit a third-party API (HubSpot), which is rate-limited to 100 requests / 10 seconds.
We're making these requests to HubSpot from our cloud functions, and often find ourselves exceeding HubSpot's rate-limit during campaigns or other website usage spikes. Also, since they are all write requests to update data on HubSpot, these requests cannot be made out of order.
Is there a way to throttle our requests to HubSpot, so as to not exceed their rate limit? Open to suggestions that may not necessarily involve cloud functions, although that would be preferred.
Note: When I say "throttle", I mean that all requests to HubSpot need to go through. I'm trying to achieve something similar to what Lodash's throttle method does, if that makes sense.
What we usually do in this case is store the data into a database, and then pass it over to HubSpot in a tempered way (e.g. without exceeding their rate limit) using a cron that runs every minute. For every data item that we pass to HubSpot successfully, we mark it as "success" in the database.
Cloud Functions can not be rate limited. It will always attempt to service requests and events as fast as they arrive. But you can use Cloud Tasks to create an task queue to spread out the load of some work over time using a configured rate limit. A task queue can target another HTTP function. This effectively makes your processing asynchronous, but is really the only mechanism that Google Cloud gives you to smooth out load.
Azure provides monitor to the incoming request to the Cosmos. When I am alone working on my Cosmos DB, ran a simple select vertex statement(eg., g.V('id')). Then I monitored the incoming request, it shows around 10. But for sure I know i'm the only person accessed. I also tried traversing through the graph in a single select query the Request count is huge (around 100).
Do anybody noticed the metrics? We are assuming the request code is huge for an hour in production cause the performance slowness. Is the metric is trustworthy to believe or how to find the incoming request to the cosmos?
I have created a Azure Logic apps to pull data from a REST API and populate a Azure SQL Database to process some data and push result to Dynamics 365. I have around 6000 rows from REST API and I have created 2 logic apps, one pulls data as paged (each page having 10 records) and using a do until loop to process each set. I'm calling another logic app 2 from DO UNTIL loop and passing the paged records which inserts record in to SQL Database.
The issue i'm encountering is the Main logic app times out after 2 minutes.(It process around 600 rows and times out.)
I came across this article which explains various patterns related to managing long running process.
https://learn.microsoft.com/en-us/azure/logic-apps/logic-apps-create-api-app
What would be the best approach to executing long running tasks without time out issues?
Your REST API should follow async pattern by returning 202 with a retry-after & location header, see more at: https://learn.microsoft.com/azure/logic-apps/logic-apps-create-api-app
Or, your REST API can be of webhook kind, so Logic Apps can provide a callback url for you to invoke once the processing is completed.
I have a document db database on azure. I have a particularly heavy query that happens when I archive a user record and all of their data.
I was on the S1 plan and would get an exception that indicated I was hitting the limit of RU/s. The S1 plan has 250.
I decided to switch to the Standard plan that lets you set the RU/s and pay for it.
I set it to 500 RU/s.
I did the same query and went back and looked at the monitoring chart.
At the time I did this latest query test it said I did 226 requests and 10 were throttled.
Why is that? I set it to 500 RU/s. The query had failed, by the way.
Firstly, Requests != Request Units, so your 226 requests will at some point have caused more than 500 Request Units to be needed within one second.
The DocumentDb API will tell you how many RUs each request costs, so you can examine that client side to find out which request is causing the problem. From my experience, even a simple by-id request often cost at least a few RUs.
How you see that cost is dependent on which client-side SDK you use. In my code, I have added something to automatically log all requests that cost more than 10 RUs, just so I know and can take action.
It's also the case that the monitoring tools in the portal are quite inadequate and I know the team are working on that; you can only see the total RUs for every five minute interval, but you may try to use 600 RUs in one second and you can't really see that in the portal.
In your case, you may either have a single big query that just costs more than 500 RU - the logging will tell you. In that case, look at the generated SQL to see why, maybe even post it here.
Alternatively, it may be the cumulative effect of lots of small requests being fired off in a small time window. If you are doing 226 requests in response to one user action (and I don't know if you are) then you probably want to reconsider your design :)
Finally, you can retry failed requests. I'm not sure about other SDKs but the .Net SDK retries a request automatically 9 times before giving up (that might be another explanation for the 229 requests hitting the server).
If your chosen SDK doesn't retry, you can easily do it yourself; the server will return a specific status code (I think 429 but can't quite remember) along with an instruction on how long to wait before retrying.
Please examine the queries and update your question so we can help further.