SharePoint Online Throttling - sharepoint

We are creating a custom solution using SharePoint CSOM.
There are some cases where we need to store ~400 list items containing Attachments, with a total size of about 400 MB.
We are uploading them in parallel (using TPL library, each iteration using a new instance of ClientContext object) so the whole process of uploading them does not take that much time. However, we receive a lot of throttling errors (HTTP STATUS CODE 429).
We did some research and found that setting the User-Agent header link could partially solve the throttling issue, but it's not that clear what values (extracted from where?) you need to use in order to compose the header.
Any idea how we can avoid getting throttled because the main flows in the app require uploading a large number of documents or storing a large number of list items at once?

Related

Serving and processing large CSV and XML files from API

I am working on a node web application and require a form to enable users to provide a URL containing a (potentially 100mb) large CSV or XML file. This would then be submitted and trigger the server (Express) to download the file using fetch, process it and then save it to my Postgres database.
The problem I am having is the size of the file. Responses from the API take minutes to return and I'm worried this solution is not optimal for a production application. I've also seen that many servers (including cloud based ones) have response size limits on them, which would obviously be exceeded here.
Is there a better way to do this than simply via a fetch request?
Thanks

Microsoft Graph APi Excel : Session creation with long-running operation pattern (when and how to do it)

I have been using Microsoft graph API to create/update/delete rows in Excel files.
However, my requests need processing of tens of thousands of records. Which is by default not supported by Microsoft graph API.
In order to do this we have to request workbook-session-id and use it to make a call to graph api for requests comprising of loads of rows.
However, When I try to Request for Workbook-session-id after multiple poling requests also the request fails to respond with a workbook-session-id and I am not able to leverage this feature.
Can Anyone help me resolve this
As MS graph API after the first poll fails to give success and ends up giving me error({"error":{"code":"LostDepartedOperation","message":"We're sorry. We ran into a problem completing your request.","innerError":{"code":"notFoundUncategorized","message":"The requested resource cannot be found.")

List Blobs in Container having 1000+ (large number of files). [AZURE-STORAGE][REST]

I am trying to make a get List of Blobs in a Container REST call to my Azure storage account.
Currently, I have a limited number of files and the response that I get is in the following format as specified in official doc.:
But if there are 1000s of files in this container the response will be huge will it still be in the following XML format or will there be any pagination or paging?
I can't practically test it for 1000 files and there is no such thing mentioned in the docs here. Link
Simple answer to your question is yes. Maximum number of blobs that can be returned in a single list blobs request is 5000 (it can be less than that and even zero).
If more blobs are available, then you’ll get a continuation token which you can use to fetch next set of blobs. This continuation token can be found in element in response body.
For further reference, please see this link: https://learn.microsoft.com/en-us/rest/api/storageservices/enumerating-blob-resources.

Scalability for intensive pdf generation tasks on a node.js app using puppeteer?

The goal of the app is to generate a pdf using puppeteer, we fetch the data, build the html template then using chrome headless generate the pdf, we then, return a link to the newly generated pdf.
The issue, is it takes about 7000 ms to generate a pdf, mainly because of the three puppeteer functions : launch (launch the headless broweser), goto (navigate to the html template) and pdf (generates the pdf).
So having around 7~8 seconds to answer one request, with more incoming requests or a sudden spike, it could easily takes about 40 to 50 seconds for 30 simultaneous requests, which I find unacceptable.
After many time spent on research, I will implement the cluster module to take advantage of multiple processes.
But besides clustering, are there any other possible options to optimize the time on a single instance?
There are something to consider ...
Consider to call puppeteer.launch once per application start. Your conversion script will just check is browser instance already exists and use it by calling newPage(), which basically create new tab, instead of every time creating the browser.
You may consider to intercept Request as page.on('request', this.onPageRequest); when calling goto() and filter out certain types of the files which page is loading right now, but you don't need them for PDF rendering; you may filter out external resources as well if this is your case.
When use pdf() you may return back Buffer from your service, instead of using file system and return link to the location of PDF file created. this may or may not speed up things, depend on your service setup; anyway less IO should be better.
This is probably all you can do for single instance of your app; With the implementation above regular (couple of pages) PDF with a few images render for me in 1-2 sec.
To speed up things use clustering. Other than embed it inside your application you may consider to use PM2 manager to start and scale multiple instances of your service.

Get Photos and Detailed Info for Many Foursquare Venues in one call

I am working on an iPhone app which allows users to search Foursquare. I am using the venues/explore endpoint for the search which works great, but the results don't include the images for a place or the priceTier.
Right now I am calling /venues/VENUE_ID for each of the returned results, which is generating a lot of API calls. Is there a better way to get this info in a single call?
Related question: If I use the multi endpoint to batch these requests, does that count as a single request towards the limit or as multiple requests?
Sounds like you're worried about limits more than network latency? If you're worried that making the extra call to details will make you hit rate limits faster, this is actually why we generally ask developers to cache details such as prices or photos :) A single multi request is not a single API call; it counts as how many requests are bundled into one.
There is a little help with photos though—if you pass in the venuePhotos=1 param as part of an explore request, you ought to get back photos in the response.

Resources