Ingest more than 30k users details from Azure Active Directory - azure

I am facing an issue with iterating over 39k azure ad users in Azure Active directory.
I am able to get the ad users from Microsoft Graph API page by page.
As the Graph API provides results page by page, our scaling is limited to the number of records in a single page.
Thus, it takes a long time (more than an hour) to process the data for more than 10k users.
I want to know, if there is any other way I can use to get all the users in pages in parallel and process those batches of data in parallel.

Related

Azure B2C Performance Metrics

Is it possible to track page load times for each user journey? What other performance metrics are available for Azure B2C that we can also plugin to Azure Monitor? Can we track total time of execution for a user journey (at least the steps where we do not wait for user input)
Came across this github.com/yoelhor/aadb2c-load-test which might help to perform Azure AD B2C Load Testing.

Overcome throttling in Office 365 with service account

Currently I work on the application for fetching and downloading data from Office 365 services for given organization. I use such public APIs as EWS, SharePoint and Microsoft Graph for getting access to data for given user in organization (like Outlook, Calendar, OneDrive), groups (Team Site content, Planner, Conversations) and SharePoint content. I need to execute a lot of requests at the same time but unfortunately I experience throttling during this. There are some info in the internet related to using service accounts in order to decrease throttling rate but it's not enough such info at the moment.
How can use service accounts to overcome throttling in Office 365?
Are you already using a service account, or do you actually have the credentials for each user whose data you are retrieving? Typically the way to avoid throttling is to get a service account with Impersonation rights to the individual calendars, etc. When querying the mailbox or calendar the service account impersonates the actual user so the connection and communicate charges are counted against the user and not the service account. This way, e.g. a single service account can launch many parallel requests for multiple users without the charges being accumulated against the actual service account and causing throttling.
I've spent an intense amount of time dealing with this problem. Here is what we did:
1) Moved our automated processing to off peak hours (e.g. 6pm - 6am)
2) All calls need to have retry ability. MS says to use the value in the retry-after header, but it always 2 minutes. I'll retry for 20 minutes. If it fails after that time period, I don't try again for another hour.
Using service accounts is absolutely a workaround, albeit a terrible one. At some point, MS will adjust the algorithms again and you will be just as screwed. The real answer is retrying over periods of time.

Increasing the data retention for activity logs (Audit and Sign-ins) in Azure Active Directory

In the Azure Portal under Azure Active Directory I am looking for a way to persist the Audit and Sign-in activity data for 1-year or longer. Azure AD Premium 1-2 seems to only allow for a maximum of 30 days. I am in search of a method, preferably inside of the Azure ecosystem, to store this data longer. In my attempts to Google a solution, I found the ability to export the Azure Activity Log data to general purpose storage, but I do not see that option from within Azure Active Directory.
Is the only option to create a script to move this data to a more permanent location, or is there a way to extend the data retention for these logs within Azure?
I'm new to all things Azure, so if I am missing any obvious things, please inform me.
For now, AAD doesn't support increasing the data retention for Audit logs within Azure Active Directory.
Depending on your license, Azure Active Directory Actions stores activity reports for the following durations:
Report Azure AD Free Azure AD Premium P1 Azure AD Premium P2
Directory Audit 7 days 30 days 30 days
Sign-in Activity 7 days 30 days 30 days
If you need data for duration that is longer than 30 days, you can pull the data programmatically using the reporting API and store it on your side. Alternatively, you can integrate audit logs into your SIEM systems.
Hope this helps!

Azure Web Jobs long history affecting app service plan performance

I have been running my web jobs for a few months now and the history includes hundreds of thousands of instances when some of them ran, mainly TimerTriggers. When I go in the portal to the "Functions" view of web jogs logs I have noticed that my app service plan shoots up to 100% CPU while I am sat on that page. The page constantly says "Indexing...."
When I close the "Functions" view down the CPU goes straight back down to a few percent, it's normal range.
I assume it must be down to the fact that it has been running for so long and the number of records to search through is so vast. I cannot see any option to archive or remove old records of when jobs ran.
Is there a way I can reduce the history of the jobs? Or is there another explanation?
I'm not familiar with Azure Web Jobs, but I am familiar with Azure Functions which is built on top of Web Jobs, so this might work.
In Azure Functions, each execution is stored in Azure Storage Table. There, you can see all of the parameters that were passed in, as well as the result. I could go into the Storage Table and truncate the records I do not need, so you might be able to do the same with Web Jobs.
Here is how to access this information:
Table Storage in markheath.net/post/three-ways-view-error-logs-azure-functions
Based on your description, I checked my webjob and found the related logs for azure webjobs dashboard as follows:
For Invocation Log Recently executed functions, you could find them as follows:
Note: The list records for Invocation Log are under azure-webjobs-dashboard\functions\recent\flat, the detailed invocation logs are under azure-webjobs-dashboard\functions\instances.

Setting up Azure to Sync Contacts in Custom Program, Tasks and Pricing

We have our own application that stores contacts in an SQL database. What all is involved in getting up and running in the cloud so that each user of the application can have his own, private list of contacts, which will be synced with both his computer and his phone?
I am trying to get a feeling for what Azure might cost in this regard, but I am finding more abstract talk than I am concrete scenarios.
Let's say there are 1,000 users, and each user has 1,000 contacts that he keeps in his contacts book. No user can see the contacts set up by any other user. Syncing should occur any time the user changes his contact information.
Thanks.
While the Windows Azure Cloud Platform is not intended to compete directly with consumer-oriented services such as Dropbox, it is certainly intended as a platform for building applications that do that. So your particular use case is a good one for Windows Azure: creating a service for keeping contacts in sync, scalable across many users, scalable in the amount of data it holds, and so forth.
Making your solution is multi-tenant friendly (per comment from #BrentDaCodeMonkey) is key to cost-efficiency. Your data needs are for 1K users x 1K contacts/user = 1M contacts. If each contact is approx 1KB then we are talking about approx 1GB of storage.
Checking out the pricing calculator, the at-rest storage cost is $9.99/month for a Windows Azure SQL Database instance for 1GB (then $13.99 if you go up to 2GB, etc. - refer to calculator for add'l projections and current pricing).
Then you have data transmission (Bandwidth) charges. Though since the pricing calculator says "The first 5 GB of outbound data transfers per billing month are also free" you probably won't have any costs with current users, assuming moderate smarts in the sync.
This does not include the costs of your application. What is your application, how does it run, etc? Assuming there is a client-side component, (typically) this component cannot be trusted to have the database connection. This would therefore require a server-side component running that could serve as a gatekeeper for the database. (You also, usually, don't expose the database to all IP addresses - another motivation for channeling data through a server-side component.) This component will also cost money to operate. The costs are also in the pricing calculator - but if you chose to use a Windows Azure Web Site that could be free. An excellent approach might be the nifty ASP.NET Web API stack that has recently been released. Using the Web API, you can implement a nice REST API that your client application can access securely. Windows Azure Web Sites can host Web API endpoints. Check out the "reserved instance" capability too.
I would start out with Windows Azure Web Sites, but as my service grew in complexity/sophistication, check out the Windows Azure Cloud Service (as a more advance approach to building server-side components).

Resources