Azure Active Directory B2C and Azure Front Door reducing cost - azure

I am currently using Azure Active Directory B2C in combination with Azure Front Door (so that I can have a custom domain). But this seems very expensive, which is why I am looking for ways to reduce the cost. I have investigated the following.
I have caching and compression enabled but this does not change that much in this case. It seems like almost all of the data comes from the Azure generated content. This comes to around 160kb for each request.
The authorize?client_id seems to only include javascript for the sign-in to work.
It may be that I just do not understand the monitoring and analytics though. since the Reports only show around 214kb transferred data from Edge to client in the last 24 hours, while the metrics show a response size (Sum) of over 25mb. It also does not seem to show even close to all of the requests that are actually made. Is this maybe a caching thing that I am not understanding?
The report also shows origin to edge data, but not edge to origin data, which I am paying for.
If you can clear up any of my confusion or have a great method to reduce the cost, I would greatly appreciate it.

Related

PageSpeed Insights number of distinct samples to show data for a URL logic

I'm reading the PageSpeed Insights documentation and am wondering if anyone knows how Google is determining what is considered a sufficient number of distinct samples per this FAQ:
Why is the real-world Chrome User Experience Report speed data not available for a URL?
Chrome User Experience Report aggregates real-world speed data from opted-in users and requires that a URL must be public (crawlable and indexable) and have sufficient number of distinct samples that provide a representative, anonymized view of performance of the URL.
I'm building a report centered around Core Web Vitals data and realizing some URLs have few data points with CWV timings, and I'm curious exactly how Google is handling these situations. I've been searching through docs and articles, haven't found anything with a specific reference.
The exact threshold is kept secret, so that's why you won't find it documented anywhere. However, as a site owner there are a few things you can do to work around a URL not having sufficient data:
Use the Core Web Vitals report by Search Console, which groups similar pages together, making them more likely to collectively exceed that threshold.
Look at origin-level aggregations in PSI or the CrUX API. These include user experiences from all pages on the origin, so it's much less granular, but it gives you a sense of typical experiences overall.
Instrument your site with your own first-party Core Web Vitals monitoring. web-vitals.js can be integrated with your existing analytics provider to track vitals on all of your pages. If you're integrating with Google Analytics, you can link your data with the Web Vitals Report to see how your site is doing.
Use your site with the Web Vitals extension enabled to see the Core Web Vitals data for your own experience. Your local experiences may not be representative of most users, but this can be a great tool for validating expectations vs reality.
Use lab data as a proxy. For example, lab data from Lighthouse in PSI can tell you how a mobile user on a slow connection might experience your page. This should really only be used as a last resort when no other field data is available.

Azure CDN Purge and Custom Caching Rules

The Issue
I am currently building a PWA that is hosted on Azure and utilises Azure CDN Premium.
Within this PWA, we have the following files:
/service-worker.js
/js/translations/en-us.json
/js/translations/en-hk.json
etc...
When a release is deployed to the storage blob, we trigger a CDN 'purge' that is meant to tell the edge nodes to re-retrieve the assets from the origin storage account.
However, for some reason, the CDN is still returning old versions of these files, despite the storage account having the latest versions (I have left it over 10 hours so not a propagation issue).
Why is this happening? The whole point of a 'purge' is to empty the cache...
I appreciate that there may also be downstream caches beyond the nodes but I never have these problems with AWS and therefore I can only come to the conclusion that it is because Azure is either doing something badly, or I am misunderstanding how it is meant to work.
Possible Solutions
I have come up with possible solutions to this, however, because I am fairly new to Azure, I want to get other's opinions on what the best solution is...
Use Query Strings and Set the relevant Cache mode
I am aware that I could use just use query strings on these files (apart from service-worker.js), however, I do not feel confident this is the best solution.
Custom Rules Engine
Alternatively, I can define custom rules to instruct the CDN to skip the cache for certain files. This kind of defeats the purpose of a CDN though. Which goes back to the question, why is Azure not purging these assets properly...
If this is the best solution, please could someone advise me on the what rules I should define?

Should I use a different class/method/pattern to handle local-cloud syncing of data?

I'm new to Azure Mobile Services as well as mobile development.
From my experience in web development, retrieving data from the database is done part by part as the user requests more data i.e. the website doesn't load all the data on one go.
I'm implementing this principle in mobile app wherein data is loaded (if already in the local db) or downloaded (if not yet in the local db) as the user scrolls down.
I'm using Azure Mobile Services Sync Table to handle the loading of data in the app. However, i wont be able to paginate the downloading of data. According to this post, the PullAsync method downloads all data that has changed/added since its last sync and doesn't allow for using take/skip methods. This is because PullAsync uses incremental sync.
This would mean there will be a large download of data during the first ever launch of the app or if the app hasn't been online for a while even if the user hasn't requested for the said data (i.e. scrolled to it).
Is this a good way for handling data in mobile apps? I like using SyncTable cos it handles quite a lot of important data upload/download stuff e.g. data upload queuing, download/upload of data changes. I'm just concerned with downloading data that the user doesn't need yet.
Or maybe there's something i can do to limit the items PullAsync downloads? (aside from deleted = false and UserId = current user's UserId)
Currently, i limited the times PullAsync is called to the Loading Screen after the user logs in and when the user pulls to refresh.
Mobile development is very different from web development. While loading lots of data to a stateless web page is a bad thing, loading the same data to a mobile app might actually be a good thing. It can help app performance and usability.
The main purpose of using something like the offline data storage is for occasionally disconnected scenarios. There are always architectural tradeoffs that have to be considered. "How much is too much" is one of those tradeoffs. How many roundtrips to the server is too much? How much data transfer is too much? Can you find the right balance of the data that you pass to the mobile device? Mobile applications that are "chatty" with the servers can become unusable when the carrier signal is lost.
In your question, you suggest "maybe there's something i can do to limit the items PullAsync downloads". In order to avoid the large download, it may make sense for you to design your application to allow the user to set criteria for download. If UserId doesn't make sense, maybe a Service Date or a number of days forward or back in the schedule. Finding the right "partition" of data to load to the device will be a key consideration for usability of your app...both online and offline.
There is no one right answer for your solution. However, key considerations should be bandwidth, data plan limits, carrier coverage and user experience both connected and disconnected. Remember...your mobile app is "stateful" and you aren't limited to round-trips to the server for data. This means you have a bit of latitude to do things you wouldn't on a web page.

Block traffic from referral spam bots in Azure Web App with DNN

I am sure many of you have found fake referral traffic in your google analytics reports/views. This makes it difficult for low to medium traffic sites to have accurate data for marketing. I am wondering what others are doing to exclude this traffic from their analytics reports.
If you go to your analytics account and go to acquisition -> all traffic -> referrals you will see sites like floating-share-buttons.com. These are the sites I want to filter out. Which you can do by setting up a custom filter for the view as described at the bottom of this page. I have done this and it works.
I would rather block these bots from hitting the site all together. Just a note: my sites are running as web apps in azure.
I am not sure if setting up url rewrite rules described here will work in azure apps or if this will mess with the existing url rewrite functions of the Content Management System I am using (DotNetNuke DNN platform 7).
I am really just looking to hear what others have done to block bots rather than than setting up filters in the analytics view's settings.
Thanks
PS
for those who are interested, this is the current filter list I am using:
webmonetizer\.net|trafficmonetizer\.org|success-seo\.com|event-tracking\.com|Get-Free-Traffic-Now\.com|buttons-for-website\.com|4webmasters\.org|floating-share-buttons\.com|free-social-buttons\.com|e-buyeasy\.com
With regards to this issue, there are a number of things that you can do. You are going the route that I see most commonly used and that is to block the information using the filters in Google Analytics.
You can go the route of an IIS Filter as well, just like you have linked. DNN's Friendly URL's will not necessarily be impacted by this as they are processed BEFORE DNN gets the request. There is a marginal performance impact by having two things process re-writes, but nothing to be concerned about until incredibly high user volume.
This is also a great collection of options.
First you need to know that there are mainly 2 types of spam affecting GA right now, Ghost and Crawlers.
The first(ghosts) never interacts with your page, so any server-side solutions like the HTTP rules or htaccess file won't have any effect and will only fill your config files with.
The crawlers as the name imply do access your website and can be blocked this way, but there are only a few of them compared with the ghost. To give you an Idea there are around 8 active crawlers while there are more than 100 ghosts and each week increasing.
This is because the ghost method is easier to implement for the spammers.
From your expression, only success-seo is a crawler. The rest should be filtered. Now there is a better way to get rid of all ghosts with just one filter based in your valid hostnames instead of creating of updating one every week.
You can find more information about the ghost spam and the solution here
https://stackoverflow.com/a/28354319/3197362
https://moz.com/ugc/stop-ghost-spam-in-google-analytics-with-one-filter
Hope it helps.

How to deal with Azure Outages, current one was a network drop between websites and SQL Database

We just suffered a SQL Database connectivity issue on Azure. Although very quick, around 1 minute, it did kick all users out, and/or raised Elmah Errors such as:
The wait operation timed out ...
at System.Data.ProviderBase.DbConnectionPool.TryGetConnection
Even glitches like this compromises confidence. I am trying to understand about good approaches to confront these transitory outages. Some thoughts that come to mind include:
Have some code that checks that all required services are running before using them and keep checking with provide friendly error message until they are. I think there is a tendency to assume all is available and working, and I wonder whether this is a dangerous assumption in the world of cloud. I suppose this is more an approach one would take when building a distributed application, although one may not for a database which is usually close to the web application.
Use failover procedures such as TrafficManager. However it is expensive as one now has >1 instance and also one needs to take care of the syncing data across >1 DB etc. Associated link on Failover procedure in Azure
Make sure Custom Error pages are used so Yellow Screen of death (YSOD) is not seen:
<customErrors mode="RemoteOnly" defaultRedirect="~/Error/Error" />
Although YSOD was seen by a colleague, not sure how with the above in force. Once criticism I have of Azure is that if Websites are down, then one can get bad error pages, only provided by Azure and not customisable, although I was advised that using something like CloudFlare can sort this issue.
I think a) is the most interesting concept. Should we code Azure Web Apps as if they are WAN rather than LAN applications, and assume nodes could be down, and so check beforehand?
I would really appreciate thoughts on the above. Our feeling is that Azure is getting a few too many of these outage blips now, which may be due to increased customers... not sure. Although no doubt within the 99.9% annual SLA.
EDIT1
A useful MSDN Azure Cloud Architecture article on this:
Resilient Azure Website Architectures

Resources