PageSpeed Insights number of distinct samples to show data for a URL logic - pagespeed-insights

I'm reading the PageSpeed Insights documentation and am wondering if anyone knows how Google is determining what is considered a sufficient number of distinct samples per this FAQ:
Why is the real-world Chrome User Experience Report speed data not available for a URL?
Chrome User Experience Report aggregates real-world speed data from opted-in users and requires that a URL must be public (crawlable and indexable) and have sufficient number of distinct samples that provide a representative, anonymized view of performance of the URL.
I'm building a report centered around Core Web Vitals data and realizing some URLs have few data points with CWV timings, and I'm curious exactly how Google is handling these situations. I've been searching through docs and articles, haven't found anything with a specific reference.

The exact threshold is kept secret, so that's why you won't find it documented anywhere. However, as a site owner there are a few things you can do to work around a URL not having sufficient data:
Use the Core Web Vitals report by Search Console, which groups similar pages together, making them more likely to collectively exceed that threshold.
Look at origin-level aggregations in PSI or the CrUX API. These include user experiences from all pages on the origin, so it's much less granular, but it gives you a sense of typical experiences overall.
Instrument your site with your own first-party Core Web Vitals monitoring. web-vitals.js can be integrated with your existing analytics provider to track vitals on all of your pages. If you're integrating with Google Analytics, you can link your data with the Web Vitals Report to see how your site is doing.
Use your site with the Web Vitals extension enabled to see the Core Web Vitals data for your own experience. Your local experiences may not be representative of most users, but this can be a great tool for validating expectations vs reality.
Use lab data as a proxy. For example, lab data from Lighthouse in PSI can tell you how a mobile user on a slow connection might experience your page. This should really only be used as a last resort when no other field data is available.

Related

Share point - usage data/reporting

I do reporting/analytics for site usage and engagement for a share point online site with my company. I currently run the usage logs manually from site audit reports and the process is very time consuming and not always accurate. Does anyone know a better way to get these logs? Also has anyone had success in implementing a 3rd party platform to capture site visits like google analytics? We have tried to implement Matomo, but not much success.
#B1landry,
You may have a try Azure app insight, provides similar functionality to Google Analytics with the advantage of keeping your data in the same ecosystem.
Check below docs to get started:
https://sharepoint.handsontek.net/2019/02/19/how-to-add-application-insights-to-sharepoint-without-modifying-the-master-page/
https://learn.microsoft.com/en-us/azure/azure-monitor/app/sharepoint
https://learn.microsoft.com/en-us/answers/questions/246834/how-can-i-setup-a-sharepoint-online-site-usage-mon-1.html
BR

Azure scaling with Orchard cms

I was reading through these questions:
Scaling Orchard with Azure Web Sites
Orchard CMS Performance
How to deploy Orchard CMS in Windows Azure?
I started to think about an e-commerce project I am undertaking and would like to clarify a few things if possible.
Please forgive me because I am finding it very difficult to articulate this question in a way I feel I have clearly communicated what I am thinking.
Firstly, what factors and when would those factors kick in for me to start thinking about scaling to handle the traffic of my web site. The type of factors I am aware of would include:
Session handling
Caching
I am thinking the amount of data being served in a request but not sure on the full implications of request size
Secondly, with all things there should be a certain level of up-front planning when trying to set up a web site that can handle traffic of certain levels. Would the Azure scaling need to be done upfront or is it a simple matter to make it work now for what is needed and then up-scale at a later date when it is necessary?
Let me give a real life scenario to try aid where my fear is:
A radio broadcast was put out for a certain web site trying to sell
their wares. The web site was not planned very well. The web site
started to receive visits from people listening to the radio show. So
many visitors that the web site was not able to handle the traffic and
an error message was displayed telling the world that they should
'talk to the administrator' or words to that effect. You know the
picture I am sure and I am also very certain it would be embarrassing
for any web developer to be told that this was happening to a web site
they had designed.
I would really like to really be able to distil a proper question out of this, but there are many things that I am just not aware of. To try an make this question less vague I will try to summarise what I would like to achieve:
I want to have a web site that is able to handle a lot of traffic following successful advertising/marketing campaigns. I want to walk the tightrope of budget versus functionality, which is why I would like to be able to do the least amount possible to start with and be able to easily up-scale as demand dictates.
Bearing this in mind, what approach/considerations should I take to avoid nasty pitfalls with performance/availability/reliability when using an Orchard CMS/Azure combination to deliver my project?
Orchard on Azure Web Sites is working great for us, see http://nublr.pt
A few things to bear in mind with the site configuration are:
follow the guidelines in http://docs.orchardproject.net/Documentation/Optimizing-Performance-of-Orchard-with-Shared-Hosting
set up caching (module Contrib.Cache available in the gallery) which will use IIS's application cache.
set up the Warmup feature to keep the site alive,
also ensure that dynamic compilation is off by using the Config/HostComponents.config
We are currently in "shared" mode of azure web sites, we don't have much traffic yet, but out load testing with https://loadimpact.com has not taken the site down once. at any time we can move to the "reserved" mode (it does take up to 24h for it to happen)
Version 1.6 will bring a lot of improvements to Orchard, try to get started with your development in it.
Hope this has helped.

Use Google Analytics for data to display on our webpage?

On some of our pages, we display some statistics like number of times that page has been viewed today, number of times it's been viewed the past week, etc. Additionally, we have an overall statistics page where we list the pages, in order, that have been viewed the most.
Today, we just insert these pageviews and event counts into our database as they happen. We also send them to Google Analytics via normal page tracking and their API. Ideally, instead of querying our database for these stats to display on our webpages, we just query Google Analytics' API. Google Analytics does a FAR better job figuring out who the real uniques are and avoids counting people who artificially inflate their pageview counts (we allow people to create pages on our site).
So the question is if it's possible to use Google Analytics' API for updating the statistics on our webpages? If I cache the results is it more feasible? Or just occasionally update our stats? I absolutely love Google Analytics for our site metrics, but maybe there's a better solution for this particular need?
So the question is if it's possible to use Google Analytics' API for updating the statistics on our webpages?
Yes, it is. But, the authentication process and xml return may slow things up. You can speed it up by limiting the rows/columns returned. Also, authentication for the way you want to display the data (if I understood you correctly) would require you to use the client authentication method. You send the username and password. Security is an issue.
I have done exactly what you described but had to put a loading graphic on the page for the stats.
If I cache the results is it more feasible? Or just occasionally update our stats?
Either one but caching seems like it would work especially since GA data is not real-time data anyway. You could make the api call and store (or process then store) the returned xml for display later.
I haven't done this but I think I might give it a go. Could even run as a scheduled job.
I absolutely love Google Analytics for our site metrics, but maybe there's a better solution for this particular need?
There are some third-party solutions (googling should root them out) but money and feasibility should be considered.

Difference between Ad company statistics, Google Analytics and Awstats on adult sites

I have this problem. I have web page with adult content and for several past months i had PPC advertisement on it. And I've noticed a big difference between Ad company statistics of my page, Google Analytics data and Awstats data on my server.
For example, Ad company tells me, that i have 10K pageviews per day, Google Analytics tells me, that i have 15K pageviews and on Awstats it's around 13K pageviews. Which system should I trust? Should i write my own (and reinvent a wheel again)? If so, how? :)
The joke is, that i have another web page, with "normal" content (MMORPG fan site) and those numbers are +- equal in all three systems (ad company, GA, Awstats). Do you think it's because it's not adult oriented page?
And final question, that is totally offtopic, do you know about Ad company that pays per impression and don't mind adult sites?
Thanks for the answers!
First, you should make sure not to mix up »hits«, »files«, »visits« and »unique visits«. They all have a different meaning and are sometimes called differently. I recommend you to look up some definitions if you are confused about the terms.
awstats has probably the most correct statistics, because it has access to the access.log from the web server. Unfortunately, a cached site (maybe cached by the browser, a proxy from an ISP or your own caching server) might not produce a hit on the web server. Especially if your site is served with good caching hints which don't enforce a revalidation and you are running your own web cache (e.g. Squid) in front of your site, the number will be considerable lower, because it only measures the work of the web server.
On the other hand, Google Analytics is only able to count requests from users which haven't blocked Google Analytics and have JavaScript enabled (but they will count pages served by a web cache). So, this count can be influenced by the user, but isn't affected by web caches.
The ad-company is probably simply counting the number of requests which they get from your site (probably based on their access.log). So, to get counted there, the add must not be cached and must not be blocked by the user.
So, as you can see, it's not that easy to get a single correct value. But as long as you use the measured values in comparison to those from the previous months, you should get at least a (nearly) correct rate of growth.
And your porn site probably serves a high amount of static content (e.g. images from the disk) and most of the web servers are really good at serving caching hints automatically for static files. Your MMORPG on the other hand, might mostly consist of some dynamic scripts (PHP?) which don't send any caching hints at all and web servers aren't able to determine those caching headers for dynamic content automatically. That's at least my explanation, without knowing your application and server configuration :)

How long to retain an archive of web server traffic logs?

We've currently got four web servers in a farm generating IIS web logs about 100Mb per day. These can be compressed pretty effieciently down to somewhere around 5% of their size.
We are planning to use waRmZip to move them off the servers and onto a SAN. After a week or so we can be confident we don't have any technical issues to investigate so the only other thing would be using them for trend analysis as a compliment to Google Analytics.
What retention periods do people recommend? Are there any legal requirements to keep this data?
Legal requirements will depend on your country, how much you're logging, and quite possibly the nature of your business. Talk to your company's lawyers - legal advice on SO is likely to be worth what you pay for it.
If you're only storing 5MB per day, you should be able to store them for basically as long as you want without worrying on the technical front.
Please consider the sensitivity of your web log data as well. I have no idea whether access to your web apps would be considered sensitive if made public, but you need to realize that your web logs contain the necessary information to potentially identify individuals (esp. in conjunction with other information available elsewhere). Your privacy policies should reflect how long you retain these logs and what purposes to which they will be put. Google, I think, recently decided to anonymize their logs after 9 months to help protect user privacy. Granted, their situation is a little different since they collect so much information, but you need to consider your customer's needs as well as your own when determining how long and in what form to keep your logs.
I tend to keep mine forever. That's mainly for trend analysis because Google misses some visitors (non-JavaScript ones).

Resources