We implemented CLS optimization 20 days ago, actual values (lab data) are perfect from that time.
CLS on field data is different story. It is improving but very very slowly. If it is truth that it is calculated out of 28-day period, then we might see significantly better values.
We started with CLS of 1.06 and now we are on 0.68. Lab data on my computer shows CLS of 0.001
Is there any way to validate field data calculation?
Or is there any other reason I am not seeing? Thanks.
First after 20 days a CLS drop from 1.06 to 0.68 is good, you should level out at about 0.5 which is a big improvement.
Unfortunately the reason you have CLS issues is that you still have problems somewhere.
You see the synthetic lab tests only measure initial page load for CLS at 2 specific screen sizes.
The field data measures until page unload and at every screen size.
So your problem is either further down the page or caused by CLS at a different screen size than those tested.
As you have "maxed out" the synthetic tests the advice in this answer I gave may help you identify CLS issues, which covers 2 ways to test using developer tools and how to track real world data (the best way in my opinion) to help narrow down the cause.
Related
According to core web vitals there are only 3 core vitals for measuring the user experience of any website LCP(Largest contentful paint), FID(First input delay) and CLS(Cumulative Layout shift). According to Pagespeedinsights or CRUX dashboard, FID of my website is in good limits i.e 90% of users have an input delay of less than 100 ms
Will there be any benefit if I do the chunk optimisations(splitting, lazy loading) on the user experience of people landing on my website?
I understand that it will effect TBT(Total Blocking Time), TTI(Time to interactive) but anyways it doesn't matter if my FID is ver less. is my understanding correct?
I work on several large sites and we measure FID and TBT across thousands of pages. My work on this shows there is little correlation between TBT and FID. I have lots of pages reporting TBT of 2s or more but then are in the 90% score for FID. So I would NOT spend money or time optimizing TBT, what I would do instead is optimize for something that you can correlate to a business metric. For instance, add some user timings to measure how fast a CTA button appears and when it becomes interactive. This is a metric that is useful.
Being in the green on the core web vitals report (for one or all metrics) is great, but it doesn't mean that you should not try to improve performance further. In fact, if all your competitors have better FID / CLS / LCP / etc. than you, you will be at a disadvantage. Generally speaking, I think the web vitals report can be used as a guide to continuously prioritise changes to try and improve performance.
While it's impossible to predict the improvements without looking at a current report and the codebase, it seems fair to expect code-splitting to improve FID and LCP, and lazy-loading to help with LCP. Both improvements would benefit users.
Note that TBT and FID are pretty similar.
I have 3 years historical transaction dataset since jan'2016 - dec'2018. I want to make classification model of good and bad customer. Before do that, I need to make observation window and performance window of my data to determine my features and target (good or bad).
If the condition of bad customer is somebody who did not do transaction in 6 months. how the best way to make observation window and performance window?
I have thought that to make performance window at july'2018-dec'2018 (6 month). So I can get all features (predictors) at range (jan'2016 - jun'2018) then see the customer is bad (target) if they do not do transaction in range of performance window.
Do you think that I am thinking a right method or not? or have I an misuderstanding about performance or observation window? Thank you for your help.
A client is asking how long does it take to audit the security of his Drupal module that is 29k lines long. Does anyone know at least what ballpark I should give him? His main concerns are file encryption and user permission.
Nope, not a damn clue :-)
However, whatever value you choose, may I suggest one thing?
Monitor your progress! Tell your client that your initial estimate is (for example) twenty-nine working days but that it depends on a great many factors outside your control.
Tell them you plan to mitigate risks of budget overrun by providing a daily snapshot of progress:
current number of lines audited in total [a].
days spent [b].
current "run rate" (number of lines per day, average) [c = a/b].
number of lines yet to be audited [d = 29,000 - a].
estimated days to completion [e = d / c].
Allow them to pull the plug at any time if the run rate is well below what you estimated.
This basic project management/reporting should give them the confidence that you know what you're doing, and will minimise their exposure considerably, to the point where they'll feel a lot more comfortable about taking you on.
Just on that last bullet point above, you may want to consider giving them a range (say +/-5% of the estimate), but don't get too clever about working out best and worst case based on your best and worst days to date. The power of averaging is that it gives you a "best" guess without having to fiddle too much with figures.
Typical estimates I've seen are that you can expect a developer to review 100-150 lines of code per hour. This is a very rough estimate, and it will vary greatly depending upon the nature of the code and the thoroughness of the review. Also, if you can review code for 8 hours a day, 5 days a week, straight, you're inhuman and amazing; for the rest of us, we need a change of activity to clear the brain.
I've previously asked how long it takes for a winning combination to appear on Google's Web Optimizer, but now I have another weird problem during an A/B test:
For the past two days has Google announced that there was a "High Confidence Winner" that had a 98.5% chance of beating the original variation by 27.4%. Great!
I decided to leave it running to make absolutely sure, but something weird happened: Today Google is saying that they "haven't collected enough data yet to show any significant results" (as shown below). Sure, the figures have changed slightly, but they're still very high: 96.6% chance of beating the original by 22%.
So, why is Google not so sure now?
How could it have gone from having a statistically significant "High Confidence" winner, to not having enough data to calculate one? Are my numbers too tiny for Google to be absolutely sure or something?
Thanks for any insights!
How could it have gone from having a
statistically significant "High
Confidence" winner, to not having
enough data to calculate one?
With all statistics tests there is what's called a p-value, which is the probablity of obtaining the observed result by random chance, assuming that there is no difference between what's being tested. So when you run a test, you want a small p-value so that you can be confident with your results.
So with GWO must have a p-value between 1.5% and 3.4% (I'm guessing it's 2.5%, atleast in this case, it might be that it depends on the number of combinations)
So when (100% - chance to beat %) > p-value, then GWO will say that it has not collected enough information, and if a combination has a (100% - chance to beat %) < p-value then a winner is found. Obviously if that line is just crossed, then it could easily go back with a little more data.
To summerize, you shouldn't be checking the results frequently, you should setup a test, then ignore it for a long while then check the results.
Are my numbers too tiny for Google to
be absolutely sure or something?
No
I've had an A/B Test running in Google Web Optimizer for six weeks now, and there's still no end in sight. Google is still saying: "We have not gathered enough data yet to show any significant results. When we collect more data we should be able to show you a winning combination."
Is there any way of telling how close Google is to making up its mind? (Does anyone know what algorithm does it use to decide if there's been any "high confidence winners"?)
According to the Google help documentation:
Sometimes we simply need more data to
be able to reach a level of high
confidence. A tested combination
typically needs around 200 conversions
for us to judge its performance with
certainty.
But all of our conversions have over 200 conversations at the moment:
230 / 4061 (Original)
223 / 3937 (Variation 1)
205 / 3984 (Variation 2)
205 / 4007 (Variation 3)
How much longer is it going to have to run??
Thanks for any help.
Is there any way of telling how close Google is to making up its mind?
You can use the GWO calculator to help determine how long a test will take based on a number of assumptions that you provide. Keep in mind though that it is possible that there is not significant difference between your test combination, in which case a test to see which is best would take an infinite amount of time, because it is not possible to find a winner.
(Does anyone know what algorithm does it use to decide if there's been any "high confidence winners"?)
That is a mystery, but with most, if not all, statistical tests, there is what's called a p-value which is the probability of obtaining a result as extreme as the one observed by chance alone. GWO tests run until the p-value passes some threshold, probably 5%. To be more clear, GWO tests run until a combination is significantly better than the original combination, such that the result only has a 5% chance of occurring by chance alone.
For your test there appears to be no significant winner, it's a tie.