Resolving CDN response issues - cross-domain

When developing websites or visiting websites I'm seeing issues when pulling down files from various CDN. E.g. I just signed up to Shopify and on first load of the shop in Firefox it wouldn't pull down the files from http://cdn.shopify.com. Likewise visiting stackoverflow.com and other sites I have had the same issue with no response when downloading https://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js.
I have experienced both these issues today using the latest Firefox 14 and Chrome 21. Sometimes just doing a refresh the file comes through fine, then refresh and again sometimes no response. Always with CDN files (separate domain).
My questions are:
Is this an issue well known to other web developers?
What causes CDN file requests to fail (outside of the server just being down or under load), is it a cross-domain issue?
Are there any precautions we can take to prevent this, aside from hosting the files ourselves?

Related

Google Site link reporting We're Sorry

We have a curriculum site hosted in New Sites and is shared publicly. Anyone that visits the site gets the Google "We're Sorry" page and can't access the website without refreshing the page multiple times. It seems that after you finally get each page to show, future visits are fine. But as they begin to roll this site out to teachers, they need the link to work. This is both via direct link access or clicking the link in an email. Happens in Chrome and Firefox so far from testing.
I've never seen this happen with Google Sites. There is nothing specific on the page that is unsafe, no insecure embeds (just images and links to google drive docs).
I used https://transparencyreport.google.com/safe-browsing/search to test and it comes back safe.
Per request I am going to include screenshots from the Network tab. However I can no longer replicate this issue on my network or machines, but many teachers are still reporting the issue so trying to get screenshots from them. In this first one, logimpressions is blocked for them but the site loaded - this is most likely caused by having uBlock enabled.

Why does my GitHub Pages URL return the wrong page in Chrome?

I've been playing around with GitHub pages for a while, and have been doing most of my development in Firefox. Everything was working amazingly, until I attempted to test my project page in Google Chrome. To my surprise, when visiting the same GitHub project page in Firefox and Chrome, Firefox was served the correct index.html page while Chrome was served a completely different (and incorrect) one.
I've poked around for a few hours now and honestly have no idea what's going on. Both Firefox and Chrome are requesting the exact same URL with an HTTP GET request and receive different responses from the server. I've tried changed the user agent and messing with the request header in both browsers, and it didn't seem to affect anything.
Does anybody have a clue what's going on? If it helps, the project page in question is "https://wgxli.github.io/complex-function-plotter/". Any help is much appreciated.
Edit: It appears to be related to a browser cache issue. The behavior disappears if I clear all data from the browser and visit the above page. However, if I clear the browsing data, visit the root directory of the above page, and then request the above page, the problem reproduces itself. At this point, I think I've reduced it to a question of why the browser (or CDN) is returning a cache hit when it shouldn't.
I ended up fixing the issue. I was using create-react-app, which automatically registers a service worker for local caching. I just disabled this service, which resolved the problem.

Multilingual Umbraco Website cannot be scraped?

I have created a multilingual Umbraco website which has 3 domain names pointing to it for each language. The site has gone live and people are starting to share links to it on LinkedIn and other social media. I have metadata in the website which should be picked up when these links are shared. On LinkedIn when the link is shared it has 'coming soon' as the strap-line, which is what was in the holding page months ago suggesting the site isn't being re-scraped.
I used the Facebook link debugging tool and that was returning a run-time error with a 500 response code.
My co-worker insists that there is nothing wrong with the DNS and there aren't any errors in the code of the website so I am wondering if anyone has any ideas why the website cannot be scraped?
It also has another issue where one of the domains sometimes doesn't redirect to it's www. version despite have a redirect on the DNS which may be related.
Is there some specific Umbraco configuration that I may have missed? Or a bug within Umbraco that may cause this?
Aside from this issue the website is working fine, it is just these scrapers seem to be unable to hit the website successfully.
Do you have meta data set for encoding? see https://www.w3.org/International/questions/qa-html-language-declarations probably long shot.

VueJS - Disabling cache results in network request errors

EDIT This issue seems to only affect Chrome and Safari on my Macbook Pro. I can't replicate this issue on other computers and browsers. I thought it might have been malware or virus, so I reformatted my Macbook. Didn't fix the issue All of a sudden, I am running into this issue when developing on my local server as well with MAMP. Assets are missing everywhere and some pages fail to load all together
I've noticed recently when I refresh my Vue SPA with the cache disabled, the page tends to look messed up with missing images/resources.
When I check the console, I see a lot of ERR_CONNECTION_REFUSED for resources that are definitely there. If I refresh the page, the errors go away. It tends to happen after I clear cache and load up the webpage for the first time, or if I disable cache in the developer console.
It turns out there had recently been a DDos attack against my IP address so my hosting service forced rate limit connections to my IP address. So if you ever run into the same issues, check with your hosting company first.

Google links opens wrong pages

Our website has been recently hacked (Joomla 1.5, hosted on VPS). Attacker added few php scripts that were redirecting to some ad sites. We have cleaned everything (or at least we think we did), and now everything works as it should.
However, links on Google (or Yahoo) that are pointing to our web site are still trying to include these php scripts (and returns 404 as these are deleted now). Direct links from browser works as they should.
We have cleaned site 10 days ago, so I do not think that something is cached at Google servers. Re-indexing should be done by now.
To reproduce this behavior:
Go to www.google.com
type in "anitex socks"
click any php link that starts with "anitexsocks.com"
You will get "The requested URL /wp-includes/client.php was not found on this server" + 404 error
Refresh page and everything works without issues
Why are only Google links making troubles?
Any help is welcome. Thanks!
As for the reason why this is happening, I installed a firefox add-on which blocks my browser's Referrer Header and then followed a Google link to your site and it worked fine. Then I disabled the add-on and the problem started occurring again.
This shows that there is still some malicious code running on your website which is checking all http requests to see if they come from Google (based on checking the HTTP Referrer header) and redirecting them to /wp-includes/client.php if they do,
To try to determine where this code may lie, try performing a recursive grep through all your www files on your server as well as your www configuration files,somewhere in there there must still be a reference to that client.php script, hopefully you can find and eliminate it.
That said, if it were my site and I knew a hacker had had free reign over my server to do whatever they wanted to it, I would not mess around with trying to undo the damage and would instead restore the most recent backup from before the site was hacked. You only have to miss one back door the hacker left in place and they can re-enter your site. After restoring backups, you should also upgrade/reconfigure the software they used to gain access in the first place so they can't simply rehack it in the same manner again.

Resources