YouTube's own embed code causing Page Speed alerts - pagespeed-insights

We recently ran a series of Google pages peed reports and found that pages using an embedded Youtube player (either within the product imagery or description) have significantly longer load times than all other pages. Now that we're utilizing video on our website more prominently, we'd like to do what we can to reduce these load times.
YouTube is the biggest offender in these categories:
Reduce unused Javascript
Reduce impact of 3rd party code
Some third party resources can be lazy-loaded
(to name just a few)
Ironies aside, are there native embed options to improve this? I've tried lightweight players from GitHub, but those introduce visual quirks that are a downgrade from what YouTube's embed provided.
https://github.com/paulirish/lite-youtube-embed

Related

How to measure performance of browser extension on websites (> 100)

I'm looking for tools for measuring webpage performance with certain browser(Chrome) extension installed. I would like to know things like # of requests, time to first byte, slowest call, average call, FCP, and LCP et. al..
I've used development tool that comes with the browser and extensions such as Page load time, Performance-Analyser.
I'm look for some method/tool that can load pages one by one from a list and be able to download the results, so I can test many webpages and batch process the results.
Thanks.
You can use any suitable browser automation framework i.e. Selenium which is some form a de-facto standard
Check out 6 Easy Steps to Testing Your Chrome Extension With Selenium for example
There is also Lighthouse tool which can also be considered and executed from shell scripts or programmatically, however it's more web-oriented hence not all metrics will be applicable so you might get a lot of false-negative results.

Is automated detection of web forms, payment gateways, and ads on web pages possible?

I would like to know if it is possible to detect web forms, payment gateways, and ads on a web page in an automated manner, by running some code after crawling and indexing the web page.
To give the question some context, I would basically like to know if the given web page is being monetized in some way. If there were a program to detect the above, it would provide a way to find out if the web page is capturing leads, selling a product or service, or displaying ads.
Moreover, I would like to know: is there some form of on-page monetization that cannot be detected by using an automated program?
Yes its possible to parse a web page after downloading it - but if you need to ask the question then you still have a very long journey ahead of you before you are capable of understanding how to do this. Your question is also off topic as it does not relate to solving a programming problem.
However it is impossible to determine programatically if the content of a web page represents monetization. It is possible to make some guesses using a database of known advertisers and/or very advanced AI techniques. Come back and ask again after you have mastered data science, antagonistic / back progogated neural networks and image analysis in about 15 years from now.

best approach for converting a heavy website into a hybrid app

what is the best approach to work with ibm worklight website which has lots of content ..should it made multipgage?if it is multipage how do we access worklight context on each page
IMO there are multiple aspects you need to think about and take into account with respect to your specific scenario and needs. Since you did not describe those in detail, I will try to generalise my suggestions:
Your are not required to have an app per-se
You could also re-design your website with responsive web design in mind. This way, as your users load the website in either Desktop browser or Mobile browser apps, the website fits itself to the device's viewport size.
If you do choose to create an app
Not all aspects of your website must exist on the mobile app. Re-consider your strategy and find the right balance of what you should present to your end-users. Make it lighter
Think mobile-first; the paradigm is different and so should be your approach and design: UI Design Dos and Don'ts
As for the technical aspect, many UI framework provide ways to present "pages" within your app. Worklight can work with any of them. Read more about the challenges and solutions, here:
Building a multi-page application tutorial
Example application showcasing multi-page navigation in Worklight 6.2 using jQuery Mobile
Stack Overflow questions about Worklight and multi-page apps
Strictly speaking Worklight hybrid apps are single page apps: there is a single HTML page and we never navigate to a new "URL". However from the UI point of view the user sees what appears to be multiple pages, typically this is achieved by manipulating the DOM of the single page. For example we have a DIV for each "page" the user sees, and we navigate by showing and hiding those DIVs.
With that philosophy in mind your question about accessing the Worklight context now becomes trivial: we're on a single page, so the context is always avaialble.
As Idan says it usually simplest to implement such a single-page, multi-view app by using a JavaScript framework that manages the navigation. Many folks these days use angularJs. Using such frameworks we can decompose the app into a number of small HTML and JS files that are dynamically loaded, from the app perspective it's still a single page but from a development perspective we now have finer-grained artefacts that allow easier parallel development in a multi-person team. When you have an with many 10s of "pages" such decomposition really pays off.

Search engine components

I'm a middle school student learning computer programming, and I just have some questions about search engines like Google and Yahoo.
As far as I know, these search engines consist of:
Search algorithm & code
(Example: search.py file that accepts search query from the web interface and returns the search results)
Web interface for querying and showing result
Web crawler
What I am confused about is the Web crawler part.
Do Google's and Yahoo's Web crawlers immediately search through every single webpage existing on WWW? Or do they:
First download all the existing webpages on WWW, save them on their huge server, and then search through these saved pages??
If the latter is the case, then wouldn't the search results appearing on the google search results be outdated, Since I suppose searching through all the webpages on WWW will take tremendous amount of time??
PS. One more question: Actually.. How exactly does a web crawler retrieve all the web pages existing on WWW? For example, does it search through all the possible web addresses, like www.a.com, www.b.com, www.c.com, and so on...? (although I know this can't be true)
Or is there some way to get access to all the existing webpages on world wide web?? (sorry for asking such a silly question..)
Thanks!!
The crawlers search through pages, download them and save (parts of them) for later processing. So yes, you are right that the results that search engines return can easily be outdated. And a couple of years ago they really were quite outdated. Only relatively recently Google and others started to do more realtime searching by collaborating with large content providers (such as Twitter) to get data from them directly and frequently but they took the realtime search again offline in July 2011. Otherwise they for example take notice how often a web page changes so they know which ones to crawl more often than others. And they have special systems for it, such as the Caffeine web indexing system. See also their blogpost Giving you fresher, more recent search results.
So what happens is:
Crawlers retrieve pages
Backend servers process them
Parse text, tokenize it, index it for full text search
Extract links
Extract metadata such as schema.org for rich snippets
Later they do additional computation based on the extracted data, such as
Page rank computation
In parallel they can be doing lots of other stuff such as
Entity extraction for Knowledge graph information
Discovering what pages to crawl happens simply by starting with a page and then its following links to other pages and following their links, etc. In addition to that, they have other ways of learning about new web sites - for example if people use their public DNS server, they will learn about pages that they visit. Sharing links on G+, Twitter, etc.
There is no way of knowing what all the existing web pages are. There may be some that are not linked from anywhere and noone publicly shares a link to them (and doesn't use their DNS, etc.) so they have no way of knowing what these pages are. Then there's the problem of the Deep Web. Hope this helps.
Crawling is not an easy task (for example Yahoo is now outsourcing crawling via Microsoft's Bing). You can read more about it in Page's and Brin's own paper: The Anatomy of a Large-Scale Hypertextual Web Search Engine
More details about storage, architecture, etc. you can find for example on the High Scalability website: http://highscalability.com/google-architecture

how to implement a web site like youtube?

I'm doing a language web site for my university language center, where students login and see videos to learn English. i have to do it like this,
person is logging in to the system, search using a search area and find the details,lessons and videos relevant to that videos. this functionality exactly matches the youtube scenario.
for implementing twitter like functionality we can use status-net, is there a similer library, statusnet like famous implementation for youtube or a some kind of platform or a framework like codeigniter that we can use to implement youtube like site very easily??
please suggest some options?? a open source one or a commercial one ???
and what is the best video format to use in a such web site?? flv?? mp4?? or mov???
regards,
Rangana
Your best option is to use a 'cloud' based video processing service. Most have a sample project / library for many different languages and frameworks. Here is a list of a few I've tried and liked:
http://zencoder.com/
http://transloadit.com/
http://pandastream.com/
The typical steps involve uploading the video files to a large 'cloud' static asset host (such as S3) through the browser. If you are inexperienced it is best to select a processor that provides an uploader (it will handle putting the files in the right spot). Of the three, Transloadit and Panda both have custom unloaders.
Usually the service will allow you to either pass the encoding settings (what formats and qualities to) output to as parameters or configure them in your account. To support all current HTML5 browsers you just need H264 (.m4a) and OGG (.ogv). However, the new trend in the video world is for WEBM (.webm) so you might want to include it as well.
Next you will receive a unique code from the web service that you must store in persistent storage (database). The web service can be configured to 'callback' (perform an HTTP POST or GET request to your service) once the video is encoded.
Once your recieve a callback you can activate your video and start dislpaying it on your pages. For displaying, if you are inexperienced I'd highly recommend you use one of the following players:
http://sublimevideo.net/
http://longtailvideo.com/
http://videojs.com/
They all do similar things for different prices. My current personal favourite is Sublime Video (it offers cool light box effects and a gorgeous player).
Why do you have to re-implement Youtube when you can just use it for hosting your videos for free? Many online e-learning portals (e.g. Khan academy) do exactly that.
As far as the best video format to use -- go read about H.264/AVC. It's what Youtube currently uses.
I think you will not find already built solution ;)
But it's not really that hard. You can use existing frameworks that will make your life easier while you build account management system, the rest shouldn't be really that hard (assuming you don't really want to re-build the whole Youtube ;D ).
For playing videos, you can use JW Player. A great piece of software, you should check it out.

Resources