Programatically retrieve the updated og:image of a page in an extension - google-chrome-extension

There seem to be many pages where the the og:image does not change as I keep browsing from one page to another. The og:image always points to the first (landing) page. This is true of youtube videos, for instance. Of course, reloading the page provides the correct og:image
I am wondering if there is a way, within a custom extension on Chrome and Safari, to force refresh the og:image data without affecting user experience?

Dynamically updated sites (aka AJAX sites) change only a portion of the page with the new content on intra-site navigation. The meta information in head element like og:image isn't updated usually.
A universal workaround for any site with AJAX navigation would be to make a XMLHttpRequest for the current URL, convert the response into DOM via DOMParser API and extract the og:image tag.
Or you can write site-specific code and try to find an internal variable or element that contains the og:image. It requires some reverse-engineering, and your code would break on site changes.

Related

Is there a way in which a background video keeps playing when you navigate through a static website?

this website is not using ajax to update sections of the site with content, its a static site and therefore moves from one html page to the other in the classic way.
each page has a video element embedded which plays the same video.
clearly this means each time a user clicks to navigate to a different page, the video will restart.
what approaches can be taken to enable the video to maintain its position no matter which page the user visits ?
I understand each time a user clicks a hyperlink to navigate to a different page, a new http request is performed and there is the request and retrieval of this html page.
I'm curious whether I can achieve what I'm looking for without using AJAX. be interested to hear from you all on this.

How can I prevent an iframe displaying an email to load images and other email trackers?

We have a web admin panel in which the agents can see conversations with customers.
Those conversations are the result of importing normal emails thru an IMAP connection. We grab the "untouched" mailbox files and we store them in a database. Then we post-process the files to index by "from", "to", "date" and so on and so forth.
Up to here, okey. We can seek all the emails involved with a client and render them at will.
Then when the agent looks for a customer in the web admin panel and opens it, the full email conversation appears. And we display the HTML version of the email within an iframe (or the text version if the html version is not there). 90% of the customers send HTML.
What happens? Upon the agent opening the email in our web, the iframe loads the "full html" and renders it. This makes "remote loading" (images, sounds, styles if so, and whatever) to be downloaded. This allows customers to "track" if we opened the email by appending tracking id's to the assets (typical http://track.example.com/image.jpg?id=123456789)
I've tried the "sandbox" attribute of the iframe html tag with no luck (it still downloads the images).
Question
How can I programmatically tell the iframe to not load ANY remote content, and just render the initial HTML without any remote call?
Mozilla's iframe documentation listing all available attributes for the is here: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe
If you look at "sandbox" there is no restriction specific to image or other includes, just restrictions on things like running JavaScript. There are no other attributes that would restrict images and includes.
To solve the problem of images and includes in your HTML you will need to filter the HTML either at the server before sending it or in the client after it arrives.
Server:
Before storing it into the database.
In the code that retrieves the HTML and returns it to the iframe.
Client:
Use AJAX to fill the iframe with the HTML, with code that filters a
response. With this approach you could also use a div instead of an
iframe if that works better for your layout.
If all of your users will use Chrome or Firefox, you could look at writing a browser extension

Chrome Extension: Sending a message to the page loaded in a specific iframe

I'm working on a Chrome extension to (among other things) support a page with multiple iframes, each of which loads a page from some other domain. I need to send a msg to the page loaded a specific one of those iframes. The top-level page and the pages in the iframe each have their own content scripts, so the full messaging API is available.
From the top page, when I do chrome.runtime.sendMessage(), all the iframes get it (as does the top window, but it's easy for its content script to know that that particular msg isn't intended for it). Is there any way to target a specific one of those iframes, or for the desired iframe page to know that the msg is for it?
Note that...
The top page can't access anything in iframe pages directly, because they're from other domains.
The top page knows the URL that was originally loaded in each frame, but the user may have navigated from there, so including the target URL as a msg parameter for the receiving script to check won't work.
Is there something obvious I'm missing here?
UPDATE: #wOxxOm's answer was very helpful, but I'm still stuck on how to get the frameIds I need.
More specifically, I need to do two things with those iframes, both of which need that frameId:
Inject a script into each iframe
Send msgs to a specific iframe in response to user actions on the top-level page
All of this is complicated by the fact that the iframes are created and removed dynamically as the user works.
One idea I had is to initially load each new iFrame with the URL "about:blank?id=nnn", where nnn is the DOM id of the corresponding iframe element. That way, when I call getAllFrames(), I can recognize the new iframes by that URL, and build a lookup of frameIds for each DOM id. Once that's done, I can load the real URL, inject the script once it's loaded.
That seems so roundabout, I'm hoping I've missed some supporting API or other straightforward approach.
I did find a solution, but it's pretty indirect. I hope this is clear; all these moving parts are the nature of the beast as I understand it.
Here's what I ended up doing:
Added a name attribute to each iframe, the same as its DOM id.
When the page in each iframe loads, a global content script calls chrome.runtime.sendMessage(), passing that name, which it can access as window.name.
The background script gets that msg, with the frameId of that iframe as sender.frameId, and calls chrome.tabs.sendMessage(), passing the frameId and window name.
The top-level page's content script builds a lookup object from those window-name (AKA iframe DOM id) / frameId pairs.
When the top-level page's content script wants to send a message to any of the iframe pages, it looks up the target's frameId in that lookup object, then calls chrome.runtime.sendMessage(), with a message type that indicates it's for a specific iframe, and including that frameId.
In response, the background script sends it on to the requested iframe's content script with chrome.tabs.sendMessage(), passing {frameId: request.frameId} as the 3rd parameter, as wOxxOm suggested.
This is working here, but by all means let me know if there's a simpler way to do this.

Google Chrome Extension - prevent cookie on jquery ajax request or Use a chome.extension

I have a great working chrome extension now.
It basically loops over a list of HTML of a web auction site, if a user has not paid for to have the image shown in the main list. A default image is shown.
My plugin use a jQuery Ajax request to load the auction page and find the main image to display as a thumbnail for any missing images. WORKS GREAT.
The plugin finds the correct image url and update the HTML Dom to the new image and sets a new width.
The issue is, that the auction site tracks all pages views and saves it to a "recently viewed" section of the site "users can see any auctions they have clicked on"
ISSUE
- My plugin uses ajax and the cookies are sent via the jQuery ajax request. I am pretty sure I cannot modify the cookies in this request so the auction site tracks the request and for any listing that has a missing image this listing is now shown in my "recently viewed" even though I have not actually navigated to it.
Can I remove cookies for ajax request (I dont think I can)
Can chrome remove the cookie (only for the ajax requests)
Could I get chrome to make the request (eg curl, with no cookie?)
Just for the curious.
Here is a page with missing images on this auction site
http://www.trademe.co.nz/Browse/SearchResults.aspx?searchType=all&searchString=toaster&type=Search&generalSearch_keypresses=9&generalSearch_suggested=0
Thanks for any input, John.
You can use the webRequest API to intercept and modify requests (including blanking headers). It cannot be used to modify requests which are created within the context of a Chrome extension though. If you want to use this API for cookie-blanking purposes, you have to load the page in a non-extension context. Either by creating a new tab, or use an off-screen tab (using the experimental offscreenTabs API.
Another option is to use the chrome.cookie API, and bind a onChanged event. Then, you can intercept cookie modifications, and revert the changes using chrome.cookies.set.
The last option is to create a new window+tab in Incognito mode. This method is not reliable, and should not be used:
The user can disallow access to the Incognito mode
The user could have navigated to the page in incognito mode, causing cookie fields to be populated.
It's disruptive: A new window is created.
Presumably this AJAX interaction is being run from a content script? Could you run it from the background page instead and pass the data to the content script? I belive the background page operates in a different context and shouldn't send the normal cookies.

Understanding 3rd party iframes security?

Facebook and others offer little iframe snipplets that I can put in my site.
Example:
<iframe src="http://www.facebook.com/widgets/like.php?href=http://example.com"
scrolling="no" frameborder="0"
style="border:none; width:450px; height:80px"></iframe>
What I'd like to know is, if I put this code inside my side, could the code they load into my page access the DOM of my page? I see some security isssues if so.
Likewise facebook allows me to put an iframe into their site, this is how facebook applications work.
Could I then mine any data off any page that contains my iframe?
Note I used facebook as an example here, but many companies do the same thing so this quesiton is not specific to facebook in any way so I am not tagging it as such.
Also can the parent page access the DOM of the iframe?
Actually there are specific rules of inheritance for iframes. This is apart of the same-origin policy, and I highly recommend reading the entire Google Browser Sec Handbook.
I do know the parent page can access the DOM of the iframe. Recently we had a project at work where we had a site which needed to be 508 compliant. The iframe was not and although screen readers are handling iframes much better, the content within this iframe was not compliant. We loaded jquery library into our site, and then also loaded code into our site to manipulate the iframe (only after it loads) and at that point mashup the iframes content to be accessible.
To give you an idea of how we did it here is a sample of our jquery. (Used a lot of finds and replaces but you get the idea, you could do other things. )
$('iframe').load(function() {
var f = $(this).contents();
f.find('#sysverb_back').remove();
f.find('a.column_head').each(function(){
$(this).attr('title', $(this).text());
});
f.find('img[title]:not([alt])').each(function(){
$(this).attr('alt',$(this).attr('title'));
});
f.find('input').filter(function() {
return this.id.match(/sys_readonly\..+|ni\..+/);
}).each(function() {
$(this).before('<label for="'+this.id+'" style="display:none;">'+this.id+'</label>');
});
});
});
Although I do not know if you can from the iframe access the parent DOM.
I'm assuming cross-domain iFrame since presumably the risk would be lower if you controlled it yourself.
I've been trying to figure this out myself
Clickjacking/XSS is a problem if your site is included as an iframe
A compromised iFrame could display malicious content (imagine the iFrame displaying a login box instead of an ad)
An included iframe can make certain JS calls like alert and prompt which could annoy your user
An included iframe can redirect via location.href (yikes, imagine a 3p frame redirecting the customer from bankofamerica.com to bankofamerica.fake.com)
Malware inside the 3p frame (java/flash/activeX) could infect your user
Note that the html5 "sandbox" attribute can solve a lot of these problems if your browser supports it, and you can prevent your site from being included as an iFrame as well via X-FRAME-OPTIONS.

Resources