tabs permission or content script? - google-chrome-extension

I'm writing an extension that needs to show a page action on amazon.com pages.
Would it be better to request the "tabs" permission or to inject a content script into amazon.com pages?
The tabs permission strikes me as using less resources (because it just checks the URL against a regex in the background script) but I think it's a scarier permission message ("access your tabs and browsing activity")?
Injecting a content script into amazon.com pages seems like it would take more resources it but would only need permission to amazon.com...

It is a generic question and answer depends on Client to Client. You have pointed out the + and - of each.
I suggest you to go for content scripts if your clients are particular about security and privacy, in this you are adding an extra load to pages(with content scripts and message passing) which may slow down the normal execution process.
I suggest you to go for tab permission, if you are all about performance. It is a native API, and executes in background page no extra load on tabs. Many extensions on web store does use tabs API, i dont think this would scare them as this is not new.
However, it is all about your target section of users.

Related

Chrome Extension - Scrape any url, ignoring sandboxing and Content Security Policy?

I'd like to build a chrome extension that can make requests against any web page that the user has access to, even pages that are protected by Content Security Policies, preferably in the background (without having to have the page open in the browser).
So for example, I'd like to be able to:
request info from a page the user may be logged into, like Gmail
request info from a RSS/other pages
request info from pages on Facebook
Is this possible? It seems like I could have the extension open a new window, and a tab for every page I want to pull info from. Is this the only way this can work? I'd prefer to have this happen behind the scenes, without having to open a window.
CSP is not a problem as long as your manifest.json adds the URLs you want to process in permissions e.g. "*://*/" or "<all_urls>" will allow access to any site.
The solution, however, depends on how that page is built. If the server response contains all the info you need then you can simply make a direct request via XMLHttpRequest or fetch (more info) in the background script, parse it with DOMParser and extract the data. Otherwise you can try to run it in an iframe (you'll have to strip X-Frame-Options) or in an inactive/pinned tab and use a content script to extract the data. To access JavaScript variables of the page you'll need to add a DOM script so its code will run in page context.

How do you prevent crawling from your web site?

I am running a website on IIS with more than 1000 page links at pagination and I want to prevent others to crawl/steal these pages by running a crawler script and get the info page by page.
Is there any way to understand the request if it is a user request or being ran by a script? or maybe some filters for this on highest level before coming to request?
You can't prevent automated crawling.
You can make it harder to automatically crawl your content, but if you allow users to see the content it can be automated (i.e. automating browser navigation is not hard and computer generally don't care to wait long time between requests).
One option is to require single "user" (either authenticated or not) to have some minimal delay between requests (i.e. 1-5 seconds). This way generic crawling will not be useful (require some "user id" in request and delay between requests), and one would have to write custom crawling code which is clearly more time intensive.
Note that writing special "crawler" for your site may be considered as "noble" action and significantly increase incentive to create one (i.e. check out "how to make Google maps available offline" questions).

moving from permissions to optional_permissions: how to handle content_scripts?

Originally posted this question here:
https://groups.google.com/a/chromium.org/d/msg/chromium-extensions/wbSpXvnO10A/nov36skmnQ0J
My extension has an optional feature that interacts with the user's gmail tab. We don't want to mention mail.google.com domains at all in the permission confirmation that the user sees when first installing the extension. So I moved that entry out of the manifest's permissions block and into the optional_permissions block. We also needed to use a content script tied to mail.google.com, but defining this in the manifest causes the 'mail.google.com' permission warning that is sppoking some users.
I've tried removing the content_script manifest block and using Programmatic Injection instead as describe here. http://developer.chrome.com/extensions/content_scripts.html#pi
However scripts injected that way are not content scripts and don't have access to the needed APIs (chrome.tabs, etc)
Is there some way to get the best of both worlds: use optional_permission, AND get the content scripts added to a matching URL, but only if the user has approved the optional permission?
It seems like you could create a background page, and call chrome.tabs.query against your optional origin to get a list of tabs that match that host. You can then call programmatic injection to the content script (chrome.tabs.executeScript). None of these require the "tabs" permission (many "tabs" functions don't require any special permission, and it intelligently lets you query for tabs whose origins match your optional permission)
You could call this every second or so to see if there are any new tabs for which you haven't yet called executeScript.
It would be nice if this were edge-triggered. See https://code.google.com/p/chromium/issues/detail?id=264704
You can actually get it to be mostly edge triggered by using chrome.tabs.onUpdated.addListener and simply trying to inject every time that is triggered (which will be every time a page loads in any tab, regardless of whether you have permission or not). You'll get a lot of errors in the background script's console when you don't have permission. It will be important to have your content script set a variable like _I_already_executed=true and check for its existence so that you're not injecting multiple times (this event gets triggered several times for each page load)
Now there's the contentScripts.register() API, which lets you programmatically register content scripts.
browser.contentScripts.register({
matches: ['https://mail.google.com/*'],
js: [{file: 'content.js'}]
});
This API is only available in Firefox but there's a Chrome polyfill you can use. The new scripts will work exactly like your regular content scripts in manifest.json.
For a more comprehensive solution you could look into my webext-domain-permission-toggle and webext-dynamic-content-scripts modules, which don't apply directly to your use case but can be helpful to who wants to drop the <all_urls> permission and inject content scripts on demand.

How to stop Site from Scraping my site

I have this songs site what ever data it has same is being displayed in other site
even if i echo "hello" same is done on other site does any body know how can i prevent that
just getting in more depth i found out that site is using file_get_contents() how can i prevent him from doing that
Well, you can try to dermine their IP address and block it
You said file_get_contents was being used.
A URL can be used as a filename with this function if the fopen wrappers have been enabled. See fopen() for more details on how to specify the filename. See the Supported Protocols and Wrappers for links to information about what abilities the various wrappers have, notes on their usage, and information on any predefined variables they may provide.
To disable them, more information is at http://www.php.net/manual/en/filesystem.configuration.php#ini.allow-url-fopen
Edit: If they go and use CURL or an equivalent after this, try and mess with their script by changing the HTML layout, etc. If that doesn't help, try and locate the IP of the script host, and make it return nonsense ;)
Edit2: If they use an iframe use javascript to redirect on iframe detection
Or you can even generate rubbish information just for that crawler, just to mess the "clone" site.
The first question to be answered is: Have you identified the crawler getting the information from your site?
If so, then you can give anything you want to this process: Nothing (ignore / block), a message telling the owners to stop getting your information, give them back rubbish contents, ...
Anyway, the first step is doing things properly. Be sure that you site has a "robots.txt" with the accepted policy for crawlers.

Cross-domain iframe communication in Opera

I have need to communicate between two iframes of the same domain, which live inside a parent page on a different domain that I have no control over.
This is a Facebook app and the basic layout is this
apps.facebook.com/myapp
L iframe1 (src='mysite.com/foo')
L iframe2 (src='mysite.com/bar')
I need frame1 to talk to frame2, but in Opera I can't access window.parent.frames['frame2']
to do the usual cross-domain methods (updating location.hash for example)
Is there an alternate way to accomplish this in Opera?
Thanks for your help in advance
Did you try using HTML5 web messaging. It is quite well supported currently by recent versions of browsers.
iframe.contentWindow.postMessage('Your message','http://mysite.com');
The postMessage property will need the origin http://mysite.com.
Generally no. Same Origin Policy denies you the possibility of communicating upwards to the parent, which would be necessary to then step downwards to the other frame. This is true in any browser.
If the parent document has given your frame-to-be-contacted a unique name, there is a limited form of communication possible with it by getting the user to click a link with href="otherurl#message" target="name", which will navigate the target frame by changing the hash without reloading the page, as long as the URL matches exactly. In Mozilla you can also do this with a form target, allowing you to script its submission (since link clicking cannot be automated), but not in Opera. Probably not much use... don't know if FB gives you a frame target name in any case.
You can make a communication channel between scripts in the same domain by using cookies(*): one script writes a session cookie, the other script polls for changes to document.cookie to find messages in it. But it's super-ugly and requires some annoying work to control signalling which messages are meant for whom when there are multiple documents open simultaneously. And there are further limitations for cookies in third-party frames (you will probably need to write a P3P policy to get IE to co-operate).
(*: or, presumably, HTML5 web storage, where available.)
As others have said, use window.postMessage. But instead of using window.parent.frames['frame2'], try window.parent.frames[x] where x is the position in the node list of the other iframe.
You can see an example of doing this across origins here: http://webinista.s3.amazonaws.com/postmessage

Resources