Can I use Python script to instead of chrome extension? - python-3.x

I used a chrome extension to download files on site, I want to know if I can create a Python Selenium script to do the same action like what in chrome extension? is it difficult?

While this is possible it's probably not a great practice...
I am not that comfortable with python or python-selenium;
However I do understand that if it's a web driver then there probably is a way to get this accomplished, look around for Python Selenium file download API on google.
Another aproach would be to make two apps that communicate with eachother - one made with html, css, javascript, php, etc to find the name of a file on a webpage.
Then message this to python (once again not sure how to do this, it's probably on google somewhere) once the file name on the website and website name are passed to python then put them together with something like this (javascript variable joining in context example):
...
var example = pageurl_as_variable + "/" + file_name_as_variable;
some_download_function_defined_else-where(example);
...
then obviously download this as file...
Once again this isn't the best practice and you can't make a chrome extension with python due to security reasons, but as described in the example above you could make an app-hybrid and then run this on the target users computer.
There are a lot of ways to do this with just basic HTML and potentially JavaScript so your better off looking into a way to do it that way...
Here's some links to point you in the right direction:
How can I download a file from a link in HTML?
How can I download a file on a click event using selenium?
(couldn't find anything on passing messages between the apps in my 5-seconds of googling).

Related

How to silently save small text file to local filesystem in Chrome packaged app or Chrome extension?

I have a Chrome extension that reads some info (small portions of text) from webpages using content scripts, then sends it to background script. Then background script must make that info available for other non-Chrome apps (in my current case it's a local Node.js app using Discord.js). And that must work silently and automatically.
After some research I've decided that the best way to share info is plain text files. Chrome saves them, Node reads them. So we come to the question in header - is there any way to just save small text files somewhere on the local drive? If it neccessarily requires a packaged app, I can create it and exchange messages with the extension, no problem.
I see no security hole here. I don't want no self-modifying code, need only one isolated place on drive where I will store files. Also, this extension is for my private use (my Discord channel automation).
Also please tell me if the whole scheme I have in mind is impossible to implement. Don't want to be stuck in the dead end.
Thank you!

Which languages do I need to learn to be able to program an extension

I am looking forward to program an extension for Google Chrome which runs in background and extracts information from a site. If a specific value of information equals my search parameters it should start performing some automated actions and then going back to the information-extraction loop.
Tl;DR
Language that can:
*automate web routines
*extract information
*running in background
Thanks in advance.
Using css, js and html as you do to make a website.
Look at this webpage: https://developer.chrome.com/extensions/getstarted

Retrieve Google results without using the Custom Search API

Recently I've been working on an idea that requires me to query Google Images and retrieve links for images matching that search term. My most promising candidate for a usable Google Images API was the Google Web Search API, but it looks like it's going to be going out of service as of tomorrow:
https://developers.google.com/web-search/docs/
The API that replaced it is the Google Custom Search API, but it's a little discouraging to use:
Google API Custom Search with Python - Programmatic Search Results
100 search results a day is a very strict limit; that's just four searches per hour. I also don't want to have to go through the hassle of creating some custom search bar that I'm never going to use except through Python
I decided to turn to parsing HTML directly from the results page. This presents a problem, though, because nowhere inside the page's HTML is there any direct link to the image, only referrer URLs. This is true of the javascript-enabled and javascript-disabled versions of Google Images (so even if Python spoofs javascript as enabled, nothing). I'm not sure where to go from here. Could anyone refer me to some obscure, updated library that I've somehow overlooked, or give me some pointers?
You could use Selenium Webdriver to actually execute the JavaScript and click on the images in the thumbnail view. Once an image has been opened, the link is in the DOM and you can scrape it from there. All Webdriver does is open an actual browser and simulate a user. You can even run it as a headless browser if you use xvfbwrapper. The downside is that even then, you will need all the dependencies of the browser you are using installed on your server.
However, scraping Google is against their terms of service and they will make an effort of blocking you as quickly as possible. So, unless you pass through the captchas (which are linked to sessions), you will possibly not be able to make a whole lot of searches before being blocked this way, either.

Get a friendly name for browser/computer

Is it possible to retrieve the computer name when developing a Chrome Extension, for example "Jenny-PC"?
At first glance I did not find the API, but maybe I missed something.
If you are quite the daredevil, you could try to extract that info from a NPAPI plugin. This is quite dangerous, as you can read more about on the chrome extension site
No directly, for security reasons extensions can't access OS services.
But, hacker way, you may find some odd way to get what you're looking for.
If your extension has file:// permission, it can read system configuration files.
If you can get the user drop some file containing the name you want on some receiver in your extension's page, you can read it with HTML5 FileReader object.
If you can get the user download and execute some script you wrote (for example a .bat in Windows), it can grab that name and send to the extensions in various ways:
- writing it in a file the extension can read
- executing something like
"c:\chrome install folder\chrome.exe" chrome://extensions/yourextensionkey/receiver.html?name=thenameyourellokingfor
About file:// permission
Chrome Web Store doesn't allow uploading nor publishing extensions with such permission. But the extension works if you install it as a developer, or as .crx .
I'm not sure, but I think you can upload it to Chrome web store modifying it, in order to ask for permission.

What general approach can I take to parse the contents of a website?

Say someone else has a website generated by JavaScript, so I can't go look at the source and read what should be on the screen. How can I grab the text on the screen so I can feed it into another program? Also, how can I write a program that automatically clicks on radio buttons, links, etc. that satisfy certain criteria?
You can write a web scraping tool in Perl or Python. Or, you can use existing tools and frameworks to achieve that.
Check out Scrapy, an open-source tool written in Python.
Take a look at Selenium too.
To parse dynamic content you could see the javascript source and get that same content the same way the webpage is getting it. (ie. replicating ajax calls and such)
If you want to submit data (not actually click on the elements) as if it were clicked/edited/selected you could also send a request containing the same data that the server is expecting by using some HTTP library, like CURL. See an example here.
If you need to handle content generated by script, then your first problem is to cause the script to execute. Further, the script will want to generate the content into a DOM. That means you need to have a DOM, and a script engine, and probably HTTP access to the Internet, and XML handling, etc.
If that sounds a lot like a web browser, then you're listening.
What you basically need is a web browser that you can control from a program. You'll need to be able to tell it to browse to a page, click buttons and links, etc., then you'll need to read back the resulting DOM.
Only then will you need to parse the page.
If you're in the Microsoft world, then you can use the WebBrowser control. There are several forms of this, and they all amount to the same thing: you can have Internet Explorer run inside of your program, and your program can control it.
I understand there are other browsers that can be controlled from a program, but since I don't know their details, I'll wait for someone else to tell us both.

Resources