can I use a parsing code in my Chrome Extension? It parse the google.images search and display a result to user.
Is it legal?
With Google search you have the following situation:
If the search is opened in a regular tab, nothing stops you from modifying / parsing the result with a content script. Nothing stops you from opening the tab for your user either.
Then again, I'm not a lawyer, treat this advice accordingly.
Google will actively prevent you from querying it in a context other than a regular page, for example an iframe or through XHR. Trying to circumvent that is is a ToS violation and in general will cause Google to stop queries from the machine / subnet.
There are legal, but paid, options to query Google search.
Related
I am developing an Opera extension. At some moment it allows the user to perform search. Now I use chrome.tabs.update with { url: `https://google.com/search?q=${encodeURIComponent(query)}`}. This is not user friendly. The better approach is to maintain a user-editable list of URLs like https://google.com/search?q= or https://ca.search.yahoo.com/search?p= in the extension's options page and allow the user to choose between them, but I really don't want to reinvent the wheel.
Chromium-based browsers all have user-configurable search engines/providers, and I want to allow the user to choose between them or just use the default. To make that possible, I basically need two API functions:
a function to get the list of all configured search engines;
a function to get the default engine.
I didn't find anything similar to my problem in the API docs.
BTW, Opera also has an undocumented chrome.search function (Chromium has none), but it always searches via Google regardless of the user's default search engine setting.
Thanks.
PS. If there is a more appropriate place to ask this question, please tell me.
I didn't find anything similar to my problem in the API docs.
Because there isn't one, unfortunately.
There is a relevant old feature request with no real activity.
how to get a lot of URLs from Google.com search and I received it on TMemo without using TWebbrowser. but I do not mean no Source Code HTML / even like this code [eg: Idhttp.Get ('http://www.google.com/search?q=blah+blah+blah'); ], but only a Text / String URLs from Google search results.
Thx b4.
Don't use Google's HTML-based website frontend. It is meant for web browsers and user interactions. Use Google's developer APIs instead, like its Custom Search API.
I have done a search on different browsers on google ,but they return different set of results?? Is this because of the web history or some other reason??
All Google searches depends of several parameters, to offer to the user the best answers possibles.
Various parameters can influence the returned query :
- if you're logged in or not (see the top bar)
- if your browser can store your search preferences (chrome is very well aware of your interrested are !) but IE if very poor to send data to Google search engine
These 2 parameters are the main that can influence differencies between the results. I've never heard of sending the history of the visited pages directly
I have this songs site what ever data it has same is being displayed in other site
even if i echo "hello" same is done on other site does any body know how can i prevent that
just getting in more depth i found out that site is using file_get_contents() how can i prevent him from doing that
Well, you can try to dermine their IP address and block it
You said file_get_contents was being used.
A URL can be used as a filename with this function if the fopen wrappers have been enabled. See fopen() for more details on how to specify the filename. See the Supported Protocols and Wrappers for links to information about what abilities the various wrappers have, notes on their usage, and information on any predefined variables they may provide.
To disable them, more information is at http://www.php.net/manual/en/filesystem.configuration.php#ini.allow-url-fopen
Edit: If they go and use CURL or an equivalent after this, try and mess with their script by changing the HTML layout, etc. If that doesn't help, try and locate the IP of the script host, and make it return nonsense ;)
Edit2: If they use an iframe use javascript to redirect on iframe detection
Or you can even generate rubbish information just for that crawler, just to mess the "clone" site.
The first question to be answered is: Have you identified the crawler getting the information from your site?
If so, then you can give anything you want to this process: Nothing (ignore / block), a message telling the owners to stop getting your information, give them back rubbish contents, ...
Anyway, the first step is doing things properly. Be sure that you site has a "robots.txt" with the accepted policy for crawlers.
I've noticed that a lot of the time when i search something on Google, Google automatically uses the search function of relevant websites and return the result of the website search as if it was just another URL.
How do i let Google and other search engines know what is the search box on my own website and does Open Search has anything to do with it?
do you maybe mean the site search function via the google chrome omnibar?
to get there you just need to have a
form with method type GET
input type text element
submit button
on the root page of your domain
if users go directly to your root page and search something there, google learns of this form and adds it to the search engines accessible via the omnibar (the google chrome address bar).
did you mean this?
Google doesn't use anyones search forms - it just finds a link to search results, you need to
Use GET for your search parameters to make this possible
Create links to common/useful search results pages
Make sure google finds those links
Google makes it look like just another URL because that is exactly what it is.
Most of the time though Google will do a better job than your search engine so actually doing this could lower the quality of results from your site...
I don't think it does. It's impossible to spider sites in real time.
It's just a SEO technique some sites use to improve their ranking by spamming Google with fake results. They feed the Google bot with an endless stream of links to bogus pages:
http://en.wikipedia.org/wiki/Spamdexing