I want to be able to search google for sites with certain response codes, or cookies that they set. Like Im able to use site:, inurl, etc I was wondering if there is any way to search sites for response cookie like. cookie:cookiename
Is there any search engine that does this?
Related
I'm receiving multiple 404-causing URL requests. They contain keywords like plugins=wsm.
Ex.
/2EF8E3DF-30F3-4066-90C5-69DE2633DD99EqSdKyV8ca4PLF8Famfi6_o98Ea31pQxIm216UCLQF34JsE6qTZfDR5jolZzWb2EJSobMapS8zKpZI2zR3CO5w/init?url=https://www.example.com/page1&plugins=wsm&data={"data":[{"plugin":"wsm","parameters":"{\"referrer\":\"https://www.google.com/\"}"}]}&isTopLevel=true&nocache=1bdb3
and many more like these. The starting URL encoded part is same mostly for all the URLs.
Is it due to some tool called WSM downloader?
Is the user trying to download media from my website?
If yes, then suppose if I want to ban such users. I ban if any user is hitting such URLs (with keywords like: plugin, wsm, etc.). Is this check Ok or can it affect genuine users.
Thanks.
I want to get or buy google search results content (structured) from Google itself or any other source that can sell google data legally. I want all results about a specific keyword for the recent 6 months for example.
It will be a good turnaround if I can only get the page content as a raw text for this stage.
Automatic reading out / scraping of Google SERP is against Google ToS. From this point of view there is no one who sells such data legally - any seller violates Googles ToS.
Tere are many offers on markt, where you can get SERP data as JSON or full HTML through API access - just google for it.
The way every seller does SERP scraping is always the same - you can do it by your own. Run many proxies with IP addresses of countries, from where you need SERPs, and query Google with a kind of headless browser. Use captcha solving services to get data even if IP should be banned. Multithread your queries to get more data at once. Thats the whole magic.
how to get a lot of URLs from Google.com search and I received it on TMemo without using TWebbrowser. but I do not mean no Source Code HTML / even like this code [eg: Idhttp.Get ('http://www.google.com/search?q=blah+blah+blah'); ], but only a Text / String URLs from Google search results.
Thx b4.
Don't use Google's HTML-based website frontend. It is meant for web browsers and user interactions. Use Google's developer APIs instead, like its Custom Search API.
I have a browser extension which is scraping the threadId from the URL when the user is reading an email in Gmail, and is using this threadId to fetch circumstantial data using the Google Apps Script API.
The extension do however not know which of maybe several Google accounts are reading this message; it knows only the URL to my Apps Script webapp and the threadId. So when it executes the fetch, the webapp will the interpret request as coming from the default user session, which in some cases is wrong and will thus result in an null when executing GmailApp.getThreadById(e.parameter.threadId).
So what I am wondering is whether it is possible to specify what Google account to use when querying the webapp. Are there any possibilities other than asking the user to log off all other accounts and set the current one as default?
Unfortunately Google Apps Script does not have good support for multiple logins. See this page for more information.
You can add an authuser parameters to the requests you make to your the Google Apps script.
The authuser param's values are zero based indexes for all the Google accounts that are logged in the current browser session.
Now for extracting what index value you need to send, you can scrape the current page for profiles.google.com links that have authuser param and extract your value from them and send it with your requests.
The link might look like this:
https://profiles.google.com/ ... authuser=0
Specifically for gmail, the url also contains the current authuser index, For example:
https://mail.google.com/mail/u/1/#inbox
This URL above contains the authuser value 1 at the end (before the fragment and after /u/)
I know this is very complex and is looks more like a hack. But I think this surely is a workaround, until Google provides a better way to specifying the context for your apps script requests.
I hope this might be helpful.
Thanks.
I've noticed that a lot of the time when i search something on Google, Google automatically uses the search function of relevant websites and return the result of the website search as if it was just another URL.
How do i let Google and other search engines know what is the search box on my own website and does Open Search has anything to do with it?
do you maybe mean the site search function via the google chrome omnibar?
to get there you just need to have a
form with method type GET
input type text element
submit button
on the root page of your domain
if users go directly to your root page and search something there, google learns of this form and adds it to the search engines accessible via the omnibar (the google chrome address bar).
did you mean this?
Google doesn't use anyones search forms - it just finds a link to search results, you need to
Use GET for your search parameters to make this possible
Create links to common/useful search results pages
Make sure google finds those links
Google makes it look like just another URL because that is exactly what it is.
Most of the time though Google will do a better job than your search engine so actually doing this could lower the quality of results from your site...
I don't think it does. It's impossible to spider sites in real time.
It's just a SEO technique some sites use to improve their ranking by spamming Google with fake results. They feed the Google bot with an endless stream of links to bogus pages:
http://en.wikipedia.org/wiki/Spamdexing