search news by keyword using Google CSE - search

I want to search results from "Google News" via "Google Custom Search Engine Api (CSE)" based on location/country and keyword.
I tried using it by setting up a CSE which only searches inside the site "news.google.com" but then it only returns old news article clippings. Not sure how to grab the recent news articles. Also, i noticed that if we set schema type NewsArticle, its not accurate as not all news sites having this schema type of page.
I knew that there is a workaround to use RSS feeds for getting google news and it returns results as required. Example - https://news.google.com/news/feeds?hl=en&q=corruption&ie=utf-8&num=10&output=rss . But, I afraid this is not a correct and recommended way to get go with.
Will appreciate if anyone could advice right setting in CSE or other recommended solution.

Related

Bing search API: specify filetype and site/domain?

Having had the luxury of having my fun and earning my bread well away from MS products for years, I am today trying to programmatically search with Bing Azure (wearing gloves) basically because I thought getting a google api was complex. So I headed down Data Market and issued this (let's say with perl's LWP which has been used to pass credentials):
https://api.datamarket.azure.com/Bing/Search/v1/Composite?%24skip=0&%24top=10&%24format=json&Sources=%27web%27&Query=%27abc%27
which works.
What I am now trying to find out is
1) how to tell the Bing search api to restrict results to a specific domain (e.g. ".org" or even a single website "www.wikipedia.org").
2) how to tell the search engine to restrict results to a specific filetype, e.g. 'PDF', 'XML' (or PDF and XML if that's possible)
3) if there is a simple list of the features/keywords in the GET request of the latest bing search API. Please no MS links if you please - i am really tired.
I have seen "site:.org" working on the bing search website when doing a manual search. And read about "filetype:pdf" working too.
Any hints?
bliako
cracked it:
... Query=%27abc site:.com filetype:pdf%27
at the point when m$ realises it costs to be clumsy
bliako

Custom field info. in Search Engines

Can I show custom field info. on Search Engine Pages?
For example, when I type the keyword "google" on Google Search Engine, along with the regular search results, I also get some more information such as CEO, Founded, Headquarters, Founder, etc.. on the right sidebar, similar to the below image.
And I wondered if I could ever show up my company information also like that.
Any help would be greatly appreciated.
Yes, if you have a website accessible by Google's indexing bots and your page(s) contain Structured Data Markup providing such info, more specifically by Customizing the Knowledge Graph for your company (and provided that Google's indexing service considers the info relevant enough to show up in search results, of course).

Filter Search and Free text search using Foursquare api

We are currently developing a food/restaurant search on our website using Foursquare API.
We have hit an issue which is the free text search. If I would like to search for a specific restaurant/food venue eg. "Lucilda Pizzeria" will it allow me to do so?
Can we use the Food Category in the Category tree https://developer.foursquare.com/categorytree to allow people to filter the venues? Eg. "Minnesota" - "Bagel Shop"
Hope anyone can please clear up these questions for me.
Thanks to anyone who will get back to me with an answer.
Take a look at the getting started guide to search: https://developer.foursquare.com/start, then read about the search and explore endpoints. In your use case, I would recommend making an explore API request with the intent=food parameter passed in.

How to get the file excerpts for search using box api?

http://developers.box.com/docs/#search
This api returns only the files/folders related to the search query. How do I show the search excerpts?
Should I integrate solr/lucene for search?
EDIT:
I mean excerpt from the content of the files/documents. The search snippets that you see like in google.
Example:
http://www.bestrank.com/files/uploads/39/image/anatomy-of-a-search-engine-snippet.png
The description in this case.
The Box API currently does not provide this in the search response, but we're looking at adding it sometime in the future.

Definitive method to exclude page sections from main search engines

I have quite a few constant parts of pages I'd like to exclude from displaying in search results to prevent obscuring of the unique content on each respective page.
I read that class="nocontent" will perform this action for Google. But what about the other main search engines like Yahoo and Bing? Is there a globally accepted solution for this, or is there an additional step to get them to do the same?
Thank you for any assistance.
Google doesn't offer such a feature for the general search. The class nocontent is only for Google Custom Search. The comments googleon/googleoff are only for Google Search Appliance.
Yahoo! introduced the class robots-nocontent in 2007. Google doesn't support it.
There is a microformats draft, but it has probably no support.
Despite that, there are some "hacks" that could accomplish what you need, but I wouldn't count on or use them. For example: inserting content with JS, or embedding content in iframe (and blocking the source URL in robots.txt).

Resources