Google Custom Search not indexing Dynamic Pages - search

I am trying to use Google Custom Search to provide search capabilities to an informational site.
About the site:
Content is generated dynamically
URL Access to content is search engine friendly (i.e. site.com/Info/3/4/45)
Sitemap (based on RSS feed) submitted
and accepted by web master tools. It
notes that no pages were indexed.
Annotations sucessfully submitted based on the RSS feed
Problem:
There are no results for any keywords that appear on the pages that were submitted.
Questions:
Why is Google not indexing the submitted pages?
What could I be doing wrong?

Custom Search with basic settings is principally same thing as standard search with site:your.website. Does standard search give you expected results?
Note, that Google doesn't index pages immediately. It takes some time. Check if your site is already indexed.

Yeah it took about 2 weeks for Google to pick up all my pages after I submitted a site map. But you should see a few pages indexed after a couple days.

Related

Liferay search results

I am currently involved in project where we are using Liferay (6.1 GA2).
It seems that Liferay search results provide links to Web Content Fragments instead of to the pages containing them.
Have any of you gone through this issue? Do you know how to solve it?
Thanks a lot pals.
Best, Alberto
You can have a lot more content in the backend than actually displayed on any page. Further, you can display any article on multiple pages at once.
A way to work around this is to specify in the "Web Content Search" portlet that you're only interested in content that is actually published. However, this does not solve your second problem: The content can still be published on many different pages.
Every content can have a "Display Page" - the setup of such a display page is well explained in the UI (see the Web Content Editor) so that you'll actually see a proper page with the search results.
If you actually want to search for pages only instead of content (you might miss out on some metadata), I'd recommend to go with some spider solution that spiders your website, indexes the pages independent of their construction elements (articles) and search that external index.

How to find most visited pages in site collection using sharepoint 2010

I want to find 3 top most viewed pages in the entire site collection and display their page titles. Using Web analytics webpart, I got top most 10 pages URL. But i want the page title and the not URL. Is there anyway to do that?
Regards,
Raji
This post(http://auditlogsp.codeplex.com/) can give you a reference. The solution can help us to log the source of SharePoint events , actually include the 'Viewing' action.You can store those information at an isolated database for analyzing the “top” reports.

Microsoft SharePoint Search - Ignore sections of the page

I am using Microsoft SharePoint Search (MOSS) to search all pages on a website.
My problem is that when you search for a word that appears in the header, footer, menu or tag cloud section of the website, that word will appear on every page, so the search server will bring you a list of results for that search term: every page on the website.
Ideally I want to tell the search server to ignore certain HTML sections in its search index.
This website seems to describe my problem, and a guy says "why not hide those sections of your website if the User Agent is the search server.
The problem with that approach is that most of the sections I hide contain links to other pages (menu's and tag clouds) and so the crawler will hit a dead end and won't crawl very far.
Anyone got any suggestions on how to solve this problem?
I'm not sure if i'm reading this correctly. You DON'T want Search to include parts of your site in the index, but you DO want it to go into that section and follow any links in it?
I think the best way is to indeed exclude those section based on user agent (i.e. add them to a usercontrol and if the user agent is MS Search you don't render the section).
Seeing as these sections would be the same on every page, it's okay to exclude them when the search crawler comes by.
Just create ONE page (i.e. a sitemap :-D). that does include all the links a normal user would see in the footer / header / etc. The crawler could then use that page to follow links deeper into your site. This would be a performance boost as well, seeing as the crawler only encounters the links once instead of on every page.

Sharepoint search of external RSS feeds

I want my sharepoint site to allow a user to search content in a known collection of RSS feeds. I figure conceptually a few ways to do this
crawl the feeds at their source (Yikes!)
Pull the full articles into my sharepoint site, then let my crawler crawl it
Make use of an existing index (like google)
search the full articles, on demand, using something like a google utility (my preference)
So can I somehow, from my sharepoint site, allow a user to search the full articles from a couple dozen, named, rss feeds
thanks
Cary
I don't see why there is a problem with crawling the feeds at their source? That would seem to be reasonable.
It is fairly easy to create a content source to point at the feed and select the correct indexing schedule. If that does not work then you can try a more complicated approach.
Be aware that copying the content of another website to host on your own could have copyright implications (not too mention the risk that any inflammatory content would appear to be published on your own site).
--update--
Try reading the target sites robots.txt to see if (it even has one) it has a desired frequency. Otherwise it depends on the depth of the site you would be crawling.
If you are crawling just the rss feed xml, I suspect you could do that every hour without annoying anyone. Otherwise if you reach into each article, you may want to limit that. It really depends a lot on any relationship you have with the target site and type of site you are hitting.
Checkout this article for a little more info on how SharePoint deals with robots.txt
(p.s. the target site did not put the articles on the web so no one would read them)
The out of the box crawler will respect robots.txt and there are provisions for crawler impact rules that will lessen the chance that SharePoint will perform a beat down on the external site.

Search Center on SharePoint Publishing site

Can someone give me some directions on how to setup SharePoint Search Center so I can get results from the list and that they have some custom (modified) link?
I have Forms authentication (and anonymous access) enabled with alternate access mapping.
Right now in the Default zone I get results from the data in lists and they all point to the AllItems.aspx. If try search from the Internet zone I don't get any results from the lists and I am guessing that this is because of some security settings. But if make them to show how will I customize resulting link so that list items are shown with some publishing page.
For example if I keep news in the News list and when I do search I want to get result with link in following format
http://somesite/Pages/News.aspx?itemId=12
where the itemID is he id of the news item.
Can I customize link in the result ?
You can customize the result link using the Core Search Results web part. It is all in the XSL which is available if you modify the shared properties of the web part.
The problem is that this page is meant to show search results of all types including documents in SharePoint, files potentially outside of SharePoint, web pages, business data, etc.
You may want to have a custom search results page that uses a specific scope or managed property query such that you can be sure the results will be list items. This can probably be done without any coding (if you don't consider XSL coding) and you could still use the Core Search Results web part.
Another option may be similar, but use the Data Form/View web part (through SharePoint Designer) or the Content Query Web Part (Publishing Infrastructure feature required).

Resources