Liferay search results - search

I am currently involved in project where we are using Liferay (6.1 GA2).
It seems that Liferay search results provide links to Web Content Fragments instead of to the pages containing them.
Have any of you gone through this issue? Do you know how to solve it?
Thanks a lot pals.
Best, Alberto

You can have a lot more content in the backend than actually displayed on any page. Further, you can display any article on multiple pages at once.
A way to work around this is to specify in the "Web Content Search" portlet that you're only interested in content that is actually published. However, this does not solve your second problem: The content can still be published on many different pages.
Every content can have a "Display Page" - the setup of such a display page is well explained in the UI (see the Web Content Editor) so that you'll actually see a proper page with the search results.
If you actually want to search for pages only instead of content (you might miss out on some metadata), I'd recommend to go with some spider solution that spiders your website, indexes the pages independent of their construction elements (articles) and search that external index.

Related

Share news articles between sites

Does anyone know if it is possible to share data between sites on a Kentico platform, such as news articles? I have tried searching but could not find an answer.
Thank you,
Yes. If you want the content to be the same and keep the URLs and Domains separate, then this is best accomplished with linked documents between the sites. This way if they are updated on either site, they will remain in sync on the other site. Read more here: https://docs.kentico.com/k9/managing-website-content/working-with-pages/copying-and-moving-pages-creating-linked-pages
https://docs.kentico.com/k9/e-commerce-features/managing-your-store/products/linking-existing-products-or-sections
Basically what you'd want to do in your content tree, is click "new page" and then select "Link an existing page" from the bottom of the next menu. After that, you will see a content tree to select a page from. Use the site selector in the top left to choose a different site, and then select the page you'd like to link in.
Note: Keep in mind that the page type will need to be allowed in that section of the tree (for example, if you are trying to nest a news article under a folder but folders are not an allowed parent type, you will get an error).
If you'd like to pull data from another site in via repeater and leave the content on the other site, then you can user a repeater or similar viewer control, and specify a site within the properties in order to pull the pages from the other site.
Each of these methods assumes the Kentico sites are on the same instance of Kentico with a shared database.
If you have multiple sites in one Kentico instance, in your repeater getting the news articles, specify the site you want to get the news articles from and it will be displayed on that site. If you want to combine them then create a custom query and use the query repeater to show the news articles.

Liferay: How can I get the pages of the site in a web content?

I have a portal in Liferay 6.2, and need to design the velocity template of a web content that must have a menu listing the pages (linked names) of the site where is present.
My questions are:
Is this possible?
What would be the correct way to do this?
Would it be better to make a portlet instead of a web content for this purpose?
Thanks for the help.
It feels a bit like you are trying to solve many problems in a single template - consider to compose the UI from many different elements (e.g. custom portlets) rather than building the one structure/template that fits all requirements.
That being said, as there's also the chance that your template doesn't do more than just displaying the current navigation: You have two options: The out-of-the-box Navigation portlet is quite configurable, you might be able to utilize that one instead of implementing anything yourself (check the configuration options).
And lastly, if you want to implement for yourself: Get hold of the themeDisplay object. With getLayout() you'll get the current page, while getLayouts() you'll get all pages of the current site and can enumerate them. However, there's one problem: You typically don't have access to the themeDisplay object from a CMS template. But there are several ways to still get to the data (search the Liferay forums for cms template themedisplay). Also, an Application Display Template will be a lot more powerful - and you can also check how the layouts collection is built - just search for usage of ThemeDisplay.setLayouts in Liferay's source code. But with ADT we're diverting from your original question.
Liferay offers a sitemap portlet out-of-the-box which lists pages of a site. You can configure it and define your own application display template (ADT).

Software for building a sitemap

If I had to create a content inventory for a website that doesn't have a sitemap, and I do not have access to modify the website, but the site is very large. How can I build a sitemap out of that website without having to browse it entirely ?
I tried with Visio's sitemap builder, but it fails great time.
Let's say for example: I Want to create a sitemap of Stackoverflow.
Do you guys know a software to build it ?
You would have to browse it entirely to search every page for unique links within the site and then put them in an index.
Also for each unique link you find within the site you then need to visit that page and search for more unique links.
You would use a tool such as HtmlAgilityPack to easily grab urls and extract links from them.
I have written an article which touches on the extracting links part of the problem:
http://runtingsproper.blogspot.com/2009/11/easily-extracting-links-from-snippet-of.html
I would register all your pages in a Database, and then just output them all on a page (php - sql). Maybe even indexing software could help you! First of all, just make sure all your pages are linked up and submit it to google still!
Just googled and found this one.
http://www.xml-sitemaps.com/
Looks pretty interesting!
There is a pretty big collection of XML Sitemaps generators (assuming that's what you want to generate -- not a HTML sitemap page or something else?) at http://code.google.com/p/sitemap-generators/wiki/SitemapGenerators
In general, for any larger site, the best solution is really to grab the information directly from the source, for example from the database that powers the site. By doing that you can get the most accurate and up-to-date Sitemap file. If you have to crawl the site to get the URLs for a Sitemap file, it will take quite some time for a larger site and it will load the server during that time (it's like someone visiting all pages in your site). Crawling the site from time to time to determine if there are crawlability issues (such as endless calendars, content hidden through forms, etc) is a good idea, but if you can, it's generally better to get the URLs for the Sitemap file directly.

Microsoft SharePoint Search - Ignore sections of the page

I am using Microsoft SharePoint Search (MOSS) to search all pages on a website.
My problem is that when you search for a word that appears in the header, footer, menu or tag cloud section of the website, that word will appear on every page, so the search server will bring you a list of results for that search term: every page on the website.
Ideally I want to tell the search server to ignore certain HTML sections in its search index.
This website seems to describe my problem, and a guy says "why not hide those sections of your website if the User Agent is the search server.
The problem with that approach is that most of the sections I hide contain links to other pages (menu's and tag clouds) and so the crawler will hit a dead end and won't crawl very far.
Anyone got any suggestions on how to solve this problem?
I'm not sure if i'm reading this correctly. You DON'T want Search to include parts of your site in the index, but you DO want it to go into that section and follow any links in it?
I think the best way is to indeed exclude those section based on user agent (i.e. add them to a usercontrol and if the user agent is MS Search you don't render the section).
Seeing as these sections would be the same on every page, it's okay to exclude them when the search crawler comes by.
Just create ONE page (i.e. a sitemap :-D). that does include all the links a normal user would see in the footer / header / etc. The crawler could then use that page to follow links deeper into your site. This would be a performance boost as well, seeing as the crawler only encounters the links once instead of on every page.

Google Custom Search not indexing Dynamic Pages

I am trying to use Google Custom Search to provide search capabilities to an informational site.
About the site:
Content is generated dynamically
URL Access to content is search engine friendly (i.e. site.com/Info/3/4/45)
Sitemap (based on RSS feed) submitted
and accepted by web master tools. It
notes that no pages were indexed.
Annotations sucessfully submitted based on the RSS feed
Problem:
There are no results for any keywords that appear on the pages that were submitted.
Questions:
Why is Google not indexing the submitted pages?
What could I be doing wrong?
Custom Search with basic settings is principally same thing as standard search with site:your.website. Does standard search give you expected results?
Note, that Google doesn't index pages immediately. It takes some time. Check if your site is already indexed.
Yeah it took about 2 weeks for Google to pick up all my pages after I submitted a site map. But you should see a few pages indexed after a couple days.

Resources