Kentico - Content Only page & smart search result - kentico

I have a custom page types (Content Only) for Locations. Then I have a landing page (/company/locations/) with repeater to list all locations and their details. Things work well so far. Now, after adding the smart search, I notice that if I search a location name like "san francisco", the landing page didn't show up in search result, but the content-only page showed with a URL like this /company/locations/san-francisco. The thing is, this URL results in 404 since that page doesn't really exist. What should I do? Should I re-create the page type and change it to a regular page instead of content only before it's too late? Or is there a way to make individual location url (/company/locations/san-francisco) work - considering we can't specify a page template to go with content only page type? Thanks!

There are multiple types of Search indexes in Kentico.
"Pages" scans the data of a document, such as any webparts+properties, editable text, form data, etc. They do NOT scan the rendering on the page though, it doesn't catch any Repeaters (what you're using).
"Page Crawler" will literally load the page, and scan all the content in the page. This will catch Repeaters and dynamic content like that.
Knowing this, you have a couple options.
Use Pages, then Modify the Smart Search Result and add some transformation logic to say something like the below
The Link
Use Page Crawler, tell it specifically to only index the /company/locations.
Use Page Crawler, and also a custom smart search indexer so you can exclude the header/footer or other areas out of the content (it's a bit more advanced)

If you don't want that URL to show then simply exclude those page types from that search index. But if you want them to specifically show, then create a detail or selected transformation for that /company/locations repeater to display when someone navigates to it from the search. This will also be good for google and other search indexes if you plan to have specifics for each location.

Related

Kentico 12 Smart Search Page Crawler Index not working

I have a Kentico 12 MVC site where the cms and I guess "client" site are in the same server but separate IIS entries. One is called admin.site.com and the other is called dev.site.com.
I'm trying to implement the Smart Search functionality with a Page Crawler index. The reason I want a Page Crawler index is because my content structure is as follows:
Page Container > Page Type "Product"
Then within "Product" page type, I'm pulling in content from a different part of the content tree using widgets/page builder functionality in the Page tab. The Content tab of that page has very little actual content.
If I use Pages Index and search on that, it only grabs the page types that are in the content widget section of the site, so not the pages that implement the widgets which are the actually live pages on the site. I implemented the Page Crawler index and tried a search preview but literally anything I search comes with no results. Please let me know what details you'd need from me to help, I appreciate any help!
Best,
RP
Check the documentation and especially the note:
"We do not recommend using crawler indexes on MVC content-only sites. The crawler only selects pages from the site's content tree in Kentico, which may not match the actual structure of the website (in many cases, content-only pages only store data and do not represent pages on the live site)."
To achieve your need you will need to create your own crawler code and combine it with custom search index.

Search result: How to show only pages, not different content items?

We are using Liferay as a classic CMS meaning that we compose pages using web content articles. There is an issue with Liferay's internal search I could not yet find a proper answer for:
Because web content articles are pretty much only building blocks for pages we don't want the search to show them as distinct items. The user should only get a list of pages that contain their search keywords, including all the articles put onto this page.
At the moment we can see two different approaches and both come with certain problems we could not solve yet:
Idea 1
We modify the journal indexer and try to obtain all URLs of the pages (how?) where the article has been placed on. Then we add them to the document to be indexed. In the search result we then can access the URLs and collect them. In the end we make sure every URL is only shown once.
Idea 2
At some point Liferay renders the entire page before sending it to the browser. If we somehow could put an indexer there, we could index the entire page. We then could limit the search to the special "page documents". Getting the fully rendered page would be the main issue here, because either we would have to run a crawler to frequently trigger this indexing or we would need to find a way to trigger page rendering from within an indexer or something like that.
I have been carrying this problem around for quite a while now and still could not find an idea good enough to spend time trying it out. If anyone of you has some input on those two ideas or maybe an entirely different approach, I would be extremely grateful.
I'll just answer myself, because by now we found a suitable solution to solve our problem:
In addition to the default search portlet there is also a "Web Content Search Portlet" shipped with Liferay. It seems to have been part of Liferay for quite a while now, but it's somewhat hard to find, because there is hardly any documentation for it (I only found the Liferay wiki page, which isn't really anything at all). It searches only within web content articles and shows links to the pages rather than just a link an isolated view of the article. It has much less configuration options than the default search portlet, however. Pretty much all it allows to change is whether articles actually have to be placed on at least one page to show up in the results.
So there is no need for any kind of custom indexer or any other "hack"...all we need to do is use the correct portlet. We will only need to write a hook that changes the appearance of the result page.
What you ask is interesting but your ideas are on the wrong direction.
Specially idea 2 it's particulary wrong because you cannot do indexing work meanwhile a page is rendered. Think about performace only.
In Liferay pages and assets are not directly linked: pages have portlets and portlets display assets (web content and more).
Liferay indexing refers and scans assets content, not refers the display result of the assets. Think about permission: the same page can display different contents depends on the user who looks.
bye

Indexed search displaying results in another page

I want to have a search box on top of the header, but when I submit keywords, I dont want the results to appear on the header, but on the body.
To do this, I thought id have the plugin once in the header and another in the body in a special "Search" page, where I could hide the from in the header when the user was using this page. But I dont know how to do, so that when a search is done, to jump to this other page. (Its sort of like when tt_news has a single pid to go from LIST to SINGLE)
How can I do it to do this jump? Or maybe is there an easier way to achieve what I want?
On common pages you need to construct "pure html" in the header part with the search form where its action is link to the another page - displaying search results. It uses typolink for generating proper form's action.
On the page with search results you don't need to hide the search form, instead you can use TypoScript to fill the search field with value entered on the common page.
There is ready-to-use sample of TypoScript for such scenario placed in Introduction Package, I don't use it so sorry, but I won't paste it here. Anyway you can install it locally and dig for nice snippets and techniques.

Handle default web page with little information for search?

Would like to garner opinions. We've created a website for a gay members club and they wanted the default landing page to mysterious with little information on it.
As such the Default.aspx only contains a form asking for some personal details. Users can click a button to skip this content and go to an AboutUs page.
The problem is, because we cannot control what information Google uses for the site description in search results, it is picking up the forms fields - which obviously do not makes sense as a description.
I think there are two options to counter this:
Use Robots.txt to block access to Default.aspx and only allow access to AboutUs.aspx
Write a description and title in a H1 tag but make the text colour the same as the background colour
Could I get opinions which method people will think is best for search results?
Thanks.
I would not block or try and deceive Google.
Make sure the title tag for the page is good and descriptive. Around 70 characters to explain what the website is about.
Same goes for your meta description. About two sentences to continue on from the title information.

How would I best make this SEO_able?

I have a search engine that searches albums.
For each music album, I have a page.
So, the work flow goes like this:
People search for music titles
The search engine displays a list of albums.
People click on an album to go to a details page.
I want google to index my front page and the details page. I want the details page to be highly ranked. How can I build a sitemap for this?
By the way, I have about 5 million albums (but I want the top 1000 ones to be highly ranked on google)
You would not use a sitemap for that many results. You would want each album to appear as a page with a unique URI to reference that page. That way the search engine can crawl your site by crawling links since search bots cannot submit form data. Each of those URIs should be simple, meaning limited to this part of the URI syntax:
scheme://authority_segment/path
Program your web application to remove and throw away any extraneous data, such as query string or parameters. If you do this you have to be sure that you are watching for URI poisoning or SQL injection even through means of character encoding.
How can I build a sitemap for this?
By pulling the addresses out of your database and creating a XML file with a high priority for some selected pages. Somehow I think that isn’t your real question …
If I wanted to automate building a site map for a site like this, I'd employ Python. I'd pretty much write everything from the ground up (except the data store access). The format is quite simple.
I'm not sure I quite understand your question...

Resources