I use typo3 7.6.10
I have crawler that index all pages and in search result are showed but crawler is not indexing the "content" of the page.
I have to write something in Configuration?
This tutorial by Xavier Perseguers tells you everything you need to do to index pages and records with Indexed Search. It was made for an older version of TYPO3 (as you can see from the screenshots) but it should work for newer releases too.
Related
Recently, I used nutch-1.11 and solr-4.10.4 to set up a crawler, I can crawl data by sequential nutch commands, but now my problem is how can I to fetch the specified data, like tags of questions of stackoverflow for example, then I can use these data for solr indexing for my some purpose? I try to configure and modify the "local/conf/nutch-site" but doesn't work for me, I'm a newer for Nnutch!
Nutch fetches urls, so what you could do is point it to a page which might contain all the links to the questions with that tag.
For example
https://stackoverflow.com/questions/tagged/nutch?sort=newest, this page contains links to all questions having Nutch as the tag. Now by crawling 2 or more rounds will make Nutch fetch all outlinks from this page.
I'm having an issue where inline Javascript is being displayed in Solr search results on my Drupal website. Is there a way to hide parts of my code from being indexed by Solr similar to how google uses googleoff:index and googleon:index to keep code from being indexed?
If you use the solr search module for drupal, you can tell solr to index specific fields in your content :
https://www.drupal.org/project/search_api_solr
So your javascript will not get indexed.
We updated our Sitecore CMS from version 6.3 to 6.6 SP2. This Sitecore version has the Intranet Module installed. Everything is working fine, but the Lucene Search doesn't seem to work properly.
There are two indexes defined. One for the whole content tree and one for the media library. The search only delivers results with media items (images, PDFs), but no pages. With the tool Luke I'm able to look into the indexes and I see the items there. But they are not in the search results on the website anymore.
I rebuilt the search indexes using the Sitecore Control Panel, but that didn't help.
As I said, it was working fine on Sitecore 6.3, but not on the updated 6.6 SP2.
Any idea what could be the problem?
Thanks in advance :)
Here is a blog post about Troubleshooting Sitecore Lucene search and indexing .
In shortcut:
Check if items are indexed correctly either using Luke.
Check if MatchAll query return page items:
SearchManager.GetIndex("your_index_name").CreateSearchContext()
.Search(new MatchAllDocsQuery(), int.MaxValue)
.FetchResults(0, int.MaxValue).Select(r => r.GetObject<Item>())
Check included templates:
<include hint="list:IncludeTemplate">
It turned out that the 3 missing fields _sclsMedia, _sclsSearchable and _scLang in the Content Lucene Index that are causing the search not to function. So I removed the 3 fields from the code in my solution and now I get search results again.
The question is why were those 3 fields lost during the update from Sitecore 6.3 to 6.6.
I use indexed_search and RealUrl and I need it to show the whole url in the search result.
Right now it is only showing that part of the url which is related to pages and not the part that is related to my extension.
Now it shows: domain.dk/products/
But it should show: domain.dk/products/product/product-title
I dont know whether it is in RealUrl configuration or in Indexed Search I should make som changes.
There are some pretty good explanations on the web, showing how to index database/extension records with the crawler extension. Try this one as a start, it shows everything step by step and with screenshots, so I guess it should be useful.
If this is not enough, there are ready-to-use examples for tt_news and other extensions in the crawler documentation.
I have just finished migrating an old templavoila site to tyop3 6.1 and setup the indexed search (much like it was in 4.7) I can't get indexed search to index any content on any page. I would like to know if this extension actually works with a TV page and what I may have overlooked in it's setup.
indexed_search is a core extension and always works on the current version. If you are using MySQL its also recommended to install index_search_mysql.
To activate indexing just set the options
config {
#indexed_search
index_enable = 1
}
And check the results in "Web > Info > Indexed Search". There are also scheduler tasks to clean up indexes.
Actually Merecs answer is wrong. You will have to set
page.config.index_enable = 1
for it to work.