Force Lucene (in OpenCms) to show results in a specific Locale

Force Lucene (in OpenCms) to show results in a specific Locale - search

I've a website mounted with OpenCms and it use "Lucene" as Search Engine. My website is available in two languages: Spanish (supported) and Gallegan (not supported). I've achieved my search proccess works well but results is always show in Spanish. Is it posible force Lucene to show results in a specific locale?

When i build the search index (in the backoffice) i have an option called "Locale" where i can specify the locale of the index. I did it and created two separate index; one with locale "es" called "index-es" and other with locale "gl" called "index-gl".
I pass different index name with the function "setIndex" whether my user is using one language or other one but don't works. Always show contents in ES locale.
Do you refer to this or I didn't understand you?

When you build the search index, you could create a new Field called Language.
Use that Field to filter your search results.
EDIT
Document doc = new Document();
doc.Add(new Field("Language", "GL", Field.Store.NO,
Field.Index.NOT_ANALYZED_NO_NORMS));
...
indexWriter.AddDocument(document);
Get top 10 documents in GL:
Directory dir = FSDirectory.open(new File("..."));
IndexSearcher searcher = new IndexSearcher(dir);
Query q = new TermQuery(new Term("Language", "GL"));
TopDocs hits = searcher.search(q, 10);
searcher.close();

Related

Returning accented as well as normal result set via azure search filters

Does anyone know how to ensure we can return normal result as well as accented result set via the azure search filter. For e.g the below filter query in Azure search returns a name called unicorn when i check for record with name unicorn.
var result= searchServiceClient.Documents.SearchAsync<myDto>("*",new SearchParameters
{
SearchFields = new List<string> {"Name"},
Filter = "Name eq 'unicorn'"
});
This is all good but what i want is i want to write a filter such that it returns record named unicorn as well as record named únicorn (please note the first accented character) provided that both record exist.
This can be achieved when searching for such name via the search query using language or Standard ASCII folding search analyzer as mentioned in this link. What i am struggling to find out is how can we implement the same with azure filters?
Please let me know if anyone has got any solutions around this.

Filters are applied on the non-analyzed representation of the data, so I don’t think there’s any way to do any type of linguistic analysis on filters. One way to work around this is to manually create a field which only do lowercasing + asciifloding (no tokenization) and then search lucene queries that look like this:
"normal search query terms" AND customFilterColumn:"filtérValuèWithÄccents"
Basically the document would both need to match the search terms in any field AND also match the filter term in the “customFilterColumn”. This may not be sufficient for your needs, but at least you understand the art of the possible.

Using filters it won't work unless you specify in advance all the possibilities:
for example:
$filter=name eq 'unicorn' or name eq 'únicorn'
You'd better work with a different analyzer that will change accents to it's root form. As another possibility, you can try fuzzy search:
search=unicorn~&highlight=Name

Smart search results behaviour of compound index of multiple page types

Can someone confirm the behaviour of the Smart search results webpart when using a Smart search filter on a particular field, documentation here, when the index, and the expected results, are compound of multiple page types?
In my scenario I have 2 page types, one is always a child of the other, my hypothetical scenario would be a Folder and File types as an example.
I've configured the index with Pages type and Standard analyzer to include all Folder and File types under the path /MyOS/% on the tree.
The search page, includes the Smart search results webpart and a Smart search filter, a checkbox for the File's field FileIsHidden.
What I'm trying to ascertain is the possibility for the results to include all folders that have a hidden field, as well as the files?
Client has a v8.2 license and now has a requirement similar to this scenario.
Thanks so much for any help in advance.

Firstly what i would do is download the latest version of LUKE, it's a lucene inspector that allows you to run queries, inspect the data, etc.
https://code.google.com/archive/p/luke/downloads
Your search indexes are in the App_Data/Modules/SmartSearch/[SearchName], now i am not sure if LUKE can query 2 indexes as the same time, however you can run hte same query against both and see if it's filtering out results one way or another.
If you are trying to query where a field must be a value, and the other page type does not have the field, it probably is filtered out. What you need to do is use the lucene syntax to say "(classname = 'cms.file' and fileonlyproperty = '' OR classname <> 'cms.file')" so to say.
You'll have to test, but say the class name is cms.file and cms.folder, and the property is FileIsHidden, i think the syntax would be:
+((FieldIsHidden:(true) and classname:('cms.file')) OR (NOT classname:('cms.file'))
But you'll have to test that.

What is the internal id(name/number) for a saved search record type?

I created a savedSearch for a savedSearch itself via UI with an internal id customsearch_savedsearch.
When I'm loading the search using a suiteScript. It shows me an Unexpected error has occured.
var search = nlapiLoadSearch(null, 'customsearch_savedsearch');
The above statement works fine for all other record-types, But fails for a savedSearch record type.
What could be the internal id for the savedSearch record type?

You cannot use null for the first parameter. When loading or creating a search, you must specify the record type for the search as well. Whatever record type customsearch_savedsearch searches for, that's what you would pass in as the first parameter.
So for instance if your saved search is a Customer search, then you would load it by:
var search = nlapiLoadSearch('customer', 'customsearch_savedsearch');

Try
var search = nlapiSearchRecord(null, 'customsearch_savedsearch');
Documentation:
nlapiSearchRecord(type, id, filters, columns)
Performs a search using a set of criteria (your search filters) and columns (the results). Alternatively, you can use this API to execute an existing saved search. Results are limited to 1000 rows. Also note that in search/lookup operations, long text fields are truncated at 4,000 characters. Usage metering allowed for nlapiSearchRecord is 10 units.
This API is supported in client, user event, scheduled, portlet, and Suitelet scripts.

If `
var search = nlapiSearchRecord(null, 'customsearch_savedsearch');
does not work`use
var search = nlapiSearchRecord('', 'customsearch_savedsearch');

Everything looks correct in your statement. I think the problem is that SuiteScript does not support SavedSearch record types. Here is a list of supported types.

You should be able to run this using the above mentioned
var search = nlapiSearchRecord(null, 'customsearch_savedsearch',null,null);
I've used this in my code and haven't had any issues. Make sure you have the right permissions set on the saved search. To start with, set it as Public. And in the Audience, "select all" roles.

Your syntax is correct. However, Saved Search type (also like Budget record) is not scriptable even you are able to create a saved search. That is why you encountered that error. Most likely, those record types listed on the SuiteScript Record Browser are supported. You may check the list here:
***Note: You should log in first to the account..
Production: https://system.netsuite.com/help/helpcenter/en_US/srbrowser/Browser2016_1/script/record/account.html
Sandbox: https://system.sandbox.netsuite.com/help/helpcenter/en_US/srbrowser/Browser2016_1/script/record/account.html

I know this is a bit of an old question, and I came across it in my own attempt to do the same, but I just tried the following with success:
var search = nlapiLoadSearch('savedsearch', 'customsearch_savedsearch');
Seemed a little on the nose, but it did the trick.

Alfresco - Search Workflow with like cluase (or contains!)

I've developed a simple webscript that accepts in input some paramenters and returns a list of workflows that matches the conditions. This is a simplified version:
WorkflowInstanceQuery workflowInstanceQuery = new WorkflowInstanceQuery();
Map<QName, Object> filters = new HashMap<QName, Object>(9);
if (req.getParameter(MY_PARAM) != null)
filters.put(QNAME_MYPROP, req.getParameter(MY_PARAM));
workflowInstanceQuery.setCustomProps(filters);
List<WorkflowInstance> workflows = new ArrayList<WorkflowInstance>();
workflows.addAll(workflowService.getWorkflows(workflowInstanceQuery));
List<Map<String, Object>> results = new ArrayList<Map<String, Object>>(workflows.size());
for (WorkflowInstance workflow : workflows)
{
results.add(buildSimple(workflow));
}
This is working perfectly, but now i'd like to have as a result all the workflows that match in like or contains the property in input.
For example if the property in input is valued "hello" i would like to have in output of the webscript the workflows that have that property with values such as "hello" or "hello Dear" or "Say hello" and so on...
This is actually working with search for content in Advanced Search of Alfresco Share...how to implement with WorkflowInstanceQuery?!

Alfresco's ActivitiWorkflowEngine class uses Activiti's HistoricProcessInstanceQuery for the search and it is using "variableValueEquals" method to add the custom properties so it will never behave as a "LIKE" clause.

There are two things which you need to consider here.Workflow Model and Content Model.You need to understand both of the things here.Whatever properties are created in content model are stored with documents and not with workflows.Workflows are having task model associated with it.So logical its going to difficult to find filter workflows based on properties of document.Because there is no association between them, unless and until you have explicitly created it.
If you want to filter based on properties than it should exist in workflow model associated with workflow task.That too you have to filter based on task of workflows.Because each task will have its own property.

Have you tried putting wildcards for your filters parameter?
filters.put(QNAME_MYPROP, "*"+req.getParameter(MY_PARAM)+"*");

Indexing multi-lingual content with Lucene.net

I use Lucene.net for indexing content & documents etc.. on websites. The index is very simple and has this format:
LuceneId - unique id for Lucene (TypeId + ItemId)
TypeId - the type of text (eg. page content, product, public doc etc..)
ItemId - the web page id, document id etc..
Text - the text indexed
Title - web page title, document name etc.. to display with the search results
I've got these options to adapt it to serve multi-lingual content:
Create a separate index for each language. E.g. Lucene-enGB, Lucene-frFR etc..
Keep the one index and add an additional 'language' field to it to filter the results.
Which is the best option - or is there another? I've not used multiple indexes before so I'm leaning toward the second.

I do [2], but one problem I have is that I cannot use different analyzers depending on the language. I've combined the stopwords of the languages I want, but I lose the capability of more advanced stuff that the analyzer will offer such as stemming etc.

You can eliminate option 1 and 2.
You can use one index and the fields that contains arabic words create two fileds for each:
If you have field "Text" might contain arabic or english contents ==>
Create 2 fields for "Text" : 1 field, "Text", indexed/searched with your standard analyzer and another one, "Text_AR" , with the arabicAnalyzer. In order to achieve that you can use
PreFieldAnalyzerWrapper

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Force Lucene (in OpenCms) to show results in a specific Locale - search

Related

Returning accented as well as normal result set via azure search filters

Smart search results behaviour of compound index of multiple page types

What is the internal id(name/number) for a saved search record type?

Alfresco - Search Workflow with like cluase (or contains!)

Indexing multi-lingual content with Lucene.net

Categories

Resources