Hi I have a problem on this page
https://www.hwl.dk/da/frontpage?q=sp%C3%A6kbr%C3%A6t&hPP=6
Where the italic text in the search result says things like spækbrætt8 and spækbrættn when I have a two way synonym set up for [spækbræt,skærebræt] and the original text in the search result is "skærebræt".
My question is this:
Why does algolia add t8 and tn and so on to the end of the result when using synonyms.
PS.
The æøå, ÆØÅ letters are native to Denmark by the way.
This was a bug in Algolia's API, which is now fixed (as of Friday, November 10th, 2017).
Related
Basically I want to implement a fuzzy search that disregards language!
For example - let's say that there's an entry for "Hello World".
Now, I want this to work with:
"hello"
"henlp"
"руддщ" (these are the Russian characters if you try to type "hello" but forget to switch to English)
"рутдз" (same as above but with "henlp" instead of "hello")
"יקמךם" (same as above but in Hebrew)
etc.
Now the things that makes most sense to me is to ignore the actual text and regard their relevant keyCodes, which all obviously work universally).
I did thought about for each entry, saving an array which represents all key codes - and then implement fuzziness based on the already given keyCodes instead of chars, but that feels like I'm doing something wrong, or missing something that already exists.
So, from what I've gathered there's no implementation of fuzzy search that regards this.
Is there maybe an alogrithm (other than fuzzy search) that already regards this which I'm missing?
Currently trying to implement in Node.js but open for more languages and frameworks
I am new to Notepad++ and have ben researching how to do this, but it seems each answer I try to mimic doesn't work correctly.
Here is the scenario:
I have 2 text files, each with ATM transactions such as time of transaction (In military time, such as 18:09) and transaction amount (Displayed as 43.00)
I need to find a way to search the document so that it only returns matches where both the time and amount are there, and on the same line of the document.
Example would be, I need to find on this huge text file where both 43.00 and 18:09 appear on the same line, allowing my to verify the transaction was valid.
Any ideas on how to do this? I am using the latest Notepad++6.8 and have downloaded the compare plugin.
Thank you and I will begin researching how the coding works in notepad++ in the meantime, as I am not an experienced programmer (Just had 1 college course in C++ which I loved but eh)
Cheers!
Ctrl-F, Select "Regular expression" as Search mode and then write:
8:09.*43.00
Ctrl F, search for 43.00 or 18:09.
I am using the Hit Highlighting feature in Azure Search and noticed a discrepancy in the way it behaves from the documentation. In the documentation it says that when you use hit highlighting it will return a snippet of the field with the highlight, but it always returns the entire field (with proper highlighting).
Is there a way to have Azure Search instead return just a snippet (say of about 200 characters) that includes the highlight?
Currently, the answer is no, you cannot. The field breaks according to (English) sentence rules, ie. it breaks on ".", "!", "?".
Also see this question for an example on breaking and some more info relating to the delimiters.
Depending on the nature of the field you might be able to add one of the above delimiters to 'emulate' what you want to accomplish (as suggested by Nate Ko).
I want to suggest something else on top of what Nate spoke to. When you look at the document response, also take a look at the Highlights part of the results (as opposed to the Document). For example, you might be currently getting the field results by retrieving something like this:
Results[i].Document.DESCRIPTION
If there is a highlight found for that field, the snipped will be found here:
Results[i].Highlights.DESCRIPTION
What I like to do is to first check if there is a valid Highlight and if so display it. If not, I show the actual field content.
Liam
We recently introduced a change that improves the highlighter performance on large fields and NLP experience. One side effect of the change was that the new highlighter generates snippets based on sentences, breaking the text field on '.' (period).
One way to workaround the issue is to put '.'s in the field. We are working to enforce the snippet size and let you know when it is available.
I am using hit-highlighting in azure search. It works fine but I want to fine tune it a bit.
Say, a field has the following value:
"It uses period as the delimiter. If not, please clarify"
If I search for "please" I will get a highlight hit on that field, e.g.:
"If not, <em>please</em> clarify"
If I search for "period" I will get a highlight hit on that field, e.g.:
"It uses <em>period</em> as the delimiter."
After trying it with several examples it seems that it uses period (".") as a delimiter so that it doesn't return the whole field.
From another SO question (Hit Highlighting in Azure Search Service) it seems that I cannot configure azure search to return the whole field with all terms highlighted.
I want to ask:
if this is really the case or more complex rules apply
do I have any control of how the field is split for hit highlighting, e.g. change the delimiter to say "," or "\n"
Thanks in advance
Unfortunately there is no way to customize how documents are split for hit highlighting. Feel free to use Azure Search User Voice website to post improvements ideas giving other users opportunity to vote for them and helping us prioritize: http://feedback.azure.com/forums/263029-azure-search
The hit highlighter splits documents into sentences. In general it's fair to assume it breaks on dots but it also handles abbreviations etc.
Question about Lucene,
I have a file that I would like to index and search by different analyzers. My goal is to be able to change how I search.
In one case I would like to search exact phrase with punctuation IE. for "one,two" and only return exact matchings w/ punctuation.
I would also like to be able to search the exact phrase without punctuation. IE. for "one two." As in the StandardAnalyzer
Essentially I need to change the search functionality on one field.
How can I change the search on the same file. Ive tried using two analyzers (standard and whitespace) however this makes the indexing time very long.
My second thought is to use just a WhitespaceAnalyzer and when searching pass a query that further tokenizes each string if needed? However I am not sure which API has this if any do.
Also is there a good reading on how analyzers and tokens work and are implemented.
Thanks
What do you mean you tried two analyzers? Duplicate the content to 2 seperate fields with different analyzers? That would be my suggestion.