Does SharePoint Search support range tags? - sharepoint

I am working on a project to digitize approximately 1 million images for which metadata will be added to facilitate search.
Each image is, for example, a page in a dictionary. But not text. Just a static scanned image. OCR is not an option :(
My objective is to emulate the current search procedure which consists of looking up the alphabetical entries till the correct page is found. In absence of machine readable text, I am looking at tagging each page with Dictionary range tag. For Example (Apple-Canada). So if someone searches for "Banana", it should hit the (Apple-Canada) range Tag.
Is this supported in SharePoint out of the box? If not, is there an addon product which provides this functionality or am I looking at building a customized extension?
Any help will be appreciated :)

Installing the IFilter for TIF files is done with a couple of clicks and gives you free OCR along the way. Very good for scanned pages.
On your question though: No, SharePoint does not have any kind of "range" tags or fields. The only vaguely similar thing to what you are requesting is the Thesaurus of the search. There you could define acronyms and synonyms for words and it would actually search for something else. So you could enter Banana but it would actually search for Apple. Some examples here: How to: Customize the Thesaurus in SharePoint Search and Search Server.
Other than that I can only think of a custom implemented search provider giving you the flexibility you need.

Related

Creating Index and Skill Azure Cognitive Search

I am trying to create an index and skill that will allow me to
Index pdfs, multi and single page, and all other types of files,
Extract the Data and make it searchable,
Search for a term say "Cat" and have sections of text where the term appears to be returned, as well as the page number and document name / downloadable URL of the PDF/ image where it was found, a bounding box, would be nice but not necessary.
I am struggling, I have tried text extraction skill, OCR skill, but I am struggling in that the Search term returns the whole, extracted document (100 pages), as text in the file "content"
It's not making much sense to me, the JFK example is outdated.
I have spent 4 days on this, it cannot be that difficult, the documentation is not that helpful either.
I have tied to "build" and index and skillset using the portal tools, but getting a similar result.
any help would be appreciated.
You might want to try the hOCR custom skill, available on GitHub from the Power Skills repository if you prefer to use the hOCR format for bounding boxes, but [the OCR skill](https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-ocr#sample-text-and-layouttext-output's output) already offers bounding boxes for content. Note that the Power Skills repo also has updated versions of most of the skills used in the JFK sample, including the image store that can help you make pictures of the pages available in your app.
The key to making it work is in the skillset definition.
The JFK skillset has its OCR skill output layoutText.
There is also a custom image store skill that uploads /document/normalized_images/*/data and keeps the resulting URI as imageStoreUri.
Another custom skill transforms the OCR layout results into the HOCR format.
Then a ShaperSkill is aggregating that information under ocrImageMetadata.
In the case of JFK, that information then gets further aggregated under cryptonyms, because that's the main thing the JFK demo is focusing on, and the image metadata is also an output field mapping for /document/hocrDocument/metadata as metadata, which is also indexed. The important point is that all the relevant information is mapped to the indexed fields. As a consequence, the information therein becomes available from index query results.

Search a specific search of a journal article based on the user type

I have this requirement:
We have a journalarticle and we wish to have sections which have content for internal and external users for the application.
We are able to hide the content from rendering by implementing custom template on web content display and using a simple custom-field for a user which helps us to classify it.
Having said that when we search something as an external user, the search portlet is able to fetch an article where the search text is a part of internal user content, and due to the above mentioned template the content is not visible.
In short, from the user's perspective the resultant article does not match the searched term.
I wish to seek some pointer to check whether there is a mechanism to ensure that when an external user searches something then we only search the dynamic-element of the doc which matches the user type?
We have thousands of such articles and create multiple copy of the same article does not seems viable solution.. so any pointers would be a great help.
Liferay version : 6.2 GA4 CE
Thanks!
AJ
First of all: Not finding a search term in a document can be a sign of good working synonym resolution in the search engine. It's questionable if this behaviour is always wrong or only in this particular case. Remember google bombs?
That being said, I believe that this architecture of half-visible documents is flawed from the beginning. Ideally I'd suggest to change it, for example by splitting the information to two articles, so that you can use the standard permissions to resolve. If you link both, you can determine how/which article or template to use. It's not an ideal solution, but might be a workaround.
Another workaround might be to change Liferay's indexer component and index two different versions of the article, with two different permissions. Of course, you'll have to change the search side as well, so that you'll find each article at most once, even if it's now twice in the search engine.
Again - not ideal, but might be the quickest fix that you can get right now without changing the underlying architecture. However, to change the underlying architecture is my actual recommendation.

Definitive method to exclude page sections from main search engines

I have quite a few constant parts of pages I'd like to exclude from displaying in search results to prevent obscuring of the unique content on each respective page.
I read that class="nocontent" will perform this action for Google. But what about the other main search engines like Yahoo and Bing? Is there a globally accepted solution for this, or is there an additional step to get them to do the same?
Thank you for any assistance.
Google doesn't offer such a feature for the general search. The class nocontent is only for Google Custom Search. The comments googleon/googleoff are only for Google Search Appliance.
Yahoo! introduced the class robots-nocontent in 2007. Google doesn't support it.
There is a microformats draft, but it has probably no support.
Despite that, there are some "hacks" that could accomplish what you need, but I wouldn't count on or use them. For example: inserting content with JS, or embedding content in iframe (and blocking the source URL in robots.txt).

How to modify opencart basic search module to a more advanced one?

I am pretty new to opencart and I want to learn how can I create a more advanced search function.
For example I have products in my store, some of them are blue and some of them are red.
How can I specify or where I can specify a product's color (but not in the title, something like an attribute)?
And after that how can I search for red products?
I do not want to search by the keywords. I want a tab where to select the color and if I choose blue the search will show me all the blue products.
I hope you understand what I wish.(and can you please give me some code examples: where to add what to add to achieve what)
Thank you!
Normally, you would use Product Tags for this (field is under Product description on Product Page in Admin > Catalog > Products). You could have problems with very short tags 3 characters or less. See this post for more info:
mysql fulltext MATCH,AGAINST returning 0 results
You would add tags to your products like: red,black,brown,leather,s,m,l,xl small, medium, large
Then you could search any of the terms
[EDIT: in response to comment #1]
I would imagine that you just type multiple terms into search box:
'brown','large'
then all products that have (any? both?) of these tags returned.
You could use a Tag cloud or similar module to display tags on your pages, also you could use these terms in search field. If you search for 'brown', all products that have this tag will be returned.
You may also consider a third party extension for a more advanced search, check Opencart site's extension section.
If you want to modify/improve the search functionality yourself, you'll need to tinker with SQL queries in catalog/model/catalog/product.php
Opencart Search is considered by many to be one of the weak points of this package. There have been discussions on Opencart forums on this matter.
Just see how it works for you with the out-of-the-box setup, then if you need more functionality, look for an extension that does what you want, hire a programmer or code it yourself.

Sharepoint content are not searchable...why?

I create a form form infopath. On that form I took the text boxes with corresponding fields, then I embed the form in SharePoint. Then in my document library, clicked "new", filled up data. Then I can see the data columns wise in my document library. Let's say I fill data as "Lalit" when I tried to search it, it gave me message:
"No results matching your search were found."
1.Check your spelling. Are the words in your query spelled correctly?
2.Try using synonyms. Maybe what you're looking for uses slightly different words.
3.Make your search more general. Try more general terms in place of specific ones.
4.Try your search in a different scope. Different scopes can have different results.
What should the problem ?
If you're using SharePoint Server you have to configure the search before you can use it. SharePoint then crawls the content of you site and builds an index for it, that will be used by the search.
You find the search configuration in the Central Administration under the Shared Service Provider for you web application.

Resources