I am looking for a way to add a document to Search Index using API, as and when document gets added to document library.
I can add eventhandler and write a code to call API. I need to know if API supports such interface. Any sample will be really helpful.
Thanks.
I think that SharePoint (2007 and 2010) have passive indexing, meaning it is out of your control beyond scheduling the indexing service to run at a certain frequency. That being the case, there are occasions when your search cache will be out of sync, such as when you first delete an item. However, I believe you can programmatically prime the index service.
It is also possible to have SharePoint non-SharePoint content, such as a UNC path, via the Central Admin.
As other mentioned it isn't quite possible to do what you want. However you can decrease the latency between when you add content and when it gets indexed. The process looks like this:
Create a new Search Content Source that includes your data that needs to be rapidly searched
Add only sites that you care about rapid search to this content source
Schedule this content source's incremental crawl to happen really often. Consider programmatically watching the crawl status so that you could restart the crawl after it has completed.
Tune your search databases I/O and its indexes so that search crawling happens as fast as possible.
Related
Is Sharepoint my best option to replace an aging network of fileshares? There's approx 1TB of data residing among 3 fileshares (1 DFS, 2 NAS boxes). A document management system is in place for new things - the file shares are now just read-only archives/legacy. Our users would simply need to be able to search for and open the documents.
Users are finding it difficult to locate their documents in the file shares and windows search does not often help. Sharepoint was suggested as something which would play nicely with Office documents (99% of the content) and have a good search facility.
Not being a Sharepoint Developer or having had any training on it, I'm getting a little lost. I have set up a test server to try it out using SP2013. I have managed to index each of my file shares and have created a search page. However, results aren't consistent with the indexted items. I assume I need to somehow get the relevant metadata from the files but I have no idea how to go about this.
Could anyone suggest some resources for help on this subject (my searches have mainly turned up paid-for Sharepoint addons or outdated blogs) and any experience of doing something similar? Also happy for any suggestions on ways to achieve this using other software/platforms.
I went with Microsoft Search Server 2010 in the end.
Sharepoint is basically optimized to be a document manager. I think you don't need to buy or donwload addons.
For your problem, metadata are the key! You need to properly specify the metadata.
I give you the theory of a plan document management in SharePoint 2013 :
https://technet.microsoft.com/en-us/library/cc263266.aspx
A nice introduction to metadata :
http://fr.slideshare.net/gzelfond/document-management-in-sharepoint-without-folders-introduction-to-metadata
Be careful to use the Microsoft documentation for the beginning. From my experience, its difficult to start with this documentation because you have several things in it. There is also good books/ebooks that you can find easily to start well, and probably more simplified than MS documentation.
I am looking for a document repository solution (hosted internally, not cloud-based) that has the following set of features:
Statuses: Users can set a status (pending, complete, etc) on the document. Notification system based on the status change.
Workflows: Ability to define who the documents go to based on the different statuses. Be able to customize these workflows.
Tagging: Users want to be able to tag different sections of a document with certain key words.
Search: Ability to search for documents based on content within the document as well as for tags.
Merging: Ability to select different pages and/or content within one document and merge that content to another document. Kinda like copy and paste, but more seamless.
Archiving: We need to be able to archive documents in some way.
Permission System: We want to limit document access to users based on roles (read, edit, delete, etc.)
Real-time collaboration: Users should be able to view documents at the same time.
I know that SharePoint has support for at least some of these features, but I am not sure which ones. I am having a difficult time navigating Microsoft's website on what SharePoint can actually do.
So my question is, do you guys know which of the above features SharePoint supports? Oh and any recommendations for other document repository solutions are also welcome.
Assuming that you are targeting SharePoint 2013 on premises;
Status - this can be done by adding a column to document library
Workflow - this is available out of the box. You can also enhance workflow capabilities with adding a third party workflow engine (e.g. Nintex)
Tagging - I am not sure I understand "tagging different sections of a document". Document tags are supported in SharePoint
Search - supported
Merging - this is "document set" functionality in SharePoint. But, it is limited to creating a set of documents with multiple files. Not sure this is what you want
Archiving - Look at records management processes in SharePoint. You can set file retention policies. You can declare records (so no one can change/delete them). Depending upon your definition of Archiving, this may or may not work for you.
Permissions System - most definitely available. More granular than NTFS IMHO
Real Time collaboration - available
Hope this helps
As part of a SharePoint solution, the functionality for users to create new web sites and publishing pages (programmatically) via a button click has been added. I need to ensure that the Description field for the newly created sites and pages is indexed by SharePoint Search. What is the best way to do this?
Please note, I am NOT interested in starting a new crawl. I just want to ensure that whenever the next scheduled crawl occurs, the contents of these fields will be searchable.
Thanks, MagicAndi
I'm guessing you mean how can you ensure the site is indexed immediately?
Generally, crawls are scheduled which means your new site will only be added to the search index after the next crawl is done. So if your incremental crawl happens every hour you may have to wait up to an hour for it to appear in the search index.
However, given that your new sites are being added programatically you could also programatically start an incremental crawl if it is vital for it to start appearing in search results immediately. There are details how to do this in this article.
Update:
The site title and description should be indexed automatically by the next crawl. If this isn't happening, then you don't have a Content Source that covers that site so you need to create/update one to cover the new sites and make sure it has a crawl schedule. If the new sites are created in separate site collections consider putting them on a Managed Path.
In our SharePoint system we have a terrabyte of data with 100,000 site collections and probably 20 new site collections added every day. We only have one content source that points to the root of the site and everything gets indexed automatically.
It sounds like you're missing a content source or a crawl schedule.
It turns out that the site description is included in the crawl by default. I tested the search default properties by creating a new site and assigning a unique text string to the description. After the next incremental crawl, I was able to search and find the unique string via the default SharePoint search.
I have not yet tested if the page description is included in the search scope by default, but I'm prepared to guess that it is. I will update my answer as soon as I get a chance to test this.
I want my sharepoint site to allow a user to search content in a known collection of RSS feeds. I figure conceptually a few ways to do this
crawl the feeds at their source (Yikes!)
Pull the full articles into my sharepoint site, then let my crawler crawl it
Make use of an existing index (like google)
search the full articles, on demand, using something like a google utility (my preference)
So can I somehow, from my sharepoint site, allow a user to search the full articles from a couple dozen, named, rss feeds
thanks
Cary
I don't see why there is a problem with crawling the feeds at their source? That would seem to be reasonable.
It is fairly easy to create a content source to point at the feed and select the correct indexing schedule. If that does not work then you can try a more complicated approach.
Be aware that copying the content of another website to host on your own could have copyright implications (not too mention the risk that any inflammatory content would appear to be published on your own site).
--update--
Try reading the target sites robots.txt to see if (it even has one) it has a desired frequency. Otherwise it depends on the depth of the site you would be crawling.
If you are crawling just the rss feed xml, I suspect you could do that every hour without annoying anyone. Otherwise if you reach into each article, you may want to limit that. It really depends a lot on any relationship you have with the target site and type of site you are hitting.
Checkout this article for a little more info on how SharePoint deals with robots.txt
(p.s. the target site did not put the articles on the web so no one would read them)
The out of the box crawler will respect robots.txt and there are provisions for crawler impact rules that will lessen the chance that SharePoint will perform a beat down on the external site.
I have a case where if a SharePoint site owner decides to break permissions inheritance and directly manage site membership, I'd also like to correspondingly modify view permissions on items in a specific list in the top-level site.
How can I best catch those changes so I know when to apply the appropriate changes to the list items?
I'd like to have some C# code be notified when a site's permissions are changed so I can programmatically modify the appropriate list item permissions.
The best way to do this (unfortunately) is to periodically query all of the sites and check to see if inheritance is disabled. I had a similar problem and used powershell scripting to create a report on site security. If you haven't used Powershell before, don't be intimidated. The syntax is VERY similar to C#.
You can use SharePoint auditing to monitor permission changes. It will track changes down to item level. The downside is that you have to turn this feature on and it will hurt performance somewhat.
As for notification, I don't think auditing tells you about changes. I'm pretty sure you would need to poll the audit log.
There's heaps of information about auditing in this article on MSDN.
Another approach which I think might do a very good job of this is to use the SharePoint ChangeLog. Bascially, this is used by SharePoint during indexing, with the log telling the gatherer exactly what has changed, and what should be indexed during an incremental crawl.
When you have a permission change, then this should be picked up during an incremental crawl. The ChangeLog has specific parameters that can be passed to identify changes to permissions. Take a look here at the SPChangeQuery Class:
http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.spchangequery.aspx
Specifically you can look for ChangeTypes:
http://msdn.microsoft.com/en-us/library/microsoft.sharepoint.spchangetype.aspx
Including:
AssignmentAdd
AssignmentDelete
MemberAdd
MemberDelete
...and more