Are SharePoint Lists evil in an SOA based enterprise? - sharepoint

My company is moving from Client/Server applications (thick client apps that makes database calls directly) to a Service Oriented Architecture (SOA) (thin or thick clients that call a web service that then does business logic and calls the database).
Part of this includes using SharePoint as our client (not our only client type, but the major one). I have been watching the Pluralsight training on SharePoint and I am starting to see a lot about SharePoint Lists.
SharePoint Lists seem to be to a core part of SharePoint. However they also seem to be a huge step backward architecturally speaking. These are my concerns:
Using these lists, I will have my SharePoint webparts hitting the data directly again (much like where we were with 2 tier client/server apps).
This confuses the data layer big time. Do I store my list of clients in the SQL Server database? Or a SharePoint list? Or both? (say it ain't so!) If both, how do I keep them in Sync?
If I store the data in SharePoint Lists do I then have to have my Web Services using the SharePoint Client Object Model to get at the lists (for non-SharePoint clients)?
Basically SharePoint Lists seem like a very very bad idea. But what I hear is that it is one of the big benefits of SharePoint. (Though I know that there are things like resource management and permissions that are also useful in SharePoint.)
SharePoint Lists seem like an attempt at low grade data storage. (With out all the benefits of a full data management solution like SQL Server.)
So here are my questions: What are the right/best practice reasons why would I use SharePoint Lists over web services that access a SQL Server? And can SharePoint even work normally using web services to get and update data? (Basically, if I don't use lists, do I lose a lot of functionality?)

SharePoint lists are not a one size fits all solution to data storage. There are a great deal of scenarios where you'll want to use data available from an external system, like an existing CRM database, inside of SharePoint.
SharePoint 2007 used a concept called Business Data Catalog to address some of these scenarios, allowing a read-only view of external system data in SharePoint lists.
SharePoint 2010 greatly expands on the SharePoint 2007 capabilities with Business Connectivity Services, allowing for full read/write from SharePoint lists, with API access allowing custom connectors to be implemented in code for whatever backend system you may be trying to access (a SQL Server provider is provided out of the box). Here's a pretty thorough primer on the BCS, and there's a lot more information to be found on MSDN.
Be wary of trying to use SharePoint lists as traditional tables in a RDBMS, these aren't their purpose and it will only lead to intense headaches down the road.

While I agree with the answer by OedipusPrime, I feel your question "Why would I use lists" warrants a more detailed answer.
The short version is, you probably shouldn't. What SharePoint gives you are lists which are a bit 'database-like', but simple enough that your ordinary user can cope with them. They're quite flexible for users. It also gives you a user interface to interact with the lists and data.
You're not using the UI, and you're probably quite happy with SQL, so SQL should probably be your choice. There's nothing that you can do in SharePoint that you can't do yourself in SQL (often faster) - but SQL isn't as user friendly for non-techies to set things up. SharePoint isn't a "full data management solution" like SQL - it's more like ASP.NET on steroids, and it has different advantages. (That's why its back end is ... SQL)
So, where would you store your data? SQL or Lists. One or the other - don't do both, that never works out well. If your data is in SQL, you can expose it in SharePoint with the BCS as mentioned already.
If your data is in a SharePoint list, yes, you can use the Client Object model. Or you can use Web Services directly. Or the REST API. All those are valid options.
Or, you could expose data from your database via your own Web Services, and then consume those via SharePoint's BCS, allowing you to present your data in SharePoint (with full CRUD, if you want) without your application becoming dependent upon it.

You are partially right. Regarding your options, here is the route you should go:
You should store the data only in lists. Not in SQL server. The data in lists is ultimately stored in SharePoint content database in SQL server and there is no point syncing it.
To have your clients access the data, your web services can call out of the box web services exposed in SharePoint which can operate on lists data.
See this article on overview of web services exposed by SharePoint:
http://msdn.microsoft.com/en-us/library/ee538665.aspx
http://www.sharepointmonitor.com/2007/01/sharepoint-web-service/

This is one of the big questions that faces someone new to SharePoint - should I store X in a SharePoint list, or in a SQL Server table?
SharePoint lists have similarities to database tables in that they have rows (items) and columns and the equivalent of triggers (event receivers). Data management pages are part of SharePoint so there is no need to build pages for updating and adding items to the table, and additionally the lists can be exposed as RSS feeds or through web services. They can also be versioned and participate in a workflow. Another advantage is that the contents of the lists are automatically included in content backups, so there is no need to manage a separate backup and restore process – everything is in the content database. There may not necessarily be a performance impact because there are several caching mechanisms which come into play.
A SharePoint list should certainly be considered as a storage mechanism, even for large datasets with appropriate treatment. In one sense the SharePoint list is acting as an effective data access layer, bearing in mind that ultimately the data is being stored in a SQL Server database anyway. What you do not get is the rigour of relational modelling, normalisation, referential integrity, optimisation of the execution plan, and all the other tools of the DBA’s craft. If the efficient management of the data is dependent on those skills then storing the data directly in its own database is probably a better choice. You can still get at the data through BCS, as well as through custom code.
A word of warning: on no account be tempted to interact with the SharePoint content databases directly.

Related

Benefits of using a hosted search service over building your own

I'm building a B2B Node app which has heavily related data models. We currently have our own search queries, but as we scale some of the queries appear to be becoming sluggish.
We will need to support multilingual search as well as content-based searches (searching matching content within related data).
The queries are growing more and more complicated (each has multiple joins on joins on joins) and I'm now considering a hosted search tool such as Algolia.
Given my concerns below, why should I use a hosted cloud search service rather than continue building my own queries?
Data privacy is important
Data is hosted in our own postgres DB - integrations with that are important (e.g.: will I now need to manually maintain our DB data and data in Algolia?)
Speed will be important, but not so much now
Must be able to do content-based searches across multiple languages
We are a tiny team of devs now, so dev resource time is vital
What other things should I be concerned about that can help make a decision in search capabilities?
Regarding maintenance of both DB and Cloud data, it seems it's as simple as getting all data, caching it, and storing it in the cloud:
var index = Algolia.initIndex('contacts');
var contactsJSON = require('./contacts.json');
index.addObjects(contactsJSON, function(err, content) {
if (err) {
console.error(err);
}
});
Search services like Algolia or self-hosted Elasticsearch/solr operate as full text search, not relational db queries.
But it sounds like the bottleneck is the continual rejoining. Which if you can make your relational data act like a full text document db then that could be a more efficient type of index (pre-joined sort of).
You might also look into views, or a data warehouse (maybe star schema).
But if you are going the search route maybe investigate hosting your own elasticsearch.
You could specify database, schema, sql, index, query details if you want more help.
Full Disclosure: I founded a company called SearchStax on the premise that companies and developers should not spend time setting up, managing, scaling or building tools for the search infrastructure (ops) - they are better off investing time of their employees into building value for the company, whether that be features, capabilities, product or customers.
Open Source Search solutions based on top of Lucene (Apache Solr / Elasticsearch) have what you need now and what you might need in near future from a capability perspective from a search engine. Find a mature service provider / AS-A-Service company that has specialization in open source search and let them deal with all. It may look small effort right now, though it's probably not worth time and effort of your devs to spend time on the operations of that.
For your concerns mentioned above:
Data privacy is important
Your concern around Privacy and Security are addressable. There are multiple ways you can secure your Solr environment and the right MSP or a Managed Solution provider should be able to address those.
a. Security at the transport layer can be addressed by SSL certificates. All the data going over the wire is encrypted.
b. IP Filtering and User Based Authentication should address who has access to what. Solr-as-a-Service offering by Measured Search supports both.
c. Security at rest can be addressed in multiple ways - OS level / File encryption, but you can even go further by ensuring not even your services provider has access to that data by using Searchable Encryption technology.
Privacy concerns are all address by Terms & Conditions - I am sure your legal department will address that from a Service Provider's perspective.
Data is hosted in our own postgres DB - integrations with that are important
Solr provides ability to import data directly (DIH) through a traditional relational database (MySQL, Postgres, Oracle, etc). You can either use that so Solr can pull data periodically or write your own simple script to push data through the Solr APIs.
If you are hosted in the cloud (AWS), a tunnel can be created so only the Solr deployments have the ability to pull data from your servers and your database servers are not exposed to the world, if you choose to go the DIH route.
Speed will be important, but not so much now
Solr is built for search speed - I don't think that's where your problems are going to be. Service offering like Measured Search's - you can spin up a cluster in any data center supported by AWS or Azure and make sure your search deployments are closer to your application servers so the latency overhead is minimal.
Must be able to do content-based searches across multiple languages
Yes, Solr supports that. More than 30 languages.
We are a tiny team of devs now, so dev resource time is vital
I am biased here, but I would not have my developers spend much time on operations and let them focus on what they do best - build great product capabilities to push the limits and deliver business value.
If you are interested in doing a comparison and ROI of doing it yourself vs using a solr-as-a-service like offered by SearchStax, check this paper out - https://www.searchstax.com/white-papers/why-measured-search-is-better-than-diy-solr-infrastructure/

Sharepoint Beginner Questions: where to store data?

I'm about to develop an application in Sharepoint.
I've got experience in asp.net and C#, Domino, Java, etc..
Now my 1000$ question: Where can I store data in Sharepoint? I'm aware there are list definitions.. so is it a good practice to store the data "natively" in Sharepoint using lists, or traditionally in an external data container, e.g. ms sql 2008?
Because SharePoint is essentially a .NET Web Application, the options are virtually limitless for how you store data used in your application. The two most common practices would be to use SharePoint lists to store your data, or to store the data in a SQL database.
I would suggest that each have their advantages. A SharePoint list is advantageous because it can be seen by the users and you can leverage out of the box features to allow users to do CRUD operations. A SQL database makes more sense when the size of the data is large and does not fit well within the constructs of the SharePoint lists. SQL is going to perform much faster when doing bulk operations.
Hope this helps!

Single Shared Database, Fluent NHibernate, Many clients

I am working on inventory application (C# .net 4.0) that will simultaneously inventory dozens of workstations and write the results to a central database. To save me having to write a DAL I am thinking of using Fluent NHibernate which I have never used before.
It is safe and good practice to allow the inventory application which runs as a standalone application to talk directly to the database using Nhibernate? Or should I be using a client server model where all access to the database is via a server which then reads/writes to database. In other words if 50 workstations when currently being inventoried there would be 50 active DB sessions. I am thinking of using GUID-Comb for the PK ID's.
Depending on the environment in which your application will be deployed, you should also consider that direct database connections to a central server might not always be allowed for security reasons.
Creating a simple REST Service with WCF (using WebServiceHost) and simply POST'ing or PUT'ing your inventory data (using HttpClient) might provide a good alternative.
As a result, clients can get very simple and can be written for other systems easily (linux? android?) and the server has full control over how and where data is stored.
it depends ;)
NHibernate has optimistic concurrency control ootb which is good enough for many situations. So if you just create data on 50 different stations there should be no problem. If creating data on one station depends on data from all stations it gets tricky and a central server would help.

SharePoint List like Data Access Interface

I am impressed by the way we programmatically access lists in SharePoint. I percieve it as a Data Access Layer, while modeling the database is as simple as defining the columns in the List.
I am looking for a tool OR an application that would give me similar interface to a database. Basically, for some reason I cannot use SharePoint and I don't wish to take up the responsibility of modeling, deploying and maintaining the database. I find the SharePoint way of persistence management acceptable and exciting.
Can anyone suggest me something even close to this.
BTW, my application is on ASP.Net and my preferred RDBMS is MS SQL Server.
If you don't want the overhead and expense of a Sharepoint installation, 90% of the time all you really need is WSS 3.0 (free with a windows server license).
For auto generated entity classes you can use Linq To Sharepoint (SPMetal)
For hand written POCO entities you can try using SharepointCommon ORM
Use NOSQL database like MongoDb or CouchDB which are schema less, allowing you to freely add fields to JSON documents without having to first define schema.

Should we use the SharePoint WF host for workflows that include external (to SharePoint) data sources?

We need to build a couple applications that require fairly advanced workflow functionality. The plan is to store the data in SQL Server, use Windows Workflow Foundation as the workflow engine, and build the frontend using an RIA technology such as Flex or Silverlight.
We already have Sharepoint 2007 set up, and some of us (including me) have a little bit of experience creating custom Sharepoint workflows that work with data in Sharepoint lists.
My question is, would it make sense to use Sharepoint for the workflow, while the actual data is stored outside of Sharepoint in a separate database? We need the task, authentication, and email functionality of Sharepoint, but our data model is a bit complex so we'd rather not store the data in Sharepoint. We'd rather not start from scratch with Workflow Foundation, because Sharepoint already gives us 90% of the functionality we need.
Any thoughts / advice?
I think that this is a great example for use of SharePoint as a platform. I dont see any conceptual problems using it in the way that you describe. I see SharePoint as a development platform. One thing you might want to keep in mind, is if you want to make the workflow continiue on events happening in the seperate database, you might have to update for instance the workflow tasks item from an external program.
Your use case is a perfect fit and one that SharePoint adds great value to. I would highly recommend using SharePoint to host your workflows.
I have developed many SharePoint hosted WF workflows and the only real problem that I ever experienced was making calls to long running web services (asynchronous operations) as SharePoints WF host has some limitations on the type of external providers it can listen for events from.
The solution that I developed (which was a bit of a hack at first but ended up being of some value to my customers) was to create a service proxy (WCF) that sat outside of SharePoint and would route calls to remote services and wait for their response. In parallel to making that asynchronous call a parallel activity would create a SharePoint task associated with the asynchronous operation. Then the WF would stop on a OnTaskCompleted activity which causes the WF resources to be released and the state to be persisted to SQL. As the long running operation would event back status updates or completion notification the external service would update the related SharePoint task. Once the task is marked completed the WF is dehydrated and continues executing. The neat thing about this approach was that I could then create a dashboard that showed the status of all the long running processes going on outside of SharePoint. Lastly I packaged all of this stuff up into a composite activity so that it didn't clutter up my pretty workflow diagrams.
SharePoint is ideally suited for this scenarion. I would suggest using a Business Data Catalog (BDC) to access external data sources. It provides a tremendouse benefit primarily by making your datasource searchable as well as providing OOB web parts to display the data with master child relation ships, filtering and a rich API.
I would caution against making workflows too complex and instead break up the process into stages using smaller workflows, InfoPath and user actions to facilitate the entire process. this is where SharePoint really shines as you can interject visibility of the process stages to others in the organization using dashboards (if it makes sense for your scenario) as well as collaboration, approvals ... the list goes on.
I agree that SP can provide a nice WF engine, but let me ask this... are you storing anything IN SharePoint? (tasks, data sources, etc)
I ask because it may be as easy (and more appropriate) to run your own WF engine. If you are running all native WF functionality, and just need an engine, you can write a quick console app that can start workflows.
If you are using SP for anything beyond WF, then I absolutely agree to use SP.

Resources