Sharepoint Beginner Questions: where to store data? - sharepoint

I'm about to develop an application in Sharepoint.
I've got experience in asp.net and C#, Domino, Java, etc..
Now my 1000$ question: Where can I store data in Sharepoint? I'm aware there are list definitions.. so is it a good practice to store the data "natively" in Sharepoint using lists, or traditionally in an external data container, e.g. ms sql 2008?

Because SharePoint is essentially a .NET Web Application, the options are virtually limitless for how you store data used in your application. The two most common practices would be to use SharePoint lists to store your data, or to store the data in a SQL database.
I would suggest that each have their advantages. A SharePoint list is advantageous because it can be seen by the users and you can leverage out of the box features to allow users to do CRUD operations. A SQL database makes more sense when the size of the data is large and does not fit well within the constructs of the SharePoint lists. SQL is going to perform much faster when doing bulk operations.
Hope this helps!

Related

Application insight -> export -> Power BI Data Warehouse Architecture

Our team have just recently started using Application Insights to add telemetry data to our windows desktop application. This data is sent almost exclusively in the form of events (rather than page views etc). Application Insights is useful only up to a point; to answer anything other than basic questions we are exporting to Azure storage and then using Power BI.
My question is one of data structure. We are new to analytics in general and have just been reading about star/snowflake structures for data warehousing. This looks like it might help in providing the answers we need.
My question is quite simple: Is this the right approach? Have we over complicated things? My current feeling is that a better approach will be to pull the latest data and transform it into a SQL database of facts and dimensions for Power BI to query. Does this make sense? Is this what other people are doing? We have realised that this is more work than we initially thought.
Definitely pursue Michael Milirud's answer, if your source product has suitable analytics you might not need a data warehouse.
Traditionally, a data warehouse has three advantages - integrating information from different data sources, both internal and external; data is cleansed and standardised across sources, and the history of change over time ensures that data is available in its historic context.
What you are describing is becoming a very common case in data warehousing, where star schemas are created for access by tools like PowerBI, Qlik or Tableau. In smaller scenarios the entire warehouse might be held in the PowerBI data engine, but larger data might need pass through queries.
In your scenario, you might be interested in some tools that appear to handle at least some of the migration of Application Insights data:
https://sesitai.codeplex.com/
https://github.com/Azure/azure-content/blob/master/articles/application-insights/app-insights-code-sample-export-telemetry-sql-database.md
Our product Ajilius automates the development of star schema data warehouses, speeding the development time to days or weeks. There are a number of other products doing a similar job, we maintain a complete list of industry competitors to help you choose.
I would continue with Power BI - it actually has a very sophisticated and powerful data integration and modeling engine built in. Historically I've worked with SQL Server Integration Services and Analysis Services for these tasks - Power BI Desktop is superior in many aspects. The design approaches remain consistent - star schemas etc, but you build them in-memory within PBI. It's way more flexible and agile.
Also are you aware that AI can be connected directly to PBI Web? This connects to your AI data in minutes and gives you PBI content ready to use (dashboards, reports, datasets). You can customize these and build new reports from the datasets.
https://powerbi.microsoft.com/en-us/documentation/powerbi-content-pack-application-insights/
What we ended up doing was not sending events from our WinForms app directly to AI but to the Azure EventHub
We then created a job that reads from the eventhub and send the data to
AI using the SDK
Blob storage for later processing
Azure table storage to create powerbi reports
You can of course add more destinations.
So basically all events are send to one destination and from there stored in many destinations, each for their own purposes. We definitely did not want to be restricted to 7 days of raw data and since storage is cheap and blob storage can be used in many analytics solutions of Azure and Microsoft.
The eventhub can be linked to stream analytics as well.
More information about eventhubs can be found at https://azure.microsoft.com/en-us/documentation/articles/event-hubs-csharp-ephcs-getstarted/
You can start using the recently released Application Insights Analytics' feature. In Application Insights we now let you write any query you would like so that you can get more insights out of your data. Analytics runs your queries in seconds, lets you filter / join / group by any possible property and you can also run these queries from Power BI.
More information can be found at https://azure.microsoft.com/en-us/documentation/articles/app-insights-analytics/

Azure Table Storage - Entity Design Best Practices Question

Im writing a 'proof of concept' application to investigate the possibility of moving a bespoke ASP.NET ecommerce system over to Windows Azure during a necessary re-write of the entire application.
Im tempted to look at using Azure Table Storage as an alternative to SQL Azure as the entities being stored are likely to change their schema (properties) over time as the application matures further, and I wont need to make endless database schema changes. In addition we can build refferential integrity into the applicaiton code - so the case for considering Azure Table Storage is a strong one.
The only potential issue I can see at this time is that we do a small amount of simple reporting - i.e. value of sales between two dates, number of items sold for a particular product etc.
I know that Table Storage doesnt support aggregate type functions, and I believe we can achieve what we want with clever use of partitions, multiple entity types to store subsets of the same data and possibly pre-aggregation but Im not 100% sure about how to go about it.
Does anyone know of any in-depth documents about Azure Table Storage design principles so that we make proper and efficient use of Tables, PartitionKeys and entity design etc.
there's a few simplistic documents around, and the current books available tend not to go into this subject in much depth.
FYI - the ecommerce site has about 25,000 customers and takes about 100,000 orders per year.
Have you seen this post ?
http://blogs.msdn.com/b/windowsazurestorage/archive/2010/11/06/how-to-get-most-out-of-windows-azure-tables.aspx
Pretty thorough coverage of tables
I think there are three potential issues I think in porting your app to Table Storage.
The lack of reporting - including aggregate functions - which you've already identified
The limited availability of transaction support - with 100,000 orders per year I think you'll end up missing this support.
Some problems with costs - $1 per million operations is only a small cost, but you can need to factor this in if you get a lot of page views.
Honestly, I think a hybrid approach - perhaps EF or NH to SQL Azure for critical data, with large objects stored in Table/Blob?
Enough of my opinion! For "in depth":
try the storage team's blog http://blogs.msdn.com/b/windowsazurestorage/ - I've found this very good
try the PDC sessions from Jai Haridas (couldn't spot a link - but I'm sure its there still)
try articles inside Eric's book - http://geekswithblogs.net/iupdateable/archive/2010/06/23/free-96-page-book---windows-azure-platform-articles-from.aspx
there's some very good best practice based advice on - http://azurescope.cloudapp.net/ - but this is somewhat performance orientated
If you have start looking at Azure storage such as table, it would do no harm in looking at other NOSQL offerings in the market (especially around document databases). This would give you insight into NOSQL space and how solution around such storages are designed.
You can also think about a hybrid approach of SQL DB + NOSQL solution. Parts of the system may lend themselves very well to Azure table storage model.
NOSQL solutions such as Azure table have their own challenges such as
Schema changes for data. Check here and here
Transactional support
ACID constraints. Check here
All table design papers I have seen are pretty much exclusively focused on the topics of scalability and search performance. I have not seen anything related to design considerations for reporting or BI.
Now, azure tables are accessible through rest APIs and via the azure SDK. Depending on what reporting you need, you might be able to pull out the information you require with minimal effort. If your reporting requirements are very sophisticated, then perhaps SQL azure together with Windows Azure SQL Reporting services might be a better option to consider?

Sharepoint comments module best approach

I have a requirement to make a comments web part that allows paging. Paging is a common feature throughout the design.
What I was wondering was is a web part the best way of going about this or is there another approach that would be more suited to sharepoint?
I am not 100% sure what you are asking... but the Linq To SharePoint functionality in SharePoint 2010 features the Skip/Take functionality which can provide paging for lists, if that is where your data is being persisted. If you have data being persisted to a database, you can use obviously use the Linq Skip/Take functionality.
Not sure if that helps

Are SharePoint Lists evil in an SOA based enterprise?

My company is moving from Client/Server applications (thick client apps that makes database calls directly) to a Service Oriented Architecture (SOA) (thin or thick clients that call a web service that then does business logic and calls the database).
Part of this includes using SharePoint as our client (not our only client type, but the major one). I have been watching the Pluralsight training on SharePoint and I am starting to see a lot about SharePoint Lists.
SharePoint Lists seem to be to a core part of SharePoint. However they also seem to be a huge step backward architecturally speaking. These are my concerns:
Using these lists, I will have my SharePoint webparts hitting the data directly again (much like where we were with 2 tier client/server apps).
This confuses the data layer big time. Do I store my list of clients in the SQL Server database? Or a SharePoint list? Or both? (say it ain't so!) If both, how do I keep them in Sync?
If I store the data in SharePoint Lists do I then have to have my Web Services using the SharePoint Client Object Model to get at the lists (for non-SharePoint clients)?
Basically SharePoint Lists seem like a very very bad idea. But what I hear is that it is one of the big benefits of SharePoint. (Though I know that there are things like resource management and permissions that are also useful in SharePoint.)
SharePoint Lists seem like an attempt at low grade data storage. (With out all the benefits of a full data management solution like SQL Server.)
So here are my questions: What are the right/best practice reasons why would I use SharePoint Lists over web services that access a SQL Server? And can SharePoint even work normally using web services to get and update data? (Basically, if I don't use lists, do I lose a lot of functionality?)
SharePoint lists are not a one size fits all solution to data storage. There are a great deal of scenarios where you'll want to use data available from an external system, like an existing CRM database, inside of SharePoint.
SharePoint 2007 used a concept called Business Data Catalog to address some of these scenarios, allowing a read-only view of external system data in SharePoint lists.
SharePoint 2010 greatly expands on the SharePoint 2007 capabilities with Business Connectivity Services, allowing for full read/write from SharePoint lists, with API access allowing custom connectors to be implemented in code for whatever backend system you may be trying to access (a SQL Server provider is provided out of the box). Here's a pretty thorough primer on the BCS, and there's a lot more information to be found on MSDN.
Be wary of trying to use SharePoint lists as traditional tables in a RDBMS, these aren't their purpose and it will only lead to intense headaches down the road.
While I agree with the answer by OedipusPrime, I feel your question "Why would I use lists" warrants a more detailed answer.
The short version is, you probably shouldn't. What SharePoint gives you are lists which are a bit 'database-like', but simple enough that your ordinary user can cope with them. They're quite flexible for users. It also gives you a user interface to interact with the lists and data.
You're not using the UI, and you're probably quite happy with SQL, so SQL should probably be your choice. There's nothing that you can do in SharePoint that you can't do yourself in SQL (often faster) - but SQL isn't as user friendly for non-techies to set things up. SharePoint isn't a "full data management solution" like SQL - it's more like ASP.NET on steroids, and it has different advantages. (That's why its back end is ... SQL)
So, where would you store your data? SQL or Lists. One or the other - don't do both, that never works out well. If your data is in SQL, you can expose it in SharePoint with the BCS as mentioned already.
If your data is in a SharePoint list, yes, you can use the Client Object model. Or you can use Web Services directly. Or the REST API. All those are valid options.
Or, you could expose data from your database via your own Web Services, and then consume those via SharePoint's BCS, allowing you to present your data in SharePoint (with full CRUD, if you want) without your application becoming dependent upon it.
You are partially right. Regarding your options, here is the route you should go:
You should store the data only in lists. Not in SQL server. The data in lists is ultimately stored in SharePoint content database in SQL server and there is no point syncing it.
To have your clients access the data, your web services can call out of the box web services exposed in SharePoint which can operate on lists data.
See this article on overview of web services exposed by SharePoint:
http://msdn.microsoft.com/en-us/library/ee538665.aspx
http://www.sharepointmonitor.com/2007/01/sharepoint-web-service/
This is one of the big questions that faces someone new to SharePoint - should I store X in a SharePoint list, or in a SQL Server table?
SharePoint lists have similarities to database tables in that they have rows (items) and columns and the equivalent of triggers (event receivers). Data management pages are part of SharePoint so there is no need to build pages for updating and adding items to the table, and additionally the lists can be exposed as RSS feeds or through web services. They can also be versioned and participate in a workflow. Another advantage is that the contents of the lists are automatically included in content backups, so there is no need to manage a separate backup and restore process – everything is in the content database. There may not necessarily be a performance impact because there are several caching mechanisms which come into play.
A SharePoint list should certainly be considered as a storage mechanism, even for large datasets with appropriate treatment. In one sense the SharePoint list is acting as an effective data access layer, bearing in mind that ultimately the data is being stored in a SQL Server database anyway. What you do not get is the rigour of relational modelling, normalisation, referential integrity, optimisation of the execution plan, and all the other tools of the DBA’s craft. If the efficient management of the data is dependent on those skills then storing the data directly in its own database is probably a better choice. You can still get at the data through BCS, as well as through custom code.
A word of warning: on no account be tempted to interact with the SharePoint content databases directly.

SharePoint List like Data Access Interface

I am impressed by the way we programmatically access lists in SharePoint. I percieve it as a Data Access Layer, while modeling the database is as simple as defining the columns in the List.
I am looking for a tool OR an application that would give me similar interface to a database. Basically, for some reason I cannot use SharePoint and I don't wish to take up the responsibility of modeling, deploying and maintaining the database. I find the SharePoint way of persistence management acceptable and exciting.
Can anyone suggest me something even close to this.
BTW, my application is on ASP.Net and my preferred RDBMS is MS SQL Server.
If you don't want the overhead and expense of a Sharepoint installation, 90% of the time all you really need is WSS 3.0 (free with a windows server license).
For auto generated entity classes you can use Linq To Sharepoint (SPMetal)
For hand written POCO entities you can try using SharepointCommon ORM
Use NOSQL database like MongoDb or CouchDB which are schema less, allowing you to freely add fields to JSON documents without having to first define schema.

Resources