Creating a sub site in SharePoint takes a very long time - sharepoint

I am working in a MOSS 2007 project and have customized many parts of it. There is a problem in the production server where it takes a very long time (more than 15 minutes, sometimes fails due to timeouts) to create a sub site (even with the built-in site templates). While in the development server, it only takes 1 to 2 minutes.
Both servers are having same configuration with 8 cores CPU and 8 GIGs RAM. Both are using separate database servers with the same configuration. The content db size is around 100 GB. More than a hundred subsites are there.
What could be the reason why in the other server it will take so much time? Is there any configuration or something else I need to take care?
So today I had the chance to check the environment with my clients. But site creation was so fast though they said they didn't change any configuration in the server.
I also used that chance to examine the database. The disk fragmentation was quite high at 49% so I suggested them to run defrag. And I also asked the database file growth to be increased to 100MB, up from the default 1MB.
So my suspicion is that some processes were running heavily on the server previously, that's why it took that much of time.
Update 2:
Yesterday my client reported that the site creation was slow again so I went to check it. When I checked the db, I found that instead of the reported 100GB, the content db size is only around 30GB. So it's still far below the recommended size.
One thing that got my attention is, the site collection recycle bin was holding almost 5 millions items. And whenever I tried to browse the site collection recycle bin, it would take a lot of time to open and the whole site collection is inaccessible.
Since the web application setting is set to the default (30 days before cleaning up, and 50% size for the second stage recycle bin), is this normal or is this a potential problem also?
Actually, there was also another web application using the same database server with 100GB content db and it's always fast. But the one with 30GB is slow. Both are having the same setup, only different data.
What should I check next?
Thanks for the inputs everyone, I really appreciate.
Any idea what should I check next? Thanks a lot.

Yes, its normal OOB if you haven't turned the Second Stage Recycle bin off or set a site quota. If a site quota has not been set then the growth of the Second Stage Recycle bin is not limited...
the second stage recycle bin is by default limited to 50% size of the site quota, in other words if you have a site quota of 100gb then you would have a Second Stage recycle bin of 50gb. If a site quota has not been set, there are not any growth limitations...

I second everything Nat has said and emphasize splitting the content database. There are instructions on how to this provided you have multiple site collections and not a single massive one.
Also check your SharePoint databases are in good shape. Have you tried DBCC CHECKDB? Do you have SQL Server maintenance plans configured to reindex and reduce fragmentation? Read these resources on TechNet (particularly the database maintenance article) for details.
Finally, see if there is anything more you can do to isolate the SQL Server as the problem. Are there any other applications with databases on the same SQL Server and are they having problems? Are you running performance monitoring on the SQL Server or SharePoint servers that show any bottlenecks?

Backup the production database to dev and attach it to your dev SharePoint server.
Try and create a site. If it does not take forever to create a site, you can assume there is a problem with the Prod database.
Despite that, at 100gig, you are running up to the limit for a content database and should be planning to put content into more than one. you will know why when you try and backup the database. Searching should also be starting to take a good long time now.
So long term you are going to have to plan on splitting your websites out into different content databases.
Yeah, database size is all just about SQL server handling it. 100GB is just the "any more than this and it starts to be a pain" rule of thumb. Full Search crawls will also start a while.
Given that you do not have access to the production database and that creating a sub-site is primarily a database operation, there is nothing you can really do to figure out what the issue is.
You could try creating a subsite while doing a trace of the Dev database and look at the tables those commands reference to see if there is a smoking gun, but without production access you are really hampered.
Does the production system server pages and documents at a reasonable speed?
See if you can start getting some stats from the database during the creation, find out what work is being done. SQL has some great tools for that now.


Enabling NUMA on IIS when migrating to Azure VMs

So I'm trying to migrate a Legacy website from an AWS VM to an Azure VM and we're trying to get the same level of performance. The problem is I'm pretty new to setting up sites on IIS.
The authors of the application are long gone and we struggle with the application for many reasons. One of the problems with the site is when it's "warming up" it pulls back a ton of data to store in memory for the entire day. This involves executing long running stored procs and in memory processes which means first load of certain pages takes up to 7 minutes. It then uses a combination of in memory data and output caching to deliver the pages.
Sessions do seem to be in use although the site is capable of recovering session data from the database in some more relatively long running database operations so sessions are better to stick with where possible which is why I'm avoiding a web garden.
That's a little bit of background, however my question is really about upping the performance on IIS. When I went through their settings on the AWS box they had something call NUMA enabled with what appears to be the default settings and then the maximum worker processes set to 0 which seems to enable NUMA. I don't know why they enabled NUMA or if it was necessary, but I am trying to get as close to a like for like transition as possible and if it gives extra performance in this application we'll probably need it!
On the Azure box I can see options to set the maximum worker processes to 0 but no NUMA options. My question is whether NUMA is enabled with those default options or is there something further I need to do to enable NUMA.
Both are production sized VMs but the one on Azure I'm working with is a Standard D16s_v3 with 16 vCores and 64Gb RAM. We are load balancing across a few of them.
If you don't see the option in the Azure VM it's because the server is using symmetric processing and isn't NUMA aware.
Now to optimize your loading a bit:
HUGE CAVEAT - if you have memory leak type issues, don't do this! To ensure you don't, put on a private bytes limit roughly 70% the size of memory on the server. If you see that get hit/issue an IIS recycle (that event is logged by default) then you may want to ignore further steps. Either that or mess around with perfmon (or more easily iteratively check peak bytes in task manager where you'll have to add that column in the details pane)
Change your app pool startup mode to: AlwaysRunning
Change your web app to preloadenabled=true
Set an initialization page in your web.config (so that preloading knows what to load).
*Edit forgot some steps. Make sure your idle timeout is clear or set it to midnight.
Make sure you don't have the default recycle time enabled, clear that out.
If you want to get fancy you can add a loading page and set an http refresh or due further customizations seen below:

Content staging extremely slow

Recently, content staging became extremely slow for our Kentico 8.2 application (to move a page, it's taking 30 minutes or more). Similar staging tasks before took seconds to complete. We have restarted the website, and that had no effect.
Before, we just had the one website in the Kentico instance. We recently deployed another website to the same instance. This could be a coincidence, but it is the only thing we can think of that might be affecting the staging performance. However, we do not understand why. Why would adding a second website slow down the content staging of a different website? How do we fix it? Also, if the addition of another website is just a coincidence, what are other things to check in the event of slow content staging? We don't really know where to start with this one.
Sites are hosted on premise (not Azure) on same server.
Look into the table index fragmentation. It grows over a period of time and make the staging application slow.
Another thing to check, make a frequent sync of tasks / changes to higher environment to reduce the number of records to keep a track.
Hope this helps in resolving your issue.

How does memory usage, cpu time, data out, and file system storage apply to my website?

Pardon my ignorance but I have a few questions that I can not seem to get the answers by searching here or google. These questions will seem completely dumb but I honestly need help with them.
On my Azure website portal I have a few things I am curious of.
How does CPU-Time apply to my website? I am unaware how I am using CPU unless this applies to hosting some type of application? I am using this "site" as a form to submit data to my database.
What exactly does "data out" mean? I am allowed 165mb per day.
What exactly is file system storage? Is this the actual space available on my Azure server to store my project and any other things I might directly host on it?
Last question is, how does memory usage apply in this scenario as well? I am allowed 1024mb per hour.
I know what CPU-Time is in desktop computing as well as memory usage but I am not exactly sure how this applies to my website. I do not know how I will be able to project if I will go over any of these limits so that I can upgrade my site.
How does CPU-Time apply to my website? I am unaware how I am using CPU
unless this applies to hosting some type of application? I am using
this "site" as a form to submit data to my database.
This is CPU time used by your code. If you use a WebSite project (in ASP.NET) you may want to do PreCompilation for your WebSite proejct before deploying to Azure Website (read about PreCompilations here). Compiling your code is one side of the things. Rest is executing your code. Each web request that goes to a server handler/mapper/controller/aspx page/etc. uses some CPU time. Especially writing to database and so on. All these actions count toward CPU time.
But how exactly the CPU time is measured, it is not documented.
What exactly does "data out" mean? I am allowed 165mb per day.
Every single HTTP request to your site generates a response. All the data that goes out from your website is counted as "data out". Basically all and any data that goes out of the Data Center where your WebSite is located counts as data out. This also includes any outgoing HTTP/Web Request your code might be performing against remote sources. This also is the Data that goes out if you are using Azure SQL Database that is not in the same Data Center as your WebSite.
What exactly is file system storage? Is this the actual space
available on my Azure server to store my project and any other things
I might directly host on it?
Exactly - your project + anything you upload to it (if you allow for example file uploads) + server logs.
Last question is, how does memory usage apply in this scenario as
well? I am allowed 1024mb per hour.
Memory is same as CPU cycles. However my guess is that this is much easier to gauge. Your application lives in its own .NET App Domain (check this SO question on AppDomain). It is relatively easy to measure memory usage for the App Domain.

How to reduce memory consumption for Orchard CMS site hosted on Windows Azure Websites

I have an Orchard CMS website currently hosted on Windows Azure Websites.
Its a pretty standard blog where images are hosted via skydrive and linked, so the blog itself only serves html.
I've set it in Shared mode, running 1 instance.
But I keep getting quota reached. and it seems like my site is always maxing out the memory (max is 512mb per hour) and I can't understand why?
I've tried increasing to 3 instances, but it doesn't increase the maximum memory I can use.
The maximum usage for websites under Shared mode are:
CPU Time: 4 hours per day, 2.5minutes per 5 minute
File System: 1024mb
Memory usage: 512mb per hour
Database: 1024mb (web instance)
I've tried re-creating my website in different zones. Currently my site is hosted in US West, which has the above limits, but other zones have slightly different limits, such as East Asia has 1024mb per hour memory usage limit! I haven't been able to dig up any documentation on this, which is puzzling.
In Update2 I mentioned that different regions have different "memory usage per hour limit". This is actually not true. I had set up a new site under the "Free" setting with 1024mb per hour, but when I switched this to "Shared" the memory usage limit came down to 512mb per hour.
I have not been able to reproduce this issue in any of my other sites despite being the same source code, which leads me to believe its something weird with my particular azure website set up. Possibly something to do with the dashboard as mentioned by #Vinblad.
I'm planning to set up a new azure website in a different region, and while I'm at it, upgrade to Orchard 1.6
Had a similar issue on Azure with Orchard. It was due to the error log files continually increasing and taking up space. Manually deleting files at the moment but have to look into a more automated solution.
512MB / hour doesn't make any sense at all, I agree with Steve. 512MB (not per hour) is more than enough to host Orchard however. Try to measure memory on your local copy of the site. If you do get abnormal memory consumption, try to profile it and find the module that's responsible for it. If not, then contact Azure support and ask them why the same application would take more memory on Azure than on your local machine.
Another thing to investigate would be caching: do you have output caching enabled?
I saw this post on the Azure forums where they recommend disabling the dynamic module loader. We gave this a try but this gave us problems with the images so we had to revert back.

Architecture recommendation for load-balanced ASP.NET site

UPDATE 2009-05-21
I've been testing the #2 method of using a single network share. It is resulting in some issues with Windows Server 2003 under load:
end update
I've received a proposal for an ASP.NET website that works as follows:
Hardware load-balancer -> 4 IIS6 web servers -> SQL Server DB with failover cluster
Here's the problem...
We are choosing where to store the web files (aspx, html, css, images). Two options have been proposed:
1) Create identical copies of the web files on each of the 4 IIS servers.
2) Put a single copy of the web files on a network share accessible by the 4 web servers. The webroots on the 4 IIS servers will be mapped to the single network share.
Which is the better solution?
Option 2 obviously is simpler for deployments since it requires copying files to only a single location. However, I wonder if there will be scalability issues since four web servers are all accessing a single set of files. Will IIS cache these files locally? Would it hit the network share on every client request?
Also, will access to a network share always be slower than getting a file on a local hard drive?
Does the load on the network share become substantially worse if more IIS servers are added?
To give perspective, this is for a web site that currently receives ~20 million hits per month. At recent peak, it was receiving about 200 hits per second.
Please let me know if you have particular experience with such a setup. Thanks for the input.
UPDATE 2009-03-05
To clarify my situation - the "deployments" in this system are far more frequent than a typical web application. The web site is the front end for a back office CMS. Each time content is published in the CMS, new pages (aspx, html, etc) are automatically pushed to the live site. The deployments are basically "on demand". Theoretically, this push could happen several times within a minute or more. So I'm not sure it would be practical to deploy one web server at time. Thoughts?
I'd share the load between the 4 servers. It's not that many.
You don't want that single point of contention either when deploying nor that single point of failure in production.
When deploying, you can do them 1 at a time. Your deployment tools should automate this by notifying the load balancer that the server shouldn't be used, deploying the code, any pre-compilation work needed, and finally notifying the load balancer that the server is ready.
We used this strategy in a 200+ web server farm and it worked nicely for deploying without service interruption.
If your main concern is performance, which I assume it is since you're spending all this money on hardware, then it doesn't really make sense to share a network filesystem just for convenience sake. Even if the network drives are extremely high performing, they won't perform as well as native drives.
Deploying your web assets are automated anyway (right?) so doing it in multiples isn't really much of an inconvenience.
If it is more complicated than you're letting on, then maybe something like DeltaCopy would be useful to keep those disks in sync.
One reason the central share is bad is because it makes the NIC on the share server the bottleneck for the whole farm and creates a single point of failure.
With IIS6 and 7, the scenario of using a network single share across N attached web/app server machines is explicitly supported. MS did a ton of perf testing to make sure this scenario works well. Yes, caching is used. With a dual-NIC server, one for the public internet and one for the private network, you'll get really good performance. The deployment is bulletproof.
It's worth taking the time to benchmark it.
You can also evaluate a ASP.NET Virtual Path Provider, which would allow you to deploy a single ZIP file for the entire app. Or, with a CMS, you could serve content right out of a content database, rather than a filesystem. This presents some really nice options for versioning.
VPP For ZIP via #ZipLib.
VPP for ZIP via DotNetZip.
In an ideal high-availability situation, there should be no single point of failure.
That means a single box with the web pages on it is a no-no. Having done HA work for a major Telco, I would initially propose the following:
Each of the four servers has it's own copy of the data.
At a quiet time, bring two of the servers off-line (i.e., modify the HA balancer to remove them).
Update the two off-line servers.
Modify the HA balancer to start using the two new servers and not the two old servers.
Test that to ensure correctness.
Update the two other servers then bring them online.
That's how you can do it without extra hardware. In the anal-retentive world of the Telco I worked for, here's what we would have done:
We would have had eight servers (at the time, we had more money than you could poke a stick at). When the time came for transition, the four offline servers would be set up with the new data.
Then the HA balancer would be modified to use the four new servers and stop using the old servers. This made switchover (and, more importantly, switchback if we stuffed up) a very fast and painless process.
Only when the new servers had been running for a while would we consider the next switchover. Up until that point, the four old servers were kept off-line but ready, just in case.
To get the same effect with less financial outlay, you could have extra disks rather than whole extra servers. Recovery wouldn't be quite as quick since you'd have to power down a server to put the old disk back in, but it would still be faster than a restore operation.
Use a deployment tool, with a process that deploys one at a time and the rest of the system keeps working (as Mufaka said). This is a tried process that will work with both content files and any compiled piece of the application (which deploy causes a recycle of the process).
Regarding the rate of updates this is something you can control. Have the updates go through a queue, and have a single deployment process that controls when to deploy each item. Notice this doesn't mean you process each update separately, as you can grab the current updates in the queue and deploy them together. Further updates will arrive to the queue, and will be picked up once the current set of updates is over.
Update: About the questions in the comment. This is a custom solution based on my experience with heavy/long processes which needs their rate of updates controlled. I haven't had the need to use this approach for deployment scenarios, as for such dynamic content I usually go with a combination of DB and cache at different levels.
The queue doesn't need to hold the full information, it just need to have the appropriate info (ids/paths) that will let your process pass the info to start the publishing process with an external tool. As it is custom code, you can have it join the information to be published, so you don't have to deal with that in the publishing process/tool.
The DB changes would be done during the publishing process, again you just need to know where the info for the required changes is and let the publishing process/tool handle it. Regarding what to use for the queue, the main ones I have used is msmq and a custom implementation with info in sql server. The queue is just there to control the rate of the updates, so you don't need anything specially targeted at deployments.
Update 2: make sure your DB changes are backwards compatible. This is really important, when you are pushing changes live to different servers.
I was in charge of development for a game website that had 60 million hits a month. The way we did it was option #1. User did have the ability to upload images and such and those were put on a NAS that was shared between the servers. It worked out pretty well. I'm assuming that you are also doing page caching and so on, on the application side of the house. I would also deploy on demand, the new pages to all servers simultaneously.
What you gain on NLB with the 4IIS you loose it with the BottleNeck with the app server.
For scalability I'll recommend the applications on the front end web servers.
Here in my company we are implementing that solution. The .NET app in the front ends and an APP server for Sharepoint + a SQL 2008 Cluster.
Hope it helps!
We have a similar situation to you and our solution is to use a publisher/subscriber model. Our CMS app stores the actual files in a database and notifies a publishing service when a file has been created or updated. This publisher then notifies all the subscribing web applications and they then go and get the file from the database and place it on their file systems.
We have the subscribers set in a config file on the publisher but you could go the whole hog and have the web app do the subscription itself on app startup to make it even easier to manage.
You could use a UNC for the storage, we chose a DB for convenience and portability between or production and test environments (we simply copy the DB back and we have all the live site files as well as the data).
A very simple method of deploying to multiple servers (once the nodes are set up correctly) is to use robocopy.
Preferably you'd have a small staging server for testing and then you'd 'robocopy' to all deployment servers (instead of using a network share).
robocopy is included in the MS ResourceKit - use it with the /MIR switch.
To give you some food for thought you could look at something like Microsoft's Live Mesh
. I'm not saying it's the answer for you but the storage model it uses may be.
With the Mesh you download a small Windows Service onto each Windows machine you want in your Mesh and then nominate folders on your system that are part of the mesh. When you copy a file into a Live Mesh folder - which is the exact same operation as copying to any other foler on your system - the service takes care of syncing that file to all your other participating devices.
As an example I keep all my code source files in a Mesh folder and have them synced between work and home. I don't have to do anything at all to keep them in sync the action of saving a file in VS.Net, notepad or any other app initiates the update.
If you have a web site with frequently changing files that need to go to multiple servers, and presumably mutliple authors for those changes, then you could put the Mesh service on each web server and as authors added, changed or removed files the updates would be pushed automatically. As far as the authors go they would just be saving their files to a normal old folder on their computer.
Assuming your IIS servers are running Windows Server 2003 R2 or better, definitely look into DFS Replication. Each server has it's own copy of the files which eliminates a shared network bottleneck like many others have warned against. Deployment is as simple as copying your changes to any one of the servers in the replication group (assuming a full mesh topology). Replication takes care of the rest automatically including using remote differential compression to only send the deltas of files that have changed.
We're pretty happy using 4 web servers each with a local copy of the pages and a SQL Server with a fail over cluster.
