Problem with Sharepoint during peak loads - sharepoint

My web application works in conjunction with Sharepoint. During peak hours, Sharepoint can't handle or respond to web application requests. At the same time, the CPU usage of the MySql server associated with Sharepoint reaches almost 100 percent. And the size of its database is already 1000 gigabytes. What can be done? And what can be done with the database, the filestream setting is disabled?

Related

IIS application pool setting Throttle under load

One of our customers has set the application pool to throttle under load at 35%, and at times they noticed the following event
Event ID: 5210
CPU time for application pool 'abc' has been throttled.
They noticed such events show up in the event viewer log, even though the CPU utilization on the web server is not pegged high, for example < 60%
Would like to know:
• Under what condition does the event id 5210 get generated?
• How does IIS detect contention on the CPU? Is it based on a performance counter etc?
The cause of the issue is there is something misconfiguration with your Application Pool.
open iis manager.
Right-click on the appropriate Application Pool and click Advanced Settings.
Set the Limit to 0 and Limit Action value to NoAction.
IIS 8.0 CPU Throttling: Sand-boxing Sites and Applications
This is strange question: you said you had set AppPool to ThrottleUnderLoad and set Limit to 35%. Why do you wonder when it hits the limit and generates a log record? The most common value for the throttle limit is 80%, which allows to throttle the load a little when there are too many requests and the CPU resources are not enough. I don't see sense to set the limit to 35%, this will slow web responses from your server when it has plenty of CPU time.
I suggest increasing the value. This would decrease the number of 5210 warnings.

ASP.NET Core 2.2 experiencing high CPU usage

So I have hosted asp.net core 2.2 web service on Azure(S2 plan). The problem is that my application sometimes getting high CPU usage(almost 99%). What I have done for now - checked process explorer on azure. I see there a lot of processes who are consuming CPU. Maybe someone knows if it's okay for these processes consume CPU?
Currently, I don't have an idea where do they come from. Maybe it's normal to have them here.
Shortly about my application:
Currently, there is not much traffic. 500-600 request in a day. Most of the request is used to communicate with MS SQL by querying records, adding, etc.
As well I am using MS Websocket, but high CPU happens when no WebSocket client is connected to web service, so I hardly believe that it's a cause. I tried to use apache ab for load testing, but there isn't any pattern, that after one request's load test, I would get high CPU. So sometimes happens, sometimes don't during load testing.
So I just update screenshot of processes, I see that lots of threads are being locked/used during the time when fluent migrator start running its logging.
Update*
I will remove fluent migrator logging middleware from Configure method. Will look forward with the situation.
UPDATE**
So I removed logging of FluentMigrator. Until now I didn't notice any CPU usage over 90%.
But still, I am confused. My CPU usage is spinning. Is it health CPU usage graph or not?
Also, I tried to make a load test on the websocket server.
I made a script that calls some functions of WebSocket every 100ms from 6-7 clients. So every 100ms there are 7 calls to WebSocket server from different clients, every function within itself queries some data/insert (approximately 3-4 queries of every WebSocket function).
What I did notice, on Azure S1 DTU 20 after 2min I am getting out of SQL pool connections, If I increase DTU to 100, it handles 7 clients properly without any errors of 'no connection pool'.
So the first question: is it a normal CPU spinning?
Second: should I get an error message of 'no SQL connection free' using this kind of load test on DTU 10 Azure SQL. I am afraid that when creating a scoped service on singleton WebSocket Service I am leaking connections.
This topic gets too long, maybe I should move it to a new topic?
-
At this stage I would say you need to profile your application and figure out what areas of your code are CPU intensive. In the past I have used dotTrace, this highlighted methods which are the most expensive with a call tree.
Once you know what areas of your code base are the least efficient, you can begin to refactor them so that they are more efficient. This could simply be changing some small operations, adding caching for queries or using distributed locking for example.
I believe the reason the other DLLs are showing CPU usage is because your code calling methods which are within those DLLs.

SQL Azure Premium tier is unavailable for more than a minute at a time and we're around 10-20% utilization, if that

We run a web service that gets 6k+ requests per minute during peak hours and about 3k requests per minute during off hours. Lots of data feeds compiled from 3rd party web services and custom generated images. Our service and code is mature, we've been running this for years. A lot of work by good developers has gone into our service's code base.
We're migrating to Azure, and we're seeing some serious problems. For one, we are seeing our Premium P1 SQL Azure database routinely become unavailable for 1-2 full entire minutes. I'm sorry, but this seems absurd. How are we supposed to run a web service with requests waiting 2 minutes for access to our database? This is occurring several times a day. It occurs less after switching from Standard level to Premium level, but we're nowhere near our DB's DTU capacity and we're getting throttled hard far too often.
Our SQL Azure DB is Premium P1 and our load according to the new Azure portal is usually under 20% with a couple spikes each hour reaching 50-75%. Of course, we can't even trust Azure's portal metrics. The old portal gives us no data for our SQL, and the new portal is very obviously wrong at times (our DB was not down for 1/2 an hour, like the graph suggests, but it was down for more than 2 full minutes):
Azure reports the size of our DB at a little over 12GB (in our own SQL Server installation, the DB is under 1GB - that's another of many questions, why is it reported as 12GB on Azure?). We've done plenty of tuning over the years and have good indices.
Our service runs on two D4 cloud service instances. Our DB libraries are all implementing retry logic, waiting 2, 4, 8, 16, 32, and then 48 seconds before failing completely. Controllers are all async, most of our various external service calls are async. DB access is still largely synchronous but our heaviest queries are async. We heavily utilize in-memory and Redis caching. The most frequent use of our DB is 1-3 records inserted for each request (those tables are queried only once every 10 minutes to check error levels).
Aside from batching up those request logging inserts, there's really not much more give in our application's db access code. We're nowhere near our DTU allocation on this database, and the server our DB is on has like 2000 DTU's available to be allocated still. If we have to live with 1+ minute periods of unavailability every day, we're going to abandon Azure.
Is this the best we get?
Querying stats in the database seems to show we are nowhere near our resource limits. Also, on premium tier we should be guaranteed our DTU level second-by-second. But, again, we go more than an entire solid minute without being able to get a database connection. What is going on?
I can also say that after we experience one of these longer delays, our stats seem to reset. The above image was a couple minutes before a 1 min+ delay and this is a couple minutes after:
We have been in contact with Azure's technical staff and they confirm this is a bug in their platform that is causing our database to go through failover multiple times a day. They stated they will be deploying fixes starting this week and continuing over the next month.
Frankly, we're having trouble understanding how anyone can reliably run a web service on Azure. Our pool of Websites randomly goes down for a few minutes a few times a month, taking our public sites down. If our cloud service returns too many 500 responses something in front of it is cutting off all traffic and returning 502's (totally undocumented behavior as far as we can tell). SQL Azure has very limited performance and obviously isn't ready for prime time.

When to create a web garden

I have an application that is currently running on IIS 6.0 with one worker process (the default). I am trying to determine if creating a web garden will improve performance. I have read a bunch of articles that say that a web garden is not the right approach for everyone (since it duplicates resources, cache is not shared, etc). I could not find an article that had a clear rational for using a web garden (Microsoft's site provides three bullet points, but no specific examples can be found). My situation is as follows:
We have can have up to 40 concurrent users at a given time.
Our application performs a series of calcuations (on the magnitude of 1,000s of calculations) that can take up to 10 minutes to complete.
We have multipe database calls some of which can take upwards to 30 seconds to complete.
Will creating a web garden improve performance, or should I simply increase the number of threads in the current worker process? When would be an example of when you should use a web garden? If a thread in the current worker process is performing calcuations (running .net code) and/or calling the database, can other threads run at the same time (I assume yes).
Thanks.
Ryan
Personally, I have gone the route of using a web garden when my web application required frequent worker process recycles.
In this specific case we needed to recycle the worker process often because we were using CodeDOM to emit assemblies dynamically, which has a memory leak by definition as more assemblies are loaded.
Having a web garden helped avoid a delay in server response every time the worker process was recycled.

Creating a sub site in SharePoint takes a very long time

I am working in a MOSS 2007 project and have customized many parts of it. There is a problem in the production server where it takes a very long time (more than 15 minutes, sometimes fails due to timeouts) to create a sub site (even with the built-in site templates). While in the development server, it only takes 1 to 2 minutes.
Both servers are having same configuration with 8 cores CPU and 8 GIGs RAM. Both are using separate database servers with the same configuration. The content db size is around 100 GB. More than a hundred subsites are there.
What could be the reason why in the other server it will take so much time? Is there any configuration or something else I need to take care?
Update:
So today I had the chance to check the environment with my clients. But site creation was so fast though they said they didn't change any configuration in the server.
I also used that chance to examine the database. The disk fragmentation was quite high at 49% so I suggested them to run defrag. And I also asked the database file growth to be increased to 100MB, up from the default 1MB.
So my suspicion is that some processes were running heavily on the server previously, that's why it took that much of time.
Update 2:
Yesterday my client reported that the site creation was slow again so I went to check it. When I checked the db, I found that instead of the reported 100GB, the content db size is only around 30GB. So it's still far below the recommended size.
One thing that got my attention is, the site collection recycle bin was holding almost 5 millions items. And whenever I tried to browse the site collection recycle bin, it would take a lot of time to open and the whole site collection is inaccessible.
Since the web application setting is set to the default (30 days before cleaning up, and 50% size for the second stage recycle bin), is this normal or is this a potential problem also?
Actually, there was also another web application using the same database server with 100GB content db and it's always fast. But the one with 30GB is slow. Both are having the same setup, only different data.
What should I check next?
So today I had the chance to check the environment with my clients. But site creation was so fast though they said they didn't change any configuration in the server.
I also used that chance to examine the database. The disk fragmentation was quite high at 49% so I suggested them to run defrag. And I also asked the database file growth to be increased to 100MB, up from the default 1MB.
So my suspicion is that some processes were running heavily on the server previously, that's why it took that much of time.
Thanks for the inputs everyone, I really appreciate.
Yesterday my client reported that the site creation was slow again so I went to check it. When I checked the db, I found that instead of the reported 100GB, the content db size is only around 30GB. So it's still far below the recommended size.
One thing that got my attention is, the site collection recycle bin was holding almost 5 millions items. And whenever I tried to browse the site collection recycle bin, it would take a lot of time to open and the whole site collection is inaccessible.
Since the web application setting is set to the default (30 days before cleaning up, and 50% size for the second stage recycle bin), is this normal or is this a potential problem also?
Actually, there was also another web application using the same database server with 100GB content db and it's always fast. But the one with 30GB is slow. Both are having the same setup, only different data.
Any idea what should I check next? Thanks a lot.
Yes, its normal OOB if you haven't turned the Second Stage Recycle bin off or set a site quota. If a site quota has not been set then the growth of the Second Stage Recycle bin is not limited...
the second stage recycle bin is by default limited to 50% size of the site quota, in other words if you have a site quota of 100gb then you would have a Second Stage recycle bin of 50gb. If a site quota has not been set, there are not any growth limitations...
I second everything Nat has said and emphasize splitting the content database. There are instructions on how to this provided you have multiple site collections and not a single massive one.
Also check your SharePoint databases are in good shape. Have you tried DBCC CHECKDB? Do you have SQL Server maintenance plans configured to reindex and reduce fragmentation? Read these resources on TechNet (particularly the database maintenance article) for details.
Finally, see if there is anything more you can do to isolate the SQL Server as the problem. Are there any other applications with databases on the same SQL Server and are they having problems? Are you running performance monitoring on the SQL Server or SharePoint servers that show any bottlenecks?
Backup the production database to dev and attach it to your dev SharePoint server.
Try and create a site. If it does not take forever to create a site, you can assume there is a problem with the Prod database.
Despite that, at 100gig, you are running up to the limit for a content database and should be planning to put content into more than one. you will know why when you try and backup the database. Searching should also be starting to take a good long time now.
So long term you are going to have to plan on splitting your websites out into different content databases.
--Responses--
Yeah, database size is all just about SQL server handling it. 100GB is just the "any more than this and it starts to be a pain" rule of thumb. Full Search crawls will also start a while.
Given that you do not have access to the production database and that creating a sub-site is primarily a database operation, there is nothing you can really do to figure out what the issue is.
You could try creating a subsite while doing a trace of the Dev database and look at the tables those commands reference to see if there is a smoking gun, but without production access you are really hampered.
Does the production system server pages and documents at a reasonable speed?
See if you can start getting some stats from the database during the creation, find out what work is being done. SQL has some great tools for that now.

Resources