Currently I have 2 virtual test servers(win2003). One is on which websites are deployed and other one has all the tests. I have setup all nightly build using cc.net. When ever I run tests on nightly builds. Some of the tests fail saying
1. System.Runtime.InteropServices.COMException : The RPC server is unavailable. (Exception from HRESULT: 0x800706BA.
2. While some got timeout exception. while some times testfixtures is not set up due to timeout on server.
Any one has any idea. Why it is causing this problem, is there any thing to do with synchronisation of 2 machines or any privileges or firewall restirction.
Kind regards
i got the same System.Runtime.InteropServices.COMException on one of my test. But for the timeout, i solved it by expanding the timeout property (WatiN 2.0)
[SetUp] \\for NUnit but you can change with you own testing engine
public void EachSetup()
{
Settings.AutoCloseDialogs = true;
Settings.AttachToIETimeOut = 300;
Settings.WaitForCompleteTimeOut = 300;
Settings.WaitUntilExistsTimeOut = 300;
}
Related
Backgound:
I'm currently hosting an ASP.NET application in Azure with the following specs:
ASP .Net Core 2.2
Using Flurl for HTTP requests
Kestrel Webserver
Docker (Linux - mcr.microsoft.com/dotnet/core/aspnet:2.2 runtime)
Azure App Service on P2V2 tier app service plan
I have a a couple of background jobs that run on the service that makes a lot of outbound HTTP calls to a 3rd party service.
Issue:
Under a small load (approximately 1 call per 10 seconds), all requests are completed in under a second with no issue. The issue I'm having is that under a heavy load, when service can make up to 3/4 calls in a 10 second span, some of the requests will randomly timeout and throw an exception. When I was using RestSharp the exception would read "The operation has timed out". Now that I'm using Flurl, the exception reads "The call timed out".
Here's the kicker - If I run the same job from my laptop running Windows 10 / Visual Studios 2017, this problem does NOT occur. This leads me to believe I'm hitting some limit or running out of some resource in my hosted environment. Unclear if that is connection/socket or thread related.
Things I've tried:
Ensure all code paths to the request are using async/await to prevent lockouts
Ensure Kestrel Defaults allow unlimited connections (it does by default)
Ensure Dockers default connection limits are sufficient (2000 by default, more than enough)
Configuring ServicePointManager settings for connection limits
Here is the code in my startup.cs that I'm currently using to try and prevent this issue:
public class Startup
{
public Startup(IHostingEnvironment hostingEnvironment)
{
...
// ServicePointManager setup
ServicePointManager.UseNagleAlgorithm = false;
ServicePointManager.Expect100Continue = false;
ServicePointManager.DefaultConnectionLimit = int.MaxValue;
ServicePointManager.EnableDnsRoundRobin = true;
ServicePointManager.ReusePort = true;
// Set Service point timeouts
var sp = ServicePointManager.FindServicePoint(new Uri("https://placeholder.thirdparty.com"));
sp.ConnectionLeaseTimeout = 15 * 1000; // 15 seconds
FlurlHttp.ConfigureClient("https://placeholder.thirdparty.com", cli => cli.Settings.ConnectionLeaseTimeout = new TimeSpan(0, 0, 15));
}
}
Has anyone else run into a similar issue to this? I'm open to any suggestions on how to best debug this situation, or possible methods to correct the issue. I'm at a complete loss after researching this for several days.
Thank you in advance.
I had similar issues. Take a look at Asp.net Core HttpClient has many TIME_WAIT or CLOSE_WAIT connections . Debugging via netstat helped identify the problem for me. As one possible solution. I suggest you use IHttpClientFactory. You can get more info from https://learn.microsoft.com/en-us/aspnet/core/fundamentals/http-requests?view=aspnetcore-2.2 It should be fairly easy to use as described in Flurl client lifetime in ASP.Net Core 2.1 and IHttpClientFactory
Short error description:
Ms.Dynamics.Performance.CreateUsers.exe from PerfSDK throws error
There was no endpoint listening at https://mytest.sandbox.operations.dynamics.com/Services/AxUserManagement/Service.svc/ws2007FedHttp that could accept the message.
Long error description:
I have created a single user C# test from an XML recording and run it with PerfSDK successfully as described in the first part of the PerfSDK and multiuser testing with Visual Studio Online guide.
I am having trouble running multiuser load tests as described in the second part of the lab. The link above seems to be the only resource online describing how a multiuser test can be created from a singleuser test and how Visual Studio Online can be used to run it in a sandbox environment. I've also watched a few videos such as Tools to Measure and Improve Microsoft Dynamics AX Performance, Performance Tools and the like, but none of them explains all the steps that need to be taken in as much detail as the above article.
I've done the following:
Created a recording of a scenario with Task Recorder in Dynamics 365
for Operations.
Created C# perf test from recording in Visual Studio using the
PerfSDKSample project from the PerfSDK folder.
Followed all 'Steps to run single user performance test with Perf
SDK' from the article;
Built the solution and successfully ran my test from Test Explorer:
Internet Explorer opened starting and replaying the scenario that I
had recorded.
Note: I used DEV environment usnconeboxax1aos.cloud.onebox.dynamics.com for testing. When I tried using another hostname in CloudEnvironment.Config (a sandbox, e.g. mysandbox.sandbox.operations.dynamics.com), the singleuser test failed with the following error message:
System.TypeInitializationException: The type initializer for 'MS.Dynamics.TestTools.CloudCommonTestUtilities.Authentication.UserManagement' threw an exception. ---> System.ServiceModel.EndpointNotFoundException: There was no endpoint listening at https://mysandbox.sandbox.operations.dynamics.com/Services/AxUserManagement/Service.svc/ws2007FedHttp that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. ---> System.Net.WebException: The remote server returned an error: (404) Not Found..
For multiuser testing, I launched Visual Studio from Visual Studio
Online portal https://app.vssps.visualstudio.com/profile/view
I modified the TestSetup method as follows:
Single-user TestSetup:
public void TestSetup()
{
SetupData();
_userContext = new UserContext(UserManagement.AdminUser);
Client = DispatchedClient.DefaultInstance;
Client.ForceEditMode = false;
Client.Company = "GB01";
Client.Open();
}
Multi-user TestSetup:
public void TestSetup()
{
var testroot = System.Environment.GetEnvironmentVariable("DeploymentDir");
if (string.IsNullOrEmpty(testroot))
{
testroot = System.IO.Directory.GetCurrentDirectory();
}
Environment.SetEnvironmentVariable("testroot", testroot);
if (this.TestContext != null)
{
timerProvider = new TimerProvider(this.TestContext);
}
SetupData();
_userContext = new UserContext(UserManagement.AdminUser);
Client = new DispatchedClientHelper().GetClient();
Client.ForceEditMode = false;
Client.Company = "GB01";
Client.Open();
}
I set the HostName in CloudEnvironment.Config to the sandbox URL e.g. mysandbox.sandbox.operations.dynamics.com.
Logged in to the sandbox machine and installed the certificate I had generated earlier for the single-user testing.
Updated wif.config on the sandbox machine in the same way it had been updated in DEV earlier, and restarted IIS.
Double-clicked vsonline.testsettings in Solution Explorer and used the settings recommended in the above article (accordingly modified for my certificate and test scenario).
Opened SampleLoadTest.loadtest from Solution Explorer and tweaked it to use only my test in the Test Mix node, reduced test duration and user count.
Run the load test.
The load test ended with a few errors. The first TestError is the same as mentioned above:
Initialization method MS.Dynamics.Performance.Application.TaskRecorder.GenJnlBase.TestSetup threw exception. System.TypeInitializationException: System.TypeInitializationException: The type initializer for 'MS.Dynamics.TestTools.CloudCommonTestUtilities.Authentication.UserManagement' threw an exception. ---> System.ServiceModel.EndpointNotFoundException: There was no endpoint listening at https://mysandbox.sandbox.operations.dynamics.com/Services/AxUserManagement/Service.svc/ws2007FedHttp that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details. ---> System.Net.WebException: The remote server returned an error: (404) Not Found..
Finally, even though I was able to run Ms.Dynamics.Performance.CreateUsers.exe on my DEV machine successfully (a number of test AX users were created in usnconeboxax1aos.cloud.onebox.dynamics.com), when the sandbox environment URL was set in CloudEnvironment.Config, Ms.Dynamics.Performance.CreateUsers.exe failed with same error:
C:\PerfSDK>Ms.Dynamics.Performance.CreateUsers.exe 3 GB01
Failed with the following error:
System.TypeInitializationException: The type initializer for 'MS.Dynamics.TestTools.CloudCommonTestUtilities.Authentication.UserManagement' threw an exception. ---> System.ServiceModel.EndpointNotFoundException: There was no endpoint listening at https://mytest.sandbox.operations.dynamics.com/Services/AxUserManagement/Service.svc/ws2007FedHttp that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details.
...
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at MS.Dynamics.TestTools.CloudCommonTestUtilities.AxUserManagementServiceReference.IAxUserManagement.EnumUsers()
at MS.Dynamics.TestTools.CloudCommonTestUtilities.Authentication.UserManagement.PopulateAxUsers()
at MS.Dynamics.TestTools.CloudCommonTestUtilities.Authentication.UserManagement..cctor()
--- End of inner exception stack trace ---
at MS.Dynamics.TestTools.CloudCommonTestUtilities.Authentication.UserManagement.get_AdminUser()
at MS.Dynamics.Performance.CreateUsers.Program.Main(String[] args)
As per the walkthrough,
If you have an ARR-enabled environment, i.e. you have 2 endpoints like
this:
apr-arr8aossoap.axcloud.test.dynamics.com
apr-arr8aos.axcloud.test.dynamics.com
You would need to enter both endpoints in CloudEnvironment.Config
The no endpoint listening error can be resolved by specifying correct SOAP hostname, e.g.
<ExecutionConfigurations Key="HostName" Value="mysandbox.sandbox.operations.dynamics.com" />
<ExecutionConfigurations Key="SoapHostName" Value="mysandboxaossoap.sandbox.operations.dynamics.com" />
I am running an azure webjobs SDK console application (continuous) with the recommended setup:
public static void ProcessQueueMessage([QueueTrigger("logqueue")] string logMessage, TextWriter logger)
The azure queue I am running against has ~6000 messages in it and I am running the web-job locally, as a console application.
The problem I'm having is that the processing randomly stops after processing between zero and ~30 messages. The console stays open, but no more console messages are displayed.
For example, it might just process 2 messages:
Executing: 'Functions.ProcessQueueMessage' - Reason: 'New queue message detected on 'QueueName'.'
Executed: 'Functions.ProcessQueueMessage' (Succeeded)
Executing: 'Functions.ProcessQueueMessage' - Reason: 'New queue message detected on 'QueueName'.'
Executed: 'Functions.ProcessQueueMessage' (Succeeded)
And then, nothing. There doesn't seem to be anything wrong with my internet connection and I can't trace the issues down to any particular messages.
Has anyone else had issues with this SDK?
Update:
I made sure that I was using the right versions of all of the dependencies by removing the nuget packages and then re-running install-package Microsoft.Axure.Webjobs. I am now using webjobs version 1.1.0 which has pulled in version 4.3 of azure storage.
As recommended by Matthew, I have pulled down the source code for azure webjobs to determine where the process is freezing up. Once the freez-up occurs, I pause execution and checked the running threads for what I believe is the culprit within Microsoft.Azure.WebJobs.Host.CompositeTraceWriter
protected virtual void InvokeTextWriter(TraceEvent traceEvent)
{
if (_innerTextWriter != null)
{
string message = traceEvent.Message;
if (!string.IsNullOrEmpty(message) &&
message.EndsWith("\r\n", StringComparison.OrdinalIgnoreCase))
{
// remove any terminating return+line feed, since we're
// calling WriteLine below
message = message.Substring(0, message.Length - 2);
}
_innerTextWriter.WriteLine(message);
if (traceEvent.Exception != null)
{
_innerTextWriter.WriteLine(traceEvent.Exception.ToDetails());
}
}
}
The line it freezes on is line 66 : _innerTextWriter.WriteLine(message);
_innerTextWriter is an instance of System.IO.TextWriter.SyncTextWriter
Is it possible there is some deadlock issue with this class or the way it is being used?
Some notes:
I am running in the debugger, so in this case I believe the textwriter is forwarding to the console internally
I have my batchsize set to 1 via config.Queues.BatchSize = 1;, not sure if that could matter
I'm currently working on setting up an environment on another computer so that I can see if it is reproducible somewhere other than this machine (surface book).
Update
The issue was me not understanding how the new windows 10 command prompt works. Any time you click on the command window, it goes into "select" mode which completely pauses execution of the process.
Basically: https://superuser.com/questions/419717/windows-command-prompt-freezing-randomly?newreg=ece53f5584254346be68f85d1fd2f18d
You can tell it is in this state because it will prefix the window title with the word "Select":
You have to press enter or click again to get it going once again.
So, two final comments:
1) What an incredibly confusing and un-intuitive behavior for a command window!
2) I hope some admin will come take pity on the shame I have brought upon myself and my family by deleting this question.
To get rid of this strange behavior, you can disable QuickEdit mode:
Strange. When it is in this stuck state, can you try adding a new queue message to the queue and see if that triggers? Are you sure your function isn't hanging internally? What version of the SDK are you using? You might also try upgrading to v1.1.0 which we just released last week. If there are really a bunch of messages in the queue waiting to be processed, I can't think of anything that would cause this. The queue listener in the SDK should chug along, reading batches of messages in parallel and dispatching them to your function. Have you changed any of the JobHostConfiguration.Queues configuration knobs? You haven't force updated the version of the Azure SDK have you to something higher than the WebJobs SDK supports?
Another option if you can't figure this out might be to clone the SDK, build it and debug it locally. The repo is here. The main queue processing loop is here.
My azure webjob appears to be terminating without throwing an exception and I'm lost.
My web job is run on-demand (or scheduled) and has a dependency on my web site DLL (and MVC app). It calls into it to do most of the work, which includes working with an entity frameworks database and making REST calls to several other sites. Most of the work is done asynchronously. Most of the code used to do this work is also called from other parts of the site without problem, and it goes without saying that the web job works flawlessly when run locally.
The web job terminates and doesn't seem to throw an exception when it does and it doesn't seem to be possible to debug a web that's not of the continuously run variety (?). Therefor, my debugging has mostly been of the Console.WriteLine variety. Because of that and the asynchronisity, I haven't been able to nail down exactly where it's crashing - I thought it was while accessing the database, but after mucking with it, the database access started working.. ugh. My next best guess it that it dies during an await or other async plumbing. It does, however, crash within two try/catch blocks that have finallys that log results to redis and azure storage. None of that happens. I can not figure out, or imagine, how this process is crashing without hitting any exception handlers.. ?
Anyone had this problem with an azure webjob? Any idea what I should be looking for or any tips for debugging this?
Thanks!
I figured it out! One of the many things happening asynchronously was the creation of a certificate. I traced it down to this:
signedCert = new X509Certificate2(cert, "notasecret", X509KeyStorageFlags.Exportable);
This code works fine when called from my azure website or my tests, but kills the webjob process completely without throwing an exception! For example, the WriteLine in the exception handler below never gets called:
X509Certificate2 signedCert;
try
{
signedCert = new X509Certificate2(cert, "notasecret", X509KeyStorageFlags.Exportable);
}
catch (Exception ex)
{
// We never get here! Argh!
Console.WriteLine("Exception converting cert: " + ex);
throw;
}
Extremely time consuming and frustrating. Unlike the diagnosis, the fix is simple:
signedCert = new X509Certificate2(
cert,
"notasecret",
X509KeyStorageFlags.Exportable |
X509KeyStorageFlags.MachineKeySet |
X509KeyStorageFlags.PersistKeySet);
We’ve been using Watin and CruiseControl.net for a few weeks now and most of the time they work together well. We have not had any problems running the tests on our developer machines. We are also able to run the tests interactively without problems when logged into the CI server.
Most of the time there are also no problems when the tests are executed under CruiseControl but this is not always the case as we’ve recently been seeing intermittent errors. The errors seem to come and go somewhat randomly but when an error does occur, it’s always one of the following:
WatiN.Core.Exceptions.TimeoutException: Timeout while Internet Explorer busy
System.Runtime.InteropServices.COMException: Creating an instance of the COM component with CLSID {0002DF01-0000-0000-C000-000000000046} from the IClassFactory failed due to the following error: 800704a6
Our CI server environment is:
Windows Server 2008 R2
IIS 7.5
IE 8
Watin 2.0
CruiseControl 1.6.7981.1. This is running as a service which logs in as a user on our domain because the tests need to access resources on the domain.
The 'randomly' failing tests create their IE instances as follows:
[TestMethod]
public void SomeTest()
{
using (var browser = new IE())
{
// run tests here
}
}
I also tried creating the IE in a new process as follows:
[TestMethod]
public void SomeTest()
{
using (var browser = new IE( true))
{
// run tests here
}
}
But when I did that, all of our tests failed with a “WatiN.Core.Exceptions.BrowserNotFoundException: Could not find an IE window matching constraint: Timeout while waiting to attach to newly created instance of IE.. Search expired after '30' seconds”
So, I have two questions:
Can anyone tell me how I can stop the timeouts and com exceptions?
Can anyone explain why IE(true) didn’t work at all?
TIA,
Mike
See this answer by Carl Hörberg:
Running Watin on TeamCity
CruiseControl.Net and TeamCity have the same problem when running as a service and the work-around should work for both environments.
By default, the ccservice runs as Local System, does not have access to interact with the desktop UI, and the Local System account privileges are limited and that's most probably what is causing the TimeoutException, COMException, and BrowserNotFoundException from being thrown.