Oracle on Azure closes connection after a time period - azure

We have a Oracle 11g DB on Microsoft Azure VM.
The oracle connection at client side closes after a time period even if active sqls are running. I'm checking active sql reports one minute and in the next BOOM closed connection.
We have not defined any profile and timing out a connection but still if I keep a connection in my SQL Developer and some time later when i run a query, The connection is closed.
In case of running batch programs which consume more time on the server itself, It seems to get hanged and no session will exist of this batch program
I'm guessing after a particular time from getting a session, the DB is closing the connection thus making the batch hang.
Is it related to Azure or any other thing. It is not even producing any error codes

Related

node postgres connection perfomance improvement

I am writing my backend using nodejs (express), and my database is PostgreSQL. I am using node-postres (pg) to connect to the postgres database.
Currently I am using the pg.Pool concept, so that I will have clients connected to serve the request/response. What I observe is the node-postgres takes more than 2 seconds to connect to the DB, and hence the response time is quite long.
In the node-postgres document, they mention that initial connection takes only 20 - 30 milliseconds. But I see more than 2-3 secs, to establish the connection. I did a load test on my app hitting around 1000 requests/ sec, but the average response time, is quite high due to the initial connection establish time. I have only a single SELECT query, where I get the response. The response processing time is very less, only the connection and getting data from DB takes more time.
I tried all the ways of releasing a client to the pool, after receiving the response etc.. Now I'm using pool.query which will take care of connecting, as well as releasing the client to the pool, after the task is done.
Is there any alternate for node-postgres which an provide a better performance for the DB operations.

Connection to Redis cache fails after restart - Azure

We are using following code to connect to our caches (in-memory and Redis):
settings
.WithSystemRuntimeCacheHandle()
.WithExpiration(CacheManager.Core.ExpirationMode.Absolute, defaultExpiryTime)
.And
.WithRedisConfiguration(CacheManagerRedisConfigurationKey, connectionString)
.WithMaxRetries(3)
.WithRetryTimeout(100)
.WithJsonSerializer()
.WithRedisBackplane(CacheManagerRedisConfigurationKey)
.WithRedisCacheHandle(CacheManagerRedisConfigurationKey, true)
.WithExpiration(CacheManager.Core.ExpirationMode.Absolute, defaultExpiryTime);
It works fine, but sometimes machine is restarted (automatically by Azure where we host it) and after the restart connection to Redis fails with following exception:
Connection to '{connection string}' failed.
at CacheManager.Core.BaseCacheManager`1..ctor(String name, ICacheManagerConfiguration configuration)
at CacheManager.Core.BaseCacheManager`1..ctor(ICacheManagerConfiguration configuration)
at CacheManager.Core.CacheFactory.Build[TCacheValue](String cacheName, Action`1 settings)
at CacheManager.Core.CacheFactory.Build(Action`1 settings)
According to Redis FAQ (https://learn.microsoft.com/en-us/azure/redis-cache/cache-faq) part: "Why was my client disconnected from the cache?" it might happen after redeploy.
The question is
is there any mechanism to restore the connection after redeploy
is anything wrong in way we initialize the connection
We are sure the connection string is OK
Most clients (including StackExchange.Redis) usually connect / re-connect automatically after a connection break. However, your connect timeout setting needs to be large enough for the re-connect to happen successfully. Remember, you only connect once, so it's alright to give the system enough time to be able to reconnect. Higher connect timeout is especially useful when you have a burst of connections or re-connections after a blip causing CPU to spike and some connections might not happen in time.
In this case, I see RetryTimeout as 100. If this is the Connection timeout, check if this is in milliseconds. 100 milliseconds is too low. You might want to make this more like 10 seconds (remember it's a one time thing, so you want to give it time to be able to connect).

Apache Derby Embedded Mode and Multi-Threaded Connection Management

I am currently working on an application (whose logic and code I cannot put out here) that creates an embedded derby database and interacts with it using multiple threads to perform CRUD and SELECT operations.
Let's say the name of the embedded database is bar and the path to this database is c:\foo\bar
Multiple threads open a their own connection to c:\foo\bar and perform their respective operations against their own tables in the database.
The connection to the database is abstracted out by a decorator that also maintains the last time the connection was accessed.
If the last time the connection was accessed exceeds a particular threshold the database connection is shutdown and reaped.
There is a reaper thread that runs at a pre-defined scheduled interval and performs the reaping logic. As a part of the reaping logic it uses the following shutdown URL:
jdbc:derby:c:\foo\bar;shutdown=true
Any thread that attempts to perform a query against this database after the reaper thread has run fails with Derby error 8003 which indicates that there is no current connection.
So is it that in Derby embedded mode even though each thread has opened it's own connection; when the reaper thread runs it shuts down the entire database and any connections that were previously opened against this database across all threads are now in an invalid or closed state?
What are the best practices for using embedded derby within such applications?
The shutdown=true attribute on the Derby JDBC Connection URL doesn't shut down the connection, it shuts down the database.
See: http://db.apache.org/derby/docs/10.13/ref/rrefattrib16471.html
If you just want to shut down a connection, call Connection.close()

Azure WebJob DB Connection Error Only on some instances

I have two Azure WebJobs. The first takes an incoming message that tells it to grab a PDF and break it into individual page images and then queue another message for the second WebJob to process the individual pages. It worked fine on our QC instance, but when we tried to move to production I started getting strange errors on the second job, but not consistently. The first job runs and breaks the file into page images. That is working fine. I have confirmed that every page image gets created and every page message gets queued. However, for the second job, only some of the messages are getting processed correctly. The remaining show this error in the WebJob diagnostics:
Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.ProcessBatchPage ---> System.Data.SqlClient.SqlException: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 52 - Unable to locate a Local Database Runtime installation. Verify that SQL Server Express is properly installed and that the Local Database Runtime feature is enabled.) ---> System.ComponentModel.Win32Exception: The system cannot find the file specified
But what's weird is that this error mentions the Local Database Runtime and SQL Server Express and I am not references either anywhere in my code. The system points at an Azure SQL DB. The job is ADO.Net and I have hardcoded the connection string to try to eliminate any issues with configuration based connection strings. But what's weird is that it only happens to a certain portion of the messages. The others process perfectly.
Lastly, I ran the job in debug locally (still pointing at the real queue and DB on Azure) and got the same problem. But the job outputs a console line with the job ID as the first line of the code. For those jobs that process successfully, I see this writeline. For those that fail, I never see anything. It's almost like the job is not really starting up correctly. (the failed jobs also have a really short run time 50-100ms)
I had the same issue with some jobs and I've came accross these articles to find a solution:
Transient Fault Handling (Building Real-World Cloud Apps with Azure)
Connection Resiliency / Retry Logic (EF6 onwards)
From theses articles :
Causes of transient failures :
In the cloud environment you’ll find that failed and dropped database connections happen periodically. That’s partly because you’re going through more load balancers compared to the on-premises environment where your web server and database server have a direct physical connection. Also, sometimes when you’re dependent on a multi-tenant service you’ll see calls to the service get slower or time out because someone else who uses the service is hitting it heavily. In other cases you might be the user who is hitting the service too frequently, and the service deliberately throttles you – denies connections – in order to prevent you from adversely affecting other tenants of the service.
Use smart retry/back-off logic to mitigate the effect of transient failures:
The Microsoft Patterns & Practices group has a Transient Fault Handling Application Block that does everything for you if you’re using ADO.NET for SQL Database access (not through Entity Framework). You just set a policy for retries – how many times to retry a query or command and how long to wait between tries – and wrap your SQL code in a using block :
public void HandleTransients()
{
var connStr = "some database";
var _policy = RetryPolicy.Create < SqlAzureTransientErrorDetectionStrategy(
retryCount: 3,
retryInterval: TimeSpan.FromSeconds(5));
using (var conn = new ReliableSqlConnection(connStr, _policy))
{
// Do SQL stuff here.
}
}
When you use the Entity Framework you typically aren’t working directly with SQL connections, so you can’t use this Patterns and Practices package, but Entity Framework 6 builds this kind of retry logic right into the framework. In a similar way you specify the retry strategy, and then EF uses that strategy whenever it accesses the database.
To use this feature in the Fix It app, all we have to do is add a class that derives from DbConfiguration and turn on the retry logic.
// EF follows a Code based Configuration model and will look for a class that
// derives from DbConfiguration for executing any Connection Resiliency strategies
public class EFConfiguration : DbConfiguration
{
public EFConfiguration()
{
AddExecutionStrategy(() => new SqlAzureExecutionStrategy());
}
}

IBM Cognos Report Studio: "The connection closed before the request is processed."

We are consuming TM1 cubes with Report Studio through Framework Manager.
Quite often when I am trying to come up with new solutions to my challenges in Report Studio, I get an error when I run the report, and then the server goes down. Then I have to restart the dispatchers (Cognos Administration -> Status -> System -> Right Click on the server -> Test Dispatchers -> Right Click on the server -> Start Dispatchers).
The error message that I get is:
The connection closed before the request is processed. If you are
using WebSphere Application Server, to reduce the frequency of this
error, increase the Persistent Timeout parameter for the Web container
transport chains in the administrative console. Increase the time in
10-15 second intervals until the error no longer or rarely occurs.
We are not using WebSphere, but Tomcat (default with the installation).
-> Increasing connection timout interval on WebSphere thus not applicable
-> The timeout interval in the Tomcat config seems to be 60 seconds (60000 ms)
More importantly: The error message shows immediately (after 1 second) when I run the report
-> Indicates to me that this is regardless of any timeout interval setting
Additional info: The error message comes almost always when I manually and dynamically attempt to build MUNs. However, sometimes (dunno when and why) it shows the MUN that I've created and tells me that it is invalid. Which is WAY much better for debugging.
Any suggestions on why this is happening and how to fix it would be greatly appreciated!
Edit 1: http://www.linkedin.com/groups/Product-Cognos-BI-1011-Cognos-3917273.S.143157206
This post states (almost at the bottom) that
When the Cognos BI report ask for a field that does not exist, the TM1
Application disconnects the connection. And the Cognos BI Report will
timeout.
Is this true? If so; why am I sometimes told that my MUN is invalid, whereas other times the connection is closed and the server shut down? Is it because even Report Studio thinks that my MUN is valid and tries to get it from the TM1 Server?
And additionally: Is it possible to change this behavior for the TM1 server?
Edit 2: Or change the BI server behavior so that it does not shut down when the TM1 connection is disconnected, but rather show an error of some kind?
Thanks again!
Edit 3: Okay, so I did some checking with the TM1 top utility (http://pic.dhe.ibm.com/infocenter/ctm1/v9r5m0/index.jsp?topic=%2Fcom.ibm.swg.im.cognos.tm1_op.9.5.1.doc%2Ftm1_op_id6961UsingtheTM1TopUtility_N160F47.html).
When a normal report is run, a new thread is shown in the monitoring list. This thread then disappears when I stop the BI server dispatchers, or automatically after approximately 5 minutes of idle time without any reports being run (according to the TM1 Top log dump).
Likewise, when the error occurs, a new thread is shown in the list. However, it disappears after a short second (probably because the BI server dispatchers are shut down).
I have therefore concluded that it is safe to assume (?) that the request seems to reach the TM1 server, and that TM1 returns something back (or simply closes the connection as suggested in the linkedin-post that I referenced in my first edit) . And hence, that it is safe to assume that this is something that have to be fixed on the BI server side(?).
The question is therefore more likely: Is it possible to change the BI server behavior so that it does not shut down when the TM1 server returns something invalid or closes the connection, and rather show some kind of error message instead?
Thanks for any input!

Resources