Multi Tenant Snowflake Datasource using Hikari from Spring creating more and more threads causing memory issue later - memory-leaks

I have Spring web app on tomcat and it connects to both myql and snowflake depending on the configuration for which i am using AbstractRoutingDataSource and putting all the datasources in a map. I am finding that the number of threads are getting created by the connection and it is causing memory issue. Not able to know whether this is Hikari Database connection concurrency issue or the issue with the underlying snowflake driver. If i replace the snowflake datasource with other mysql datasource then everything works fine.
The code for the snowflake datasource is given below:
public DataSource getSFDataSource(CommonTenantEntity tenant) {
//it's assumed its snoflake at this point
HikariConfig config = new HikariConfig();
config.setDriverClassName(<Datsource drive name>);
config.setJdbcUrl(<URL to Snowflake>);
config.setPoolName(org.springframework.util.StringUtils.isEmpty(SUFFIX_SF) ?
config.setMaximumPoolSize(<Pool size>);
config.addDataSourceProperty("cachePrepStmts", "true");
config.addDataSourceProperty("prepStmtCacheSize", "250");
config.addDataSourceProperty("prepStmtCacheSqlLimit", "2048");
config.addDataSourceProperty("zeroDateTimeBehavior", "convertToNull");
config.addDataSourceProperty("useLegacyDatetimeCode", "false");
config.addDataSourceProperty("hibernate.physical_naming_strategy", "org.hibernate.boot.model.naming.OrgSFPhysicalNamingStrategyImpl");
config.addDataSourceProperty("hibernate.implicit_naming_strategy", "CustomNamingStrategy");
return new HikariDataSource(config);
The thread dump observed during this is as below:
"integration-snowflake-4 housekeeper" - Thread t#122
java.lang.Thread.State: TIMED_WAITING
at java.base#11.0.16/jdk.internal.misc.Unsafe.park(Native Method)
- parking to wait for <6cb5eb65> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.base#11.0.16/java.util.concurrent.locks.LockSupport.parkNanos(
at java.base#11.0.16/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(
at java.base#11.0.16/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(
at java.base#11.0.16/java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(
at java.base#11.0.16/java.util.concurrent.ThreadPoolExecutor.getTask(
at java.base#11.0.16/java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.base#11.0.16/java.util.concurrent.ThreadPoolExecutor$
at java.base#11.0.16/
Locked ownable synchronizers:
- None
"integration-snowflake-4 connection adder" - Thread t#123
java.lang.Thread.State: RUNNABLE
at java.base#11.0.16/ Method)
at java.base#11.0.16/
at java.base#11.0.16/
at java.base#11.0.16/
at java.base#11.0.16/
at java.base#11.0.16/
at java.base#11.0.16/
at java.base#11.0.16/
at java.base#11.0.16/$
at net.snowflake.client.jdbc.internal.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(
at net.snowflake.client.jdbc.internal.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(
at net.snowflake.client.jdbc.internal.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(
at net.snowflake.client.jdbc.internal.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(
at net.snowflake.client.jdbc.internal.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(
at net.snowflake.client.jdbc.internal.apache.http.protocol.HttpRequestExecutor.execute(
at net.snowflake.client.jdbc.internal.apache.http.impl.execchain.MainClientExec.execute(
at net.snowflake.client.jdbc.internal.apache.http.impl.execchain.ProtocolExec.execute(
at net.snowflake.client.jdbc.internal.apache.http.impl.execchain.RetryExec.execute(
at net.snowflake.client.jdbc.internal.apache.http.impl.execchain.RedirectExec.execute(
at net.snowflake.client.jdbc.internal.apache.http.impl.client.InternalHttpClient.doExecute(
at net.snowflake.client.jdbc.internal.apache.http.impl.client.CloseableHttpClient.execute(
at net.snowflake.client.jdbc.internal.apache.http.impl.client.CloseableHttpClient.execute(
at net.snowflake.client.jdbc.RestRequest.execute(
at net.snowflake.client.core.HttpUtil.executeRequestInternal(
at net.snowflake.client.core.HttpUtil.executeRequest(
at net.snowflake.client.core.HttpUtil.executeGeneralRequest(
at net.snowflake.client.core.SessionUtil.newSession(
at net.snowflake.client.core.SessionUtil.openSession(
- locked <381ec49c> (a net.snowflake.client.core.SFSession)
at net.snowflake.client.jdbc.DefaultSFConnectionHandler.initialize(
at net.snowflake.client.jdbc.DefaultSFConnectionHandler.initializeConnection(
at net.snowflake.client.jdbc.SnowflakeConnectionV1.initConnectionWithImpl(
at net.snowflake.client.jdbc.SnowflakeConnectionV1.<init>(
at net.snowflake.client.jdbc.SnowflakeDriver.connect(
at com.zaxxer.hikari.util.DriverDataSource.getConnection(
at com.zaxxer.hikari.pool.PoolBase.newConnection(
at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(
at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(
at com.zaxxer.hikari.pool.HikariPool$
at com.zaxxer.hikari.pool.HikariPool$
at java.base#11.0.16/
at java.base#11.0.16/java.util.concurrent.ThreadPoolExecutor.runWorker(
at java.base#11.0.16/java.util.concurrent.ThreadPoolExecutor$
at java.base#11.0.16/
Locked ownable synchronizers:
- locked <5c5da1e> (a java.util.concurrent.ThreadPoolExecutor$Worker)
- locked <761970bf> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)


azure-sdk-for-java eventhubs Partition has been lost

We recently deployed azure event-hub java receiver/listener client by following azure-docs.
I truly believe arrays starts with 0, but that has nothing to do with this question. So anyways, I observed the following error raised from processError & also processPartitionClose
Error occurred in partition14 - connectionId[MF_5fba9c_1636350888640] sessionName[eventhub-name/ConsumerGroups/consumer-group-name/Partitions/14] entityPath[eventhub-name/ConsumerGroups/consumer-group-name/Partitions/14] linkName[14_500701_1636350888641] Cannot create receive link from a closed session., errorContext[NAMESPACE: ERROR CONTEXT: N/A, PATH: eventhub-name/ConsumerGroups/consumer-group-name/Partitions/14]
ERROR | Partition has been lost 14 reason LOST_PARTITION_OWNERSHIP
Question :
Do azure-sdk-for-java-sdk-eventhubs reconnect on such partition lost automatically ?
If NOT then what is the best practice before restarting manually ?
do I need to update the checkpoint manually ?
do I need to do anything on the ownership ?
This is our sdk setup with Sample Code
EventProcessorClientBuilder eventProcessorClientBuilder = new EventProcessorClientBuilder()
.checkpointStore(new BlobCheckpointStore(blobContainerAsyncClient))
.connectionString(getEventHubConnectionString(), getEventHubName())
EventProcessorClient eventProcessorClient = eventProcessorClientBuilder.buildEventProcessorClient();
// Starts the event processor
private final Consumer < ErrorContext > ERROR_HANDLER = errorContext->{
log.error("Error occurred in partition" + errorContext.getPartitionContext().getPartitionId()
+ " - " + errorContext.getThrowable().getMessage());
private final Consumer < CloseContext > CLOSE_HANDLER = closeContext->{
log.error("Partition has been lost " + closeContext.getPartitionContext().getPartitionId()
+ " reason " + closeContext.getCloseReason());
EventContext lastContext = lastEvent.get();
if (lastContext != null && (lastContext.getEventData().getSequenceNumber() % 10) != 0) {
jdk : 1.8
I did come across github-issue-15164 but could not find it anywhere mentioned.
Do azure-sdk-for-java-sdk-eventhubs reconnect on such partition lost automatically ?
Yes, the EventProcessorClient in azure-messaging-eventhubs library will reconnect on such partitions. You don't need to change anything manually.
If there are multiple instances of EventProcessorClients running and they all process events from the same Event Hub and use the same consumer group, then you see this LOST_PARTITION_OWNERSHIP error on one processor because the ownership of a partition might have been claimed by the other processor. The checkpoints are read from the checkpoint store (Storage Blob in your code sample above) and the processing resumes from the next sequence number.
Please refer to partition ownership and checkpointing for more details.

Get ConcurrentModificationException when createRow() with Apache POI within multi-threads method evem I have already syncronized this code.

I use multi-threads to create rows in excel with Apache POI package.
Each thread will create a row . to avoid the concurrency issue, I out the code inside the synchronized block to make sure there is only one thread can create a row and cell at the same time. however I still get an "ConcurrentModificationException". When code is running, I can reproduce the exception in the same place where code is processing.
the exception is
at java.util.TreeMap$NavigableSubMap$SubMapIterator.nextEntry(
at java.util.TreeMap$NavigableSubMap$
at java.util.TreeMap$NavigableSubMap$
at java.util.TreeMap$NavigableSubMap$EntrySetView.size(
at java.util.TreeMap$NavigableSubMap.size(
at org.apache.poi.xssf.usermodel.XSSFSheet.createRow(
at org.apache.poi.xssf.usermodel.XSSFSheet.createRow(
at java.util.concurrent.Executors$
My code is :
public void run() {
//add row to excel sheet
Row row = sheet.createRow(querySequence+1); // here I get exception
row.createCell(0).setCellValue( rqResult.getRQName());
Thanks for any help.

How to save data using multiple threads in grails-2.4.4 application using thread pool

I have a multithreaded program running some logic to come up with rows of data that I need to save in my grails (2.4.4) application. I am using a fixedthreadpool with 30 threads. The skeleton of my program is below. My expectation is that each thread calculates all the attributes and saves on a row in the table. However, the end result I am seeing is that there are some random rows that are not saved. Upon repeating this exercise, it is seen that a different set of rows are not saved in the table. So, overall, each time this is attempted a certain set of rows are NOT saved in table at all. GORMInstance.errors did not reveal any errors. So, I have no clue what is incorrect in this program.
ExecutorService exeSvc = Executors.newFixedThreadPool(30)
for (obj in list){
exeSvc.execute({-> finRunnable obj} as Callable)
Also, here's the runnable program that the above snippet invokes.
def finRunnable = {obj ->
for (item in LIST-1){
for (it in LIST-2){
for (i in LIST-3){
rowdata = calculateValues(item, it, i);
GORMInstance instance = new GORMInstance();
instance.attribute2=rowdata[1]; on..*without flush:true, I am
running into HeuristicCompletion exception. So I need it
here. */
}//forloop 3
}//forloop 2
}//forloop 1
}//runnable closure

Coalescing items in channel

I have a function which receives tasks and puts them into a channel. Every task has ID, some properties and a channel where result will be placed. It looks like this
task.Result = make(chan *TaskResult)
queue <- task
result := <-task.Result
Another goroutine takes a task from the channel, processes it and puts the result into task's channel
task := <-queue
task.Result <- doExpensiveComputation(task)
This code works fine. But now I want to coalesce tasks in the queue. Task processing is a very expensive operation, so I want process all the tasks in the queue with the same IDs once. I see two ways of doing it.
First one is not to put tasks with the same IDs to the queue, so when existing task arrives it will wait for it's copy to complete. Here is pseudo-code
if newTask in queue {
existing := queue.getById(newTask.ID)
} else {
So, I can implement it using go channel and map for random access + some synchronization means like mutex. What I don't like about this way is that I have to carry both map and channel around the code and keep their contents synchronized.
The second way is to put all the tasks into queue, but to extract task and all the tasks with the same IDs from the queue when result arrives, then send result to all the tasks. Here is pseudo-code
someTask := queue.dequeue()
result := doExpensiveComputation(someTask)
someTask.Result <- result
moreTasks := queue.getAllWithID(someTask.ID)
for _,theSameTask := range moreTasks {
theSameTask.Result <- result
And I have an idea how to implement this using chan + map + mutex in the same way as above.
And here is the question: is there some builtin/existing data structures which I can use for such a problem? Are there another (better) ways of doing this?
If I understand the problem correctly, the simplest solution that comes into my mind is adding a middle layer between task senders (putting into queue) and workers (taking from queue). This, probably routine, would be responsible for storing current tasks (by ID) and broadcasting the results to every matching tasks.
Pseugo code:
go func() {
active := make(map[TaskID][]Task)
for {
select {
case task := <-queue:
tasks := active[task.ID]
// No tasks with such ID, start heavy work
if len(tasks) == 0 {
worker <- task
// Save task for the result
active[task.ID] = append(active[task.ID], task)
case r := <-response:
// Broadcast to all tasks
for _, task := range active[r.ID] {
task.Result <- r.Result
No mutexes needed and probably no need to carry anything around either, workers will simply need to put all the results into this middle layer, which is then routing responses correctly. You could even easily add caching here if there's a chance clashing IDs can arrive some time apart.
Edit: I had this dream where the above code caused a deadlock. If you send a lot of requests at once and choke worker channel there's a serious problem – this middle layer routine is stuck on worker <- task waiting for a worker to finish, but all the workers will be probably blocked on send to response channel (because our routine can't collect it). Playable proof.
One could think of adding some buffers into the channels but this is not a proper solution (unless you can design the system in such way the buffer will never fill up). There're a few ways of solving this problem; for example, you can run a separate routine for collecting responses, but then you would need to protect active map with a mutex. Doable. You could also put worker <- task into a select, which would try to send task to a worker, receive new task (if nothing to send) or collect a response. One could take advantage of the fact that nil channel is never ready for communication (ignored by select), so you can alternate between receiving and sending tasks within a single select. Example:
go func() {
var next Task // received task which needs to be passed to a worker
in := queue // incoming channel (new tasks) -- active
var out chan Task // outgoing channel (to workers) -- inactive
for {
select {
case t := <-in:
next = t // store task, so we can pass to worker
in, out = nil, worker // deactivate incoming channel, activate outgoing
case out <- next:
in, out = queue, nil // deactivate outgoing channel, activate incoming
case r := <-response:
collect <- r

Identify the associated w3wp process for a web role instance

I am working on monitoring the performance of an Azure service.
There are currently two web role instances (for the same website) running - each with its own W3WP.exe (w3wp and w3wp#1)
How can i find out which w3wp process belongs to which role instance?
With this information i want to feed the azure.diagnostics.monitor with some performance counters - namely Process(w3wp)\ProcessorTime (%) and Thread Count.
But in order to get any meaningfull data i have to append the process ID of the w3wp process to the performance counter (e.g Process(w3wp_PID)\processorTime(%)) - dont know if the syntax is right , but there is a way to append the PID)
so the final result in the AzureStorage table WADPerformanceCounters only has entries like:
WebRoleInstance_n_0 | process(w3wp_1033)\processorTime (%) | 12.4
WebRoleInstance_n_1 | process(w3wp_1055)\processorTime (%) | 48.4
atm its like
WebRoleInstance_n_0 | process(w3wp)\processorTime (%) | 12.4
WebRoleInstance_n_1 | process(w3wp)\processorTime (%) | 12.4
i thought: if i started a DiagnosticsMonitor for each Role, that the monitor would use the corrrect process - belonging to the Roleinstance who started the monitor . but actually that does not work - or i think it doesnt work - at least after looking at the resulting values.
//update: on the manage.windowsazure portal you can define custom metrics for the performance monitoring. It is possible here to chose the webrole instance to be monitored exclusively.
This is what i want to do also. Insights on what this page actually does might help also.
for comparison:
They only stupid way i can think of to get this information is :
to get a list of all processes before and after each w3wp is started - identify which one was added and then decide code base context wise which role instance was just started.
i got it working - allthough it was not really straight forward.
first of all i have to make some corrections to my previous statements - just to be on the same level.
In the Cloud Service there are several Virtual Machines, each hosting either a WebRole Instance or a WorkerRole Instance.
Thus on a single VM only a single w3wp runs or no w3wp at all but a waworkerhost process.
In my special case there is the possiblity to have two w3wp running on a single VM. so i needed to differenciate between those two - thus requiering me to make some sort of process-Instance association.
What i wanted to log was: The Total CPU Load of a single VM, the CPU Load of the Instance Process running on the VM ( w3wp, waworkerhost).
The PerformanceCounter for Total CPU Load is easy and equal for each VM: \Processor(_Total)\% Processortime
for the webrole VM i couldnt just use the \process(w3wp)\% processortime counter because i can not be sure if its the correct w3wp ( see above)
Now here is what i did:
Since you have to start a performance counter monitor for each role instance OnStart() in the WebRole.cs or WorkerRole.cs i figured this is the only place where i can somehow gather the required information.
In the WorkerRole.cs i did:
int pc = Environment.ProcessorCount;
string instance = RoleEnvironment.CurrentRoleInstance.Id;
SomeOtherManagementClass.StartDiagnosticMonitorService(pc, instance, Process.GetCurrentProcess());
In the WebRole.cs the CurrentProcess also returns WaWorkerHost, so i had to move the above codelines into the global.asax of the WebRole . Here the correct Process is available.
In the SomeOtherManagementClass i put the StartDiagnosticsMonitorService , which now receives the CurrentProcess from which StartDiagnosticsMonitorService was called.
(from workerrole.cs it would receive WaWorkerHost Process and from WebRoles the w3wp process - including PID)
public static void StartDiagnosticMonitorService(int coreCount, string currentRole, Process process)
string processName = GetProcessInstanceName(process.Id);
SetCPUCoreData(coreCount, currentRole, processName, process.Id);
instanceProcessLoadCounterName = String.Format(#"\Process({0})\% Processor Time", processName);
GetProcessInstanceName(process.Id) is now called on each VM and gets the processName to the provided process.Id - this allows you to make a differentiation between multiple w3wps on a single VM because the instanceNames returned are w3wp, w3wp#1, w3wp#2 etc. in contrary to to the processName provided by GetCurrentProcess, which is allways w3wp. for this i modified a codesample i found here on stackoverflow - you can find it below:
private static string GetProcessInstanceName(int pid)
PerformanceCounterCategory cat = new PerformanceCounterCategory("Process");
string[] instances = cat.GetInstanceNames();
foreach (string instance in instances)
using (PerformanceCounter cnt = new PerformanceCounter("Process",
"ID Process", instance, true))
int val = (int)cnt.RawValue;
if (val == pid)
return instance;
catch (InvalidOperationException)
//this point is reached when a process terminates while iterating the processlist- this it cannot be found
return "";
Last but not least: SetCPUCoreData(coreCount, currentRole, processName, process.Id) saves all the relevant Data of the processes to the azure storage so it is available from everywhere in the application:
private static void SetCPUCoreData(int count, string roleinstance, string processName, int processID)
string[] instances = roleinstance.Split('.');
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(GetSettingValue("LoadMonitor.Connection.String"));
CloudTableClient cloudTableClient = storageAccount.CreateCloudTableClient();
const string tableName = "PerformanceMonitorCoreCount";
TableServiceContext serviceContext = cloudTableClient.GetDataServiceContext();
PerformanceCounterCPUCoreEntity ent = new PerformanceCounterCPUCoreEntity(count, instances[instances.Count() - 1],processName, processID);
serviceContext.AttachTo(tableName, ent);
the PerformanceCounterCPUCoreEntity is a Template for the StorageTable - look into the azure Storage API if you have any questions regarding this part, or just ask.
