Azure SSAS tabular full process: memory won't release - azure

can anyone tell me if I did anything wrong in my design? Because it seems for some reasons, the memory used for processing tabular model is not released correctly.
The whole story is:
I created a SSIS package to loop all Azure SSAS tabular models, and then process them by pre-defined types, e.g. full, dataOnly + recalculate, clear + full, etc. If any error occurs, the package will log the error and process next model.
In my test, I processed 40+ models in full process mode, most of them are very small except one could be around 400mb. Some of them generated errors as I expected. But then I noticed that the memory usage of the SSAS instance increased dramatically.
I tried to run dmv queries to see sessions and sinkable memory allocations, but couldn't find any clues. At last I have to restart the instance to release the memory. As demonstrated, after restart of the service, memory usage dropped to just 5GB.
To my understanding when I selected full process mode, the memory used to process should be released. But in below demonstration, the memory was not released correctly.
Did anyone encounter this problem before? Is it related to the loop + full process? and if yes what is the correct way to process multiple tabular models?
Thanks in advance.

Related

JMtere: What can be the reason of sudden spike in Response time graph which then decreased to run consistently?

As seen in the above graphs, Graph no. 1 is the Response time graph and it showed a sudden spike in the middle of the test. But then it seems to be running consistently.
On the other hand, the throughput graph, Graph no. 2, showed a down spike but not a sudden spike, it gradually decreased. Also, I got two different throughput values, before and after a down spike.
I first thought it to be a memory issue, but then it should have affected response time as well.
Could anyone help me in knowing the reason behind the sudden spike in the Response Time graph?
And also what could be the possible bottleneck if not memory leakage issue?
Unfortunately these 2 charts don't tell the full story and not knowing your application details technology stack it's quite hard to suggest anything meaningful.
A couple of possible reasons could be:
Your application is capable of autoscaling so when the load reaches certain threshold it either adds more resources or kicks off another node of the cluster
Your application is going i.e. Garbage Collection as its heap is busy with stale objects and once the collection is done it starts working at full speed again. You might want to run a Soak Test to see whether the pattern repeats or not
Going forward consider collecting information on what's going on at your application under test side using i.e. JMeter PerfMon Plugin or SSHMon Listener

How can I automatically test for memory leaks in Node?

I have some code in a library that has in the past leaked badly, and I would like to add regression tests to avoid that in the future. I understand how to find memory leaks manually, by looking at memory usage profiles or Valgrind, but I have had trouble writing automatic tests for them.
I tried using global.gc() followed by process.memoryUsage() after running the operation I was checking for leaks, then doing this repeatedly to try to establish a linear relationship between number of operations and memory usage, but there seems to be noise in the memory usage numbers that makes this hard to measure accurately.
So, my question is this: is there an effective way to write a test in Node that consistently passes when an operation leaks memory, and fails when it does not leak memory?
One wrinkle that I should mention is that the memory leaks were occurring in a C++ addon, and some of the leaked memory was not managed by the Node VM, so I was measuring process.memoryUsage().rss.
Automating and logging information to test for memory leaks in node js.
There is a great module called memwatch-next.
npm install --save memwatch-next
Add to app.js:
const memwatch = require('memwatch-next');
// ...
memwatch.on('leak', (info) => {
// Some logging code...
console.error('Memory leak detected:\n', info);
});
This will allow you to automatically measure if there is a memory leak.
Now to put it to a test:
Good tool for this is Apache jMeter. More information here.
If you are using http you can use jMeter to soak test the application's end points.
SOAK testing is done to verify system's stability and performance characteristics over an extended period of time, its good when you are looking for memory leaks, connection leaks etc.
Continuous integration software:
Prior to deployment to production if you are using a software for continuous integration like Jenkins, you can make a Jenkins job to do this for you, it will test the application with parameters provided after the test will ether deploy the application or report that there is a memory leak. ( Depending on your Jenkins job configuration )
Hope it helps, update me on how it goes;
Good luck,
Given some arbitrary program, is it always possible to determine if it will ever terminate? The halting problem describes this. Consider the following program:
function collatz(n){
if(n==1)
return;
if(n%2==0)
return collatz(n/2);
else
return collatz(3*n+1);
}
The same idea can be applied to data in memory. It's not always possible to identify what memory isn't needed anymore and can thus be garbage collected. There is also the case of the program being designed to consume a lot of memory in some situation. The only known option is coming up with some heuristic like you have done, but it will most likely result in false positives and negatives. It may be easier to determine the root cause of the leak so it can be corrected.

QSQLite Error: Database is locked

I am new to Qt development, the way it handles threads (signals and slots) and databases (and SQLite at that). It has been 4 weeks that I have started working on the mentioned technologies. This is the first time I'm posting a question on SO and I feel I have done research before coming to you all. This may look a little long and possibly a duplicate, but I request you all to read it thoroughly once before dismissing it off as a duplicate or tl;dr.
Context:
I am working on a Windows application that performs a certain operation X on a database. The application is developed in Qt and uses QSQLite as database engine. It's a single threaded application, i.e., the tables are processed sequentially. However, as the DB size grows (in number of tables and records), this processing becomes slower. The result of this operation X is written in a separate results table in the same DB. The processing being done is immaterial to the problem, but in basic terms here's what it does:
Read a row from Table_X_1
Read a row from Table_X_2
Do some operations on the rows (only read)
Push the results in Table_X_Results table (this is the only write being performed on the DB)
Table_X_1 and Table_X_2 are identical in number and types of columns and number of rows, only the data may differ.
What I'm trying to do:
In order to improve the performance, I am trying to make the application multi-threaded. Initially I am spawning two threads (using QtConcurrentRun). The two tables can be categorized in two types, say A and B. Each thread will take care of the tables of two types. Processing within the threads remains same, i.e., within each thread the tables are being processed sequentially.
The function is such that it uses SELECT to fetch rows for processing and INSERT to insert result in results table. For inserting the results I am using transactions.
I am creating all the intermediate tables, result tables and indices before starting my actual operation. I am opening and closing connections everytime. For the threads, I create and open a connection before entering the loop (one for each thread).
THE PROBLEM:
Inside my processing function, I get following (nasty, infamous, stubborn) error:
QSqlError(5, "Unable to fetch row", "database is locked")
I am getting this error when I'm trying to read a row from DB (using SELECT). This is in the same function in which I'm performing my INSERTs into results table. The SELECT and the INSERT are in the same transaction (begin and commit pair). For INSERT I'm using prepared statement (SQLiteStatement).
Reasons for seemingly peculiar things that I am doing:
I am using QtConcurrentRun to create the threads because it is straightforward to do! I have tried using QThread (not subclassing QThread, but the other method). That also leads to same problem.
I am compiling with DSQLITE_THREADSAFE=0 to avoid application from crashing. If I use the default (DSQLITE_THREADSAFE=1), my application crashes at SQLiteStatement::recordSet->Reset(). Also, with the default option, internal SQLITE sync mechanism comes into play which may not be reliable. If the need be, I'll employ explicit sync.
Making the application multi-threaded to improve performance, and not doing this. I'm taking care of all the optimizations recommended there.
Using QSqlDatabase::setConnectOptions with QSQLITE_BUSY_TIMEOUT=0. A link suggested that it will prevent the DB to get locked immediately and hence may give my thread(s) appropriate amount of time to "die peacefully". This failed: the DB got locked much frequently than before.
Observations:
The database goes into lock only and as soon as when one of the threads return. This behavior is consistent.
When compiling with DSQLITE_THREADSAFE=1, the application crashes when one of the threads return. Call stack points at SQLiteStatement::recordSet->Reset() in my function, and at winMutexEnter() (called from EnterCriticalSection()) in sqlite3.c. This is consistent as well.
The threads created using QtConcurrentRun do not die immediately.
If I use QThreads, I can't get them to return. That is to say, I feel the thread never returns even though I have connected the signals and the slots correctly. What is the correct way to wait for threads and how long it takes them to die?
The thread that finishes execution never returns, it has locked the DB and hence the error.
I checked for SQLITE_BUSY and tried to make the thread sleep but could not get it to work. What is the correct way to sleep in Qt (for threads created with QtConcurrentRun or QThreads)?
When I close my connections, I get this warning:
QSqlDatabasePrivate::removeDatabase: connection 'DB_CONN_CREATE_RESULTS' is still in use, all queries will cease to work.
Is this of any significance? Some links suggested that this warning arises because of using local QSqlDatabase, and will not arise if the connection is made a class member. However, could it be the reason for my problem?
Further experiments:
I am thinking of creating another database which will only contain results table (Table_X_Results). The rationale is that while the threads will read from one DB (the one that I have currently), they will get to write to another DB. However, I may still face the same problem. Moreover, I read on the forums and wikis that it IS possible to have two threads doing read and write on same DB. So why can I not get this scenario to work?
I am currently using SQLITE version 3.6.17. Could that be the problem? Will things be better if I used version 3.8.5?
I was trying to post the web resources that I have already explored, but I get a message saying "I'd need 10 reps to post more than 2 links". Any help/suggestions would be much appreciated.

Slow performance in Mongoose after upgrading from 3.5.7 to 3.8.4

I changed the version number of mongoose from 3.5.7 to 3.8.4 and performance took a huge hit in an import process. This process reads lines from a file and populates an empty database (no indexes) with about 2.5 million rows.
This is the only change I've made; just upgrading the version. I can switch back and forth and see the difference in performance.
The performance hits are: 1) The process pegs at 100% CPU, where before it ran upwards of maybe 25% or so. 2) Entry into the database into the database is slow. 3.5.7 inserted about 10K records every 20 seconds while 3.8.4 seems to be inserting at more around 1 per second. 3) nodejs seems to "disappear" into something CPU intensive and all other I/O functions get blocked (http requests, etc.); previously the system remained very responsive.
It's hard to isolate the code, but roughly here's what's happening:
Read a line from a file
Parse it
Run a query to see if the record already exists
Insert/update a record with the values from the line read
Write the existing record to an already-open file stream
Continue with the next line in the file
At a guess, I would say it's related to a change in how requests are throttled either in the underlying driver that mongoose depends on or mongoose itself. My first thought was to try and handle the case where requests are getting queued up and put a pause on the file read. This works really well when writing the results of a query (pausing the querystream when the file starts caching writes, then resuming on drain). But I haven't been able to find where mongoose might be emitting information about its back-pressure.
The reason I upgraded in the first place is because of a memory leak error I was getting when setting an event handled in mongoose that I had read was fixed (sorry I lost reference to that).
I'd like to stay upgraded and figure out the issue. Any thoughts on what it might be? Is there an event given somewhere that notifies me of back-pressure so I can pause/resume the input stream?
I solved this by simply reverting back to 3.5.7 and solving the original memory leak warnings another way.
I attempted to isolate my code, but the worst issue I was able to raise was high memory consumption (which I resolved by nulling objects, which apparently helps the GC figure out what it can collect). I started adding in unrelated code, but at some point it became clear that the issue wasn't with mongoose or the mongodb driver itself.
My only guess on what really causes the performance issue when I upgrade mongoose is that some library depended on by mongoose introduced a change that my non-mongoose-related code isn't playing well with.
If I ever get to the bottom of it, I'll come back and post a clearer answer.

Too much data (at a time) for Core Data?

My iPhone app uses core data and things are fine for most part. But here is a problem:
after a certain amount of data, it stalls at first time execution (where core data entities must be loaded).
Some experimenting showed that things are OK up to a certain amount of data loaded in Core Data at start.
If I go over a critical amount the installation starts failing. The bigger the amount of data for start, the higher the probability that it fails.
By making separate tests I made sure the data themselves are not faulty.
I also can say this problem does not appear in the simulator.
It also does not happen when I connect the debugger to the device.
It looks like too much data loaded in core data in a short amount of time creates some kind of overload.
Is that true? Any idea on a possible solution?
At this point I made up a partial solution using a UIActionSheet object to kill some time (asking the user to push a button). But this is not very satisfactory, though for the time being it works.
Any comment or advice for a better way would be appreciated.
It is not quite clear what do you mean by "it fails".
However if you are using SQLite, by loading into CoreData, if you mean "create and save" entities at start up to populate CoreData, then remember to not call [managedObjectContext save...] only at the end especially with large amount of data, but create and save a reasonable set of NSManagedObject.
Otherwise, if you mean you have large amount of data that are retrieved as NSManagedObject, probably loaded into a UITableView consider using some kind of NSOperation for asynchronous loading.
If those two cases doesn't apply to you just tell us the error you are getting, or what you mean by "fails" os "stalls".

Resources