I have a user requirement that I have been battling with for a while with no success. I need to write an add-in that can read around 100 formula-driven cells (of a specific spreadsheet) once every couple of minutes, and send to a web service.
I'm more than happy to use Excel-DNA or VSTO, but everything I've tried so far causes the user interface to hang for an instant. Would this always be the case if the data is being read from the active spreadsheet (even from a different thread) ?
Reading the sheet from a different thread is likely to have a worse effect than reading from the Excel main thread (say in an event or something). This is due to the COM threading switch that is required for the cross-thread calls. In the end, all the COM calls have to do their work on the main thread anyway.
You might have more success by hooking some of the Excel events, as a start the Workbook.SheetChange event, then checking whether the changes affect your watched Range(s) and updating an internal data structure with the new data.
You can then update the back-end periodically (or only when watched cells have change) from a background thread.
You need to run a secondary thread to post the data to the web service to prevent any UI freeze.
Related
Over 2 years ago, Remy Lebeau gave me invaluable tips on threads in Delphi. His answers were very useful to me and I feel like I made great progress thanks to him. This post can be found here.
Today, I now face a "conceptual problem" about threads. This is not really about code, this is about the approach one should choose for a certain problem. I know we are not supposed to ask for personal opinions, I am merely asking if, on a technical point a view, one of these approach must be avoided or if they are both viable.
My application has a list of unique product numbers (named SKU) in a database. Querying an API with theses SKUS, I get back a JSON file containing details about these products. This JSON file is processed and results are displayed on screen, and saved in database. So, at one step, a download process is involved and it is executed in a worker thread.
I see two different approaches possible for this whole procedure :
When the user clicks on the start button, a query is fired, building a list of SKUs based on the user criteria. A Tstringlist is then built and, for each element of the list, a thread is launched, downloads the JSON, sends back the result to the main thread and terminates.
This can be pictured like this :
When the user clicks on the start button, a query is fired, building a list of SKUs based on the user criteria. Instead of sending SKU numbers one after another to the worker thread, the whole list is sent, and the worker thread iterates through the list, sending back results for displaying and saving to the main thread (via a synchronize event). So we only have one worker thread working the whole list before terminating.
This can be pictured like this :
I have coded these two different approaches and they both work... with each their downsides that I have experienced.
I am not a professional developer, this is a hobby and, before working my way further down a path or another for "polishing", I would like to know if, on a technical point of view and according to your knowledge and experience, one of the approaches I depicted should be avoided and why.
Thanks for your time
Mathias
Another thing to consider in this case is latency to your API that is producing the JSON. For example, if it takes 30 msec to go back and forth to the server, and 0.01 msec to create the JSON on the server, then querying a single JSON record per request, even if each request is in a different thread, does not make much sense. In that case, it would make sense to do fewer requests to the server, returning more data on each request, and partition the results up among different threads.
The other thing is that threads are not a solution to every problem. I would question why you need to break each sku into a single thread. how long is each individual thread running and how much processing is each thread doing? In general, creating lots of threads, for each thread to work for a fraction of a msec does not make sense. You want the threads to be alive for as long as possible, processing as much data as they can for the job. You don't want the computer to be using as much time creating/destroying threads as actually doing useful work.
I am handling RTD feeds again and remembering the difficulties, but now we have multi-core machines and multi-threading. maybe anyone can advise.
As I understand/rememeber: pushing data into Excel is not one(obvious reasons) so it sends a friendly nod to say your parcel is ready come and get it. Then when Excel has done its nails and feels in the mood, it might get the data.
So this kind of architecture is right back on the dance-floor and hopefully I can make it work.
Problem.
I need to run a bot that examines the data and responds differently;
1. looping or running a winapi timer even is still enough to keep excel busy and no data, or so it seems.
Definitely, executing bot logic , however small, will bring on an Excel fainting fit.
Tried responding via calculation event. Very hit-and-miss and definitely not up to the job. There is no logic obvious as to when and why it fires or does not other than a " bad hair day"
Tried winapi timer looking at the newly acquired data every second comparing to old in a separate data structure, running some EMAs and making a decision. No dice.
The timer is enough to put a delay up to 10 or even 20 seconds between occasional delivery of data.
Options I am thinking about:
1. The timer running outside of the excel environment and lookin in a the data. e.g an AQddon via pia etc. What I don't know is whether this Addon, perhaps inc C# or vb.net, could utilze multithreading, via tasks I think and do its bit without "scarin the panties of her ladyship"?
2. I remember hearing that XLL UDFs could be asynchronous, does anyone know if this is a potential option?
Any ideas?
Right now my FMX project is totally based on Livebinding to connect the datasources to my editors on the form.
It works nice, besides to be slow and do not use paging loading (TLisView).
However, I have many different datasources and the amount of data can be huge and connections eventually slow.
My idea is to keep the user interface responsive and let threads in the background make the data load opening the datasources and put them it the right state. After that assigning the datasource to the controls on the form.
I have played with that with LiveBinding but I cannot mix the main thread with background ones. Some problems happened.
Having to load each field record to each control manually seems to be extremely unproductive. I have almost all the controls that I use already wrapped, I made my own controls based on the FMX ones, so I can have the possibility to add more functions.
I was wondering if there is something already done. Any class or library that I could use to map the source and targets and that I can have the control to activate when it is needed, since I can have many datasources in loading state by a thread.
This is not really a livebinding question.
Also without livebindings, when you retrieve data in a thread you have to respect the thread context. When getting a dataset from a connection, this dataset is also bound to that connection and the connection is bound to the thread context.
The solution is to copy the dataset into a clientdataset and hand over that CDS to the UI thread. Now you can bind that CDS wherever you like. Remember, that there is no connection between the CDS and the data connection. You have to take care yourself writing back the changes.
Don't know if this is relevant still. I use livebinding frequently with load time for the underlying data using treads, utilizing the TTask.Run and Thread.Queue. The important point is to have the LiveBinding AutoActivate = FALSE (i.e. TLinkGridToDataBindSource or other livebinding).
Request is done in TThread.Run with Execute of query, and LiveBinding property "Active" set to True in a TThread.Queue [inside the TThread.Run]. The Livebinding is updating UI and must occur in the main thread.
Subsequent update/request is done the same way, setting active to false first.
I am new to Qt development, the way it handles threads (signals and slots) and databases (and SQLite at that). It has been 4 weeks that I have started working on the mentioned technologies. This is the first time I'm posting a question on SO and I feel I have done research before coming to you all. This may look a little long and possibly a duplicate, but I request you all to read it thoroughly once before dismissing it off as a duplicate or tl;dr.
Context:
I am working on a Windows application that performs a certain operation X on a database. The application is developed in Qt and uses QSQLite as database engine. It's a single threaded application, i.e., the tables are processed sequentially. However, as the DB size grows (in number of tables and records), this processing becomes slower. The result of this operation X is written in a separate results table in the same DB. The processing being done is immaterial to the problem, but in basic terms here's what it does:
Read a row from Table_X_1
Read a row from Table_X_2
Do some operations on the rows (only read)
Push the results in Table_X_Results table (this is the only write being performed on the DB)
Table_X_1 and Table_X_2 are identical in number and types of columns and number of rows, only the data may differ.
What I'm trying to do:
In order to improve the performance, I am trying to make the application multi-threaded. Initially I am spawning two threads (using QtConcurrentRun). The two tables can be categorized in two types, say A and B. Each thread will take care of the tables of two types. Processing within the threads remains same, i.e., within each thread the tables are being processed sequentially.
The function is such that it uses SELECT to fetch rows for processing and INSERT to insert result in results table. For inserting the results I am using transactions.
I am creating all the intermediate tables, result tables and indices before starting my actual operation. I am opening and closing connections everytime. For the threads, I create and open a connection before entering the loop (one for each thread).
THE PROBLEM:
Inside my processing function, I get following (nasty, infamous, stubborn) error:
QSqlError(5, "Unable to fetch row", "database is locked")
I am getting this error when I'm trying to read a row from DB (using SELECT). This is in the same function in which I'm performing my INSERTs into results table. The SELECT and the INSERT are in the same transaction (begin and commit pair). For INSERT I'm using prepared statement (SQLiteStatement).
Reasons for seemingly peculiar things that I am doing:
I am using QtConcurrentRun to create the threads because it is straightforward to do! I have tried using QThread (not subclassing QThread, but the other method). That also leads to same problem.
I am compiling with DSQLITE_THREADSAFE=0 to avoid application from crashing. If I use the default (DSQLITE_THREADSAFE=1), my application crashes at SQLiteStatement::recordSet->Reset(). Also, with the default option, internal SQLITE sync mechanism comes into play which may not be reliable. If the need be, I'll employ explicit sync.
Making the application multi-threaded to improve performance, and not doing this. I'm taking care of all the optimizations recommended there.
Using QSqlDatabase::setConnectOptions with QSQLITE_BUSY_TIMEOUT=0. A link suggested that it will prevent the DB to get locked immediately and hence may give my thread(s) appropriate amount of time to "die peacefully". This failed: the DB got locked much frequently than before.
Observations:
The database goes into lock only and as soon as when one of the threads return. This behavior is consistent.
When compiling with DSQLITE_THREADSAFE=1, the application crashes when one of the threads return. Call stack points at SQLiteStatement::recordSet->Reset() in my function, and at winMutexEnter() (called from EnterCriticalSection()) in sqlite3.c. This is consistent as well.
The threads created using QtConcurrentRun do not die immediately.
If I use QThreads, I can't get them to return. That is to say, I feel the thread never returns even though I have connected the signals and the slots correctly. What is the correct way to wait for threads and how long it takes them to die?
The thread that finishes execution never returns, it has locked the DB and hence the error.
I checked for SQLITE_BUSY and tried to make the thread sleep but could not get it to work. What is the correct way to sleep in Qt (for threads created with QtConcurrentRun or QThreads)?
When I close my connections, I get this warning:
QSqlDatabasePrivate::removeDatabase: connection 'DB_CONN_CREATE_RESULTS' is still in use, all queries will cease to work.
Is this of any significance? Some links suggested that this warning arises because of using local QSqlDatabase, and will not arise if the connection is made a class member. However, could it be the reason for my problem?
Further experiments:
I am thinking of creating another database which will only contain results table (Table_X_Results). The rationale is that while the threads will read from one DB (the one that I have currently), they will get to write to another DB. However, I may still face the same problem. Moreover, I read on the forums and wikis that it IS possible to have two threads doing read and write on same DB. So why can I not get this scenario to work?
I am currently using SQLITE version 3.6.17. Could that be the problem? Will things be better if I used version 3.8.5?
I was trying to post the web resources that I have already explored, but I get a message saying "I'd need 10 reps to post more than 2 links". Any help/suggestions would be much appreciated.
From msdn it seems that IMessageFilter doesn't handle all exceptions, for example, at some points, office applications 'suspend' their object model, at which point it cannot be invoked and throws: 0x800AC472 (VBA_E_IGNORE)
In order to get around this, you have to put your call in a loop and wait for it to succeed:
while(true)
{
try
{
office_app.DoSomething();
break;
}
catch(COMException ce)
{
LOG(ce.Message);
}
}
My question is: if this boiler-plate code is necessary for each call to the office app's object model, is there any point in implementing IMessageFilter?
Yes, the two mechanisms are independent. IMessageFilter is a general COM mechanism to deal with COM servers that have gone catatonic and won't handle a call for 60 seconds. The VBE_E_IGNORE error is highly specific to Excel and happens when it gets in a state where the property browser is temporarily disabled. Think of it as a modal state, intentionally turned on when Excel needs to perform a critical operation that must complete before it can handle out-of-process calls again. A lock if you will. It is not associated with time, like IMessageFilter, but purely by execution state. Geoff Darst of the VSTO team gives some background info in this MSDN forums thread.
Having to write these kind of retry loops is horrible of course. Given that it is an execution state issue and assuming that you are doing interop with Excel without the user also operating the program, there ought to be a way to sail around the problem. If you are pummeling Excel from a worker thread in your program then do consider reviewing the interaction between threads in your code to make sure you don't create a case where the threads are issuing Excel calls at roughly the same time.