Presenting background loading data through a ContentProvider Cursor - multithreading

I'm currently re-factoring an Android project that in a few places loads data on background threads in order to update list views. The API that is being called to collect the data has a callback mechanism, so when a lot of data is returned (which takes a long time) I can handle the results asynchronously.
In the old code, this data was packaged up as an appropriate object and passed into a handle on the UI thread, to be inserted into the list view's adapter. This worked well, but I've decided that presenting the data through a ContentProvider would make the project easier to maintain and expand.
This means I need to provide the data as a Cursor object when requested via the query method.
So far I've been unable to update the data in the Cursor after retuning it. Does this mean that all of the data needs to be collected before returning the Cursor? The Android LoaderThrottleSupport sample suggests that I don't, but I have yet to get it working for anything other than an SQL backend.
Has anyone else tried to present non-SQL backed asynchronous data in this sort of way?

Related

Non-blocking insert into database with node js

Part of my Node Js app includes reading a file and after some (lightweight, row by row) processing, insert these records into the database.
Original code did just that. The problem is that the file may contain a crazy number of records which are inserted row by row. According to some tests I did, a file of 10000 rows blocks completely the app for some 10 seconds.
My considerations were:
Bulk create the whole object at once. This means reading the file, preparing the object by doing for each row some calculation, pushing it to the final object and in the end using Sequelize's bulkcreate. There were two downsides:
A huge insert can be as blocking as thousands of single-row inserts.
This may make it hard to generate reports for rows that were not inserted.
Bulk create in smaller, reasonable objects. This means reading the file, iterating each n (ex. 2000) rows by doing the calculations and adding it to an object, then using Sequelize's bulkcreate for the object. Object preparation and the bulkcreate would run asyncroniously. The downside:
Setting the object length seems arbitrary.
Also it seems like an artifice on my side, while there might be existing and proven solutions for this particular situation.
Moving this part of the code in another proccess. Ideally limiting cpu usage to reasonable levels for this process (idk. if it can be done or if it is smart).
Simply creating a new process for this (and other blocking parts of the code).
This is not the 'help me write some code' type of question. I have already looked around and it seems there is enough documentation. But I would like to invest on an efficient solution, using the proper tools. Other ideas are welcomed.

Robot's Tracker Threads and Display

Application: The purposed application has an tcp server able to handle several connections with the robots.
I choosed to work with database/ no files, so i'm using a sqlite db to save information about the robots and their full history, models of robots, tasks, etc...
The robots send us several data like odometry, tasks information, and so on...
I create a thread for every new robot's connection to handle the messages and update the informations of the robots on the database. Now lets start talk about my problems:
The application got to show information about the robots in realtime, and I was thinking about using QSqlQueryModel, set the right query and the show it on a QTableView but then I got to some problems/ solutions to think about:
Problem number 1: There are informations to show on the QTableView that are not on the database: I have the current consumption on the database and the actual charge on the database in capacity, but I want to show also on my table the remaining battery time, how can I add that column with the right behaviour (math implemented) in my TableView.
Problem number 2: I will be receiving messages each second for each robot, so, updating the db and the the gui(loading the query) may not be the best solution when I have a big number of robots connected? Is it better to update the table, and only update the db each minute or something like this? If I use this method I cant work with the table with the QSqlQueryModel to update the tables, so what is the approach that you recommend me to use?
Thanks
SancheZ
I have run into similar problem before; my conclusion was QSqlQueryModel is not the best option for display purposes. You may want some processing on query results, or you may want to create, remove, change display data based on the result for a fancier gui. I think best is to implement your own delegates and override the view related methods - setData, setEditor
This way you have the control over all your columns and direct union of raw data and its display equivalent (i.e. EditData, UserData).
Yes, it is better if you update your view real-time and run a batch execute at lower frequency to update the big data. In general app is the middle layer and db is a bottom layer for data monitoring, unless you use db in memory shared cache.
EDIT: One important point, you cannot run updates in multiple threads (you can, but sqlite blocks the thread until it gets the lock) so it is best to run update from a single thread

Mongo cursor with node driver

I am writing an aggregate function for mongo using the native driver, v2.1.
My code looks something like this:
db.collection("whatever").aggregate(...).each(function(err, doc) {
// cursor processing
})
My question is: where is the cursor processing executed? On the client or on the server?
I am assuming that it's executed on the client side (node), and if it's so, is there any way to run a cursor processing (or some other sort of data processing) on the server?
I am working with lots of gb of data, and I don't want to be transferring back and forth with mongo server.
Thx!
Little bit of internals of 'mongodb' driver's Cursor constructor.
When 'each'(prototype method of Cursor constructor) method of a cursor is invoked with a callback function passed to it,
It fires the given query on database. Over the wire, gets the complete results set returned by the database and push into an array in memory at client side(node application end).
Then invokes the callback function given to 'each' method by passing each element in the above array as argument. Of course in node style. callback(err, doc)
So, the point here to be noted is - once the data is received from the database, building an array and iterating through it etc. are happening at the application's end. Loading and iterating an array can be memory intensive. It is the caller's responsibility to make sure that the entire array of results set can fit the memory. Not only that, the amount of data to be transferred over the wire should also be considered.
So here are my 2 cents..
In the cases of dealing with substantial amount of data using mongodb driver,
Better to set ''batch size' of cursor. For example, cursor1.batchSize(100, callback). When batch size of the cursor is set, the cursor gets the data in batches(of 100 docs in the example above) from database, instead of trying to get the complete result set in one go. By doing it in batches, it consumes relatively less memory and/or reduced amount of data to be transferred over the wire so better performance.
Use 'projections' in query wherever possible. Again, by using projections in right place in right way, we stop unnecessary data from being transferred to client side. So less data in size to process, less memory and better performance.
Please be advised about doing 'sort' on cursors. Invoking 'sort' works on the complete list of documents returned by find query. If the list is big, it might slow down the query execution. When you need to do sort, check if you can use any filter in query before you sort. Not exactly a client side issue though, but our queries should be faster in execution as much as possible.
Hope this information is useful.
Thank you.

How to initialize database with default values in sqlalchemy?

I want to put certain default values in the database when it is first created.
Is there a hook/func available for that, so that it executes only once after the db is created?
One way could be to use the Inspector and check if the table/db is available or not...and then set a flag before creating the table. And then use this flag to insert default values.
Is there a better way to do it?
I usually have a dedicated install function that is called for this purpose as I can do anything in this function that I need. However, if you just want to launch your application and do Base.metadata.create_all then you can use the after_create event. You'd have to test out whether it gives you one metadata object or multiple table objects and handle that accordingly. In this context you even get a connection object that you can use to insert data. Depending on transaction management and database support this could even mean that table creation is rolled back if the insert failed.
Depending on your needs, both ways are okay, but if you are certain you only need to insert data after creation then the event way is actually the best idea.

How to update fields automatically

In my CouchDB database I'd like all documents to have an 'updated_at' timestamp added when they're changed (and have this enforced).
I can't modify the document with validation functions
updates functions won't run unless they're called specifically (so it'd be possible to update the document and not call the specific update function)
How should I go about implementing this?
There is no way to do this now without triggering _update handlers. This is nice idea to track documents changing time, but it faces problems with replications.
Replications are working on top of public API and this means that:
In case of enforcing such trigger you'll have replications broken since it will be impossible to sync data as it is without document modification. Since document get modified, he receives new revision which may easily lead to dead loop if you replicate data from database A to B and B to A in continuous mode.
In other case when replications are fixed there will be always way to workaround your trigger.
I can suggest one work around - you can create a view which emits a current date as a key (or a part of it):
function( doc ){
emit( new Date, null );
}
This will assign current dates to all documents as soon as the view generation gets triggered (which happens after first request to it) and will reassign new dates on each update of a specific document.
Although the above should solve your issue, I would advice against using it for the reasons already explained by Kxepal: if you're on a replicated network, each node will assign its own dates. So taking this into account, the best I can recommend is to solve the issue on the client side and just post the documents with a date already embedded.

Resources