Enhance my Core Data design. Experts only!

Enhance my Core Data design. Experts only! - core-data

In AcaniUsers, I'm downloading the closest 20 users to me and displaying their profile pictures as thumbnails in a table view. User & Photo are both Resources because they each have an id (MongoDB BSON ObjectId) on the server. Each user has a unique_id. Each Photo has four different sizes (images) on the server: square: 75x75, square#2x: 150x150, large: 320x480, large#2x: 640x960. But, each device will only have two of these sizes, depending on whether it's an iPhone 3 or 4 (retina display). Each of these sizes has their own MongoDB collection. And, all four images for each Photo have the same BSON ObjectId's across these four collections.
In the future, I may give User a relationship called photos to allow a user to have more than one photo. Also, although I don't foresee this, I may add more Image sizes (types).
The fresh attribute on Image tells me whether I've downloaded the latest Image. I set this to NO whenever the Photo's ID has changed, and then back to yes after I've finished downloading the Image.
Should I store the four different images in Core Data or on the file system and just store their URLs in Core Data? I read somewhere that over 1 or 2MB, you should store in file system, not Core Data. So, I was thinking of storing the square images in Core Data and the large images in the file system, but I'd rather store them all the same way to make things easier. So, maybe I'll just store them all in the file system? What do you think?
Do you think I should discard the 75x75 & 320x480 sizes since pretty soon iPhone 3's will be gone?
How can I improve my design of the entities, and their attributes and relationships. For example, is the Resource entity even beneficial at all?
I'm displaying the Users with an NSFetchedResultsController. However, it doesn't know when the User's image gets updated, so the images don't show up until I scroll aggressively the first time. How do I let the NSFetchedResultsController know that a user's thumbnail has finished downloading? Do I have to use KVO?

To answer your questions:
1 I'd store them all in the file system and record the URL in the database. I've never been a big fan of storing image data in the DB. Plus it'll simplify things a little to have all of the image storage uniform. That way in your image loading code you don't have to worry about if it's a type that's stored in the DB or on the file system.
2 No, I wouldn't do that yet. The iPhone 3 is going to be around for a bit longer. ATT is still selling them as the cheap entry level iPhone. I just saw a commercial the other night advertising them for $49.
3 Remove the Resources entry and add the id attribute to each of the classes. How you did it is actually bad. Abstract entities should only be used when you have a couple of entities that are almost identical and only have a few differences between them. Under the hood, Core Data will make only one table for an abstract entity and all of its children. So right now you're going to end up with only one table that will contain both your user and photo entries which can be bad when you're trying to query just type of entity.
You should also delete the Image entity and move its attributes into the Photo entity. The Photo will always have those values associated with it and the same values won't be shared between photos. Having them as a separate entity will cause a slow down. You'll either need to load them with the photos which will require a join (slow) or they'll be loaded one at a time when you access either the data or fresh attributes which is also slow. When each of the faults is fired in the latter scenario a separate query and round trip to the disk will happen for each object. So when you loop through your pictures for display in the table, you'll be firing n queries instead of one which can be a big difference in performance.
4 You can use KVO to do it. Have your table cell observer the User or Picture (depends on if you have the Picture already added to the user and are changing the data or if you're adding a new picture to the user on load completion). When the observer gets triggered, update the image being displayed.

Related

In React, speed up large initial fetch of data when web app loads

Similar to the project I am working on, this website has a search bar at the top of its home page:
On the linked website, the search bar works seemingly immediately when visiting the website. According to their own website, there have been roughly ~20K MLB players in MLB history, and this is a good estimate for the number of dropdown items in this select widget.
On my project, it currently takes 10-15 seconds to make this fetch (from MongoDB, using Node + Express) for a table with ~15mb of data that contains the data for the select's dropdown items. This 15mb of data is as small as I could make this table, as it includes only two keys (1 for the id, and 1 for the name for each dropdown). This table is large because there are more than 150K options to choose from in my project's select widget. I currently have to disable the widget for the first 15 seconds while the data loads, which results in a bad user experience.
Is there any way to make the data required for the select widget immediately available to the select when users visit, that way the widget does not have to be disabled? In particular:
Can I use localstorage to store this table in the users browser? is 15MB too big for localstorage? This table changes / increases in size daily (not too persistent), and a table in localstorage would then be outdated the next day, no?
Can I avoid all-together having to do this fetch? Potentially there is a way to load the correct data into react only when a user searches for that user?
Some other approach?
Saving / fetching quicker for 15mb of data for this select will improve our react app's user experience by quite a bit.

The data on the site you link to is basically 20k in size. It does not contain all the players but fetches the data as needed when you click on a link in the drop-down. So if you have 20Mb of searchable data, then you need to find a way to only load it as required. How to do that sensibly depends on the nature of the data. Many search bars with large result sets behind them will use a typeahead search where the user's input is posted back as they type (with a decent debounce interval) and the search results matching the user's input sent back in real time (usually with a limit of, say, the first 20 or 50 results).
So basically the answer is to find a way to only serve up the data that the user needs rather than basically downloading the entire database to the browser (option 2 of your list). You must obviously provide a search API to allow that to happen.

Robot's Tracker Threads and Display

Application: The purposed application has an tcp server able to handle several connections with the robots.
I choosed to work with database/ no files, so i'm using a sqlite db to save information about the robots and their full history, models of robots, tasks, etc...
The robots send us several data like odometry, tasks information, and so on...
I create a thread for every new robot's connection to handle the messages and update the informations of the robots on the database. Now lets start talk about my problems:
The application got to show information about the robots in realtime, and I was thinking about using QSqlQueryModel, set the right query and the show it on a QTableView but then I got to some problems/ solutions to think about:
Problem number 1: There are informations to show on the QTableView that are not on the database: I have the current consumption on the database and the actual charge on the database in capacity, but I want to show also on my table the remaining battery time, how can I add that column with the right behaviour (math implemented) in my TableView.
Problem number 2: I will be receiving messages each second for each robot, so, updating the db and the the gui(loading the query) may not be the best solution when I have a big number of robots connected? Is it better to update the table, and only update the db each minute or something like this? If I use this method I cant work with the table with the QSqlQueryModel to update the tables, so what is the approach that you recommend me to use?
Thanks
SancheZ

I have run into similar problem before; my conclusion was QSqlQueryModel is not the best option for display purposes. You may want some processing on query results, or you may want to create, remove, change display data based on the result for a fancier gui. I think best is to implement your own delegates and override the view related methods - setData, setEditor
This way you have the control over all your columns and direct union of raw data and its display equivalent (i.e. EditData, UserData).
Yes, it is better if you update your view real-time and run a batch execute at lower frequency to update the big data. In general app is the middle layer and db is a bottom layer for data monitoring, unless you use db in memory shared cache.
EDIT: One important point, you cannot run updates in multiple threads (you can, but sqlite blocks the thread until it gets the lock) so it is best to run update from a single thread

Can you find the logical size of a single NotesDocument in a DAOS-enabled database or an uploaded file size?

I'm doing some feasability for an XPages application. One of the aspects is checking the amount of space used by users.
The database will be DAOS-enabled to minimise the size of the NSF. Is it possible to identify the logical size of a NotesDocuemnt that has a DAOSed attachment? I know I can find the logical size of the overall database, but need to identify it based on users.
LotusScript or Java would be acceptable options.
The other option is to capture file sizes at upload time and store that information against the user. Is it possible to identify the attachment size at the point of upload and deletion? This would need to be captured before the attachment was moved to the DAOS store.

Paul,
As far as I know from the client point of view he can't see if a Database/Document has been DAOS'ed or not. SO this meahs that using LotusScript against the document would report the document size as if the attachment(s) would be in the document. I haven't tested it myself to give you a 100% guarantee but you could test it for yourself very easily by enabling a database for DAOS and then create 10 docs with all of them the exact same attachment attached to the documents. If the docs report a size of arround the attachment size when accessed via LotusScript you will have your answer !

You could check the logical size of the database before and after saving the document. But unfortunately, you would have to rig the critical section of this code with a semaphore or some other mechanism that assures that only one instance can run at a time, otherwise two simultaneous saves would give you bad results.

Build a view with a column whose formulas is #DocLength or #Sum(#AttachmentLengths) This will show the logical size of the docs as if DAOS was not active.
/Newbs

Storing images in Core Data or as file?

I have set of data which contains images also. I want to cache this data. Should i store them on file system or on core data and why?

There are two main options:
Store the file on disk, and then store the path to the image in core data
Store the binary data of the image in core data
I personally prefer the 1st option, since it allows me to choose when I want to load the actual image in memory. It also means that I don't have to remember what format the raw data is in; I can just use the path to alloc/init a new UIImage object.

You might want to read this from the Core Data Programming Guide on how to deal with binary large objects (BLOBs). There are rules of thumb for what size binary data should and should not be stored within the actual Core Data store.
You might also look at Core Data iPad/iPhone BLOBS vs File system for 20k PDFs
If you do place binary data within Core Data store, you would do well to have a "Data" entity that holds the actual data and to have your "Image" entity separate. Create a relationship between the two entities, so that "Data" need only be loaded when actually needed. The "Image" entity can hold the meta-data such as title, data type, etc.

With regards to where to store the user data/files (I found "application support" to be a decent location given that i was wary of the user moving, deleting or altering the file in some way that would result in the image not being able to be recovered and used later by my application)
Take minecraft as an example:
eg. "~/Library/Application Support/minecraft/saves/"
I would agree with the previous comments and store paths to the images in core data but otherwise store the images themselves as png files in their own folder outside of core data.

How do we get around the Lotus Notes 60 Gb database barrier

Are there ways to get around the upper database size limit on Notes databases? We are compacting a database that is still approaching 60 gigs in size. Thank you very much if you can offer a suggestion.

Even if you could find a way to get over the 64GB limit it would not be the recommended solution. Splitting up the application into multiple databases is far better if you wish to improve performance and retain the stability of your Domino server. If you think you have to have everything in the same database in order to be able to search, please look up domain search and multi-database search in the Domino Administrator help.
Maybe some parts of the data is "old" and could be put into one or more archive databases instead?
Maybe you have a lot of large attachments and can store them in a series of attachment databases?
Maybe you have a lot of complicated views that can be streamlined or eliminated and thereby save a lot of space and keep everything in the same database for the time being? (Remove sorting on columns where not needed, using "click on column header to sort" is a sure way to increase the size of the view index.)

I'm assuming your database is large because of file attachments as well. In that case look into DAOS - it will store all file attachments on filesystem (server functionality - transparent to clients and existing applications).
As a bonus it finds duplicates and stores them only once.
More here: http://www.ibm.com/developerworks/lotus/library/domino-green/

Just a stab in the dark:
Use the DB2 storage method instead of to a Domino server?

I'm guessing that 80-90% of that space is taken up by file attachments. My suggestion is to move all the attachments to a file share, provided everyone can access that share, or to an FTP server that everyone can connect to.
It's not ideal because security becomes an issue - now you need to manage credentials to the Notes database AND to the external file share - however it'll be worth the effort from a Notes administrator's perspective.
In the Notes documents, just provide a link to the file. If users are adding these files via a Notes form, perhaps you can add some background code to extract the file from the document after it has been saved, and replace it with a link to that file.

The 64GB is not actually an absolute limit, you can go above that, I've seen 80GB and even close to 100Gb although once your past 64Gb you can get problems at any time. The limit is not actually Notes, its the underlying file system, I've seen this on AS400 but the great thing about Notes is that if you do get a huge crash you can still access all the documents and pull everything out to new copies using scheduled agents even if you can no longer get views to open in the client.
Your best best is regular archiving, if it is file attachments then anything over two years old doesn't need to be in main system, just brief synopsis and link, you could even have 5 year archive, 2 year archive 1 year archive etc, data will continue to accumulate and has to be managed, irrespective of what platform you use to store it.

If the issue really is large file attachments, I would certainly recommend looking into implementing DAOS on your server / database. It is only available with Domino Server 8.5 and later. On the other hand, if your database contains over 100,000+ documents, you may want to look seriously at dividing the data into multiple NSF's - at that number of documents, you need to be very careful about your view design, your lookup code, etc.
Some documented successes with DAOS:
http://www.edbrill.com/ebrill/edbrill.nsf/dx/yet-another-daos-success-story-from-darren-duke?opendocument&comments

If you're database is getting to 60gb.. don't use a Domino solution you need to switch to a relational database. You need to archive or move documents across several databases. Although you can get to 60gb, you shouldn't do it. The performance hit for active databases is significant. Not so much a problem for static databases.

I would also look at removing any unnecessary views & their indexes. View indexes can occupy 80-90% of your disk space. If you can't remove them, simplify their sorting arrangements/formulas and remove any unnecessary column sorting options. I halved a 50gb down to 25gb with a few simple changes like this and virtually no users noticed.

One path could be, for once, to start with the user. Do all the users need to access all that data all the time ? If no, it's time to split or archive. If yes, there is probably a flaw in the design of the application.
Technically, I would add to the previous comments a suggestion to check the many options for compaction. Quick and dirty : disard all view indices, but be sure to rebuild at least the one for the default view if you don't want your users to riot. See updall

One more thing to check: make sure you have checked
[x] Use LZ1 compression for attachments
in db properties.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string