I have an NSManagedObject *obj written to an NSManagedObjectContext. It has key/values which include a "data" key that returns a ~5-10mb size NSData *value. How do i get the url / file path where value is stored?
By default, Core Data stores everything into a flat persistent store file, usually SQLite. Thus there is no individual file on disk that holds the data you assign to an object.
You can turn on .allowsExternalBinaryDataStorage for individual attributes in your model if you wish. This allows Core Data to shove the data into a separate file on disk, if it sees fit.
It's important to note that this is for performance optimisation purposes. Core Data does not expose any API to tell you the URL of the file on disk.
There is no such file url. The way you've designed this managed object and its attributes, the data is stored inside the persistent store along with everything else.
Related
we have a map of custom object key to custom value Object(complex Object). We set the in-memory-format as OBJECT. But IMap.get is taking more time to get the value when the retrieved object size is big. We cannot afford latency here and this is required for further processing. IMap.get is called in jvm where cluster is started. Do we have a way to get the objects quickly irrespective of its size?
This is partly the price you pay for in-memory-format==OBJECT
To confirm, try in-memory-format==BINARY and compare the difference.
Store and retrieve are slower with OBJECT, some queries will be faster. If you run enough of those queries the penalty is justified.
If you do get(X) and the value is stored deserialized (OBJECT), the following sequence occurs
1 - the object it serialized from object to byte[]
2 - the byte array is sent to the caller, possibly across the network
3 - the object is deserialized by the caller, byte[] to object.
If you change to store serialized (BINARY), step 1 isn't need.
If the caller is the same process, step 2 isn't needed.
If you can, it's worth upgrading (latest is 5.1.3) as there are some newer options that may perform better. See this blog post explaining.
You also don't necessarily have to return the entire object to the caller. A read-only EntryProcessor can extract part of the data you need to return across the network. A smaller network packet will help, but if the cost is in the serialization then the difference may not be remarkable.
If you're retrieving a non-local map entry (either because you're using client-server deployment model, or an embedded deployment with multiple nodes so that some retrievals are remote), then a retrieval is going to require moving data across the network. There is no way to move data across the network that isn't affected by object size; so the solution is to find a way to make the objects more compact.
You don't mention what serialization method you're using, but the default Java serialization is horribly inefficient ... any other option would be an improvement. If your code is all Java, IdentifiedDataSerializable is the most performant. See the following blog for some numbers:
https://hazelcast.com/blog/comparing-serialization-options/
Also, if your data is stored in BINARY format, then it's stored in serialized form (whatever serialization option you've chosen), so at retrieval time the data is ready to be put on the wire. By storing in OBJECT form, you'll have to perform the serialization at retrieval time. This will make your GET operation slower. The trade-off is that if you're doing server-side compute (using the distributed executor service, EntryProcessors, or Jet pipelines), the server-side compute is faster if the data is in OBJECT format because it doesn't have to deserialize the data to access the data fields. So if you aren't using those server-side compute capabilities, you're better off with BINARY storage format.
Finally, if your objects are large, do you really need to be retrieving the entire object? Using the SQL API, you can do a SELECT of just certain fields in the object, rather than retrieving the entire object. (You can also do this with Projections and the older Predicate API but the SQL method is the preferred way to do this). If the client code doesn't need the entire object, selecting certain fields can save network bandwidth on the object transfer.
I have an application that uses two merged core data models mapped to two different data stores (both are Sqlite stores) via use of model configurations (each unique configuration within each model is mapped to its own data store). The persistent store coordinator does a good job in saving relevant data into a correct store. However, the problem is that when the stores initially created by core data on very first save operation their data schemas look absolutely identically and correspond to a union of the two merged models.
Is there any way to control core data so it creates the stores solely based on the configuration/model mapped into that store?
I guess not, because if core data can generate partially schemas into different persistent store then it might destroy the relationships between them and thus cause problems. At least in current stage i dont think Apple tends to complete this.
I work with iPhone iOS 4.3.
In my project I need a read-only, repopulated table of data (say a table with 20 rows and 20 fields).
This data has to be fetched by key on the row.
What is better approach? CoreData Archives, SQLite, or other? And how can I prepare and store this table?
Thank you.
I would use core data for that. Drawback: You have to write a program (Desktop or iOS) to populate the persistent store.
How to use a pre-populated store, you should have a look into the Recipes sample code at apple's.
The simplest approach would be to use an NSArray of NSDictionary objects and then save the array to disk as a plist. Include the plist in your build and then open it read only from the app bundle at runtime.
Each "row" would be the element index of the array which would return a dictionary object wherein each "column" would be a key-value pair.
I've done this 2 different ways:
Saved all my data as dictionaries in a plist, then deserialized everything and loaded it into the app during startup
Created a program during development that populates the Core Data db. Save that db to the app bundle, then copy the db during app startup into the Documents folder for use as the Persistent Store
Both options are relatively easy, and if your initial data requirements get very large, it's also proven to be the most performant for me.
I have set of data which contains images also. I want to cache this data. Should i store them on file system or on core data and why?
There are two main options:
Store the file on disk, and then store the path to the image in core data
Store the binary data of the image in core data
I personally prefer the 1st option, since it allows me to choose when I want to load the actual image in memory. It also means that I don't have to remember what format the raw data is in; I can just use the path to alloc/init a new UIImage object.
You might want to read this from the Core Data Programming Guide on how to deal with binary large objects (BLOBs). There are rules of thumb for what size binary data should and should not be stored within the actual Core Data store.
You might also look at Core Data iPad/iPhone BLOBS vs File system for 20k PDFs
If you do place binary data within Core Data store, you would do well to have a "Data" entity that holds the actual data and to have your "Image" entity separate. Create a relationship between the two entities, so that "Data" need only be loaded when actually needed. The "Image" entity can hold the meta-data such as title, data type, etc.
With regards to where to store the user data/files (I found "application support" to be a decent location given that i was wary of the user moving, deleting or altering the file in some way that would result in the image not being able to be recovered and used later by my application)
Take minecraft as an example:
eg. "~/Library/Application Support/minecraft/saves/"
I would agree with the previous comments and store paths to the images in core data but otherwise store the images themselves as png files in their own folder outside of core data.
I'm converting an app from SQLitePersistentObjects to CoreData.
In the app, have a class that I generate many* instances of from an XML file retrieved from my server. The UI can trigger actions that will require me to save some* of those objects until the next invocation of the app.
Other than having a single NSManagedObjectContext for each of these objects (shared only with their subservient objects which can include blobs). I can't see a way how I can have fine grained control (i.e. at the object level) over which objects are persisted. If I try and have a single context for all newly created objects, I get an exception when I try to move one of my objects to a new context so I can persist it on ots own. I'm guessing this is because the objects it owns are left in the 'old' context.
The other option I see is to have a single context, persist all my objects and then delete the ones I don't need later - this feels like it's going to be hitting the database too much but maybe CoreData does magic.
So:
Am I missing something basic about the way my CoreData app should be architected?
Is having a context per object a good design pattern?
Is there a better way to move objects between contexts to avoid 2?
* where "many" means "tens, maybe hundreds, not thousands" and "some" is at least one order of magnitude less than "many"
Also cross posted to the Apple forums.
Core Data is really not an object persistence framework. It is an object graph management framework that just happens to be able to persist that graph to disk (see this previous SO answer for more info). So trying to use Core Data to persist just some of the objects in an object graph is going to be working against the grain. Core Data would much rather manage the entire graph of all objects that you're going to create. So, the options are not perfect, but I see several (including some you mentioned):
You could create all the objects in the Core Data context, then delete the ones you don't want to save. Until you save the context, everything is in-memory so there won't be any "going back to the database" as you suggest. Even after saving to disk, Core Data is very good at caching instances in the contexts' row cache and there is surprisingly little overhead to just letting it do its thing and not worrying about what's on disk and what's in memory.
If you can create all the objects first, then do all the processing in-memory before deciding which objects to save, you can create a single NSManagedObjectContext with a persistent store coordinator having only an in-memory persistent store. When you decide which objects to save, you can then add a persistent (XML/binary/SQLite) store to the persistent store coordinator, assign the objects you want to save to that store (using the context's (void)assignObject:(id)object toPersistentStore:(NSPersistentStore *)store) and then save the context.
You could create all the objects outside of Core Data, then copy the objects to-be-saved into a Core Data context.
You can create all the objects in a single in-memory context and write your own methods to copy those objects' properties and relationships to a new context to save just the instances you want. Unless the entities in your model have many relationships, this isn't that hard (see this page for tips on migrating objects from one store to an other using a multi-pass approach; it describes the technique in the context of versioning managed object models and is no longer needed in 10.5 for that purpose, but the technique would apply to your use case as well).
Personally, I would go with option 1 -- let Core Data do its thing, including managing deletions from your object graph.