After adding a new Core Data model version to my app, I performed a lightweight migration, apparently successfully. The migrated file loaded fine, but upon the first attempt to access an attribute via a particular relationship, the app crashes with an NSRangeException: '*** -[__NSArrayM objectAtIndex:]: index 4294967295 beyond bounds [0 .. 35]'. This relationship worked fine prior to the migration. I know from other posts here that 4294967295 is really -1, but the only thing I can identify with 36 items in my app/data is that there are 36 total entities in the data model (for reference, the relationship that's being fetched has 58 items in its table).
The question:
My question is: based on the error I'm getting and the troubleshooting I've done below, is there a type of schema change that could pass the lightweight migration, but corrupt the data along the way, leading to the noted exception? I'm going to try breaking down the migration into smaller chunks over several versions to either isolate or avoid the issue, but it would be nice to be able to focus on specific schema changes that might be at fault.
The failure:
The failure occurs with the following code in "myobject":
[[self object2] text];
The object2 relationship is to-one, non-optional both ways and neither the forward nor inverse relationship was changed between data models. The text attribute is likely not relevant because when the error occurs, awakeFromFetch is not reached in object2. If I assign [self object2] to a variable prior to the above statement, the assignment is successful and reports data: <fault>.
The database:
Looking at the database in sqlite3, I notice the following:
The index values for the forward and inverse relationships appear to be correct in each table.
The object2 table has two columns for the inverse relationship instead of the one prior to migration (ZMYOBJECT as before and the additional Z2_MYOBJECT, which is empty for all rows). No other relationship were added to explain this column.
In the Z_PRIMARYKEY table, all entries post-migration show -1 for Z_MAX, whereas prior to migration they showed zero for empty tables and the maximum row number for populated tables. Manually updating Z_MAX to the proper values did not help with the exception. All Z_SUPER values were correct.
I set up a mapping model to see if anything looked awry with the automatic mappings, but everything looked fine.
Overall schema changes:
In the source version of the data model, there were fourteen entities, of which only four had been populated with data (the app is still in development). Seven were top-level entities and seven were sub-entities of three of the top-level entities.
In the target version of the data model, twenty-two entities were added, some top-level and some sub-entities, with dozens of relationships, including some added to existing entities.
Some attributes and relationships were removed from existing entities and others were added. No data types or relationship settings were changed, no attributes or relationships were renamed, and no special mappings were required.
Update (2/25/12): As I started working on a new intermediate model, I remembered that I had changed the class (representedClassName) for a number of entities from NSManagedObject to an NSManagedObject subclass, but hadn't generated the class files. I didn't suspect that would cause an issue and, indeed, creating all of the class files did not help with the exception. I just wanted to note that as another change between models.
Conclusions:
This is a wild guess, but if the 36 entity count is not a coincidence, it seems that when "myobject" attempts to fault in "object2" it does not have a valid reference for the table and is attempting to load table number -1, causing the exception. The fact that a simple assignment of [self object2] is successful, however, doesn't jibe with that conclusion.
Any ideas?
By working through several incremental migrations I was able to determine what is causing the issue, and a solution.
The problem:
One of the existing entities with data has no child entities in the current model. If I create a new model that simply adds a child entity, containing no attributes or relationships, and makes no other changes, the NSRangeException, Z_MAX observation, and doubling of the inverse relationship noted in my question all occur.
The solution:
After observing the failures following a "successful" lightweight migration for the case above, I created a mapping model. Since the only change was one additional entity, all but one of the entity mappings were straightforward. The question was what to do with the single added entity.
By default, the added entity with no attributes or relationships of its own was showing attribute and relationship mappings for all of the parent's properties. All of the mappings had empty value expressions by default, which I assumed meant that it would just skip them during the migration. Not true, apparently. By deleting all of the attribute and relationship mappings within the entity mapping and then turning off inferred mapping, the migration proceeded successfully.
I still have to tackle all of the remaining entities and will be trying this approach to do the rest in bulk, with all planned attributes and relationships intact.
Your posts were helpful when I encountered this problem. Thank you. [Have you reported the bug yet?]
Here are some more experimental results but, alas, not a great solution.
My schema change similarly added an entity subtype that has no additional attributes or relationships. The error message is the same as yours except the bounds are [0 .. 19]. That does correspond to 20 entity types, validating your hypothesis. Like your situation, the error happened when attempting to access an entity property after migration completed.
Adding a dummy attribute and a dummy self-relationship to the new entity type didn't avoid the post-migration crash. (However, I didn't test with that new entity type as the only schema change since I previously pushed that schema change to alpha testers.)
I observe the Z2_MYOBJECT column and Z_PRIMARYKEY.Z_MAX = -1 symptoms after successful migrations for other schema changes, so those may not be problematic at all. The -1 values get replaced lazily by the proper max values. The extra column might be used during migration.
In my case, the new entity's supertype has an ordered to-many relationship. In the very simple case where the entire data store contains just one object instance (an instance of that entity type with no outgoing relationship links), the schema migration succeeds. It does have the extra Z2_MYOBJECT column and Z_PRIMARYKEY.Z_MAX = -1 values and yet the resulting data store works fine when adding objects from there.
I tried creating a mapping model but was unsuccessful in getting Core Data to apply it. Turning off inferred mapping just made Core Data unable to migrate at all. Is there a trick to it? Do I have to write custom migration code to invoke a mapping model? This is Xcode 4.6.2 so the older bug is long gone.
When using git to roll the code & data model backwards or forwards to conduct an experiment, it seems to be necessary to (1) close & reopen the Xcode project and (2) do a clean build. Otherwise Xcode may crash and/or leave confounding state around.
To experimentally roll backwards, you must delete the .momd/ directory or the entire app from the target iOS simulator/device (or deploy the app via iTunes or TestFlight) since redeploying via Xcode won't remove obsolete files (like .mom and .omo data model definitions) which in turn lets the app do lightweight migrations that the actual deployed app can't do.
About the entity mapping to use for the added entity type, note that when Core Data applies a mapping model, it's copying entities from the old data store to a new one. It's not modifying the tables in place. You don't want it to "skip" properties (including inherited properties) unless you want to drop them.
However, since the schema change added an entity type, that entity has no instances to migrate so its custom mapping model rules do not matter.
Thus I wonder if something else caused your crashes to stop, like leftover experimental .mom files or custom migration code. Did your workaround hold up?
After 2 days of experimenting I decided my alpha testers would have to live without data migration this time. Fortunately this happened without production customers. But it doesn't give me confidence in Core Data.
I had the same sort of NSRangeException after adding a core data model version when accessing any instance of a particular entity after automatic lightweight migration. In my case also the range corresponded to the number of entities in my model.
I generated a mapping model with Xcode 4.6 (4H127) using File > New > File... and then selecting Core Data > Mapping Model. This caused the crash to (d)evolve into -[NSSymbolicExpression length]: unrecognized selector sent to instance...
Solution
The issue in my case was that my entity causing the original crash had a relationship named size, which is a reserved word listed in apple's Predicate Programming Guide. An examination of the mapping model revealed that the reserved word had been capitalized in the Value Expression for the relationship:
FUNCTION($manager, "destinationInstancesForEntityMappingNamed:sourceInstances:" , "PNSizeOptionToPNSizeOption", $source.SIZE)
I found the solution in Core Data Model Versioning and Data Migration Programming Guide:
Reserved words in custom value expressions: If you use a custom value
expression, you must escape reserved words such as SIZE, FIRST, and
LAST using a # (for example, $source.#size).
Unfortunately, Xcode's algorithm for generating the mapping model did not recognize the reserved word and I had to change the expression's key path in the Relationship Mapping inspector to $source.#size. This solved the problem. I assume that core data's inferred mapping model ran into a similar problem during lightweight migration.
There may be other causes of this kind crash and so this solution may not apply, but it may be worth checking the property names in your model against the list of reserved words in the Predicate Programming Guide.
Related
I have created a webpage for doctors where there are doctors, patients, and diagnoses stored in the database. There is an option to delete either of these entries (e.g. delete a doctor instance), but in, order to avoid data loss, I want to archive the data instead of deleting it entirely. I have found 2 options to do these:
1 - Keep an archive collection (e.g. doctors-archive, patients-archive) to move the entry there from the original collection.
2 - Keep an attribute isDeleted inside the original collection, so when an entry is deleted, isDeleted will become true, and when fetching the entries with isDeleted=true will not be returned.
Both of these options have their drawbacks - 1st option makes it hard to keep relations, as the patient and doctor have relations and if one of them is deleted the relation will be lost. The second option makes the original collection too heavy as the data will never be removed from it.
Is there a better option than these 2 to store archived data? If not, which of these options are better?
It's a very good discussion. I had the same confusion in one of my projects, and after a long research I found a solution that fit well my needs.
First of all, I found that the suppression is not reputed, developers avoid this action to avoid data loss wrongly.
Furthermore, the suppression of data has another disadvantages:
It will lead to having more data into your system, because you will create new models & collections (e.g., doctors-archive model for doctors model).
This will have a significant impact on the performance of the system: the suppression will be done after 3 (and may be more) queries:
find(): to select the object to be deleted.
insert(): to create an archive instance for the object to be deleted.
delete(): to delete the object.
Note that we can not use the findOneAndDelete() method since we have to create an archive instance before !
Contrariwise, the second option can be done by 1 query: findAndModify().
But, the best solution is not the first one, nor the second one ! But a combination of both of them: it consists of cleaning your data after a specific period (every 4 months for example).
In other worlds, when deleting an instance, you apply the 2nd solution (i.e., isDeleted = true), but after 4 months, you can delete this instance from the collection and keep a copy in the archive (also called backup). This solution will prevent the original collection from being too heavy.
Note that you can also have a separately backup database to avoid use of the originale database.
Both of these options have their drawbacks - 1st option makes it hard to keep relations, as the patient and doctor have relations and if one of them is deleted the relation will be lost. The second option makes the original collection too heavy as the data will never be removed from it. => What kind of relationship are you talking about?
If you still need a document post it is archived there has to be a middle ground there and by the why in those scenarios the data isn't lost you still have the power to do $lookup.
Based on my experience the option 2 has way too many negatives, it leads often to range index scans and on top of it you are maintaining extra burden for the indexes too where isDeleted will become part of every index you create.
TLDR; Definitely option 1 and if you need to query on archived content then use $lookup.
There are some techniques for mongodb which you can use to manage data growth in MongoDB:
Using capped collection
Using TTLs
Using multiple collections for months( rename old one and keep data in new one only)
I’ve an app on the App Store and I’ve noticed that in rare circumstances, when I update the fields of an object in CoreData, a new NSManagedObject is created with the new property value and all the relationships set to null.
So, I’ve an inconsistency in the database because there could be the possibility to have at the same time the old object and a duplicated one with null relationship and the updated property value.
I’m studying the problem and the solution could be to add a unique constraint (the objects have an id field which should be unique). The problem is that this migration causes the crash if duplicated objects are in the database. On a fresh install, instead, all works correctly. How can I manage the migration of the users with duplicate items?
Here some information about context of the problem I facing:
we have a semi-structured (JSON from node.js backend) data in datastore.
after saving an entity,
and getting a list of entities about them soon and even a while later,
returned data does not have one indexed property
I can find the entity by that property value.
I use Google Datastore via node.js client library. #google-cloud/datastore: "^2.0.0".
How it can be possible? I understood when due to eventual consistency some updates can be incompletely written etc. But when I getting same inconsistency - lack of whole property of entity saved e. g. hour ago?
I gone through scenario multiple times for same kind multiple times.
I do not have such issues with other kinds or other properties of that kind.
How I can avoid this type of issues with Google Datastore?
Answer for anyone who may encounter with such issue.
We mostly do not use any DTO (data-transfer objects) or any other wrappers for most of our kinds in this project, but for this one a DTO has been used, mostly to be sure the result objects have default values for properties omitted/absent in entity which usually happens for entities created by older version of code.
After reviewing my own code more carefully, I found a piece of code which is out of sync with other related pieces of code - there was no a line to copy this property from entity to the DTO object.
Side note: Actually all this situation remind me a story or meme about a guy who claimed he found a bug in compiler just because he was not able to find a mistake he made in his code.
So I am following the instructions found here: http://msdn.microsoft.com/en-US/data/jj691402 concerning how to handle multiple result sets from in EF.
I am trying to avoid the second solution as this will involve changing the EDMX by hand, which concerns me as I do not want to have to worry about other members on my team overwriting them.
But the first example still seems to be lacking. It refers to the ObjectContext.Translate<TEntity> method, but no where does it say how the <TEntity> is being created. Any time I create an Entity by hand, I of course get Error 2062, "no mapping between entity set and association set". Is there a step that I am missing? Or does the first solution not work with a DB first approach?
If you have create entity by hand in EF designer without mapping it to existing table or database view you indeed receive an error. Try to create complex type instead.
Does anyone have an example of how to efficiently provide a UITableView with data from a Core Data model, preferable including the use of sections (via a referenced property), without the use of NSFetchedResultsController?
How was this done before NSFetchedResultsController became available? Ideally the sample should only get the data that's being viewed and make extra requests when necessary.
Thanks,
Tim
For the record, I agree with CommaToast that there's at best a very limited set of reasons to implement an alternative version of NSFetchedResultsController. Indeed I'm unable to think of an occasion when I would advocate doing so.
That being said, for the purpose of education, I'd imagine that:
upon creation, NSFetchedResultsController runs the relevant NSFetchRequest against the managed object context to create the initial result set;
subsequently — if it has a delegate — it listens for NSManagedObjectContextObjectsDidChangeNotification from the managed object context. Upon receiving that notification it updates its result set.
Fetch requests sit atop predicates and predicates can't always be broken down into the keys they reference (eg, if you create one via predicateWithBlock:). Furthermore although the inserted and deleted lists are quite explicit, the list of changed objects doesn't provide clues as to how those objects have changed. So I'd imagine it just reruns the predicate supplied in the fetch request against the combined set of changed and inserted records, then suitably accumulates the results, dropping anything from the deleted set that it did previously consider a result.
There are probably more efficient things you could do whenever dealing with a fetch request with a fetch limit. Obvious observations, straight off the top of my head:
if you already had enough objects, none of those were deleted or modified and none of the newly inserted or modified objects have a higher sort position than the objects you had then there's obviously no changes to propagate and you needn't run a new query;
even if you've lost some of the objects you had, if you kept whichever was lowest then you've got an upper bound for everything that didn't change, so if the changed and inserted ones together with those you already had make more then enough then you can also avoid a new query.
The logical extension would seem to be that you need re-interrogate the managed object context only if you come out in a position where the deletions, insertions and changes modify your sorted list so that — before you chop it down to the given fetch limit — the bottom object isn't one you had from last time. The reasoning being that you don't already know anything about the stored objects you don't have hold of versus the insertions and modifications; you only know about how those you don't have hold of compare to those you previously had.