When observe pattern cause GC problems - garbage-collection

In a GC enabled language, when observer subscribes to events of subject, actually subject got a reference of observer.
So before drop an observer, it must un-subscribes first. Other wise, because it's still referenced by subject, it will never be garbage collected.
Normally there are 3 solutions:
Manually un-subscribes
Weak Reference.
Both of them cause other problems.
So usually I don't like to use observer patterns, but I still can not find any replacement for that.
I mean, this pattern describes thing in such a natural way that You could hardly find anything better.
What do you think about it?

In this scenario, you can use finalize() in Java. finalize() is a bad idea when you have to release a resource (like a DB connection) because some outside system is affected. In your case, the object which installed the observer will be GC'd during the runtime of your app and then, finalize() will be called and it can unsubscribe the observer.
Not exactly what you want but someone must decide "it's okay to unsubscribe, now". That either happens when your subject goes away (but it should already kill all observers) or the object which installed the observer.
If your app terminates unexpectedly, well, it doesn't hurt that finalize() might not be called in this case.

If you want to remove an observer you should inform the publisher by unsubscribing, first, otherwise it will try to send out events and depending on how it is written, it could crash the app, quietly ignore the error or remove the observer. But, if you open something, close it; if you subscribe, unsubscribe.
The fact that you are not unsubscribing is a bad design, IMO. Don't blame the pattern for a poor implementation.
The observer pattern works well, but if you want to alleviate some of the issues, you could use AOP for the implementation:
http://www.cin.ufpe.br/~sugarloafplop/final_articles/20_ObserverAspects.pdf

Consider the scenario of an object which counts how often some observable thing changes. There are two types of references to the object: (1) those by entities which are interested in the count; (2) those used by the observable thing(s) which aren't really interested in the the count, but need to update it. The entities which are interested in the count should hold a reference to an object which in turn holds a reference to the one that manages the count. The entities that will have to update the count but aren't really interested in it should just hold references to the second object.
If the first object holds a finalizer, it will be fired when the object goes out of scope. That could trigger the second object to unsubscribe, but it should probably not be unsubscribed directly. Unsubscription would probably require acquiring a lock, and finalizers should not wait on locks. Instead, the finalizer of the first object should probably add that object to a linked list maintained using Interlocked.CompareExchange, and some other thread should periodically poll that list for objects needing unsubscription.
Note, btw: If the first object holds a reference to second object, the latter would be guaranteed to exist when the finalizer for the first object runs, but it would not be guaranteed to be in any particular state. The cleanup thread should not try to do anything with it other than unsubscribe.

Related

Is it required to lock shared variables in perl for read access?

I am using shared variables on perl with use threads::shared.
That variables can we modified only from single thread, all other threads are only 'reading' that variables.
Is it required in the 'reading' threads to lock
{
lock $shared_var;
if ($shared_var > 0) .... ;
}
?
isn't it safe to simple verification without locking (in the 'reading' thread!), like
if ($shared_var > 0) ....
?
Locking is not required to maintain internal integrity when setting or fetching a scalar.
Whether it's needed or not in your particular case depends on the needs of the reader, the other readers and the writers. It rarely makes sense not to lock, but you haven't provided enough details for us to determine what your needs are.
For example, it might not be acceptable to use an old value after the writer has updated the shared variable. For starters, this can lead to a situation where one thread is still using the old value while the another thread is using the new value, a situation that can be undesirable if those two threads interact.
It depends on whether it's meaningful to test the condition just at some point in time or other. The problem however is that in a vast majority of cases, that Boolean test means other things, which might have already changed by the time you're done reading the condition that says it represents a previous state.
Think about it. If it's an insignificant test, then it means little--and you have to question why you are making it. If it's a significant test, then it is telltale of a coherent state that may or may not exist anymore--you won't know for sure, unless you lock it.
A lot of times, say in real-time reporting, you don't really care which snapshot the database hands you, you just want a relatively current one. But, as part of its transaction logic, it keeps a complete picture of how things are prior to a commit. I don't think you're likely to find this in code, where the current state is the current state--and even a state of being in a provisional state is a definite state.
I guess one of the times this can be different is a cyclical access of a queue. If one consumer doesn't get the head record this time around, then one of them will the next time around. You can probably save some processing time, asynchronously accessing the queue counter. But here's a case where it means little in context of just one iteration.
In the case above, you would just want to put some locked-level instructions afterward that expected that the queue might actually be empty even if your test suggested it had data. So, if it is just a preliminary test, you would have to have logic that treated the test as unreliable as it actually is.

Can I use [self retain] to hold the object itself in objective-c?

I'm using [self retain] to hold an object itself, and [self release] to free it elsewhere. This is very convenient sometimes. But this is actually a reference-loop, or dead-lock, which most garbage-collection systems target to solve. I wonder if objective-c's autorelease pool may find the loops and give me surprises by release the object before reaching [self release]. Is my way encouraged or not? How can I ensure that the garbage-collection, if there, won't be too smart?
This way of working is very discouraged. It looks like you need some pointers on memory management.
Theoretically, an object should live as long as it is useful. Useful objects can easily be spotted: they are directly referenced somewhere on a thread stack, or, if you made a graph of all your objects, reachable through some path linked to an object referenced somewhere on a thread stack. Objects that live "by themselves", without being referenced, cannot be useful, since no thread can reach to them to make them perform something.
This is how a garbage collector works: it traverses your object graph and collects every unreferenced object. Mind you, Objective-C is not always garbage-collected, so some rules had to be established. These are the memory management guidelines for Cocoa.
In short, it is based over the concept of 'ownership'. When you look at the reference count of an object, you immediately know how many other objects depend on it. If an object has a reference count of 3, it means that three other objects need it to work properly (and thus own it). Every time you keep a reference to an object (except in rare conditions), you should call its retain method. And before you drop the reference, you should call its release method.
There are some other importants rule regarding the creation of objects. When you call alloc, copy or mutableCopy, the object you get already has a refcount of 1. In this case, it means the calling code is responsible for releasing the object once it's not required. This can be problematic when you return references to objects: once you return it, in theory, you don't need it anymore, but if you call release on it, it'll be destroyed right away! This is where NSAutoreleasePool objects come in. By calling autorelease on an object, you give up ownership on it (as if you called release), except that the reference is not immediately revoked: instead, it is transferred to the NSAutoreleasePool, that will release it once it receives the release message itself. (Whenever some of your code is called back by the Cocoa framework, you can be assured that an autorelease pool already exists.)
It also means that you do not own objects if you did not call alloc, copy or mutableCopy on them; in other words, if you obtain a reference to such an object otherwise, you don't need to call release on it. If you need to keep around such an object, as usual, call retain on it, and then release when you're done.
Now, if we try to apply this logic to your use case, it stands out as odd. An object cannot logically own itself, as it would mean that it can exist, standalone in memory, without being referenced by a thread. Obviously, if you have the occasion to call release on yourself, it means that one of your methods is being executed; therefore, there's gotta be a reference around for you, so you shouldn't need to retain yourself in the first place. I can't really say with the few details you've given, but you probably need to look into NSAutoreleasePool objects.
If you're using the retain/release memory model, it shouldn't be a problem. Nothing will go looking for your [self retain] and subvert it. That may not be the case, however, if you ever switch over to using garbage collection, where -retain and -release are no-ops.
Here's another thread on SO on the same topic.
I'd reiterate the answer that includes the phrase "overwhelming sense of ickyness." It's not illegal, but it feels like a poor plan unless there's a pretty strong reason. If nothing else, it seems sneaky, and that's never good in code. Do heed the warning in that thread to use -autorelease instead of -release.

Resolving an NSManagedObject conflict with multiple threads, relationships, and pointers

I'm having a conflict when saving a bunch of NSManagedObjects via an outside thread. For starters, I can tell you the following:
I'm using a separate MOC for each thread.
The MOCs share the same persistent store coordinator.
It's likely that an outside thread is modifying one or many of the records that I'm saving.
OK, so with that out of the way, here's what I'm doing.
In my outside thread, I'm doing some computation and updating a single value in a bunch of managed objects. I do this by looking up the object in the persistent store by my primary key, modifying the single decimal property, and then calling save on the bunch all at once.
In the meantime, I believe the main thread is doing some updating of its own.
When my outside thread does its big save on its managed object context, I get an exception thrown stating a large number of conflicts. All of the conflicts seem to be centered around a single relationship on each record. Though the managed object in the persistent store and my outside thread share the same ObjectID for this relationship, they don't share the same pointer. Based on what I see, that's the only thing that's different between the objects in my NSMergeConflict debug output.
It makes sense to me why the two objects have relationships with different pointers -- they're in different threads. However, as I understand it from Apple's documentation, the only thing cached when an object is first retrieved from the persistent store are the global IDs. So, one would think that when I run save on the outside thread MOC, it compares the ObjectIDs, sees they're the same, and lets it all through.
So, can anyone tell me why I'm getting a conflict?
Per the documentation in the Concurrency with Core Data chapter of The Core Data Programming Guide, the recommended configuration is for the contexts to share the same persistent store coordinator, not just the same persistent store.
Also, the section Track Changes in Other Threads Using Notifications of the same chapter states if you're tracking updates with the NSManagedObjectContextDidSaveNotification then you send -mergeChangesFromContextDidSaveNotification to the main thread's context so it can merge the changes. But if you're tracking with NSManagedObjectContextDidChangeNotification then the external thread should send the object IDs of the modified objects to the main thread which will then send -refreshObject:mergeChanges: to its context for each modified object.
And really, you should know if the main thread is also performing updates through its controller, and propagate its changes in like manner but in the opposite direction.
You need to have all your contexts listening for NSManagedObjectContextDidSaveNotification from any context that makes changes. Otherwise, only the front context will be aware of changes made on the background threads but the background context won't be aware of changes on the front thread.
So, if you have three threads and three context each of which makes changes, all three context must register for notifications from the other two.
Unfortunately, it seems as though this bug was actually being caused by something else -- I was calling the operation causing the error more than once at the same time when I shouldn't have been. Although this doesn't answer the initial question as to why pointers matter in conflicts, updating my code to prevent this situation has resolved my issue.

Best way to prevent early garbage collection in CLR

I have written a managed class that wraps around an unmanaged C++ object, but I found that - when using it in C# - the GC kicks in early while I'm executing a method on the object. I have read up on garbage collection and how to prevent it from happening early. One way is to use a "using" statement to control when the object is disposed, but this puts the responsibility on the client of the managed object. I could add to the managed class:
MyManagedObject::MyMethod()
{
System::Runtime::InteropServices::GCHandle handle =
System::Runtime::InteropServices::GCHandle::Alloc(this);
// access unmanaged member
handle.Free();
}
This appears to work. Being new to .NET, how do other people deal with this problem?
Thank you,
Johan
You might like to take a look at this article: http://www.codeproject.com/Tips/246372/Premature-NET-garbage-collection-or-Dude-wheres-my. I believe it describes your situation exactly. In short, the remedies are either ausing block or a GC.KeepAlive. However, I agree that in many cases you will not wish to pass this burden onto the client of the unmanaged object; in this case, a call to GC.KeepAlive(this) at the end of every wrapper method is a good solution.
You can use GC.KeepAlive(this) in your method's body if you want to keep the finalizer from being called. As others noted correctly in the comments, if your this reference is not live during the method call, it is possible for the finalizer to be called and for memory to be reclaimed during the call.
See http://blogs.microsoft.co.il/blogs/sasha/archive/2008/07/28/finalizer-vs-application-a-race-condition-from-hell.aspx for a detailed case study.

Core Data: awakeFromFetch Not Getting Called For Unsaved Contexts

First, let me illustrate the steps to reproduce the 'bug'.
Create a new NSManagedObject.
Fault the managed object using refreshObject:mergeChanges:NO - At this time, the didTurnIntoFault notification is received by the object.
'Unfault' the object again by using willAccessValueForKey:nil - At this time, the awakeFromFetch notification is supposed to be received BUT NO NOTIFICATION COMES. All code relying of it firing fails, and the bread in the toaster burns :)
The interesting thing is that if I 'save' the managed object context before performing step 2, everything works okay and the awakeFromFetch notification comes as expected.
Currently the workaround that I am using is 'saving' the context at regular intervals, but that is more of a hack since we actually need to save the context once (when the application terminates).
Googling has so far returned nothing concrete, except a gentleman here that seems to have run into the same problem.
So my question is twofold - Is this really a bug, and if it is, then what other walkarounds (sic) do you suggest.
EDIT: THIS IS NOT A BUG BUT THAT WAS JUST ME BEING STUPID. See, if I turn an object to fault without saving it, then there is no history of the object to maintain. So in this case (i.e for an unsaved object) there is no logical concept of awakeFromFetch (since it was never saved). Please do let me know if I am still getting it all mixed up.
Anyways, turns out my 'actual' problem was somewhere else - hidden well behind 2 gotcha's
If you use refreshObject:mergeChanges:NO to turn an object to fault in order to break any retain cycles that core data might have established, you have to do the same for the child objects also - Each child object which might have gotten involved in a cyclic retain with someone else will have to be manually faulted. What I had (wrongly) assumed was that faulting the parent will automatically break the cycles amongst the children.
The reverseTransform function of your custom transformers will NOT be called when such a object (i.e. which has been forcefully faulted) is resurrected by firing a fault on it. This in my eyes IS a bug, since there is no other way for me to know when the object is alive again. Anyways, the workaround in this case was to set the staleness interval to an arbitrarily low value so that core data skips its cache and always calls the reverseTransform function to resurrect the object. Better suggestions are welcome.
it really has been one of those days :)

Resources