Storing job-specific data - multithreading

In my play application I have several Jobs and I have a singleton class.
What I would like to do is for each Job to store data in the singleton class and I would like to be able to retrieve from this singleton class the data that corresponds to the current Job via yet another class.
In other words I would like to have something like this:
-Job 1 stores "Job1Data" in the singleton class
-Job 2 stores "Job2Data" in the singleton class
-Another class asks the singleton class data for the currently executing job (in the current thread I guess) and use it
To perform this I assumed each Job is run on a different thread. Then what I did is that data from each Job stored in the singleton class is stored in a Map that maps the current thread id with the data.
However I'm not sure this is the way I should do it because it may not be thread safe (although Hashtable is said to be thread-safe) and maybe another thread is created each time the Job is executed which would make my Map grow a lot and never clear it-self.
I thought of another way to do what I want. Maybe I could use the ThreadLocal class in my singleton to be sure it's thread-safe and that I store thread-specific data. However I don't know if it will work well if another thread is used each time a Job is executing. Furthermore, I read somewhere that ThreadLocal creates memory leaks if the data is not remove, and the problem is that I don't know when I can remove the data.
So, would anybody have a solution for my issue ? I would like to be sure data I would like to store during Job execution is stored in a global class and can be accessed by another class (with an access to the data of the correct Job, thus the correct thread I guess).
Thank you for your help

Related

Singleton in multi threaded environment

I Have a Singleton class called FILELOGGER and property called number_of_lines.
I will make sure only one object is able to create for FILELOGGER class which makes singleton.
Throughout my application, my object is able to write it to a file and update number_of_lines property for each write.
What if i use this design pattern in multi threaded environment. How it behaves and i feel like number_of_lines property should be locked when other threads are trying to update. And I might to loose logging of data with delay and performance will be loosing.
Say for example thread T1 is logging at time 10:10:10 and T2 is also logging at the same exact time and both trying to update number_of_lines property.
How to solve this problem? Is there any alternative design pattern to solve this. Thanks for your time.
You can either synchronize access to the whole file as you've already done, or there's an alternative with some cons: snapshotting.
N threads write file contents to a string variable.
A dedicated thread does snapshots of in-memory data to disk and updates number_of_lines. number_of_lines will be synchronized when this dedicated thread needs to update it. Snapshotting may occur in time intervals like 10 seconds, 1 minute, 1 hour...
The main issue with this approach is an application/system crash will mean losing the data that wasn't persisted to disk since the last snapshot, but since your application works with in-memory data, it should increase overall performance.
Also you have to implement the singleton pattern in a thread safe way. I think the best approach is with an inner class to guarantee the purpose of the singleton in a multi threaded application.

How to run a Haskell code by its name?

First of all, I'm new in Haskell and I'm curious how can I implement something that I have working in Java.
A bit of prehistory:
I have a Java-based project with workers. Workers can be started periodically and do some work. A worker - is just a Java-class with some functionality implemented. New workers can be added and configured. Workers' names and parameters are stored in a data base. Periodically, the application gathers workers' details (names, params) from DB and starts them. In Java, to start worker by its name, I do the following:
Class<?> myClass = Class.forName("com.mycompany.superworker");
Constructor<?> constructor = myClass.getConstructor();
MyClass myInstance = (MyClass) constructor.newInstance();
myInstance.run("param1", "param2");
Hence, the application gets worker's name (the actual class name) from DB, gets class and constructor, creates new instance of the class and then just runs it.
Now, the question: how I could implement something similar in Haskell? I mean, if I have some function/module/class whatever implemented in Haskell and I have its name stored in plain text, - then how can I run this code by its name (from the main Haskell-based application, of course)?
Thanks
UPDATE
A bit more about the app...
I have an application that grabs some data from Internet, does some parsing work and puts the result in DB. We have a vary of websites we grab data from, so we have a vary of parsers. These parsers are workers. User can implement its own worker (a java class) and put its details into DB via UI. So we store names of workers (and their params) in DB. And when it's time, we go to DB, gather workers' class names and instantiate and start every worker.
Workers do not need communication between each other. Application also don't need to communicate to workers. Application just start a worker, worker grabs data from web, does some parsing, and puts result into DB. That's it.
So, worker can be launched as a separate process.
The main problem (as for me) is that we don't have some constant amount of workers. User can implement its own worker, compile it, restart the application, and the application should know how to start this new worker. So, we store workers' class names in DB and use Java reflection to launch them.
I'm looking for how such an app could be written in Haskell - in a Haskell way, not necessary to just copy the existing Java way.

why does NSManagedObject changedValues return all modified attributes and relationships even after saving changes in child MOC?

My setup is one main moc with a persistent store coordinator and an SQLite persistent store.
I'm trying to asynchronously (and potentially concurrently) retrieve data from a server, parse it into CoreData objects and then save those new objects into the persistent store and have them available in the main moc.
So I tried 2 approaches:
Each time I go fetch from the server, I do so inside a GCD block (concurrent global queue with normal priority) where I create a new context with NSConfinementConcurrencyType which shares the presistent store coordinator with the main moc. When I'm done parsing the JSON and have the new managed objects, I save this "local" context which sends NSManagedObjectContextDidSaveNotification to the main context, which in turn does the merge.
Each time I go fetch from the server, I don't dispatch a GCD block but rather I create a child context with NSPrivateQueueConcurrencyType. This context has the main context as parent but no store coordinator. Then I call -performBlock: on the child context, where I parse the JSON into CoreData and tell the child context to save, which in turn triggers the main context to merge.
Now, what I noticed is that approach 1 triggers what seems to be an exception but otherwise works. I say this because if I set a generic exception breakpoint to break on any Objective-C throw, it always halts when a local-to-GCD-block context saves. It's always a thread other than the main one, and even though it looks like an exception, the save error out param is nil after the save. What's more, the objects in the main context seem consistent (since I know the data they are supposed to have). And calling -savedChanges: on any of these objects (after the main context merged) returns no values, which is what I'd expect.
For the second approach, I don't get the exception breakpoint halting anywhere (which seems good), but... while the right data is in the right objects after the main context merge, calling -changedValues returns all the values (attributes and/or relationships) that were populated in the child context. This I would not expect, since in theory, I did save and the save should have been pushed up to the main context and the main context did merge.
So I'm confused.
I need -changedValues: to only return values if these were changed after the master context saved, since I use these values to figure out that my app has changed a mo's state and the new state needs to be pushed back to the server.
I'd really appreciate any help / pointers with either approach 1 or 2.
A child context saves into its parent. That is, once the child has saved its changes, these changes appear in the parent, and within the parent they are flagged as changed because the parent still needs to change these.
I'd generally discourage use of parent-child context setups, because they have a lot of drawbacks. More info in the Working with Multiple Contexts chapter in our book.

Silverlight Multithreading; Need to Synchronize?

I have a Silverlight app where I've implemented the M-V-VM pattern so my actual UI elements (Views) are separated from the data (Models). Anyways, at one point after the user has gone and done some selections and possible other input, I'd like to asyncronously go though the model and scan it and compile a list of optiions that the user has changed (different from the default), and eventually update that on the UI as a summary, but that would be a final step.
My question is that if I use a background worker to do this, up until I actually want to do the UI updates, I just want to read current values in one of my models, I don't have to synchronize access to the model right? I'm not modifying data just reading current values...
There are Lists (ObservableCollections), so I will have to call methods of those collections like "_ABCCollection.GetSelectedItems()" but again I'm just reading, I'm not making changes. Since they are not primitives, will I have to synchronize access to them for just reads, or does that not matter?
I assume I'll have to sychronize my final step as it will cause PropertyChanged events to fire and eventually the Views will request the new data through the bindings...
Thanks in advance for any and all advice.
You are correct. You can read from your Model objects and ObservableCollections on a worker thread without having a cross-thread violation. Getting or setting the value of a property on a UI element (more specifically, an object that derives from DispatcherObject) must be done on the UI thread (more specifically, the thread on which the DispatcherObject subclass instance was created). For more info about this, see here.

How do you create a non-Thread-based Guice custom Scope?

It seems that all Guice's out-of-the-box Scope implementations are inherently Thread-based (or ignore Threads entirely):
Scopes.SINGLETON and Scopes.NO_SCOPE ignore Threads and are the edge cases: global scope and no scope.
ServletScopes.REQUEST and ServletScopes.SESSION ultimately depend on retrieving scoped objects from a ThreadLocal<Context>. The retrieved Context holds a reference to the HttpServletRequest that holds a reference to the scoped objects stored as named attributes (where name is derived from com.google.inject.Key).
Class SimpleScope from the custom scope Guice wiki also provides a per-Thread implementation using a ThreadLocal<Map<Key<?>, Object>> member variable.
With that preamble, my question is this: how does one go about creating a non-Thread-based Scope? It seems that something that I can use to look up a Map<Key<?>, Object> is missing, as the only things passed in to Scope.scope() are a Key<T> and a Provider<T>.
Thanks in advance for your time.
It's a bit unclear what you want - you don't want scopes that are based on threads, and you don't want scopes that ignore threads.
But yes, scopes are intended to manage the lifecycle of an object and say when an instance should be reused. So really you're asking "what are the other possibilities for re-using an instance beyond 'always use the same instance', 'never use the same instance', and 'use an instance depending on the execution environment of the current thread'?"
Here's what comes to mind:
Use the same instance for a fixed amount of time. The example here would be of a configuration file that's reloaded and reparsed every ten minutes.
Perform some network call to query whether a given object should be re-used (maybe it's a fast call to determine whether we need to reconstruct the object, but the call for reconstructing the object is slow)
Re-use the same object until some outside call comes in telling us to reload
Re-use the same object per thread, but not with a scope that's explicitly entered and left like the servlet scopes. (So one instance per thread)
A "this thread and child threads" scope that is based on an InheritableThreadLocal, not a plain ThreadLocal.
Related to that, a Scope and a threadpool-based ExecutorService that work togehter so that instances are shared between a thread and jobs it submits for background execution.
Pull instances out of a pool; this is tricky, since we'd need a good way to return objects to the pool when finished. (Maybe you could combine this idea with something like the request scope, so that objects can be returned to the pool when the request ends)
A scope that composes two or more other scopes, so for example we could get a configuration object that is re-read every 10 minutes except that the same instance is used through the lifetime of a given request.

Resources