Singleton to hold collections of rows - jsf

In a Web App, need to display a view of 6 objects (rows of table DB) via JSF per page. To advance to the next view another different random 6 objects are to be displayed and so on...
So I was thinking in having a #Singleton that queries all the rows of the table with a #Schedule job in a given interval, let's say every 1 hour. It will have a getCollection() method.
Then every visitor will have a #SessionScoped CDI bean, that will query the Collection of the #Singleton, then shuffle it to make a random view to the specific user.
As with many visits, many CDI beans will be created that will access the getCollection() method concurrently.
Is this thought correctly? Any specific annotations are needed for this case? Any other way of doing this?
-----UPDATE---
After speaking with friends, specially Luiggi Mendoza, they tell me that best thing here is to use a EHCACHE or similar, instead of a Singleon. I think that's the way.

Will you have a cluster of webservers? In that case you need a distributed cache or you need to update state through the database.
Else I would just go for a simple map in a bean that is #ApplicationScoped.
I use JPA in a solution where a big bunch of data is almost always the same. There's more then one tomcat involved so a pure cache in a bean that is #ApplicationScoped won't work. To fix it I have no second level cache and instead cache the result of the database queries. That means each tomcat has it's own cache.
For every login a timestamp is read, if the data in the cache is not stale it is used. Otherwise the cache is updated. The timestamp is updated when changes occur with the help of database triggers.
#PrePersist
#PreUpdate
#PreRemove
public void newTimeStamp() {
// save a new timestamp
}
#Singleton is not part of the CDI specification so I would refrain from using it.
Also I would keep my client bean #RequestScoped and reload the 6 objects with #PostConstruct. That way you will get a fresh 6 for every request.
Or if that is to short lived maybe #ViewScoped (requires myfaces codi), #ConversationScoped or #ViewAccessScoped (requires myfaces codi).

Related

Should I use Hazelcast to detect duplicate requests to a REST service

I have a simple usecase. I have a system where duplicate requests to a REST service (with dozens of instances) are not allowed. However, also difficult to prevent because of a complicated datastore configuration and downstream services.
So the only way I can prevent duplicate "transactions" is to have some centralized place where I write a unique hash of a request data. Each REST endpoint first checks if the hash of a new request already exists and only proceeds if no such hash exists.
For purposes of this question assume that it's not possible to do this with database constraints.
One solution is to create a table in the database where I store my request hashes and always write to this table before proceeding with the request. However, I want something lighter than that.
Another solution is to use something like Redis and write to redis my unique hashes before proceeding with the request. However, I don't want to spin up a Redis cluster and maintain it etc..
I was thinking of embedding Hazelcast in each of my app instances and write my unique hashes there. In theory, all instances will see the hash in the memory grid and will be able to detect duplicate requests. This solves my problem of having a lighter solution than a database and the other requirement of not having to maintain a Redis cluster.
Ok now for my question finally. Is it a good idea to use Hazelcast for this usecase?
Will hazelcast be fast enough to detect duplicate requests that come in milliseconds or microseconds apart ?
If request 1 comes into instance 1 and request 2 comes into instance 2 microseconds apart. Instance 1 writes to hazelcast a hash of the request, instance 2 checks hazelcast for existence of the hash only millyseconds later will the hash have be detected? Is hazelcast going to propagate the data across the cluster in time? Does it even need to do that?
Thanks in advance, all ideas are welcome.
Hazelcast is definitely a good choice for this kind of usecase. Especially if you just use a Map<String, Boolean> and just test with Map::containsKey instead of retrieving the element and check for null. You should also put a TTL when putting the element, so you won't run out of memory. However, same as with Redis, we recommend to use Hazelcast with a standalone cluster for "bigger" datasets, as the lifecycle of cached elements normally interferes with the rest of the application and complicates GC optimization. Running Hazelcast embedded is a choice that should be taken only after serious considerations and tests of your application at runtime.
Yes you can use Hazelcast distributed Map to detect duplicate requests to a REST service as whenever there is put operation in hazelcast map data will be available to all the other clustered instance.
From what I've read and seen in the tests, it doesn't actually replicate. It uses a data grid to distribute the primary data evenly across all the nodes rather than each node keeping a full copy of everything and replicating to sync the data. The great thing about this is that there is no data lag, which is inherent to any replication strategy.
There is a backup copy of each node's data stored on another node, and that obviously depends on replication, but the backup copy is only used when a node crashes.
See the below code which creates two hazelcast clustered instances and get the distributed map. One hazelcast instance putting the data into distibuted IMap and other instance is getting data from the IMap.
import com.hazelcast.config.Config;
import com.hazelcast.core.Hazelcast;
import com.hazelcast.core.HazelcastInstance;
import com.hazelcast.core.IMap;
public class TestHazelcastDataReplication {
//Create 1st Instance
public static final HazelcastInstance instanceOne = Hazelcast
.newHazelcastInstance(new Config("distributedFisrtInstance"));
//Create 2nd Instance
public static final HazelcastInstance instanceTwo = Hazelcast
.newHazelcastInstance(new Config("distributedSecondInstance"));
//Insert in distributedMap using instance one
static IMap<Long, Long> distributedInsertMap = instanceOne.getMap("distributedMap");
//Read from distributedMap using instance two
static IMap<Long, Long> distributedGetMap = instanceTwo.getMap("distributedMap");
public static void main(String[] args) {
new Thread(new Runnable() {
#Override
public void run() {
for (long i = 0; i < 100000; i++) {
//Inserting data in distributedMap using 1st instance
distributedInsertMap.put(i, System.currentTimeMillis());
//Reading data from distributedMap using 2nd instance
System.out.println(i + " : " + distributedGetMap.get(i));
}
}
}).start();
}
}

Storing a bean during a Batch Job

How can a CDI bean be stored for the entire duration of Batch Job?
I have a Chunk Job that i would like to monitor. I though i could do with a Bean Instance which scoped during the entire duration of the job.
It's not really feasible to do this. Serialization doesn't work quite the way most people think for CDI.
What I would recommend is having your batch job fire events at various points and use cDI to observe those events and take action.
You could also develop a custom scope for the batch and use it that way, but if you end up with multiple nodes executing the batch job it won't work quite right.

Using dataCache property in view control in XPages

In the the data source of a view control there is a property of dataCache with options of Full, ID and NoData. From some sources I gather that:
Full - The entire view is persisted
ID - Minimal scalar data ID and position. Access to column values during a POST are not available
None - Enough said – the entire view needs to be reconstructed
But exactly how does this property effect performance of the XPage? Which are the methods/functionality I can use in each of these options? What is the suitability of each option?
I haven't tested, but I would presume the following are true. The more you persist in memory, the quicker it is to restore for expand/collapse etc. However, the more users and the bigger the view (number of columns rather than number of documents, because not all documents will get cached), the more risk of out of memory issues. Access to column values means you may have problems using getColumnValues() in SSJS from the view.
XPages is pretty quick, so unless you have specific performance issues, the default should be sufficient.

Core Data - How often to fetch objects?

So I do alot of fetching of my objects. At startup for instance I set an unread count for the badge on a tab. To get this unread count I need to fetch my datamodel objects to see which objects have the flag unread. So there we have a fetch. Then right after that method I do another fetch of all my datamodel objects to do something else. And then on the view controller I need to display my datamodel objects so I do another fetch there and so on.
So there are alot of calls like this : NSArray *dataModelObjects = [moc executeFetchRequest:request error:&error];
This seems kind of redundant to me? Since I will be working alot with my datamodel objects can I not just fetch them once in the application and access them through an instance variable whenever I need to access them? But I always want to have up-to-date data. So there can be added and/or deleted datamodel objects.
Am I making any sense on what I want to achieve here?
On of the concepts and benefits of Core Data is that you don't need to access the database every time you need an object - that's why NSManagedObjectContext was created - it stores those objects, retrieved from the database, so if you will try to get the object you've already get fromt the database, it wil be really fast.
And every change in those objects, made in the NSManagedObjectContext are automaticly brought to you.
But if you had some changes in the database, they may not be reflected in the NSManagedObjectContext, so you'll have to refresh them. You can read about keeping objects up-to-date here.

CoreData design pattern: persisting a single object of many -or-how many NSPessistentObjectContexts should I have?

I'm converting an app from SQLitePersistentObjects to CoreData.
In the app, have a class that I generate many* instances of from an XML file retrieved from my server. The UI can trigger actions that will require me to save some* of those objects until the next invocation of the app.
Other than having a single NSManagedObjectContext for each of these objects (shared only with their subservient objects which can include blobs). I can't see a way how I can have fine grained control (i.e. at the object level) over which objects are persisted. If I try and have a single context for all newly created objects, I get an exception when I try to move one of my objects to a new context so I can persist it on ots own. I'm guessing this is because the objects it owns are left in the 'old' context.
The other option I see is to have a single context, persist all my objects and then delete the ones I don't need later - this feels like it's going to be hitting the database too much but maybe CoreData does magic.
So:
Am I missing something basic about the way my CoreData app should be architected?
Is having a context per object a good design pattern?
Is there a better way to move objects between contexts to avoid 2?
* where "many" means "tens, maybe hundreds, not thousands" and "some" is at least one order of magnitude less than "many"
Also cross posted to the Apple forums.
Core Data is really not an object persistence framework. It is an object graph management framework that just happens to be able to persist that graph to disk (see this previous SO answer for more info). So trying to use Core Data to persist just some of the objects in an object graph is going to be working against the grain. Core Data would much rather manage the entire graph of all objects that you're going to create. So, the options are not perfect, but I see several (including some you mentioned):
You could create all the objects in the Core Data context, then delete the ones you don't want to save. Until you save the context, everything is in-memory so there won't be any "going back to the database" as you suggest. Even after saving to disk, Core Data is very good at caching instances in the contexts' row cache and there is surprisingly little overhead to just letting it do its thing and not worrying about what's on disk and what's in memory.
If you can create all the objects first, then do all the processing in-memory before deciding which objects to save, you can create a single NSManagedObjectContext with a persistent store coordinator having only an in-memory persistent store. When you decide which objects to save, you can then add a persistent (XML/binary/SQLite) store to the persistent store coordinator, assign the objects you want to save to that store (using the context's (void)assignObject:(id)object toPersistentStore:(NSPersistentStore *)store) and then save the context.
You could create all the objects outside of Core Data, then copy the objects to-be-saved into a Core Data context.
You can create all the objects in a single in-memory context and write your own methods to copy those objects' properties and relationships to a new context to save just the instances you want. Unless the entities in your model have many relationships, this isn't that hard (see this page for tips on migrating objects from one store to an other using a multi-pass approach; it describes the technique in the context of versioning managed object models and is no longer needed in 10.5 for that purpose, but the technique would apply to your use case as well).
Personally, I would go with option 1 -- let Core Data do its thing, including managing deletions from your object graph.

Resources