Hazelcast mancenter partition groups monitoring - hazelcast

How can I monitor partition groups using Hazelcast Management Center to validate my configuration.
Is there a way to understand how my custom partition grouping works?
Hazelcast mancenter seems missing that feature and it is also not possible to see it in the logs.

It is possible to see the partition group config of a member in the Member details page under the "Member Configuration" box. There you can see the effective configuration of a member, including the partition group config.
I'm not sure what you mean by how my custom partition grouping works. If there's a feature you want to be included, you can create an issue in the Hazelcast GitHub repository. Please make sure to explain what you want in more precise terms if you decide to do so.

Related

Clustered HTTP sessions using Hazelcast in jhipster generator

Which options to pick during application generation for being able to pick the option:
Clustered HTTP sessions using Hazelcast
This option is mentioned in the documentation https://www.jhipster.tech/creating-an-app/#2
But somehow I am not able to pick it. Is is automatically selected when I choose hazelcast as cache provider?
Is there an equivalent yo-rc.json setting?
It has been removed with this PR following a public vote on our mailing list. So obviously the documentation has not been updated.

What is the recommended approach towards multi-tenant databases in Cassandra?

I'm thinking of creating a multi-tenant app using Apache Cassandra.
I can think of three strategies:
All tenants in the same keyspace using tenant-specific fields for security
table per tenant in a single shared DB
Keyspace per tenant
The voice in my head is suggesting that I go with option 3.
Thoughts and implications, anyone?
There are several considerations that you need to take into account:
Option 1: In pure Cassandra this option will work only if access to database will be always through "proxy" - the API, for example, that will enforce filtering on tenant field. Otherwise, if you provide an CQL access, then everybody can read all data. In this case, you need also to create data model carefully, to have tenant as a part of composite partition key. DataStax Enterprise (DSE) has additional functionality called row-level access control (RLAC) that allows to set permissions on the table level.
Options 2 & 3: are quite similar, except that when you have a keyspace per tenant, then you have flexibility to setup different replication strategy - this could be useful to store customer's data in different data centers bound to different geographic regions. But in both cases there are limitations on the number of tables in the cluster - reasonable number of tables is around 200, with "hard stop" on more than 500. The reason - you need an additional resources, such as memory, to keep auxiliary data structures (bloom filter, etc.) for every table, and this will consume both heap & off-heap memory.
I've done this for a few years now at large-scale in the retail space. So my belief is that the recommended way to handle multi-tenancy in Cassandra, is not to. No matter how you do it, the tenants will be hit by the "noisy neighbor" problem. Just wait until one tenant runs a BATCH update with 60k writes batched to the same table, and everyone else's performance falls off.
But the bigger problem, is that there's no way you can guarantee that each tenant will even have a similar ratio of reads to writes. In fact they will likely be quite different. That's going to be a problem for options #1 and #2, as disk IOPs will be going to the same directory.
Option #3 is really the only way it realistically works. But again, all it takes is one ill-considered BATCH write to crush everyone. Also, want to upgrade your cluster? Now you have to coordinate it with multiple teams, instead of just one. Using SSL? Make sure multiple teams get the right certificate, instead of just one.
When we have new teams use Cassandra, each team gets their own cluster. That way, they can't hurt anyone else, and we can support them with fewer question marks about who is doing what.

Configuring Distributed Objects Dynamically

I'm currently evaluating using Hazelcast for our software. Would be glad if you could help me elucidate the following.
I have one specific requirement: I want to be able to configure distributed objects (say maps, queues, etc.) dynamically. That is, I can't have all the configuration data at hand when I start the cluster. I want to be able to initialise (and dispose) services on-demand, and their configuration possibly to change in-between.
The version I'm evaluating is 3.6.2.
The documentation I have available (Reference Manual, Deployment Guide, as well as the "Mastering Hazelcast" e-book) are very skimpy on details w.r.t. this subject, and even partially contradicting.
So, to clarify an intended usage: I want to start the cluster; then, at some point, create, say, a distributed map structure, use it across the nodes; then dispose it and use a map with a different configuration (say, number of backups, eviction policy) for the same purposes.
The documentation mentions, and this is to be expected, that bad things will happen if nodes have different configurations for the same distributed object. That makes perfect sense and is fine; I can ensure that the configs will be consistent.
Looking at the code, it would seem to be possible to do what I intend: when creating a distributed object, if it doesn't already have a proxy, the HazelcastInstance will go look at its Config to create a new one and store it in its local list of proxies. When that object is destroyed, its proxy is removed from the list. On the next invocation, it would go reload from the Config. Furthermore, that config is writeable, so if it has been changed in-between, it should pick up those changes.
So this would seem like it should work, but given how silent the documentation is on the matter, I'd like some confirmation.
Is there any reason why the above shouldn't work?
If it should work, is there any reason not to do the above? For instance, are there plans to change the code in future releases in a way that would prevent this from working?
If so, is there any alternative?
Changing the configuration on the fly on an already created Distributed object is not possible with the current version though there is a plan to add this feature in future release. Once created the map configs would stay at node level not at cluster level.
As long as you are creating the Distributed map fresh from the config, using it and destroying it, your approach should work without any issues.

Hetrogenous gridgain grid configuration

Is there any guidance on how to configure a grid with nodes that have different roles?
For example I have some nodes that can store data and others that should only have a near cache - effectively clients of the grid.
Is this done by specifying separate config files for each node type or by overriding settings on different nodes in code?
Also generally what are the resolution rules for deploying differing configurations for the same entity (e.g. a particular named cache), or is this the mechanism to solve the above?
Thanks, Joe
You can configure same cache on different grid nodes with different distribution modes. I think this link is what you are looking for: Cache Distribution Mode

Using Hazelcast as a service directory?

I am exploring the notion of using Hazelcast (or any another caching framework) to advertise services within a cluster. Ideally when a cluster member departs then its services (or objects advertising them) should be removed from the cache.
Is this at all possible?
It is possible for sure.
The question is: which solution do you like.
If the services can be stored in a map, you could create a map with a ttl of e.g. a few minutes and each member needs to refresh its service to prevent the services from expiring.
An alternative solution is to listen to member changes using the membershiplistener and once a member leaves, the services that belong to that member need to be removed from the map.
If you don't like none of this, you could create your own SPI based implementation. The SPI is the lower level infrastructure used by hazelcast to create its distributed datastructures. A lot more work, but also a lot of flexibility.
So there are many solutions.

Resources