Hazelcast: how data is partitioned when using Cluster Groups?

Hazelcast: how data is partitioned when using Cluster Groups? - hazelcast

In the Hazelcast docs it is stated about Cluster Groups:
You can create cluster groups. To do this, use the group configuration element.
By specifying a group name and group password, you can separate your clusters in a simple way. Example groupings can be by development, production, test, app, etc. <...> Each Hazelcast instance can only participate in one group. Each Hazelcast instance only joins to its own group and does not interact with other groups.
<...>
The cluster members (nodes) and clients having the same group configuration (i.e., the same group name and password) forms a private cluster.
Each cluster will have its own group and it will not interfere with other clusters.
But there are no details about data partitioning.
If I have 5 nodes and 2 cluster groups:
node1, node2 and node3 are members of GroupA
node4 and node5 are members of GroupB
does it means that no data from GroupA will be stored at nodes4 and node5?

Yeah that's what it means. Those groups are independent clusters and have nothing in common (except maybe the network ;-)).
If you look for data partitioning, Hazelcast distributes information based on keys but you can have some kind of influence by utilizing data affinity (http://docs.hazelcast.org/docs/3.8/manual/html-single/index.html#data-affinity).
If you're looking for backup distribution you might be interested in partition groups (http://docs.hazelcast.org/docs/3.8/manual/html-single/index.html#partition-group-configuration).

Related

CosmosDB: Will call go to write replica if provided read replica region doesn't exist?

I have a CosmosDB instance on Azure, with 1 write replica and multiple read replicas. Normally we call SetCurrentLocation to make calls to read replica. My understanding is that this automatically create PreferredLocations for us. But not sure how the preferredlocations work.
Now let's say the location passed to the SetCurrentLocation method is improper. That is, there's no replica in that single location we passed, but the location is a valid azure region. In that case, will the call go to the write replica, or a closer by read replica?

SetCurrentLocation will order Azure regions based on geographical distance between the indicated region and them, and the SDK client will then take this ordered list and map it with your account available regions. So it ends up being your account available regions ordered by distance to the region you indicated on SetCurrentLocation.
For an account with a single write region, all write operations always go to that region, the Preferred Locations affect read operations. More information at: https://learn.microsoft.com/azure/cosmos-db/troubleshoot-sdk-availability

Further adding to Matias's answer, from https://learn.microsoft.com/en-us/azure/cosmos-db/sql/troubleshoot-sdk-availability:
Primary region refers to the first region in the Azure Cosmos account region list. If the values specified as regional preference do not match with any existing Azure regions, they will be ignored. If they match an existing region but the account is not replicated to it, then the client will connect to the next preferred region that matches or to the primary region.
So if the specified location is bad, or there's no read replica there, the client will try connect to the next location, where eventually the primary region (in this case, the singular write replica) is used.

how to rename Databricks job cluster name during runtime

I have created an ADF pipeline with Notebook activity. This notebook activity automatically creates databricks job clusters with autogenerated job cluster names.
1. Rename Job Cluster during runtime from ADF
I'm trying to rename this job cluster name with the process/other names during runtime from ADF/ADF linked service.
instead of job-59, i want it to be replaced with <process_name>_
2. Rename ClusterName Tag
Wanted to replace Default generated ClusterName Tag to required process name

Settings for the job can be updated using the Reset or Update endpoints.
Cluster tags allow you to easily monitor the cost of cloud resources used by various groups in your organization. You can specify tags as key-value pairs when you create a cluster, and Azure Databricks applies these tags to cloud resources like VMs and disk volumes, as well as DBU usage reports.
For detailed information about how pool and cluster tag types work together, see Monitor usage using cluster, pool, and workspace tags.
For convenience, Azure Databricks applies four default tags to each cluster: Vendor, Creator, ClusterName, and ClusterId.
These tags propagate to detailed cost analysis reports that you can access in the Azure portal.
Checkout an example how billing works.

MongoDB change stream - Duplicate records / Multiple listeners

My question is an extension of the earlier discussion here:
Mongo Change Streams running multiple times (kind of): Node app running multiple instances
In my case, the application is deployed on Kubernetes pods. There will be at least 3 pods and a maximum of 5 pods. The solution mentioned in the above link suggests to use <this instance's id> in the $mod operator. Since the application is deployed to K8s pods, pod names are dynamic. How can I achieve a similar solution for my scenario?

if you are running stateless workload i am not sure why you want to fix name of POD(deployment).
Fixing PODs names is only possible with stateful sets.
You should be using statefulset instead of deployment, replication controllers(RC), however, replication controllers are replaced with ReplicaSets.
StatefulSet Pods have a unique identity comprised of an ordinal. For any StatefulSet with N replicas, each Pod in the StatefulSet will be assigned an integer ordinal, from 0 up through N-1, which will be unique across Set.

Azure Kubernetes Service (AKS) and the primary node pool

Foreword
When you create a Kubernetes cluster on AKS you specify the type of VMs you want to use for your nodes (--node-vm-size). I read that you can't change this after you create the Kubernetes cluster, which would mean that you'd be scaling vertically instead of horizontally whenever you add resources.
However, you can create different node pools in an AKS cluster that use different types of VMs for your nodes. So, I thought, if you want to "change" the type of VM that you chose initially, maybe add a new node pool and remove the old one ("nodepool1")?
I tried that through the following steps:
Create a node pool named "stda1v2" with a VM type of "Standard_A1_v2"
Delete "nodepool1" (az aks nodepool delete --cluster-name ... -g ... -n nodepool1
Unfortunately I was met with Primary agentpool cannot be deleted.
Question
What is the purpose of the "primary agentpool" which cannot be deleted, and does it matter (a lot) what type of VM I choose when I create the AKS cluster (in a real world scenario)?
Can I create other node pools and let the primary one live its life? Will it cause trouble in the future if I have node pools that use larger VMs for its nodes but the primary one still using "Standard_A1_v2" for example?

Primary node pool is the first nodepool in the cluster and you cannot delete it, because its currently not supported. You can create and delete additional node pools and just let primary be as it is. It will not create any trouble.
For the primary node pool I suggest picking a VM size that makes more sense in a long run (since you cannot change it). B-series would be a good fit, since they are cheap and CPU\mem ratio is good for average workloads.
ps. You can always scale primary node pool to 0 nodes, cordon the node and shut it down. You will have to repeat this post upgrade, but otherwise it will work

It looks like this functionality was introduced around the time of your question, allowing you to add new system nodepools and delete old ones, including the initial nodepool. After encountering the same error message myself while trying to tidy up a cluster, I discovered I had to set another nodepool to a system type in order to delete the first.
There's more info about it here, but in short, Azure nodepools are split into two types ('modes' as they call it): System and User. When creating a single pool to begin with, it will be of a system type (favouring system pod scheduling -- so it might be good to have a dedicated pool of a node or two for system use, then a second user nodepool for the actual app pods).
So if you wish to delete your only system pool, you need to first create another nodepool with the --mode switch set to 'system' (with your preferred VM size etc.), then you'll be able to delete the first (and nodepool modes can't be changed after the fact, only on creation).

How to create more granular resource classes in Azure DW?

I'm working with an Azure SQL DW scaled to DW6000 and I want to put a user in the 'SloDWGroupC04' workload group.
However, DW6000 only provides the defaults of smallrc, mediumrc, largerc, and xlargerc resource classes that appear to have C00, C05, C06, and C07 respectively according to the documentation.
Usually I can run EXEC sp_addrolemember 'largerc', 'user' (which would put 'user' in C05) but the workload group C04 doesn't have a role yet.
Do I need to create a role first? How do I go about leveraging the other available workload groups beyond the default roles?

These SloDW* workload groups are internal use only. This generic set of workload groups are mapped to the resource classes (i.e. mediumrc, largerc etc.) depending on the DWU setting. For example, the article you have referenced shows the mapping for DW500. In that case C04 is used for the xlargerc.
Unfortunately you cannot alter the mappings yourself at this time. The mappings are fixed. If you would like to see specific improvements in this area I would encourage you to put your suggestions on the SQLDW feedback page.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Hazelcast: how data is partitioned when using Cluster Groups? - hazelcast

Related

CosmosDB: Will call go to write replica if provided read replica region doesn't exist?

how to rename Databricks job cluster name during runtime

MongoDB change stream - Duplicate records / Multiple listeners

Azure Kubernetes Service (AKS) and the primary node pool

How to create more granular resource classes in Azure DW?

Categories

Resources