I'm setting up a Storage Account so I can Dynamically create and use a persistent volume with Azure Files in Azure Kubernetes Service (AKS). Doing this to:
Have a PV and PVC for the database
A place to store the application files
AKS does create a storage account in the MC_<resource-group>_<aks-name>_<region> resource group that is automatically created. However, that storage account is destroyed if the node size/VM is changed (not node count), so it shouldn't be used since you'll lose your files and database if you need a node size/VM with more resources.
This documentation, nor any other I've really come across, says what the best practice is for the Connectivity method:
Public endpoint (all networks)
Public endpoint (selected networks)
Private endpoint
The first option sounds like a bad idea.
The second option allows me to select a virtual network, and there are two choices:
MC_<resource-group>_<aks-name>_<region>... again, doesn't seem like a good idea because if the node size/VM is changed, the connection will be broke.
aks-vnet-<number>... not sure what this is, but looks like it is part of the previous resource group so will also be destroyed in the previously mentioned scenario.
The third option contains a number of options some of which are included the second option.
So how should I securely set this up for AKS to share files with the application and persist database files?
EDIT
Looking at the both the "Firewalls and virtual networks" and "Private endpoint connections" for the storage account that comes with the AKS node, it looks like it is just setup for "All networks"... so maybe having that were my actual PV and PVC will be stored isn't such an issue...? Could use some clarity on the topic.
not sure where the problem lies. all the assets generated by AKS are tied to AKS lifecycle. if you delete AKS it will delete the MC_* resource group (and that it 100% right). Not sure what do you mean about storage account being destroyed, it wouldn't get destroyed unless you remove the pvc and set the delete action to reclaim.
Reading: https://learn.microsoft.com/en-us/azure/aks/azure-files-dynamic-pv
As for the networking part, selected networks with selecting the AKS nodes network should be the way to go. you can figure that network out by looking at the AKS nodes or the AKS agent pool definition(s). I dont think this is configurable only using kubernetes primitives, so that would be a manual\scripted action after storage account is created.
Related
Currently for our Azure Disaster recovery plan we replicate workloads from a primary site/region to a secondary site. Where we mirror the source VM config and create required or associated resource groups, storage accounts, virtual networks, etc.
We are looking into an alternate method the wouldn't require a second resource group. This would require:
Use one, already existing resource group; i.e. testGroup-rg in East-US
Deploy new IaC components into the same RG but in Central-US
So in the singular resource group, if we wanted a function app, we would have two sets of components. testFuncApp in East-US and testFuncApp in Central-US.
This way we would only ever have one set of IaC created. Of course we would need to automate how to flow traffic etc. into a particular region if both exist.
Is this a possibility? If it is, is it even necessary/worth it?
Unfortunately there is no way to use the same RG. We need to have a resource group in target region if not Site Recovery creates a new resource group in the target region, with an "asr" suffix.
Currently have velero up and running and it's working great. The only issue I have is that the snap shots of the volumes are being created in the same region as the originals which kinda defeats the purpose of disaster recovery. This flag
--snapshot-location-config
doesn't have arg for region. I know there is a config for the default snap shot location
volumesnapshotlocations.velero.io "default"
Does anyone know how to modify the default so I can get my snap shots into new regions?
Snapshots creation from the main region into a different region is not supported.
Azure zone-redundant snapshots and images for managed disks have a decent 99.9999999999% (12 9's) durability. The availability zones in a region are usually physically separated and even if an outage affects one AZ, you can still access your data from a redundant AZ.
However, if you fear calamities that can affect several square kilometers(multiple zones in a region), you can manually move the snapshots in a different region or even automate the process. Here is a guide to do it.
--snapshot-location-config doesn't have arg for region
--snapshot-location-config doesn't create the storage, you must do so yourself. You can specify a different region, a different Azure subscription, or even a different provider, like AWS.
For Azure, follow the instructions here to create your storage container.
If your provider supports a region config (Azure does not - see Volume Snapshot Location Config doc and Backup Storage Location Config doc), it is configurable using the --config, e.g. --config region=us-west-2. Check your provider plugin to see whether different regions are supported, what the key name is, and what possible values are supported.
Refer to the Velero locations documentation for examples of using multiple snapshot and backup locations.
Update:
Although velero snapshot-location create allows you to specify a --provider, the Limitations/Caveats section of the Location documentation specifically states that only a single set of credentials is supported, and furthermore that Azure specifically does not allow creation of snapshots in a different region:
Velero only supports a single set of credentials for VolumeSnapshotLocations. Velero will always use the credentials provided at install time (stored in the cloud-credentials secret) for volume snapshots.
Volume snapshots are still limited by where your provider allows you to create snapshots. For example, AWS and Azure do not allow you to create a volume snapshot in a different region than where the volume is. If you try to take a Velero backup using a volume snapshot location with a different region than where your cluster’s volumes are, the backup will fail.
I personally find this confusing -- how could one use a different provider without specifying credentials? Regardless, it seems as if storage of a snapshots in a different region in Azure is not possible.
A little context: I'm having to migrate a project from AWS, where I'm currently using ECS, to Azure, where I'll be using AKS since their ACS (ECS equivalent) is deprecated.
This is a regular Django app, with its configuration variables being fetched from a server-config.json hosted on a private S3 bucket, the EC2 instance has the correct role with S3FullAccess,
I've been looking into reproducing that same behavior but with Azure Blob Storage instead, having achieved no success whatsoever :-(.
I tried using the Service Principal concept and adding it to the AKS Cluster with Storage Blob Data Owner roles, but that doesn't seem to work. Overall it's been quite the frustrating experience - maybe I'm just having a hard time grasping the right way to use the permissions/scopes. The fact that the AKS Cluster creates its own resource group is something unfathomable - but I've attempted attaching the policies to it as well, to no avail. I then moved onto a solution indicated by Microsoft.
I managed to bind my AKS pods with the correct User Managed Identity through their indicated solution aad-pod-identity, but I feel like I'm missing something. I assigned Storage Blob Data Owner/Contributor to the identity, but still, when I enter the pods and try to access a Blob (using the python sdk), I get a resource not found message.
Is what I'm trying to achieve possible at all? Or will I have to change to a solution using Azure Keyvault/something along those lines?
first off all, you can use AKS Engine which is more or less ACS for Kubernetes now.
As for the access to the blob storage, you dont have to use Managed Service Identity, you can just use account name\key ( which is a bit less secure, but a lot less error prone and more examples exist ). The fact that you are getting resource not found error most likely means your auth part is fine, you just dont have access to the resource, according to this storage blob contributor should be fine if you assigned it at a proper scope. For this to work 100% just give your identity contributor access at subscription level, this way its guaranteed to work.
I've found an example of using python with MSI (here). You should start with that (and grant your identity contributor access) and verify you can list resource groups. when that works making reading blobs working should be trivial.
I'd like to move an instance of Azure Kubernetes Service to another subnet in the same virtual network. Is it possible or the only way to do this is to recreate the AKS instance?
No, it is not possible, you need to redeploy AKS
edit: 08.02.2023 - its actually possible to some extent now: https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni-dynamic-ip-allocation#configure-networking-with-dynamic-allocation-of-ips-and-enhanced-subnet-support---azure-cli
I'm not sure it can be updated on an existing cluster without recreating it (or the nodepool)
I know its an old thread, but just responding in case someone might find it useful. You cannot change the subnet of the AKS directly. However, you can always change the subnets of the underlying components. In our case, we had a simple setup of 2 nodes and a LoadBalancer. We created a new subnet and change the subnets on these individual components. It worked for us, so do ensure to check the services and the pods, to ensure correct working.
When I create an AKS cluster using Azure portal I can see that new resource groups are created. It seems that I have no control over how they are named, especially the one with with "MC_" prefix. I also don't see an option to change its name when using ARM template.
In addition, if I create a cluster in customer's subscription, where I only have access to 1 resource group, I don't even see the newly created RG and can't manage it.
Is there a way to force deployment of all AKS components into a single resource group?
No, there is no way to force it at this point in time. As for the access, you should request access to that RG. No real workarounds.
Secondary resource group name can be inferred, I think, its something like:
MC_original-resource-group-name_aks-resource-name_location
it also creates OMS resource group (if you enable OMS) and Network Watcher (this can be disabled, btw, but its a provider setting). you have no control over that as well.
there is a not implemented yet nodeResourceGroup property: https://learn.microsoft.com/en-us/rest/api/aks/managedclusters/createorupdate#examples
EDIT: this is actually working right now, so the nodeResourceGroup property can be used. But it would still be a new resource group, so you would still need to request access to that group and using this property is not possible with the portal (so ARM Templates\pulumi\terraform)