I'm confused as to desirable practices in terms of containerization in Azure, and have some basic questions. Consider the standard data science software requirements (jupyter / rstudio / python / bash), a VM with limited space (200Gb), and a corresponding storage account with plenty of space:
I have seen a few resources on employing Docker in Azure, and I'm wondering if alternative container solutions (Singularity) or virtual environments (conda, pyenv) are not desirable for any technical reasons?
Considering the limited disk in the VM, is it possible that docker containers and conda environments are installed in blob storage as opposed to the limited disk space in the VM?
Thanks in advance
Related
we have a working Netapp with ESXi (VMWare 5.5) setup. With multiple VMs running on 3 ESXi Systems but they are residing entirely on Netapp Storage.
We are thinking of moving this entire setup to private Cloud consists of HP Nimble cloud storage. This cloud is currently owned by one of our departments and are ready to give us space(in terms of storage) and ESXis(VMI Cluster) to run our VMs on a rental basis. So immediate advantage for us is more space, more network speed, DR Setup and not anymore worry about the hardware.
Ofcourse it is in discussion phase but I still would like to ask you experts following questions.
Netapp Storage is all about data plus its configuration (Snapshot, User Quota Policies, Export Rules etc.). When we talk about storage space in cloud, then how are we going to control/administrate the configuration parts listed above? Or will this not be anymore possible to administrate? And the Cloud administrators take this control in their hands and we have to be dependant on them for every configuration changes? This is very important factor.
Will VMs running on Netapp Storage be migrated without much efforts? Is there any documented method for this?
Your view on this will be really helpful.
Thanx in advance.
Regards,
Admin
On point #1, a common way to provide multi-tenant administrator access on NetApp is to create a separate SVM [1] (Storage Virtual Machine) that a tenant administrator can use to manage volume capacity, snapshots, quotas, etc.
For #2, a common migration path for moving VMware VMs is to use Storage vMotion [2]. The private cloud provider can remap the ESXi hosts in your environment to be managed under their vCenter Server first. Then from there, they will have the ability to (non-disruptively, in most cases) move the VMs from your old NetApp datastores to new datastores on their array. They can do the same for vMotioning these VMs over to their managed ESXi hosts.
[1] https://docs.netapp.com/us-en/ontap/concepts/storage-virtualization-concept.html
[2] https://docs.vmware.com/en/VMware-vSphere/6.5/com.vmware.vsphere.vcenterhost.doc/GUID-AB266895-BAA4-4BF3-894E-47F99DC7B77F.html
We have 2 Ubuntu VMs inside Virtual Machine Flexible Orchestration that are behind Application Gateway and are running Apache Tomcat web servers. When a client connects to one of the VMs and uploads the files that files also need to exist on another Virtual Machine.
I only found 2 options to do that:
Azure File Share - $80/month for 1 TB of Hot SKU, but the speed is only 1 MBs when mounted as SMB share on Ubuntu.
Azure NetApp Files - $600/month for 4 TB minimum.
Both of the options are not good, the first one is to slow and the second one is too expensive. What can we use in the development environment and production environment to achieve file sharing between Highly Available VMs?
1MBs is awfully low, I am not sure where this is coming from. I am fairly sure I get about 30MBs for Standard SSD/HDD deployments when mounting them into Linux docker containers, which should not perform worse.
An alternative to the mounted file shares would be to use shared disks. You can basically attach a disk to multiple VMs at the same time.
There are some limitations, for your case mainly mainly:
Shared disks can be attached to individual VMSS instances but can't be defined in the VMSS models or automatically deployed.
You can still expect to pay 50-200$ for the disk, but you should be able to get much better speeds than what you are currently getting.
Use a Blob and grant access via Managed Identity to your Virtual Machines:
https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/tutorial-vm-windows-access-storage
Blob Pricing and IOPS:
https://azure.microsoft.com/es-es/pricing/details/storage/page-blobs/
Azure functions can be containerized but what are the actual use cases for it. Is it portability and ease of running it in any Kubernetes environment on prem or otherwise? Or anything further?
As far as I Know,
We can run the Azure Functions in a serverless fashion i.e., backend VMs and servers managed by the Vendor (Azure). Also, I believe there are 2 Azure Container Services like Container Instances and Kubernetes Service.
Azure Kubernetes Service handles large volume of containers.
Much like running multiple virtual machines on a single physical host, you can run multiple containers in a single physical or virtual host.
In VMs, you look at OS, disk, internet, updating the VM and patching, updating the applications present in VM and all you have to manage, whereas in containers, you don’t have to look at OS, you can easily provision the services like databases, python runtime in the container and utilize them.
Example:
You have control over the VM, but containers are not like that.
Let’s say If I’m the web developer / data scientist / data analyst who wants to work only on SQL Database.
It can be installed on the Virtual Machine, and it is also available through containers.
The primary difference would be,
When you deploy on containers, it would be a simple package which would let you only focus on SQL Database, all the other configuration like dependencies like OS, Configuration comes as part of that package can be taken care by that Container Service.
But in the VM, the moment you install SQL Database, there are other dependencies you need to look at.
Can we use a virtual machine (Machine A) to take the backup of another virtual machines snapshot(Machine B). If we can do it what setup should we make (in machine A). Can you give me a working example with some real virtualization techniques.Assumption is that both the virtual machines are running on some cloud virtual machine management services for example like ovirt
Although it is a general question, I think the feature you are really looking for is snapshots.
I use a lot of cloud based VMs, most cloud provider offer you to snapshot your volumes, this is the preferred way to do backups in the cloud as it doesn't require you to stop or slow down your VM the back up is done at the disk level.
Later on you can restore your backups by creating an image out of your disk snapshots and spinning a new VM with this image.
On the other hand if you really need to backup a running at the filesystem level, you can have a look at rsync on linux/unix hosts. For Windows sorry I don't have a clue...
I have been looking into Azure VMs for machine learning. The standard Azure DSVM is a nice and easy solution for this. I also came across the Azure Deep Learning VM, which is preconfigured to be used as a GPU-based DSVM. However, I can also deploy the standard DSVM as a GPU-based VM.
What is the difference between these two VMs?
Is it worth the hassle of deploying the Deep Learning VM, since this one can only be deployed in its own Resource Group and Virtual Network?
There is not much difference between these two in terms of tools and frameworks. The deep learning VM has few extra samples on deep learning and the deployment has been targeted for the GPU based VMs.
You are correct on this "I can also deploy the standard DSVM as a GPU-based VM." So if you don't care about those few samples, you are good with DSVM. Soon we are going to deprecate Azure Deep Learning VM and keep only "DSVM".