Virtual Machine Specs on Google Cloud Platform for Data Science in Jupyter Notebook - scikit-learn

I am currently running out of memory and RAM on my 2013 Macbook Pro (8gb 1600 MHz DDR3 memory, and 2 GHz Intel Core i7 processor) while running different scikit-learn (Random Search on MLPRegressor and GradientBoostingRegressor) models on a 50,000 sample data set with ~70 features, most of which are categorical. I have setup a VM on Google Cloud Platform, but have not seen much of an improvement in execution time. Here are the specs of the VM: Machine type: n1-standard-8 (8 vCPUs, 30 GB memory), Source image: ubuntu-1604-xenial-v20180126. I'm wondering if anyone has any recommendations on tweaking vm specs for learning data science. I'm not looking to add any GPUs due to cost. Thank you

As the volume and nature of the data you plan to process is known to you, simply experimenting with different machines on Google Cloud Platform is the best way to choose the most effective option.
General information on subject is to be found in the "Google Cloud Platform for Data Scientists" document.
In case you may still consider this option, "Graphics Processing Unit (GPU)" provides an overview of this component in the context of data science.

Related

Different processor for same size VMSS

We have Service Fabric application, that creates VMSS when we create the clusters. While creation we have to select the VM Size. We have 3 different regions where we have deployed our application.
Although the VM size selected is same for all 3 regions. The processor assigned is different. That's not a problem if the processors are similar in their performance. But it's not.
https://learn.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series#dv2-series
Above link states:
Dv2-series sizes run on Intel® Xeon® Platinum 8272CL (Cascade Lake), Intel® Xeon® 8171M 2.1GHz (Skylake) or the the Intel® Xeon® E5-2673 v4 2.3 GHz (Broadwell) or the Intel® Xeon® E5-2673 v3 2.4 GHz (Haswell) processors with Intel Turbo Boost Technology 2.0.
With same code, one region is performing well, but other regions usually have CPU maxed out. On talking to Microsoft support, they said processors are assigned randomly and they cannot change it.
Only option suggested by support, that we try change the cluster and node by stopping and starting all VMSS instance at the same time in the Azure portal manually.
If we look at the performance benchmark for 2 processor assigned to us:
https://www.cpubenchmark.net/compare/Intel-Xeon-E5-2673-v4-vs-%5BDual-CPU%5D-Intel-Xeon-Platinum-8171M/2888vs3220.2
Now the only option we are left with is try restarting VMSS for n number of times or upgrade to different size.
Anyone faced similar situation? If yes what was the resolution?
Or any information, what are the design consideration by which Microsoft assigns a particular processor to the VM?
I wouldn't read too much into the comparative specs between each processor because you're ultimately not getting the full processor - you're only buying the vCPUs. Each is are supposed to have similar performance from one to another. This would suggest that Microsoft may pack more VMs onto an 8171M host than onto an E5-2673 so the vCPUs across either machine are closer in performance equivalency within the SKU series.
Put simply, you have no idea how many VMs Microsoft is running off any given processor and it would only make sense that they run more off a higher performing host system.
If you want to have full operational performance of the processor, you'd have to buy a dedicated host. Note that the pricing sheet names precisely which processor SKU you get for your money unlike the vCPU mix and match happening in the D#_v2 SKU series.

Gaming on VM Window 10 but got dx11 feature level 10.0 request issue

I'm using MacOS tried to use VM window 10 to run PUGB (Downloaded from Steam)but received a msg saying "dx11 feature level 10.0 request to run engine"
I tried roll back driver solution but VM window itself don't have the previous version I guess.
I've done some googling knowing some user got the same msg on their physical pc but worked on VM window 10.
Azure's Standard NV6 (6 vcpus, 56 GB memory) is my VM' server and thinking will the problem solved if I upgrade the spec?
NV Series VMs are available with single or multiple NVDIA GPUs as part
of the Azure N Series offering. These VMs are optimized for remote
visualization and VDI scenarios, using frameworks such as OpenGL and
DirectX.
From this description, the NV series VMs have the DirectX function. For more details, see here. And to take advantage of the GPU capabilities of Azure N-series VMs running Windows, NVIDIA GPU drivers must be installed. You can take a look at this document and it will show you how to install the drivers. Good luck.

Windows 10 IoT Core GPU utilization

How come there is no data visible for the GPU Utilization in the Windows IoT Core Device Portal?
That's because the driver is not yet available to utilize the GPU. You will notice this when doing using video and complex UI's.

Azure F-Series VM

What are those F-Size VMs that are listed in pricing and listed as an option to deploy? I am unable to find any information on them.
F Instances
The F-Series is intended for processor intensive workloads.
Taken from The Azure Virtual Machine Pricing
The F-Series virtual machines sport 2GB RAM and 16 GB of local solid state drive (SSD) per CPU core, and are optimized for compute intensive workloads. The F-series is based on the 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell) processor, which can achieve clock speeds as high as 3.1 GHz with the Intel Turbo Boost Technology 2.0. These VMs are suitable for scenarios like batch processing, web servers, analytics and gaming.
They are already listed in the showcase. Basically they are CPU-wise equivalent to Dv2 of the same size but have about two times less memory and about three times less disk space available to the user and they are about 20 percent cheaper. The intent is that if your application uses a lot of processor core time but doesn't benefit from large amount of memory and disk space you can save about 20 percent cash by moving to F-series.
Microsoft recently announced these VM sizes. Per the announcement, they may be a better value than the Dv2 series if your workload is CPU bound. Note that the CPUs for both Dv2-Series and F-Series are the same - 2.4 GHz Intel Xeon E5-2673 v3 (Haswell).
This is the same CPU performance as the Dv2-Series of VMs with 2GB of
memory per CPU core at a lower per-hour price.
The F-Series VMs are an excellent choice for gaming servers, web
servers and batch processing. Any workload which does not need as much
memory or local SSD per CPU core will benefit from the value of the
F-Series. The F-Series sizes range from one to 16 CPU cores with
options for both standard and premium storage optimized sizes.

What kind of graphics card are Windows Azure Virtual Machines equipped with?

I am thinking about running some graphics intensive programs on Windows Azure virtual machine, but not sure what kind of hardware do they have. Does all the VM have the same GPU? What is your experience of it?
The GPUs in Azure Virtual Machine are likely to be very basic and will most probably not have anywhere near the processing power you will need for carrying out intensive graphics manipulation. To my knowledge MS don't publish the details of the graphics hardware behind their Virtual Machines (If they actually use them at all?).
There's a question here on running WPF in an Azure cloud service which may be helpful.
Can Azure run WPF?
The N series Azure VMs support beefy GPUs. The NC series VM sports Tesla K80, with DDA (discreet device assignment) it supposed to provide close to bare metal performance. NV series VMs offer Tesla M60 with nVidia GRID.
More:
https://www.hpcwire.com/2015/09/29/microsoft-puts-gpu-boosters-on-azure-cloud/
https://blogs.technet.microsoft.com/hybridcloudbp/2016/12/13/n-series-azure-vms-with-gpu/
It's fascinating that there are FPGAs in Azure machines too (although not publicly accessible):
https://www.microsoft.com/en-us/research/project/project-catapult/

Resources