What are those F-Size VMs that are listed in pricing and listed as an option to deploy? I am unable to find any information on them.
F Instances
The F-Series is intended for processor intensive workloads.
Taken from The Azure Virtual Machine Pricing
The F-Series virtual machines sport 2GB RAM and 16 GB of local solid state drive (SSD) per CPU core, and are optimized for compute intensive workloads. The F-series is based on the 2.4 GHz Intel Xeon® E5-2673 v3 (Haswell) processor, which can achieve clock speeds as high as 3.1 GHz with the Intel Turbo Boost Technology 2.0. These VMs are suitable for scenarios like batch processing, web servers, analytics and gaming.
They are already listed in the showcase. Basically they are CPU-wise equivalent to Dv2 of the same size but have about two times less memory and about three times less disk space available to the user and they are about 20 percent cheaper. The intent is that if your application uses a lot of processor core time but doesn't benefit from large amount of memory and disk space you can save about 20 percent cash by moving to F-series.
Microsoft recently announced these VM sizes. Per the announcement, they may be a better value than the Dv2 series if your workload is CPU bound. Note that the CPUs for both Dv2-Series and F-Series are the same - 2.4 GHz Intel Xeon E5-2673 v3 (Haswell).
This is the same CPU performance as the Dv2-Series of VMs with 2GB of
memory per CPU core at a lower per-hour price.
The F-Series VMs are an excellent choice for gaming servers, web
servers and batch processing. Any workload which does not need as much
memory or local SSD per CPU core will benefit from the value of the
F-Series. The F-Series sizes range from one to 16 CPU cores with
options for both standard and premium storage optimized sizes.
Related
We have Service Fabric application, that creates VMSS when we create the clusters. While creation we have to select the VM Size. We have 3 different regions where we have deployed our application.
Although the VM size selected is same for all 3 regions. The processor assigned is different. That's not a problem if the processors are similar in their performance. But it's not.
https://learn.microsoft.com/en-us/azure/virtual-machines/dv2-dsv2-series#dv2-series
Above link states:
Dv2-series sizes run on Intel® Xeon® Platinum 8272CL (Cascade Lake), Intel® Xeon® 8171M 2.1GHz (Skylake) or the the Intel® Xeon® E5-2673 v4 2.3 GHz (Broadwell) or the Intel® Xeon® E5-2673 v3 2.4 GHz (Haswell) processors with Intel Turbo Boost Technology 2.0.
With same code, one region is performing well, but other regions usually have CPU maxed out. On talking to Microsoft support, they said processors are assigned randomly and they cannot change it.
Only option suggested by support, that we try change the cluster and node by stopping and starting all VMSS instance at the same time in the Azure portal manually.
If we look at the performance benchmark for 2 processor assigned to us:
https://www.cpubenchmark.net/compare/Intel-Xeon-E5-2673-v4-vs-%5BDual-CPU%5D-Intel-Xeon-Platinum-8171M/2888vs3220.2
Now the only option we are left with is try restarting VMSS for n number of times or upgrade to different size.
Anyone faced similar situation? If yes what was the resolution?
Or any information, what are the design consideration by which Microsoft assigns a particular processor to the VM?
I wouldn't read too much into the comparative specs between each processor because you're ultimately not getting the full processor - you're only buying the vCPUs. Each is are supposed to have similar performance from one to another. This would suggest that Microsoft may pack more VMs onto an 8171M host than onto an E5-2673 so the vCPUs across either machine are closer in performance equivalency within the SKU series.
Put simply, you have no idea how many VMs Microsoft is running off any given processor and it would only make sense that they run more off a higher performing host system.
If you want to have full operational performance of the processor, you'd have to buy a dedicated host. Note that the pricing sheet names precisely which processor SKU you get for your money unlike the vCPU mix and match happening in the D#_v2 SKU series.
When comparing two different VM series in Azure, I see that one has Cores and the other one vCPUs. Keeping aside the number of Cores/CPUs, Memory and Processor Type (Intel Xeon E/Platinum etc), what is the advantage of one over the other? I understand that CPU can have multiple cores, but in Azure what is the difference between 4 vCPUs and 4 vCores?
G Series with Core
D Series with vCPU
A core is a physical unit of a CPU.
A virtual CPU (vCPU) also known as a virtual processor, is a physical central processing unit (CPU) that is assigned to a virtual machine (VM).
For more details, you can refer to these msdn answers: this and this.
I'm reading the documentation of the SQL Databases on Microsoft Azure about the performance between two kinds of database service, GEN4 and GEN5. Currently the documentation shows that GEN4 CPUs are based on Intel E5-2673 v3 (Haswell) 2.4 GHz processors and 1 vCore = 1 physical CPU, and GEN5 logical CPUs are based on Intel E5-2673 v4 (Broadwell) 2.3 GHz processors where 1 vCore = 1 Hyper thread.
My question is, Is GEN4 1 physical cpu equivalent to a Intel E5-2673 v3 with 12 cores and 24 logical proccesors or Is a individual core? , and Is GEN5 1 hyper Thread equivalent to a logical core of a physical core on a Intel E5-2673 v4?
This is the link of the documentation :Azure SQL Database pricing
Is GEN4 1 physical cpu equivalent to a Intel E5-2673 v3 with 12 cores and 24 logical proccesors or Is a individual core.
1 physical cpu in GEN4 is represent one core that based on Intel E5-2673 v3 (Haswell) 2.4 GHz processors.
Is GEN5 1 hyper Thread equivalent to a logical core of a physical core on a Intel E5-2673 v4?
Introduction of Hyper Thread:
Hyper-threading (officially called Hyper-Threading Technology or HT Technology, and abbreviated as HTT or HT) is Intel's proprietary simultaneous multithreading (SMT) implementation used to improve parallelization of computations (doing multiple tasks at once) performed on x86 microprocessors. It first appeared in February 2002 on Xeon server processors and in November 2002 on Pentium 4 desktop CPUs.[4] Later, Intel included this technology in Itanium, Atom, and Core 'i' Series CPUs, among others.
For each processor core that is physically present, the operating system addresses two virtual (logical) cores and shares the workload between them when possible. The main function of hyper-threading is to increase the number of independent instructions in the pipeline; it takes advantage of superscalar architecture, in which multiple instructions operate on separate data in parallel. With HTT, one physical core appears as two processors to the operating system, allowing concurrent scheduling of two processes per core. In addition, two or more processes can use the same resources: if resources for one process are not available, then another process can continue if its resources are available.
In addition to requiring simultaneous multithreading (SMT) support in the operating system, hyper-threading can be properly utilized only with an operating system specifically optimized for it.[5] Furthermore, Intel recommends HTT to be disabled when using operating systems unaware of this hardware feature.
More information about Hyper Thread, we can refer to: Hyper Thread
It seems that Microsoft is intentionally being deceptive with how they've labeled/described the CPU count between the two models. It seems pretty clear based on the wording as you described and the performance we are seeing that the same level in GEN5 has half as many logical processors. This makes sense when you figure that you get improved hardware in GEN5, but the prices are the same for the same levels.
We have many processor intensive analytical queries, in testing over the last week, we have to go to GEN5_16 in order to get the same performance as GEN4_8. Unfortunately, the price skyrockets from $42k a year to $84k to do this. We moved to GEN5_8 over the holidays and are currently suffering from incredible contention on GEN5 and log rate throttles on simple SELECT INTO queries and are in a pickle. We are constantly bumping up on the 1TB limit of GEN4 (MSSQL's log growth kills us - we dont need full recovery but have no choice), but we never had performance or throttling issues on GEN4_8.
I am trying to find equivalent of an i3.4xlarge AWS ec2 instance on Azure. I am not sure if Microsoft Azure VMs have NVMe drives.
Does Azure have NVMe based VMs?
Microsoft has announced NVMe storage is now available.
https://learn.microsoft.com/en-us/azure/virtual-machines/windows/sizes-storage
The Lsv2-series features high throughput, low latency, directly mapped local NVMe storage running on the AMD EPYCTM 7551 processor with an all core boost of 2.55GHz and a max boost of 3.0GHz. The Lsv2-series VMs come in sizes from 8 to 80 vCPU in a simultaneous multi-threading configuration. There is 8 GiB of memory per vCPU, and one 1.92TB NVMe SSD M.2 device per 8 vCPUs, with up to 19.2TB (10x1.92TB) available on the L80s v2.
Specifically for your question an AWS i3.4xlarge has 16 vCPUs, 122 GiB memory, and 2 x 1.9 TB NVMe while the generally equivalent Azure Standard_L16s_v2 has 16 vCPU, 128 GiB memory, and 2 x 1.92 TB NVMe.
Short answer is no.
The closest match on Azure is the L-Series VMs although it does not seem to have NVMe and uses the 'Premium Storage'
The Lsv2-series features high throughput, low latency, directly mapped local NVMe storage running on the AMD EPYC ™ 7551 processor with an all core boost of 2.55GHz and a max boost of 3.0GHz. The Lsv2-series VMs come in sizes from 8 to 80 vCPU in a simultaneous multi-threading configuration. There is 8 GiB of memory per vCPU, and one 1.92TB NVMe SSD M.2 device per 8 vCPUs, with up to 19.2TB (10x1.92TB) available on the L80s v2.
https://learn.microsoft.com/en-us/azure/virtual-machines/windows/sizes-storage
Please be aware that the NVMe disks provided with the Lsv2-series are ephemeral so you'll lose everything if you need to stop/deallocate the VM for whatever reasons:
2 Local NVMe disks are ephemeral, data will be lost on these disks if you stop/deallocate your VM.
I've learned it the hard way recently, I had to stop the VM to resize the main OS disk (as Azure doesn't support online resizing) and after the operation my whole database was gone!
I am currently running out of memory and RAM on my 2013 Macbook Pro (8gb 1600 MHz DDR3 memory, and 2 GHz Intel Core i7 processor) while running different scikit-learn (Random Search on MLPRegressor and GradientBoostingRegressor) models on a 50,000 sample data set with ~70 features, most of which are categorical. I have setup a VM on Google Cloud Platform, but have not seen much of an improvement in execution time. Here are the specs of the VM: Machine type: n1-standard-8 (8 vCPUs, 30 GB memory), Source image: ubuntu-1604-xenial-v20180126. I'm wondering if anyone has any recommendations on tweaking vm specs for learning data science. I'm not looking to add any GPUs due to cost. Thank you
As the volume and nature of the data you plan to process is known to you, simply experimenting with different machines on Google Cloud Platform is the best way to choose the most effective option.
General information on subject is to be found in the "Google Cloud Platform for Data Scientists" document.
In case you may still consider this option, "Graphics Processing Unit (GPU)" provides an overview of this component in the context of data science.