Custom container using as base image Databricks Runtime ML - databricks

I would like to create a docker image based on the databricks runtime for machine learning and extend it further.
The base runtime i can simply extend in a dockerfile by doing:
FROM databricksruntime/standard:9.x
Can i do something similar for the ML runtime?
As a follow-up can I see the source code of the ml runtime?
Cheers

ML Runtime = Standard Runtime + ML Libs.
Databricks Runtime 10.2 ML is built on top of Databricks Runtime 10.2.
Reference
You can recreate it by installing the libraries mentioned here.
ver 10.2 is used as an example in this answer.

Related

MSI/Exe to MSIX package conversion process using pipelines

I have been looking everywhere to get some guidance on MSI/exe files to MSIX package using Azure DevOps pipelines. However, I am unable to find any information regarding that.
There are plenty of articles explaining the conversion process by using the tool but I am trying to automate the conversion process.
Can someone direct me to the right path and if its even possible to convert the files into MSIX package using pipelines.
Pipeline will help us alleviate issues regarding new binaries (in exe or msi format) being available and they can be package into an MSIX through pipeline.
Idea is to use these packages, create VHDX and attach it to an Azure virtual desktop (using MSIX app attach).
Any guidance would be greatly appreciated.
I assume you are referring to the MSIX Packaging Tool. This tool is not recommended to be used if you have access to the source code.
The MSIX Packaging Tool is intended to be used by IT pros to create MSIX packages for apps where they don't have the source code. It is not designed for developers.
If you have access to your source code, which it sounds like you do, then you can use Visual Studio or third-party tools like Advanced Installer to create the MSIX package.
This older SO question has more details: How to build an MSIX from comandline

Need to install python packages in Azure ML studio

I am new to Azure ML studio and trying to run python script.
Currently I am working on Text analytics related code and as part of that I want to get the singular values of SVD decomposition something like below
lsa=TruncatedSVD(algorithm='randomized',n_components=MaximumNumComponents,n_iter=20,random_state=42,tol=0.0)
U = lsa.fit_transform(X)
Sigma = lsa.singular_values_
Current version of scikit learn in Azure ML studio is 0.17 and singular values is part of higher versions of scikit learn like 0.20.
So I need to upgrade scikit learn package to 0.20. I tried downloading scikit learn 0.20 wheel file and zipped it and uploaded as dataset into Azure ML studio and connected enter image description here but still getting error like below "
AttributeError: 'TruncatedSVD' object has no attribute 'singular_values_'
Process returned with non-zero exit code 1
Already referred to below questions as well
Stackoverflow questions related to upgrade packages in Azure ML studio
How can I install Python packages in Azure ML?
Currently this is the limitation with Azure ML studio. The only way is to reach out to support who generates the necessary package and we can upload it to Azure ML Studio.

Execute spark jobs in Azure ML studio

I am trying to run some spark scripts using execute python script in azure ML studio. And getting an error saying
unable to import spark libraries
Basically i am trying to create web services using ML studio for the models that are developed.
Is it possible or feasible to run spark jobs using ML studio?
Can anyone please help me on this.
Thanks in advance.

What is the best way to deploy a tensorflow trained graph into production?

I have been working on machine learning problems lately as part of my internship. So far I have been using Tensorflow with python because that's what I am most comfortable with. Once a problem is solved using deep learning, I am left with the architecture of the network and the weights. Now, my problem is, how can I deploy my solution in production? I won't be using tensorflow serving because it is mainly for huge applications where you set a remote server and your developed application will make requests to this server. In my case, I just want to develop a machine learning solution and integrate it into an already existing software that uses C++ with visual studio 2017.
So far and after a lot of research, I have some solutions in mind :
1) Using the "dnn" module from OpenCV : this module can load graphs and you can do inference and other operations (like extracting a specific layer from the network at run time). This module seemed very promising but then I started facing some problems when using networks that are a little bit different from the one used in the example described in OpenCV github, they used "inception5h" for the example and when I tried to load "inception_v3" there was an error about some unknown layer in the network, namely the JPEG_decode layer.
2) Building tensorflow from source and using it directly with C++. This solution seemed like the best one but then I encountered so many problems with parts of my code not compiling while others do. I am using Visual Studio 2017 with Windows 10. So although I was able to build tensorflow from source, I wasn't able to compile all parts of my code, in fact it wasn't even my code, it was an example from tensorflow website, this one : tensorflow C++ example.
3) Another possibility that I am entertaining is using tensorflow for designing the solution and then using another machine learning framework such as Caffe2, CNTK...etc for deployment into production. I have found some possibilities to convert graphs from one framework to another here : models converters. I thought that this could be a reasonable solution because all I have to do is find the framework most compatible with windows and just do a model conversion once I finish designing my solution in tensorflow and python. The conversion process though seems a little too good, am I wrong?
4) A final possibility that I am thinking of is using CPython. So basically, I will create my the pipeline for prediction in python, wrap in some python functions then use <Python.h> in my Visual Studio project and make calls to those functions using C++, here's an example : embedding python in C++. I have never used a solution like this before and I am not sure about all the things that could go wrong.
So basically, what do you think is the best solution to deploy a machine learning solution into an already existing project on Visual Studio that uses C++? Am I missing a better solution? Any guidelines or hints are greatly appreciated!
I ended up using solution 2. After the new updates from tensorflow, it's now easier to build tensorflow from source on Windows. With this solution, I didn't need to worry about the compatibility of my models since I use tensorflow with python for prototyping and I use it with C++ for production.
[EDIT] : In 2021, I am now using ONNX Runtime (ORT) for deploying my models in production as part of a C++ application. The documentation for ORT is not great but the tool itself is very good.
I used CNTK from the beginning because I just wanted to stay in my C++ world in Visual Studio, and knew that I wanted to deploy as part of my C++ desktop App. No Tensorflow, no Python, no cloud, not even .NET, and no translating models. Just do it in CNTK from the start. I have a commercial product now using Deep Learning. Cool!
I'd consider exporting your NN model (which is not restricted to tensorflow) using ONNX to Intel Vino or TensorRT in order to export your model to C++ for optimized CPU or optimized GPU
It's states here that Intel Vino is twice as fast as tensorflow

How to use https://github.com/Azure/azure-powershell source code?

I am new to Windows powershell and Visual Studio. I followed the azure developer guide on github(https://github.com/Azure/azure-powershell/wiki/Microsoft-Azure-PowerShell-Developer-Guide) till the "Build Solution .Execute tools\Build.ps1" step. Build is failing with 802 errors. How to import the project into visual studio and run the samples? What project is it actually? C# or VB or it just contains cmdlets and sample tests?
There are two parts to the Azure Powershell repository.
Firstly there is the source code, unless you're a developer, or are interested in the inner workings of Azure Powershell, this probably isn't what you are looking for (my apologies if it is!)
As a general overview though, the Build.ps1 configures your machine to build the Azure Powershell cmdlets. The source code is c# and is located here Build.ps1 installs the following -
Install Windows Azure SDK
Install Python 2.7 x86
Install Django
Install Wix 3.8
Make sure that git.exe and wix bin folder are in you PATH environment variable.
Set environment variable EnableNuGetPackageRestore to value true
if you do those things manually, you can then load the project into Visual Studio and compile the cmdlets.
However, the second purpose of that repository is what I suspect you are actually looking for. If you look here you will find the releases of the installers that you can download to install Azure Powershell. Once these are installed on your machine you can browse and configure Azure services via Powershell.
For instance for version 1.0.1 you will find the download for the latest version of Azure Powershell (at the time of writing anyway!)
If you are just looking to learn Powershell and Azure then that is probably the place you want to start.
If you are looking to explore the source code, then that is there also.

Resources