Running scripts from a datastore on Azure Machine Learning Service - azure

I am migrating from Batch AI to the new Azure Machine Learning Service. Previously I had my python scripts on an Azure Files share and those scripts ran directly from there.
In the new service when you create an Estimator you have to provide a source directory and an entry script. The documentation states the source directory is a local directory that is copied to the remote computer.
However, the Estimator constructor also allows you to specify a datastore name that is supposed to specify the datastore for the project share.
To me, this sounds like you can specify a datastore and then the source directory is relative to that however this does not work, it still wants to find the source directory on the local machine.
tf_est = TensorFlow(source_directory='./script',
source_directory_data_store=ds,
script_params=script_params,
compute_target=compute_target,
entry_script='helloworld.py',
use_gpu=False)
Does anybody know if its possible to run a training job using a datastore for execution?

Related

SAP Commerce Cloud Hot Folder local setup

We are trying to use cloud hot folder functionality and in order to do so we are modifying our existing hot-folder implementation that was not implemented originally for usage within cloud.
Following the steps on this help page:
https://help.sap.com/viewer/0fa6bcf4736c46f78c248512391eb467/SHIP/en-US/4abf9290a64f43b59fbf35a3d8e5ba4d.html
We are trying to test the cloud functionality locally. I have on my machine azurite docker container running and I have modified the mentioned properties in local.properties file but it seems that the files are not being picked up by hybris in any of the cases that we are trying.
First we have in our local azurite storage a blob storage called hybris. Within this blob storage we have folders master>hotfolder, and according to docs uploading a sample.csv file into this should trigger a hot folder upload.
Also we have a mapping for our hot-folder import that scans the files within this folder: #{baseDirectory}/${tenantId}/sample/classifications. {baseDirectory} is configured using a property like so: ${HYBRIS_DATA_DIR}/sample/import
Can we keep these mappings within our hot folder xml definitions, or do we need to change them?
How should the blob container be named in order for it to be accessible to hybris?
Thank you very much,
I would be very happy to provide any further information.
In the end I did manage to run cloud hot folder imports on local machine.
It was a matter of correctly configuring a number of properties that are used by cloudhotfolder and azurecloudhotfolder extensions.
Simply use the following properties to set the desired behaviour of the system:
cluster.node.groups=integration,yHotfolderCandidate
azure.hotfolder.storage.account.connection-string=DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:32770/devstoreaccount1;
azure.hotfolder.storage.container.hotfolder=${tenantId}/your/path/here
cloud.hotfolder.default.mapping.file.name.pattern=^(customer|product|url_media|sampleFilePattern|anotherFileNamePattern)-\\d+.*
cloud.hotfolder.default.images.root.url=http://127.0.0.1:32785/devstoreaccount1/${azure.hotfolder.storage.container.name}/master/path/to/media/folder
cloud.hotfolder.default.mapping.header.catalog=YourProductCatalog
And that is it, if there are existing routings for traditional hot folder import, these can also be used but their mappings should be in the value of
cloud.hotfolder.default.mapping.file.name.pattern
property.
I am trying the same - to set up a local dev env to test out the cloud hotfolder. It seems that you have had some success. Can you provide where you located the azurecloudhotfolder - which is called out here https://help.sap.com/viewer/0fa6bcf4736c46f78c248512391eb467/SHIP/en-US/4abf9290a64f43b59fbf35a3d8e5ba4d.html
Thanks

What is the meaning of each part of this luminoth command?

I am trying to train on a dataset using luminosity. However, as my computer has a poor GPU I am planning to use glcoud. It seems that luminoth has gcloud integration according to the doc(https://media.readthedocs.org/pdf/luminoth/latest/luminoth.pdf).
Here is what I have done.
Create a Google Cloud project.
Install Google Cloud SDK on your machine.
gcloud auth login
Enable the following APIs:
• Compute Engine
• Cloud Machine Learning Engine
• Google Cloud Storage
I did it through the webconsole.
Now here is where I am stuck.
5. Upload your dataset’s TFRecord files to a Cloud Storage bucket:
the command for this is;
gsutil -o GSUtil:parallel_composite_upload_threshold=150M cp -r /path/to/dataset/˓→tfrecords gs://your_bucket/path
I have the tfrecords file in my local drive and the data that I need to train on. However, I am not sure what each command in gsutil is trying to say. For /path/to/dataset/ do I simply input the directory my data is in? And I have uploaded the files to a bucket. Do I simply provide the path for it?
Additionally, I am currently getting
does not have permission to access project (or it may not exist)
Apologies if this may be a stupid question.

Jenkins Slave Node : Can I use it to take over build done on different domain?

I have successfully set up Jenkins on local domain as a test. It builds from SCM, zips the build, extracts to a unique timestamp folder, and then copies over the files to the IIS folder.
I now have to set it up to deploy to a Azure VM. Now things are getting hairy.
I get the file to copy across - it takes a long time. Unzipping literally takes an hour.
Cross domain user rights are also making things difficult as the user running Jenkins service does not exist on production boxes which are on Azure domains.
What are my options?
Should I install a slave node on the production box and then "activate" the slave from the master and then let the slave :
1. perhaps copy the file over from Azure storage to the production box?
2. extract the files
3. Copy the files to the IIS folder.
Well, there's no clear answer to this, try what works best for you. So the main options i see are:
1. Use slave node in Azure, upload zip to some place (Azure storage account or whatever) and let slave node handle the download\unpacking\etc.
2. Use remote PowerShell and connect directly to servers in Azure and download the zip from the web (or Azure storage or whatever) and extract it.
3. Use a tool, like Octopus, which does literally the same, but is kind of build with deployments in mind.

access certain folder in azure cloud service

In my code (which has worker role) I need to specify a path to a directory (third party library requires it). Locally I've included folder into project and just give full path to it. However after deployment of course I need a new path. How do I confirm that whole folder has been deployed and how do I determine a new path to it?
Edit:
I added folder to the role node in visual studio and accessed it like this: Path.Combine(Environment.GetEnvironmentVariable("RoleRoot"), "my_folder");
Will this directory be used for reading and writing? If yes, you should use a LocalStorage resource. https://azure.microsoft.com/en-us/documentation/articles/cloud-services-configure-local-storage-resources/ shows how to use this.
If the directory is only for reading (ie. you have binaries or config files there), then you can use the %RoleRoot% environment variable to identify the path where your package was deployed to, then just append whatever folder you refernced in your project (ie. %RoleRoot%\Myfiles).
I'd take a slightly different approach. Place the 3rd party package into Windows Azure blob storage, then during role startup, you can download/extract it and place the files into the available Local storage (giving it whatever permissions the app needs). Then leverage that location from your application via the same local storage configuration entry.
This should help you reduce the size of your deployment package as well as give you the ability to update the 3rd party components without completely redeploying your solution. And by leveraging it on startup, you can guarantee that the files will be there in case the role instance gets torn down and rebuilt.

OnStart vs Startup Script for batch file?

I have a Ruby on Rails application that needs to find a home in an Azure Worker Role.
I currently automate the deployment of the application with a batch file - a file that takes the apache and ruby installers, runs them, and then drops the RoR app in the appropriate directory. After the batch script finishes, Apache is serving to and from the application via port 80.
I'm new to Azure and trying to figure out how to do this.
From my understanding, I have two options here: OnStart with the installation files in Blob Storage, or a startup script. I'm not sure how to do the latter, but I have located the onStart method within the WorkerRole.vb file in the new Azure project I just created.
My question: Is it recommended to use OnStart to deploy the application (using the batch script)? If so, how would I go about integrating the script into the project? And - how do I get started with storing and referencing the files in blob storage?
I know these are super high-level questions. Any input or suggested reading would be super helpful. I have tried to google / search for relevant resources but haven't been able to find much. Thank you for your time!
When you are inside OnStart() function it is better to do role configuration things i.e. IP binding, etc however if you would want to install runtime, download application zip, configured role specific setting, it is best to use Startup task. Please visit my blog Windows Azure: Startup task or OnStart(), which to choose? to learn more about it.
Now in your case it is best to use Startup task. What you can do it as below:
Create your ROR package a zip and place it at Windows Azure Blob Storage
Create a Cmmmand batch file which will do:
2.1 Download the ZIP
2.2 Unzip to Zip content to a specific location
2.3 Update the status back to AZure Blob Storage (Optional)
In your OnStart() function you just need to configure the ROR
The code will look as below if you have TCP Endpoint name "RORWeb80" set to use port 80:
TcpListener RoRPortListener = new TcpListener(RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["RORWeb80"].IPEndpoint);
RoRPortListener.Start();
I have written a sample app for Tomcat/Java based worker role which does exactly the same. So what you can do it just replace the Tomcat ZIP file with ROR ZIP and reuse the code exactly.
As long as you don't need admin-level access (e.g. modifying registry, installing msi's, etc.) you can do your setup from OnStart(), including launching your script. Just include the startup script with your project (don't forget to set Copy Local to true).
Same goes with startup script: you call your cmd file, which then executes the sequence for you. And if you give it elevated permissions, you can run installers, modify registry settings, install custom perf counters, whatever.
In either case: you can keep your apache zip, ruby installers, etc. in blob storage and, at startup, download them to local storage. This saves you from bundling everything within the deployment, which gives you a few advantages (being able to update ruby / apache without redeploy, reduced package size, etc.).
There's a sample app on codeplex that demonstrates the basics of setting up Tomcat via startup script. For one more example, you can look at the scripts installed via Eclipse Windows Azure plugin for Java. These scripts are quite similar. The key is to have some way of downloading files from blob storage and then unzipping them. the codeplex project I referred to points to a sample app that does simple blob downloading. The Eclipse packaging provides similar functionality in a .vbs app. Here's a snippet of one of my scripts from an Eclipse-based project:
SET SERVER_DIR_NAME=apache-tomcat-7.0.25
SET WAR_NAME=myapp.war
rd "\%ROLENAME%"
mklink /D "\%ROLENAME%" "%ROLEROOT%\approot"
cd /d "\%ROLENAME%"
cscript /NoLogo util\unzip.vbs jre7.zip "%CD%"
cscript /NoLogo util\unzip.vbs tomcat7.zip "%CD%"
copy %WAR_NAME% "%SERVER_DIR_NAME%\webapps\%WAR_NAME%"
cd "%SERVER_DIR_NAME%\bin"
set JAVA_HOME=\%ROLENAME%\jre7
set PATH=%PATH%;%JAVA_HOME%\bin
cmd /c startup.bat
The codeplex project has a similar-looking script.
Don't forget: you'll need to set up an Input Endpoint for your role (part of the role properties).
To get blobs into blob storage, there are both free tools (like Clumsy Leaf CloudXplorer and paid tools (such as Cerebrata's Cloud Storage Studio).
To download blobs to local storage, you can either write a few lines of .net code (from OnStart) or just use the utility pointed to in the codeplex project.

Resources