Hadoop on Azure Create New Cluster - azure

I am able to create a new Hadoop cluster through the interface, but need to create a new cluster on request. Does anyone know if an API exists to create a new cluster?

Not yet. As of now (Preview) you must be using Windows Azure Management Portal Interface to create Hadoop cluster in your windows Azure Subscription.
As most of the Windows Azure Management functionalities are available on Powershell, it is possible to have such functionality built into Powershell over REST as described here however I don't know any immediate plans.

Yes, you can.
The Azure CLI allows you to control HDInsight clusters from a batch file for example. You then get a bunch of HDInsight control functions. Type
azure hdinsight
to see all the built in help. It covers all the basics (listing, creating configuring clusters) and the multiple storage account functionality.
This is I believe based on the nodejs sdk. To get going, install nodejs and
npm install azure-cli
This should give you what you need to be able to manage the clusters from the command line.

By asking "Create a new Hadoop cluster", I believe you mean Hadoop HDInsight Cluster.
If yes, then we can create a powershell(.ps1) script, which can do the job for you.
Here is the sample script which might be useful.
http://mydailyfindingsit.blogspot.in/2016/01/create-script-hdinsight-cluster.html

Related

Using Kubernetes as docker containers orchestration in PCF

I have a requirement to use Docker containers in PCF deployed in Azure.
And now we want to use kubernetes as container orchestration.
Does kubernetes can be used here ?
Or PCF will take care of the container orchasteration ?
Which one would be the better approach here ?
PKS Is Pivotal's answer to running kubernetes in PCF (regardless of IaaS)
Pivotal Cloud Foundry (PCF) is a sophisticated answer from Microsoft to current cloud expectations. PCF offers the best platform to run Microsoft based technology like .NET, and smoothly supports enterprise Java application. You can run Kubernetes there with fine results, but to achieve comfortable orchestration and management of containers I suggest reading about GKE or setting up your own Kubernetes cluster using kubespray utility.

Pre-define custom scripts to run on Azure Container Service cluster VMs

I would like to provide some pre-defined scripts which need to be run in Azure Container Service Swarm cluster VMs. Is there a way I can provide these pre-defined scripts to ACS, so that when the cluster is deployed, this pre-defined scripts automatically gets executed in all the VMs of that ACS cluster?
This is not a supported feature of ACS, though you can run arbitrary scripts on the VMs once created. Another option is to use ACS Engine which allows much more control over the configuration of cluster as it outputs templates that you can configure. See http://github.com/Azure/acs-engine.

Advantage of using Windows Azure for MapReduce

I'm trying to develop a MapReduce application using Hadoop which could run on top of Windows Azure.
ie: using the Windows Azure clusters to deploy.
I wanted to know what could be the advantages of going in with Windows Azure when compared to other cloud services such as Amazon EC2, Google and so on.
Any help would be appreciated.
Azure has 1 click cluster deployments, parternships with hadoop solution providers to optimize solutions on Azure, VPN capabilities for seamless hybrid solutions along with a plethora of other things.
You also get the ability to automate using Rest or Powershell APIs and built in automation and monitoring that can be predictive as well as re-active to autoscale on demand. The seperation of concerns from compute and storage enables you to stand up and tear down clusters on demand so you can pay for them only when you need to.
Beyond this, billing by the minute is also ideal for anybody reselling compute time on clusters running on Azure.

HDInsight Server Locally multi-node | NOT on Azure

I am confused.
Can you install HDInsight Server (multiple nodes) locally, NOT on Azure?
I don't want to install Hortonworks (I know that HDInsight uses their distribution).
Also, I don't want to install any kind of emulator or developer preview with one node.
Again, I would like to install HDInsight on my windows servers on PRODUCTION, how can I do that?
Chicago, I don't think this is possible. HDInsight is a service offered as part of the public Azure offering. If you want to install multiple nodes locally you will need to go with Hortonworks or the Apache Hadoop distribution.

Create sample Azure Hadoop job via Web UI or cross-platform CLI?

I'm trying to play with Hadoop on Azure using HDInsight, but am a bit confused on how to run a Hadoop job on my newly created cluster. So far I've created an HDInsight cluster and attached a Storage Account to it. I also have the azure-cli installed on my local OS X box.
There's an Azure tutorial on launching Hadoop jobs, but it uses PowerShell, which I don't think is available via the Azure cross-platform CLI.
Aside form starting up a Windows VirtualBox, can a job get created via the Azure Web UI (e.g. like Amazon EMR provides) or via some other command line arguments that would be compatible with OS X?
Thanks
Login using Remote Desktop Protocol into the head node of the cluster and use CLI tools there.
Or submit jobs using WebHCatalog REST APIs (e.g. using curl).
HDInsight currently has XPlat CLI tools for cluster provisioning. Job submission XPlat CLI tools will be available later.

Resources