Opening a port on HDInsight cluster on Azure - azure

I have a microsoft Azure HDInsight cluster.
On the node I am rdp'ing and starting an application that binds to port 8080. I would like to be able to connect to this application from outside the cluster.
I have my cluster connection string (https://xxxxx.azurehdinsight.net) however when I try to connect to it I am timing out.
I believe this is due to the fact that I have not opened port 8080 to public. How can I do this as under the cluster I only have Hadoop Services and username....

At this point in time, we don't allow you to control / open additional network ports on an HDInsight cluster.
You can deploy an HDInsight cluster into an Azure Virtual network if you'd like to have another machine in Azure to have access to all of the ports/nodes on the cluster. We've documented how to deploy into a vnet in this article.

Related

Connect to Azure HDInsight Kafka cluster from public network / local machine

Is there a way to connect to Azure HDInsight Kafka cluster from the public network without using a VPN ?
Is it only possible to connect to Azure Kafka cluster from with in Azure network ?
Thanks
You don't need a VPN. Just open the firewall ports from Azure

Failing to create Azure Databricks cluster because of unreachable instances

I'm trying to create a cluster in Azure Databricks and getting a such error messgae
Resources were not reachable via SSH. If the problem persists, this usually indicates a network environment misconfiguration. Please check your cloud provider configuration, and make sure that Databricks control plane can reach Spark clusters instances.
I have such the default configuration:
Cluster mode: Standard
Pool: None
Runtime version: 5.5 LTS
Autoscaling enabled
Worker Type: Standard_DS3_v2
Driver Type: Standard_DS3_v2
From Logs Analytics I see Azure tried to create virtual machines and without any reason (I suppose because they were unreachable) had to delete all of them.
Did anyone face such issue?
If the issue is temporary, this may be caused by the driver of the virtual machine going down or a networking issue since Azure Databricks was able to launch the cluster, but lost the connection to the instance hosting the Spark driver referring to this. You could try to remove it and create the cluster again.
If the problem persists, this may happen when you have an Azure Databricks workspace deployed to your own VNet. If the virtual network where the workspace is deployed is already peered or has an ExpressRoute connection to on-premises resources, the virtual network cannot make an ssh connection to the cluster node when Azure Databricks is attempting to create a cluster. You could add a user-defined route (UDR) to give the Azure Databricks control plane ssh access to the cluster instances.
For detailed UDR instructions, see Step 3: Create user-defined routes and associate them with your Azure Databricks virtual network subnets. For more VNet-related troubleshooting information, see Troubleshooting
Hope this could help you.
Issue: Instances Unreachable: Resources were not reachable via SSH.
Possible cause: traffic from control plane to workers is blocked. If you are deploying to an existing virtual network connected to your on-premises network, review your setup using the information supplied in Connect your Azure Databricks Workspace to your On-Premises Network.
Reference: Azure Databricks - Troubleshooting
Hope this helps.

Network setup for accessing Azure Redis service from Azure AKS

We have an application that runs on an Ubuntu VM. This application connects to Azure Redis, Azure Postgres and Azure CosmosDB(mongoDB) services.
I am currently working on moving this application to Azure AKS and intend to access all the above services from the cluster. The services will continue to be external and will not reside inside the cluster.
I am trying to understand how the network/firewall of both the services and aks should be configured so that pods inside the cluster can access the above services or any Azure service in general.
I tried the following:
Created a configMap containing the connection params(public ip/address, username/pwd, port, etc) of all the services and used this configMap in the deployment resource.
Hardcoded the connection params of all the services as env vars inside the container image
In the firewall/inbound rules of the services, I added the AKS API ip, individual node ips
None of the above worked. Did I miss anything? What else should be configured?
I tested the setup locally on minikube with all the services running on my local machine and it worked fine.
I am currently working on moving this application to Azure AKS and
intend to access all the above services from the cluster.
I assume that you would like to make all services to access each other and all the services are in AKS cluster? If so, I advise you configure the internal load balancer in AKS cluster.
Internal load balancing makes a Kubernetes service accessible to
applications running in the same virtual network as the Kubernetes
cluster.
You can take a try and follow the following document: Use an internal load balancer with Azure Kubernetes Service (AKS). In the end, good luck to you!
Outbound traffic in azure is SNAT-translated as stated in this article. If you already have a service in your AKS cluster, the outbound connection from all pods in your cluster will come thru the first LoadBalancer type service IP; I strongly suggest you create one for the sole purpose to have a consistent outbound IP. You can also pre-create a Public IP and use it as stated in this article using the LoadBalancerIP spec.
On a side note, rather than a ConfigMap, due to the sensitiveness of the connection string, I'd suggest you create a Secret and pass that down to your Deployment to be mounted or exported as environment variable.

Accessing Spark in Azure HDInsights via JDBC

I'm able to connect to hive externally using the following URL for a HDInsight cluster in Azure.
jdbc:hive2://<host>:443/default;transportMode=http;ssl=true;httpPath=/
However, I'm not able to find such a string for spark. The documentation says the port is 10002, but its not open externally. How do I connect to the cluster to run SparkSQL queries through JDBC?
There is not one available. But you can vote for the feature at https://feedback.azure.com/forums/217335-hdinsight/suggestions/14794632-create-a-jdbc-driver-for-spark-on-hdinsight.
HDInsight is deployed with a gateway. This is the reason why HDInsight clusters out-of-box enable only HTTPS (Port 443) and SSH (Ports 22, 23) communication to the cluster. If you don' t deploy the cluster in a virtual network (vnet) there is no other way you can communicate with HDInsight clusters. So instead of Port 10002 Port 443 is used if you want to access the Spark thrift server. If you deploy the cluster in a vnet, you could also access the thrift server via the ip address it is running on (one of the headnodes) and standard port 10002. See also public and non-public ports in the documentation.

unable to access DB pod External IP from application

I've created two pods top of Azure Kubernetes cluster
1) Application
2) MS SQL server
both pods are exposed via Azure Loadbalancer and both having External IPs. I am unable to use the External IP in my application config file. But I can connect that SQL Server from anywhere. For some reason I am unable to telnet DB IP from Application container.
the connection is getting timeout. but I can ping/telnet the DB's cluster ip. So I have tried to use the DB cluster IP in my config file to check if the connection is successful but no luck.
Could someone help me with this ?
As Suresh said, we should not use public IP address to connect them.
We can refer to this article to create a application and a database, then connect a front end to a back end using a service.
This issue was fixed in other way. But still running a Application & DB as separate service is night mare in Azure container service(Kubernetes).
1) I've combined App+DB in same container and put the DB connection string as "localhost" or "localhost,1433" is my application config file.
2) Created Docker image with above setup
3) Created pod
4) Exposed pod with two listening ports "kubectl expose pods "xxx" --port=80,1433 --type=LoadBalancer
5) I can access the DB with 1433
In the above setup, we have planned to keep the container in auto scaled environment with persistent volume storage
Also we are planning to do the scheduled backup of container, So we do not want to loose the DB data.
Is anybody having other thoughts, what the major issue factors we need to consider in above setup ??
This issue was fixed..!
Create two pods for Application and DB, Earlier when I provide the DB cluster IP in application config file, it was worked.But I was able to telnet 1433
I have created another K8s cluster in Azure then tried with same setup (provided cluster IP). This time it worked like charm.
Thanks to #Suresh Vishnoi

Resources