Connect Squirrel to Azure HBase cluster - azure

My objective is to access my Hbase cluster on Azure with Squirrel with a Phoenix driver running on my local computer.
My Hbase cluster on Azure is operational. I can see it in the Ambari dashboard and I can access it using SSH. I can start Phoenix with the sqlline.py command pointing to one of the zookeeper nodes. The !tables command returs four lines.
My Hbase cluster is included in an Azure VNet. From my local computer (running Windows 10) I can connect to this VNet. I can ping the IP address (10.254.x.x) of the zookeeper node successfully but pinging the FQDN of the zookeeper node results in an error message:
"Ping request could not find host zk1-.......ax.internal.cloudapp.net.
Please check the name and try again."
When I start Squirrel on my local computer with the URL pointing to the FQDN of the zookeeper node I get an error message:
"Unexpected Error occurred attempting to open an SQL connection". The
stack trace points to a java.util.concurrent.RuntimeException: "Unable
to establish connection"
When I start Squirrel on my local computer with the URL pointing to the IP address of the zookeeper node I get a different error:
"Unexpected Error occurred attempting to open an SQL connection". The
stack trace points to a java.util.concurrent.TimeoutException.
I suspect this has something to do with the Domain Name resolution problem as described here [https://superuser.com/questions/966832/windows-10-dns-resolution-via-vpn-connection-not-working]. I applied the resolution as described by LikeARock47 on Feb 23. This did not improve the situation however.
Does this indeed have to do with the Domain Name resolution issue or is the problem somewhere else?
Is there a better solution to the Domain Name resolution issue?

A JDBC connection from Squirrel on my local Windows10 computer has succesfully been established to the Hbase cluster by using the zookeeper IP address and the port and "/hbase-unsecure":
jdbc:phoenix:10.254.x.x:2181:/hbase-unsecure
I can manage my HBase cluster with a local Squirrel now!
I'd still be interested to find out how I can get the zookeeper FQDN resolved locally.....

Related

MarkLogic - XDMP-HOSTOFFLINE: Host is offline or not responding

MarkLogic 9.0.9
Deployed in Azure with Managed Disk
While setting up new MarkLogic Cluster, we are facing an issue for 2 server nodes as below
This host is down. The following error occurred while trying to contact it:
XDMP-HOSTOFFLINE: Host is offline or not responding
Host <HostName>
Online Disconnected
While looking at error log, I got this line
2020-05-06 05:22:28.832 Warning: A valid hostname is required for proper functioning of MarkLogic Server: SVC-SOCHN: Socket hostname error: getaddrinfo .reddog.microsoft.com: Name or service not known (where as it should connect to )
I got knowledge base article which is published in April 2020.
https://help.marklogic.com/Knowledgebase/Article/View/svc-sochn-warning-during-start-up-on-aws
Based on this article, I do not find any file under /etc/ or /var/local folders as mentioned in article
Not sure if it is because of this, I am not able to open MarkLogic Admin Interface (port 8001).
It seems that somewhere in the MarkLogic configuration this name is there, but which one is a question.
Please find below screen from host within MarkLogic Interface. In this case, disconnected status is for 01 & 03
Whereas I can access Admin Interface of 01, so I am wondering.
After discussing same issue with infra team, they found issue with DNS resolution as full dns was not set in hostname within MarkLogic.
i.e. ml-01 was set in hostname instead of ml-01.abc.com and then as MarkLogic was in azure, it added ml.01.reddog.microsoft.com automatically.
So outside MarkLogic we were able to ping server with full name.
After change in DNS resolution, i was able to add ML server nodes in cluster.

Apache Spark Error creating pool to EC2 Cassandra

My configuration are:
1 Spark machine on EC2: c3.2xlarge.
Communicating with 4 nodes of Cassandra on EC2.
I am getting the following error:
16/08/03 22:41:10 ERROR Session: Error creating pool to /XX.XX.XXX.XX:9042
com.datastax.driver.core.TransportException: [/XX.XX.XXX.XX:9042] Cannot connect
The XX are the public IP of the EC2 cassandra.
However inside my spark configuration: Im telling spark to use the seeder internal IP node, which then the spark-connector driver receive the information from Cassandra the public IP.
How my IT set the clusters up, I am assuming the following:
ERROR Session: Error creating pool to /127.0.0.1:9042
However I don't want to have my clusters connect via public IP and open up the firewall. I would like to have it stay to the internal IP of the cluster.
Is there a way to do this Spark code level wise or cassandra.yml configuration wise?

MapR control management gives connection refused

I have installed a cluster including 3 nodes on amazon Ec2. I just stopped all instances , however after restarting all insatnces while accesingth e control console using 9443 port it gives me connection refuse error
Do I neeed to restart the MapR services and how?
Thnaks
Did you check the status of Webserver as it provides access to MCS.

Can my chef-server and workstation be on different clouds ..?

Say .. I have a scenario where my workstation is in my local network and my chef server is in AWS . In knife.rb ,i gave the AWS Public IP in the chef server url. Will this work or not for open source chef .??
i tried doing that.i am getting the following error:-
ERROR: Network Error: Error connecting to https://xx.xx.xx.xx/cookbooks?num_versions=all - Connection timed out - connect(2)
Check your knife configuration and network settings
can some one help me out in this.
Sure, as long as your workstation (usually your PC/Mac) has IP connectivity to the Chef server that's how it works. Given your output, it looks like access to port 443 is not allowed (or you entered the wrong IP of your chef server).

UnknownHostException on tasktracker in Hadoop cluster

I have set up a pseudo-distributed Hadoop cluster (with jobtracker, a tasktracker, and namenode all on the same box) per tutorial instructions and it's working fine. I am now trying to add in a second node to this cluster as another tasktracker.
When I examine the logs on Node 2, all the logs look fine except for the tasktracker. I'm getting an infinite loop of the error message listed below. It seems that the Task Tracker is trying to use the hostname SSP-SANDBOX-1.mysite.com rather than the ip address. This hostname is not in /etc/hosts so I'm guessing this is where the problem is coming from. I do not have root access in order to add this to /etc/hosts.
Is there any property or configuration I can change so that it will stop trying to connect using the hostname?
Thanks very much,
2011-01-18 17:43:22,896 ERROR org.apache.hadoop.mapred.TaskTracker:
Caught exception: java.net.UnknownHostException: unknown host: SSP-SANDBOX-1.mysite.com
at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:850)
at org.apache.hadoop.ipc.Client.call(Client.java:720)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
at $Proxy5.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1033)
at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1720)
at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2833)
This blog posting might be helpful:
http://western-skies.blogspot.com/2010/11/fix-for-exceeded-maxfaileduniquefetches.html
The short answer is that Hadoop performs reverse hostname lookups even if you specify IP addresses in your configuration files. In your environment, in order for you to make Hadoop work, SSP-SANDBOX-1.mysite.com must resolve to the IP address of that machine, and the reverse lookup for that IP address must resolve to SSP-SANDBOX-1.mysite.com.
So you'll need to talk to whoever is administering those machines to either fudge the hosts file or to provide a DNS server that will do the right thing.

Resources