MarkLogic - XDMP-HOSTOFFLINE: Host is offline or not responding - azure

MarkLogic 9.0.9
Deployed in Azure with Managed Disk
While setting up new MarkLogic Cluster, we are facing an issue for 2 server nodes as below
This host is down. The following error occurred while trying to contact it:
XDMP-HOSTOFFLINE: Host is offline or not responding
Host <HostName>
Online Disconnected
While looking at error log, I got this line
2020-05-06 05:22:28.832 Warning: A valid hostname is required for proper functioning of MarkLogic Server: SVC-SOCHN: Socket hostname error: getaddrinfo .reddog.microsoft.com: Name or service not known (where as it should connect to )
I got knowledge base article which is published in April 2020.
https://help.marklogic.com/Knowledgebase/Article/View/svc-sochn-warning-during-start-up-on-aws
Based on this article, I do not find any file under /etc/ or /var/local folders as mentioned in article
Not sure if it is because of this, I am not able to open MarkLogic Admin Interface (port 8001).
It seems that somewhere in the MarkLogic configuration this name is there, but which one is a question.
Please find below screen from host within MarkLogic Interface. In this case, disconnected status is for 01 & 03
Whereas I can access Admin Interface of 01, so I am wondering.

After discussing same issue with infra team, they found issue with DNS resolution as full dns was not set in hostname within MarkLogic.
i.e. ml-01 was set in hostname instead of ml-01.abc.com and then as MarkLogic was in azure, it added ml.01.reddog.microsoft.com automatically.
So outside MarkLogic we were able to ping server with full name.
After change in DNS resolution, i was able to add ML server nodes in cluster.

Related

How do I get the exact IP address of my mongodb database?

I am trying to deploy my node app to my Cpanel however the page times out with an error
503 Service unavailable
The website works on Heroku, ngrok and localhost however on my hosting service, it doesn't.
I found out that the issue was due to port 27017 not being open.
On discussing with my hosting providers, they said
"We can open the ports for you but our policy is to open non-standard ports to specific IP's for better security of the server. Is it possible to get the exact IP addresses of the database server you are trying to connect to."
So I'm not familiar with mongodb database having a specific IP address. What could they mean?
To connect to your db, your node app needs a URL something like this.
const url = 'mongodb://hostnameOfMongo.example.com:27017'
Your database's hostname is the stuff after mongodb:// and before :27017.
Open up a shell (a command window) and type
ping -n 1 hostnameOfMongo.example.com
or maybe
ping -c 1 hostnameOfMongo.example.com
It should show you the IP address associated with your mongo server.
(Obvs, put your actual db hostname into the command, not my example.)
It's a little strange that your hosting provider didn't ask for the hostname when you didn't know the IP address. If they were my hosting provider, my confidence in the competence of their support would go down a notch because of that.
And please be aware that running a db in one data center and a node app (or indeed any app that uses the db) in another data center is a formula for poor performance and unreliability. The app and the db work best with a short and private network connecting them. With respect, it doesn't seem likely you have the network engineering chops to make that sort of thing stable and reliable.
Not to mention the security problems with exposing mongodb to the public network. Your hosting service is reluctant to open a port for a very good reason. Read this. Because cybercreeps

Not able to fetch data or queue data from FTP server in Azure cloud nifi server?

I am using FTP server which is working in local and able to fetch data from FTP server, but in the Azure cloud nifi server same FTP server is not fetching a single record from the particular FTP server. I am using ListFTP associated with FetchFTP nifi processor and used the same configuration whatever I used in local for ListFTP and FetchFTP nifi processors.
Can someone please suggest what is happening here. I checked firewall and even I disabled the firewall. That FTP server is running on Active connection mode. I tried but I'm not able to figure out the exact reason.
I am attaching the screenshots of my FTP processors configuration. One very important thing while using GetFTP server it is not fetching a single data after running hours of hours and even not a single exception or error. But with ListFTP and FetchFTP server it is showing exception after some 15 minutes interval that is "Failed to perform listing on remote host due to java.net.SocketException"
I think once you go through your conf/nifi.properties file and check whether keystore certificate is enabled or disabled and if it is disabled then do it enable.
Here you can check nifi configuration documentation.

Unable to Add Azure DB Firewall Rule to Allow Build Server to Run Tests

We use a Visual Studio Online-hosted build server to automate our build process. As part of this I'm looking into adding unit and integration tests into this process.
These tests require access to our SQL Azure DBs (2 of them, both on the same server), which in turn requires access through the DB server's firewall.
I have a PowerShell script which uses New-AzureRmSqlServerFirewallRule to add IP addresses to the DB server, and these firewall rules are successfully showing up in the Azure portal.
Specifically, the script adds firewall rules for:
All IPv4 addresses* on the build server (as returned by Get-NetIPAddress)
Build server's external IP address (as returned by https://api.ipify.org)
In conjunction, it appears that the pre-defined AllowAllAzureIPs and AllowAllWindowsAzureIps rules are automatically added.
However, the tests subsequently fail with the exception:
System.Data.SqlClient.SqlException:
System.Data.SqlClient.SqlException: A network-related or
instance-specific error occurred while establishing a connection to
SQL Server. The server was not found or was not accessible. Verify
that the instance name is correct and that SQL Server is configured to
allow remote connections. (provider: Named Pipes Provider, error: 40 -
Could not open a connection to SQL Server)
I'm unsure why the build server is unable to reach the DB server - could it be that the host of the test processes is using yet a different IP address?
Update
As has been pointed out, the exception message mentions "Named Pipes Provider" which suggests that the DB connection is using a named pipe instead of an IP/TCP connection. To test this I changed the local app.config to contain an unknown/random/inaccessible IP and ran the tests locally (they otherwise run successfully locally): I received exactly the same exception message mentioning "Named Pipes Provider". Perhaps at some level the ReliableSqlConnection class resolves to a named pipe but my point is that I can induce this very same exception by changing to an unknown or inaccessible IP address in my DB connection string.
Furthermore, the DB connection string starts with tcp: which, as per this blog post, explicitly tells the connection to use TCP/IP and not named pipes.
I have also modified the firewall rule to permit all IP addresses (0.0.0.0 to 255.255.255.255) but the same exception is still thrown. This suggests that the SQL Azure firewall rule is not the cause of the 'blockage'.
My suspicion therefore turns to network access being blocked (though a whitelist is probably present to permit the build server to reach the code repository). I added a very simple PowerShell script to the start of the build process:
Test-Connection "172.217.18.100" #resolves to www.google.com
This results in
Testing connection to computer '172.217.18.100' failed: Error due to lack of resources
Have the build servers disabled ping/ICMP or is all outgoing traffic blocked?
* The script only considers IPv4 addresses because I haven't had any success in passing IPv6 addresses to New-AzureRmSqlServerFirewallRule.
We finally solved the issue. The problem had nothing to do with Firewalls. The issue was that the app.config files in our unit test didn't go through the transformation step that our web.config files did. So all the settings were from our local development and therefore wrong.
More about this here:
Connect to external services inside Visual Studio Online build/test task
What connection string are you using? Your error seems to indicate that this is not truly a firewall issue, but rather a connection is being attempted to a server that doesn't exist.
My * incorrect * hypothesis right now is that your connection string contains only the server name, without .database.windows.net suffix which causes the client driver to look for server on local network. The error presented appears to not be a firewall related issue.
( Edited to reflect author feedback. )
If you're connecting over TCP, then why is your error message saying Named Pipes?
[...]
(provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server)
I'd look into this paradox first.
The firewall test is very simple, allow 0.0.0.0 to 255.255.255.255 or 0.0.0.0/0 and re-test. My money is on the same error message.

SQLAzure database server - named pipes provider, error: 40 - the network path was not found

We access our database that is in SQL Azure, and every so often we hit this error while trying to connect. We connect from a corporate network, using SSMS or API.
The weird part is how it always successfully and instantly connects on retrying. We retry just 1 second after and it works.
We saw that the DTU Usage % was high and scaled our server up, but that did not help. We have employed a SqlAzureRetry policy while accessing the database from our API, which seems to be helping in mitigating the issue - but the root cause is still not identified.
Has anyone employed a configuration or strategy or faced a similar issue? (the underlying provider failed to open / network path not found).
Thanks!
The solution was to change the format of the server name to use TCP:
tcp:servername.database.windows.net,1433;
Also, if you're connecting from code, you should change to the above format in your connection string.

Connect Squirrel to Azure HBase cluster

My objective is to access my Hbase cluster on Azure with Squirrel with a Phoenix driver running on my local computer.
My Hbase cluster on Azure is operational. I can see it in the Ambari dashboard and I can access it using SSH. I can start Phoenix with the sqlline.py command pointing to one of the zookeeper nodes. The !tables command returs four lines.
My Hbase cluster is included in an Azure VNet. From my local computer (running Windows 10) I can connect to this VNet. I can ping the IP address (10.254.x.x) of the zookeeper node successfully but pinging the FQDN of the zookeeper node results in an error message:
"Ping request could not find host zk1-.......ax.internal.cloudapp.net.
Please check the name and try again."
When I start Squirrel on my local computer with the URL pointing to the FQDN of the zookeeper node I get an error message:
"Unexpected Error occurred attempting to open an SQL connection". The
stack trace points to a java.util.concurrent.RuntimeException: "Unable
to establish connection"
When I start Squirrel on my local computer with the URL pointing to the IP address of the zookeeper node I get a different error:
"Unexpected Error occurred attempting to open an SQL connection". The
stack trace points to a java.util.concurrent.TimeoutException.
I suspect this has something to do with the Domain Name resolution problem as described here [https://superuser.com/questions/966832/windows-10-dns-resolution-via-vpn-connection-not-working]. I applied the resolution as described by LikeARock47 on Feb 23. This did not improve the situation however.
Does this indeed have to do with the Domain Name resolution issue or is the problem somewhere else?
Is there a better solution to the Domain Name resolution issue?
A JDBC connection from Squirrel on my local Windows10 computer has succesfully been established to the Hbase cluster by using the zookeeper IP address and the port and "/hbase-unsecure":
jdbc:phoenix:10.254.x.x:2181:/hbase-unsecure
I can manage my HBase cluster with a local Squirrel now!
I'd still be interested to find out how I can get the zookeeper FQDN resolved locally.....

Resources