How can I diagnose a connection failure to my Load-balanced Service Fabric Cluster in Azure?

How can I diagnose a connection failure to my Load-balanced Service Fabric Cluster in Azure? - azure

I'm taking my first foray into Azure Service Fabric using a cluster hosted in Azure. I've successfully deployed my cluster via ARM template, which includes the cluster manager resource, VMs for hosting Service Fabric, a Load Balancer, an IP Address and several storage accounts. I've successfully configured the certificate for the management interface and I've successfully written and deployed an application to my cluster. However, when I try to connect to my API via Postman (or even via browser, e.g. Chrome) the connection invariably times out and does not get a response. I've double checked all of my settings for the Load Balancer and traffic should be getting through since I've configured my load balancing rules using the same port for the front and back ends to use the same port for my API in Service Fabric. Can anyone provide me with some tips for how to troubleshoot this situation and find out where exactly the connection problem lies ?
To clarify, I've examined the documentation here, here and here

Have you tried logging in to one of your service fabric nodes via remote desktop and calling your API directly from the VM? I have found that if I can confirm it's working directly on a node, the issue likely lies within the LB or potentially an NSG.

Related

VNet Integration For Azure Web App and Azure SQL Server

I have an Azure Web App and an Azure SQL Server, both in the same subscription. Both of them are connected to the same VNet Subnet as shown in the below snapshots. The SQL Server is configured not to Allow Azure Resources and Services to access the server, as it should only permit access from either the connected subnet or a set of IP rules.
Unfortunately, the SQL Server is actively refusing any connection from the web app stating that the web app IP is not allowed to access the server.
The interesting thing is that I have the exact same configuration working on another subscription.
What could I be missing?
Snapshots:
1- Here you can see the web application connected to the "webapps" subnet
2- And here you can see the SQL Server connected to the same subnet
3- And that's the error I get

TLDR
The configuration is correct, but an app service restart may be required.
VNET Integration
The configuration of using a virtual network to connect a web app to a SQL database is correct: if the web app is connected to the same subnet/vnet which is allowed in the database's ACLs, and the Microsoft.Sql service endpoint is enabled on the subnet, the web app is able to communicate to the database. This is the whole reason for service endpoints: you do not need to configure with IP allowances on the database.
As to why the configuration still resulted in an error, it could be the order in which the resources were configured. We were experiencing the exact same setup and issue (which is what let me to this question)!
We connected our web app to the subnet/vnet but had not enabled the service endpoint on the subnet. We then added/allowed the subnet/vnet as an ACL in the database, during which we were prompted to enable the Microsoft.Sql service endpoint (we did). However, even after waiting ~20 minutes, we were still seeing the same connection issue.
However, once we restarted the app service, the issue went away and the web app could connect to the SQL database.
I suspect the issue is due to enabling the subnet's service endpoint after the app service was connected to the subnet. The app service must need a restart to refresh the app service's vnet config/routing.
Configuration NOT needed
Contrary to other answers, you do not need to configure firewall IP allowances nor enable access to Azure services and resources. In fact, there are downsides to both approaches:
Enabling access to Azure services and resources allows any Azure-based resource to connect to your database, which includes resources not owned by you. From doc:
This option configures the firewall to allow all connections from Azure, including connections from the subscriptions of other customers.
Unless you're using an App Service Environment (which is significantly more expensive than normal App Service plans), your web app's outbound IP addresses are neither static nor specific to your application. From doc:
Azure App Service is a multi-tenant service, except for App Service Environments. Apps that are not in an App Service environment (not in the Isolated tier) share network infrastructure with other apps. As a result, the inbound and outbound IP addresses of an app can be different, and can even change in certain situations.
The second point is further elaborated upon in this Github issue:
IPs are indeed shared with other App Service plans (including other customer's plans) that are deployed into the same shared webspace. The network resources are shared among the plans in a workspace even if the computing instances are dedicated (e.g. in Standard tier). This is inherent to the App Service multi-tenant model. The only way to have a dedicated webspace (i.e. outbound IPs) is to deploy an App Service plan into an App Service Environment (ASE) (i.e. Isolated tier). ASE is the only thing that offers true single-tenency in App Service.
So neither of the above options will truly harden your SQL database if you want to isolate communication from only your web app. If you have resources in the same subnet, using vnet integration is the correct way to solve the problem.
If resources cannot be in the same subnet, the solution is to use Private Endpoints.

Virtual networking in Azure is quite different from how it would work on premises.
I had similar problems in production environment and digging deep, the working solution (meeting security standards and create a secure connection to the database) was to create a private endpoint for SQL access in the virtual network. Then all the calls to the SQL were performed internally (it did not go on the internet), and the databases were denying all public calls.
In your case now, you deactivated the Allow Azure apps to access so when your app is trying to access the SQL the server checks the ip to find out if it is white listed or not. So fast solutions would be one of the following:
Enable Azure Web apps to access SQL
Find all outbound IPs of your web app and register them in you SQL firewall/ security settings.
If you talk about a proper production environment with security regulations I would suggest you go down the more tedious path of private endpoints.

You have to configure the outbound IPs from the app service in the sql fw.
You can find them under properties of your app service. Documentation.
The reason why is that the VNET integration doesn't give your app service an outbound IP in the VNET you configured it in, so the FW you configured doesn't work.

I have working web apps which access storage accounts and KVs. These storage accounts and KVs accept traffic from a particular subnet and the web apps have been configured to integrate with those subnets. I did face an issue where even after integration apps were not able to access these resources. What worked for me was, I changed the App service SKU from Standard to Premium and restarted the app. As you can see, it warns that "Outgoing IPs of your app might change". This is not guaranteed solution but it worked for me.. several times! Not sure about SQL server though. Private endpoint does seem like the way to go but you can give this a try.

Connecting to Azure VM from Azure App Service

I have an Azure virtual machine, on which a process listens on a certain port. A Node.js application on my local computer is able to connect to this process using the VM's public IP address. But the same Node.js application, deployed as an app service on Azure, is apparently not able to connect using any IP address, despite the fact that the VM allows all incoming traffic on all ports.
(Details: The VM process is running "q" (kdb+), and the Node.js application is using the "node-q" package to connect to it. Both the Azure VM and the Azure app service are Linux, but the local version of the app service is on Windows. The Azure app service is able to connect to my Azure SQL database.)
Any insights into this problem would be appreciated.

There are many reasons for Bad gateway error, probably you could verify these factors on your side:
Azure VM side. Make sure the Azure VM is running and the process port is listening when you request a connection from an application. You could run sudo netstat -plnt on Linux VM to check the listening ports. Or, a server can crash if it has exhausted its memory, due to a multitude of visitors on site or a DDOS attack.
Firewall blocks a request. You should allow all incoming traffic or Azure web app service outbound traffic on this listening port on the VM. In this scenario, you could verify the Network Security Group configuration for the VM and firewall inside the VM if you have. You could find NSG settings by clicking Virtual machine--Settings---Networking---inbound port rules on the Azure portal.
Faulty programming. It seems the Node.js application could work locally.
Temporary issue. Sometimes, there is no real issue but your browser thinks there is one thanks to an issue with your browser, a problem with your networking equipment, or some other reasons. You could refresh your web browser or clear cache and cookies to get the page back what you are looking for. More details you can refer to fixing 502 error.
If you still have any question, feel free to let me know.

It was faulty deployment. I didn't include all dependencies in the upload to Azure. Thank you.

Reach Service Fabric from internet

I’m playing with Microsoft Azure Service Fabric, but I'm having some problems reaching the services from internet.
My situation:
I Created the Service Fabric cluster:
Windows Server 2016 Datacenter.
Node type count: 1.
Custom Endpoint: empty.
“Enable reverse proxy” flagged.
All my services are developed base on .NET Core 2.1, REST API.
Using a web browser, all the services work fine locally (with Service Fabric Local Cluster and Azure Storage Emulator or Azure Storage). Then I published the application to the Azure cluster but I can not reach any of the service from internet.
Question
How can I setup the environment so to reach the services from internet?
I read some docs:
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-connect-and-communicate-with-services
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-reverseproxy
One of the service in the ServiceMnifest.xml file has the following configuration:
<Endpoint Protocol="http" Name="ServiceEndpoint" Type="Input" Port="8939" />
So, I added the following configurations in the load balancer:
Health probes: added a configuration for the 8939 port.
Load balancing rules: added a configuration for an 8939 => 8939 TCP passthrough using the previous health probes configuration.
But when I try to reach it from browser I get a timeout.
Any suggestion is appreciated.
Regards,
Attilio

So if you have the reverse proxy enabled, and want to use that the endpoint on port 8939 is not interesting as you should access the reverse proxy using the proxy url
As pointed out in the comments the format is http://[FQDN]:[ReversPoxyPort]/[ApplicationName]/[ServiceName]/[Controller] and typically the port is 19081 so the url becomes something like myawesomeservicefabric.westeurope.cloudapp.azure.com:19081/FabricApp1/Service1/Values
As with all other things the port should be configured in the load balancer with a probe and a rule.
I am not sure what the portal does with clusters now a days, but at some point it configured a network security group which might also be the cause of your issues.

RDP into the machine and test the endpoint on localhost. If it doesn't work, it's likely an application error.
Verify the firewall settings, for a rule for incoming traffic on port 8939.
Enable logging on the Azure Load Balancer to see if the health probe detects the endpoint.

Load balancer for Azure Service Fabric Cluster on-premises

As developers we wrote microservices on Azure Service Fabric and we can run them in Azure in some sort of PaaS concept for many customers. But some of our customers do not want to run in the cloud, as databases are on-premises and not going to be available from the outside, not even through a DMZ. It's ok, we promised to support it as Azure Service Fabric can be installed as a cluster on-premises.
We have an API-gateway microservice running inside the cluster on every virtual machine, which uses the name resolver, and requests are routed and distributed accordingly, but the API that the API gateway microservice provides is the entrance for another piece of client software which our customers use, that software runs outside of the cluster and have to send requests to the API.
I suggested to use an Load Balancer like HA-Proxy or Nginx on a seperate machine (or machines) where the client software send their requests to and then the reverse proxy would forward it to an available machine inside the cluster.
It seems that is not what our customer want, another machine as load balancer is not an option. They suggest: make the client software smarter to figure out which host to go to, in other words: we should write our own fail-over/load balancer inside the client software.
What other options do we have?
Install Network Load Balancer Feature on each of the virtual machine to give the cluster a single IP address, is this even possible? Something like https://www.poweradmin.com/blog/configuring-network-load-balancing-in-windows-server/
Suggest an API gateway outside the cluster, like KONG https://getkong.org/
Something else ?
PS: The client applications do not send many requests per second, maybe a few per minute.

Very similar problem, we have a many services and Service Fabric Cluster that runs on-premises. When it's time to use the load balancer we install IIS on the same machine where Service Fabric cluster runs. As the IIS is a good load balancer we use IIS as a reverse proxy only for API Gateway. Kestrel hosting is using for other services that communicate by HTTP. The API gateway microservice is the single entry point for all clients and has always static URI inside SF, we used that URI to configure IIS
If you do not have possibility to use IIS then look at Using nginx as HTTP load balancer

You don't need another machine just for HTTP forwarding. Just use/run it as a service on the cluster.
Did you consider using the built in Reverse Proxy of Service Fabric? This runs on all nodes, and it will forward http calls to services inside the cluster.
You can also run nginx as a guest executable or inside a Container on the cluster.

We have also faced the same situation when started working with service fabric cluster. We configured Application Gateway as Proxy but it would not provide the function like HTTP to HTTPS redirection.
For that, we configured Nginx Instead of Azure Application Gateway as Proxy to Service Fabric Application.

Secure communication between existing Azure App Service and Azure VM cluster

We have an application running in Azure that consists of the following:
A Web App front end, which talks to…
A WebApi running as a Web App as well, which can (as well as a couple other services) talk to…
A Cloud Service load balanced set of VMs which Are hosting an Elasticsearch cluster.
Additionally we have the scenario were dev’s whitelist their IPs so that their localhost version of the API can hit the VMs as well.
We have locked down our Elasticsearch VM’s by adding ACLs to the exposed end point. I whitelisted the outbound IPs that were listed on my App Services. I was under the mistaken impression that these were unique to my Api. It turns out that these are shared across the scale unit in Azure. Other services running in the same scale unit, could, if they knew the endpoint, access the data exposed on the endpoint in my cluster. I need to lock this down, and I am trying to find the easiest way. These are the things I am looking at, and I would appreciate advice and/or redirection.
Elastic Shield: Not being considered. This is a product by Elastic
that is designed to secure ES. This is ideal, but at the moment it
is out of scope (due to the cost and overhead)
List item
Elastic plugins: Not being considered. The main plugins (such as
Jetty) appear to be abandoned.
Azure VPN. I originally tried to set this up, but ran into too many
difficulties. The ACLs seemed to give me what I need without much
difficulty. I am not sure if I can set this up now. The things I
don’t know are:
I don’t think I can move existing VMs into a new VPN.
I think you have to recreate the VMs in that VPN from the get go
Could I move my Web App into the VPN? How does that work?
This would prob break my developer scenario as the localhost API
would not be able to access the VPN, right?
Add a certificate to requests: It would be ideal if I could have
requests require a cert or a header token. I assume to do this I
would need to create a proxy that would run on the VMs and do the
validation before forwarding the request on to my Elasticsearch.
Anything else? Is there another option I have not thought of?
Thanks!
~john

You can create a VPN point-to-site connecting your Web App with your IaaS VMs. This is the best solution because you will be able to use just internal IPs on your IaaS.
The easiest way to do that using Azure Portal is create a Web App and, create a new VPN and VNet using "setup" option at "Your Web App" -> Settings -> Networking -> VNET Integration -> Setup -> Create New Virtual Network.
After that, create your IaaS inside this new VNet.
You also can create a ARM template to create Web App, IaaS, VPN and everything that you need. Take a look at my ARM template to create PHP+MySQL using Web App and MariaDB Cluster connected by VPN: https://github.com/juliosene/azure-webapp-php-mariadb

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string