Monitoring Azure Data Factory access - azure

Kind of a simple question, but puzzling...
Is there a stat in Azure services to monitor how many times data factory is / was accessed ?
So, as an example if an automated system is set up to make persistent API calls to ADF with the malicious intent exhaust it is there a way to monitor for that and gather some kind of stats?

The monitoring built into the Azure Data Factory PaaS itself only monitors legitimate, authenticated usage. You can see this on the https://adf.azure.com/en/monitoring/pipelineruns?factory=%2Fsubscriptions%... dashboard.
Notice how the root domain is adf.azure.com - this is the same for all tenants using data factory around the world. Your specific subscription / instance are mere query parameters in the URL. Microsoft Azure is fully managing the actual hosting of this PaaS, which means they are entirely responsible for subverting any DDOS or similar bad-actor attempts on this service. It's not something you have to worry about, and therefore not something you have much visibility into.
If you ever needed or wanted to check in on how microsoft is doing with this, head on over to https://status.azure.com/status and search for the "Azure Data Factory" row:
This is really one of the biggest selling points of using a fully-hosted cloud PaaS such as Data Factory. You are no longer responsible for the hardware, or even range of ip addresses that back this service. No more than you have to worry about someone DDOS'ing outlook.office.com which probably services your entire organisation's email. I could happen, but if it did, it affects all of Microsoft's customers around the world, not just you personally, so there should be no expectation that you personally are doing anything special to mitigate against it.
Note that more generically if you want to monitor network traffic within your NSGs, iterfaces, VNETs etc in general on Azure, the thing to use is the Application Insights' Network Monitoring at https://portal.azure.com/#view/Microsoft_Azure_Monitoring/AzureMonitoringBrowseBlade/~/networkInsights
This is more generically applicable to all provisioned resources and services on Azure though, not something specific to Azure Data Factory.

Related

Question about Azure Load Balancer/Azure Traffic Manager

If one application have Azure serviceBus, EventHub in diff Azure Namesapces, web application and also other azure services (eg: cognitive services). can these be accessed with one URL by using Gateway or Load balancer or traffic manager or any other option ?
My problem is - if we have diff namesapces, we need to whitelist every time when there is new Namespaces and it could so too much of a work. so wondering if we can have one common DNS/URL that would make life easier.
Today, Service Bus and Event Hubs don't support any sort of network gateway. This is due to fact that namespace in the connection string used for authorization purpose at the service side.
To add a bit of context to Serkant's statement, support for this scenario is something that is on our roadmap, and hopefully in the near term. Unfortunately, I don't have a date to share currently. The work is being tracked [here] should you wish to keep an eye on it.

Is it possible to see when my Azure Resources are idling?

I want to see when my resources are idling (e.g. certain resources might only be used during business hours and not used for any other background process). I'd like to do that preferably through an API call.
It would all depends on the type of resource and what you are wanting to do. You could use the Azure Monitor API or Azure Data Explorer API with Kusto to query out specific metrics for your different services. Depending on the type of data, this would require you to have more analytics enabled.
Here are some examples based on types of services.
Azure App Service - You could query for CPU, Memory, HTTP Requests, etc. This would give you an idea of activity. These same metrics tie into the auto-scaling.
Azure VMs - CPU, Memory, Disk IO, etc. You could determine your baseline then you would know when it is idle or not.
Azure Storage - Transactions, Ingress, Egress, Requests, etc. You could use that to determine if there is activity in your storage account.
As you can see it all depends on what you want to define as idling. If the goal is to reduce costs, then that will be difficult with many of these services. You could scale up and down your App Services with some scripts or scale in/out based on metrics. Same can be done with your Azure VMs, or using stopping and starting. Storage will not be able to be adjusted, but you are only charged for storage and egress so that is dictated by activity.
Hope this helps.
no, this is not possible. how do you define "idling"? how would azure know if your service does anything or not? besides, most of the PaaS resources cannot be stopped, so whats the use of that.
You can use Azure Advisor to get cost optimization advice, or Azure Monitor directly to gather performance data and then analyze it, but its not going to be trivial.

Resource Group Location vs. Tenant Region

I have been tasked with building a PoC in Azure to "simulate" a future global deployment where data transfer time is important factor. The actual deployment will be using fully on-prem resources. So, as odd as it sounds, I am looking for the worse performance possible between the two options.
Architecture A (single tenant):
Create a single Azure tenant in the US region
Create a Resource Group with a US-based location
Create another Resource Group with an EU-based location
Architecture B (dual tenant):
Create an Azure tenant in the US region with a US-based RG
Create an entirely separate Azure tenant in an EU region with a EU-based RG
Would the dual-tenant structure above make any measurable difference one way or the other from the single-tenant (assuming all vNetwork, VMs, etc are identical)? I am thinking the single-tenant setup would be faster since (presumably) the traffic never leaves the Azure Service Fabric. But that's just speculation.
Here is what I got back from a colleague. She is (obviously) far more versed in Azure IaaS than I am. Answer #3 below indicates that the closest analog to the client MPLS connection is via VPN/ER. Not really worth the cost but still good to know.
Can a single subscription be used to provision US and European region located resources? Yes
Can resources in US and European located regions be managed from a US based portal? Yes
When allowing resources in US and European located regions communicate with one another what are our options? A couple primary ways...
Intra-regional (tenant to tenant:region to region)
Communications can be provisioned to travel across the Microsoft Azure
backbone. It never hits the open Internet.
VPN or Express Route:
Travels either the open internet or a private in TLS like route from
one region to another. However express route, the mpls like option,
does require advanced routing (BGP) and dedicated circuits at I other
point from different connectivity providers. Also, expensive.

What's the point of Azure Add-Ons?

Windows Azure has a store.
The stuff you can by there are called Add-Ons, and they fall in two categories: Service and data.
I understand the point of some of the service offerings, but not all, and I don't yet understand the point of the data offerings at all.
With services, some offerings are database deployments such as ClearDB (MySQL) and MongoLab. That makes sense to me: You get those databases deployed and monitored with a few clicks, yet those databases run in the same data center as the applications that consume them, which is good for performance and security.
For most other services (there is a simple scheduler application, for example), it seems that the only advantage is the unified billing method. Is that a correct observation, or is there more to it?
Then the data offerings: The fact that I can buy bing query transactions cannot really have anything to do with the rest of my azure account, right? Technically, it's just bing (or whatever other data offering you look at) and presumably I'm going against the same bing api that I would have used previously (I'm assuming that was possible). There is nothing really deployed in any Azure data center the moment I buy it, is there? So in what sense is that an Add-On?
In a nutshell, am I missing something or are most Add-Ons just a method of buying external services and having the billed on my Azure account?
If you can answer the question for other 'app stores', you can answer it for Windows Azure. We know about THE App Store (as per the court battles over the name) which is the only way to get applications onto the closed (iOS) device. There is also a Mac App Store which would seem unnecessary because of the ability to install apps by yourself (which makes it more similar to the Azure store). In this case the reason for the store is discoverability, association with the store brand (where the buyer assumes a degree of vetting), a single point for updates, and simplified billing.
The Windows Azure Store (and data marketplace) exist for similar reasons. It is less about the technical benefits than the association with the Azure brand. Since SO is technical, let me highlight some (largely) technical aspects:
Don't assume that the service will run in the same data centre. In most cases it probably won't.
There is an advantage of having everything in one place from an operational point of view. Granting of operator access to the subscription means that you don't have to administer accounts on the service. I have had problems with this though - where the service made it difficult to do other things (such as get support) because the Azure identity wasn't handled very well. (I had this with New Relic).
The combined billing works on credit card payments only. Last time I checked (Summer 2013) there was no way to get an add-on with a pay-by-invoice subscription, so a second subscription (with credit card) was needed anyway.
Add-ons seems to still be in 'preview', which may indicate low adoption. Microsoft probably hasn't seen it grow the way they expected and may not be developing it much in future. This is opinion only, and shouldn't affect the service (after all the store is just a gateway, and has no (little) technical impact on the service provided)
Don't completely ignore the store however. The biggest benefit seems to be the free tier of the servers and reduced pricing, where Microsoft has managed to get service providers to make the store attractive. For example, the SendGrid free option provides 25,000 emails per month, and there doesn't seem to be a free option on SendGrid.com. New Relic pricing was (and maybe still is) significantly less.
Pay attention mainly to the pricing benefits, rather than perceived technical benefits.

Minimize downtime in Azure

We are experiencing a very serious unscheduled downtime of our Azure application today for what is now coming up to 9 hours. We reported to Azure support and the ops team is actively trying to fix the problem and I do not doubt that. We managed to get our application running on another "test" hosted service that we have and redirected our CNAME to point at the instance so our customers are happy, but the "main" hosted service is still unavailable.
My own "finger in the air" instinct is that the issue is network related within our data center (west europe), and indeed, later on in the day the service dash board has gone red for that region with a message to that effect. (Our application is showing as "Healthy" in the portal, but is unreachable via our cloudapp.net URL. Additionally threads within our application are logging sql connection exceptions into our storage account as it cannot contact the DB)
What is very strange, though, is that the "test" instance I referred to above is also in the same data centre and has no issues contacting the DB and it's external endpoint is fully available.
I would like to ask the community if there is anything that I could have done better to avoid this downtime? I obeyed the guidance with respect to having at least 2 roles instances per role, yet I still got burned. Should I move to a more reliable data centre? Should I deploy my application to multiple data centres? How would I manage the fact that my SQL-Azure DB is in the same datacentre?
Any constructive guidance would be appreciated - being a techie, I've never had a more frustrating day being able to do nothing to help fix the issue.
There was an outage in the European data center today with respect to SQL Azure. Some of our clients got hit and had to move to another data center.
If you are running mission critical applications that cannot be down, I would deploy the application into multiple regions. DNS resolution is obviously a weak link right now in Azure, but can be worked around (if you only run a website it can be done very simply using Response.Redirects or similar)
Now, there is a data synchronization service from Microsoft that will sync up multiple SQL Azure databases. Check here. This way, you can have mirror sites up in different regions and have them be in sync with SQL Azure perspective
Also, be a good idea to employ a 3rd party monitoring service that would detect problems with your deployed instances externally. AzureWatch can notify or even deploy new nodes if you choose to, when some of the instances turn "Unresponsive"
Hope this helps
I can offer some guidance based on our experience:
Host your application in multiple data centers, complete with Sql Azure databases. You can connect each application to its data center specific Sql Server. You can also cache any external assets (images/JS/CSS) on the data center specific Windows Azure machine or leverage Azure Blog Storage. Note: Extra costs will be incurred.
Setup one-way SQL replication between your primary Sql Azure DB and the instance in the other data center. If you want to do bi-rectional replication, take a look at the MSDN site for guidance.
Leverage Azure Traffic Manager to route traffic to the data center closest to the user. It has geo-detection capabilities which will also improve the latency of your application. So you can redirect map http://myapp.com to the internal url of your data center and a user in Europe should automatically get redirected to the European data center and vice versa for USA. Note: At the time of writing this post, there is not a way to automatically detect and failover to a data center. Manual steps will be involved, once a failover is detected and failover is a complete set (i.e. you will failover both the Windows Azure AND Sql Azure instances). If you want micro-level failover, then I suggest putting all your config the in the service config file and encrypt the values so you can edit the connection string to connect instance X to DB Y.
You are all set now. I would create or install a local application to detect the availability of the site. A better solution would be to create a page to check for the availability of application specific components by writing a diagnostic page or web service and then poll it from a local computer.
HTH
As you're deploying to Azure you don't have much control about how SQL server is setup. MS have already set it up so that it is highly available.
Having said that, it seems that MS has been having some issues with SQL Azure over the last few days. We've been told that it only affected "a small number of users". At one point the service dashboard had 5 data centres affected by a problem. I had 3 databases in one of those data centres down twice for about an hour each time, but one database in another affected data centre that had no interruption.
If having a database connection is critical to your app, then the only way in the Azure environment to ensure against problems that MS haven't prepared against (this latest technical problem, earthquakes, meteor strikes) would be to co-locate your sql data in another data centre. At the moment the most practical way to do this is to use the synch framework. There is an ability to copy SQL Azure databases, but this only works within a data centre. With your data located elsewhere you could then point your app at the new database if the main one becomes unavailable.
While this looks good on paper though, this may not have helped you with the latest problem as it did affect multiple data centres. If you'd just been making database copies on a regular basis, that might have been enough to get you through. Or not.
(I would have posted this answer on server fault, but I couldn't find the question)
This is just about a programming/architecture issue, but you amy also want to ask the question on webmasters.stackexchange.com
You need to find out the root cause before drawing any conclusions.
However. my guess one of two things was the problem
The ISP connectivity differs for the test system and your production system. Either they use different ISPs, or different lines from the same ISP. When I worked in a hosting company we made sure that ou IP connectivity went through at least two different ISPS who did not share fibre to our premises (and where we could, they had different physical routes to the building - the homing ability of backhoes when there's a critical piece of fibre to dig up is well proven
Your datacentre had an issue with some shared production infrastructure. These might be edge routers, firewalls, load balancers, intrusion detection systems, traffic shapers etc. These typically are also often only installed on production systems. Defences here involve understanding the architecture and making sure the provider has a (tested!) DR plan for restoring SOME service when things go pair shaped. Neatest hack I saw here was persuading an IPS (intrusion prevention system) that its own management servers were malicious. And so you couldn't reconfigure it at all.
Just a thought - your DC doesn't host any of the Wikileaks mirrors, or Paypal/Mastercard/Amazon (who are getting DDOS'd by wikileaks supporters at the moment)?

Resources