I have several virtual machines and virtual machine scale sets in Azure for which I want to collect Windows Security event logs. I attempted to add these events to the Log Analytics workspace used by Sentinel through the portal.
This produces the following error message.
'Security' event log cannot be collected by this intelligence pack
because Audit Success and Audit Failure event types are not currently
supported.
It's a hard requirement for me that Sentinel has access these Security logs. I've been trying to figure out what my options are, and I haven't found a good one yet.
The prescribed approach appears to be setting up a Data Connector in Sentinel for the Security Events. I hit a couple of interesting things attempting this.
Virtual machine scale sets support is limited. No actions are
available at this moment.
It looks like I can't connect virtual machine scale sets, which is a big problem. Additionally, I can't even select the tier of the security events (see below) from this context.
So it looks like I have to use Azure Security Center. From within Azure Security Center the only way I can add these Security Events is to turn on Auto-Provisioning and install the Microsoft Monitoring agent (MMA) on every VM, something I don't want to do. I'm also concerned about costs using ASC.
Are there any other options? Am I going about this the wrong way?
The Security event log is automatically added behind the scenes when adding the monitoring agent on the VM.
In regards to the VMSS, I am not sure what your options are there.
Related
I am new to GCP and come from an Azure background. Is there an equivalent of "Azure Application Insights" on the GCP side for Monitoring Applications?
Let me explain my use case more clearly with an example: If I have a .NET based web application running on a Windows VM on GCP can Google Cloud Monitoring help detect Exceptions raised by the running application and send out alerts.
Any pointers/links to further explore this type of monitoring capability would be helpful.
Cloud Monitoring will provide you with many statisctics - most probably with what you need. And if there aren't any metrics to suit you need you may create ones based on the logs collected from the VM.
By default there is a number of logs being ingested but if you want to have full range and experiment with various ones you may want to install a monitoring agent. Go through the documentation and have a look.
You can then use the metrics to create charts and have a live view on a number of things such as cpu utilisation, disk IO/s, dropped/sent/received packets etc. Here's the Cloud Monitoring documentation.
And finally - you can create alerts based on the metrics (set thresholds, time periods etc). They can be simple e-mail alerts for example but they can be sent via pub-sub and trigger some functions or apps too.
Since you're new to GCP it's a lot of reading ahead of you but you will easily find documentation for most of GCP's services.
If you provide more details I can update my answer and give you more precise answer.
There's an awful lot of disjointed documentation on monitoring network/resources in Azure. What I'm looking for is which pieces are needed to get information from VMs, NVA firewalls, azure load balancers, and other network resources and network connectivity into a single pain of glass in Azure. Only concerned about Azure, not on-prem for now.
I've come across azure monitor, log analytics work spaces, event hub, vm extensions, network watcher, insights, etc...but I'm not sure which are required and which are not. One doc leads to the next and I end up with 30 tabs open. I'll also need to be able to push logs to other security devices such as a SIEM.
Does anyone know of a deployment guide that wraps this all up in a more logical fashion? Does anyone have any feedback on which pieces from azure (not 3rd parties) are required at a minimum to accomplish a single pane of glass to view my Azure environment holistically?
General overview of observability in Azure
Likely, the thing you're looking for is Azure Monitor. It's an umbrella term for everything observability related inside Azure.
To store Metrics and Logs you need Log Analytics: it can query data with kusto query language, visualize results, define Alerts on queries.
Alerts is quite a complex beast, as it is spread across the entire cloud. Two types that I use the most:
log-analytics alert (which I mentioned above)
Alerts tab, which is available at every Azure component view. for example, open resource group, and scroll down to Monitoring section
Each component also has a subset of built-in metrics. Likely, you noticed that many azure components on the Overview view display some charts. For example, Azure Storage Account displays Total egress, Total ingress, and other line-charts. When you click on these charts you can customize them. These metrics and charts are free to use.
Microsoft also has all-in-one observability solution for Azure Functions and Web Apps: Application Insights
Dashboards allows to join multiple charts into a single view and share it with others.
If you care about security, Azure proposes Azure Security Center
Deployment/management strategy
I suggest to start with:
Create Log Analytics Workspace, which is the storage for metrics and logs. The azure docs article explains how to design it: how many instances to use, how to rate-limit ingestion (it might be expensive if goes out of control), how to access it and so on.
To get Azure components logs, look for Diagnostic Settings tab at a component page at Azure portal, but not all components has it (sic!). I suggest
sending the most critical data to Log Analytics workspace to store them in a queryable format for 30 days (it's in free tier). This is needed for investigating current issues with your infrastructure
if you might need logs later than 30 days - send them to Storage Account
you mentioned SIEM integration - route required events to Event Hub and then process the stream according to your requirements
So, if you need long-term storage - you need to create Azure Storage Account.
If you need real-time analysis - you need to build a pipeline based on Azure Event Hub.
If you have Azure Functions and Web Apps - add Application Insights. According to my experience, I would suggest starting with a separate instance per each Azure Function resource or Service.
Create Alerts for each component separately. If you do it through UI - open component page at the portal and look for Alerts tab there. If you're automating the process (please do so as soon as possible), do not expect easy trip: I used ARM templates and terraform - in both cases, there are dozens of barely documented features.
Join related components core-metrics into Dashboards and share it with the team. This guide is a good starting point. Note, when you share the dashboard, it's also persisted as an azure resource in the subscription.
I have below two questions can someone help on them.
1.Is there a script or a way to create custom alert format for azure alerts?
2.Is there a way to pin all the azure VM status to dashboard?
Regarding #1, the feature to customize or configure alert email format is currently not supported. If interested, I suggest you to raise your feedback / feature request here in UserVoice / feedback forum. Responsible product / feature team would triage / start checking feasibility and would prioritize the feedback.
Regarding #2, If 'status' is meant as 'PowerState' (i.e., status of VM whether it is running, deallocated, etc.) or if it's meant as 'StatusCode' (i.e., ok, etc.) or if it's meant as 'ProvisioningState' (i.e., succeeded, etc.) then I don't think we have straight-forward way for it so that we can ingest that particular data directly to dashboard but said that, you may just leverage 'Heartbeat' Log Analytics Kusto table at first place and create a custom view as dashboard using view designer but as views in Azure Monitor are being phased out and replaced with workbooks so I suggest leveraging these workbooks now.
If not, you may leverage a new feature called as Azure Monitor for VMs which basically helps to analyze the performance and health of your Windows and Linux VMs, and monitor their processes and dependencies on other resources and external processes. Here again, you can create interactive reports Azure Monitor for VMs with VM insights workbooks.
Hope these inputs helps!
Is Azure diagnostics only implemented through code? Windows has the Event Viewer where various types of information can be accessed. ASP.Net websites have a Trace.axd file at the root that can viewed for trace information.
I was thinking that something similar might exist in Azure. However, based on the following url, Azure Diagnostics appears to require a custom code implementation:
https://azure.microsoft.com/en-us/documentation/articles/cloud-services-dotnet-diagnostics/#overview
Is there an easier, more built-in way to access Azure diagnostics like I described for other systems above? Or does a custom Worker role need to be created to capture and process this information?
Azure Worker Roles have extensive diagnostics that you can configure up.
You get to them via the Role configuration:
Then, through the various tabs, you can configure up specific types of diagnostics and have them periodically transferred to a Table Storage account for later analysis.
You can also enable a transfer of application specific logs, which is handy and something that I use to avoid having to remote into the service to view logs:
(here, I transfer all files under the AppRoot\logs folder to a blob container named wad-processor-logs, and do so every minute.)
If you go through the tabs, you will find that you have the ability to extensively monitor quite a bit of detail, including custom Performance Counters.
Finally, you can also connect to your cloud service via the Server Explorer, and dig into the same information:
Right-click on the instance, and select View Diagnostics Data.
(a recent deployment, so not much to see)
So, yes, you can get access to Event Logs, IIS Logs and custom application logs without writing custom code. Additionally, you can implement custom code to capture additional Performance Counters and other trace logging if you wish.
"Azure diagnostics" is a bit vague since there are a variety of services in Azure, each with potentially different diagnostic experiences. The article you linked to talks about Cloud Services, but are you restricted to using Cloud Services?
Another popular option is Azure App Service, which allows you many more options for capturing logs, including streaming them, etc. Here is an article which goes into more details: https://azure.microsoft.com/en-us/documentation/articles/web-sites-enable-diagnostic-log/
We are experiencing a very serious unscheduled downtime of our Azure application today for what is now coming up to 9 hours. We reported to Azure support and the ops team is actively trying to fix the problem and I do not doubt that. We managed to get our application running on another "test" hosted service that we have and redirected our CNAME to point at the instance so our customers are happy, but the "main" hosted service is still unavailable.
My own "finger in the air" instinct is that the issue is network related within our data center (west europe), and indeed, later on in the day the service dash board has gone red for that region with a message to that effect. (Our application is showing as "Healthy" in the portal, but is unreachable via our cloudapp.net URL. Additionally threads within our application are logging sql connection exceptions into our storage account as it cannot contact the DB)
What is very strange, though, is that the "test" instance I referred to above is also in the same data centre and has no issues contacting the DB and it's external endpoint is fully available.
I would like to ask the community if there is anything that I could have done better to avoid this downtime? I obeyed the guidance with respect to having at least 2 roles instances per role, yet I still got burned. Should I move to a more reliable data centre? Should I deploy my application to multiple data centres? How would I manage the fact that my SQL-Azure DB is in the same datacentre?
Any constructive guidance would be appreciated - being a techie, I've never had a more frustrating day being able to do nothing to help fix the issue.
There was an outage in the European data center today with respect to SQL Azure. Some of our clients got hit and had to move to another data center.
If you are running mission critical applications that cannot be down, I would deploy the application into multiple regions. DNS resolution is obviously a weak link right now in Azure, but can be worked around (if you only run a website it can be done very simply using Response.Redirects or similar)
Now, there is a data synchronization service from Microsoft that will sync up multiple SQL Azure databases. Check here. This way, you can have mirror sites up in different regions and have them be in sync with SQL Azure perspective
Also, be a good idea to employ a 3rd party monitoring service that would detect problems with your deployed instances externally. AzureWatch can notify or even deploy new nodes if you choose to, when some of the instances turn "Unresponsive"
Hope this helps
I can offer some guidance based on our experience:
Host your application in multiple data centers, complete with Sql Azure databases. You can connect each application to its data center specific Sql Server. You can also cache any external assets (images/JS/CSS) on the data center specific Windows Azure machine or leverage Azure Blog Storage. Note: Extra costs will be incurred.
Setup one-way SQL replication between your primary Sql Azure DB and the instance in the other data center. If you want to do bi-rectional replication, take a look at the MSDN site for guidance.
Leverage Azure Traffic Manager to route traffic to the data center closest to the user. It has geo-detection capabilities which will also improve the latency of your application. So you can redirect map http://myapp.com to the internal url of your data center and a user in Europe should automatically get redirected to the European data center and vice versa for USA. Note: At the time of writing this post, there is not a way to automatically detect and failover to a data center. Manual steps will be involved, once a failover is detected and failover is a complete set (i.e. you will failover both the Windows Azure AND Sql Azure instances). If you want micro-level failover, then I suggest putting all your config the in the service config file and encrypt the values so you can edit the connection string to connect instance X to DB Y.
You are all set now. I would create or install a local application to detect the availability of the site. A better solution would be to create a page to check for the availability of application specific components by writing a diagnostic page or web service and then poll it from a local computer.
HTH
As you're deploying to Azure you don't have much control about how SQL server is setup. MS have already set it up so that it is highly available.
Having said that, it seems that MS has been having some issues with SQL Azure over the last few days. We've been told that it only affected "a small number of users". At one point the service dashboard had 5 data centres affected by a problem. I had 3 databases in one of those data centres down twice for about an hour each time, but one database in another affected data centre that had no interruption.
If having a database connection is critical to your app, then the only way in the Azure environment to ensure against problems that MS haven't prepared against (this latest technical problem, earthquakes, meteor strikes) would be to co-locate your sql data in another data centre. At the moment the most practical way to do this is to use the synch framework. There is an ability to copy SQL Azure databases, but this only works within a data centre. With your data located elsewhere you could then point your app at the new database if the main one becomes unavailable.
While this looks good on paper though, this may not have helped you with the latest problem as it did affect multiple data centres. If you'd just been making database copies on a regular basis, that might have been enough to get you through. Or not.
(I would have posted this answer on server fault, but I couldn't find the question)
This is just about a programming/architecture issue, but you amy also want to ask the question on webmasters.stackexchange.com
You need to find out the root cause before drawing any conclusions.
However. my guess one of two things was the problem
The ISP connectivity differs for the test system and your production system. Either they use different ISPs, or different lines from the same ISP. When I worked in a hosting company we made sure that ou IP connectivity went through at least two different ISPS who did not share fibre to our premises (and where we could, they had different physical routes to the building - the homing ability of backhoes when there's a critical piece of fibre to dig up is well proven
Your datacentre had an issue with some shared production infrastructure. These might be edge routers, firewalls, load balancers, intrusion detection systems, traffic shapers etc. These typically are also often only installed on production systems. Defences here involve understanding the architecture and making sure the provider has a (tested!) DR plan for restoring SOME service when things go pair shaped. Neatest hack I saw here was persuading an IPS (intrusion prevention system) that its own management servers were malicious. And so you couldn't reconfigure it at all.
Just a thought - your DC doesn't host any of the Wikileaks mirrors, or Paypal/Mastercard/Amazon (who are getting DDOS'd by wikileaks supporters at the moment)?