Can a Worker Role process call Antimalware for Azure Cloud Services programmatically? - azure

I'm trying to find a solution that I can use to perform virus scanning on files that have been uploaded to Azure blob storage. I wanted to know if it is possible to copy the file to local storage on a Worker Role instance, call Antimalware for Azure Cloud Services to perform the scan on that specific file, and then depending on whether the file is clean, process the file accordingly.
If the Worker Role cannot call the scan programmatically, is there a definitive way to check if a file has been scanned and whether it is clean or not once it has been copied to local storage (I don't know if the service does a real-time scan when new files are added, or only runs on a schedule)?

There isn't a direct API that we've found, but the anti-malware services conform to the standards used by Windows desktop virus checkers in that they implement the IAttachmentExecute COM API.
So we ended up implementing a file upload service that writes the uploaded file to a Quarantine local resource, then calling the IAttachmentExecute API. If the file is infected then, depending on the anti-malware service in use, it will either throw an exception, silently delete the file or mark it as inaccessible. So by attempting to read the first byte of the file, we can test if the file remains accessible.
var type = Type.GetTypeFromCLSID(new Guid("4125DD96-E03A-4103-8F70-E0597D803B9C"));
var svc = (IAttachmentExecute)Activator.CreateInstance(type);
try {
svc.SetClientGuid(ref clientGuid);
svc.SetLocalPath(path);
svc.Save();
}
finally
{
svc.ClearClientState();
}
using (var fileStream = File.OpenRead(path))
{
fileStream.ReadByte();
}
[Guid("73DB1241-1E85-4581-8E4F-A81E1D0F8C57")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
public interface IAttachmentExecute
{
void SetClientGuid(ref Guid guid);
void SetLocalPath(string pszLocalPath);
void Save();
void ClearClientState();
}

I think the best way for you to know is simply take an Azure VM (IaaS) and activate Microsoft Antimalware extension. Then you may log into it and do all the necessary check and tests against the service.
Later, you will apply all this into the Worker Role (there is a similar PaaS extension available for that, calles PaaSAntimalware).
See the next excerpt from https://msdn.microsoft.com/en-us/library/azure/dn832621.aspx:
"In PaaS, the VM agent is called GuestAgent, and is always available on Web and Worker Role VMs. (For more information, see Azure Role Architecture.) The VM agent for Role VMs can now add extensions to the cloud service VMs in the same way that it does for persistent Virtual Machines.
The biggest difference between VM Extensions on role VMs and persistent VMs is that with role VMs, extensions are added to the cloud service first and then to the deployments within that cloud service.
Use the Get-AzureServiceAvailableExtension cmdlet to list all available role VM extensions."

Related

Access denied to storage account from Azure Data Factory

My goal is to run an exe file stored in a private Azure Blob container.
The exe is simple : it creates a text file, write the current datetime in it, and then push it to the private Azure Blob container.
This has to be sent from Azure Data Factory. To do this, here is my environment :
Azure Data Factory running with the simple pipeline :
https://i.stack.imgur.com/txQ9r.png
Private storage account with the following configuration :
https://i.stack.imgur.com/SJrGX.png
A linked service connected to the storage account :
https://i.stack.imgur.com/8xW5l.png
A private managed virtual network approved :
https://i.stack.imgur.com/G2DH3.png
A linked service connected to an Azure Batch :
https://i.stack.imgur.com/Yaq6C.png
A batch account linked to the right storage account
A pool running on this batch account
Two things that I need to add in context :
When I set the storage account to public, it works and I find the text file in my blob storage. So the process works well, but there is a security issue somewhere I can't find.
All the resources (ADF, Blob storage, Batch account) used have a role has contributor/owner of the blob with a managed identity.
Here is the error I get when I set the storage account to private :
{
"errorCategory":0,
"code":"BlobAccessDenied",
"message":"Access for one of the specified Azure Blob(s) is denied",
"details":[
{
"Name":"BlobSource",
"Value":"https://XXXXXXXXXXXXXXXXX/testv2.exe?sv=2018-03-28&sr=b&sig=XXXXXXXXXXXXXXXXXX&sp=r"
},
{
"Name":"FilePath",
"Value":"D:\\batch\\tasks\\workitems\\XXXXXXXXXXX\\job-1\\XXXXXXXXXXXXXXXXXXXXXXXX\\testv2.exe"
}
]
}
Thank you for your help!
Solution found Azure community support :
Check Subnet information under Network Configuration from the Azure portal > Batch Account > Pool > Properties. Take note and write the information down.
Navigate to the storage account, and select Networking. In the Firewalls and virtual networks setting, select Enable from selected virtual networks and IP addresses for Public network access. Add the Batch pool's subnet in the firewall allowlist.
If the subnet doesn't enable the service endpoint, when you select it, a notification will be displayed as follows:
The following networks don't have service endpoints enabled for 'Microsoft.Storage'. Enabling access will take up to 15 minutes to complete. After starting this operation, it is safe to leave and return later if you don't wish to wait.
Therefore, before you add the subnet, check it in the Batch virtual network to see if the service endpoint for the storage account is enabled.
After you complete the configurations above, the Batch nodes in the pool can access the storage account successfully.

is there a way to create Generation 2 VM using Azure SDK?

Azure supports UEFI through Generation2 VM.
I am able to create a Generation2 VM using Azure web console, but I cannot a way to specify the generation of the VM through Azure SDK.
I have found a link in Microsoft Docs to create a manged disk using PowerCLI
https://learn.microsoft.com/en-us/azure/virtual-machines/windows/generation-2#frequently-asked-questions
I looked into online documentation of Azure ComputeClient#virtual_machines#create_or_update() api. But still cannot find in the python code docs, any way to specify HyperVGenerations to the VM.
Yes. It's kind of counterintuitive but it goes like this: you need to specify the VM generation on the disk; then the VM, created off of this disk would be of that same generation.
If you already have a disk of gen2 then you just pick it up and specify it when creating the VM. However, I had to create the disk from a VHD file. So when you're creating the disk, you gonna need an IWithCreate instance and then chain a call to the WithHyperVGeneration method. Like this (C#):
public async Task<IDisk> MakeDisk(string vhdPath)
{
return await Azure.Disks.Define(name)
.WithRegion(Region.EuropeWest)
.WithExistingResourceGroup("my-resources")
.WithWindowsFromVhd(vhdPath)
.WithStorageAccount("saname")
.WithHyperVGeneration(HyperVGeneration.V2) // <--- This is how you specify the generation
.WithSku(DiskSkuTypes.PremiumLRS)
.CreateAsync();
}
Then create the VM:
var osDisk = await MakeDisk("template.vhd");
var vm = await Azure.VirtualMachines.Define("template-vm")
.WithRegion(Region.EuropWest)
.WithExistingResourceGroup("the-rg")
.WithExistingPrimaryNetworkInterface("some-nic")
.WithSpecializedOSDisk(osDisk, OperatingSystemTypes.Windows) // <-- Pay attention
.WithSize(VirtualMachineSizeTypes.StandardB2s)
.CreateAsync();

Do I need an Azure Storage Account to run a WebJob?

So I'm fairly new to working with Azure and there are some things I can't quite wrap my head around. One of them being the Azure Storage Account.
My web jobs keeps stopping with the following error "Unhandled Exception: System.InvalidOperationException: The account credentials for '[account_name]' are incorrect." Understanding the error however is not the problem, at least that's what I think. The problem lies in understanding why I need an Azure Storage Account to overcome it.
Please read on as I try to take you through the steps taken thus far. Hopefuly the real question will become more clear to you.
In my efforts to deploy a WebJob on Azure we have created the following resources so far:
App Service Plan
App Service
SQL server
SQL database
I'm using the following code snippet to prevent my web job from exiting:
JobHostConfiguration config = new JobHostConfiguration();
config.DashboardConnectionString = null;
new JobHost(config).RunAndBlock();
To my understanding from other sources the Dashboard connection string is optional but the AzureWebJobsStorage connection string is required.
I tried setting the required connection string in portal using the configuration found here.
DefaultEndpointsProtocol=[http|https];AccountName=myAccountName;AccountKey=myAccountKey
Looking further I found this answer that clearly states where I would get the values needed, namely an/my missing Azure Storage Account.
So now for the actualy question: Why do I need an Azure Storage Account when I seemingly have all the resources I need place for the WebJob to run? What does it do? Is it a billing thing, cause I thought we had that defined in the App Service Plan. I've tried reading up on Azure Storage Accounts over here but I need a bit more help understanding how it relates to everything.
From the docs:
An Azure storage account provides resources for storing queue and blob data in the cloud.
It's also used by the WebJobs SDK to store logging data for the dashboard.
Refer to the getting started guide and documentation for further information
The answer to your question is "No", it is not mandatory to use Azure Storage when you are trying to setup and run a Azure web job.
If you are using JobHost or JobHostConfiguration then there is indeed a dependency for Storage accounts.
Sample code snippet is give below.
class Program
{
static void Main()
{
Functions.ExecuteTask();
}
}
public class Functions
{
[NoAutomaticTrigger]
public static void ExecuteTask()
{
// Execute your task here
}
}
The answer is no, you don't. You can have a WebJob run without being tied to an Azure Storage Account. Like Murray mentioned, your WebJob dashboard does use a storage account to log data but that's completely independent.

Copying storage data from one Azure account to another

I would like to copy a very large storage container from one Azure storage account into another (which also happens to be in another subscription).
I would like an opinion on the following options:
Write a tool that would connect to both storage accounts and copy blobs one at a time using CloudBlob's DownloadToStream() and UploadFromStream(). This seems to be the worst option because it will incur costs when transferring the data and also be quite slow because data will have to come down to the machine running the tool and then get re-uploaded back to Azure.
Write a worker role to do the same - this should theoretically be faster and not incur any cost. However, this is more work.
Upload the tool to a running instance bypassing the worker role deployment and pray the tool finishes before the instance gets recycled/reset.
Use an existing tool - have not found anything interesting.
Any suggestions on the approach?
Update: I just found out that this functionality has finally been introduced (REST APIs only for now) for all storage accounts created on July 7th, 2012 or later:
http://msdn.microsoft.com/en-us/library/windowsazure/dd894037.aspx
You can also use AzCopy that is part of the Azure SDK.
Just click the download button for Windows Azure SDK and choose WindowsAzureStorageTools.msi from the list to download AzCopy.
After installing, you'll find AzCopy.exe here: %PROGRAMFILES(X86)%\Microsoft SDKs\Windows Azure\AzCopy
You can get more information on using AzCopy in this blog post: AzCopy – Using Cross Account Copy Blob
As well, you could remote desktop into an instance and use this utility for the transfer.
Update:
You can also copy blob data between storage accounts using Microsoft Azure Storage Explorer as well. Reference link
Since there's no direct way to migrate data from one storage account to another, you'd need to do something like what you were thinking. If this is within the same data center, option #2 is the best bet, and will be the fastest (especially if you use an XL instance, giving you more network bandwidth).
As far as complexity, it's no more difficult to create this code in a worker role than it would be with a local application. Just run this code from your worker role's Run() method.
To make things more robust, you could list the blobs in your containers, then place specific file-move request messages into an Azure queue (and optimize by putting more than one object name per message). Then use a worker role thread to read from the queue and process objects. Even if your role is recycled, at worst you'd reprocess one message. For performance increase, you could then scale to multiple worker role instances. Once the transfer is complete, you simply tear down the deployment.
UPDATE - On June 12, 2012, the Windows Azure Storage API was updated, and now allows cross-account blob copy. See this blog post for all the details.
here is some code that leverages the .NET SDK for Azure available at http://www.windowsazure.com/en-us/develop/net
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.WindowsAzure.StorageClient;
using System.IO;
using System.Net;
namespace benjguinAzureStorageTool
{
class Program
{
private static Context context = new Context();
static void Main(string[] args)
{
try
{
string usage = string.Format("Possible Usages:\n"
+ "benjguinAzureStorageTool CopyContainer account1SourceContainer account2SourceContainer account1Name account1Key account2Name account2Key\n"
);
if (args.Length < 1)
throw new ApplicationException(usage);
int p = 1;
switch (args[0])
{
case "CopyContainer":
if (args.Length != 7) throw new ApplicationException(usage);
context.Storage1Container = args[p++];
context.Storage2Container = args[p++];
context.Storage1Name = args[p++];
context.Storage1Key = args[p++];
context.Storage2Name = args[p++];
context.Storage2Key = args[p++];
CopyContainer();
break;
default:
throw new ApplicationException(usage);
}
Console.BackgroundColor = ConsoleColor.Black;
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine("OK");
Console.ResetColor();
}
catch (Exception ex)
{
Console.WriteLine();
Console.BackgroundColor = ConsoleColor.Black;
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine("Exception: {0}", ex.Message);
Console.ResetColor();
Console.WriteLine("Details: {0}", ex);
}
}
private static void CopyContainer()
{
CloudBlobContainer container1Reference = context.CloudBlobClient1.GetContainerReference(context.Storage1Container);
CloudBlobContainer container2Reference = context.CloudBlobClient2.GetContainerReference(context.Storage2Container);
if (container2Reference.CreateIfNotExist())
{
Console.WriteLine("Created destination container {0}. Permissions will also be copied.", context.Storage2Container);
container2Reference.SetPermissions(container1Reference.GetPermissions());
}
else
{
Console.WriteLine("destination container {0} already exists. Permissions won't be changed.", context.Storage2Container);
}
foreach (var b in container1Reference.ListBlobs(
new BlobRequestOptions(context.DefaultBlobRequestOptions)
{ UseFlatBlobListing = true, BlobListingDetails = BlobListingDetails.All }))
{
var sourceBlobReference = context.CloudBlobClient1.GetBlobReference(b.Uri.AbsoluteUri);
var targetBlobReference = container2Reference.GetBlobReference(sourceBlobReference.Name);
Console.WriteLine("Copying {0}\n to\n{1}",
sourceBlobReference.Uri.AbsoluteUri,
targetBlobReference.Uri.AbsoluteUri);
using (Stream targetStream = targetBlobReference.OpenWrite(context.DefaultBlobRequestOptions))
{
sourceBlobReference.DownloadToStream(targetStream, context.DefaultBlobRequestOptions);
}
}
}
}
}
Its very simple with AzCopy. Download latest version from https://azure.microsoft.com/en-us/documentation/articles/storage-use-azcopy/
and in azcopy type:
Copy a blob within a storage account:
AzCopy /Source:https://myaccount.blob.core.windows.net/mycontainer1 /Dest:https://myaccount.blob.core.windows.net/mycontainer2 /SourceKey:key /DestKey:key /Pattern:abc.txt
Copy a blob across storage accounts:
AzCopy /Source:https://sourceaccount.blob.core.windows.net/mycontainer1 /Dest:https://destaccount.blob.core.windows.net/mycontainer2 /SourceKey:key1 /DestKey:key2 /Pattern:abc.txt
Copy a blob from the secondary region
If your storage account has read-access geo-redundant storage enabled, then you can copy data from the secondary region.
Copy a blob to the primary account from the secondary:
AzCopy /Source:https://myaccount1-secondary.blob.core.windows.net/mynewcontainer1 /Dest:https://myaccount2.blob.core.windows.net/mynewcontainer2 /SourceKey:key1 /DestKey:key2 /Pattern:abc.txt
I'm a Microsoft Technical Evangelist and I have developed a sample and free tool (no support/no guarantee) to help in these scenarios.
The binaries and source-code are available here: https://blobtransferutility.codeplex.com/
The Blob Transfer Utility is a GUI tool to upload and download thousands of small/large files to/from Windows Azure Blob Storage.
Features:
Create batches to upload/download
Set the Content-Type
Transfer files in parallel
Split large files in smaller parts that are transferred in parallel
The 1st and 3rd feature is the answer to your problem.
You can learn from the sample code how I did it, or you can simply run the tool and do what you need to do.
Write your tool as a simple .NET Command Line or Win Forms application.
Create and deploy a dummy we/worker role with RDP enabled
Login to the machine via RDP
Copy your tool over the RDP connection
Run the tool on the remote machine
Delete the deployed role.
Like you I am not aware of any of the off the shelf tools supporting a copy between function.
You may like to consider just installing Cloud Storage Studio into the role though and dumping to disk then re-uploading. http://cerebrata.com/Products/CloudStorageStudiov2/Details.aspx?t1=0&t2=7
Use could 'Azure Storage Explorer' (free) or some other such tool. These tools provide a way to download and upload content. You will need to manually create containers and tables - and of course this will incur a transfer cost - but if you are short on time and your contents are of reasonable size then this is a viable option.
I recommend use azcopy, you can copy the all the storage account, a container, a directory or a single blob. Here al example of cloning all the storage account:
azcopy copy 'https://{SOURCE_ACCOUNT}.blob.core.windows.net{SOURCE_SAS_TOKEN}' 'https://{DESTINATION_ACCOUNT}.blob.core.windows.net{DESTINATION_SAS_TOKEN}' --recursive
You can get SAS token from Azure Portal. Navigate to storage account overviews (source and destination), then in the sidenav click on "Shared access sigantura" and generate your own.
More examples here
I had to do somethign similar to move 600 GB of content from a local file system to Azure Storage. After a couple iterations of code I finally ended up with taking the 'Azure Storage Explorer' and extended it with ability to select folders instead of just files and then have it recursively drill into the multiple selected folders, loaded a list of Source / Destination copy item statements into an Azure Queue. Then in the upload section in 'Azure Storage Explorer', in the Queue section to pull from the queue and execute the copy operation.
Then I launched like 10 instances of the 'Azure Storage Explorer' tool and had each pulling from the queue and executing the copy operation. I was able to move the 600 GB of items in just over 2 days. Added in smarts to utilize the modified time stamps on files and have it skip over files that have already been both copied from the queue and not add to the queue if it is in sync. Now I can run "updates" or syncs within an hour or two across the whole library of content.
Try CloudBerry Explorer. It copies blob within and between subscriptions.
For copying between subscriptions, edit the storage account container's access from Private to Public Blob.
The copying process took few hours to complete. If you choose to reboot your machine, the process will continue. Check status by refreshing the target storage account container in Azure management UI by checking the timestamp, the value gets updated until the copy process completes.

Azure Worker Role Control Start Stop and Status

I'm doing my first project but large one on developing Azure Application with Intergration Component.
Currently most of the integration are done using SSIS Packages and would like to transform them on to Worker Role in Azure.
Could someone please help me to understand the following queries regarding Worker Role please?
Is there way to start or stop the Worker role (just like SSIS or Windows Schedulers) via GUI? If not how to achieve this?
How do I know my worker role has been running or not running (including why it's not running ie. logs)
How do I spin multiple worker role based on time (i.e. (9:00AM to 11:00AM spin 4 roles and scale down on quiet period)
Does the following code creates any poison message or dead lock (if multiple there are 10,000 messages to process and every 5 seconds the new thread (Processsing.run) is started?
while(true)
{
var thread = new Thread(Run);
thread.start();
Thread.Sleep(5000);
Trace.WriteLine("Working", "Information");
}
public class PhotoProcessing
{
public static void Run()
{
// Read from queue
CloudQueueMessage msg =
Storage.Queue.GetNextMessage();
while(msg != null)
{
string[] message = msg.AsString.Split('$');
if(message.Length == 2)
{
AddWatermark(message[0], message[1]);
}
// Message has been read so remove it
Storage.Queue.DeleteMessage(msg);
// Get next message if any
msg = Storage.Queue.GetNextMessage();
}
}
Is there way to start or stop the Worker role (just like SSIS or Windows Schedulers) via GUI? If not how to achieve this?
There are actually many ways to achieve this. You can use Windows Azure Portal to do or you could use 3rd party tools (like our Cloud Storage Studio) or you could write your own application using Windows Azure Service Management API (http://msdn.microsoft.com/en-us/library/ee460799.aspx)
How do I know my worker role has been running or not running (including why it's not running ie. logs)
Again you could use one of the GUI based tools to see the status of your roles. As far as why the roles are not running, you would need to enable Windows Azure Diagnostics in your worker role (http://msdn.microsoft.com/en-us/library/gg433048.aspx)
How do I spin multiple worker role based on time (i.e. (9:00AM to 11:00AM spin 4 roles and scale down on quiet period)
You can write your own application using Windows Azure Service Management API to do so or you could make use of 3rd party tools like AzureWatch from Paraleap or Azure Management Cmdlets (both from Microsoft and our company). While the cmdlets will get the job done, I believe Azure Watch is much more sophisticated solution. We wrote a blog post for autoscaling some days back which you can find here: http://www.cerebrata.com/Blog/post/Scale-your-Windows-Azure-instances-with-Azure-Management-Cmdlets.aspx.

Resources