Azure Storage CloudBlob.Properties are not initialized when using GetBlobReference() - azure

I'm trying to get some information about Azure blob (last modified UTC date time). This information is stored CloudBlob.Properties.LastModifiedUtc property.
If I use method GetBlobReference() or GetBlockBlobReference(), the Properties of the blob are not initialized (LastModifiedUtc is DateTime.MinDate). If I use ListBlobs() the Properties are initialized correctly (LastModifiedUtc has correct value).
Am I doing something wrong when using GetBlobReference function? Is there some way how to get CloudBlob instance just for one specific blob? I know I can missue ListBlobs() and filter just the blob I'm interested in, or use ListBlobsWithPrefix() from class CloudBlobClient, but I would expect to get all the metadata when I ask for specific Blob Reference.
Code showing how I'm working with Azure blobs:
string storageAccountName = "test";
string storageAccountKey = #"testkey";
string blobUrl = "https://test.blob.core.windows.net";
string containerName = "testcontainer";
string blobName = "testbontainer";
var credentials = new StorageCredentialsAccountAndKey(storageAccountName, storageAccountKey);
var cloudBlobClient = new CloudBlobClient(blobUrl, credentials);
var containerReference = cloudBlobClient.GetContainerReference(string.Format("{0}/{1}", blobUrl, containerName));
// OK - Result is of type CloudBlockBlob, cloudBlob_ListBlobs.Properties.LastModifiedUtc > DateTime.MinValue
var cloudBlob_ListBlobs = containerReference.ListBlobs().Where(i => i is CloudBlob && ((CloudBlob)i).Name == blobName).FirstOrDefault() as CloudBlob;
// WRONG - Result is of type CloudBlob, cloudBlob_GetBlobReference.Properties.LastModifiedUtc == DateTime.MinValue
var cloudBlob_GetBlobReference = containerReference.GetBlobReference(string.Format("{0}/{1}/{2}", blobUrl, containerName, blobName));
// WRONG - Result is of type CloudBlockBlob, cloudBlob_GetBlockBlobReference.Properties.LastModifiedUtc == DateTime.MinValue
var cloudBlob_GetBlockBlobReference = containerReference.GetBlockBlobReference(string.Format("{0}/{1}/{2}", blobUrl, containerName, blobName));

I believe you have to make a seperate call to fetch the attributes/metadata. After you have the blob referrence, issue the following line to retrieve the attributes.
cloudBlob_GetBlobReference.FetchAttributes();

This relates to the Java SDK. But having a CloudBlob derived CloudBlockBlob object, you may need the CloudBlob.downloadAttributes() call.

Related

How to Read a file from Azure Data Lake Storage with file Url?

Is there a way to read files from the Azure Data Lake. I have the Http url of the file. I want to read it direclty. How can i acheive it because I don't see a way to do it via the SDK.
Thanks for your help.
Regards
Did you check docs?
public async Task ListFilesInDirectory(DataLakeFileSystemClient fileSystemClient)
{
IAsyncEnumerator<PathItem> enumerator =
fileSystemClient.GetPathsAsync("my-directory").GetAsyncEnumerator();
await enumerator.MoveNextAsync();
PathItem item = enumerator.Current;
while (item != null)
{
Console.WriteLine(item.Name);
if (!await enumerator.MoveNextAsync())
{
break;
}
item = enumerator.Current;
}
}
You can also use ADLS Gen2 rest api ,
For example, you can write code like below with sas token authentication(or you can also use the shared key authentication):
string sasToken = "?sv=2018-03-28&ss=b&srt=sco&sp=rwdl&st=2019-04-15T08%3A07%3A49Z&se=2019-04-16T08%3A07%3A49Z&sig=xxxx";
string url = "https://xxxx.dfs.core.windows.net/myfilesys1/app.JPG" + sasToken;
var req = (HttpWebRequest)WebRequest.CreateDefault(new Uri(url));
//you can specify the Method as per your operation as per the api doc
req.Method = "HEAD";
var res = (HttpWebResponse)req.GetResponse();
If you know Blob APIs and Data Lake Storage Gen2 APIs can operate on the same data, then you can directly use the azure blob storage SDK to read file from ADLS Gen2.
First, install this nuget package: Microsoft.Azure.Storage.Blob, version 11.1.6.
Note that, in this case, you should use this kind of url "https://xxx.blob.core.windows.net/mycontainer/myfolder/test.txt" instead of that kind of url "https://xxx.dfs.core.windows.net/mycontainer/myfolder/test.txt".
Here is the sample code which is used to read a .txt file in ADLS Gen2:
var blob_url = "https://xxx.blob.core.windows.net/mycontainer/myfolder/test.txt";
//var blob_url = "https://xxx.dfs.core.windows.net/mycontainer/myfolder/test.txt";
var username = "xxxx";
var password = "xxxx";
StorageCredentials credentials = new StorageCredentials(username, password);
var blob = new CloudBlockBlob(new Uri(blob_url),credentials);
var mystream = blob.OpenRead();
using (StreamReader reader = new StreamReader(mystream))
{
Console.WriteLine("Read file content: " + reader.ReadToEnd());
}
//you can also use other method like below
//string text = blob.DownloadText();
//Console.WriteLine($"the text is: {text}");
The test result:

Read the blob content on Azure Storage

I'm using the Microsoft.Azure.Storage.Blob nuget package trying to get the list of the blobs in a container and than reading the content.
With the ListBlobs() method I see all the blobs.
Every blob item has an URI but I cannot see the blob name that I need for the GetBlobReferenceFromServer().
For this reason the blob name is a constant in following sample code.
What is the right way? Do I have to split and parse the URI to find the blob name?
Do I have to use another method?
Microsoft.Azure.Storage.Blob.CloudBlobContainer container =
new Microsoft.Azure.Storage.Blob.CloudBlobContainer(new Uri("https://myaccount.blob.core.windows.net/containername"),
new Microsoft.Azure.Storage.Auth.StorageCredentials("myaccount", "**********=="));
IEnumerable<Microsoft.Azure.Storage.Blob.IListBlobItem> blobs = container.ListBlobs();
foreach (var blobItem in blobs)
{
//string blobUri = blobItem.Uri.ToString();
Microsoft.Azure.Storage.Blob.ICloudBlob blockBlob = container.GetBlobReferenceFromServer("blobname");
MemoryStream downloadStream = new MemoryStream();
blockBlob.DownloadToStream(downloadStream);
string blobContent = Encoding.UTF8.GetString(downloadStream.ToArray());
}
With the ListBlobs() method I see all the blobs. Every blob item has
an URI but I cannot see the blob name that I need for the
GetBlobReferenceFromServer().
The reason for this is that ListBlobs method returns an enumerable of type IListBlobItem which does not have the name property. In order to get the name of the blob, you can cast it to either CloudBlob or CloudBlockBlob which implement this interface and you will be able to get the name of the blob which you can use GetBlobReferenceFromServer method.
BTW, once you have listed the blob you don't really need to call GetBlobReferenceFromServer method as you already have all the information about the blob as part of listing. GetBlobReferenceFromServer makes another request to storage to fetch same set of properties that you already have as part of listing.
So your code can simply be:
foreach (var blobItem in blobs)
{
var blockBlob = (CloudBlockBlob) blobItem;
MemoryStream downloadStream = new MemoryStream();
blockBlob.DownloadToStream(downloadStream);
string blobContent = Encoding.UTF8.GetString(downloadStream.ToArray());
}
Or, if you don't go down casting route, you can simply create an instance of CloudBlockBlob using the URI you got as part of the listing.
Something like:
foreach (var blobItem in blobs)
{
var blockBlob = new CloudBlockBlob(blobItem.Uri, container.ServiceClient);
MemoryStream downloadStream = new MemoryStream();
blockBlob.DownloadToStream(downloadStream);
string blobContent = Encoding.UTF8.GetString(downloadStream.ToArray());
}

Azure Blob SAS with IP Range Restriction

I'm trying to create SAS URIs / Tokens to allow download of my Azure Storage Blobs.
I'd like to do this on a blob-level, in order to not inadvertently give access to an unintended resource.
The current code I use to do this is:
public static string GetBlobSasUri(string containerName, string reference)
{
// Create the CloudBlobContainer object
CloudBlobContainer container = blobClient.GetContainerReference(containerName);
container.CreateIfNotExists();
// Get a reference to a blob within the container.
CloudBlockBlob blob = container.GetBlockBlobReference(reference);
// Set the expiry time and permissions for the blob.
// In this case, the start time is specified as a few minutes in the past, to mitigate clock skew.
// The shared access signature will be valid immediately.
SharedAccessBlobPolicy sasConstraints = new SharedAccessBlobPolicy();
sasConstraints.SharedAccessStartTime = DateTimeOffset.UtcNow.AddMinutes(-5);
sasConstraints.SharedAccessExpiryTime = DateTimeOffset.UtcNow.AddMonths(1);
sasConstraints.Permissions = SharedAccessBlobPermissions.Read;
// Generate the shared access signature on the blob, setting the constraints directly on the signature.
string sasBlobToken = blob.GetSharedAccessSignature(sasConstraints);
// Return the URI string for the container, including the SAS token.
return blob.Uri + sasBlobToken;
}
This is largely based on the example in Documentation here:
Generate a shared access signature URI for a blob
This works. However, I see in other SAS documentation that it is possible to restrict to a certain IP range as well:
Service SAS Uri Example
My understanding of SAS tokens is that the signature signs all parameters, so I don't think this is as easy as just appending my IP range to the SAS URI returned from the code I pasted above, since the signature would then not match.
However, the SharedAccessBlobPolicy only has three fields, which are the start/end times of the access, as well as the permissions. I don't see anything about IP ranges.
Is it possible to set these permitted ranges when generating SAS URIs at the blob level, not for a full account?
Please use the code below:
public static string GetBlobSasUri(string ipAddressFrom, string ipAddressTo)
{
CloudStorageAccount storageAccount = new CloudStorageAccount(new StorageCredentials("account_name", "account_key"), true);
CloudBlobClient cloudBlobClient = storageAccount.CreateCloudBlobClient();
var cloudBlobContainer = cloudBlobClient.GetContainerReference("test-1");
cloudBlobContainer.CreateIfNotExists();
CloudBlockBlob blob = cloudBlobContainer.GetBlockBlobReference("a.txt");
var ipAddressRange = new IPAddressOrRange(ipAddressFrom, ipAddressTo);
var sasBlobToken = blob.GetSharedAccessSignature(new SharedAccessBlobPolicy()
{
Permissions = SharedAccessBlobPermissions.List,
SharedAccessExpiryTime = new DateTimeOffset(DateTime.UtcNow.AddHours(1))
}, null, null,null, ipAddressRange);
return blob.Uri + sasBlobToken;
}

Unsure of correct method to list blobs with a SharedAccess Token

I upgraded WindowsAzure.Storage to 4.0.3
I want to output to a webpage a list of blobs in a folder, where clicking on the link downloads the blob. As the blobs are in a secure container each URI needs a shared access signature.
I used to have:
var dir = Container.GetDirectoryReference(folderName);
List<IListBlobItem> blobs = dir.ListBlobs().ToList();
var blobsInFolder = new List<Uri>();
foreach (IListBlobItem listBlobItem in blobs)
{
var blob = Container.GetBlockBlobReference(listBlobItem.Uri.ToString());
string sasBlobToken = blob.GetSharedAccessSignature(_sasConstraints);
blobsInFolder.Add(new Uri(blob.Uri + sasBlobToken));
}
return blobsInFolder;
This no longer works as GetBlockBlobReference no longer accepts a URI but a filename. IListBlobItem does not include the filename.
I could start chopping up the Uri to get the folder and filename
var blob = Container.GetBlockBlobReference(folderName + "/" + Path.GetFileName(listBlobItem.Uri.AbsolutePath));
...but I feel that's going the wrong way (that I shouldn't have to do this?). Can someone point me in the right way please?
Try casting IListBlobItem to CloudBlockBlob
foreach (IListBlobItem listBlobItem in blobs)
{
var blob = (CloudBlockBlob) listBlobItem;
string sasBlobToken = blob.GetSharedAccessSignature(_sasConstraints);
blobsInFolder.Add(new Uri(blob.Uri + sasBlobToken));
}
return blobsInFolder;

Create Shared Access Token with Microsoft.WindowsAzure.Storage returns 403

I have a fairly simple method that uses the NEW Storage API to create a SAS and copy a blob from one container to another.
I am trying to use this to Copy blob BETWEEN STORAGE ACCOUNTS. So I have TWo Storage accounts, with the exact same Containers, and I am trying to copy a blob from the Storage Account's Container to another Storage Account's Container.
I don't know if the SDK is built for that, but it seems like it would be a common scenario.
Some additional information:
I create the token on the Destination Container.
Does that token need to be created on both the source and destination? Does it take time to register the token? Do I need to create it for each request, or only once per token "lifetime"?
I should mention a 403 is an Unauthorized Result http error code.
private static string CreateSharedAccessToken(CloudBlobClient blobClient, string containerName)
{
var container = blobClient.GetContainerReference(containerName);
var blobPermissions = new BlobContainerPermissions();
// The shared access policy provides read/write access to the container for 10 hours:
blobPermissions.SharedAccessPolicies.Add("SolutionPolicy", new SharedAccessBlobPolicy()
{
// To ensure SAS is valid immediately we don’t set start time
// so we can avoid failures caused by small clock differences:
SharedAccessExpiryTime = DateTime.UtcNow.AddHours(1),
Permissions = SharedAccessBlobPermissions.Write |
SharedAccessBlobPermissions.Read
});
blobPermissions.PublicAccess = BlobContainerPublicAccessType.Blob;
container.SetPermissions(blobPermissions);
return container.GetSharedAccessSignature(new SharedAccessBlobPolicy(), "SolutionPolicy");
}
Down the line I use this token to call a copy operation, which returns a 403:
var uri = new Uri(srcBlob.Uri.AbsoluteUri + blobToken);
destBlob.StartCopyFromBlob(uri);
My version of Azure.Storage is 2.1.0.2.
Here is the full copy method in case that helps:
private static void CopyBlobs(
CloudBlobContainer srcContainer, string blobToken,
CloudBlobContainer destContainer)
{
var srcBlobList
= srcContainer.ListBlobs(string.Empty, true, BlobListingDetails.All); // set to none in prod (4perf)
//// get the SAS token to use for all blobs
//string token = srcContainer.GetSharedAccessSignature(
// new SharedAccessBlobPolicy(), "SolutionPolicy");
bool pendingCopy = true;
foreach (var src in srcBlobList)
{
var srcBlob = src as ICloudBlob;
// Determine BlobType:
ICloudBlob destBlob;
if (srcBlob.Properties.BlobType == BlobType.BlockBlob)
{
destBlob = destContainer.GetBlockBlobReference(srcBlob.Name);
}
else
{
destBlob = destContainer.GetPageBlobReference(srcBlob.Name);
}
// Determine Copy State:
if (destBlob.CopyState != null)
{
switch (destBlob.CopyState.Status)
{
case CopyStatus.Failed:
log.Info(destBlob.CopyState);
break;
case CopyStatus.Aborted:
log.Info(destBlob.CopyState);
pendingCopy = true;
destBlob.StartCopyFromBlob(destBlob.CopyState.Source);
return;
case CopyStatus.Pending:
log.Info(destBlob.CopyState);
pendingCopy = true;
break;
}
}
// copy using only Policy ID:
var uri = new Uri(srcBlob.Uri.AbsoluteUri + blobToken);
destBlob.StartCopyFromBlob(uri);
//// copy using src blob as SAS
//var source = new Uri(srcBlob.Uri.AbsoluteUri + token);
//destBlob.StartCopyFromBlob(source);
}
}
And finally the account and client (vetted) code:
var credentials = new StorageCredentials("BAR", "FOO");
var account = new CloudStorageAccount(credentials, true);
var blobClient = account.CreateCloudBlobClient();
var sasToken = CreateSharedAccessToken(blobClient, "content");
When I use a REST client this seems to work... any ideas?
Consider also this problem:
var uri = new Uri(srcBlob.Uri.AbsoluteUri + blobToken);
Probably you are calling the "ToString" method of Uri that produce a "Human redable" version of the url. If the blobToken contain special caracters like for example "+" this will cause a token malformed error on the storage server that will refuse to give you the access.
Use this instead:
String uri = srcBlob.Uri.AbsoluteUri + blobToken;
Shared Access Tokens are not required for this task. I ended up with two accounts and it works fine:
var accountSrc = new CloudStorageAccount(credsSrc, true);
var accountDest = new CloudStorageAccount(credsSrc, true);
var blobClientSrc = accountSrc.CreateCloudBlobClient();
var blobClientDest = accountDest.CreateCloudBlobClient();
// Set permissions on the container.
var permissions = new BlobContainerPermissions {PublicAccess = BlobContainerPublicAccessType.Blob};
srcContainer.SetPermissions(permissions);
destContainer.SetPermissions(permissions);
//grab the blob
var sourceBlob = srcContainer.GetBlockBlobReference("FOO");
var destinationBlob = destContainer.GetBlockBlobReference("BAR");
//create a new blob
destinationBlob.StartCopyFromBlob(sourceBlob);
Since both CloudStorageAccount objects point to the same account, copying without a SAS token would work just fine as you also mentioned.
On the other hand, you need either a publicly accessible blob or a SAS token to copy from another account. So what you tried was correct, but you established a container-level access policy, which can take up to 30 seconds to take effect as also documented in MSDN. During this interval, a SAS token that is associated with the stored access policy will fail with status code 403 (Forbidden), until the access policy becomes active.
One more thing that I would like to point is; when you call Get*BlobReference to create a new blob object, the CopyState property will not be populated until you do a GET/HEAD operation such as FetchAttributes.

Resources