NPOI writing XLS file converting to Azure Blob - excel

I'm trying to convert current application that uses NPOI for creating xls document on the server to Azure hosted application. I have little experience with NPOI and Azure so 2 strikes right there. I have the app uploading the xls to Blob container however it is always blank (9 bytes). From what I understand NPOI uses filestream to write to the file so I just changed that to write to the blob container.
Here is what i think are the relevant portions:
internal void GenerateExcel(DataSet ds, int QuoteID, string ReportFileName)
{
string ExcelFileName = string.Format("{0}_{1}.xls",ReportFileName,QuoteID);
try
{
//these 2 strings will get deleted but left here for now to run side by side at the moment
string ReportDirectoryPath = HttpContext.Current.Server.MapPath(".") + "\\Reports";
if (!Directory.Exists(ReportDirectoryPath))
{
Directory.CreateDirectory(ReportDirectoryPath);
}
string ExcelReportFullPath = ReportDirectoryPath + "\\" + ExcelFileName;
if (File.Exists(ExcelReportFullPath))
{
File.Delete(ExcelReportFullPath);
}
// Create a new workbook.
var workbook = new HSSFWorkbook();
//Rest of the NPOI XLS rows cells etc. etc. all works fine when writing to disk////////////////
// Retrieve storage account from connection string.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
// Create the blob client.
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
// Retrieve a reference to a container.
CloudBlobContainer container = blobClient.GetContainerReference("pricingappreports");
// Create the container if it doesn't already exist.
if (container.CreateIfNotExists())
{
container.SetPermissions(new BlobContainerPermissions { PublicAccess = BlobContainerPublicAccessType.Blob });
}
// Retrieve reference to a blob with the same name.
CloudBlockBlob blockBlob = container.GetBlockBlobReference(ExcelFileName);
// Write the output to a file on the server
String file = ExcelReportFullPath;
using (FileStream fs = new FileStream(file, FileMode.Create))
{
workbook.Write(fs);
fs.Close();
}
// Write the output to a file on Azure Storage
String Blobfile = ExcelFileName;
using (FileStream fs = new FileStream(Blobfile, FileMode.Create))
{
workbook.Write(fs);
blockBlob.UploadFromStream(fs);
fs.Close();
}
}
I'm uploading to the Blob and the file exists, why doesn't the data get written to the xls?
Any help would be appreciated.
Update: I think I found the problem. Doesn't look like you can write to a file in Blob Storage. Found this Blog which pretty much answers my questions: it doesn't use NPOI but the concept is the same. http://debugmode.net/2011/08/28/creating-and-updating-excel-file-in-windows-azure-web-role-using-open-xml-sdk/
Thanks

Can you install fiddler and check the request and the response packets? You may also need to seek back to 0 between two writes . So the correct code here could be to add the below before trying to write the stream to blob.
workbook.Write(fs);
fs.Seek(0, SeekOrigin.Begin);
blockBlob.UploadFromStream(fs);
fs.Close();
I also noticed that you are using String Blobfile = ExcelFileName instead of String Blobfile = ExcelReportFullPath.

Related

Why can't I download an Azure Blob using an asp.net core application published to Azure server

I am trying to download a Blob from an Azure storage account container. When I run the application locally, I get the correct "Download" folder C:\Users\xxxx\Downloads. When I publish the application to Azure and try to download the file, I get an error. I have tried various "Knownfolders", and some return empty strings, others return the folders on the Azure server. I am able to upload files fine, list the files in a container, but am struggling with downloading a file.
string conn =
configuration.GetValue<string>"AppSettings:AzureContainerConn");
CloudStorageAccount storageAcct = CloudStorageAccount.Parse(conn);
CloudBlobClient blobClient = storageAcct.CreateCloudBlobClient();
CloudBlobContainer container =
blobClient.GetContainerReference(containerName);
Uri uriObj = new Uri(uri);
string filename = Path.GetFileName(uriObj.LocalPath);
// get block blob reference
CloudBlockBlob blockBlob = container.GetBlockBlobReference(filename);
Stream blobStream = await blockBlob.OpenReadAsync();
string _filepath = _knownfolder.Path + "\\projectfiles\\";
Directory.CreateDirectory(_filepath);
_filepath = _filepath + filename;
Stream _file = new MemoryStream();
try
{
_file = File.Open(_filepath, FileMode.Create, FileAccess.Write);
await blobStream.CopyToAsync(_file);
}
finally
{
_file.Dispose();
}
The expected end result is the file ends up in the folder within the users "Downloads" folder.
Since you're talking about publishing to Azure, the code is probably from a web application, right? And the code for the web application runs on the server. Which means the code is trying to download the blob to the server running the web application.
To present a downloadlink to the user to enable them to download the file, use the FileStreamResult which
Represents an ActionResult that when executed will write a file from a stream to the response.
A (pseudo code) example:
[HttpGet]
public FileStreamResult GetFile()
{
var stream = new MemoryStream();
CloudBlockBlob blockBlob = container.GetBlockBlobReference(filename);
blockBlob.DownloadToStream(stream);
blockBlob.Seek(0, SeekOrigin.Begin);
return new FileStreamResult(stream, new MediaTypeHeaderValue("text/plain"))
{
FileDownloadName = "someFile.txt"
};
}

Is it possible to use more than 1.5 Gb of memory with an Azure Function App V2

I'm currently using v2 of Azure Function Apps. I've set the environment to be 64 bit and am compiling to .Net Standard 2.0. Host Json specifies version 2.
I'm reading in a .csv and it works fine for smaller files. But when I read in a 180MB .csv into a List of string[] it's ballooning to over a GB on read and when I try to parse it, it's up over 2 GB but then throws the 'Out of Memory' Exception. Even running on an app service plan with more than 3.5 GB hasn't solved the issue.
Edit:
I'm using this:
Uri blobUri = AppendSasOnUri(blobName); _webClient = new WebClient();
Stream sourceStream = _webClient.OpenRead(blobUri);
_reader = new StreamReader(sourceStream);
However, since It's a csv, I'm splitting out entire columns of data. It's pretty hard to get away from this:
internal async Task<List<string[]>> ReadCsvAsync() {
while (!_reader.EndOfStream) {
string[] currentCsvRow = await ReadCsvRowAsync();
_fullBlobCsv.Add(currentCsvRow);
}
return _fullBlobCsv; }
Goal is to store json into blob when alls said and done.
Try using stream (StreamReader) to read the input .csv file and process one line at a time.
I'm able to parse 300mb files on consumption plan with streams. My use-case may not be same but similar. Parse a large concatenated pdf file and separate it to 5000+ smaller files and store the separated files into blob container. Below is my code for reference.
For your use case you may want to use CloudAppendBlob instead of CloudBlockBlob if you're pushing all parsed data into single blob.
public async static void ExtractSmallerFiles(CloudBlockBlob myBlob, string fileDate, ILogger log)
{
using (var reader = new StreamReader(await myBlob.OpenReadAsync()))
{
CloudBlockBlob blockBlob = null;
var fileContents = new StringBuilder(string.Empty);
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (line.StartsWith("%%MS_SKEY_0000_000_PDF:"))
{
var matches = Regex.Match(line, #"%%MS_SKEY_0000_000_PDF: A(\d+)_SMFL_B1234_D(\d{8})_A\d+_M(\d{15}) _N\d+");
var smallFileDate = matches.Groups[2];
var accountNumber = matches.Groups[3];
var fileName = $"SmallerFiles/{smallFileDate}/{accountNumber}.pdf";
blockBlob = myBlob.Container.GetBlockBlobReference(fileName);
}
fileContents.AppendLine(line);
if (line.Equals("%%EOF"))
{
log.LogInformation($"Uploading {fileContents.Length} bytes to {blockBlob.Name}");
await blockBlob.UploadTextAsync(fileContents.ToString());
fileContents = new StringBuilder(string.Empty);
}
}
await myBlob.DeleteAsync();
log.LogInformation("Extracted Smaller files");
}
}

How to copy an Azure File to an Azure Blob?

I've got some files sitting in an Azure File storage.
I'm trying to programatically archive them to Azure Blobs and I'm not sure how to do this efficiently.
I keep seeing code samples about copying from one blob container to another blob container .. but not from a File to a Blob.
Is it possible to do without downloading the entire File content locally and then uploading this content? Maybe use Uri's or something?
More Info:
The File and Blob containers are in the same Storage account.
The storage account is RA-GRS
Here was some sample code I was thinking of doing .. but it just doesn't feel right :( (pseudo code also .. with validation and checks omitted).
var file = await ShareRootDirectory.GetFileReference(fileName);
using (var stream = new MemoryStream())
{
await file.DownloadToStreamAsync(stream);
// Custom method that basically does:
// 1. GetBlockBlobReference
// 2. UploadFromStreamAsync
await cloudBlob.AddItemAsync("some-container", stream);
}
How to copy an Azure File to an Azure Blob?
We also can use CloudBlockBlob.StartCopy(CloudFile). CloudFile type is also can be accepted by the CloudBlockBlob.StartCopy function.
How to copy CloudFile to blob please refer to document. The following demo code is snippet from the document.
// Parse the connection string for the storage account.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
Microsoft.Azure.CloudConfigurationManager.GetSetting("StorageConnectionString"));
// Create a CloudFileClient object for credentialed access to File storage.
CloudFileClient fileClient = storageAccount.CreateCloudFileClient();
// Create a new file share, if it does not already exist.
CloudFileShare share = fileClient.GetShareReference("sample-share");
share.CreateIfNotExists();
// Create a new file in the root directory.
CloudFile sourceFile = share.GetRootDirectoryReference().GetFileReference("sample-file.txt");
sourceFile.UploadText("A sample file in the root directory.");
// Get a reference to the blob to which the file will be copied.
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("sample-container");
container.CreateIfNotExists();
CloudBlockBlob destBlob = container.GetBlockBlobReference("sample-blob.txt");
// Create a SAS for the file that's valid for 24 hours.
// Note that when you are copying a file to a blob, or a blob to a file, you must use a SAS
// to authenticate access to the source object, even if you are copying within the same
// storage account.
string fileSas = sourceFile.GetSharedAccessSignature(new SharedAccessFilePolicy()
{
// Only read permissions are required for the source file.
Permissions = SharedAccessFilePermissions.Read,
SharedAccessExpiryTime = DateTime.UtcNow.AddHours(24)
});
// Construct the URI to the source file, including the SAS token.
Uri fileSasUri = new Uri(sourceFile.StorageUri.PrimaryUri.ToString() + fileSas);
// Copy the file to the blob.
destBlob.StartCopy(fileSasUri);
Note:
If you are copying a blob to a file, or a file to a blob, you must use a shared access signature (SAS) to authenticate the source object, even if you are copying within the same storage account.
Use the Transfer Manager:
https://msdn.microsoft.com/en-us/library/azure/microsoft.windowsazure.storage.datamovement.transfermanager_methods.aspx
There are methods to copy from CloudFile to CloudBlob.
Add the "Microsoft.Azure.Storage.DataMovement" nuget package
using Microsoft.WindowsAzure.Storage.Blob;
using Microsoft.WindowsAzure.Storage.File;
using Microsoft.WindowsAzure.Storage.DataMovement;
private string _storageConnectionString = "your_connection_string_here";
public async Task CopyFileToBlob(string blobContainer, string blobPath, string fileShare, string fileName)
{
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(_connectionString);
CloudFileShare cloudFileShare = storageAccount.CreateCloudFileClient().GetShareReference(fileShare);
CloudFile source = cloudFileShare.GetRootDirectoryReference().GetFileReference(fileName);
CloudBlobContainer blobContainer = storageAccount.CreateCloudBlobClient().GetContainerReference(blobContainer);
CloudBlockBlob target = blobContainer.GetBlockBlobReference(blobPath);
await TransferManager.CopyAsync(source, target, true);
}

Can't rename blob file in Azure Storage

I am trying to rename blob in azure storage via .net API and it is I am unable to rename a blob file after a day : (
Here is how I am doing it, by creating new blob and copy from old one.
var newBlob = blobContainer.GetBlobReferenceFromServer(filename);
newBlob.StartCopyFromBlob(blob.Uri);
blob.Delete();
There is no new blob on server so I am getting http 404 Not Found exception.
Here is working example that i have found but it is for old .net Storage api.
CloudBlob blob = container.GetBlobReference(sourceBlobName);
CloudBlob newBlob = container.GetBlobReference(destBlobName);
newBlob.UploadByteArray(new byte[] { });
newBlob.CopyFromBlob(blob);
blob.Delete();
Currently I am using 2.0 API. Where I am I making a mistake?
I see that you're using GetBlobReferenceFromServer method to create an instance of new blob object. For this function to work, the blob must be present which will not be the case as you're trying to rename the blob.
What you could do is call GetBlobReferenceFromServer on the old blob, get it's type and then either create an instance of BlockBlob or PageBlob and perform copy operation on that. So your code would be something like:
CloudBlobContainer blobContainer = storageAccount.CreateCloudBlobClient().GetContainerReference("container");
var blob = blobContainer.GetBlobReferenceFromServer("oldblobname");
ICloudBlob newBlob = null;
if (blob is CloudBlockBlob)
{
newBlob = blobContainer.GetBlockBlobReference("newblobname");
}
else
{
newBlob = blobContainer.GetPageBlobReference("newblobname");
}
//Initiate blob copy
newBlob.StartCopyFromBlob(blob.Uri);
//Now wait in the loop for the copy operation to finish
while (true)
{
newBlob.FetchAttributes();
if (newBlob.CopyState.Status != CopyStatus.Pending)
{
break;
}
//Sleep for a second may be
System.Threading.Thread.Sleep(1000);
}
blob.Delete();
The code in OP was almost fine except that an async copy method was called. The simplest code in new API should be:
var oldBlob = cloudBlobClient.GetBlobReferenceFromServer(oldBlobUri);
var newBlob = container.GetBlobReference("newblobname");
newBlog.CopyFromBlob(oldBlob);
oldBlob.Delete();

Getting an error when uploading a file to Azure Storage

I'm converting a website from a standard ASP.NET website over to use Azure. The website had previously taken an Excel file uploaded by an administrative user and saved it on the file system. As part of the migration, I'm saving this file to Azure Storage. It works fine when running against my local storage through the Azure SDK. (I'm using version 1.3 since I didn't want to upgrade during the development process.)
When I point the code to run against Azure Storage itself, though, the process usually fails. The error I get is:
System.IO.IOException occurred
Message=Unable to read data from the transport connection: The connection was closed.
Source=Microsoft.WindowsAzure.StorageClient
StackTrace:
at Microsoft.WindowsAzure.StorageClient.Tasks.Task`1.get_Result()
at Microsoft.WindowsAzure.StorageClient.Tasks.Task`1.ExecuteAndWait()
at Microsoft.WindowsAzure.StorageClient.CloudBlob.UploadFromStream(Stream source, BlobRequestOptions options)
at Framework.Common.AzureBlobInteraction.UploadToBlob(Stream stream, String BlobContainerName, String fileName, String contentType) in C:\Development\RateSolution2010\Framework.Common\AzureBlobInteraction.cs:line 95
InnerException:
The code is as follows:
public void UploadToBlob(Stream stream, string BlobContainerName, string fileName,
string contentType)
{
// Setup the connection to Windows Azure Storage
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(GetConnStr());
DiagnosticMonitorConfiguration dmc = DiagnosticMonitor.GetDefaultInitialConfiguration();
dmc.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
dmc.Logs.ScheduledTransferLogLevelFilter = LogLevel.Verbose;
DiagnosticMonitor.Start(storageAccount, dmc);
CloudBlobClient BlobClient = null;
CloudBlobContainer BlobContainer = null;
BlobClient = storageAccount.CreateCloudBlobClient();
// For large file copies you need to set up a custom timeout period
// and using parallel settings appears to spread the copy across multiple threads
// if you have big bandwidth you can increase the thread number below
// because Azure accepts blobs broken into blocks in any order of arrival.
BlobClient.Timeout = new System.TimeSpan(1, 0, 0);
Role serviceRole = RoleEnvironment.Roles.Where(s => s.Value.Name == "OnlineRates.Web").First().Value;
BlobClient.ParallelOperationThreadCount = serviceRole.Instances.Count;
// Get and create the container
BlobContainer = BlobClient.GetContainerReference(BlobContainerName);
BlobContainer.CreateIfNotExist();
//delete prior version if one exists
BlobRequestOptions options = new BlobRequestOptions();
options.DeleteSnapshotsOption = DeleteSnapshotsOption.None;
CloudBlob blobToDelete = BlobContainer.GetBlobReference(fileName);
Trace.WriteLine("Blob " + fileName + " deleted to be replaced by newer version.");
blobToDelete.DeleteIfExists(options);
//set stream to starting position
stream.Position = 0;
long totalBytes = 0;
//Open the stream and read it back.
using (stream)
{
// Create the Blob and upload the file
CloudBlockBlob blob = BlobContainer.GetBlockBlobReference(fileName);
try
{
BlobClient.ResponseReceived += new EventHandler<ResponseReceivedEventArgs>((obj, responseReceivedEventArgs)
=>
{
if (responseReceivedEventArgs.RequestUri.ToString().Contains("comp=block&blockid"))
{
totalBytes += Int64.Parse(responseReceivedEventArgs.RequestHeaders["Content-Length"]);
}
});
blob.UploadFromStream(stream);
// Set the metadata into the blob
blob.Metadata["FileName"] = fileName;
blob.SetMetadata();
// Set the properties
blob.Properties.ContentType = contentType;
blob.SetProperties();
}
catch (Exception exc)
{
Logging.ExceptionLogger.LogEx(exc);
}
}
}
I've tried a number of different alterations to the code: deleting a blob before replacing it (although the problem exists on new blobs as well), setting container permissions, not setting permissions, etc.
Your code looks like it should work, but it has lots of extra functionality that is not strictly required. I would cut it down to an absolute minimum and go from there. It's really only a gut feeling, but I think it might be the using statement giving you grief. This enture function could be written (presuming the container already exists) as:
public void UploadToBlob(Stream stream, string BlobContainerName, string fileName,
string contentType)
{
// Setup the connection to Windows Azure Storage
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(GetConnStr());
CloudBlobClient BlobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer BlobContainer = BlobClient.GetContainerReference(BlobContainerName);
CloudBlockBlob blob = BlobContainer.GetBlockBlobReference(fileName);
stream.Position = 0;
blob.UploadFromStream(stream);
}
Notes on the stuff that I've removed:
You should set up diagnostics just once when you're app starts, not every time a method is called. Usually in the RoleEntryPoint.OnStart()
I'm not sure why you're trying to set ParallelOperationThreadCount higher if you have more instances. Those two things seem unrelated.
It's not good form to check for the existence of a container/table every time you save something to it. It's more usual to do that check once when your app starts or to have a process external to the website to make sure all the required containers/tables/queues exist. Of course if you're trying to dynamically create containers this is not true.
The problem turned out to be firewall settings on my laptop. It's my personal laptop originally set up at home and so the firewall rules weren't set up for a corporate environment resulting in slow performance on uploads and downloads.

Resources