Download an excel file and read content with azure functions - azure

I am trying to write a C# Azure Function to download and open an excel file using the OpenXml-SDK.
Office Interop doesn't work here because office is not available to the Azure Function.
I am trying to use OpenXml-SDK to open and read the file which seems to require a path to the saved file and not the url or a Stream downloaded from the remote url.
Given I don't know of a way to temporary store the excel file in Azure Functions, I used Azure File Storage.
I uploaded the excel file from the url to Azure File Storage, however I cannot open the excel file with OpenXML-SDK.
I tested the excel file in Azure File Storage is working, however, when I try to open the OpenXML.SpreadsheetDocument form a MemoryStream I get error indicating the file is corrupt.
If I try to open the SpreadsheetDocument passing the file Uri (https://learn.microsoft.com/en-us/azure/storage/storage-dotnet-how-to-use-files#develop-with-file-storage) then the address passes the 260 character limit.
I'm open to using a library other than OpenXML and ideally I would prefer not to have to store the excel file.

Open XML SDK works fine in Azure Function. I tested it on my side. Here is the full code.
#r "DocumentFormat.OpenXml.dll"
#r "WindowsBase.dll"
using System.Net;
using System.IO;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
public static HttpResponseMessage Run(HttpRequestMessage req, TraceWriter log)
{
log.Info($"C# HTTP trigger function processed a request. RequestUri={req.RequestUri}");
WebClient client = new WebClient();
byte[] buffer = client.DownloadData("http://amor-webapp-test.azurewebsites.net/Content/hello.xlsx");
MemoryStream stream = new MemoryStream();
stream.Write(buffer, 0, buffer.Length);
stream.Position = 0;
using (SpreadsheetDocument doc = SpreadsheetDocument.Open(stream, false))
{
WorkbookPart workbookPart = doc.WorkbookPart;
SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();
SharedStringTable sst = sstpart.SharedStringTable;
WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
Worksheet sheet = worksheetPart.Worksheet;
var cells = sheet.Descendants<Cell>();
var rows = sheet.Descendants<Row>();
log.Info(string.Format("Row count = {0}", rows.LongCount()));
log.Info(string.Format("Cell count = {0}", cells.LongCount()));
// One way: go through each cell in the sheet
foreach (Cell cell in cells)
{
if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
{
int ssid = int.Parse(cell.CellValue.Text);
string str = sst.ChildElements[ssid].InnerText;
log.Info(string.Format("Shared string {0}: {1}", ssid, str));
}
else if (cell.CellValue != null)
{
log.Info(string.Format("Cell contents: {0}", cell.CellValue.Text));
}
}
}
return req.CreateResponse(HttpStatusCode.OK, "Hello ");
}
To use Open XML, please make sure you have created a bin folder under your function folder and uploaded DocumentFormat.OpenXml.dll and WindowsBase.dll to it.
"File contains corrupted data".
Have you tried another excel file to check whether the issue is related to specific excel file. I suggest you create a new simple excel to test your code again.
"It didn't work on my file with the same "File contains corrupted data" message. "
I download your excel file and found that it is a older version(.xls) of excel file.
To fixed the exception, you could convert the excel to latest version(.xlsx) or choose another excel parse library. ExcelDataReader could work for any versions of excel file. You could install this library using NuGet by searching 'ExcelDataReader'. Following is the sample code of how to parse .xls format excel file. I tested it on Azure Function, it did worked fine.
#r "Excel.dll"
#r "System.Data"
using System.Net;
using System.IO;
using Excel;
using System.Data;
public static HttpResponseMessage Run(HttpRequestMessage req, TraceWriter log)
{
log.Info($"C# HTTP trigger function processed a request. RequestUri={req.RequestUri}");
WebClient client = new WebClient();
byte[] buffer = client.DownloadData("http://amor-webapp-test.azurewebsites.net/Content/abcdefg.xls");
MemoryStream stream = new MemoryStream();
stream.Write(buffer, 0, buffer.Length);
stream.Position = 0;
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
DataSet result = excelReader.AsDataSet();
for (int i = 0; i < result.Tables.Count; i++)
{
log.Info(result.Tables[i].TableName +" has " + result.Tables[i].Rows.Count + " rows.");
}
return req.CreateResponse(HttpStatusCode.OK, "Hello ");
}
Please add "Excel.dll" file to the bin folder of your function before executing upper code.

If you do need to save a temporary file, Azure Functions has a %TEMP% environment variable with a path to a temporary folder. This is a folder that is local to the vm that runs your function and will not be persisted.
However, saving the file locally / in Azure Files is unnecessary. You should be able to get the stream from the response to your get request and pass it straight to OpenXML.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(originalExcelUrl);
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
using (Stream stream = response.GetResponseStream())
{
var doc = SpreadsheetDocument.Open(stream, true);
// etc
}

Related

Copying files from FTP to Azure Blob Storage

I have created my FTP (ftp://xyz.in) with user id and credentials.
I have created an asp.net core API application that will copy files from FTP to Azure blob storage.
I have my API solution placed in C://Test2/Test2 folder.
Now below is my code :
FtpWebRequest request = (FtpWebRequest)WebRequest.Create("ftp:/xyz.in");
request.Method = WebRequestMethods.Ftp.UploadFile;
// This example assumes the FTP site uses anonymous logon.
request.Credentials = new NetworkCredential("pqr#efg.com", "lmn");
// Copy the contents of the file to the request stream.
byte[] fileContents;
// Getting error in below line.
using (StreamReader sourceStream = new StreamReader("ftp://xyz.in/abc.txt"))
{
fileContents = Encoding.UTF8.GetBytes(sourceStream.ReadToEnd());
}
request.ContentLength = fileContents.Length;
using (Stream requestStream = request.GetRequestStream())
{
requestStream.Write(fileContents, 0, fileContents.Length);
}
using (FtpWebResponse response = (FtpWebResponse)request.GetResponse())
{
Console.WriteLine($"Upload File Complete, status {response.StatusDescription}");
}
But on line
using (StreamReader sourceStream = new StreamReader("ftp://xyz.in/abc.txt"))
I am getting error : System.IO.IOException: 'The filename, directory name, or volume label syntax is incorrect : 'C:\Test2\Test2\ftp:\xyz.in\abc.txt''
I am not able to understand from where does 'C:\Test2\Test2' string gets append to my FTP.
Test2 is a folder where my .Net Core application is placed.
StreamReader() doesn't take a URL/URI, it takes a file path on your local system: (read the doco):
https://learn.microsoft.com/en-us/dotnet/api/system.io.streamreader.-ctor?view=net-5.0
StreamReader is interpurting the string you've supplied as a filename ("ftp://xyz.in/abc.txt"), and it's looking for it in the current running folder "C:\Test2\Test2". If your string was "abc.txt", it would look for a file called "abc.txt" in the current folder, e.g. C:\Test2\Test2\abc.txt.
What you want is to get the file using WebClient or something similar:
WebClient request = new WebClient();
string url = "ftp://xyz.in/abc.txt";
request.Credentials = new NetworkCredential("username", "password");
try
{
byte[] fileContents = request.DownloadData(url);
// Do Something...
}

Can't Upload multiple azure blobs contents ( text and Image) to Azure SQL table

At the moment i can only upload one blobtype( text) with the below trigger
#r "System.Configuration"
#r "System.Data"
using System.Configuration;
using System.Data.SqlClient;
using System.Threading.Tasks;
using System;
public static void Run(Stream myBlob, string name, TraceWriter log)
{
log.Info($"C# Blob trigger function Processed blob\n Name:{name} \n Size: {myBlob.Length} Bytes");
string detail = ($"{name}");
var str = ConfigurationManager.ConnectionStrings["sqldb_connection"].ConnectionString;
using (SqlConnection conn = new SqlConnection(str))
{
conn.Open();
var text = "INSERT INTO PhotoTable(CreatedAt,UpdatedAt,IsDeleted, Url, Title) " +
"VALUES (SYSDATETIMEOFFSET(),SYSDATETIMEOFFSET(), 'true', 'yrhrh', #Name)";
using (SqlCommand cmd = new SqlCommand(text, conn))
{
cmd.Parameters.AddWithValue("#Name", name);
// Execute the command and log the # rows affected.
var rows = cmd.ExecuteNonQueryAsync();
log.Info($"{rows} rows were updated");
}
}
}
Have 2 questions
01)Is their any way that i can upload two types at the same time to azure SQLstorage ??( such as text blob and image blob)
02) with this trigger i am only getting the ID of the blob storage not the contents ,that also an issue is their any way that i can get the contents of the blob-storage as well?? ,
help will be highly appreciated , thank you
1) It is possible to upload different kinds of files into Azure SQL using SSIS, The following video tutorial shows you a step by step guide on how to do it
2) number two, you can use OPENROWSET, parses a file stored in Blob storage and returns the content of the file as a set of rows
The following example shows a BULK INSERT command that loads the content of the file into SQL Database:
BULK INSERT Product
FROM 'data/product.dat'
WITH ( DATA_SOURCE = 'MyAzureBlobStorageAccount');
More info can be found here

Exporting XLSX file with Epplus generates file with error

Problem
I'm trying to send a List to an Excel file by using the LoadFromCollection method.
After I generate an XLSX file, it does not open but if I change the extension to XLS, it appears like this.
ExcelFileError
Code
[HttpGet]
[Route("GetExcel")]
public async Task<HttpResponse> GetExcel()
{
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ClearContent();
HttpContext.Current.Response.ClearHeaders();
HttpContext.Current.Response.Buffer = true;
HttpContext.Current.Response.Cache.SetCacheability(HttpCacheability.NoCache);
var stream = new MemoryStream();
using (var excelPackage = new ExcelPackage())
{
ExcelWorksheet ws = excelPackage.Workbook.Worksheets.Add("SoftwareVersao");
var result = await _versaoAppService.Selecionar();
ws.Cells["A1"].LoadFromCollection(result);
ws.Cells.AutoFitColumns();
excelPackage.SaveAs(stream);
}
HttpContext.Current.Response.BinaryWrite(stream.ToArray());
HttpContext.Current.Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
HttpContext.Current.Response.AddHeader("content-disposition", "attachment;filename=" + "SoftwareVersao.xlsx");
HttpContext.Current.Response.Flush();
HttpContext.Current.ApplicationInstance.CompleteRequest();
return HttpContext.Current.Response;
}
This code generates a corrupted XLSX file if I change the lines below it generates the above image.
HttpContext.Current.Response.ContentType = "application/vnd.ms-excel"
HttpContext.Current.Response.AddHeader("content-disposition", "attachment;filename=" + "SoftwareVersao.xls");
Attempts
I've made several attempts.
I used HttpResponseMessage instead of HttpResponse
I used a blank template in creating the file
I changed the enconding
I used several code snippets
What am I doing wrong? Is there something I forgot to implement?
Update
I found this question and now could generate the XLSX file, but I'm having the same problem that the author of the question, where Excel have to repair the file.
WebApi using EF EPPlus returning gibberish in exce

NPOI writing XLS file converting to Azure Blob

I'm trying to convert current application that uses NPOI for creating xls document on the server to Azure hosted application. I have little experience with NPOI and Azure so 2 strikes right there. I have the app uploading the xls to Blob container however it is always blank (9 bytes). From what I understand NPOI uses filestream to write to the file so I just changed that to write to the blob container.
Here is what i think are the relevant portions:
internal void GenerateExcel(DataSet ds, int QuoteID, string ReportFileName)
{
string ExcelFileName = string.Format("{0}_{1}.xls",ReportFileName,QuoteID);
try
{
//these 2 strings will get deleted but left here for now to run side by side at the moment
string ReportDirectoryPath = HttpContext.Current.Server.MapPath(".") + "\\Reports";
if (!Directory.Exists(ReportDirectoryPath))
{
Directory.CreateDirectory(ReportDirectoryPath);
}
string ExcelReportFullPath = ReportDirectoryPath + "\\" + ExcelFileName;
if (File.Exists(ExcelReportFullPath))
{
File.Delete(ExcelReportFullPath);
}
// Create a new workbook.
var workbook = new HSSFWorkbook();
//Rest of the NPOI XLS rows cells etc. etc. all works fine when writing to disk////////////////
// Retrieve storage account from connection string.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
// Create the blob client.
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
// Retrieve a reference to a container.
CloudBlobContainer container = blobClient.GetContainerReference("pricingappreports");
// Create the container if it doesn't already exist.
if (container.CreateIfNotExists())
{
container.SetPermissions(new BlobContainerPermissions { PublicAccess = BlobContainerPublicAccessType.Blob });
}
// Retrieve reference to a blob with the same name.
CloudBlockBlob blockBlob = container.GetBlockBlobReference(ExcelFileName);
// Write the output to a file on the server
String file = ExcelReportFullPath;
using (FileStream fs = new FileStream(file, FileMode.Create))
{
workbook.Write(fs);
fs.Close();
}
// Write the output to a file on Azure Storage
String Blobfile = ExcelFileName;
using (FileStream fs = new FileStream(Blobfile, FileMode.Create))
{
workbook.Write(fs);
blockBlob.UploadFromStream(fs);
fs.Close();
}
}
I'm uploading to the Blob and the file exists, why doesn't the data get written to the xls?
Any help would be appreciated.
Update: I think I found the problem. Doesn't look like you can write to a file in Blob Storage. Found this Blog which pretty much answers my questions: it doesn't use NPOI but the concept is the same. http://debugmode.net/2011/08/28/creating-and-updating-excel-file-in-windows-azure-web-role-using-open-xml-sdk/
Thanks
Can you install fiddler and check the request and the response packets? You may also need to seek back to 0 between two writes . So the correct code here could be to add the below before trying to write the stream to blob.
workbook.Write(fs);
fs.Seek(0, SeekOrigin.Begin);
blockBlob.UploadFromStream(fs);
fs.Close();
I also noticed that you are using String Blobfile = ExcelFileName instead of String Blobfile = ExcelReportFullPath.

the file you are trying to open is in a different format than specified by the file extension in Asp.Net

the file you are trying to open is in a different format than specified by the file extension c# error when trying to open file in excel.
Here is my code
public ActionResult Export(string filterBy)
{
MemoryStream output = new MemoryStream();
StreamWriter writer = new StreamWriter(output, Encoding.UTF8);
var data = City.GetAll().Select(o => new
{
CountryName = o.CountryName,
StateName = o.StateName,
o.City.Name,
Title = o.City.STDCode
}).ToList();
var grid = new GridView { DataSource = data };
grid.DataBind();
var htw = new HtmlTextWriter(writer);
grid.RenderControl(htw);
writer.Flush();
output.Position = 0;
return File(output, "application/vnd.ms-excel", "test.xls");
}
when am trying to open excel i get this error
the file you are trying to open is in a different format than
specified by the file extension
After clicking on Yes the file open properly. but i don't want this msg to appear.
I have used CloseXML to solve the problem.
public static void ExportToExcel(IEnumerable<dynamic> data, string sheetName)
{
XLWorkbook wb = new XLWorkbook();
var ws = wb.Worksheets.Add(sheetName);
ws.Cell(2, 1).InsertTable(data);
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ContentType = "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet";
HttpContext.Current.Response.AddHeader("content-disposition", String.Format(#"attachment;filename={0}.xlsx",sheetName.Replace(" ","_")));
using (MemoryStream memoryStream = new MemoryStream())
{
wb.SaveAs(memoryStream);
memoryStream.WriteTo(HttpContext.Current.Response.OutputStream);
memoryStream.Close();
}
HttpContext.Current.Response.End();
}
Installed ClosedXML in my project using Nuget Package Manager.
the file you are trying to open is in a different format than
specified by the file extension
You are constantly getting that warning message because the file that got created is not an actual excel file. If you will look into the generated file, it's just a bunch of html tags. Remember that a GridView's RenderControl will generate an html table.
To fix your issue, you need to either use a third party tool that creates a real excel file (one tool you might want to use is NPOI) or create a comma-delimited file, or simply a csv file, and return that file.
In case someone else stumbles across this... I needed to convert blobs back into files on-the-fly in C#. Pdf's worked well and excel gave me this same error as OP explains.
This is the code I wrote which handles excel differently from other file types.
Giving excel application/octet-stream with an actual filename solved my issue. Probably not the cleanest way to do it but it was good enough for my purposes.
string theExt = Path.GetExtension(theDoc.documentFileName).ToUpper();
Response.Clear();
if (theExt == ".XLS" || theExt == ".XLSX"){
Response.ContentType = "application/octet-stream";
Response.AddHeader("Content-Disposition", string.Format("inline; filename={0}", theDoc.documentFileName));
}
else{
Response.ContentType = theDoc.documentMimeType;
Response.AddHeader("Content-Disposition", string.Format("inline; filename={0}", theDoc.documentTitle));
}
using (MemoryStream stream = new MemoryStream(theDoc.file))
{
stream.WriteTo(Response.OutputStream);
stream.Close();
};
Response.End();
In case someone needs to export a dataset as excel file with CloseXML.
Dataset ds = { your data from db }
var xlsx = new XLWorkbook();
var dataTable = ds.Tables[0];
xlsx.Worksheets.Add(dataTable);
xlsx.SaveAs("export.xlsx");

Resources