Stream File in Azure App Service lost bytes - azure

I have a very unusual problem in when I try to Stream file with ASP.NET core and ASP.NET full framework.
I have been trying with different code samples which work in my local environment and in my local IIS 7. However these code samples when executed in an App Service does not work well. It is impossible to download a complete file. In every execute, my file(s) lose a few bytes resulting in corrupted files (upon download).
As I have tried multiple code samples am not providing the code snippets here.The code samples were picked up from either stackoverflow or official documentation.

Basically my code has this:
try
{
WebRequest req = WebRequest.Create("[URL here]");
WebResponse response = req.GetResponse();
Stream stream = response.GetResponseStream();
//...
}
catch (Exception)
{
MessageBox.Show("There was a problem downloading the file");
}
Controller
[HttpPost("Document")]
[Produces("application/octet-stream")]
[AllowAnonymous]
public async Task<IActionResult> Download([FromBody]DownloadDmsRequest data)
{
If I would receive the stream in a MemoryStream, my file would download completely.
I did the test, of downloading the file in memory and after uploading that file to the blob storage, and the file uploaded successfully without problems.

Related

Trying to use HttpClient.GetStreamAsync straight to the adls FileClient.UploadAsync

I have an Azure Function that will call an external API via HttpClient. The external API returns a JSON response. I want to save the response directly to an ADLS File.
My simplistic code is:
public async Task UploadFileBulk(Stream contentToUpload)
{
await this._theClient.FileClient.UploadAsync(contentToUpload);
}
The this._theClient is a simple wrapper class around the various Azure Data Lake classes such as DataLakeServiceClient, DataLakeFileSystemClient, DataLakeDirectoryClient, DataLakeFileClient.
I'm happy this wrapper calls works as I expect, I spin one up, set the service, filesystem, directory and then a filename to create. I've used this wrapper class to create directories etc. so it works as I expect.
I am calling the above method as follows:
await dlw.UploadFileBulk(await this._httpClient.GetStreamAsync("<endpoint>"));
I see the file getting created in the Lake directory with the name I want, however if I then download the file using Sorage Explorer and then try to open it in say VS Code it's not in a recognisable format (I can "force" code to open it but it looks like binary format to me).
If I sniff the traffic with fiddler I can see the content from the external API is JSON, content-type is application/json and the body shows in fiddler as JSON.
If I look at the calls to the ADLS endpoint I can see a PUT call followed by two PATCH calls.
The first PATCH call looks like it is the one sending the content, it has a content-header of application/octet-stream and the request body is the "binary looking content".
I am using HttpClient.GetStreamAsync as I don't want my Function to have to load the entire API payload into memory (some of the external API endpoints return very large files over 100mb). I am thinking I can "stream the response from the external API straight into ADLS".
Is there a way to change how the ADLS FileClient.UploadAsync(Stream stream) method works so I can tell it to upload the file as a JSON file with a content type of application/json?
EDIT:
So turns out the External API was sendng back zipped content and so once I added the following extra AutomaticDecompression code to my functions startup I got the files uploaded to ADLS as expected.
public override void Configure(IFunctionsHostBuilder builder)
{
builder.Services.AddHttpClient("default", client =>
{
client.DefaultRequestHeaders.Add("Accept-Encoding", "gzip, deflate");
}).ConfigurePrimaryHttpMessageHandler(() => new HttpClientHandler
{
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
});
}
#Gaurav Mantri has given me some pointers on if the pattern of "streaming from an output to an input" is actually correct, I will research this further.
Regarding the issue, please refer to the following code
var uploadOptions = new DataLakeFileUploadOptions();
uploadOptions.HttpHeaders = new PathHttpHeaders();
uploadOptions.HttpHeaders.ContentType ="application/json";
await fileClient.UploadAsync(stream, uploadOptions);

problems uploading xslx file in body of post request to .net core app on aws-lambda

I'm trying to send a post request with postman to our AWS-Lambda server. Let me first state that, when running the web-server on my laptop using the Visual studio debugger, everything works fine. When trying to do exactly the same but to the url of the AWS-Lambda i'm getting the following errors when shifting through the logging:
when uploading the normal xlsx file (it's a size of 593kb)
Split or spanned archives are not supported.
When uploading the same file but with a few worksheet removed (because i thought maybe the size is to big, which should be bs but lets try):
Number of entries expected in End Of Central Directory does not correspond to number of entries in Central Directory.
when uploading a random xlsx file:
Offset to Central Directory cannot be held in an Int64.
I do not know what is going on, it might have something to do with the way postman serializes the xlsx file and the way my debug session (on a windows machine) deserializes it which is different from the way AWS-Lambda deserializes it but that's just a complete guess.
I always get a 400 - Bad Request response
I'm at a loss and am hoping someone here knows what to do.
This is the method in my controller, however the problem occurs before this:
[HttpPost("productmodel")]
public async Task<IActionResult> SeedProductModel()
{
try
{
_logger.LogInformation("Starting seed product model");
var memoryStream = new MemoryStream();
_logger.LogInformation($"request body: {Request.Body}");
Request.Body.CopyTo(memoryStream);
var command = new SeedProductModelCommand(memoryStream);
var result = await _mediator.Send(command);
if (!result.Success)
{
return BadRequest(result.MissingProducts);
}
return Ok();
}
catch (Exception ex)
{
_logger.LogError(ex.Message);
return BadRequest();
}
}
postman:
we do not use api keys for our test environment
Since you are uploading binary content to API Gateway, you need to enable it through the console.
Go to API Gateway -> select your API -> Settings -> Binary Media Types -> application/octet-stream, like the image below
Save it and make sure to redeploy your API, otherwise your changes will have no effect.
To do so, select your API -> Actions -> Deploy API

How to create a blob in node (server side) from a stream, a file or a base64 string?

I am trying to create a blob from a pdf I am creating from pdfmake so that I can send it to a remote api that only handles blobs.
This is how I get my PDF file:
var docDefinition = { content: 'This is an sample PDF printed with pdfMake' };
pdfDoc.pipe(fs.createWriteStream('./pdfs/test.pdf'));
pdfDoc.end();
The above lines of code do produce a readable pdf.
Now how can I get a blob from there? I have tried many options (creating the blob from the stream with the blob-stream module, creating from the file with fs, creating it from a base64 string with b64toBlob) but all of them require at some point to use the constructor Blob for which I always get an error even if I require the module blob:
TypeError: Blob is not a constructor
After some research I found that it seems that the Blob constructor is only supported client-side.
All the npm packages that I have found and which seem to deal with this issue seem to only work client-side: blob-stream, blob, blob-util, b64toBlob, etc.
So, how can I create a blob server-side on Node?
I don't understand why almost nobody also needs to create a blob server-side? The only thread I could find on the subject is this one.
According to that thread, apparently:
The Solution to this problem is to create a function which can convert between Array Buffers and Node Buffers. :)
Unfortunately this does not help me much as I clearly seem to lack some important knowledge here to be able to comprehend this.
use node-blob npm package
const Blob = require('node-blob');
let myBlob = new Blob(["something"], { type: 'text/plain' });

windows azure blob leasing in sdk1.4

I have been using the following code which I wrote after consulting the following thread - Use blob-leasing feature in the Azure cloud app
public static void UploadFromStreamWithLease(CloudBlob blob, Stream src, string leaseID)
{
string url = blob.Uri.ToString();
if (blob.ServiceClient.Credentials.NeedsTransformUri)
{
url = blob.ServiceClient.Credentials.TransformUri(url);
}
HttpWebRequest req = BlobRequest.Put(new Uri(url), 90, blob.Properties, BlobType.BlockBlob, leaseID, 0);
BlobRequest.AddMetadata(req, blob.Metadata);
using (var writer = new StreamWriter(req.GetRequestStream()))
{
byte[] content = new byte[src.Length];
writer.Write(readFully(src));
}
blob.ServiceClient.Credentials.SignRequest(req);
req.GetResponse().Close();
}
The readFully() method above simply gets the content from the stream to a byte[] array.
I have been using this code to upload some stuff to any blob that has a valid leaseId. This was working fine until I moved to version 1.4 of the Azure SDK. In the new version of the azure sdk, I get an error 400 in req.GetResponse() method.
Can someone please point out what has changed in azure sdk 1.4 that's screwing this up?
Thanks
Kapil
The 400 code means "bad request" there should be some additional error message, see http://paulsomers.blogspot.com/2010/10/azure-error-400-bad-request.html for some examples. You should try debugging or sniffing the network to get the error message.
There were some bugs for downloading blobs in version 1.4, but they may not affect you. However, you should upgrade to latest version.

Windows Azure: Can't upload a 34 MB file on to the blob

I was trying to upload a 34 MB file onto the blob but it is prompting me some error
XML Parsing Error: no element found
Location: http://127.0.0.1:83/Default.aspx
Line Number 1, Column 1:
What should I do....How to solve it
I am able to upload small files of size 500KB.. but I have a file of size 34 MB to be uploaded into my blob container
I tried it using
protected void ButUpload_click(object sender, EventArgs e)
{
// store upladed file as a blob storage
if (uplFileUpload.HasFile)
{
name = uplFileUpload.FileName;
// get refernce to the cloud blob container
CloudBlobContainer blobContainer = cloudBlobClient.GetContainerReference("documents");
// set the name for the uploading files
string UploadDocName = name;
// get the blob reference and set the metadata properties
CloudBlob blob = blobContainer.GetBlobReference(UploadDocName);
blob.Metadata["FILETYPE"] = "text";
blob.Properties.ContentType = uplFileUpload.PostedFile.ContentType;
// upload the blob to the storage
blob.UploadFromStream(uplFileUpload.FileContent);
}
}
But I am not able to upload it.. Can anyone tell me How to do that....
Blobs larger than 64MB must be uploaded using block blobs. You break the file into blocks, upload all the blocks (associating each block with a unique string identifier), and at the very end you post the list of block IDs to the blob to commit the entire batch in one go.
Uploading in blocks is also recommended for large blobs less than 64MB in size. It is very easy for a hiccup in the network connection or routing through the internet to lose a frame or two in a very large upload, which will corrupt or invalidate the entire upload. Use smaller blocks to reduce your exposure to cosmic events.
More info in this discussion thread: http://social.msdn.microsoft.com/Forums/en-NZ/windowsazure/thread/f4575746-a695-40ff-9e49-ffe4c99b28c7
I would start by dropping some logging into the project to try and track the problem down. It may not be happening where you think. There might also be a permissions error. Try adding some dummy data into the database. If it still fails that might be a potential problem.
But track it down yourself with some debug, logging and some code review, I bet you can get to the bottom of the problem sooner that way. And it will also help to make your code more robust.
You can use Blobs here. I think its an issue with your web request size. You can change this setting in the web.config by increasing the number of the maxRequestLength attribute in the element. If you are sending chunks of 500Kb, then you are wasting bandwidth and bringing down performance. Send bigger chunks of data such as 1-2 Mb per chunk. See my Silverlight or HTML5 based upload control for chunked uploads. Pick Your Azure File Upload Control: Silverlight and TPL or HTML5 and AJAX
Use the Blob Transfer Utility to download and upload all your blob files.
It's a tool to handle thousands of (small/large) blob transfers in a effective way.
Binaries and source code, here: http://bit.ly/blobtransfer

Resources