We have requirement to upload a large File, possibly 10 GB , or up to 50 GB file size into SharePoint Online.
As per the new limit here , Maximum file upload size is increased upto 250 GB.
Can we upload a large File of 10GB or 50 GB into SharePoint online using SPFX File Upload or SharePoint REST API services?
If So, Please guide me the right approach, Do we need to split the file into chunks and upload? if so, What is the maximum File Chunk size?
I will recommend you to use Core.LargeFileUpload to upload large files.You can take following link as sample
https://learn.microsoft.com/en-us/sharepoint/dev/solution-guidance/upload-large-files-sample-app-for-sharepoint
The recommended approach to have sucesfull with this requirement is use a CSOM Library: At Microsoft Docs, titled Upload large files sample SharePoint Add-in, contains useful examples that provide you to Upload large files, using C# PRogramming. Bellow contains an code example that you may use as a Console Application:
public static void UploadDocumentContent(ClientContext ctx, string libraryName, string filePath)
{
Web web = ctx.Web;
FileCreationInformation newFile = new FileCreationInformation();
// The next line of code causes an exception to be thrown for files larger than 2 MB.
newFile.Content = System.IO.File.ReadAllBytes(filePath);
newFile.Url = System.IO.Path.GetFileName(filePath);
// Get instances to the given library.
List docs = web.Lists.GetByTitle(libraryName);
// Add file to the library.
Microsoft.SharePoint.Client.File uploadFile = docs.RootFolder.Files.Add(newFile);
ctx.Load(uploadFile);
ctx.ExecuteQuery();
}
At Console Application Main method, you'll call the created method above:
static void Main(string[] args)
{
using(ClientContext ctx = new ClientContext("https://tenant.sharepoint.com/sites/business"))
{
ctx.Credentials = = new SharePointOnlineCredentials("username#tenantdomain", "add secure password object");
UploadDocumentContent(ctx, "Documents", #"C:\FileFolder\FileName.ext")
}
}
Related
problem: zip file with csv files generated from data seems to be corrupted after upload to Azure Blob Storage.
zip file before upload looks like this:
and everything works fine. That same zip file after upload is corrupted and looks like this:
During upload I use Azure Storage Blob client library for Java (v. 12.7.0, but I tried also previous versions). This is code I use (similar to example provided in SDK readme file):
public void uploadFileFromPath(String pathToFile, String blobName) {
BlobClient blobClient = blobContainerClient.getBlobClient(blobName);
blobClient.uploadFromFile(pathToFile);
}
And I get uploaded file:
When I download file directly from storage explorer, file is already corrupted.
What I'm doing wrong?
According to your description, I suggest you use the following method to upload you zip file
public void uploadFromFile(String filePath, ParallelTransferOptions parallelTransferOptions, BlobHttpHeaders headers, Map<String,String> metadata, AccessTier tier, BlobRequestConditions requestConditions, Duration timeout)
We can use the method to set content type
For example
BlobHttpHeaders headers = new BlobHttpHeaders()
.setContentType("application/x-zip-compressed");
Integer blockSize = 4* 1024 * 1024; // 4MB;
ParallelTransferOptions parallelTransferOptions = new ParallelTransferOptions(blockSize, null, null);
blobClient.uploadFromFile(pathToFile,parallelTransferOptions,headers,null, AccessTier.HOT, null, null);
For more details, please refer to the document
Eventually it was my fault. I didn't close ZipOutputStream before uploading file. It is not much of a problem when you use try with resources and just want to generate local file. But in my case I want to upload file to Blob Storage (still in the try section). File was incomplete (not closed) so it appeared on storage with corrupted data. This is what I should do from the very beginning.
private void addZipEntryAndDeleteTempCsvFile(String pathToFile, ZipOutputStream zipOut,
File file) throws IOException {
LOGGER.info("Adding zip entry: {}", pathToFile);
zipOut.putNextEntry(new ZipEntry(pathToFile));
try (FileInputStream fis = new FileInputStream(file)) {
byte[] bytes = new byte[1024];
int length;
while ((length = fis.read(bytes)) >= 0) {
zipOut.write(bytes, 0, length);
}
zipOut.closeEntry();
file.delete()
}
zipOut.close(); // missing part
}
After all, thank you for your help #JimXu. I really appreciate that.
I'm currently using v2 of Azure Function Apps. I've set the environment to be 64 bit and am compiling to .Net Standard 2.0. Host Json specifies version 2.
I'm reading in a .csv and it works fine for smaller files. But when I read in a 180MB .csv into a List of string[] it's ballooning to over a GB on read and when I try to parse it, it's up over 2 GB but then throws the 'Out of Memory' Exception. Even running on an app service plan with more than 3.5 GB hasn't solved the issue.
Edit:
I'm using this:
Uri blobUri = AppendSasOnUri(blobName); _webClient = new WebClient();
Stream sourceStream = _webClient.OpenRead(blobUri);
_reader = new StreamReader(sourceStream);
However, since It's a csv, I'm splitting out entire columns of data. It's pretty hard to get away from this:
internal async Task<List<string[]>> ReadCsvAsync() {
while (!_reader.EndOfStream) {
string[] currentCsvRow = await ReadCsvRowAsync();
_fullBlobCsv.Add(currentCsvRow);
}
return _fullBlobCsv; }
Goal is to store json into blob when alls said and done.
Try using stream (StreamReader) to read the input .csv file and process one line at a time.
I'm able to parse 300mb files on consumption plan with streams. My use-case may not be same but similar. Parse a large concatenated pdf file and separate it to 5000+ smaller files and store the separated files into blob container. Below is my code for reference.
For your use case you may want to use CloudAppendBlob instead of CloudBlockBlob if you're pushing all parsed data into single blob.
public async static void ExtractSmallerFiles(CloudBlockBlob myBlob, string fileDate, ILogger log)
{
using (var reader = new StreamReader(await myBlob.OpenReadAsync()))
{
CloudBlockBlob blockBlob = null;
var fileContents = new StringBuilder(string.Empty);
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
if (line.StartsWith("%%MS_SKEY_0000_000_PDF:"))
{
var matches = Regex.Match(line, #"%%MS_SKEY_0000_000_PDF: A(\d+)_SMFL_B1234_D(\d{8})_A\d+_M(\d{15}) _N\d+");
var smallFileDate = matches.Groups[2];
var accountNumber = matches.Groups[3];
var fileName = $"SmallerFiles/{smallFileDate}/{accountNumber}.pdf";
blockBlob = myBlob.Container.GetBlockBlobReference(fileName);
}
fileContents.AppendLine(line);
if (line.Equals("%%EOF"))
{
log.LogInformation($"Uploading {fileContents.Length} bytes to {blockBlob.Name}");
await blockBlob.UploadTextAsync(fileContents.ToString());
fileContents = new StringBuilder(string.Empty);
}
}
await myBlob.DeleteAsync();
log.LogInformation("Extracted Smaller files");
}
}
I want the file storage specifically not the blob storage (I think). This is code for my azure function and I just have a bunch of stuff in my node_modules folder.
What I would like to do is upload a zip of the entire app and then just upload that and have azure unpack it at a given folder. Is this possible?
Right now I'm essentially iterating over all of my files and calling:
var fileStream = new stream.Readable();
fileStream.push(myFileBuffer);
fileStream.push(null);
fileService.createFileFromStream('taskshare', 'taskdirectory', 'taskfile', fileStream, myFileBuffer.length, function(error, result, response) {
if (!error) {
// file uploaded
}
});
And this works its just too slow. So I'm wondering if there is a faster way to upload a bunch of files for use in apps.
And this works its just too slow. So I'm wondering if there is a faster way to upload a bunch of files for use in apps.
If Microsoft Azure Storage Data Movement Library is acceptable, please have a try to use it. The Microsoft Azure Storage Data Movement Library designed for high-performance uploading, downloading and copying Azure Storage Blob and File. This library is based on the core data movement framework that powers AzCopy.
We also could get the demo code from the github document.
string storageConnectionString = "myStorageConnectionString";
CloudStorageAccount account = CloudStorageAccount.Parse(storageConnectionString);
CloudBlobClient blobClient = account.CreateCloudBlobClient();
CloudBlobContainer blobContainer = blobClient.GetContainerReference("mycontainer");
blobContainer.CreateIfNotExists();
string sourcePath = "path\\to\\test.txt";
CloudBlockBlob destBlob = blobContainer.GetBlockBlobReference("myblob");
// Setup the number of the concurrent operations
TransferManager.Configurations.ParallelOperations = 64;
// Setup the transfer context and track the upoload progress
SingleTransferContext context = new SingleTransferContext();
context.ProgressHandler = new Progress<TransferStatus>((progress) =>
{
Console.WriteLine("Bytes uploaded: {0}", progress.BytesTransferred);
});
// Upload a local blob
var task = TransferManager.UploadAsync(
sourcePath, destBlob, null, context, CancellationToken.None);
task.Wait();
I am using below code to upload the file in SharePoint 2010 Library
String fileToUpload = #"C:\YourFile.txt";
String sharePointSite = "http://yoursite.com/sites/Research/";
String documentLibraryName = "Shared Documents";
using (SPSite oSite = new SPSite(sharePointSite))
{
using (SPWeb oWeb = oSite.OpenWeb())
{
if (!System.IO.File.Exists(fileToUpload))
throw new FileNotFoundException("File not found.", fileToUpload);
SPFolder myLibrary = oWeb.Folders[documentLibraryName];
// Prepare to upload
Boolean replaceExistingFiles = true;
String fileName = System.IO.Path.GetFileName(fileToUpload);
FileStream fileStream = File.OpenRead(fileToUpload);
// Upload document
SPFile spfile = myLibrary.Files.Add(fileName, fileStream, replaceExistingFiles);
// Commit
myLibrary.Update();
}
}
This worked well through my machine. But when I deploy it on server and used the below snippet to upload file in library from my machine, it gives error. It is not getting the file location (C:\YourFile.txt) from local(client) machine.
When you run on the server your code runs under a different account (apppool identity) which does not have the permission to read C drive.
I dont know why would you want to read and upload a file from the same server, looks like you are simply testing Sharepoint Object Model then it is ok
If you are expecting some other app or service to keep an updated file for Sharepoint , it should be moved to the web directory i.e \wwwroot\wss\VirtualDirectories\80 and then use your code to read and update your doc lib (myLibrary) as you are doing.
Are you running this in a console app or "in SharePoint"?
Could it be that the account running the code doesnt have read permissions in C:\?
I am facing the issue of remote server returned error 400, bad request while uploading files to Azure as block blobs. But the strange thing is sometimes the code is worked for uploading a particular file and some time it failed for the same file.
My code is like --
List<string> blockIdList = new List<string>();
using (var file = new FileStream(_path, FileMode.Open, FileAccess.Read))
{
int blockId = 0;
int blockSize = 4096;
// open file
while (file.Position < file.Length)
{
// calculate buffer size (blockSize in KB)
long bufferSize = blockSize * 1024 < file.Length - file.Position ? blockSize * 1024 : file.Length - file.Position;
byte[] buffer = new byte[bufferSize];
// read data to buffer
file.Read(buffer, 0, buffer.Length);
// save data to memory stream and put to storage
using (var stream = new MemoryStream(buffer))
{
// set stream position to start
stream.Position = 0;
convert block id to Base64 Encoded string
var blockIdBase64 = Convert.ToBase64String(System.BitConverter.GetBytes(blockId));
blockBlob.PutBlock(blockIdBase64, stream, null);
blockIdList.Add(blockIdBase64);
// increase block id
blockId++;
}
}
blockBlob.PutBlockList(blockIdList);
file.Close();
}
Don't know why this error is throwing and looking for possible solution.
Thanks
One thing I noticed is that you're using integer value as blockId. This could be one reason why your upload is failing because length of all blockIds must be same. So your upload code would work if the file is being split into 10 blocks (blockId = 0 - 9). However if the file is split into more than 10 blocks, the upload would fail.
My recommendation would be to pad string with 0s so that all the blockIds would be of same length. Since you can split a blob into a maximum of 50,000 blocks doing blockId.ToString("d6") should do the trick.
You may also find this blog post useful: http://gauravmantri.com/2013/05/18/windows-azure-blob-storage-dealing-with-the-specified-blob-or-block-content-is-invalid-error/.
I too was facing this problem. I gave a few incorrect parameters to the AzCopy command and that was it - every new AzCopy I issued started giving that frustrating error. Looked up a bunch of stuff on the internet including Gaurav Mantri's blog post. He talks about 'committing' uncommitted blocks.
One easy way I found to 'Purge' every damn block from a container was to use this tool called "Azure Storage Explorer". It seemed to display all blocks - just selected them and nuked them all. Post this delete my AzCopy worked peacefully !
(Note that these invalid or uncommitted blocks do not show up on the azure management portal - wonder why doesn't the azure team support that directly. It is quite a PITA :-/