Upload rolling files to Azure Blob - azure

I'm trying to foresee a system where tweets are flowing to Azure Blob storage through the Twitter streaming API. I was following a tutorial from Microsoft but it ends in a following scenario:
$writeStream = New-Object System.IO.StreamWriter $memStream
$count=0
$lineMax=1000000
$sReader = New-Object System.IO.StreamReader($response.GetResponseStream())
$inrec = $sReader.ReadLine()
while (($inrec -ne $null) -and ($count -le $lineMax))
{
if ($inrec -ne "")
{
$writeStream.WriteLine($inrec)
}
$inrec=$sReader.ReadLine()
}
$writeStream.Flush()
$memStream.Seek(0, "Begin")
$destBlob.UploadFromStream($memStream)
$sReader.close()
Now the problem is, if I want to use this on large scale I suspect the file will become too big to be sent to Azure in one go. What is the correct approach for this problem? Should I roll the files locally to disk and then send it to Azure?

You might want to check out the new Append Blob. This lets you create a blob and keep appending to it (from multiple locations if needed). Here's some how-to information that may help.

Related

Write & save xlsx file located in blob storage via Powershell

I have a xlsx file in a azure bolb storage. Now, I want to access this xlsx file to do some edits and save it back. I have tried it locally. But I don't know how to do this when the file is in blob storage. The following code are used to do it in locally . Note: I don't want to save the first to my local drive and then edit. I want to directly edit it and save it via powershell.
$workbook = $excel.Workbooks.Open("C:\Users\jubaiaral\OneDrive - BMW\Documents\Book1.xlsx")
$sheet = $workbook.worksheets | where {$_.name -eq 'Sheet1'}
..................my edits come here...................
$workbook.Save()
$excel.Quit()```
Thanks in advance!
I want to directly edit it and save it via powershell.
I dont think you can directly update the file and save it using powershell. You can
Copy the file
Do the work
Copy it back to the storage .
If you want to do this in place you use SPARK ( Azure Synapse / Azure databricks )
This may be helpful : https://community.databricks.com/s/feed/0D53f00001HKHeOCAX

How to make a IIS Site/Content Backup?

i want to learn ASP.NET, for this, i read some basic tutorials for managing the IIS Web-Server. Im Wondering how i could make a full backup of my site (Configuration and Content). Im running the IIS Server on a Hyper-V Windows Server 2012R2 Core and administrating over Powershell Remote.
In the Intenet, ifound an article about some basic stuff (see here)
This Article said, i can make a full backup of my IIS Configuration and Content over
Backup-WebConfiguration -Name "My Backup"
And Restore it over
Restore-WebConfiguration -Name "My Backup"
The Problem is: It seems it really only makes backup from the Configuration and not from the Content. For Example: It Restores the Websites from IIS:\Sites but not the physical stuff like an Application Folder in it and a default.htm. If i delete the default .htm and the folders, use the Restore-WebConfiguration, it still does not restore it - only the WebConfiguration itself.
From the Articel i guessed, it would be restore also the content ....
Did i make something wrong? How can i do what i want "from scratch" without Scripts from MS Web Deploy 3.0 ?
thanks for help and best regards,
Backup-WebConfiguration only backs up the configuration items detailed in the applicationHost.config file. It does not deal with the actual content, just how that content is handled by IIS.
To do both is easy enough, here's a quick function that creates a zip file (just enter in the path to your inetpub directory) and backs up the configuration. (This requires Powershell v3 or higher) The backup will have a Creation Date automatically set for it (you can see a list of your backups by using the Get-WebConfigurationBackup, so this goes ahead and adds the date and time to the zip file as well so they can be matched up.
If you're making more than one backup in the same day, you'll need to tweak the file name of the compressed file as it only has the date in its file name.
function Backup-WebServer
{
[CmdletBinding()]
Param
(
[Parameter(Mandatory=$true,ValueFromPipelineByPropertyName=$true)]
[ValidateNotNullOrEmpty()]
[ValidateScript({Test-Path $_})]
[string]$Source,
[Parameter(Mandatory=$true,ValueFromPipelineByPropertyName=$true)]
[ValidateNotNullOrEmpty()]
[ValidateScript({Test-Path $_})]
[string]$Destination,
[Parameter(Mandatory=$true,ValueFromPipelineByPropertyName=$true)]
[ValidateNotNullOrEmpty()]
[string]$ConfigurationName
)
Add-Type -Assembly System.IO.Compression.FileSystem
Import-Module WebAdministration
$compressionLevel = [System.IO.Compression.CompressionLevel]::Optimal
$date = $(Get-Date -Format d).Replace('/','-')
$fileName = "Inetpub-Backup $date.zip"
$inetpubBackup = Join-Path -Path $Destination -ChildPath $fileName
[System.IO.Compression.ZipFile]::CreateFromDirectory($Source,$inetpubBackup,$compressionLevel,$false)
Backup-WebConfiguration -Name $ConfigurationName
}

Enable Cache On Azure CDN

I am setting up Azure CDN, and having trouble setting the Cache-Control header.
I used Cloudberry Explorer to setup a sync between my server folders and the CDN. This is working well. All my files were copied to the CDN with no problem.
Under Tools > Http Headers > Edit Http Header I set the value for Cache-Control to be: public,max-age=604800
However, this does not appear to be having any effect (according to both Fiddler and Page Speed).
Any tips on setting the Cache-Control header for the Azure CDN would be GREATLY appreciated.
I had this issue myself and needed to update the Cache-Control header on thousands of files. To prevent caching issues in sites, I re-deploy these files with every release to a new path.
I was able to patch together some different suggestions online and ultimately landed on the following solution, which I currently use for deploying one of my production apps.
You need two files, and the script assumes they're in the same directory on your computer:
A text file with a listing of the files in the container (see example below)
The PowerShell script
The Text File (file-list.txt)
The file should be in the example format below with the full file path as deployed to the CDN container. Note this uses forward slashes, and should not include the container name since it will be included in the script. The name of this text file will be included in the PowerShell script below.
v12/app/app.js
v12/app/app.min.js
v12/app/app.min.js.map
v12/app/account/signup.js
v12/app/account/signup.min.js
... (and so on)
The Script (cdn-cache-control.ps1)
The full script is below. You'll need to replace the constants like STORAGE_ACCOUNT_NAME, STORAGE_KEY, and you may need to update the path to the Azure SDK DLL if you have a different version. There are also 2 possible implementations of $blobClient; I repurposed some of this code from a source online, and the un-commented one works for me.
The key difference between what I have here and what you'll find online is the inclusion of $blob.FetchAttributes(). Without explicitly calling this method, a majority of the blob properties like Content-Type, Last Modified Date, and others will be loaded into memory as empty/default values, then when $blob.SetProperties() is called these empty values will blow away the existing ones in the CDN, causing files to load without a Content-Type among other things.
Add-Type -Path "C:\Program Files\Microsoft SDKs\Azure\.NET SDK\v2.9\bin\Microsoft.WindowsAzure.StorageClient.dll"
$accountName = "STORAGE_ACCOUNT_NAME"
$accountKey = "STORAGE_KEY"
$blobContainerName = "STORAGE_CONTAINER_NAME"
$storageCredentials = New-Object Microsoft.WindowsAzure.StorageCredentialsAccountAndKey -ArgumentList $accountName,$accountKey
$storageAccount = New-Object Microsoft.WindowsAzure.CloudStorageAccount -ArgumentList $storageCredentials,$true
#$blobClient = $storageAccount.CreateCloudBlobClient()
$blobClient = [Microsoft.WindowsAzure.StorageClient.CloudStorageAccountStorageClientExtensions]::CreateCloudBlobClient($storageAccount)
$cacheControlValue = "max-age=31556926"
echo "Setting cache control: $cacheControlValue"
Get-Content "file-list.txt" | foreach {
$blobName = "$blobContainerName/$_".Trim()
$blob = $blobClient.GetBlobReference($blobName)
$blob.FetchAttributes()
$blob.Properties.CacheControl = $cacheControlValue
$blob.SetProperties()
echo $blobName
}
It was tricky to find information about mass-setting the Cache-Control header but I've run this script for multiple production releases with great success. I've verified the configuration of the header as well, and routinely run Google's PageSpeed Insights against my site to verify.

Lucene.NET and storing data on Azure Blob Storage

The question I am asking is specifically because I don't want to use AzureDirectory project. I am just trying something on my own.
cloudStorageAccount = CloudStorageAccount.Parse("DefaultEndpointsProtocol=http;AccountName=xxxx;AccountKey=xxxxx");
blobClient=cloudStorageAccount.CreateCloudBlobClient();
List<CloudBlobContainer> containerList = new List<CloudBlobContainer>();
IEnumerable<CloudBlobContainer> containers = blobClient.ListContainers();
if (containers != null)
{
foreach (var item in containers)
{
Console.WriteLine(item.Uri);
}
}
/* Used to test connectivity
*/
//state the file location of the index
string indexLocation = containers.Last().Name.ToString();
Lucene.Net.Store.Directory dir =
Lucene.Net.Store.FSDirectory.Open(indexLocation);
//create an analyzer to process the text
Lucene.Net.Analysis.Analyzer analyzer = new
Lucene.Net.Analysis.Standard.StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30);
//create the index writer with the directory and analyzer defined.
bool findexExists = Lucene.Net.Index.IndexReader.IndexExists(dir);
Lucene.Net.Index.IndexWriter indexWritr = new Lucene.Net.Index.IndexWriter(dir, analyzer,!findexExists, Lucene.Net.Index.IndexWriter.MaxFieldLength.UNLIMITED);
//create a document, add in a single field
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
string path="D:\\try.html";
TextReader reader = new FilterReader("D:\\try.html");
doc.Add(new Lucene.Net.Documents.Field("url",path,Lucene.Net.Documents.Field.Store.YES,Lucene.Net.Documents.Field.Index.NOT_ANALYZED));
doc.Add(new Lucene.Net.Documents.Field("content",reader.ReadToEnd().ToString(),Lucene.Net.Documents.Field.Store.YES,Lucene.Net.Documents.Field.Index.ANALYZED));
indexWritr.AddDocument(doc);
indexWritr.Optimize();
indexWritr.Commit();
indexWritr.Close();
Now the issue is after indexing is completed I am not able to see any files created inside the container. Can anybody help me out?
You're using the FSDirectory there, which is going to write files to the local disk.
You're passing it a list of containers in blob storage. Blob storage is a service made available over a REST API, and is not addressable directly from the file system. Therefore the FSDirectory is not going to be able to write your index to storage.
Your options are :
Mount a VHD disk on the machine, and store the VHD in blob storage. There are some instructions on how to do this here: http://blogs.msdn.com/b/avkashchauhan/archive/2011/04/15/mount-a-page-blob-vhd-in-any-windows-azure-vm-outside-any-web-worker-or-vm-role.aspx
Use the Azure Directory, which you refer to in your question. I have rebuilt the AzureDirectory against the latest storage SDK: https://github.com/richorama/AzureDirectory
Another alternative for people looking around - I wrote up a directory that uses the azure shared cache (preview) which can be an alternative for AzureDirectory (albeit for bounded search sets)
https://github.com/ajorkowski/AzureDataCacheDirectory

Automating App Deployment in Azure with LocalResource

I'm currently attempting to automate the deployment of an application to an Azure Worker role by pulling a file into the role from blob storage and working with it via a batch script, also located in blob storage. I'm using onStart to accomplish this. Here's a reduced version of my onStart method:
Getting ready to pull the files down:
public override bool OnStart()
{
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("mycontainer");
container.CreateIfNotExist();
CloudBlob file = container.GetBlobReference("file.bat");
Actually getting the files into the role:
LocalResource localResource = RoleEnvironment.GetLocalResource("localStore");
string filePath = System.IO.Path.Combine(localResource.RootPath, "file.bat");
using (var fileStream = System.IO.File.OpenWrite(#filePath))
{
file.DownloadToStream(fileStream);
}
This is how I get the batch file and the dependencies into the role. My problem now is - originally, I built the batch file with the assumption that the other files would be dropped right on C:\. For example - C:\installer.exe, C:\archive.zip, etc. But now the files are in localStorage.
I'm thinking I can either A) Somehow tell the batch file where localStorage is by dynamically writing the script onStart, or B) change localStorage to use C:\.
I'm not sure how to do either, or what the best thing to do here would be. Thoughts?
I would not change the LocalStorage to use C: (how would you do this anyways?). Take a look at Steve's blogpost: Using a Local Storage Resource From a Startup Task. He explains how you can get a LocalResource using powershell (and even call that script from a batch file).
And why not use the Windows Azure Bootstrapper? This is a little tool that can help you with the configuration of your role without having to write any code, you simply call it from a startup task and it can download files (also from blob storage like you're doing), work with local resources, ...
bootstrapper.exe -get http://download.microsoft.com/download/F/3/1/F31EF055-3C46-4E35-AB7B-3261A303A3B6/AspNetMVC3ToolsUpdateSetup.exe -lr $lr(temp) -run $lr(temp)\AspNetMVC3ToolsUpdateSetup.exe -args /q
Note: Instead of using absolute references in your batch file, make it use relative paths using %~dp0

Resources