Lucene.NET and storing data on Azure Blob Storage - azure

The question I am asking is specifically because I don't want to use AzureDirectory project. I am just trying something on my own.
cloudStorageAccount = CloudStorageAccount.Parse("DefaultEndpointsProtocol=http;AccountName=xxxx;AccountKey=xxxxx");
blobClient=cloudStorageAccount.CreateCloudBlobClient();
List<CloudBlobContainer> containerList = new List<CloudBlobContainer>();
IEnumerable<CloudBlobContainer> containers = blobClient.ListContainers();
if (containers != null)
{
foreach (var item in containers)
{
Console.WriteLine(item.Uri);
}
}
/* Used to test connectivity
*/
//state the file location of the index
string indexLocation = containers.Last().Name.ToString();
Lucene.Net.Store.Directory dir =
Lucene.Net.Store.FSDirectory.Open(indexLocation);
//create an analyzer to process the text
Lucene.Net.Analysis.Analyzer analyzer = new
Lucene.Net.Analysis.Standard.StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30);
//create the index writer with the directory and analyzer defined.
bool findexExists = Lucene.Net.Index.IndexReader.IndexExists(dir);
Lucene.Net.Index.IndexWriter indexWritr = new Lucene.Net.Index.IndexWriter(dir, analyzer,!findexExists, Lucene.Net.Index.IndexWriter.MaxFieldLength.UNLIMITED);
//create a document, add in a single field
Lucene.Net.Documents.Document doc = new Lucene.Net.Documents.Document();
string path="D:\\try.html";
TextReader reader = new FilterReader("D:\\try.html");
doc.Add(new Lucene.Net.Documents.Field("url",path,Lucene.Net.Documents.Field.Store.YES,Lucene.Net.Documents.Field.Index.NOT_ANALYZED));
doc.Add(new Lucene.Net.Documents.Field("content",reader.ReadToEnd().ToString(),Lucene.Net.Documents.Field.Store.YES,Lucene.Net.Documents.Field.Index.ANALYZED));
indexWritr.AddDocument(doc);
indexWritr.Optimize();
indexWritr.Commit();
indexWritr.Close();
Now the issue is after indexing is completed I am not able to see any files created inside the container. Can anybody help me out?

You're using the FSDirectory there, which is going to write files to the local disk.
You're passing it a list of containers in blob storage. Blob storage is a service made available over a REST API, and is not addressable directly from the file system. Therefore the FSDirectory is not going to be able to write your index to storage.
Your options are :
Mount a VHD disk on the machine, and store the VHD in blob storage. There are some instructions on how to do this here: http://blogs.msdn.com/b/avkashchauhan/archive/2011/04/15/mount-a-page-blob-vhd-in-any-windows-azure-vm-outside-any-web-worker-or-vm-role.aspx
Use the Azure Directory, which you refer to in your question. I have rebuilt the AzureDirectory against the latest storage SDK: https://github.com/richorama/AzureDirectory

Another alternative for people looking around - I wrote up a directory that uses the azure shared cache (preview) which can be an alternative for AzureDirectory (albeit for bounded search sets)
https://github.com/ajorkowski/AzureDataCacheDirectory

Related

Is there a way to write a spreadsheetlight excel file directly to blob storage?

I am porting some old application services code into Microsoft Azure and need a little help.
I am pulling in some data from a stored procedure and creating a SpreadsheetLight document (this was brought in from the old code since my users want to keep the extensive formatting that was already built into this process). That code works fine, but I need to write the file directly into our azure blob container without saving a local copy first (as this process will run in a pipeline). Using some sample code I found as a guide, I was able to get it working in debug by saving locally and then uploading that out to the blob storage... but now I need to find a way to remove that local save prior to the upload.
Well I actually stumbled on the solution.
string connectionString = "YourConnectionString";
string containerName = "YourContainerName";
string strFileNameXLS = "YourOutputFile.xlsx";
BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
BlobContainerClient blobContainerClient = blobServiceClient.GetBlobContainerClient(containerName);
BlobClient blobClient = blobContainerClient.GetBlobClient(strFileNameXLS);
SLDocument doc = YourSLDocument();
using (MemoryStream ms = new MemoryStream())
{
doc.SaveAs(ms);
ms.Position = 0;
await blobClient.UploadAsync(ms, true);
ms.Close();
}

Can blob versions be retained while moving or renaming?

I am using the latest .NET SDK for Azure Storage and enabled versions in Blob storage. I can upload, list all versions of blobs using my C# code. I would like to maintain the versions in case of a move or rename. Is it possible to do such a thing automatically? If not, is there any workaround that might help?
Figured it out using the docs. Here are the snippets of my code that worked:
var files = blobContainerClient.GetBlobs(BlobTraits.Metadata, BlobStates.Version, prefix: prefix);
foreach (BlobItem file in files)
{
var versionFile = blobContainerClient.GetBlobClient(file.Path).WithVersion(file.versionId);
if (file != null)
{
string newFileName = file.Name.Split("/").Last();
string newPath = $"{destFolderPath}/{newFileName}";
BlobClient newFile = rootDir.GetBlobClient(newPath);
await newFile.StartCopyFromUriAsync(file.Uri);
fileCount++;
if (file == files.Last())
{
var rootFile = blobContainerClient.GetBlobClient(file.Path);
rootFile.DeleteIfExists();
}
}
}
Removed some of the project specific code here so anyone looking to use this, please initialize appropriate classes.
Azure blob supports blob versioning natively, you can enable it by following this doc.
You can also maintain old blob versions by creating Blob snapshots.
If you are using .NET SDK, this official doc could be helpful.

If using ImageResizer with Azure blobs do I need the AzureReader2 plugin?

I'm working on a personal project to manage users of my club, it's hosted on the free Azure package (for now at least), partly as an experiment to try out Azure. Part of creating their records is to add a photo, so I've got a Contact Card view that lets me see who they are, when they came and a photo.
I have installed ImageResizer and it's really easy to resize the 10MP photos from my camera and save them to the file system locally, but it seems that for Azure I need to use their Blobs to Upload Pictures to Windows Azure Web Sites, and that's new to me. The documentation on ImageResizer says that I need to use AzureReader2 in order to work with Azure blobs but it isn't free. It also says in their best practices #5 to
Use dynamic resizing instead of pre-resizing your images.
Which is not what I was thinking, I was going to resize to 300x300 and 75x75 (for thumbnail) when creating the users record. But if I should be storing full size images as blobs and dynamically resizing on the way out then can I just use standard means to Upload a blob into a container to save it to Azure, then when I want to display the images use the ImageResizer and pass it each image to resize as required. That way not needing to use the AzureReader2, or have I misunderstood what it does / how it works?
Is there another way to consider?
I've not yet implemented cropping, but that's next to tackle when I've worked out how to actually store the images properly
With some trepidation, I'm going to disagree with astaykov here. I believe you CAN use ImageResizer with Azure WITHOUT needing AzureReader2. Maybe I should qualify that by saying 'It works on my setup' :)
I'm using ImageResizer in an MVC 3 application. I have a standard Azure account with an images container.
Here's my test code for the view:
#using (Html.BeginForm( "UploadPhoto", "BasicProfile", FormMethod.Post, new { enctype = "multipart/form-data" }))
{
<input type="file" name="file" />
<input type="submit" value="OK" />
}
And here's the corresponding code in the Post Action method:
// This action handles the form POST and the upload
[HttpPost]
public ActionResult UploadPhoto(HttpPostedFileBase file)
{
// Verify that the user selected a file
if (file != null && file.ContentLength > 0)
{
string newGuid = Guid.NewGuid().ToString();
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(ConfigurationManager.AppSettings["StorageConnectionString"]);
// Create the blob client.
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
// Retrieve reference to a previously created container.
CloudBlobContainer container = blobClient.GetContainerReference("images");
// Retrieve reference to the blob we want to create
CloudBlockBlob blockBlob = container.GetBlockBlobReference(newGuid + ".jpg");
// Populate our blob with contents from the uploaded file.
using (var ms = new MemoryStream())
{
ImageResizer.ImageJob i = new ImageResizer.ImageJob(file.InputStream,
ms, new ImageResizer.ResizeSettings("width=800;height=600;format=jpg;mode=max"));
i.Build();
blockBlob.Properties.ContentType = "image/jpeg";
ms.Seek(0, SeekOrigin.Begin);
blockBlob.UploadFromStream(ms);
}
}
// redirect back to the index action to show the form once again
return RedirectToAction("UploadPhoto");
}
This is 'rough and ready' code to test the theory and could certainly stand improvement but, it does work both locally and when deployed on Azure. I can also view the images I've uploaded, which are correctly re-sized.
Hope this helps someone.
The answer to the concrete question:
If using ImageResizer with Azure blobs do I need the AzureReader2
plugin?
is YES. And as described in the Image Resizer's documentation - that plugin is used to read/process/serve images out of Blob Storage. So there is no doubt - if you are going to use Image Resizer, AzureReader2 is your needed plugin to make things right. It will take care of Blob uploads/serve.
Although I question Image Resizer's team competency on Windows Azure, since they are referencing Azure SDK v.2, while the most current version for Azure SDK is 1.8. What they mean is the Azure Storage Client Library, which has versions 1.7 and 2.x. Whereas version 2.x is recommended one to use and comes with Azure SDK 1.8. So, do not search for Azure SDK 2.0, install the latest one, which is 1.8. And by the way, use the Nuget Package Manager to install the Azure Storage Library v. 2.0.x.
You can also upload resized versions to azure. So, you first upload the original image as a blob, say with the name /original/xxx.jpg; then you create a resize of the image and upload that to azure with the name say /thumbnail/xxx.jpg. If you want to create the resized versions on the fly or on a separate thread, you may need to temporarily save the original to disk.

Automating App Deployment in Azure with LocalResource

I'm currently attempting to automate the deployment of an application to an Azure Worker role by pulling a file into the role from blob storage and working with it via a batch script, also located in blob storage. I'm using onStart to accomplish this. Here's a reduced version of my onStart method:
Getting ready to pull the files down:
public override bool OnStart()
{
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("mycontainer");
container.CreateIfNotExist();
CloudBlob file = container.GetBlobReference("file.bat");
Actually getting the files into the role:
LocalResource localResource = RoleEnvironment.GetLocalResource("localStore");
string filePath = System.IO.Path.Combine(localResource.RootPath, "file.bat");
using (var fileStream = System.IO.File.OpenWrite(#filePath))
{
file.DownloadToStream(fileStream);
}
This is how I get the batch file and the dependencies into the role. My problem now is - originally, I built the batch file with the assumption that the other files would be dropped right on C:\. For example - C:\installer.exe, C:\archive.zip, etc. But now the files are in localStorage.
I'm thinking I can either A) Somehow tell the batch file where localStorage is by dynamically writing the script onStart, or B) change localStorage to use C:\.
I'm not sure how to do either, or what the best thing to do here would be. Thoughts?
I would not change the LocalStorage to use C: (how would you do this anyways?). Take a look at Steve's blogpost: Using a Local Storage Resource From a Startup Task. He explains how you can get a LocalResource using powershell (and even call that script from a batch file).
And why not use the Windows Azure Bootstrapper? This is a little tool that can help you with the configuration of your role without having to write any code, you simply call it from a startup task and it can download files (also from blob storage like you're doing), work with local resources, ...
bootstrapper.exe -get http://download.microsoft.com/download/F/3/1/F31EF055-3C46-4E35-AB7B-3261A303A3B6/AspNetMVC3ToolsUpdateSetup.exe -lr $lr(temp) -run $lr(temp)\AspNetMVC3ToolsUpdateSetup.exe -args /q
Note: Instead of using absolute references in your batch file, make it use relative paths using %~dp0

How to create a sub container in azure storage location

How can I create a sub container in the azure storage location?
Windows Azure doesn't provide the concept of heirarchical containers, but it does provide a mechanism to traverse heirarchy by convention and API. All containers are stored at the same level. You can gain simliar functionality by using naming conventions for your blob names.
For instance, you may create a container named "content" and create blobs with the following names in that container:
content/blue/images/logo.jpg
content/blue/images/icon-start.jpg
content/blue/images/icon-stop.jpg
content/red/images/logo.jpg
content/red/images/icon-start.jpg
content/red/images/icon-stop.jpg
Note that these blobs are a flat list against your "content" container. That said, using the "/" as a conventional delimiter, provides you with the functionality to traverse these in a heirarchical fashion.
protected IEnumerable<IListBlobItem>
GetDirectoryList(string directoryName, string subDirectoryName)
{
CloudStorageAccount account =
CloudStorageAccount.FromConfigurationSetting("DataConnectionString");
CloudBlobClient client =
account.CreateCloudBlobClient();
CloudBlobDirectory directory =
client.GetBlobDirectoryReference(directoryName);
CloudBlobDirectory subDirectory =
directory.GetSubdirectory(subDirectoryName);
return subDirectory.ListBlobs();
}
You can then call this as follows:
GetDirectoryList("content/blue", "images")
Note the use of GetBlobDirectoryReference and GetSubDirectory methods and the CloudBlobDirectory type instead of CloudBlobContainer. These provide the traversal functionality you are likely looking for.
This should help you get started. Let me know if this doesn't answer your question:
[ Thanks to Neil Mackenzie for inspiration ]
Are you referring to blob storage? If so, the hierarchy is simply StorageAccount/Container/BlobName. There are no nested containers.
Having said that, you can use slashes in your blob name to simulate nested containers in the URI. See this article on MSDN for naming details.
I aggree with tobint answer and I want to add something this situation because I also
I need the same way upload my games html to Azure Storage with create this directories :
Games\Beautyshop\index.html
Games\Beautyshop\assets\apple.png
Games\Beautyshop\assets\aromas.png
Games\Beautyshop\customfont.css
Games\Beautyshop\jquery.js
So After your recommends I tried to upload my content with tool which is Azure Storage Explorer and you can download tool and source code with this url : Azure Storage Explorer
First of all I tried to upload via tool but It doesn't allow to hierarchical directory upload because you don't need : How to create sub directory in a blob container
Finally, I debug Azure Storage Explorer source code and I edited Background_UploadBlobs method and UploadFileList field in StorageAccountViewModel.cs file. You can edit it what you wants.I may have made spelling errors :/ I am so sorry but That's only my recommend.
If you are tying to upload files from Azure portal:
To create a sub folder in container, while uploading a file you can go to Advanced options and select upload to a folder, which will create a new folder in the container and upload the file into that.
Kotlin Code
val blobClient = blobContainerClient.getBlobClient("$subDirNameTimeStamp/$fileName$extension");
this will create directory having TimeStamp as name and inside that there will be your Blob File. Notice the use of slash (/) in above code which will nest your blob file by creating folder named as previous string of slash.
It will look like this on portal
Sample code
string myfolder = "<folderName>";
string myfilename = "<fileName>";
string fileName = String.Format("{0}/{1}.csv", myfolder, myfilename);
CloudBlockBlob blob = container.GetBlockBlobReference(fileName);

Resources