Azure Storage - Exposing A Container With A File Structure - azure

I'm trying to upload a folder structure, which contains an html file, which references css and javascript files within that folder structure, in a private azure blob storage container.
When the html file is browsed to, I'm expecting it to retrieve the css and javascript files, just like you'd see if it was an html file as part of a website.
I can make this work if the container is completely public, but I'm struggling to achieve the same results when the container is private and I supply a SAS token.
Suppose the container contains an html file called "main.html" and a css file called "css/mystyles.css" (main.html will have the link tag for the css file pointing to the following relative url "css/mystyles.css").
If I create a SAS token to the container (Let's just call it "mySAS" for simplicity), then navigate to the main.html file, appending the SAS token like so:
https://my-storage-account.blob.core.windows.net/container-name/main.html?mySAS
The main.html file will load correctly as the SAS token will be appended at the end, however the css file will not, so it will return a 404.
I think I already know the answer, but is it even technically possible to store and present an html file and all of it's associated files without putting it in a public container?
I should note that modifying the paths specified in the html file is not an option, as they're files I don't control, so I don't know what they'll look like ahead of time.
There's a few (Very messy and undesirable) hybrid solution where I place it in a private container, then make it public on-demand and after a time switch it back to a private container.
Or go for a more extreme hybrid solution, where I store it in a private container (Which is never exposed), then whenever it's requested, copy the contents to a short lived public container (Reducing the risk that someone might note down the public container in the first hybrid scenario, then access it again later when it's perhaps not intended to be available for them).
I'd really rather stick with a private container and SAS token solution if it's at all possible.

If your question is about reusing a SAS token: You can't. You have to generate a URL+SAS combination, and the SAS is a hash based on the URL. You cannot just create a SAS token once and then append it to a URL. Otherwise, there's nothing stopping someone from guessing URLs and appending one URL's SAS to another URL.
If the goal is to have a static site, with no backend logic, then you need to have your content public (or you have to pre-generate all of your URLs to have SAS properly appended).
If you want dynamic access (e.g. you have no idea what content will be served up), you'd need to have some type of backend app server which serves content (e.g. returns a new html page with appropriate SAS-based links embedded in the various tags).

Related

How to force browser to download public asset from GCP Storage Bucket url?

I have assets from Wordpress being uploaded to GCP Storage bucket. But when I then list all these links to these assets within the website im working on, I would like the user to automatically download the file instead of viewing it in the browser when the user clicks on the link.
Is there an "easy" way to implement this behaviour?
The project is running with Wordpress as headless API, and Next.js frontend.
Thanks
You can change object metadata for your objects in Cloud Storage to force browsers to download files directly, instead of previewing them. You can do this through the available content-disposition property. Setting this property to attachment will allow you to directly download the content.
I quickly tested downloading public objects with and without this property and can confirm the behavior, downloads do happen directly. The documentation explains how to quickly change the metadata for existing objects in your bucket. While it is not directly mentioned, you can use wildcards to apply metadata changes to multiple objects at the same time. For example this command will apply the content-disposition property in all objects of the bucket:
gsutil setmeta -h "content-disposition:attachment" gs://BUCKET_NAME/**

How to generate sitemap on user-generated content site in express js?

I'm creating a user-generated content site using expressjs. How can I add the URL of these generated content to the sitemap and get it done automatically?
It also needs to be removed from these URLs via the sitemap when the user deletes the account or deletes the content.
I tried the sitemap builder npm packages created for express js, but none of them worked as I wanted, or the intended use was not the same as my intended use.
I am unsure if I understood your question, so I assume the following:
Your users can generate new URLs that you want to publish in an sitemap.xml that is returned from a specific endpoint right?
If so I'd suggest to use the sitemap.js package. However this package still needs a list of URLs and the metadata you want to deliver.
You could just save the URLs and the metadata to a database table, the filesystem, or whatever data storage you use. Every time content is generated or deleted you also update your URLs list there.
Now, if someone accesses the sitemap endpoint, URLs are read from storage and sitemap.js generates an XML. Goal achieved!

Is it possible to have a link to raw content of file in Azure DevOps

It's possible to generate a link to raw content of the file in GitHub, is it possible to do with VSTS/DevOps?
Even after reading the existing answers, I still struggled with this a bit, so I wanted to leave a bit more of a thorough response.
As others have said, the pattern is (query split onto separate lines for ease of reading):
https://dev.azure.com/{{organization}}/{{project}}/_apis/sourceProviders/{{providerName}}/filecontents
?repository={{repository}}
&path={{path}}
&commitOrBranch={{commitOrBranch}}
&api-version=5.0-preview.1
But how do you find the values for these variables? If you go into your Azure DevOps, choose Repos > Files from the left navigation, and select a particular file, your current url should look something like this:
https://dev.azure.com/{{organization}}/{{project}}/_git/{{repository}}?path=%2Fpackage.json
You should use those values for organization, project, and repository. For path, you'll see an HTTP encoded version of the unix file path. %2F is the HTTP encoding for /, so that path is actually just /package.json (a tool like Postman will do that encoding for you).
Commit or branch is pretty self explanatory; you either know what you want for this value or you should use master. I have "hard-coded" the api version in the above url because that's what the documentation currently points to.
For the last variable, you need providerName. In short, you should probably use TfsGit. I got this value from looking through the list of source providers and looking for one with a value of true for supportedCapabilities.queryFileContents.
However, if you just request this URL you'll get a "203 Non-Authoritative Information" response back because you still need to authenticate yourself. Referring again to the same documentation, it says to use Basic auth with any value for the username and a personal access token for the password. You can create a personal access token at https://dev.azure.com/{{organization}}/_usersSettings/tokens; ensure that it has the Token Administration - Read & Manage permission.
If you're unfamiliar with this sort of thing, again Postman is super helpful with getting these requests working before you get into the code.
So if you have a repository with a src directory at the root, and you're trying to get the file contents of src/package.json, your URL should look something like:
https://dev.azure.com/{{organization}}/{{project}}/_apis/sourceProviders/TfsGit/filecontents?repository={{repository}}&commitOrBranch=master&api-version={{api-version}}&path=src%2Fpackage.json
And don't forget the basic auth!
Sure, here's the rests call needed:
GET https://feeds.dev.azure.com/{organization}/_apis/packaging/Feeds/{feedId}/packages/{packageId}?includeAllVersions={includeAllVersions}&includeUrls={includeUrls}&isListed={isListed}&isRelease={isRelease}&includeDeleted={includeDeleted}&includeDescription={includeDescription}&api-version=5.0-preview.1
https://learn.microsoft.com/en-us/rest/api/azure/devops/artifacts/artifact%20%20details/get%20package?view=azure-devops-rest-5.0#package
I was able to get the raw contents of a file using this URL.
GET https://dev.azure.com/{organization}/{project}/_apis/sourceProviders/{providerName}/filecontents?serviceEndpointId={serviceEndpointId}&repository={repository}&commitOrBranch={commitOrBranch}&path={path}&api-version=5.0-preview.1
I got this from here.
https://learn.microsoft.com/en-us/rest/api/azure/devops/build/source%20providers/get%20file%20contents?view=azure-devops-rest-5.0
You can obtain the raw URL using chrome.
Turn on Developer tools and view the Network tab.
Navigate to view the required file in the DevOps portal (Content panel). Once the content view is visible check the network tab again and find the URL which starts with "Items?Path", this is json response which contains the required "url:" element.
Drag the filename from the attachments windows and drop it in to any other MS application to get the raw URL or linked filename.
Most answers address this well, but in context of a public repo with anonymous access the api is different. Here is the one that works in such a scenario:
https://dev.azure.com/{{your_user_name}}/{{project_name}}/_apis/git/repositories/{{repo_name_encoded}}/items?scopePath={{path_to_your_file}}&api-version=6.0
This is the exact equivalent of the "raw" url provided by Github.
Another way that may be helpful if you want to quickly get the raw URL for a specific file that you are browsing:
install the browser extension named "Undisposition"
from the dot menu (top right) choose "Download": the file will open in a new browser tab from which you can copy the URL
(edit: unfortunately this will only work for file types that the browser knows how to open, otherwise it will still offer to download it...)
I am fairly new to this and had an issue accessing a raw file in an Azure DevOps Repo. It's straightforward in Github.
I wanted to download a file in CMD and BASH using Curl.
First I browsed to the file contents in the browser make a note of the bold sections:
https://dev.azure.com/**myOrg**/_git/**myProjectName**?path=%2F**MyFileName.ps1**
I then constructed the URL similar to what #Zach posted above.
https://dev.azure.com/**myOrg**/**myProjectName**/_apis/sourceProviders/TfsGit/filecontents?repository=**myProjectName**&commitOrBranch=**master**&api-version=5.0-preview.1&path=%2F**MyFileName.ps1**
Now when I paste the above URL in the browser it displays the content in RAW form similar to GitHub.
The difference was I had to setup a PAT (Personal Access Token) in My Azure DevOps account then authenticate the URL in DOS/BASH example below:
curl -u "<username>:<password>" "https://dev.azure.com/myOrg/myProjectName/_apis/sourceProviders/TfsGit/filecontents?repository=myProjectName&commitOrBranch=master&api-version=5.0-preview.1&path=%2FMyFileName.ps1" -# -L -o MyFileName.ps1

How can I build a Kentico media selector to return media file GUID when integrating with Azure storage?

We have a Kentico 9 instance with media library integrated with Azure blob storage. This means that Kentico's default media selector form control returns an absolute URL of the Azure blob. However, as well as the URL, I need to access the media file info object itself to get additional properties (such as file width).
In the past when using Kentico's own file storage I've been able to build a custom media selector and pull the media file GUID from the returned URL. However, this isn't possible when integrating with Azure storage. Does anyone have any ideas how I might get the file ID or GUID without building my own media selector from scratch?
How about using custom form control with an UniSelector control to which you would pass all files from your azure media library?
You could get the files using something like:
var mediaLibrary = MediaLibraryInfoProvider.GetMediaLibraryInfo("MyAzureLibrary", "SiteName");
var mediaFiles = MediaFileInfoProvider.GetMediaFiles()
.Columns("FileName", "FilePath", "FileGUID")
.WhereEquals("FileLibraryID", mediaLibrary.LibraryID);
This way you could get "nice" dialog that would list all the files in particular folder and you could set up the UniSelector to store GUIDS of those files instead of their paths.
The disadvantage of this is that you don't get the nice tree view as you do in Media library. Once you have the GUID of file, you can then reconstruct the full absolute URL.
If you wanted to have the tree view you could use the CMSTreeView control, but it is more complicated and you would probably need to place it inside a modal window so that it doesn't overflow with other content. Modifying the built-in MediaSelector form control is not really possibly because its under the source code.
Try to enable the following setting:
Content -> Media -> Security -> Check files permissions
In that case inserted media URLs should remain as permanent URLs (because the media handler needs to check the permissions) and you should be able to extract the GUID from the URL as you are used to.

How do ensure static web site pages are fresh?

I have a static web site hosted on Amazon S3. I regularly update it. However I am finding out that many of the users accessing it are looking at a stale copy.
By the way, the site is: http://cosi165a-f2016.s3-website-us-west-2.amazonaws.com) and it's generated a ruby static site generator called nanoc (very nice by the way). It compiles the source material for the site: https://github.com/Coursegen/cosi165a-f2016 into the html, css, js and other files.
I assume that this has to do with the page freshness, and the fact that the browser is caching pages.
How do I ensure that my users see a fresh page?
One common technique is to keep track of the last timestamp when you updated static assets to S3, then use that timestamp as a querystring parameter in your html.
Like this:
<script src="//assets.com/app.min.js?1474399850"></script>
The browser will still cache that result, but if the timestamp changes, the browser will have to get a new copy.
The technique is called "cachebusting".
There's a grunt module if you use grunt: https://www.npmjs.com/package/grunt-cachebuster. It will calculate the hash of your asset's contents and use that as the filename.

Resources