Access Azure Blob storage account from azure data factory - azure

I have a folder with list of files in my storage account and having been trying to delete one of the files using pipeline. In-order to get that done I have used "Web" in pipeline, copied the blob storage url and access keys.
Tired using the access keys directly under Headers|Authorization. Also tried the concept of Shared Keys at https://learn.microsoft.com/en-us/azure/storage/common/storage-rest-api-auth#creating-the-authorization-header
Even tried getting this work with curl, but it returned an Authentication Error every time I tried to run
# List the blobs in an Azure storage container.
echo "usage: ${0##*/} <storage-account-name> <container-name> <access-key>"
storage_account="$1"
container_name="$2"
access_key="$3"
blob_store_url="blob.core.windows.net"
authorization="SharedKey"
request_method="DELETE"
request_date=$(TZ=GMT LC_ALL=en_US.utf8 date "+%a, %d %h %Y %H:%M:%S %Z")
#request_date="Mon, 18 Apr 2016 05:16:09 GMT"
storage_service_version="2018-03-28"
# HTTP Request headers
x_ms_date_h="x-ms-date:$request_date"
x_ms_version_h="x-ms-version:$storage_service_version"
# Build the signature string
canonicalized_headers="${x_ms_date_h}\n${x_ms_version_h}"
canonicalized_resource="/${storage_account}/${container_name}"
string_to_sign="${request_method}\n\n\n\n\n\n\n\n\n\n\n\n${canonicalized_headers}\n${canonicalized_resource}\ncomp:list\nrestype:container"
# Decode the Base64 encoded access key, convert to Hex.
decoded_hex_key="$(echo -n $access_key | base64 -d -w0 | xxd -p -c256)"
# Create the HMAC signature for the Authorization header
signature=$(printf "$string_to_sign" | openssl dgst -sha256 -mac HMAC -macopt "hexkey:$decoded_hex_key" -binary | base64 -w0)
authorization_header="Authorization: $authorization $storage_account:$signature"
curl \
-H "$x_ms_date_h" \
-H "$x_ms_version_h" \
-H "$authorization_header" \
-H "Content-Length: 0"\
-X DELETE "https://${storage_account}.${blob_store_url}/${container_name}/myfile.csv_123"
The curl command returns an error:
<?xml version="1.0" encoding="utf-8"?><Error><Code>AuthenticationFailed</Code><Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:XX
Time:2018-08-09T10:09:41.3394688Z</Message><AuthenticationErrorDetail>The MAC signature found in the HTTP request 'xxx' is not the same as any computed signature. Server used following string to sign: 'DELETE

You cannot authorize directly from the Data Factory to the storage account API. I suggest that you use an Logic App. The Logic App has built in support for Blob store:
https://learn.microsoft.com/en-us/azure/connectors/connectors-create-api-azureblobstorage
You can call the Logic App from the Data Factory Web Activity. Using the body of the Data Factory request you can pass variables to the Logic app like the blob path.

using System;
using System.Collections.Generic;
using System.Linq;
using Microsoft.Rest;
using Microsoft.Azure.Management.ResourceManager;
using Microsoft.Azure.Management.DataFactory;
using Microsoft.Azure.Management.DataFactory.Models;
using Microsoft.IdentityModel.Clients.ActiveDirectory;
using Microsoft.WindowsAzure.Storage;
namespace ClearLanding
{
class Program
{
static void Main(string[] args)
{
CloudStorageAccount backupStorageAccount = CloudStorageAccount.Parse("DefaultEndpointsProtocol=https;AccountName=yyy;AccountKey=xxx;EndpointSuffix=core.windows.net");
var backupBlobClient = backupStorageAccount.CreateCloudBlobClient();
var backupContainer = backupBlobClient.GetContainerReference("landing");
var tgtBlobClient = backupStorageAccount.CreateCloudBlobClient();
var tgtContainer = tgtBlobClient.GetContainerReference("backup");
string[] folderNames = args[0].Split(new char[] { ',', ' ' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string folderName in folderNames)
{
var list = backupContainer.ListBlobs(prefix: folderName + "/", useFlatBlobListing: false);
foreach (Microsoft.WindowsAzure.Storage.Blob.IListBlobItem item in list)
{
if (item.GetType() == typeof(Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob))
{
Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob blob = (Microsoft.WindowsAzure.Storage.Blob.CloudBlockBlob)item;
if (!blob.Name.ToUpper().Contains("DO_NOT_DEL"))
{
var tgtBlob = tgtContainer.GetBlockBlobReference(blob.Name + "_" + DateTime.Now.ToString("yyyyMMddHHmmss"));
tgtBlob.StartCopy(blob);
blob.Delete();
}
}
}
}
}
}
}
I tried resolving this by compiling the above code and referencing it using custom activity in C# pipeline. The code snippet above transfers files from landing folder to a backup folder and deletes the file from landing

Related

How to upload files into Azure Blob Storage with Postman?

I am trying to upload base64 string as image file to Azure Blob Storage. Using https://learn.microsoft.com/en-us/rest/api/storageservices/put-blob documentation tried to create blob.
Request Syntax:
PUT https://myaccount.blob.core.windows.net/mycontainer/myblockblob HTTP/1.1
Request Headers:
x-ms-version: 2015-02-21
x-ms-date: <date>
Content-Type: text/plain; charset=UTF-8
x-ms-blob-content-disposition: attachment; filename="fname.ext"
x-ms-blob-type: BlockBlob
x-ms-meta-m1: v1
x-ms-meta-m2: v2
Authorization: SharedKey myaccount:YhuFJjN4fAR8/AmBrqBz7MG2uFinQ4rkh4dscbj598g=
Content-Length: 11
Request Body:
hello world
I am getting response as below,
<?xml
version="1.0" encoding="utf-8"?>
<Error>
<Code>AuthenticationFailed</Code>
<Message>Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:a5d32623-f01e-0040-4275-c1880d000000
Time:2020-11-23T08:45:49.6994297Z</Message>
<AuthenticationErrorDetail>The MAC signature found in the HTTP request 'YhuFJjN4fAR8/AmBrqBz7MG2uFinQ4rkh4dscbj598g=' is not the same as any computed signature. Server used following string to sign: 'PUT
11
text/plain; charset=UTF-8
x-ms-blob-content-disposition:attachment; filename="demo.txt"
x-ms-blob-type:BlockBlob
x-ms-date:Mon, 23 Nov 2020 13:08:11 GMT
x-ms-encryption-key:YhuFJjN4fAR8/AmBrqBz7MG2uFinQ4rkh4dscbj598g=
x-ms-meta-m1:v1
x-ms-meta-m2:v2
x-ms-version:2015-02-21
/<myaccount>/<mycontainer>/<myblob>'.</AuthenticationErrorDetail>
</Error>
How to resolve this issue?
A simple way to upload a blob is to use the sas token.
Nav to azure portal -> your storage account -> Shared access signature, then select the following options in the screenshot -> then click the Generate SAS and connection string button. The screenshot is as below:
Then copy the SAS token, and append it to the url. Then the new url looks like this: https://myaccount.blob.core.windows.net/mycontainer/myblockblob?sv=2019-12-12&ss=b&srt=coxxxxx
Next, in the postman, paste the new url. And in the Headers, you can remove Authorization field.
The test result is as below:
#sathishKumar
If you look closely in this article Authorize with Shared Key
The syntax is as below :
Authorization="[SharedKey|SharedKeyLite] <AccountName>:<Signature>"
It is the signature that is passed along and not the Account key.
Signature is a Hash-based Message Authentication Code (HMAC) constructed from the request and computed by using the SHA256 algorithm, and then encoded by using Base64 encoding.
There are detailed steps how to construct the same mentioned on the above document.
Also, came across the post which talks about a PowerShell script which creates an Signature string through the Powershell that could be useful for you.
Sample Powershell Script
C# Implementation :
internal static AuthenticationHeaderValue GetAuthorizationHeader(
string storageAccountName, string storageAccountKey, DateTime now,
HttpRequestMessage httpRequestMessage, string ifMatch = "", string md5 = "")
{
// This is the raw representation of the message signature.
HttpMethod method = httpRequestMessage.Method;
String MessageSignature = String.Format("{0}\n\n\n{1}\n{5}\n\n\n\n{2}\n\n\n\n{3}{4}",
method.ToString(),
(method == HttpMethod.Get || method == HttpMethod.Head) ? String.Empty
: httpRequestMessage.Content.Headers.ContentLength.ToString(),
ifMatch,
GetCanonicalizedHeaders(httpRequestMessage),
GetCanonicalizedResource(httpRequestMessage.RequestUri, storageAccountName),
md5);
// Now turn it into a byte array.
byte[] SignatureBytes = Encoding.UTF8.GetBytes(MessageSignature);
// Create the HMACSHA256 version of the storage key.
HMACSHA256 SHA256 = new HMACSHA256(Convert.FromBase64String(storageAccountKey));
// Compute the hash of the SignatureBytes and convert it to a base64 string.
string signature = Convert.ToBase64String(SHA256.ComputeHash(SignatureBytes));
// This is the actual header that will be added to the list of request headers.
AuthenticationHeaderValue authHV = new AuthenticationHeaderValue("SharedKey",
storageAccountName + ":" + signature);
return authHV;
}

How to get MD5 of file stored in ADLS Gen2?

I receive daily files through sFTP to ADLS gen 2 storage account. I need to verify the file by checking the MD5 of the file stored in ADLS gen2.
I tried using the BLOB API , currently its not supporting ADLS gen2. I was able to get Content MD5 from blob properties if the file is stored in Blob storage.
Can someone help how to get the content MD5 from ADLS gen 2?
As of now, Blob api is not supported as you know, but you can take a look at Data Lake Storage Gen2 rest api -> Path - Get Properties, which can be used to fetch properties of files stored in ADLS Gen2.
Here is a sample code(Note that I use the sas token appended to the api url):
using System;
using System.Net;
namespace ConsoleApp3
{
class Program
{
static void Main(string[] args)
{
string sasToken = "?sv=2018-03-28&ss=b&srt=sco&sp=rwdl&st=2019-04-15T08%3A07%3A49Z&se=2019-04-16T08%3A07%3A49Z&sig=xxxx";
string url = "https://xxxx.dfs.core.windows.net/myfilesys1/app.JPG" + sasToken;
var req = (HttpWebRequest)WebRequest.CreateDefault(new Uri(url));
req.Method = "HEAD";
var res = (HttpWebResponse)req.GetResponse();
Console.WriteLine("the status code is: "+res.StatusCode);
var headers = res.Headers;
Console.WriteLine("the count of the headers is: "+headers.Count);
Console.WriteLine("*********");
Console.WriteLine();
//list all the properties if you don't know which correct format of property.
foreach (var h in headers.Keys)
{
Console.WriteLine(h.ToString());
}
Console.WriteLine("*********");
Console.WriteLine();
//take the Content-Type property for example.
var myheader = res.GetResponseHeader("Content-Type");
Console.WriteLine($"the header Content-Type is: {myheader}");
Console.ReadLine();
}
}
}
Result:
If you don't know how to generate sas token, you can nav to azure portal -> your storage account, then follow the screenshot below:

Google Cloud Storage get signedUrl from CDN npm

I am using a code as the following to create a signed Url for my content:
var storage = require('#google-cloud/storage')();
var myBucket = storage.bucket('my-bucket');
var file = myBucket.file('my-file');
//-
// Generate a URL that allows temporary access to download your file.
//-
var request = require('request');
var config = {
action: 'read',
expires: '03-17-2025'
};
file.getSignedUrl(config, function(err, url) {
if (err) {
console.error(err);
return;
}
// The file is now available to read from the URL.
});
This creates an Url that starts with https://storage.googleapis.com/my-bucket/
If I place that URL in the browser, it is readable.
However, i guess that URL is a direct access to the bucket file and is not passing through my configured CDN.
I see that in the docs (https://cloud.google.com/nodejs/docs/reference/storage/1.6.x/File#getSignedUrl) you can pass a cname option, which transforms the url to replace https://storage.googleapis.com/my-bucket/ to my bucket CDN.
HOWEVER when I copy the resulting URL, the sevice account or resulting url doesn't seem to have access to the resource.
I have added the firebase admin service account to the bucket but still I get no access.
Also, from the docs, the CDN signed url seems a lot different from the one signed through that API. Is it possible to create from the api a CDN signed url, or should i manually create it as explained in: https://cloud.google.com/cdn/docs/using-signed-urls?hl=en_US&_ga=2.131493069.-352689337.1519430995#configuring_google_cloud_storage_permissions?
For anyone interested in the node code for that signing:
var url = 'URL of the endpoint served by Cloud CDN';
var key_name = 'Name of the signing key added to the Google Cloud Storage bucket or service';
var key = 'Signing key as urlsafe base64 encoded string';
var expiration = Math.round(new Date().getTime()/1000) + 600; //ten minutes after, in seconds
var crypto = require("crypto");
var URLSafeBase64 = require('urlsafe-base64');
// Decode the URL safe base64 encode key
var decoded_key = URLSafeBase64.decode(key);
// buILD URL
var urlToSign = url
+ (url.indexOf('?') > -1 ? "&" : "?")
+ "Expires=" + expiration
+ "&KeyName=" + key_name;
//Sign the url using the key and url safe base64 encode the signature
var hmac = crypto.createHmac('sha1', decoded_key);
var signature = hmac.update(urlToSign).digest();
var encoded_signature = URLSafeBase64.encode(signature);
//Concatenate the URL and encoded signature
urlToSign += "&Signature=" + encoded_signature;
The Cloud CDN content delivery network works with HTTP(S) load balancing to deliver content to your users. Are you using HTTPS Load Balancer to deliver content to your users?
You can see this attached document[1] on using Google Cloud CDN and HTTP(S) load balancing and inserting content into the cache.
[1] https://cloud.google.com/cdn/docs/overview
[2] https://cloud.google.com/cdn/docs/concepts
What error code are you getting? Can you use the curl command and send the output with the error code for further analysis.
Could you confirm that configuration you have done meets the requirement of cacheability, as not all the HTTP response are cacheable? Google Cloud CDN caches only those responses that satisfy specific conditions [3], please confirm. Upon confirmation, I will do further investigation and advise you accordingly.
[3] Cacheability: https://cloud.google.com/cdn/docs/caching#cacheability
Could you provide me the output of this two command below, which will help me to verify if there is a permission issue on these objects? These commands will dump all the current permission settings on the object.
gsutil acl get gs://[full_path_to_file_to_be_cached]
gsutil ls -L gs://[full_path_to_file_to_be_cached]
For more details on permission, refer to this GCP documentation [4]
[4] Setting bucket permissions: https://cloud.google.com/storage/docs/cloud-console#_bucketpermission
No, it is not possible to create from the API a CDN signed URL
From what Google documents here. The answer provided by #htafoya seem legit.
However, I spent a couple of hours to struggle why the signed URL not working as CDN endpoint complains access denied. Eventually I found the code using crypto module doesn't produce the same hmac-sha1 hash value as what gcloud compute sign-url computed, I still don't know why.
At the same time, I see this lib (jsSHA) is pretty cool, it generates the HMAC-SHA1 hash value exactly the same as gcloud and it has a neat API, so I think I should comment here so that if others have the same struggle will benefit from this, this is the final code I used to sign gcloud cdn URL:
import jsSHA from 'jssha';
const url = `https://{domain}/{path}`;
const expire = Math.round(new Date().getTime() / 1000) + daySeconds;
const extendedUrl = `${url}${url.indexOf('?') > -1 ? "&" : "?"}Expires=${expire}&KeyName=${keyName}`;
// use jssha
const shaObj = new jsSHA("SHA-1", "TEXT", { hmacKey: { value: signKey, format: "B64" } });
shaObj.update(extendedUrl);
const signature = safeSign(shaObj.getHMAC('B64'));
return `${extendedUrl}&Signature=${signature}`;
working great!

Need Working Example of Nest REST API without using Firebase API

I'm struggling trying to find a working example of writing data to the Nest Thermostat API using plain rest. Attempting to write a C# app and cannot use Firebase. The multiple Curl examples posted so far do not work. I have a valid auth_token and can read data without issues. Finding the correct post url is elusive. Can anyone assist?
Examples like
curl -v -X PUT "https://developer-api.nest.com/structures/g-9y-2xkHpBh1MGkVaqXOGJiKOB9MkoW1hhYyQk2vAunCK8a731jbg?auth=<AUTH_TOKEN>" -H "Content-Type: application/json" -d '{"away":"away"}'
don't change any data.
Two things. First, follow redirects with -L. Second, put directly to the away data location, like
curl -v -L -X PUT "https://developer-api.nest.com/structures/g-9y-2xkHpBh1MGkVaqXOGJiKOB9MkoW1hhYyQk2vAunCK8a731jbg/away?auth=<AUTH_TOKEN>" -H "Content-Type: application/json" -d '"away"'
The PUT overwrites all data at a location. The previous command would logically be setting the structure's data to just {"away":"away"}.
user3791884,
Any luck with your C# PUT? Here is C# code that works:
using System.Net.Http;
private async void changeAway()
{
using (HttpClient http = new HttpClient())
{
string url = "https://developer-api.nest.com/structures/" + structure.structure_id + "/?auth=" + AccessToken;
StringContent content = new StringContent("{\"away\":\"home\"}"); // derived from HttpContent
HttpResponseMessage rep = await http.PutAsync(new Uri(url), content);
if (null != rep)
{
Debug.WriteLine("http.PutAsync2=" + rep.ToString());
}
}
}
Debug.WriteLine writes this to the Output window:
"http.PutAsync2=StatusCode: 200, ReasonPhrase: 'OK', Version: 1.1, Content: System.Net.Http.StreamContent, Headers:
{
Access-Control-Allow-Origin: *
Cache-Control: no-cache, max-age=0, private
Content-Length: 15
Content-Type: application/json; charset=UTF-8
}"
These two methods return a valid structure of my data.
1/ command line
curl -v -k -L -X GET "https://developer-api.nest.com/structures/Za6hCZpmt4g6mBTaaA96yuY87lzLtsucYjbxW_b_thAuJJ7oUOelKA/?auth=c.om2...AeiE"
2/ C#
private bool getStructureInfo()
{
bool success = false;
try
{
// Create a new HttpWebRequest Object to the devices URL.
HttpWebRequest myHttpWebRequest=(HttpWebRequest)WebRequest.Create("https://developer-api.nest.com/structures/?auth=" + AccessToken);
// Define the request access method.
myHttpWebRequest.Method = "GET";
myHttpWebRequest.MaximumAutomaticRedirections=3;
myHttpWebRequest.AllowAutoRedirect=true;
myHttpWebRequest.Credentials = CredentialCache.DefaultCredentials;
using(HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse())
{
if (null != myHttpWebResponse)
{
// Store the response.
Stream sData = myHttpWebResponse.GetResponseStream();
// Pipes the stream to a higher level stream reader with the required encoding format.
StreamReader readStream = new StreamReader (sData, Encoding.UTF8);
Debug.WriteLine("Response Structure stream received.");
string data = readStream.ReadToEnd();
Debug.WriteLine(data);
readStream.Close();
success = deserializeStructure(data);
}
}
}
catch (Exception ex)
{
Debug.WriteLine("getStructure Exception=" + ex.ToString());
}
return success;
}

How to get static image url from flickr URL?

Is it possible to get static image URL from the flickr URL via an api call or some script ?
For eg :
Flickr URL -> http://www.flickr.com/photos/53067560#N00/2658147888/in/set-72157606175084388/
Static image URL -> http://farm4.static.flickr.com/3221/2658147888_826edc8465.jpg
With specifying extras=url_o you get a link to the original image:
https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=YOURAPIKEY&format=json&nojsoncallback=1&text=cats&extras=url_o
For downscaled images, you use the following parameters: url_t, url_s, url_q, url_m, url_n, url_z, url_c, url_l
Alternatively, you can construct the URL as described:
http://farm{farm-id}.staticflickr.com/{server-id}/{id}_{secret}.jpg
or
http://farm{farm-id}.staticflickr.com/{server-id}/{id}_{secret}_[mstzb].jpg
In your Flickr URL, the photo ID is 2658147888. You use flickr.photos.getSizes to get the various sizes of the photo available, and pick the url you want from that, depending on the size. There are several ways to access the API so please specify if you want more details for a particular language.
You can also access the original image using the photoId (number before the first underscore)
http://flickr.com/photo.gne?id=photoId
In your case it would be:
https://www.flickr.com/photo.gne?id=2658147888
Not sure if you can get it directly through a single API call, but this link explains how the urls for the images are contructed: link
Here's some code I wrote to retrieve metadata from a Flickr Photo based on its ID:
I first defined a javascript object FlickrPhoto to hold the photo's metadata:
function FlickrPhoto(title, owner, flickrURL, imageURL) {
this.title = title;
this.owner = owner;
this.flickrURL = flickrURL;
this.imageURL = imageURL;
}
I then created a FlickrService object to hold my Flickr API Key and all my ajax calls to the RESTful API.
The getPhotoInfo function takes the Photo ID as parameter, constructs the appropriate ajax call and passes a FlickrPhoto object containing the photo metadata to a callback function.
function FlickrService() {
this.flickrApiKey = "763559574f01aba248683d2c09e3f701";
this.flickrGetInfoURL = "https://api.flickr.com/services/rest/?method=flickr.photos.getInfo&nojsoncallback=1&format=json";
this.getPhotoInfo = function(photoId, callback) {
var ajaxOptions = {
type: 'GET',
url: this.flickrGetInfoURL,
data: { api_key: this.flickrApiKey, photo_id: photoId },
dataType: 'json',
success: function (data) {
if (data.stat == "ok") {
var photo = data.photo;
var photoTitle = photo.title._content;
var photoOwner = photo.owner.realname;
var photoWebURL = photo.urls.url[0]._content;
var photoStaticURL = "https://farm" + photo.farm + ".staticflickr.com/" + photo.server + "/" + photo.id + "_" + photo.secret + "_b.jpg";
var flickrPhoto = new FlickrPhoto(photoTitle, photoOwner, photoWebURL, photoStaticURL);
callback(flickrPhoto);
}
}
};
$.ajax(ajaxOptions);
}
}
You can then use the service as follows:
var photoId = "11837138576";
var flickrService = new FlickrService();
flickrService.getPhotoInfo(photoId, function(photo) {
console.log(photo.imageURL);
console.log(photo.owner);
});
Hope it helps.
Below a solution without using flickr-apis, only standard Linux commands (actually I ran it on MS Windows with Cygwin):
Put your list of URLs in the tmp variable
If you are downloading private photos like me, the protocol will be https and you'll need to pass the authentication cookies to wget. I log on with a browser (Chrome) and exported the cookies file using an extension
If you access public URLs, just remove the parameter --load-cookies $cookies
The script downloads in the local folder the photos in their original format
If you want just the URL of the static image, remove the last command | xargs wget --load-cookies $cookies
Here the script, you can use it as a start for your explorations:
cookies=~/cookies.txt
root="https://www.flickr.com/photos/131469243#N02/"
tmp="https://www.flickr.com/photos/131469243#N02/29765108124/in/album-72157673986011342/
https://www.flickr.com/photos/131469243#N02/29765103724/in/album-72157673986011342/
https://www.flickr.com/photos/131469243#N02/29765102344/in/album-72157673986011342/"
while read -r url; do
if [[ $url == http* ]] ;
then
url2=$root`echo -n $url | grep -oP '(?<=https://www.flickr.com/photos/131469243#N02/)\w+'`/sizes/o
wget -q --load-cookies $cookies -O - $url2 | grep -io 'https://c[0-9].staticflickr.com.*_o_d.jpg' | xargs wget --load-cookies $cookies
fi
done <<< "$tmp";

Resources