Copy data from GCP to Azure Storage when the bucket in GCP has a requester pays - azure

I am trying to copy data from GCP to Azure Storage, but the bucket in GCP has a Requester Pays, I did try the transfer using AzCopy and Azure Data Factory, at the end of the Azure configuration I can see the bucket in GCP but when hit the bucket I got a 400 bad request error, this is because the bucket in GCP has a requester pays, what additional configuration I have to do to copy the data? I already have the credential of the GCP service account

I don't think you will be able to use AzCopy. Looking at the GCP documentation on Requester Pays, you need to send the billing PROJECT_IDENTIFIER along with the request. There are several ways to do this. I would suggest looking at the REST API or the code samples.
https://cloud.google.com/storage/docs/using-requester-pays#code-samples_2
Azure Data Factory supports pulling data from a REST API so you could use the GCP REST call with it. Just make sure you send the billing PROJECT_IDENTIFIER along with your REST call.
https://cloud.google.com/storage/docs/using-requester-pays#rest-access-requester-pays

Related

Storage agnostic apis to connect to different object storage i.e. Aws s3, Azure Blob storage, Google storage etc

We use Nodejs and Minio gateway to connect to Azure Blob as well as AWS S3 and Google Cloud Storage. Given the depreciation of minio gateway, Can anyone help with an alternative API way to communicate to different object storage via node application?
We don't want to use cloud specific sdks ,we need a common way to push pull data to cloud platform

Transfer buckets from S3 to GCS and reverse as well using python3 without downloading the buckets first?

I know there's the gsutil which im looking into now.
But I'm trying to figure out if there is a way to transfer buckets from aws s3 to gcs and from gcs to s3 without having to download all the data locally from one and then and then re-up it.
I would recommend to use Google Cloud Transfer Service.
You can follow this tutorial :
Transferring data from Amazon S3 to Cloud Storage using VPC Service Controls and Storage Transfer Service
This tutorial describes how to harden data transfers from Amazon
Simple Storage Service (Amazon S3) to Cloud Storage using Storage
Transfer Service with a VPC Service Controls perimeter. This tutorial
is intended for data owners who have data that resides in Amazon S3,
and who want to process or migrate that data securely to Google Cloud.

Saving a JSON data from an Azure Function

I have integrated an Azure Service Bus and an Azure Function to receive a message, and then update a SQL DB.
I want to save a JSON created from a query from the same Database to a Azure Blob.
My questions are:
I can save the JSON by calling the Azure Blob REST API. Is it a Cloud Native pattern to call one service from another service?
Is sending the JSON to the Azure Service Bus and another Azure Function saving the data to the Blob an optimal approach?
Is a resource other than Azure Blob to save the JSON data from an Azure Function which will make the integration easy.
There are many ways of saving a file in Azure Blob, if you want to save over HTTP, use Azure Blob REST API, you can also use Microsoft Azure Storage SDK that you can integrate into your application, there are storage client for many languages (.NET, Python, javascript, GO, etc.) or if you are using Azure function, you can use Output Binding.
it depends... Blob Storage is not the only location where you can save JSON, you can also save JSON straight into a SQL database for instance.
The easiest way to save from an Azure function is to use Azure Blob storage output binding for Azure Functions.

Can Azure computer vision API access image file in AWS S3?

AWS is currently the approved cloud vendor in my organization. For a new use case related to OCR, I'm exploring the Computer Vision service in Azure (read that this service is better than the corresponding AWS Textract service). Our approach is to maintain the input image files in S3, use AWS Lambda function to invoke the Azure Computer Vision service (either through REST API or Python SDK). AWS will be the primary cloud vendor for most aspects (specifically storage) and we plan to access Azure services through API for additional needs.
My question is will Azure API/SDK accept the image file in S3 as input (of course we will do whatever is needed to make the file in S3 securely accessible to Azure API)? When I read the Azure documentation, it says the image should be accessible as an URL and there is no mention that the image needs to exist in Azure storage. Should the image URL be publicly accessible (believe this should not be the case)?
Unfortunately the Cognitive API does not support authentication when passing a URL.
An approach would be to write an Azure Function that:
Authenticates correctly to S3 and read the image file using the Amazon S3 API
Covert the file in to a ByteArray
Send the byte array to the Computer Vision API
Receive the JSON result from the Computer Vision API and process accordingly
You could have the Azure function perform the business logic you need and process and store the results, or you could build the function as a web service proxy that takes a S3 location in as a parameter and returns the Computer Vision result.
You could also build the functionality out in AWS using Lambda.

Privacy of data for Azure Cognitive Speech Services

In our company want to use the Azure Speech service for ASR of Kids' speech. In our agreement with the parents for this project (and also per company policy), we need to be sure the WAV data (and transcriptions) we send to Cloud based services like this one are not stored nor logged nor kept in any way by the service provider. i could not find information of what info is stored when we use the REST API or Speech API going to the Azure Speech service. any help appreciated.
In addition to the other comment, if you want to have complete control over the data and when it gets deleted you can create your own storage account and use Azure's Bring Your Own Storage feature. This way your speech services account use the storage account that you created for all storage instead of having its own internal storage.
You can then do something like setting up a storage policy that deletes everything inside the storage after x number of days to make sure nothing is being kept.

Resources