Azure Data factory and Power Query - azure

I am using Azure Data Factory and Power Query.
I have setup a linked service and dataset that successfully connects to an Azure Gen-2 Data Lake storage account.
When I create a new power query and set it to the dataset the following message is returned...
The word 'undefined' is getting set in the power query call to the Az storage account and not the account url - when I manually paste the actual Az storage account url the power query works and returns the data however when I save the power the query goes back to 'undefined'
let
AdfDoc = AzureStorage.DataLakeContents("**undefined**/data-lake/CovidLoad/Report.csv"),
Csv = Csv.Document(AdfDoc, [Delimiter = ",", Encoding = TextEncoding.Utf8, QuoteStyle = QuoteStyle.Csv]),
PromotedHeaders = Table.PromoteHeaders(Csv, [PromoteAllScalars = true])
in
PromotedHeaders
Any idea how to fix this issue?

When I set the linked service to connection to the storage account using the account-url and account- key the power-query works OK - it returns data

Related

Error when creating view on pipeline (problem with BULK path)

Good morning everybody!
Me and my team managed to create part of an Azure Synapse pipeline which selects the database and creates a data source named 'files'.
Now we want to create a view in the same pipeline using a Script activity. However, this error comes up:
Error message here
Even if we hardcoded the folder names and the file name on the path, the pipeline won't recognise the existance of the file in question.
This is our query. If we run it manually on a script in the Develop section everything works smoothly:
CREATE VIEW query here
We expected to get every file with ".parquet" extension inside every folder available on our data_source named 'files'. However, running this query on the Azure Synapse Pipeline won't work. If we run it on a script in Develop section, it works perfectly. We want to achieve that result.
Could anyone help us out?
Thanks in advance!
I tried to reproduce the same thing my environment and got error.
The cause of error can be the Your synapse service principal or the user who is accessing the storage account does not have the role of Storage Blob data Contributor role assigned to it or your External data source have some issue. try with creating new external data source with SAS token.
Sample code:
CREATE DATABASE SCOPED CREDENTIAL SasToken
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'SAS token';
GO
CREATE EXTERNAL DATA SOURCE mysample1
WITH ( LOCATION = 'storage account',
CREDENTIAL = SasToken
)
CREATE VIEW [dbo].[View4] AS SELECT [result].filepath(1) as [YEAR], [result].filepath(2) as [MONTH], [result].filepath(3) as [DAY], *
FROM
OPENROWSET(
BULK 'fsn2p/*-*-*.parquet',
DATA_SOURCE = 'mysample1',
FORMAT = 'PARQUET'
) AS [result]
Execution:
Output:

How to migrate data from local storage to CosmosDB Table API?

I tried following the documentation where I'm able to migrate data from Azure Table storage to Local storage but after that when I'm trying migrating data from Local to Cosmos DB Table API, I'm facing issues with destination endpoint of Table API. Anyone have the idea that which destination endpoint to use? right now I'm using Table API endpoint from overview section.
cmd error
Problem I see here is you are not using the Table name correctly in source. TablesDB is not the table name. Please check the screenshot below for what we should use for table name. (In this case, mytable1 is the table name). So your source should be something like:
/Source:C:\myfolder\ /Dest:https://xxxxxxxx.table.cosmos.azure.com:443/mytable1/
Just re-iterating that I followed below steps and was able to migrate successfully:
Export from Azure Table Storage to local folder using below article. The table name should match the name of table in storage account:
AzCopy /Source:https://xxxxxxxxxxx.table.core.windows.net/myTable/ /Dest:C:\myfolder\ /SourceKey:key
Export data from Table storage
Import from local folder to Azure Cosmos DB table API using below command where table name is the one we created in the azure cosmos db table api, destkey is primary key and source is exactly copied from connection string appended with table name
AzCopy /Source:C:\myfolder\ /Dest:https://xxxxxxxx.table.cosmos.azure.com:443/mytable1//DestKey:key /Manifest:"myaccount_mytable_20140103T112020.manifest" /EntityOperation:InsertOrReplace
Output:

Write Azure Table Storage - Different behaviour local and cloud

I've a simple Azure function that writes periodically some data into an Azure Table Storage.
var storageAccount = new CloudStorageAccount(new Microsoft.WindowsAzure.Storage.Auth.StorageCredentials("mystorage","xxxxx"),true);
var tableClient = storageAccount.CreateCloudTableClient();
myTable = tableClient.GetTableReference("myData");
TableOperation insertOperation = TableOperation.Insert(data);
myTable.ExecuteAsync(insertOperation);
The code runs well locally in Visual Studio and all data is written correctly into the Azure located Table Storage.
But if I deploy this code 1:1 into Azure as an Azure function, the code also runs well without any exception and logging shows, it runs through every line of code.
But no data is written in the Table Storage - same name, same credentials, same code.
Is Azure blocking this connection (AzureFunc in Azure > Azure Table Storage) in some way in contrast to "Local AzureFunc > Azure Table Storage)?
Is Azure blocking this connection (AzureFunc in Azure > Azure Table
Storage) in some way in contrast to "Local AzureFunc > Azure Table
Storage)?
No, it's not azure which is blocking the connection or anything of that sort.
You have to await the table operation you are doing with ExecuteAsync as the control in program is moving without that method being completed. Change your last line of code to
await myTable.ExecuteAsync(insertOperation);
Take a look how here on Because this call is not awaited, the current method continues to run before the call is completed.
The problem was the rowkey:
I used DateTime.Now for the rowkey (since autoincrement values are not provided by table storage).
And my local format was "1.1.2019 18:19:20" while the server's format was "1/1/2019 ..."
And "/" seems not to be allowed in the rowkey string.
Now, formatting the DateTime string correct everything works fine.

Is it possible for accessing Azure table service from data bricks

I have loaded data into the Azure table service. I would like to access the data from data bricks the same way we access data from Azure blob.
Unfortunately, Azure Databricks does not support the data source of azure table storage.
For more details about the Data Sources of Azure Databricks, refer to this link.
Besides, if you want to improve Azure Databricks for it, you could post your idea in the feedback.
I think the above answer is old - so here is my update.
I am currently accessing data from Azure Tables through DataBricks like this:
from azure.cosmosdb.table.tableservice import TableService
table_service = TableService(account_name='accountX',
account_key=None,sas_token="tokenX") #set Azure connection
data = table_service.query_entities('tableX') #read
df_raw = pd.DataFrame([asset for asset in data]) #move it to pandas if you prefer
You need your own credentials for account_name and sas_token; TableX is the name of the table you want to access

Azure Blob to Azure SQL tables creation

I am trying to convert  BLOB Files into SQL DB Tables in Azure using BULK INSERT.
Here is the reference from the Microsoft:
https://azure.microsoft.com/en-us/updates/preview-loading-files-from-azure-blob-storage-into-sql-database/
My DATA in CSV looks like this
100,"37415B4EAF943043E1111111A05370E","ONT","000","S","ABCDEF","AB","001","000002","001","04","20110902","11111111","20110830152048.1837780","",""
My BLOB Container is in Public Access Level.
Step 1: Created Storage Credential.  I had generated a shared Access key (SAS token).
CREATE DATABASE SCOPED CREDENTIAL Abablobv1BlobStorageCredential
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = 'sv=2017-07-29&ss=bfqt&srt=sco&sp=rwdlacup&se=2018-04-10T18:05:55Z&st=2018-04-09T10:05:55Z&sip=141.6.1.0-141.6.1.255&spr=https&sig=XIFs1TWafAakQT3Ig%3D';
GO
Step 2:  Created EXTERNAL DATA SOURCE in reference to Storage Credential
CREATE EXTERNAL DATA SOURCE Abablobv1BlobStorage
WITH ( TYPE = BLOB_STORAGE, LOCATION = 'https://abcd.blob.core.windows.net/', CREDENTIAL = Abablobv1BlobStorageCredential );
GO
Step 3 BULK INSERT STATEMENT using the External Data Source and DB TABLE
BULK INSERT dbo.TWCS
FROM 'TWCSSampleData.csv'
WITH ( DATA_SOURCE = 'Abablobv1BlobStorage', FORMAT = 'CSV');
GO
I am facing this error:
Bad or inaccessible location specified in external data source
"Abablobv1BlobStorage".
Does anyone have some idea about this?
I changed the Location of EXTERNAL DATA SOURCE to Location = abcd.blob.core.windows.net/invoapprover/SampleData.csv Now I get, Cannot bulk load because the file "SampleData.csv" could not be opened. Operating system error code 5(Access is denied.). For both statements using Bulk Insert or Open Row Set. I was not sure which access should be changed because the file is in Azure blob not on my machine, any ideas for this??
Please try the following query
SELECT * FROM OPENROWSET(
BULK 'TWCSSampleData.csv',
DATA_SOURCE = 'Abablobv1BlobStorage',
SINGLE_CLOB) AS DataFile;
Make sure the file is not located inside a container on the BLOB storage. In that case you need to specify the container in the Location parameter of the External Data Source. If you have a container named "files" then the location should be like 'https://abcd.blob.core.windows.net/files'.
More examples of bulk import here.

Resources