i can't connect to input container. but the container is acccessible and the file is there - azure

I learning azure, specifically datafactory, so in a basic exercice.
1 - I should create a input container, and a output container (using azure sorage 2).
2 - After that, i created the datasets for input and output.
3 - And finally. I should connect the dataflow to my input dataset.
but
i can test conections on the datasets to prove that i created it without problems. but i cant test the connection on my dataflow to the input dataset.
enter image description here
i tryed
recreating it with different names.
keep only the needed file in the storage
use different input file (i am using a sample similar to the "movies.csv" expected to the exercise.

I created azure blob container and uploaded file
I created linked service with azure storage account
I created a dataset with above linked service following below procedure:
I tested the connection, it connected successfully.
I didn't get any error. The error which you mentioned above is related to dynamic content. If you assign any parameters in dataset provide the values of parameters correctly. I added parameters in dataset as below
I try to test the dataset I got error:
I added values for parameters in debug settings
Tested the connection, it connected successfully
Otherwise add the sink to the dataflow and try to debug it, it may work.

I think I found the solution.
when i am working with "debug on" and for some reason i create another "data flow", you cant connect to the new datasets.
But
if I restart the debug (put off and on again), the connections start working again.

Related

Script Activity in ADF does not take the new parameter values for Dynamic Link service

I have Dynamic Link Service created in ADF with parameters to pass the values dynamically whenever i want to change the server's name in Linked service. I am using this link service in Script Activity in Pipeline.
For first time, script activity works fine. After i change the server name, it does not take the new server name by default.
For example, i created Linked service with Dev SQL Server. I create Script activity in Pipeline. It works fine for Dev. if i change SQL Server name in Dynamic Linked Service to QA server. Script activity is still pointing to Dev Server. It does not take new parameter value.
I tried changing the parameter value. Same scenario works fine for Data Service which i used in copydata in Pipeline
I have reproduced the above and able to change the server name in script activity successfully.
First, I have created linked service(Azure SQL database) parameters.
Don't give any default values.
Given the parameters like below.
Then in Script activity, given pipeline parameters to it. You can directly give values using dynamic content. For sample I have given a select query.
I have two SQL servers rakeshserver and rakeshserver2. While debugging it will ask values for pipeline parameter values. If you gave values using dynamic content in script activity, then it executes directly.
first server and table in first server:
Result:
Second server and table in second server:
Result:
If your Script activity is giving the result for same server means, taking the default value might be the reason for it. Give the parameter value either during debug or using dynamic content then it may work as above.

AzureML pass data between pipeline without saving it

I have made two scripts using PythonScriptStep where data_prep.py prepares a dataset by doing some data transformation which is thereafter sent to train.py for training an ML model in AzureML.
It is possible passing data between pipeline steps using PipelineData and OutputFileDatasetConfig, however these seem to save the data in azure blob.
Q: How can I send the data between the steps without saving the data anywhere?
The data has to be passed somehow.
You can influence the storage account by changing the output datastore. If the data is just a collection of numbers, you can pass "dummy" data (e.g., empty text file) between the scripts, and have the upstream one log those numbers as metrics using Run.get_context().log(*) or MLFlow, and the downstream one load those values.
Fundamentally, there's no way to pass information between steps without it being stored somewhere, whether that's the "default blob store", another storage account, or metrics in the workspace.

Azure Labelling tool throwing error 401 on running layout OCR

I have been trying to train a custom model for a document with some fixed layout text & information. I have successfully created, project, connection, container got URL for blob container. when I open the labelling tool to mark text recognization, this throws me an errror code 401, not sure, what's wrong here?
Please note - I have running other projects with different layout and document and able to train the model and use it.
what are the chances here for this error under same account but new storage, resource group, different end point and API.

Copying a file from an FTP location into Azure DataLake

I have followed all steps shown in the MSDN documentation to Copy File from FTP.
So far, the data sets are created, linked servers were created, the pipeline is created. The diagram for the pipeline shows the logical flow. However, when I schedule the ADF, to do the work for me. It fails. The input dataset passes, but when executing the output dataset, I am presented with the following error.
Copy activity encountered a user error at Source side:
ErrorCode=UserErrorFileNotFound,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot
find the file specified. Folder path: 'Test/', File filter:
'Testfile.text'.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Net.WebException,Message=The
remote server returned an error: (500) Syntax error, command
unrecognized.,Source=System,'.
I can physically navigate to the folder and see for myself the file, but when using the ADF, I ma having issues. The firewall is set to allow the connection. Still I am getting this issue. As there is very minimal logging, I am unable to nail down the issue. Could someone help me out here?
PS: Cross Posted at MSDN
I encountered the same error and I was able to solve it by adding "enableSsl": true,
"enableServerCertificateValidation": true

Azure Machine Learning Endpoint SQL Access fails, works in experiement

I've created a classification endpoint using Azure ML, the input for which is a database query to retrieve the database row to classify.
When I run my experiment in the Machine Learning Studio, it works and connects properly to my database. When I submit the same query as a web service parameter on the import data module, I get the following error:
Ignoring the dangers of an SQL query as input, why am I getting this? Shouldn't it work the same?
Sidenote: I've used an SQL query on my training endpoint in the exact same way on the same database, and this didn't cause any problems.
UPDATE: It seems as if this is only a problem when I create a new endpoint for a service. If I use the default endpoint it does indeed work, but any new endpoints do not.
UPDATE 2: It also works when I submit my request as a batch run. If I use Request-Response, it fails.
According to Microsoft this is a known bug and they are working on it. Another workaround, although NOT recommended, is to pass the password in as a web service parameter (but may be ok for test/proof of concept apps)
Here is a link to the thread on the MS forums that states this is a bug.

Resources