unable to upload workspace packages and requirement.txt files on azure synapse analytics sparks pool - azure

When trying to import python libraries at a spark pool level by applying an uploaded requirements.txt file and custom packages, I get the following error with no other details:
CreateOrUpdateSparkComputeFailed
Error occured while processing the request
It was working perfectly fine few days back. Last upload was successful on 12/3/2021.
Also SystemReservedJob-LibraryManagement application job not getting triggered.
Environment Details:
Azure Synapse Analytics
Apache Spark pool - 3.1
We tried below things:
increase the vcore size up to 200
uploaded the same packages to different subscription resource and it is working fine.
increased the spark pool size.
Please suggest
Thank you

Make sure you have below packages in your requirement.txt
Before that we need to check about the packages which are installed and which are not. You can get all the details of packages install by running below lines of code and can conclude which packages are missing and can keep them in place:
import pkg_resources
for d in pkg_resources.working_set:
print(d)
Install the missing libraries with Requirement.txt.
I faced the similar use case where I got good information and step procedure from MS Docs, have a look on it to handle workspace libs

Related

How to uninstall pip library from Azure databricks notebook - without removing it from cluster library utility?

Trying to start data factory from databricks.
I am having conflict between Azure libraries installed on cluster level:
from azure.identity import ClientSecretCredential
from azure.mgmt.resource import ResourceManagementClient
from azure.mgmt.datafactory import DataFactoryManagementClient
azure_client_id = dbutils.secrets.get(scope="Azure_KeyVault", key="_Application_Id")
azure_client_secret = dbutils.secrets.get(scope="Azure_KeyVault", key="_Client_Secret")
azure_tenant_id = dbutils.secrets.get(scope="Azure_KeyVault", key="__Tenant_Id")
# example of trigger_object['topic']: /subscriptions/f8354c08-de3d-4a67-95ae-c7cbdb37fbf6/resourceGroups/WeS06DvBing15064/providers/Microsoft.Storage/storageAccounts/wes06dvraw15064
subscription_id = 'f4379743884938948398938493793749830'
credentials = ClientSecretCredential(client_id=azure_client_id, client_secret=azure_client_secret, tenant_id=azure_tenant_id)
dfmc = DataFactoryManagementClient(credentials, subscription_id, base_url="https://management.azure.com")
[f.id for f in dfmc.factories.list()]
Error message :
AttributeError: 'ClientSecretCredential' object has no attribute
'signed_session'
I think it could because we have Azure installed on this cluster using the cluster libraries utility. (Given that it works if i remove this library from cluster level).
When i'm doing this in the notebook : %pip uninstall Azure
i'm getting :
Python interpreter will be restarted. Found existing installation:
azure 4.0.0 Not uninstalling azure at
/databricks/python3/lib/python3.7/site-packages, outside environment
/local_disk0/.ephemeral_nfs/envs/pythonEnv-6eab9ca4-4cd6-4bd9-843f-8e33a185c96a
Can't uninstall 'azure'. No files were found to uninstall. Python
interpreter will be restarted.
I don't quite understand this last error message. I want to uninstall library in the notebook, but do not want to remove it from the cluster library utility level (it is used in many other notebooks)
Libraries can be installed in two levels when it comes to data bricks.
Workspace library
Cluster library
1. Library
Get into the folder containing libraries
Select the library name which you need to uninstall
Select the checkbox next to the library to which you need to uninstall, then confirm.
After confirmation the status changes to uninstall pending restart
2. Cluster
Go to the library folder
Select the library
Select the checkbox next to the name and select uninstall
After confirmation it will be in pending state
Restart the cluster
In this procedure, both the normal libraries and cluster libraries are isolated.

Using ipython on a different linux account: command gets stuck

I installed miniconda3 on one linux account, then I created an environment py37, installed all the needed packages and was able to launch ipython from the second account and import the package I wanted to import: hail. For that I changed all of the permissions in the folder with miniconda3 to 777. Somehow, the command when run on the second account gets stuck, but when executed on the initial one where miniconda3 is installed, it runs successfully:
import hail as hl
---> mt = hl.balding_nichols_model(n_populations=3, n_samples=50, n_variants=100)
mt.count()
The middle command gets stuck. No error, it is just not returning. When I run hl.balding_nichols_model on the original account, it is also giving me a warning (but runs successfully, giving the result in mt.count()):
WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
And the thing is that on another account where the command fails I have Hadoop mounted. So, I have a slight suspicion it is somehow related. I am totally stuck, so would appreciate any suggestions. I need to do it this way - installing miniconda3 on one account, then accessing it from the other - because on the first account I have terabytes of data free, but on the second - 4Gb and it can't be further expanded unfortunately. With miniconda3 I would quickly go over the limits.
Additional info regarding the actual software can be found here:
https://hail.is/docs/0.2/getting_started.html#installation
Update
I installed separately python37 on the conda present on the node and somehow it does not work either, so it is not the problem of permissions, and the issue now is limited to that particular linux account. I installed spark2.4 but that did not fix the issue. So, the middle command in the python script gets stuck and I do not know even how to get the log output, what is going on there.
The answer to the 'stuck' issue can be found here:
https://discuss.hail.is/t/spark-2-4-4-gets-stuck-in-initialization-phase/1178
I asked it on Hail forum and then replied myself there after we fixed the issue. It turned out to be space issue: hadoop and spark logs should be redirected to different places when you do not have enough of space at the partition you are working on.

Azure function - "Did not find any initialized language workers"

I'm running an Azure function in Azure, the function gets triggered by a file being uploaded to blob storage container. The function detects the new blob (file) but then outputs the following message - Did not find any initialized language workers.
Setup:
Azure function using Python 3.6.8
Running on linux machine
Built and deployed using azure devops (for ci/cd capability)
Blob Trigger Function
I have run the code locally using the same blob storage container, the same configuration values and the local instance of the azure function works as expected.
The functions core purpose is to read in the .xml file uploaded into blob storage container and parse and transform the data in the xml to be stored as Json in cosmos db.
I expect the process to complete like on my local instance with my documents in cosmos db, but it looks like the function doesn't actually get to process anything due to the following error:
Did not find any initialized language workers
Troy Witthoeft's answer was almost certainly the right one at the time the question was asked, but this error message is very general. I've had this error recently on runtime 3.0.14287.0. I saw the error on many attempted invocations over about 1 hour, but before and after that everything worked fine with no intervention.
I worked with an Azure support engineer who gave some pointers that could be generally useful:
Python versions: if you have function runtime version ~3 set under the Configuration blade, then the platform may choose any of python versions 3.6, 3.7, or 3.8 to run your code. So you should test your code against all three of these versions. Or, as per that link's suggestion, create the function app using the --runtime-version switch to specify a specific python version.
Consumption plans: this error may be related to a consumption-priced app having idled off and taking a little longer to warm back up again. This depends, of course, on the usage pattern of the app. (I infer (but the Engineer didn't say this) that perhaps if the Azure datacenter my app is in happens to be quite busy when my app wants to restart, it might just have to wait for some resources to become available.). You could address this either by paying for an always-on function app, or by rigging some kind of heartbeat process to stop the app idling for too long. (Easiest with a HTTP trigger: probably just ping it?)
The Engineer was able to see a lower-level error message generated by the Azure platform, that wasn't available to me in Application Insights: ARM authentication token validation failed. This was raised in Microsoft.Azure.WebJobs.Script.WebHost.Security.Authentication.ArmAuthenticationHandler.HandleAuthenticate() at /src/azure-functions-host/src/WebJobs.Script.WebHost/Security/Authentication/Arm/ArmAuthenticationHandler.cs. There was a long stack trace with innermost exception being: System.Security.Cryptography.CryptographicException : Padding is invalid and cannot be removed.. Neither of us were able to make complete sense of this and I'm not clear whether the responsibility for this error lies within the HandleAuthenticate() call, or outside (invalid input token from... where?).
The last of these points may be some obscure bug within the Azure Functions Host codebase, or some other platform problem, or totally misleading and unrelated.
Same error but different technology, environment, and root cause.
Technology Net 5, target system windows. In my case, I was using dependency injection to add a few services, I was getting one parameter from the environment variables inside the .ConfigureServices() section, but when I deployed I forget to add the variable to the application settings in azure, because of that I was getting this weird error.
This is due to SDK version, I would suggest to deploy fresh function App in Azure and deploy your code there. 2 things to check :
Make sure your local function app SDK version matches with Azure function app.
Check python version both side.
This error is most likely github issue #4384. This bug was identified, and a fix was released mid-june 2020. Apps running on version 3.0.14063 or greater should be fine. List of versions is here.
You can use azure application insights to check your version. KUSTO Query the logs. The exception table, azure SDK column has your version.
If you are on the dedicated App Service plan, you may be able to "pull" the latest version from Microsoft by deleting and redeploying your app. If you are on consumption plan, then you may need to wait for this bugfix to rollout to all servers.
Took me a while to find the cause as well, but it was related to me installing a version of protobuf explicitly which conflicted with what was used by Azure Functions. Fair, there was a warning about that in the docs. How I found it: went to <your app name>.scm.azurewebsites.net/api/logstream and looked for any errors I could find.

Azure ML Workbench File from Blob

When trying to reference/load a dsource or dprep file generated with a data source file from blob storage, I receive the error "No files for given path(s)".
Tested with .py and .ipynb files. Here's the code:
# Use the Azure Machine Learning data source package
from azureml.dataprep import datasource
df = datasource.load_datasource('POS.dsource') #Error generated here
# Remove this line and add code that uses the DataFrame
df.head(10)
Please let me know what other information would be helpful. Thanks!
Encountered the same issue and it took some research to figure out!
Currently, data source files from blob storage are only supported for two cluster types: Azure HDInsight PySpark and Docker (Linux VM) PySpark
In order to get this to work, it's necessary to follow instructions in Configuring Azure Machine Learning Experimentation Service.
I also ran az ml experiment prepare -c <compute_name> to install all dependencies on the cluster before submitting the first command, since that deployment takes quite a bit of time (at least 10 minutes for my D12 v2 cluster.)
Got the .py files to run with HDInsight PySpark compute cluster (for data stored in Azure blobs.) But .ipynb files are still not working on my local Jupyter server - the cells never finish.
I'm from the Azure Machine Learning team - sorry you are having issues with Jupyter notebook. Have you tried running the notebook from the CLI? If you run from the CLI you should see the stderr/stdout. The IFrame in WB swallows the actual error messages. This might help you troubleshoot.

Excel Data Load using SSIS - Memory used up error

I am trying to load data to an excel file using SSIS Package. Please find below the details
Source : SQL Server Table
Destination : Excel File
No.of rows:646K
No.of columns:132
I have deployed the package in the SQL Server Integration Services Catalog and trying to execute it from there.
But the following errors are being thrown:
Not enough storage is available to complete this operation.
The attempt to add a row to the Data Flow task buffer failed with
error code 0xC0047020.
SSIS Error Code DTS_E_PRIMEOUTPUTFAILED. The PrimeOutput method on
SRC_MDM_ENTITYDUPLICATE returned error code 0xC02020C4. The
component returned a failure code when the pipeline engine called
PrimeOutput(). The meaning of the failure code is defined by the
component, but the error is fatal and the pipeline stopped executing.
There may be error messages posted before this with more information
about the failure.
My DFT looks like the following:
I am using Data Conversion since I am facing some datatype mismatch between Unicode and Non-Unicode characters.
The package is working fine in my local machine with 95-99% resource utilization.
Since I have deployed the package in production environment, I can't do any modifications in the Server Settings. Also I guess the high resource utilization is creating issue while executing the package in production server.
I tried reducing DefaultBufferMaxRows size and increasing DefaultBufferSize which didn't help me anyhow.
Can somebody help me to optimize my package and fix this issue.
Thanks much in Advance.
I realized that the solution of the error is that the column is not excel in your package, as a solution you will either delete that column from the package or add empty columns

Resources